November 26, 2020 - 14 min
Docker is a really cool technology to have applications run the in the same, predictable context, ensuring the “but it works on my machine” comments are a thing of the past. Not to mention how Docker integrates with Docker Compose and Kubernetes for larger-scale applications! But that’s for another day.
In this article, you will learn about some good principles to containerise Node.js apps efficiently. These principles are probably quite general and well-known, but they’re easy to overlook for beginners to Docker and Dockerfiles.
You will need to have Docker and Node.js installed locally to follow along comfortably.
Table of contents
Let’s start with a simple Next.js application:
$ npm create next-app --use-npm
and let’s write the following into a file called
Dockerfile at the root of the project:
FROM node:14.15.0 WORKDIR /app COPY . . RUN npm ci RUN npm run build EXPOSE 3000 USER node CMD ["npm", "run", "start"]
Okay, what just happened? 😄
FROM <image>: indicates what pre-existing Docker image we want to build upon
WORKDIR <some-path-in-the-container>: indicates that our application code will live in a directory inside the container (will be created if the path does not exist)
COPY <from-context> <to-container>: copies files from the context the container’s
RUN <command>: runs a command as part of the
docker buildprocess only
EXPOSE <port>: it does not do anything more than documenting which port the app uses
USER <user>: by default, we executed all the previous command as the default user for the Docker image we chose above (which often is
rootwith superuser privileges); it’s good practice to use a different user to run the app.
CMD <command>: the command that will be executed when we run
docker run <image-tag>, it is not executed during the build process.
Assuming we have Docker installed, we could now run
$ docker build --tag my-containerised-nextjs-app .
This command kindly asks the Docker daemon running on our machine to build the Dockerfile we just wrote with the files present in the current directry (the
.) and to call that image
my-containerised-nextjs-app (with the
After the build completed, run
$ docker run --rm -p 3000:3000 my-containerised-nextjs-app
and we should have our app running at http://localhost:3000.
This second command orders the Docker daemon to run the image named
my-containerised-nextjs-app with port 3000 on my machine to correspond to port 3000 inside the container so that we can interact with our Next.js app. This port mapping is done through the
-p hostPort:containerPort flag. Go ahead and change the port mapping and see if you can still access the app.
--rm flag indicates that we do not want the Docker daemon to keep the container when we stop its execution (with CTRL+C); this is useful not to clog our machines with useless logs while developing.
Voilà, we have containerised our Next.js app. The End. 🎉
Wait! Don’t leave, there’s more! 👇
The section above does the job, but it’s not optimal. If we only have so few pipeline minutes and we don’t like wasting resources ♻️, this setup will eat more than is necessary. There are a few strategies to optimise a Dockerfile:
- Create a comprehensive
- Use an appropriate base image
- Decouple the dependency installation step from application code
- Decouple the build from the resulting built application
- (Optional) Purge
node_modulesfrom unnecessary files
If we look at the logs from building the Docker image, we could read:
Step 4/8 : RUN npm ci ---> Running in 559b499f0e2c npm WARN prepare removing existing node_modules/ before installation
That means we copied the
node_modules folder from our local context over to the Docker image being built. What’s more,
npm ci removes them to make sure it works consistently. That is an indication that we should not copy the
node_modules in the first place.
To let Docker know some files and folders should be ignored, we can list them in a
.dockerignore file. For example, add the following to a file named
Run the build again and the warning disappeared.
What’s more, we can see how much less was sent to the Docker build context:
# BEFORE $ docker build --tag my-containerised-nextjs-app . Sending build context to Docker daemon 101.8MB # AFTER $ docker build --tag my-containerised-nextjs-app . Sending build context to Docker daemon 400.4kB
In practice, we would want to add as many files as possible to our
.dockerignore to ensure we’re not wasting time and energy on files which won’t be necessary for the build. This can include documentation files (e.g.
*.md), configuration files which are only necessary when developing (e.g.
.prettier*) as well as our Git history (e.g.
With the following
node_modules *.md .git
we end up sending the bare minimum to the Docker daemon:
$ docker build --tag my-containerised-nextjs-app . Sending build context to Docker daemon 248.3kB
Docker images come in all shapes and sizes. When it comes to building or deploying our app, the smaller the size, the better. For our Next.js app, we don’t need all that’s available in
node:14.15.0. For example, we don’t need
curl. We can instead use the
alpine flavour of that image, namely
The difference in size between
node:14.15.0-alpine is enormous!
REPOSITORY TAG SIZE node 14.15.0 943MB node 14.15.0-alpine 117MB
Our final containerised Next.js app will hence be much smaller thanks to this teeny-tiny modification of our
- FROM node:14.15.0 + FROM node:14.15.0-alpine WORKDIR /app COPY . . RUN npm ci RUN npm run build EXPOSE 3000 USER node CMD ["npm", "run", "start"]
For reference, if we want to inspect the size of the Docker images we have available locally, we can run
docker image ls --all.
Our Docker image has size 1.05GB if we use
REPOSITORY TAG IMAGE ID CREATED SIZE my-containerised-nextjs-app latest c78d43a2c5bc 15 seconds ago 1.05GB
and it drops to 229MB if we use
REPOSITORY TAG IMAGE ID CREATED SIZE my-containerised-nextjs-app latest d0746ef6e8a4 21 seconds ago 229MB
Our setup is already much better with a suitable
.dockerignore file and with a much leaner base image. However, we haven’t made any changes to our
package.json and the
npm ci step always needs to run if make any changes to our application code.
This is because Docker caches each step as an intermediate Docker container. For example, if we were to run the build again (
docker build --tag my-containerised-nextjs-app .), we would see that the cache is used for all steps, resulting in a nearly instant build (because nothing happened, really 😄):
$ docker build --tag my-containerised-nextjs-app . Sending build context to Docker daemon 248.3kB Step 1/8 : FROM node:14.15.0-alpine ---> 9db54a688554 Step 2/8 : WORKDIR /app ---> Using cache ---> d5d4291dc699 Step 3/8 : COPY . . ---> Using cache ---> b56455e98a93 Step 4/8 : RUN npm ci ---> Using cache ---> ba4ab784f27e Step 5/8 : RUN npm run build ---> Using cache ---> a51c18594a20 Step 6/8 : EXPOSE 3000 ---> Using cache ---> 806dc06cbb6c Step 7/8 : USER node ---> Using cache ---> b86db08de717 Step 8/8 : CMD ["npm", "run", "start"] ---> Using cache ---> d0746ef6e8a4 Successfully built d0746ef6e8a4 Successfully tagged my-containerised-nextjs-app:latest
If we now modify anything in the folder that is not ignored thanks to our
.dockerignore file, Docker will have to bust the cache from Step 3 onwards (the “Using cache” expression is not printed any more). Let’s create an empty file and run the build again:
# create an empty file $ touch file # build again $ docker build --tag my-containerised-nextjs-app . Sending build context to Docker daemon 248.8kB Step 1/8 : FROM node:14.15.0-alpine ---> 9db54a688554 Step 2/8 : WORKDIR /app ---> Using cache ---> d5d4291dc699 Step 3/8 : COPY . . ---> e2b761bb08a6 # The cache is not used anymore Step 4/8 : RUN npm ci ---> Running in b36f2511e117 ...
To remedy this and ensure that
npm ci runs only when necessary, we can first copy
package-lock.json files, run
npm ci, and then bring in the rest of the files to build the project:
FROM node:14.15.0-alpine WORKDIR /app - COPY . . + COPY package.json . + COPY package-lock.json . RUN npm ci + COPY . . RUN npm run build EXPOSE 3000 USER node CMD ["npm", "run", "start"]
After we modified the
Dockerfile by applying the diff above, we see that Docker does not leverage the cache at all because the steps are different. This always happens when we modify the layers and/or their order.
However, after running this build for a first time, subsequent builds will be much faster as the
npm ci step will use the cache unless we modify
# create another file to see when Docker stops using the cache $ touch another-file # run the build again $ docker build --tag my-containerised-nextjs-app . Sending build context to Docker daemon 249.3kB Step 1/10 : FROM node:14.15.0-alpine ---> 9db54a688554 Step 2/10 : WORKDIR /app ---> Using cache ---> d5d4291dc699 Step 3/10 : COPY package.json . ---> Using cache ---> 139b5a611c0b Step 4/10 : COPY package-lock.json . ---> Using cache ---> 5bc380de1397 Step 5/10 : RUN npm ci ---> Using cache ---> dff6b2da4bdc Step 6/10 : COPY . . ---> 08911c09b40f Step 7/10 : RUN npm run build ---> Running in f5374a54d9ad > email@example.com build /app > next build ...
For this section to make more sense, we will create another Next.js app that uses some dev dependencies which are necessary only when building the app but not when running it in production.
$ npm create next-app --use-npm --example with-tailwindcss with-tailwindcss-app
.dockerignore files created in the other example into
with-tailwindcss-app. Build the Docker image and run it to make sure it’s working well before continuing:
$ docker build --tag with-tailwindcss-app . $ docker run --rm -p 3000:3000 with-tailwindcss-app
As we can see in the
package.json file, we have a few dependencies which are only ever going to be used when building our app. We won’t need them in the final Docker image. So let’s get rid of them!
We are going to leverage multi-stage Docker builds. They sound more complicated than they are :)
Here’s a multi-stage build Dockerfile: in the
builder container, we install dependencies and build our app, and in the second stage, we copy relevant files from the builder to our final image and install only production dependencies.
FROM node:14.15.0-alpine as builder WORKDIR /app COPY package.json . COPY package-lock.json . RUN npm ci COPY . . RUN npm run build FROM node:14.15.0-alpine WORKDIR /app COPY --from=builder /app/package.json . COPY --from=builder /app/package-lock.json . RUN npm ci --production COPY --from=builder /app/.next .next EXPOSE 3000 USER node CMD ["npm", "run", "start"]
If our app also contained a Next.js config file and a public folder, we would also need to copy them from the
... COPY --from=builder /app/.next . + COPY --from=builder /app/next.config.js . + COPY --from=builder /app/public public ...
Comparing the sizes, we see that we removed 39MB of dev dependencies, not bad.
- without multi-stage build
REPOSITORY TAG IMAGE ID CREATED SIZE with-tailwindcss-app latest 88089e6f965d 18 minutes ago 269MB
- with multi-stage build
REPOSITORY TAG IMAGE ID CREATED SIZE with-tailwindcss-app latest ba7307736fb0 25 seconds ago 230MB
Finally, to make our Docker image the leanest, we will purge our node modules from all the files which were included by the library authors and aren’t directly useful to our Docker image. For example, documentation files and READMEs, config files for Prettier and such, etc.
Note that this is not without risk as we could be removing files that we actually need to run the application correctly (we had a file named
test.js that contained important code and it was mercilessly removed by the tool, which broke the application 😅).
We will add another stage called
production-deps to make our node modules go through a diet with
node-prune (GitHub repo). The list of files and folders that
node-prune will remove is largely inspired by Yarn’s default list (generated through
yarn autoclean --init).
node-prune, we need
curl to be a locally available executable, so we can either choose a base image that contains it out of the box, or we can add it to
node:14.15.0-alpine. Because we won’t need anything else, we will choose the latter.
FROM node:14.15.0-alpine as builder WORKDIR /app COPY package.json . COPY package-lock.json . RUN npm ci COPY . . RUN npm run build + FROM node:14.15.0-alpine as production-deps + WORKDIR /app COPY --from=builder /app/package.json . COPY --from=builder /app/package-lock.json . RUN npm ci --production + # Add curl + RUN apk --no-cache add curl + # Install node-prune https://github.com/tj/node-prune + RUN curl -sf https://gobinaries.com/tj/node-prune | sh + RUN node-prune /app FROM node:14.15.0-alpine WORKDIR /app + COPY --from=production-deps /app/package.json . + COPY --from=production-deps /app/package-lock.json . + COPY --from=production-deps /app/node_modules node_modules COPY --from=builder /app/.next .next EXPOSE 3000 USER node CMD ["npm", "run", "start"]
Try it by building the image once more.
node-prune ran, it printed out in the console that it removed 12MB:
... Step 15/24 : RUN node-prune /app ---> Running in 69e030ab8b70 files total 14,269 files removed 3,722 size removed 12 MB duration 216ms ...
However, suprinsingly, a whopping 32MB were removed in comparison to our previous
Dockerfile. (It seems
node-prune’s tallying is not totally accurate when I tested locally as it removed 20.7 MB instead — though it still doesn’t explain the other 11.3 MB 🤷♂️)
$ docker image ls with-tailwindcss-app REPOSITORY TAG IMAGE ID CREATED SIZE with-tailwindcss-app latest c5e2081dba30 9 seconds ago 198MB
Let’s ensure our app is running well without the files
node-prune removed and we can be on our merry way! 🐳 🎉
To summarise what we’ve learnt so far, here’s a small checklist to review our own Docker build setups:
- I have a comprehensive
- I use an appropriate base image
- I decoupled the dependency installation step to improve caching
- I decoupled the resulting built application from the build step
- I purged the
node_modulesfrom unnecessary files
Time to flex! 💪 Show me your before and after Docker image sizes in the comments below!
Personal blog written by Robin Cussol
I like math and I like code. Oh, and writing too.