Kubernetes has a reputation for being complex, and that reputation is not undeserved. But the complexity is manageable when you build up from first principles rather than trying to absorb the entire API surface at once. This guide takes a Node.js application from "runs on my laptop" to a production Kubernetes deployment with rolling updates and automated rollbacks.
Prerequisites
You'll need Docker Desktop (which includes a local Kubernetes cluster), kubectl, and helm installed. For a production cluster, we'll reference AWS EKS, but the manifests work on any conformant Kubernetes distribution.
Step 1: Containerise Your Application
A production Docker image should be small, reproducible, and run as a non-root user. A multi-stage build achieves this:
# Build stage
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --from=build --chown=appuser:appgroup /app/dist ./dist
COPY --from=build --chown=appuser:appgroup /app/node_modules ./node_modules
USER appuser
EXPOSE 3000
CMD ["node", "dist/server.js"]
Key decisions: Alpine base (5 MB vs 900 MB for the full node image), non-root user (prevents privilege escalation if the container is compromised), and copying only the compiled output (no source code in production).
Build and test locally: docker build -t my-api:latest . && docker run -p 3000:3000 my-api:latest
Step 2: Writing Production Kubernetes Manifests
A minimal production deployment needs three resources: a Deployment, a Service, and a HorizontalPodAutoscaler.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api
labels:
app: my-api
spec:
replicas: 3
selector:
matchLabels:
app: my-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # zero-downtime: never remove a pod before adding one
template:
metadata:
labels:
app: my-api
spec:
containers:
- name: my-api
image: your-registry/my-api:1.0.0
ports:
- containerPort: 3000
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
env:
- name: NODE_ENV
value: production
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: my-api-secrets
key: database-url
The critical settings: maxUnavailable: 0 ensures zero-downtime rolling updates, and readiness/liveness probes prevent Kubernetes from routing traffic to pods that aren't ready.
Step 3: Secrets Management
Never put secrets in your manifests or Docker images. Use Kubernetes Secrets for development and a proper secrets manager (AWS Secrets Manager, HashiCorp Vault) for production, synced into the cluster via the External Secrets Operator.
# Create a secret (development only — use External Secrets in production)
kubectl create secret generic my-api-secrets --from-literal=database-url='postgresql://user:password@host:5432/db'
Step 4: Setting Up Helm
Helm is the package manager for Kubernetes. Instead of maintaining separate YAML files for dev/staging/production with duplicated configuration, Helm charts let you parameterise your manifests:
helm create my-api-chart
# This generates a chart with templates for Deployment, Service, Ingress, HPA
Replace hardcoded values with template variables like {{ .Values.image.tag }} and {{ .Values.replicas }}. Then deploy different configurations per environment:
# Staging
helm upgrade --install my-api ./my-api-chart -f values.staging.yaml --set image.tag=abc1234
# Production
helm upgrade --install my-api ./my-api-chart -f values.production.yaml --set image.tag=1.2.0
Step 5: Zero-Downtime Deployments
With maxUnavailable: 0 set and readiness probes configured, Kubernetes already performs zero-downtime rolling updates. But there's one more step: ensure your application handles SIGTERM gracefully. When Kubernetes terminates a pod, it sends SIGTERM and waits terminationGracePeriodSeconds (default 30 seconds) before force-killing it. Your application should:
process.on('SIGTERM', async () => {
console.log('SIGTERM received, closing server gracefully');
server.close(async () => {
await db.pool.end(); // close database connections
process.exit(0);
});
// Force exit after 25 seconds if graceful shutdown hangs
setTimeout(() => process.exit(1), 25000);
});
Step 6: Automated Rollbacks
Kubernetes stores rollout history. If a deployment causes errors, roll back with a single command:
kubectl rollout undo deployment/my-api
# Or roll back to a specific revision
kubectl rollout history deployment/my-api
kubectl rollout undo deployment/my-api --to-revision=3
For automated rollbacks triggered by error rate spikes, integrate your deployment pipeline with your monitoring stack. Argo Rollouts provides progressive delivery with canary deployments and automatic metric-based rollbacks — highly recommended for production systems handling significant traffic.
Wrapping Up
The Kubernetes learning curve is real, but the payoff is a deployment platform that is reliable, auditable, and self-healing. Start with these fundamentals — multi-stage Docker builds, proper resource requests/limits, readiness probes, and rolling updates — and your system will be significantly more robust than a fleet of manually managed servers.
The next steps from here: set up KEDA for event-driven autoscaling, configure PodDisruptionBudgets for maintenance windows, and instrument your pods with Prometheus metrics.