You’re running a microservice-based application in Kubernetes. Maybe your application has 2 microservices, or 5, or 25. Everything’s humming along fine until you roll out a new update and the application starts misbehaving. Users are complaining about not being able to login, those who can are getting blank pages, etc. Since you deployed with your continuous delivery process, you’re not quite sure what’s broken as all of the Pods seem to be healthy and running fine. What do you do?
One option is to start troubleshooting the app to determine what’s broken and try to fix it with an update. While it’s good to dig in to better understand the problem, do you want to do that on your production system while users aren’t able to get value out of your app? Probably not.
What you’d most likely want to do is to get your application back to its latest healthy state so that you can get customers back online and then troubleshoot the application update in a different environment without impacting your application availability. That’s where rolling back your entire application to its previous state comes into play, but how to do it so that you know everything’s at the previous (working) version?
Yipee is in the incubation phase of developing capabilities in our upcoming onPrem/inCluster product. Deployed into the cluster, it will be able to discover applications deployed into the cluster and keep track of the state of your application as it changes over time. We’re also working on an advanced ability to diff kubernetes applications, either between revisions in time or between instances of the deployed application in different environments.
Of course, you will always be in control of which applications we’re able to see and we’ll also give you the ability to let us know that a particular application isn’t able to handle rollbacks, such as in the case when complex data migrations have to be performed manually.
For those applications that are suitable for rollback, an operator will be able to view the application as it’s deployed and initiate a rollback when things have gone wrong. We can either initiate the standard rollback mechanisms or we can do an alternate process whereby we shutdown the app, execute any custom Jobs necessary to get the application ready for rollback and then bring the application back up using the application configuration that existed prior to the failed update. If running a CD process such as GitOps, we’ll write the appropriate configuration back into a git repository that you’ve given us so that your normal CD process handles the update as usual.
With Yipee + CI/CD + Kubernetes, you can deploy your application confidently knowing that Yipee has your (roll)back.
Interested in participating as an early adopter? Visit us at https://yipee.io/rollback_apps/ and let us know.
If you need a refresher on the characteristics and limitations of Kubernetes’ built-in rollback features, read on…
Built-in Kubernetes Update and Rollback Mechanisms
Kubernetes has an update/rollback mechanism for Pods created by Deployments, DaemonSets, and StatefulSets (hereafter referred to as Controllers) as long as you set the updateStrategy for these to RollingUpdate. Even if you leave that set at the default OnDelete setting, you can manually affect a rollback by updating the PodTemplate to its previous configuration in the appropriate controller and manually deleting the pods created by the controller. This is sometimes handy if you have a complicated rollback which requires undoing data transformation as you roll back.
Also, there is a separate rolling update mechanism for (legacy) ReplicationControllers, which I’ll discuss in a bit.
Commands which trigger rolling pod updates
Not all updates to a controller initiate a rolling update, even if the updateStrategy is set to RollingUpdate. For a rolling update to occur, you need to update something in the controller’s PodTemplate. This makes sense since anything else in the controller doesn’t effect the configuration of the Pod at all (which is what rolling updates are for). You can, for example, scale up/down the number of pods in a Deployment and that would just result in adding/removing pods but not changing currently running pods configurations.
Also, there are different methods in updating/rolling back controllers/pods depending on the controller type. ReplicationControllers have an earlier version of the rollout/rollback capability while Deployments, DaemonsSets, and StatefulSets use a newer (and more robust) mechanism
Deployments, DaemonSets, and StatefulSets
There are various ways you can initiate a rolling update for these types of controllers:
- kubectl set <parameter name>1 <controller type> [<controller name>] <container name>=<new parameter name>
kubectl set image deployment db mysql=mysql:8
- kubectl edit <controller type>/<controller name> (change some parameters in the PodTemplate in your editor and save)
$ kubectl edit deployment/db
- kubectl apply -f <filename> 2
$ kubectl apply -f database.yaml
Note that your settings for the controller revisionHistoryLimit controls how many total versions are available for rollback. The default value of 10 should give plenty of leeway to rollback to a good earlier version, but you can change this to your liking.
The StatefulSet rolling update proceeds in the opposite order in which the pods are created, so that they are updating in reverse ordinal number.
Rolling updates for ReplicationControllers work a bit differently. Ran manually, you would create a new ReplicationController (with a new name) with your updated attributes, plus at least one label/selector pair such that the new ReplicationController matches only the newly created pods. Then deploy the new ReplicationController to create the new pods. Once those Pods are up and running, then you can scale down the old ReplicationController until all of it’s pods are gone. Kubernetes supplies a kubectl subcommand, rolling-update, which helps manage this process, and has some options for not having to copy and edit the new replication controller yaml yourself.
- kubectl rolling-update <controller name> [<new controller name>] –image=<new container image> 3
$ kubectl rolling-update dbv1 dbv2 --image=mysql:8 $ kubectl rolling-update dbv1 --image=mysql:8
$ kubectl rolling-update dbv1 -f databasev2.yaml
The biggest drawback with this whole approach is that there is not history of updates made to which you can easily rollback. You need to keep track of changes made yourself and rollback to the previous config.
OK. Now a quick refresher on rolling back controllers to previous states. In general, this can happen in at least one of two ways — automatic rollback (in the case that its ReadinessProbe failed) or manual rollback (in the case that there were no probe failures). Automatic rollback is generally what you’d want — unless there are breaking changes that impact multiple microservices.
Rolling Back Deployments, Daemonsets, and StatefulSets
Assuming you used one of the update methods mentioned previously, you can use the kubectl rollout subcommand in order to view previous rolling updates and to revert back to a previous version of your controller/pods.
kubectl rollout history
The kubectl rollout history subcommand shows you information about current and past revisions of your controller. Let’s take an example of an initial deployment of a simple wordpress app consisting of a pod containing the wordpress container and another containing MySQL.
$ kubectl rollout history deployment/db deployments "db" REVISION CHANGE-CAUSE 1 <none>
Now make some changes that will trigger an update:
$ kubectl set image deployment/db mysql=mysql:8.0.13 deployment.apps "db" image updated
And running the
kubectl rollout history command again:
$ kubectl rollout history deployment/db deployments "db" REVISION CHANGE-CAUSE 1 <none> 2 <none>
Now there are two revisions, the original plus our updated deployment. You can view each revision with the kubectl rollout history
$ kubectl rollout history deployment/db --revision=2 deployments "db" with revision #2 Pod Template: Labels: app=wordpress component=db name=wordpress pod-template-hash=1288161368 Containers: mysql: Image: mysql:5 Port: 3306/TCP Host Port: 0/TCP Liveness: exec [mysqladmin ping] delay=30s timeout=5s period=5s #success=1 #failure=5 Readiness: tcp-socket :3306 delay=5s timeout=1s period=2s #success=1 #failure=3 Environment: MYSQL_DATABASE: wpdb MYSQL_PASSWORD: <set to the key 'password' in secret 'dbpass'> Optional: false MYSQL_RANDOM_ROOT_PASSWORD: yes MYSQL_USER: wpdbuser Mounts: /var/lib/mysql from wpdata (rw) Volumes: wpdata: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: wpdata-claim ReadOnly: false
kubectl rollout undo
Now suppose that the pods updated correctly, but for some reason the app is no longer functioning. We can use the kubectl rollout undo command to rollback to the previous version, or kubectl rollout undo –to-revision=[REVISION #] command to rollback to a specific revision. 6
$ kubectl rollout undo deployment/db --to-revision=1 deployment.apps "db"
Rolling back Replication Controllers
A rollback for Replication Controllers is essentially just a rolling update to the old version. To do it using the kubectl rollout command, just append the –rollback=true flag to the original command.
$ kubectl rolling-update dbv1 dbv2 --image=mysql:8 --rollback=true $ kubectl rolling-update dbv1 -f databasev2.yaml --rollback=true
This will revert the changes. Note that you can only go back a single revision.
You can imagine for a complex application with many microservices that it can be challenging to know how to get your application back into a known good state. You’d need to figure out which controllers actually updated (i.e. had a change in the PodTemplateSpec that required a rolling update) and hunt up their previous versions and roll them back, while leaving the others running. One might even imagine adding or updating an initContainer for a previous deployment to handle any data migrations so that the application container itself will be able to work as intended. Technically that isn’t a rollback, but rather an update to a new version of the microservice which works like the old one once the data migration has been completed.
Also, if there were any changes in, e.g. Services, you might need to re-apply their original configurations — which aren’t recorded in any rollback history.
Hopefully this have given you a better understanding of updates and rollbacks. If you’re interested in participating in our early adopter program for a simplified update and rollback mechanism, visit us at https://yipee.io/rollback_apps/ and let us know.
Applies to parameters: env, image, resources, serviceaccount. Check
kubectl set <parameter type> --helpfor exact syntax as it varies for different parameters. ↩
Requires the resource to be created with
kubectl apply -for
kubectl create -f --save-config↩
Applies only to ReplicationControllers ↩
Applies only to ReplicationControllers ↩
Note that this method can contain more than image updates, and must require a new .metadata.name value, at least one common label in the spec.selector value, plus it must use the same metadata.namespace value as previously. ↩
In this case, you might want to see whether any database schema changes occured in the update to the new version that must be reverted before rolling back, or otherwise the application still won’t work when rolled back. ↩