Kubernetes Data Protection: A Guide to Application-Aware Backups with Bareos

Companies and organizations continuously transform enterprise application delivery. Some have moved their applications to containers, others have transformed existing applications into microservices, or developed new cloud-native applications. 

When it comes to data protection and backups, containerized apps are not that different from traditional applications running on virtual machines or bare metal. The backup solution should consider both an application’s state and its data

This post introduces a backup concept for Kubernetes deployments, creating file-based backups of applications running in multiple containers and pods. It makes use of Bareos’ feature for client-initiated backups and offers the following benefits: 

  • Protection of business-critical containerized applications 
  • Quick restore and easy redeployment of application data 
  • Flexible approach, easy to migrate within and across the cloud 
  • Consistent backup concept with certified Open Source software 

Stateful Applications in K8s

A typical Kubernetes setup includes many components, for example, containers, pods, services, certificates, secrets, etc. While there are backup concepts available which handle the Kubernetes objects and their configuration itself, this approach is different: an application-aware backup secures the current state of applications at the time of the backup. This can include data in memory and also pending transactions. 

In the example setup, a typical WordPress blog with a MySQL database is deployed to Kubernetes. The blog software runs in one container, the database backend in another one. To make it more interesting, those two containers belong to different pods. 

The goal is to create:

  • a file-based backup of the WordPress installation
  • a logical backup of the database 

The challenge is a closed setup: the two pods and their containers can’t be accessed from the outside. The solution is that Bareos supports client-initiated connections, so the File Daemon runs as a sidecar container, using the same settings as WordPress for accessing the MySQL database. 

Sidecar containers for backup and restore

In Kubernetes, a pod is a group of one or more containers. A sidecar container is a utility container in a pod which is linked to a main container. As a result, the sidecar and the primary application container share resources, for example, the pod storage and network interfaces. Containers in the same pod can also share storage volumes. 

Sidecar containers are mainly used to extend the main containers’ functionality without having to change their codebase. In this setup, the pod with the WordPress installation has a sidecar container which runs the Bareos File Daemon. The sidecar has access to the same persistent volume as the WordPress container, so it can back up and restore the volume’s contents. Additionally, the sidecar container includes the MySQL credentials, so it can back up and restore the database. 

Kubernetes clusters (v1.29+) can use “native sidecars”. These are defined as initContainers with restartPolicy: Always. This makes the sidecar start before the main application containers and keep running for the full Pod lifetime.

Bareos client in a Kubernetes pod

Three steps are required to implement the application-aware backup with Bareos: 

  1. Build a container image which runs the Bareos File Daemon and the MySQL client. 
  2. Deploy the sidecar image to Kubernetes. 
  3. Configure the Bareos Director and set up the new client. 

1) Create the sidecar image

A container image is built that runs the Bareos File Daemon and the MySQL client. The image can be customized using environment variables at deployment time. 

The example Dockerfile installs bareos-filedaemon and mysql on an AlmaLinux base image:

FROM docker.io/almalinux:8
RUN curl -o /etc/yum.repos.d/bareos.repo -O https://download.bareos.org/bareos/release/21/EL_8/bareos.repo \
  && dnf install -y bareos-filedaemon mysql \
  && dnf clean all
COPY *.* /
ENTRYPOINT [ "/entrypoint.sh" ]
CMD [ "-f" ]

The entrypoint script reads configuration templates and replaces variables at deployment time. This is necessary because logging into the container and adjusting settings in a text editor is not possible. 

Basically, the Bareos File Daemon needs these four settings: 

  1. Name of the Bareos Director
  2. Password of the Bareos Director
  3. Address of the Bareos Director
  4. Name of the File Daemon (the client)

For the MySQL client, the template sets the database backend’s hostname, username and password.

2) Deploy to Kubernetes

In the GitHub repository, the k8s directory offers Kubernetes kustomization files (YAML) that can be adjusted: kustomization.yaml, wordpress.yaml, mysql.yaml, and ingress.yaml. 

The WordPress and sidecar containers share the same document root (/var/www/html) via volumeMounts, so the backup container can back up the same data. 

3) Configure the Bareos Director

To set up the new File Daemon, you need three configuration files. The wordpress-fd.conf contains the client configuration and enables client-initiated connection

Using this feature, the Bareos Director doesn’t have to know how to access the pod with WordPress and the sidecar containers. If the pod starts, the file daemon contacts the Bareos Director itself. 

Example client resource: 

Client {
  Name = wordpress-fd
  Password = "BareosFDPassword"

  Connection From Client To Director = yes
  Connection From Director To Client = no

  Address = localhost
} 

The FileSet contains the document root for WordPress and uses the bpipe plugin (docs) to stream a database dump to Bareos for backup.

FileSet {
  Name = "wordpress-set"
  Include {
    Options {
      Signature = MD5
    }
    File = /var/www/html
    Plugin = "bpipe:file=/MYSQL/all.sql:reader=mysqldump --all-databases:writer=mysql"
  }
}

The backup job itself is defined as: 

Job {
  Name = "wordpress"
  JobDefs = "DefaultJob"
  FileSet = "wordpress-set"
  Client = "wordpress-fd"
}

Reload the Bareos configuration and check the schedule (for example via bconsole status schedule) to confirm the job appears.

Conclusion

When it comes to creating file-based backups, containerized apps are not that different from traditional applications running on virtual machines or bare metal. The backup software should be able to handle the application’s state and its data. Following the application-aware approach, this concept backs up Kubernetes applications running in multiple containers and pods: a stateful WordPress application with two persistence backends, the filesystem and the database backend. 

During restore, both the files and the database (its current state) have to be restored. Even if the entire Kubernetes cluster drops out, it’s easy to restore the setup with this approach: after deploying the applications to Kubernetes, Bareos can restore the WordPress files and the MySQL database in only a few minutes.

Learn more

Leave a Comment

Your email address is not required. Required fields are marked with *.

Scroll to Top