Skip to main content

Running etcd on a ramdisk

·2 mins

My current kubernetes setup is quite simple:

  • Cluster consists of different virtual machines
  • All virtual machines are running on the same host
  • They all share the same disk for the root partition(though on different partitions)

This doesn’t sound like a production-ready setup, but it’s a cheap and workable solution, and it’s certainly a good learning platform for a variety of subjects:

  • Networking and bridging between multiple virtual machines
  • Firewalls
  • NATs
  • etcd’s write performance

etcd regularly complains about write latency and generally needs a fast hard drive.

Unfortunately, I don’t have too many options when it comes to shared disks, so another idea had to be found.

Fortunately I remember using a ramdisk to improve Firefox’s performance.

Gotchas #

There are a handful of things that need to be taken into account in order to copy the data completely:

  1. Copy to ram before etcd starts
  2. Copy back to hard drive after etcd has stopped

Preparation #

tmpfs #

This is quickly done using fstab

tmpfs    /mnt/etcd-ramdisk    tmpfs    defaults,size=1G      0       0

etcd configuration #

etcd is configured using the parameter –data-dir=/var/lib/etcd, with the respecting volumeMount:

volumeMounts:
 - mountPath: /var/lib/etcd
   name: etcd-data
So changing the volume to
volumes:
- hostPath:
    path: /mnt/etcd-ramdisk
    type: DirectoryOrCreate
  name: etcd-data
was enough to make etcd read from the ramdisk

rsync that shit #

At first I thought a simple rsync -aAXq /var/lib/etcd /mnt/etcd-ramdisk would do. However I quickly found out that SELinux did not seem to like this approach. I found a workable solution and extended the command to rsync -aAXq –filter="-x security.selinux" /var/lib/etcd/ /mnt/etcd-ramdisk Copying back to disk is similar, with the exception of –delete which tells rsync to delete extraneous files from the receiving side

Now that copying to and from the ramdisk is working, regular backups can be made. The only thing left to do is to run the job before and after etcd starts.

systemd that shit #

As etcd is run as a container by crio, systemd’s dependencies can be leveraged. Systemd further allows a service to execute different commands depending on it’s lifecycle:

[Unit]
Description=etcd sync to ramdisk
Before=crio.service

[Service]
Environment="SLEEP=900"
ExecStartPre=rsync -aAXq --filter="-x security.selinux" /var/lib/etcd/ /mnt/etcd-ramdisk
ExecStart=/bin/sh -c 'while true; do rsync -aAXq --filter="-x security.selinux" --delete /mnt/etcd-ramdisk/ /var/lib/etcd/ && sleep $SLEEP; done'
ExecStop=rsync -aAXq --filter="-x security.selinux" --delete /mnt/etcd-ramdisk/ /var/lib/etcd/
[Install]
WantedBy=default.target

Before=crio.service tells systemd two things:

  1. Start the etcd-sync.service before crio.service starts
  2. Stop the etcd-sync.service after crio.service ends

Using ExecStartPre,ExecStart and ExecStop it is possible to distinguish when to copy to and from the ramdisk based on the state of the service.