Max Uptime: Building Resilient PostgreSQL with CloudNativePG

A Data Center
Photo by Leif Christoph Gottwald on Unsplash

Intro

Postgres is one of the most popular databases for OLTP. As you know, OLTP performance has a direct impact on the user, so it is important to make sure there is high availability and fast recovery when your system faces an attack or significant issues that bring your system down.

To achieve these critical objectives, Postgres leverages robust features such as streaming replication for high availability and continuous archiving of Write-Ahead Logs (WAL) for point-in-time recovery. Streaming replication allows changes from a primary database to be continuously streamed to one or more standby servers, ensuring data redundancy and enabling rapid failover in case of a primary instance failure. Meanwhile, WAL archiving, coupled with base backups, provides the foundation for restoring the database to any specific moment, significantly minimizing data loss even in catastrophic scenarios.

You will need these

  • CloudnativePG
  • Kubernetes
  • Barman
  • S3 compatible storage

CloudnativePG

CloudNativePG is a Kubernetes operator designed to automate the management of PostgreSQL clusters directly within a Kubernetes environment. This includes handling critical aspects like high availability through streaming replication, automated failover, and efficient disaster recovery using continuous archiving of Write-Ahead Logs (WAL) to object storage. It empowers users to define the desired state of their PostgreSQL instances declaratively, allowing the operator to continuously reconcile and maintain that state, ultimately simplifying database operations in a cloud-native setup.

Kubernetes

Kubernetes is container orchestrator. It will automating deployment, scaling, and management of containerized applications. We will utilize the kubernetes capabilities to achive robust postgres database.

Barman

Barman is a kubernetes plugin, Python based administration tool specifically designed for disaster recovery of PostgreSQL servers. It enables organizations to remotely back up multiple critical databases, thereby reducing risk and assisting DBAs during recovery operations.

We will utilize Barman to perform backup operations to an S3-compatible bucket. I suggest using one of the following platforms, as not all S3-compatible buckets are supported by Barman: AWS S3, GCP GCS, Azure Blob Storage, Minio, DigitalOcean, etc.

S3 Compatible storage

S3 compatible storage is essentially any storage system that you can interact with and manage using the widely adopted S3 (Amazon Simple Storage Service) protocol.

We will use S3 compatible storage to store backup data.

How to setup

I assume you have already set up your Kubernetes cluster, as this post will focus on PostgreSQL. You can follow this documentation to set up a Kubernetes cluster: K3s Quick-Start Guide. I usually use K3s on my on-premises VMs. You can, however, use another Kubernetes variant like Minikube for local development, or a managed Kubernetes service such as EKS, GKE, etc.

Create the S3 secret

  • Create S3 secret on default namespace, since we install CloudnativePG on default namespace
kubectl create secret generic s3-credentials \
  --from-literal=access_key_id='<your access_key_id>' 
  --from-literal=secret_access_key='<your secret_access_key>'

Create and apply these manifests

You can apply the manifest by using this command kubectl apply -f <manifest.yaml>.

  • postgresql-cluster.yaml. This manifest will create postgres database
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-database
  namespace: default
spec:
  instances: 1
  storage:
    size: 5Gi
  postgresql:
    parameters:
      max_connections: "100"
  imageName: ghcr.io/cloudnative-pg/postgresql:16.2
  resources:
    requests:
      cpu: "150m"
      memory: "384Mi"
    limits:
      cpu: "400m"
      memory: "768Mi"
  env:
  - name: AWS_REQUEST_CHECKSUM_CALCULATION
    value: "when_required"
  - name: AWS_RESPONSE_CHECKSUM_VALIDATION
    value: "when_required"
  plugins:
  - name: barman-cloud.cloudnative-pg.io
    isWALArchiver: true
    parameters:
      barmanObjectName: postgres-database-backup-store
  • Get the Postgres credentials
  • Lists kubernetes secrets kubectl get secret. See type kubernetes.io/basic-auth. Usually the secret name is postgres-database-app. We will see this secret as postgres credential for interact with our applications.
  • Get the cloudnative username
kubectl get secret postgres-database-app -o jsonpath='{.data.username}' | base64 --decode
  • Get the cloudnative password
kubectl get secret postgres-database-app -o jsonpath='{.data.password}' | base64 --decode
  • Get the cloudnative host
kubectl get secret postgres-database-app -o jsonpath='{.data.host}' | base64 --decode

Usually, you will get postgres-database-rw.

If your apps in another kubernetes namespace, use fully qualified domain name (FQDN).
postgres-database-rw.default.svc.cluster.local

It will better if we create the credential again in apps namespace. Since my apps namespace is blog. I will create the secret on my blog namespace.

kubectl create secret generic postgres-credentials \
  --from-literal=PG_USER="${PG_USERNAME}" \
  --from-literal=PG_PASSWORD="${PG_PASSWORD}" \
  --from-literal=PG_HOST="${PG_HOST}" \
  --namespace=my-app-namespace
  • Create objectstore.yaml. This manifest to do backup and restore.
apiVersion: barmancloud.cnpg.io/v1
kind: ObjectStore
metadata:
  name: postgres-database-backup-store
  namespace: default
spec:
  configuration:
    destinationPath: "s3://blogbackup/"
    endpointURL: "https://storage.googleapis.com"
    s3Credentials:
      accessKeyId:
        name: s3-credentials
        key: access_key_id
      secretAccessKey:
        name: s3-credentials
        key: secret_access_key
    wal:
      compression: gzip
      maxParallel: 2
    data:
      compression: gzip
      jobs: 2
  retentionPolicy: "6m"
  • Create on-demand-backup.yaml to trigger the on demand backup.
apiVersion: postgresql.cnpg.io/v1
kind: Backup
metadata:
  name: postgres-database-on-demand-backup
  namespace: default
spec:
  cluster:
    name: postgres-database
  method: plugin
  pluginConfiguration:
    name: barman-cloud.cloudnative-pg.io
  • Create schedule-backup.yaml to set schedule backup
# schedule-backup.yaml
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: postgres-database-weekly-backup
  namespace: default
spec:
  cluster:
    name: postgres-database
  schedule: '0 17 * * 7'
  backupOwnerReference: self
  method: plugin
  pluginConfiguration:
    name: barman-cloud.cloudnative-pg.io
  • Create postgresql-restore.yaml too to test the restore.
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-database-restored
  namespace: default
spec:
  instances: 1
  imagePullPolicy: IfNotPresent
  imageName: ghcr.io/cloudnative-pg/postgresql:16.3
  bootstrap:
    recovery:
      source: source-cluster
  plugins:
  - name: barman-cloud.cloudnative-pg.io
    isWALArchiver: true
    parameters:
      barmanObjectName: postgres-database-backup-store
  externalClusters:
  - name: source-cluster
    plugin:
      name: barman-cloud.cloudnative-pg.io
      parameters:
        barmanObjectName: postgres-database-backup-store
        serverName: postgres-database
  storage:
    size: 5Gi
    storageClass: local-path

Conclusion

In this guide, we've explored a robust strategy for ensuring the high availability and fast recovery of PostgreSQL databases, a critical concern for any OLTP workload. By leveraging the power of Kubernetes with a lightweight distribution like K3s, and combining it with the specialized capabilities of CloudNativePG and Barman Cloud, we've demonstrated how to build a resilient and easily manageable PostgreSQL infrastructure.

About the Author

Data & Platform Engineer with 3+ years of experience in cloud, DevOps, and data solutions. I love turning complex data into clear, actionable insights and am passionate about building efficient, scalable systems, especially with Kubernetes (which even runs this blog!).

Let's connect on Linkedin