Hello Zsolt,

On Wed, Jan 4, 2023 at 10:00 PM Zsolt Kozak <koza...@gmail.com> wrote:

> Hello,
>
> I have some problems with backuping Kubernetes PVCs with Bacula Kubernetes
> Plugin.
>
> I am using the latest 13.0.1 Bacula from the community builds on Debian
> Bullseye hosts.
>
> Backuping only the Kubernetes objects except Persistent Volume Claims
> (PVC) works like a charm. I've installed the Kubernetes plugin and the
> latest Bacula File Daemon on the master node (control plane) of our
> Kubernetes cluster. Bacula can access the Kubernetes cluster and backup
> every single object as YAML files.
>
> The interesting part comes with trying to backup a PVC...
>
> First of all I could build my own Bacula Backup Proxy Pod Image from the
> source and it's deployed into our local Docker image repository (repo). The
> Bacula File Daemon is configured properly I guess. Backup process started
> and the following things happened.
>

You mentioned you could run a kubernetes backup of all resources
successfully, thus the Bacula File Daemon should be ok.


> 1. Bacula File Daemon deployed Bacula Backup Proxy Pod Image into the
> Kubernetes cluster, so Bacula-backup container pod started.
>
2. I got into the pod and I could see the Baculatar application started and
> running.
>
3. The k8s_backend application started on the Bacula File Daemon host
> (kubernetes.server) in 2 instances.
> 4. From the Bacula-backup pod I could check that Baculatar could connect
> to the k8s_backend at the default 9104 port (kubernetes.server:9104).
>
All fine so far!


> 5. I checked the console messages of the job with Bat that Bacula File
> Daemon started to process the configured PVC, started to write a pvc.tar
> but nothing happened.
> 6. After default 600 sec, after timeout the job was cancelled.
>

Ok, so we have a problem.

> 7. It may be important that Bacula File Daemon could not delete the
> Bacula-backup pod. (It could create it but could not delete it.)
>

This is a design decision. If there is a failure with the bacula-backup
pod, it is not removed by Bacula. It requires the kubernetes admin to
manually remove it.


> Could you please tell me what's wrong?
>
>
> Here are some log parts. (I've changed some sensitive data.)
>
>
> Bacula File Daemon configuration:
>
> FileSet {
>     Name = "Kubernetes Set"
>     Include {
>         Options {
>             signature = SHA512
>             compression = GZIP
>             Verify = pins3
>         }
>         Plugin = "kubernetes: \
>             debug=1 \
>             baculaimage=repo/bacula-backup:04jan23 \
>             namespace=namespace \
>             pvcdata \
>             pluginhost=kubernetes.server \
>             timeout=120 \
>             verify_ssl=0 \
>             fdcertfile=/etc/bacula/certs/bacula-backup.cert \
>             fdkeyfile=/etc/bacula/certs/bacula-backup.key"
>     }
> }
>
>
>
> Bacula File Daemon debug log (parts):
>
>
> DEBUG:[baculak8s/jobs/estimation_job.py:134 in processing_loop] processing
> get_annotated_namespaced_pods_data:namespace:nrfound:0
> DEBUG:[baculak8s/plugins/kubernetes_plugin.py:319 in
> list_pvcdata_for_namespace] list pvcdata for namespace:namespace
> pvcfilter=True estimate=False
> DEBUG:[baculak8s/plugins/k8sbackend/pvcdata.py:108 in
> pvcdata_list_namespaced] pvcfilter: True
> DEBUG:[baculak8s/plugins/k8sbackend/pvcdata.py:112 in
> pvcdata_list_namespaced] found:some-claim
> DEBUG:[baculak8s/plugins/k8sbackend/pvcdata.py:127 in
> pvcdata_list_namespaced] add pvc: {'name': 'some-claim', 'node_name': None,
> 'storage_class_name': 'nfs-client', 'capacity': '2Gi', 'fi':
> <baculak8s.entities.file_info.FileInfo object at 0x7ffaa55bfcc0>}
> DEBUG:[baculak8s/jobs/estimation_job.py:165 in processing_loop] processing
> list_pvcdata_for_namespace:namespace:nrfound:1
> DEBUG:[baculak8s/jobs/estimation_job.py:172 in processing_loop]
> PVCDATA:some-claim:{'name': 'some-claim', 'node_name': 'node1',
> 'storage_class_name': 'nfs-client', 'capacity': '2Gi', 'fi':
> <baculak8s.entities.file_info.FileInfo object at 0x7ffaa55bfcc0>}
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> I000041
> Start backup volume claim: some-claim
>
> DEBUG:[baculak8s/jobs/job_pod_bacula.py:298 in prepare_bacula_pod]
> prepare_bacula_pod:token=xx88M5oggQJ....4YDbSwBRxTOhT namespace=namespace
> DEBUG:[baculak8s/jobs/job_pod_bacula.py:136 in prepare_pod_yaml] pvcdata:
> {'name': 'some-claim', 'node_name': 'node1', 'storage_class_name':
> 'nfs-client', 'capacity': '2Gi', 'fi':
> <baculak8s.entities.file_info.FileInfo object at 0x7ffaa55bfcc0>}
> DEBUG:[baculak8s/plugins/k8sbackend/baculabackup.py:102 in
> prepare_backup_pod_yaml] host:kubernetes.server port:9104
> namespace:namespace image:repo/bacula-backup:04jan23
> job:KubernetesBackup.2023-01-04_21.05.03_10:410706
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> I000149
> Prepare Bacula Pod on: node1 with: repo/bacula-backup:04jan23
> <IfNotPresent> kubernetes.server:9104
>
> DEBUG:[baculak8s/jobs/job_pod_bacula.py:198 in prepare_connection_server]
> prepare_connection_server:New ConnectionServer: 0.0.0.0:9104
> DEBUG:[baculak8s/util/sslserver.py:180 in listen]
> ConnectionServer:Listening...
> DEBUG:[baculak8s/jobs/job_pod_bacula.py:307 in prepare_bacula_pod]
> prepare_bacula_pod:start pod
> INFO:[baculak8s/plugins/kubernetes_plugin.py:771 in backup_pod_isready]
> backup_pod_status:isReady: False / 0
> INFO:[baculak8s/plugins/kubernetes_plugin.py:771 in backup_pod_isready]
> backup_pod_status:isReady: True / 1
> DEBUG:[baculak8s/jobs/estimation_job.py:183 in _estimate_file] {'name':
> 'some-claim', 'node_name': 'node1', 'storage_class_name': 'nfs-client',
> 'capacity': '2Gi', 'fi': <baculak8s.entities.file_info.FileInfo object at
> 0x7ffaa55bfcc0>}
> DEBUG:[baculak8s/jobs/estimation_job.py:190 in _estimate_file] file_info:
> {FileInfo
> name:/@kubernetes/namespaces/namespace/persistentvolumeclaims/some-claim.tar
> namespace:None type:F objtype:pvcdata cached:False}
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> C000079
>
> FNAME:/@kubernetes/namespaces/namespace/persistentvolumeclaims/some-claim.tar
>
>
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> C000040
> TSTAMP:1672861077 1672861077 1672861077
>
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> C000031
> STAT:F 2147483648 0 0 100640 1
>
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> F000000
> (EOD PACKET)
>
> DEBUG:[baculak8s/jobs/backup_job.py:77 in __backup_pvcdata]
> backup_pvcdata:data recv
> DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet
> C000005
> DATA
>
> DEBUG:[baculak8s/util/sslserver.py:193 in handle_connection]
> ConnectionServer:Connection from: ('192.168.XX.YY', 10541)
> DEBUG:[baculak8s/util/sslserver.py:145 in gethello] ['Hello',
> 'KubernetesBackup.2023-01-04_21.05.03_10', '410706']
> DEBUG:[baculak8s/util/token.py:57 in check_auth_data] AUTH_DATA:Token:
> xx88M5oggQJuGsPbtD........ohQjeU7PkA4YDbSwBRxTOhT
> DEBUG:[baculak8s/util/token.py:59 in check_auth_data]
> RECV_TOKEN_DATA:Token: xx88M5oggQJuGsPbtD....ohQjeU7PkA4YDbSwBRxTOhT
> DEBUG:[baculak8s/util/sslserver.py:105 in authenticate]
> ConnectionServer:Authenticated
>
> .... after timeout
>
> DEBUG:[baculak8s/jobs/job_pod_bacula.py:121 in handle_pod_data_recv]
> handle_pod_data_recv:EOT
> DEBUG:[baculak8s/util/sslserver.py:201 in handle_connection]
> ConnectionServer:Finish - disconnect.
> DEBUG:[baculak8s/jobs/backup_job.py:85 in __backup_pvcdata]
> backup_pvcdata:logs recv
>
>
>
>
> Job messages:
>
> bacula-dir No prior or suitable Full backup found in catalog for the
> current FileSet. Doing FULL backup.
>  The FileSet "Kubernetes Set" was modified on 2023-01-04 20:20:41, this is
> after the last successful backup on 2023-01-04 19:19:49.
> bacula-sd Ready to append to end of Volume "Full-XXX" size=3,838,161,002
> bacula-fd Connected to Storage at bacula.server:9103 with TLS
> bacula-sd Volume "Full-XXXX" previously written, moving to end of data.
> bacula-dir Connected to Client "bacula-fd" at kubernetes.server:9102 with
> TLS
>  Using Device "FileStorageEeyoreFull" to write.
>  Connected to Storage "InternalStorageFull" at bacula.server:9103 with TLS
>  Start Backup JobId 410830, Job=KubernetesBackup.2023-01-04_21.05.03_10
> bacula-fd kubernetes: Prepare Bacula Pod on: node with:
> repo/bacula-backup:04jan23  kubernetes.server:9104
>  kubernetes: Processing namespace: namespace
>  kubernetes: Start backup volume claim: some-claim
>  kubernetes: Connected to Kubernetes 1.25 - v1.25.4.
> bacula-dir
> Error: Bacula Enterprise bacula-dir 13.0.1 (05Aug22):
>   Build OS:               x86_64-pc-linux-gnu-bacula-enterprise debian 11.2
>   JobId:                  410830
>   Job:                    KubernetesBackup.2023-01-04_21.05.03_10
>   Backup Level:           Full (upgraded from Differential)
>   Client:                 "bacula-fd" 13.0.1 (05Aug22)
> x86_64-pc-linux-gnu-bacula-enterprise,debian,10.11
>   FileSet:                "Kubernetes Set" 2023-01-04 20:20:41
>   Pool:                   "Full-Pool" (From Job FullPool override)
>   Catalog:                "MyCatalog" (From Client resource)
>   Storage:                "InternalStorageFull" (From Pool resource)
>   Scheduled time:         04-Jan-2023 21:05:03
>   Start time:             04-Jan-2023 21:27:04
>   End time:               04-Jan-2023 21:29:06
>   Elapsed time:           2 mins 2 secs
>   Priority:               10
>   FD Files Written:       23
>   SD Files Written:       0
>   FD Bytes Written:       52,784 (52.78 KB)
>   SD Bytes Written:       0 (0 B)
>   Rate:                   0.4 KB/s
>   Software Compression:   100.0% 1.0:1
>   Comm Line Compression:  5.6% 1.1:1
>   Snapshot/VSS:           no
>   Encryption:             yes
>   Accurate:               yes
>   Volume name(s):         Full-XXXX
>   Volume Session Id:      43
>   Volume Session Time:    1672853724
>   Last Volume Bytes:      3,838,244,105 (3.838 GB)
>   Non-fatal FD errors:    3
>   SD Errors:              0
>   FD termination status:  OK
>   SD termination status:  SD despooling Attributes
>   Termination:            *** Backup Error ***
>  Fatal error: catreq.c:680 Restore object create error.
> bacula-fd
> Error: kubernetes: PTCOMM cannot get packet header from backend.
> bacula-dir Fatal error: sql_create.c:1273 Create db Object record INSERT
> INTO RestoreObject
> (ObjectName,PluginName,RestoreObject,ObjectLength,ObjectFullLength,ObjectIndex,ObjectType,ObjectCompression,FileIndex,JobId)
>
> VALUES ('RestoreOptions','kubernetes: \n            debug=1 \n
>  baculaimage=repo/bacula-backup:04jan23 \n            namespace=namespace
> \n            pvcdata \n
> pluginhost=kubernetes.server \n            timeout=120 \n
>  verify_ssl=0 \n            fdcertfile=/etc/bacula/certs/bacula-backup.cert
> \n
> fdkeyfile=/etc/bacula/certs/bacula-backup.key','# Plugin configuration
> file\n# Version 1\nOptPrompt=\"K8S config
> file\"\nOptDefault=\"*None*\"\nconfig=@STR@\n\n
> OptPrompt=\"K8S API server 
> URL/Host\"\nOptDefault=\"*None*\"\nhost=@STR@\n\nOptPrompt=\"K8S
> Bearertoken\"\nOptDefault=\"*None*\"\ntoken=@STR@\n\nOptPrompt=\"K8S API
> server cert verification\"\n
> OptDefault=\"True\"\nverify_ssl=@BOOL@\n\nOptPrompt=\"Custom CA Certs
> file to use\"\nOptDefault=\"*None*\"\nssl_ca_cert=@STR@\n\nOptPrompt=\"Output
> format when saving to file (JSON, YAML)\"\n
> OptDefault=\"RAW\"\noutputformat=@STR@\n\nOptPrompt=\"The address for
> listen to incoming backup pod
> data\"\nOptDefault=\"*FDAddress*\"\nfdaddress=@STR@\n\n
> OptPrompt=\"The port for opening socket for
> listen\"\nOptDefault=\"9104\"\nfdport=@INT32@\n\nOptPrompt=\"The endpoint
> address for backup pod to connect\"\n
> OptDefault=\"*FDAddress*\"\npluginhost=@STR@\n\nOptPrompt=\"The endpoint
> port to connect\"\nOptDefault=\"9104\"\n
> pluginport=@INT32@\n\n',859,859,0,27,0,1,410830) failed. ERR=Data too
> long for column 'PluginName' at row 1
>

It seems to be an issue related to the Catalog update. Are you using Bacula
with a MySQL or a PostgreSQL Catalog?

Can you check if you have a "RestoreObject" in the bacula database and
share the table information using the following query for a MySQL database?

SELECT * FROM `INFORMATION_SCHEMA`.`TABLES`WHERE TABLE_NAME LIKE 'table1'


For Postgresql, "\dt+ table" should work.

It seems to me we would need a bug report for this issue.

Best regards,
Ana

> bacula-sd Sending spooled attrs to the Director. Despooling 8,214 bytes ...
> bacula-fd
> Error: kubernetes: Error closing backend. Err=Child exited with code 1
>  Fatal error: kubernetes: Wrong backend response to JobEnd command.
> bacula-sd Elapsed time=00:02:02, Transfer rate=659  Bytes/second
> bacula-fd
> Error: kubernetes: PTCOMM cannot get packet header from backend.
>
> Error: kubernetes: Cannot successfully start bacula-backup pod in expected
> time!
>
> Error: kubernetes: Job already running in 'namespace' namespace. Check
> logs or delete bacula-backup Pod manually.
>
>
> Best regards,
> Zsolt
>
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to