One of our Virtual Full jobs has been failing every day during Always 
Incremental consolidation, and I'm having trouble figuring out why.

*list joblog jobid=18065
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
 2024-07-23 09:01:22 bareos-dir-prod JobId 18065: Start Virtual Backup 
JobId 18065, Job=node0057-AI.2024-07-23_09.00.04_41
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Bootstrap records written 
to /var/lib/bareos/bareos-dir-prod.restore.45.bsr
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Consolidating JobIds 
17646,13356,13422,13488,13556,13625,13694 containing 2025684 files
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Connected Storage daemon 
at bareos-sd-t1-prod.foo.bar.edu:9103, encryption: 
TLS_CHACHA20_POLY1305_SHA256 TLSv1.3
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065:  Encryption: 
TLS_CHACHA20_POLY1305_SHA256 TLSv1.3
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Using Device 
"FileStorage5" to read.
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Max configured use 
duration=72,000 sec. exceeded. Marking Volume 
"node0057-AI-Consolidated-12250" as Used.
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Created new Volume 
"node0057-AI-Consolidated-12276" in catalog.
 2024-07-23 09:01:41 bareos-dir-prod JobId 18065: Using Device 
"FileStorageConsolidated5" to write.
 2024-07-23 09:01:41 bareos-sd-t1-prod JobId 18065: Labeled new Volume 
"node0057-AI-Consolidated-12276" on device "FileStorageConsolidated5" 
(/var/lib/bareos/storage).
 2024-07-23 09:01:41 bareos-sd-t1-prod JobId 18065: Wrote label to 
prelabeled Volume "node0057-AI-Consolidated-12276" on device 
"FileStorageConsolidated5" (/var/lib/bareos/storage)
 2024-07-23 09:01:41 bareos-sd-t1-prod JobId 18065: Ready to read from 
volume "node0057-AI-Consolidated-6560" on device "FileStorage5" 
(/var/lib/bareos/storage).
 2024-07-23 09:01:41 bareos-sd-t1-prod JobId 18065: Forward spacing Volume 
"node0057-AI-Consolidated-6560" to file:block 0:274.
 2024-07-23 09:02:15 bareos-dir-prod JobId 18065: Insert of attributes 
batch table with 800001 entries start
 2024-07-23 09:02:49 bareos-dir-prod JobId 18065: Insert of attributes 
batch table done
 2024-07-23 09:02:49 bareos-dir-prod JobId 18065: Fatal error: Director's 
comm line to SD dropped.
 2024-07-23 09:02:49 bareos-dir-prod JobId 18065: Insert of attributes 
batch table with 3303 entries start
 2024-07-23 09:02:49 bareos-dir-prod JobId 18065: Insert of attributes 
batch table done
 2024-07-23 09:02:49 bareos-dir-prod JobId 18065: Replicating deleted files 
from jobids 17646,13356,13422,13488,13556,13625,13694 to jobid 18065
 2024-07-23 09:03:03 bareos-dir-prod JobId 18065: Error: Bareos 
bareos-dir-prod 22.1.4 (28Feb24):
  Build OS:               Red Hat Enterprise Linux release 9.1 (Plow)
  JobId:                  18065
  Job:                    node0057-AI.2024-07-23_09.00.04_41
  Backup Level:           Virtual Full
  Client:                 "node0057.foo.bar.edu-fd" 22.1.5 (04Jun24) Red 
Hat Enterprise Linux Server release 7.9 (Maipo),redhat
  FileSet:                "LinuxAll" 2023-10-22 13:39:16
  Pool:                   "node0057-AI-Consolidated" (From Job Pool's 
NextPool resource)
  Catalog:                "MyCatalog" (From Client resource)
  Storage:                "bareos-sd-t1-prod-Consolidated" (From Storage 
from Pool's NextPool resource)
  Scheduled time:         23-Jul-2024 09:00:04
  Start time:             21-May-2024 02:00:01
  End time:               21-May-2024 02:30:31
  Elapsed time:           30 mins 30 secs
  Priority:               10
  SD Files Written:       0
  SD Bytes Written:       110,846,184 (110.8 MB)
  Rate:                   60.6 KB/s
  Volume name(s):         node0057-AI-Consolidated-12276
  Volume Session Id:      19
  Volume Session Time:    1721645047
  Last Volume Bytes:      275 (275 B)
  SD Errors:              0
  SD termination status:  Error
  Accurate:               yes
  Bareos binary info:     Bareos subscription release
  Job triggered by:       User
  Termination:            *** Backup Error ***

*

Director and SD are separate hosts, and this issue seems to persist only 
with jobs from this client, node0057. I enabled debug tracing on the SD but 
haven't seen anything that makes sense to me.

...
bareos-sd-t1-prod (200): stored/mac.cc:195-18065 before write JobId=18065 
FI=804499 SessId=19 Strm=1998 len=65
bareos-sd-t1-prod (200): stored/mac.cc:195-18065 before write JobId=18065 
FI=804499 SessId=19 Strm=MD5 len=16
bareos-sd-t1-prod (100): stored/mac.cc:655-18065 ok=0
bareos-sd-t1-prod (130): stored/label.cc:627-18065 session_label 
record=fc052df8
bareos-sd-t1-prod (150): stored/label.cc:652-18065 Write sesson_label 
record JobId=18065 FI=EOS_LABEL SessId=19 Strm=18065 len=234 remainder=0
bareos-sd-t1-prod (150): stored/label.cc:660-18065 Leave WriteSessionLabel 
Block=390293886d File=0d
bareos-sd-t1-prod (100): stored/block.cc:567-18065 return WriteBlockToDev, 
job is canceled
bareos-sd-t1-prod (100): stored/mac.cc:684-18065 Set ok=FALSE after 
WriteBlockToDevice.
bareos-sd-t1-prod (200): stored/mac.cc:687-18065 Flush block to device pos 
0:390293886
bareos-sd-t1-prod (100): stored/acquire.cc:538-18065 releasing device 
"FileStorageConsolidated5" (/var/lib/bareos/storage)
bareos-sd-t1-prod (100): stored/acquire.cc:560-18065 There are 0 writers in 
ReleaseDevice
bareos-sd-t1-prod (50): stored/askdir.cc:366-18065 >dird UpdCat 
Job=node0057-AI.2024-07-23_09.00.04_41 FileAttributes bareos-sd-t1-prod 
(50): stored/askdir.cc:369-18065 create_jobmedia error BnetRecv
bareos-sd-t1-prod (200): stored/mac.cc:229-18062 bareos-sd-t1-prod (200): 
stored/acquire.cc:568-18065 ===== Wrote block new pos 2:4028935146
bareos-sd-t1-prod (50): stored/askdir.cc:298-18065 Update cat 
VolBytes=390293887
bareos-sd-t1-prod (50): stored/askdir.cc:317-18065 >dird bareos-sd-t1-prod 
(200): stored/acquire.cc:587-18065 dir_update_vol_info. Release 
vol=node0057-AI-Consolidated-12276 dev="FileStorageConsolidated5" 
(/var/lib/bareos/storage)
bareos-sd-t1-prod (150): stored/vol_mgr.cc:695-18065 === clear in_use 
vol=node0057-AI-Consolidated-12276
bareos-sd-t1-prod (150): stored/vol_mgr.cc:712-18065 === set not reserved 
vol=node0057-AI-Consolidated-12276 num_writers=0 dev_reserved=0 
dev="FileStorageConsolidated5" (/var/lib/bareos/storage)
bareos-sd-t1-prod (150): stored/vol_mgr.cc:740-18065 === clear in_use 
vol=node0057-AI-Consolidated-12276
bareos-sd-t1-prod (150): stored/vol_mgr.cc:751-18065 === remove volume 
node0057-AI-Consolidated-12276 dev="FileStorageConsolidated5" 
(/var/lib/bareos/storage)
...

I can provide client/job/pool/storage configurations if they seem relevant 
here, and am continuing to poke at this myself. Thanks for any thoughts on 
troubleshooting.

Josh

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bareos-users/4a802f7a-69e6-4cff-9756-f6f089c8aa3en%40googlegroups.com.

Reply via email to