Hello Azur It looks that problem was an error on the connection:
*11-jan 05:42 server00-dir JobId 56875: Error: Director's connection to SD for this Job was lost. * You could try enable "Heartbeat Interval in your config files: https://github.com/wanderleihuttel/bacula-utils/blob/master/dicas/heartbeat_interval.md Best regards *Wanderlei Hüttel* http://www.bacula.com.br Em sex, 11 de jan de 2019 às 06:08, <azu...@pobox.sk> escreveu: > Hi all, > > i'm having problems with virtual full backup on one of our clients. > Everything was working fine for years but, suddenly, Bacula is unable > to complete the job - i was trying to run it about 10 times during > past few days, all of them ends with the same weird error. > > The error is occuring near of the end of the job, after 240 volumes > were created (we have a limit of 5 GB for volume size) and 1,2 TB of > data was copied. > > Here are last lines from job report: > > 11-jan 05:40 server08-sd JobId 56875: Ready to read from volume > "cloud0015-16052" on File device "cloud0015-file-storage" > (/backup/cloud0015). > 11-jan 05:40 server08-sd JobId 56875: Forward spacing Volume > "cloud0015-16052" to addr=226 > 11-jan 05:40 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17313" as Used. > 11-jan 05:40 server08-sd JobId 56875: End of medium on Volume > "cloud0015-17313" Bytes=5,368,688,735 Blocks=83,220 at 11-Jan-2019 > 05:40. > 11-jan 05:40 server00-dir JobId 56875: Created new > Volume="cloud0015-17314", Pool="cloud0015-pool", > MediaType="File-cloud0015" in catalog. > 11-jan 05:40 server08-sd JobId 56875: Labeled new Volume > "cloud0015-17314" on File device "cloud0015-file-storage-2" > (/backup/cloud0015). > 11-jan 05:40 server08-sd JobId 56875: Wrote label to prelabeled Volume > "cloud0015-17314" on File device "cloud0015-file-storage-2" > (/backup/cloud0015) > 11-jan 05:40 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17314" as Used. > 11-jan 05:40 server08-sd JobId 56875: New volume "cloud0015-17314" > mounted on device "cloud0015-file-storage-2" (/backup/cloud0015) at > 11-Jan-2019 05:40. > 11-jan 05:40 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17314" as Used. > 11-jan 05:40 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17314" as Used. > 11-jan 05:41 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17314" as Used. > 11-jan 05:41 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17314" as Used. > 11-jan 05:41 server08-sd JobId 56875: End of Volume "cloud0015-16052" > at addr=5368688648 on device "cloud0015-file-storage" > (/backup/cloud0015). > 11-jan 05:41 server08-sd JobId 56875: Ready to read from volume > "cloud0015-16053" on File device "cloud0015-file-storage" > (/backup/cloud0015). > 11-jan 05:41 server08-sd JobId 56875: Forward spacing Volume > "cloud0015-16053" to addr=226 > 11-jan 05:41 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17314" as Used. > 11-jan 05:41 server08-sd JobId 56875: End of medium on Volume > "cloud0015-17314" Bytes=5,368,688,678 Blocks=83,220 at 11-Jan-2019 > 05:41. > 11-jan 05:41 server00-dir JobId 56875: Created new > Volume="cloud0015-17315", Pool="cloud0015-pool", > MediaType="File-cloud0015" in catalog. > 11-jan 05:41 server08-sd JobId 56875: Labeled new Volume > "cloud0015-17315" on File device "cloud0015-file-storage-2" > (/backup/cloud0015). > 11-jan 05:41 server08-sd JobId 56875: Wrote label to prelabeled Volume > "cloud0015-17315" on File device "cloud0015-file-storage-2" > (/backup/cloud0015) > 11-jan 05:41 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 05:41 server08-sd JobId 56875: New volume "cloud0015-17315" > mounted on device "cloud0015-file-storage-2" (/backup/cloud0015) at > 11-Jan-2019 05:41. > 11-jan 05:41 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 05:42 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 05:42 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 05:42 server00-dir JobId 56875: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 05:42 server00-dir JobId 56875: Error: Director's connection to > SD for this Job was lost. > > > > > > This is what is logged to syslog on SD side: > > Jan 11 05:42:43 server08 bacula-sd[13585]: server08-sd: > vbackup.c:130-56875 Fatal append error on device > "cloud0015-file-storage-2" (/backup/cloud0015): ERR=file_dev.c:190 > Could not > open(/backup/cloud0015/cloud0015-17315,OPEN_READ_WRITE,0640): ERR=No > such file or directory > > > > > > > > > There's no reason why the volume wasn't created. Permissions are ok: > > # ls -ld /backup/cloud0015/ > drwxr-x--- 2 bacula tape 36864 Jan 11 05:41 /backup/cloud0015/ > > > Free disk space is sufficient: > > # df -h /backup/ > Filesystem Size Used Avail Use% Mounted on > /dev/sdc1 7.3T 5.1T 2.2T 70% /backup > > > > > > > > Another strange thing is that while virtual full was running, new > incremental backup was queue for that client. After full failed, > incremetnal was run AND completed fine while using THAT ONE ERRONEOUS > VOLUME: > > 11-jan 05:42 server00-dir JobId 56897: Start Backup JobId 56897, > Job=cloud0015.2019-01-11_01.00.17_25 > 11-jan 05:42 server00-dir JobId 56897: There are no more Jobs > associated with Volume "cloud0015-17315". Marking it purged. > 11-jan 05:42 server00-dir JobId 56897: All records pruned from Volume > "cloud0015-17315"; marking it "Purged" > 11-jan 05:42 server00-dir JobId 56897: Recycled volume "cloud0015-17315" > 11-jan 05:42 server00-dir JobId 56897: Using Device > "cloud0015-file-storage" to write. > 11-jan 05:42 cloud0015-fd JobId 56897: shell command: run > ClientRunBeforeJob "/var/www/backends/webbackend/misc/mysql_backup.py" > 11-jan 05:42 server00-dir JobId 56897: Sending Accurate information to the > FD. > 11-jan 05:52 server08-sd JobId 56897: Recycled volume > "cloud0015-17315" on File device "cloud0015-file-storage" > (/backup/cloud0015), all previous data lost. > 11-jan 05:52 server00-dir JobId 56897: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 05:58 server00-dir JobId 56897: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 06:03 server00-dir JobId 56897: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 06:06 server00-dir JobId 56897: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 06:13 server00-dir JobId 56897: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 06:16 server08-sd JobId 56897: Elapsed time=00:24:42, Transfer > rate=3.260 M Bytes/second > 11-jan 06:16 server00-dir JobId 56897: Volume used once. Marking > Volume "cloud0015-17315" as Used. > 11-jan 06:16 server08-sd JobId 56897: Sending spooled attrs to the > Director. Despooling 10,006,797 bytes ... > > > > > This starts to happen on Bacula 7.4.4 but after few days i decided to > upgrade to 9.2.1, problem persists. Debian linux, 64bit, i'm using > original Debian packages of Bacula. > > I also checked the health of HDD where backups are stored, done short > and long SMART tests, file system checks etc. No errors at all. > > Any hints? Thanks > > azur > > > > > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users >
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users