On 9/20/2017 4:42 AM, Matthias Koch-Schirrmeister wrote:
Am 19.09.2017 um 15:46 schrieb Can Şirin:
  My jobs have almost 30M
files x 8 parallell jobs, and it takes about 20 hours to despool
attributes.
So you are saying that the whole backup run times out because the
database is taking so long? From what I could see with "top", the DB
never created noticable load on the system.

I must admit that I do not know what "despooling" in this case means.
Does it mean "updating the database entries for the present backup run"?

Yes. Job attributes, file metadata, etc. that is to be stored in the db, are spooled to a file in /var/spool/bacula while the fd is actively transmitting data. After data transmission is complete (or if the spool file becomes large enough?), Bacula reads the cached metadata from the file and updates the db in batch mode. This is used both to speed db updates and to prevent constant db updates from slowing down collection of the fd's data..

If so, why do I finally get a "FD error"?

fafnir Fatal error: Network error with FD during Backup: ERR=Connection
reset by peer
fafnir Fatal error: No Job status returned from FD.

Because Bacula expects TCP connections to remain connected throughout the job. If for some reason the connection is dropped by the fd, or by a router or switch in between the fd and dir, then it is seen by dir as a dropped connection. There are several ways that the connection might be dropped; the fd goes into sleep mode and powers down the Ethernet PHY, a switch that the fd is connected to powers down the port due to inactivity, buggy 802.3az implementation, etc.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to