Re: [Bacula-users] Debugging a backup job
On 30/05/2019 22:15, Gestió Servidors wrote: Hello, after doing some new test, I think the problem was being caused because of the backup size and network topology. Between my bacula server and my "backup" server there are 2 firewall. Also, my server is sharing some NFS resources, so its load sometimes is hard... After modifying fileset and exclude two very big folders, backup job has finished OK. You still need to implement the heartbeat settings, as pointed out earlier, that will make such failures less likely in the future. And allow you to put the big directories back into the backup. And when a directory grows big enough to cause the problem again, it won't happen because you have stopped it happening. Cheers, GaryB-) ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Debugging a backup job
Hello, after doing some new test, I think the problem was being caused because of the backup size and network topology. Between my bacula server and my "backup" server there are 2 firewall. Also, my server is sharing some NFS resources, so its load sometimes is hard... After modifying fileset and exclude two very big folders, backup job has finished OK. Thanks! ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Debugging a backup job
Hello, czw., 30 maj 2019 o 11:32 Josh Fisher napisał(a): > Disregard my last post about it being a firewall. Obviously some data is > being transferred, so I don't know where my brain was. > I think you could have a right about a firewall problem, even if some data was transferred it doesn't mean the connection cannot timeout because of firewall, especially when "baculaserver" in the logs is a Director. Firewalls loves to reset/close any inactive connections. > Wanderlei Huttel has given the most likely fix; to enable heartbeat. I ran > into this once before. Since the client is a server, I'm going to assume > that it didn't go into sleep mode. However, there are probably switches / > routers in between this server and the director. Bacula-dir maintains a TCP > connection with the client for the duration of a job and some > energy-efficient switches do not handle this essentially inactive > connection very well. They cut power to the port in such a way that the > server bacula-dir is running on thinks that the other end dropped the > connection. Bacula's heartbeat facility should keep the Dir->FD connection > active and prevent the switch from putting the port into power-saving mode. > Yes, it should be the right solution. best regards -- Radosław Korzeniewski rados...@korzeniewski.net ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Debugging a backup job
Disregard my last post about it being a firewall. Obviously some data is being transferred, so I don't know where my brain was. Wanderlei Huttel has given the most likely fix; to enable heartbeat. I ran into this once before. Since the client is a server, I'm going to assume that it didn't go into sleep mode. However, there are probably switches / routers in between this server and the director. Bacula-dir maintains a TCP connection with the client for the duration of a job and some energy-efficient switches do not handle this essentially inactive connection very well. They cut power to the port in such a way that the server bacula-dir is running on thinks that the other end dropped the connection. Bacula's heartbeat facility should keep the Dir->FD connection active and prevent the switch from putting the port into power-saving mode. On 5/29/2019 11:46 AM, Josh Fisher wrote: On 5/29/2019 6:05 AM, Gestió Servidors wrote: Hello, a backup job from a server is failing continuosly. From bacula console, I have reconfigured debug with "setdebug level=99 trace=1 client=my_server" but job is not returning more info... so I don't know why is failing. I have rerun four times and, always, after writing 80 GB, job fails with message: 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error with FD during Backup: ERR=Connection timed out Firewall on the client is blocking connections to bacula-fd (TCP port 9102)? 2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error from client:my_server_IP_address:9103: ERR=No data available 2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, Transfer rate=11.06 M Bytes/second 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status returned from FD. I have not set any limit in configuration as "Maximum Volume Bytes"... so I don't understand anything. I would like to know how debug this job. Could anyone help me? Thanks. ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Debugging a backup job
Hello Include heartbeat interval in the configs: https://github.com/wanderleihuttel/bacula-utils/blob/master/dicas/heartbeat_interval.md Best regards *Wanderlei Hüttel* http://www.bacula.com.br Em qua, 29 de mai de 2019 às 13:18, David Brodbeck escreveu: > Since you're backing up *some* data, this doesn't seem like a firewall > issue. > > I would check the client's syslog, first. See if the fd is crashing, or if > you're getting I/O errors that are stalling the transfer. Sometimes a bad > disk will only show up during backups, if the bad sectors are in an area > that's rarely accessed. > > Since this is a server, power management is probably not an issue, but > this *is* what I'd expect to see if a client went into sleep mode during a > backup. Happens a lot with desktop systems. > > > On Wed, May 29, 2019 at 8:11 AM Gestió Servidors > wrote: > >> Hello, >> >> a backup job from a server is failing continuosly. From bacula console, I >> have reconfigured debug with "setdebug level=99 trace=1 client=my_server" >> but job is not returning more info... so I don't know why is failing. I >> have rerun four times and, always, after writing 80 GB, job fails with >> message: >> >> 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error with >> FD during Backup: ERR=Connection timed out >> 2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error >> from client:my_server_IP_address:9103: ERR=No data available >> 2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, Transfer >> rate=11.06 M Bytes/second >> 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status >> returned from FD. >> >> I have not set any limit in configuration as "Maximum Volume Bytes"... so >> I don't understand anything. >> >> I would like to know how debug this job. >> >> Could anyone help me? >> >> Thanks. >> >> >> ___ >> Bacula-users mailing list >> Bacula-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/bacula-users >> > > > -- > David Brodbeck > System Administrator, Department of Mathematics > University of California, Santa Barbara > > ___ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Debugging a backup job
Since you're backing up *some* data, this doesn't seem like a firewall issue. I would check the client's syslog, first. See if the fd is crashing, or if you're getting I/O errors that are stalling the transfer. Sometimes a bad disk will only show up during backups, if the bad sectors are in an area that's rarely accessed. Since this is a server, power management is probably not an issue, but this *is* what I'd expect to see if a client went into sleep mode during a backup. Happens a lot with desktop systems. On Wed, May 29, 2019 at 8:11 AM Gestió Servidors wrote: > Hello, > > a backup job from a server is failing continuosly. From bacula console, I > have reconfigured debug with "setdebug level=99 trace=1 client=my_server" > but job is not returning more info... so I don't know why is failing. I > have rerun four times and, always, after writing 80 GB, job fails with > message: > > 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error with > FD during Backup: ERR=Connection timed out > 2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error > from client:my_server_IP_address:9103: ERR=No data available > 2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, Transfer > rate=11.06 M Bytes/second > 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status > returned from FD. > > I have not set any limit in configuration as "Maximum Volume Bytes"... so > I don't understand anything. > > I would like to know how debug this job. > > Could anyone help me? > > Thanks. > > > ___ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > -- David Brodbeck System Administrator, Department of Mathematics University of California, Santa Barbara ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Debugging a backup job
On 5/29/2019 6:05 AM, Gestió Servidors wrote: Hello, a backup job from a server is failing continuosly. From bacula console, I have reconfigured debug with "setdebug level=99 trace=1 client=my_server" but job is not returning more info... so I don't know why is failing. I have rerun four times and, always, after writing 80 GB, job fails with message: 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error with FD during Backup: ERR=Connection timed out Firewall on the client is blocking connections to bacula-fd (TCP port 9102)? 2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error from client:my_server_IP_address:9103: ERR=No data available 2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, Transfer rate=11.06 M Bytes/second 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status returned from FD. I have not set any limit in configuration as "Maximum Volume Bytes"... so I don't understand anything. I would like to know how debug this job. Could anyone help me? Thanks. ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users