Re: [Bacula-users] Debugging a backup job

2019-05-30 Thread Gary R. Schmidt

On 30/05/2019 22:15, Gestió Servidors wrote:

Hello,

after doing some new test, I think the problem was being caused because
of the backup size and network topology. Between my bacula server and my
"backup" server there are 2 firewall. Also, my server is sharing some
NFS resources, so its load sometimes is hard... After modifying fileset
and exclude two very big folders, backup job has finished OK.

You still need to implement the heartbeat settings, as pointed out 
earlier, that will make such failures less likely in the future.


And allow you to put the big directories back into the backup.

And when a directory grows big enough to cause the problem again, it 
won't happen because you have stopped it happening.


Cheers,
GaryB-)



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Debugging a backup job

2019-05-30 Thread Gestió Servidors
Hello,

after doing some new test, I think the problem was being caused because 
of the backup size and network topology. Between my bacula server and my 
"backup" server there are 2 firewall. Also, my server is sharing some 
NFS resources, so its load sometimes is hard... After modifying fileset 
and exclude two very big folders, backup job has finished OK.

Thanks!

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Debugging a backup job

2019-05-30 Thread Radosław Korzeniewski
Hello,

czw., 30 maj 2019 o 11:32 Josh Fisher  napisał(a):

> Disregard my last post about it being a firewall. Obviously some data is
> being transferred, so I don't know where my brain was.
>
I think you could have a right about a firewall problem, even if some data
was transferred it doesn't mean the connection cannot timeout because of
firewall, especially when "baculaserver" in the logs is a Director.
Firewalls loves to reset/close any inactive connections.

> Wanderlei Huttel has given the most likely fix; to enable heartbeat. I ran
> into this once before. Since the client is a server, I'm going to assume
> that it didn't go into sleep mode. However, there are probably switches /
> routers in between this server and the director. Bacula-dir maintains a TCP
> connection with the client for the duration of a job and some
> energy-efficient switches do not handle this essentially inactive
> connection very well. They cut power to the port in such a way that the
> server bacula-dir is running on thinks that the other end dropped the
> connection. Bacula's heartbeat facility should keep the Dir->FD connection
> active and prevent the switch from putting the port into power-saving mode.
>
Yes, it should be the right solution.

best regards
-- 
Radosław Korzeniewski
rados...@korzeniewski.net
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Debugging a backup job

2019-05-30 Thread Josh Fisher
Disregard my last post about it being a firewall. Obviously some data is 
being transferred, so I don't know where my brain was.


Wanderlei Huttel has given the most likely fix; to enable heartbeat. I 
ran into this once before. Since the client is a server, I'm going to 
assume that it didn't go into sleep mode. However, there are probably 
switches / routers in between this server and the director. Bacula-dir 
maintains a TCP connection with the client for the duration of a job and 
some energy-efficient switches do not handle this essentially inactive 
connection very well. They cut power to the port in such a way that the 
server bacula-dir is running on thinks that the other end dropped the 
connection. Bacula's heartbeat facility should keep the Dir->FD 
connection active and prevent the switch from putting the port into 
power-saving mode.



On 5/29/2019 11:46 AM, Josh Fisher wrote:



On 5/29/2019 6:05 AM, Gestió Servidors wrote:

Hello,

a backup job from a server is failing continuosly. From bacula 
console, I have reconfigured debug with "setdebug level=99 trace=1 
client=my_server" but job is not returning more info... so I don't 
know why is failing. I have rerun four times and, always, after 
writing 80 GB, job fails with message:


2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error 
with FD during Backup: ERR=Connection timed out



Firewall on the client is blocking connections to bacula-fd (TCP port 
9102)?




2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error 
from client:my_server_IP_address:9103: ERR=No data available
2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, 
Transfer rate=11.06 M Bytes/second
2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status 
returned from FD.

I have not set any limit in configuration as "Maximum Volume 
Bytes"... so I don't understand anything.


I would like to know how debug this job.

Could anyone help me?

Thanks.




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Debugging a backup job

2019-05-29 Thread Wanderlei Huttel
Hello

Include heartbeat interval in the configs:
https://github.com/wanderleihuttel/bacula-utils/blob/master/dicas/heartbeat_interval.md


Best regards

*Wanderlei Hüttel*
http://www.bacula.com.br


Em qua, 29 de mai de 2019 às 13:18, David Brodbeck 
escreveu:

> Since you're backing up *some* data, this doesn't seem like a firewall
> issue.
>
> I would check the client's syslog, first. See if the fd is crashing, or if
> you're getting I/O errors that are stalling the transfer. Sometimes a bad
> disk will only show up during backups, if the bad sectors are in an area
> that's rarely accessed.
>
> Since this is a server, power management is probably not an issue, but
> this *is* what I'd expect to see if a client went into sleep mode during a
> backup. Happens a lot with desktop systems.
>
>
> On Wed, May 29, 2019 at 8:11 AM Gestió Servidors 
> wrote:
>
>> Hello,
>>
>> a backup job from a server is failing continuosly. From bacula console, I
>> have reconfigured debug with "setdebug level=99 trace=1 client=my_server"
>> but job is not returning more info... so I don't know why is failing. I
>> have rerun four times and, always, after writing 80 GB, job fails with
>> message:
>>
>> 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error with 
>> FD during Backup: ERR=Connection timed out
>> 2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error 
>> from client:my_server_IP_address:9103: ERR=No data available
>> 2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, Transfer 
>> rate=11.06 M Bytes/second
>> 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status 
>> returned from FD.
>>
>> I have not set any limit in configuration as "Maximum Volume Bytes"... so
>> I don't understand anything.
>>
>> I would like to know how debug this job.
>>
>> Could anyone help me?
>>
>> Thanks.
>>
>>
>> ___
>> Bacula-users mailing list
>> Bacula-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>
>
> --
> David Brodbeck
> System Administrator, Department of Mathematics
> University of California, Santa Barbara
>
> ___
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Debugging a backup job

2019-05-29 Thread David Brodbeck
Since you're backing up *some* data, this doesn't seem like a firewall
issue.

I would check the client's syslog, first. See if the fd is crashing, or if
you're getting I/O errors that are stalling the transfer. Sometimes a bad
disk will only show up during backups, if the bad sectors are in an area
that's rarely accessed.

Since this is a server, power management is probably not an issue, but this
*is* what I'd expect to see if a client went into sleep mode during a
backup. Happens a lot with desktop systems.


On Wed, May 29, 2019 at 8:11 AM Gestió Servidors 
wrote:

> Hello,
>
> a backup job from a server is failing continuosly. From bacula console, I
> have reconfigured debug with "setdebug level=99 trace=1 client=my_server"
> but job is not returning more info... so I don't know why is failing. I
> have rerun four times and, always, after writing 80 GB, job fails with
> message:
>
> 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error with 
> FD during Backup: ERR=Connection timed out
> 2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error 
> from client:my_server_IP_address:9103: ERR=No data available
> 2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, Transfer 
> rate=11.06 M Bytes/second
> 2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status 
> returned from FD.
>
> I have not set any limit in configuration as "Maximum Volume Bytes"... so
> I don't understand anything.
>
> I would like to know how debug this job.
>
> Could anyone help me?
>
> Thanks.
>
>
> ___
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>


-- 
David Brodbeck
System Administrator, Department of Mathematics
University of California, Santa Barbara
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Debugging a backup job

2019-05-29 Thread Josh Fisher


On 5/29/2019 6:05 AM, Gestió Servidors wrote:

Hello,

a backup job from a server is failing continuosly. From bacula 
console, I have reconfigured debug with "setdebug level=99 trace=1 
client=my_server" but job is not returning more info... so I don't 
know why is failing. I have rerun four times and, always, after 
writing 80 GB, job fails with message:


2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: Network error 
with FD during Backup: ERR=Connection timed out



Firewall on the client is blocking connections to bacula-fd (TCP port 9102)?



2019-05-29 11:16:20baculaserver JobId 44381: Error: bsock.c:577 Read error 
from client:my_server_IP_address:9103: ERR=No data available
2019-05-29 11:16:20baculaserver JobId 44381: Elapsed time=02:11:15, 
Transfer rate=11.06 M Bytes/second
2019-05-29 11:16:20baculaserver JobId 44381: Fatal error: No Job status 
returned from FD.

I have not set any limit in configuration as "Maximum Volume Bytes"... 
so I don't understand anything.


I would like to know how debug this job.

Could anyone help me?

Thanks.




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users