Re: [Bacula-users] SPEED!

2011-07-07 Thread Glen Barber
On 7/7/11 2:36 AM, J. Echter wrote:
> Am 07.07.2011 04:43, schrieb Glen Barber:
>> On 7/6/11 12:37 PM, J. Echter wrote:
>>> backup speed has nothing to do with regular backup speed.
>>>
>> Can you explain exactly what this means?
>>
> sorry, i meant backup speed has nothing to do with regular *network* speed.
> 

I see.

-- 
Glen Barber

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] SPEED!

2011-07-06 Thread Glen Barber
On 7/6/11 12:37 PM, J. Echter wrote:
> 
> backup speed has nothing to do with regular backup speed.
> 

Can you explain exactly what this means?

-- 
Glen Barber

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] timeout error

2010-08-09 Thread Glen Barber
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 8/9/10 1:32 PM, Jesus arteche wrote:
> hey guys,
> 
> I'm getting this error in a job:
> 
> 09-Aug 15:14 InternetServer-dir JobId 6680: Fatal error: Network error with
> FD during Backup: ERR=Connection timed out
> 09-Aug 15:14 InternetServer-sd JobId 6680: Job
> gestiondocumental.2010-08-09_12.52.59.04 marked to be canceled.
> 09-Aug 15:14 InternetServer-sd JobId 6680: Job
> gestiondocumental.2010-08-09_12.52.59.04 marked to be canceled.
> 09-Aug 15:14 InternetServer-sd JobId 6680: Job write elapsed time =
> 02:11:16, Transfer rate = 400.4 K bytes/second
> 09-Aug 15:14 InternetServer-dir JobId 6680: Fatal error: No Job status
> returned from FD.
> 09-Aug 15:14 InternetServer-dir JobId 6680: Error: Bacula InternetServer-dir
> 2.4.4 (28Dec08): 09-Aug-2010 15:14:34
> 
> I change my server from location, the backup now is made from a data center
> to my office...I guess maybe this error is given by the latency...I dont
> know I have a good bandwidth in both ways
> 
> Anyone knows howto solve it??
> 
> thanks
> 

Hello,

If you haven't done so already, you might try adding a heartbeat
interval to the client FD.

% grep -i heartbeat /usr/local/etc/bacula-fd.conf
  Heartbeat Interval = 15

http://www.bacula.org/manuals/en/install/install/Storage_Daemon_Configuratio.html

Regards,
- -- 
Glen Barber
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQEcBAEBAgAGBQJMYEA9AAoJEFJPDDeguUajZD4H/002XkWDqNMqrf89O6cyuBvg
OMiRobMQETwC3GnOMXwzoZPJmDnKVb0J6TVq4zPQTeGBRY7QLsgwpDgKbHCG1Cgy
Ok2bq4cMZ37PMu+3Id8nY7Sm2+2fXJ5SdwT3N74FFUktc2rkI6oMT4EfIWyDhlFy
dC9YKKtn/BfNZnPL48vvi7+NGMLHttV421E6jmPM4snb/AKBQN2GdeFgWc8cp/FV
WX40eJfkIYoSovgIJSs4DdrSnZkfBGUb8b/LbhbFvsdodP/5KgSFqIZ7dyFPyhvZ
ArnSpRtuTAgXsZgkzU26sbeWCJNW88lZMOdPQyA+xG6pBbPA6a0nb+3PzfB2Xbk=
=nunY
-END PGP SIGNATURE-

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Odd Maximum Spool Size behavior after network connection interruption

2010-07-29 Thread Glen Barber
Hi,

I've seen this odd behavior twice now, and I'm having trouble tracking 
down the cause.  Both times, it was after a network connection issue 
during a backup.

When the network connection is interrupted, the SD stops writing to the 
spool, the spool file is removed, and the failed job gets rescheduled. 
However, though the SD never seems to become aware that the spool file 
has been removed, and doesn't use all available space.

I have Maximum Spool Size set to 25gb.  Though, it doesn't appear to be 
using (25gb - failed_job_spoolfile_size), as the the 25gb spool size is 
very generously high.  In the most recent situation, the SD was spooling 
130MB chunks.

The only time I realize there is an issue is when I see large amounts of 
"User specified spool size reached." messages, and the spool is written 
to tape roughly once a minute or so.  Killing the SD gets everything 
back on track, and there doesn't appear to be any other issues outside 
of constant tape writes.

Has anyone else seen this?  Might this be something that is corrected in 
5.0.3?  I have no compelling reason to upgrade otherwise, and this 
really isn't a problem.

I'm using bacula 5.0.0 on FreeBSD 7.2-RELEASE.

Thanks, and regards.

-- 
Glen Barber

--
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] How does Bacula determine network transfer speed?

2010-04-22 Thread Glen Barber
Hi Martin,

Martin Simmons wrote: 
> >>>>> On Thu, 22 Apr 2010 10:27:30 -0400, Glen Barber said:
> > 
> > I'm seeing what I believe to be unusual behavior with regards to transfer
> > speeds.  Ultimately, I'd like to find out if either bacula-fd or
> > bacula-dir determine the maximum possible transfer speed during a backup,
> > and if that speed is locked during the process.
> 
> No, they don't work at that level.  They open tcp streams and let the
> underlying OSes deal with data transfer.  The backup data itself is
> transferred from the bacula-fd to the bacula-sd (not the bacula-dir).
> 
> netstat -s might be a good place to start looking for network stats, though
> they will be per machine rather than per process.

The data I have been able to obtain so far indicated something
bacula-specific.  As always, thank you for the explanation and suggestion.

Regards,
-- 
Glen Barber

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] How does Bacula determine network transfer speed?

2010-04-22 Thread Glen Barber
Howdy,

I'm seeing what I believe to be unusual behavior with regards to transfer
speeds.  Ultimately, I'd like to find out if either bacula-fd or
bacula-dir determine the maximum possible transfer speed during a backup,
and if that speed is locked during the process.

During the past few days, I've been seeing anywhere between 0.5 and 3.5%
packet loss during large-packet pings (1450 byte packets), though that
isn't the reason for the post.  It appears the transfer rate of the
backup, in full, is determined in the beginning phases of the
director / file-daemon negotiation, though I have not found any
documentation to verify this.  Unfortunately, the Director Services Daemon
documentation in the developer's manual is "to be written." [1]

All machines are running 5.0.0.

On Tuesday, the backups were running unusually far behind schedule.  The
particular job that was running was transferring at approximately 2Mb/s.
I was able to do some basic testing to see what other transfers would be
affected, and found that I could get ~9.6Mb/s between the backup server
and the same client.  At 9:00AM, I cancelled the job to let some smaller
jobs complete, and rescheduled the job afterwards.  The rescheduled job
ran significantly faster than previously, within a matter of two hours
between the jobs.

The transfer information from the cancelled job follows:
Elapsed time:   1 hour 14 mins 8 secs
FD Bytes Written:   1,035,309,129 (1.035 GB)
Rate:   232.8 KB/s

The transfer information from the rescheduled job, which was later
cancelled for further testing, follows:
Elapsed time:   13 mins 26 secs
FD Bytes Written:   1,260,431,584 (1.260 GB)
Rate:   1563.8 KB/s

Outside of cancelling the job, I made no other changes between these jobs,
and I cannot explain the large difference in rate.  As I am unable to find
documentation on how the negiotion takes place, my theory is that the
slower running jobs are initiated during a time in which the network is
experiencing packet loss, and never recovers from that state.

Further data I have seen seems to suggest this is the case.  The following
log snippets are from the same host, on two separate days:
Elapsed time:   12 hours 54 mins 16 secs
FD Bytes Written:   7,364,615,843 (7.364 GB)
Rate:   158.5 KB/s

Elapsed time:   10 hours 10 mins 4 secs
SD Bytes Written:   51,384,264,311 (51.38 GB)

Interestingly, the latter log snippet maintained a transfer rate of 12Mb/s
throughout the entire 10 hour period, according to my firewall network
graph, stopping only to despool to tape.

Does bacula lock in to a particular transfer rate until the backup reaches
completion?

Thanks, and regards.

[1] - 
http://www.bacula.org/5.0.x-manuals/en/developers/developers/Director_Services_Daemon.html

-- 
Glen Barber

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] "Got BNET_SIG -4 from SD" error

2010-04-19 Thread Glen Barber
Hi,

A while back[1], I was having an issue with one machine failing on full
backups, erroring with "Network send error to SD" in the log.  This
particular job would bomb at around 3 hours into the backup.  I suspected
the network at the first sign of a problem, but was unable to pinpoint the
specific cause of the failures.  Since the original problem, I've upgraded
all my machines from 2.4.3 to 5.0.0.

Recently, this machine was moved to a different colo, so I scheduled a
full backup after enabling logging with '-d 150' on the FD, the same debug
level I was using previously, in hopes I would not need to refer to the
log.

Unfortunately I did, and I'm not quite sure I understand the error.  Here
is what I see in the debug output:


servername-fd: heartbeat.c:91-0 Got BNET_SIG -4 from SD
servername-fd: heartbeat.c:96-0 wait_intr=1 stop=1
servername-fd: backup.c:1023-14662 Send data to SD len=65536
servername-fd: backup.c:1023-14662 Send data to SD len=65536
servername-fd: backup.c:1023-14662 Send data to SD len=65536
servername-fd: heartbeat.c:142-14662 Send kill to heartbeat id
servername-fd: backup.c:211-14662 end blast_data ok=0
servername-fd: job.c:1626-14662 Error in blast_data.
servername-fd: job.c:276-14662 Quit command loop. Canceled=1
servername-fd: job.c:303-14662 End FD msg: Jmsg \
Job=servername.2010-04-19_09.42.48_34 type=3 \
level=1271721175 servername-fd \
JobId 14662: Fatal error: backup.c:1019 \
Network send error to SD.  ERR=Broken pipe
servername-fd: job.c:382-14662 Calling term_find_files
servername-fd: job.c:385-14662 Done with term_find_files
servername-fd: jcr.c:183-14662 write_last_jobs seek to 188
servername-fd: job.c:387-0 Done with free_jcr


Though I found information on bnet_sig()[2], I am not clear on how it
would receive '-4', and more importantly, what that signal is telling the
FD.

I found one post from December [3] with the same error, though it suggests
the NIC had fallen asleep due to power saving features, which is not the
case here.  More importantly, this backup chewed through over 35GB before
it got to this point.

Any insight would be appreciated.

[1] - http://adsm.org/lists/html/Bacula-users/2010-02/msg00532.html
[2] - 
http://oss.org.cn/man/network/bacula/bacula_dev/TCP_IP_Network_Protocol.html#SECTION000188000
[3] - 
http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg38708.html

-- 
Glen Barber

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bacula suddenly choking on Full backups with Unknown term code

2010-02-22 Thread Glen Barber
Hi Martin,

Martin Simmons wrote: 
> >>>>> On Sun, 21 Feb 2010 12:15:01 -0500, Glen Barber said:
> > 
> > fd JobId 13934: Fatal error: backup.c:892 Network send error to SD. 
> > ERR=Broken pipe
> > sd JobId 13934: Job client.2010-02-20_17.43.07 marked to be canceled.
> > sd JobId 13934: Fatal error: append.c:259 Network error on data channel. 
> > ERR=Connection reset by peer
> > sd JobId 13934: Job write elapsed time = 02:58:46, Transfer rate = 1.451 M 
> > bytes/second
> > sd JobId 13934: Error: bsock.c:444 Read error from 
> > client:xxx.xxx.xxx.xxx:36643: 
> > ERR=Connection reset by peer
> > dir JobId 13934: Error: Bacula dir 2.4.3 (10Oct08): 20-Feb-2010 21:12:25
> > 
> 
> The fd got "Network send error to SD. ERR=Broken pipe" so the fd's OS thinks 
> that
> the socket was closed by the peer (i.e. the sd).
> 
> Conversely, the sd got "Network error on data channel. ERR=Connection reset by
> peer" so the sd's OS thinks the socket was forcibly closed by the peer
> (i.e. the fd).
> 
> They can't both be right, unless something in between is messing up.  That
> looks very much like a network problem to me, maybe not in the colo switch but
> somewhere in between.

I haven't yet completely dismissed this possibility, but the reproducible
mbox failures are far too strange for me not to think there may be a file
issue also.  I've copied this file to the backup server itself and ran an
individual backup on it without an issue.

Another run earlier today shows this in the debug log:

fd: backup.c:895-0 Send data to SD len=65552
fd: heartbeat.c:95-0 wait_intr=0 stop=0
fd: heartbeat.c:95-0 wait_intr=0 stop=0
fd: heartbeat.c:95-0 wait_intr=0 stop=0
fd: heartbeat.c:95-0 wait_intr=0 stop=0
[ ... a few more times ... ]
fd: heartbeat.c:139-0 Send kill to heartbeat id
fd: backup.c:197-0 end blast_data ok=0
fd: job.c:1447-0 Error in blast_data.

I'd like to be able to view the datastream as this failure occurs, but I
don't see how to accomplish this in the documentation.  I can use truss or
ktrace if needed, but if bacula has a built-in function, that would be
even better.

Best,

-- 
Glen Barber

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Bacula suddenly choking on Full backups with Unknown term code

2010-02-21 Thread Glen Barber
Howdy,

I'm running bacula 2.4.3 on FreeBSD which up until recently hasn't been
giving me issues.

I run daily incrementals, weekly differentials, and monthly fulls on
colo-stored clients.  One of these client machines began failing to complete
differential and full backups, with an "Unknown term code" in the email
notification, with the following in the log:

fd JobId 13934: Fatal error: backup.c:892 Network send error to SD. ERR=Broken 
pipe
sd JobId 13934: Job client.2010-02-20_17.43.07 marked to be canceled.
sd JobId 13934: Fatal error: append.c:259 Network error on data channel. 
ERR=Connection reset by peer
sd JobId 13934: Job write elapsed time = 02:58:46, Transfer rate = 1.451 M 
bytes/second
sd JobId 13934: Error: bsock.c:444 Read error from 
client:xxx.xxx.xxx.xxx:36643: 
ERR=Connection reset by peer
dir JobId 13934: Error: Bacula dir 2.4.3 (10Oct08): 20-Feb-2010 21:12:25

In November, I changed the fileset for this client, where a full backup
was scheduled and terminated successfully.  Since the initial full backup
due to the fileset change, there have been two successful full and seven
differentials which terminated successfully.  Incremental backups are not
affected.

I initially began to suspect the network, but the colo switch does not show
errors.  I've already enabled the heartbeat on the client with settings as
low as 15 seconds, with no luck.  I ran the client fd with -d200 to track
the failures, and found the backup was choking on a mbox file.

I created a new job for this particular file, again with -d200, which failed
as well.  Since we are lucky enough that bacula will tell us how many bytes
were transferred, I was able to use dd(1) to get to that point in the file,
where I found the contents to be a base64-encoded PDF.  Removing that email,
a subsequent backup of that file was successful.  A backup of only that
one email resulted in a failure.

file(1) output of the mbox file shows: ASCII mail text, with very long lines

I moved the mbox to a location which does not get backed up, and found the
same failure on another user's mbox with the same file(1) output.

Have I hit a bug?  I know 2.4.3 is rather old, and an upgrade is in the near
future, but I'd hate to upgrade to find the same problem, so I'm hoping this
is something someone has seen before.

-- 
Glen Barber

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tool to test a Fileset what passes?

2010-01-31 Thread Glen Barber
Hi Thomas,

Thomas Schweikle wrote: 
> Hi!
> 
> Is there a tool available to test a given file-set for backup which
> files from a system pass and which are ignore?
> 
> I'd like to test file-sets before I apply them to the server maybe
> not backing up files that should have been backed up ...!
> 

Have a look at the 'estimate' command.

http://www.bacula.org/en/dev-manual/Bacula_Console.html

Regards,

-- 
Glen Barber

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Continue to spool to disk when tape is full?

2009-12-11 Thread Glen Barber
On Fri, Dec 11, 2009 at 6:01 PM, John Drescher  wrote:
>>
>> Bacula is capable of only one spool file, it seems?
>>
>> I suppose a more specific question would be "can I have Bacula close a
>> spool file if it cannot despool, and write to tape when it is
>> available?"
>>
>
> Looks like that is item #14 on the projects list.
>
> http://bacula.git.sourceforge.net/git/gitweb.cgi?p=bacula/bacula;a=blob;f=bacula/projects;hb=HEAD
>

Ah, yes that does look like what I am looking for.

> If you do not know the projects list is a list of submitted feature
> requests that generally get voted on before each major release and the
> results of the vote steer the developers on what is most important to
> users.
>

That is good to know.  Thanks for the quick reply!


-- 
Glen Barber

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Continue to spool to disk when tape is full?

2009-12-11 Thread Glen Barber
Hi John,

On Fri, Dec 11, 2009 at 4:26 PM, John Drescher  wrote:
> On Fri, Dec 11, 2009 at 4:13 PM, Glen Barber  wrote:
>> Hi Martin,
>>
>> On Fri, Dec 11, 2009 at 3:48 PM, Martin Simmons  wrote:
>>>>
>>>> If I understand the documentation correctly, raising dir.conf Maximum
>>>> Concurrent Jobs will interleave multiple backups to tape, which I want
>>>> to avoid.  Is there something obvious I've missed in the
>>>> documentation, or is Bacula expected to stop spooling to disk when the
>>>> tape is full?
>>>
>>> The latter -- it stops because it can't simultaneously read and write the
>>> spool file.
>>>
>
> Is simultaneous spooling / despooling in bacula yet? I thought that
> was not yet implemented.
>

I don't necessarily mean "simultaneous" - in my scenario, the
despooling was not possible.

> What I have seen is if there is no appendable volume in the device
> (and its not an autochanger that can load new volumes) the next time
> it needs to spool will be blocked.
>

Bacula is capable of only one spool file, it seems?

I suppose a more specific question would be "can I have Bacula close a
spool file if it cannot despool, and write to tape when it is
available?"

Regards,

-- 
Glen Barber

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Continue to spool to disk when tape is full?

2009-12-11 Thread Glen Barber
Hi Martin,

On Fri, Dec 11, 2009 at 3:48 PM, Martin Simmons  wrote:
>>
>> If I understand the documentation correctly, raising dir.conf Maximum
>> Concurrent Jobs will interleave multiple backups to tape, which I want
>> to avoid.  Is there something obvious I've missed in the
>> documentation, or is Bacula expected to stop spooling to disk when the
>> tape is full?
>
> The latter -- it stops because it can't simultaneously read and write the
> spool file.
>

That makes sense.  Thanks for the explanation.



-- 
Glen Barber

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Continue to spool to disk when tape is full?

2009-12-10 Thread Glen Barber
Hello,

I'm running Bacula version 2.4.3 on a machine with a single tape
drive, and on occasion the tape will become full overnight.  'status
dir' shows the following:

server1.-MM-DD_HH.MM.SS is waiting on max Storage jobs

Since the machine has plenty of space to spool the data, I'd like to
continue to spool to disk when the tape is full so despooling to tape
can take place when the tape is changed.

I have the following Maximum Concurrent Jobs set:

bacula-dir.conf:   Maximum Concurrent Jobs = 1
bacula-sd.conf:   Maximum Concurrent Jobs = 30

If I understand the documentation correctly, raising dir.conf Maximum
Concurrent Jobs will interleave multiple backups to tape, which I want
to avoid.  Is there something obvious I've missed in the
documentation, or is Bacula expected to stop spooling to disk when the
tape is full?

Regards,

-- 
Glen Barber

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users