Looks like you are having a socket timeout error. Put in the "Heartbeat 
Interval" on the director config so it will keep the channel open and 
see if that helps.

Jason

Michael Proto wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello all,
>
> I'm seeing a strange problem with my bacula-fd clients after upgrading
> all of my systems to v2.0.3 (client and server). Intermittently, when
> performing a backup of some random client I'll see the following error
> in the Director:
>
> 30-Mar 02:09 archive2-dir: Start Backup JobId 1046,
> Job=guildenstern-a.2007-03-30_01.05.43
> 30-Mar 02:09 archive2-dir: guildenstern-a.2007-03-30_01.05.43 Fatal
> error: Socket error on Storage command: ERR=No data available
> 30-Mar 02:09 archive2-dir: guildenstern-a.2007-03-30_01.05.43 Error:
> Bacula 2.0.3 (06Mar07): 30-Mar-2007 02:09:12
>   JobId:                  1046
>   Job:                    guildenstern-a.2007-03-30_01.05.43
>   Backup Level:           Incremental, since=2007-03-29 02:06:06
>   Client:                 "guildenstern-a-fd" 2.0.3 (06Mar07)
> i686-pc-linux-gnu,debian,3.1
>   FileSet:                "guildenstern" 2007-03-18 21:37:31
>   Pool:                   "Daily" (From Run pool override)
>   Storage:                "ADIC-Library1" (From Job resource)
>   Scheduled time:         30-Mar-2007 01:05:42
>   Start time:             30-Mar-2007 02:09:05
>   End time:               30-Mar-2007 02:09:12
>   Elapsed time:           7 secs
>   Priority:               10
>   FD Files Written:       0
>   SD Files Written:       0
>   FD Bytes Written:       0 (0 B)
>   SD Bytes Written:       0 (0 B)
>   Rate:                   0.0 KB/s
>   Software Compression:   None
>   VSS:                    no
>   Encryption:             no
>   Volume name(s):
>   Volume Session Id:      44
>   Volume Session Time:    1175201849
>   Last Volume Bytes:      204,618,000,384 (204.6 GB)
>   Non-fatal FD errors:    0
>   SD Errors:              0
>   FD termination status:
>   SD termination status:  Error
>   Termination:            *** Backup Error ***
>
> If I re-run the job just after the failure, the client works as
> expected. I have about 80 clients, all different platforms (Linux,
> FreeBSD, and Windows), and this seems to only affect the Linux clients.
> Of those Linux clients that are failing it occurs on a variety of
> distributions/versions (Debian v3.0 & v3.1, RHEL v3 & v4) and its
> hit-or-miss whether a given Linux client will work on the first try or
> not, but in all cases I've seen (thus far), the re-run job works fine.
> Some days, a given client will work on the first try, and then the next
> day it fail, then work again the following day, etc... I determined any
> rhyme-or-reason to it other than its just Linux clients that are
> affected. Currently, about 30% of my clients on a given day exhibit this
> behavior.
>
> To work around the problem I've added the following entries to the
> default job resource:
>
> JobDefs {
>   Name = "DefaultJob"
>   Type = Backup
>   Reschedule On Error = yes
>   Reschedule Times = 3
>   Reschedule Interval = 90 seconds
>   ...
>
> This does help my regularly-scheduled jobs to complete without having to
> manually re-run them, but this is not ideal and I'd like to determine
> why the first backup of a given client is failing.
>
> I built and packaged all the Bacula Linux clients myself (so they all
> pull from the same set of config files for quick installation), and I
> used the following compile-time flags when building them:
>
> - --with-openssl --enable-client-only --enable-static-fd --enable-smartalloc
>
> I'm using the static-bacula-fd binary (instead of the bacula-fd binary)
> for maximum portability. They were built on a Debian Sarge host and then
> packaged into appropriate distribution packages.
>
> On one of the often-affected hosts I now have the client started with
> the following flags (out of /etc/inittab):
>
> /sbin/static-bacula-fd -fvc -d100 /etc/bacula/bacula-fd.conf
>   
>> /tmp/bacula-fd.out
>>     
>
> When the client fails, I see the modification timestamp update on the
> resultant /tmp/bacula-fd.out file, but its currently empty. Do I need to
> redirect stderr to this file instead of stdout?
>
> Anyone have any ideas what might be causing these errors or how I can go
> about debugging this unusual (and while not critical, still very
> annoying) problem?
>
>
>
> Thanks!
> Michael Proto
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.7 (FreeBSD)
>
> iD8DBQFGEX3TOLq/wl1XW74RAmOeAJ9U9+O6kNDDp3LBVGyBHvD7Lt+JvgCdFsrI
> f8IzD/gUPS0/F4dGgeIZ7J4=
> =NcOC
> -----END PGP SIGNATURE-----
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>   

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to