Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread Chris Morris
For the benefit of list readers and those that may still be trying to 
assist me with troubleshooting. 

I have finally been able to duplicate the errors on my own terms.  I 
could never duplicate this before, because I would sit at a terminal 
window and make the snapshot, mount the snapshot, browse the snapshot, 
and umount the snapshot in that order.  Finally, I decided to try to 
*behave* more like I expect a script to work. 

I opened two terminal windows.  In one, I started the snapshot 
generation process.  In the other, I tried to mount the snapshot before 
it was finished.   It, of course, didn't work.  More importantly, 
however, _I got the exact errors that I would randomly get in my 
automated overnight backups.

_Now that I can reproduce the error, fixing and implementing is 
trivial.  Many thanks to you those that provided assistance with this 
matter.

Thank you,

Chris Morris

-- 
S i x  F e e t  U p  |  "Nowhere to go but open source"
Silicon Valley: +1 (650) 401-8579 x609
Midwest: +1 (317) 861-5948 x609
Toll-Free: 1-866-SIX-FEET
mailto:[EMAIL PROTECTED]
http://www.sixfeetup.com  |  Zope/Plone Custom Development



John Drescher wrote:
>
>
> On 7/11/07, *Chris Morris* <[EMAIL PROTECTED] 
> > wrote:
>
> Dan Langille wrote:
> >
> > You say the jobs fail.  What is the "failure"?  Error message?
> >
> >
> Below, I've pasted in a failure notification email that Bacula
> automatically sends.
>
>11-Jul 07:18 admin01-dir: Start Backup JobId 255,
> Job=app11_BSD.2007-07-10_23.35.04
>11-Jul 01:19 app11-fd: DIR and FD clocks differ by -21547
> seconds, FD
> automatically adjusting.
>11-Jul 01:19 app11-fd: ClientRunBeforeJob: run command
> "/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1
> /var:autogen_bkup"
>11-Jul 01:22 app11-fd: ClientRunBeforeJob: mount:
> /var/.snap/autogen_bkup.0: Resource temporarily unavailable
>11-Jul 01:22 app11-fd: ClientRunBeforeJob: run command
> "/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1
> /usr:autogen_bkup"
>11-Jul 01:24 app11-fd: ClientRunBeforeJob: run command
> "/usr/local/bin/sudo /usr/local/sbin/snapshot mount /var:autogen_bkup
> /mnt/var"
>11-Jul 01:24 app11-fd: ClientRunBeforeJob: mount: /dev/md0:
> Input/output error
>11-Jul 01:24 app11-fd: ClientRunBeforeJob: snapshot:ERROR:
> unable to
> mount "/dev/md0" under "/mnt/var"
>11-Jul 01:24 app11-fd: app11_BSD.2007-07-10_23.35.04 Error:
> Runscript: ClientRunBeforeJob returned non-zero status=1. ERR=Child
> exited with code 1
>11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Fatal
> error:
> Bad response to ClientRunBeforeJob command: wanted 2000 OK RunBefore
>, got 2905 Bad RunBeforeJob command.
>
>
>
> This says your ClientRunBeforeJob is has failed as it could not 
> perform the mount of /dev/md0 to /mnt/var. Have you checked into that? 
> Is bacula-fd running as user bacula? Possibly this is a permissions 
> issue.
>
> John

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread John Drescher

On 7/11/07, Chris Morris <[EMAIL PROTECTED]> wrote:


Dan Langille wrote:
>
> You say the jobs fail.  What is the "failure"?  Error message?
>
>
Below, I've pasted in a failure notification email that Bacula
automatically sends.

   11-Jul 07:18 admin01-dir: Start Backup JobId 255,
Job=app11_BSD.2007-07-10_23.35.04
   11-Jul 01:19 app11-fd: DIR and FD clocks differ by -21547 seconds, FD
automatically adjusting.
   11-Jul 01:19 app11-fd: ClientRunBeforeJob: run command
"/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /var:autogen_bkup"
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: mount:
/var/.snap/autogen_bkup.0: Resource temporarily unavailable
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: run command
"/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /usr:autogen_bkup"
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: run command
"/usr/local/bin/sudo /usr/local/sbin/snapshot mount /var:autogen_bkup
/mnt/var"
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: mount: /dev/md0:
Input/output error
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: snapshot:ERROR: unable to
mount "/dev/md0" under "/mnt/var"
   11-Jul 01:24 app11-fd: app11_BSD.2007-07-10_23.35.04 Error:
Runscript: ClientRunBeforeJob returned non-zero status=1. ERR=Child
exited with code 1
   11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Fatal error:
Bad response to ClientRunBeforeJob command: wanted 2000 OK RunBefore
   , got 2905 Bad RunBeforeJob command.




This says your ClientRunBeforeJob is has failed as it could not perform the
mount of /dev/md0 to /mnt/var. Have you checked into that? Is bacula-fd
running as user bacula? Possibly this is a permissions issue.

John
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread Chris Morris
Dan Langille wrote:
>
> You say the jobs fail.  What is the "failure"?  Error message?
>
>   
Below, I've pasted in a failure notification email that Bacula 
automatically sends.

   11-Jul 07:18 admin01-dir: Start Backup JobId 255, 
Job=app11_BSD.2007-07-10_23.35.04
   11-Jul 01:19 app11-fd: DIR and FD clocks differ by -21547 seconds, FD 
automatically adjusting.
   11-Jul 01:19 app11-fd: ClientRunBeforeJob: run command 
"/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /var:autogen_bkup"
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: mount: 
/var/.snap/autogen_bkup.0: Resource temporarily unavailable
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: run command 
"/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /usr:autogen_bkup"
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: run command 
"/usr/local/bin/sudo /usr/local/sbin/snapshot mount /var:autogen_bkup 
/mnt/var"
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: mount: /dev/md0: 
Input/output error
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: snapshot:ERROR: unable to 
mount "/dev/md0" under "/mnt/var"
   11-Jul 01:24 app11-fd: app11_BSD.2007-07-10_23.35.04 Error: 
Runscript: ClientRunBeforeJob returned non-zero status=1. ERR=Child 
exited with code 1
   11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Fatal error: 
Bad response to ClientRunBeforeJob command: wanted 2000 OK RunBefore
   , got 2905 Bad RunBeforeJob command.

   11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Error: Bacula 
2.0.3 (06Mar07): 11-Jul-2007 07:23:21
 JobId:  255
 Job:app11_BSD.2007-07-10_23.35.04
 Backup Level:   Differential, since=2007-07-10 07:32:06
 Client: "app11-fd" 2.0.3 (06Mar07) 
amd64-portbld-freebsd6.2,freebsd,6.2-RC1
 FileSet:"defaultBSD" 2007-07-09 23:05:00
 Pool:   "Default" (From Job resource)
 Storage:"storage01" (From Job resource)
 Scheduled time: 10-Jul-2007 23:35:03
 Start time: 11-Jul-2007 07:18:09
 End time:   11-Jul-2007 07:23:21
 Elapsed time:   5 mins 12 secs
 Priority:   10
 FD Files Written:   0
 SD Files Written:   0
 FD Bytes Written:   0 (0 B)
 SD Bytes Written:   0 (0 B)
 Rate:   0.0 KB/s
 Software Compression:   None
 VSS:no
 Encryption: no
 Volume name(s):  Volume Session Id:  95
 Volume Session Time:1183747854
 Last Volume Bytes:  403,356,186,192 (403.3 GB)
 Non-fatal FD errors:0
 SD Errors:  0
 FD termination status:   SD termination status:  OK
 Termination:*** Backup Error ***

-- 
S i x  F e e t  U p  |  "Nowhere to go but open source"
Silicon Valley: +1 (650) 401-8579 x609
Midwest: +1 (317) 861-5948 x609
Toll-Free: 1-866-SIX-FEET
mailto:[EMAIL PROTECTED]
http://www.sixfeetup.com  |  Zope/Plone Custom Development





-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread Dan Langille
On 11 Jul 2007 at 11:15, Chris Morris wrote:

> Since I've introduced FreeBSD snapshots into my Bacula plan, I've 
> started getting random backup failures.  A server will fail one day, and 
> back up just fine the next.  A server will back up fine one day and fail 
> the next.  ...all with no changes to the Bacula configuration.
> 
> Below, I've posted pertinent portions of configuration files and message 
> logs.  Please let me know if I need to supply any further information to 
> help troubleshoot this down.

You say the jobs fail.  What is the "failure"?  Error message?

> 
> bacula-dir.conf pertinent portions only...sensitive information removed 
> with:  <*REMOVED*>
> 
> JobDefs {
>   Name = "BSD"
>   Type = Backup
>   FileSet = "defaultBSD"
>   Storage = storage01
>   Messages = Standard
>   Pool = Default
>   ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
> make -g1 /var:autogen_bkup"
>   ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
> make -g1 /usr:autogen_bkup"
>   ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
> mount /var:autogen_bkup /mnt/var"
>   ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
> mount /usr:autogen_bkup /mnt/usr"
>   ClientRunAfterJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
> umount /mnt/var"
>   ClientRunAfterJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
> umount /mnt/usr"

I suggest creating scripts on the client, and moving these commands 
into those scripts.  It makes the JobDefs easier to read.  Sure, you 
have to copy stuff to the client, but I think that's cleaner.

YMMV.

>   Priority = 10
> }
> 
> Typical job, as they are all nearly identical:
> 
> Job {
>   Name = "app06_BSD"
>   Client = "app06-fd"
>   Schedule = "MonCycle"
>   JobDefs = "BSD"
>   Write Bootstrap = "/var/db/bacula/app06.bsr"
> }
> 
> Typical client, as they are all nearly identical:
> 
> Client {
>   Name = app06-fd
>   Address = app06
>   FDPort = 9102
>   Catalog = MyCatalog
>   Password = "<*REMOVED*>"  # password for FileDaemon
>   File Retention = 30 days   # 30 days
>   Job Retention = 6 months # six months
>   AutoPrune = yes   # Prune expired
> Jobs/Files
> }
> 
> My primary FileSet resource:
> 
> FileSet {
>   Name = "defaultBSD"
>   Include {
> Options {
>   signature = MD5
>   compression = GZIP
> }
> File = /
> File = /mnt/usr
> File = /mnt/var
>   }
>   Exclude {
> File = /proc
> File = /tmp
> File = /.journal
> File = /.fsck
>   }
> }
> 
> Finally, I get the same message from my /var/log/messages file at every 
> failure.  The lines before and after this have nothing to do with the 
> backup.
> 
> Jul 11 08:19:07 app11 sudo:   <*REMOVED*> : TTY=unknown ;
> PWD=/usr/local/etc/rc.d ; USER=root ;
> COMMAND=/usr/local/sbin/snapshot make -g1 /var:autogen_bkup
> Jul 11 08:21:34 app11 kernel: fsync: giving up on dirty
> Jul 11 08:21:34 app11 kernel: 0xff005b07cd90: tag devfs, type VCHR
> Jul 11 08:21:34 app11 kernel: usecount 1, writecount 0, refcount 604
> mountedhere 0xff011f1ba200
> Jul 11 08:21:34 app11 kernel: flags ()
> Jul 11 08:21:34 app11 kernel: v_object 0xff005d3a ref 0
> pages 8572 
> Jul 11 08:21:34 app11 kernel: lock type devfs: EXCL (count 1) by
> thread 0xff00abc57980 (pid 46772)
> Jul 11 08:21:34 app11 kernel: dev da0s1d
> Jul 11 08:22:03 app11 sudo:   <*REMOVED*> : TTY=unknown ;
> PWD=/usr/local/etc/rc.d ; USER=root ;
> COMMAND=/usr/local/sbin/snapshot make -g1 /usr:autogen_bkup
> Jul 11 08:24:13 app11 sudo:   <*REMOVED*> : TTY=unknown ;
> PWD=/usr/local/etc/rc.d ; USER=root ;
> COMMAND=/usr/local/sbin/snapshot mount /var:autogen_bkup /mnt/var
> Jul 11 08:24:13 app11 kernel: g_vfs_done():md0[READ(offset=65536,
> length=8192)]error = 5

This looks like an OS issue, not a Bacula issue.  I suggest following 
up on the FreeBSD maling lists.

-- 
Dan Langille - http://www.langille.org/



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Random Backup Failures

2007-07-11 Thread Chris Morris
Since I've introduced FreeBSD snapshots into my Bacula plan, I've 
started getting random backup failures.  A server will fail one day, and 
back up just fine the next.  A server will back up fine one day and fail 
the next.  ...all with no changes to the Bacula configuration.

Below, I've posted pertinent portions of configuration files and message 
logs.  Please let me know if I need to supply any further information to 
help troubleshoot this down.

bacula-dir.conf pertinent portions only...sensitive information removed 
with:  <*REMOVED*>

JobDefs {
  Name = "BSD"
  Type = Backup
  FileSet = "defaultBSD"
  Storage = storage01
  Messages = Standard
  Pool = Default
  ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
make -g1 /var:autogen_bkup"
  ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
make -g1 /usr:autogen_bkup"
  ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
mount /var:autogen_bkup /mnt/var"
  ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
mount /usr:autogen_bkup /mnt/usr"
  ClientRunAfterJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
umount /mnt/var"
  ClientRunAfterJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
umount /mnt/usr"
  Priority = 10
}

Typical job, as they are all nearly identical:

Job {
  Name = "app06_BSD"
  Client = "app06-fd"
  Schedule = "MonCycle"
  JobDefs = "BSD"
  Write Bootstrap = "/var/db/bacula/app06.bsr"
}

Typical client, as they are all nearly identical:

Client {
  Name = app06-fd
  Address = app06
  FDPort = 9102
  Catalog = MyCatalog
  Password = "<*REMOVED*>"  # password for FileDaemon
  File Retention = 30 days   # 30 days
  Job Retention = 6 months # six months
  AutoPrune = yes   # Prune expired
Jobs/Files
}

My primary FileSet resource:

FileSet {
  Name = "defaultBSD"
  Include {
Options {
  signature = MD5
  compression = GZIP
}
File = /
File = /mnt/usr
File = /mnt/var
  }
  Exclude {
File = /proc
File = /tmp
File = /.journal
File = /.fsck
  }
}

Finally, I get the same message from my /var/log/messages file at every 
failure.  The lines before and after this have nothing to do with the 
backup.

Jul 11 08:19:07 app11 sudo:   <*REMOVED*> : TTY=unknown ;
PWD=/usr/local/etc/rc.d ; USER=root ;
COMMAND=/usr/local/sbin/snapshot make -g1 /var:autogen_bkup
Jul 11 08:21:34 app11 kernel: fsync: giving up on dirty
Jul 11 08:21:34 app11 kernel: 0xff005b07cd90: tag devfs, type VCHR
Jul 11 08:21:34 app11 kernel: usecount 1, writecount 0, refcount 604
mountedhere 0xff011f1ba200
Jul 11 08:21:34 app11 kernel: flags ()
Jul 11 08:21:34 app11 kernel: v_object 0xff005d3a ref 0
pages 8572 
Jul 11 08:21:34 app11 kernel: lock type devfs: EXCL (count 1) by
thread 0xff00abc57980 (pid 46772)
Jul 11 08:21:34 app11 kernel: dev da0s1d
Jul 11 08:22:03 app11 sudo:   <*REMOVED*> : TTY=unknown ;
PWD=/usr/local/etc/rc.d ; USER=root ;
COMMAND=/usr/local/sbin/snapshot make -g1 /usr:autogen_bkup
Jul 11 08:24:13 app11 sudo:   <*REMOVED*> : TTY=unknown ;
PWD=/usr/local/etc/rc.d ; USER=root ;
COMMAND=/usr/local/sbin/snapshot mount /var:autogen_bkup /mnt/var
Jul 11 08:24:13 app11 kernel: g_vfs_done():md0[READ(offset=65536,
length=8192)]error = 5

-- 
S i x  F e e t  U p  |  "Nowhere to go but open source"
Silicon Valley: +1 (650) 401-8579 x609
Midwest: +1 (317) 861-5948 x609
Toll-Free: 1-866-SIX-FEET
mailto:[EMAIL PROTECTED]
http://www.sixfeetup.com  |  Zope/Plone Custom Development


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users