On Wednesday 14 December 2005 04:22, Rick Knight wrote:
> Kern Sibbald wrote:
> >Hello,
> >
> >If you are able to reproduce this easily, could you turn on level 100 by
> >putting -d100 on the command line when you start it, then capture the
> > output. This may help me understand what is going on.
> >
> >I've tried everything I can to duplicate this, but all my tests run fine.
> >
> >Hmmm. Normally, it wouldn't be the OS that is causing problems, but I'm
> > open to almost any suggestion -- the goal being to fix it ...
> >
> >On Tuesday 13 December 2005 22:07, James Peverill wrote:
> >>I am also getting this error since upgrading to 1.38 from the 1.36
> >>branch... been trying to figure out why for a few days now.
> >>
> >>I am also running slackware 10... I wonder if this is related.  Not
> >>running 2.4 kernel though, running 2.6.11.12 right now.  I am backing up
> >>to disk, not tapes.  Manually scheduled jobs run just fine, and
> >>automatic jobs run sometimes (the first job)... but most of the time
> >>they block indefinitely with "waiting to reserve a device".
> >>
> >>I figured I had just screwed up a configuration file somehow, but I have
> >>tweaked my configuration files with no improvement.  Maybe this is tied
> >>to a package in slackware 10?  Any other users running slackware 10 with
> >>1.38 successfully?
> >>
> >>james peverill
> >>
> >>Richard W. Knight wrote:
> >>>Kern Sibbald wrote:
> >>>>On Tuesday 13 December 2005 13:43, Rick Knight wrote:
> >>>>>Kern Sibbald wrote:
> >>>>>>Hello,
> >>>>>>
> >>>>>>On Monday 12 December 2005 19:35, Richard W. Knight wrote:
> >>>>>>>Hi all,
> >>>>>>>
> >>>>>>>A couple of weeks ago I upgraded my Bacula installation from 1.34 to
> >>>>>>>1.38.1. I made a few other changes at the same time and everything
> >>>>>>>has
> >>>>>>>been working well since. Yesterday I decided to upgrade to 1.38.3. I
> >>>>>>>built from source using the same configuration options that I used
> >>>>>>> to build 1.38.1 with the addition of "--with-python". The build
> >>>>>>> went OK,
> >>>>>>>no errors. I stopped 1.38.1 and started 1.38.3. Everything seemed
> >>>>>>>to be
> >>>>>>>OK. I ran a couple of small test backups and there were no errors
> >>>>>>>so I
> >>>>>>>assumed that the upgrade went fine. Last night the scheduled
> >>>>>>>backup ran
> >>>>>>>and after the first job, instead of continuing on to the next job,
> >>>>>>>I got
> >>>>>>>this message ...
> >>>>>>>
> >>>>>>>12-Dec 07:29 MyJob-SMB-sd: Job MyJob.2005-12-12_01.05.01 waiting to
> >>>>>>>reserve a device.
> >>>>>>>
> >>>>>>>This morning, when I first saw this message, I just did a mount from
> >>>>>>>bconsole and the job continued, I have bacula configured for 6
> >>>>>>>jobs per
> >>>>>>>media and the tape wasn't full so the job should have just started
> >>>>>>>as it
> >>>>>>>hallways has. Now it's time to backup up the catalog, to file, and
> >>>>>>>I'm
> >>>>>>>getting the same message.
> >>>>>>>
> >>>>>>>I have an HP DDS2 drive, no changer, running on Slackware 10. I'm
> >>>>>>>using
> >>>>>>>the same conf files the worked fine on 1.38.1. Can anyone tell me
> >>>>>>> why I'm now getting this error message?
> >>>>>>
> >>>>>>You are not by any chance running on a 2.4 kernel with /lib/tls?
> >>>>>>
> >>>>>>Could you send me your bacula-dir.conf and bacula-sd.conf along
> >>>>>>with the
> >>>>>>job report that shows the jobs blocking?
> >>>>>>
> >>>>>>>Thanks,
> >>>>>>>RickKnight
> >>>>>>>
> >>>>>>>
> >>>>>>>-------------------------------------------------------
> >>>>>>>This SF.net email is sponsored by: Splunk Inc. Do you grep through
> >>>>>>>log
> >>>>>>>files for problems?  Stop!  Download the new AJAX search engine that
> >>>>>>>makes searching your log files as easy as surfing the  web. 
> >>>>>>> DOWNLOAD SPLUNK!
> >>>>>>> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> >>>>>>> _______________________________________________
> >>>>>>>Bacula-users mailing list
> >>>>>>>Bacula-users@lists.sourceforge.net
> >>>>>>>https://lists.sourceforge.net/lists/listinfo/bacula-users
> >>>>>
> >>>>>Thanks Kern,
> >>>>>
> >>>>>I am running a 2.4.26 kernel, but I don't know about /lib/tls. I don't
> >>>>>have a /lib/tls on my system. How can I tell?
> >>>>
> >>>>Try:
> >>>>
> >>>> ls -l /lib/tls
> >>>>
> >>>>If it exists, then that could explain why it *appears* that Bacula is
> >>>>not seeing some pthread broadcasts that would allow it to continue.
> >>>>This is a bit of a long shot, but at this point, I need to consider
> >>>>all possibilities ...
> >>>>
> >>>>In the mean time, I'll take a careful look at your config.  Perhaps I
> >>>>have missed something important that I can add to my test cases.  All
> >>>>my tests here succeeded perfectly ...
> >>>>
> >>>>By the way, getting the message that a job is waiting to reserve a
> >>>>drive is not in itself bad. This happens in my test case. However, at
> >>>>some point when the drive is available, the job should continue as it
> >>>>does in my test case.
> >>>>
> >>>>>Also, the .conf files are attached along with the log.
> >>>>>
> >>>>>Thanks again,
> >>>>>RickKnight
> >>>
> >>>Kern,
> >>>
> >>>I just got another job log email. Something I noticed is a clock
> >>>difference between the Director and File daemons. The two lines below
> >>>are from this mornings log (also attached). Could that be causing a
> >>>problem?
> >>>
> >>>12-Dec 19:29 knight-linux-SMB-sd: Job Knight-Linux.2005-12-12_01.05.01
> >>>waiting to reserve a device.
> >>>13-Dec 04:33 knight-linux-fd: DIR and FD clocks differ by 676 seconds,
> >>>FD automatically adjusting.
> >>>
> >>>
> >>>Thanks again,
> >>>Rick Knight
> >>>
> >>>------------------------------------------------------------------------
> >>>
> >>>12-Dec 04:29 knight-linux-SMB-dir: Start Backup JobId 559,
> >>>Job=Knight-Linux.2005-12-12_01.05.01 12-Dec 04:29 knight-linux-SMB-sd:
> >>>Job Knight-Linux.2005-12-12_01.05.01 waiting to reserve a device. 12-Dec
> >>>05:29 knight-linux-SMB-sd: Job Knight-Linux.2005-12-12_01.05.01 waiting
> >>>to reserve a device. 12-Dec 07:29 knight-linux-SMB-sd: Job
> >>>Knight-Linux.2005-12-12_01.05.01 waiting to reserve a device. 12-Dec
> >>>11:29 knight-linux-SMB-sd: Job Knight-Linux.2005-12-12_01.05.01 waiting
> >>>to reserve a device. 12-Dec 19:29 knight-linux-SMB-sd: Job
> >>>Knight-Linux.2005-12-12_01.05.01 waiting to reserve a device. 13-Dec
> >>>04:33 knight-linux-fd: DIR and FD clocks differ by 676 seconds, FD
> >>>automatically adjusting. 13-Dec 04:22 knight-linux-SMB-sd: Volume
> >>>"DailyIncr-0011" previously written, moving to end of data. 13-Dec 04:22
> >>>knight-linux-SMB-sd: Ready to append to end of Volume "DailyIncr-0011"
> >>> at file=4. 13-Dec 04:44 knight-linux-SMB-dir: Bacula 1.38.3 (09Dec05):
> >>> 13-Dec-2005 04:44:48 JobId:                  559
> >>>  Job:                    Knight-Linux.2005-12-12_01.05.01
> >>>  Backup Level:           Incremental, since=2005-12-11 01:07:20
> >>>  Client:                 "knight-linux-fd"
> >>>i686-pc-linux-gnu,slackware,Slackware 9.0.0 FileSet:
> >>>"Knight-Linux" 2005-11-21 21:06:17
> >>>  Pool:                   "DailyPool"
> >>>  Storage:                "HPSureStoreDAT-8"
> >>>  Scheduled time:         12-Dec-2005 01:05:00
> >>>  Start time:             12-Dec-2005 04:29:22
> >>>  End time:               13-Dec-2005 04:44:48
> >>>  Priority:               10
> >>>  FD Files Written:       262
> >>>  SD Files Written:       262
> >>>  FD Bytes Written:       675,038,414
> >>>  SD Bytes Written:       675,072,191
> >>>  Rate:                   7.7 KB/s
> >>>  Software Compression:   None
> >>>  Volume name(s):         DailyIncr-0011
> >>>  Volume Session Id:      3
> >>>  Volume Session Time:    1134331562
> >>>  Last Volume Bytes:      2,190,220,863
> >>>  Non-fatal FD errors:    0
> >>>  SD Errors:              0
> >>>  FD termination status:  OK
> >>>  SD termination status:  OK
> >>>  Termination:            Backup OK
> >>>
> >>>13-Dec 04:44 knight-linux-SMB-dir: Begin pruning Jobs.
> >>>13-Dec 04:44 knight-linux-SMB-dir: No Jobs found to prune.
> >>>13-Dec 04:44 knight-linux-SMB-dir: Begin pruning Files.
> >>>13-Dec 04:50 knight-linux-SMB-dir: Pruned Files from 2 Jobs for client
> >>>knight-linux-fd from catalog. 13-Dec 04:50 knight-linux-SMB-dir: End
> >>> auto prune.
>
> Kern,
>
> Here's some more info from my Director Status.
>
> Running Jobs:
>  JobId Level   Name                       Status
> ======================================================================
>    560 Full    BackupCatalog.2005-12-12_01.10.00 is waiting for higher
> priority
> jobs to finish
>    561 Increme  Knight-Linux_SMB.2005-12-13_01.05.00 is running
>    562 Increme  Knight-Linux.2005-12-13_01.05.01 is waiting on max
> Storage jobs
>    563 Full    BackupCatalog.2005-12-13_01.10.00 is waiting execution
> ====
>
> Catalog backs up to File and job 560 should be done. Also, at the time
> I'm looking at this, job 561 has finished so 562 should have just
> started, but I had to mount the volume in the drive. It should have
> already been mounted, unless, the prior evenings Catalog (runAfter is in
> Catalog only) is not finished because it could not eject the tape
> because the tape drive is busy with the next job? Or am I just confused?

I suspect that there are two problems here. 1. You probably don't have 
Maximume Concurrent Jobs set in your director's storage resource, and 2. it 
looks like there may be a problem with the way the SD in 1.38 is trying to 
open drives, which cause it to wait.  I'm working on a solution to that now.

>
> Thanks,
> Rick Knight

-- 
Best regards,

Kern

  (">
  /\
  V_V


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to