Hi,

My bacula-sd is deadlocking during copy jobs.

Version 5.0.1
compile option:  --with-readline=/usr/include/readline --disable-conio 
--with-mysql --enable-smartalloc
Linux version:  i686-pc-linux-gnu debian 5.0.4

Daily i copy every job made that night from 2 disk pools to a migrate pool 
using copy jobs (pool uncopied).
The migrate pool contains a autochanger with to drives. Config of the bacula-sd 
autochanger see below.

Both pools are loadbalancing there jobs over the 2 drives (using the maximum 
concurrent jobs =1 feature in the bacula-sd) as expected.

After a while the load on both the dir and sd are dropping to 0.

When I try to do status stor in the console I see the following (stopping at 
Used Volume status: and waiting forever):


*status stor
The defined Storage resources are:
     1: Migrate
     2: diskbackup
     3: diskbackup2
Select Storage resource (1-3): 1
Connecting to Storage daemon Migrate at bacula-sd.solcon.nl:9103

bacula-sd Version: 5.0.1 (24 February 2010) i686-pc-linux-gnu debian 5.0.4
Daemon started 08-Mar-10 10:07, 20 Jobs run since started.
 Heap: heap=2,367,488 smbytes=1,724,139 max_bytes=1,985,497 bufs=250 
max_bufs=293
Sizes: boffset_t=8 size_t=4 int32_t=4 int64_t=8

Running Jobs:
Reading: Full Copy job D2D2T2 JobId=131743 Volume="disk2-1265"
    pool="Disk2-Pool" device="diskbackup2" (/bacula/diskbackup2)
    Files=3,337 Bytes=45,779,673 Bytes/sec=140,428
    FDSocket closed
====

Jobs waiting to reserve a drive:
====

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name
===================================================================
131559  Full        184    70.61 M  OK       08-Mar-10 11:20 D2D2T
131561  Full        215    32.03 M  OK       08-Mar-10 11:22 D2D2T
131735  Full        233    1.541 G  OK       08-Mar-10 11:23 D2D2T2
131563  Full         41    2.123 M  OK       08-Mar-10 11:25 D2D2T
131737  Full        118    241.3 M  OK       08-Mar-10 11:28 D2D2T2
131565  Full     21,836    239.7 M  OK       08-Mar-10 11:30 D2D2T
131739  Full      2,069    596.7 M  OK       08-Mar-10 11:32 D2D2T2
131567  Full        122    315.9 M  OK       08-Mar-10 11:34 D2D2T
131741  Full        141    2.779 M  OK       08-Mar-10 11:35 D2D2T2
131569  Full        187    20.59 M  OK       08-Mar-10 11:37 D2D2T
====

Device status:
Autochanger "TandbergT40" with devices:
   "Drive-1" (/dev/st0)
   "Drive-2" (/dev/st1)
Device "Drive-1" (/dev/st0) is mounted with:
    Volume:      B4MO03
    Pool:        Migrate-Pool
    Media type:  LTO-4
    Slot 19 is loaded in drive 0.
    Total Bytes=4,032,451,584 Blocks=62,506 Bytes/block=64,513
    Positioned at File=8 Block=0
Device "Drive-2" (/dev/st1) is mounted with:
    Volume:      B4MO01
    Pool:        Migrate-Pool
    Media type:  LTO-4
    Slot 22 is loaded in drive 1.
    Total Bytes=3,954,521,088 Blocks=61,298 Bytes/block=64,513
    Positioned at File=17 Block=0
Device "diskbackup" (/bacula/diskbackup) is not open.
Device "diskrestore" (/bacula/diskbackup) is not open.
Device "diskbackup2" (/bacula/diskbackup2) is mounted with:
    Volume:      disk2-1265
    Pool:        *unknown*
    Media type:  File
    Total Bytes Read=0 Blocks Read=0 Bytes/block=0
    Positioned at File=0 Block=1,575,192,713
====

Used Volume status:



Restarting the bacula-sd is the only way to get him back to work.

I tried to run the bacula-sd manual unther gdb, but gdb is not showing 
something usefull:

only something like this:

[New Thread 0xb61e0b90 (LWP 8026)]
[New Thread 0xb57dfb90 (LWP 8027)]
[New Thread 0xb4fdeb90 (LWP 8028)]
[New Thread 0xb47ddb90 (LWP 8029)]
[Thread 0xb57dfb90 (LWP 8027) exited]
[Thread 0xb61e0b90 (LWP 8026) exited]
[Thread 0xb47ddb90 (LWP 8029) exited]
[Thread 0xb4fdeb90 (LWP 8028) exited]
[New Thread 0xb4fdeb90 (LWP 8046)]
[Thread 0xb4fdeb90 (LWP 8046) exited]
[New Thread 0xb4fdeb90 (LWP 8047)]
[Thread 0xb4fdeb90 (LWP 8047) exited]
[New Thread 0xb4fdeb90 (LWP 8050)]
[Thread 0xb4fdeb90 (LWP 8050) exited]
[New Thread 0xb4fdeb90 (LWP 8051)]
[New Thread 0xb47ddb90 (LWP 8052)]
[Thread 0xb47ddb90 (LWP 8052) exited]
[New Thread 0xb47ddb90 (LWP 8053)]
[New Thread 0xb61e0b90 (LWP 8062)]
[New Thread 0xb57dfb90 (LWP 8063)]
[Thread 0xb57dfb90 (LWP 8063) exited]
[Thread 0xb61e0b90 (LWP 8062) exited]
[New Thread 0xb57dfb90 (LWP 8064)]
[Thread 0xb57dfb90 (LWP 8064) exited]
[New Thread 0xb57dfb90 (LWP 8076)]


What can be the problem and how do i make a good trace using gdb. I tried the 
way described in the manual:
http://bacula.org/5.0.x-manuals/en/problems/problems/What_Do_When_Bacula.html#SECTION00640000000000000000

I dont understand the part 
thread apply all bt
Please help me out

Thanks and regards,

Jan Jaap

Config Bacula-sd autochanger part:


Autochanger {
  Name = TandbergT40
  Device = Drive-1
  Device = Drive-2
  Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
  Changer Device = /dev/sg3
}

Device {
  Name = Drive-1                      #
  Drive Index = 0
  Media Type = LTO-4
  Archive Device = /dev/st0
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Alert Command = "/bin/sh -c '/usr/sbin/smartctl -H -l error %c'"
  Spool Directory = /bacula/spool
  Maximum Concurrent Jobs = 1
}

Device {
  Name = Drive-2                      #
  Drive Index = 1
  Media Type = LTO-4
  Archive Device = /dev/st1
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Alert Command = "/bin/sh -c '/usr/sbin/smartctl -H -l error %c'"
  Spool Directory = /bacula/spool
  Maximum Concurrent Jobs = 1
}

                                          
_________________________________________________________________
Download gratis emoticons voor Messenger
http://www.rulive.nl/aspx/emoticons.aspx
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to