Hello, Le Thursday 13 November 2008 17:03:10 Pasi Kärkkäinen, vous avez écrit : > Hello list! > > I'm using Bacula 2.5.19 and trying 'copy jobs' feature to copy jobs from > disk volumes/pools to tape. > > Sometimes bacula-sd seems to get stuck.. it hangs without doing anything. > Now it happened when tape got full and Bacula started to change the tape on > the drive (using autoloader): > > bacula-sd JobId 3082: Start Copying JobId 3082, > Job=CopyPool4UncopiedToTape.2008-11-13_10.53.04.54 bacula-sd JobId 3082: > Using Device "IBM-LTO3-Drive" > bacula-sd JobId 3082: Ready to read from volume "Pool4-Vol-0127" on device > "FSDevice4" (/mnt/backup1/pool04). bacula-sd JobId 3082: Forward spacing > Volume "Pool4-Vol-0127" to file:block 0:218. bacula-sd JobId 3082: End of > Volume "756NNNL3" at 764:10067 on device "IBM-LTO3-Drive" (/dev/nst0). > Write of 64512 bytes got -1. bacula-sd JobId 3082: Re-read of last block > succeeded. > bacula-sd JobId 3082: End of medium on Volume "756NNNL3" > Bytes=725,237,130,240 Blocks=11,241,894 at 13-Nov-2008 11:51. bacula-sd > JobId 3082: 3307 Issuing autochanger "unload slot 3, drive 0" command. > > <nothing happens after this> > > > *sta > Status available for: > 1: Director > 2: Storage > 3: Client > 4: All > Select daemon type for status (1-4): 2 > > ... > > Device status: > Autochanger "IBM-LTO3-AutoChanger" with devices: > "IBM-LTO3-Drive" (/dev/nst0) > Device "FSDevice0" (/mnt/backup1/pool00) is not open. > Device "FSDevice1" (/mnt/backup1/pool01) is not open. > Device "FSDevice2" (/mnt/backup1/pool02) is not open. > Device "FSDevice3" (/mnt/backup1/pool03) is not open. > Device "FSDevice4" (/mnt/backup1/pool04) is mounted with: > Volume: Pool4-Vol-0127 > Pool: Pool4 > Media type: File4 > Total Bytes Read=1,649,507,328 Blocks Read=25,569 Bytes/block=64,512 > Positioned at File=0 Block=1,649,507,534 > Device "IBM-LTO3-Drive" (/dev/nst0) is not open. > Device is being initialized. > Drive 0 is not loaded. > ==== > > Used Volume status: > > <hangs here and nothing happens> > > > I can exit bconsole by pressing CTRL+C multiple times.. if I restart > bconsole and run that again, it gets stuck again.. > > I tried 'strace -p <pid>' to see what bacula-sd is doing: > > # strace -p 7339 > Process 7339 attached - interrupt to quit > select(5, [4], NULL, NULL, NULL <unfinished ...> > Process 7339 detached > > So.. bacula-sd seems to be stuck on select() .. > > Running 'mtx' seems to work fine.. at the same time when bacula-sd is > stuck. > > # mtx -f /dev/sg3 status > Storage Changer /dev/sg3:1 Drives, 8 Slots ( 0 Import/Export ) > Data Transfer Element 0:Empty > Storage Element 1:Full :VolumeTag=179MMML3 > Storage Element 2:Full :VolumeTag=658NNNL3 > Storage Element 3:Full :VolumeTag=756NNNL3 > Storage Element 4:Full :VolumeTag=177MMML3 > Storage Element 5:Full :VolumeTag=655NNNL3 > Storage Element 6:Full :VolumeTag=656NNNL3 > Storage Element 7:Full :VolumeTag=657NNNL3 > Storage Element 8:Full :VolumeTag=CLNU38L1 > > > Any ideas how to fix this? Other than restarting Bacula..
Could you stop all daemons with a sigsegv to force a backtrace ? killall -SEGV bacula-sd bacula-dir (you will find 2 kind of file, *traceback and *bactrace in working directory) After, if you can put results to pastbin, it will give information about your problem. Bye > I don't see any IO errors in dmesg and/or messages. > > -- Pasi ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
