1. Did you run the btape "test" command before using Bacula? 2. Did you do all the 9 steps suggested in the tape testing chapter of the manual?
On Wednesday 03 January 2007 19:32, Alan Davis wrote: > > I'm consistently running into a problem where whenever the SD has to > reposition the tape to start a new job it fails the tape because the > number of volume files it counts on the tape doesn't match the number > stored in the database for that volume. The files on tape is always 1 > less than the catalog entry. > > I can reliably reproduce the failure with the following steps. > > Setup: > set all but one tape to disabled in the pool to ensure I know which tape > will be chosen for backup > purge files/jobs from target tape - this sets it's status to purged > ensure that volume is set to recycle > > Test: > run job - this one creates 2 volfiles > check count of volfiles in catalog > unmount volume > remount volume - forces next job to seek to end of data > run job - the volume will be set to error because the SD finds 1 volume > file on tape and catalog says 2 > > Check tape EOM: > unmount volume > mtx load 1 0 > mt status - shows file marker at 0 > mt eom > mt status - shows file marker at 2 > (see transcript below for actual commands and output) > > Alternate test: > run job - creates 2 volfiles > run job - creates 2 volfiles > unmount volume > remount volume - forces next job to seek to end of data > run job - the volume will be set to error because the SD finds 3 volume > files on tape and catalog says 4 > > > I'm running 1.39.22 with DIR, SD and FD on Solaris 10 x86. > The configuration passes btape test and autochanger. > The drives are HP TZ89 on this system and HP LTO-3 on a second SD server > that is exhibiting the same problem. > > It seems that the SD isn't finding itself at the right location after > the > seek to EOM. What /should/ the file location be from mt status after > seeking to EOM with 2 volfiles written to tape? > > I'm using the following directives in the Device entry (wrapped > incorrectly by the emailer, of course): > > Device { > Name = Drive-0 # > Drive Index = 0 > Media Type = DLT-8000 > Archive Device = /dev/rmt/0cbn > AutomaticMount = yes; # when device opened, read it > AutoSelect = yes; > AlwaysOpen = yes; > RemovableMedia = yes; > RandomAccess = no; > AutoChanger = yes > Changer Command = "/opt/bacula/bin/mtx-changer %c %o %S %a %d" > Changer Device = /dev/changer > # Enable the Alert command only if you have the mtx package loaded > Alert Command = "/sbin/sh -c '/usr/local/sbin/tapeinfo -f %c > |/usr/bin/grep TapeAlert|/usr/bin/cat'" > Offline On Unmount = no > Hardware End of Medium = yes > BSF at EOM = yes > Backward Space Record = yes > Backward Space File = yes > Fast Forward Space File = yes > Use MTIOCGET = yes > TWO EOF = yes > LabelMedia = yes; # lets Bacula label unlabeled > media > Maximum Network Buffer Size = 65536 > } > > Do I have a combination of settings that works under btape but is > incorrect for the logic of the SD? > > What tools does bacula provide that I can use to track this down > further? > > Here's a transcript of a session that reproduces the problem: > > ### Begin transcript > # bconsole > Connecting to Director labtape.ruckus.com:9101 > 1000 OK: athos-dir Version: 1.39.22 (08 Septermber 2006) > Enter a period to cancel a command. > *list nextvol > The defined Job resources are: > 1: TestBackup > 2: Sol10x86FullBackup > 3: u2FullBackupJob > 4: WinXPFullBackup > 5: Client1 > 6: BackupCatalog > 7: RestoreFiles > Select Job resource (1-7): 1 > The next Volume to be used by Job "TestBackup" (Default, Incremental) > will be EJB084 > You have messages. > *messages > 03-Jan 10:47 athos-dir: Recycled volume "EJB084" > *run > A job name must be specified. > The defined Job resources are: > 1: TestBackup > 2: Sol10x86FullBackup > 3: u2FullBackupJob > 4: WinXPFullBackup > 5: Client1 > 6: BackupCatalog > 7: RestoreFiles > Select Job resource (1-7): 1 > Run Backup job > JobName: TestBackup > FileSet: TestFileSet1 > Level: Full > Client: athos-fd > Storage: TL800 > Pool: Default > When: 2007-01-03 11:08:13 > Priority: 10 > OK to run? (yes/mod/no): yes > Job started. JobId=54 > * > 03-Jan 11:08 athos-dir: Start Backup JobId 54, > Job=TestBackup.2007-01-03_11.08.16 > 03-Jan 11:08 athos-sd: Recycled volume "EJB084" on device "Drive-0" > (/dev/rmt/0cbn), all previous data lost. > * > You have messages. > *messages > 03-Jan 11:15 athos-sd: Job write elapsed time = 00:07:17, Transfer rate > = 3.286 M bytes/second > 03-Jan 11:15 athos-dir: Bacula 1.39.22 (09Sep06): 03-Jan-2007 11:15:46 > JobId: 54 > Job: TestBackup.2007-01-03_11.08.16 > Backup Level: Full > Client: "athos-fd" i386-pc-solaris2.10,solaris,5.10 > FileSet: "TestFileSet1" 2006-11-17 17:33:12 > Pool: "Default" (From Job resource) > Storage: "TL800" (From Job resource) > Scheduled time: 03-Jan-2007 11:08:13 > Start time: 03-Jan-2007 11:08:18 > End time: 03-Jan-2007 11:15:46 > Elapsed time: 7 mins 28 secs > Priority: 10 > FD Files Written: 16,470 > SD Files Written: 16,470 > FD Bytes Written: 1,433,764,300 (1.433 GB) > SD Bytes Written: 1,436,193,872 (1.436 GB) > Rate: 3200.4 KB/s > Software Compression: None > Volume name(s): EJB084 > Volume Session Id: 2 > Volume Session Time: 1167801329 > Last Volume Bytes: 1,437,907,968 (1.437 GB) > Non-fatal FD errors: 0 > SD Errors: 0 > FD termination status: OK > SD termination status: OK > Termination: Backup OK > > 03-Jan 11:15 athos-dir: Begin pruning Jobs. > 03-Jan 11:15 athos-dir: No Jobs found to prune. > 03-Jan 11:15 athos-dir: Begin pruning Files. > 03-Jan 11:15 athos-dir: Pruned Files from 1 Jobs for client athos-fd > from catalog. > 03-Jan 11:15 athos-dir: End auto prune. > > *sqlquery > Entering SQL query mode. > Terminate each query with a semicolon. > Terminate query mode with a blank line. > Enter SQL query: select mediaid, volumename, volfiles from media where > Add to SQL query: volumename='EJB084'; > +---------+------------+----------+ > | mediaid | volumename | volfiles | > +---------+------------+----------+ > | 12 | EJB084 | 2 | > +---------+------------+----------+ > Enter SQL query: > End query mode. > *unmount > The defined Storage resources are: > 1: File > 2: TL800 > Select Storage resource (1-2): 2 > Connecting to Storage daemon TL800 at labtape.ruckus.com:9103 ... > Enter autochanger drive[0]: > 3307 Issuing autochanger "unload slot 1, drive 0" command. > 3002 Device "Drive-0" (/dev/rmt/0cbn) unmounted. > *mount > The defined Storage resources are: > 1: File > 2: TL800 > Select Storage resource (1-2): 2 > Enter autochanger drive[0]: > Enter autochanger slot: 1 > 3301 Issuing autochanger "loaded? drive 0" command. > 3302 Autochanger "loaded? drive 0", result: nothing loaded. > 3304 Issuing autochanger "load slot 1, drive 0" command. > 3305 Autochanger "load slot 1, drive 0", status is OK. > 3301 Issuing autochanger "loaded? drive 0" command. > 3302 Autochanger "loaded? drive 0", result is Slot 1. > 3001 Mounted Volume: EJB084 > 3001 Device "Drive-0" (/dev/rmt/0cbn) is mounted with Volume "EJB084" > *sqlquery > Entering SQL query mode. > Terminate each query with a semicolon. > Terminate query mode with a blank line. > Enter SQL query: select mediaid, volumename, volfiles from media where > Add to SQL query: volumename='EJB084'; > +---------+------------+----------+ > | mediaid | volumename | volfiles | > +---------+------------+----------+ > | 12 | EJB084 | 2 | > +---------+------------+----------+ > Enter SQL query: > End query mode. > *run > A job name must be specified. > The defined Job resources are: > 1: TestBackup > 2: Sol10x86FullBackup > 3: u2FullBackupJob > 4: WinXPFullBackup > 5: Client1 > 6: BackupCatalog > 7: RestoreFiles > Select Job resource (1-7): 3 > Run Backup job > JobName: u2FullBackupJob > FileSet: u2FullBackupFileSet > Level: Full > Client: u2-fd > Storage: TL800 > Pool: Default > When: 2007-01-03 11:21:28 > Priority: 10 > OK to run? (yes/mod/no): yes > Job started. JobId=55 > * > *sqlquery > Entering SQL query mode. > Terminate each query with a semicolon. > Terminate query mode with a blank line. > Enter SQL query: select mediaid, volumename, volfiles from media where > Add to SQL query: volumename='EJB084'; > +---------+------------+----------+ > | mediaid | volumename | volfiles | > +---------+------------+----------+ > | 12 | EJB084 | 2 | > +---------+------------+----------+ > Enter SQL query: > End query mode. > You have messages. > *messages > 03-Jan 11:21 athos-dir: Start Backup JobId 55, > Job=u2FullBackupJob.2007-01-03_11.21.32 > 03-Jan 11:21 athos-sd: 3301 Issuing autochanger "loaded? drive 0" > command. > 03-Jan 11:21 athos-sd: 3302 Autochanger "loaded? drive 0", result is > Slot 1. > 03-Jan 11:21 athos-sd: Volume "EJB084" previously written, moving to end > of data. > * > 03-Jan 11:22 athos-sd: u2FullBackupJob.2007-01-03_11.21.32 Error: I > cannot write on Volume "EJB084" because: > The number of files mismatch! Volume=1 Catalog=2 > 03-Jan 11:22 athos-sd: Marking Volume "EJB084" in Error in Catalog. > *sqlquery > Entering SQL query mode. > Terminate each query with a semicolon. > Terminate query mode with a blank line. > Enter SQL query: select mediaid, volumename, volfiles from media where > Add to SQL query: volumename='EJB084'; > +---------+------------+----------+ > | mediaid | volumename | volfiles | > +---------+------------+----------+ > | 12 | EJB084 | 2 | > +---------+------------+----------+ > Enter SQL query: > End query mode. > You have messages. > * > 03-Jan 11:24 athos-sd: Job u2FullBackupJob.2007-01-03_11.21.32 waiting. > Cannot find any appendable volumes. > Please use the "label" command to create a new Volume for: > Storage: "Drive-0" (/dev/rmt/0cbn) > Media type: DLT-8000 > Pool: Default > *cancel > Automatically selected Job: JobId=55 > Job=u2FullBackupJob.2007-01-03_11.21.32 > Confirm cancel (yes/no): yes > 2001 Job u2FullBackupJob.2007-01-03_11.21.32 marked to be canceled. > 3000 Job u2FullBackupJob.2007-01-03_11.21.32 marked to be canceled. > * > 03-Jan 11:26 athos-sd: Job u2FullBackupJob.2007-01-03_11.21.32 canceled > while waiting for mount on Storage Device ""Drive-0" (/dev/rmt/0cbn)". > 03-Jan 11:26 athos-dir: Bacula 1.39.22 (09Sep06): 03-Jan-2007 11:26:25 > JobId: 55 > Job: u2FullBackupJob.2007-01-03_11.21.32 > Backup Level: Full > Client: "u2-fd" i686-pc-linux-gnu,suse,9 > FileSet: "u2FullBackupFileSet" 2007-01-03 00:20:14 > Pool: "Default" (From Job resource) > Storage: "TL800" (From Job resource) > Scheduled time: 03-Jan-2007 11:21:28 > Start time: 03-Jan-2007 11:21:35 > End time: 03-Jan-2007 11:26:25 > Elapsed time: 4 mins 50 secs > Priority: 10 > FD Files Written: 0 > SD Files Written: 0 > FD Bytes Written: 0 (0 B) > SD Bytes Written: 0 (0 B) > Rate: 0.0 KB/s > Software Compression: None > Volume name(s): > Volume Session Id: 3 > Volume Session Time: 1167801329 > Last Volume Bytes: 0 (0 B) > Non-fatal FD errors: 0 > SD Errors: 0 > FD termination status: Canceled > SD termination status: Canceled > Termination: Backup Canceled > > *umount > The defined Storage resources are: > 1: File > 2: TL800 > Select Storage resource (1-2): 2 > Enter autochanger drive[0]: > 3307 Issuing autochanger "unload slot 1, drive 0" command. > 3002 Device "Drive-0" (/dev/rmt/0cbn) unmounted. > *quit > # mtx status > Storage Changer /dev/changer:2 Drives, 10 Slots ( 0 Import/Export ) > Data Transfer Element 0:Empty > Data Transfer Element 1:Full (Storage Element 3 Loaded):VolumeTag = > EJB069 > Storage Element 1:Full :VolumeTag=EJB084 > Storage Element 2:Full :VolumeTag=DAI296 > Storage Element 3:Empty > Storage Element 4:Full :VolumeTag=EJB067 > Storage Element 5:Full :VolumeTag=EJB065 > Storage Element 6:Full :VolumeTag=EJB066 > Storage Element 7:Full :VolumeTag=EJB094 > Storage Element 8:Full :VolumeTag=EJB059 > Storage Element 9:Full :VolumeTag=DAI259 > Storage Element 10:Full > # mtx load 1 0 > Loading media from Storage Element 1 into drive 0...done > # mt status > DEC DLT TZ89 tape drive: > sense key(0x6)= Unit Attention residual= 0 retries= 0 > file no= 0 block no= 0 > # mt status > DEC DLT TZ89 tape drive: > sense key(0x0)= No Additional Sense residual= 0 retries= 0 > file no= 0 block no= 0 > #mt eom > #mt status > DEC DLT TZ89 tape drive: > sense key(0x0)= No Additional Sense residual= 0 retries= 0 > file no= 2 block no= 0 > > ### End Transcript > ---- > Alan Davis > Senior Architect > Ruckus Network, Inc. > 703.464.6578 (o) > 410.365.7175 (m) > [EMAIL PROTECTED] > alancdavis AIM > > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users