On 2011-07-11 06:13, Martin Simmons wrote: >>>>>> On Sun, 10 Jul 2011 12:17:55 +0000, Steve Costaras said: >> Importance: Normal >> Sensitivity: Normal >> >> I am trying a full backup/multi-job to a single client and all was going >> well until this morning when I received the error below. All other jobs >> were also canceled. >> >> My question is two fold: >> >> 1) What the heck is this error? I can unmount the drive, issue a rawfill to >> the tape w/ btape and no problems? >> ... >> 3000 OK label. VolBytes=1024 DVD=0 Volume="FA0016" Device="LTO4" (/dev/nst0) >> Requesting to mount LTO4 ... >> 3905 Bizarre wait state 7 >> Do not forget to mount the drive!!! >> 2011-07-10 03SD-loki JobId 6: Wrote label to prelabeled Volume "FA0016" on >> device "LTO4" (/dev/nst0) >> 2011-07-10 03SD-loki JobId 6: New volume "FA0016" mounted on device "LTO4" >> (/dev/nst0) at 10-Jul-2011 03:51. >> 2011-07-10 03SD-loki JobId 6: Fatal error: block.c:439 Attempt to write on >> read-only Volume. dev="LTO4" (/dev/nst0) >> 2011-07-10 03SD-loki JobId 6: End of medium on Volume "FA0016" Bytes=1,024 >> Blocks=0 at 10-Jul-2011 03:51. >> 2011-07-10 03SD-loki JobId 6: Fatal error: Job 6 canceled. >> 2011-07-10 03SD-loki JobId 6: Fatal error: device.c:192 Catastrophic error. >> Cannot write overflow block to device "LTO4" (/dev/nst0). ERR=Input/output >> error > Do you regularly see the "3905 Bizarre wait state 7" message? It could be an > indication of problems (and everything after that could be a consequence of > it). > > What are the messages that lead up to that point? Nothing, really, this was the 17th tape in a row on a ~3day (so far) backup. No messages in /var/log/messages. Previous messages from bacula are below as you can see it just blows chunks right after FA0016 is mounted, all concurrent jobs are killed. And I've tested that tape before the backup ran and again right after this failure with btape. no problems.
----------- *label storage=LTO4 pool=BackupSetFA volume=FA0015 Connecting to Storage daemon LTO4 at loki:9103 ... Sending label command for Volume "FA0015" Slot 14 ... 3000 OK label. VolBytes=1024 DVD=0 Volume="FA0015" Device="LTO4" (/dev/nst0) Requesting to mount LTO4 ... 3001 Device "LTO4" (/dev/nst0) is mounted with Volume "FA0015" * 2011-07-10 00SD-loki JobId 3: Wrote label to prelabeled Volume "FA0015" on device "LTO4" (/dev/nst0) 2011-07-10 00SD-loki JobId 3: New volume "FA0015" mounted on device "LTO4" (/dev/nst0) at 10-Jul-2011 00:48. * 2011-07-10 00SD-loki JobId 3: Despooling elapsed time = 01:21:56, Transfer rate = 70.98 M Bytes/second 2011-07-10 00SD-loki JobId 3: Alert: smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen 2011-07-10 00SD-loki JobId 3: Alert: Home page is http://smartmontools.sourceforge.net/ 2011-07-10 00SD-loki JobId 3: Alert: 2011-07-10 00SD-loki JobId 3: Alert: TapeAlert: OK 2011-07-10 00SD-loki JobId 3: Alert: 2011-07-10 00SD-loki JobId 3: Alert: Error counter log: 2011-07-10 00SD-loki JobId 3: Alert: Errors Corrected by Total Correction Gigabytes Total 2011-07-10 00SD-loki JobId 3: Alert: ECC rereads/ errors algorithm processed uncorrected 2011-07-10 00SD-loki JobId 3: Alert: fast | delayed rewrites corrected invocations [10^9 bytes] errors 2011-07-10 00SD-loki JobId 3: Alert: read: 0 0 0 0 0 0.000 0 2011-07-10 00SD-loki JobId 3: Alert: write: 3010 0 3010 3010 3010 0.000 0 2011-07-10 00SD-loki JobId 3: Sending spooled attrs to the Director. Despooling 65,784,417 bytes ... 2011-07-10 00DIR-loki JobId 3: Bacula DIR-loki 5.0.3 (04Aug10): 10-Jul-2011 00:58:04 Build OS: x86_64-unknown-linux-gnu ubuntu 10.04 JobId: 3 Job: JOB-loki_var_ftp_.2011-07-07_17.45.00_05 Backup Level: Full Client: "FD-loki" 5.0.3 (04Aug10) x86_64-unknown-linux-gnu,ubuntu,10.04 FileSet: "FS-loki_var_ftp_" 2011-07-06 18:00:00 Pool: "BackupSetFA" (From Run FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "LTO4" (From Pool resource) Scheduled time: 07-Jul-2011 17:45:00 Start time: 07-Jul-2011 17:50:30 End time: 10-Jul-2011 00:58:04 Elapsed time: 2 days 7 hours 7 mins 34 secs Priority: 50 FD Files Written: 186,287 SD Files Written: 186,287 FD Bytes Written: 2,925,298,735,317 (2.925 TB) SD Bytes Written: 2,925,332,067,132 (2.925 TB) Rate: 14740.4 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): FA0001|FA0002|FA0005|FA0006|FA0010|FA0011|FA0014|FA0015 Volume Session Id: 4 Volume Session Time: 1310078212 Last Volume Bytes: 35,227,823,104 (35.22 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK 2011-07-10 00DIR-loki JobId 3: Begin pruning Jobs older than 1 year 25 days . 2011-07-10 00DIR-loki JobId 3: No Jobs found to prune. 2011-07-10 00DIR-loki JobId 3: Begin pruning Jobs. 2011-07-10 00DIR-loki JobId 3: No Files found to prune. 2011-07-10 00DIR-loki JobId 3: End auto prune. 2011-07-10 03SD-loki JobId 6: End of Volume "FA0015" at 98:1604 on device "LTO4" (/dev/nst0). Write of 2097152 bytes got -1. 2011-07-10 03SD-loki JobId 6: Re-read of last block succeeded. 2011-07-10 03SD-loki JobId 6: End of medium on Volume "FA0015" Bytes=844,973,861,888 Blocks=402,915 at 10-Jul-2011 03:30. 2011-07-10 03SD-loki JobId 6: Please mount Volume "FA0016" or label a new one for: Job: JOB-loki_var_ftp_pub_Multimedia_DVD.2011-07-07_17.45.01_08 Storage: "LTO4" (/dev/nst0) Pool: BackupSetFA Media type: LTO4 *umount storage=LTO4 3001 Device "LTO4" (/dev/nst0) unmounted. *label storage=LTO4 pool=BackupSetFA volume=FA0016 Connecting to Storage daemon LTO4 at loki:9103 ... Sending label command for Volume "FA0016" Slot 15 ... 3000 OK label. VolBytes=1024 DVD=0 Volume="FA0016" Device="LTO4" (/dev/nst0) Requesting to mount LTO4 ... 3905 Bizarre wait state 7 Do not forget to mount the drive!!! 2011-07-10 03SD-loki JobId 6: Wrote label to prelabeled Volume "FA0016" on device "LTO4" (/dev/nst0) 2011-07-10 03SD-loki JobId 6: New volume "FA0016" mounted on device "LTO4" (/dev/nst0) at 10-Jul-2011 03:51. 2011-07-10 03SD-loki JobId 6: Fatal error: block.c:439 Attempt to write on read-only Volume. dev="LTO4" (/dev/nst0) 2011-07-10 03SD-loki JobId 6: End of medium on Volume "FA0016" Bytes=1,024 Blocks=0 at 10-Jul-2011 03:51. 2011-07-10 03SD-loki JobId 6: Fatal error: Job 6 canceled. 2011-07-10 03SD-loki JobId 6: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "LTO4" (/dev/nst0). ERR=Input/output error * 2011-07-10 03SD-loki JobId 6: Despooling elapsed time = 02:32:53, Transfer rate = 93.64 M Bytes/second 2011-07-10 03SD-loki JobId 6: Job write elapsed time = 57:37:54, Transfer rate = 8.278 M Bytes/second 2011-07-10 03FD-loki JobId 6: Error: bsock.c:393 Write error sending 65536 bytes to Storage daemon:loki:9103: ERR=Connection reset by peer 2011-07-10 03FD-loki JobId 6: Fatal error: backup.c:1024 Network send error to SD. ERR=Connection reset by peer 2011-07-10 03SD-loki JobId 7: Fatal error: block.c:439 Attempt to write on read-only Volume. dev="LTO4" (/dev/nst0) 2011-07-10 03SD-loki JobId 7: Fatal error: spool.c:301 Fatal append error on device "LTO4" (/dev/nst0): ERR=block.c:1015 Read zero bytes at 0:0 on device "LTO4" (/dev/nst0). 2011-07-10 03SD-loki JobId 7: Despooling elapsed time = 00:00:01, Transfer rate = 858.9 G Bytes/second * 2011-07-10 03DIR-loki JobId 6: Error: Bacula DIR-loki 5.0.3 (04Aug10): 10-Jul-2011 03:52:08 Build OS: x86_64-unknown-linux-gnu ubuntu 10.04 JobId: 6 Job: JOB-loki_var_ftp_pub_Multimedia_DVD.2011-07-07_17.45.01_08 Backup Level: Full Client: "FD-loki" 5.0.3 (04Aug10) x86_64-unknown-linux-gnu,ubuntu,10.04 FileSet: "FS-loki_var_ftp_pub_Multimedia_DVD" 2011-07-06 18:00:01 Pool: "BackupSetFA" (From Run FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "LTO4" (From Pool resource) Scheduled time: 07-Jul-2011 17:45:01 Start time: 07-Jul-2011 17:50:30 End time: 10-Jul-2011 03:52:08 Elapsed time: 2 days 10 hours 1 min 38 secs Priority: 50 FD Files Written: 452 SD Files Written: 452 FD Bytes Written: 1,717,640,639,816 (1.717 TB) SD Bytes Written: 1,717,632,388,872 (1.717 TB) Rate: 8222.4 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): FA0011|FA0012|FA0015 Volume Session Id: 6 Volume Session Time: 1310078212 Last Volume Bytes: 1,024 (1.024 KB) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** * 2011-07-10 03SD-loki JobId 7: Job write elapsed time = 57:26:37, Transfer rate = 16.61 M Bytes/second 2011-07-10 03FD-loki JobId 7: Error: bsock.c:393 Write error sending 65536 bytes to Storage daemon:loki:9103: ERR=Connection reset by peer 2011-07-10 03FD-loki JobId 7: Fatal error: backup.c:1024 Network send error to SD. ERR=Connection reset by peer 2011-07-10 03SD-loki JobId 5: Fatal error: block.c:439 Attempt to write on read-only Volume. dev="LTO4" (/dev/nst0) 2011-07-10 03SD-loki JobId 5: Fatal error: spool.c:301 Fatal append error on device "LTO4" (/dev/nst0): ERR=block.c:1015 Read zero bytes at 0:0 on device "LTO4" (/dev/nst0). 2011-07-10 03SD-loki JobId 5: Despooling elapsed time = 00:00:01, Transfer rate = 141.9 G Bytes/second 2011-07-10 03DIR-loki JobId 7: Error: Bacula DIR-loki 5.0.3 (04Aug10): 10-Jul-2011 03:57:49 Build OS: x86_64-unknown-linux-gnu ubuntu 10.04 JobId: 7 Job: JOB-loki_var_ftp_pub_mirrors.2011-07-07_17.45.01_09 Backup Level: Full Client: "FD-loki" 5.0.3 (04Aug10) x86_64-unknown-linux-gnu,ubuntu,10.04 FileSet: "FS-loki_var_ftp_pub_mirrors" 2011-07-06 18:00:01 Pool: "BackupSetFA" (From Run FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "LTO4" (From Pool resource) Scheduled time: 07-Jul-2011 17:45:01 Start time: 07-Jul-2011 17:50:31 End time: 10-Jul-2011 03:57:49 Elapsed time: 2 days 10 hours 7 mins 18 secs Priority: 50 FD Files Written: 403,525 SD Files Written: 403,525 FD Bytes Written: 3,435,178,457,337 (3.435 TB) SD Bytes Written: 3,435,250,296,123 (3.435 TB) Rate: 16417.6 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): FA0003|FA0004|FA0008|FA0009|FA0013|FA0014 Volume Session Id: 7 Volume Session Time: 1310078212 Last Volume Bytes: 1,024 (1.024 KB) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** 2011-07-10 03DIR-loki JobId 5: Error: Bacula DIR-loki 5.0.3 (04Aug10): 10-Jul-2011 03:58:22 Build OS: x86_64-unknown-linux-gnu ubuntu 10.04 JobId: 5 Job: JOB-loki_var_ftp_pub_Multimedia_BluRay.2011-07-07_17.45.01_07 Backup Level: Full Client: "FD-loki" 5.0.3 (04Aug10) x86_64-unknown-linux-gnu,ubuntu,10.04 FileSet: "FS-loki_var_ftp_pub_Multimedia_BluRay" 2011-07-06 18:00:01 Pool: "BackupSetFA" (From Run FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "LTO4" (From Pool resource) Scheduled time: 07-Jul-2011 17:45:01 Start time: 07-Jul-2011 17:50:30 End time: 10-Jul-2011 03:58:22 Elapsed time: 2 days 10 hours 7 mins 52 secs Priority: 50 FD Files Written: 683 SD Files Written: 683 FD Bytes Written: 2,718,316,374,045 (2.718 TB) SD Bytes Written: 2,718,316,489,228 (2.718 TB) Rate: 12989.4 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): FA0000|FA0001|FA0004|FA0005|FA0009|FA0010 Volume Session Id: 5 Volume Session Time: 1310078212 Last Volume Bytes: 1,024 (1.024 KB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: Error Termination: *** Backup Error *** *umount storage=LTO4 3002 Device "LTO4" (/dev/nst0) unmounted. ------------ ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users