Once again amanda reported an error within the daily report-mail. Beside this I 
found an reported error in dmesg.
 
st3: Sense Key : Aborted Command [current] 
st3: Add. Sense: Information unit iuCRC error detected
st3: Sense Key : Aborted Command [current] 
st3: Add. Sense: Information unit iuCRC error detected
st3: Sense Key : Aborted Command [current] 
st3: Add. Sense: Information unit iuCRC error detected
st3: Sense Key : Aborted Command [current] 
st3: Add. Sense: Information unit iuCRC error detected
INFO: task taper:26832 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
taper         D 0000000000000008     0 26832  26831 0x00000080
 ffff8809665d7b58 0000000000000082 ffff8809665d7b20 ffff8809665d7b1c
 ffff880828ed5000 ffff88083fc25300 ffff88085c495f80 0000000000000400
 ffff8808e9fa9a78 ffff8809665d7fd8 000000000000f4e8 ffff8808e9fa9a78
Call Trace:
 [<ffffffff814ed975>] schedule_timeout+0x215/0x2e0
 [<ffffffff814ed5f3>] wait_for_common+0x123/0x180
 [<ffffffff8105e7f0>] ? default_wake_function+0x0/0x20
 [<ffffffff8124e7a2>] ? __generic_unplug_device+0x32/0x40
 [<ffffffff814ed70d>] wait_for_completion+0x1d/0x20
 [<ffffffffa020c4c3>] st_do_scsi.clone.0+0x213/0x380 [st]
 [<ffffffffa020ca45>] st_int_ioctl+0x1a5/0xce0 [st]
 [<ffffffff81112ffe>] ? generic_file_aio_write+0xbe/0xe0
 [<ffffffffa020f089>] st_ioctl+0x439/0x11c0 [st]
 [<ffffffff811892f2>] vfs_ioctl+0x22/0xa0
 [<ffffffff81189494>] do_vfs_ioctl+0x84/0x580
 [<ffffffff81176602>] ? vfs_write+0x132/0x1a0
 [<ffffffff81189a11>] sys_ioctl+0x81/0xa0
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
 
In the first step, I restricted my Adaptec 29320LPE to U160 mode. Up to know 
I'm running amdump without any errors.
 
It may be possible, that the error has gone with the needed reboot to 
reconfigure the scsi-adapter. Otherwise the reconfiguration of the scsi-apadter 
solved the problem.
 
So I will keep a look on it. I think amanda is not responsible for the errors. 
Seems to be a problem within my used kernel/driver configuration.
 
Thanks
 
Markus
 
>>> Hi there,
>>> during my daily backup I recieve the error-message for quite a few days.
>>> These dumps were to tape AKREG1-08.
>>> *** A TAPE ERROR OCCURRED: [runtapes volumes already written].
>>> There are 506710M of dumps left in the holding disk.
>>> Run amflush to flush them to tape.
>>> Beside this error, I found the following lines in the corresponding 
>>> taper.log (taper.20120328210001.debug)
>>> 1332980895.993331: taper: warning: Got EIO on /dev/nst3, assuming end 
>>> of tape
>>What is before that line, what amanda was trying to do when it got EIO?
>>Is it when it wrote the tape label? or did it wrote a lot of data before the 
>>error? Did it hit the end of tape?
>
>In the beginning, amanda starts
>
>[root@backup02 ]# cat /var/log/amanda/server/daily1/taper.20120328210001.debug
>1332961201.685286: taper: pid 24858 ruid 33 euid 33 version 2.6.1p2: start at 
>Wed Mar 28 21:00:01 2012
>1332961201.686081: taper: taper: pid 24858 executable taper version 2.6.1p2
>1332961201.686657: taper: pid 24858 ruid 33 euid 33 version 2.6.1p2: rename at 
>Wed Mar 28 21:00:01 2012
>1332961201.686824: taper: getcmd: START-TAPER 20120328210001
>1332961201.721631: taper: putresult: 25 TAPER-OK
>1332961625.751742: taper: getcmd: FILE-WRITE 08-00010 
>/data1/amanda/daily1/20120328210001/servername01.domain.tld._opt_online.1 
>servername01.domain.tld /opt/online/ 1 20120328210001 0
>1332961625.752021: taper: putresult: 26 REQUEST-NEW-TAPE
>1332961625.752172: taper: getcmd: NEW-TAPE
>
>amanda wrote some data to the tape and then reports EIO. But the tapes end 
>wasn't hit.
>
>1332980856.560330: taper: getcmd: FILE-WRITE 01-00093 
>/data1/amanda/daily1/20120328210001/servername0N.domain.tld._data1_BACKUP.0 
>servername0N.domain.tld /data1/BACKUP 0 20120328210001 0
>1332980856.560442: taper: Building type 3 (FILE) header of size 32768 using:
>1332980856.560452: taper: Contents of *(dumpfile_t *)0x7f4889d1f1f0:
>1332980856.560457: taper:     type             = 3 (FILE)
>1332980856.560461: taper:     datestamp        = '20120328210001'
>1332980856.560465: taper:     dumplevel        = 0
>1332980856.560469: taper:     compressed       = -32
>1332980856.560473: taper:     encrypted        = 0
>1332980856.560477: taper:     comp_suffix      = '.gz'
>1332980856.560481: taper:     encrypt_suffix   = ''
>1332980856.560485: taper:     name             = 'servername0N.domain.tld'
>1332980856.560489: taper:     disk             = '/data1/BACKUP'
>1332980856.560493: taper:     program          = '/bin/tar'
>1332980856.560497: taper:     application      = ''
>1332980856.560501: taper:     srvcompprog      = ''
>1332980856.560504: taper:     clntcompprog     = ''
>1332980856.560508: taper:     srv_encrypt      = ''
>1332980856.560512: taper:     clnt_encrypt     = ''
>1332980856.560516: taper:     recover_cmd      = '/bin/tar -f... -'
>1332980856.560520: taper:     uncompress_cmd   = '/usr/bin/gzip -dc |'
>1332980856.560524: taper:     encrypt_cmd      = ''
>1332980856.560528: taper:     decrypt_cmd      = ''
>1332980856.560532: taper:     srv_decrypt_opt  = ''
>1332980856.560536: taper:     clnt_decrypt_opt = ''
>1332980856.560539: taper:     cont_filename    = ''
>1332980856.560543: taper:     dle_str          = (null)
>1332980856.560547: taper:     is_partial       = 0
>1332980856.560554: taper:     partnum          = 0
>1332980856.560559: taper:     totalparts       = 0
>1332980856.560563: taper:     blocksize        = 32768
>1332980895.993331: taper: warning: Got EIO on /dev/nst3, assuming end of tape
>1332980895.993377: taper: Device /dev/nst3 error = 'No space left on device'
>1332980895.993441: taper: Device /dev/nst3 setting status flag(s): 
>DEVICE_STATUS_VOLUME_ERROR
>1332980903.652733: taper: device /dev/nst3 ran out of space
>1332981061.192603: taper: putresult: 4 PARTIAL
>1332981061.192955: taper: getcmd: FILE-WRITE 01-00094 
>/data1/amanda/daily1/20120328210001/servername0N.domain.tld._data1_BACKUP.0 
>servername0N.domain.tld /data1/BACKUP 0 20120328210001 0
>1332981061.193110: taper: putresult: 26 REQUEST-NEW-TAPE
>1332981061.193234: taper: getcmd: NO-NEW-TAPE "runtapes volumes already 
>written"
>1332981063.305514: taper: putresult: 10 FAILED
>1332984143.214650: taper: getcmd: QUIT
>1332984143.214684: taper: putresult: 2 QUITTING
>
>
>All those runs were finished this the notice "runtapes volumes already written"
>
>USAGE BY TAPE:
>  Label           Time      Size      %    Nb    Nc
>AKREG1-08       0:04    13208M    1.7    46    46 (today)
>AKREG1-07       3:37    53203M    6.9    46    46 (yesterday)
>AKREG1-05      13:49   242502M   31.3    46    46 (the day before yesterday)
>
>>> 1332980895.993377: taper: Device /dev/nst3 error = 'No space left on 
>>> device'
>>> 1332980895.993441: taper: Device /dev/nst3 setting status flag(s):
>>> DEVICE_STATUS_VOLUME_ERROR
>>> 1332980903.652733: taper: device /dev/nst3 ran out of space
>>> 1332981061.192603: taper: putresult: 4 PARTIAL
>>> 1332981061.192955: taper: getcmd: FILE-WRITE 01-00094 
>>> /data1/amanda/daily1/20120328210001/server.domain.de._data1_BACKUP.0
>>> server.domain.de /data1/BACKUP 0 20120328210001 0
>>> 1332981061.193110: taper: putresult: 26 REQUEST-NEW-TAPE
>>> 1332981061.193234: taper: getcmd: NO-NEW-TAPE "runtapes volumes 
>>> already written"
>>> 1332981063.305514: taper: putresult: 10 FAILED
>>> 1332984143.214650: taper: getcmd: QUIT
>>> 1332984143.214684: taper: putresult: 2 QUITTING At first, I expected 
>>> an dirty tape device 
>>> (http://wiki.zmanda.com/index.php/No_space_left_on_device). But after 
>>> cleaning the device, it even doesn't work poperly.
>>> I'm using three LTO4 and one LTO5 device. The LTO4 drives are from 
>>> the same typ (TANDBERG LTO-4 HH, same firmware). The LTO4 tapes are 
>>> connected to two Adaptec 29320LPE PCIe Ultra320 SCSI adapter.
>>> For all three LTO4 devices I'm using the same device-definition in 
>>> amanda.conf define tapetype tandberg-lto4hh {
>>>    comment "Tandberg LTO-4 HH; compression disabled"
>>>    length 794001824 kbytes
>>>    filemark 0 kbytes
>>>    speed 50554 kps
>>>    blocksize 32 kbytes
>>> }
>>> But only this one device is causing problems. Then I expected a 
>>> hardwareerror. So I downloaded the Tandberg test software. I was able 
>>> to write and read a hugh amount of random data several times on the 
>>> device.
>>> Selected Target: 3 - TANDBERG LTO-4 HH         W619 SCSI
>>> >>> Tape IO Test Start
>>> Param: WR, 64, Random Data
>>> Checking block size info...Completed
>>> Setting block size info...Completed
>>> Get current compression status - Enabled Disabling 
>>> compression...Completed Rewinding...Completed Resetting IO log 
>>> parameters...Completed Writing logical beginning of tape...Completed 
>>> Writing 750.000 MB data to the tape...Completed Retrieving log status 
>>> parameters...Completed Write Transfer Rate = 4527 MB/min 
>>> Rewinding...Completed Locating the first data...Completed Reading 
>>> 750.000 MB data from the tape...Completed Retrieving log status 
>>> parameters...Completed Read Transfer Rate = 4529 MB/min 
>>> Rewinding...Completed Erasing Tape...Completed
>>> >>> Tape IO Test Passed
>>> Writing directly to the tape using 'tar' is also posible and doesn't 
>>> cause any error. Just amanda keeps on telling me, that the device 
>>> reports 'EIO' and won't write the backupdata to tape. Running amflush 
>>> leads to the same problem.
>>> On the serverside (RHEL 6.2), I use the following repository based
>>> versions:
>>> - amanda-server-2.6.1p2-7.el6.x86_64
>>> - amanda-client-2.6.1p2-7.el6.x86_64
>>> - amanda-2.6.1p2-7.el6.x86_64
>>> Can anyone give me a short hint, where I should take a look to?
>>> Thanks in advance
>>> Markus

Reply via email to