Re: amreport claims tape is full but it is not

2016-04-27 Thread Chris Nighswonger
On Tue, Apr 26, 2016 at 12:48 PM, Chris Nighswonger <
cnighswon...@foundations.edu> wrote:

> So the Quantum tech had me run a device health test on the drive using
> their xTalk utility. The drive failed with some sort of comm error similar
> to what Amanda was seeing. So he then had me clean the drive twice and
> re-run the health test. The drive passed this time around.
>
>
> We will see what happens during the nightly backup run tonight.
>

The taper failed again on an assumed EOT. I reran xTalk and the health test
failed again. Sent off the new logs to Quantum. They are shipping a new
library to me overnight. Should be on line in time for tomorrow night's
backup run.

Kudos to Quantum (and their Rapid Replacement Program)!


Re: amreport claims tape is full but it is not

2016-04-26 Thread Chris Nighswonger
On Tue, Apr 19, 2016 at 11:35 AM, Alan Hodgson 
wrote:

> On Tuesday, April 19, 2016 09:57:32 AM Chris Nighswonger wrote:
> > I attempted Alan's suggestion and dd'd to the tape. /dev/md0 is > 130G
> and
> > dd borked almost immediately:
> >
> > root@scriptor:/home/manager dd if=/dev/md0 of=/dev/st0
> > dd: writing to ‘/dev/st0’: Input/output error
> > 3+0 records in
> > 2+0 records out
> > 1024 bytes (1.0 kB) copied, 15.2882 s, 0.1 kB/s
> >
> > Funny thing is that the Superloader passes all of its internal
> diagnostics
> > as far as I can tell.
> >
> > Suggestions on how to verify which piece of hardware is causing the
> problem
> > would be greatly appreciated.
> >
>
> That's kind of tough unless you have either a separate tape drive, or a
> separate server/SAS controller/cable combination to test.
>
> It seems to be talking OK to the loader, though, so odds are it's the
> drive.
>
> Quantum's pretty good about taking RMA's on the Superloaders without too
> much
> fuss, as long as you have a maintenance contract in place.
>
>
So the Quantum tech had me run a device health test on the drive using
their xTalk utility. The drive failed with some sort of comm error similar
to what Amanda was seeing. So he then had me clean the drive twice and
re-run the health test. The drive passed this time around.

So if it was dirty heads, why did the drive not indicate cleaning needed?
Or does the drive just indicate cleaning based on tape motion hours?

Here are the stats retrieved from the drive. Only 35 hrs in motion. Seems a
bit early for a cleaning...


Drive Information Report
Drive Type  : Ultrium 4 HH SCSI
Drive Serial Number --- : HU
Media Changer Present - : No
Vendor  ID  : QUANTUM
Product ID  : ULTRIUM 4
Product Revision Level  : W61T
Product Family  : 800.0 / 1600.0 GB
Servo FW Revision - : N/A
Power On Hours  : 4681
Tape Motion Hours - : 35
Load Count  : 308
Cleaning Tape Count --- : 2


We will see what happens during the nightly backup run tonight.


Re: amreport claims tape is full but it is not

2016-04-19 Thread Alan Hodgson
On Tuesday, April 19, 2016 09:57:32 AM Chris Nighswonger wrote:
> I attempted Alan's suggestion and dd'd to the tape. /dev/md0 is > 130G and
> dd borked almost immediately:
> 
> root@scriptor:/home/manager dd if=/dev/md0 of=/dev/st0
> dd: writing to ‘/dev/st0’: Input/output error
> 3+0 records in
> 2+0 records out
> 1024 bytes (1.0 kB) copied, 15.2882 s, 0.1 kB/s
> 
> Funny thing is that the Superloader passes all of its internal diagnostics
> as far as I can tell.
> 
> Suggestions on how to verify which piece of hardware is causing the problem
> would be greatly appreciated.
> 

That's kind of tough unless you have either a separate tape drive, or a 
separate server/SAS controller/cable combination to test.

It seems to be talking OK to the loader, though, so odds are it's the drive.

Quantum's pretty good about taking RMA's on the Superloaders without too much 
fuss, as long as you have a maintenance contract in place.



Re: amreport claims tape is full but it is not

2016-04-19 Thread Chris Nighswonger
I attempted Alan's suggestion and dd'd to the tape. /dev/md0 is > 130G and
dd borked almost immediately:

root@scriptor:/home/manager dd if=/dev/md0 of=/dev/st0
dd: writing to ‘/dev/st0’: Input/output error
3+0 records in
2+0 records out
1024 bytes (1.0 kB) copied, 15.2882 s, 0.1 kB/s

Funny thing is that the Superloader passes all of its internal diagnostics
as far as I can tell.

Suggestions on how to verify which piece of hardware is causing the problem
would be greatly appreciated.

Kind regards,
Chris


On Fri, Apr 15, 2016 at 1:03 PM, Chris Nighswonger <
cnighswon...@foundations.edu> wrote:

> Well, perhaps it is a bad device. A run of amflush CONFIG resulted in
> another failure:
>
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: updating state
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
> Amanda::Taper::Scan::traditional stage 2: scan for any reusable volume
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: too early for another 'status' invocation
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: loading next relative to 5: 6
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: using drive 0
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: unloading drive 0
> Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper: invoking /usr/sbin/mtx -f
> /dev/sg4 unload 5 0
> Fri Apr 15 11:53:41 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: unload complete
> Fri Apr 15 11:53:41 2016: thd-0x18bae00: taper: invoking /usr/sbin/mtx -f
> /dev/sg4 load 6 0
> Fri Apr 15 11:55:01 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: polling 'tape:/dev/nst0' to see if it's ready
> Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper:
> Quantum-Superloader3-LTO-V4: setting current slot to 6
> Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Slot 6 with label
> campus-NGH874L4 is usable
> Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper:
> Amanda::Taper::Scan::traditional result: 'campus-NGH874L4' on
> tape:/dev/nst0 slot 6, mode 2
> Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Amanda::Taper::Scribe
> preparing to write, part size 0, using LEOM (falling back to holding disk
> as cache) (splitter)  (LEOM supported)
> Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Starting  ( -> )>
> Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Final linkage:
>  -(PULL_BUFFER)-> 
> -(PUSH_BUFFER)-> 
> Fri Apr 15 11:55:09 2016: thd-0x18bae00: taper: Building type TAPESTART
> header of 32768-32768 bytes with name='campus-NGH874L4' disk='' dumplevel=0
> and blocksize=32768
> Fri Apr 15 11:55:11 2016: thd-0x18bae00: taper: Device tape:/dev/nst0
> error = 'Error writing filemark: Input/output error'
> Fri Apr 15 11:55:11 2016: thd-0x18bae00: taper: Device tape:/dev/nst0
> setting status flag(s): DEVICE_STATUS_DEVICE_ERROR, and
> DEVICE_STATUS_VOLUME_ERROR
> Fri Apr 15 11:55:27 2016: thd-0x18bae00: taper: tape campus-NGH874L4 kb 0
> fm -1 [OK]
>
>


Re: amreport claims tape is full but it is not

2016-04-15 Thread Chris Nighswonger
Well, perhaps it is a bad device. A run of amflush CONFIG resulted in
another failure:

Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: updating state
Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
Amanda::Taper::Scan::traditional stage 2: scan for any reusable volume
Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: too early for another 'status' invocation
Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: loading next relative to 5: 6
Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: using drive 0
Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: unloading drive 0
Fri Apr 15 11:52:31 2016: thd-0x18bae00: taper: invoking /usr/sbin/mtx -f
/dev/sg4 unload 5 0
Fri Apr 15 11:53:41 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: unload complete
Fri Apr 15 11:53:41 2016: thd-0x18bae00: taper: invoking /usr/sbin/mtx -f
/dev/sg4 load 6 0
Fri Apr 15 11:55:01 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: polling 'tape:/dev/nst0' to see if it's ready
Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper:
Quantum-Superloader3-LTO-V4: setting current slot to 6
Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Slot 6 with label
campus-NGH874L4 is usable
Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper:
Amanda::Taper::Scan::traditional result: 'campus-NGH874L4' on
tape:/dev/nst0 slot 6, mode 2
Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Amanda::Taper::Scribe
preparing to write, part size 0, using LEOM (falling back to holding disk
as cache) (splitter)  (LEOM supported)
Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Starting  -> )>
Fri Apr 15 11:55:04 2016: thd-0x18bae00: taper: Final linkage:
 -(PULL_BUFFER)-> 
-(PUSH_BUFFER)-> 
Fri Apr 15 11:55:09 2016: thd-0x18bae00: taper: Building type TAPESTART
header of 32768-32768 bytes with name='campus-NGH874L4' disk='' dumplevel=0
and blocksize=32768
Fri Apr 15 11:55:11 2016: thd-0x18bae00: taper: Device tape:/dev/nst0 error
= 'Error writing filemark: Input/output error'
Fri Apr 15 11:55:11 2016: thd-0x18bae00: taper: Device tape:/dev/nst0
setting status flag(s): DEVICE_STATUS_DEVICE_ERROR, and
DEVICE_STATUS_VOLUME_ERROR
Fri Apr 15 11:55:27 2016: thd-0x18bae00: taper: tape campus-NGH874L4 kb 0
fm -1 [OK]


Re: amreport claims tape is full but it is not

2016-04-15 Thread Chris Nighswonger
On Fri, Apr 15, 2016 at 11:17 AM, Alan Hodgson 
wrote:

> On Friday, April 15, 2016 09:31:26 AM you wrote:
> > I'm trying to figure out how a LTO4 tape was filled up by 32k of data
> > or am I missing something here?
> >
>
> Any time I've started getting premature end-of-tape it has been because the
> drive is malfunctioning. Try a few times with just dd on a new tape and see
> how much it can write before it gets EOT.
>
>
Ouch! This drive is only a couple of months old...

I'll give this a try if switching the device type does not work.

Kind regards,
Chris


Re: amreport claims tape is full but it is not

2016-04-15 Thread Chris Nighswonger
On Fri, Apr 15, 2016 at 11:24 AM, Jean-Louis Martineau <
jmartin...@carbonite.com> wrote:

> On 15/04/16 11:15 AM, Chris Nighswonger wrote:
>
>> warning: Got EIO
>>
> What can amanda do when it get an EIO?
> Check system log for error.
> Amanda require a non-rewing device, you must use /dev/nst0 instead of
> /dev/st0
>

Interesting. I ran the last Quantum drive as /dev/st0 and never had issues.
I've changed over to /dev/nst0 and will see how it runs now.


> Do the tape drive is in variable block size?
>   mt -f /dev/nst0 status


backup@scriptor:~/campus$ mt -f /dev/nst0 status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 0 bytes. Density code 0x46 (LTO-4).
Soft error count since last status=0
General status bits on (4101):
 BOT ONLINE IM_REP_EN

Thanks for your help.

Kind regards,
Chris


Re: amreport claims tape is full but it is not

2016-04-15 Thread Alan Hodgson
On Friday, April 15, 2016 09:31:26 AM you wrote:
> I'm trying to figure out how a LTO4 tape was filled up by 32k of data
> or am I missing something here?
> 

Any time I've started getting premature end-of-tape it has been because the 
drive is malfunctioning. Try a few times with just dd on a new tape and see 
how much it can write before it gets EOT.



Re: amreport claims tape is full but it is not

2016-04-15 Thread Chris Nighswonger
On Fri, Apr 15, 2016 at 10:21 AM, Jean-Louis Martineau <
jmartin...@carbonite.com> wrote:

> Which version of amanda are you using?
>

3.3.3 (on Ubuntu 14.04.4 LTS)


> Any other error message in the report?
>

none

What is the setting of max-dle-by-volume?
>
>
   amgetconf CONFIG max-dle-by-volume
>

backup@scriptor:~/campus$ amgetconf campus max-dle-by-volume
10



> post the taper debug file and the amdump.[x] file
>


There's a lot of stuff to sanitize in those files before posting here, but
here are most likely the relevant lines from the taper debug file:

Thu Apr 14 22:15:16 2016: thd-0x21e5050: taper: Building type SPLIT_FILE
header of 32768-32768 bytes with name='some_client' disk='/path/to/data'
dumplevel=1 and blocksize=32768
Thu Apr 14 22:15:16 2016: thd-0x21e5050: taper: warning: Got EIO on
/dev/st0, assuming end of tape
Thu Apr 14 22:15:16 2016: thd-0x21e5050: taper: Device tape:/dev/st0 error
= 'No space left on device'
Thu Apr 14 22:15:16 2016: thd-0x21e5050: taper: Device tape:/dev/st0
setting status flag(s): DEVICE_STATUS_VOLUME_ERROR
Thu Apr 14 22:15:27 2016: thd-0xf98c00: taper: Will request retry of failed
split part.
Thu Apr 14 22:15:27 2016: thd-0xf98c00: taper: tape campus-NGH873L4 kb 0 fm
1 [OK]


So this is a new Quantum Superloader, I can't imagine the heads are dirty
already.

Here is the tapetype as suggested by amtapetype:

define tapetype LTO-4 {
comment "Created by amtapetype; compression enabled"
length 794405408 kbytes
filemark 1385 kbytes
speed 77291 kps
blocksize 32 kbytes
}



The size of this backup job does not exceed 400GB at its largest size.

Kind regards,
Chris


---

These dumps were to tape campus-NGH873L4.
Not using all tapes because 1 tapes filled; runtapes=1 does not allow
additional tapes.
There are 112130560k of dumps left in the holding disk.
They will be flushed on the next run.

...


Tape Time (hrs:min) 0:00   0:00   0:00
Tape Size (meg)  0.00.00.0
Tape Used (%)0.00.00.0
DLEs Taped 1  0  1  1:1
Parts Taped1  0  1  1:1
Avg Tp Write Rate (k/s)  0.0-- 0.0

USAGE BY TAPE:
  Label   Time Size  %  DLEs Parts
  campus-NGH873L4 0:00  32k0.0 1 1

-


I'm trying to figure out how a LTO4 tape was filled up by 32k of data
> or am I missing something here?
>
> Amanda was doing this last week, so I loaded blank tapes and started the
> cycle fresh.
>
> Here is the contents of tapelist for this backup:
>
> 20160414221501 campus-NGH873L4 reuse BARCODE:NGH873L4 BLOCKSIZE:32
> 20160413221501 campus-NGH872L4 reuse BARCODE:NGH872L4 BLOCKSIZE:32
> 20160412221501 campus-NGH871L4 reuse BARCODE:NGH871L4 BLOCKSIZE:32
> 20160411221501 campus-NGH870L4 reuse BARCODE:NGH870L4 BLOCKSIZE:32
> 20160408221501 campus-NGH869L4 reuse BARCODE:NGH869L4 BLOCKSIZE:32
> 20160406222549 campus-NGH863L4 reuse BARCODE:NGH863L4 BLOCKSIZE:32
> 0 campus-NGH862L4 reuse BARCODE:NGH862L4 BLOCKSIZE:32
> 0 campus-NGH861L4 reuse BARCODE:NGH861L4 BLOCKSIZE:32
> 0 campus-NGH876L4 reuse BARCODE:NGH876L4 BLOCKSIZE:32
> 0 campus-NGH875L4 reuse BARCODE:NGH875L4 BLOCKSIZE:32
> 0 campus-NGH874L4 reuse BARCODE:NGH874L4 BLOCKSIZE:32
> 0 campus-NGH864L4 reuse BARCODE:NGH864L4 BLOCKSIZE:32
> 0 campus-NGH865L4 reuse BARCODE:NGH865L4 BLOCKSIZE:32
>
>
> Kind regards,
> Chris
>
>
>