I know *my* mid afternoon and just-before-I-leave cronjobs have an “amcheck” which does indeed interfere with “amrecovers” going on from another node. I just figured you had already thought of that possibility! So no sweat — I’ve got this problem too. ;) Deb Baddorf
> On Feb 17, 2017, at 3:17 AM, ts...@portrix-systems.de wrote: > > ARGH, this is awkward, I just figured out what was calling amcheck on Feb 14 > 12:00. One of our cronjobs that is testing whether a new tape for the night > has to be inserted. > > Didn't expect amcheck to use the same interactivity Mail-Reporting and > Mail-Template. Case closed. > > > > On 16/02/17 13:11, ts...@portrix-systems.de wrote: >> The patches work great, thanks again. I have noticed another issue >> regarding "device busy" though: >> >> During backup, after the first tape change, I get more "AMANDA VOLUME >> REQUEST" mails, even though the tape is still beeing written to and far >> from beeing full. >> >> I don't think the patches have caused this, I have seen it before. It is >> quite difficult to reproduce and I am thinking about setting up a test >> system with a virtual Tape Device like mhvtl (never tried it before) in >> order to test it reliably. Testing a potential patch will take some >> time, too. >> >> Right now I believe it is only happening when amanda is still dumping to >> the holding disk after the first tape change. And it is caused by >> something calling amcheck while writing to tape is in progress. >> >> Normally I would leave the backup running over night and the next day >> when I do the first tape change, only writing to tape is left to do. I >> don't get erroneous mails in that case. >> >> The erroneous mails look like this: >> >>> Subject: AMANDA VOLUME REQUEST: new volume >>> >>> Can't open tape device /dev/tape/by-id/scsi-3500110a00132b8f2-nst: >>> Device or resource busy >>> >>> Insert a new volume in drive_daily >>> or write the name of a new changer in '/tmp/input_daily' >>> or write 'abort' in the file to abort the scan. >> >> In the debug log dir I can find (mail arrived at 12:00): >> >> amcheck-device.20170214120001.debug: >> >>> Tue Feb 14 12:00:01.472364882 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: pid 30687 ruid 500 euid 500 version 3.4.2: start at >>> Tue Feb 14 12:00:01 2017 >>> Tue Feb 14 12:00:01.472544844 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: reading config file >>> /usr/local/amandatest/etc/amanda/ptxhamdaily/amanda.conf >>> Tue Feb 14 12:00:01.483178644 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: pid 30687 ruid 500 euid 500 version 3.4.2: rename at >>> Tue Feb 14 12:00:01 2017 >>> Tue Feb 14 12:00:01.528612782 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: Device tape:/dev/tape/by-id/scsi-3500110a00132b8f2-nst >>> error = 'Can't open tape device >>> /dev/tape/by-id/scsi-3500110a00132b8f2-nst: Device or resource busy' >>> Tue Feb 14 12:00:01.528639498 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: Device tape:/dev/tape/by-id/scsi-3500110a00132b8f2-nst >>> setting status flag(s): DEVICE_STATUS_DEVICE_BUSY >>> Tue Feb 14 12:00:01.529026195 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: new Amanda::Changer::Error: type='fatal', >>> message='Can't open tape device >>> /dev/tape/by-id/scsi-3500110a00132b8f2-nst: Device or resource busy' >>> Tue Feb 14 12:00:01.529453221 2017: pid 30687: thd-0x1357e10: >>> amcheck-device: cmd: /usr/bin/Mail -s AMANDA VOLUME REQUEST: new >>> volume x...@xxx.xxx >> >> >> amcheck.20170214120001.debug: >> >>> Tue Feb 14 12:00:01.346665049 2017: pid 30684: thd-0x1e9d010: amcheck: >>> pid 30684 ruid 0 euid 500 version 3.4.2: start at Tue Feb 14 12:00:01 >>> 2017 >>> Tue Feb 14 12:00:01.346740802 2017: pid 30684: thd-0x1e9d010: amcheck: >>> reading config file >>> /usr/local/amandatest/etc/amanda/ptxhamdaily/amanda.conf >>> Tue Feb 14 12:00:01.347261333 2017: pid 30684: thd-0x1e9d010: amcheck: >>> pid 30684 ruid 0 euid 500 version 3.4.2: rename at Tue Feb 14 12:00:01 >>> 2017 >>> Tue Feb 14 12:00:01.355625384 2017: pid 30686: thd-0x1e9d010: >>> amcheck-server: new message: amcheck.c:940:4:2800027 Amanda Tape >>> Server Host Check >>> Tue Feb 14 12:00:01.355650277 2017: pid 30686: thd-0x1e9d010: >>> amcheck-server: Not value for key 'dle_hostname' in message { >>> "source_filename" : "amcheck.c", >>> "source_line" : "940", >>> "severity" : "message", >>> "process" : "amcheck-server", >>> "running_on" : "amanda-server", >>> "component" : "amanda", >>> "module" : "amanda", >>> "code" : "2800027", >>> "message" : "Amanda Tape Server Host Check" >>> } >>> Tue Feb 14 12:00:01.356030840 2017: pid 30686: thd-0x1e9d010: >>> amcheck-server: Spawning >>> "/usr/local/amandatest/libexec/amanda/amcheck-device >>> /usr/local/amandatest/libexec/amanda/amcheck-device ptxhamdaily >>> ptxhamdaily" in pipeline >>> Tue Feb 14 12:00:01.528935767 2017: pid 30686: thd-0x1e9d010: >>> amcheck-server: new message: amcheck.c:867:4:123 slot 1: Can't open >>> tape device /dev/tape/by-id/scsi-3500110a00132b8f2-nst: Device or >>> resource busy >>> Tue Feb 14 12:00:01.528954799 2017: pid 30686: thd-0x1e9d010: >>> amcheck-server: Not value for key 'dle_hostname' in message , >>> { >>> "source_filename" : "amcheck.c", >>> "source_line" : "867", >>> "severity" : "message", >>> "process" : "amcheck-server", >>> "running_on" : "amanda-server", >>> "component" : "amanda", >>> "module" : "amanda", >>> "code" : "123", >>> "errstr" : "slot 1: Can't open tape device >>> /dev/tape/by-id/scsi-3500110a00132b8f2-nst: Device or resource busy", >>> "message" : "slot 1: Can't open tape device >>> /dev/tape/by-id/scsi-3500110a00132b8f2-nst: Device or resource busy" >>> } >> >> Please let me know if more info is required. >> >> >> >> On 10/02/17 17:25, Jean-Louis Martineau wrote: >>> I committed both patches. >>> >>> Let me know if you have other issues or if it succeed! >>> >>> Jean-Louis >>> >>> On 10/02/17 09:37 AM, ts...@portrix-systems.de >>> <mailto:ts...@portrix-systems.de> wrote: >>>> A quick test with amflush on previous holding-disk content didn't show >>>> the issue anymore. Tape was ejected, no more locking, flushing was >>>> able to finish both tapes. Will try a real backup next week. >>>> >>>> Thanks alot, >>>> Thilo >>>> >>>> >>>> On 09/02/17 17:33, Jean-Louis Martineau wrote: >>>>> This patch is also needed. >>>>> >>>>> Please try them and let me know if they works. >>>>> >>>>> Jean-Louis >>>>> >>>>> On 09/02/17 10:58 AM, Jean-Louis Martineau wrote: >>>>>> Hi, >>>>>> >>>>>> Can you try the attached patch. >>>>>> >>>>>> Jean-Louis >>>>>> >>>>>> On 08/02/17 12:58 PM, ts...@portrix-systems.de >>> <mailto:ts...@portrix-systems.de> >>>>>> <mailto:ts...@portrix-systems.de> wrote: >>>>>> > Hi, >>>>>> > >>>>>> > tried this with: >>>>>> > >>>>>> >>> http://www.zmanda.com/downloads/community/Amanda/3.4.2/Redhat_Enterprise_6.0/amanda-backup_server-3.4.2-1.rhel6.x86_64.rpm >>> >>>>>> >>>>>> > (and also 3.4.1) >>>>>> > >>>>>> > Oracle Linux Server release 6.8 >>>>>> > # uname -rsv >>>>>> > Linux 4.1.12-61.1.23.el6uek.x86_64 #2 SMP Tue Dec 20 16:51:41 PST >>>>>> 2016 >>>>>> > >>>>>> > >>>>>> > Tapeinfo: >>>>>> > Vendor ID: 'HP ' >>>>>> > Product ID: 'Ultrium 4-SCSI ' >>>>>> > Revision: 'H63Z' >>>>>> > >>>>>> > >>>>>> > The test backup doesn't fit on one tape. After I get the E-Mail from >>>>>> > the interactivity module, I can see with lsof that the taper still >>>>>> has >>>>>> > a lock on /dev/nst1: >>>>>> > >>>>>> > > # lsof /dev/nst1 >>>>>> >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >>>>>> >> taper 1168 amandabackup 5u CHR 9,129 0t0 1812371 /dev/nst1 >>>>>> > >>>>>> > touching the check-file looks like this in >>>>>> > /var/log/amanda/server/ptxhamtest/taper.20170208122530.debug: >>>>>> > >>>>>> >> Wed Feb 08 16:21:12.035824026 2017: pid 1168: thd-0x240ad70: taper: >>>>>> >> xfer_queue_message: MSG: <XMsg@0x7ff8bc012390 type=XMSG_PART_DONE >>>>>> >> elt=<XferDestTaperSplitter@0x24090a0> version=0> >>>>>> >> Wed Feb 08 16:21:12.036846793 2017: pid 1168: thd-0x240ad70: taper: >>>>>> >> Building type SPLIT_FILE header of 32768-32768 bytes with >>>>>> >> name='localhost' disk='/amanda/testbackupsrc' dumplevel=0 and >>>>>> >> blocksize=32768 >>>>>> >> Wed Feb 08 16:22:52.113249836 2017: pid 1168: thd-0x240ad70: taper: >>>>>> >> Device tape:/dev/nst1 error = 'No space left on device' >>>>>> >> Wed Feb 08 16:22:52.113278102 2017: pid 1168: thd-0x240ad70: taper: >>>>>> >> Device tape:/dev/nst1 setting status flag(s): >>>>>> DEVICE_STATUS_VOLUME_ERROR >>>>>> >> Wed Feb 08 16:22:53.101250403 2017: pid 1168: thd-0x240ad70: taper: >>>>>> >> xfer_queue_message: MSG: <XMsg@0x7ff8bc012670 type=XMSG_PART_DONE >>>>>> >> elt=<XferDestTaperSplitter@0x24090a0> version=0> >>>>>> >> Wed Feb 08 16:23:19.976805822 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> Will request retry of failed split part. >>>>>> >> Wed Feb 08 16:23:19.977170240 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> tape HAM-M-029-L4 kb 754974720 fm 73 [OK] >>>>>> >> Wed Feb 08 16:23:19.980883703 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> Device is in variable block size >>>>>> >> Wed Feb 08 16:23:40.882859608 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Wed Feb 08 16:23:40.882974991 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> cmd: /usr/bin/Mail -s AMANDA VOLUME REQUEST: new volume >>> x...@xxx.xxx <mailto:x...@xxx.xxx> >>>>>> <mailto:x...@xxx.xxx> >>>>>> >> >>>>>> >> Wed Feb 08 16:23:50.888937917 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Wed Feb 08 16:24:00.960351367 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Wed Feb 08 16:24:10.960959061 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> > ... >>>>>> >> Wed Feb 08 17:20:55.510368422 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Wed Feb 08 17:21:05.512939917 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Wed Feb 08 17:21:15.515187884 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Wed Feb 08 17:21:25.517437824 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> Device tape:/dev/nst1 error = 'Can't open tape device /dev/nst1: >>>>>> >> Device or resource busy' >>>>>> >> Wed Feb 08 17:21:25.517455484 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> Device tape:/dev/nst1 setting status flag(s): >>>>>> DEVICE_STATUS_DEVICE_BUSY >>>>>> >> Wed Feb 08 17:21:25.517680908 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> Slot 1 without label can be labeled >>>>>> >> Wed Feb 08 17:21:25.517775016 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> new Amanda::Changer::Error: type='fatal', message='Can't open tape >>>>>> >> device /dev/nst1: Device or resource busy' >>>>>> >> Wed Feb 08 17:21:25.518772880 2017: pid 1168: thd-0x1454d20: taper: >>>>>> >> cmd: /usr/bin/Mail -s AMANDA VOLUME REQUEST: new volume >>> x...@xxx.xxx <mailto:x...@xxx.xxx> >>>>>> <mailto:x...@xxx.xxx> >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > if I remove "eject-volume yes" from amanda.conf, lsof doesn't show >>>>>> > /dev/nst1 as open after the mail and logfile looks like this when >>>>>> > touching the check-file: >>>>>> > >>>>>> >> Sun Feb 05 22:01:45.892621331 2017: pid 14498: thd-0x36592a0: >>>>>> taper: >>>>>> >> xfer_queue_message: MSG: <XMsg@0x7f5770012690 type=XMSG_PART_DONE >>>>>> >> elt=<XferDestTaperSplitter@0x3226530> version=0> >>>>>> >> Sun Feb 05 22:01:45.893809646 2017: pid 14498: thd-0x36592a0: >>>>>> taper: >>>>>> >> Building type SPLIT_FILE header of 32768-32768 bytes with >>>>>> >> name='localhost' disk='abc-rman' dumplevel=0 and blocksize=32768 >>>>>> >> Sun Feb 05 22:02:27.050993044 2017: pid 14498: thd-0x36592a0: >>>>>> taper: >>>>>> >> Device tape:/dev/nst1 error = 'No space left on device' >>>>>> >> Sun Feb 05 22:02:27.051019886 2017: pid 14498: thd-0x36592a0: >>>>>> taper: >>>>>> >> Device tape:/dev/nst1 setting status flag(s): >>>>>> DEVICE_STATUS_VOLUME_ERROR >>>>>> >> Sun Feb 05 22:02:28.006847900 2017: pid 14498: thd-0x36592a0: >>>>>> taper: >>>>>> >> xfer_queue_message: MSG: <XMsg@0x7f5770016d70 type=XMSG_PART_DONE >>>>>> >> elt=<XferDestTaperSplitter@0x3226530> version=0> >>>>>> >> Sun Feb 05 22:02:54.991253275 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> Will request retry of failed split part. >>>>>> >> Sun Feb 05 22:02:54.992089576 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> tape HAM-M-025-L4 kb 804625975 fm 137 [OK] >>>>>> >> Sun Feb 05 22:02:54.993163873 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Sun Feb 05 22:02:54.993302419 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> cmd: /usr/bin/Mail -s AMANDA VOLUME REQUEST: new volume >>> x...@xxx.xxx <mailto:x...@xxx.xxx> >>>>>> <mailto:x...@xxx.xxx> >>>>>> >> >>>>>> >> Sun Feb 05 22:03:04.998826673 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Sun Feb 05 22:03:15.005238432 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Sun Feb 05 22:03:25.009766390 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> > ... >>>>>> >> Mon Feb 06 09:13:13.437584985 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Mon Feb 06 09:13:23.455771038 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Mon Feb 06 09:13:33.473860747 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> new Amanda::Changer::Error: type='failed', reason='notfound', >>>>>> >> message='No acceptable volumes found' >>>>>> >> Mon Feb 06 09:13:38.596670849 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> Device is in variable block size >>>>>> >> Mon Feb 06 09:13:40.734257638 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> Slot 1 with label HAM-M-026-L4 is usable >>>>>> >> Mon Feb 06 09:13:45.889592195 2017: pid 14498: thd-0x2258610: >>>>>> taper: >>>>>> >> Building type TAPESTART header of 32768-32768 bytes with >>>>>> >> name='HAM-M-026-L4' disk='' dumplevel=0 and blocksize=32768 >>>>>> > >>>>>> > >>>>>> > Thanks, >>>>>> > Thilo >>>>> >>>>>