Hello,
after an upgrade to Amanda-3.4.4, I discovered that amrecover eats too
much memory. Trying to recover a single directory from a huge GNU tar
image (900 GB), amrecover gets killed because of memory constraints.
System is running CentOS 5.11 x86_64 using 8 GB RAM and 24 GB Swap
available for several years using Amanda to backup to a 90 slot vTape
iSCSI-attached.
Downgrade to Amanda-3.4.3 helped. The same recover procedure (same
backup date, same directory to recover) works well using this version.
Last lines of amidxtaped.20170603173611.debug (timezone is CEST (GMT+2)):
Sat Jun 03 17:13:55.770129000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198642624 kb
Sat Jun 03 17:14:20.073570000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198645984 kb
Sat Jun 03 17:15:14.519889000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198646752 kb
Sat Jun 03 17:21:39.862161000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198647680 kb
Sat Jun 03 17:21:40.388673000 2017: pid 720: thd-0x20d85690: amidxtaped:
xfer_cancel_with_error: Error writing to fd 8: Broken pipe
Sat Jun 03 17:21:40.388729000 2017: pid 720: thd-0x20d85690: amidxtaped:
xfer_queue_message: MSG: <XMsg@0x2b2e0c0008c0 type=XMSG_ERROR
elt=<XferElemen
tGlue@0x20d86020> version=0>
Sat Jun 03 17:21:40.388900000 2017: pid 720: thd-0x20d85690: amidxtaped:
xfer_queue_message: MSG: <XMsg@0x2b2e0c009ab0 type=XMSG_CANCEL
elt=<XferSourceRecovery@0x20a3d000> version=0>
Sat Jun 03 17:21:40.845451000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
Cancelling <Xfer@0x20d849c0 (<XferSourceRecovery@0x20a3d000> ->
<XferDestFd@0x20d83010>)>
Sat Jun 03 17:21:40.922716000 2017: pid 720: thd-0x20d85690: amidxtaped:
sending XMSG_CRC message 0x20d83010
Sat Jun 03 17:21:40.922777000 2017: pid 720: thd-0x20d85690: amidxtaped:
pull_and_write CRC: 1b42fe0e size 203415191552
Sat Jun 03 17:21:40.922793000 2017: pid 720: thd-0x20d85690: amidxtaped:
xfer_queue_message: MSG: <XMsg@0x2b2e0c009a40 type=XMSG_CRC
elt=<XferDestFd@0x20d83010> version=0>
Sat Jun 03 17:21:40.922818000 2017: pid 720: thd-0x20d85690: amidxtaped:
xfer_queue_message: MSG: <XMsg@0x2b2e0c001ae0 type=XMSG_DONE
elt=<XferElementGlue@0x20d86020> version=0>
Sat Jun 03 17:21:41.313146000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
dest_crc: 1b42fe0e:203415191552
Sat Jun 03 17:21:41.674875000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1893:info:4900012 198647680 kb
Sat Jun 03 17:21:42.462789000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1921:error:4900055 Error
writing to fd 8: Broken pipe
Sat Jun 03 17:21:42.958237000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
/opt/amanda/lib/amanda/perl/Amanda/Restore.pm:2138:error:4900068 Error
writing to fd 8: Broken pipe
Sat Jun 03 17:21:43.406247000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
user_message feedback: Error writing to fd 8: Broken pipe
Sat Jun 03 17:21:44.597549000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
CTL >> MESSAGE Error writing to fd 8: Broken pipe
Sat Jun 03 17:21:46.940293000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
exiting with 1
Sat Jun 03 17:21:46.990525000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_utime : 352
Sat Jun 03 17:21:46.990566000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_stime : 463
Sat Jun 03 17:21:46.990576000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_maxrss : 38984
Sat Jun 03 17:21:46.990595000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_ixrss : 0
Sat Jun 03 17:21:46.990604000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_idrss : 0
Sat Jun 03 17:21:46.990611000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_isrss : 0
Sat Jun 03 17:21:46.990619000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_minflt : 13614
Sat Jun 03 17:21:46.990627000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_majflt : 2372
Sat Jun 03 17:21:46.990634000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_nswap : 0
Sat Jun 03 17:21:46.990641000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_inblock : 0
Sat Jun 03 17:21:46.990649000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_oublock : 0
Sat Jun 03 17:21:46.990656000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_msgsnd : 0
Sat Jun 03 17:21:46.990663000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_msgrcv : 0
Sat Jun 03 17:21:46.990670000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_nsignals: 0
Sat Jun 03 17:21:46.990678000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_nvcsw : 1477628
Sat Jun 03 17:21:46.990685000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
ru_nivcsw : 4970121
Sat Jun 03 17:21:47.010360000 2017: pid 720: thd-0x1f6147f0: amidxtaped:
pid 720 finish time Sat Jun 3 17:21:47 2017
Lines from /var/log/messages:
Jun 3 17:17:42 ente -- MARK --
Jun 3 17:21:36 ente kernel: sshd invoked oom-killer: gfp_mask=0x201d2,
order=0, oomkilladj=0
Jun 3 17:21:37 ente kernel:
Jun 3 17:21:37 ente kernel: Call Trace:
Jun 3 17:21:37 ente kernel: [<ffffffff800cb6ae>] out_of_memory+0x8e/0x2f3
Jun 3 17:21:37 ente kernel: [<ffffffff8000f698>] __alloc_pages+0x27f/0x308
Jun 3 17:21:37 ente kernel: [<ffffffff80013083>]
__do_page_cache_readahead+0x96/0x17b
Jun 3 17:21:37 ente kernel: [<ffffffff800139cc>]
filemap_nopage+0x14c/0x360
Jun 3 17:21:37 ente kernel: [<ffffffff8000896c>]
__handle_mm_fault+0x1fd/0x1037
Jun 3 17:21:37 ente kernel: [<ffffffff800671ae>] do_page_fault+0x499/0x842
Jun 3 17:21:37 ente kernel: [<ffffffff800a3f9a>]
autoremove_wake_function+0x0/0x2e
Jun 3 17:21:37 ente kernel: [<ffffffff8003abee>] tty_ldisc_deref+0x6a/0x7d
Jun 3 17:21:37 ente kernel: [<ffffffff8005ddf9>] error_exit+0x0/0x84
Jun 3 17:21:38 ente kernel:
Jun 3 17:21:39 ente kernel: Mem-info:
Jun 3 17:21:40 ente kernel: Node 0 DMA per-cpu:
Jun 3 17:21:41 ente kernel: cpu 0 hot: high 0, batch 1 used:0
Jun 3 17:21:41 ente kernel: cpu 0 cold: high 0, batch 1 used:0
Jun 3 17:21:41 ente kernel: cpu 1 hot: high 0, batch 1 used:0
Jun 3 17:21:41 ente kernel: cpu 1 cold: high 0, batch 1 used:0
Jun 3 17:21:41 ente kernel: Node 0 DMA32 per-cpu:
Jun 3 17:21:41 ente kernel: cpu 0 hot: high 186, batch 31 used:9
Jun 3 17:21:42 ente kernel: cpu 0 cold: high 62, batch 15 used:51
Jun 3 17:21:42 ente kernel: cpu 1 hot: high 186, batch 31 used:18
Jun 3 17:21:42 ente kernel: cpu 1 cold: high 62, batch 15 used:37
Jun 3 17:21:42 ente kernel: Node 0 Normal per-cpu:
Jun 3 17:21:42 ente kernel: cpu 0 hot: high 186, batch 31 used:17
Jun 3 17:21:42 ente kernel: cpu 0 cold: high 62, batch 15 used:14
Jun 3 17:21:42 ente kernel: cpu 1 hot: high 186, batch 31 used:56
Jun 3 17:21:42 ente kernel: cpu 1 cold: high 62, batch 15 used:54
Jun 3 17:21:42 ente kernel: Node 0 HighMem per-cpu: empty
Jun 3 17:21:42 ente kernel: Free pages: 39636kB (0kB HighMem)
Jun 3 17:21:42 ente kernel: Active:945135 inactive:1018975 dirty:0
writeback:0 unstable:0 free:9909 slab:6321 mapped-file:1157
mapped-anon:1963393 pagetables:19050
Jun 3 17:21:42 ente kernel: Node 0 DMA free:10084kB min:12kB low:12kB
high:16kB active:0kB inactive:0kB present:9652kB pages_scanned:0
all_unreclaimable? yes
Jun 3 17:21:43 ente kernel: lowmem_reserve[]: 0 3383 7928 7928
Jun 3 17:21:43 ente kernel: Node 0 DMA32 free:23032kB min:4860kB
low:6072kB high:7288kB active:1406744kB inactive:1978380kB
present:3464992kB pages_scanned:16393700 all_unreclaimable? yes
Jun 3 17:21:43 ente kernel: lowmem_reserve[]: 0 0 4545 4545
Jun 3 17:21:43 ente kernel: Node 0 Normal free:6520kB min:6528kB
low:8160kB high:9792kB active:2374184kB inactive:2097132kB
present:4654080kB pages_scanned:10006802 all_unreclaimable? yes
Jun 3 17:21:43 ente kernel: lowmem_reserve[]: 0 0 0 0
Jun 3 17:21:43 ente kernel: Node 0 HighMem free:0kB min:128kB low:128kB
high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Jun 3 17:21:43 ente kernel: lowmem_reserve[]: 0 0 0 0
Jun 3 17:21:43 ente kernel: Node 0 DMA: 5*4kB 4*8kB 5*16kB 5*32kB
3*64kB 5*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 2*4096kB = 10084kB
Jun 3 17:21:43 ente kernel: Node 0 DMA32: 32*4kB 1*8kB 1*16kB 19*32kB
8*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 5*4096kB = 23032kB
Jun 3 17:21:43 ente kernel: Node 0 Normal: 126*4kB 56*8kB 48*16kB
2*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 1*4096kB = 6520kB
Jun 3 17:21:43 ente kernel: Node 0 HighMem: empty
Jun 3 17:21:43 ente kernel: 7119 pagecache pages
Jun 3 17:21:43 ente kernel: Swap cache: add 19085619, delete 19079797,
find 1173083/1206685, race 0+462
Jun 3 17:21:43 ente kernel: Free swap = 0kB
Jun 3 17:21:43 ente kernel: Total swap = 25165816kB
Jun 3 17:21:43 ente kernel: Free swap: 0kB
Jun 3 17:21:44 ente kernel: 2228224 pages of RAM
Jun 3 17:21:44 ente kernel: 215581 reserved pages
Jun 3 17:21:44 ente kernel: 15624 pages shared
Jun 3 17:21:44 ente kernel: 5822 pages swap cached
Jun 3 17:21:44 ente kernel: Out of memory: Killed process 707, UID 0,
(amrecover).