I've updated the server to amanda-backup_server-3.5-1 (64bit) which appears to
have fixed the issue.
The client that failed most regularly is running amanda-backup_client-3.3.9-1
(32bit).
I'll keep monitoring this in case the situation changes but it looks like it's
working properly now.
On 05/10/17 08:58, Tom Robinson wrote:
>
> It may well be just that I can't see the wood for the trees when looking at
> logging but I can't
> find the problem :-(
>
> I'm running daily manual dumps of the FAILED DLE's to keep backups intact!
>
> I'm still getting the following:
>
> FAILURE DUMP SUMMARY:
> bentley Resources lev 1 FAILED [too many dumper retry: [request failed: No
> route to host]]
> bentley sysadmin lev 1 FAILED [too many dumper retry: [request failed: No
> route to host]]
>
> Apart from the two KVM hosts, all these systems are KVM Guests. The backup
> server is a KVM guest.
> Has anyone seen or know of issues that may occur with amanda on virtualised
> infrastructure?
>
> From my understanding of KVM networking between guests, whole network frames
> are dumped and picked
> up between them. This allows higher transport speeds. I've tested the
> throughput with iperf and
> have seen througput as high as 25Gbps. The following ipef session shows the
> connection between the
> failed guest, bentley, and the backup server. I've only shown the 'server'
> side results for iperf
> below:
>
> # systemctl stop xinetd
>
> # iperf -p 10080 -s
>
> Server listening on TCP port 10080
> TCP window size: 85.3 KByte (default)
>
> [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39214
> [ ID] Interval Transfer Bandwidth
> [ 4] 0.0-10.0 sec 20.5 GBytes 17.6 Gbits/sec
> [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39215
> [ 4] 0.0-10.0 sec 20.7 GBytes 17.8 Gbits/sec
> [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39218
> [ 4] 0.0-10.0 sec 21.3 GBytes 18.3 Gbits/sec
> [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39223
> [ 4] 0.0-10.0 sec 21.4 GBytes 18.4 Gbits/sec
>
> Any clues/help for the above are appreciated.
>
> I'm now also getting some other strange errors that I've never seen before.
> These report as
> 'FAILED' but further on into the report they appear to have completed without
> issue. What do the
> error codes signify (e.g. FAILED [02-00098] etc.)?
>
> ---8<---
>
> FAILURE DUMP SUMMARY:
> ---8<---
> bentley ECN lev 0 FAILED [02-00098]
> bentley Repair lev 1 FAILED [06-00229]
> garage /var lev 1 FAILED [shm_ring cancelled]
> modena /usr/src lev 1 FAILED [12-00205]
>
> ---8<---
> NOTES:
> planner: Last full dump of bentley:ECN on tape daily02 overwritten in 5
> runs.
> planner: Last level 1 dump of bentley:ECN on tape daily01 overwritten in 4
> runs.
> planner: Last full dump of bentley:Repair on tape daily07 overwritten in 2
> runs.
> planner: Last full dump of garage:/var on tape daily01 overwritten in 4
> runs.
>
> ---8<---
> DUMP SUMMARY:
> DUMPER STATS
> TAPER STATS
> HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS
> KB/s MMM:SS KB/s
> ---
> --- ---
> ---8<---
> bentley ECN 0 19790 19790-- 0:03
> 7325.0 0:00 197900.0
> bentley Repair110 0.00:00
> 4.2 0:00 0.0
> garage /var 1 7000 7000-- 0:00
> 33341.0 0:00 7.0
> modena /usr/src 1 190 147.40:04
> 3.3 0:00140.0
> ---8<---
>
>
> What are the error codes and did amanda dump these OK or not?
>
> Kind regards,
> Tom
>
>
> Tom Robinson
> IT Manager/System Administrator
>
> MoTeC Pty Ltd
>
> 121 Merrindale Drive
> Croydon South
> 3136 Victoria
> Australia
>
> T: +61 3 9761 5050
> F: +61 3 9761 5051
> E: tom.robin...@motec.com.au
> On 13/09/17 23:09, Jean-Louis Martineau wrote:
>> Tom,
>>
>> It is the system that return the "No route to host" error.
>> You should check your system log (on server, client, router, firewall, nat,
>> ...) for network error.
>>
>> Jean-Louis
>>
>> On 12/09/17 06:01 PM, Tom Robinson wrote:
>>> bump
>>>
>>> On 11/09/17 12:45, Tom Robinson wrote:
>>> > Hi,
>>> >
>>> > I've recently migrated our backup server from CentOS 5 to CentOS 7. I've
>>> > also upgraded from amanda
>>> > 3.3.7 to 3.4.5
>>> >
>>> > The amcheck works fine and reports no issues. Yet, on backup runs on some
>>> > DLEs I get the error:
>>> >
>>> > dump failed: [request failed: No route to host](too)
>>> >
>>> > It also appears to be random as to which DLEs fail. Sometimes it's just
>>> > one or two on a client.
>>> >