On 2017-05-31 00:37, Steven Haigh wrote:
On 31/05/17 00:18, Boris Ostrovsky wrote:
On 05/30/2017 06:27 AM, Steven Haigh wrote:
Just wanted to give this a nudge to try and get some suggestions on
where to go / what to do about this.

On 28/05/17 09:44, Steven Haigh wrote:
The last couple of days running on kernel 4.9.29 and 4.9.30 with Xen
4.9.0-rc6 I've had a number of ethernet lock ups that have taken my
system off the network.

This is a new development - but I'm not sure if its kernel or xen related.

Since noone seems to have seen this it would be useful to narrow it down
a bit.

Do you observe this on rc5? Or with 4.9.28 kernel? Any particular load
that you are using? Do you see this on a specific NIC?

This install is currently using xen 4.9-rc7 and kernel 4.9.30. I would
say that there may be a connection between occurrences between disk
activity and the ethernet adapter locking up - but I haven't been able
to prove this in any valid way yet.

I am currently running this script on the server in question to try and
get a log of how often the adapter locks up. I only added the logger
line tonight - so I don't have a great deal of historical data to add as
yet.

#!/bin/bash
while true; do
        ping -c1 10.1.1.2 >& /dev/null
        if [ $? != 0 ]; then
                logger 'No response. Resetting enp5s0'
                mii-tool -R enp5s0
        fi
        sleep 5
done

Just to keep kicking this along a little bit, my logs so far have shown:
messages:May 31 00:20:10 No response. Resetting enp5s0
messages:May 31 04:20:08 No response. Resetting enp5s0
messages:May 31 12:21:37 No response. Resetting enp5s0

Its almost spooky that its nearly 20 minutes past the hour on each reset.

I've checked against the cron logs, but I can't find anything that would be scheduled on the Dom0 at that time.

The logs also show that after running mii-tool to reset the ethernet adapter, connectivity has returned straight away.

The network adapter uses the r8169 kernel module, and shows as:
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)

I have a DomU backup script that runs *in* a DomU at 01:00 each night - that causes a lot of disk activity - but alas, that time hasn't lined up with anything as yet...

Still seem to be fidgeting in the dark :(

--
Steven Haigh

? net...@crc.id.au     ? http://www.crc.id.au
? +61 (3) 9001 6090    ? 0412 935 897

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to