Hi Hunter
I run an ESXi host on a USDT system and use a USB3 LAN dongle to give me a
seperate network for user/management traffic so I can use the onboard one
for iSCSI. This was done following the artivle here:
https://www.virtuallyghetto.com/2016/03/working-usb-ethernet-adapter-nic-for-esxi.html
I note that that USB interface can be dropping packets all the time, not a
big problem if the protocols can handle that and RDP etc suffers no real
issues. But running something like TotalNetworkMonitor on a VM there you
do see that there are up to 50% or so ping packets lost in its probes.
Could be that you are seeing a similar behaviour where the protocol
doesn't handle lost packets too well...
regards
Dave
On Fri, 20 Jul 2018 03:15:34 +0100, Hunter Goatley
<[email protected]> wrote:
Here's where we stand on our cluster communications errors: nothing we
did worked. We tried different ports on the switch. We tried forcing
1Gbps. >We tried forcing the port down to 10 Mbps. That actually seemed
to help slightly, in that we only lost communications every 63 seconds
or so, >instead of every 15--60 seconds. But it would lose and
re-establish connection to the cluster every 63 seconds.
So I decided to try setting up and using a TAP device, just to see what
would happen.
Using the dedicated Ethernet card, it made no difference. It still lost
communications every 63 seconds.
When I say dedicated Ethernet card, I probably should have stated
earlier that it's a USB -> Ethernet device plugged into the system. I
don't know >what brand or model, but I can find out, if anyone wants to
know.
So I decided to try tunneling through the "real" Ethernet port used by
the Linux system. After figuring out what to do for the missing tunctl
command >under CentOS, I was able to set up a tunnel, and I did "attach
xq tap:tap0". I then booted the system and wonder of wonders, miracle of
miracles, it >was seven minutes into the boot (yes, it takes a long
time, mounting a slew of disks that needed to be rebuilt) before it lost
communications. But it re->established them immediately, and as of my
typing this, it was been twenty-nine minutes since that happened. No
further drops. Normally, I wouldn't >think twenty-nine minutes is enough
to prove anything, but when it was dropping every 15--63 seconds for two
solid days, this sounds like a fix to >me.
So what does it mean? One thing it suggests is that the USB Ethernet
device may be buggy or bad. I mean, it seems to work OK for TCP/IP
>communications, etc, but it sure sounds like it may be the part
responsible for the problems. Especially since tunneling through the
built-in Ethernet >card seems to work and tunneling through the USB
device did not.
These are the commands I used to set up the tap device for CentOS:
brctl addbr br0
ifconfig eno1 0.0.0.0 ; eno1 is the host's Ethernet device
ifconfig br0 XXX.XX.XX.XX up ; the IP address of the host system
brctl addif br0 eno1
brctl setfd br0 0
#tunctl -t tap0
ip tuntap add tap0 mode tap ; Replacement for tunctl on CentOS 7
brctl addif br0 tap0
ifconfig tap0 up
I then just did "xq attach tap:tap0" in the init file. I guess I should
set up a special MAC address, but I haven't yet, and so far, nothing
seems amiss.
While I thought having a dedicated Ethernet device would be the simplest
thing, I can live with tunneling it through the shared Ethernet device,
>especially since it works and the former does not. ;-)
Thank you for all of your input over the past couple of days, and thank
you for all of your work on SIMH!
Hunter
--
_______________________________________________
Simh mailing list
[email protected]
http://mailman.trailing-edge.com/mailman/listinfo/simh