On 11/19/22 06:50, hw wrote:
On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote:
... I suggest trying a Category 6A factory patch cable at least 2 meters
long.
I tried it with a 10m cat6 cable and the connection was intermittent. It's the
same (as in "identical to") cable that works between the other server and a
client.
Okay. I suggest putting a unique mark/ serial number on each cable for
tracking purposes until you resolve the intermittent connection issue.
What OS's for the various machines?
Fedora on the server and Debian on the backup server, Fedora on the client.
Okay. If the NIC works correctly in the backup server with Fedora,
maybe you should just use Fedora.
Do you compile your own kernels and/or NIC drivers?
No, I'm using the kernels that come with the distributions.
Okay. That is the safest approach.
I did compile the
driver (i. e. module) from the source on Intels web site to see if a different
driver would make a difference, and it didn't, so I restored the "original"
module.
Okay. Too bad it did not work; that seemed like a good suggestion.
If you have another Broadcom NIC, what happens if you swap it with the
Intel NIC in the backup server?
I haven't tried yet because when I swap cards around, I'll have to redo the
configuration and the server has some network cards passed through to a VM
running OPNsense. I don't want to mess with that.
Perhaps that is a good reason to do some devops development -- e.g.
write a data-driven script that reads a configuration file to
interconnect the VM virtual network interfaces and host physical network
interfaces.
I prefer to use a dedicated hardware device for my LAN (UniFi Security
Gateway).
I suspect it's a mainboard issue. I pulled the Intel card and then the on-board
network card quit working.
With the current Debian installation? Did you try the d-i rescue shell
or any live sticks?
I plugged the Intel card back in and the on-card
worked again. I'd try disabling the on-board card but there is no option to do
that in the BIOS.
Okay. That indicates the issue is software.
Do you have any diagnostic information that indicates the Intel NIC is
overheating?
No, the idea that it might overheat is from internet searches revealing that
some people had issues with the card overheating and adding a fan blowing on the
heatsink fixed the problem. I always had a fan blowing over it from the top of
the card, so that should be fine, and placing another fan directly on the
heatsink didn't make a difference. I took the extra fan out today when I was at
it because it's awfully loud --- it's an old Delta fan from 2003 that comes from
an old IBM server and it makes a good airstream :)
The heat sink looks fine and unfortunately, it's designed in such a way that I
can't remove it without breaking the pins holding the heatsink to the card, so I
decided not to touch it. That's how I discovered that the on-board network card
quit working when the Intel card wasn't plugged in ...
Perhaps it's some kind of resource conflict or incompatibility, or the board is
broken.
At this point, all I can suggest is a program of A/B testing to isolate
the faulty hardware and/or software component(s). Beware that you may
have multiple faults, so be meticulous.
I prefer FreeBSD for my servers. The "Intel ® Ethernet Controller
Products 27.7 Release Notes" indicate the "ix" driver is supported and
tested on FreeBSD 13 and FreeBSD 12.3 ("Fedora" and "Debian" appear
nowhere in that document):
https://www.intel.com/content/www/us/en/download/19622/intel-ethernet-product-software-release-notes.html
My FreeBSD-12.3-RELEASE-amd64 SOHO server has a man page ixgbe(4):
NAME
ixgbe - Intel(R) 10Gb Ethernet driver for the FreeBSD operating system
SYNOPSIS
To compile this driver into the kernel, place the following lines
in your
kernel configuration file:
device iflib
device ixgbe
Alternatively, to load the driver as a module at boot time, place the
following line in loader.conf(5):
if_ixgbe_load="YES"
David