On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote: > On 11/18/22 05:23, hw wrote: > > On Tue, 2022-11-15 at 16:42 -0800, David Christensen wrote: > > > On 11/15/22 07:15, hw wrote: > > > > On Tue, 2022-11-15 at 12:38 +0100, hw wrote: > > > > > On Mon, 2022-11-14 at 13:21 +0100, hw wrote: > [...] > > > What is the cable type? Length? Factory or home made? > > > > I got a new cable today which is rated as cat 8.1. It's only 1.5 meters > > long. > > I have tried 3 different cables now, two of them about 1.5 and another 10 > > meters > > long. Before I got the new cable, I tried the other port on the nic, and it > > made no difference. > > > > Even with the new cable, the connection is intermittent :( > > > Different category cables have different characteristic impedance, and > the NIC's are designed for specific cables. > [...] > So, I suggest trying a Category 6A factory patch cable at least 2 meters > long.
I tried it with a 10m cat6 cable and the connection was intermittent. It's the same (as in "identical to") cable that works between the other server and a client. > > > What is connected to the other end of the cable? If it is a NIC in > > > another server, what happens if you swap the two NIC's? > > > > It's connected to a Broadcom NetXtreme II BCM57810 in another server. The > > other > > server has an identical mainboard and CPU in it, and the other port on the > > Broadcom is connected to a client with the same card, and that connection > > works > > fine. So I'm assuming that the Broadcom card is ok. > > > What OS's for the various machines? Fedora on the server and Debian on the backup server, Fedora on the client. > Do you compile your own kernels and/or NIC drivers? No, I'm using the kernels that come with the distributions. I did compile the driver (i. e. module) from the source on Intels web site to see if a different driver would make a difference, and it didn't, so I restored the "original" module. > > I'm about to move the client into a new case in a couple days and then I > > might > > swap the Broadcom from the client into the backup server. > > > If you have another Broadcom NIC, what happens if you swap it with the > Intel NIC in the backup server? I haven't tried yet because when I swap cards around, I'll have to redo the configuration and the server has some network cards passed through to a VM running OPNsense. I don't want to mess with that. I suspect it's a mainboard issue. I pulled the Intel card and then the on-board network card quit working. I plugged the Intel card back in and the on-card worked again. I'd try disabling the on-board card but there is no option to do that in the BIOS. > > Maybe I can reseat the heat sink on the card with new thermal paste. > > Overheating might explain why the connection is intermittent. > > > Do you have any diagnostic information that indicates the Intel NIC is > overheating? No, the idea that it might overheat is from internet searches revealing that some people had issues with the card overheating and adding a fan blowing on the heatsink fixed the problem. I always had a fan blowing over it from the top of the card, so that should be fine, and placing another fan directly on the heatsink didn't make a difference. I took the extra fan out today when I was at it because it's awfully loud --- it's an old Delta fan from 2003 that comes from an old IBM server and it makes a good airstream :) The heat sink looks fine and unfortunately, it's designed in such a way that I can't remove it without breaking the pins holding the heatsink to the card, so I decided not to touch it. That's how I discovered that the on-board network card quit working when the Intel card wasn't plugged in ... Perhaps it's some kind of resource conflict or incompatibility, or the board is broken.