Re: Intel X540-AT2 and Debian: intermittent connection
On Sun, 2022-11-20 at 10:25 -0800, David Christensen wrote: > On 11/20/22 08:25, hw wrote: > > On Sun, 2022-11-20 at 12:45 +0100, hw wrote: > > > [...] > > I am unable to determine if Intel has fixed any device driver bugs for > the X540-AT2 adapter since FreeBSD-12.3-RELEASE-amd64-memstick.img was > released. IIRC, the card is EOL since 2 years or so. Intel probably won't touch it unless the driver supports other cards that are still being sold. The network interfaces I got in FreeBSD were ix0 and ix1.
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/20/22 08:25, hw wrote: On Sun, 2022-11-20 at 12:45 +0100, hw wrote: [...] I don't know, I'll try the Debian rescue and FreeBSD when the currently running backups are finished. I booted the Debian rescue and the LEDs on the network cards don't light up and the link remains down even when I replug the cable. I rebooted into the installed Debian and the LEDs are back on and the link is up and I can ping. After 53 pings, the address is unreachable again. So I booted FreeBSD and went into the shell. Dmesg said link up and down from my plugging and ifconfig kept saying 'no carrier'. After giving the interface an IPv6 address with ifconfig it still said no carrier and after replugging the cable, I could ping 127 times. Then ping stopped for a few seconds before it pinged one more time, and now it's stuck. After a while, it pinged again, and after another while, ping stopped again. There are no further messages in dmesg and ifconfig keeps saying the interface is active even when it doesn't ping. FreeBSD ping times are about almost 100ms less the ones with Linux. This is probably not a software issue. I'll switch cards around in a couple days or so. That's gona suck but now I want to know. Stay tuned ... Regarding the FreeBSD installer rescue shell/ live CD, I am wondering if the device driver is loaded (?). This is the FreeBSD installer I use: https://download.freebsd.org/ftp/releases/ISO-IMAGES/12.3/ FreeBSD-12.3-RELEASE-amd64-memstick.img Booting and choosing "Welcome" -> "Shell": # freebsd-version ; uname -a 12.3-RELEASE FreeBSD 12.3-RELEASE FreeBSD 12.3-RELEASE r371126 GENERIC amd64 # ls -l /boot/kernel/*ixgbe* ls: /boot/kernel/*ixgbe*: No such file or directory Searching, I found: # ls -l /boot/kernel/if_ix.ko -r-xr-xr-x 1 root wheel 351112 Dec 2 2021 /boot/kernel/if_ix.ko # md5sum /boot/kernel/if_ix.ko dd37c6a5077ca77289be338566ab3ade /boot/kernel/if_ix.ko If I try to load it: # kldload if_ix module_register: cannot register pci/ix from if_ix.ko; already loaded from kernel kldload: can't load if_ix: module already loaded or in kernel Using a recently updated VM: 2022-11-20 09:36:10 toor@f3 ~ # freebsd-version ; uname -a 12.3-RELEASE-p7 FreeBSD f3.tracy.holgerdanske.com 12.3-RELEASE-p6 FreeBSD 12.3-RELEASE-p6 GENERIC amd64 2022-11-20 09:36:24 toor@f3 ~ # ll /boot/kernel/if_ix.ko -r-xr-xr-x 2 root wheel 351112 2022/01/30 20:50:57 /boot/kernel/if_ix.ko 2022-11-20 09:44:00 toor@f3 ~ # md5sum /boot/kernel/if_ix.ko dd37c6a5077ca77289be338566ab3ade /boot/kernel/if_ix.ko 2022-11-20 09:36:29 toor@f3 ~ # kldload if_ix kldload: can't load if_ix: module already loaded or in kernel So, it looks like the driver is loaded in FreeBSD-12.3-RELEASE-amd64-memstick.img, and the driver has not been updated on the 12.3-R branch. Looking at the FreeBSD download page, the date on the installer is 2021-Dec-02 06:37. Looking at the Intel driver page for FreeBSD: https://www.intel.com/content/www/us/en/download/14303/intel-network-adapters-driver-for-pcie-10-gigabit-network-connections-under-freebsd.html The tarball is: ix-3.3.31.tar.gz I assume the version number is 3.3.31. STFW for FreeBSD-12.3-RELEASE source code, I see: https://github.com/freebsd/freebsd-src/tree/releng/12.3/sys/dev/ixgbe Looking at a few *.h and *.c files, I am unable to find a version number. Looking at "Intel ® Ethernet Controller Products 27.7 Release Notes" -> "2.0 Fixed Issues" -> "2.3 Intel ® Ethernet 500 Series Network Adapters", I see: None for this release. I am unable to determine if Intel has fixed any device driver bugs for the X540-AT2 adapter since FreeBSD-12.3-RELEASE-amd64-memstick.img was released. David
Re: Intel X540-AT2 and Debian: intermittent connection
On Sat, 2022-11-19 at 17:35 -0800, David Christensen wrote: > On 11/19/22 15:51, hw wrote: > > On Sat, 2022-11-19 at 13:35 -0800, David Christensen wrote: > > > On 11/19/22 06:50, hw wrote: > > > > On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote: > > > > > > > > ... I suggest trying a Category 6A factory patch cable at least 2 > > > > > meters > > > > > long. > > > > > > > > I tried it with a 10m cat6 cable and the connection was intermittent. > > > > It's > > > > the > > > > same (as in "identical to") cable that works between the other server > > > > and a > > > > client. > > > > > > > > > Okay. I suggest putting a unique mark/ serial number on each cable for > > > tracking purposes until you resolve the intermittent connection issue. > > > > What for? All the cables I used except for the new ones are known good. > > > Sanity check/ OCD. I went through a period with SATA III drive problems > and marked all of my SATA cables to help with troubleshooting. Hm I can see that for when you have so many of them that they're hard to tell apart. > [...] > > > Perhaps that is a good reason to do some devops development -- e.g. > > > write a data-driven script that reads a configuration file to > > > interconnect the VM virtual network interfaces and host physical network > > > interfaces. > > > > Why would I do that? How would a script figure out which interface is which > > and > > how would it guarantee that they will be exactly the same as seen by > > OPNsense > > running in that VM when I switch them around? I'm not saying it's > > impossible, > > but I'd rather resolve this problem in a timely manner and not in a couple > > years > > when I might have finished the script and tested it in a bunch of servers > > which > > aren't even relevant. > > > I agree that creating software for devops can be difficult and time > consuming, but it is nice to have when done. I have built up a > collection of shell and Perl scripts over the years that are very useful. I do that when it makes sense, not when it doesn't. You should try to set up a VM with OPNsense and a couple network cards you have to pass through so you see how much fun that is. It took me a day or two to get it to work stable, and it was a one-time endevour. You'd have to have some robot arm to pull the server from the rack, take off the cover (requires two arms maybe) and have them move the cards in the PCI slots, controlled by an AI that's acutally smart enough to understand what it's doing and able to do the testing as well. Good luck with programming that :) > > > > I suspect it's a mainboard issue. > > > The clues support that hypothesis. Or the card is broken. At least Intel makes cards that appear to behave somewhat consistently even when they don't work ;) > > > [...] > > > Did you try the d-i rescue shell > > > > You mean the rescue system that comes with the Debian installer? No, I > > haven't. > > How would that make a difference? > > > It would provide data point for troubleshooting. The LEDs on the cards don't come when the rescue system is running and it doesn't work at all. > > > or any live sticks? > > > > only the Fedora one > > > That indicates a bad NIC and/or a bad PCIe slot. Right, FreeBSD also makes an intermittent connection. > > > > > I plugged the Intel card back in and the on-card > > > > worked again. I'd try disabling the on-board card but there is no > > > > option to > > > > do > > > > that in the BIOS. > > > > > > Okay. That indicates the issue is software. > > > > How would it be a software issue affecting a network card from a totally > > different manufacturer in a PCI slot that the BIOS doesn't have an option to > > disable the on-board network card? > > > Without extensive engineering information and the right test equipment, > who knows? Something that isn't there can't have an effect ... > > > [...] > It sounds like you could use more spare parts and/or computers. I already have too many. > Let us know what happens with the Broadcom card and with whatever BSD > you pick (the FreeBSD installer includes a rescue shell and a live system). ok
Re: Intel X540-AT2 and Debian: intermittent connection
On Sun, 2022-11-20 at 12:45 +0100, hw wrote: > [...] > I don't know, I'll try the Debian rescue and FreeBSD when the currently > running > backups are finished. I booted the Debian rescue and the LEDs on the network cards don't light up and the link remains down even when I replug the cable. I rebooted into the installed Debian and the LEDs are back on and the link is up and I can ping. After 53 pings, the address is unreachable again. So I booted FreeBSD and went into the shell. Dmesg said link up and down from my plugging and ifconfig kept saying 'no carrier'. After giving the interface an IPv6 address with ifconfig it still said no carrier and after replugging the cable, I could ping 127 times. Then ping stopped for a few seconds before it pinged one more time, and now it's stuck. After a while, it pinged again, and after another while, ping stopped again. There are no further messages in dmesg and ifconfig keeps saying the interface is active even when it doesn't ping. FreeBSD ping times are about almost 100ms less the ones with Linux. This is probably not a software issue. I'll switch cards around in a couple days or so. That's gona suck but now I want to know. Stay tuned ...
Re: Intel X540-AT2 and Debian: intermittent connection
On Sun, 2022-11-20 at 10:46 +0100, hede wrote: > On Sun, 20 Nov 2022 00:51:20 +0100 hw wrote: > > > Unfortunately it doesn't work anymore with Fedora either ... I tried it > > with a > > live system if it would work and it didn't. > > The source of connection resets can be diverse. Is it a connection reset? I can ping for a short time, then the address is unreachable, then the pings get through again. > Sometimes dmesg will show useful info, sometimes not. It can be anything, from > the link layer (ethernet re-negitiation) to some upper layers (arp, ip, etc.). > What kind of logs and status apps did you examined already? (dmesg, > ethtool/mii-tool, syslog, systemd journal, journal for which kind of services, > etc.) I checked dmesg and messages and that only shows when the link is up when I unplug/plug the cable. Ethtool doesn't show anything special, either. There are no services running using the card; it's sole purpose is to make backups faster via rsync (I don't have a 10GB switch). > Does the live system use the same kernel as the installed one? That's > typically not the case as those get updated very frequently. As such the > driver can still be different. (can, maybe, not a must) I don't know, I'll try the Debian rescue and FreeBSD when the currently running backups are finished. > Does the X540-AT2 uses external or builtin firmware? With external firmware > even that can differ between systems and firmware is also a potential source > of connection problems. I don't know. While I was trying to fix the problem, I installed the Linux firmware package, and there doesn't seem to be a particular card for the card. I haven't seen any notices about firmware being loaded and having the firmware package installed didn't make a difference. dmesg | grep -i firm [0.156099] Spectre V2 : Enabling Restricted Speculation for firmware calls [0.895840] GHES: APEI firmware first mode is enabled by WHEA _OSC. [3.489013] 3w-sas: scsi0: Firmware FH9X 5.12.00.007, BIOS BE9X 5.11.00.006, Phys: 8. [4.689010] 3w-sas: scsi7: Firmware FH9X 5.12.00.007, BIOS BE9X 5.11.00.006, Phys: 8. Hm, is it normal for both controllers to say 'Phys: 8'? They are working fine, but perhaps they're conflicting with the network card. > For the cable: my own experience is that with shorter connections the cable is > more irrelevant. On shorter connections even cat 5 works on 10 GBit. I had to > use those for some room-to-room connection (wall-moulded cables for fire > protection between two adjacent rooms, not simply exchangeable). They are > perfectly working in full speed. So if you tried several cat 6 cables 10 m and > less, which are working between other systems, I don't think(!) the cable is > of interest here... I don't think so, either. When you use a short patch cable like 50 or 25cm there could be issues because those can be of really poor quality; or when you have 50--100 meters of a quality that works fine up to 30 meters, you might see transmission errors and delays in the connection. The shorter cables are usually fine unless you have that doesn't work at all, but that's easy to figure out. Ethernet is remarkably robust, at least with 1GB. I think that the network card works fine and that the mainboard is having some issue that somehow sometimes prevents everything from being transmitted through that card. I'll find out if the cards works once some parts I'm waiting on have arrived. (I could pull the SAS controllers but without them the server is rather useless because the disks won't be connected ...)
Re: Intel X540-AT2 and Debian: intermittent connection
On Sun, 20 Nov 2022 00:51:20 +0100 hw wrote: > Unfortunately it doesn't work anymore with Fedora either ... I tried it with > a > live system if it would work and it didn't. The source of connection resets can be diverse. Sometimes dmesg will show useful info, sometimes not. It can be anything, from the link layer (ethernet re-negitiation) to some upper layers (arp, ip, etc.). What kind of logs and status apps did you examined already? (dmesg, ethtool/mii-tool, syslog, systemd journal, journal for which kind of services, etc.) Does the live system use the same kernel as the installed one? That's typically not the case as those get updated very frequently. As such the driver can still be different. (can, maybe, not a must) Does the X540-AT2 uses external or builtin firmware? With external firmware even that can differ between systems and firmware is also a potential source of connection problems. For the cable: my own experience is that with shorter connections the cable is more irrelevant. On shorter connections even cat 5 works on 10 GBit. I had to use those for some room-to-room connection (wall-moulded cables for fire protection between two adjacent rooms, not simply exchangeable). They are perfectly working in full speed. So if you tried several cat 6 cables 10 m and less, which are working between other systems, I don't think(!) the cable is of interest here... hede
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/19/22 15:51, hw wrote: On Sat, 2022-11-19 at 13:35 -0800, David Christensen wrote: On 11/19/22 06:50, hw wrote: On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote: ... I suggest trying a Category 6A factory patch cable at least 2 meters long. I tried it with a 10m cat6 cable and the connection was intermittent. It's the same (as in "identical to") cable that works between the other server and a client. Okay. I suggest putting a unique mark/ serial number on each cable for tracking purposes until you resolve the intermittent connection issue. What for? All the cables I used except for the new ones are known good. Sanity check/ OCD. I went through a period with SATA III drive problems and marked all of my SATA cables to help with troubleshooting. What OS's for the various machines? Fedora on the server and Debian on the backup server, Fedora on the client. Okay. If the NIC works correctly in the backup server with Fedora, maybe you should just use Fedora. Unfortunately it doesn't work anymore with Fedora either ... I tried it with a live system if it would work and it didn't. Okay. Perhaps that is a good reason to do some devops development -- e.g. write a data-driven script that reads a configuration file to interconnect the VM virtual network interfaces and host physical network interfaces. Why would I do that? How would a script figure out which interface is which and how would it guarantee that they will be exactly the same as seen by OPNsense running in that VM when I switch them around? I'm not saying it's impossible, but I'd rather resolve this problem in a timely manner and not in a couple years when I might have finished the script and tested it in a bunch of servers which aren't even relevant. I agree that creating software for devops can be difficult and time consuming, but it is nice to have when done. I have built up a collection of shell and Perl scripts over the years that are very useful. I suspect it's a mainboard issue. The clues support that hypothesis. I pulled the Intel card and then the on- board network card quit working. With the current Debian installation? yes Did you try the d-i rescue shell You mean the rescue system that comes with the Debian installer? No, I haven't. How would that make a difference? It would provide data point for troubleshooting. or any live sticks? only the Fedora one That indicates a bad NIC and/or a bad PCIe slot. I plugged the Intel card back in and the on-card worked again. I'd try disabling the on-board card but there is no option to do that in the BIOS. Okay. That indicates the issue is software. How would it be a software issue affecting a network card from a totally different manufacturer in a PCI slot that the BIOS doesn't have an option to disable the on-board network card? Without extensive engineering information and the right test equipment, who knows? At this point, all I can suggest is a program of A/B testing to isolate the faulty hardware and/or software component(s). Beware that you may have multiple faults, so be meticulous. The only thing I can do is try the network card that's in the client now. It'll be about a week before I can get to that. I prefer FreeBSD for my servers. The "Intel ® Ethernet Controller Products 27.7 Release Notes" indicate the "ix" driver is supported and tested on FreeBSD 13 and FreeBSD 12.3 ("Fedora" and "Debian" appear nowhere in that document): Good idea, I can try this maybe: https://www.nomadbsd.org/ I'd be surprised if it worked, but maybe it does and if it does, I could just as well use FreeBSD for the backup server. It sounds like you could use more spare parts and/or computers. Let us know what happens with the Broadcom card and with whatever BSD you pick (the FreeBSD installer includes a rescue shell and a live system). David
Re: Intel X540-AT2 and Debian: intermittent connection
On Sat, 2022-11-19 at 13:35 -0800, David Christensen wrote: > On 11/19/22 06:50, hw wrote: > > On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote: > > > > ... I suggest trying a Category 6A factory patch cable at least 2 meters > > > long. > > > > I tried it with a 10m cat6 cable and the connection was intermittent. It's > > the > > same (as in "identical to") cable that works between the other server and a > > client. > > > Okay. I suggest putting a unique mark/ serial number on each cable for > tracking purposes until you resolve the intermittent connection issue. What for? All the cables I used except for the new ones are known good. > > > What OS's for the various machines? > > > > Fedora on the server and Debian on the backup server, Fedora on the client. > > > Okay. If the NIC works correctly in the backup server with Fedora, > maybe you should just use Fedora. Unfortunately it doesn't work anymore with Fedora either ... I tried it with a live system if it would work and it didn't. > > > Do you compile your own kernels and/or NIC drivers? > > > > No, I'm using the kernels that come with the distributions. > > > Okay. That is the safest approach. And it's convenient :) > > I did compile the > > driver (i. e. module) from the source on Intels web site to see if a > > different > > driver would make a difference, and it didn't, so I restored the "original" > > module. > > > Okay. Too bad it did not work; that seemed like a good suggestion. It's good that it didn't make a difference because I won't be able to keep the Intel source working. Sooner or later the kernel will be incompatible, and until then, I might have to recompile it for every new kernel version. It's always better to use hardware that is supported by the modules that come with the kernel. > > > If you have another Broadcom NIC, what happens if you swap it with the > > > Intel NIC in the backup server? > > > > I haven't tried yet because when I swap cards around, I'll have to redo the > > configuration and the server has some network cards passed through to a VM > > running OPNsense. I don't want to mess with that. > > > Perhaps that is a good reason to do some devops development -- e.g. > write a data-driven script that reads a configuration file to > interconnect the VM virtual network interfaces and host physical network > interfaces. Why would I do that? How would a script figure out which interface is which and how would it guarantee that they will be exactly the same as seen by OPNsense running in that VM when I switch them around? I'm not saying it's impossible, but I'd rather resolve this problem in a timely manner and not in a couple years when I might have finished the script and tested it in a bunch of servers which aren't even relevant. > I prefer to use a dedicated hardware device for my LAN (UniFi Security > Gateway). Ubiquity sucks. I'd prefer to run OPNsense on dedicated hardware, but electricity is insanely expensive here, and OPNsense works fine in this VM with no issues whatsoever in over a year now. > > I suspect it's a mainboard issue. I pulled the Intel card and then the on- > > board > > network card quit working. > > > With the current Debian installation? yes > Did you try the d-i rescue shell You mean the rescue system that comes with the Debian installer? No, I haven't. How would that make a difference? > or any live sticks? only the Fedora one > > I plugged the Intel card back in and the on-card > > worked again. I'd try disabling the on-board card but there is no option to > > do > > that in the BIOS. > > > Okay. That indicates the issue is software. How would it be a software issue affecting a network card from a totally different manufacturer in a PCI slot that the BIOS doesn't have an option to disable the on-board network card? > > > Do you have any diagnostic information that indicates the Intel NIC is > > > overheating? > > > > No, the idea that it might overheat is from internet searches revealing that > > some people had issues with the card overheating and adding a fan blowing on > > the > > heatsink fixed the problem. I always had a fan blowing over it from the top > > of > > the card, so that should be fine, and placing another fan directly on the > > heatsink didn't make a difference. I took the extra fan out today when I > > was at > > it because it's awfully loud --- it's an old Delta fan from 2003 that comes > > from > > an old IBM server and it makes a good airstream :) > > > > The heat sink looks fine and unfortunately, it's designed in such a way that > > I > > can't remove it without breaking the pins holding the heatsink to the card, > > so I > > decided not to touch it. That's how I discovered that the on-board network > > card > > quit working when the Intel card wasn't plugged in ... > > > > Perhaps it's some kind of resource conflict or incompatibility, or the board > > is > > broken. > > > At this point,
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/19/22 13:35, David Christensen wrote: The "Intel ® Ethernet Controller Products 27.7 Release Notes" indicate the "ix" driver is supported and tested on FreeBSD 13 and FreeBSD 12.3 ("Fedora" and "Debian" appear nowhere in that document): https://www.intel.com/content/www/us/en/download/19622/intel-ethernet-product-software-release-notes.html Correction -- Table 2 on page 6 has a column "Debian 11" that indicates the ixgbe driver is Supported Not Tested. David
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/19/22 06:50, hw wrote: On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote: ... I suggest trying a Category 6A factory patch cable at least 2 meters long. I tried it with a 10m cat6 cable and the connection was intermittent. It's the same (as in "identical to") cable that works between the other server and a client. Okay. I suggest putting a unique mark/ serial number on each cable for tracking purposes until you resolve the intermittent connection issue. What OS's for the various machines? Fedora on the server and Debian on the backup server, Fedora on the client. Okay. If the NIC works correctly in the backup server with Fedora, maybe you should just use Fedora. Do you compile your own kernels and/or NIC drivers? No, I'm using the kernels that come with the distributions. Okay. That is the safest approach. I did compile the driver (i. e. module) from the source on Intels web site to see if a different driver would make a difference, and it didn't, so I restored the "original" module. Okay. Too bad it did not work; that seemed like a good suggestion. If you have another Broadcom NIC, what happens if you swap it with the Intel NIC in the backup server? I haven't tried yet because when I swap cards around, I'll have to redo the configuration and the server has some network cards passed through to a VM running OPNsense. I don't want to mess with that. Perhaps that is a good reason to do some devops development -- e.g. write a data-driven script that reads a configuration file to interconnect the VM virtual network interfaces and host physical network interfaces. I prefer to use a dedicated hardware device for my LAN (UniFi Security Gateway). I suspect it's a mainboard issue. I pulled the Intel card and then the on-board network card quit working. With the current Debian installation? Did you try the d-i rescue shell or any live sticks? I plugged the Intel card back in and the on-card worked again. I'd try disabling the on-board card but there is no option to do that in the BIOS. Okay. That indicates the issue is software. Do you have any diagnostic information that indicates the Intel NIC is overheating? No, the idea that it might overheat is from internet searches revealing that some people had issues with the card overheating and adding a fan blowing on the heatsink fixed the problem. I always had a fan blowing over it from the top of the card, so that should be fine, and placing another fan directly on the heatsink didn't make a difference. I took the extra fan out today when I was at it because it's awfully loud --- it's an old Delta fan from 2003 that comes from an old IBM server and it makes a good airstream :) The heat sink looks fine and unfortunately, it's designed in such a way that I can't remove it without breaking the pins holding the heatsink to the card, so I decided not to touch it. That's how I discovered that the on-board network card quit working when the Intel card wasn't plugged in ... Perhaps it's some kind of resource conflict or incompatibility, or the board is broken. At this point, all I can suggest is a program of A/B testing to isolate the faulty hardware and/or software component(s). Beware that you may have multiple faults, so be meticulous. I prefer FreeBSD for my servers. The "Intel ® Ethernet Controller Products 27.7 Release Notes" indicate the "ix" driver is supported and tested on FreeBSD 13 and FreeBSD 12.3 ("Fedora" and "Debian" appear nowhere in that document): https://www.intel.com/content/www/us/en/download/19622/intel-ethernet-product-software-release-notes.html My FreeBSD-12.3-RELEASE-amd64 SOHO server has a man page ixgbe(4): NAME ixgbe - Intel(R) 10Gb Ethernet driver for the FreeBSD operating system SYNOPSIS To compile this driver into the kernel, place the following lines in your kernel configuration file: device iflib device ixgbe Alternatively, to load the driver as a module at boot time, place the following line in loader.conf(5): if_ixgbe_load="YES" David
Re: Intel X540-AT2 and Debian: intermittent connection
On Fri, 2022-11-18 at 21:27 -0500, Porter Smith wrote: > Userslly it is considered best practice to use nics from a known comparable > vendor for example Intel dual port nics can be found on sites like Amazon or > neerhg for a reasonable amount of $. What do you think I'm using?
Re: Intel X540-AT2 and Debian: intermittent connection
On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote: > On 11/18/22 05:23, hw wrote: > > On Tue, 2022-11-15 at 16:42 -0800, David Christensen wrote: > > > On 11/15/22 07:15, hw wrote: > > > > On Tue, 2022-11-15 at 12:38 +0100, hw wrote: > > > > > On Mon, 2022-11-14 at 13:21 +0100, hw wrote: > [...] > > > What is the cable type? Length? Factory or home made? > > > > I got a new cable today which is rated as cat 8.1. It's only 1.5 meters > > long. > > I have tried 3 different cables now, two of them about 1.5 and another 10 > > meters > > long. Before I got the new cable, I tried the other port on the nic, and it > > made no difference. > > > > Even with the new cable, the connection is intermittent :( > > > Different category cables have different characteristic impedance, and > the NIC's are designed for specific cables. > [...] > So, I suggest trying a Category 6A factory patch cable at least 2 meters > long. I tried it with a 10m cat6 cable and the connection was intermittent. It's the same (as in "identical to") cable that works between the other server and a client. > > > What is connected to the other end of the cable? If it is a NIC in > > > another server, what happens if you swap the two NIC's? > > > > It's connected to a Broadcom NetXtreme II BCM57810 in another server. The > > other > > server has an identical mainboard and CPU in it, and the other port on the > > Broadcom is connected to a client with the same card, and that connection > > works > > fine. So I'm assuming that the Broadcom card is ok. > > > What OS's for the various machines? Fedora on the server and Debian on the backup server, Fedora on the client. > Do you compile your own kernels and/or NIC drivers? No, I'm using the kernels that come with the distributions. I did compile the driver (i. e. module) from the source on Intels web site to see if a different driver would make a difference, and it didn't, so I restored the "original" module. > > I'm about to move the client into a new case in a couple days and then I > > might > > swap the Broadcom from the client into the backup server. > > > If you have another Broadcom NIC, what happens if you swap it with the > Intel NIC in the backup server? I haven't tried yet because when I swap cards around, I'll have to redo the configuration and the server has some network cards passed through to a VM running OPNsense. I don't want to mess with that. I suspect it's a mainboard issue. I pulled the Intel card and then the on-board network card quit working. I plugged the Intel card back in and the on-card worked again. I'd try disabling the on-board card but there is no option to do that in the BIOS. > > Maybe I can reseat the heat sink on the card with new thermal paste. > > Overheating might explain why the connection is intermittent. > > > Do you have any diagnostic information that indicates the Intel NIC is > overheating? No, the idea that it might overheat is from internet searches revealing that some people had issues with the card overheating and adding a fan blowing on the heatsink fixed the problem. I always had a fan blowing over it from the top of the card, so that should be fine, and placing another fan directly on the heatsink didn't make a difference. I took the extra fan out today when I was at it because it's awfully loud --- it's an old Delta fan from 2003 that comes from an old IBM server and it makes a good airstream :) The heat sink looks fine and unfortunately, it's designed in such a way that I can't remove it without breaking the pins holding the heatsink to the card, so I decided not to touch it. That's how I discovered that the on-board network card quit working when the Intel card wasn't plugged in ... Perhaps it's some kind of resource conflict or incompatibility, or the board is broken.
networking is getting weirder (Re: Intel X540-AT2 and Debian: intermittent connection)
On Fri, 2022-11-18 at 16:00 +0100, hw wrote: > On Fri, 2022-11-18 at 09:35 -0500, Jeffrey Walton wrote: > > On Mon, Nov 14, 2022 at 6:25 AM hw wrote: > > > > > > I have an X540-AT2 network card in my backup server and it worked when I > > > was > > > running Fedora on the server. > > > > > > I installed Debian on it and wanted to make backups with rsync, but the > > > connection via this network card is now intermittent where it used to be > > > stable > > > with Fedora. > > > > Fedora uses the latest version of a package that's available at the > > release date. Maybe Fedora was using a newer driver than Debian? > > Then it should have worked when I booted a Fedora live from an USB stick ... > > > It looks like there's several updated Linux drivers at > > https://www.intel.com/content/www/us/en/products/sku/60020/intel-ethernet-controller-x540at2/downloads.html > > . Maybe you can try one of the newer drivers on the Debian machine? > > Yep, thanks, I tried that and it didn't make a difference. > > I've never had a broken network card and I think it's strange that the > connection is intermittent. If it was broken, would it have a connection at > all? I guess my best chance is reseating the heatsink. So I pulled the Intel card today and I don't think I can reseat the heatsink because it's attached with pins that'll break if I try to take the heatsink off. It looks fine anyway. So I switched the server on without the Intel card installed and now the on- board network card doesn't work anymore :( Ethtool says no link detected, 'ip link' says DOWN, I can't bring the interface up. The light on the card is green, the switch port it's connected to is green. It was working fine yesterday. I'll plug the Intel card back in and see what happens ... Ok, the on-board card is working again. I can't tell if the connection is interittent now because no pings seem to go through at all. What's going on? I'm starting to think this mainboard has issues ... I've never seen anything like this before.
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/18/22 06:35, Jeffrey Walton wrote: On Mon, Nov 14, 2022 at 6:25 AM hw wrote: I have an X540-AT2 network card in my backup server and it worked when I was running Fedora on the server. I installed Debian on it and wanted to make backups with rsync, but the connection via this network card is now intermittent where it used to be stable with Fedora. Fedora uses the latest version of a package that's available at the release date. Maybe Fedora was using a newer driver than Debian? It looks like there's several updated Linux drivers at https://www.intel.com/content/www/us/en/products/sku/60020/intel-ethernet-controller-x540at2/downloads.html . Maybe you can try one of the newer drivers on the Debian machine? Thank you for the link. A newer device driver could very well be the answer to the OP's intermittent connection problems. This Intel web page has a link to download the latest driver source code tarball: https://www.intel.com/content/www/us/en/download/14302/intel-network-adapter-driver-for-pcie-intel-10-gigabit-ethernet-network-connections-under-linux.html "Detailed Description Overview This is the most current release of the ixgbe driver for Linux, which supports kernel versions 2.6.18 up through 5.18. ... ... This download is valid for the product(s) listed below. ... Intel® Ethernet Controller X540-AT2 ..." (There is a similar link for FreeBSD; which may be another option for the OP.) Searching Debian packages for ixgbe, I see packages for "Data Plane Development Kit (librte-pmd-ixgbe runtime library)", but not device drivers (?). Similarly so for Debian backports. The Linux kernel 5.10 includes the ixgbe driver, but I am unable to determine a version number: https://www.kernel.org/doc/html/v5.10/networking/device_drivers/ethernet/intel/ixgbe.html "Linux Base Driver for the Intel(R) Ethernet 10 Gigabit PCI Express Adapters ... Identifying Your Adapter The driver is compatible with devices based on the following: ... Intel(R) Ethernet Controller X540 ..." This web page has a HOWTO for updating the Linux ixgbe driver using Intel source code: https://www.xmodulo.com/download-install-ixgbe-driver-ubuntu-debian.html David
Re: Intel X540-AT2 and Debian: intermittent connection
Userslly it is considered best practice to use nics from a known comparable vendor for example Intel dual port nics can be found on sites like Amazon or neerhg for a reasonable amount of $. On November 18, 2022 8:02:43 PM EST, David Christensen wrote: >On 11/18/22 05:23, hw wrote: >> On Tue, 2022-11-15 at 16:42 -0800, David Christensen wrote: >>> On 11/15/22 07:15, hw wrote: On Tue, 2022-11-15 at 12:38 +0100, hw wrote: > On Mon, 2022-11-14 at 13:21 +0100, hw wrote: > > Any ideas? Backups over an 1GB link are excruciatingly slow ... > Update: I booted a Fedora live system and the connection is also intermittent. So it's not a Debian issue. It's still an issue, though ... >>> >>> >>> What is the cable type? Length? Factory or home made? >> >> I got a new cable today which is rated as cat 8.1. It's only 1.5 meters >> long. >> I have tried 3 different cables now, two of them about 1.5 and another 10 >> meters >> long. Before I got the new cable, I tried the other port on the nic, and it >> made no difference. >> >> Even with the new cable, the connection is intermittent :( > > >Different category cables have different characteristic impedance, and the >NIC's are designed for specific cables. The EE buzzword is "transmission >line". You want to use the cables that Intel designed for -- Category 6A or >Category 6, 100 meters or 55 meters maximum (respectively): > >https://www.intel.com/content/www/us/en/support/articles/07404/ethernet-products.html > > >I do not see a specification for minimum length; either on Intel or STFW. >Back in the day of 10BASE-*, I seem to recall hearing, reading, and/or >learning 2 meters minimum. > > >So, I suggest trying a Category 6A factory patch cable at least 2 meters long. > > >>> What is connected to the other end of the cable? If it is a NIC in >>> another server, what happens if you swap the two NIC's? >> >> It's connected to a Broadcom NetXtreme II BCM57810 in another server. The >> other >> server has an identical mainboard and CPU in it, and the other port on the >> Broadcom is connected to a client with the same card, and that connection >> works >> fine. So I'm assuming that the Broadcom card is ok. > > >What OS's for the various machines? > > >Do you compile your own kernels and/or NIC drivers? > > >> I'm about to move the client into a new case in a couple days and then I >> might >> swap the Broadcom from the client into the backup server. > > >If you have another Broadcom NIC, what happens if you swap it with the Intel >NIC in the backup server? > > >> Maybe I can reseat the heat sink on the card with new thermal paste. >> Overheating might explain why the connection is intermittent. > > >Do you have any diagnostic information that indicates the Intel NIC is >overheating? > > >David >
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/18/22 05:23, hw wrote: On Tue, 2022-11-15 at 16:42 -0800, David Christensen wrote: On 11/15/22 07:15, hw wrote: On Tue, 2022-11-15 at 12:38 +0100, hw wrote: On Mon, 2022-11-14 at 13:21 +0100, hw wrote: Any ideas? Backups over an 1GB link are excruciatingly slow ... Update: I booted a Fedora live system and the connection is also intermittent. So it's not a Debian issue. It's still an issue, though ... What is the cable type? Length? Factory or home made? I got a new cable today which is rated as cat 8.1. It's only 1.5 meters long. I have tried 3 different cables now, two of them about 1.5 and another 10 meters long. Before I got the new cable, I tried the other port on the nic, and it made no difference. Even with the new cable, the connection is intermittent :( Different category cables have different characteristic impedance, and the NIC's are designed for specific cables. The EE buzzword is "transmission line". You want to use the cables that Intel designed for -- Category 6A or Category 6, 100 meters or 55 meters maximum (respectively): https://www.intel.com/content/www/us/en/support/articles/07404/ethernet-products.html I do not see a specification for minimum length; either on Intel or STFW. Back in the day of 10BASE-*, I seem to recall hearing, reading, and/or learning 2 meters minimum. So, I suggest trying a Category 6A factory patch cable at least 2 meters long. What is connected to the other end of the cable? If it is a NIC in another server, what happens if you swap the two NIC's? It's connected to a Broadcom NetXtreme II BCM57810 in another server. The other server has an identical mainboard and CPU in it, and the other port on the Broadcom is connected to a client with the same card, and that connection works fine. So I'm assuming that the Broadcom card is ok. What OS's for the various machines? Do you compile your own kernels and/or NIC drivers? I'm about to move the client into a new case in a couple days and then I might swap the Broadcom from the client into the backup server. If you have another Broadcom NIC, what happens if you swap it with the Intel NIC in the backup server? Maybe I can reseat the heat sink on the card with new thermal paste. Overheating might explain why the connection is intermittent. Do you have any diagnostic information that indicates the Intel NIC is overheating? David
Re: Intel X540-AT2 and Debian: intermittent connection
On Fri, 2022-11-18 at 09:35 -0500, Jeffrey Walton wrote: > On Mon, Nov 14, 2022 at 6:25 AM hw wrote: > > > > I have an X540-AT2 network card in my backup server and it worked when I was > > running Fedora on the server. > > > > I installed Debian on it and wanted to make backups with rsync, but the > > connection via this network card is now intermittent where it used to be > > stable > > with Fedora. > > Fedora uses the latest version of a package that's available at the > release date. Maybe Fedora was using a newer driver than Debian? Then it should have worked when I booted a Fedora live from an USB stick ... > It looks like there's several updated Linux drivers at > https://www.intel.com/content/www/us/en/products/sku/60020/intel-ethernet-controller-x540at2/downloads.html > . Maybe you can try one of the newer drivers on the Debian machine? Yep, thanks, I tried that and it didn't make a difference. I've never had a broken network card and I think it's strange that the connection is intermittent. If it was broken, would it have a connection at all? I guess my best chance is reseating the heatsink.
Re: Intel X540-AT2 and Debian: intermittent connection
On Mon, Nov 14, 2022 at 6:25 AM hw wrote: > > I have an X540-AT2 network card in my backup server and it worked when I was > running Fedora on the server. > > I installed Debian on it and wanted to make backups with rsync, but the > connection via this network card is now intermittent where it used to be > stable > with Fedora. Fedora uses the latest version of a package that's available at the release date. Maybe Fedora was using a newer driver than Debian? It looks like there's several updated Linux drivers at https://www.intel.com/content/www/us/en/products/sku/60020/intel-ethernet-controller-x540at2/downloads.html . Maybe you can try one of the newer drivers on the Debian machine? Jeff
Re: Intel X540-AT2 and Debian: intermittent connection
On Tue, 2022-11-15 at 16:42 -0800, David Christensen wrote: > On 11/15/22 07:15, hw wrote: > > On Tue, 2022-11-15 at 12:38 +0100, hw wrote: > > > On Mon, 2022-11-14 at 13:21 +0100, hw wrote: > > > > On Mon, 2022-11-14 at 12:28 +0100, stefano gozzi wrote: > > > > > Please loot at this: > > > > > https://www.linuxquestions.org/questions/linux-networking-3/intel-x540-t2-network-card-installed-but-only-at-100mbit-cant-change-or-improve-4175686736/ > > > > > > > > > > It seems that you need a 8x pcie slot to work fine > > > > > > > > Thanks, the card is in an 8x slot and has been working fine with > > > > Fedora. I > > > > didn't change anything but using Debian instead of Fedora. > > > > > > Ok I pulled the server from the rack and put another fan to blow directly > > > on > > > the > > > card in case it might overheat. > > > > > > And I have to correct myself. The card is in an 8x slot and according to > > > the > > > manual of the mainboard it's supposed to be 8x and not 4x. I pulled it > > > and > > > put > > > it back in. > > > > > > However, lspci says "LnkSta: Speed 5GT/s (ok), Width x4 (downgraded)". > > > > > > Usually cards in PCI slots with 4 instead of 8 lanes still work fine, and > > > the > > > card did work in that slot with Fedora. > > > > > > I found that I had to unplug the network cable and to plug it back in > > > before I > > > could send/receive pings. I already tried a different network cable and > > > it > > > didn't make a difference. > > > > > > I suspect that Debian might be doing something differently or not doing > > > that > > > Fedora does which causes the intermittent connections. > > > > > > Any ideas? Backups over an 1GB link are excruciatingly slow ... > > > > > > > Update: I booted a Fedora live system and the connection is also > > intermittent. > > So it's not a Debian issue. It's still an issue, though ... > > > What is the cable type? Length? Factory or home made? I got a new cable today which is rated as cat 8.1. It's only 1.5 meters long. I have tried 3 different cables now, two of them about 1.5 and another 10 meters long. Before I got the new cable, I tried the other port on the nic, and it made no difference. Even with the new cable, the connection is intermittent :( > What is connected to the other end of the cable? If it is a NIC in > another server, what happens if you swap the two NIC's? It's connected to a Broadcom NetXtreme II BCM57810 in another server. The other server has an identical mainboard and CPU in it, and the other port on the Broadcom is connected to a client with the same card, and that connection works fine. So I'm assuming that the Broadcom card is ok. I'm about to move the client into a new case in a couple days and then I might swap the Broadcom from the client into the backup server. Maybe I can reseat the heat sink on the card with new thermal paste. Overheating might explain why the connection is intermittent.
Re: Intel X540-AT2 and Debian: intermittent connection
On 11/15/22 07:15, hw wrote: On Tue, 2022-11-15 at 12:38 +0100, hw wrote: On Mon, 2022-11-14 at 13:21 +0100, hw wrote: On Mon, 2022-11-14 at 12:28 +0100, stefano gozzi wrote: Please loot at this: https://www.linuxquestions.org/questions/linux-networking-3/intel-x540-t2-network-card-installed-but-only-at-100mbit-cant-change-or-improve-4175686736/ It seems that you need a 8x pcie slot to work fine Thanks, the card is in an 8x slot and has been working fine with Fedora. I didn't change anything but using Debian instead of Fedora. Ok I pulled the server from the rack and put another fan to blow directly on the card in case it might overheat. And I have to correct myself. The card is in an 8x slot and according to the manual of the mainboard it's supposed to be 8x and not 4x. I pulled it and put it back in. However, lspci says "LnkSta: Speed 5GT/s (ok), Width x4 (downgraded)". Usually cards in PCI slots with 4 instead of 8 lanes still work fine, and the card did work in that slot with Fedora. I found that I had to unplug the network cable and to plug it back in before I could send/receive pings. I already tried a different network cable and it didn't make a difference. I suspect that Debian might be doing something differently or not doing that Fedora does which causes the intermittent connections. Any ideas? Backups over an 1GB link are excruciatingly slow ... Update: I booted a Fedora live system and the connection is also intermittent. So it's not a Debian issue. It's still an issue, though ... What is the cable type? Length? Factory or home made? What is connected to the other end of the cable? If it is a NIC in another server, what happens if you swap the two NIC's? David
Re: Intel X540-AT2 and Debian: intermittent connection
On Tue, 2022-11-15 at 12:38 +0100, hw wrote: > On Mon, 2022-11-14 at 13:21 +0100, hw wrote: > > On Mon, 2022-11-14 at 12:28 +0100, stefano gozzi wrote: > > > Please loot at this: > > > https://www.linuxquestions.org/questions/linux-networking-3/intel-x540-t2-network-card-installed-but-only-at-100mbit-cant-change-or-improve-4175686736/ > > > > > > It seems that you need a 8x pcie slot to work fine > > > > Thanks, the card is in an 8x slot and has been working fine with Fedora. I > > didn't change anything but using Debian instead of Fedora. > > Ok I pulled the server from the rack and put another fan to blow directly on > the > card in case it might overheat. > > And I have to correct myself. The card is in an 8x slot and according to the > manual of the mainboard it's supposed to be 8x and not 4x. I pulled it and > put > it back in. > > However, lspci says "LnkSta: Speed 5GT/s (ok), Width x4 (downgraded)". > > Usually cards in PCI slots with 4 instead of 8 lanes still work fine, and the > card did work in that slot with Fedora. > > I found that I had to unplug the network cable and to plug it back in before I > could send/receive pings. I already tried a different network cable and it > didn't make a difference. > > I suspect that Debian might be doing something differently or not doing that > Fedora does which causes the intermittent connections. > > Any ideas? Backups over an 1GB link are excruciatingly slow ... > Update: I booted a Fedora live system and the connection is also intermittent. So it's not a Debian issue. It's still an issue, though ...
Re: Intel X540-AT2 and Debian: intermittent connection
On Mon, 2022-11-14 at 13:21 +0100, hw wrote: > On Mon, 2022-11-14 at 12:28 +0100, stefano gozzi wrote: > > Please loot at this: > > https://www.linuxquestions.org/questions/linux-networking-3/intel-x540-t2-network-card-installed-but-only-at-100mbit-cant-change-or-improve-4175686736/ > > > > It seems that you need a 8x pcie slot to work fine > > Thanks, the card is in an 8x slot and has been working fine with Fedora. I > didn't change anything but using Debian instead of Fedora. Ok I pulled the server from the rack and put another fan to blow directly on the card in case it might overheat. And I have to correct myself. The card is in an 8x slot and according to the manual of the mainboard it's supposed to be 8x and not 4x. I pulled it and put it back in. However, lspci says "LnkSta: Speed 5GT/s (ok), Width x4 (downgraded)". Usually cards in PCI slots with 4 instead of 8 lanes still work fine, and the card did work in that slot with Fedora. I found that I had to unplug the network cable and to plug it back in before I could send/receive pings. I already tried a different network cable and it didn't make a difference. I suspect that Debian might be doing something differently or not doing that Fedora does which causes the intermittent connections. Any ideas? Backups over an 1GB link are excruciatingly slow ...
Re: Intel X540-AT2 and Debian: intermittent connection
On Mon, 2022-11-14 at 12:28 +0100, stefano gozzi wrote: > Please loot at this: > https://www.linuxquestions.org/questions/linux-networking-3/intel-x540-t2-network-card-installed-but-only-at-100mbit-cant-change-or-improve-4175686736/ > > It seems that you need a 8x pcie slot to work fine Thanks, the card is in an 8x slot and has been working fine with Fedora. I didn't change anything but using Debian instead of Fedora.
Re: Intel X540-AT2 and Debian: intermittent connection
Please loot at this: https://www.linuxquestions.org/questions/linux-networking-3/intel-x540-t2-network-card-installed-but-only-at-100mbit-cant-change-or-improve-4175686736/ It seems that you need a 8x pcie slot to work fine On Mon, Nov 14, 2022 at 12:25 PM hw wrote: > > Hi, > > I have an X540-AT2 network card in my backup server and it worked when I > was > running Fedora on the server. > > I installed Debian on it and wanted to make backups with rsync, but the > connection via this network card is now intermittent where it used to be > stable > with Fedora. > > The link always shows as up. Every now and then rsync halts and pings > don't get > through. After a while, rsync continues to work and pings also get through > again. > > The interface is directly connected (i. e. without a switch in between) to > another 10GB network card. > > When I make the backups over the "normal" 1GB link through a different > network > card (via switch), the 1GB connection is fine. > > > Do I need to do something special to get the X540-AT2 to work? Is this a > known > issue with Debian? How could I debug this? > >
Intel X540-AT2 and Debian: intermittent connection
Hi, I have an X540-AT2 network card in my backup server and it worked when I was running Fedora on the server. I installed Debian on it and wanted to make backups with rsync, but the connection via this network card is now intermittent where it used to be stable with Fedora. The link always shows as up. Every now and then rsync halts and pings don't get through. After a while, rsync continues to work and pings also get through again. The interface is directly connected (i. e. without a switch in between) to another 10GB network card. When I make the backups over the "normal" 1GB link through a different network card (via switch), the 1GB connection is fine. Do I need to do something special to get the X540-AT2 to work? Is this a known issue with Debian? How could I debug this?