Re: tg3 pxe weirdness
On Wed, Sep 27, 2017 at 3:35 PM, Berend De Schouwer wrote: > On Mon, 2017-09-25 at 15:11 +0530, Siva Reddy Kallam wrote: >> On Fri, Sep 22, 2017 at 9:04 PM, Berend De Schouwer >> wrote: >> > On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: >> > > >> > > >> > > Can you please share below details? >> > > 1) Model and Manufacturer of the system >> > > 2) Linux distro/kernel used? >> > >> > 4.13.3 gets a little further, but after some more data is >> > transferred >> > the tg3 driver still crashes. This is unfortunately before I've >> > got a >> > writeable filesystem. >> > >> > The last line is: >> > tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2 >> > >> > I've got some ideas to get the full dmesg. >> > >> > As with the other kernels it works OK on 1Gbps, but not slower >> > switches. >> >> I am suspecting with link aware mode, the clock speed could be slow >> and boot code does not >> complete within the expected time with lower link speeds. So, >> Providing a patch to override clock. >> Can you please try with attached debug patch and provide us the >> feedback with 100M link? >> If it solves this issue, we will work on proper changes. > > This does work on 4.13.3 and PXE for me. > > I've tested on 1 Gbps, 100 Mbps and 10 Mbps. I've done some > preliminary testing (eg. large file copies.) Good. We will work on required changes and upstream proper patch after sanity test with multiple speeds.
Re: tg3 pxe weirdness
On Mon, 2017-09-25 at 15:11 +0530, Siva Reddy Kallam wrote: > On Fri, Sep 22, 2017 at 9:04 PM, Berend De Schouwer > wrote: > > On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: > > > > > > > > > Can you please share below details? > > > 1) Model and Manufacturer of the system > > > 2) Linux distro/kernel used? > > > > 4.13.3 gets a little further, but after some more data is > > transferred > > the tg3 driver still crashes. This is unfortunately before I've > > got a > > writeable filesystem. > > > > The last line is: > > tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2 > > > > I've got some ideas to get the full dmesg. > > > > As with the other kernels it works OK on 1Gbps, but not slower > > switches. > > I am suspecting with link aware mode, the clock speed could be slow > and boot code does not > complete within the expected time with lower link speeds. So, > Providing a patch to override clock. > Can you please try with attached debug patch and provide us the > feedback with 100M link? > If it solves this issue, we will work on proper changes. This does work on 4.13.3 and PXE for me. I've tested on 1 Gbps, 100 Mbps and 10 Mbps. I've done some preliminary testing (eg. large file copies.)
Re: tg3 pxe weirdness
On Fri, Sep 22, 2017 at 9:04 PM, Berend De Schouwer wrote: > On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: >> >> >> Can you please share below details? >> 1) Model and Manufacturer of the system >> 2) Linux distro/kernel used? > > 4.13.3 gets a little further, but after some more data is transferred > the tg3 driver still crashes. This is unfortunately before I've got a > writeable filesystem. > > The last line is: > tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2 > > I've got some ideas to get the full dmesg. > > As with the other kernels it works OK on 1Gbps, but not slower > switches. I am suspecting with link aware mode, the clock speed could be slow and boot code does not complete within the expected time with lower link speeds. So, Providing a patch to override clock. Can you please try with attached debug patch and provide us the feedback with 100M link? If it solves this issue, we will work on proper changes. 0001-tg3-Add-clock-override-support-for-5762.patch Description: Binary data
Re: tg3 pxe weirdness
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: > > > Can you please share below details? > 1) Model and Manufacturer of the system > 2) Linux distro/kernel used? 4.13.3 gets a little further, but after some more data is transferred the tg3 driver still crashes. This is unfortunately before I've got a writeable filesystem. The last line is: tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2 I've got some ideas to get the full dmesg. As with the other kernels it works OK on 1Gbps, but not slower switches.
Re: tg3 pxe weirdness
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: > On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer > wrote: > > Hi, > > > > I've got a machine with a Broadcom bcm5762c, using the tg3 driver, > > that > > fails to receive network packets under some very specific > > conditions. > > > > > > Berend > > Can you please share below details? > 1) Model and Manufacturer of the system > 2) Linux distro/kernel used? 3.10.107 behaves like CentOS' kernel.
Re: tg3 pxe weirdness
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: > On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer > wrote: > > Hi, > > > > I've got a machine with a Broadcom bcm5762c, using the tg3 driver, > > that > > fails to receive network packets under some very specific > > conditions. > > > > > > Berend > > Can you please share below details? > 1) Model and Manufacturer of the system > 2) Linux distro/kernel used? 4.13.3 mainline locks up, but network dumps confirm it gets further. The second DHCP query completes, and the first NFS mount completes. I have to rely on network dumps since on PXE and SATA boot it breaks VGA output, so I can't see what it's doing, or why it stops.
Re: tg3 pxe weirdness
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote: > On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer > wrote: > > Hi, > > > > I've got a machine with a Broadcom bcm5762c, using the tg3 driver, > > that > > fails to receive network packets under some very specific > > conditions. > > > > > > Berend > > Can you please share below details? > 1) Model and Manufacturer of the system HP EliteDesk 705 G3 SFF lspci -n on the network card: 01:00.0 0200: 14e4:1687 (rev 10) dmesg from tg3: tg3 :01:00.0: PCI INT A -> GSI 24 (level, low) -> IRQ 24 tg3 :01:00.0: setting latency timer to 64 tg3 :01:00.0: eth0: Tigon3 [partno(none) rev 5762100] (PCI Express) MAC address 70:5a:0f:3d:9f:4f tg3 :01:00.0: eth0: attached PHY is 5762C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1]) tg3 :01:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] tg3 :01:00.0: eth0: dma_rwctrl[0001] dma_mask[64-bit] The same motherboard with add-on network cards does boot using PXE. It's an EFI board but Secureboot is currently disabled. > 2) Linux distro/kernel used? CentOS 6 + updates. I've tried all the CentOS 6 kernels up to 2.6.32- 696.10.2.el6.x86_64 I've tried updating the tg3 driver from 3.137 to both 3.137h and 3.137o, with the same result. I'm currently in the process of making 4.13.2 boot.
Re: tg3 pxe weirdness
On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer wrote: > Hi, > > I've got a machine with a Broadcom bcm5762c, using the tg3 driver, that > fails to receive network packets under some very specific conditions. > > It works perfectly using a 1Gbps switch. If, however, it first uses > PXE and then loads the Linux tg3 driver, and the switch is 100Mbps, it > no longer receives packets larger than ICMP. > > It does do ARP and ping. > > If it boots using PXE on a 1Gbps switch, boots into Linux, and then > it's plugged into 100 Mbps it also stops receiving packets. > > mii-diag and dmesg confirm auto-negotiated speed and flow control, and > confirm temporary disconnect as the cables are moved. > > PXE boots using UNDI, which then transfers a kernel using TFTP, which > transfers correctly. The kernel boots, loads the tg3 driver, connects > the network. Up to this point everything works. Ping will work too. > Any other network traffic fails. > > Booting from a harddrive works fine. I assume the UNDI driver > somewhere breaks auto-negotiation. I've tried using mii-tool and > ethtool, but I haven't managed to make it work yet. > > Is it possible to get negotiation working after PXE boot? Are there > any tg3 driver flags that might make a difference? > > > Berend Can you please share below details? 1) Model and Manufacturer of the system 2) Linux distro/kernel used?
tg3 pxe weirdness
Hi, I've got a machine with a Broadcom bcm5762c, using the tg3 driver, that fails to receive network packets under some very specific conditions. It works perfectly using a 1Gbps switch. If, however, it first uses PXE and then loads the Linux tg3 driver, and the switch is 100Mbps, it no longer receives packets larger than ICMP. It does do ARP and ping. If it boots using PXE on a 1Gbps switch, boots into Linux, and then it's plugged into 100 Mbps it also stops receiving packets. mii-diag and dmesg confirm auto-negotiated speed and flow control, and confirm temporary disconnect as the cables are moved. PXE boots using UNDI, which then transfers a kernel using TFTP, which transfers correctly. The kernel boots, loads the tg3 driver, connects the network. Up to this point everything works. Ping will work too. Any other network traffic fails. Booting from a harddrive works fine. I assume the UNDI driver somewhere breaks auto-negotiation. I've tried using mii-tool and ethtool, but I haven't managed to make it work yet. Is it possible to get negotiation working after PXE boot? Are there any tg3 driver flags that might make a difference? Berend