Re: tg3 pxe weirdness

2017-09-27 Thread Siva Reddy Kallam
On Wed, Sep 27, 2017 at 3:35 PM, Berend De Schouwer
 wrote:
> On Mon, 2017-09-25 at 15:11 +0530, Siva Reddy Kallam wrote:
>> On Fri, Sep 22, 2017 at 9:04 PM, Berend De Schouwer
>>  wrote:
>> > On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
>> > >
>> > >
>> > > Can you please share below details?
>> > > 1) Model and Manufacturer of the system
>> > > 2) Linux distro/kernel used?
>> >
>> > 4.13.3 gets a little further, but after some more data is
>> > transferred
>> > the tg3 driver still crashes.  This is unfortunately before I've
>> > got a
>> > writeable filesystem.
>> >
>> > The last line is:
>> > tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2
>> >
>> > I've got some ideas to get the full dmesg.
>> >
>> > As with the other kernels it works OK on 1Gbps, but not slower
>> > switches.
>>
>> I am suspecting with link aware mode, the clock speed could be slow
>> and boot code does not
>> complete within the expected time with lower link speeds. So,
>> Providing a patch to override clock.
>> Can you please try with attached debug patch and provide us the
>> feedback with 100M link?
>> If it solves this issue, we will work on proper changes.
>
> This does work on 4.13.3 and PXE for me.
>
> I've tested on 1 Gbps, 100 Mbps and 10 Mbps.  I've done some
> preliminary testing (eg. large file copies.)

Good. We will work on required changes and upstream proper patch after
sanity test with multiple speeds.


Re: tg3 pxe weirdness

2017-09-27 Thread Berend De Schouwer
On Mon, 2017-09-25 at 15:11 +0530, Siva Reddy Kallam wrote:
> On Fri, Sep 22, 2017 at 9:04 PM, Berend De Schouwer
>  wrote:
> > On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
> > > 
> > > 
> > > Can you please share below details?
> > > 1) Model and Manufacturer of the system
> > > 2) Linux distro/kernel used?
> > 
> > 4.13.3 gets a little further, but after some more data is
> > transferred
> > the tg3 driver still crashes.  This is unfortunately before I've
> > got a
> > writeable filesystem.
> > 
> > The last line is:
> > tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2
> > 
> > I've got some ideas to get the full dmesg.
> > 
> > As with the other kernels it works OK on 1Gbps, but not slower
> > switches.
> 
> I am suspecting with link aware mode, the clock speed could be slow
> and boot code does not
> complete within the expected time with lower link speeds. So,
> Providing a patch to override clock.
> Can you please try with attached debug patch and provide us the
> feedback with 100M link?
> If it solves this issue, we will work on proper changes.

This does work on 4.13.3 and PXE for me.

I've tested on 1 Gbps, 100 Mbps and 10 Mbps.  I've done some
preliminary testing (eg. large file copies.)


Re: tg3 pxe weirdness

2017-09-25 Thread Siva Reddy Kallam
On Fri, Sep 22, 2017 at 9:04 PM, Berend De Schouwer
 wrote:
> On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
>>
>>
>> Can you please share below details?
>> 1) Model and Manufacturer of the system
>> 2) Linux distro/kernel used?
>
> 4.13.3 gets a little further, but after some more data is transferred
> the tg3 driver still crashes.  This is unfortunately before I've got a
> writeable filesystem.
>
> The last line is:
> tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2
>
> I've got some ideas to get the full dmesg.
>
> As with the other kernels it works OK on 1Gbps, but not slower
> switches.
I am suspecting with link aware mode, the clock speed could be slow
and boot code does not
complete within the expected time with lower link speeds. So,
Providing a patch to override clock.
Can you please try with attached debug patch and provide us the
feedback with 100M link?
If it solves this issue, we will work on proper changes.


0001-tg3-Add-clock-override-support-for-5762.patch
Description: Binary data


Re: tg3 pxe weirdness

2017-09-22 Thread Berend De Schouwer
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
> 
> 
> Can you please share below details?
> 1) Model and Manufacturer of the system
> 2) Linux distro/kernel used?

4.13.3 gets a little further, but after some more data is transferred
the tg3 driver still crashes.  This is unfortunately before I've got a
writeable filesystem.

The last line is:
tg3 :01:00.0: tg3_stop_block timed out, ofs=4c00 enable_bit=2

I've got some ideas to get the full dmesg.

As with the other kernels it works OK on 1Gbps, but not slower
switches.


Re: tg3 pxe weirdness

2017-09-22 Thread Berend De Schouwer
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
> On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer
>  wrote:
> > Hi,
> > 
> > I've got a machine with a Broadcom bcm5762c, using the tg3 driver,
> > that
> > fails to receive network packets under some very specific
> > conditions.
> > 
> > 
> > Berend
> 
> Can you please share below details?
> 1) Model and Manufacturer of the system
> 2) Linux distro/kernel used?

3.10.107 behaves like CentOS' kernel.


Re: tg3 pxe weirdness

2017-09-22 Thread Berend De Schouwer
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
> On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer
>  wrote:
> > Hi,
> > 
> > I've got a machine with a Broadcom bcm5762c, using the tg3 driver,
> > that
> > fails to receive network packets under some very specific
> > conditions.
> > 
> > 
> > Berend
> 
> Can you please share below details?
> 1) Model and Manufacturer of the system
> 2) Linux distro/kernel used?

4.13.3 mainline locks up, but network dumps confirm it gets further. 
The second DHCP query completes, and the first NFS mount completes.

I have to rely on network dumps since on PXE and SATA boot it breaks
VGA output, so I can't see what it's doing, or why it stops.


Re: tg3 pxe weirdness

2017-09-22 Thread Berend De Schouwer
On Fri, 2017-09-22 at 11:51 +0530, Siva Reddy Kallam wrote:
> On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer
>  wrote:
> > Hi,
> > 
> > I've got a machine with a Broadcom bcm5762c, using the tg3 driver,
> > that
> > fails to receive network packets under some very specific
> > conditions.
> > 
> > 
> > Berend
> 
> Can you please share below details?
> 1) Model and Manufacturer of the system

HP EliteDesk 705 G3 SFF
lspci -n on the network card: 01:00.0 0200: 14e4:1687 (rev 10)

dmesg from tg3: 
tg3 :01:00.0: PCI INT A -> GSI 24 (level, low) -> IRQ 24
tg3 :01:00.0: setting latency timer to 64
tg3 :01:00.0: eth0: Tigon3 [partno(none) rev 5762100] (PCI Express)
MAC address 70:5a:0f:3d:9f:4f
tg3 :01:00.0: eth0: attached PHY is 5762C (10/100/1000Base-T
Ethernet) (WireSpeed[1], EEE[1])
tg3 :01:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0]
TSOcap[1]
tg3 :01:00.0: eth0: dma_rwctrl[0001] dma_mask[64-bit]

The same motherboard with add-on network cards does boot using PXE. 
It's an EFI board but Secureboot is currently disabled.


> 2) Linux distro/kernel used?

CentOS 6 + updates.  I've tried all the CentOS 6 kernels up to 2.6.32-
696.10.2.el6.x86_64

I've tried updating the tg3 driver from 3.137 to both 3.137h and
3.137o, with the same result.

I'm currently in the process of making 4.13.2 boot.


Re: tg3 pxe weirdness

2017-09-21 Thread Siva Reddy Kallam
On Thu, Sep 21, 2017 at 7:53 PM, Berend De Schouwer
 wrote:
> Hi,
>
> I've got a machine with a Broadcom bcm5762c, using the tg3 driver, that
> fails to receive network packets under some very specific conditions.
>
> It works perfectly using a 1Gbps switch.  If, however, it first uses
> PXE and then loads the Linux tg3 driver, and the switch is 100Mbps, it
> no longer receives packets larger than ICMP.
>
> It does do ARP and ping.
>
> If it boots using PXE on a 1Gbps switch, boots into Linux, and then
> it's plugged into 100 Mbps it also stops receiving packets.
>
> mii-diag and dmesg confirm auto-negotiated speed and flow control, and
> confirm temporary disconnect as the cables are moved.
>
> PXE boots using UNDI, which then transfers a kernel using TFTP, which
> transfers correctly.  The kernel boots, loads the tg3 driver, connects
> the network.  Up to this point everything works.  Ping will work too.
> Any other network traffic fails.
>
> Booting from a harddrive works fine.  I assume the UNDI driver
> somewhere breaks auto-negotiation.  I've tried using mii-tool and
> ethtool, but I haven't managed to make it work yet.
>
> Is it possible to get negotiation working after PXE boot?  Are there
> any tg3 driver flags that might make a difference?
>
>
> Berend

Can you please share below details?
1) Model and Manufacturer of the system
2) Linux distro/kernel used?


tg3 pxe weirdness

2017-09-21 Thread Berend De Schouwer
Hi,

I've got a machine with a Broadcom bcm5762c, using the tg3 driver, that
fails to receive network packets under some very specific conditions.

It works perfectly using a 1Gbps switch.  If, however, it first uses
PXE and then loads the Linux tg3 driver, and the switch is 100Mbps, it
no longer receives packets larger than ICMP.

It does do ARP and ping.

If it boots using PXE on a 1Gbps switch, boots into Linux, and then
it's plugged into 100 Mbps it also stops receiving packets.

mii-diag and dmesg confirm auto-negotiated speed and flow control, and
confirm temporary disconnect as the cables are moved.

PXE boots using UNDI, which then transfers a kernel using TFTP, which
transfers correctly.  The kernel boots, loads the tg3 driver, connects
the network.  Up to this point everything works.  Ping will work too. 
Any other network traffic fails.

Booting from a harddrive works fine.  I assume the UNDI driver
somewhere breaks auto-negotiation.  I've tried using mii-tool and
ethtool, but I haven't managed to make it work yet.

Is it possible to get negotiation working after PXE boot?  Are there
any tg3 driver flags that might make a difference?


Berend