Re: ibm_newemac tx problem with jumbo frame enabled

2011-12-08 Thread Prashant Bhole
On Thu, Dec 8, 2011 at 3:33 AM, Benjamin Herrenschmidt 
b...@kernel.crashing.org wrote:

 On Wed, 2011-12-07 at 13:35 +0530, Prashant Bhole wrote:
  Still couldn't find anything like fifo overflow...
  I noticed one more thing, this problem happens only when mtu size on
  the initiator (the other end) is set to 4088, regardless of any mtu
  size set for EMAC.

 Did you check all the registers that may carry errors ? Nothing showed
 up ? Did you check that things like Pause frames were properly
 negociated on both sides ? Tried playing with the pause and FIFO
 thresholds ?

 Other than using the tx timeout to perform resets I don't see a good way
 to fix that problem.

 Cheers,
 Ben.


I checked RX descriptor status and TX descriptor status and ethtool output.
However I don't know about pause packet/frame, how do I check if pause
frames are properly negotiated on both sides?
I need to try changing pause and FIFO thresholds.

ethtool output after disconnection is as follows:
# ethtool -S eth0
NIC statistics:
 rx_packets: 330939
 rx_bytes: 804963241
 tx_packets: 248554
 tx_bytes: 798853638
 rx_packets_csum: 330716
 tx_packets_csum: 179526
 tx_undo: 0
 rx_dropped_stack: 0
 rx_dropped_oom: 0
 rx_dropped_error: 0
 rx_dropped_resize: 0
 rx_dropped_mtu: 0
 rx_stopped: 0
 rx_bd_errors: 0
 rx_bd_overrun: 0
 rx_bd_bad_packet: 0
 rx_bd_runt_packet: 0
 rx_bd_short_event: 0
 rx_bd_alignment_error: 0
 rx_bd_bad_fcs: 0
 rx_bd_packet_too_long: 0
 rx_bd_out_of_range: 0
 rx_bd_in_range: 0
 rx_parity: 0
 rx_fifo_overrun: 0
 rx_overrun: 0
 rx_bad_packet: 0
 rx_runt_packet: 0
 rx_short_event: 0
 rx_alignment_error: 0
 rx_bad_fcs: 0
 rx_packet_too_long: 0
 rx_out_of_range: 0
 rx_in_range: 0
 tx_dropped: 0
 tx_bd_errors: 0
 tx_bd_bad_fcs: 0
 tx_bd_carrier_loss: 0
 tx_bd_excessive_deferral: 0
 tx_bd_excessive_collisions: 0
 tx_bd_late_collision: 0
 tx_bd_multple_collisions: 0
 tx_bd_single_collision: 0
 tx_bd_underrun: 0
 tx_bd_sqe: 0
 tx_parity: 0
 tx_underrun: 0
 tx_sqe: 0
 tx_errors: 0


Thanks,
Prashant
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: ibm_newemac tx problem with jumbo frame enabled

2011-12-08 Thread Benjamin Herrenschmidt
On Thu, 2011-12-08 at 18:31 +0530, Prashant Bhole wrote:

 
 I checked RX descriptor status and TX descriptor status and ethtool
 output.
 However I don't know about pause packet/frame, how do I check if pause
 frames are properly negotiated on both sides? 
 I need to try changing pause and FIFO thresholds.
 
 ethtool output after disconnection is as follows:
 # ethtool -S eth0
 NIC statistics:
  rx_packets: 330939
  rx_bytes: 804963241
  tx_packets: 248554
  tx_bytes: 798853638
  rx_packets_csum: 330716
  tx_packets_csum: 179526
  tx_undo: 0

 .../...

Ok so none of the error counters seem to trip, odd. No idea what's up,
you may want to ask the folks at APM (CCed Tirumala).

I wonder also if we are properly enabling the reporting of error
interrupts... if we got that wrong we may never detect FIFO overruns.
What you describe really looks like a fifo overrun to me.

Additionally, look at emac_configure(), sees how it configures the pause
packet thresholds, maybe you can tweak the watermark to be more
aggressive. Also check that pause is actually enabled (with ethtool) and
that the PHY negociated it properly (that the link partner supports
pause frames).

Cheers,
Ben.



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: ibm_newemac tx problem with jumbo frame enabled

2011-12-08 Thread Tirumala Marri
Hi Ben,

-Original Message-
From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
Sent: Thursday, December 08, 2011 2:59 PM
To: Prashant Bhole
Cc: linuxppc-...@ozlabs.org; Tirumala Marri
Subject: Re: ibm_newemac tx problem with jumbo frame enabled

On Thu, 2011-12-08 at 18:31 +0530, Prashant Bhole wrote:


 I checked RX descriptor status and TX descriptor status and ethtool
 output.
 However I don't know about pause packet/frame, how do I check if pause
 frames are properly negotiated on both sides?
 I need to try changing pause and FIFO thresholds.

 ethtool output after disconnection is as follows:
 # ethtool -S eth0
 NIC statistics:
  rx_packets: 330939
  rx_bytes: 804963241
  tx_packets: 248554
  tx_bytes: 798853638
  rx_packets_csum: 330716
  tx_packets_csum: 179526
  tx_undo: 0

 .../...

Ok so none of the error counters seem to trip, odd. No idea what's up,
you may want to ask the folks at APM (CCed Tirumala).

I wonder also if we are properly enabling the reporting of error
interrupts... if we got that wrong we may never detect FIFO overruns.
What you describe really looks like a fifo overrun to me.

Additionally, look at emac_configure(), sees how it configures the pause
packet thresholds, maybe you can tweak the watermark to be more
aggressive. Also check that pause is actually enabled (with ethtool) and
that the PHY negociated it properly (that the link partner supports
pause frames).

I will take a look.
Thx,
Marri
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: ibm_newemac tx problem with jumbo frame enabled

2011-12-07 Thread Prashant Bhole
On Fri, Nov 25, 2011 at 10:55 AM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:
 On Fri, 2011-11-18 at 10:33 +0530, Prashant Bhole wrote:
 Hi,
 I have been facing problem with ibm_newemac driver (v3.54).
 The board gets disconnected and can not be pinged in between
 some heavy network traffic. In my case I am running IOmeter
 All-in-One 8 threads on the iSCSI target. MTU is 4088.

 I found that after executing emac_full_tx_reset(), the board can
 be pinged again. Again after some heavy traffic of 5-6 seconds,
 traffic stops. This can be repeated after full tx reset.

 Is this a known issue? what could cause this?
 Any pointers would be greatly appreciated.

 Not that I know of. Can you check if any of the error reporting
 registers trip anything ? Could it just be a fifo overflow which we may
 not be handling properly in the driver ?

 Cheers,
 Ben.

Still couldn't find anything like fifo overflow...
I noticed one more thing, this problem happens only when mtu size on
the initiator (the other end) is set to 4088, regardless of any mtu size set
for EMAC.


-
Prashant
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: ibm_newemac tx problem with jumbo frame enabled

2011-12-07 Thread Benjamin Herrenschmidt
On Wed, 2011-12-07 at 13:35 +0530, Prashant Bhole wrote:
 Still couldn't find anything like fifo overflow...
 I noticed one more thing, this problem happens only when mtu size on
 the initiator (the other end) is set to 4088, regardless of any mtu
 size set for EMAC. 

Did you check all the registers that may carry errors ? Nothing showed
up ? Did you check that things like Pause frames were properly
negociated on both sides ? Tried playing with the pause and FIFO
thresholds ?

Other than using the tx timeout to perform resets I don't see a good way
to fix that problem.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: ibm_newemac tx problem with jumbo frame enabled

2011-11-24 Thread Benjamin Herrenschmidt
On Fri, 2011-11-18 at 10:33 +0530, Prashant Bhole wrote:
 Hi,
 I have been facing problem with ibm_newemac driver (v3.54).
 The board gets disconnected and can not be pinged in between
 some heavy network traffic. In my case I am running IOmeter
 All-in-One 8 threads on the iSCSI target. MTU is 4088.
 
 I found that after executing emac_full_tx_reset(), the board can
 be pinged again. Again after some heavy traffic of 5-6 seconds,
 traffic stops. This can be repeated after full tx reset.
 
 Is this a known issue? what could cause this?
 Any pointers would be greatly appreciated.

Not that I know of. Can you check if any of the error reporting
registers trip anything ? Could it just be a fifo overflow which we may
not be handling properly in the driver ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


ibm_newemac tx problem with jumbo frame enabled

2011-11-17 Thread Prashant Bhole
Hi,
I have been facing problem with ibm_newemac driver (v3.54).
The board gets disconnected and can not be pinged in between
some heavy network traffic. In my case I am running IOmeter
All-in-One 8 threads on the iSCSI target. MTU is 4088.

I found that after executing emac_full_tx_reset(), the board can
be pinged again. Again after some heavy traffic of 5-6 seconds,
traffic stops. This can be repeated after full tx reset.

Is this a known issue? what could cause this?
Any pointers would be greatly appreciated.


-
Prashant
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev