Re: IPoIB issues

2010-03-10 Thread Eli Cohen
On Thu, Mar 11, 2010 at 09:47:31AM +0200, Or Gerlitz wrote:
> >The patch does not address these failures directly but maybe as a
> >side effect they would go away too.
> The patch seems to solve a case of possible "live lock" happening in
> a node which has both CM and datagram neighbors e.g where ipoib have
> called netif_stop etc but there is now room in the QP for more
> postings which could turn into letting the network layer continue to
> post if the CQ would have been polled. Its hard to see how this
> relates to the post send error print
Right, I meant that they could disapear due to the system not getting
into such a state that they will show up but the patch __does not__
address that problem.

> 
> >I think printing the return value is in place so in the future we will have 
> >more information in such cases.
> I posted a patch that does this, but I think it missed the 2.6.34
> merge cycle.
> 
Can you push them to OFED-1.5.1? We'll remove the patch later when
it's in the kernel but at least we'll have the information handy
if/when we need it.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPoIB issues

2010-03-10 Thread Or Gerlitz

Eli Cohen wrote:
The patch does not address these failures directly but maybe as a side effect they would go away too. 
The patch seems to solve a case of possible "live lock" happening in a 
node which has both CM and datagram neighbors e.g where ipoib have 
called netif_stop etc but there is now room in the QP for more postings 
which could turn into letting the network layer continue to post if the 
CQ would have been polled. Its hard to see how this  relates to the post 
send error print



I think printing the return value is in place so in the future we will have 
more information in such cases.
I posted a patch that does this, but I think it missed the 2.6.34 merge 
cycle.


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPoIB issues

2010-03-10 Thread Eli Cohen
On Wed, Mar 10, 2010 at 05:30:38PM +0200, Moni Shoua wrote:
> Hi Eli
> Although Josh already reported that the patch seems to fix the issue I have a 
> question though.
> 
> "post_send failed" prints were during work in datagram mode. I don't know if 
> Josh verified 
> that but I don't expect that these prints would go away, even with the patch. 
> Am I right?
The patch does not address these failures directly but maybe as a side
effect they would go away too. Maybe Josh can share with us his
experience.

> 
> BTW, what could be the reason for UD QP post_send() failures?
> 

Usually they should not fail unless the WR is malformed or the QP has
all available WR outstanding, which should not happen in IPoIB. I
think printing the return value is in place so in the future we will
have more information in such cases.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPoIB issues

2010-03-10 Thread Moni Shoua
Eli Cohen wrote:
> I just posted a patch which might fix your problem. Please try it and
> let us know if it fixed anything.
> 
Hi Eli
Although Josh already reported that the patch seems to fix the issue I have a 
question though.

"post_send failed" prints were during work in datagram mode. I don't know if 
Josh verified 
that but I don't expect that these prints would go away, even with the patch. 
Am I right?

BTW, what could be the reason for UD QP post_send() failures?

>>
>> In datagram mode, I see errors on the boot servers of the form.
>>
>> ib0: post_send failed
>> ib0: post_send failed
>> ib0: post_send failed
>>
>>
>> When using connected mode, I hit a different error:
>>
>> NETDEV WATCHDOG: ib0: transmit timed out
>> ib0: transmit timeout: latency 1999 msecs
>> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
>> NETDEV WATCHDOG: ib0: transmit timed out
>> ib0: transmit timeout: latency 2999 msecs
>> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
>> ...
>> ...
>> NETDEV WATCHDOG: ib0: transmit timed out
>> ib0: transmit timeout: latency 61824999 msecs
>> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
>>
>>
>> The errors seem to hit only after NFS comes into play.  Once it
>> starts, the NETDEV WATCHDOG messages continue until I run
>> 'ifconfig ib0 down up'.  I've tried tuning send_queue_size and
>> recv_queue_size on both sides, the txqueuelen of the ib0 interface, the
>> NFS rsize/wsize.  None of it seems to help greatly.  Does anyone have
>> any ideas about what can I do to try to fix
>> these problems?
>>
>> -JE
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPoIB issues

2010-03-03 Thread Josh England
I've applied the patch and initial testing has not produced any
transmit timeout errors.  I'll be doing some heavier testing in the
next couple days, but it looks good so far.  Thanks for the quick
turn-around!

-JE

On Wed, Mar 3, 2010 at 4:29 AM, Eli Cohen  wrote:
> I just posted a patch which might fix your problem. Please try it and
> let us know if it fixed anything.
>
> On Tue, Mar 02, 2010 at 01:54:09PM -0800, Josh England wrote:
>> Hello,
>>
>> I've been running into several issues using IPoIB.  The 2 primary uses
>> are for read-only NFS to the clients (over TCP) and access to an
>> ethernet-connected parallel filesystem (Panasas) through router nodes
>> passing IPoIB<-->10GbE.
>>
>> All nodes are running CentOS 5.3 and OFED 1.4.2, although a have played
>> with OFED 1.5 and seen similar results.  Client nodes mount their NFS root
>> from boot servers via IPoIB with a ratio of 80:1.  The boot servers are the
>> ones that seem to have issues.  The fabric itself consists of ~1000 nodes
>> interconnected such that their is 2:1 oversubscription within any single 
>> rack,
>> and 20:1 oversubscription between racks (through the core switch).  I
>> don't know how much the oversubscription comes into play here as I can
>> reproduce the error within a single rack.
>>
>> In datagram mode, I see errors on the boot servers of the form.
>>
>> ib0: post_send failed
>> ib0: post_send failed
>> ib0: post_send failed
>>
>>
>> When using connected mode, I hit a different error:
>>
>> NETDEV WATCHDOG: ib0: transmit timed out
>> ib0: transmit timeout: latency 1999 msecs
>> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
>> NETDEV WATCHDOG: ib0: transmit timed out
>> ib0: transmit timeout: latency 2999 msecs
>> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
>> ...
>> ...
>> NETDEV WATCHDOG: ib0: transmit timed out
>> ib0: transmit timeout: latency 61824999 msecs
>> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
>>
>>
>> The errors seem to hit only after NFS comes into play.  Once it
>> starts, the NETDEV WATCHDOG messages continue until I run
>> 'ifconfig ib0 down up'.  I've tried tuning send_queue_size and
>> recv_queue_size on both sides, the txqueuelen of the ib0 interface, the
>> NFS rsize/wsize.  None of it seems to help greatly.  Does anyone have
>> any ideas about what can I do to try to fix
>> these problems?
>>
>> -JE
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPoIB issues

2010-03-03 Thread Eli Cohen
I just posted a patch which might fix your problem. Please try it and
let us know if it fixed anything.

On Tue, Mar 02, 2010 at 01:54:09PM -0800, Josh England wrote:
> Hello,
> 
> I've been running into several issues using IPoIB.  The 2 primary uses
> are for read-only NFS to the clients (over TCP) and access to an
> ethernet-connected parallel filesystem (Panasas) through router nodes
> passing IPoIB<-->10GbE.
> 
> All nodes are running CentOS 5.3 and OFED 1.4.2, although a have played
> with OFED 1.5 and seen similar results.  Client nodes mount their NFS root
> from boot servers via IPoIB with a ratio of 80:1.  The boot servers are the
> ones that seem to have issues.  The fabric itself consists of ~1000 nodes
> interconnected such that their is 2:1 oversubscription within any single rack,
> and 20:1 oversubscription between racks (through the core switch).  I
> don't know how much the oversubscription comes into play here as I can
> reproduce the error within a single rack.
> 
> In datagram mode, I see errors on the boot servers of the form.
> 
> ib0: post_send failed
> ib0: post_send failed
> ib0: post_send failed
> 
> 
> When using connected mode, I hit a different error:
> 
> NETDEV WATCHDOG: ib0: transmit timed out
> ib0: transmit timeout: latency 1999 msecs
> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
> NETDEV WATCHDOG: ib0: transmit timed out
> ib0: transmit timeout: latency 2999 msecs
> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
> ...
> ...
> NETDEV WATCHDOG: ib0: transmit timed out
> ib0: transmit timeout: latency 61824999 msecs
> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464
> 
> 
> The errors seem to hit only after NFS comes into play.  Once it
> starts, the NETDEV WATCHDOG messages continue until I run
> 'ifconfig ib0 down up'.  I've tried tuning send_queue_size and
> recv_queue_size on both sides, the txqueuelen of the ib0 interface, the
> NFS rsize/wsize.  None of it seems to help greatly.  Does anyone have
> any ideas about what can I do to try to fix
> these problems?
> 
> -JE
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html