Re: IPoIB issues
On Thu, Mar 11, 2010 at 09:47:31AM +0200, Or Gerlitz wrote: > >The patch does not address these failures directly but maybe as a > >side effect they would go away too. > The patch seems to solve a case of possible "live lock" happening in > a node which has both CM and datagram neighbors e.g where ipoib have > called netif_stop etc but there is now room in the QP for more > postings which could turn into letting the network layer continue to > post if the CQ would have been polled. Its hard to see how this > relates to the post send error print Right, I meant that they could disapear due to the system not getting into such a state that they will show up but the patch __does not__ address that problem. > > >I think printing the return value is in place so in the future we will have > >more information in such cases. > I posted a patch that does this, but I think it missed the 2.6.34 > merge cycle. > Can you push them to OFED-1.5.1? We'll remove the patch later when it's in the kernel but at least we'll have the information handy if/when we need it. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IPoIB issues
Eli Cohen wrote: The patch does not address these failures directly but maybe as a side effect they would go away too. The patch seems to solve a case of possible "live lock" happening in a node which has both CM and datagram neighbors e.g where ipoib have called netif_stop etc but there is now room in the QP for more postings which could turn into letting the network layer continue to post if the CQ would have been polled. Its hard to see how this relates to the post send error print I think printing the return value is in place so in the future we will have more information in such cases. I posted a patch that does this, but I think it missed the 2.6.34 merge cycle. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IPoIB issues
On Wed, Mar 10, 2010 at 05:30:38PM +0200, Moni Shoua wrote: > Hi Eli > Although Josh already reported that the patch seems to fix the issue I have a > question though. > > "post_send failed" prints were during work in datagram mode. I don't know if > Josh verified > that but I don't expect that these prints would go away, even with the patch. > Am I right? The patch does not address these failures directly but maybe as a side effect they would go away too. Maybe Josh can share with us his experience. > > BTW, what could be the reason for UD QP post_send() failures? > Usually they should not fail unless the WR is malformed or the QP has all available WR outstanding, which should not happen in IPoIB. I think printing the return value is in place so in the future we will have more information in such cases. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IPoIB issues
Eli Cohen wrote: > I just posted a patch which might fix your problem. Please try it and > let us know if it fixed anything. > Hi Eli Although Josh already reported that the patch seems to fix the issue I have a question though. "post_send failed" prints were during work in datagram mode. I don't know if Josh verified that but I don't expect that these prints would go away, even with the patch. Am I right? BTW, what could be the reason for UD QP post_send() failures? >> >> In datagram mode, I see errors on the boot servers of the form. >> >> ib0: post_send failed >> ib0: post_send failed >> ib0: post_send failed >> >> >> When using connected mode, I hit a different error: >> >> NETDEV WATCHDOG: ib0: transmit timed out >> ib0: transmit timeout: latency 1999 msecs >> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 >> NETDEV WATCHDOG: ib0: transmit timed out >> ib0: transmit timeout: latency 2999 msecs >> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 >> ... >> ... >> NETDEV WATCHDOG: ib0: transmit timed out >> ib0: transmit timeout: latency 61824999 msecs >> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 >> >> >> The errors seem to hit only after NFS comes into play. Once it >> starts, the NETDEV WATCHDOG messages continue until I run >> 'ifconfig ib0 down up'. I've tried tuning send_queue_size and >> recv_queue_size on both sides, the txqueuelen of the ib0 interface, the >> NFS rsize/wsize. None of it seems to help greatly. Does anyone have >> any ideas about what can I do to try to fix >> these problems? >> >> -JE >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IPoIB issues
I've applied the patch and initial testing has not produced any transmit timeout errors. I'll be doing some heavier testing in the next couple days, but it looks good so far. Thanks for the quick turn-around! -JE On Wed, Mar 3, 2010 at 4:29 AM, Eli Cohen wrote: > I just posted a patch which might fix your problem. Please try it and > let us know if it fixed anything. > > On Tue, Mar 02, 2010 at 01:54:09PM -0800, Josh England wrote: >> Hello, >> >> I've been running into several issues using IPoIB. The 2 primary uses >> are for read-only NFS to the clients (over TCP) and access to an >> ethernet-connected parallel filesystem (Panasas) through router nodes >> passing IPoIB<-->10GbE. >> >> All nodes are running CentOS 5.3 and OFED 1.4.2, although a have played >> with OFED 1.5 and seen similar results. Client nodes mount their NFS root >> from boot servers via IPoIB with a ratio of 80:1. The boot servers are the >> ones that seem to have issues. The fabric itself consists of ~1000 nodes >> interconnected such that their is 2:1 oversubscription within any single >> rack, >> and 20:1 oversubscription between racks (through the core switch). I >> don't know how much the oversubscription comes into play here as I can >> reproduce the error within a single rack. >> >> In datagram mode, I see errors on the boot servers of the form. >> >> ib0: post_send failed >> ib0: post_send failed >> ib0: post_send failed >> >> >> When using connected mode, I hit a different error: >> >> NETDEV WATCHDOG: ib0: transmit timed out >> ib0: transmit timeout: latency 1999 msecs >> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 >> NETDEV WATCHDOG: ib0: transmit timed out >> ib0: transmit timeout: latency 2999 msecs >> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 >> ... >> ... >> NETDEV WATCHDOG: ib0: transmit timed out >> ib0: transmit timeout: latency 61824999 msecs >> ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 >> >> >> The errors seem to hit only after NFS comes into play. Once it >> starts, the NETDEV WATCHDOG messages continue until I run >> 'ifconfig ib0 down up'. I've tried tuning send_queue_size and >> recv_queue_size on both sides, the txqueuelen of the ib0 interface, the >> NFS rsize/wsize. None of it seems to help greatly. Does anyone have >> any ideas about what can I do to try to fix >> these problems? >> >> -JE >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IPoIB issues
I just posted a patch which might fix your problem. Please try it and let us know if it fixed anything. On Tue, Mar 02, 2010 at 01:54:09PM -0800, Josh England wrote: > Hello, > > I've been running into several issues using IPoIB. The 2 primary uses > are for read-only NFS to the clients (over TCP) and access to an > ethernet-connected parallel filesystem (Panasas) through router nodes > passing IPoIB<-->10GbE. > > All nodes are running CentOS 5.3 and OFED 1.4.2, although a have played > with OFED 1.5 and seen similar results. Client nodes mount their NFS root > from boot servers via IPoIB with a ratio of 80:1. The boot servers are the > ones that seem to have issues. The fabric itself consists of ~1000 nodes > interconnected such that their is 2:1 oversubscription within any single rack, > and 20:1 oversubscription between racks (through the core switch). I > don't know how much the oversubscription comes into play here as I can > reproduce the error within a single rack. > > In datagram mode, I see errors on the boot servers of the form. > > ib0: post_send failed > ib0: post_send failed > ib0: post_send failed > > > When using connected mode, I hit a different error: > > NETDEV WATCHDOG: ib0: transmit timed out > ib0: transmit timeout: latency 1999 msecs > ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 > NETDEV WATCHDOG: ib0: transmit timed out > ib0: transmit timeout: latency 2999 msecs > ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 > ... > ... > NETDEV WATCHDOG: ib0: transmit timed out > ib0: transmit timeout: latency 61824999 msecs > ib0: queue stopped 1, tx_head 2154042680, tx_tail 2154039464 > > > The errors seem to hit only after NFS comes into play. Once it > starts, the NETDEV WATCHDOG messages continue until I run > 'ifconfig ib0 down up'. I've tried tuning send_queue_size and > recv_queue_size on both sides, the txqueuelen of the ib0 interface, the > NFS rsize/wsize. None of it seems to help greatly. Does anyone have > any ideas about what can I do to try to fix > these problems? > > -JE > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html