Re: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path
On 9/30/2010 5:35 PM, Ralph Campbell wrote: > I was looking at the Rx connection tear down and found a bug. > I don't know if it would cause this panic but you might try it. > I haven't stress tested it but it compiles and basic network > connections work. > > I also don't like the call to cancel_delayed_work(&priv->cm.stale_task) > at the end of ipoib_cm_dev_stop(). I think it should be called after > ib_destroy_cm_id() and priv->cm.id = NULL. > Ralph, I have managed to recreate this crash a few times under stress. I expect to be able to try your patch some time next week, and will let you know. Thanks for taking time to look into this. Thanks Pradeep > On Thu, 2010-09-02 at 20:41 -0700, Pradeep Satyanarayana wrote: >> Ralph, >> >> I see the following crash sporadically (only under stress) with a Sles11SP1 >> (which is 2.6.32 kernel). >> I saw this crash with V4 of your patch and have not yet had a chance to try >> V5. Have you seen this >> in your testing? If this not the crash stack can you please share what your >> patch fixes? >> >> <4>ib0: RX drain timing out >> <4>idr_remove called for id=11491974 which is not allocated. >> <4>Call Trace: >> <4>[c00749fe33b0] [c00129e4] .show_stack+0x6c/0x198 (unreliable) >> <4>[c00749fe3460] [c02ea594] .sub_remove+0x1ec/0x1f8 >> <4>[c00749fe3520] [c02ea5e0] .idr_remove+0x40/0xf8 >> <4>[c00749fe35b0] [d00012d84d70] .cm_destroy_id+0xa0/0x520 [ib_cm] >> <4>[c00749fe3680] [d0001b7fb644] >> .ipoib_cm_free_rx_reap_list+0xd4/0x190 [ib_ipoib] >> <4>[c00749fe3740] [d0001b7fe404] .ipoib_cm_dev_stop+0x23c/0x360 >> [ib_ipoib] >> <4>[c00749fe3800] [d0001b7f4dbc] .ipoib_ib_dev_stop+0xe4/0x4b0 >> [ib_ipoib] >> <4>[c00749fe3960] [d0001b7f0f30] .ipoib_stop+0x88/0x178 [ib_ipoib] >> <4>[c00749fe39f0] [c04eacf4] .dev_close+0xdc/0x148 >> <4>[c00749fe3a80] [c04ea2b8] .dev_change_flags+0x1f0/0x288 >> <4>[c00749fe3b20] [d0001b7f11b8] .ipoib_remove_one+0xb8/0x140 >> [ib_ipoib] >> <4>[c00749fe3bc0] [d0001210425c] .ib_unregister_client+0xb4/0x1b8 >> [ib_core] >> <4>[c00749fe3c90] [d0001b7ffde8] .ipoib_cleanup_module+0x20/0x60 >> [ib_ipoib] >> <4>[c00749fe3d20] [c00ec408] .SyS_delete_module+0x238/0x320 >> <4>[c00749fe3e30] [c00085b4] syscall_exit+0x0/0x40 >> <1>Unable to handle kernel paging request for data at address >> 0x4527228d1ffb >> <1>Faulting instruction address: 0xc05a8e88 >> 12:mon> e >> cpu 0x12: Vector: 300 (Data Access) at [c00749fe3250] >> pc: c05a8e88: .wait_for_common+0xb8/0x268 >> lr: c05a8e20: .wait_for_common+0x50/0x268 >> sp: c00749fe34d0 >>msr: 80009032 >>dar: 4527228d1ffb >> dsisr: 4200 >> current = 0xc0074b4ce0e0 >> paca= 0xc0f64a00 >> pid = 13605, comm = modprobe >> 12:mon> >> >> Thanks >> Pradeep > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path
I was looking at the Rx connection tear down and found a bug. I don't know if it would cause this panic but you might try it. I haven't stress tested it but it compiles and basic network connections work. I also don't like the call to cancel_delayed_work(&priv->cm.stale_task) at the end of ipoib_cm_dev_stop(). I think it should be called after ib_destroy_cm_id() and priv->cm.id = NULL. On Thu, 2010-09-02 at 20:41 -0700, Pradeep Satyanarayana wrote: > Ralph, > > I see the following crash sporadically (only under stress) with a Sles11SP1 > (which is 2.6.32 kernel). > I saw this crash with V4 of your patch and have not yet had a chance to try > V5. Have you seen this > in your testing? If this not the crash stack can you please share what your > patch fixes? > > <4>ib0: RX drain timing out > <4>idr_remove called for id=11491974 which is not allocated. > <4>Call Trace: > <4>[c00749fe33b0] [c00129e4] .show_stack+0x6c/0x198 (unreliable) > <4>[c00749fe3460] [c02ea594] .sub_remove+0x1ec/0x1f8 > <4>[c00749fe3520] [c02ea5e0] .idr_remove+0x40/0xf8 > <4>[c00749fe35b0] [d00012d84d70] .cm_destroy_id+0xa0/0x520 [ib_cm] > <4>[c00749fe3680] [d0001b7fb644] > .ipoib_cm_free_rx_reap_list+0xd4/0x190 [ib_ipoib] > <4>[c00749fe3740] [d0001b7fe404] .ipoib_cm_dev_stop+0x23c/0x360 > [ib_ipoib] > <4>[c00749fe3800] [d0001b7f4dbc] .ipoib_ib_dev_stop+0xe4/0x4b0 > [ib_ipoib] > <4>[c00749fe3960] [d0001b7f0f30] .ipoib_stop+0x88/0x178 [ib_ipoib] > <4>[c00749fe39f0] [c04eacf4] .dev_close+0xdc/0x148 > <4>[c00749fe3a80] [c04ea2b8] .dev_change_flags+0x1f0/0x288 > <4>[c00749fe3b20] [d0001b7f11b8] .ipoib_remove_one+0xb8/0x140 > [ib_ipoib] > <4>[c00749fe3bc0] [d0001210425c] .ib_unregister_client+0xb4/0x1b8 > [ib_core] > <4>[c00749fe3c90] [d0001b7ffde8] .ipoib_cleanup_module+0x20/0x60 > [ib_ipoib] > <4>[c00749fe3d20] [c00ec408] .SyS_delete_module+0x238/0x320 > <4>[c00749fe3e30] [c00085b4] syscall_exit+0x0/0x40 > <1>Unable to handle kernel paging request for data at address > 0x4527228d1ffb > <1>Faulting instruction address: 0xc05a8e88 > 12:mon> e > cpu 0x12: Vector: 300 (Data Access) at [c00749fe3250] > pc: c05a8e88: .wait_for_common+0xb8/0x268 > lr: c05a8e20: .wait_for_common+0x50/0x268 > sp: c00749fe34d0 >msr: 80009032 >dar: 4527228d1ffb > dsisr: 4200 > current = 0xc0074b4ce0e0 > paca= 0xc0f64a00 > pid = 13605, comm = modprobe > 12:mon> > > Thanks > Pradeep IB/ipoib: fix race when handling IPOIB_CM_RX_DRAIN_WRID From: Ralph Campbell ipoib_cm_start_rx_drain() calls ib_post_send() and *then* moves the struct ipoib_cm_rx onto the rx_drain_list. The ib_post_send() will trigger a completion callback to ipoib_cm_handle_rx_wc() which tries to move the rx_drain_list to the rx_reap_list but if the callback happens before ipoib_cm_start_rx_drain() has moved the structure, it is left in limbo. The fix is to change ipoib_cm_start_rx_drain() to put the struct on the rx_drain_list and then call ib_post_send(). Also, only move one struct from rx_flush_list to rx_drain_list since concurrent IPOIB_CM_RX_DRAIN_WRID events on different QPs could put multiple ipoib_cm_rx structs on rx_flush_list. Signed-off-by: Ralph Campbell --- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index bb10041..dfff159 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -216,15 +216,21 @@ static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) !list_empty(&priv->cm.rx_drain_list)) return; + p = list_entry(priv->cm.rx_flush_list.next, typeof(*p), list); + + /* + * Put p on rx_drain_list before calling ib_post_send() or there + * is a race with the ipoib_cm_handle_rx_wc() completion handler + * trying to remove it from rx_drain_list. + */ + list_move(&p->list, &priv->cm.rx_drain_list); + /* * QPs on flush list are error state. This way, a "flush * error" WC will be immediately generated for each WR we post. */ - p = list_entry(priv->cm.rx_flush_list.next, typeof(*p), list); if (ib_post_send(p->qp, &ipoib_cm_rx_drain_wr, &bad_wr)) ipoib_warn(priv, "failed to post drain wr\n"); - - list_splice_init(&priv->cm.rx_flush_list, &priv->cm.rx_drain_list); } static void ipoib_cm_rx_event_handler(struct ib_event *event, void *ctx)
RE: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path
I haven't seen this stack trace before. Since it involves the RX QP connection and my patch changes the TX QP connection, I doubt my patch has any effect on this case. When I get some time, I will look to see if I can find similar races in the RX connection set up & tear down that might exist. From: Pradeep Satyanarayana [prade...@linux.vnet.ibm.com] Sent: Thursday, September 02, 2010 8:41 PM To: Ralph Campbell Cc: Roland Dreier; linux-rdma@vger.kernel.org Subject: Re: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path Ralph, I see the following crash sporadically (only under stress) with a Sles11SP1 (which is 2.6.32 kernel). I saw this crash with V4 of your patch and have not yet had a chance to try V5. Have you seen this in your testing? If this not the crash stack can you please share what your patch fixes? <4>ib0: RX drain timing out <4>idr_remove called for id=11491974 which is not allocated. <4>Call Trace: <4>[c00749fe33b0] [c00129e4] .show_stack+0x6c/0x198 (unreliable) <4>[c00749fe3460] [c02ea594] .sub_remove+0x1ec/0x1f8 <4>[c00749fe3520] [c02ea5e0] .idr_remove+0x40/0xf8 <4>[c00749fe35b0] [d00012d84d70] .cm_destroy_id+0xa0/0x520 [ib_cm] <4>[c00749fe3680] [d0001b7fb644] .ipoib_cm_free_rx_reap_list+0xd4/0x190 [ib_ipoib] <4>[c00749fe3740] [d0001b7fe404] .ipoib_cm_dev_stop+0x23c/0x360 [ib_ipoib] <4>[c00749fe3800] [d0001b7f4dbc] .ipoib_ib_dev_stop+0xe4/0x4b0 [ib_ipoib] <4>[c00749fe3960] [d0001b7f0f30] .ipoib_stop+0x88/0x178 [ib_ipoib] <4>[c00749fe39f0] [c04eacf4] .dev_close+0xdc/0x148 <4>[c00749fe3a80] [c04ea2b8] .dev_change_flags+0x1f0/0x288 <4>[c00749fe3b20] [d0001b7f11b8] .ipoib_remove_one+0xb8/0x140 [ib_ipoib] <4>[c00749fe3bc0] [d0001210425c] .ib_unregister_client+0xb4/0x1b8 [ib_core] <4>[c00749fe3c90] [d0001b7ffde8] .ipoib_cleanup_module+0x20/0x60 [ib_ipoib] <4>[c00749fe3d20] [c00ec408] .SyS_delete_module+0x238/0x320 <4>[c00749fe3e30] [c00085b4] syscall_exit+0x0/0x40 <1>Unable to handle kernel paging request for data at address 0x4527228d1ffb <1>Faulting instruction address: 0xc05a8e88 12:mon> e cpu 0x12: Vector: 300 (Data Access) at [c00749fe3250] pc: c05a8e88: .wait_for_common+0xb8/0x268 lr: c05a8e20: .wait_for_common+0x50/0x268 sp: c00749fe34d0 msr: 80009032 dar: 4527228d1ffb dsisr: 4200 current = 0xc0074b4ce0e0 paca= 0xc0f64a00 pid = 13605, comm = modprobe 12:mon> Thanks Pradeep -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path
Ralph, I see the following crash sporadically (only under stress) with a Sles11SP1 (which is 2.6.32 kernel). I saw this crash with V4 of your patch and have not yet had a chance to try V5. Have you seen this in your testing? If this not the crash stack can you please share what your patch fixes? <4>ib0: RX drain timing out <4>idr_remove called for id=11491974 which is not allocated. <4>Call Trace: <4>[c00749fe33b0] [c00129e4] .show_stack+0x6c/0x198 (unreliable) <4>[c00749fe3460] [c02ea594] .sub_remove+0x1ec/0x1f8 <4>[c00749fe3520] [c02ea5e0] .idr_remove+0x40/0xf8 <4>[c00749fe35b0] [d00012d84d70] .cm_destroy_id+0xa0/0x520 [ib_cm] <4>[c00749fe3680] [d0001b7fb644] .ipoib_cm_free_rx_reap_list+0xd4/0x190 [ib_ipoib] <4>[c00749fe3740] [d0001b7fe404] .ipoib_cm_dev_stop+0x23c/0x360 [ib_ipoib] <4>[c00749fe3800] [d0001b7f4dbc] .ipoib_ib_dev_stop+0xe4/0x4b0 [ib_ipoib] <4>[c00749fe3960] [d0001b7f0f30] .ipoib_stop+0x88/0x178 [ib_ipoib] <4>[c00749fe39f0] [c04eacf4] .dev_close+0xdc/0x148 <4>[c00749fe3a80] [c04ea2b8] .dev_change_flags+0x1f0/0x288 <4>[c00749fe3b20] [d0001b7f11b8] .ipoib_remove_one+0xb8/0x140 [ib_ipoib] <4>[c00749fe3bc0] [d0001210425c] .ib_unregister_client+0xb4/0x1b8 [ib_core] <4>[c00749fe3c90] [d0001b7ffde8] .ipoib_cleanup_module+0x20/0x60 [ib_ipoib] <4>[c00749fe3d20] [c00ec408] .SyS_delete_module+0x238/0x320 <4>[c00749fe3e30] [c00085b4] syscall_exit+0x0/0x40 <1>Unable to handle kernel paging request for data at address 0x4527228d1ffb <1>Faulting instruction address: 0xc05a8e88 12:mon> e cpu 0x12: Vector: 300 (Data Access) at [c00749fe3250] pc: c05a8e88: .wait_for_common+0xb8/0x268 lr: c05a8e20: .wait_for_common+0x50/0x268 sp: c00749fe34d0 msr: 80009032 dar: 4527228d1ffb dsisr: 4200 current = 0xc0074b4ce0e0 paca= 0xc0f64a00 pid = 13605, comm = modprobe 12:mon> Thanks Pradeep -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path
Basically, a struct sk_buff has a pointer to a struct dst_entry which has a pointer to a struct neighbour. IPoIB uses struct neighbour.ha to store the IB "hardware address" and a pointer to a struct ipoib_neigh. When using connected mode, struct ipoib_neigh has a pointer to struct ipoib_cm_tx which contains pointers back to struct ipoib_neigh and ipoib_path. The core network code will call ipoib_neigh_cleanup() when it is destroying struct neighbour and IPoIB should guarantee that the struct ipoib_neigh and all the memory it points to are destroyed. Also, the connected mode RC QP contains a pointer back to the struct ipoib_cm_tx which can be dereferenced in the CQ completion handler. The problem is that there are several places where the struct ipoib_cm_tx can be used after it has been freed. The easiest way to reproduce this bug is to run a UDP bandwidth test while loading/unloading the IPoIB module or ifup/ifdown the interface. The fix is rather complex since the RC QP connection setup, tear down, and CQ completion draining are asynchronous processes. The struct ipoib_cm_tx goes through four "states": 1) Newly created by ipoib_cm_create_tx() neigh!=NULL, flags==0, and linked on priv->cm.start_list. 2) Being used by ipoib_cm_tx_start() neigh!=NULL, not on priv->cm.start_list, flags==0, and in the process of starting CM. 3) Being used by CM or sending data on the RC QP neigh!=NULL, not on priv->cm.start_list, flags & IPOIB_FLAG_INITIALIZED. 4) Being destroyed tx->neigh==NULL and on priv->cm.reap_task list or being destroyed by ipoib_cm_tx_reap(). Signed-off-by: Ralph Campbell --- drivers/infiniband/ulp/ipoib/ipoib.h | 14 + drivers/infiniband/ulp/ipoib/ipoib_cm.c| 108 ++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 266 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 76 ++- 4 files changed, 223 insertions(+), 241 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 753a983..5a842d7 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -415,9 +415,7 @@ static inline struct ipoib_neigh **to_ipoib_neigh(struct neighbour *neigh) INFINIBAND_ALEN, sizeof(void *)); } -struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh, - struct net_device *dev); -void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh); +void ipoib_neigh_flush(struct ipoib_neigh *neigh); extern struct workqueue_struct *ipoib_workqueue; @@ -464,7 +462,8 @@ void ipoib_dev_cleanup(struct net_device *dev); void ipoib_mcast_join_task(struct work_struct *work); void ipoib_mcast_carrier_on_task(struct work_struct *work); -void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb); +void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb, + struct ipoib_neigh *neigh); void ipoib_mcast_restart_task(struct work_struct *work); int ipoib_mcast_start_thread(struct net_device *dev); @@ -570,6 +569,7 @@ void ipoib_cm_dev_cleanup(struct net_device *dev); struct ipoib_cm_tx *ipoib_cm_create_tx(struct net_device *dev, struct ipoib_path *path, struct ipoib_neigh *neigh); void ipoib_cm_destroy_tx(struct ipoib_cm_tx *tx); +void ipoib_cm_flush_path(struct net_device *dev, struct ipoib_path *path); void ipoib_cm_skb_too_long(struct net_device *dev, struct sk_buff *skb, unsigned int mtu); void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc); @@ -659,6 +659,12 @@ void ipoib_cm_destroy_tx(struct ipoib_cm_tx *tx) } static inline +void ipoib_cm_flush_path(struct net_device *dev, struct ipoib_path *path) +{ + return; +} + +static inline int ipoib_cm_add_mode_attr(struct net_device *dev) { return 0; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index bb10041..c1f3a65 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -799,31 +799,14 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) if (wc->status != IB_WC_SUCCESS && wc->status != IB_WC_WR_FLUSH_ERR) { - struct ipoib_neigh *neigh; - ipoib_dbg(priv, "failed cm send event " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); spin_lock_irqsave(&priv->lock, flags); - neigh = tx->neigh; - - if (neigh) { - neigh->cm = NULL; - list_del(&neigh->list); - if (neigh->ah) - ipoib_put_ah(neigh->ah); - ipoib_neigh_free(dev, neigh); - - tx->neigh =