Re: [ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

2008-02-13 Thread Or Gerlitz

Stefan Roscher wrote:

yes this problem does also exist in 2.6.25-rc1. It was introduced by a patch 
from roland:
http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=efcd99717f76c6d19dd81203c60fe198480de522

In function ipoib_cm_dev_stop() the error-,drain- and flush lists are put into 
a local list after a timeout.
In the past there was a list_for_each_entry loop iterating over this local list and destroyed all added QPs. 
With the patch above the list_for_each_entry call is moved to function ipoib_cm_free_rx_reap_list(),

which does not iterate the former local list, but device's reap_list.
Pradeeps patch puts now all QPs after a timeout from error, drain and flush 
lists into the reap_list so that they were all freed in 
poib_cm_free_rx_reap_list().


OK, so send the patch to Roland for review before you put it in ofed.

Or.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

2008-02-13 Thread Stefan Roscher
On Wednesday 13 February 2008 09:04:53 Or Gerlitz wrote:
> Pradeep Satyanarayana wrote:
> > This patch fixes -fail to destroy ipoib rx QP 
> > (https://bugs.openfabrics.org/show_bug.cgi?id=906)
> > Hence the usecnt issue reported previously on ehca is solved and allows the 
> > qp to be destroyed.
> > 
> > As per Eli's request, I am splitting up the patches. This is first portion 
> > of yesterday's patch.
> > Tested on ppc64 machines with ehca and mthca.
> 
> Also here, does this problem exist in the 2.6.25-rc1 upstream code as 
> well? from the change log I don't understand the source of the problem 
> (only the symptom of failing to destroy ipoib/cm rx QP) and the solution.
> 
> Or.

Hi,
yes this problem does also exist in 2.6.25-rc1. It was introduced by a patch 
from roland:
http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=efcd99717f76c6d19dd81203c60fe198480de522

In function ipoib_cm_dev_stop() the error-,drain- and flush lists are put into 
a local list after a timeout.
In the past there was a list_for_each_entry loop iterating over this local list 
and destroyed all added QPs. 
With the patch above the list_for_each_entry call is moved to function 
ipoib_cm_free_rx_reap_list(),
which does not iterate the former local list, but device's reap_list.
Pradeeps patch puts now all QPs after a timeout from error, drain and flush 
lists into the reap_list so that they were all freed in 
poib_cm_free_rx_reap_list().

Stefan
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

2008-02-13 Thread Shirley Ma
On Wed, 2008-02-13 at 10:04 +0200, Or Gerlitz wrote:
> Also here, does this problem exist in the 2.6.25-rc1 upstream code as 
> well? from the change log I don't understand the source of the
> problem 
> (only the symptom of failing to destroy ipoib/cm rx QP) and the
> solution.
> 
> Or.

I believe so. This is not a new problem in OFED-1.3 release.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

2008-02-13 Thread Or Gerlitz

Pradeep Satyanarayana wrote:

This patch fixes -fail to destroy ipoib rx QP 
(https://bugs.openfabrics.org/show_bug.cgi?id=906)
Hence the usecnt issue reported previously on ehca is solved and allows the qp 
to be destroyed.

As per Eli's request, I am splitting up the patches. This is first portion of 
yesterday's patch.
Tested on ppc64 machines with ehca and mthca.


Also here, does this problem exist in the 2.6.25-rc1 upstream code as 
well? from the change log I don't understand the source of the problem 
(only the symptom of failing to destroy ipoib/cm rx QP) and the solution.


Or.




Signed-off-by: Pradeep Satyanarayana <[EMAIL PROTECTED]>
---

--- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-11 
14:28:47.0 -0500
+++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-12 
17:44:07.0 -0500
@@ -883,9 +883,9 @@ void ipoib_cm_dev_stop(struct net_device
/*
 * assume the HW is wedged and just free up everything.
 */
-   list_splice_init(&priv->cm.rx_flush_list, &list);
-   list_splice_init(&priv->cm.rx_error_list, &list);
-   list_splice_init(&priv->cm.rx_drain_list, &list);
+   list_splice_init(&priv->cm.rx_flush_list, 
&priv->cm.rx_reap_list);
+   list_splice_init(&priv->cm.rx_error_list, 
&priv->cm.rx_reap_list);
+   list_splice_init(&priv->cm.rx_drain_list, 
&priv->cm.rx_reap_list);
break;
}
spin_unlock_irq(&priv->lock);

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

2008-02-12 Thread Pradeep Satyanarayana
This patch fixes -fail to destroy ipoib rx QP 
(https://bugs.openfabrics.org/show_bug.cgi?id=906)
Hence the usecnt issue reported previously on ehca is solved and allows the qp 
to be destroyed.

As per Eli's request, I am splitting up the patches. This is first portion of 
yesterday's patch.
Tested on ppc64 machines with ehca and mthca.

Signed-off-by: Pradeep Satyanarayana <[EMAIL PROTECTED]>
---

--- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-11 
14:28:47.0 -0500
+++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-12 
17:44:07.0 -0500
@@ -883,9 +883,9 @@ void ipoib_cm_dev_stop(struct net_device
/*
 * assume the HW is wedged and just free up everything.
 */
-   list_splice_init(&priv->cm.rx_flush_list, &list);
-   list_splice_init(&priv->cm.rx_error_list, &list);
-   list_splice_init(&priv->cm.rx_drain_list, &list);
+   list_splice_init(&priv->cm.rx_flush_list, 
&priv->cm.rx_reap_list);
+   list_splice_init(&priv->cm.rx_error_list, 
&priv->cm.rx_reap_list);
+   list_splice_init(&priv->cm.rx_drain_list, 
&priv->cm.rx_reap_list);
break;
}
spin_unlock_irq(&priv->lock);

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg