Re: [openib-general] oops at device removal

2007-01-29 Thread Sean Hefty
 @@ -71,6 +70,7 @@ struct mcast_device {
   int start_port;
   int end_port;
   struct mcast_port   port[0];
 + struct ib_event_handler event_handler;
  };

The mcast_port data is allocated at the end of the structure.  event_handler 
will need to be located up in the structure.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] oops at device removal

2007-01-28 Thread Michael S. Tsirkin
 We have observed the following crash:

OK, I think I see a reason for this.

I notice the following in code, file multicast.c, function mcast_add_one:

ib_set_client_data(device, mcast_client, dev);

INIT_IB_EVENT_HANDLER(event_handler, device,
  mcast_event_handler);
ib_register_event_handler(event_handler);

So it seems like if I have 2 devices, event_handler will be registered twice.
This will trigger data corruption as same entry will be added to list twice.

Or so it seems. Sean, what's the idea here?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] oops at device removal

2007-01-28 Thread Michael S. Tsirkin
 Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
 Subject: Re: oops at device removal
 
  We have observed the following crash:
 
 OK, I think I see a reason for this.
 
 I notice the following in code, file multicast.c, function mcast_add_one:
 
 ib_set_client_data(device, mcast_client, dev);
 
   INIT_IB_EVENT_HANDLER(event_handler, device,
 mcast_event_handler);
 ib_register_event_handler(event_handler);
 
 So it seems like if I have 2 devices, event_handler will be registered twice.
 This will trigger data corruption as same entry will be added to list twice.
 
 Or so it seems. Sean, what's the idea here?

It seems something like the following would fix it (untested).



Make new multicast code not crash on platforms with multiple HCAs.

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

---

diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index fde977e..e51a078 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -51,7 +51,6 @@ static struct ib_client mcast_client = {
 };
 
 static struct ib_sa_client sa_client;
-static struct ib_event_handler event_handler;
 static struct workqueue_struct *mcast_wq;
 static union ib_gid mgid0;
 
@@ -71,6 +70,7 @@ struct mcast_device {
int start_port;
int end_port;
struct mcast_port   port[0];
+   struct ib_event_handler event_handler;
 };
 
 enum mcast_state {
@@ -793,8 +793,8 @@ static void mcast_add_one(struct ib_device *device)
dev-device = device;
ib_set_client_data(device, mcast_client, dev);
 
-   INIT_IB_EVENT_HANDLER(event_handler, device, mcast_event_handler);
-   ib_register_event_handler(event_handler);
+   INIT_IB_EVENT_HANDLER(dev-event_handler, device, mcast_event_handler);
+   ib_register_event_handler(dev-event_handler);
 }
 
 static void mcast_remove_one(struct ib_device *device)
@@ -807,7 +807,7 @@ static void mcast_remove_one(struct ib_device *device)
if (!dev)
return;
 
-   ib_unregister_event_handler(event_handler);
+   ib_unregister_event_handler(dev-event_handler);
flush_workqueue(mcast_wq);
 
for (i = 0; i = dev-end_port - dev-start_port; i++) {

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general