RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-08-18 Thread Devesh Sharma
Hi Chuk,

Can this patch be lined up for merger. Looks like there are no oppositions to 
this change

-Regards
 Devesh

 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Devesh Sharma
 Sent: Thursday, July 31, 2014 10:45 AM
 To: Chuck Lever
 Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
  -Original Message-
  From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
  ow...@vger.kernel.org] On Behalf Of Chuck Lever
  Sent: Thursday, July 31, 2014 12:09 AM
  To: Devesh Sharma
  Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
  r...@vger.kernel.org
  Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
  On Jul 22, 2014, at 1:06 AM, Devesh Sharma
 devesh.sha...@emulex.com
  wrote:
 
   -Original Message-
   From: Chuck Lever [mailto:chuck.le...@oracle.com]
   Sent: Monday, July 21, 2014 11:01 PM
   To: Devesh Sharma
   Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
   r...@vger.kernel.org
   Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma
   provider module
  
  
   On Jul 21, 2014, at 1:07 PM, Devesh Sharma
  devesh.sha...@emulex.com
   wrote:
  
   Until that support is in place, obviously I would prefer that
   the removal of the underlying driver be prevented while there
   are NFS mounts in place. I think that's what NFS users have
   come to
  expect.
  
   In other words, don't allow device removal until we have
   support for device insertion :-)
  
   This needs a fresh series of patches?
  
   do not allow removal would likely take an approach similar to
   what you originally posted, unless someone has an idea how to use a
   CM_EVENT to make this work, or there is an objection from the
   kernel
  RDMA folks.
  
   Okay, I will re-work the patch if need be.
 
  Devesh, if there isn't one already, could you file an upstream bug at
 
http://bugzilla.linux-nfs.org
 
  that documents the shutdown hang and perhaps summarizes this thread?
  Thanks!
 
 A bug has been filed
 https://bugzilla.linux-nfs.org/show_bug.cgi?id=266
 
 
  --
  Chuck Lever
  chuck[dot]lever[at]oracle[dot]com
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-rdma
  in the body of a message to majord...@vger.kernel.org More
 majordomo
  info at http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-08-18 Thread Chuck Lever


 On Aug 18, 2014, at 5:52 AM, Devesh Sharma devesh.sha...@emulex.com wrote:
 
 Hi Chuk,
 
 Can this patch be lined up for merger. Looks like there are no oppositions to 
 this change

Sorry, I was expecting a fresh rewritten patch. I haven't done a deep review of 
the original.

Anything you want to merge should be independently tested, of course. For 
example Steve should include it in his testing branch.


 
 -Regards
 Devesh
 
 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Devesh Sharma
 Sent: Thursday, July 31, 2014 10:45 AM
 To: Chuck Lever
 Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Chuck Lever
 Sent: Thursday, July 31, 2014 12:09 AM
 To: Devesh Sharma
 Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On Jul 22, 2014, at 1:06 AM, Devesh Sharma
 devesh.sha...@emulex.com
 wrote:
 
 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Monday, July 21, 2014 11:01 PM
 To: Devesh Sharma
 Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma
 provider module
 
 
 On Jul 21, 2014, at 1:07 PM, Devesh Sharma
 devesh.sha...@emulex.com
 wrote:
 
 Until that support is in place, obviously I would prefer that
 the removal of the underlying driver be prevented while there
 are NFS mounts in place. I think that's what NFS users have
 come to
 expect.
 
 In other words, don't allow device removal until we have
 support for device insertion :-)
 
 This needs a fresh series of patches?
 
 do not allow removal would likely take an approach similar to
 what you originally posted, unless someone has an idea how to use a
 CM_EVENT to make this work, or there is an objection from the
 kernel
 RDMA folks.
 
 Okay, I will re-work the patch if need be.
 
 Devesh, if there isn't one already, could you file an upstream bug at
 
  http://bugzilla.linux-nfs.org
 
 that documents the shutdown hang and perhaps summarizes this thread?
 Thanks!
 
 A bug has been filed
 https://bugzilla.linux-nfs.org/show_bug.cgi?id=266
 
 
 --
 Chuck Lever
 chuck[dot]lever[at]oracle[dot]com
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma
 in the body of a message to majord...@vger.kernel.org More
 majordomo
 info at http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-30 Thread Chuck Lever

On Jul 22, 2014, at 1:06 AM, Devesh Sharma devesh.sha...@emulex.com wrote:

 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Monday, July 21, 2014 11:01 PM
 To: Devesh Sharma
 Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On Jul 21, 2014, at 1:07 PM, Devesh Sharma devesh.sha...@emulex.com
 wrote:
 
 Until that support is in place, obviously I would prefer that the
 removal of the underlying driver be prevented while there are NFS
 mounts in place. I think that's what NFS users have come to expect.
 
 In other words, don't allow device removal until we have support
 for device insertion :-)
 
 This needs a fresh series of patches?
 
 do not allow removal would likely take an approach similar to what you
 originally posted, unless someone has an idea how to use a CM_EVENT to
 make this work, or there is an objection from the kernel RDMA folks.
 
 Okay, I will re-work the patch if need be.

Devesh, if there isn’t one already, could you file an upstream bug at

  http://bugzilla.linux-nfs.org

that documents the shutdown hang and perhaps summarizes this thread?
Thanks!

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-30 Thread Devesh Sharma
 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Chuck Lever
 Sent: Thursday, July 31, 2014 12:09 AM
 To: Devesh Sharma
 Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On Jul 22, 2014, at 1:06 AM, Devesh Sharma devesh.sha...@emulex.com
 wrote:
 
  -Original Message-
  From: Chuck Lever [mailto:chuck.le...@oracle.com]
  Sent: Monday, July 21, 2014 11:01 PM
  To: Devesh Sharma
  Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux-
  r...@vger.kernel.org
  Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
  On Jul 21, 2014, at 1:07 PM, Devesh Sharma
 devesh.sha...@emulex.com
  wrote:
 
  Until that support is in place, obviously I would prefer that
  the removal of the underlying driver be prevented while there
  are NFS mounts in place. I think that's what NFS users have come to
 expect.
 
  In other words, don't allow device removal until we have support
  for device insertion :-)
 
  This needs a fresh series of patches?
 
  do not allow removal would likely take an approach similar to what
  you originally posted, unless someone has an idea how to use a
  CM_EVENT to make this work, or there is an objection from the kernel
 RDMA folks.
 
  Okay, I will re-work the patch if need be.
 
 Devesh, if there isn't one already, could you file an upstream bug at
 
   http://bugzilla.linux-nfs.org
 
 that documents the shutdown hang and perhaps summarizes this thread?
 Thanks!

A bug has been filed
https://bugzilla.linux-nfs.org/show_bug.cgi?id=266

 
 --
 Chuck Lever
 chuck[dot]lever[at]oracle[dot]com
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Devesh Sharma
Shirley,

Once rmmod is issued, the connection corresponding to the active mount is 
destroyed and all the associated resources 
Are freed. As per the processing logic of DEVICE-REMOVAL event, nfs-rdma 
wakes-up all the  waiters, This results into
Re-establishment efforts, since the device is not present any more, 
rdma_resolve_address() fails with CM resolution
Error. This loop continues forever.

I am yet to find out which part of ocrdma is blocked. I am putting some debug 
messages to find it out. I will get back to 
The group with an update.

-Regards
 Devesh

 -Original Message-
 From: Shirley Ma [mailto:shirley...@oracle.com]
 Sent: Friday, July 18, 2014 9:18 PM
 To: Steve Wise; Devesh Sharma; 'Chuck Lever'
 Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On 07/18/2014 06:27 AM, Steve Wise wrote:
  We can't really deal with a CM_DEVICE_REMOVE event while there are
  active NFS mounts.
 
  System shutdown ordering should guarantee (one would hope) that
 NFS
  mount points are unmounted before the RDMA/IB core infrastructure
  is torn down. Ordering shouldn't matter as long all NFS activity
  has ceased before the CM tries to remove the device.
 
  So if something is hanging up the CM, there's something xprtrdma is
  not cleaning up properly.
 
 
 
  Devesh, how are you reproducing this?  Are you just rmmod'ing the
  ocrdma module while there are active mounts?
 
  Yes, I am issuing rmmod while there is an active mount. In my case
  rmmod ocrdma remains blocked forever.
 Where is it blocked?
 
  Off-the-course of this discussion: Is there a reasoning behind not
  using
  ib_register_client()/ib_unregister_client() framework?
 
  I think the idea is that you don't need to use it if you are
  transport-independent and use the rdmacm...
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-rdma
  in the body of a message to majord...@vger.kernel.org More
 majordomo
  info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Devesh Sharma
In rpcrdma_ep_connect():

write_lock(ia-ri_qplock);
old = ia-ri_id;
ia-ri_id = id;
write_unlock(ia-ri_qplock);

rdma_destroy_qp(old);
rdma_destroy_id(old);  = Cm -id is destroyed here.


If following code fails in rpcrdma_ep_connect():
id = rpcrdma_create_id(xprt, ia,
(struct sockaddr *)xprt-rx_data.addr);
if (IS_ERR(id)) {
rc = -EHOSTUNREACH;
goto out;
}

it leaves old cm-id still alive. This will always fail if Device is removed 
abruptly.

In rdma_resolve_addr()/rdma_destroy_id() cm_dev is referenced/de-referenced 
here (cma.c):

static int cma_acquire_dev(struct rdma_id_private *id_priv,
   struct rdma_id_private *listen_id_priv) {
.
.
if (!ret)
cma_attach_to_dev(id_priv, cma_dev);
}

static void cma_release_dev(struct rdma_id_private *id_priv)
{
mutex_lock(lock);
list_del(id_priv-list);
cma_deref_dev(id_priv-cma_dev);
.
.
}

Since as per design of nfs-rdma at-least previously known good cm-id always 
remains live utill
another good cm-id is created, cma_dev-refcount never becomes 0 upon device 
removal .
Thus blocking the rmmod vendor driver forever.

-Regards
 Devesh

 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Devesh Sharma
 Sent: Monday, July 21, 2014 11:42 AM
 To: Shirley Ma; Steve Wise; 'Chuck Lever'
 Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 Shirley,
 
 Once rmmod is issued, the connection corresponding to the active mount is
 destroyed and all the associated resources Are freed. As per the processing
 logic of DEVICE-REMOVAL event, nfs-rdma wakes-up all the  waiters, This
 results into Re-establishment efforts, since the device is not present any
 more, rdma_resolve_address() fails with CM resolution Error. This loop
 continues forever.
 
 I am yet to find out which part of ocrdma is blocked. I am putting some debug
 messages to find it out. I will get back to The group with an update.
 
 -Regards
  Devesh
 
  -Original Message-
  From: Shirley Ma [mailto:shirley...@oracle.com]
  Sent: Friday, July 18, 2014 9:18 PM
  To: Steve Wise; Devesh Sharma; 'Chuck Lever'
  Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org
  Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
  On 07/18/2014 06:27 AM, Steve Wise wrote:
   We can't really deal with a CM_DEVICE_REMOVE event while there
   are active NFS mounts.
  
   System shutdown ordering should guarantee (one would hope) that
  NFS
   mount points are unmounted before the RDMA/IB core
 infrastructure
   is torn down. Ordering shouldn't matter as long all NFS activity
   has ceased before the CM tries to remove the device.
  
   So if something is hanging up the CM, there's something xprtrdma
   is not cleaning up properly.
  
  
  
   Devesh, how are you reproducing this?  Are you just rmmod'ing the
   ocrdma module while there are active mounts?
  
   Yes, I am issuing rmmod while there is an active mount. In my case
   rmmod ocrdma remains blocked forever.
  Where is it blocked?
 
   Off-the-course of this discussion: Is there a reasoning behind not
   using
   ib_register_client()/ib_unregister_client() framework?
  
   I think the idea is that you don't need to use it if you are
   transport-independent and use the rdmacm...
  
  
  
   --
   To unsubscribe from this list: send the line unsubscribe linux-rdma
   in the body of a message to majord...@vger.kernel.org More
  majordomo
   info at  http://vger.kernel.org/majordomo-info.html
  
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Chuck Lever
Hi Devesh-

Thanks for drilling into this further.

On Jul 21, 2014, at 7:48 AM, Devesh Sharma devesh.sha...@emulex.com wrote:

 In rpcrdma_ep_connect():
 
 write_lock(ia-ri_qplock);
old = ia-ri_id;
ia-ri_id = id;
write_unlock(ia-ri_qplock);
 
rdma_destroy_qp(old);
rdma_destroy_id(old);  = Cm -id is destroyed here.
 
 
 If following code fails in rpcrdma_ep_connect():
 id = rpcrdma_create_id(xprt, ia,
(struct sockaddr *)xprt-rx_data.addr);
if (IS_ERR(id)) {
rc = -EHOSTUNREACH;
goto out;
}
 
 it leaves old cm-id still alive. This will always fail if Device is removed 
 abruptly.

For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep-rep_connected
to -ENODEV.

Then:

 929 int
 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
 931 {
 932 struct rdma_cm_id *id, *old;
 933 int rc = 0;
 934 int retry_count = 0;
 935 
 936 if (ep-rep_connected != 0) {
 937 struct rpcrdma_xprt *xprt;
 938 retry:
 939 dprintk(RPC:   %s: reconnecting...\n, __func__);

ep-rep_connected is probably -ENODEV after a device removal. It would be
possible for the connect worker to destroy everything associated with this
connection in that case to ensure the underlying object reference counts
are cleared.

The immediate danger is that if there are pending RPCs, they could exit while
qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external().
Checking for NULL pointers inside the ri_qplock would prevent that.

However, NFS mounts via this adapter will hang indefinitely after all
transports are torn down and the adapter is gone. The only thing that can be
done is something drastic like “echo b  /proc/sysrq_trigger” on the client.

Thus, IMO hot-plugging or passive fail-over are the only scenarios where
this makes sense. If we have an immediate problem here, is it a problem with
system shutdown ordering that can be addressed in some other way?

Until that support is in place, obviously I would prefer that the removal of
the underlying driver be prevented while there are NFS mounts in place. I
think that’s what NFS users have come to expect.

In other words, don’t allow device removal until we have support for device
insertion :-)


 In rdma_resolve_addr()/rdma_destroy_id() cm_dev is referenced/de-referenced 
 here (cma.c):
 
 static int cma_acquire_dev(struct rdma_id_private *id_priv,
   struct rdma_id_private *listen_id_priv) {
 .
 .
 if (!ret)
cma_attach_to_dev(id_priv, cma_dev);
 }
 
 static void cma_release_dev(struct rdma_id_private *id_priv)
 {
mutex_lock(lock);
list_del(id_priv-list);
cma_deref_dev(id_priv-cma_dev);
 .
 .
 }
 
 Since as per design of nfs-rdma at-least previously known good cm-id always 
 remains live utill
 another good cm-id is created, cma_dev-refcount never becomes 0 upon device 
 removal .
 Thus blocking the rmmod vendor driver forever.
 
 -Regards
 Devesh
 
 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Devesh Sharma
 Sent: Monday, July 21, 2014 11:42 AM
 To: Shirley Ma; Steve Wise; 'Chuck Lever'
 Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 Shirley,
 
 Once rmmod is issued, the connection corresponding to the active mount is
 destroyed and all the associated resources Are freed. As per the processing
 logic of DEVICE-REMOVAL event, nfs-rdma wakes-up all the  waiters, This
 results into Re-establishment efforts, since the device is not present any
 more, rdma_resolve_address() fails with CM resolution Error. This loop
 continues forever.
 
 I am yet to find out which part of ocrdma is blocked. I am putting some debug
 messages to find it out. I will get back to The group with an update.
 
 -Regards
 Devesh
 
 -Original Message-
 From: Shirley Ma [mailto:shirley...@oracle.com]
 Sent: Friday, July 18, 2014 9:18 PM
 To: Steve Wise; Devesh Sharma; 'Chuck Lever'
 Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On 07/18/2014 06:27 AM, Steve Wise wrote:
 We can't really deal with a CM_DEVICE_REMOVE event while there
 are active NFS mounts.
 
 System shutdown ordering should guarantee (one would hope) that
 NFS
 mount points are unmounted before the RDMA/IB core
 infrastructure
 is torn down. Ordering shouldn't matter as long all NFS activity
 has ceased before the CM tries to remove the device.
 
 So if something is hanging up the CM, there's something xprtrdma
 is not cleaning up properly.
 
 
 
 Devesh, how are you reproducing this?  Are you just rmmod'ing the
 ocrdma

RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Steve Wise


 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Monday, July 21, 2014 9:54 AM
 To: Devesh Sharma
 Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; 
 linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 Hi Devesh-
 
 Thanks for drilling into this further.
 
 On Jul 21, 2014, at 7:48 AM, Devesh Sharma devesh.sha...@emulex.com wrote:
 
  In rpcrdma_ep_connect():
 
  write_lock(ia-ri_qplock);
 old = ia-ri_id;
 ia-ri_id = id;
 write_unlock(ia-ri_qplock);
 
 rdma_destroy_qp(old);
 rdma_destroy_id(old);  = Cm -id is destroyed 
  here.
 
 
  If following code fails in rpcrdma_ep_connect():
  id = rpcrdma_create_id(xprt, ia,
 (struct sockaddr *)xprt-rx_data.addr);
 if (IS_ERR(id)) {
 rc = -EHOSTUNREACH;
 goto out;
 }
 
  it leaves old cm-id still alive. This will always fail if Device is removed 
  abruptly.
 
 For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep-rep_connected
 to -ENODEV.
 
 Then:
 
  929 int
  930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
  931 {
  932 struct rdma_cm_id *id, *old;
  933 int rc = 0;
  934 int retry_count = 0;
  935
  936 if (ep-rep_connected != 0) {
  937 struct rpcrdma_xprt *xprt;
  938 retry:
  939 dprintk(RPC:   %s: reconnecting...\n, __func__);
 
 ep-rep_connected is probably -ENODEV after a device removal. It would be
 possible for the connect worker to destroy everything associated with this
 connection in that case to ensure the underlying object reference counts
 are cleared.
 
 The immediate danger is that if there are pending RPCs, they could exit while
 qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external().
 Checking for NULL pointers inside the ri_qplock would prevent that.
 
 However, NFS mounts via this adapter will hang indefinitely after all
 transports are torn down and the adapter is gone. The only thing that can be
 done is something drastic like echo b  /proc/sysrq_trigger on the client.
 
 Thus, IMO hot-plugging or passive fail-over are the only scenarios where
 this makes sense. If we have an immediate problem here, is it a problem with
 system shutdown ordering that can be addressed in some other way?
 
 Until that support is in place, obviously I would prefer that the removal of
 the underlying driver be prevented while there are NFS mounts in place. I
 think that's what NFS users have come to expect.
 
 In other words, don't allow device removal until we have support for device
 insertion :-)
 
 


If we fix the above problems on provider unload, shouldn't the mount recover if 
the
provider module is subsequently loaded?  Or another provider configured such 
that
rdma_resolve_addr/route() then picks an active device?



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Chuck Lever

On Jul 21, 2014, at 11:03 AM, Steve Wise sw...@opengridcomputing.com wrote:

 
 
 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Monday, July 21, 2014 9:54 AM
 To: Devesh Sharma
 Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; 
 linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 Hi Devesh-
 
 Thanks for drilling into this further.
 
 On Jul 21, 2014, at 7:48 AM, Devesh Sharma devesh.sha...@emulex.com wrote:
 
 In rpcrdma_ep_connect():
 
 write_lock(ia-ri_qplock);
   old = ia-ri_id;
   ia-ri_id = id;
   write_unlock(ia-ri_qplock);
 
   rdma_destroy_qp(old);
   rdma_destroy_id(old);  = Cm -id is destroyed 
 here.
 
 
 If following code fails in rpcrdma_ep_connect():
 id = rpcrdma_create_id(xprt, ia,
   (struct sockaddr *)xprt-rx_data.addr);
   if (IS_ERR(id)) {
   rc = -EHOSTUNREACH;
   goto out;
   }
 
 it leaves old cm-id still alive. This will always fail if Device is removed 
 abruptly.
 
 For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep-rep_connected
 to -ENODEV.
 
 Then:
 
 929 int
 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
 931 {
 932 struct rdma_cm_id *id, *old;
 933 int rc = 0;
 934 int retry_count = 0;
 935
 936 if (ep-rep_connected != 0) {
 937 struct rpcrdma_xprt *xprt;
 938 retry:
 939 dprintk(RPC:   %s: reconnecting...\n, __func__);
 
 ep-rep_connected is probably -ENODEV after a device removal. It would be
 possible for the connect worker to destroy everything associated with this
 connection in that case to ensure the underlying object reference counts
 are cleared.
 
 The immediate danger is that if there are pending RPCs, they could exit while
 qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external().
 Checking for NULL pointers inside the ri_qplock would prevent that.
 
 However, NFS mounts via this adapter will hang indefinitely after all
 transports are torn down and the adapter is gone. The only thing that can be
 done is something drastic like echo b  /proc/sysrq_trigger on the client.
 
 Thus, IMO hot-plugging or passive fail-over are the only scenarios where
 this makes sense. If we have an immediate problem here, is it a problem with
 system shutdown ordering that can be addressed in some other way?
 
 Until that support is in place, obviously I would prefer that the removal of
 the underlying driver be prevented while there are NFS mounts in place. I
 think that's what NFS users have come to expect.
 
 In other words, don't allow device removal until we have support for device
 insertion :-)
 
 
 
 
 If we fix the above problems on provider unload, shouldn't the mount recover 
 if the
 provider module is subsequently loaded?  Or another provider configured such 
 that
 rdma_resolve_addr/route() then picks an active device?

On device removal, we have to destroy everything.

On insertion, we’ll have to create a fresh PD and MRs, which isn’t done now
during reconnect. So, insertion needs work too.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Steve Wise


 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Monday, July 21, 2014 10:21 AM
 To: Steve Wise
 Cc: Devesh Sharma; Shirley Ma; Hefty, Sean; Roland Dreier; 
 linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 
 On Jul 21, 2014, at 11:03 AM, Steve Wise sw...@opengridcomputing.com wrote:
 
 
 
  -Original Message-
  From: Chuck Lever [mailto:chuck.le...@oracle.com]
  Sent: Monday, July 21, 2014 9:54 AM
  To: Devesh Sharma
  Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; 
  linux-rdma@vger.kernel.org
  Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider 
  module
 
  Hi Devesh-
 
  Thanks for drilling into this further.
 
  On Jul 21, 2014, at 7:48 AM, Devesh Sharma devesh.sha...@emulex.com 
  wrote:
 
  In rpcrdma_ep_connect():
 
  write_lock(ia-ri_qplock);
old = ia-ri_id;
ia-ri_id = id;
write_unlock(ia-ri_qplock);
 
rdma_destroy_qp(old);
rdma_destroy_id(old);  = Cm -id is destroyed 
  here.
 
 
  If following code fails in rpcrdma_ep_connect():
  id = rpcrdma_create_id(xprt, ia,
(struct sockaddr *)xprt-rx_data.addr);
if (IS_ERR(id)) {
rc = -EHOSTUNREACH;
goto out;
}
 
  it leaves old cm-id still alive. This will always fail if Device is 
  removed
abruptly.
 
  For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep-rep_connected
  to -ENODEV.
 
  Then:
 
  929 int
  930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
  931 {
  932 struct rdma_cm_id *id, *old;
  933 int rc = 0;
  934 int retry_count = 0;
  935
  936 if (ep-rep_connected != 0) {
  937 struct rpcrdma_xprt *xprt;
  938 retry:
  939 dprintk(RPC:   %s: reconnecting...\n, __func__);
 
  ep-rep_connected is probably -ENODEV after a device removal. It would be
  possible for the connect worker to destroy everything associated with this
  connection in that case to ensure the underlying object reference counts
  are cleared.
 
  The immediate danger is that if there are pending RPCs, they could exit 
  while
  qp/cm_id are NULL, triggering a panic in 
  rpcrdma_deregister_frmr_external().
  Checking for NULL pointers inside the ri_qplock would prevent that.
 
  However, NFS mounts via this adapter will hang indefinitely after all
  transports are torn down and the adapter is gone. The only thing that can 
  be
  done is something drastic like echo b  /proc/sysrq_trigger on the 
  client.
 
  Thus, IMO hot-plugging or passive fail-over are the only scenarios where
  this makes sense. If we have an immediate problem here, is it a problem 
  with
  system shutdown ordering that can be addressed in some other way?
 
  Until that support is in place, obviously I would prefer that the removal 
  of
  the underlying driver be prevented while there are NFS mounts in place. I
  think that's what NFS users have come to expect.
 
  In other words, don't allow device removal until we have support for device
  insertion :-)
 
 
 
 
  If we fix the above problems on provider unload, shouldn't the mount 
  recover if the
  provider module is subsequently loaded?  Or another provider configured 
  such that
  rdma_resolve_addr/route() then picks an active device?
 
 On device removal, we have to destroy everything.
 
 On insertion, we'll have to create a fresh PD and MRs, which isn't done now
 during reconnect. So, insertion needs work too.
 

Got it, thanks! 



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Devesh Sharma
 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Monday, July 21, 2014 8:53 PM
 To: 'Chuck Lever'
 Cc: Devesh Sharma; 'Shirley Ma'; 'Hefty, Sean'; 'Roland Dreier'; linux-
 r...@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 
  -Original Message-
  From: Chuck Lever [mailto:chuck.le...@oracle.com]
  Sent: Monday, July 21, 2014 10:21 AM
  To: Steve Wise
  Cc: Devesh Sharma; Shirley Ma; Hefty, Sean; Roland Dreier;
  linux-rdma@vger.kernel.org
  Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
  On Jul 21, 2014, at 11:03 AM, Steve Wise sw...@opengridcomputing.com
 wrote:
 
  
  
   -Original Message-
   From: Chuck Lever [mailto:chuck.le...@oracle.com]
   Sent: Monday, July 21, 2014 9:54 AM
   To: Devesh Sharma
   Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier;
   linux-rdma@vger.kernel.org
   Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma
   provider module
  
   Hi Devesh-
  
   Thanks for drilling into this further.
  
   On Jul 21, 2014, at 7:48 AM, Devesh Sharma
 devesh.sha...@emulex.com wrote:
  
   In rpcrdma_ep_connect():
  
   write_lock(ia-ri_qplock);
 old = ia-ri_id;
 ia-ri_id = id;
 write_unlock(ia-ri_qplock);
  
 rdma_destroy_qp(old);
 rdma_destroy_id(old);  = Cm -id is destroyed
 here.
  
  
   If following code fails in rpcrdma_ep_connect():
   id = rpcrdma_create_id(xprt, ia,
 (struct sockaddr *)xprt-rx_data.addr);
 if (IS_ERR(id)) {
 rc = -EHOSTUNREACH;
 goto out;
 }
  
   it leaves old cm-id still alive. This will always fail if Device
   is removed
 abruptly.
  
   For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets
   ep-rep_connected to -ENODEV.
  
   Then:
  
   929 int
   930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia
   *ia)
   931 {
   932 struct rdma_cm_id *id, *old;
   933 int rc = 0;
   934 int retry_count = 0;
   935
   936 if (ep-rep_connected != 0) {
   937 struct rpcrdma_xprt *xprt;
   938 retry:
   939 dprintk(RPC:   %s: reconnecting...\n, 
   __func__);
  
   ep-rep_connected is probably -ENODEV after a device removal. It
   ep-would be
   possible for the connect worker to destroy everything associated
   with this connection in that case to ensure the underlying object
   reference counts are cleared.
  
   The immediate danger is that if there are pending RPCs, they could
   exit while qp/cm_id are NULL, triggering a panic in
 rpcrdma_deregister_frmr_external().
   Checking for NULL pointers inside the ri_qplock would prevent that.
  
   However, NFS mounts via this adapter will hang indefinitely after
   all transports are torn down and the adapter is gone. The only
   thing that can be done is something drastic like echo b 
 /proc/sysrq_trigger on the client.
  
   Thus, IMO hot-plugging or passive fail-over are the only scenarios
   where this makes sense. If we have an immediate problem here, is it
   a problem with system shutdown ordering that can be addressed in
 some other way?
  
   Until that support is in place, obviously I would prefer that the
   removal of the underlying driver be prevented while there are NFS
   mounts in place. I think that's what NFS users have come to expect.
  
   In other words, don't allow device removal until we have support
   for device insertion :-)

This needs a fresh series of patches?

  
  
  
  
   If we fix the above problems on provider unload, shouldn't the mount
   recover if the provider module is subsequently loaded?  Or another
   provider configured such that
   rdma_resolve_addr/route() then picks an active device?
 
  On device removal, we have to destroy everything.
 
  On insertion, we'll have to create a fresh PD and MRs, which isn't
  done now during reconnect. So, insertion needs work too.

Makes Sense to me too.
Thanks Chuck.

-Regards
 Devesh
 
 
 Got it, thanks!
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-21 Thread Chuck Lever

On Jul 21, 2014, at 1:07 PM, Devesh Sharma devesh.sha...@emulex.com wrote:

 Until that support is in place, obviously I would prefer that the
 removal of the underlying driver be prevented while there are NFS
 mounts in place. I think that's what NFS users have come to expect.
 
 In other words, don't allow device removal until we have support
 for device insertion :-)
 
 This needs a fresh series of patches?

“do not allow removal” would likely take an approach similar to what you
originally posted, unless someone has an idea how to use a CM_EVENT to
make this work, or there is an objection from the kernel RDMA folks.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-20 Thread Devesh Sharma
 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Friday, July 18, 2014 6:58 PM
 To: Devesh Sharma; 'Chuck Lever'
 Cc: 'Hefty, Sean'; 'Shirley Ma'; 'Roland Dreier'; linux-rdma@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
We can't really deal with a CM_DEVICE_REMOVE event while there are
active NFS mounts.
   
System shutdown ordering should guarantee (one would hope) that
NFS mount points are unmounted before the RDMA/IB core
infrastructure is torn down. Ordering shouldn't matter as long all
NFS activity has ceased before the CM tries to remove the device.
   
So if something is hanging up the CM, there's something xprtrdma
is not cleaning up properly.
   
  
  
   Devesh, how are you reproducing this?  Are you just rmmod'ing the
   ocrdma module while there are active mounts?
 
  Yes, I am issuing rmmod while there is an active mount. In my case
  rmmod ocrdma remains blocked forever.
  Off-the-course of this discussion: Is there a reasoning behind not
  using
  ib_register_client()/ib_unregister_client() framework?
 
 I think the idea is that you don't need to use it if you are transport-
 independent and use the rdmacm...

Okay, Makes sense..

 
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-20 Thread Devesh Sharma
 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Friday, July 18, 2014 8:57 PM
 To: Devesh Sharma
 Cc: Steve Wise; Hefty, Sean; Shirley Ma; Roland Dreier; linux-
 r...@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On Jul 18, 2014, at 2:19 AM, Devesh Sharma devesh.sha...@emulex.com
 wrote:
 
  -Original Message-
  From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
  ow...@vger.kernel.org] On Behalf Of Steve Wise
  Sent: Friday, July 18, 2014 1:39 AM
  To: 'Hefty, Sean'; 'Shirley Ma'; Devesh Sharma; 'Roland Dreier'
  Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
 
  -Original Message-
  From: Steve Wise [mailto:sw...@opengridcomputing.com]
  Sent: Thursday, July 17, 2014 2:56 PM
  To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
  Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
  provider module
 
 
 
  -Original Message-
  From: Hefty, Sean [mailto:sean.he...@intel.com]
  Sent: Thursday, July 17, 2014 2:50 PM
  To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
  Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
  provider module
 
  So the rdma cm is expected to increase the driver reference count
  (try_module_get) for
  each new cm id, then deference count (module_put) when cm id is
  destroyed?
 
 
  No, I think he's saying the rdma-cm posts a
  RDMA_CM_DEVICE_REMOVAL
  event to each application with rdmacm objects allocated, and each
  application is expected to destroy all the objects it has
  allocated before returning from the event handler.
 
  This is almost correct.  The applications do not have to destroy
  all the objects that
  it has
  allocated before returning from their event handler.  E.g. an app
  can queue a work
  item
  that does the destruction.  The rdmacm will block in its ib_client
  remove handler
  until all
  relevant rdma_cm_id's have been destroyed.
 
 
  Thanks for the clarification.
 
 
  And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in
  rpcrdma_conn_upcall().
  It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls
  rpcrdma_conn_func() for that endpoint, which schedules
  rep_connect_worker...  and I gave up following the code path at this
 point...
  :)
 
  For this to all work correctly, it would need to destroy all the QPs,
  MRs, CQs, etc for that device _before_ destroying the rdma cm ids.
  Otherwise the provider module could be unloaded too soon...
 
  Okay, Should I try to handle device removal in this proposed fashion and
 post the v1.
 
 Hi Devesh,
 
 To make it work, xprtrdma is going to have to allow the device to be removed
 and added back while there are active NFS mounts and pending RPCs.
 AFAICT the code is not structured to do that today.
 
 Probably the place to start is to see how much work is needed to leverage
 the existing logic to watch for ENODEV and do the right things to suspend
 RPC activity until another device is inserted. It would have to work like a
 network partition that causes a transport reconnect.
 
 However, replacing everything, including all MRs and the PD, will require
 significant code churn and additional (undesirable) serialization around the
 use of QPs and cm_ids. Thus I would like to understand how much of a
 priority this is.

Sure, I got your point, this is good for long term stability that we understand
and do the right thing. However if it's not feasible to change the entire 
infrastructure
in one go OR in the near future, do we have other ideas/options to handle this 
problem

In the real world situations it's quite possible that someone attempts to 
reload/replace/unload the 
vendor driver without knowing if there is an active mount or not.

 
 --
 Chuck Lever
 chuck[dot]lever[at]oracle[dot]com
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-18 Thread Devesh Sharma
 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Steve Wise
 Sent: Friday, July 18, 2014 1:39 AM
 To: 'Hefty, Sean'; 'Shirley Ma'; Devesh Sharma; 'Roland Dreier'
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 
  -Original Message-
  From: Steve Wise [mailto:sw...@opengridcomputing.com]
  Sent: Thursday, July 17, 2014 2:56 PM
  To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
  Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
 
   -Original Message-
   From: Hefty, Sean [mailto:sean.he...@intel.com]
   Sent: Thursday, July 17, 2014 2:50 PM
   To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
   Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
   Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
   provider module
  
 So the rdma cm is expected to increase the driver reference
 count
(try_module_get) for
 each new cm id, then deference count (module_put) when cm id is
destroyed?

   
No, I think he's saying the rdma-cm posts a
 RDMA_CM_DEVICE_REMOVAL
event to each application with rdmacm objects allocated, and each
application is expected to destroy all the objects it has
allocated before returning from the event handler.
  
   This is almost correct.  The applications do not have to destroy all
   the objects that
 it has
   allocated before returning from their event handler.  E.g. an app
   can queue a work
 item
   that does the destruction.  The rdmacm will block in its ib_client
   remove handler
 until all
   relevant rdma_cm_id's have been destroyed.
  
 
  Thanks for the clarification.
 
 
 And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in
 rpcrdma_conn_upcall().
 It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls
 rpcrdma_conn_func() for that endpoint, which schedules
 rep_connect_worker...  and I gave up following the code path at this point...
 :)
 
 For this to all work correctly, it would need to destroy all the QPs, MRs, 
 CQs,
 etc for that device _before_ destroying the rdma cm ids.  Otherwise the
 provider module could be unloaded too soon...

Okay, Should I try to handle device removal in this proposed fashion and post 
the v1.

 
 Steve.
 
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-18 Thread Steve Wise
   We can't really deal with a CM_DEVICE_REMOVE event while there are
   active NFS mounts.
  
   System shutdown ordering should guarantee (one would hope) that NFS
   mount points are unmounted before the RDMA/IB core infrastructure is
   torn down. Ordering shouldn't matter as long all NFS activity has
   ceased before the CM tries to remove the device.
  
   So if something is hanging up the CM, there's something xprtrdma is
   not cleaning up properly.
  
 
 
  Devesh, how are you reproducing this?  Are you just rmmod'ing the ocrdma
  module while there are active mounts?
 
 Yes, I am issuing rmmod while there is an active mount. In my case rmmod 
 ocrdma remains
 blocked forever.
 Off-the-course of this discussion: Is there a reasoning behind not using
 ib_register_client()/ib_unregister_client() framework?

I think the idea is that you don't need to use it if you are 
transport-independent and use
the rdmacm...



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-18 Thread Chuck Lever

On Jul 18, 2014, at 2:19 AM, Devesh Sharma devesh.sha...@emulex.com wrote:

 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Steve Wise
 Sent: Friday, July 18, 2014 1:39 AM
 To: 'Hefty, Sean'; 'Shirley Ma'; Devesh Sharma; 'Roland Dreier'
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 
 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Thursday, July 17, 2014 2:56 PM
 To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 
 -Original Message-
 From: Hefty, Sean [mailto:sean.he...@intel.com]
 Sent: Thursday, July 17, 2014 2:50 PM
 To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
 provider module
 
 So the rdma cm is expected to increase the driver reference
 count
 (try_module_get) for
 each new cm id, then deference count (module_put) when cm id is
 destroyed?
 
 
 No, I think he's saying the rdma-cm posts a
 RDMA_CM_DEVICE_REMOVAL
 event to each application with rdmacm objects allocated, and each
 application is expected to destroy all the objects it has
 allocated before returning from the event handler.
 
 This is almost correct.  The applications do not have to destroy all
 the objects that
 it has
 allocated before returning from their event handler.  E.g. an app
 can queue a work
 item
 that does the destruction.  The rdmacm will block in its ib_client
 remove handler
 until all
 relevant rdma_cm_id's have been destroyed.
 
 
 Thanks for the clarification.
 
 
 And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in
 rpcrdma_conn_upcall().
 It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls
 rpcrdma_conn_func() for that endpoint, which schedules
 rep_connect_worker...  and I gave up following the code path at this point...
 :)
 
 For this to all work correctly, it would need to destroy all the QPs, MRs, 
 CQs,
 etc for that device _before_ destroying the rdma cm ids.  Otherwise the
 provider module could be unloaded too soon...
 
 Okay, Should I try to handle device removal in this proposed fashion and post 
 the v1.

Hi Devesh,

To make it work, xprtrdma is going to have to allow the device to be
removed and added back while there are active NFS mounts and pending RPCs.
AFAICT the code is not structured to do that today.

Probably the place to start is to see how much work is needed to leverage
the existing logic to watch for ENODEV and do the right things to suspend
RPC activity until another device is inserted. It would have to work like
a network partition that causes a transport reconnect.

However, replacing everything, including all MRs and the PD, will require
significant code churn and additional (undesirable) serialization around
the use of QPs and cm_ids. Thus I would like to understand how much of a
priority this is.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-18 Thread Shirley Ma

On 07/18/2014 06:27 AM, Steve Wise wrote:
 We can't really deal with a CM_DEVICE_REMOVE event while there are
 active NFS mounts.

 System shutdown ordering should guarantee (one would hope) that NFS
 mount points are unmounted before the RDMA/IB core infrastructure is
 torn down. Ordering shouldn't matter as long all NFS activity has
 ceased before the CM tries to remove the device.

 So if something is hanging up the CM, there's something xprtrdma is
 not cleaning up properly.



 Devesh, how are you reproducing this?  Are you just rmmod'ing the ocrdma
 module while there are active mounts?

 Yes, I am issuing rmmod while there is an active mount. In my case rmmod 
 ocrdma remains
 blocked forever.
Where is it blocked?

 Off-the-course of this discussion: Is there a reasoning behind not using
 ib_register_client()/ib_unregister_client() framework?
 
 I think the idea is that you don't need to use it if you are 
 transport-independent and use
 the rdmacm...
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Devesh Sharma
If verndor driver is attempted for removal while xprtrdma still has an
active mount, the removal of driver may never complete and can cause
unseen races or in worst case system crash.

To solve this, xprtrdma module should get reference of struct ib_device
structure for every mount. Reference is taken after local device address
resolution is completed successfuly.

reference to the struct ib_device pointer is put just before cm_id destruction.

Signed-off-by: Devesh Sharma devesh.sha...@emulex.com
---
 net/sunrpc/xprtrdma/verbs.c |   17 +++--
 net/sunrpc/xprtrdma/xprt_rdma.h |1 +
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 6ead5df..b00e55e 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -457,6 +457,11 @@ rpcrdma_create_id(struct rpcrdma_xprt *xprt,
}
wait_for_completion_interruptible_timeout(ia-ri_done,
msecs_to_jiffies(RDMA_RESOLVE_TIMEOUT) + 1);
+   if (!ia-ri_async_rc  !try_module_get(id-device-owner)) {
+   dprintk(RPC:   %s: Failed to get device module\n,
+   __func__);
+   ia-ri_async_rc = -ENODEV;
+   }
rc = ia-ri_async_rc;
if (rc)
goto out;
@@ -466,16 +471,18 @@ rpcrdma_create_id(struct rpcrdma_xprt *xprt,
if (rc) {
dprintk(RPC:   %s: rdma_resolve_route() failed %i\n,
__func__, rc);
-   goto out;
+   goto out_put;
}
wait_for_completion_interruptible_timeout(ia-ri_done,
msecs_to_jiffies(RDMA_RESOLVE_TIMEOUT) + 1);
rc = ia-ri_async_rc;
if (rc)
-   goto out;
+   goto out_put;
 
return id;
 
+out_put:
+   module_put(id-device-owner);
 out:
rdma_destroy_id(id);
return ERR_PTR(rc);
@@ -613,6 +620,7 @@ rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct sockaddr 
*addr, int memreg)
rwlock_init(ia-ri_qplock);
return 0;
 out2:
+   module_put(ia-ri_id-device-owner);
rdma_destroy_id(ia-ri_id);
ia-ri_id = NULL;
 out1:
@@ -638,9 +646,11 @@ rpcrdma_ia_close(struct rpcrdma_ia *ia)
if (ia-ri_id != NULL  !IS_ERR(ia-ri_id)) {
if (ia-ri_id-qp)
rdma_destroy_qp(ia-ri_id);
+   module_put(ia-ri_id-device-owner);
rdma_destroy_id(ia-ri_id);
ia-ri_id = NULL;
}
+
if (ia-ri_pd != NULL  !IS_ERR(ia-ri_pd)) {
rc = ib_dealloc_pd(ia-ri_pd);
dprintk(RPC:   %s: ib_dealloc_pd returned %i\n,
@@ -886,6 +896,7 @@ retry:
if (ia-ri_id-device != id-device) {
printk(RPC:   %s: can't reconnect on 
different device!\n, __func__);
+   module_put(id-device-owner);
rdma_destroy_id(id);
rc = -ENETUNREACH;
goto out;
@@ -895,6 +906,7 @@ retry:
if (rc) {
dprintk(RPC:   %s: rdma_create_qp failed %i\n,
__func__, rc);
+   module_put(id-device-owner);
rdma_destroy_id(id);
rc = -ENETUNREACH;
goto out;
@@ -906,6 +918,7 @@ retry:
write_unlock(ia-ri_qplock);
 
rdma_destroy_qp(old);
+   module_put(old-device-owner);
rdma_destroy_id(old);
} else {
dprintk(RPC:   %s: connecting...\n, __func__);
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index c419498..b35fa21 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -44,6 +44,7 @@
 #include linux/spinlock.h/* spinlock_t, etc */
 #include linux/atomic.h  /* atomic_t, etc */
 #include linux/workqueue.h   /* struct work_struct */
+#include linux/module.h  /* try_module_get()/module_put() */
 
 #include rdma/rdma_cm.h  /* RDMA connection api */
 #include rdma/ib_verbs.h /* RDMA verbs api */
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Steve Wise



On 7/17/2014 9:01 AM, Devesh Sharma wrote:

If verndor driver is attempted for removal while xprtrdma still has an
active mount, the removal of driver may never complete and can cause
unseen races or in worst case system crash.

To solve this, xprtrdma module should get reference of struct ib_device
structure for every mount. Reference is taken after local device address
resolution is completed successfuly.

reference to the struct ib_device pointer is put just before cm_id destruction.

Signed-off-by: Devesh Sharma devesh.sha...@emulex.com


This seems like an issue with the rdma-cm or rdma core, not xprtrdma.  I 
see that user rdma applications cause a ref on the provider module here 
in ib_uverbs_open():


if (!try_module_get(dev-ib_dev-owner)) {
ret = -ENODEV;
goto err;


Maybe kernel applications that allocate device resources should cause a 
ref on the provider's module.


Sean/Roland,  is there some history here as to how rdma provider module 
removal should be handled?


Steve.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Chuck Lever

On Jul 17, 2014, at 11:01 AM, Steve Wise sw...@opengridcomputing.com wrote:

 
 
 On 7/17/2014 9:01 AM, Devesh Sharma wrote:
 If verndor driver is attempted for removal while xprtrdma still has an
 active mount, the removal of driver may never complete and can cause
 unseen races or in worst case system crash.
 
 To solve this, xprtrdma module should get reference of struct ib_device
 structure for every mount. Reference is taken after local device address
 resolution is completed successfuly.
 
 reference to the struct ib_device pointer is put just before cm_id 
 destruction.
 
 Signed-off-by: Devesh Sharma devesh.sha...@emulex.com
 
 This seems like an issue with the rdma-cm or rdma core, not xprtrdma.

Good point. I was wondering if svcrdma might have a similar issue.

Infrequently I see a hang during shutdown that appears to be related to
the ordering of removal of one of the RDMA modules, but I’ve never had
time to chase it.

See also: https://bugzilla.linux-nfs.org/show_bug.cgi?id=252


 I see that user rdma applications cause a ref on the provider module here in 
 ib_uverbs_open():
 
if (!try_module_get(dev-ib_dev-owner)) {
ret = -ENODEV;
goto err;
 
 
 Maybe kernel applications that allocate device resources should cause a ref 
 on the provider's module.
 
 Sean/Roland,  is there some history here as to how rdma provider module 
 removal should be handled?
 
 Steve.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Devesh Sharma

From: Steve Wise [sw...@opengridcomputing.com]
Sent: Thursday, July 17, 2014 8:31 PM
To: Devesh Sharma; Roland Dreier; Hefty, Sean
Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

On 7/17/2014 9:01 AM, Devesh Sharma wrote:
 If verndor driver is attempted for removal while xprtrdma still has an
 active mount, the removal of driver may never complete and can cause
 unseen races or in worst case system crash.

 To solve this, xprtrdma module should get reference of struct ib_device
 structure for every mount. Reference is taken after local device address
 resolution is completed successfuly.

 reference to the struct ib_device pointer is put just before cm_id 
 destruction.

 Signed-off-by: Devesh Sharma devesh.sha...@emulex.com

This seems like an issue with the rdma-cm or rdma core, not xprtrdma.  I
see that user rdma applications cause a ref on the provider module here
in ib_uverbs_open():

 if (!try_module_get(dev-ib_dev-owner)) {
 ret = -ENODEV;
 goto err;


Maybe kernel applications that allocate device resources should cause a
ref on the provider's module.

Yes, To me it looks like the right place to get reference of provider driver is
rdma_resolve_addr()--acuire_ib_dev(). However this patch provides a workaround.

Lets wait for Roland/Sean's input.


Sean/Roland,  is there some history here as to how rdma provider module
removal should be handled?

Steve.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Devesh Sharma
 Replying again, due to mail formatting issue:

 -Original Message-
 From: Chuck Lever [mailto:chuck.le...@oracle.com]
 Sent: Thursday, July 17, 2014 8:36 PM
 To: Steve Wise
 Cc: Devesh Sharma; Roland Dreier; Hefty, Sean; linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 On Jul 17, 2014, at 11:01 AM, Steve Wise sw...@opengridcomputing.com
 wrote:
 
 
 
  On 7/17/2014 9:01 AM, Devesh Sharma wrote:
  If verndor driver is attempted for removal while xprtrdma still has
  an active mount, the removal of driver may never complete and can
  cause unseen races or in worst case system crash.
 
  To solve this, xprtrdma module should get reference of struct
  ib_device structure for every mount. Reference is taken after local
  device address resolution is completed successfuly.
 
  reference to the struct ib_device pointer is put just before cm_id
 destruction.
 
  Signed-off-by: Devesh Sharma devesh.sha...@emulex.com
 
  This seems like an issue with the rdma-cm or rdma core, not xprtrdma.
 
 Good point. I was wondering if svcrdma might have a similar issue.
 
 Infrequently I see a hang during shutdown that appears to be related to the
 ordering of removal of one of the RDMA modules, but I've never had time to
 chase it.
 
 See also: https://bugzilla.linux-nfs.org/show_bug.cgi?id=252
 
 
  I see that user rdma applications cause a ref on the provider module here
 in ib_uverbs_open():
 
 if (!try_module_get(dev-ib_dev-owner)) {
 ret = -ENODEV;
 goto err;
 
 
  Maybe kernel applications that allocate device resources should cause a ref
 on the provider's module.

Yes, To me it looks like the right place to get reference of provider driver is 
rdma_resolve_addr()--acuire_ib_dev(). However this patch provides a workaround.

Let's wait for Roland/Sean's input.

 
  Sean/Roland,  is there some history here as to how rdma provider module
 removal should be handled?
 
  Steve.
 
 --
 Chuck Lever
 chuck[dot]lever[at]oracle[dot]com
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Hefty, Sean
 On 7/17/2014 9:01 AM, Devesh Sharma wrote:
  If verndor driver is attempted for removal while xprtrdma still has an
  active mount, the removal of driver may never complete and can cause
  unseen races or in worst case system crash.
 
  To solve this, xprtrdma module should get reference of struct ib_device
  structure for every mount. Reference is taken after local device address
  resolution is completed successfuly.
 
  reference to the struct ib_device pointer is put just before cm_id
 destruction.
 
  Signed-off-by: Devesh Sharma devesh.sha...@emulex.com
 
 This seems like an issue with the rdma-cm or rdma core, not xprtrdma.  I
 see that user rdma applications cause a ref on the provider module here
 in ib_uverbs_open():
 
  if (!try_module_get(dev-ib_dev-owner)) {
  ret = -ENODEV;
  goto err;
 
 
 Maybe kernel applications that allocate device resources should cause a
 ref on the provider's module.
 
 Sean/Roland,  is there some history here as to how rdma provider module
 removal should be handled?

The kernel modules should are not expected to access the rdma devices after 
their remove device callback has been invoked.  The rdma cm basically forwards 
the device removal on a per id basis.  Apps are expected to destroy the id 
after receiving that callback.  The rdma cm should block in the remove device 
call until all id's associated with the removed device have been destroyed.

- Sean
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Steve Wise


 -Original Message-
 From: Shirley Ma [mailto:shirley...@oracle.com]
 Sent: Thursday, July 17, 2014 1:58 PM
 To: Hefty, Sean; Steve Wise; Devesh Sharma; Roland Dreier
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 
 
 On 07/17/2014 09:06 AM, Hefty, Sean wrote:
  On 7/17/2014 9:01 AM, Devesh Sharma wrote:
  If verndor driver is attempted for removal while xprtrdma still has an
  active mount, the removal of driver may never complete and can cause
  unseen races or in worst case system crash.
 
  To solve this, xprtrdma module should get reference of struct ib_device
  structure for every mount. Reference is taken after local device address
  resolution is completed successfuly.
 
  reference to the struct ib_device pointer is put just before cm_id
  destruction.
 
  Signed-off-by: Devesh Sharma devesh.sha...@emulex.com
 
  This seems like an issue with the rdma-cm or rdma core, not xprtrdma.  I
  see that user rdma applications cause a ref on the provider module here
  in ib_uverbs_open():
 
   if (!try_module_get(dev-ib_dev-owner)) {
   ret = -ENODEV;
   goto err;
 
 
  Maybe kernel applications that allocate device resources should cause a
  ref on the provider's module.
 
  Sean/Roland,  is there some history here as to how rdma provider module
  removal should be handled?
 
  The kernel modules should are not expected to access the rdma devices after 
  their
 remove device callback has been invoked.  The rdma cm basically forwards the 
 device
 removal on a per id basis.  Apps are expected to destroy the id after 
 receiving that
callback.
 The rdma cm should block in the remove device call until all id's associated 
 with the
 removed device have been destroyed.
 
 So the rdma cm is expected to increase the driver reference count 
 (try_module_get) for
 each new cm id, then deference count (module_put) when cm id is destroyed?
 

No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event  to 
each
application with rdmacm objects allocated, and each application is expected to 
destroy all
the objects it has allocated before returning from the event handler.

And I think the ib_verbs core calls each ib_client's remove handler when an 
rdma provider
unregisters with the core.  

Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Steve Wise


 -Original Message-
 From: Hefty, Sean [mailto:sean.he...@intel.com]
 Sent: Thursday, July 17, 2014 2:50 PM
 To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
   So the rdma cm is expected to increase the driver reference count
  (try_module_get) for
   each new cm id, then deference count (module_put) when cm id is
  destroyed?
  
 
  No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event
  to each
  application with rdmacm objects allocated, and each application is expected
  to destroy all
  the objects it has allocated before returning from the event handler.
 
 This is almost correct.  The applications do not have to destroy all the 
 objects that it
has
 allocated before returning from their event handler.  E.g. an app can queue a 
 work item
 that does the destruction.  The rdmacm will block in its ib_client remove 
 handler until
all
 relevant rdma_cm_id's have been destroyed.
 

Thanks for the clarification.  

  And I think the ib_verbs core calls each ib_client's remove handler when an
  rdma provider
  unregisters with the core.
 
 yes

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Steve Wise


 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Thursday, July 17, 2014 2:56 PM
 To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 
 
  -Original Message-
  From: Hefty, Sean [mailto:sean.he...@intel.com]
  Sent: Thursday, July 17, 2014 2:50 PM
  To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
  Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
So the rdma cm is expected to increase the driver reference count
   (try_module_get) for
each new cm id, then deference count (module_put) when cm id is
   destroyed?
   
  
   No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event
   to each
   application with rdmacm objects allocated, and each application is 
   expected
   to destroy all
   the objects it has allocated before returning from the event handler.
 
  This is almost correct.  The applications do not have to destroy all the 
  objects that
it has
  allocated before returning from their event handler.  E.g. an app can queue 
  a work
item
  that does the destruction.  The rdmacm will block in its ib_client remove 
  handler
until all
  relevant rdma_cm_id's have been destroyed.
 
 
 Thanks for the clarification.
 

And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in 
rpcrdma_conn_upcall().
It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls 
rpcrdma_conn_func()
for that endpoint, which schedules rep_connect_worker...  and I gave up 
following the code
path at this point... :)  

For this to all work correctly, it would need to destroy all the QPs, MRs, CQs, 
etc for
that device _before_ destroying the rdma cm ids.  Otherwise the provider module 
could be
unloaded too soon...

Steve.




--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Shirley Ma


On 07/17/2014 12:55 PM, Steve Wise wrote:
 
 
 -Original Message-
 From: Hefty, Sean [mailto:sean.he...@intel.com]
 Sent: Thursday, July 17, 2014 2:50 PM
 To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

 So the rdma cm is expected to increase the driver reference count
 (try_module_get) for
 each new cm id, then deference count (module_put) when cm id is
 destroyed?


 No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event
 to each
 application with rdmacm objects allocated, and each application is expected
 to destroy all
 the objects it has allocated before returning from the event handler.

 This is almost correct.  The applications do not have to destroy all the 
 objects that it
 has
 allocated before returning from their event handler.  E.g. an app can queue 
 a work item
 that does the destruction.  The rdmacm will block in its ib_client remove 
 handler until
 all
 relevant rdma_cm_id's have been destroyed.

 
 Thanks for the clarification.  

Thanks, checked the cma code, the reference count maintains there.

 And I think the ib_verbs core calls each ib_client's remove handler when an
 rdma provider
 unregisters with the core.

 yes
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Chuck Lever

On Jul 17, 2014, at 4:08 PM, Steve Wise sw...@opengridcomputing.com wrote:

 
 
 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Thursday, July 17, 2014 2:56 PM
 To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 
 
 -Original Message-
 From: Hefty, Sean [mailto:sean.he...@intel.com]
 Sent: Thursday, July 17, 2014 2:50 PM
 To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
 Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 So the rdma cm is expected to increase the driver reference count
 (try_module_get) for
 each new cm id, then deference count (module_put) when cm id is
 destroyed?
 
 
 No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event
 to each
 application with rdmacm objects allocated, and each application is expected
 to destroy all
 the objects it has allocated before returning from the event handler.
 
 This is almost correct.  The applications do not have to destroy all the 
 objects that
 it has
 allocated before returning from their event handler.  E.g. an app can queue 
 a work
 item
 that does the destruction.  The rdmacm will block in its ib_client remove 
 handler
 until all
 relevant rdma_cm_id's have been destroyed.
 
 
 Thanks for the clarification.
 
 
 And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in 
 rpcrdma_conn_upcall().
 It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls 
 rpcrdma_conn_func()
 for that endpoint, which schedules rep_connect_worker...  and I gave up 
 following the code
 path at this point... :)  
 
 For this to all work correctly, it would need to destroy all the QPs, MRs, 
 CQs, etc for
 that device _before_ destroying the rdma cm ids.  Otherwise the provider 
 module could be
 unloaded too soon…

We can’t really deal with a CM_DEVICE_REMOVE event while there are active
NFS mounts.

System shutdown ordering should guarantee (one would hope) that NFS
mount points are unmounted before the RDMA/IB core infrastructure is
torn down. Ordering shouldn’t matter as long all NFS activity has
ceased before the CM tries to remove the device.

So if something is hanging up the CM, there’s something xprtrdma is not
cleaning up properly.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Steve Wise


 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org 
 [mailto:linux-rdma-ow...@vger.kernel.org] On
 Behalf Of Chuck Lever
 Sent: Thursday, July 17, 2014 3:42 PM
 To: Steve Wise
 Cc: Hefty, Sean; Shirley Ma; Devesh Sharma; Roland Dreier; 
 linux-rdma@vger.kernel.org
 Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
 
 
 On Jul 17, 2014, at 4:08 PM, Steve Wise sw...@opengridcomputing.com wrote:
 
 
 
  -Original Message-
  From: Steve Wise [mailto:sw...@opengridcomputing.com]
  Sent: Thursday, July 17, 2014 2:56 PM
  To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
  Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider 
  module
 
 
 
  -Original Message-
  From: Hefty, Sean [mailto:sean.he...@intel.com]
  Sent: Thursday, July 17, 2014 2:50 PM
  To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
  Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
  Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider 
  module
 
  So the rdma cm is expected to increase the driver reference count
  (try_module_get) for
  each new cm id, then deference count (module_put) when cm id is
  destroyed?
 
 
  No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event
  to each
  application with rdmacm objects allocated, and each application is 
  expected
  to destroy all
  the objects it has allocated before returning from the event handler.
 
  This is almost correct.  The applications do not have to destroy all the 
  objects
that
  it has
  allocated before returning from their event handler.  E.g. an app can 
  queue a work
  item
  that does the destruction.  The rdmacm will block in its ib_client remove 
  handler
  until all
  relevant rdma_cm_id's have been destroyed.
 
 
  Thanks for the clarification.
 
 
  And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in
 rpcrdma_conn_upcall().
  It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls
 rpcrdma_conn_func()
  for that endpoint, which schedules rep_connect_worker...  and I gave up 
  following the
 code
  path at this point... :)
 
  For this to all work correctly, it would need to destroy all the QPs, MRs, 
  CQs, etc
for
  that device _before_ destroying the rdma cm ids.  Otherwise the provider 
  module could
 be
  unloaded too soon.
 
 We can't really deal with a CM_DEVICE_REMOVE event while there are active
 NFS mounts.
 
 System shutdown ordering should guarantee (one would hope) that NFS
 mount points are unmounted before the RDMA/IB core infrastructure is
 torn down. Ordering shouldn't matter as long all NFS activity has
 ceased before the CM tries to remove the device.
 
 So if something is hanging up the CM, there's something xprtrdma is not
 cleaning up properly.
 


Devesh, how are you reproducing this?  Are you just rmmod'ing the ocrdma module 
while
there are active mounts?






--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

2014-07-17 Thread Devesh Sharma
 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Steve Wise
 Sent: Friday, July 18, 2014 2:29 AM
 To: 'Chuck Lever'
 Cc: 'Hefty, Sean'; 'Shirley Ma'; Devesh Sharma; 'Roland Dreier'; linux-
 r...@vger.kernel.org
 Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
 module
 
 
 
  -Original Message-
  From: linux-rdma-ow...@vger.kernel.org
  [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Chuck Lever
  Sent: Thursday, July 17, 2014 3:42 PM
  To: Steve Wise
  Cc: Hefty, Sean; Shirley Ma; Devesh Sharma; Roland Dreier;
  linux-rdma@vger.kernel.org
  Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
  module
 
 
  On Jul 17, 2014, at 4:08 PM, Steve Wise sw...@opengridcomputing.com
 wrote:
 
  
  
   -Original Message-
   From: Steve Wise [mailto:sw...@opengridcomputing.com]
   Sent: Thursday, July 17, 2014 2:56 PM
   To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
   Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com'
   Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
   provider module
  
  
  
   -Original Message-
   From: Hefty, Sean [mailto:sean.he...@intel.com]
   Sent: Thursday, July 17, 2014 2:50 PM
   To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
   Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com
   Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
   provider module
  
   So the rdma cm is expected to increase the driver reference
   count
   (try_module_get) for
   each new cm id, then deference count (module_put) when cm id is
   destroyed?
  
  
   No, I think he's saying the rdma-cm posts a
   RDMA_CM_DEVICE_REMOVAL event to each application with
 rdmacm
   objects allocated, and each application is expected to destroy
   all the objects it has allocated before returning from the event
   handler.
  
   This is almost correct.  The applications do not have to destroy
   all the objects
 that
   it has
   allocated before returning from their event handler.  E.g. an app
   can queue a work
   item
   that does the destruction.  The rdmacm will block in its ib_client
   remove handler
   until all
   relevant rdma_cm_id's have been destroyed.
  
  
   Thanks for the clarification.
  
  
   And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in
  rpcrdma_conn_upcall().
   It sets ep-rep_connected to -ENODEV, wakes everybody up, and calls
  rpcrdma_conn_func()
   for that endpoint, which schedules rep_connect_worker...  and I gave
   up following the
  code
   path at this point... :)
  
   For this to all work correctly, it would need to destroy all the
   QPs, MRs, CQs, etc
 for
   that device _before_ destroying the rdma cm ids.  Otherwise the
   provider module could
  be
   unloaded too soon.
 
  We can't really deal with a CM_DEVICE_REMOVE event while there are
  active NFS mounts.
 
  System shutdown ordering should guarantee (one would hope) that NFS
  mount points are unmounted before the RDMA/IB core infrastructure is
  torn down. Ordering shouldn't matter as long all NFS activity has
  ceased before the CM tries to remove the device.
 
  So if something is hanging up the CM, there's something xprtrdma is
  not cleaning up properly.
 
 
 
 Devesh, how are you reproducing this?  Are you just rmmod'ing the ocrdma
 module while there are active mounts?

Yes, I am issuing rmmod while there is an active mount. In my case rmmod ocrdma 
remains blocked forever.
Off-the-course of this discussion: Is there a reasoning behind not using 
ib_register_client()/ib_unregister_client() framework?
 
 
 
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html