Re: [PATCH 5/6] opensm: Add command-line option --pidfile
Hi Bart, On 16:20 Tue 11 Sep , Bart Van Assche wrote: This option is necessary to control opensm from an LSB-compliant init script. This option should also be added to opensm.conf and documented in the man page. -- Alex -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
how to preserve QP over HA events for librdmacm applications
Hi Sean, We have a case here where an app which uses librdmacm wants to preserve its QP over HA events such IB link down/up, specifically the sequence of operations done by the app is the following: 1. rdma_create_id using the IPoIB port space 2. rdma_bind _addr 3. rdma_create_qp using UD QP type. We are looking for a way to reset this QP such that any pending send buffers will be flushed out and then the QP returns to be functional (in RTS state) - eventually with the same QPN. Using rdma_disconnect indeed moves the QP to the error state and the buffers are flushed, however, there's no way to modify the QP state again to RTS, etc via librdmacm. Can this flushing be somehow done with the current librdmacm/libibverbs APIs or we need some enhancement? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: how to preserve QP over HA events for librdmacm applications
Can this flushing be somehow done with the current librdmacm/libibverbs APIs or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. - Sean -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how to preserve QP over HA events for librdmacm applications
On 19/09/2012 18:48, Hefty, Sean wrote: Can this flushing be somehow done with the current librdmacm/libibverbs APIs or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. Alex, Any reason we can't deploy this hack? is that for the IPoIB port space it would require copying some low level code from librdmacm or even from the kernel? e.g the IPoIB qkey, etc. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: how to preserve QP over HA events for librdmacm applications
Since we use the RDMA_PS_IPOIB we need librdmacm to help get the correct pkey_index and qkey (in INIT-RTR transition) to match IPoIB's UD QP own values. If not, than our user space UD QP will not be able to send/recv from IPoIB on remote machines (which is what we want to gain by using the IPOIB port space). Maybe we can save the values used from the rdma_create_qp and reuse them once modify the UD QP state by libverbs (ibv_modify_qp). It would be nice if we had access to the rdma's modify qp wrapper to do this nicely from application level. Alex -Original Message- From: Or Gerlitz Sent: Wednesday, September 19, 2012 6:53 PM To: Alex Rosenbaum Cc: Hefty, Sean; linux-rdma (linux-rdma@vger.kernel.org) Subject: Re: how to preserve QP over HA events for librdmacm applications On 19/09/2012 18:48, Hefty, Sean wrote: Can this flushing be somehow done with the current librdmacm/libibverbs APIs or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. Alex, Any reason we can't deploy this hack? is that for the IPoIB port space it would require copying some low level code from librdmacm or even from the kernel? e.g the IPoIB qkey, etc. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how to preserve QP over HA events for librdmacm applications
On Sep 19, 2012, at 11:58 AM, Alex Rosenbaum al...@mellanox.com wrote: Since we use the RDMA_PS_IPOIB we need librdmacm to help get the correct pkey_index and qkey (in INIT-RTR transition) to match IPoIB's UD QP own values. If not, than our user space UD QP will not be able to send/recv from IPoIB on remote machines (which is what we want to gain by using the IPOIB port space). Maybe we can save the values used from the rdma_create_qp and reuse them once modify the UD QP state by libverbs (ibv_modify_qp). It would be nice if we had access to the rdma's modify qp wrapper to do this nicely from application level. I too would be interested in bringing a QP from error back to a usable state. I have been debating whether to reconnect using the current RDMA calls versus trying to transition the existing RC QP. I assumed to transition the existing QP that I would need to open a socket to coordinate the two sides. Is that correct? If I were instead to use rdma_connect(), does it require a new CM id or just a new QP within the same id? Thanks, Scott -Original Message- From: Or Gerlitz Sent: Wednesday, September 19, 2012 6:53 PM To: Alex Rosenbaum Cc: Hefty, Sean; linux-rdma (linux-rdma@vger.kernel.org) Subject: Re: how to preserve QP over HA events for librdmacm applications On 19/09/2012 18:48, Hefty, Sean wrote: Can this flushing be somehow done with the current librdmacm/libibverbs APIs or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. Alex, Any reason we can't deploy this hack? is that for the IPoIB port space it would require copying some low level code from librdmacm or even from the kernel? e.g the IPoIB qkey, etc. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how to preserve QP over HA events for librdmacm applications
On Sep 19, 2012, at 1:05 PM, Hefty, Sean sean.he...@intel.com wrote: I too would be interested in bringing a QP from error back to a usable state. I have been debating whether to reconnect using the current RDMA calls versus trying to transition the existing RC QP. I assumed to transition the existing QP that I would need to open a socket to coordinate the two sides. Is that correct? If I were instead to use rdma_connect(), does it require a new CM id or just a new QP within the same id? What do you gain by transitioning an RC QP from error to RTS, versus just establishing a new connection? I have a certain amount of state regarding a peer. I lookup that state based on the qp_num returned within a work completion, for example. If I reconnect, I will need to migrate the state from the old qp_num to the new qp_num. I have no preference which is why I asked about the two options (opening a socket to coordinate state transitions versus connecting with a new QP). Scott-- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how to preserve QP over HA events for librdmacm applications
On Sep 19, 2012, at 2:14 PM, Atchley, Scott atchle...@ornl.gov wrote: On Sep 19, 2012, at 1:05 PM, Hefty, Sean sean.he...@intel.com wrote: I too would be interested in bringing a QP from error back to a usable state. I have been debating whether to reconnect using the current RDMA calls versus trying to transition the existing RC QP. I assumed to transition the existing QP that I would need to open a socket to coordinate the two sides. Is that correct? If I were instead to use rdma_connect(), does it require a new CM id or just a new QP within the same id? What do you gain by transitioning an RC QP from error to RTS, versus just establishing a new connection? I have a certain amount of state regarding a peer. I lookup that state based on the qp_num returned within a work completion, for example. If I reconnect, I will need to migrate the state from the old qp_num to the new qp_num. I have no preference which is why I asked about the two options (opening a socket to coordinate state transitions versus connecting with a new QP). I don't know if it matters to the conversation or not, but I use an SRQ. I am unclear how to remove a QP from the SRQ. Is ibv_destroy_qp() sufficient? Or do I need to use rdma_destroy_qp()? I basically, use the rdma_* calls for connection setup. After that, I use only ibv_* calls for communication (Send/Recv and RDMA). Scott-- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: how to preserve QP over HA events for librdmacm applications
I don't know if it matters to the conversation or not, but I use an SRQ. I am unclear how to remove a QP from the SRQ. Is ibv_destroy_qp() sufficient? Or do I need to use rdma_destroy_qp()? rdma_destroy_qp() is a wrapper around ibv_destroy_qp(), plus destroys any internally allocated resources, like CQs, if the rdma_cm allocated those for the user. If you called ibv_create_cq yourself, either is sufficient. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html