question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
Sean, I'm debugging some disconnect related race in iser - and wanted to check with you something re the CM/RDMA-CM state machine: I see that when a disconnected is initiated by the passive side (iser target) of a connection, such that the active side (iser initiator) gets RDMA_CM_EVENT_DISCO

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
>>   hca_id: mlx4_0 >>             transport:                      InfiniBand (0) >>             fw_ver:                        2.6.628 >>             node_guid:                      0003:ba00:0100:b1d8 >>             sys_image_guid:                0003:ba00:0100:b1db >>             vendor_id: 

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
> I'm debugging some disconnect related race in iser - and wanted to check > with you something re the CM/RDMA-CM state machine: I see that when a > disconnected is initiated by the passive side (iser target) of a > connection, such that the active side (iser initiator) gets > RDMA_CM_EVENT_DISCONN

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
On Mon, Nov 14, 2011 at 9:16 PM, Hefty, Sean wrote: > After disconnecting, the QP should enter the timewait state for twice the > packet lifetime. Does going through timewait always holds? e.g no matter what's the return status of rdma_disconnect and/or the status of the rdma_cm disconnected eve

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
> Does going through timewait always holds? e.g no matter what's the > return status of rdma_disconnect and/or the status of the rdma_cm > disconnected event? It usually holds. It will fail if rdma_disconnect() is called from a bogus state. But otherwise, I believe that it will enter timewait o

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
> It usually holds.  It will fail if rdma_disconnect() is called from a bogus > state.  But > otherwise, I believe that it will enter timewait on failure to send or > receive a disconnect > message mmm, so can these bogus states for rdma_disconnect to be called be better defined? basically, for

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
>>>    hca_id: mlx4_0 >>>             transport:                      InfiniBand (0) >>>             fw_ver:                         2.6.628 >>>             node_guid:                      0003:ba00:0100:b1d8 >>>             sys_image_guid:                 0003:ba00:0100:b1db >>>            

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
> mmm, so can these bogus states for rdma_disconnect to be called be > better defined? basically, for the case where the rdma_cm manages the > consumer QP, this call is the only way to move an RC QP into the error > state when the QP is okay and the consumer want to flush, etc. By bogus I mean ca

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
> By bogus I mean calling disconnect when the QP has never been connected, or > calling > disconnect twice what return value can serve as bogus indication for the application? is that -EINVAL? also, basically a QP could have buffers posted to it also before being connected (e.g after RTR or there

Re: srp_transport: Fix atttribute registration race

2011-11-14 Thread Or Gerlitz
On Sun, Nov 13, 2011 at 11:55 PM, Dave Dillow wrote: > SRP uses RDMA, so you cannot use UC mode. per the IB spec, RDMA write is supported for UC Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
> what return value can serve as bogus indication for the application? > is that -EINVAL? also, basically a QP could have buffers posted to it > also before being connected (e.g after RTR or there's no point in time > an rdma-cm consumer for which the cma manages the QP state is exposed > to such Q

Re: srp_transport: Fix atttribute registration race

2011-11-14 Thread Dave Dillow
On Mon, Nov 14, 2011 at 04:43:32PM -0500, Or Gerlitz wrote: > On Sun, Nov 13, 2011 at 11:55 PM, Dave Dillow wrote: > > SRP uses RDMA, so you cannot use UC mode. > > per the IB spec, RDMA write is supported for UC Yeah, I read that as UD for some reason... -- To unsubscribe from this list: send t

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
> Hello again, Vladimir! > > To set up "/dev/mst/mt25418_pci_cr0" manually I've followed up > what "/etc/init.d/mst start" does. > I've seen that it uses the "minit" tool which is part of the arch > dependent mft-package. > > Is there maybe a possibility to get "minit" for sparc64? > Otherwise

[opensm] [PATCH 1/5] Free memory from osm_subn_opt_t when osm_subn_t destroyed

2011-11-14 Thread Albert Chu
Signed-off-by: Albert L. Chu --- opensm/osm_subnet.c | 73 +++ 1 files changed, 73 insertions(+), 0 deletions(-) diff --git a/opensm/osm_subnet.c b/opensm/osm_subnet.c index 554a950..03b73dd 100644 --- a/opensm/osm_subnet.c +++ b/opensm/osm_subne

[opensm] [PATCH 2/5] Move no_fallback_routing_engine from osm_subn_opt_t to osm_opensm_t.

2011-11-14 Thread Albert Chu
no_fallback_routing_engine is a convenience flag and not a configurable option, so it should not be in osm_subn_opt_t. Signed-off-by: Albert L. Chu --- include/opensm/osm_opensm.h |4 include/opensm/osm_subnet.h |1 - opensm/osm_opensm.c |4 +++- opensm/osm_ucast_mgr.c

[opensm] [PATCH 3/5] Fix rescan config file parsing spamming and loading bugs

2011-11-14 Thread Albert Chu
This patch fixes several major issues when config files are rescanned. First, config file changes were only noticed if users listed the config file option in the file. If a user commented out an option (or uncommented a previously commented out option) the change may not be noticed. Second, unde

[opensm] [PATCH 4/5] Fix potential memleak

2011-11-14 Thread Albert Chu
Note that the new config file parsing code, different osm_subn_opt_t structures do not share pointers to the same strdup'ed memory. Therefore, this memory must be freed before reallocing to avoid a memleak. Signed-off-by: Albert L. Chu --- opensm/main.c |2 ++ 1 files changed, 2 insertions(+

[opensm] [PATCH 5/5] Remove duplicate initialization of scatter_ports

2011-11-14 Thread Albert Chu
Signed-off-by: Albert L. Chu --- opensm/osm_subnet.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/opensm/osm_subnet.c b/opensm/osm_subnet.c index 6b6bd63..f9327a6 100644 --- a/opensm/osm_subnet.c +++ b/opensm/osm_subnet.c @@ -1009,7 +1009,6 @@ void osm_subn_set_default

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Roland Dreier
On Sun, Nov 13, 2011 at 12:26 AM, Vladimir Sokolovsky wrote: > Try to update HCA's firmware to the latest version > (http://www.mellanox.com/content/pages.php?pg=firmware_download). Independent of a firmware update, have you tried with an unpatched (upstream) kernel? 2.6.39.4 would be fine, or y

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
Roland Dreier wrote: > On Sun, Nov 13, 2011 at 12:26 AM, Vladimir Sokolovsky > wrote: >> Try to update HCA's firmware to the latest version >> (http://www.mellanox.com/content/pages.php?pg=firmware_download). > > Independent of a firmware update, have you tried with an unpatched (upstream) >

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
On Mon, Nov 14, 2011 at 11:55 PM, Hefty, Sean wrote: > [...] calling disconnect is one way that a QP may be transitioned into > timewait [...] I was talking on the QP "physical" state (e.g error that causes flushes) not the state w.r.t the IB CM. Or. -- To unsubscribe from this list: send the

Re: srp_transport: Fix atttribute registration race

2011-11-14 Thread Bart Van Assche
On Mon, Nov 14, 2011 at 10:43 PM, Or Gerlitz wrote: > On Sun, Nov 13, 2011 at 11:55 PM, Dave Dillow wrote: > > SRP uses RDMA, so you cannot use UC mode. > > per the IB spec, RDMA write is supported for UC Agreed. But an SRP target does not only issue RDMA write requests, it also issues RDMA read

Re: [opensm] [PATCH 1/5] Free memory from osm_subn_opt_t when osm_subn_t destroyed

2011-11-14 Thread Bart Van Assche
On Mon, Nov 14, 2011 at 11:49 PM, Albert Chu wrote: > +       if (opt->vlarb_high) > +               free(opt->vlarb_high); Those if-statements are superfluous - invoking free(NULL) is safe. See e.g. http://pubs.opengroup.org/onlinepubs/009695399/functions/free.html. Bart. -- To unsubscribe from