Re: [ewg] OFED Nightly OFED-1.5-20090714-0600 build break in infiniband-diags
On 22:59 Tue 14 Jul , Tziporet Koren wrote: > > was this issue resolved? Yes. Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] EWG/OFED meeting minutes for July 13, 2009
These are the meeting minutes for EWG/OFED meeting for July 13 09 Meeting summary === 1. OFED 1.5 alpha release is still not ready due to compilation issues 2. OFED 1.4.2: Bug fixes will be submitted soon. RC is expected this week and GA next week. Details: 1. OFED 1.5 alpha release * RHEL 5.4 - we will take backports * SLES 10 SP3 - Tziporet to ask Novell if the kernel available somewhere * New Qlogic driver will be sent to kernel soon * PPC64 - compilation fails on mthca - Mellanox to fix it * Management package compilation still fail - Sasha is working to fix * Decided to leave sa cache patches since Qlogic are using them 2. OFED 1.4.2: Brian found more bugs on 1.4.1 - 4 bugs are opened, fixes are under testing major change in the headers will not be done in 1.4.2 since it put release stability on a risk (can be done in 1.5) Discussion: Who will use 1.4.2: * Lustre customers * NFS/RDMA users 1.4.2 Quality disclaimer * We do not going to recommend the distros to take OFED 1.4.2 as package since it is less tested and have not passed interop and OFA logo testing * Note should be written in RN on this. Note: To prevent such case in the future EWG ask Sun to test Lustre on OFED 1.5 on the RCs Comments: I will be on vacation in time frame for next meeting. Betsy Zeller will replace me Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] Wish to remove local sa patches from OFED 1.5
I'll let Sean explain it, but it is a user space daemon that uses an IB multicast scheme to derive path record information. The rdma_cm can use this rather than having to go to the SM to get path records, which we know does not scale. Sean can provide more details. -Original Message- From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com] Sent: Tuesday, July 14, 2009 1:37 PM To: Woodruff, Robert J Cc: Tziporet Koren; Hefty, Sean; michael.bro...@qlogic.com; ewg@lists.openfabrics.org; amar.mudran...@qlogic.com Subject: Re: [ewg] Wish to remove local sa patches from OFED 1.5 On Tue, Jul 14, 2009 at 3:58 PM, Woodruff, Robert J wrote: > Sorry, I did not explain it clearly, > > What I meant to say was that the new userspace > module could work with the rdma_cm to provide > better scaling than the local sa cache module > that is in the kernel, and if it does, the > local sa cache feature might not be needed > anymore. What new userspace module are you referring to ? -- Hal > woody > > > -Original Message- > From: ewg-boun...@lists.openfabrics.org > [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren > Sent: Tuesday, July 14, 2009 12:46 PM > To: Hefty, Sean > Cc: michael.bro...@qlogic.com; amar.mudran...@qlogic.com; > ewg@lists.openfabrics.org > Subject: Re: [ewg] Wish to remove local sa patches from OFED 1.5 > > Sean Hefty wrote: >> >> >> I am working on a userspace app that should help with scaling for some >> topologies, but I doubt it will work for all routing algorithms. I'm at >> least a >> couple weeks away from posting anything. >> >> > I guess I didn't quite understood what Woody explained in the meeting > Sorry about that > > Tziporet > ___ > ewg mailing list > ewg@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > ___ > ewg mailing list > ewg@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH OFED-1.4.2] RDMA/nes: fix qp refcount during disconnect
qp was accessed after it got freed from disconnect task handling causing system crash. Now we increment qp's refcount before queue_work() and decrementing it after it is complete. Signed-off-by: Faisal Latif --- kernel_patches/fixes/nes_0350_qp_refcount.patch | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) create mode 100644 kernel_patches/fixes/nes_0350_qp_refcount.patch diff --git a/kernel_patches/fixes/nes_0350_qp_refcount.patch b/kernel_patches/fixes/nes_0350_qp_refcount.patch new file mode 100644 index 000..76e7bb0 --- /dev/null +++ b/kernel_patches/fixes/nes_0350_qp_refcount.patch @@ -0,0 +1,23 @@ +diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c +index 1856a21..96152b5 100644 +--- a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c +@@ -2461,6 +2461,7 @@ int nes_cm_disconn(struct nes_qp *nesqp) + if (nesqp->disconn_pending == 0) { + nesqp->disconn_pending++; + spin_unlock_irqrestore(&nesqp->lock, flags); ++ nes_add_ref(&nesqp->ibqp); + /* init our disconnect work element, to */ + INIT_WORK(&nesqp->disconn_work, nes_disconnect_worker); + +@@ -2482,6 +2483,7 @@ static void nes_disconnect_worker(struct work_struct *work) + nes_debug(NES_DBG_CM, "processing AEQE id 0x%04X for QP%u.\n", + nesqp->last_aeq, nesqp->hwqp.qp_id); + nes_cm_disconn_true(nesqp); ++ nes_rem_ref(&nesqp->ibqp); + } + + +-- +1.6.0 + -- 1.6.0 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Wish to remove local sa patches from OFED 1.5
On Tue, Jul 14, 2009 at 3:58 PM, Woodruff, Robert J wrote: > Sorry, I did not explain it clearly, > > What I meant to say was that the new userspace > module could work with the rdma_cm to provide > better scaling than the local sa cache module > that is in the kernel, and if it does, the > local sa cache feature might not be needed > anymore. What new userspace module are you referring to ? -- Hal > woody > > > -Original Message- > From: ewg-boun...@lists.openfabrics.org > [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren > Sent: Tuesday, July 14, 2009 12:46 PM > To: Hefty, Sean > Cc: michael.bro...@qlogic.com; amar.mudran...@qlogic.com; > ewg@lists.openfabrics.org > Subject: Re: [ewg] Wish to remove local sa patches from OFED 1.5 > > Sean Hefty wrote: >> >> >> I am working on a userspace app that should help with scaling for some >> topologies, but I doubt it will work for all routing algorithms. I'm at >> least a >> couple weeks away from posting anything. >> >> > I guess I didn't quite understood what Woody explained in the meeting > Sorry about that > > Tziporet > ___ > ewg mailing list > ewg@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > ___ > ewg mailing list > ewg@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH OFED-1.4.2] RDMA/nes: Make LRO as default feature
Make LRO as default feature Signed-off-by: Faisal Latif --- kernel_patches/fixes/nes_0340_lro_default.patch | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) create mode 100644 kernel_patches/fixes/nes_0340_lro_default.patch diff --git a/kernel_patches/fixes/nes_0340_lro_default.patch b/kernel_patches/fixes/nes_0340_lro_default.patch new file mode 100644 index 000..cebd7bf --- /dev/null +++ b/kernel_patches/fixes/nes_0340_lro_default.patch @@ -0,0 +1,15 @@ +diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c +index ef13030..0f6fbe7 100644 +--- a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c +@@ -1601,6 +1601,7 @@ struct net_device *nes_netdev_init(struct nes_device *nesdev, + netif_napi_add(netdev, &nesvnic->napi, nes_netdev_poll, 128); + nes_debug(NES_DBG_INIT, "Enabling VLAN Insert/Delete.\n"); + netdev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX; ++ netdev->features |= NETIF_F_LRO; + netdev->vlan_rx_register = nes_netdev_vlan_rx_register; + + /* Fill in the port structure */ +-- +1.5.3.3 + -- 1.6.0 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ofa-general] RE: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support onlyQP1
Hi Robert, Your suggestion to represent RDMAoE as a transport indeed makes the code simpler. Thus, we will have: switch(port_transport) { case RDMA_TRANSPORT_IB: ... break; case RDMA_TRANSPORT_RDMAOE: ... break; case RDMA_TRANSPORT_IWARP: ... break; }; instead of: switch(port_transport) { case RDMA_TRANSPORT_IB: if (port_type == IB) { ... } else { ... } break; case RDMA_TRANSPORT_IWARP: ... break; }; which is cleaner. In addition, for places in which IB and RDMAOE behave the same, we will have: case RDMA_TRANSPORT_IB: case RDMA_TRANSPORT_RDMAOE: ... break; which will make this fact explicit. The only difference is that the switch() will operate on port-transport rather than node transport. (We can add a wrapper that if the ib_dev didn't regsiter a port-transport function, it will default to the node transport.) Thanks! --Liran -Original Message- From: Liran Liss Sent: Tuesday, July 14, 2009 11:53 AM To: 'Woodruff, Robert J'; Eli Cohen; Hefty, Sean; Roland Dreier Cc: ewg; general-list Subject: RE: [ofa-general] RE: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support onlyQP1 S.B. --Liran > Trying to emulate IB for mad services is a total hack and not how this new transport should be added into the core. It should be it's own transport type, just like iWarp was added. > You should start with adding a new transport type to ib_verbs.h, e.g., LL: it is not a hack: RDMAoE will probably use mad services at least for connection management, and additional ones in the future. --- ib_verbs.h 2009-07-13 09:06:10.0 -0400 +++ ib_verbs_new.h 2009-07-14 03:00:23.0 -0400 @@ -64,12 +64,14 @@ enum rdma_node_type { RDMA_NODE_IB_CA = 1, RDMA_NODE_IB_SWITCH, RDMA_NODE_IB_ROUTER, - RDMA_NODE_RNIC + RDMA_NODE_RNIC, + RDMA_NODE_IBXOE }; LL: a multi-port HCA can have both IB and Ethernet ports, so this is not a per-node thing. enum rdma_transport_type { RDMA_TRANSPORT_IB, - RDMA_TRANSPORT_IWARP + RDMA_TRANSPORT_IWARP, + RDMA_TRANSPORT_IBXOE }; LL: thanks, we will look into this. I am not sure that "transport" is the right terminology, since we are using the IB transport layer. enum rdma_transport_type___ general mailing list gene...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nightly OFED-1.5-20090714-0600 build break in infiniband-diags
Sasha Khapyorsky wrote: On 11:26 Tue 14 Jul , Jon Mason wrote: If you want assistance with test compiling on certain architectures, I'll be happy to help. Your reports are very helpful. was this issue resolved? Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] Wish to remove local sa patches from OFED 1.5
Sorry, I did not explain it clearly, What I meant to say was that the new userspace module could work with the rdma_cm to provide better scaling than the local sa cache module that is in the kernel, and if it does, the local sa cache feature might not be needed anymore. woody -Original Message- From: ewg-boun...@lists.openfabrics.org [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren Sent: Tuesday, July 14, 2009 12:46 PM To: Hefty, Sean Cc: michael.bro...@qlogic.com; amar.mudran...@qlogic.com; ewg@lists.openfabrics.org Subject: Re: [ewg] Wish to remove local sa patches from OFED 1.5 Sean Hefty wrote: > > > I am working on a userspace app that should help with scaling for some > topologies, but I doubt it will work for all routing algorithms. I'm at > least a > couple weeks away from posting anything. > > I guess I didn't quite understood what Woody explained in the meeting Sorry about that Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Wish to remove local sa patches from OFED 1.5
Sean Hefty wrote: I am working on a userspace app that should help with scaling for some topologies, but I doubt it will work for all routing algorithms. I'm at least a couple weeks away from posting anything. I guess I didn't quite understood what Woody explained in the meeting Sorry about that Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support only QP1
Hal wrote, >Unfortunately I don't think it's this simple although I wish it were. >IBXOE is on a per port rather than a per node basis which is a >different model than we've used for IB or iWARP. >-- Hal Yuk... Can the driver/hardware present itself to the upper core layers as two separate NICs, ever though the hardware has both an IB port and an Ethernet port on one card ? as a multi-function PCI device or something ? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nightly OFED-1.5-20090714-0600 build break in infiniband-diags
On 11:26 Tue 14 Jul , Jon Mason wrote: > > If you want assistance with test compiling on certain architectures, > I'll be happy to help. Your reports are very helpful. Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] Wish to remove local sa patches from OFED 1.5
>> Since Qlogic are using some of the APIs in these files it was decided not to >> remove them in 1.5 >> However Qlogic were requested to approach Sean and see if they can move >> their implementation to the new SA API he is developing now > >Has this new SA API been proposed to the list as yet (and I missed it :-() ? Um... I think there's been some miscommunication somewhere. I'm not developing an SA API, at least in my waking hours. I am working on a userspace app that should help with scaling for some topologies, but I doubt it will work for all routing algorithms. I'm at least a couple weeks away from posting anything. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nightly OFED-1.5-20090714-0600 build break in infiniband-diags
On Tue, Jul 14, 2009 at 07:09:49PM +0300, Sasha Khapyorsky wrote: > On 10:32 Tue 14 Jul , Jon Mason wrote: > > > > Can you regress the infiniband-diags RPM in the nightlies to the version > > found in 07/02 until Sasha has it working again? > > Why should we regress things when we are even in pre-alpha phase yet? We > need to fix problems instead. I understand that there can be bugs in the code that will be found, that is to be expected with this early stage of development. However, I was under the impression that all code needed to compile before being submitted for inclusion in the build. Internally, we are using the nightlies to verify my bug fixes. If we cannot build the nightlies, I cannot close my bugs. Which is why we are hitting these build breaks so frequently (and why I keep bringing up this issue). If you want assistance with test compiling on certain architectures, I'll be happy to help. Thanks, Jon > > Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nightly OFED-1.5-20090714-0600 build break in infiniband-diags
On 10:32 Tue 14 Jul , Jon Mason wrote: > > Can you regress the infiniband-diags RPM in the nightlies to the version > found in 07/02 until Sasha has it working again? Why should we regress things when we are even in pre-alpha phase yet? We need to fix problems instead. Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] OFED Nightly OFED-1.5-20090714-0600 build break in infiniband-diags
Last nights build is still breaking in infiniband-diags. Once we regress that package to the one found in the 07/02 nightly build, the nightly finishes the install without issue. Can you regress the infiniband-diags RPM in the nightlies to the version found in 07/02 until Sasha has it working again? Thanks, Jon Distribution: CentOS 5.3 CPU type: x86_64 Linux kernel: 2.6.30 OFED version: OFED-1.5-20090714-0600 # ./install.pl -- 2,3 ... Failed to build infiniband-diags RPM See /tmp/OFED.7319.logs/infiniband-diags.rpmbuild.log # less /tmp/OFED.7319.logs/infiniband-diags.rpmbuild.log ... libtool: link: gcc -shared .libs/libibnetdisc_la-ibnetdisc.o .libs/ libibnetdisc_la-chassis.o -losmcomp -libmad -libumad -m64 - mtune=generic -Wl,--version-script=./src/libibnetdisc.map -Wl,-soname -Wl,libibnetdisc.so.1 -o .libs/libibnetdisc.so.1.0.0 libtool: link: (cd ".libs" && rm -f "libibnetdisc.so.1" && ln -s "libibnetdisc.so.1.0.0" "libibnetdisc.so.1") libtool: link: (cd ".libs" && rm -f "libibnetdisc.so" && ln -s "libibnetdisc.so.1.0.0" "libibnetdisc.so") libtool: link: ar cru .libs/libibnetdisc.a libibnetdisc_la-ibnetdisc.o libibnetdisc_la-chassis.o libtool: link: ranlib .libs/libibnetdisc.a libtool: link: ( cd ".libs" && rm -f "libibnetdisc.la" && ln -s "../ libibnetdisc.la" "libibnetdisc.la" ) make[2]: *** No rule to make target `man/ibnd_debug.3', needed by `all- am'. Stop. make[2]: Leaving directory `/var/tmp/OFED_topdir/BUILD/infiniband- diags-1.5.2_20090709_d3b47d7/libibnetdisc' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/var/tmp/OFED_topdir/BUILD/infiniband- diags-1.5.2_20090709_d3b47d7' make: *** [all] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.42415 (%build) RPM build errors: user vlad does not exist - using root group vlad does not exist - using root user vlad does not exist - using root group vlad does not exist - using root Bad exit status from /var/tmp/rpm-tmp.42415 (%build) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Wish to remove local sa patches from OFED 1.5
On Tue, Jul 14, 2009 at 9:58 AM, Tziporet Koren wrote: > Jack Morgenstein wrote: >> >> Hello all, >> >> We wish to remove the local sa patches from OFED 1.5. The local SA is >> disabled by default, >> and to the best of our knowledge, no one is using it, though it has been >> around since OFED 1.3. >> It has also never been accepted into the mainline kernel. >> >> We wish therefore to remove it from OFED 1.5. >> >> This includes, under kernel_patches/fixes, the following patches: >> sean_local_sa_1_notifications.patch >> sean_local_sa_2_cache.patch >> sean_local_sa_3_disable.patch >> sean_local_sa_4_fix_hang.patch >> >> If anyone objects to this removal, please let me know ASAP. >> >> > > Since Qlogic are using some of the APIs in these files it was decided not to > remove them in 1.5 > However Qlogic were requested to approach Sean and see if they can move > their implementation to the new SA API he is developing now Has this new SA API been proposed to the list as yet (and I missed it :-() ? Thanks. -- Hal > so eventualy we will be able to remove them > > Tziporet > > ___ > ewg mailing list > ewg@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Wish to remove local sa patches from OFED 1.5
Jack Morgenstein wrote: Hello all, We wish to remove the local sa patches from OFED 1.5. The local SA is disabled by default, and to the best of our knowledge, no one is using it, though it has been around since OFED 1.3. It has also never been accepted into the mainline kernel. We wish therefore to remove it from OFED 1.5. This includes, under kernel_patches/fixes, the following patches: sean_local_sa_1_notifications.patch sean_local_sa_2_cache.patch sean_local_sa_3_disable.patch sean_local_sa_4_fix_hang.patch If anyone objects to this removal, please let me know ASAP. Since Qlogic are using some of the APIs in these files it was decided not to remove them in 1.5 However Qlogic were requested to approach Sean and see if they can move their implementation to the new SA API he is developing now so eventualy we will be able to remove them Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support only QP1
On Tue, Jul 14, 2009 at 9:38 AM, Eli Cohen wrote: > On Tue, Jul 14, 2009 at 07:15:44AM -0400, Hal Rosenstock wrote: >> On Tue, Jul 14, 2009 at 3:46 AM, Eli Cohen wrote: >> > On Mon, Jul 13, 2009 at 03:26:34PM -0400, Hal Rosenstock wrote: >> >> On Mon, Jul 13, 2009 at 2:14 PM, Eli Cohen wrote: >> >> > Since RDMAoE is using Ethernet as its link layer, there is no need for >> >> > QP0. QP1 >> >> > is still needed since it handles communications between CM agents. This >> >> > patch >> >> > will create only QP1 for RDMAoE ports. >> >> >> >> What happens with other QP1 traffic (other than CM and SA) ? >> > I think it should work but I haven't tried that. >> >> Would you ? You could try tools from infiniband-diags or ibdiagnet. > Yes I would try that. But I need something that will not fail because > it could not open QP0. So opensm, ibdiagnet, and smpquery fail (error out) ? > For example, something that uses only QP1. Are > the any in ibutils? In infiniband-diags, there are perfquery, saquery, vendstat, ibping, and ibssystat which only use QP1. The latter two run are client/server and can take GUID (not GID) as an argument. -- Hal >> >> >> Userspace >> >> can access QP1 (and QP0). >> > QP0 is not accessible since ib_register_mad_agent() will fail for QP0 >> > becuase of this: >> > >> > if (!port_priv->qp_info[qp_type].qp) >> > return NULL; >> > >> > QP1 should work in the same way >> >> So what happens with things like PerfMgt class ? I think it ends up >> timing out if no receiver consumer is present. >> >> >> Does QP0 error out ? What about QP1 ? Does >> >> it just timeout ? If so, a direct error would be better. >> >> >> > >> > See above - you can't access QP0. Do you know of a utility from >> > userspace which sends/receives MADs on QP0 or QP1? >> >> Yes, opensm, infiniband-diags (various), and ibutils (ibdiagnet, etc). >> >> -- Hal > ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support only QP1
On Tue, Jul 14, 2009 at 07:15:44AM -0400, Hal Rosenstock wrote: > On Tue, Jul 14, 2009 at 3:46 AM, Eli Cohen wrote: > > On Mon, Jul 13, 2009 at 03:26:34PM -0400, Hal Rosenstock wrote: > >> On Mon, Jul 13, 2009 at 2:14 PM, Eli Cohen wrote: > >> > Since RDMAoE is using Ethernet as its link layer, there is no need for > >> > QP0. QP1 > >> > is still needed since it handles communications between CM agents. This > >> > patch > >> > will create only QP1 for RDMAoE ports. > >> > >> What happens with other QP1 traffic (other than CM and SA) ? > > I think it should work but I haven't tried that. > > Would you ? You could try tools from infiniband-diags or ibdiagnet. Yes I would try that. But I need something that will not fail because it could not open QP0. For example, something that uses only QP1. Are the any in ibutils? > > >> Userspace > >> can access QP1 (and QP0). > > QP0 is not accessible since ib_register_mad_agent() will fail for QP0 > > becuase of this: > > > > if (!port_priv->qp_info[qp_type].qp) > > return NULL; > > > > QP1 should work in the same way > > So what happens with things like PerfMgt class ? I think it ends up > timing out if no receiver consumer is present. > > >> Does QP0 error out ? What about QP1 ? Does > >> it just timeout ? If so, a direct error would be better. > >> > > > > See above - you can't access QP0. Do you know of a utility from > > userspace which sends/receives MADs on QP0 or QP1? > > Yes, opensm, infiniband-diags (various), and ibutils (ibdiagnet, etc). > > -- Hal ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ofa-general] RE: [ewg] [PATCH 6/8 v3] IB/ipoib: restrict IPoIB to work on IB ports only
Oops, I meant "exits" instead of "exists"... -Original Message- From: Liran Liss Sent: Tuesday, July 14, 2009 11:16 AM To: 'Woodruff, Robert J'; Eli Cohen; Hefty, Sean; Roland Dreier Cc: ewg; general-list Subject: RE: [ofa-general] RE: [ewg] [PATCH 6/8 v3] IB/ipoib: restrict IPoIB to work on IB ports only This exaclty the same as for iWARP: IPoIB checks the node transport, and if it is != IB, it exists. For RDMAoE, we do the same check but at the port level. -Original Message- From: general-boun...@lists.openfabrics.org [mailto:general-boun...@lists.openfabrics.org] On Behalf Of Woodruff, Robert J Sent: Tuesday, July 14, 2009 12:04 AM To: Eli Cohen; Hefty, Sean; Roland Dreier Cc: ewg; general-list Subject: [ofa-general] RE: [ewg] [PATCH 6/8 v3] IB/ipoib: restrict IPoIB to work on IB ports only Eli Cohen wrote, >We don't want IPoIB to work over RDMAoE since it will give worse >performance than working directly on Ethernet interfaces which are a >prerequisite to RDMAoE anyway. This is another reason why NOT to try to add IBxOE under the IB transport, but rather add it as it's own transport type. We should not need to hack all the InfiniBand ULPs to now have to know the difference between real IB and IBxOE.___ general mailing list gene...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ofa-general] Re: [ewg] [PATCH 1/8 v3] ib_core: Add API to support RDMAoE
On Tue, Jul 14, 2009 at 2:35 AM, Eli Cohen wrote: > On Mon, Jul 13, 2009 at 03:26:06PM -0400, Hal Rosenstock wrote: >> > >> > +enum ib_port_link_type ib_get_port_link_type(struct ib_device *device, u8 >> > port_num) >> > +{ >> > + return device->get_port_link_type ? >> > + device->get_port_link_type(device, port_num) : >> > PORT_LINK_IB; >> >> So do iWARP devices return PORT_LINK_IB ? If so, that seems a little >> weird to me. >> >> -- Hal >> > > Maybe it's more appropriate to make this function mandatory and > require all drivers to report the correct port type. What do you > think? That seems better to me; another alternative would be to require this routine for either all iWARP or IB devices as well but that might be error prone. Maybe someone else has a better idea on this. -- Hal ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support only QP1
On Tue, Jul 14, 2009 at 3:46 AM, Eli Cohen wrote: > On Mon, Jul 13, 2009 at 03:26:34PM -0400, Hal Rosenstock wrote: >> On Mon, Jul 13, 2009 at 2:14 PM, Eli Cohen wrote: >> > Since RDMAoE is using Ethernet as its link layer, there is no need for >> > QP0. QP1 >> > is still needed since it handles communications between CM agents. This >> > patch >> > will create only QP1 for RDMAoE ports. >> >> What happens with other QP1 traffic (other than CM and SA) ? > I think it should work but I haven't tried that. Would you ? You could try tools from infiniband-diags or ibdiagnet. >> Userspace >> can access QP1 (and QP0). > QP0 is not accessible since ib_register_mad_agent() will fail for QP0 > becuase of this: > > if (!port_priv->qp_info[qp_type].qp) > return NULL; > > QP1 should work in the same way So what happens with things like PerfMgt class ? I think it ends up timing out if no receiver consumer is present. >> Does QP0 error out ? What about QP1 ? Does >> it just timeout ? If so, a direct error would be better. >> > > See above - you can't access QP0. Do you know of a utility from > userspace which sends/receives MADs on QP0 or QP1? Yes, opensm, infiniband-diags (various), and ibutils (ibdiagnet, etc). -- Hal ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support only QP1
On Mon, Jul 13, 2009 at 4:58 PM, Woodruff, Robert J wrote: > Eli Cohen wrote, > >>Since RDMAoE is using Ethernet as its link layer, there is no need for QP0. >>QP1 >>is still needed since it handles communications between CM agents. This patch >>will create only QP1 for RDMAoE ports. > > > Trying to emulate IB for mad services is a total hack and not how this > new transport should be added into the core. It should be it's own transport > type, > just like iWarp was added. > You should start with adding a new transport type to ib_verbs.h, > e.g., > > > --- ib_verbs.h 2009-07-13 09:06:10.0 -0400 > +++ ib_verbs_new.h 2009-07-14 03:00:23.0 -0400 > @@ -64,12 +64,14 @@ enum rdma_node_type { > RDMA_NODE_IB_CA = 1, > RDMA_NODE_IB_SWITCH, > RDMA_NODE_IB_ROUTER, > - RDMA_NODE_RNIC > + RDMA_NODE_RNIC, > + RDMA_NODE_IBXOE > }; > > enum rdma_transport_type { > RDMA_TRANSPORT_IB, > - RDMA_TRANSPORT_IWARP > + RDMA_TRANSPORT_IWARP, > + RDMA_TRANSPORT_IBXOE > }; > > enum rdma_transport_type Unfortunately I don't think it's this simple although I wish it were. IBXOE is on a per port rather than a per node basis which is a different model than we've used for IB or iWARP. -- Hal ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH OFED 1.5] install.pl: mark cxgb3 and nes as available
Chien Tung wrote: Both cxgb3 and nes are compiling for OFED 1.5 so mark them as available. Also restore both userspace libraries for kernels other than 2.6.30. Signed-off-by: Chien Tung --- Applied, Thanks, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH] IB/qib: fix compiler errors for 2.6.30
Ralph Campbell wrote: Vlad, Please pull: git://git.openfabrics.org/~ralphc/linux-2.6/.git ofed_kernel_1_5 This should fix the 2.6.30 build errors. commit 4047cc3120a7c9b36fbddc3158ae0c01ec6636d9 Author: Ralph Campbell (QLogic) Date: Thu Jul 9 15:47:02 2009 -0700 IB/qib: backport QIB from 2.6.30 to 2.6.27 Signed-off-by: Ralph Campbell commit 7251f2dbf99d0a97ea936f277d85baecbeafa1fd Author: Ralph Campbell (QLogic) Date: Thu Jul 9 15:43:52 2009 -0700 IB/qib: fix compiler errors for 2.6.30 Signed-off-by: Ralph Campbell Done, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [GIT PULL OFED-1.4.2] NFSRDMA bug fixes
Jon Mason wrote: Hey Vlad, Please pull for OFED 1.4.2 fixes ssh://v...@sofa.openfabrics.org/home/jon/scm/ofed_kernel-1.4.git dev It contains fixes for 1675, 1676, and 1677. It also contains a fix for a RPC encode/decode issue found on reiserfs (which has the stale file handle issues). I can open a bug for this issue if you like. Done, Please open a bug. Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [GIT PULL OFED-1.5] NFSRDMA bug fixes
Jon Mason wrote: Hey Vlad, Please pull from ssh://v...@sofa.openfabrics.org/home/jon/scm/ofed_kernel-1.5.git dev It contains the following patches (notice that the RHEL5.4 patch is in this): Done, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH v4] libibmad: Handle MAD redirection
Hal Rosenstock wrote on 08.07.2009 18:48:29: > > This patch should make its way into OFED 1.5... so who should pull it? > > You? Vlad? Someone not on CC? Whoever, please apply for OFED 1.5 -- > > thanks! > > Sasha is the management maintainer. Userspace trees for OFED 1.5 > haven't been created and I think this aspect is in transition. Sasha, can you apply this patch? Thanks! Here's a link to the patch: http://lists.openfabrics.org/pipermail/ewg/2009-July/013519.html Cheers, Joachim ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ofa-general] RE: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support onlyQP1
S.B. --Liran > Trying to emulate IB for mad services is a total hack and not how this new transport should be added into the core. It should be it's own transport type, just like iWarp was added. > You should start with adding a new transport type to ib_verbs.h, e.g., LL: it is not a hack: RDMAoE will probably use mad services at least for connection management, and additional ones in the future. --- ib_verbs.h 2009-07-13 09:06:10.0 -0400 +++ ib_verbs_new.h 2009-07-14 03:00:23.0 -0400 @@ -64,12 +64,14 @@ enum rdma_node_type { RDMA_NODE_IB_CA = 1, RDMA_NODE_IB_SWITCH, RDMA_NODE_IB_ROUTER, - RDMA_NODE_RNIC + RDMA_NODE_RNIC, + RDMA_NODE_IBXOE }; LL: a multi-port HCA can have both IB and Ethernet ports, so this is not a per-node thing. enum rdma_transport_type { RDMA_TRANSPORT_IB, - RDMA_TRANSPORT_IWARP + RDMA_TRANSPORT_IWARP, + RDMA_TRANSPORT_IBXOE }; LL: thanks, we will look into this. I am not sure that "transport" is the right terminology, since we are using the IB transport layer. enum rdma_transport_type___ general mailing list gene...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ofa-general] RE: [ewg] [PATCH 6/8 v3] IB/ipoib: restrict IPoIB to work on IB ports only
This exaclty the same as for iWARP: IPoIB checks the node transport, and if it is != IB, it exists. For RDMAoE, we do the same check but at the port level. -Original Message- From: general-boun...@lists.openfabrics.org [mailto:general-boun...@lists.openfabrics.org] On Behalf Of Woodruff, Robert J Sent: Tuesday, July 14, 2009 12:04 AM To: Eli Cohen; Hefty, Sean; Roland Dreier Cc: ewg; general-list Subject: [ofa-general] RE: [ewg] [PATCH 6/8 v3] IB/ipoib: restrict IPoIB to work on IB ports only Eli Cohen wrote, >We don't want IPoIB to work over RDMAoE since it will give worse >performance than working directly on Ethernet interfaces which are a >prerequisite to RDMAoE anyway. This is another reason why NOT to try to add IBxOE under the IB transport, but rather add it as it's own transport type. We should not need to hack all the InfiniBand ULPs to now have to know the difference between real IB and IBxOE.___ general mailing list gene...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH 2/8 v3] ib_core: RDMAoE support only QP1
On Mon, Jul 13, 2009 at 03:26:34PM -0400, Hal Rosenstock wrote: > On Mon, Jul 13, 2009 at 2:14 PM, Eli Cohen wrote: > > Since RDMAoE is using Ethernet as its link layer, there is no need for QP0. > > QP1 > > is still needed since it handles communications between CM agents. This > > patch > > will create only QP1 for RDMAoE ports. > > What happens with other QP1 traffic (other than CM and SA) ? I think it should work but I haven't tried that. > Userspace > can access QP1 (and QP0). QP0 is not accessible since ib_register_mad_agent() will fail for QP0 becuase of this: if (!port_priv->qp_info[qp_type].qp) return NULL; QP1 should work in the same way > Does QP0 error out ? What about QP1 ? Does > it just timeout ? If so, a direct error would be better. > See above - you can't access QP0. Do you know of a utility from userspace which sends/receives MADs on QP0 or QP1? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg