[PATCH] IB/qib: Return correct MAD when setting link width to 255
From: Mitko Haralanov mi...@qlogic.com Fix a bug which cause the driver to return incorrect MADs as a response to Set(PortInfo) which sets the link width to 0xFF or link speed to 0xF. Signed-off-by: Mitko Haralanov mi...@qlogic.com Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com --- drivers/infiniband/hw/qib/qib_mad.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c index 43f8bbd..7c44726 100644 --- a/drivers/infiniband/hw/qib/qib_mad.c +++ b/drivers/infiniband/hw/qib/qib_mad.c @@ -705,7 +705,7 @@ static int subn_set_portinfo(struct ib_smp *smp, struct ib_device *ibdev, lwe = pip-link_width_enabled; if (lwe) { if (lwe == 0xFF) - lwe = ppd-link_width_supported; + set_link_width_enabled(ppd, ppd-link_width_supported); else if (lwe = 16 || (lwe ~ppd-link_width_supported)) smp-status |= IB_SMP_INVALID_FIELD; else if (lwe != ppd-link_width_enabled) @@ -720,7 +720,8 @@ static int subn_set_portinfo(struct ib_smp *smp, struct ib_device *ibdev, * speeds. */ if (lse == 15) - lse = ppd-link_speed_supported; + set_link_speed_enabled(ppd, + ppd-link_speed_supported); else if (lse = 8 || (lse ~ppd-link_speed_supported)) smp-status |= IB_SMP_INVALID_FIELD; else if (lse != ppd-link_speed_enabled) @@ -849,7 +850,7 @@ static int subn_set_portinfo(struct ib_smp *smp, struct ib_device *ibdev, if (clientrereg) pip-clientrereg_resv_subnetto |= 0x80; - goto done; + goto get_only; err: smp-status |= IB_SMP_INVALID_FIELD; -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/2] Improved node descriptions
Roland, Thanks for the positive feedback - I'll make the changes ASAP. Sorry about the subject lines, I thought I was doing the right thing by making them match. The change is not in OFED yet, but QLogic has been using it in our own release for a while and works extremely well on large fabrics. From: rol...@purestorage.com [rol...@purestorage.com] On Behalf Of Roland Dreier [rol...@kernel.org] Sent: Thursday, February 17, 2011 6:20 PM To: Mike Heinz Cc: linux-rdma@vger.kernel.org Subject: Re: [PATCH 0/2] Improved node descriptions On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz michael.he...@qlogic.com wrote: This patch addresses the problem by providing a function to build the node description. If the provided source string for the description contains an '@' character, the function will substitute the current utsname. This ensures that even after a fabric has been completely initialized, if a node's hostname changes, that change will be reflected in the next sweep of the SM, but also maintains compatibility with existing code since the behavior is unchanged if the description string does not contain an '@' character. This looks like a reasonable approach to me, although of course the SM has no way of knowing it should update a port's node description if a hostname changes. Aside from some minor quibbles - next time please use different subjects for each patch in the thread - the prototype of ib_build_node_desc() seems to force every call site to have a cast; maybe the function should take a pointer to struct ib_smp instead? - the internals of ib_build_node_desc() look a bit ugly, is there any way to make it a little cleaner? I do like this. Does anyone have any feelings about applying this for 2.6.39? Is this shipping in OFED? - R. This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/2] Improved node descriptions
Hal, I'm in the process of smoke-testing changes to meet Roland's quibbles - I'm going to change the code to pass the structure instead of a char*. Also, I just want to point out that even in the case where no enhanced trap is used, any change will still be seen by the SM the next time it sweeps. There will still be lag involved, but I don't think it's insurmountable. -Original Message- From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com] Sent: Thursday, February 17, 2011 11:19 PM To: Roland Dreier Cc: Mike Heinz; linux-rdma@vger.kernel.org Subject: Re: [PATCH 0/2] Improved node descriptions On Thu, Feb 17, 2011 at 6:20 PM, Roland Dreier rol...@kernel.org wrote: On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz michael.he...@qlogic.com wrote: This patch addresses the problem by providing a function to build the node description. If the provided source string for the description contains an '@' character, the function will substitute the current utsname. This ensures that even after a fabric has been completely initialized, if a node's hostname changes, that change will be reflected in the next sweep of the SM, but also maintains compatibility with existing code since the behavior is unchanged if the description string does not contain an '@' character. This looks like a reasonable approach to me, although of course the SM has no way of knowing it should update a port's node description if a hostname changes. It does; There's an enhanced trap for this now. Aside from some minor quibbles - next time please use different subjects for each patch in the thread - the prototype of ib_build_node_desc() seems to force every call site to have a cast; maybe the function should take a pointer to struct ib_smp instead? - the internals of ib_build_node_desc() look a bit ugly, is there any way to make it a little cleaner? I do like this. Does anyone have any feelings about applying this for 2.6.39? Is this shipping in OFED? I need a little time to review this. -- Hal - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. N�r��yb�X��ǧv�^�){.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj��!�i
How to build a kernel module with OFED kernel
A newbie question . (please cc me as I'm not on linux-rdma list yet) The system was on RHEL 5.4. I used a source tar-ball to build RHEL 5.5 (2.6.18-194.el5). The kernel build was placed at: /usr/src/linux. Rebooted to pick up the new kernel .. worked fine... Then 1. On the new 2.6.18-192.el5 system, install OFED-1.5.2 ... succeed. 2. Reboot to pick up new ofa kernel ..succeed (check with modinfo) 3. Run user mode rdma application ... succeed 4. The new ofa kernel modules are placed (by OFED scripts) at: /lib/modules/2.6.18-194.el5/updates/ directory 5. Build a kernel module (xx.ko) using the attached Makefile Now .. trying to load (or run after forced loading) xx.ko (on top of ofa kernel kmod(s)) .. It fails .. as the build apparently picks up IB kmods from /lib/modules/2.6.18-194.el5/drivers/infiniband directory, instead of /lib/modules/2.6.18-194.el5/updates directory, together with wrong header files from /lib/modules/2.6.18-194.el5/source/include, where source is /usr/src/linux directory. Anyone can help me with a correct procedure ? I do understand the primary usage of OFED RDMA is for user mode applications .. but I need to have a kernel mode driver on top of OFED RDMA for some experiment works. Thanks, Wendy == Make File == EXTRA_CFLAGS:= -I/usr/src/linux/drivers/xx/include EXTRA_CFLAGS+= -DXX_KMOD_DEF obj-m := xx_kmod.o xx_kmod-y := main/xx_main.o main/xx_init.o \ libxxverbs/xx_device.o libxxverbs/xx_cm.o \ libxxverbs/xx_ar.o libxxverbs/xx_mr.o \ libxxverbs/xx_cq.o libxxverbs/xx_sq.o xx_kmod-y += util/xx_perf.o xx_kmod-y += brd/xx_brd.o kmod: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules install: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules_install run: /sbin/depmod -a; echo 5 /proc/sys/kernel/panic; modprobe --force-modversion xx_kmod === End attachment === -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to build a kernel module with OFED kernel
To get rid of the symbol mismatch warnings and not force the mod load, pull the Module.symvers from /usr/src/ofa_kernel/ and use that instead of the default on in the kernel build tree. Also, you'll need to include the ofa headers over the backing kernel headers. Its kind of painful to do. You can look at my krping tree for some ideas. krping can build it two modes: it gets built as an out-of tree module, like what you're doing, or it also has a build mode where you place the krping src code inside the ofa kernel tree and build it there using the ofa kernel Makefiles. The latter is pretty easy to do. The former method really doesn't work very well for krping. git://git.openfabrics.org/~swise/krping.git Steve. On 02/18/2011 12:59 PM, Wendy Cheng wrote: A newbie question . (please cc me as I'm not on linux-rdma list yet) The system was on RHEL 5.4. I used a source tar-ball to build RHEL 5.5 (2.6.18-194.el5). The kernel build was placed at: /usr/src/linux. Rebooted to pick up the new kernel .. worked fine... Then 1. On the new 2.6.18-192.el5 system, install OFED-1.5.2 ... succeed. 2. Reboot to pick up new ofa kernel ..succeed (check with modinfo) 3. Run user mode rdma application ... succeed 4. The new ofa kernel modules are placed (by OFED scripts) at: /lib/modules/2.6.18-194.el5/updates/ directory 5. Build a kernel module (xx.ko) using the attached Makefile Now .. trying to load (or run after forced loading) xx.ko (on top of ofa kernel kmod(s)) .. It fails .. as the build apparently picks up IB kmods from /lib/modules/2.6.18-194.el5/drivers/infiniband directory, instead of /lib/modules/2.6.18-194.el5/updates directory, together with wrong header files from /lib/modules/2.6.18-194.el5/source/include, where source is /usr/src/linux directory. Anyone can help me with a correct procedure ? I do understand the primary usage of OFED RDMA is for user mode applications .. but I need to have a kernel mode driver on top of OFED RDMA for some experiment works. Thanks, Wendy == Make File == EXTRA_CFLAGS:= -I/usr/src/linux/drivers/xx/include EXTRA_CFLAGS+= -DXX_KMOD_DEF obj-m := xx_kmod.o xx_kmod-y := main/xx_main.o main/xx_init.o \ libxxverbs/xx_device.o libxxverbs/xx_cm.o \ libxxverbs/xx_ar.o libxxverbs/xx_mr.o \ libxxverbs/xx_cq.o libxxverbs/xx_sq.o xx_kmod-y += util/xx_perf.o xx_kmod-y += brd/xx_brd.o kmod: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules install: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules_install run: /sbin/depmod -a; echo 5 /proc/sys/kernel/panic; modprobe --force-modversion xx_kmod === End attachment === -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] updated patch for improving node descriptions.
Updated the code as per Roland's requests, except for one part: Roland, I tried to rework the main function to be cleaner, because of the need to split the hostname from the domain name and the need to not modify the original source string, using library functions made things worse than the existing pointer math. --- Michael Heinz (2): Function for improved node descriptions Add support for ib_build_node_desc() to the HCAs. drivers/infiniband/core/mad.c | 19 +++ drivers/infiniband/hw/ipath/ipath_mad.c |2 +- drivers/infiniband/hw/mlx4/mad.c|2 +- drivers/infiniband/hw/mthca/mthca_mad.c |2 +- drivers/infiniband/hw/qib/qib_mad.c |2 +- include/rdma/ib_mad.h |9 + 6 files changed, 32 insertions(+), 4 deletions(-) -- Signed-off-by: Michael Heinz michael.he...@qlogic.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Add support for ib_build_node_desc() to the HCAs.
Adds support for ib_build_node_desc() to the HCAs. Signed-off-by: Michael Heinz michael.he...@qlogic.com --- drivers/infiniband/hw/ipath/ipath_mad.c |2 +- drivers/infiniband/hw/mlx4/mad.c|2 +- drivers/infiniband/hw/mthca/mthca_mad.c |2 +- drivers/infiniband/hw/qib/qib_mad.c |2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_mad.c b/drivers/infiniband/hw/ipath/ipath_mad.c index ceb98ee..bd1a112 100644 --- a/drivers/infiniband/hw/ipath/ipath_mad.c +++ b/drivers/infiniband/hw/ipath/ipath_mad.c @@ -60,7 +60,7 @@ static int recv_subn_get_nodedescription(struct ib_smp *smp, if (smp-attr_mod) smp-status |= IB_SMP_INVALID_FIELD; - memcpy(smp-data, ibdev-node_desc, sizeof(smp-data)); + ib_build_node_desc(smp, ibdev-node_desc); return reply(smp); } diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c index 57ffa50..6a89c3c 100644 --- a/drivers/infiniband/hw/mlx4/mad.c +++ b/drivers/infiniband/hw/mlx4/mad.c @@ -196,7 +196,7 @@ static void node_desc_override(struct ib_device *dev, mad-mad_hdr.method == IB_MGMT_METHOD_GET_RESP mad-mad_hdr.attr_id == IB_SMP_ATTR_NODE_DESC) { spin_lock(to_mdev(dev)-sm_lock); - memcpy(((struct ib_smp *) mad)-data, dev-node_desc, 64); + ib_build_node_desc((struct ib_smp *) mad, dev-node_desc); spin_unlock(to_mdev(dev)-sm_lock); } } diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c index 03a5953..b442028 100644 --- a/drivers/infiniband/hw/mthca/mthca_mad.c +++ b/drivers/infiniband/hw/mthca/mthca_mad.c @@ -153,7 +153,7 @@ static void node_desc_override(struct ib_device *dev, mad-mad_hdr.method == IB_MGMT_METHOD_GET_RESP mad-mad_hdr.attr_id == IB_SMP_ATTR_NODE_DESC) { mutex_lock(to_mdev(dev)-cap_mask_mutex); - memcpy(((struct ib_smp *) mad)-data, dev-node_desc, 64); + ib_build_node_desc((struct ib_smp *) mad, dev-node_desc); mutex_unlock(to_mdev(dev)-cap_mask_mutex); } } diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c index 5ad224e..1033112 100644 --- a/drivers/infiniband/hw/qib/qib_mad.c +++ b/drivers/infiniband/hw/qib/qib_mad.c @@ -260,7 +260,7 @@ static int subn_get_nodedescription(struct ib_smp *smp, if (smp-attr_mod) smp-status |= IB_SMP_INVALID_FIELD; - memcpy(smp-data, ibdev-node_desc, sizeof(smp-data)); + ib_build_node_desc(smp, ibdev-node_desc); return reply(smp); } -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/8] Reliably generate large request from SRP
On Tue, 2011-01-18 at 21:31 -0800, Roland Dreier wrote: Hi Dave, Now that at least one vendor is implementing full support for the SRP indirect memory descriptor tables, we can safely expand the sg_tablesize, and realize some performance gains, in many cases quite large. I don't have vendor code that implements the full support needed for safety, but the rareness of FMR mapping failures allows the mapping code to function, at a risk, with existing targets. Have you considered using memory registration through a send queue (from the base memory management extensions)? mlx4 at least has support for this operation, which would let you pre-allocate everything and avoid the possibility of failure (I think). I'm looking at this now, and it may not be too bad; I think I can re-use the mapping machinery pretty easily without it getting too ugly. I do have a few questions, though. It looks like I can only have up to 256 keys for each fast page registration MR, so that seems to limit the mappings in flight -- I can see cases where we may run out and have to push the request back to the mid-layer. We'd also be cycling through the key space pretty quickly, negating some of the advantages of being able to invalidate the mappings quickly. I'm thinking about just allocating an MR per request -- this seems like a simple way to avoid the restrictions, but would of course use more MRs. mlx4 seems to allow just shy of 512 K of them, so that's probably not a large concern. What do you think? Is that overkill? Do I misunderstand the BMM extensions? -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/8] Reliably generate large request from SRP
[ sorry for the dupe, mis-edited Roland's address ] On Tue, 2011-01-18 at 21:31 -0800, Roland Dreier wrote: Hi Dave, Now that at least one vendor is implementing full support for the SRP indirect memory descriptor tables, we can safely expand the sg_tablesize, and realize some performance gains, in many cases quite large. I don't have vendor code that implements the full support needed for safety, but the rareness of FMR mapping failures allows the mapping code to function, at a risk, with existing targets. Have you considered using memory registration through a send queue (from the base memory management extensions)? mlx4 at least has support for this operation, which would let you pre-allocate everything and avoid the possibility of failure (I think). I'm looking at this now, and it may not be too bad; I think I can re-use the mapping machinery pretty easily without it getting too ugly. I do have a few questions, though. It looks like I can only have up to 256 keys for each fast page registration MR, so that seems to limit the mappings in flight -- I can see cases where we may run out and have to push the request back to the mid-layer. We'd also be cycling through the key space pretty quickly, negating some of the advantages of being able to invalidate the mappings quickly. I'm thinking about just allocating an MR per request -- this seems like a simple way to avoid the restrictions, but would of course use more MRs. mlx4 seems to allow just shy of 512 K of them, so that's probably not a large concern. What do you think? Is that overkill? Do I misunderstand the BMM extensions? -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html