[PATCH] IB/qib: Return correct MAD when setting link width to 255

2011-02-18 Thread Mike Marciniszyn
From: Mitko Haralanov mi...@qlogic.com

Fix a bug which cause the driver to return incorrect MADs
as a response to Set(PortInfo) which sets the link width
to 0xFF or link speed to 0xF.

Signed-off-by: Mitko Haralanov mi...@qlogic.com
Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com
---
 drivers/infiniband/hw/qib/qib_mad.c |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_mad.c 
b/drivers/infiniband/hw/qib/qib_mad.c
index 43f8bbd..7c44726 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -705,7 +705,7 @@ static int subn_set_portinfo(struct ib_smp *smp, struct 
ib_device *ibdev,
lwe = pip-link_width_enabled;
if (lwe) {
if (lwe == 0xFF)
-   lwe = ppd-link_width_supported;
+   set_link_width_enabled(ppd, ppd-link_width_supported);
else if (lwe = 16 || (lwe  ~ppd-link_width_supported))
smp-status |= IB_SMP_INVALID_FIELD;
else if (lwe != ppd-link_width_enabled)
@@ -720,7 +720,8 @@ static int subn_set_portinfo(struct ib_smp *smp, struct 
ib_device *ibdev,
 * speeds.
 */
if (lse == 15)
-   lse = ppd-link_speed_supported;
+   set_link_speed_enabled(ppd,
+  ppd-link_speed_supported);
else if (lse = 8 || (lse  ~ppd-link_speed_supported))
smp-status |= IB_SMP_INVALID_FIELD;
else if (lse != ppd-link_speed_enabled)
@@ -849,7 +850,7 @@ static int subn_set_portinfo(struct ib_smp *smp, struct 
ib_device *ibdev,
if (clientrereg)
pip-clientrereg_resv_subnetto |= 0x80;
 
-   goto done;
+   goto get_only;
 
 err:
smp-status |= IB_SMP_INVALID_FIELD;

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2] Improved node descriptions

2011-02-18 Thread Mike Heinz
Roland,

Thanks for the positive feedback - I'll make the changes ASAP. Sorry about the 
subject lines, I thought I was doing the right thing by making them match.

The change is not in OFED yet, but QLogic has been using it in our own release 
for a while and works extremely well on large fabrics.

From: rol...@purestorage.com [rol...@purestorage.com] On Behalf Of Roland 
Dreier [rol...@kernel.org]
Sent: Thursday, February 17, 2011 6:20 PM
To: Mike Heinz
Cc: linux-rdma@vger.kernel.org
Subject: Re: [PATCH 0/2] Improved node descriptions

On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz michael.he...@qlogic.com wrote:
 This patch addresses the problem by providing a function to build the node
 description. If the provided source string for the description contains an
 '@' character, the function will substitute the current utsname.

 This ensures that even after a fabric has been completely initialized, if
 a node's hostname changes, that change will be reflected in the next sweep
 of the SM, but also maintains compatibility with existing code since the
 behavior is unchanged if the description string does not contain an '@'
 character.

This looks like a reasonable approach to me, although of course the SM
has no way of knowing it should update a port's node description if a
hostname changes.

Aside from some minor quibbles
 - next time please use different subjects for each patch in the thread
 - the prototype of ib_build_node_desc() seems to force every call
   site to have a cast; maybe the function should take a pointer to
   struct ib_smp instead?
 - the internals of ib_build_node_desc() look a bit ugly, is there any
   way to make it a little cleaner?
I do like this.  Does anyone have any feelings about applying this
for 2.6.39?  Is this shipping in OFED?

 - R.


This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2] Improved node descriptions

2011-02-18 Thread Mike Heinz
Hal,

I'm in the process of smoke-testing changes to meet Roland's quibbles - I'm 
going to change the code to pass the structure instead of a char*.

Also, I just want to point out that even in the case where no enhanced trap is 
used, any change will still be seen by the SM the next time it sweeps. There 
will still be lag involved, but I don't think it's insurmountable.

-Original Message-
From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com]
Sent: Thursday, February 17, 2011 11:19 PM
To: Roland Dreier
Cc: Mike Heinz; linux-rdma@vger.kernel.org
Subject: Re: [PATCH 0/2] Improved node descriptions

On Thu, Feb 17, 2011 at 6:20 PM, Roland Dreier rol...@kernel.org wrote:
 On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz michael.he...@qlogic.com 
 wrote:
 This patch addresses the problem by providing a function to build the node
 description. If the provided source string for the description contains an
 '@' character, the function will substitute the current utsname.

 This ensures that even after a fabric has been completely initialized, if
 a node's hostname changes, that change will be reflected in the next sweep
 of the SM, but also maintains compatibility with existing code since the
 behavior is unchanged if the description string does not contain an '@'
 character.

 This looks like a reasonable approach to me, although of course the SM
 has no way of knowing it should update a port's node description if a
 hostname changes.

It does; There's an enhanced trap for this now.

 Aside from some minor quibbles
  - next time please use different subjects for each patch in the thread
  - the prototype of ib_build_node_desc() seems to force every call
   site to have a cast; maybe the function should take a pointer to
   struct ib_smp instead?
  - the internals of ib_build_node_desc() look a bit ugly, is there any
   way to make it a little cleaner?
 I do like this.  Does anyone have any feelings about applying this
 for 2.6.39?  Is this shipping in OFED?

I need a little time to review this.

-- Hal

  - R.
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.
N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

How to build a kernel module with OFED kernel

2011-02-18 Thread Wendy Cheng
A newbie question . (please cc me as I'm not on linux-rdma list yet)

The system was on RHEL 5.4. I used a source tar-ball to build RHEL 5.5
(2.6.18-194.el5). The kernel build was placed at: /usr/src/linux.
Rebooted to pick up the new kernel .. worked fine... Then

1. On the new 2.6.18-192.el5 system, install OFED-1.5.2 ... succeed.
2. Reboot to pick up new ofa kernel ..succeed (check with modinfo)
3. Run user mode rdma application ... succeed
4. The new ofa kernel modules are placed (by OFED scripts) at:
/lib/modules/2.6.18-194.el5/updates/ directory
5. Build a kernel module (xx.ko) using the attached Makefile

Now .. trying to load (or run after forced loading) xx.ko (on top of
ofa kernel kmod(s)) ..   It fails .. as the build apparently picks up
IB kmods from
/lib/modules/2.6.18-194.el5/drivers/infiniband directory, instead of
/lib/modules/2.6.18-194.el5/updates directory, together with wrong
header files from
/lib/modules/2.6.18-194.el5/source/include, where source is
/usr/src/linux directory.

Anyone can help me with a correct procedure ?

I do understand the primary usage of OFED RDMA is for user mode
applications .. but I need to have a kernel mode driver on top of OFED
RDMA for some experiment works.

Thanks,
Wendy

== Make File ==

EXTRA_CFLAGS:= -I/usr/src/linux/drivers/xx/include
EXTRA_CFLAGS+= -DXX_KMOD_DEF

obj-m   := xx_kmod.o

xx_kmod-y   := main/xx_main.o main/xx_init.o \
   libxxverbs/xx_device.o libxxverbs/xx_cm.o \
   libxxverbs/xx_ar.o libxxverbs/xx_mr.o \
   libxxverbs/xx_cq.o libxxverbs/xx_sq.o

xx_kmod-y   += util/xx_perf.o
xx_kmod-y   += brd/xx_brd.o

kmod:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

install:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules_install

run:
/sbin/depmod -a; echo 5  /proc/sys/kernel/panic; modprobe
--force-modversion xx_kmod

=== End attachment ===
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to build a kernel module with OFED kernel

2011-02-18 Thread Steve Wise
To get rid of the symbol mismatch warnings and not force the mod load, pull the Module.symvers from /usr/src/ofa_kernel/ 
and use that instead of the default on in the kernel build tree.


Also, you'll need to include the ofa headers over the backing kernel headers.  Its kind of painful to do.  You can look 
at my krping tree for some ideas.  krping can build it two modes:  it gets built as an out-of tree module, like what 
you're doing, or it also has a build mode where you place the krping src code inside the ofa kernel tree and build it 
there using the ofa kernel Makefiles.  The latter is pretty easy to do.  The former method really doesn't work very well 
for krping.


git://git.openfabrics.org/~swise/krping.git


Steve.




On 02/18/2011 12:59 PM, Wendy Cheng wrote:

A newbie question . (please cc me as I'm not on linux-rdma list yet)

The system was on RHEL 5.4. I used a source tar-ball to build RHEL 5.5
(2.6.18-194.el5). The kernel build was placed at: /usr/src/linux.
Rebooted to pick up the new kernel .. worked fine... Then

1. On the new 2.6.18-192.el5 system, install OFED-1.5.2 ... succeed.
2. Reboot to pick up new ofa kernel ..succeed (check with modinfo)
3. Run user mode rdma application ... succeed
4. The new ofa kernel modules are placed (by OFED scripts) at:
 /lib/modules/2.6.18-194.el5/updates/ directory
5. Build a kernel module (xx.ko) using the attached Makefile

Now .. trying to load (or run after forced loading) xx.ko (on top of
ofa kernel kmod(s)) ..   It fails .. as the build apparently picks up
IB kmods from
/lib/modules/2.6.18-194.el5/drivers/infiniband directory, instead of
/lib/modules/2.6.18-194.el5/updates directory, together with wrong
header files from
/lib/modules/2.6.18-194.el5/source/include, where source is
/usr/src/linux directory.

Anyone can help me with a correct procedure ?

I do understand the primary usage of OFED RDMA is for user mode
applications .. but I need to have a kernel mode driver on top of OFED
RDMA for some experiment works.

Thanks,
Wendy

== Make File ==

EXTRA_CFLAGS:= -I/usr/src/linux/drivers/xx/include
EXTRA_CFLAGS+= -DXX_KMOD_DEF

obj-m   := xx_kmod.o

xx_kmod-y   := main/xx_main.o main/xx_init.o \
libxxverbs/xx_device.o libxxverbs/xx_cm.o \
libxxverbs/xx_ar.o libxxverbs/xx_mr.o \
libxxverbs/xx_cq.o libxxverbs/xx_sq.o

xx_kmod-y   += util/xx_perf.o
xx_kmod-y   += brd/xx_brd.o

kmod:
 make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

install:
 make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules_install

run:
 /sbin/depmod -a; echo 5  /proc/sys/kernel/panic; modprobe
--force-modversion xx_kmod

=== End attachment ===
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] updated patch for improving node descriptions.

2011-02-18 Thread Michael Heinz
Updated the code as per Roland's requests, except for one part:

Roland, I tried to rework the main function to be cleaner, because of the
need to split the hostname from the domain name and the need to not
modify the original source string, using library functions made things
worse than the existing pointer math.

---

Michael Heinz (2):
  Function for improved node descriptions
  Add support for ib_build_node_desc() to the HCAs.


 drivers/infiniband/core/mad.c   |   19 +++
 drivers/infiniband/hw/ipath/ipath_mad.c |2 +-
 drivers/infiniband/hw/mlx4/mad.c|2 +-
 drivers/infiniband/hw/mthca/mthca_mad.c |2 +-
 drivers/infiniband/hw/qib/qib_mad.c |2 +-
 include/rdma/ib_mad.h   |9 +
 6 files changed, 32 insertions(+), 4 deletions(-)

-- 
Signed-off-by: Michael Heinz michael.he...@qlogic.com
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Add support for ib_build_node_desc() to the HCAs.

2011-02-18 Thread Michael Heinz
Adds support for ib_build_node_desc() to the HCAs.

Signed-off-by: Michael Heinz michael.he...@qlogic.com

---
 drivers/infiniband/hw/ipath/ipath_mad.c |2 +-
 drivers/infiniband/hw/mlx4/mad.c|2 +-
 drivers/infiniband/hw/mthca/mthca_mad.c |2 +-
 drivers/infiniband/hw/qib/qib_mad.c |2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_mad.c 
b/drivers/infiniband/hw/ipath/ipath_mad.c
index ceb98ee..bd1a112 100644
--- a/drivers/infiniband/hw/ipath/ipath_mad.c
+++ b/drivers/infiniband/hw/ipath/ipath_mad.c
@@ -60,7 +60,7 @@ static int recv_subn_get_nodedescription(struct ib_smp *smp,
if (smp-attr_mod)
smp-status |= IB_SMP_INVALID_FIELD;
 
-   memcpy(smp-data, ibdev-node_desc, sizeof(smp-data));
+   ib_build_node_desc(smp, ibdev-node_desc);
 
return reply(smp);
 }
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 57ffa50..6a89c3c 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -196,7 +196,7 @@ static void node_desc_override(struct ib_device *dev,
mad-mad_hdr.method == IB_MGMT_METHOD_GET_RESP 
mad-mad_hdr.attr_id == IB_SMP_ATTR_NODE_DESC) {
spin_lock(to_mdev(dev)-sm_lock);
-   memcpy(((struct ib_smp *) mad)-data, dev-node_desc, 64);
+   ib_build_node_desc((struct ib_smp *) mad, dev-node_desc);
spin_unlock(to_mdev(dev)-sm_lock);
}
 }
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c 
b/drivers/infiniband/hw/mthca/mthca_mad.c
index 03a5953..b442028 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -153,7 +153,7 @@ static void node_desc_override(struct ib_device *dev,
mad-mad_hdr.method == IB_MGMT_METHOD_GET_RESP 
mad-mad_hdr.attr_id == IB_SMP_ATTR_NODE_DESC) {
mutex_lock(to_mdev(dev)-cap_mask_mutex);
-   memcpy(((struct ib_smp *) mad)-data, dev-node_desc, 64);
+   ib_build_node_desc((struct ib_smp *) mad, dev-node_desc);
mutex_unlock(to_mdev(dev)-cap_mask_mutex);
}
 }
diff --git a/drivers/infiniband/hw/qib/qib_mad.c 
b/drivers/infiniband/hw/qib/qib_mad.c
index 5ad224e..1033112 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -260,7 +260,7 @@ static int subn_get_nodedescription(struct ib_smp *smp,
if (smp-attr_mod)
smp-status |= IB_SMP_INVALID_FIELD;
 
-   memcpy(smp-data, ibdev-node_desc, sizeof(smp-data));
+   ib_build_node_desc(smp, ibdev-node_desc);
 
return reply(smp);
 }

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/8] Reliably generate large request from SRP

2011-02-18 Thread David Dillow
On Tue, 2011-01-18 at 21:31 -0800, Roland Dreier wrote:
 Hi Dave,
 
   Now that at least one vendor is implementing full support for the SRP
   indirect memory descriptor tables, we can safely expand the sg_tablesize,
   and realize some performance gains, in many cases quite large. I don't
   have vendor code that implements the full support needed for safety, but
   the rareness of FMR mapping failures allows the mapping code to function,
   at a risk, with existing targets.
 
 Have you considered using memory registration through a send queue (from
 the base memory management extensions)?  mlx4 at least has support for
 this operation, which would let you pre-allocate everything and avoid
 the possibility of failure (I think).

I'm looking at this now, and it may not be too bad; I think I can re-use
the mapping machinery pretty easily without it getting too ugly.

I do have a few questions, though. It looks like I can only have up to
256 keys for each fast page registration MR, so that seems to limit the
mappings in flight -- I can see cases where we may run out and have to
push the request back to the mid-layer. We'd also be cycling through the
key space pretty quickly, negating some of the advantages of being able
to invalidate the mappings quickly.

I'm thinking about just allocating an MR per request -- this seems like
a simple way to avoid the restrictions, but would of course use more
MRs. mlx4 seems to allow just shy of 512 K of them, so that's probably
not a large concern.

What do you think? Is that overkill? Do I misunderstand the BMM
extensions?
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/8] Reliably generate large request from SRP

2011-02-18 Thread David Dillow
[ sorry for the dupe, mis-edited Roland's address ]

On Tue, 2011-01-18 at 21:31 -0800, Roland Dreier wrote:
 Hi Dave,
 
   Now that at least one vendor is implementing full support for the SRP
   indirect memory descriptor tables, we can safely expand the sg_tablesize,
   and realize some performance gains, in many cases quite large. I don't
   have vendor code that implements the full support needed for safety, but
   the rareness of FMR mapping failures allows the mapping code to function,
   at a risk, with existing targets.
 
 Have you considered using memory registration through a send queue (from
 the base memory management extensions)?  mlx4 at least has support for
 this operation, which would let you pre-allocate everything and avoid
 the possibility of failure (I think).

I'm looking at this now, and it may not be too bad; I think I can re-use
the mapping machinery pretty easily without it getting too ugly.

I do have a few questions, though. It looks like I can only have up to
256 keys for each fast page registration MR, so that seems to limit the
mappings in flight -- I can see cases where we may run out and have to
push the request back to the mid-layer. We'd also be cycling through the
key space pretty quickly, negating some of the advantages of being able
to invalidate the mappings quickly.

I'm thinking about just allocating an MR per request -- this seems like
a simple way to avoid the restrictions, but would of course use more
MRs. mlx4 seems to allow just shy of 512 K of them, so that's probably
not a large concern.

What do you think? Is that overkill? Do I misunderstand the BMM
extensions?
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html