Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Or Gerlitz

On 1/12/2012 11:23 AM, Sebastian Riemer wrote:
We are running iSER directly on the host. KVM is compiled in but there 
aren't any VMs on our iSER test server. It is a diskless SuperMicro 
server with NFS root. On productive servers we have a live-image and 
KVM uses the iSER driven block devices for storage. This is the IB HCA 
(mlx4): Mellanox MT26428 [ConnectX IB QDR, PCIe 2.0 5GT/s] We've 
updated the firmware lately on all servers. How can I find out the 
firmware version? With tvflash or mstflint? 


If you  have build the kernel IB user space support (uverbs) and the IB 
libs, do ibv_devinfo if not, just ossi cat 
/sys/class/infiniband/mlx4_0/* and send the output. To be clear, iser 
does work for you on the productive servers but not on this server?


The storage has the same IB HCA, and they are connected via a switch. 
I'll ask someone of the SysOps which one it is and if they have the 
latest firmware on it. Perhaps this could be the problem.


As its local protection error on TX, I don't see how this could relate 
to the target node.



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Sebastian Riemer
On 12/01/12 10:29, Or Gerlitz wrote:
 If you  have build the kernel IB user space support (uverbs) and the
 IB libs, do ibv_devinfo if not, just ossi cat
 /sys/class/infiniband/mlx4_0/* and send the output. To be clear, iser
 does work for you on the productive servers but not on this server?

Yes, we've got consistent OFED-1.5.4 user-space. ibv_devinfo reports a
mismatch between the kernel and the userspace libraries - kernel does
not support XRC.. ibverbs-driver-mlx4 is at version
1.0.1-1.20.g6771d22 and libibverbs is at version 1.1.4-1.24.gb89d4d7.

But O.K., the other method shows firmware version 2.9.1000.

iSER only works on productive servers, because we use the OFA kernel
modules from OFED for them at the moment (with 3.0 ported *iscsi*
drivers). But there the IPoIB traffic is too slow for us.
We connect customer VMs with IPv6 between different servers via IB.

And yes, we could also test kernel 3.2 on our iSER test server.

Regards,
Sebastian


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ibv_req_notify_cq and multithreading

2012-01-12 Thread Flavio Baronti
I'm trying to have N threads reading from the same completion channel, bounded to M completion queues. I would like to 
have N  M, and to ensure that only a single thread at time can call ibv_poll_cq() on a given queue, to process the 
events in the same order they were put in the queue.


I can't understand how to properly achieve this, since:
1- If I call ibv_req_notify_cq() before ibv_poll_cq(), I might end up with two 
threads polling the same queue.
2- If I call ibv_req_notify_cq() after ibv_poll_cq(), I could end up with events in the cq not being notified in the 
channel (I read this on the IBTA 11.4.2.2, and I *think* I actually experienced this under load).


I can use option 1 with an additional lock before ibv_req_notify_cq(), but I would like to know if there is a simpler 
way which I can't see.


Thanks
Flavio
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


endian question about struct srp_direct_buf

2012-01-12 Thread Dan Carpenter
Sparse complains because len in struct srp_direct_buf is declared as
big endian but it's used throughout as CPU endian.  struct
srp_indirect_buf has the same thing.  It's declared one way but used the
other way.

$ grep -w len drivers/scsi -R | grep -w md
drivers/scsi/ibmvscsi/ibmvfc.c: md[i].len = sg_dma_len(sg);
drivers/scsi/ibmvscsi/ibmvstgt.c:   mlen = min(rest, md[i].len);
drivers/scsi/libsrp.c:  md-len, scsi_sg_count(sc));
drivers/scsi/libsrp.c:  len = min(scsi_bufflen(sc), md-len);
drivers/scsi/libsrp.c:  len = md-len;
drivers/scsi/libsrp.c:  err = rdma_io(sc, sg, nsg, md, 1, dir, len);
drivers/scsi/libsrp.c:  md = dma_alloc_coherent(iue-target-dev, 
id-table_desc.len,
drivers/scsi/libsrp.c:  sg_init_one(dummy, md, id-table_desc.len);
drivers/scsi/libsrp.c:  err = rdma_io(sc, sg, nsg, md, nmd, dir, len);
drivers/scsi/libsrp.c:  dma_free_coherent(iue-target-dev, 
id-table_desc.len, md, token);
drivers/scsi/libsrp.c:  len = md-len;

Probably we should just change the declaration to u32?

regards,
dan carpenter
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: endian question about struct srp_direct_buf

2012-01-12 Thread Bart Van Assche
On Thu, Jan 12, 2012 at 12:41 PM, Dan Carpenter
dan.carpen...@oracle.com wrote:
 Sparse complains because len in struct srp_direct_buf is declared as
 big endian but it's used throughout as CPU endian.  struct
 srp_indirect_buf has the same thing.  It's declared one way but used the
 other way.

 $ grep -w len drivers/scsi -R | grep -w md
 drivers/scsi/ibmvscsi/ibmvfc.c:         md[i].len = sg_dma_len(sg);
 drivers/scsi/ibmvscsi/ibmvstgt.c:               mlen = min(rest, md[i].len);
 drivers/scsi/libsrp.c:                  md-len, scsi_sg_count(sc));
 drivers/scsi/libsrp.c:          len = min(scsi_bufflen(sc), md-len);
 drivers/scsi/libsrp.c:          len = md-len;
 drivers/scsi/libsrp.c:  err = rdma_io(sc, sg, nsg, md, 1, dir, len);
 drivers/scsi/libsrp.c:          md = dma_alloc_coherent(iue-target-dev, 
 id-table_desc.len,
 drivers/scsi/libsrp.c:          sg_init_one(dummy, md, id-table_desc.len);
 drivers/scsi/libsrp.c:  err = rdma_io(sc, sg, nsg, md, nmd, dir, len);
 drivers/scsi/libsrp.c:          dma_free_coherent(iue-target-dev, 
 id-table_desc.len, md, token);
 drivers/scsi/libsrp.c:          len = md-len;

 Probably we should just change the declaration to u32?

(resending as plain text)

No. The SRP spec says that that field is big endian and the ib_srp
driver uses that field as a big endian field. The output above (libsrp
+ ibmvstgt) is code that is used by the ibmvstgt driver only, and the
reason that driver works fine without endianness conversion is because
it is only used on PowerPC systems.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] opensm: fixed segfault in osm_destroy

2012-01-12 Thread Alex Netes
Fixed segfault in osm_destroy() when hop_weights_file,
port_search_ordering_file or io_guid_file are configured.

The segfault introduced by d71a924736707400bed47a3c69395cf864c970bb.

Signed-off-by: Alex Netes ale...@mellanox.com
---
 opensm/main.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/opensm/main.c b/opensm/main.c
index 3edc52f..c75d220 100644
--- a/opensm/main.c
+++ b/opensm/main.c
@@ -724,13 +724,13 @@ int main(int argc, char *argv[])
break;
 
case 'w':
-   opt.hop_weights_file = optarg;
+   SET_STR_OPT(opt.hop_weights_file, optarg);
printf( Hop Weights File = %s\n,
   opt.hop_weights_file);
break;
 
case 'O':
-   opt.port_search_ordering_file = optarg;
+   SET_STR_OPT(opt.port_search_ordering_file, optarg);
printf( Port Search Ordering/Dimension Ports File = 
%s\n,
   opt.port_search_ordering_file);
break;
@@ -959,7 +959,7 @@ int main(int argc, char *argv[])
break;
 
case 'G':
-   opt.io_guid_file = optarg;
+   SET_STR_OPT(opt.io_guid_file, optarg);
printf( I/O Node Guid File: %s\n, opt.io_guid_file);
break;
case 11:
-- 
1.7.1

-- 

-- Alex
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Sebastian Riemer
On 12/01/12 11:16, Sebastian Riemer wrote:
 On 12/01/12 10:29, Or Gerlitz wrote:
   
 If you  have build the kernel IB user space support (uverbs) and the
 IB libs, do ibv_devinfo if not, just ossi cat
 /sys/class/infiniband/mlx4_0/* and send the output. To be clear, iser
 does work for you on the productive servers but not on this server?
 
 Yes, we've got consistent OFED-1.5.4 user-space. ibv_devinfo reports a
 mismatch between the kernel and the userspace libraries - kernel does
 not support XRC.. ibverbs-driver-mlx4 is at version
 1.0.1-1.20.g6771d22 and libibverbs is at version 1.1.4-1.24.gb89d4d7.

 But O.K., the other method shows firmware version 2.9.1000.

   

I've found out that we have two single port MHQH19B-XTR InfiniBand HCAs.

lspci output:
03:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
5GT/s - IB QDR / 10GigE] (rev b0)
04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
5GT/s - IB QDR / 10GigE] (rev b0)

The first one is ib1. And the second is ib0.
/sys/devices/pci:00/:00:0c.0/:03:00.0/net/ib1
/sys/devices/pci:00/:00:0b.0/:04:00.0/net/ib0

The iSER traffic is on ib1 (the HCA which reported the error) and ib0 is
for IPoIB traffic. I don't know if the mlx4 driver has a problem with
that hardware config.

Here is the requested data:
mlx4_0:
board_id   MT_0D90110009
fw_ver 2.9.1000
hca_type   MT26428
hw_rev b0
node_desc  pserver214 HCA-1 (mlx4_0 - MT26428)
node_guid  0002:c903:000f:5f76
node_type  1: CA
sys_image_guid 0002:c903:000f:5f79
uevent NAME=mlx4_0

mlx4_1:
board_id   MT_0D90110009
fw_ver 2.9.1000
hca_type   MT26428
hw_rev b0
node_desc  pserver214 HCA-2 (mlx4_1 - MT26428)
node_guid  0002:c903:000f:5f26
node_type  1: CA
sys_image_guid 0002:c903:000f:5f29
uevent NAME=mlx4_1

Both are connected to the storage but in different subnets and without
multipathing.

How do I find out if ib1 is on mlx4_1 or mlx4_0?

Cheers,
Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: ibv_req_notify_cq and multithreading

2012-01-12 Thread Hefty, Sean
 I'm trying to have N threads reading from the same completion channel, bounded
 to M completion queues. I would like to
 have N  M, and to ensure that only a single thread at time can call
 ibv_poll_cq() on a given queue, to process the
 events in the same order they were put in the queue.
 
 I can't understand how to properly achieve this, since:
 1- If I call ibv_req_notify_cq() before ibv_poll_cq(), I might end up with two
 threads polling the same queue.
 2- If I call ibv_req_notify_cq() after ibv_poll_cq(), I could end up with
 events in the cq not being notified in the
 channel (I read this on the IBTA 11.4.2.2, and I *think* I actually
 experienced this under load).
 
 I can use option 1 with an additional lock before ibv_req_notify_cq(), but I
 would like to know if there is a simpler
 way which I can't see.

I can't think of a simpler way.  You just don't have any idea which CQ will be 
returned from the completion channel.  Does your traffic pattern work to create 
N completion channels and distributed the CQs among them?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Send with immediate data completion

2012-01-12 Thread Atchley, Scott
On Jan 11, 2012, at 5:22 PM, Hefty, Sean wrote:

 I'm still waiting on feedback from the IBTA, but they are looking into the
 matter.
 
 The intent is for immediate data only to be provided on receive work 
 completions.  The IBTA will clarify the spec on this.  I'll submit patches 
 that remove setting the wc flag, which may help avoid this confusion some.

Sean,

Thanks for looking into this.

Scott--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Mike Marciniszyn
 Should whatever this issue is be a general PCI fixup? Like broken MSI,
 etc.

Can you point me to some details on this?

 Might be nice to include what 0x51 tunes in the commit to aide other
 peoole with the broken chipset :)

 Isn't it necesary to check the PCI vendor as well as the devid?

Will do both of these in a V2.

Mike

This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Or Gerlitz

On 1/12/2012 5:18 PM, Sebastian Riemer wrote:

How do I find out if ib1 is on mlx4_1 or mlx4_0


you do ip addr show and compare with 
/sys/class/infiniband/mlx4_*/ports/1/gid/0


you didn't send the kernel logs from the failure after opening the iser  
(debug_level=2) and libiscsi (debug_libiscsi_session=1 
debug_libiscsi_conn=1) debug prints


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Or Gerlitz

On 1/11/2012 10:09 PM, Or Gerlitz wrote:
[...] I'll give 3.0.15 a try tomorrow, however, the error you're 
getting iser_drain_tx_cq:tx id 88402391f898 status 4 vend_err 57 
means that iser got local protection error (=4) on the first buffer we 
used with IB (the connection handshake buffers belong to the IB CM, 
not to the ULP) which is the login request. Sounds like something is 
broken maybe dma mapping wise, for this reason I think its likely that 
the problem might not hit me on my testbed [...]


okay, I've tried 3.0.15 with your .config slightly changed for my local 
SATA disk, will send you copy of my .config , and, iser works for me... 
so you need to try a bit harder and send me your logs... I'm using 
iscsi-initiator-utils-6.2.0.872-21.el6.x86_64


Or.

My board ID is MT_0D81120009 which is a bit different but the HCA is 
ConnectX b0 as yours, I'm using non GA firmware, but I find it hard to 
believe this is the reason for your failure



# ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 2
Firmware version: 2.9.4270
Hardware version: b0
Node GUID: 0x0002c9030010c6e8
System image GUID: 0x0002c9030010c6eb
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 10
LMC: 0
SM lid: 6
Capability mask: 0x02510868
Port GUID: 0x0002c9030010c6e9




[  134.869036] iscsi: registered transport (tcp)
[  134.987553] iscsi: registered transport (iser)
[  136.075198] iser: iser_connect:connecting to: 192.168.20.19, port 
0xbc0c
[  136.100162] iser: iser_cma_handler:event 0 status 0 conn 
88020eb7ba80 id 8802252aec00
[  136.58] iser: iser_cma_handler:event 2 status 0 conn 
88020eb7ba80 id 8802252aec00
[  136.130923] iser: iser_create_ib_conn_res:setting conn 
88020eb7ba80 cma_id 8802252aec00: fmr_pool 880224c17880 qp 
8802154a4600
[  136.150646] iser: iser_cma_handler:event 9 status 0 conn 
88020eb7ba80 id 8802252aec00

[  136.332263] iser: iscsi_iser_ep_poll:ib conn 88020eb7ba80 rc = 1
[  136.338710] scsi3 : iSCSI Initiator over iSER, v.0.1
[  136.346240] iser: iscsi_iser_conn_bind:binding iscsi/iser conn 
880225294ab8 880225294cc8 to ib_conn 88020eb7ba80
[  136.609277] scsi 3:0:0:0: RAID  Mellanox 
vsa  1PQ: 0 ANSI: 5

[  136.617604] scsi 3:0:0:0: Attached scsi generic sg3 type 12
[  136.623454] scsi 3:0:0:1: Direct-Access Mellanox 
VIRTUAL-DISK 0001 PQ: 0 ANSI: 5

[  136.631820] sd 3:0:0:1: Attached scsi generic sg4 type 0
[  136.631848] sd 3:0:0:1: [sdc] 2147483648 512-byte logical blocks: 
(1.09 TB/1.00 TiB)

[  136.645040] sd 3:0:0:1: [sdc] Write Protect is off
[  136.649880] sd 3:0:0:1: [sdc] Mode Sense: 49 00 00 08
[  136.649975] sd 3:0:0:1: [sdc] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[  136.659680]  sdc: unknown partition table
[  136.664071] sd 3:0:0:1: [sdc] Attached SCSI disk
[  211.020096] sd 3:0:0:1: [sdc] Synchronizing SCSI cache
[  211.526075] iser: iscsi_iser_ep_disconnect:ib conn 88020eb7ba80 
state 2
[  211.534048] iser: iser_cma_handler:event 10 status 0 conn 
88020eb7ba80 id 8802252aec00
[  211.542750] iser: iser_free_ib_conn_res:freeing conn 
88020eb7ba80 cma_id 8802252aec00 fmr pool 880224c17880 qp 
8802154a4600
[  211.556053] iser: iser_device_try_release:device 880225b30480 
refcount 0


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upstream support for multicast IBoE

2012-01-12 Thread Shawn Bohrer
On Wed, Jan 11, 2012 at 09:49:25PM +0200, Or Gerlitz wrote:
 Shawn Bohrer sboh...@rgmadvisors.com wrote:
  Is there any estimate on when we might see something like this upstream?
 
 Could you elaborate a little on your use case for multicast IBoE
 traffic? e.g how the setup looks like and how are your Ethernet
 switches act to route that traffic.

I'm not sure exactly what you are asking here.  We do what I would
imagine is a typical one to many UD multicast.  We code directly to
libibvers and librdmacm, and everything is sent IBoE.

The hosts are in a spine/leaf configuration and all traffic is sent
over vlans.  My understanding is that the multicast IBoE traffic is
simply sent as broadcast and that the adapters do the necessary
filtering.

Really from my point of view OFED already does what we want, but I
would really like to see this supported upstream.

Thanks,
Shawn


---
This email, along with any attachments, is confidential. If you 
believe you received this message in error, please contact the 
sender immediately and delete all copies of the message.  
Thank you.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: fixed segfault in osm_destroy

2012-01-12 Thread Dale Purdy

On Thu, 12 Jan 2012, Alex Netes wrote:


Fixed segfault in osm_destroy() when hop_weights_file,
port_search_ordering_file or io_guid_file are configured.

The segfault introduced by d71a924736707400bed47a3c69395cf864c970bb.

Signed-off-by: Alex Netes ale...@mellanox.com
---
opensm/main.c |6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)



The fix looks good, and works too!  Thanks.

Dale
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Mike Marciniszyn

 Does this work on systems where the broken chipset might
 not be the immediate parent of the qib device (ie there are
 some PCIe switches in between)?


The code figures this out at the top of routine and returns, changing nothing.



This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: Get correct guid in case of multiple ports

2012-01-12 Thread Alex Netes
Hi Goldwyn,

On 10:02 Wed 11 Jan , Goldwyn Rodrigues wrote:
 
 Hi Alex,
 
 Let me start with how we encountered the problem:
 This problem came up when our customer was using a 2 port card with only
 one of the port active. opensm could not get the guid of the port
 that was active in daemon mode.

I guess it's because your costumer runs opensm with -g 0 -B in command line.

 
  On 13:36 Wed 05 Oct , Goldwyn Rodrigues wrote:
   
   In case of multiple ports and running in daemon mode, the active port is 
   not selected because opt.guid is set to INVALID_GUID in main() but the 
   check in get_port_guid is done against zero: 
 if (port_guid == 0) {
  
  opt.guid is set to 0 by default.
  opt.guid is set to INVALID_GUID if a user used -g WRONG_GUID command line
  option when executing the SM. 
 
 What happens when -g 0 -B is specified? Check the getopt code. It sets guid
 to INVALID_GUID. Consider /etc/sysconfig/opensm  as well.

You are correct. Setting argument -g 0 will set port_guid to INVALID_GUID.
From OpenSM man page:
-g, --guid GUID in hex
This option specifies the local port GUID value with which OpenSM
should bind.  OpenSM may be bound to 1 port at a time.  If GUID given
is 0, OpenSM displays a list of possible port GUIDs and waits for user
input.  Without -g, OpenSM tries to use the default port.

So I guess the behavior of running OpenSM with -g 0 -B is undefined. I think
it's better to exit than execute OpenSM with wrong parameter.

Moreover, there is no problem when you set guid 0 in the opensm.conf and run
opensm as a daemon (actually this is the default).

 
 What happens when you provide -g WRONG_GUID -B?
 I think in this case, -B should take priority and set with the first
 active port available.

I think that in that case, a user intended to bind OpenSM on specific port and
it could be a major issue if OpenSM will automatically binds to a different
port.

 
 
  In that case, when SM runs not in daemon mode,
  SM prompts the user to choose available port GUID out of available range.
  In case when SM runs in daemon mode, it can't prompt the user so it just 
  exits.
  
   
   On second thoughts, passing port_guid is worthless because this function 
   is called only when no guid is supplied at the command prompt. So, 
   removed the port_guid parameter from the function altogether.
   
   If not in daemon mode, it would show the list of ports as intended.
   
   Also added error message if no ports are found.
   
   Signed-off-by: Goldwyn Rodrigues rgold...@suse.de
   
   diff --git a/opensm/main.c b/opensm/main.c
   index 51c8291..a236859 100644
   --- a/opensm/main.c
   +++ b/opensm/main.c
   @@ -403,7 +403,7 @@ static void show_usage(void)
 exit(2);
}

   -static ib_net64_t get_port_guid(IN osm_opensm_t * p_osm, uint64_t 
   port_guid)
   +static ib_net64_t get_port_guid(IN osm_opensm_t *p_osm)
{
 ib_port_attr_t attr_array[MAX_LOCAL_IBPORTS];
 uint32_t num_ports = MAX_LOCAL_IBPORTS;
   @@ -436,21 +436,19 @@ static ib_net64_t get_port_guid(IN osm_opensm_t * 
   p_osm, uint64_t port_guid)
cl_hton64(attr_array[0].port_guid));
 return attr_array[0].port_guid;
 }
   - /* If port_guid is 0 - use the first connected port */
   - if (port_guid == 0) {
   + /* If in daemon mode autoselect first available port */
   + if (p_osm-subn.opt.daemon) {
 for (i = 0; i  num_ports; i++)
 if (attr_array[i].link_state  IB_LINK_DOWN)
 break;
   + /* No port found which is available */
 if (i == num_ports)
   - i = 0;
   + return 0;
 printf(Using default GUID 0x% PRIx64 \n,
cl_hton64(attr_array[i].port_guid));
 return attr_array[i].port_guid;
 }

   - if (p_osm-subn.opt.daemon)
   - return 0;
   -
 /* More than one possible port - list all ports and let the user
  * to choose. */
 while (1) {
   @@ -1106,10 +1104,12 @@ int main(int argc, char *argv[])
then get a port GUID value with which to bind.
  */
 if (opt.guid == 0 || cl_hton64(opt.guid) == CL_HTON64(INVALID_GUID))
   - opt.guid = get_port_guid(osm, opt.guid);
   + opt.guid = get_port_guid(osm);

   - if (opt.guid == 0)
   + if (opt.guid == 0) {
   + printf(\nError: No available ports\n);
 goto Exit;
   + }

 status = osm_opensm_bind(osm, opt.guid);
 if (status != IB_SUCCESS) {
   
   -- 
   Goldwyn
   --
   To unsubscribe from this list: send the line unsubscribe linux-rdma in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
  -- 
  
  -- Alex
 
 -- 
 Goldwyn
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  

[PATCH V1 1/6] IB: use central enum for speed instead of hard-coded values

2012-01-12 Thread Or Gerlitz
The kernel IB stack uses one enumeration for IB speed, which wasn't
explicitly specified in the verbs header file. Add that enum, and
use it all over the code. Note that the IB speed/width notation is
also used by iWARP and IBoE hw drivers who apply the convention of
rate = speed X width, to advertize their port link rate.

Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---

changes from v0:
 fixed typo in the enum type name (was ib_port_seed instead of ib_port_speed)

 drivers/infiniband/core/sysfs.c  |   15 +++-
 drivers/infiniband/core/uverbs_cmd.c |3 ++
 drivers/infiniband/core/verbs.c  |1 +
 drivers/infiniband/hw/amso1100/c2_provider.c |2 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |2 +-
 drivers/infiniband/hw/cxgb4/provider.c   |2 +-
 drivers/infiniband/hw/ehca/ehca_hca.c|2 +-
 drivers/infiniband/hw/mlx4/main.c|   10 
 drivers/infiniband/hw/mlx4/qp.c  |   31 +
 drivers/infiniband/hw/nes/nes_verbs.c|2 +-
 include/rdma/ib_verbs.h  |   11 -
 11 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index c61bca3..9ce70ca 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -189,21 +189,24 @@ static ssize_t rate_show(struct ib_port *p, struct 
port_attribute *unused,
rate = (25 * attr.active_speed) / 10;

switch (attr.active_speed) {
-   case 2:
+   case IB_SPEED_SDR:
+   speed =  SDR;
+   break;
+   case IB_SPEED_DDR:
speed =  DDR;
break;
-   case 4:
+   case IB_SPEED_QDR:
speed =  QDR;
break;
-   case 8:
+   case IB_SPEED_FDR10:
speed =  FDR10;
rate = 10;
break;
-   case 16:
+   case IB_SPEED_FDR:
speed =  FDR;
rate = 14;
break;
-   case 32:
+   case IB_SPEED_EDR:
speed =  EDR;
rate = 25;
break;
@@ -214,7 +217,7 @@ static ssize_t rate_show(struct ib_port *p, struct 
port_attribute *unused,
return -EINVAL;

return sprintf(buf, %d%s Gb/sec (%dX%s)\n,
-  rate, (attr.active_speed == 1) ? .5 : ,
+  rate, (attr.active_speed == IB_SPEED_SDR) ? .5 : ,
   ib_width_enum_to_int(attr.active_width), speed);
 }

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index b930da4..8722e96 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1399,6 +1399,9 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
if (copy_from_user(cmd, buf, sizeof cmd))
return -EFAULT;

+   if (cmd.qp_type == IB_QPT_RAW_PACKET  !capable(CAP_NET_RAW))
+   return -EPERM;
+
INIT_UDATA(udata, buf + sizeof cmd,
   (unsigned long) cmd.response + sizeof resp,
   in_len - sizeof cmd, out_len - sizeof resp);
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 602b1bd..f73e15b 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -479,6 +479,7 @@ static const struct {
[IB_QPT_UD]  = (IB_QP_PKEY_INDEX
|
IB_QP_PORT  
|
IB_QP_QKEY),
+   [IB_QPT_RAW_PACKET] = IB_QP_PORT,
[IB_QPT_UC]  = (IB_QP_PKEY_INDEX
|
IB_QP_PORT  
|
IB_QP_ACCESS_FLAGS),
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c 
b/drivers/infiniband/hw/amso1100/c2_provider.c
index 12f923d..07eb3a8 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -94,7 +94,7 @@ static int c2_query_port(struct ib_device *ibdev,
props-pkey_tbl_len = 1;
props-qkey_viol_cntr = 0;
props-active_width = 1;
-   props-active_speed = 1;
+   props-active_speed = IB_SPEED_SDR;

return 0;
 }
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 37c224f..0bdf09a 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1227,7 +1227,7 @@ static int iwch_query_port(struct ib_device *ibdev,
props-gid_tbl_len = 1;
props-pkey_tbl_len = 1;
props-active_width = 2;
-   props-active_speed = 2;
+   props-active_speed = IB_SPEED_DDR;
props-max_msg_sz = 

Re: [PATCH] opensm: Get correct guid in case of multiple ports

2012-01-12 Thread Goldwyn Rodrigues
Hi Alex,

On Thu, Jan 12, 2012 at 07:23:30PM +0200, Alex Netes wrote:
 Hi Goldwyn,
 
 On 10:02 Wed 11 Jan , Goldwyn Rodrigues wrote:
  
  Hi Alex,
  
  Let me start with how we encountered the problem:
  This problem came up when our customer was using a 2 port card with only
  one of the port active. opensm could not get the guid of the port
  that was active in daemon mode.
 
 I guess it's because your costumer runs opensm with -g 0 -B in command line.
 
  
   On 13:36 Wed 05 Oct , Goldwyn Rodrigues wrote:

In case of multiple ports and running in daemon mode, the active port 
is not selected because opt.guid is set to INVALID_GUID in main() but 
the check in get_port_guid is done against zero: 
if (port_guid == 0) {
   
   opt.guid is set to 0 by default.
   opt.guid is set to INVALID_GUID if a user used -g WRONG_GUID command 
   line
   option when executing the SM. 
  
  What happens when -g 0 -B is specified? Check the getopt code. It sets 
  guid
  to INVALID_GUID. Consider /etc/sysconfig/opensm  as well.
 
 You are correct. Setting argument -g 0 will set port_guid to INVALID_GUID.
 From OpenSM man page:
 -g, --guid GUID in hex
   This option specifies the local port GUID value with which OpenSM
   should bind.  OpenSM may be bound to 1 port at a time.  If GUID given
   is 0, OpenSM displays a list of possible port GUIDs and waits for user
   input.  Without -g, OpenSM tries to use the default port.
 
 So I guess the behavior of running OpenSM with -g 0 -B is undefined. I think
 it's better to exit than execute OpenSM with wrong parameter.

Think from a user POV instead of a programmer's POV. A user will be
confused when he attempts to start the daemon and the daemon just exits.
Could opensm atleast complain about it saying that the options
are incompatible or it does not want to use the available guids?

 
 Moreover, there is no problem when you set guid 0 in the opensm.conf and run
 opensm as a daemon (actually this is the default).

Have you tried it with multi-port? For 1 port, get_port_guid() selects the
default one because num_ports is 1 and the daemon will not exit, even if
you supply  -g 0 -B. 

BTW, We are using SLES 11.

  
  What happens when you provide -g WRONG_GUID -B?
  I think in this case, -B should take priority and set with the first
  active port available.
 
 I think that in that case, a user intended to bind OpenSM on specific port and
 it could be a major issue if OpenSM will automatically binds to a different
 port.
 
  
  
   In that case, when SM runs not in daemon mode,
   SM prompts the user to choose available port GUID out of available range.
   In case when SM runs in daemon mode, it can't prompt the user so it just 
   exits.
   

On second thoughts, passing port_guid is worthless because this 
function is called only when no guid is supplied at the command prompt. 
So, removed the port_guid parameter from the function altogether.

If not in daemon mode, it would show the list of ports as intended.

Also added error message if no ports are found.

Signed-off-by: Goldwyn Rodrigues rgold...@suse.de

diff --git a/opensm/main.c b/opensm/main.c
index 51c8291..a236859 100644
--- a/opensm/main.c
+++ b/opensm/main.c
@@ -403,7 +403,7 @@ static void show_usage(void)
exit(2);
 }
 
-static ib_net64_t get_port_guid(IN osm_opensm_t * p_osm, uint64_t 
port_guid)
+static ib_net64_t get_port_guid(IN osm_opensm_t *p_osm)
 {
ib_port_attr_t attr_array[MAX_LOCAL_IBPORTS];
uint32_t num_ports = MAX_LOCAL_IBPORTS;
@@ -436,21 +436,19 @@ static ib_net64_t get_port_guid(IN osm_opensm_t * 
p_osm, uint64_t port_guid)
   cl_hton64(attr_array[0].port_guid));
return attr_array[0].port_guid;
}
-   /* If port_guid is 0 - use the first connected port */
-   if (port_guid == 0) {
+   /* If in daemon mode autoselect first available port */
+   if (p_osm-subn.opt.daemon) {
for (i = 0; i  num_ports; i++)
if (attr_array[i].link_state  IB_LINK_DOWN)
break;
+   /* No port found which is available */
if (i == num_ports)
-   i = 0;
+   return 0;
printf(Using default GUID 0x% PRIx64 \n,
   cl_hton64(attr_array[i].port_guid));
return attr_array[i].port_guid;
}
 
-   if (p_osm-subn.opt.daemon)
-   return 0;
-
/* More than one possible port - list all ports and let the user
 * to choose. */
while (1) {
@@ -1106,10 +1104,12 @@ int main(int argc, char *argv[])
   then get a port GUID value with which to bind.
 */
   

[PATCH] RDS: Remove some unused iWARP code

2012-01-12 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com

rds_iw_flush_goal() just returns a count, but it is only called in one
place and its return value is ignored there.  So delete all the dead code.

Signed-off-by: Roland Dreier rol...@purestorage.com
---
 net/rds/iw_rdma.c |   15 +--
 1 files changed, 1 insertions(+), 14 deletions(-)

diff --git a/net/rds/iw_rdma.c b/net/rds/iw_rdma.c
index 4e1de17..a817705 100644
--- a/net/rds/iw_rdma.c
+++ b/net/rds/iw_rdma.c
@@ -477,17 +477,6 @@ void rds_iw_sync_mr(void *trans_private, int direction)
}
 }
 
-static inline unsigned int rds_iw_flush_goal(struct rds_iw_mr_pool *pool, int 
free_all)
-{
-   unsigned int item_count;
-
-   item_count = atomic_read(pool-item_count);
-   if (free_all)
-   return item_count;
-
-   return 0;
-}
-
 /*
  * Flush our pool of MRs.
  * At a minimum, all currently unused MRs are unmapped.
@@ -500,7 +489,7 @@ static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool 
*pool, int free_all)
LIST_HEAD(unmap_list);
LIST_HEAD(kill_list);
unsigned long flags;
-   unsigned int nfreed = 0, ncleaned = 0, unpinned = 0, free_goal;
+   unsigned int nfreed = 0, ncleaned = 0, unpinned = 0;
int ret = 0;
 
rds_iw_stats_inc(s_iw_rdma_mr_pool_flush);
@@ -514,8 +503,6 @@ static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool 
*pool, int free_all)
list_splice_init(pool-clean_list, kill_list);
spin_unlock_irqrestore(pool-list_lock, flags);
 
-   free_goal = rds_iw_flush_goal(pool, free_all);
-
/* Batched invalidate of dirty MRs.
 * For FMR based MRs, the mappings on the unmap list are
 * actually members of an ibmr (ibmr-mapping). They either
-- 
1.7.8.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/6] IB: use central enum for speed instead of hard-coded values

2012-01-12 Thread Roland Dreier
Seems to have the raw packet QP stuff mixed in now?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/6] IB: use central enum for speed instead of hard-coded values

2012-01-12 Thread Or Gerlitz
On Thu, Jan 12, 2012 at 9:30 PM, Roland Dreier rol...@kernel.org wrote:
 Seems to have the raw packet QP stuff mixed in now?

sorry, my bad, will fix and resend

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 1/6] IB: use central enum for speed instead of hard-coded values

2012-01-12 Thread Or Gerlitz
The kernel IB stack uses one enumeration for IB speed, which wasn't
explicitly specified in the verbs header file. Add that enum, and
use it all over the code. Note that the IB speed/width notation is
also used by iWARP and IBoE hw drivers who apply the convention of
rate = speed X width, to advertize their port link rate.

Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---

changes from v0:
 fixed typo in the enum type name (was ib_port_seed instead of ib_port_speed)

changes from v1:
 removed raw qp code which went in by mistake

 drivers/infiniband/core/sysfs.c  |   15 +--
 drivers/infiniband/hw/amso1100/c2_provider.c |2 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |2 +-
 drivers/infiniband/hw/cxgb4/provider.c   |2 +-
 drivers/infiniband/hw/ehca/ehca_hca.c|2 +-
 drivers/infiniband/hw/mlx4/main.c|   10 +-
 drivers/infiniband/hw/nes/nes_verbs.c|2 +-
 include/rdma/ib_verbs.h  |9 +
 8 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index c61bca3..9ce70ca 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -189,21 +189,24 @@ static ssize_t rate_show(struct ib_port *p, struct 
port_attribute *unused,
rate = (25 * attr.active_speed) / 10;

switch (attr.active_speed) {
-   case 2:
+   case IB_SPEED_SDR:
+   speed =  SDR;
+   break;
+   case IB_SPEED_DDR:
speed =  DDR;
break;
-   case 4:
+   case IB_SPEED_QDR:
speed =  QDR;
break;
-   case 8:
+   case IB_SPEED_FDR10:
speed =  FDR10;
rate = 10;
break;
-   case 16:
+   case IB_SPEED_FDR:
speed =  FDR;
rate = 14;
break;
-   case 32:
+   case IB_SPEED_EDR:
speed =  EDR;
rate = 25;
break;
@@ -214,7 +217,7 @@ static ssize_t rate_show(struct ib_port *p, struct 
port_attribute *unused,
return -EINVAL;

return sprintf(buf, %d%s Gb/sec (%dX%s)\n,
-  rate, (attr.active_speed == 1) ? .5 : ,
+  rate, (attr.active_speed == IB_SPEED_SDR) ? .5 : ,
   ib_width_enum_to_int(attr.active_width), speed);
 }

diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c 
b/drivers/infiniband/hw/amso1100/c2_provider.c
index 12f923d..07eb3a8 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -94,7 +94,7 @@ static int c2_query_port(struct ib_device *ibdev,
props-pkey_tbl_len = 1;
props-qkey_viol_cntr = 0;
props-active_width = 1;
-   props-active_speed = 1;
+   props-active_speed = IB_SPEED_SDR;

return 0;
 }
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 37c224f..0bdf09a 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1227,7 +1227,7 @@ static int iwch_query_port(struct ib_device *ibdev,
props-gid_tbl_len = 1;
props-pkey_tbl_len = 1;
props-active_width = 2;
-   props-active_speed = 2;
+   props-active_speed = IB_SPEED_DDR;
props-max_msg_sz = -1;

return 0;
diff --git a/drivers/infiniband/hw/cxgb4/provider.c 
b/drivers/infiniband/hw/cxgb4/provider.c
index 247fe70..be1c18f 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -329,7 +329,7 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
props-gid_tbl_len = 1;
props-pkey_tbl_len = 1;
props-active_width = 2;
-   props-active_speed = 2;
+   props-active_speed = IB_SPEED_DDR;
props-max_msg_sz = -1;

return 0;
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c 
b/drivers/infiniband/hw/ehca/ehca_hca.c
index 73edc36..9ed4d25 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -233,7 +233,7 @@ int ehca_query_port(struct ib_device *ibdev,
props-phys_state  = 5;
props-state   = rblock-state;
props-active_width= IB_WIDTH_12X;
-   props-active_speed= 0x1;
+   props-active_speed= IB_SPEED_SDR;
}

 query_port1:
diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 7b445df..6ff6bdf 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -215,16 +215,16 @@ static int ib_link_query_port(struct ib_device *ibdev, u8 
port,

switch (ext_active_speed) {
case 1:
-   props-active_speed = 16; /* FDR */
+

Re: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Jason Gunthorpe
On Thu, Jan 12, 2012 at 08:02:52AM -0800, Mike Marciniszyn wrote:
  Should whatever this issue is be a general PCI fixup? Like broken MSI,
  etc.
 
 Can you point me to some details on this?

I can explain the broken MSI stuff, as an example. As I noted I'm not
sure what you are working around here, but if there are limits imposed
on otherwise correct values in the PCI capabilities block then I think
it is broadly applicable to handle this in core code...

There are flags in pci.h like:

PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,

Which are quirk things.. Look in drivers/pci/quirks.c to see how it is
set.

So broadly you'd make a new appropriate bus flag to control
whatever you are working around and then test and set it in quirks,
and provide core code to traverse the bus path from a device to ensure
nothing in the path sets that quirk.

Really depends what the problem actually is.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Roland Dreier
On Thu, Jan 12, 2012 at 9:17 AM, Mike Marciniszyn
mike.marcinis...@qlogic.com wrote:
 Does this work on systems where the broken chipset might
 not be the immediate parent of the qib device (ie there are
 some PCIe switches in between)?

 The code figures this out at the top of routine and returns, changing nothing.

OIC.

Also I see

   if (parent-vendor != 0x8086)
return 1;

so I guess you don't need another vendor check.

Although this might be better written as PCI_VENDOR_ID_INTEL instead of 0x8086.

I guess this is OK, although as Jason said it would be much better
if the PCI core knew about these chipset errata.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Mike Marciniszyn

if (parent-vendor != 0x8086)
 return 1;

 so I guess you don't need another vendor check.

Actually, Jason is right.  The vendor check you reference here is in 
qib_tune_pcie_coalesce() and not the routine being patched.

A bit of background here is that the issue was noted with the indicated 
Harpertown root complex chip sets as follows:
- The BIOS set the root complex MaxPayLoad to 128, but rc capabilities indicate 
256 is possible
- To get the best performance we tried going to 256 on the rc and our card and 
noted the Poisoned TLP
- The patch is an effort to avoid having to use set pcie_caps at all as well as 
avoiding issues with the problematic chip sets
- The module parameter can still be used to experiment

We have never the issue with AMD or other Intel chipsets.  The problematic 
device ids are not in fixup.c in lib.

I can reissue a v2 with:
- the vendor check
- define use when available

We probably need to do something, since the current 3.2 rc has the above risk.

Mike


This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Jason Gunthorpe
On Thu, Jan 12, 2012 at 02:14:12PM -0800, Mike Marciniszyn wrote:

 Actually, Jason is right.  The vendor check you reference here is in
 qib_tune_pcie_coalesce() and not the routine being patched.
 
 A bit of background here is that the issue was noted with the
 indicated Harpertown root complex chip sets as follows:
 - The BIOS set the root complex MaxPayLoad to 128, but rc capabilities 
 indicate 256 is possible
 - To get the best performance we tried going to 256 on the rc and
 our card and noted the Poisoned TLP

I don't think it is appropriate for a driver to modify the pci
configuration of the root complex.. What if other drivers also try and
modify this configuration? Chaos.

It doesn't seem to me like this has any place in the quirks thing
either. Things seem to be working properly, the MaxPayLoad of 128 is
clearly the highest the system will support correctly.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IB/qib: detour pcie_caps for certain chip sets

2012-01-12 Thread Mike Marciniszyn
 It doesn't seem to me like this has any place in the quirks thing
 either. Things seem to be working properly, the MaxPayLoad of 128 is
 clearly the highest the system will support correctly.

 Jason

Probably the best thing to do unwind the module parameter default in 8d4548f2b 
which would change the initial value back to 0.

That's the way the file has always been and that won't change the rc.

Mike

This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/qib: unwind pcie change

2012-01-12 Thread Mike Marciniszyn
Commit 8d4548f2b (IB/qib: Default some module parameters optimally)
introduced an issue with older root complexes.  They cannot handle
the pcie_caps of 0x51 (MaxReadReq 4096, MaxPayload=256).

A typical diagnostic in this situation reported by syslog contains
the text:

  [PCIe Poisoned TLP][Send DMA memory read]

Restore the module paramter default to zero with will avoid
any changes in the root complex.

Reviewed-by: Mark Debbage mark.debb...@qlogic.com
Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com
---
 drivers/infiniband/hw/qib/qib_pcie.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_pcie.c 
b/drivers/infiniband/hw/qib/qib_pcie.c
index 0de55c0..790646e 100644
--- a/drivers/infiniband/hw/qib/qib_pcie.c
+++ b/drivers/infiniband/hw/qib/qib_pcie.c
@@ -577,7 +577,7 @@ static int qib_tune_pcie_coalesce(struct qib_devdata *dd)
  * BIOS may not set PCIe bus-utilization parameters for best performance.
  * Check and optionally adjust them to maximize our throughput.
  */
-static int qib_pcie_caps = 0x51;
+static int qib_pcie_caps;
 module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO);
 MODULE_PARM_DESC(pcie_caps, Max PCIe tuning: Payload (0..3), ReadReq (4..7));
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDS: Remove some unused iWARP code

2012-01-12 Thread David Miller
From: Roland Dreier rol...@kernel.org
Date: Thu, 12 Jan 2012 10:57:56 -0800

 From: Roland Dreier rol...@purestorage.com
 
 rds_iw_flush_goal() just returns a count, but it is only called in one
 place and its return value is ignored there.  So delete all the dead code.
 
 Signed-off-by: Roland Dreier rol...@purestorage.com

Applied.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html