Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Amir Vadai
I hope to have a fix next week for the first one. Thanks, Amir On 04/14/2010 09:48 PM, Tung, Chien Tin wrote: >> Tung, Chien Tin wrote: >> One more thing - Please open a bug regarding the num_sge limitation at: https://bugs.openfabrics.org/ >>> Done, Bug 2027. >>

[infiniband-diags] [2/2] check for duplicate port guids in libibnetdisc cache

2010-04-14 Thread Al Chu
Hey Sasha, This patch checks for duplicate port guids in a libibnetdisc cache when it is loaded and report an error back to the user appropriately. Al -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory --- Begin Message ---

[infiniband-diags] [1/2] fix libibnetdisc cache error path memleak

2010-04-14 Thread Al Chu
Hey Sasha, This patch fixes a mem-leak through error paths in the libibnetdisc cache loading. If some data had not yet been "copied over" to the fabric struct and an error occurred, that memory would be leaked. Al -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Divisio

Re: [PATCH 1/4] opensm: added function that dumps PathRecords

2010-04-14 Thread Jim Schutt
Hi Yevgeny, On Thu, 2010-04-08 at 07:29 -0600, Yevgeny Kliteynik wrote: > Dumping SL, MTU and Rate for all the > non-switch-2-non-switch paths in the subnet. > > PRs that are dumped: > > for every non-switch source port > for every non-switch target LID in the subnet > dump PR

RE: Socket Direct Protocol: help (2)

2010-04-14 Thread Tung, Chien Tin
>I work on NE020 cards from February 2010 for an INFN experimental >proposal, called REDIGO (Read out at 10 Gbits/s), about the data >acquisition and movement systems. The covergence of storage protocols >around 10 Gigabits/s Ethernet protocols shows that one way could be the >Remote Direct Memory

OFED Cross Reference Server update

2010-04-14 Thread John Groves
As some of you know, System Fabric Works hosts an indexed, hyper linked and search-able cross reference of the OFED code - based on the LXR cross referencing engine.  I have just added and indexed OFED-1.5.1, as well as variant of the 1.5.1 with our new Soft RoCEE driver and library included (OFED-

RE: Socket Direct Protocol: help (2)

2010-04-14 Thread Tung, Chien Tin
>Tung, Chien Tin wrote: >>> One more thing - Please open a bug regarding the num_sge limitation at: >>> https://bugs.openfabrics.org/ >>> >> >> Done, Bug 2027. >> >> Chien >> > >And 2028 opened to request fastreg support. > I am open to test fixes for these two bugs. Chien -- To unsubscribe from

[infiniband-diags] fix libibnetdisc portguid hashing corner case

2010-04-14 Thread Al Chu
Hey Sasha, This patch fixes a corner case in libibnetdisc that was storing portguids w/ a guid of 0. This bug was relatively innoucuous for ibnetdiscover b/c ibnetdiscover does not output these ports. However, it became a problem for me in the caching library as I attempted to reconstruct a fabr

Re: [PATCH] libibnetdisc: fix outstanding SMPs countung

2010-04-14 Thread Ira Weiny
On Apr 14, 2010, at 3:23 AM, Sasha Khapyorsky wrote: On 13:44 Tue 13 Apr , Ira Weiny wrote: This changes the logic. "num_smps_outstanding" is NOT the number on the wire, but it appears you have made it so. Actually yes, it made it so. This is the number which will cause process_smp_

[PATCH] ummunotify: fix umn-test build

2010-04-14 Thread Randy Dunlap
From: Randy Dunlap Add ummunotify.h to Kbuild list for export to userspace, fixing 27 build errors in umn-test.c when O=builddir is used. Signed-off-by: Randy Dunlap --- include/linux/Kbuild |1 + 1 file changed, 1 insertion(+) maybe another item to add to SubmitChecklist :( --- lnx-263

[ANNOUNCE] libcxgb4 1.0.0 Release

2010-04-14 Thread Steve Wise
The libcxgb4 package is a userspace driver for the new Chelsio T4 iWARP RNICs. It is a plug-in module for libibverbs that allows programs to use Chelsio RDMA T4 hardware directly from userspace. The initial release of this library is now available at: http://www.openfabrics.org/downloads/cxg

Re: RDMA CM problems, ib_find_cached_gid() fails

2010-04-14 Thread Mike M
On Tue, Apr 13, 2010 at 5:41 PM, Sean Hefty wrote: >>Putting in lots of printk messages, I see that the server machine >>indeed gets a connection request MAD, but that the >>ib_find_cached_gid() call inside of cma_acquire_dev() fails. >> >>Any ideas? > > What kernel version are you using, and are

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Amir Vadai
ok - actually I used it in an early version of SDP before changing to FMR... - Amir On 04/14/2010 06:08 PM, Steve Wise wrote: > Amir Vadai wrote: > >> You are right - I missed it. >> >> Andrea, Please open a bug at bugzilla (https://bugs.openfabrics.org) - >> so that you will be notified as so

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Andrea Gozzelino
Thank you very much. I check the status of 2027 and 2028 bugs. Andrea n Apr 14, 2010 05:05 PM, Steve Wise wrote: > Tung, Chien Tin wrote: > >> One more thing - Please open a bug regarding the num_sge limitation > >> at: > >> https://bugs.openfabrics.org/ > >> > > > > Done, Bug 2027. > > >

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Steve Wise
Amir Vadai wrote: You are right - I missed it. Andrea, Please open a bug at bugzilla (https://bugs.openfabrics.org) - so that you will be notified as soon as I will fix SDP not use FMR if not supported. As to fastreg_mrs support - I don't know this mechanism. Do you mean FRWR? ib_alloc_fa

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Steve Wise
Tung, Chien Tin wrote: One more thing - Please open a bug regarding the num_sge limitation at: https://bugs.openfabrics.org/ Done, Bug 2027. Chien And 2028 opened to request fastreg support. Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Amir Vadai
You are right - I missed it. Andrea, Please open a bug at bugzilla (https://bugs.openfabrics.org) - so that you will be notified as soon as I will fix SDP not use FMR if not supported. As to fastreg_mrs support - I don't know this mechanism. Do you mean FRWR? Thanks, Amir On 04/14/2010 05:54 P

RE: Socket Direct Protocol: help (2)

2010-04-14 Thread Tung, Chien Tin
>One more thing - Please open a bug regarding the num_sge limitation at: >https://bugs.openfabrics.org/ Done, Bug 2027. Chien -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.o

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Steve Wise
Hey Amir, I don't think this helps because sdp_add_device() will not add rdma devices that fail to create fmr pools. So I guess you could key off of fmr pool failures and set sdp_zcopy_thresh to 0 and allow the device to be used? But what we really need is sdp support for fastreg_mrs as an

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Amir Vadai
One more thing - Please open a bug regarding the num_sge limitation at: https://bugs.openfabrics.org/ Thanks, Amir On 04/14/2010 05:31 PM, Amir Vadai wrote: > Hi, > > FMR are being used only in a special mode called ZCopy. > > You could disable this mode by setting the module paramter > sdp_zcopy

Re: [PATCH] ummunotify: Userspace support for MMU notifications

2010-04-14 Thread Jeff Squyres
On Apr 14, 2010, at 5:06 AM, Gleb Natapov wrote: > > The Open MPI developers have spent a lot of effort trying to handle this > > purely in userspace and still do not believe that a truly robust > > solution is possible without kernel help. Perhaps they can expand on > > what the obstacles are.

Re: Socket Direct Protocol: help (2)

2010-04-14 Thread Amir Vadai
Hi, FMR are being used only in a special mode called ZCopy. You could disable this mode by setting the module paramter sdp_zcopy_thresh to 0, or by issuing: # echo 0 > /sys/module/ib_sdp/parameters/sdp_zcopy_thresh This means that you won't get the benefits of Zero-copy. - Amir On 04/14/2010 1

[PATCH V4 2/2] mlx4/IB: Add support for enhanced atomic operations

2010-04-14 Thread Vladimir Sokolovsky
Added support for masked atomic operations: - Masked Compare and Swap - Masked Fetch and Add Signed-off-by: Vladimir Sokolovsky --- drivers/infiniband/hw/mlx4/cq.c |8 drivers/infiniband/hw/mlx4/main.c |1 + drivers/infiniband/hw/mlx4/qp.c | 27 ++

[PATCH V4 1/2] IB/core: Add support for enhanced atomic operations

2010-04-14 Thread Vladimir Sokolovsky
- Add a new IB_WR_MASKED_ATOMIC_CMP_AND_SWP and IB_WR_MASKED_ATOMIC_FETCH_AND_ADD send opcodes that can be used to mark a "masked atomic compare and swap" and "masked atomic fetch and add" work request correspondingly. - Add masked_atomic_cap capability. - Add mask fields to atomic struct of i

[PATCH V4 0/2] Add support for enhanced atomic operations

2010-04-14 Thread Vladimir Sokolovsky
Hi Roland, This patchset adds support for the following enhanced atomic operations: - Masked atomic compare and swap - Masked atomic fetch and add These operations enable using a smaller amount of memory when using multiple locks by using portions of a 64 bit value in an atomic operation. For som

[PATCH] amso1100: Add missing memset

2010-04-14 Thread Vladimir Sokolovsky
Signed-off-by: Vladimir Sokolovsky --- drivers/infiniband/hw/amso1100/c2_rnic.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c b/drivers/infiniband/hw/amso1100/c2_rnic.c index dd05c48..bac9856 100644 --- a/drivers/infiniband/hw/

Re: [PATCH] libibnetdisc: fix outstanding SMPs countung

2010-04-14 Thread Sasha Khapyorsky
On 13:44 Tue 13 Apr , Ira Weiny wrote: > > > This changes the logic. "num_smps_outstanding" is NOT the number on the > > wire, but it appears you have made it so. Actually yes, it made it so. > > This is the number which will cause process_smp_queue to continue being > > called. > > > >

[PATCH] libibnetdisc: don't query CA ports not connected to a fabric

2010-04-14 Thread Sasha Khapyorsky
We can save some amount of MADs by not querying CA/Router ports which is not connected to our fabric. When discovery reaches CA or Router node it will always get PortInfo for a port which was discovered and not others. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/libibnetdisc/src/ibnetd

Re: [PATCH v3 1/2] libibnetdisc: Convert to a multi-smp algorithm

2010-04-14 Thread Sasha Khapyorsky
On 13:30 Tue 13 Apr , Ira Weiny wrote: > > If we are going to do something like this why not more like a context? > > Something like this? > > ibqueryerrors.c > query_errors_ibmad_port = mad_rpc_open_port(ibd_ca, ibd_ca_port, > mgmt_classes, 4); > > { > ibnd_conte

Re: [infiniband-diags] [0/3] support --diff and --diffcheck in ibnetdiscover

2010-04-14 Thread Sasha Khapyorsky
On 10:17 Tue 13 Apr , Al Chu wrote: > > I had considered this at one point. There were several reasons I > decided to go w/ the cache idea. Perhaps the major reason is that the > current cache system has "all" the data (nodeinfo, portinfo, etc.) > saved, whereas the normal ibnetdiscover outp

Re: [PATCH] ummunotify: Userspace support for MMU notifications

2010-04-14 Thread Gleb Natapov
On Tue, Apr 13, 2010 at 10:57:32AM -0700, Roland Dreier wrote: > > It is further claimed that "… other tricks are not robust". I wrote > > the code used in Scali/Platform MPI handling the issue. I do not > > think its fair to claim that this MPI is not robust in this matter > > nor that is perf

Re: [PATCH] ummunotify: Userspace support for MMU notifications

2010-04-14 Thread Gleb Natapov
On Tue, Apr 13, 2010 at 08:02:54PM +0200, Peter Zijlstra wrote: > On Tue, 2010-04-13 at 10:57 -0700, Roland Dreier wrote: > > Are those system calls the only possible way that virtual to physical > > mappings can change? Can't page migration or something like that > > potentially affect things? A

RE: Socket Direct Protocol: help (2)

2010-04-14 Thread Andrea Gozzelino
On Apr 13, 2010 10:22 PM, "Tung, Chien Tin" wrote: > >>> Chien, does the NE020 support FMRs? I looked at the nes ofed-1.5 > >>> code > >>> and it appears to do nothing in the map_phys_fmr functions. > >>> > >> > >> We never implemented map_phys_fmr. Is it relevant to the # of SGEs? > >> > >No, bu