Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc3 is ready

2006-08-31 Thread Tziporet Koren
Hi Scott, 
This was my mistake (I tgz both binary RPMs and not just the source
RMPs).
I fixed this (removed the binary RPMs).
All the rest was not touched.
 
Tziporet

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Scott
Weitzenkamp (sweitzen)
Sent: Friday, September 01, 2006 4:31 AM
To: Tziporet Koren; EWG
Cc: OPENIB
Subject: Re: [openfabrics-ewg] OFED 1.1-rc3 is ready

RC3 includes a bunch of binary RPMS, please remove for RC4.  Look at the
size of the RC3 tarball vs previous ones:

$ ls -s | more
total 290848
 46512 OFED-1.1-rc1.tgz
 0 OFED-1.1-rc1.tgz.md5sum
 47048 OFED-1.1-rc2.tgz
 0 OFED-1.1-rc2.tgz.md5sum
197288 OFED-1.1-rc3.tgz
 0 OFED-1.1-rc3.tgz.md5sum

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Tziporet Koren
> Sent: Thursday, August 31, 2006 9:24 AM
> To: EWG
> Cc: OPENIB
> Subject: [openfabrics-ewg] OFED 1.1-rc3 is ready
> 
> Hi,
> 
> OFED 1.1-RC3 is available on 
> https://openib.org/svn/gen2/branches/1.1/ofed/releases/
> File: OFED-1.1-rc3.tgz
> Please report any issues in bugzilla http://openib.org/bugzilla/
> 
> Schedule reminder:
> ==
> Next milestones:
> RC4 is planned for 7-Sep. It should include critical bug fixes only.
> Final release will be on 11 or 12 Sep.
> 
> Owners - please update release notes for RC4.
> 
> Tziporet & Vlad
> --
> ---
> 
> Release details:
> 
> Build_id:
> OFED-1.1-rc3
> 
> openib-1.1 (REV=9203)
> # User space
> https://openib.org/svn/gen2/branches/1.1/src/userspace
> Git:
> ref: refs/heads/ofed_1_1
> commit 338e942a4ae10d62f2632e6292f85bb1b15d154c
> 
> # MPI
> mpi_osu-0.9.7-mlx2.2.0.tgz
> openmpi-1.1.1-1.src.rpm
> mpitests-2.0-0.src.rpm
> 
> 
> OS support:
> ===
> Novell:
> - SLES 9.0 SP3
> - SLES10
> Redhat:
> - Redhat EL4 up3
> - Redhat EL4 up4
> kernel.org:
> - Kernel 2.6.17
> 
> Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped 
> from the list.
> We keep the backport patches for these OSes and make sure 
> OFED compile and
> loaded properly but will not do full QA cycle.
> 
> Systems:
> 
> * x86_64
> * x86
> * ia64
> * ppc64
> 
> Main changes from OFED-1.1-rc2:
> ===
> 1. Added ehca (IBM) driver. This driver can be compiled on 
> kernel 2.6.18 
> only
> 3. Open MPI version update to openmpi-1.1.1-1
> 4. Core: Huge pages registration is supported
> 5. IPoIB high availability script supports multicast groups
> 6. RHEL4 up4 is now supported
> 7. SDP: fixed connection refused problem; get peer name working
> 8. libsdp: several bug fixes
> 
> Limitations and known issues:
> =
> 1. SDP: For Mellanox Sinai HCAs one must use latest FW 
> version (1.1.000).
> 2. SDP: Scalability issue when many connections are opened
> 3. SDP: If RTU packet is lost Accept call blocks even if 
> client connected.
> 4. ipath driver is not supported on SLES9 SP3
> 5. Compilation on kernel 2.6.18-rc5 is failing - to be fixed in RC4
>  
> 
> Missing features that should be completed for RC4:
> ==
> None
> 
> 
> 
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg
> 

___
openfabrics-ewg mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openfabrics-ewg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] single rkey

2006-08-31 Thread Devesh Sharma
On 8/31/06, yipee <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Is it possible for several memory registrations (using ibv_reg_mr) to have a
> single rkey?
> Can I add memory registrations to a previous rkey?
No this is not possible, In a single memory registration call you can
have large buffer but once it is registered with NIC you can not any
modifications to it and hence multiple registrations cannot share a
same R_Key.
>
>
> thanks,
> y
>
>
>
>
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/perftest: Fix get_median, size of delta, usage(), worst latency

2006-08-31 Thread CH Ganapathi
>> "Michael S. Tsirkin" <[EMAIL PROTECTED]> 8/31/06 2:59 PM >>>
> 3) Worst latency is delta[iters - 2] in read_lat.c, not delta[iters -
3].

>> could you explain this last bit please?

delta having (iters - 1) elements has index range: 0 to (iters - 2).
After 
sorting delta[iters - 2] is the maximum.

Regards,
Ganapathi.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc3 is ready

2006-08-31 Thread Scott Weitzenkamp (sweitzen)
RC3 includes a bunch of binary RPMS, please remove for RC4.  Look at the
size of the RC3 tarball vs previous ones:

$ ls -s | more
total 290848
 46512 OFED-1.1-rc1.tgz
 0 OFED-1.1-rc1.tgz.md5sum
 47048 OFED-1.1-rc2.tgz
 0 OFED-1.1-rc2.tgz.md5sum
197288 OFED-1.1-rc3.tgz
 0 OFED-1.1-rc3.tgz.md5sum

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Tziporet Koren
> Sent: Thursday, August 31, 2006 9:24 AM
> To: EWG
> Cc: OPENIB
> Subject: [openfabrics-ewg] OFED 1.1-rc3 is ready
> 
> Hi,
> 
> OFED 1.1-RC3 is available on 
> https://openib.org/svn/gen2/branches/1.1/ofed/releases/
> File: OFED-1.1-rc3.tgz
> Please report any issues in bugzilla http://openib.org/bugzilla/
> 
> Schedule reminder:
> ==
> Next milestones:
> RC4 is planned for 7-Sep. It should include critical bug fixes only.
> Final release will be on 11 or 12 Sep.
> 
> Owners - please update release notes for RC4.
> 
> Tziporet & Vlad
> --
> ---
> 
> Release details:
> 
> Build_id:
> OFED-1.1-rc3
> 
> openib-1.1 (REV=9203)
> # User space
> https://openib.org/svn/gen2/branches/1.1/src/userspace
> Git:
> ref: refs/heads/ofed_1_1
> commit 338e942a4ae10d62f2632e6292f85bb1b15d154c
> 
> # MPI
> mpi_osu-0.9.7-mlx2.2.0.tgz
> openmpi-1.1.1-1.src.rpm
> mpitests-2.0-0.src.rpm
> 
> 
> OS support:
> ===
> Novell:
> - SLES 9.0 SP3
> - SLES10
> Redhat:
> - Redhat EL4 up3
> - Redhat EL4 up4
> kernel.org:
> - Kernel 2.6.17
> 
> Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped 
> from the list.
> We keep the backport patches for these OSes and make sure 
> OFED compile and
> loaded properly but will not do full QA cycle.
> 
> Systems:
> 
> * x86_64
> * x86
> * ia64
> * ppc64
> 
> Main changes from OFED-1.1-rc2:
> ===
> 1. Added ehca (IBM) driver. This driver can be compiled on 
> kernel 2.6.18 
> only
> 3. Open MPI version update to openmpi-1.1.1-1
> 4. Core: Huge pages registration is supported
> 5. IPoIB high availability script supports multicast groups
> 6. RHEL4 up4 is now supported
> 7. SDP: fixed connection refused problem; get peer name working
> 8. libsdp: several bug fixes
> 
> Limitations and known issues:
> =
> 1. SDP: For Mellanox Sinai HCAs one must use latest FW 
> version (1.1.000).
> 2. SDP: Scalability issue when many connections are opened
> 3. SDP: If RTU packet is lost Accept call blocks even if 
> client connected.
> 4. ipath driver is not supported on SLES9 SP3
> 5. Compilation on kernel 2.6.18-rc5 is failing - to be fixed in RC4
>  
> 
> Missing features that should be completed for RC4:
> ==
> None
> 
> 
> 
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Srp question

2006-08-31 Thread Roland Dreier
Makia> We did add this (pardon my typo of max_sects below).  I did
Makia> find that by appending the max_sect= at the end of the line
Makia> we were seeing strange behaviour (it seemed that the parser
Makia> added a newline for no reason) and the only fix was to put
Makia> it at the beginning of the line.
 
Actually it's probably echo adding the newline.  You can use "echo -n"
to work around this, or just put the max_sect at the beginning of the line.

Makia> Sorry again about the type (I should never attempt to work
Makia> off of memory).  With the max_sect=4096, and the
Makia> srp_sg_tablesize to 256, we are now seeing 512KB IOs.  The
Makia> new question is is there a way to get this to 1M IOs?

Don't know... do you get that with the old IBgold SRP initiator?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] lockdep warnings

2006-08-31 Thread Roland Dreier
Michael> Hi, Roland!  I got a load of lockdep warnings after
Michael> loading all modules and configuring ipoib. This doesn't
Michael> usually happen, not sure what I changed this time.  I'm a
Michael> bit too busy this week - could you take a look at the
Michael> log, please?

This would only happen on a mem-full HCA.  I just asked Linus to pull
the following fix.

commit 02113bd77e86386d02a9a606cdad53803a6e2794
Author: Roland Dreier <[EMAIL PROTECTED]>
Date:   Thu Aug 31 16:43:06 2006 -0700

IB/mthca: Use IRQ safe locks to protect allocation bitmaps

It is supposed to be OK to call mthca_create_ah() and mthca_destroy_ah()
from any context.  However, for mem-full HCAs, these functions use the
mthca_alloc() and mthca_free() bitmap helpers, and those helpers use
non-IRQ-safe spin_lock() internally.  Lockdep correctly warns that
this could lead to a deadlock.  Fix this by changing mthca_alloc() and
mthca_free() to use spin_lock_irqsave().

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>

diff --git a/drivers/infiniband/hw/mthca/mthca_allocator.c 
b/drivers/infiniband/hw/mthca/mthca_allocator.c
index 25157f5..f930e55 100644
--- a/drivers/infiniband/hw/mthca/mthca_allocator.c
+++ b/drivers/infiniband/hw/mthca/mthca_allocator.c
@@ -41,9 +41,11 @@ #include "mthca_dev.h"
 /* Trivial bitmap-based allocator */
 u32 mthca_alloc(struct mthca_alloc *alloc)
 {
+   unsigned long flags;
u32 obj;
 
-   spin_lock(&alloc->lock);
+   spin_lock_irqsave(&alloc->lock, flags);
+
obj = find_next_zero_bit(alloc->table, alloc->max, alloc->last);
if (obj >= alloc->max) {
alloc->top = (alloc->top + alloc->max) & alloc->mask;
@@ -56,19 +58,24 @@ u32 mthca_alloc(struct mthca_alloc *allo
} else
obj = -1;
 
-   spin_unlock(&alloc->lock);
+   spin_unlock_irqrestore(&alloc->lock, flags);
 
return obj;
 }
 
 void mthca_free(struct mthca_alloc *alloc, u32 obj)
 {
+   unsigned long flags;
+
obj &= alloc->max - 1;
-   spin_lock(&alloc->lock);
+
+   spin_lock_irqsave(&alloc->lock, flags);
+
clear_bit(obj, alloc->table);
alloc->last = min(alloc->last, obj);
alloc->top = (alloc->top + alloc->max) & alloc->mask;
-   spin_unlock(&alloc->lock);
+
+   spin_unlock_irqrestore(&alloc->lock, flags);
 }
 
 int mthca_alloc_init(struct mthca_alloc *alloc, u32 num, u32 mask,

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [GIT PULL] please pull infiniband.git

2006-08-31 Thread Roland Dreier
Linus, please pull from

master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This tree is also available from kernel.org mirrors at:

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
for-linus

to get a fix for a locking bug found by lockdep:

Roland Dreier:
  IB/mthca: Use IRQ safe locks to protect allocation bitmaps

 drivers/infiniband/hw/mthca/mthca_allocator.c |   15 +++
 1 files changed, 11 insertions(+), 4 deletions(-)


diff --git a/drivers/infiniband/hw/mthca/mthca_allocator.c 
b/drivers/infiniband/hw/mthca/mthca_allocator.c
index 25157f5..f930e55 100644
--- a/drivers/infiniband/hw/mthca/mthca_allocator.c
+++ b/drivers/infiniband/hw/mthca/mthca_allocator.c
@@ -41,9 +41,11 @@ #include "mthca_dev.h"
 /* Trivial bitmap-based allocator */
 u32 mthca_alloc(struct mthca_alloc *alloc)
 {
+   unsigned long flags;
u32 obj;
 
-   spin_lock(&alloc->lock);
+   spin_lock_irqsave(&alloc->lock, flags);
+
obj = find_next_zero_bit(alloc->table, alloc->max, alloc->last);
if (obj >= alloc->max) {
alloc->top = (alloc->top + alloc->max) & alloc->mask;
@@ -56,19 +58,24 @@ u32 mthca_alloc(struct mthca_alloc *allo
} else
obj = -1;
 
-   spin_unlock(&alloc->lock);
+   spin_unlock_irqrestore(&alloc->lock, flags);
 
return obj;
 }
 
 void mthca_free(struct mthca_alloc *alloc, u32 obj)
 {
+   unsigned long flags;
+
obj &= alloc->max - 1;
-   spin_lock(&alloc->lock);
+
+   spin_lock_irqsave(&alloc->lock, flags);
+
clear_bit(obj, alloc->table);
alloc->last = min(alloc->last, obj);
alloc->top = (alloc->top + alloc->max) & alloc->mask;
-   spin_unlock(&alloc->lock);
+
+   spin_unlock_irqrestore(&alloc->lock, flags);
 }
 
 int mthca_alloc_init(struct mthca_alloc *alloc, u32 num, u32 mask,

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Srp question

2006-08-31 Thread Makia Minich



On 8/31/06 3:34 PM, "Roland Dreier" <[EMAIL PROTECTED]> wrote:

> Makia> We are attempting to do some performance testing of the SRP driver
> (with a
> Makia> DDN target) and are seeing some poor results:
> 
> Makia> ~120MB/s per lun with 1 sgp_dd
> Makia> ~80MB/s per lun with 4 sgp_dd
> 
> Makia> Previously we had attempted the same tests with IBGold and got the
> Makia> following:
> 
> Makia> ~150MB/s per lun with 1 sgp_dd
> Makia> ~600MB/s per lun with 4 sgp_dd
> 
> Were these tests with the same kernels otherwise?  If not, there may
> be unrelated changes to the SCSI stack that affect synthetic
> benchmarks like this.  (I seem to remember a change in the not too
> distant patch that affected the largest IO it is possible to submit
> through the SG interface).

The kernels in question were 2.6.9.22.0.2 and 2.6.9-34.EL.  I'll have to
find some changelogs to see if there were changes to the SCSI stack.

> Makia> To achieve the results in IBGold, we were able to set the
> Makia> srp module option "max_xfer_sectors_per_io=4096", but can't
> Makia> seem to find an equivalent option in the OFED SRP drivers.
> 
> When connecting to the target (the echo to the add_target file), you
> can add ",max_sect=4096" to the string you pass in.

We did add this (pardon my typo of max_sects below).  I did find that by
appending the max_sect= at the end of the line we were seeing strange
behaviour (it seemed that the parser added a newline for no reason) and the
only fix was to put it at the beginning of the line.
 
> Makia> By default, we found (via stats from the DDN) that we were
> Makia> only seeing reads and writes in the 0-32Kbyte range.
> Makia> Comparing IBGold and OFED, we found that the
> Makia> srp_sg_tablesize defaulted to 256, but in OFED it defaulted
> Makia> to 12.  So, changing this (via modprobe.conf) to 256 in
> Makia> OFED, we were able to see reads and writes in the 128Kbyte
> Makia> range (which is what ultimately got us to the performance
> Makia> above).  I also noticed that there is a max_sects option
> Makia> you can pass to add_target (in the SRP /sys entries) which
> Makia> seemed to be the same idea as srp_sg_tablesize, but this
> Makia> didn't seem to affect anything.
> 
> It is "max_sect" not "max_sects" (no final 's').  Anyway, what do you
> mean that it didn't affect anything?  max_sect=4096 should
> theoretically get you up to 512 KB IOs.

Sorry again about the type (I should never attempt to work off of memory).
With the max_sect=4096, and the srp_sg_tablesize to 256, we are now seeing
512KB IOs.  The new question is is there a way to get this to 1M IOs?
 
>  - R.

-- 
Makia Minich <[EMAIL PROTECTED]>
National Center for Computation Science
Oak Ridge National Laboratory


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [Openib-windows] File transfer performance options

2006-08-31 Thread Tzachi Dar
There is one thing that is missing from your mail, and that is if you
want to see the windows machine as some file server (for example SAMBA,
NFS, SRP), or are you ready to accept it as a normal server. The big
difference is that on the second option the server can be running at
user mode (for example FTP server).

When (the server application is) running at user mode, SDP can be used
as a socket provider.  This means that theoretically every socket
application should run and enjoy the speed of Infiniband. Currently
there are two projects of SDP under development: one is for Linux and
the other for Windows, so SDP can be used to allow machines from both
types to connect. Performance that we have measured on the windows
platform, using DDR cards was bigger than 1200 MB/Sec. (of course, this
data was from host memory, and not from disks).

So, if all you need to do is to pass files from one side to the other, I
would recommend that you will check this option.

One note about your experiments: when using ram disks, this probably
means that there is one more copy from the ram disk to the application
buffer. A real disk, has it's DMA engine, while a ram disk doesn't.
Another copy is probably not a problem when you are talking about
100MB/sec, but it would become a problem once you will use SDP (I hope).

Thanks
Tzachi





We've been testing an application that archives large quantities
of data 
from a Linux system onto a Windows-based server (64bit server
2003 R2).

As part of the investigation into relatively modest transfer
speeds in the 
win-linux configuration, we configured a Linux-Linux transfer
via IpoIB with 
NFS layered on top (with ram disks to avoid physical disk
issues)

[Whilst for a real Linux-Linux configuration I would look for
the RDMA over 
NFS solution, this wouldn't translate to our eventual win-linux 
inter-operable system.]

I was surprised that even on linux-linux I hit a wall of 100MB/s
(test notes 
below). Are others doing better? I was hoping for 150MB/s -
200MB/s

Does anyone have any hints on tweaking of an IPoIB/NFS solution
to get 
better throughput for large files (not so concerned about
latency).

Are there any other inter-operable windows-linux solutions now? 
(cross-platform NFS over RDMA or SRP initiator/target?)

Paul Baxter




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] cma: protect against adding device during destruction

2006-08-31 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: [PATCH] cma: protect against adding device during destruction
> 
> Can you see if this patch helps any?
> 
> This closes a window where address resolution can attach an rdma_cm_id
> to a device during destruction of the rdma_cm_id.  This can result in
> the rdma_cm_id remaining in the device list after its memory has been
> freed.
> 
> Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
 

I'll test some, but the problem hasn't reappeared since.
The patch looks right, I'd say push it for 2.6.18.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] 2.6.19 cma: fix typo

2006-08-31 Thread Roland Dreier
This was already fixed by the iWARP merge patches (which I'll push out
shortly).  So I'll drop this patch...

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Srp question

2006-08-31 Thread Roland Dreier
Makia> We are attempting to do some performance testing of the SRP driver 
(with a
Makia> DDN target) and are seeing some poor results:

Makia> ~120MB/s per lun with 1 sgp_dd
Makia> ~80MB/s per lun with 4 sgp_dd

Makia> Previously we had attempted the same tests with IBGold and got the
Makia> following:

Makia> ~150MB/s per lun with 1 sgp_dd
Makia> ~600MB/s per lun with 4 sgp_dd

Were these tests with the same kernels otherwise?  If not, there may
be unrelated changes to the SCSI stack that affect synthetic
benchmarks like this.  (I seem to remember a change in the not too
distant patch that affected the largest IO it is possible to submit
through the SG interface).

Makia> To achieve the results in IBGold, we were able to set the
Makia> srp module option "max_xfer_sectors_per_io=4096", but can't
Makia> seem to find an equivalent option in the OFED SRP drivers.

When connecting to the target (the echo to the add_target file), you
can add ",max_sect=4096" to the string you pass in.

Makia> By default, we found (via stats from the DDN) that we were
Makia> only seeing reads and writes in the 0-32Kbyte range.
Makia> Comparing IBGold and OFED, we found that the
Makia> srp_sg_tablesize defaulted to 256, but in OFED it defaulted
Makia> to 12.  So, changing this (via modprobe.conf) to 256 in
Makia> OFED, we were able to see reads and writes in the 128Kbyte
Makia> range (which is what ultimately got us to the performance
Makia> above).  I also noticed that there is a max_sects option
Makia> you can pass to add_target (in the SRP /sys entries) which
Makia> seemed to be the same idea as srp_sg_tablesize, but this
Makia> didn't seem to affect anything.

It is "max_sect" not "max_sects" (no final 's').  Anyway, what do you
mean that it didn't affect anything?  max_sect=4096 should
theoretically get you up to 512 KB IOs.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Srp question

2006-08-31 Thread Makia Minich
We are attempting to do some performance testing of the SRP driver (with a
DDN target) and are seeing some poor results:

~120MB/s per lun with 1 sgp_dd
~80MB/s per lun with 4 sgp_dd

Previously we had attempted the same tests with IBGold and got the
following:

~150MB/s per lun with 1 sgp_dd
~600MB/s per lun with 4 sgp_dd

To achieve the results in IBGold, we were able to set the srp module option
"max_xfer_sectors_per_io=4096", but can't seem to find an equivalent option
in the OFED SRP drivers.

By default, we found (via stats from the DDN) that we were only seeing reads
and writes in the 0-32Kbyte range.  Comparing IBGold and OFED, we found that
the srp_sg_tablesize defaulted to 256, but in OFED it defaulted to 12.  So,
changing this (via modprobe.conf) to 256 in OFED, we were able to see reads
and writes in the 128Kbyte range (which is what ultimately got us to the
performance above).  I also noticed that there is a max_sects option you can
pass to add_target (in the SRP /sys entries) which seemed to be the same
idea as srp_sg_tablesize, but this didn't seem to affect anything.

So, my question is, what is the right magic to get SRP up to speed?

Thanks...

-- 
Makia Minich <[EMAIL PROTECTED]>
National Center for Computation Science
Oak Ridge National Laboratory


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] cma: protect against adding device during destruction

2006-08-31 Thread Sean Hefty
Can you see if this patch helps any?

This closes a window where address resolution can attach an rdma_cm_id
to a device during destruction of the rdma_cm_id.  This can result in
the rdma_cm_id remaining in the device list after its memory has been
freed.

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
Index: cma.c
===
--- cma.c   (revision 9192)
+++ cma.c   (working copy)
@@ -283,7 +284,6 @@ static int cma_acquire_ib_dev(struct rdm
 
ib_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid);
 
-   mutex_lock(&lock);
list_for_each_entry(cma_dev, &dev_list, list) {
ret = ib_find_cached_gid(cma_dev->device, &gid,
 &id_priv->id.port_num, NULL);
@@ -292,7 +292,6 @@ static int cma_acquire_ib_dev(struct rdm
break;
}
}
-   mutex_unlock(&lock);
return ret;
 }
 
@@ -781,7 +780,9 @@ void rdma_destroy_id(struct rdma_cm_id *
state = cma_exch(id_priv, CMA_DESTROYING);
cma_cancel_operation(id_priv, state);
 
+   mutex_lock(&lock);
if (id_priv->cma_dev) {
+   mutex_unlock(&lock);
switch (rdma_node_get_transport(id->device->node_type)) {
case RDMA_TRANSPORT_IB:
if (id_priv->cm_id.ib && !IS_ERR(id_priv->cm_id.ib))
@@ -793,8 +794,8 @@ void rdma_destroy_id(struct rdma_cm_id *
cma_leave_mc_groups(id_priv);
mutex_lock(&lock);
cma_detach_from_dev(id_priv);
-   mutex_unlock(&lock);
}
+   mutex_unlock(&lock);
 
cma_release_port(id_priv);
cma_deref_id(id_priv);
@@ -1511,16 +1512,26 @@ static void addr_handler(int status, str
enum rdma_cm_event_type event;
 
atomic_inc(&id_priv->dev_remove);
-   if (!id_priv->cma_dev && !status)
+
+   /*
+* Grab mutex to block rdma_destroy_id() from removing the device while
+* we're trying to acquire it.
+*/
+   mutex_lock(&lock);
+   if (!cma_comp_exch(id_priv, CMA_ADDR_QUERY, CMA_ADDR_RESOLVED)) {
+   mutex_unlock(&lock);
+   goto out;
+   }
+
+   if (!status && !id_priv->cma_dev)
status = cma_acquire_dev(id_priv);
+   mutex_unlock(&lock);
 
if (status) {
-   if (!cma_comp_exch(id_priv, CMA_ADDR_QUERY, CMA_ADDR_BOUND))
+   if (!cma_comp_exch(id_priv, CMA_ADDR_RESOLVED, CMA_ADDR_BOUND))
goto out;
event = RDMA_CM_EVENT_ADDR_ERROR;
} else {
-   if (!cma_comp_exch(id_priv, CMA_ADDR_QUERY, CMA_ADDR_RESOLVED))
-   goto out;
memcpy(&id_priv->id.route.addr.src_addr, src_addr,
   ip_addr_size(src_addr));
event = RDMA_CM_EVENT_ADDR_RESOLVED;
@@ -1747,8 +1758,11 @@ int rdma_bind_addr(struct rdma_cm_id *id
 
if (!cma_any_addr(addr)) {
ret = rdma_translate_ip(addr, &id->route.addr.dev_addr);
-   if (!ret)
+   if (!ret) {
+   mutex_lock(&lock);
ret = cma_acquire_dev(id_priv);
+   mutex_unlock(&lock);
+   }
if (ret)
goto err;
}


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] 2.6.19 cma: fix typo

2006-08-31 Thread Sean Hefty
Comma should be semi-colon

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
Please queue for 2.6.19

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d6f99d5..bf20410 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -265,7 +265,7 @@ static int cma_acquire_ib_dev(struct rdm
union ib_gid gid;
int ret = -ENODEV;
 
-   ib_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid),
+   ib_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid);
 
mutex_lock(&lock);
list_for_each_entry(cma_dev, &dev_list, list) {


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] OFED 1.1-rc3 is ready

2006-08-31 Thread Tziporet Koren
Hi,

OFED 1.1-RC3 is available on 
https://openib.org/svn/gen2/branches/1.1/ofed/releases/
File: OFED-1.1-rc3.tgz
Please report any issues in bugzilla http://openib.org/bugzilla/

Schedule reminder:
==
Next milestones:
RC4 is planned for 7-Sep. It should include critical bug fixes only.
Final release will be on 11 or 12 Sep.

Owners - please update release notes for RC4.

Tziporet & Vlad
-

Release details:

Build_id:
OFED-1.1-rc3

openib-1.1 (REV=9203)
# User space
https://openib.org/svn/gen2/branches/1.1/src/userspace
Git:
ref: refs/heads/ofed_1_1
commit 338e942a4ae10d62f2632e6292f85bb1b15d154c

# MPI
mpi_osu-0.9.7-mlx2.2.0.tgz
openmpi-1.1.1-1.src.rpm
mpitests-2.0-0.src.rpm


OS support:
===
Novell:
- SLES 9.0 SP3
- SLES10
Redhat:
- Redhat EL4 up3
- Redhat EL4 up4
kernel.org:
- Kernel 2.6.17

Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped from the list.
We keep the backport patches for these OSes and make sure OFED compile and
loaded properly but will not do full QA cycle.

Systems:

* x86_64
* x86
* ia64
* ppc64

Main changes from OFED-1.1-rc2:
===
1. Added ehca (IBM) driver. This driver can be compiled on kernel 2.6.18 
only
3. Open MPI version update to openmpi-1.1.1-1
4. Core: Huge pages registration is supported
5. IPoIB high availability script supports multicast groups
6. RHEL4 up4 is now supported
7. SDP: fixed connection refused problem; get peer name working
8. libsdp: several bug fixes

Limitations and known issues:
=
1. SDP: For Mellanox Sinai HCAs one must use latest FW version (1.1.000).
2. SDP: Scalability issue when many connections are opened
3. SDP: If RTU packet is lost Accept call blocks even if client connected.
4. ipath driver is not supported on SLES9 SP3
5. Compilation on kernel 2.6.18-rc5 is failing - to be fixed in RC4
 

Missing features that should be completed for RC4:
==
None



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/srp: destroy and re-create QP and CQ on reconnect

2006-08-31 Thread Michael S. Tsirkin
Hello, Roland!
Please consider the following for 2.6.19.

---

>From: Ishai Rabinovitz <[EMAIL PROTECTED]>

For some reason (could be a firmware problem) I got a CQ overrun in SRP.
Because of that there was a QP FATAL. Since in srp_reconnect_target we are not
destroying the QP, the QP FATAL persists after the reconnect.
In order to be able to recover from such situation I suggest we
destroy the CQ and the QP in every reconnect.

This also corrects a minor spec in-compliance - when srp_reconnect_target
is called, srp destroys the CM ID and resets the QP, the new connection
will be retried with the same QPN which could theoretically lead to
stale packets (for strict spec compliance I think QPN should not be reused
till all stale packets are flushed out of the network).

---

IB/srp: destroy/re-create QP and CQ on each reconnect.
This makes SRP more robust in presence of hardware errors
and is closer to behaviour suggested by IB spec,
reducing chance of stale packets.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-08-31 
12:23:52.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-08-31 
12:30:48.0 +0300
@@ -495,10 +495,10 @@
 static int srp_reconnect_target(struct srp_target_port *target)
 {
struct ib_cm_id *new_cm_id;
-   struct ib_qp_attr qp_attr;
struct srp_request *req, *tmp;
-   struct ib_wc wc;
int ret;
+   struct ib_cq *old_cq;
+   struct ib_qp *old_qp;
 
spin_lock_irq(target->scsi_host->host_lock);
if (target->state != SRP_TARGET_LIVE) {
@@ -522,17 +522,17 @@
ib_destroy_cm_id(target->cm_id);
target->cm_id = new_cm_id;
 
-   qp_attr.qp_state = IB_QPS_RESET;
-   ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE);
-   if (ret)
-   goto err;
-
-   ret = srp_init_qp(target, target->qp);
-   if (ret)
+   old_qp = target->qp;
+   old_cq = target->cq;
+   ret = srp_create_target_ib(target);
+   if (ret) {
+   target->qp = old_qp;
+   target->cq = old_cq;
goto err;
+   }
 
-   while (ib_poll_cq(target->cq, 1, &wc) > 0)
-   ; /* nothing */
+   ib_destroy_qp(old_qp);
+   ib_destroy_cq(old_cq);
 
spin_lock_irq(target->scsi_host->host_lock);
list_for_each_entry_safe(req, tmp, &target->req_queue, list)

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] single rkey

2006-08-31 Thread Dotan Barak
yipee wrote:
> Hi,
>
> Is it possible for several memory registrations (using ibv_reg_mr) to have a
> single rkey?
> Can I add memory registrations to a previous rkey?
>
>
> thanks,
> y
>   
I believe that the answer is no.

Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] single rkey

2006-08-31 Thread yipee
Hi,

Is it possible for several memory registrations (using ibv_reg_mr) to have a
single rkey?
Can I add memory registrations to a previous rkey?


thanks,
y




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/cm: do not track remote QPN in timewait state

2006-08-31 Thread Michael S. Tsirkin
Roland, please queue for 2.6.19.

---

IB/cm: fix spurious rejects with bogus stale connection syndrome.
CM should not track remote QPN in TimeWait, since QP is not connected.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>
Acked-by: Sean Hefty <[EMAIL PROTECTED]>

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index f85c97f..e270311 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -679,6 +679,8 @@ static void cm_enter_timewait(struct cm_
 {
int wait_time;
 
+   cm_cleanup_timewait(cm_id_priv->timewait_info);
+
/*
 * The cm_id could be destroyed by the user before we exit timewait.
 * To protect against this, we search for the cm_id after exiting


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/perftest: Fix get_median, size of delta, usage(), worst latency

2006-08-31 Thread Michael S. Tsirkin
Quoting r. CH Ganapathi <[EMAIL PROTECTED]>:
> o Fix get_median.
> o Change usage() in write_bw.c to match the actual default of exchanges.
> o Fix worst latency in read_lat.c.
> o Allocate only the necessary (iters - 1) elements for delta.
> 
> Signed-off-by: Ganapathi CH <[EMAIL PROTECTED]>

Thanks, applied.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] perftest: enhancement to rdma_lat to allow use of RDMA CM

2006-08-31 Thread Michael S. Tsirkin
Quoting r. Pradipta Kumar Banerjee <[EMAIL PROTECTED]>:
> Subject: [PATCH] perftest: enhancement to rdma_lat to allow use of RDMA CM
> 
> Hi Michael,
> This patch contains changes to the rdma_lat.c to allow use of RDMA CM.
> This has been successfully tested with Ammasso iWARP cards, IBM eHCA and 
> mthca IB
> cards.
> 
> Summary of changes
> 
> # Added an option (-c|--cma) to enable use of RDMA CM
> # Added a new structure (struct pp_data) containing the user parameters as 
> well
>   as other data required by most of the routines. This makes it convenient to
>   pass the parameters between various routines.
> # Outputs to stdout/stderr are prefixed with the process-id. This helps to
>   sort the output when multiple servers/clients are run from the same machine
> 
> Signed-off-by: Pradipta Kumar Banerjee <[EMAIL PROTECTED]>

Thanks, applied.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/perftest: Fix get_median, size of delta, usage(), worst latency

2006-08-31 Thread Michael S. Tsirkin
3) Worst latency is delta[iters - 2] in read_lat.c, not delta[iters - 3].

>> could you explain this last bit please?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/perftest: Fix get_median, size of delta, usage(), worst latency

2006-08-31 Thread CH Ganapathi
Hi,

1) When iters (exchanges) is even, delta has odd no.of elements and when iters 
is odd, delta has even no.of elements. Hence when (iters - 1) is passed 
get_median() uses incorrect indexes to find the median.
For example: 
  When iters = 2 , get_median returns median = (delta[0] + delta[-1])/2 when it 
 should have been median = delta[0].
  When iters = 3 get_median returns median = delta[1] when actually it should
 have been median = (delta[0] + delta[1])/2.

2) The array delta requires only (iters - 1) size to be allocated.
3) Worst latency is delta[iters - 2] in read_lat.c, not delta[iters - 3].
4) usage() in write_bw.c incorrectly states default exchanges as 1000.

Thanks,
Ganapathi
Novell Inc.

The following patch includes:

o Fix get_median.
o Change usage() in write_bw.c to match the actual default of exchanges.
o Fix worst latency in read_lat.c.
o Allocate only the necessary (iters - 1) elements for delta.

Signed-off-by: Ganapathi CH <[EMAIL PROTECTED]>

Index: userspace/perftest/read_lat.c
===
--- userspace/perftest/read_lat.c   (revision 9196)
+++ userspace/perftest/read_lat.c   (working copy)
@@ -568,7 +568,7 @@
  */
 static inline cycles_t get_median(int n, cycles_t delta[])
 {
-   if (n % 2)
+   if ((n - 1) % 2)
return(delta[n / 2] + delta[n / 2 - 1]) / 2;
else
return delta[n / 2];
@@ -591,7 +591,7 @@
cycles_t median;
unsigned int i;
const char* units;
-   cycles_t *delta = malloc(iters * sizeof *delta);
+   cycles_t *delta = malloc((iters - 1) * sizeof *delta);
 
if (!delta) {
perror("malloc");
@@ -627,7 +627,7 @@
median = get_median(iters - 1, delta);
printf("%7d%d%7.2f%7.2f  %7.2f\n",
   size,iters,delta[0] / cycles_to_units ,
-  delta[iters - 3] / cycles_to_units ,median / cycles_to_units );
+  delta[iters - 2] / cycles_to_units ,median / cycles_to_units );
 
free(delta);
 }
Index: userspace/perftest/write_bw.c
===
--- userspace/perftest/write_bw.c   (revision 9196)
+++ userspace/perftest/write_bw.c   (working copy)
@@ -509,7 +509,7 @@
printf("  -s, --size= size of message to exchange 
(default 65536)\n");
printf("  -a, --all Run sizes from 2 till 2^23\n");
printf("  -t, --tx-depth=  size of tx queue (default 100)\n");
-   printf("  -n, --iters=   number of exchanges (at least 2, 
default 1000)\n");
+   printf("  -n, --iters=   number of exchanges (at least 2, 
default 5000)\n");
printf("  -b, --bidirectional   measure bidirectional bandwidth 
(default unidirectional)\n");
printf("  -V, --version display version number\n");
 }
Index: userspace/perftest/rdma_lat.c
===
--- userspace/perftest/rdma_lat.c   (revision 9196)
+++ userspace/perftest/rdma_lat.c   (working copy)
@@ -516,7 +516,7 @@
  */
 static inline cycles_t get_median(int n, cycles_t delta[])
 {
-   if (n % 2)
+   if ((n - 1) % 2)
return (delta[n / 2] + delta[n / 2 - 1]) / 2;
else
return delta[n / 2];
@@ -538,7 +538,7 @@
cycles_t median;
unsigned int i;
const char* units;
-   cycles_t *delta = malloc(iters * sizeof *delta);
+   cycles_t *delta = malloc((iters - 1) * sizeof *delta);
 
if (!delta) {
perror("malloc");
Index: userspace/perftest/send_lat.c
===
--- userspace/perftest/send_lat.c   (revision 9196)
+++ userspace/perftest/send_lat.c   (working copy)
@@ -678,7 +678,7 @@
  */
 static inline cycles_t get_median(int n, cycles_t delta[])
 {
-   if (n % 2)
+   if ((n - 1) % 2)
return(delta[n / 2] + delta[n / 2 - 1]) / 2;
else
return delta[n / 2];
@@ -701,7 +701,7 @@
cycles_t median;
unsigned int i;
const char* units;
-   cycles_t *delta = malloc(iters * sizeof *delta);
+   cycles_t *delta = malloc((iters - 1) * sizeof *delta);
 
if (!delta) {
perror("malloc");
Index: userspace/perftest/write_lat.c
===
--- userspace/perftest/write_lat.c  (revision 9196)
+++ userspace/perftest/write_lat.c  (working copy)
@@ -579,7 +579,7 @@
  */
 static inline cycles_t get_median(int n, cycles_t delta[])
 {
-   if (n % 2)
+   if ((n - 1) % 2)
return(delta[n / 2] + delta[n / 2 - 1]) / 2;
else
return delta[n / 2];
@@ -602,7 +602,7 @@
cycles_t median;
unsigned int i;
const char* u

Re: [openib-general] [PATCH] libibcm: Need to include stddef.h in cm.c for SLES10 compilations

2006-08-31 Thread Dotan Barak
Sean Hefty wrote:
> Jack Morgenstein wrote:
>   
>> Fix compilation on SLES10:
>> cm.c uses offsetof, so it must include stddef.h
>> 
>
> Thanks - committed in 9150.
>
>   

I checked this libibcm with multithreaded test (qp_test) and it is 
working with no problems.

thanks
Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ibv_poll_cq

2006-08-31 Thread Dotan Barak
Sunil Patil wrote:
> I am using socket based communication for exchanging intial 
> information such as lid,qpn,psn, in fact, more or less the same code 
> that is there in the examples. Is there any CM based example that I 
> can look at?
>  
> Regards,
> John
in: https://openib.org/svn/gen2/trunk/src/userspace/libibcm/examples you 
can find some libibcm examples.
Anyway, if you are using sockets you should sync between the two sides 
before you use the QPs (sync between them after they both in at least 
the RTR state).

Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ibv_poll_cq

2006-08-31 Thread Sunil Patil
I am using socket based communication for exchanging intial information such as lid,qpn,psn, in fact, more or less the same code that is there in the examples. Is there any CM based example that I can look at?
 
Regards,
John 
On 8/31/06, Dotan Barak <[EMAIL PROTECTED]> wrote:
Hi.john t wrote:> Hi Dotan>> Is there a way to know if the two QPs (local and remote) are in sync
> or to wait for them to get in sync and then do the data transfer.>> I think in my case it is more like one QP is sending the message but> the other end (receiver) is not in RTR state at that time (since
> sender and receiver are implemented as threads, may be receiver thread> on the other machine is getting scheduled very late).>> Is there a way where I can specifiy infinite retry_count/timeout or
> find out if remote QP is in RTR state (or error state) and only then> do the actual data tranfer.>Sorry, but the answer is no: there isn't any way for a local QP to knowthe state of the remote QP .
This is exactly the role of the CM: to sync between the two QPs and tomove the various attributes between the two sides.how do you connect the two QPs?(are you using the CM or a socket based communication?)
Dotan___openib-general mailing listopenib-general@openib.org
http://openib.org/mailman/listinfo/openib-generalTo unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] ibv_poll_cq

2006-08-31 Thread Dotan Barak
Hi.

john t wrote:
> Hi Dotan
>  
> Is there a way to know if the two QPs (local and remote) are in sync 
> or to wait for them to get in sync and then do the data transfer.
>  
> I think in my case it is more like one QP is sending the message but 
> the other end (receiver) is not in RTR state at that time (since 
> sender and receiver are implemented as threads, may be receiver thread 
> on the other machine is getting scheduled very late).
>  
> Is there a way where I can specifiy infinite retry_count/timeout or 
> find out if remote QP is in RTR state (or error state) and only then 
> do the actual data tranfer.
>  
Sorry, but the answer is no: there isn't any way for a local QP to know 
the state of the remote QP .
This is exactly the role of the CM: to sync between the two QPs and to 
move the various attributes between the two sides.

how do you connect the two QPs?
(are you using the CM or a socket based communication?)

Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general