We have experimented with various configuration options and the build does
complete, depending on what config options are chosen. So, yes we do get
the kernel modules built.
Basic question, how does one get the corresponding user level libraries and
scripts that go with OFED-3.2. Is that still in
Here are the steps that we followed and the errors encountered:
/home/user/compat-rdma # git clone
git://git.openfabrics.org/compat-rdma/linux-3.2.git
/home/user/compat-rdma # git clone
git://git.openfabrics.org/compat-rdma/compat.git
/home/user/compat-rdma # git clone
Here are the steps that we followed and the errors encountered:
/home/user/compat-rdma # git clone
git://git.openfabrics.org/compat-rdma/linux-3.2.git
/home/user/compat-rdma # git clone
git://git.openfabrics.org/compat-rdma/compat.git
/home/user/compat-rdma # git clone
Some customers have seen strange behaviors with the tcp_mem settings
in /sbin/ib_ipoib_sysctl.The problems are similar to what has been reported
at:
http://lists.linbit.com/pipermail/drbd-user/2009-September/012711.html
Essentially a setting of :
/sbin/sysctl -q -w net.ipv4.tcp_mem=16777216
uDAPL users confirmed that the fix is available and bug# 2026 is now
closed.
Thanks
Pradeep
prad...@us.ibm.com
|
| From: |
|
Is it the intent that the RoCE implementation in OFED-1.5.1 corresponds to the
RoCE supplement to the IB spec dated April 6th 2010,
excepting bugs of course? Or are there known deviations from the spec?
Pradeep
___
ewg mailing list
Tom Ammon wrote:
A widely-supported method for collecting statistics from ethernet
switching devices is SNMP. As Woody said, you wouldn't manage your
ethernet network any differently than you do now. In addition many many
tools have been written to make use of SNMP counters, so you won't have
With IBoE/RoCEE, the traditional SM in IB clusters is not needed. Most of the
current
IB tools rely on the SM and PM to get packet and error statistics and so on.
These
won't be applicable with IBoE/RoCEE. netstat will have no value since the
kernel
has been bypassed. So, how does one monitor
We are trying run openMPI with OFED-1.5 on the 2.6.31-rt11-preempt-rt kernel
and see the following errors:
[[45393,1],8][../../../../../ompi/mca/btl/openib/btl_openib_component.c:2951:handle_wc]
from elm3b107 to: elm3b17 error polling HP CQ with status WORK REQUEST FLUSHED
ERROR status number 5
Steve Wise wrote:
This patch works. It also backports cleanly to ofed-1.5.1/RH5.3.
Acked-by: Steve Wise sw...@opengridcomputing.com
Steve.
Steve, Was this tested against both iWARP and IB?
Thanks
Pradeep
___
ewg mailing list
Steve Wise wrote:
Sean, can you try openmpi? It fails for me, and yet ucmatose succeeds.
I don't understand the difference yet...
Sean Hefty wrote:
On my OFED 1.4.1 RHEL4u6 systems, rdma_bind_addr() fails when
attempting to
bind to 127.0.0.1 per the email I sent Friday:
Jeff Squyres wrote:
On Feb 8, 2010, at 7:30 PM, Pradeep Satyanarayana wrote:
elm3b199:/usr/lib # /usr/mpi/gcc/openmpi-1.4.1/bin/mpirun -np 2 --bynode
--mca btl_openib_cpc_include rdmacm ring
--
mpirun was unable
ewg-boun...@lists.openfabrics.org wrote on 11/22/2009 02:36:32 AM:
Pradeep Satyanarayana wrote:
Roland Dreier wrote:
Thanks... in any case I applied all 9 of the patches in this series.
Thanks for pulling all this together.
Sean, Thanks a lot for pulling it all together. Can we
Roland Dreier wrote:
Thanks... in any case I applied all 9 of the patches in this series.
Thanks for pulling all this together.
Sean, Thanks a lot for pulling it all together. Can we consider including this
into OFED-1.5 too?
Pradeep
___
ewg mailing
Tziporet Koren wrote:
Pradeep Satyanarayana wrote:
This crash was originally reported against Rhel5.4. However, one can
recreate this crash quite easily in OFED-1.5 too.
Can you open a bugzilla too?
I have opened bug# 1821
https://bugs.openfabrics.org/show_bug.cgi?id=1821
Pradeep
Shiri Franchi wrote:
Hi,
I tried to reproduce on RH5 up4 with ping and iperf and it did not
happened.
Are you sure you used modprobe -t ib_ipoib or maybe modprobe -r
bonding?
Thanks,
Shiri
Hi Shiri,
I used modprobe -r ib_ipoib. One of the pre-requisites to create
this crash is to do
Shiri Franchi wrote:
Hi,
I did it exactly as you described:
1. ifdown ib0
2. ifdown ib1
3. modprobe -r ib_ipoib
And it did not reproduced..
It is a race that you may not be recreating. I have tried
this across different HCAs and platforms (x86_64, ppc64).
Seems to recreate almost at
This crash was originally reported against Rhel5.4. However, one can recreate
this crash quite easily in OFED-1.5 too.
The steps to recreate the crash are as follows:
1. Run traffic (I used ping) on the IB interfaces through the bond master
2. ifdown ib0
3. ifdown ib1
4. modprobe -r ib_ipoib
ewg-boun...@lists.openfabrics.org wrote on 09/21/2009 09:30:45 AM:
EWG/OFED the meeting minutes for Sep 21, 2009
Meeting summary:
* This was a short meeting on OFED 1.5 status
* We wish to have OFED 1.5 RC1 this week
* Must for RC1 is to resolve all compilation issues
Tziporet, Vlad,
Who will be able to help us with this? Need to include the correct level of
librdmacm.
Is it reasonable to expect that this will get done before the next beta
release?
Thanks!
Pradeep
prad...@us.ibm.com
ewg-boun...@lists.openfabrics.org wrote on 09/17/2009 08:54:42 AM:
On
I am still seeing the following problem trying to install today's OFED-1.5
build on on Sles11 (ppc64) :
gcc -m64
-Wp,-MD,/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/net/sunrpc/.svc.o.d
-nostdinc -isystem /usr/lib64/gcc/powerpc64-suse-linux/4.3/include
-D__KERNEL__ \
-D__OFED_BUILD__ \,
-include
ewg-boun...@lists.openfabrics.org wrote on 09/10/2009 01:23:08 AM:
On Wed, 9 Sep 2009 16:47:14 +0300
Tziporet Koren tzipo...@mellanox.co.il wrote:
Hi,
Hi,
I wish to update all that we plan to release OFED 1.5 beta tomorrow
I know it's a week late then what we planned but we waited
Tziporet Koren wrote:
Pradeep Satyanarayana wrote:
Since the RDMA_CM with support for IPv6 was dropped from OFED-1.4,
(and is now upstream) can one expect that it will be in OFED-1.5?
If its in 2.6.30 then we already have it
If its in 2.6.31 we will need to take the code
Can you let
As mentioned on the OFED conference call, I downloaded yesterday's build
and did confirm that the bug was fixed by running the Connectathon tests on
a couple of ppc64 machines.
The tests ran to completion without any problems.
Steve, Jon, Thanks for your help in resolving the issues.
Pradeep
Hello Mike,
Could this be a firmware issue? I presume you are using ConnectX -is that
correct?
We have not seen this with Rhel5.3/OFED-1.4.1 rc4 in our tests.
Pradeep
prad...@us.ibm.com
Mike
I downloaded a recent version of Roland's git tree and tried IPoIB bonding.
Fail over does not seem
to be working at all. I have tried OFED 1.3.2 on a Rhel5 derivative and that
(fail over) worked as expected.
Is this a known issue? Given that OFED 1.4 will be in sync with main line
kernel, is
Eli Cohen wrote:
On Sun, 2008-02-17 at 11:21 +0200, Or Gerlitz wrote:
Thanks, this sheds more light on the solution but I still can not
understand how can the upstream code live without the QPs getting
destroyed? or the bug exist also there? if yes, I would recommend to
reshape the
Roland Dreier wrote:
Hello Eli, I have already submitted this patch to mainline. I will follow
up with Roland to get this merged there.
I didn't see the submission... can you resend?
Roland,
Here is the link to the patch sent previously:
Roland Dreier wrote:
Here is the link to the patch sent previously:
http://lists.openfabrics.org/pipermail/general/2008-February/046463.html
OK, applied, although that link points to an HTML-mangled version of
the patch, and I also had to figure out why we needed that change and
write
of the patch
submitted yesterday and is split up as per Eli's request.
Signed-off-by: Pradeep Satyanarayana [EMAIL PROTECTED]
---
--- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-12
17:46:03.0 -0500
+++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02
.
Tested on ppc64 machines with ehca and mthca.
Signed-off-by: Pradeep Satyanarayana [EMAIL PROTECTED]
---
--- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-11
14:28:47.0 -0500
+++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c2008-02-12
17:44:07.0
Or Gerlitz wrote:
Eli Cohen wrote:
could you send as distinct patches according to what they fix?
Pradeep Satyanarayana wrote:
2. Change retry counts to small values. This helps interoperability
between ehca and mthca.
Indeed, I sent a note on that now to the general list, lets discuss
Eli Cohen wrote:
Pradeep,
could you send as distinct patches according to what they fix?
Thanks.
Hello Eli,
Sure I will do that. And I will drop the change due to the UD split CQ.
Pradeep
___
ewg mailing list
ewg@lists.openfabrics.org
Eli Cohen wrote:
This problem was seen on a ehca that supports SRQ.
Please reply how many scatter entries does ehca support when working
in SRQ mode? Also any piece of info I might need to try and mimic ehca
behaviour on Mellanox devices. I will appreciate if you can repeat the
exact
Tziporet Koren wrote:
Shirley Ma wrote:
Thanks Tziporet. We will test it right after it's out.
You can start use the lates build -
http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz
Tziporet
I have downloaded the todays build mentioned above. I am still seeing
Pradeep Satyanarayana wrote:
Tziporet Koren wrote:
Shirley Ma wrote:
Thanks Tziporet. We will test it right after it's out.
You can start use the lates build -
http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz
Tziporet
I have downloaded the todays build
Pradeep Satyanarayana wrote:
Eli Cohen wrote:
Pradeep,
Can you check if this is resolved?
On 2/4/08, Pradeep Satyanarayana [EMAIL PROTECTED] wrote:
I pulled today's (Feb 4th) OFED build and saw the following Oops while
touch testing
on ehca1 on a 2.6.24 kernel.
snip
NIP
Pradeep, Shir
We tries to apply this patch for OFED 1.3 and its breaks some of the
backports.
Please use the makedist script on the ofa server (there is an
explanation in the developers Wiki) and fix this so we can try to
apply it
Vlad will help you later today too
Thanks,
Tziporet
or
pointers to resolve this issue would be appreciated. Thanks!
Pradeep
---BeginMessage---
Pradeep Satyanarayana wrote:
Some HCAs like ehca do not natively support srq. This patch would enable IPoIB
CM
for such HCAs. This patch has been accepted into Roland's for-2.6.25 git tree for
about 3
Some HCAs like ehca do not natively support srq. This patch would enable IPoIB
CM
for such HCAs. This patch has been accepted into Roland's for-2.6.25 git tree
for
about 3 months now.
Please consider including this patch into OFED 1.3.
Signed-off-by: Pradeep Satyanarayana [EMAIL PROTECTED
Some HCAs like ehca do not natively support srq. In order to enable IPoIB CM
for such HCAs, I have developed a nonsrq patch. This patch has been accepted
into Roland's for-2.6.25 git tree for about 3 months now.
I am working on porting that to OFED 1.3 and it will take me at least several
days
Pradeep Satyanarayana wrote:
Some HCAs like ehca2 support fewer than 16 SG entries. Currently IPoIB/CM
implicitly assumes all HCAs will support 16 SG entries of 4K pages for 64K
MTUs. This patch removes that restriction.
This patch continues to use order 0 allocations and enables
I saw some specific known issues and limitations wrt ConnectX in OFED
1.2c. Is ConnectX officially supported in OFED 1.2c, or will that be OFED
1.3?
Pradeep
[EMAIL PROTECTED]
___
ewg mailing list
ewg@lists.openfabrics.org
43 matches
Mail list logo