[ewg] Which IPv6 multicast group to be used for mckey ?
Hi, I am new to infiniband technology, so do not have much more exposure of it. I have installed OFED-1.5 on my machine. I was trying to run mckey application with following *two different multicast groups*. mckey -M *FF10:0:0:0:0:0:0:B* -b 10.10.10.1 (receiver) mckey -M *FF10:0:0:0:0:0:0:C* -b 10.10.10.2 -s (sender) Above both multicast groups are different still, data sent by sender received by receiver on another machine. Why it happens ? Is there any special format of IPv6 multicast groups for Infiniband ? Thanks in advance, Vivek. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Which IPv6 multicast group to be used for mckey ?
Hi, I did google on this and found the following link: http://www.mail-archive.com/linux-r...@vger.kernel.org/msg00953.html As per above link: RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with FF1x:A01B::/32 as MGIDs 1) So, does it mean that mckey works with multicast addresses starting with FF1x:A01B only ? 2) Again, I did some testing and found that if I use multicast address * FF12:A01B:0:0:0:0:0:A* with mckey, then multicast join fails with following error: #mckey -M FF12:A01B:0:0:0:0:0:A -b fe80::202:c903:0:d1e1 mckey: starting server mckey: joining mckey: event: RDMA_CM_EVENT_MULTICAST_ERROR, error: -22 test complete return status 0 mckey fails if X bit in FF1X:A01B: , is 2. For any value of X other than 2, mckey works fine. Can anyone please tell me the reason of this ? Thanks in advance, Vivek On Thu, Mar 11, 2010 at 1:30 PM, Vivek Satpute vivekonlin...@gmail.comwrote: Hi, I am new to infiniband technology, so do not have much more exposure of it. I have installed OFED-1.5 on my machine. I was trying to run mckey application with following *two different multicast groups*. mckey -M *FF10:0:0:0:0:0:0:B* -b 10.10.10.1 (receiver) mckey -M *FF10:0:0:0:0:0:0:C* -b 10.10.10.2 -s (sender) Above both multicast groups are different still, data sent by sender received by receiver on another machine. Why it happens ? Is there any special format of IPv6 multicast groups for Infiniband ? Thanks in advance, Vivek. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch
I am handling this. On Thu, Mar 11, 2010 at 09:54:34AM +0200, Tziporet Koren wrote: On 3/10/2010 9:05 PM, Jason Gunthorpe wrote: On Wed, Mar 10, 2010 at 08:57:17PM +0200, Eli Cohen wrote: On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote: I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde, alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9, and fixup the 'Add in the P_Key' hunk to also fixup the scope byte too. Can you elaborate on this? + ++ /* Work around broken ip_ib_mc_map */ ++ if (mclist-dmi_addrlen == INFINIBAND_ALEN) { ++ mclist-dmi_addr[5] = 0x10 | (dev-broadcast[5] 0xF); ++ mclist-dmi_addr[8] = dev-broadcast[8]; ++ mclist-dmi_addr[9] = dev-broadcast[9]; ++ } 5 in the dmi_addr is the scope byte. The old patch: -+ /* Add in the P_Key */ -+ mgid.raw[4] = (priv-pkey 8) 0xff; -+ mgid.raw[5] = priv-pkey 0xff; -+ Only includes the dmi_addr bytes 8 and 9. This is also a small bug. The above should read something like: mgid.raw[1] = 0x10 | (dev-broadcast[5] 0xF); mgid.raw[4] = dev-broadcast[8]; mgid.raw[5] = dev-broadcast[9]; Jason Eli Can you take care for it now or you need the complete pathc from Jason? Vlad Please revert the patch that causing the problem Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] ofa_1_5_kernel 20100311-0200 daily build status
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.54.5-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.18-186.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.27.19-5-smp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-89.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [ANNOUNCE] OFED 1.5.1 rc4 release is available
OFED 1.5.1-rc4 is available Notes: The tarball is available on: http://www.openfabrics.org/downloads/OFED/ofed-1.5.1/OFED-1.5.1-rc4.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ for OFED 1.5.1 Vladimir Tziporet Supported Platforms and Operating Systems - o CPU architectures: - x86_64 - x86 - ppc64 - ia64 o Linux Operating Systems: - RedHat EL4 up72.6.9-78.ELsmp - RedHat EL4 up82.6.9-89.ELsmp - RedHat EL5 up32.6.18-128.el5 - RedHat EL5 up42.6.18-164.el5 - SLES10 SP22.6.16.60-0.21-smp - SLES10 SP32.6.16.60-0.54-smp - SLES112.6.27.19-5-default - OEL 4 up7 2.6.9-78.ELsmp - OEL 4 up8 2.6.9-89.ELsmp - CentOS5.3 2.6.18-128.el5 - CentOS5.4 2.6.18-164.el5 - Fedora Core12 2.6.31.5-127.fc12* - OpenSuSE 11.2 2.6.31.5-0.1-default * - kernel.org2.6.29, 2.6.30, 2.6.31 and 2.6.32* * Minimal QA for these versions Main changes from 1.5.1-rc3: === 1. Updated packages: - ibutils: ibutils-1.5.4 - libmlx4: libmlx4-1.0-0.6.g72e73dc Bug fix in mlx4_create_ah - install.pl: Add '--builddir' parameter NFSoRDMA will not support SLES10 SPx - NFSoRDMA is not supported under SLES10SPx 2. Bug fixes commit 3e2e26b64187f3d5292653b4df761dfcd1e353ea Merge: f52992f eada57c Author: Vladimir Sokolovsky v...@mellanox.co.il Date: Thu Mar 11 13:01:40 2010 +0200 Merge remote branch 'vu/ofed_kernel_1_5' into ofed_kernel_1_5 commit eada57cc6dfcd0f779c5254a1fc354702ac41247 Author: Vu Pham (Mellanox) v...@lists.openfabrics.org Date: Thu Mar 11 02:30:28 2010 -0800 srp: fixing panic bug happened during manual unload ib_srp module Signed-off-by: Vu Pham v...@mellanox.com commit f52992f15c05d70d17412ca92e14e6f1bf2c1ac7 Author: Yevgeny Petrilin yevge...@mellanox.co.il Date: Wed Mar 10 18:46:55 2010 +0200 mlx4_en: reconfigure mac address When the other port removes a mac address that is the same that the current port has, the table should be reconfigured. fixes bugzilla #1965 Signed-off-by: Yevgeny Petrilin yevge...@mellanox.co.il commit 7a1bbb340356e0489e50678f7e6d563f2f0be268 Author: Eli Cohen e...@mellanox.co.il Date: Thu Mar 11 10:33:52 2010 +0200 IPoIB: Fix multicast handling After reverting c124815 it was necessary to modify ipoib_mcast_addr_is_valid() so it will not filter out valid ipoib multicast addresses. Signed-off-by: Eli Cohen e...@mellanox.co.il commit 5daec886e9a7d7baea848180c3c8dbbc7b249e79 Author: Eli Cohen e...@mellanox.co.il Date: Thu Mar 11 09:17:21 2010 +0200 Revert ipoib/mcast: Fix IPoIB multicast backport The reverted comit changes the multicat address that the kernel created causing resource leaks and other problems. This reverts commit c12481586c4ba09cb88dc2090c67fdce7c856cde. commit efe60c7da58f9bf235eef0381aa4a93c014805aa Merge: 3df6ee7 0ff7a6e Author: Vladimir Sokolovsky v...@mellanox.co.il Date: Wed Mar 10 08:12:42 2010 +0200 Merge branch 'ofed_kernel_1_5' of git://git.openfabrics.org/~ralphc/linux-2.6/ into ofed_kernel_1_5 commit 3df6ee73e2364080d7ac179d1ecd8c4aaf9a9e43 Merge: 7a227ee 4c29a30 Author: Vladimir Sokolovsky v...@mellanox.co.il Date: Wed Mar 10 08:11:37 2010 +0200 Merge branch 'ofed_kernel_1_5' of ssh://sofa.openfabrics.org/home/ctung/scm/ofed-1.5 into ofed_kernel_1_5 commit 7a227ee270783b7f1d773939e82343fdc3e69fb4 Merge: a679ae9 8721f26 Author: Vladimir Sokolovsky v...@mellanox.co.il Date: Wed Mar 10 08:06:29 2010 +0200 Merge branch 'ofed_1_5' of ssh://sofa.openfabrics.org/~swise/scm/ofed_kernel into ofed_kernel_1_5 commit 0ff7a6e94e7e98a8f10d1c01e70c6f26d776c4ee Author: Ralph Campbell (QLogic) ral...@lists.openfabrics.org Date: Tue Mar 9 16:09:47 2010 -0800 IB/qib: clear symbol error counters on link UP Clear symbol error counters on link UP. Signed-off-by: Ralph Campbell ralph.campb...@qlogic.com commit 4c29a3078cee40ee9800ade03f6f91c69202f368 Author: Chien Tung chien.tin.t...@intel.com Date: Tue Mar 9 15:59:02 2010 -0600 RDMA/nes: make nesadapter-phy_lock usage consistent nes_{read,write}_1G_phy_reg() are using phy_lock while nes_{read,write}_10G_phy_reg() leave that to the caller. Remove phy_lock from 1G routines and leave the locking to the caller. Add additional phy_lock calls around 1G read/write. Signed-off-by: Chien Tung
Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue
good debugging, applied thanks. I do worry (as Moni mentioned) that this doesn't explain why you would get send failures in this case, but the patch itself is well-explained and looks obviously correct so I think we should apply it. -- Roland Dreier rola...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue
On Thu, 2010-03-11 at 13:38 -0800, Roland Dreier wrote: good debugging, applied thanks. I do worry (as Moni mentioned) that this doesn't explain why you would get send failures in this case, but the patch itself is well-explained and looks obviously correct so I think we should apply it. Well, after more testing it seems there may still be a problem. I haven't isolated it yet though. I could definitely use help reviewing the code changes. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue
Sorry, I was referring to my patch not Eli's. On Thu, 2010-03-11 at 13:41 -0800, Ralph Campbell wrote: On Thu, 2010-03-11 at 13:38 -0800, Roland Dreier wrote: good debugging, applied thanks. I do worry (as Moni mentioned) that this doesn't explain why you would get send failures in this case, but the patch itself is well-explained and looks obviously correct so I think we should apply it. Well, after more testing it seems there may still be a problem. I haven't isolated it yet though. I could definitely use help reviewing the code changes. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue
Sorry, I was referring to my patch not Eli's. Heh, I never would have said anything about your patch was obvious. I skimmed yours once but I do want to read it more carefully. Did you ever say what test case you are using to provoke the problem you're fixing? -- Roland Dreier rola...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RC4 build failure on FC12
Has anyone seen this? Install rds-tools RPM: Running rpm -iv /root/OFED-1.5.1-rc4/RPMS/fedora-release-11-1.noarch/x86_64/rds-tools-1.5-1.x86_64.rpm Build ibutils RPM Running rpmbuild --rebuild --define '_topdir /var/tmp//OFED_topdir' --define 'dist %{nil}' --target x86_64 --define '_prefix /usr' --define '_exec_prefix /usr' --define '_sysconfdir /etc' --define '_usr /usr' --define 'build_ibmgtsim 1' --define '__arch_install_post %{nil}' --define 'configure_options --with-osm=/usr ' /root/OFED-1.5.1-rc4/SRPMS/ibutils-1.5.4-1.src.rpm Failed to build ibutils RPM See /tmp/OFED.18913.logs/ibutils.rpmbuild.log [r...@shuttle1 OFED-1.5.1-rc4]# tail -50 /tmp/OFED.18913.logs/ibutils.rpmbuild.log ... ibmssh_wrap.cpp:40796: warning: deprecated conversion from string constant to 'char*' ibmssh_wrap.cpp:40796: warning: deprecated conversion from string constant to 'char*' if g++ -DHAVE_CONFIG_H -I. -I. -I.. -I/var/tmp/OFED_topdir/BUILD/ibutils-1.5.4/ibdm/ibdm -I/usr/include -I-I/var/tmp/OFED_topdir/BUILD/ibutils-1.5.4/ibdm/ibdm -I/usr/include -I./../../ibdm/ibdm -I/usr/include/infiniband -I/usr/include -DOSM_VENDOR_INTF_OPENIB -DOSM_BUILD_OPENIB -D_XOPEN_SOURCE=600 -D_BSD_SOURCE=1 -O2 -Wall -I/usr/include/infiniband -I/usr/include -DOSM_VENDOR_INTF_OPENIB -DOSM_BUILD_OPENIB -D_XOPEN_SOURCE=600 -D_BSD_SOURCE=1 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT sma.o -MD -MP -MF .deps/sma.Tpo -c -o sma.o sma.cpp; \ then mv -f .deps/sma.Tpo .deps/sma.Po; else rm -f .deps/sma.Tpo; exit 1; fi sma.cpp: In static member function 'static void* SMATimer::timerRun(void*)': sma.cpp:134: warning: no return statement in function returning non-void sma.cpp: In member function 'int IBMSSma::nodeDescMad(ibms_mad_msg_t)': sma.cpp:511: error: invalid conversion from 'const char*' to 'char*' sma.cpp: In member function 'int IBMSSma::setPortInfoSwExtPort(ibms_mad_msg_t, ibms_mad_msg_t, uint8_t, ib_port_info_t, int)': sma.cpp:1278: warning: suggest parentheses around arithmetic in operand of '|' make[3]: *** [sma.o] Error 1 make[3]: Leaving directory `/var/tmp/OFED_topdir/BUILD/ibutils-1.5.4/ibmgtsim/src' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/var/tmp/OFED_topdir/BUILD/ibutils-1.5.4/ibmgtsim' make[1]: *** [all] Error 2 make[1]: Leaving directory `/var/tmp/OFED_topdir/BUILD/ibutils-1.5.4/ibmgtsim' make: *** [all-recursive] Error 1 error: Bad exit status from /var/tmp/rpm-tmp.bRz5D4 (%build) RPM build errors: user vlad does not exist - using root group vlad does not exist - using root user vlad does not exist - using root group vlad does not exist - using root Bad exit status from /var/tmp/rpm-tmp.bRz5D4 (%build ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue
On Thu, 2010-03-11 at 13:52 -0800, Roland Dreier wrote: Sorry, I was referring to my patch not Eli's. Heh, I never would have said anything about your patch was obvious. I skimmed yours once but I do want to read it more carefully. Did you ever say what test case you are using to provoke the problem you're fixing? I think I did but it is just UDP stress tests in general. Throwing in some link failures and switching between connected and datagram modes helps too. netperf, qperf, etc. should work. Anything which causes the connected mode QP to fail should exercise the fix too. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg