Re: [openib-general] IB/mthca: question about HCA profile module parameters
Hi Mini and thanks for the quick response. Moni Shoua wrote: OK. So I ran more tests on my setup which now include - Dual x86_64 processor (Intel Xeon) - 1GB RAM - 25204 HCA - fw_ver=1.1.0 In the range of 16K - to 256K of value for num_qp I got no errors. For lower and higher values I got errors from INIT_HCA and (not always and just for very low values) a machine hung. Do you have the Oops saved somewhere? Can you put it here please? Sorry but i don't have a dump of the kernel oops but i have a strong belief that we saw the same kernel oops ... If it is needed, i will try to reproduce it one more time. Did you verify the HCA profile module parameter feature? As I mentioned earlier, I verified that non default values can be assigned and that the HCA works for some selected values. I also noticed that illegal cause the driver to throw a message to the kernel log. However, I didn't test the exact behaviout of all possible values for each profile variable. I guess that this is something that need to be done. i will add this to our regression in the future Is there is any known limitation for the values that should be used? (for example: only values which are power of two) I guess that it is clear that there are hardware limitations that don't allow setting of any value. Unfotunately, even after looking for them in the PRM, I couldn't figure out which are they. The software limits the value to be a power of 2 and corrects the users if they try to set a wrong value (to the nearest power of 2). In that case a warning message is thrown to the kernel log. As much as i know, the minimum amount of any resource (for example, QPs) are the number of resources that the HCA report as reserved. I will open a bug in the Bugzilla, so we will know that there are problems in this feature. thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ofa_1_2_kernel 20070205-0200 daily build status
This email was generated automatically, please do not reply Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.18 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.19 Passed on x86_64 with linux-2.6.16 Passed on powerpc with linux-2.6.17 Passed on powerpc with linux-2.6.18 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.16 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.12 Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.19 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.16 Failed: Build failed on ia64 with linux-2.6.16.21-0.8-default Log: /home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function âregister_netevent_notifierâ /home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function âaddr_cleanupâ: /home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function âunregister_netevent_notifierâ make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 -- ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Our ref. 702/a5tms/12
February5th, 2007 Our ref. 702/a5tms/12 Kind Attn. of General Manager ESQ, CC. Kind Attn. of Marketing Manager ESQ. Dear Sir, Good Afternoon As a leading company specializing in the registration of trademarks/ logos and Commercial Agencies in United Arab Emirates WorldWide, we would like to express our sincere desire to be at your service concerning the same in both of UAE and worldwide. For setting up your company branch in Dubai, It's our most pleasure to assist you in this regard. Awaiting your kind inquiries, instructions, suggestions, we always remain. Warm regards, Sincerely, For International IP - Dubai (WorldWide Trademarks Attorneys) Main Branch - Dubai P.O. Box:64246, Dubai, United Arab Emirates Tel. #+ 971-4-2977-930 Fax. #+ 971-4-2977-776 Cellular # +971-50-2519-528 E-mail: [EMAIL PROTECTED] Rashid Khalfan Bin Sabt General Manager attachment: Clear_Day_Bkgrd.JPG ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] MVAPICH2 rpmbuild issue
Hi Shaun, Please check the following issue: Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.84872 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + cd mvapich2-0.9.8 + export OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed + OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed + '[' -d /var/tmp/OFED/usr/local/ofed/lib ']' + '[' -d /var/tmp/OFED/usr/local/ofed/lib64 ']' + export PREFIX=/var/tmp/OFED/usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1 + PREFIX=/var/tmp/OFED/usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1 + export CC=gcc CXX=g++ F77=gfortran + CC=gcc + CXX=g++ + F77=gfortran + export ROMIO=yes + ROMIO=yes + export SHARED_LIBS=yes + SHARED_LIBS=yes + ./make.mvapich2.gen2 Could not find the OPEN_IB_HOME/lib64 or OPEN_IB_HOME/lib directory. Exiting. error: Bad exit status from /var/tmp/rpm-tmp.84872 (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.84872 (%install) ERROR: Failed executing rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_name mvapich2_gcc' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1' --define 'build_root /var/tmp/OFED' --define 'open_ib_home /usr/local/ofed' --define 'ofed_build_root /var/tmp/OFED' --define 'comp_env CC=gcc CXX=g++ F77=gfortran' --define 'iwarp 0' --define 'romio 1' --define 'shared_libs 1' --define 'auto_req 1' /mswg2/work/vlad/ofed/test/OFED-1.2-alpha1/SRPMS/mvapich2-0.9.8-1.src.rpm -- Vladimir Sokolovsky [EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 334] Problems with build OFED-1.1.1-ib_local_sa
https://bugs.openfabrics.org/show_bug.cgi?id=334 [EMAIL PROTECTED] changed: What|Removed |Added CC||[EMAIL PROTECTED] --- Comment #17 from [EMAIL PROTECTED] 2007-02-05 03:52 --- (In reply to comment #16) I don't agree with your patch. It assumes that SLES 10 may be corrupted. OFED should not try to support this. If you want to use this patch for your own purposes, just apply it (manually) before running OFED build scripts. OFED's backport patches mechanism is not suitable for such patches. I don't agree with you because my patch do not any changes in system files. It only search version of SUSE, but if you think that OFED should not try to support this I think that many Intel people who will install OFED on SLES10 platform will be unhappy. Thanks a lot for you help. -- Dmitry. Note that /etc/issue belongs to a SLES package: rpm thyme:~ # rpm -qf /etc/issue sles-release-10-15.2 Deleting it means that you corrupt your system. One can also delete /etc/SuSE-release and expect that OFED will work. If you decide to delete /etc/issue (or any other file that comes with SLES 10), you'll need to change OFED scripts for your special needs. Anyway, I maintain iSER in OFED. You may want to ask Vlad ([EMAIL PROTECTED]) what he thinks about it. He maintains OFED's build scripts. -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 334] Problems with build OFED-1.1.1-ib_local_sa
https://bugs.openfabrics.org/show_bug.cgi?id=334 --- Comment #18 from [EMAIL PROTECTED] 2007-02-05 04:02 --- Note that /etc/issue belongs to a SLES package: rpm thyme:~ # rpm -qf /etc/issue sles-release-10-15.2 Deleting it means that you corrupt your system. One can also delete /etc/SuSE-release and expect that OFED will work. If you decide to delete /etc/issue (or any other file that comes with SLES 10), you'll need to change OFED scripts for your special needs. Anyway, I maintain iSER in OFED. You may want to ask Vlad ([EMAIL PROTECTED]) what he thinks about it. He maintains OFED's build scripts. Thank you. I do not delete /etc/issue file. I have had it file, but it contain next information: : cat /etc/issue Use of this system by unauthorized persons or in an unauthorized manner is strictly prohibited That is all. -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] QoS in opensm will not be part of OFED 1.2
Hi Hal, I had an AI to check the QoS status with OSM. Conclusions are that QoS support in OpenSM will not be part of OFED 1.2 (I updated the plan on the Wiki) The reasons for this are: 1. Code not ready at code freeze. 2. There are technical discussion in the list regarding some implementation details (e.g. XML or text syntax). 3. SPEC is not published by IBTA yet. Hal Yevgeny - please work on a plan that will enable QoS to be merged on the main trunk once its ready. Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] OSM QoS policy file
Hi Hal. I added osm/doc/qos-policy.txt file with the description of the QoS policy file, and an example of such file (with more comments inside). I'm sure you'll have questions and corrections regarding this file, so for now, to make our work easier, I'm not sending it as patch, but just as text. Please review the file. Thanks -- Yevgeny = QoS Policy File === The QoS policy file is divided into 4 sub sections: - Port Group: a set of CAs, Routers or Switches that share the same settings. A port group might be a partition defined by the partition manager policy in terms of GUIDs. Future implementations might provide support for NodeDescription based definition of port groups. - Fabric Setup: Defines how the SL2VL and VLArb tables should be setup. This policy definition assumes the computation of target behavior should be performed outside of OpenSM. - QoS-Levels Definition: This section defines the possible sets of parameters for QoS that a client might be mapped to. Each set holds: SL and optionally: Max MTU, Max Rate, Packet Lifiteme and QoS Class. - Matching Rules: A list of rules that match an incoming PathRecord request to a QoS-Level. The rules are processed in order such as the first match is applied. Each rule is built out of set of match expressions which should all match for the rule to apply. The matching expressions are defined for the following fields: - SRC and DST to lists of port groups - Service-ID to a list of Service-ID or Service-ID ranges - QoS Class to a list of QoS Class values or ranges Example of the QoS policy file == ?xml version=1.0 encoding=ISO-8859-1? qos-policy !-- Port Groups define sets of ports to be used later in the settings -- port-groups !-- using port GUIDs -- port-group nameStorage/name !-- use is just a description that is used for logging. Other than that, it is just a commentary -- useour SRP storage targets/use port-guid0x1001/port-guid port-guid0x1002/port-guid /port-group port-group nameVirtual Servers/name usenode desc and IB port #/use !-- The syntax of the port name is as follows: hostname/CA-num/Pnum. hostname and CA-num are compared to the first 2 words of NodeDescription, and Pnum is a port number on that node. -- port-namevs1/HCA-1/P1/port-name port-namevs3/HCA-1/P1/port-name port-namevs3/HCA-2/P1/port-name /port-group !-- using partitions defined in the partition policy -- port-group namePartition 1/name usedefault settings/use partitionPart1/partition /port-group !-- using node types CA|ROUTER|SWITCH -- port-group nameRouters/name useall routers/use node-typeROUTER/node-type /port-group /port-groups qos-setup sl2vl-tables !-- scope defines the exact devices and in/out ports the tables apply to if the same port is matching several rules the last one applies -- sl2vl-scope groupPart1/group !-- *see explanation below the policy file example* -- from*/from !-- *see explanation below the policy file example* -- to*/to !-- SL2VL table has to have exactly 16 values (one for each SL) -- sl2vl-table0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7/sl2vl-table /sl2vl-scope sl2vl-scope !-- *see explanation below the policy file example* -- across-fromStorage1/across-from !-- *see explanation below the policy file example* -- across-toStorage2/across-to sl2vl-table0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0/sl2vl-table /sl2vl-scope /sl2vl-tables !-- define all types of VLArb tables. The length of the tables should match the physically supported tables by their target ports -- vlarb-tables !-- scope defines the exact ports the VLArb tables apply to -- vlarb-scope !-- defining VLArb tables on all the ports that belong to port group 'Storage', and on all the ports that connected to ports of port group 'Storage' -- groupStorage/group !-- across means all the ports that are connected to ports that belong to the specified port group -- acrossStorage/across !-- VLArb table holds VL and weight pairs --
Re: [openib-general] idea for ofed 1 2 kernel file structure
I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I'm not sure how you have your branches set up, but if you have something like a linus branch that tracks the upstream kernel, it's easy to do stuff like git log linus.. or git diff linus.. drivers/infiniband and see the differences that way. Using git that way (which is what it's designed for, after all) seems better than some scripts to munge together two trees. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] QoS in opensm will not be part of OFED 1.2
Hi Tziporet, On Mon, 2007-02-05 at 07:04, Tziporet Koren wrote: Hi Hal, I had an AI to check the QoS status with OSM. Conclusions are that QoS support in OpenSM will not be part of OFED 1.2 (I updated the plan on the Wiki) The reasons for this are: 1. Code not ready at code freeze. 2. There are technical discussion in the list regarding some implementation details (e.g. XML or text syntax). 3. SPEC is not published by IBTA yet. I think this last reason also applies to the end client QoS changes as well. -- Hal Hal Yevgeny - please work on a plan that will enable QoS to be merged on the main trunk once its ready. Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] enable IPoIB only if broadcast join finish
Hi, Roland, Please review this patch. According to IPoIB RFC4391 section 5, once IPoIB broacast group has been joined, the interface should be ready for data transfer. In current IPoIB implementation, the interface is UP and RUNNING when all default multicast join successful. We hit a problem while the broadcast join finishe and sucessful but the all hosts multicast join failure. Here is the patch, if possible please give your input asap, we have an urgent customer issue need to be resolved: diff -urpN ipoib/ipoib_multicast.c ipoib-multicast/ipoib_multicast.c --- ipoib/ipoib_multicast.c 2006-11-29 13:57:37.0 -0800 +++ ipoib-multicast/ipoib_multicast.c 2007-02-04 22:34:16.0 -0800 @@ -402,6 +402,11 @@ static void ipoib_mcast_join_complete(in queue_work(ipoib_workqueue, priv-mcast_task); mutex_unlock(mcast_mutex); complete(mcast-done); + /* +* broadcast join finished, enable carrier +*/ + if (mcast == priv-broadcast) + netif_carrier_on(dev); return; } @@ -599,7 +604,6 @@ void ipoib_mcast_join_task(void *dev_ptr ipoib_dbg_mcast(priv, successfully joined all multicast groups\n); clear_bit(IPOIB_MCAST_RUN, priv-flags); - netif_carrier_on(dev); } int ipoib_mcast_start_thread(struct net_device *dev) (See attached file: ipoib-multicast.patch) Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 ipoib-multicast.patch Description: Binary data ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Unknown SMP Recv
Hi, I have change the driver (smi) a little and have written a tool like a router or a bridge. It receives directed route smp's on one port and sends it to another port. I use 3 nodes (sender on node 1, the router on node 2, normal node on 3) and send a subnGet SMP with [0][1][1] as initial path. And it works fine, but on way back the router also receives a second subnGetResp packet with no data. The header is almost the same as the real subnGetResp packet, just the DrSLID,DrDLID, initial path, return path are 0. Are there any ideas where this packet come from? Ack? Thanks Michael ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] idea for ofed 1 2 kernel file structure
Quoting Roland Dreier [EMAIL PROTECTED]: Subject: Re: [openib-general] idea for ofed 1 2 kernel file structure I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I'm not sure how you have your branches set up, but if you have something like a linus branch that tracks the upstream kernel, it's easy to do stuff like git log linus.. or git diff linus.. drivers/infiniband and see the differences that way. limit to drivers/infiniband is no longer sufficient as we have components under drivers/net etc. Another problem is that history-rewriting tools such as git rebase seem to easily get confused by the complicated linux history. Using git that way (which is what it's designed for, after all) seems better than some scripts to munge together two trees. Problem is, OFED kernel code actually consists of 2 parts: upstream kernel developed separately at lkml and out of kernel components, developed separately. OFED does not really track linux all the time: we only update at -RC time. Mixing such 2 projects together does not seem to be what git was designed for. For example, when a patch is applied upstream we need to remove it from fixes. So after I do git pull from upstream I get a broken tree that won't even build. Not good. Another problem I'm trying to address is the confusion around what gets applied as patch and what directly. This way, a bad patch won't even apply. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Unknown SMP Recv
On Mon, 2007-02-05 at 10:18, Michael Arndt wrote: Hi, I have change the driver (smi) a little and have written a tool like a router or a bridge. It receives directed route smp's on one port and sends it to another port. I use 3 nodes (sender on node 1, the router on node 2, normal node on 3) and send a subnGet SMP with [0][1][1] as initial path. And it works fine, but on way back the router also receives a second subnGetResp packet with no data. The header is almost the same as the real subnGetResp packet, just the DrSLID,DrDLID, initial path, return path are 0. Are there any ideas where this packet come from? Ack? A router should not allow a SMP to cross a subnet boundary. SMPs are restricted to the local subnet. -- Hal Thanks Michael ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] QoS in opensm will not be part of OFED 1.2
I had an AI to check the QoS status with OSM. Conclusions are that QoS support in OpenSM will not be part of OFED 1.2 (I updated the plan on the Wiki) The reasons for this are: 1. Code not ready at code freeze. 2. There are technical discussion in the list regarding some implementation details (e.g. XML or text syntax). 3. SPEC is not published by IBTA yet. I think this last reason also applies to the end client QoS changes as well. Yes. But the other 2 don't. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Immediate data question
Roland: If I only want to send/recv 4 bytes with immediate data: On sender side: opcode = IBV_WR_SEND_WITH_IMM; imm_data = my_4_bytes_data; Do I still need to specify sg_list and num_sge ? On receiver side, because the immediate data is inside the completion structure, do I need to post a receive for above message ? If I need to post a receive, do I need to specify sg_list and num_sge for the receive ? I looked the spec but did not find useful information. The reason I ask is that at some point, I can not(or hard) to provide registered memory only for 4 bytes data. Thank you. --CQ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roland Dreier Sent: Monday, February 05, 2007 8:20 AM To: Michael S. Tsirkin Cc: openib-general@openib.org Subject: Re: [openib-general] idea for ofed 1 2 kernel file structure I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I'm not sure how you have your branches set up, but if you have something like a linus branch that tracks the upstream kernel, it's easy to do stuff like git log linus.. or git diff linus.. drivers/infiniband and see the differences that way. Using git that way (which is what it's designed for, after all) seems better than some scripts to munge together two trees. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] QoS in opensm will not be part of OFED 1.2
I had an AI to check the QoS status with OSM. Conclusions are that QoS support in OpenSM will not be part of OFED 1.2 (I updated the plan on the Wiki) The reasons for this are: 1. Code not ready at code freeze. 2. There are technical discussion in the list regarding some implementation details (e.g. XML or text syntax). 3. SPEC is not published by IBTA yet. I think this last reason also applies to the end client QoS changes as well. Yes. But the other 2 don't. Right but I think that precludes it from being included in OFED right now. Since the code is already included in OFED, moving it out would violate the feature freeze rules, unless there's an actual bug this would fix. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] QoS in opensm will not be part of OFED 1.2
On Mon, 2007-02-05 at 10:38, Michael S. Tsirkin wrote: I had an AI to check the QoS status with OSM. Conclusions are that QoS support in OpenSM will not be part of OFED 1.2 (I updated the plan on the Wiki) The reasons for this are: 1. Code not ready at code freeze. 2. There are technical discussion in the list regarding some implementation details (e.g. XML or text syntax). 3. SPEC is not published by IBTA yet. I think this last reason also applies to the end client QoS changes as well. Yes. But the other 2 don't. Right but I think that precludes it from being included in OFED right now. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [libmthca] deadlock while trying to destroy QP
Hi Roland, I am running a proprietary test over ofed1.1 (userspace). I have one context where I poll my cq and another (signal handler context) where I try to destroy my QP. It looks like mthca_destroy_qp is trying to take a lock that mthca_poll_cq is holding. The deadlock is occurring at the end of the test run where there are no more completions, hence deadlocking and the test never exists. Here is a core dump: #0 0x003a6ce09172 in pthread_spin_lock () from /lib64/tls/libpthread.so.0 #1 0x002a959cf449 in mthca_cq_clean (cq=0x607240, qpn=3277830, srq=0x0) at src/cq.c:554 #2 0x002a959d28b9 in mthca_destroy_qp (qp=0x607400) at src/mthca.h:246 #3 0x0040117b in client_sig_handler () #4 signal handler called #5 0x003a6ce09165 in pthread_spin_lock () from /lib64/tls/libpthread.so.0 #6 0x002a959cec91 in mthca_poll_cq (ibcq=0x607240, ne=1, wc=0x7fb590) at src/cq.c:467 #7 0x002a9557bf73 in ibv_poll_cq (cq=0x607240, num_entries=1, wc=0x7fb590) at /usr/local/ofed/include/infiniband/verbs.h:824 Does destroy_qp needs to be dependent on the CQ? Do you have any suggestions? Thanks, Guy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] RE: regression in ofed 1.2
The name is ib_mcast_wq which is too long for older kernels. Did we loose a backport patch? Not sure what happened here. Sean, could you rename ib_mcast_wq to ib_mcast please? I renamed the workqueue for what I requested to pull upstream, and I added a patch to my pull request to rename a couple of other workqueues. Didn't you already apply a rename patch to the ofed code? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Unknown SMP Recv
On Mon, 2007-02-05 at 11:56, Michael Arndt wrote: Hi, A router should not allow a SMP to cross a subnet boundary. SMPs are restricted to the local subnet. I work on a discovering mechanism for switchless InfiniBand Architectures like Rings, Tori or maybe Hyper-Cubes. There is just one single subnet, no switches or routers. Please ignore the background and focus to the problem about the second packet. Maybe you have some ideas even you are not involved in the hole project. That would be nice. Guess you don't mean IB router when you say router in your description. I also have no theories without more information: Is the sender a normal node ? Is normal node mean standard OpenIB without changes ? How was the SMI changed ? On which nodes ? Only the intermediate one ? Aside from the initial path being [0][1][1], what are the hop count and hop pointer ? What are DrDLID and DrSLID as well as the LIDs in the LRH of the SMP ? -- Hal Thanks Michael ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] cxgb3.git tree merged to 2.6.20
All, I've updated my tree git://staging.openfabrics.org/~swise/cxgb3.git to linux-2.6.20. Branches: cxgb3 - my development branch with commits that were used to review the rdma driver (large patch series) + the T3 Ethernet driver. for-roland - branch where roland can pull the latest rdma driver (the same code that is in OFED 1.2) for-ofed_1_2 - branch used to deliver the original ethernet and rdma driver code to the ofed_1_2 tree. It is up to date with the ofed_1_2 tree wrt the drivers. Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] patches to 2.6.19.1 kernel for switch Operation
Hal: We are upgrading to 2.6.19.1 kernel and I finally ported the changes required for Switch operation from my current kernel (2.6.12) version. I have tested these changes for a switch with different SM(s). But I need the community's help to test the changes on different HCAs to make sure I have not broken anything. Please see if the changes look OK. Thanks, Suri smi.c.ptch Description: Binary data agent.c.ptch Description: Binary data mad.c.ptch Description: Binary data ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] idea for ofed 1 2 kernel file structure
On Sun, 4 Feb 2007, Michael S. Tsirkin wrote: Hi! I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I agree. It would easy to split OFED specific files In separate directory and have OFED scripts combine that with upstream kernel. All out of tree modules we distribute would go there too. What do others think about this? I like that idea very much. -- Arthur ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Immediate data question
On 2/5/07, Tang, Changqing [EMAIL PROTECTED] wrote: On sender side: opcode = IBV_WR_SEND_WITH_IMM; imm_data = my_4_bytes_data; Do I still need to specify sg_list and num_sge ? At the sender side i think you can do well with: opcode = IBV_WR_SEND send_flags |= IBV_SEND_INLINE sge.addr = pointer to the 4 bytes sge.len = 4 sge.lkey = don't care since the 4 bytes are --copied-- by the IB library from sge.addr during the execution of ibv_post_send(), the owenership of sge.addr is yours once the call returns. On receiver side, because the immediate data is inside the completion structure, do I need to post a receive for above message ? yes, i don't see how you can get a way from posting a receive WR The reason I ask is that at some point, I can not(or hard) to provide registered memory only for 4 bytes data. what about the mpi impl. header ??? do you have a case where only 4 bytes need to be passed to the other side? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] RE: regression in ofed 1.2
Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [openib-general] [PATCH] RE: regression in ofed 1.2 The name is ib_mcast_wq which is too long for older kernels. Did we loose a backport patch? Not sure what happened here. Sean, could you rename ib_mcast_wq to ib_mcast please? I renamed the workqueue for what I requested to pull upstream, and I added a patch to my pull request to rename a couple of other workqueues. Didn't you already apply a rename patch to the ofed code? You but I assumed it's in your branch so I threw it out when I took your latest code. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] idea for ofed 1 2 kernel file structure
Quoting [EMAIL PROTECTED] [EMAIL PROTECTED]: Subject: Re: [openib-general] idea for ofed 1 2 kernel file structure On Sun, 4 Feb 2007, Michael S. Tsirkin wrote: Hi! I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I agree. It would easy to split OFED specific files In separate directory and have OFED scripts combine that with upstream kernel. All out of tree modules we distribute would go there too. What do others think about this? I like that idea very much. Could you address Roland's proposal as well? -- Arthur -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Fwd: bug in mthca_qp.c (GEN 2)
Roland, what do you think? Looks pretty severe actually. - Forwarded message from Jack Morgenstein [EMAIL PROTECTED] - Subject: bug in mthca_qp.c (GEN 2) Date: Mon, 5 Feb 2007 12:44:11 +0200 From: Jack Morgenstein [EMAIL PROTECTED] static void to_ib_ah_attr(struct mthca_dev *dev, struct ib_ah_attr *ib_ah_attr, struct mthca_qp_path *path) { memset(ib_ah_attr, 0, sizeof *path); SHOULD BE: memset(ib_ah_attr, 0, sizeof *ib_ah_attr); - End forwarded message - -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] ofed_1_2 - iw_cxgb3 - Add standard GPL header to tcb.h
Add standard GPL header to tcb.h From: Steve Wise [EMAIL PROTECTED] Signed-off-by: Steve Wise [EMAIL PROTECTED] --- drivers/infiniband/hw/cxgb3/tcb.h | 33 +++-- 1 files changed, 31 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/tcb.h b/drivers/infiniband/hw/cxgb3/tcb.h index f287a7c..c702dc1 100644 --- a/drivers/infiniband/hw/cxgb3/tcb.h +++ b/drivers/infiniband/hw/cxgb3/tcb.h @@ -1,5 +1,34 @@ -/* This file is automatically generated --- do not edit */ - +/* + * Copyright (c) 2007 Chelsio, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ #ifndef _TCB_DEFS_H #define _TCB_DEFS_H ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MVAPICH2 SRPM and install file patches
Vladimir Sokolovsky wrote: On Wed, 2007-01-31 at 20:32 -0500, Shaun Rowland wrote: I've placed the MVAPICH2 SRPM on the OFA server in ~rowland/ofed_1_2, and it is linked to here: http://www.openfabrics.org/~rowland/ofed_1_2/ Hi Shaun, Please change mvapich2.spec to avoid using of %build macro. It removes RPM_BUILD_ROOT on SuSE distros: Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.9418 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + /bin/rm -rf /var/tmp/OFED ++ dirname /var/tmp/OFED + /bin/mkdir -p /var/tmp + /bin/mkdir /var/tmp/OFED + cd mvapich2-0.9.8 + export OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed + OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed Thank you for pointing out this issue on SuSE. I've made the change and placed a new SRPM in my directory (mvapich2-0.9.8-2.src.rpm) and updated my latest.txt file. -- Shaun Rowland [EMAIL PROTECTED] http://www.cse.ohio-state.edu/~rowland/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] idea for ofed 1 2 kernel file structure
On Mon, 2007-02-05 at 06:20 -0800, Roland Dreier wrote: I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I'm not sure how you have your branches set up, but if you have something like a linus branch that tracks the upstream kernel, it's easy to do stuff like git log linus.. or git diff linus.. drivers/infiniband and see the differences that way. Using git that way (which is what it's designed for, after all) seems better than some scripts to munge together two trees. So git log linus.. would show commits in the current branch that are not in the linus branch, correct? That would work. Two branches: one with the main kernel git tree, and based on that + the ofed-specific changes. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MVAPICH2 SRPM and install file patches
Shaun, Thanks for doing this. I see things like romio and shlibs configurable in the patch, what about other MVAPICH2 features like fault tolerance, multi rail, threads, and MPD? How can configure them when I use install.sh to compile and install OFED? I also didn't quite understand the ib-vs-iwarp configuration, I thought OFED 1.2 would support both. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shaun Rowland Sent: Wednesday, January 31, 2007 5:33 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; openib-general@openib.org Subject: [openib-general] MVAPICH2 SRPM and install file patches I've placed the MVAPICH2 SRPM on the OFA server in ~rowland/ofed_1_2, and it is linked to here: http://www.openfabrics.org/~rowland/ofed_1_2/ Additionally, I am including a patch in this email that updates the ofed_1_2_scripts files from the GIT repository we were given to handle the MVAPICH2 SRPM file. Basically, installing MVAPICH2 is similar to the other MPI packages, except that I have added a choice option to build with iWARP support or not. The default is IB only. If the user has selected the librdmacm packages and the mvapich2 package, this choice is presented. This is also saved in the ofed.conf file using an MVAPICH2_IMPL variable, and the librdmacm packages are added as dependencies if the iWARP version of MVAPICH2 is desired and they are not already in the ofed.conf file, which seems like standard behavior in the scripts. The resulting binary RPM uses the name convention mvapich2_compiler as normal in either case. There are various ways this could be implemented, perhaps in a better manner. This is what I was able to come up with by today. Since the installation scripts given were very similar to the original OFED 1.1 scripts, I was able to test the installation procedure using OFED 1.1 files. Everything worked for me, including building the mpitests package against the mvapich2 package. There are some comments about this in what I have done. I hope that it is helpful in getting our SRPM integrated into the installation scripts. Additionally, I put a README file in my ofed_1_2 directory that contains information about the macros that can be used with our SRPM file. The SRPM can be used to install against an existing OFED installation, and those macros control various aspects of the result. There is one special macro I use for when the SRPM is being built along with the OFED source, and its use should be clear in the patched build.sh script and associated comment. -- Shaun Rowland [EMAIL PROTECTED] http://www.cse.ohio-state.edu/~rowland/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: bug in mthca_qp.c (GEN 2)
Roland, what do you think? Looks pretty severe actually. static void to_ib_ah_attr(struct mthca_dev *dev, struct ib_ah_attr *ib_ah_attr, struct mthca_qp_path *path) { memset(ib_ah_attr, 0, sizeof *path); It's definitely a bug but I don't think it's very severe -- the only calls to to_ib_ah_attr are in mthca_query_qp, where the function is used to fill in fields embedded in a struct ib_qp_attr, and even though the memset overruns the ib_ah_attr slightly, it only zeros out fields that are set later in the function anyway. So with current code at least the bug is harmless. anyway, I queued the patch below for 2.6.21: IB/mthca: Use correct structure size in call to memset() When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof *ib_ah_attr instead of sizeof *path. Pointed out by Jack Morgenstein [EMAIL PROTECTED]. Signed-off-by: Roland Dreier [EMAIL PROTECTED] --- drivers/infiniband/hw/mthca/mthca_qp.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 5f5214c..224c93d 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -399,7 +399,7 @@ static int to_ib_qp_access_flags(int mthca_flags) static void to_ib_ah_attr(struct mthca_dev *dev, struct ib_ah_attr *ib_ah_attr, struct mthca_qp_path *path) { - memset(ib_ah_attr, 0, sizeof *path); + memset(ib_ah_attr, 0, sizeof *ib_ah_attr); ib_ah_attr-port_num = (be32_to_cpu(path-port_pkey) 24) 0x3; if (ib_ah_attr-port_num == 0 || ib_ah_attr-port_num dev-limits.num_ports) -- 1.4.4.1 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Immediate data question
If I only want to send/recv 4 bytes with immediate data: I assume you mean that you only want to send the 4 bytes of immediate data, and nothing else. On sender side: opcode = IBV_WR_SEND_WITH_IMM; imm_data = my_4_bytes_data; Do I still need to specify sg_list and num_sge ? Well, you should be able to specify num_sge = 0. But to be honest I'm not positive that 0-length sends are allowed; I know that 0-length RDMA WRITE operations are allowed. On receiver side, because the immediate data is inside the completion structure, do I need to post a receive for above message ? Yes, otherwise how would you get the immediate data? If I need to post a receive, do I need to specify sg_list and num_sge for the receive ? I believe that a 0-length receive with num_sge = 0 should be fine, at least to handle an RDMA write with immediate data. But again I'm not positive. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] OFED-1.2 first release
Hi, OFED-1.2-20070205-1823.tgz can be downloaded from http://www.openfabrics.org/builds/ofed-1.2/ The first OFED package includes: ofa_kernel-1.2-alpha1.src.rpm ofa_user-1.2-alpha1.src.rpm mvapich-0.9.9-971.src.rpm mvapich2-0.9.8-1.src.rpm openmpi-1.2b4ofedr13470-1ofed.src.rpm mpitests-2.0-698.src.rpm open-iscsi-generic-2.0-742.src.rpm ib-bonding-0.9.0-1.src.rpm ofed-docs-1.2-0.src.rpm ofed-scripts-1.2-0.src.rpm Known issues: srptools - compilation fails openib_diags - compilation fails ibutils - not included yet To build OFED RPMs: cd OFED-1.2-20070205-1823 ./build.sh Created RPMs will be stored under OFED-1.2-20070205-1823/RPMS/ directory. To install OFED RPMs: cd OFED-1.2-20070205-1823 ./install.sh For a detailed installation guide, see OFED-1.2-xxx/docs/OFED_Installation_Guide.txt -- Vladimir Sokolovsky [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Web site needs update
The web site lists the svn repo, which is mostly empty now, and the README says the web site lists the various git repos for accessing the source code, but there are no git repos listed on the web site. Could we please have the authoritative git repos for the different components being worked on listed on the web site for easy reference? -- Doug Ledford [EMAIL PROTECTED] GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MVAPICH2 SRPM and install file patches
Scott Weitzenkamp (sweitzen) wrote: Shaun, Thanks for doing this. I see things like romio and shlibs configurable in the patch, what about other MVAPICH2 features like fault tolerance, multi rail, threads, and MPD? How can configure them when I use install.sh to compile and install OFED? Hi Scott. I had thought about this a little when I was testing with the install/build scripts Vlad gave us. I would appreciate his input if I get anything wrong here as well. From the perspective of the user running the install.sh script, the MPI packages are essentially built one way. You do get to pick the compiler(s) to use, but as for other options - you would have to edit the build.sh function associated with the desired package. I created a hack for the iwarp vs ib configuration for MVAPICH2 because I needed to distinguish between the two (for reasons I will outline at the end of this message). Theoretically, you should be able to export the proper variables from our make.mvapich2.* scripts before running the install.sh script, and the features would be enabled. For instance, you could do: export MULTI_THREAD=yes ./install.sh This is not a good solution for installing OFED, but should work due to not conflicting with anything else - at least that I am aware. I see that I need to update the make.mvapich2.iwarp script to have the multithreading option anyway as well, so it would not quite work 100% right now. As far as each feature you asked about: * fault tolerance - this is controlled during the build process with $ENABLE_CKPT and requires $BLCR_HOME pointing to a BLCR installation. This only works for single threaded builds without rdmacm support (the ib case only, essentially). * multi rail - this is controlled by runtime environment variables after installation. * threads - This is controlled by $MULTI_THREAD during the build process. As noted above, there's a restriction with fault tolerance. * MPD - MPD is used by MVAPICH2 as it is based on MPICH2. There are actually a number of options that could be chosen. I believe from our side, it will be good for me to go ahead and put these in our SRPM now. Our SRPM can be used outside of the OFED installation system of course, and these should really be there. There are even other devices, like uDAPL. I did the SRPM in the install/build script patches the way I did because that seemed like a good set of options for how the OFED installation system works. There's no framework or examples of asking about features to build in an MPI package. I just quickly tacked on the iwarp question and made up a new configuration variable for the ofed.conf file, but it's not necessarily a good way to do it. One possibility would be to create a shell function that sets various build options for MPI packages. Variables could be set in this function using some name convention, in our case perhaps MVAPICH2_OPT_whatever. In such a function (probably one for each package, that seems to be the convention), it would be easier to code all the exceptions for features - if there are any. There are some in our case, as I've mentioned. This configuration function could be called when the user is choosing to install MVAPICH2. This leads to a number of problems. Can the user select different options for each of the compiler versions of the MPI package? I think clearly the answer should be no. Even as implemented now, you cannot install the iwarp and ib version of MVAPICH2 at the same time during the install process. You must choose one or the other. Being able to do either would require one of two changes: 1. Having another level of installer system configuration where I could selected the devices desired, and options for each device (by device here, I mean uDAPL, IB, iWARP). - or - 2. Make multiple RPM packages to fit into how the installer currently interacts with SRPMs, prompts, etc. I've only had a limitted time to investigate this, so what I have done so far mostly fits with how the OFED install system does things with the other packages - except for my iwarp vs ib question prompt. I think there's potential for a lot of compilication here. A configuration function for each package would be one possible way to contain that, however I'd have to go back and check out how things work again to see how something like that would fit in. So, I will add these new feature options to our SRPM because they could be used outside of the OFED installation system anyway, and we would like that to be possible and give the ability to set these options. However, I cannot say what would be best for the OFED installation system. It might be better to just go with what we have now - more mainstream builds, and let the user do their own build if they want to highly customize or something. Otherwise, I've given one possible idea from the perspective of someone who is new to the install system. Vlad, do you have any opinion here? Do you see where I am coming from
Re: [openib-general] Immediate data question
Thank you. Other than using immediate data to send notification from one end to the other of a QP, is there any other way to do this ? For example, can I modify QP state from RTS to other state on one end, and then the other end gets some notification when I query the QP ? --CQ -Original Message- From: Roland Dreier [mailto:[EMAIL PROTECTED] Sent: Monday, February 05, 2007 4:09 PM To: Tang, Changqing Cc: Michael S. Tsirkin; openib-general@openib.org Subject: Re: Immediate data question If I only want to send/recv 4 bytes with immediate data: I assume you mean that you only want to send the 4 bytes of immediate data, and nothing else. On sender side: opcode = IBV_WR_SEND_WITH_IMM; imm_data = my_4_bytes_data; Do I still need to specify sg_list and num_sge ? Well, you should be able to specify num_sge = 0. But to be honest I'm not positive that 0-length sends are allowed; I know that 0-length RDMA WRITE operations are allowed. On receiver side, because the immediate data is inside the completion structure, do I need to post a receive for above message ? Yes, otherwise how would you get the immediate data? If I need to post a receive, do I need to specify sg_list and num_sge for the receive ? I believe that a 0-length receive with num_sge = 0 should be fine, at least to handle an RDMA write with immediate data. But again I'm not positive. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Immediate data question
Changqing Thank you. Other than using immediate data to send Changqing notification from one end to the other of a QP, is Changqing there any other way to do this ? For example, can I Changqing modify QP state from RTS to other state on one end, and Changqing then the other end gets some notification when I query Changqing the QP ? Not that I know of. You would need to do something that triggers something to be sent on the wire, and I don't know of any way to do that other than posting a work request. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MVAPICH2 SRPM and install file patches
I also didn't quite understand the ib-vs-iwarp configuration, I thought OFED 1.2 would support both. There are 2 reasons our SRPM has to be told whether it is being built for iWARP or IB: 1. We need to use -DRDMA_CM_RNIC during the build for iWARP (this is actually done by invoking our make.mvapich2.iwarp script in the RPM build). I believe the iWARP build will work over IB too. The difference, I think, is that the iWARP build uses the RDMA-CM and the IB build uses the IB-CM. Shaun, is this correct? If so, I suggest you define these options differently. Perhaps IBCM vs RDMACM? Right now it implies that you cannot run the same mvapich build over both transports. My 2 cents. Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
BTW: The README.txt still talks about OFED-1.1 and the October 2006 release. Steve. On Tue, 2007-02-06 at 00:25 +0200, Vladimir Sokolovsky wrote: Hi, OFED-1.2-20070205-1823.tgz can be downloaded from http://www.openfabrics.org/builds/ofed-1.2/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] idea for ofed 1 2 kernel file structure
On Mon, 5 Feb 2007, Michael S. Tsirkin wrote: Could you address Roland's proposal as well? Regarding the use of git to track the differences in OFED/kernel.org trees? I had to go (re)learn some git stuff, but now I think that this will work fine. -- Arthur ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
I think there might be some dependency problem. I selected libibverbs, libcxgb3, librdmacm, perftest, mvapich2/IWARP and mpitests. For some reason it pulled in libibumad as a prereq, but not libibcommon... Also, I think mvapich2/IWARP links with libibumad or libibcommon and it doesn't need to when using librdmacm. [EMAIL PROTECTED] redhat-release-4AS-5.5]# rpm -U * error: Failed dependencies: libibcommon.so.1()(64bit) is needed by libibumad-1.0.2-0.x86_64 libibcommon.so.1(IBCOMMON_1.0)(64bit) is needed by libibumad-1.0.2-0.x86_64 Suggested resolutions: libibcommon-1.0-1.x86_64.rpm On Tue, 2007-02-06 at 00:25 +0200, Vladimir Sokolovsky wrote: Hi, OFED-1.2-20070205-1823.tgz can be downloaded from http://www.openfabrics.org/builds/ofed-1.2/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] idea for ofed 1 2 kernel file structure
Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [openib-general] idea for ofed 1 2 kernel file structure On Mon, 2007-02-05 at 06:20 -0800, Roland Dreier wrote: I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike: It is hard to see changes that are specific to OFED since we have whole kernel history mixed in. I'm not sure how you have your branches set up, but if you have something like a linus branch that tracks the upstream kernel, it's easy to do stuff like git log linus.. or git diff linus.. drivers/infiniband and see the differences that way. Using git that way (which is what it's designed for, after all) seems better than some scripts to munge together two trees. So git log linus.. would show commits in the current branch that are not in the linus branch, correct? That would work. Two branches: one with the main kernel git tree, and based on that + the ofed-specific changes. Well, that's what we have now. The master branch tracks upstream kernel. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Web site needs update
Quoting Doug Ledford [EMAIL PROTECTED]: Subject: Web site needs update The web site lists the svn repo, which is mostly empty now, and the README says the web site lists the various git repos for accessing the source code, but there are no git repos listed on the web site. Could we please have the authoritative git repos for the different components being worked on listed on the web site for easy reference? I think the thing to do now is to finally move openfabrics.org and openib.org to point to the new server. Then we'll be able to fix this. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
Vlad and Tziporet, It might help if you elaborated on what you meant by first release, you have been saying code freeze but really this is feature freeze, right? This announcement is quite a bit different from previous OFED announcements, where you detailed what features were available and what OS were supported. The daily build email mentions compiling against kernels, but I haven't seen what distros were actually tested. Are we starting from scratch on compiling and testing with distros like RHEL4? Do you anticipate we will just go day by day with builds trying to stabilize things initially? In any case, here's what I see when I try to compile with install.sh on RHEL4 U3 x86_64: ... /tmp/OFED-1.2-20070205-1823/build.sh: line 802: kernel-ib: command not found Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-docs-1. 2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-docs-1.2-0.noarch.rpm /tmp/ OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-scripts -1.2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-scripts-1.2-0.noarch.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ib-bonding-0.9.0-1.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 See log file: /tmp/OFED.10899.log # tail -10 /tmp/OFED.10899.log Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/ib-bonding-0. 9.0-root Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1-rh-x86_64.rpm Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-debuginfo-0.9.0-1-rh-x86_64.rpm Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.98615 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + rm -rf ib-bonding-0.9.0 + exit 0 /bin/mv: cannot stat `/var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm ': No such file or directory ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vladimir Sokolovsky Sent: Monday, February 05, 2007 2:26 PM To: [EMAIL PROTECTED] Cc: openib-general@openib.org Subject: [openib-general] OFED-1.2 first release Hi, OFED-1.2-20070205-1823.tgz can be downloaded from http://www.openfabrics.org/builds/ofed-1.2/ The first OFED package includes: ofa_kernel-1.2-alpha1.src.rpm ofa_user-1.2-alpha1.src.rpm mvapich-0.9.9-971.src.rpm mvapich2-0.9.8-1.src.rpm openmpi-1.2b4ofedr13470-1ofed.src.rpm mpitests-2.0-698.src.rpm open-iscsi-generic-2.0-742.src.rpm ib-bonding-0.9.0-1.src.rpm ofed-docs-1.2-0.src.rpm ofed-scripts-1.2-0.src.rpm Known issues: srptools - compilation fails openib_diags - compilation fails ibutils - not included yet To build OFED RPMs: cd OFED-1.2-20070205-1823 ./build.sh Created RPMs will be stored under OFED-1.2-20070205-1823/RPMS/ directory. To install OFED RPMs: cd OFED-1.2-20070205-1823 ./install.sh For a detailed installation guide, see OFED-1.2-xxx/docs/OFED_Installation_Guide.txt -- Vladimir Sokolovsky [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
Moving on, I set ib_bonding=n in ofed.conf and try install.sh again, and now get this: ... Building MVAPICH RPM. Please wait... Using gcc compiler Running rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_nam e mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --define 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/loc al/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPMS/mvapich-0.9.9-9 71.src.rpm ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm See log file: /tmp/OFED.6120.log # tail /tmp/OFED.6120.log + LANG=C + export LANG + unset DISPLAY /var/tmp/rpm-tmp.870: line 33: syntax error near unexpected token `)' error: Bad exit status from /var/tmp/rpm-tmp.870 (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.870 (%install) ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm Scott From: Scott Weitzenkamp (sweitzen) Sent: Monday, February 05, 2007 9:27 PM To: Vladimir Sokolovsky; [EMAIL PROTECTED]; Tziporet Koren; Scott Weitzenkamp (sweitzen) Cc: openib-general@openib.org Subject: RE: [openib-general] OFED-1.2 first release Vlad and Tziporet, It might help if you elaborated on what you meant by first release, you have been saying code freeze but really this is feature freeze, right? This announcement is quite a bit different from previous OFED announcements, where you detailed what features were available and what OS were supported. The daily build email mentions compiling against kernels, but I haven't seen what distros were actually tested. Are we starting from scratch on compiling and testing with distros like RHEL4? Do you anticipate we will just go day by day with builds trying to stabilize things initially? In any case, here's what I see when I try to compile with install.sh on RHEL4 U3 x86_64: ... /tmp/OFED-1.2-20070205-1823/build.sh: line 802: kernel-ib: command not found Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-docs-1. 2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-docs-1.2-0.noarch.rpm /tmp/ OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-scripts -1.2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-scripts-1.2-0.noarch.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ib-bonding-0.9.0-1.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 See log file: /tmp/OFED.10899.log # tail -10 /tmp/OFED.10899.log Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/ib-bonding-0. 9.0-root Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1-rh-x86_64.rpm Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-debuginfo-0.9.0-1-rh-x86_64.rpm Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.98615 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + rm -rf ib-bonding-0.9.0 + exit 0 /bin/mv: cannot stat `/var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm ': No such file or directory ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vladimir Sokolovsky Sent: Monday, February 05, 2007 2:26 PM To: [EMAIL PROTECTED
Re: [openib-general] OFED-1.2 first release
Vlad, # tail -10 /tmp/OFED.10899.log Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1-rh-x86_64.rpm Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-debuginfo-0.9.0-1-rh-x86_64.rpm Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.98615 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + rm -rf ib-bonding-0.9.0 + exit 0 /bin/mv: cannot stat `/var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm I see that there is a small difference in the expected RPM name. Can you fix that in the script or should we change the name of the RPM ? -- Moni ': No such file or directory ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Immediate data question
Hi CQ. Tang, Changqing wrote: Roland: If I only want to send/recv 4 bytes with immediate data: On sender side: opcode = IBV_WR_SEND_WITH_IMM; imm_data = my_4_bytes_data; Do I still need to specify sg_list and num_sge ? If the data that is being sent is only the immediate data, so no MR should be registered in this side. The SR will look like this: sr.opcode = IBV_WR_SEND_WITH_IMM; sr.imm_data = my_4_bytes_data; sr.num_sge = 0; On receiver side, because the immediate data is inside the completion structure, do I need to post a receive for above message ? If I need to post a receive, do I need to specify sg_list and num_sge for the receive ? In the receiver side you must post RR (because SEND opcode consumes a RR). If you are using UD QP, you must add s/g list with 40 bytes (of registered memory). If you are not using UD QP, the s/g list in this side can be empty (num_sge = 0) and the data that was sent will be provided to you in wc.imm_data. I looked the spec but did not find useful information. The reason I ask is that at some point, I can not(or hard) to provide registered memory only for 4 bytes data. I think that you can avoid registering those 4 bytes ... Hope this helped you Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general