Fwd: [ewg] Re: [PATCH v3] mlx4_ib: Optimize hugetlab pages support
Vlad, Can you please replase mlx4_1070-optimize-huge_tlb.patch with Yossi's patch. It fixes Eli's patch. Thanks Olga -- Forwarded message -- From: Yossi Etigin yos...@voltaire.com Date: Mon, Mar 30, 2009 at 6:49 PM Subject: [ewg] Re: [PATCH v3] mlx4_ib: Optimize hugetlab pages support To: Eli Cohen e...@mellanox.co.il Cc: Roland Dreier rdre...@cisco.com, ewg ewg@lists.openfabrics.org, general-list gene...@lists.openfabrics.org Eli Cohen wrote: Since Linux may not merge adjacent pages into a single scatter entry through calls to dma_map_sg(), we check the special case of hugetlb pages which are likely to be mapped to coniguous dma addresses and if they are, take advantage of this. This will result in a significantly lower number of MTT segments used for registering hugetlb memory regions. How about the one below - it fixes bugzilla #1569 (fix mapping for size that is not on page boundary): --- Since Linux may not merge adjacent pages into a single scatter entry through calls to dma_map_sg(), we check the special case of hugetlb pages which are likely to be mapped to coniguous dma addresses and if they are, take advantage of this. This will result in a significantly lower number of MTT segments used for registering hugetlb memory regions. Signed-off-by: Eli Cohen e...@mellanox.co.il --- drivers/infiniband/hw/mlx4/mr.c | 81 ++ 1 files changed, 72 insertions(+), 9 deletions(-) Index: b/drivers/infiniband/hw/mlx4/mr.c === --- a/drivers/infiniband/hw/mlx4/mr.c 2008-11-19 21:32:15.0 +0200 +++ b/drivers/infiniband/hw/mlx4/mr.c 2009-03-30 18:29:55.0 +0300 @@ -119,6 +119,70 @@ out: return err; } +static int handle_hugetlb_user_mr(struct ib_pd *pd, struct mlx4_ib_mr *mr, + u64 start, u64 virt_addr, int access_flags) +{ +#if defined(CONFIG_HUGETLB_PAGE) !defined(__powerpc__) !defined(__ia64__) + struct mlx4_ib_dev *dev = to_mdev(pd-device); + struct ib_umem_chunk *chunk; + unsigned dsize; + dma_addr_t daddr; + unsigned cur_size = 0; + dma_addr_t uninitialized_var(cur_addr); + int n; + struct ib_umem *umem = mr-umem; + u64 *arr; + int err = 0; + int i; + int j = 0; + int off = start (HPAGE_SIZE - 1); + + n = DIV_ROUND_UP(off + umem-length, HPAGE_SIZE); + arr = kmalloc(n * sizeof *arr, GFP_KERNEL); + if (!arr) + return -ENOMEM; + + list_for_each_entry(chunk, umem-chunk_list, list) + for (i = 0; i chunk-nmap; ++i) { + daddr = sg_dma_address(chunk-page_list[i]); + dsize = sg_dma_len(chunk-page_list[i]); + if (!cur_size) { + cur_addr = daddr; + cur_size = dsize; + } else if (cur_addr + cur_size != daddr) { + err = -EINVAL; + goto out; + } else + cur_size += dsize; + + if (cur_size HPAGE_SIZE) { + err = -EINVAL; + goto out; + } else if (cur_size == HPAGE_SIZE) { + cur_size = 0; + arr[j++] = cur_addr; + } + } + + if (cur_size) { + arr[j++] = cur_addr; + } + + err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, umem-length, + convert_access(access_flags), n, HPAGE_SHIFT, mr-mmr); + if (err) + goto out; + + err = mlx4_write_mtt(dev-dev, mr-mmr.mtt, 0, n, arr); + +out: + kfree(arr); + return err; +#else + return -ENOSYS; +#endif +} + struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_udata *udata) @@ -140,17 +204,20 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct goto err_free; } - n = ib_umem_page_count(mr-umem); - shift = ilog2(mr-umem-page_size); - - err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, length, - convert_access(access_flags), n, shift, mr-mmr); - if (err) - goto err_umem; - - err = mlx4_ib_umem_write_mtt(dev, mr-mmr.mtt, mr-umem); - if (err) - goto err_mr; + if (!mr-umem-hugetlb || + handle_hugetlb_user_mr(pd, mr, start, virt_addr, access_flags)) { + n = ib_umem_page_count(mr-umem); + shift = ilog2(mr-umem-page_size); + + err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, length, +
***SPAM*** Re: [ewg] RE: Delaying next Monday OFED meeting
Both dates are OK with us On Thu, Mar 5, 2009 at 4:02 PM, John Russo john.ru...@qlogic.com wrote: Let’s go for the 12th. From: ewg-boun...@lists.openfabrics.org [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren Sent: Thursday, March 05, 2009 7:23 AM To: ewg@lists.openfabrics.org Subject: [ewg] Delaying next Monday OFED meeting Hello, Due to Purim holiday in Israel I wish to delay the next Monday OFED meeting. We can do it next week on Thursday (12 March) 9am PST or delay to a week after on Monday (March 16 ) 9am PST Can you reply with your availability? Sorry for this inconvenient. Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED (EWG) meeting agenda for tomorrow (Jan 26)
3. OFED 1.5 schedule Betsy from Qlogic suggested to early the release. From the other hand Olga from Voltaire asked to stay with the July time frame. Based on the decisions in 1 2 we should decide on the release schedule. We should decide whether we want to have one or two OFED releases per year. If we will decide that we should go for one OFED release per year, I think we should postpone OFED 1.5 release to October. And have dot release in a middle. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED Jan 5, 2009 meeting minutes on OFED plans
- Kernel base will be 2.6.29 Hi, Kernel 2.6.29 window will be closed very soon, so it means that we cannot have any new features in this kernel. Therefore no new features in OFED 1.5. I think we should be based on 2.6.30. And I agree with Tziporet regarding the OFED 1.5 schedule, no need to rush, OFED is mature enough, therefore no need to have releases every 1/2 year. Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED Nov 24, 2008 meeting minutes
OFED 1.4 release: RC6 on Nov 28, GA on Dec 8 Hi, Are you going to build RC6 today/tomorrow? I see that there are still a lot of major bugs. Maybe we should wait? Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] RE: Do we have an EWG meeting today?
I got email from Jeff: Friendly reminder: the OFED teleconference is today (24 November 2008). 1. Noon US Eastern / 9am US Pacific / 7pm Israel Monday, November 24, code 210020028 (*** TODAY ***) 2. Noon US Eastern / 9am US Pacific / 7pm Israel Monday, December 1, code 210020028 US/Canada: +1.866.432.9903 India: +91.80.4103.3979 Israel: +972.9.892.7026 Others: http://cisco.com/en/US/about/doing_business/conferencing/ On Mon, Nov 24, 2008 at 6:57 PM, Woodruff, Robert J [EMAIL PROTECTED] wrote: I can set up a bridge number if we want to meet. woody -Original Message- From: Tziporet Koren [mailto:[EMAIL PROTECTED] Sent: Monday, November 24, 2008 5:10 AM To: Woodruff, Robert J; Betsy Zeller; Olga Shern Subject: Do we have an EWG meeting today? I thought we decided to have one but I don't see such meeting in my calender Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED 1.4 - delay the GA to Dec 4
1370blo [EMAIL PROTECTED] Ping over IPoIB I/F fails after ifconfig down and up Yossi have sent a patch that fixes this 1198cri [EMAIL PROTECTED] hang during ipoib create_child/ifdown We sent patch to Roland some time ago. But it was decided in EWG meeting that because: 1. It is rarely that user will run such test 2. This is an old bug that wasn't introduced in OFED 1.4 we will not add the patch to OFED 1.4 If you think this is another bug we should open a new one 1289maj [EMAIL PROTECTED]Ib and ipoib doesnt respond while running multiple tests ... It seems that this was already fixed - need only retest this and verify that this is indeed fixed ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED 1.4 bugs status and OFED meetings
Hi Vlad, Is this bug :1349maj [EMAIL PROTECTED]Kernel panic on sdp was fixed? Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] rhel4.6 testing
I assume you mean OFED 1.4 We have tested it - regression tests. Do you see any problem? On Thu, Nov 6, 2008 at 9:12 PM, Steve Wise [EMAIL PROTECTED] wrote: Has anyone tested the core rdma stuff on rhel4.6? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED October 27 2008 meeting summary on OFED 1.4 status
2. We had a discussion on NFS-RDMA since both RHEL 5.1 and SLES10 SP2 backports are not working well We had a debate - do we take it out of OFED since it is not working on the distros Leave it in: We can have bug fixes for 1.4.1, and give customers a platform to play with Take it out: If someone will try it on the distro experience can be problematic Decision: We will leave it for 2.6.27 kernel only. All testing should be done on this kernel mainly to see that basic functionality is working We have tested NFSoRDMA on 2.6.27 and didn't see any of the issues that we see on Distros. So basic functionality is working ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] ***SPAM*** NFS-RDMA compilation problem
Hi Amar, I suggest you to open bug in openbabrics bugzilla: https://bugs.openfabrics.org/. Thanks Olga On Thu, Oct 23, 2008 at 4:50 PM, Amar Mudrankit [EMAIL PROTECTED] wrote: While I was trying to install OFED-1.4-rc3 over SLES 10 SP 2 with NFS-RDMA selected for installation, I got the following error message: nfs-utils-1.1.1 rpm is required to build kernel-ib I have downloaded and installed successfully, the nfs-utils-1.1.4 **source .tgz** from http://www.kernel.org/pub/linux/utils/nfs, still I was hit with the same error message. I was not able to find out nfs-utils rpm that would install over SLES 10 SP 2. Can anybody please point me to the location of rpm? Why is OFED installation unable to detect the latest installation of nfs utils compiled from source and is fully dependent upon the rpm installation? Regards, Amar ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] ***SPAM*** Re: [ofa-general] OFED-1.4-rc3 is available
- 27 bugs fixed (see attached for details) Hi Vlad, I don't see the attached file. Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: Continue of defer skb_orphan() until irqs enabled
We run regression tests and it were OK. We will continue the testing and update if we see any issues. Olga On Sun, Sep 28, 2008 at 2:40 PM, Olga Shern (Voltaire) [EMAIL PROTECTED] wrote: Hi Eli, We also want to run regression tests with this patch. Please let me know when OFED daily build will include it. Thanks Olga On Sun, Sep 28, 2008 at 2:39 PM, Eli Cohen [EMAIL PROTECTED] wrote: On Fri, Sep 26, 2008 at 01:19:00PM -0700, Roland Dreier wrote: How about this? Instead of trying to rely on some complicated and fragile reasoning about when some race might occur, let's just do what we want to do anyway and get rid of LLTX. We change from priv-tx_lock (taken with IRQ disabling) to netif_tx_lock (taken on with BH-disabling). And then we can keep the skb_orphan in the place it is, since our xmit routine runs with IRQs enabled. We'll integrate this into ofed 1.4 and monitor this through our regression system. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED build hangs when trying to build sdpnetstat
Hi, Please see bugzilla: https://bugs.openfabrics.org/show_bug.cgi?id=1238 Olga On Mon, Sep 29, 2008 at 10:17 PM, Woodruff, Robert J [EMAIL PROTECTED] wrote: Has anyone else seen a problem with the OFED install in today's daily build hanging while trying to build sdpnetstat ? Here is the last few lines in the log file after a did a cntrl-c. The hang seems to happen both on EL 5.2 (2.6.18-92.el5) and EL 5.1 (2.6.18-53.el5). + unset DISPLAY + make netstat Configuring the Linux net-tools (NET-3 Base Utilities)... * * * Internationalization * * The net-tools package has currently been translated to French, * German and Brazilian Portugese. Other translations are, of * course, welcome. Answer `n' here if you have no support for * internationalization on your system. * Does your system support GNU gettext? (I18N) [n] * * * Protocol Families. * UNIX protocol family (HAVE_AFUNIX) [y] INET (TCP/IP) protocol family (HAVE_AFINET) [y] INET6 (IPv6) protocol family (HAVE_AFINET6) [n] make: *** Deleting file `config.h' make: *** wait: No child processes. Stop. make: *** Waiting for unfinished jobs make: *** wait: No child processes. Stop. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: Continue of defer skb_orphan() until irqs enabled
Hi Eli, We also want to run regression tests with this patch. Please let me know when OFED daily build will include it. Thanks Olga On Sun, Sep 28, 2008 at 2:39 PM, Eli Cohen [EMAIL PROTECTED] wrote: On Fri, Sep 26, 2008 at 01:19:00PM -0700, Roland Dreier wrote: How about this? Instead of trying to rely on some complicated and fragile reasoning about when some race might occur, let's just do what we want to do anyway and get rid of LLTX. We change from priv-tx_lock (taken with IRQ disabling) to netif_tx_lock (taken on with BH-disabling). And then we can keep the skb_orphan in the place it is, since our xmit routine runs with IRQs enabled. We'll integrate this into ofed 1.4 and monitor this through our regression system. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] ***SPAM*** Regarding: bonding issue.
Hi Gnana, First, I would recommend using OFED 1.3.1. How did you configure bonding? Please check whether you configuration is according the instructions in /usr/share/doc/packages/ib-bonding-0.9.0/ib-bonding.txt Best Regards, Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] ***SPAM*** OFED installation on SLES 10
Hi Vlad, I tested OFED 1.4 beta installation on SLES 10 minimal installation, and all dependencies checks were OK except kernel sources check, I think we should add for sles check whether kernel-source rpm is installed I attached a patch that should fix it. Thanks Olga install.diff Description: Binary data ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] ***SPAM*** OFED installation on SLES 10
Thanks Vlad, It works :) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
***SPAM*** Re: [ewg] OFED installation on RH5 UP2
Hi Vlad, I found another issue with openmpi rpm removal on RH5 UP2. See patch attached Thanks Olga uninstall.patch Description: Binary data ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Fwd: [ofa-general] [PATCH v3] ib/core: fix for send multicast group send leave retry
Hi Vlad, Please add this patch to OFED 1.4 It is in Roland's tree for 2.6.28 Thanks Olga -- Forwarded message -- From: Yossi Etigin [EMAIL PROTECTED] Date: Aug 11, 2008 7:35 PM Subject: [ofa-general] [PATCH v3] ib/core: fix for send multicast group send leave retry To: Roland Drier [EMAIL PROTECTED] Cc: Olga Shern [EMAIL PROTECTED], general list [EMAIL PROTECTED], Ron Livne [EMAIL PROTECTED] Until now, only if joining a multicast group failed there was a retry mechanism. This patch will add a mechanism that will retry to leave a multicast group before giving up. Changes from v1: - Save the leave state because it's overridden - use 'else' Changes from v2: - Call mcast_work_handler() when send_leave() fails Signed-off-by: Ron Livne [EMAIL PROTECTED] Signed-off-by: Yossi Etigin [EMAIL PROTECTED] Index: b/drivers/infiniband/core/multicast.c === --- a/drivers/infiniband/core/multicast.c 2008-08-11 19:13:26.0 +0300 +++ b/drivers/infiniband/core/multicast.c 2008-08-11 19:34:21.0 +0300 @@ -106,6 +106,8 @@ struct mcast_group { struct ib_sa_query *query; int query_id; u16 pkey_index; + u8 leave_state; + int retries; }; struct mcast_member { @@ -350,6 +352,7 @@ static int send_leave(struct mcast_group rec = group-rec; rec.join_state = leave_state; + group-leave_state = leave_state; ret = ib_sa_mcmember_rec_query(sa_client, port-dev-device, port-port_num, IB_SA_METHOD_DELETE, rec, @@ -542,7 +545,11 @@ static void leave_handler(int status, st { struct mcast_group *group = context; - mcast_work_handler(group-work); + if (status (group-retries 0) + !send_leave(group, group-leave_state)) + group-retries--; + else + mcast_work_handler(group-work); } static struct mcast_group *acquire_group(struct mcast_port *port, @@ -565,6 +572,7 @@ static struct mcast_group *acquire_group if (!group) return NULL; + group-retries = 3; group-port = port; group-rec.mgid = *mgid; group-pkey_index = MCAST_INVALID_PKEY_INDEX; -- --Yossi ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Ofed1.3 bonding problem
On 7/15/08, Acero Fernandez Alicia [EMAIL PROTECTED] wrote: Hi everybody, I have tried what is explained in the ib-bonding.txt file, but it doesn't work. What are the lines in the /etc/modprobe.conf for Redhat Enterprise linux 4 up 5? Or perhaps there is some more information needed. There is no need to add anything to /etc/modprobe.conf if your OS is RH4 UP5 Could anyone help me? Can you please send your network scripts for bonding and ifconfig output and dmesg. Thanks Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Ofed1.3 bonding problem
Hi Alicia, You are right bonding package inside OFED replaces bonding package that is installed on your OS (ib-bonding is installed under /lib/modules/uname -r/updates, therefore if you will remove it, you will have native bonding working). I assume that your OS is RH 4. Indeed in OFED 1.3 there was a bug, that Ethernet bonding didn't work on RH4, but there is a workaround in OFED 1.3.1. ib-bonding rpm includes ib-bonding.txt file that has an instruction how to configure Ethernet bonding: (/usr/share/doc/packages/ib-bonding-0.9.0/ib-bonding.txt) 3.3 Configuring Ethernet slaves --- It is not possible to have a mix of Ethernet slaves and IPoIB slaves under the same bonding master. It is possible however that a bonding master of Ethernet slaves and a bonding master of IPoIB slaves will co-exist in one machine. To configure Ethernet slaves under a bonding master use the same instructions as for IPoIB slaves (according to the OS) with one exception. When working under Redhat-AS4 do the following when configuring a bonding master with Ethernet slaves - In the master configuration file add the line SLAVEDEV=1 - In the slave configuration file leave the line TYPE=InfiniBand This bug will be fixed in OFED 1.4. Please let me know if it helps. Best Regards Olga On 7/10/08, Acero Fernandez Alicia [EMAIL PROTECTED] wrote: Hi everybody, I am trying to install ofed1.3 in my cluster. I have done ethernet bonding in some network interfaces and I would like to install ofed1.3, but when I try it ethernet network connection is lost. I have been looking for a solution, but I have found that the module name for ethernet bonding and for infiniband bonding is the same, then perhaps it is the reason, is it true? In that case, how could I solve it? It doesn´t seem to be solved in ofed1.3.1 because I have tried to install it and the same happens. Could you help me, please? Regards Alicia Confidencialidad: Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener información privilegiada o confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilización, divulgación y/o copia sin autorización está prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente respondiendo al mensaje y proceda a su destrucción. Disclaimer: This message and its attached files is intended exclusively for its recipients and may contain confidential information. If you received this e-mail in error you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited and may be unlawful. In this case, please notify us by a reply and delete this email and its contents immediately. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Agenda for the OFED meeting today (May 5)
On 5/19/08, Tziporet Koren [EMAIL PROTECTED] wrote: Hi, This is the agenda for the OFED meeting today: 1. OFED 1.3.1: 1.1 Schedule: rc1 - done on May 6 rc2 - May 22 == I propose to delay to Thursday since there are few IPOIB bugs on work GA - May 29 1.2 OS support: SLES10 SP2 backports were done (thanks to Moshe from Voltaire) There is a request fro RHEL 5.2 - who has this OS and can help with the backports? 1.3 Bugs status Please set release version 1.3.1 for all bugs that should be resolved in 1.3.1 In the way the bugs are assigned today it is very hard to extract the relevant bugs for the release. This is the list of bugs that should be resolved to my best knowledge (please add more): There is also bug number 1004 1004 https://bugs.openfabrics.org/show_bug.cgi?id=1004 maj P2 RHEL [EMAIL PROTECTED] IPoIB failed on stress testing 1024normal [EMAIL PROTECTED] Bonding-Ping not recovery after reconnect the non active interface 1027normal [EMAIL PROTECTED] kernel panic in mad.c handle_outgoing_dr_smp with RESULT_CONSUMED 1031normal [EMAIL PROTECTED] OpenSM fat tree routing thinks fat tree isn't 1032critical[EMAIL PROTECTED] RHEL 5.1 and OFED 1.3 cannot write IO blocks greater than 1024. 1038normal [EMAIL PROTECTED] Kernel panic while running tcp/ip ltp tests 1040normal [EMAIL PROTECTED]Kernel Oops during port up/down test 1041normal [EMAIL PROTECTED] Install Failed with memtrack flag in the conf file 1042normal [EMAIL PROTECTED] ofed-1.3.1 install fails 2. OFED 1.4: - Kernel rebase status: we have prepared the new tree, make-dist pass but compilation still fails. Any help to resolve compilation issues is welcome. URL: git://git.openfabrics.org/ofed_1_4/linux-2.6.git ofed_kernel - Update from the participants (mainly on new components/features): - NFSoRDMA - Jeff - Management - Sasha - Multiple EQs to best fit multi-core systems - we try to define it with Roland - RDMA CM to support IPv6 - Woody any news on this? - IB BMME and iWARP equivalent memory extensions - under progress on the general list 3. Open discussion - Upgrade memory in the OFA server: This request raised long time ago and we had a promise to do it after 1.3 release. What is the status? - Other topics ... Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Fwd: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index
On 5/14/08, Vladimir Sokolovsky [EMAIL PROTECTED] wrote: Olga Shern (Voltaire) wrote: Hello Olga, This patch can't be applied as is to the ofed-1.3.1 git tree: patching file drivers/infiniband/ulp/ipoib/ipoib_cm.c Hunk #1 succeeded at 847 (offset -160 lines). patching file drivers/infiniband/ulp/ipoib/ipoib_ib.c Hunk #1 succeeded at 488 (offset -106 lines). Hunk #2 FAILED at 729. 1 out of 2 hunks FAILED -- saving rejects to file drivers/infiniband/ulp/ipoib/ipoib_ib.c.rej Can you recreate this patch against git://git.openfabrics.org/ofed_1_3/linux-2.6.git http://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel? Regards, Vladimir It was applied without issues on OFED 1.3, therfore I sent it as is. I will recreate it against OFED 1.3.1 Olga I added this patch (as is) to ofed-1.3.1 kernel git tree as kernel_patches/fixes/ipoib_0360_Handle_case_when_P_Key_is_deleted.patch. Thanks, Regards, Vladimir Great, Thanks Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Fwd: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index
Hello Olga, This patch can't be applied as is to the ofed-1.3.1 git tree: patching file drivers/infiniband/ulp/ipoib/ipoib_cm.c Hunk #1 succeeded at 847 (offset -160 lines). patching file drivers/infiniband/ulp/ipoib/ipoib_ib.c Hunk #1 succeeded at 488 (offset -106 lines). Hunk #2 FAILED at 729. 1 out of 2 hunks FAILED -- saving rejects to file drivers/infiniband/ulp/ipoib/ipoib_ib.c.rej Can you recreate this patch against git:// git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel? Regards, Vladimir It was applied without issues on OFED 1.3, therfore I sent it as is. I will recreate it against OFED 1.3.1 Olga ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RE: [ofa-general] OFED May 5 meeting summary
On 5/12/08, Tziporet Koren [EMAIL PROTECTED] wrote: Moshe Kazir wrote: I have checked OFED-1.3.1-rc1 on SLES10 SP 2 Beta3. ib-bonding compile failed. Everything else is compiled o.k. Attached : ib-bonding error log. I'll take the backport of ib-bonding to sles10 sp 2 on me (if needed, I'll get Moni's help). Thanks Please update when done. Any need for a change in the install script? It seems that there is no need for changes in the install script, I will update you Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Fwd: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index
Hi Vlad, Please add this patch to OFED 1.3.1 In additional to the main purpose of this patch it is also fixes issues we saw with partitioning and SM failover because of: *Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey() everywhere in IPoIB, since none of the places that look for P_Keys are in a fast path or in non-sleeping context, and in general we want to kill off the whole caching infrastructure eventually. This also fixes consistency problems caused because some IPoIB queries were cached and some were uncached during the window where the cache was not updated.* ** Thanks Olga -- Forwarded message -- From: Roland Dreier [EMAIL PROTECTED] Date: Apr 15, 2008 8:55 AM Subject: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index To: [EMAIL PROTECTED] If a P_Key is deleted and then re-added at the same index, then IPoIB gets confused because __ipoib_ib_dev_flush() only checks whether the index is the same without checking whether the P_Key was present, so the interface is stopped when the P_Key is deleted, but the event when the P_Key is re-added gets ignored and the interface never gets restarted. Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey() everywhere in IPoIB, since none of the places that look for P_Keys are in a fast path or in non-sleeping context, and in general we want to kill off the whole caching infrastructure eventually. This also fixes consistency problems caused because some IPoIB queries were cached and some were uncached during the window where the cache was not updated. Thanks to Venkata Subramonyam [EMAIL PROTECTED] for debugging this problem and testing this fix. Signed-off-by: Roland Dreier [EMAIL PROTECTED] --- drivers/infiniband/ulp/ipoib/ipoib_cm.c |4 ++-- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 10 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 9d411f2..9db7b0b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1007,9 +1007,9 @@ static int ipoib_cm_modify_tx_init(struct net_device *dev, struct ipoib_dev_priv *priv = netdev_priv(dev); struct ib_qp_attr qp_attr; int qp_attr_mask, ret; - ret = ib_find_cached_pkey(priv-ca, priv-port, priv-pkey, qp_attr.pkey_index); + ret = ib_find_pkey(priv-ca, priv-port, priv-pkey, qp_attr.pkey_index); if (ret) { - ipoib_warn(priv, pkey 0x%x not in cache: %d\n, priv-pkey, ret); + ipoib_warn(priv, pkey 0x%x not found: %d\n, priv-pkey, ret); return ret; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 8b4ff69..0205eb7 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -594,7 +594,7 @@ static void ipoib_pkey_dev_check_presence(struct net_device *dev) struct ipoib_dev_priv *priv = netdev_priv(dev); u16 pkey_index = 0; - if (ib_find_cached_pkey(priv-ca, priv-port, priv-pkey, pkey_index)) + if (ib_find_pkey(priv-ca, priv-port, priv-pkey, pkey_index)) clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags); else set_bit(IPOIB_PKEY_ASSIGNED, priv-flags); @@ -835,13 +835,13 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv, int pkey_event) clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags); ipoib_ib_dev_down(dev, 0); ipoib_ib_dev_stop(dev, 0); - ipoib_pkey_dev_delay_open(dev); - return; + if (ipoib_pkey_dev_delay_open(dev)) + return; } - set_bit(IPOIB_PKEY_ASSIGNED, priv-flags); /* restart QP only if P_Key index is changed */ - if (new_index == priv-pkey_index) { + if (test_and_set_bit(IPOIB_PKEY_ASSIGNED, priv-flags) + new_index == priv-pkey_index) { ipoib_dbg(priv, Not flushing - P_Key index not changed.\n); return; } -- 1.5.5 ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED May 5 meeting summary
On 5/6/08, Tziporet Koren [EMAIL PROTECTED] wrote: May 5 OFED meeting summary: === 1. OFED 1.3.1: 1.1 Status of changes: IB-bonding - on work SRP failover - done (need more testing) SDP crashes - on work (not clear if we will have something on time) RDS fixes for RDMA API - done librdmacm 1.0.7 - done uDAPL updates - done Open MPI 1.2.6 - done MVAPICH 1.0.1 - done MVAPICH2 1.0.3 - done IPoIB - 2 bugs fixed. There are still two issue that should be resolved. Low level drivers: Changes that already committed: nes mlx4 cxgb3 ehca 1.2 Schedule: rc1 - was released today rc2 - May 20 GA - May 29 1.3 Discussion: - ipath driver is going to be updated - There is an issue of bonding and Ethernet drivers on RHEL4 - under debug - We wish to add support for SLES10 SP2. Already got an approval from Novell Any volunteer to provide the new backport patches? Tziporet, we will do it. Already started with it, seems like everything is compiled, need only backport bonding Olga 2. OFED 1.4: Updated that the new tree will be ready next week - based on 2.6.26-rc 3. Update on OpenSuSE build system - Yiftah updated on the work that is done and problems: - The system requires clean RPMs only (no use of install script) - they work to resolve - We target this system toward releases (and not to replace the daily build system). - we may try now with OFED 1.3.1 Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg