[ewg] Re: Possible process deadlock in RMPP flow
Eli Cohen wrote: On Wed, Sep 23, 2009 at 09:08:28AM -0700, Sean Hefty wrote: What kernel does 1.4.2 map to? I think OFED 1.4.2 is based on kernel 2.6.27 but they're using RHEL 5.3 Yes, the usual mess: ofed X is based on kernel Y1 but with some additions from kernel Y2 plus plenty of unreviwed and non-merged patches. Distro Z picks ofed X and the result is 99% unsupportable as Roland said. Somehow this ofed creature is still hanging around working on the the next damage its going to bring into this world (code name 1.5) Eli, here's a little tip for you, I had the displeasure to resolve bunch of support cases originating from the fact that the below 2 years old commit missed some ofed version (sorry forgot the number...), maybe it would help you as well? Under a normal setting, if this commit actually solves a bug being hit by many costumers, someone would have opened a distro bugzilla case saying, please pick this commit for your kernel, the customers would have either wait for the next distro update or use a distro intermediate kernel. Currently, I understand that distros are picking ofed versions and that's it. Or. commit b61d92d8ae6aa13b17d1c31e69d123879cec2ee2 Author: Sean Hefty sean.he...@intel.com Date: Fri Nov 30 17:30:18 2007 -0800 IB/mad: Fix incorrect access to items on local_list ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: Possible process deadlock in RMPP flow
On Thu, Sep 24, 2009 at 09:38:43AM +0300, Or Gerlitz wrote: commit b61d92d8ae6aa13b17d1c31e69d123879cec2ee2 Author: Sean Hefty sean.he...@intel.com Date: Fri Nov 30 17:30:18 2007 -0800 IB/mad: Fix incorrect access to items on local_list Thanks Or. This one is already in OFED 1.4.2 but apparently this is a different problem. Once I have information whether the patch Roland posted fixed it I will update the list. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH] libehca supported kernel versions
Alexander Schmidt wrote: Hi Vlad, please apply the following patch for install.pl. Signed-off-by: Alexander Schmidt al...@linux.vnet.ibm.com Index: OFED-1.5-20090915-0844/install.pl === --- OFED-1.5-20090915-0844.orig/install.pl +++ OFED-1.5-20090915-0844/install.pl @@ -1646,10 +1646,8 @@ sub set_availability set_compilers(); # Ehca -# if ($arch =~ m/ppc64|powerpc/ and -# $kernel =~ m/2.6.1[6-9]|2.6.2[0-9]/) { if ($arch =~ m/ppc64|powerpc/ and -$kernel =~ m/2.6.30/) { +$kernel =~ m/2.6.1[6-9]|2.6.2[0-9]|2.6.30/) { $kernel_modules_info{'ehca'}{'available'} = 1; $packages_info{'libehca'}{'available'} = 1; $packages_info{'libehca-devel-static'}{'available'} = 1; Applied, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] kernel_fixes: import a patch to fix bugzilla 1664
Moni Shoua wrote: Add commit 5e47596bee12597824a3b5b21e20f80b61e58a35 to kernel fixes. This will fix https://bugs.openfabrics.org/show_bug.cgi?id=1664. Signed-off-by: Moni Shoua mo...@voltaire.com --- kernel_patches/fixes/ipoib_0550_check_multicast_address_format.patch | 51 ++ 1 file changed, 51 insertions(+) Applied, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH] ofed-docs: A cooment about ib-bonding and newer kernels
Add comment for ib-bonding and distros that use new kernels (i.e. SLES11) Signed-off-by: Moni Shoua mo...@voltaire.com --- ipoib_release_notes.txt |5 + 1 file changed, 5 insertions(+) diff --git a/ipoib_release_notes.txt b/ipoib_release_notes.txt index adf1304..3c3d70f 100644 --- a/ipoib_release_notes.txt +++ b/ipoib_release_notes.txt @@ -271,6 +271,11 @@ Notes: * Using /etc/infiniband/openib.conf to create a persistent configuration is no longer supported * On RHEL4_U7, cannot set a slave interface as primary. +* ib-bonding will not be compiled and installed with OFED on OS with kernel + that is = 2.6.27. The bonding driver that comes with those kernels already + supports enslaving of IPoIB interfaces. However, there still might be a issue + of OS configuration tools (like sysconfig or initscripts) that needs a fix but + such issues were not observed yet. === ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH] ehca: backports for 2.6.27
Hi Vlad, please apply the following ehca backports for 2.6.27. Thanks! Signed-off-by: Alexander Schmidt al...@linux.vnet.ibm.com Index: ofa_kernel-1.5/kernel_patches/backport/2.6.27/ehca-010-undo_cpumask.patch === --- /dev/null +++ ofa_kernel-1.5/kernel_patches/backport/2.6.27/ehca-010-undo_cpumask.patch @@ -0,0 +1,42 @@ +--- + drivers/infiniband/hw/ehca/ehca_irq.c | 14 -- + 1 file changed, 8 insertions(+), 6 deletions(-) + +Index: ofa_kernel-1.5/drivers/infiniband/hw/ehca/ehca_irq.c +=== +--- ofa_kernel-1.5.orig/drivers/infiniband/hw/ehca/ehca_irq.c 2009-07-27 08:20:08.0 -0400 ofa_kernel-1.5/drivers/infiniband/hw/ehca/ehca_irq.c 2009-07-27 08:26:31.0 -0400 +@@ -659,12 +659,12 @@ + + WARN_ON_ONCE(!in_interrupt()); + if (ehca_debug_level = 3) +- ehca_dmp(cpu_online_mask, cpumask_size(), ); ++ ehca_dmp(cpu_online_map, sizeof(cpumask_t), ); + + spin_lock_irqsave(pool-last_cpu_lock, flags); +- cpu = cpumask_next(pool-last_cpu, cpu_online_mask); ++ cpu = next_cpu_nr(pool-last_cpu, cpu_online_map); + if (cpu = nr_cpu_ids) +- cpu = cpumask_first(cpu_online_mask); ++ cpu = first_cpu(cpu_online_map); + pool-last_cpu = cpu; + spin_unlock_irqrestore(pool-last_cpu_lock, flags); + +@@ -855,7 +855,7 @@ + case CPU_UP_CANCELED_FROZEN: + ehca_gen_dbg(CPU: %x (CPU_CANCELED), cpu); + cct = per_cpu_ptr(pool-cpu_comp_tasks, cpu); +- kthread_bind(cct-task, cpumask_any(cpu_online_mask)); ++ kthread_bind(cct-task, any_online_cpu(cpu_online_map)); + destroy_comp_task(pool, cpu); + break; + case CPU_ONLINE: +@@ -902,7 +902,7 @@ + return -ENOMEM; + + spin_lock_init(pool-last_cpu_lock); +- pool-last_cpu = cpumask_any(cpu_online_mask); ++ pool-last_cpu = any_online_cpu(cpu_online_map); + + pool-cpu_comp_tasks = alloc_percpu(struct ehca_cpu_comp_task); + if (pool-cpu_comp_tasks == NULL) { Index: ofa_kernel-1.5/kernel_patches/backport/2.6.27/ehca-020-undo_unsigned_long.patch === --- /dev/null +++ ofa_kernel-1.5/kernel_patches/backport/2.6.27/ehca-020-undo_unsigned_long.patch @@ -0,0 +1,1005 @@ +Index: ofa_kernel-1.5/drivers/infiniband/hw/ehca/ehca_cq.c +=== +--- ofa_kernel-1.5.orig/drivers/infiniband/hw/ehca/ehca_cq.c 2009-07-26 09:08:48.0 -0400 ofa_kernel-1.5/drivers/infiniband/hw/ehca/ehca_cq.c2009-07-27 08:59:04.0 -0400 +@@ -196,7 +196,7 @@ + + if (h_ret != H_SUCCESS) { + ehca_err(device, hipz_h_alloc_resource_cq() failed +- h_ret=%lli device=%p, h_ret, device); ++ h_ret=%li device=%p, h_ret, device); + cq = ERR_PTR(ehca2ib_return_code(h_ret)); + goto create_cq_exit2; + } +@@ -232,7 +232,7 @@ + + if (h_ret H_SUCCESS) { + ehca_err(device, hipz_h_register_rpage_cq() failed +- ehca_cq=%p cq_num=%x h_ret=%lli counter=%i ++ ehca_cq=%p cq_num=%x h_ret=%li counter=%i +act_pages=%i, my_cq, my_cq-cq_number, +h_ret, counter, param.act_pages); + cq = ERR_PTR(-EINVAL); +@@ -244,7 +244,7 @@ + if ((h_ret != H_SUCCESS) || vpage) { + ehca_err(device, Registration of pages not +complete ehca_cq=%p cq_num=%x +- h_ret=%lli, my_cq, my_cq-cq_number, ++ h_ret=%li, my_cq, my_cq-cq_number, +h_ret); + cq = ERR_PTR(-EAGAIN); + goto create_cq_exit4; +@@ -252,7 +252,7 @@ + } else { + if (h_ret != H_PAGE_REGISTERED) { + ehca_err(device, Registration of page failed +- ehca_cq=%p cq_num=%x h_ret=%lli ++ ehca_cq=%p cq_num=%x h_ret=%li +counter=%i act_pages=%i, +my_cq, my_cq-cq_number, +h_ret, counter, param.act_pages); +@@ -266,7 +266,7 @@ + + gal = my_cq-galpas.kernel; + cqx_fec = hipz_galpa_load(gal, CQTEMM_OFFSET(cqx_fec)); +- ehca_dbg(device, ehca_cq=%p cq_num=%x CQX_FEC=%llx, ++ ehca_dbg(device, ehca_cq=%p cq_num=%x CQX_FEC=%lx, +my_cq,
[ewg] Re: [PATCH] ehca: backports for 2.6.27
Alexander Schmidt wrote: Hi Vlad, please apply the following ehca backports for 2.6.27. Thanks! Signed-off-by: Alexander Schmidt al...@linux.vnet.ibm.com Applied, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: Possible process deadlock in RMPP flow
Thanks Or. This one is already in OFED 1.4.2 but apparently this is a different problem. Once I have information whether the patch Roland posted fixed it I will update the list. If ibnetdiscover doesn't use RMPP as Hal indicated, I don't think Roland's patch will help. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg