Re: [ewg] OFA server fs is full
I've been extremely busy these past few weeks and haven't had a chance to do much related to SA work on the server. I'll look into this tonight, but any migration might have to wait until the weekend. On Thu, Jul 23, 2009 at 12:24 PM, Jeff Becker wrote: > Hi Tziporet. I believe Ido is working on moving us to the new server. > > -jeff > > Tziporet Koren wrote: >> Sasha Khapyorsky wrote: >> >>> Now there is: >>> >>> Filesystem 1K-blocks Used Available Use% Mounted on >>> /dev/sda1 151873632 139556036 4602784 97% / >>> >>> >>> We will have a next "overflow" just in few days. >>> >>> >>> >>> >> Jeff >> When will we move to the new server? >> Can Ido help with this? >> >> Tziporet >> >> > > ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: Compile error on 7/14 daily build of OFED-1.5
This compile error still exists in today's daily build. I did a little investigation and it looks like it is an include file search order problem. It looks like the include file was added to bitops.h for ia64 for a backport for addr.c, however with this, it causes the compile problem in kobject_backport.c . I was able to work around the problem with the following changes to bitops.h, but I am not sure this is the best way to fix it. #ifndef BACKPORT_ASM_BITOPS_H #define BACKPORT_ASM_BITOPS_H #include_next #if defined(__ia64__) /* #include */ <- causes a compile problem in kobject_backport.c #define mb() ia64_mf() <--- if I only add the defines that are needed from system.h instead #ifdef CONFIG_SMP everything seems to compile and run OK. #define smp_mb() mb() #else #define smp_mb() barrier() #endif #endif static inline void clear_bit_unlock(unsigned long nr, volatile unsigned long *addr) { smp_mb__before_clear_bit(); clear_bit(nr, addr); } #endif ~ -Original Message- From: Tziporet Koren [mailto:tzipo...@dev.mellanox.co.il] Sent: Thursday, July 16, 2009 7:24 AM To: tzipo...@dev.mellanox.co.il Cc: Woodruff, Robert J; EWG Subject: Re: Compile error on 7/14 daily build of OFED-1.5 Tziporet Koren wrote: > Woodruff, Robert J wrote: >> I am seeing this build error when trying to compile >> the 7/14 daily build on EL 5.3 on IA64. Not sure who the maintainer >> is of kobject_backport.c, >> but it looks to be the culprit. >> >> woody >> >> >> ckport/2.6.18-EL5.3/include/linux/slab.h:1, >> from >> /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/drivers/infiniband/core/kobject_backport.c:1: >> >> >> include/linux/bitops.h: At top level: >> include/linux/bitops.h:57: error: conflicting types for 'fls_long' >> /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/kernel_addons/backport/2.6.18-EL5.3/include/linux/log2.h:64: >> >> error: previous implicit declaration of 'fls_long' was here >> make[4]: *** >> [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/drivers/infiniband/core/kobject_backport.o] >> >> Error 1 >> > There is no specific owner to the backports > If you have any fix please send it. > Otherwise we will try to look into it next week > > Tziporet > I see Jack just fixed it Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ofa-general] Re: [ewg] [Patch mthca backport] Don't use kmalloc > 128k
> This will fix the 2^20 bits limit on our bitmaps once and for all. Not really... since getting > 128KB of contiguous memory is likely to fail anyway. And I don't think the upstream kernel has that limit on kmalloc size either (at least with SLUB, not sure about SLAB). Really the long-term fix is to handle non-contiguous memory in the bitmap allocator. maybe using vmalloc(), although I always hate big allocations with vmalloc too. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] OFED release status
Hi All, I am going for two weeks vacation and wish to update you on the releases status: OFED 1.4.2 - everything is ready beside 1678 bug fix. After the fix is approved by Jon Vlad will release it next week OFED 1.5 - Alpha release was done today. Several modules backports are not completed yet, and we should move now to the new library schema we agreed at Sonoma Betsy will replace me in the next EWG meeting on Monday 27-July and Jack will represent Mellanox Regards, Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] OFED 1.5 alpha release is available
OFED 1.5-alpha4 is available Notes: The tarball is available on: http://www.openfabrics.org/downloads/OFED/ofed-1.5/OFED-1.5-alpha4.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ for OFED 1.5 Vladimir & Tziporet Release information: -- Linux Operating Systems: o Linux Operating Systems: - RedHat EL4 up6: 2.6.9-67.ELsmp - RedHat EL4 up7: 2.6.9-78.ELsmp - RedHat EL4 up8: 2.6.9-89.ELsmp - RedHat EL5 up2: 2.6.18-92.el5 - RedHat EL5 up3: 2.6.18-128.el5 - SLES10 SP2: 2.6.16.60-0.21-smp - SLES11: 2.6.27.19-5-default - OpenSuSE 10.3:2.6.22.5-31 * - OEL 4 up7 2.6.9-78.ELsmp - OEL 5 up2 2.6.18-92.el5 - CentOS5.2 2.6.18-92.el5 - CentOS5.3 2.6.18-128.el5 - kernel.org: 2.6.29 and 2.6.30 * Minimal QA for these versions Systems: * x86_64 * x86 * ia64 * ppc64 Changes from OFED-1.4.1 1 General changes o Kernel code based on 2.6.30 2 SDP o Performance improvements 3 uDAPL o New library 4 Management o OpenSM - Mesh Analysis for LASH routing algorithm. - Reloadable OpenSM configuration (preliminary implemented) - Routing paths sorted balancing (for UpDown and MinHops) - Weighted Lid Matrices calculation (for UpDown, MinHop and DOR). - I/O nodes connectivity (for FatTree). 5 MPI: - For now same versions as in OFED 1.4.1 Tasks that should be completed for the beta === 1. Complete Backports for all kernel modules 2. Move to new libraries package scheme (as we agreed in Sonoma) 3. SDP Zero Copy 4. Stability, stability, Stability ... ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFA server fs is full
Hi Tziporet. I believe Ido is working on moving us to the new server. -jeff Tziporet Koren wrote: > Sasha Khapyorsky wrote: > >> Now there is: >> >> Filesystem 1K-blocks Used Available Use% Mounted on >> /dev/sda1151873632 139556036 4602784 97% / >> >> >> We will have a next "overflow" just in few days. >> >> >> >> > Jeff > When will we move to the new server? > Can Ido help with this? > > Tziporet > > ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [Patch mthca backport] Don't use kmalloc > 128k
On Jul 23, 2009, at 4:20 AM, Jack Morgenstein wrote: On Thursday 16 July 2009 21:08, Doug Ledford wrote: On rhel4 and rhel5 machines, the kmalloc implementation does not automatically forward kmalloc requests > 128kb to __get_free_pages. Please include this patch in all rhel4 and rhel5 backport directories so that we do the right thing in the mthca driver on rhel in regards to kmalloc requests larger than 128k (at least in this code path, there may be others lurking too, I'll forward additional patches if I find they are needed). commit a7f18a776785aecb5eb9967aef6f0f603b698ba0 Author: Doug Ledford Date: Thu Jul 16 12:47:55 2009 -0400 [mthca] Fix attempts to use kmalloc on overly large allocations Signed-off-by: Doug Ledford This needs a correct signed-off-by: line. Mine got added when I put it in my local git tree, but the original patch came from Red Hat's bugzilla, bug #508902, author David Jeffery Roland, I think that this patch should be taken into the mainstream kernel, rather than just as a backport patch for RHEL. (We can have a similar patch for mlx4). I notice that __get_free_pages(), free_pages(), and get_order() are all in the mainstream kernel. This will fix the 2^20 bits limit on our bitmaps once and for all. If you agree, I will post this patch and one for mlx4 on the general list. Doug posted this patch on the EWG list. Thanks Doug! diff --git a/drivers/infiniband/hw/mthca/mthca_mr.c b/drivers/ infiniband/hw/mthca/mthca_mr.c index d606edf..312e18d 100644 --- a/drivers/infiniband/hw/mthca/mthca_mr.c +++ b/drivers/infiniband/hw/mthca/mthca_mr.c @@ -152,8 +152,11 @@ static int mthca_buddy_init(struct mthca_buddy *buddy, int max_order) goto err_out; for (i = 0; i <= buddy->max_order; ++i) { - s = BITS_TO_LONGS(1 << (buddy->max_order - i)); - buddy->bits[i] = kmalloc(s * sizeof (long), GFP_KERNEL); + s = BITS_TO_LONGS(1 << (buddy->max_order - i)) * sizeof(long); + if(s > PAGE_SIZE) + buddy->bits[i] = (unsigned long *)__get_free_pages(GFP_KERNEL, get_order(s)); + else + buddy->bits[i] = kmalloc(s, GFP_KERNEL); if (!buddy->bits[i]) goto err_out_free; bitmap_zero(buddy->bits[i], @@ -166,9 +169,13 @@ static int mthca_buddy_init(struct mthca_buddy *buddy, int max_order) return 0; err_out_free: - for (i = 0; i <= buddy->max_order; ++i) - kfree(buddy->bits[i]); - + for (i = 0; i <= buddy->max_order; ++i){ + s = BITS_TO_LONGS(1 << (buddy->max_order - i)) * sizeof(long); + if(s > PAGE_SIZE) + free_pages((unsigned long)buddy->bits[i], get_order(s)); + else + kfree(buddy->bits[i]); + } err_out: kfree(buddy->bits); kfree(buddy->num_free); @@ -178,10 +185,15 @@ err_out: static void mthca_buddy_cleanup(struct mthca_buddy *buddy) { - int i; + int i, s; - for (i = 0; i <= buddy->max_order; ++i) - kfree(buddy->bits[i]); + for (i = 0; i <= buddy->max_order; ++i){ + s = BITS_TO_LONGS(1 << (buddy->max_order - i)) * sizeof(long); + if(s > PAGE_SIZE) + free_pages((unsigned long)buddy->bits[i], get_order(s)); + else + kfree(buddy->bits[i]); + } kfree(buddy->bits); kfree(buddy->num_free); -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford InfiniBand Specific RPMS http://people.redhat.com/dledford/Infiniband PGP.sig Description: This is a digitally signed message part ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFA server fs is full
Sasha Khapyorsky wrote: Now there is: Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1151873632 139556036 4602784 97% / We will have a next "overflow" just in few days. Jeff When will we move to the new server? Can Ido help with this? Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [Patch mthca backport] Don't use kmalloc > 128k
On Thursday 16 July 2009 21:08, Doug Ledford wrote: > On rhel4 and rhel5 machines, the kmalloc implementation does not > automatically forward kmalloc requests > 128kb to __get_free_pages. > Please include this patch in all rhel4 and rhel5 backport directories > so that we do the right thing in the mthca driver on rhel in regards > to kmalloc requests larger than 128k (at least in this code path, > there may be others lurking too, I'll forward additional patches if I > find they are needed). > > commit a7f18a776785aecb5eb9967aef6f0f603b698ba0 Author: Doug Ledford Date: Thu Jul 16 12:47:55 2009 -0400 [mthca] Fix attempts to use kmalloc on overly large allocations Signed-off-by: Doug Ledford Roland, I think that this patch should be taken into the mainstream kernel, rather than just as a backport patch for RHEL. (We can have a similar patch for mlx4). I notice that __get_free_pages(), free_pages(), and get_order() are all in the mainstream kernel. This will fix the 2^20 bits limit on our bitmaps once and for all. If you agree, I will post this patch and one for mlx4 on the general list. Doug posted this patch on the EWG list. Thanks Doug! diff --git a/drivers/infiniband/hw/mthca/mthca_mr.c b/drivers/infiniband/hw/mthca/mthca_mr.c index d606edf..312e18d 100644 --- a/drivers/infiniband/hw/mthca/mthca_mr.c +++ b/drivers/infiniband/hw/mthca/mthca_mr.c @@ -152,8 +152,11 @@ static int mthca_buddy_init(struct mthca_buddy *buddy, int max_order) goto err_out; for (i = 0; i <= buddy->max_order; ++i) { - s = BITS_TO_LONGS(1 << (buddy->max_order - i)); - buddy->bits[i] = kmalloc(s * sizeof (long), GFP_KERNEL); + s = BITS_TO_LONGS(1 << (buddy->max_order - i)) * sizeof(long); + if(s > PAGE_SIZE) + buddy->bits[i] = (unsigned long *)__get_free_pages(GFP_KERNEL, get_order(s)); + else + buddy->bits[i] = kmalloc(s, GFP_KERNEL); if (!buddy->bits[i]) goto err_out_free; bitmap_zero(buddy->bits[i], @@ -166,9 +169,13 @@ static int mthca_buddy_init(struct mthca_buddy *buddy, int max_order) return 0; err_out_free: - for (i = 0; i <= buddy->max_order; ++i) - kfree(buddy->bits[i]); - + for (i = 0; i <= buddy->max_order; ++i){ + s = BITS_TO_LONGS(1 << (buddy->max_order - i)) * sizeof(long); + if(s > PAGE_SIZE) + free_pages((unsigned long)buddy->bits[i], get_order(s)); + else + kfree(buddy->bits[i]); + } err_out: kfree(buddy->bits); kfree(buddy->num_free); @@ -178,10 +185,15 @@ err_out: static void mthca_buddy_cleanup(struct mthca_buddy *buddy) { - int i; + int i, s; - for (i = 0; i <= buddy->max_order; ++i) - kfree(buddy->bits[i]); + for (i = 0; i <= buddy->max_order; ++i){ + s = BITS_TO_LONGS(1 << (buddy->max_order - i)) * sizeof(long); + if(s > PAGE_SIZE) + free_pages((unsigned long)buddy->bits[i], get_order(s)); + else + kfree(buddy->bits[i]); + } kfree(buddy->bits); kfree(buddy->num_free); ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg