Re: [PATCH V2] mm/page_alloc: Ensure that HUGETLB_PAGE_ORDER is less than MAX_ORDER

2021-04-19 Thread Christoph Lameter
n't we end up setting the buddy order to > > something > MAX_ORDER -1 on that path? > > Agreed. We would need to return the supersized block to the huge page pool and not to the buddy allocator. There is a special callback in the compound page sos that you can call an alternate free

Re: 4.12-rc ppc64 4k-page needs costly allocations

2017-06-02 Thread Christoph Lameter
On Thu, 1 Jun 2017, Hugh Dickins wrote: > SLUB versus SLAB, cpu versus memory? Since someone has taken the > trouble to write it with ctors in the past, I didn't feel on firm > enough ground to recommend such a change. But it may be obvious > to someone else that your suggestion would be better

Re: 4.12-rc ppc64 4k-page needs costly allocations

2017-06-02 Thread Christoph Lameter
On Thu, 1 Jun 2017, Hugh Dickins wrote: > Thanks a lot for working that out. Makes sense, fully understood now, > nothing to worry about (though makes one wonder whether it's efficient > to use ctors on high-alignment caches; or whether an internal "zero-me" > ctor would be useful). Use kzalloc

Re: 4.12-rc ppc64 4k-page needs costly allocations

2017-06-01 Thread Christoph Lameter
On Thu, 1 Jun 2017, Hugh Dickins wrote: > CONFIG_SLUB_DEBUG_ON=y. My SLAB|SLUB config options are > > CONFIG_SLUB_DEBUG=y > # CONFIG_SLUB_MEMCG_SYSFS_ON is not set > # CONFIG_SLAB is not set > CONFIG_SLUB=y > # CONFIG_SLAB_FREELIST_RANDOM is not set > CONFIG_SLUB_CPU_PARTIAL=y > CONFIG_SLABINFO=y

Re: 4.12-rc ppc64 4k-page needs costly allocations

2017-06-01 Thread Christoph Lameter
> > I am curious as to what is going on there. Do you have the output from > > these failed allocations? > > I thought the relevant output was in my mail. I did skip the Mem-Info > dump, since that just seemed noise in this case: we know memory can get > fragmented. What more output are you loo

Re: 4.12-rc ppc64 4k-page needs costly allocations

2017-05-31 Thread Christoph Lameter
On Wed, 31 May 2017, Michael Ellerman wrote: > > SLUB: Unable to allocate memory on node -1, gfp=0x14000c0(GFP_KERNEL) > > cache: pgtable-2^12, object size: 32768, buffer size: 65536, default > > order: 4, min order: 4 > > pgtable-2^12 debugging increased min order, use slub_debug=O to disabl

Re: 4.12-rc ppc64 4k-page needs costly allocations

2017-05-31 Thread Christoph Lameter
On Tue, 30 May 2017, Hugh Dickins wrote: > I wanted to try removing CONFIG_SLUB_DEBUG, but didn't succeed in that: > it seemed to be a hard requirement for something, but I didn't find what. CONFIG_SLUB_DEBUG does not enable debugging. It only includes the code to be able to enable it at runtime.

Re: [PATCH] percpu: improve generic percpu modify-return implementation

2016-09-21 Thread Christoph Lameter
On Wed, 21 Sep 2016, Tejun Heo wrote: > Hello, Nick. > > How have you been? :) > He is baack. Are we getting SL!B? ;-)

Re: [kernel-hardening] Re: [PATCH 9/9] mm: SLUB hardened usercopy support

2016-07-08 Thread Christoph Lameter
On Fri, 8 Jul 2016, Kees Cook wrote: > Is check_valid_pointer() making sure the pointer is within the usable > size? It seemed like it was checking that it was within the slub > object (checks against s->size, wants it above base after moving > pointer to include redzone, etc). check_valid_pointe

Re: [kernel-hardening] Re: [PATCH 9/9] mm: SLUB hardened usercopy support

2016-07-08 Thread Christoph Lameter
On Fri, 8 Jul 2016, Michael Ellerman wrote: > > I wonder if this code should be using size_from_object() instead of s->size? > > Hmm, not sure. Who's SLUB maintainer? :) Me. s->size is the size of the whole object including debugging info etc. ksize() gives you the actual usable size of an objec

Re: [PATCH v3 1/3] mm: rename alloc_pages_exact_node to __alloc_pages_node

2015-07-30 Thread Christoph Lameter
> > the patch. See the code in slub.c that is similar. > > Doh, somehow I convinced myself that there's #else and alloc_pages() is only > used for !CONFIG_NUMA so it doesn't matter. Here's a fixed version. Acked-by: Christoph Lameter __

Re: [PATCH v3 1/3] mm: rename alloc_pages_exact_node to __alloc_pages_node

2015-07-30 Thread Christoph Lameter
On Thu, 30 Jul 2015, Vlastimil Babka wrote: > --- a/mm/slob.c > +++ b/mm/slob.c > void *page; > > -#ifdef CONFIG_NUMA > - if (node != NUMA_NO_NODE) > - page = alloc_pages_exact_node(node, gfp, order); > - else > -#endif > - page = alloc_pages(gfp, order); > +

Re: [PATCH v3 2/3] mm: unify checks in alloc_pages_node() and __alloc_pages_node()

2015-07-30 Thread Christoph Lameter
Acked-by: Christoph Lameter ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/3] mm: use numa_mem_id() in alloc_pages_node()

2015-07-30 Thread Christoph Lameter
On Thu, 30 Jul 2015, Vlastimil Babka wrote: > numa_mem_id() is able to handle allocation from CPUs on memory-less nodes, > so it's a more robust fallback than the currently used numa_node_id(). > > Suggested-by: Christoph Lameter > Signed-off-by: Vlastimil Babka > A

Re: [PATCH] mm: rename and document alloc_pages_exact_node

2015-07-23 Thread Christoph Lameter
On Wed, 22 Jul 2015, David Rientjes wrote: > Eek, yeah, that does look bad. I'm not even sure the > > if (nid < 0) > nid = numa_node_id(); > > is correct; I think this should be comparing to NUMA_NO_NODE rather than > all negative numbers, otherwise we silently ignore overflow

Re: [PATCH] mm: rename and document alloc_pages_exact_node

2015-07-21 Thread Christoph Lameter
On Tue, 21 Jul 2015, Vlastimil Babka wrote: > The function alloc_pages_exact_node() was introduced in 6484eb3e2a81 ("page > allocator: do not check NUMA node ID when the caller knows the node is valid") > as an optimized variant of alloc_pages_node(), that doesn't allow the node id > to be -1. Unf

Re: powerpc: Replace __get_cpu_var uses

2014-10-29 Thread Christoph Lameter
On Wed, 29 Oct 2014, Michael Ellerman wrote: > > #define __ARCH_IRQ_STAT > > > > -#define local_softirq_pending() > > __get_cpu_var(irq_stat).__softirq_pending > > +#define local_softirq_pending() > > __this_cpu_read(irq_stat.__softirq_pending) > > +#define set_softirq_pending(x) __this_c

Re: powerpc: Replace __get_cpu_var uses

2014-10-27 Thread Christoph Lameter
On Tue, 28 Oct 2014, Michael Ellerman wrote: > I'm happy to put it in a topic branch for 3.19, or move the definition or > whatever, your choice Christoph. Get the patch merged please. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https:

Re: powerpc: Replace __get_cpu_var uses

2014-10-27 Thread Christoph Lameter
Ping? We are planning to remove support for __get_cpu_var in the 3.19 merge period. I can move the definition for __get_cpu_var into the powerpc per cpu definition instead if we cannot get this merged? On Tue, 21 Oct 2014, Christoph Lameter wrote: > > This still has not been merged a

powerpc: Replace __get_cpu_var uses

2014-10-21 Thread Christoph Lameter
write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt CC: Paul Mackerras Signed-off-by: Christoph Lameter --- arch/powerpc/include/asm/hardirq.h | 4 +++

Re: [RFC PATCH v3 1/4] topology: add support for node_to_mem_node() to determine the fallback node

2014-08-14 Thread Christoph Lameter
On Wed, 13 Aug 2014, Nishanth Aravamudan wrote: > +++ b/include/linux/topology.h > @@ -119,11 +119,20 @@ static inline int numa_node_id(void) > * Use the accessor functions set_numa_mem(), numa_mem_id() and cpu_to_mem(). > */ > DECLARE_PER_CPU(int, _numa_mem_); > +extern int _node_numa_mem_[M

RE: Kernel build issues after yesterdays merge by Linus

2014-06-12 Thread Christoph Lameter
Goobledieguy due to missing Mime header. On Thu, 12 Jun 2014, David Laight wrote: > RnJvbTogQW50b24gQmxhbmNoYXJkDQouLi4NCj4gZGlmZiAtLWdpdCBhL2FyY2gvcG93ZXJwYy9i > b290L2luc3RhbGwuc2ggYi9hcmNoL3Bvd2VycGMvYm9vdC9pbnN0YWxsLnNoDQo+IGluZGV4IGI2 > YTI1NmIuLmUwOTZlNWEgMTAwNjQ0DQo+IC0tLSBhL2FyY2gvcG93ZXJ

power and percpu: Could we move the paca into the percpu area?

2014-06-11 Thread Christoph Lameter
Looking at arch/powerpc/include/asm/percpu.h I see that the per cpu offset comes from a local_paca field and local_paca is in r13. That means that for all percpu operations we first have to determine the address through a memory access. Would it be possible to put the paca at the beginning of the

Kernel build issues after yesterdays merge by Linus

2014-06-11 Thread Christoph Lameter
This is under Ubuntu Utopic Unicorn on a Power 8 system while simply trying to build with the Ubuntu standard kernel config. It could be that these issues come about because we do not have an rc1 yet but I wanted to give some early notice. Also this is a new arch to me so I may not be aware of how

Re: Node 0 not necessary for powerpc?

2014-05-21 Thread Christoph Lameter
On Mon, 19 May 2014, Nishanth Aravamudan wrote: > I'm seeing a panic at boot with this change on an LPAR which actually > has no Node 0. Here's what I think is happening: > > start_kernel > ... > -> setup_per_cpu_areas > -> pcpu_embed_first_chunk > -> pcpu_fc_alloc >

Re: Bug in reclaim logic with exhausted nodes?

2014-04-03 Thread Christoph Lameter
On Mon, 31 Mar 2014, Nishanth Aravamudan wrote: > Yep. The node exists, it's just fully exhausted at boot (due to the > presence of 16GB pages reserved at boot-time). Well if you want us to support that then I guess you need to propose patches to address this issue. > I'd appreciate a bit more g

Re: Bug in reclaim logic with exhausted nodes?

2014-03-28 Thread Christoph Lameter
On Thu, 27 Mar 2014, Nishanth Aravamudan wrote: > > That looks to be the correct way to handle things. Maybe mark the node as > > offline or somehow not present so that the kernel ignores it. > > This is a SLUB condition: > > mm/slub.c::early_kmem_cache_node_alloc(): > ... > page = new_sla

Re: Bug in reclaim logic with exhausted nodes?

2014-03-25 Thread Christoph Lameter
On Tue, 25 Mar 2014, Nishanth Aravamudan wrote: > On power, very early, we find the 16G pages (gpages in the powerpc arch > code) in the device-tree: > > early_setup -> > early_init_mmu -> > htab_initialize -> > htab_init_page_sizes -> >

Re: Bug in reclaim logic with exhausted nodes?

2014-03-25 Thread Christoph Lameter
On Tue, 25 Mar 2014, Nishanth Aravamudan wrote: > On 25.03.2014 [11:17:57 -0500], Christoph Lameter wrote: > > On Mon, 24 Mar 2014, Nishanth Aravamudan wrote: > > > > > Anyone have any ideas here? > > > > Dont do that? Check on boot to not allow exhausting a no

Re: Bug in reclaim logic with exhausted nodes?

2014-03-25 Thread Christoph Lameter
On Mon, 24 Mar 2014, Nishanth Aravamudan wrote: > Anyone have any ideas here? Dont do that? Check on boot to not allow exhausting a node with huge pages? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linu

Re: Node 0 not necessary for powerpc?

2014-03-12 Thread Christoph Lameter
On Tue, 11 Mar 2014, Nishanth Aravamudan wrote: > I have a P7 system that has no node0, but a node0 shows up in numactl > --hardware, which has no cpus and no memory (and no PCI devices): Well as you see from the code there has been so far the assumption that node 0 has memory. I have never run a

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-24 Thread Christoph Lameter
On Mon, 24 Feb 2014, Joonsoo Kim wrote: > > It will not common get there because of the tracking. Instead a per cpu > > object will be used. > > > get_partial_node() always fails even if there are some partial slab on > > > memoryless node's neareast node. > > > > Correct and that leads to a page

Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-24 Thread Christoph Lameter
On Fri, 21 Feb 2014, Nishanth Aravamudan wrote: > I added two calls to local_memory_node(), I *think* both are necessary, > but am willing to be corrected. > > One is in map_cpu_to_node() and one is in start_secondary(). The > start_secondary() path is fine, AFAICT, as we are up & running at that

Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-20 Thread Christoph Lameter
On Wed, 19 Feb 2014, Nishanth Aravamudan wrote: > We can call local_memory_node() before the zonelists are setup. In that > case, first_zones_zonelist() will not set zone and the reference to > zone->node will Oops. Catch this case, and, since we presumably running > very early, just return that a

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-20 Thread Christoph Lameter
On Wed, 19 Feb 2014, David Rientjes wrote: > On Tue, 18 Feb 2014, Christoph Lameter wrote: > > > Its an optimization to avoid calling the page allocator to figure out if > > there is memory available on a particular node. > Thus this patch breaks with memory hot-add for a

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-19 Thread Christoph Lameter
On Tue, 18 Feb 2014, Nishanth Aravamudan wrote: > the performance impact of the underlying NUMA configuration. I guess we > could special-case memoryless/cpuless configurations somewhat, but I > don't think there's any reason to do that if we can make memoryless-node > support work in-kernel? Wel

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-18 Thread Christoph Lameter
On Tue, 18 Feb 2014, Nishanth Aravamudan wrote: > We use the topology provided by the hypervisor, it does actually reflect > where CPUs and memory are, and their corresponding performance/NUMA > characteristics. And so there are actually nodes without memory that have processors? Can the hypervis

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-18 Thread Christoph Lameter
On Tue, 18 Feb 2014, Nishanth Aravamudan wrote: > > Well, on powerpc, with the hypervisor providing the resources and the > topology, you can have cpuless and memoryless nodes. I'm not sure how > "fake" the NUMA is -- as I think since the resources are virtualized to > be one system, it's logicall

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-18 Thread Christoph Lameter
On Mon, 17 Feb 2014, Joonsoo Kim wrote: > On Wed, Feb 12, 2014 at 10:51:37PM -0800, Nishanth Aravamudan wrote: > > Hi Joonsoo, > > Also, given that only ia64 and (hopefuly soon) ppc64 can set > > CONFIG_HAVE_MEMORYLESS_NODES, does that mean x86_64 can't have > > memoryless nodes present? Even with

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-18 Thread Christoph Lameter
On Mon, 17 Feb 2014, Joonsoo Kim wrote: > On Wed, Feb 12, 2014 at 04:16:11PM -0600, Christoph Lameter wrote: > > Here is another patch with some fixes. The additional logic is only > > compiled in if CONFIG_HAVE_MEMORYLESS_NODES is set. > > > > Subject: sl

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-12 Thread Christoph Lameter
current available per cpu objects and if that is not available will create a new slab using the page allocator to fallback from the memoryless node to some other node. Signed-off-by: Christoph Lameter Index: linux/mm/slub.c

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-11 Thread Christoph Lameter
On Mon, 10 Feb 2014, Joonsoo Kim wrote: > On Fri, Feb 07, 2014 at 12:51:07PM -0600, Christoph Lameter wrote: > > Here is a draft of a patch to make this work with memoryless nodes. > > > > The first thing is that we modify node_match to also match if we hit an > >

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-07 Thread Christoph Lameter
Here is a draft of a patch to make this work with memoryless nodes. The first thing is that we modify node_match to also match if we hit an empty node. In that case we simply take the current slab if its there. If there is no current slab then a regular allocation occurs with the memoryless node.

Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

2014-02-07 Thread Christoph Lameter
On Fri, 7 Feb 2014, Joonsoo Kim wrote: > > > > It seems like a better approach would be to do this when a node is brought > > online and determine the fallback node based not on the zonelists as you > > do here but rather on locality (such as through a SLIT if provided, see > > node_distance()). >

Re: [RFC PATCH 3/3] slub: fallback to get_numa_mem() node if we want to allocate on memoryless node

2014-02-07 Thread Christoph Lameter
On Fri, 7 Feb 2014, Joonsoo Kim wrote: > > This check wouild need to be something that checks for other contigencies > > in the page allocator as well. A simple solution would be to actually run > > a GFP_THIS_NODE alloc to see if you can grab a page from the proper node. > > If that fails then fa

Re: [RFC PATCH 3/3] slub: fallback to get_numa_mem() node if we want to allocate on memoryless node

2014-02-06 Thread Christoph Lameter
On Thu, 6 Feb 2014, Joonsoo Kim wrote: > diff --git a/mm/slub.c b/mm/slub.c > index cc1f995..c851f82 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1700,6 +1700,14 @@ static void *get_partial(struct kmem_cache *s, gfp_t > flags, int node, > void *object; > int searchnode = (node ==

Re: [RFC PATCH 1/3] slub: search partial list on numa_mem_id(), instead of numa_node_id()

2014-02-06 Thread Christoph Lameter
On Thu, 6 Feb 2014, David Rientjes wrote: > I think you'll need to send these to Andrew since he appears to be picking > up slub patches these days. I can start managing merges again if Pekka no longer has the time. ___ Linuxppc-dev mailing list Linuxpp

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-02-06 Thread Christoph Lameter
On Wed, 5 Feb 2014, Nishanth Aravamudan wrote: > > Right so if we are ignoring the node then the simplest thing to do is to > > not deactivate the current cpu slab but to take an object from it. > > Ok, that's what Anton's patch does, I believe. Are you ok with that > patch as it is? No. Again hi

Re: [RFC PATCH 1/3] slub: search partial list on numa_mem_id(), instead of numa_node_id()

2014-02-06 Thread Christoph Lameter
e no partial slab on that node. > > On that node, page allocation always fallback to numa_mem_id() first. So > searching a partial slab on numa_node_id() in that case is proper solution > for memoryless node case. Acked-by: Christoph Lameter

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-02-05 Thread Christoph Lameter
On Tue, 4 Feb 2014, Nishanth Aravamudan wrote: > > If the target node allocation fails (for whatever reason) then I would > > recommend for simplicities sake to change the target node to > > NUMA_NO_NODE and just take whatever is in the current cpu slab. A more > > complex solution would be to loo

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-02-04 Thread Christoph Lameter
On Mon, 3 Feb 2014, Nishanth Aravamudan wrote: > Yes, sorry for my lack of clarity. I meant Joonsoo's latest patch for > the $SUBJECT issue. Hmmm... I am not sure that this is a general solution. The fallback to other nodes can not only occur because a node has no memory as his patch assumes. If

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-02-03 Thread Christoph Lameter
On Mon, 3 Feb 2014, Nishanth Aravamudan wrote: > So what's the status of this patch? Christoph, do you think this is fine > as it is? Certainly enabling CONFIG_MEMORYLESS_NODES is the right thing to do and I already acked the patch. ___ Linuxppc-dev ma

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-30 Thread Christoph Lameter
On Wed, 29 Jan 2014, Nishanth Aravamudan wrote: > exactly what the caller intends. > > int searchnode = node; > if (node == NUMA_NO_NODE) > searchnode = numa_mem_id(); > if (!node_present_pages(node)) > searchnode = local_memory_node(node); > > The difference in semantics from the prev

Re: [PATCH] powerpc: enable CONFIG_HAVE_MEMORYLESS_NODES

2014-01-29 Thread Christoph Lameter
On Tue, 28 Jan 2014, Nishanth Aravamudan wrote: > Anton Blanchard found an issue with an LPAR that had no memory in Node > 0. Christoph Lameter recommended, as one possible solution, to use > numa_mem_id() for locality of the nearest memory node-wise. However, > numa_mem_id() [a

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-29 Thread Christoph Lameter
On Tue, 28 Jan 2014, Nishanth Aravamudan wrote: > This helps about the same as David's patch -- but I found the reason > why! ppc64 doesn't set CONFIG_HAVE_MEMORYLESS_NODES :) Expect a patch > shortly for that and one other case I found. Oww... ___ Lin

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-27 Thread Christoph Lameter
On Fri, 24 Jan 2014, Nishanth Aravamudan wrote: > What I find odd is that there are only 2 nodes on this system, node 0 > (empty) and node 1. So won't numa_mem_id() always be 1? And every page > should be coming from node 1 (thus node_match() should always be true?) Well yes that occurs if you sp

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-27 Thread Christoph Lameter
On Fri, 24 Jan 2014, Nishanth Aravamudan wrote: > As to cpu_to_node() being passed to kmalloc_node(), I think an > appropriate fix is to change that to cpu_to_mem()? Yup. > > Yeah, the default policy should be to fallback to local memory if the node > > passed is memoryless. > > Thanks! I would

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-27 Thread Christoph Lameter
On Fri, 24 Jan 2014, David Rientjes wrote: > kmalloc_node(nid) and kmem_cache_alloc_node(nid) should fallback to nodes > other than nid when memory can't be allocated, these functions only > indicate a preference. The nid passed indicated a preference unless __GFP_THIS_NODE is specified. Then the

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-24 Thread Christoph Lameter
On Fri, 24 Jan 2014, Wanpeng Li wrote: > > > >diff --git a/mm/slub.c b/mm/slub.c > >index 545a170..a1c6040 100644 > >--- a/mm/slub.c > >+++ b/mm/slub.c > >@@ -1700,6 +1700,9 @@ static void *get_partial(struct kmem_cache *s, gfp_t > >flags, int node, > > void *object; > > int searchnode =

Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-20 Thread Christoph Lameter
On Mon, 20 Jan 2014, Wanpeng Li wrote: > >+ enum zone_type high_zoneidx = gfp_zone(flags); > > > >+ if (!node_present_pages(searchnode)) { > >+ zonelist = node_zonelist(searchnode, flags); > >+ for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) { > >+

Re: mm/slab: ppc: ubi: kmalloc_slab WARNING / PPC + UBI driver

2013-08-06 Thread Christoph Lameter
On Tue, 6 Aug 2013, Wladislav Wiebe wrote: > ok, just saw in slab/for-linus branch that those stuff is reverted again.. No that was only for the 3.11 merge by Linus. The 3.12 patches have not been put into pekkas tree. ___ Linuxppc-dev mailing list Linu

Re: mm/slab: ppc: ubi: kmalloc_slab WARNING / PPC + UBI driver

2013-07-31 Thread Christoph Lameter
On Wed, 31 Jul 2013, Wladislav Wiebe wrote: > Thanks for the point, do you plan to make kmalloc_large available for extern > access in a separate mainline patch? > Since kmalloc_large is statically defined in slub_def.h and when including it > to seq_file.c > we have a lot of conflicting types:

Re: mm/slab: ppc: ubi: kmalloc_slab WARNING / PPC + UBI driver

2013-07-31 Thread Christoph Lameter
allocation. Use kmalloc_large(). This fixes the warning about large allocs but it will still cause large contiguous allocs that could fail because of memory fragmentation. Signed-off-by: Christoph Lameter Index: linux/fs/seq_file.c

Re: mm/slab: ppc: ubi: kmalloc_slab WARNING / PPC + UBI driver

2013-07-31 Thread Christoph Lameter
buffers for proc fs? Signed-off-by: Christoph Lameter Index: linux/fs/seq_file.c === --- linux.orig/fs/seq_file.c2013-07-10 14:03:15.367134544 -0500 +++ linux/fs/seq_file.c 2013-07-31 10:11:42.671736131 -0500 @@ -96,7 +96,7 @@ static

Re: mm/slab: ppc: ubi: kmalloc_slab WARNING / PPC + UBI driver

2013-07-31 Thread Christoph Lameter
On Wed, 31 Jul 2013, Wladislav Wiebe wrote: > on a PPC 32-Bit board with a Linux Kernel v3.10.0 I see trouble with > kmalloc_slab. > Basically at system startup, something request a size of 8388608 b, > but KMALLOC_MAX_SIZE has 4194304 b in our case. It points a WARNING at: > .. > NIP [c0099fec]

Re: [PATCH v5 04/14] memory-hotplug: remove /sys/firmware/memmap/X sysfs

2013-01-02 Thread Christoph Lameter
On Thu, 27 Dec 2012, Tang Chen wrote: > On 12/26/2012 11:30 AM, Kamezawa Hiroyuki wrote: > >> @@ -41,6 +42,7 @@ struct firmware_map_entry { > >>const char *type; /* type of the memory range */ > >>struct list_headlist; /* entry for the linked list */

Re: [RFC PATCH v3 0/13] memory-hotplug : hot-remove physical memory

2012-07-09 Thread Christoph Lameter
On Mon, 9 Jul 2012, Yasuaki Ishimatsu wrote: > Even if you apply these patches, you cannot remove the physical memory > completely since these patches are still under development. I want you to > cooperate to improve the physical memory hot-remove. So please review these > patches and give your c

Re: [PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's saved_alias list, SLAB, and SLOB

2012-07-09 Thread Christoph Lameter
> I was pointed by Glauber to the slab common code patches. I need some > more time to read the patches. Now I think the slab/slot changes in this > v3 are not needed, and can be ignored. That may take some kernel cycles. You have a current issue here that needs to be fixed. > > down_write(&

Re: [PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's saved_alias list, SLAB, and SLOB

2012-07-06 Thread Christoph Lameter
itself. Signed-off-by: Christoph Lameter --- mm/slub.c | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) Index: linux-2.6/mm/slub.c === --- linux-2.6.orig/mm/slub.c2012-06-11 08:49

Re: [PATCH powerpc 2/2] kfree the cache name of pgtable cache if SLUB is used

2012-07-03 Thread Christoph Lameter
alias processing is done using the copy of the string and not the string itself. Signed-off-by: Christoph Lameter --- mm/slub.c | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) Index: linux-2.6/mm/slub.c

Re: [PATCH powerpc 2/2] kfree the cache name of pgtable cache if SLUB is used

2012-07-03 Thread Christoph Lameter
On Mon, 25 Jun 2012, Li Zhong wrote: > This patch tries to kfree the cache name of pgtables cache if SLUB is > used, as SLUB duplicates the cache name, and the original one is leaked. SLAB also does not free the name. Why would you have an #ifdef in there?

Re: [PATCH] slub: fix kernel BUG at mm/slub.c:1950!

2011-06-13 Thread Christoph Lameter
On Mon, 13 Jun 2011, Pekka Enberg wrote: > > Hmmm.. The allocpercpu in alloc_kmem_cache_cpus should take care of the > > alignment. Uhh.. I see that a patch that removes the #ifdef CMPXCHG_LOCAL > > was not applied? Pekka? > > This patch? > > http://git.kernel.org/?p=linux/kernel/git/penberg/slab-

Re: [PATCH] slub: fix kernel BUG at mm/slub.c:1950!

2011-06-13 Thread Christoph Lameter
On Sun, 12 Jun 2011, Hugh Dickins wrote: > 3.0-rc won't boot with SLUB on my PowerPC G5: kernel BUG at mm/slub.c:1950! > Bisected to 1759415e630e "slub: Remove CONFIG_CMPXCHG_LOCAL ifdeffery". > > After giving myself a medal for finding the BUG on line 1950 of mm/slub.c > (it's actually the >

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-27 Thread Christoph Lameter
On Fri, 24 Sep 2010, Alan Cox wrote: > Whether you add new syscalls or do the fd passing using flags and hide > the ugly bits in glibc is another question. Use device specific ioctls instead of syscalls? ___ Linuxppc-dev mailing list Linuxppc-dev@list

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-27 Thread Christoph Lameter
On Thu, 23 Sep 2010, john stultz wrote: > > > 3) Further, the PTP hardware counter can be simply set to a new offset > > > to put it in line with the network time. This could cause trouble with > > > timekeeping much like unsynced TSCs do. > > > > You can do the same for system time. > > Settimeof

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-27 Thread Christoph Lameter
On Thu, 23 Sep 2010, Christian Riesch wrote: > > > It implies clock tuning in userspace for a potential sub microsecond > > > accurate clock. The clock accuracy will be limited by user space > > > latencies and noise. You wont be able to discipline the system clock > > > accurately. > > > > Noise

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-23 Thread Christoph Lameter
On Thu, 23 Sep 2010, john stultz wrote: > > The HPET or pit timesource are also quite slow these days. You only need > > access periodically to essentially tune the TSC ratio. > > If we're using the TSC, then we're not using the PTP clock as you > suggest. Further the HPET and PIT aren't used to s

Re: [PATCH 6/8] ptp: Added a clock that uses the eTSEC found on the MPC85xx.

2010-09-23 Thread Christoph Lameter
On Thu, 23 Sep 2010, Alan Cox wrote: > > Please do not introduce useless additional layers for clock sync. Load > > these ptp clocks like the other regular clock modules and make them sync > > system time like any other clock. > > I don't think you understand PTP. PTP has masters, a system can nee

Re: [PATCH 6/8] ptp: Added a clock that uses the eTSEC found on the MPC85xx.

2010-09-23 Thread Christoph Lameter
On Thu, 23 Sep 2010, Richard Cochran wrote: > +* Gianfar PTP clock nodes > + > +General Properties: > + > + - compatible Should be "fsl,etsec-ptp" > + - reg Offset and length of the register set for the device > + - interrupts There should be at least two interrupts. Some devices >

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-23 Thread Christoph Lameter
On Thu, 23 Sep 2010, john stultz wrote: > This was my initial gut reaction as well, but in the end, I agree with > Richard that in the case of one or multiple PTP hardware clocks, we > really can't abstract over the different time domains. My (arguably still superficial) review of the source does

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-23 Thread Christoph Lameter
On Thu, 23 Sep 2010, Jacob Keller wrote: > > There is a reason for not being able to shift posix clocks: The system has > > one time base. The various clocks are contributing to maintaining that > > sytem wide time. > > > > Adjusting clocks is absolutely essential for proper functioning of the PTP

Re: [PATCH v6 0/8] ptp: IEEE 1588 hardware clock support

2010-09-23 Thread Christoph Lameter
On Thu, 23 Sep 2010, Richard Cochran wrote: > Support for obtaining timestamps from a PHC already exists via the > SO_TIMESTAMPING socket option, integrated in kernel version 2.6.30. > This patch set completes the picture by allow user space programs to > adjust the PHC and to control its

Re: [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim

2010-03-01 Thread Christoph Lameter
On Mon, 1 Mar 2010, Mel Gorman wrote: > Christoph, how feasible would it be to allow parallel reclaimers in > __zone_reclaim() that back off at a rate depending on the number of > reclaimers? Not too hard. Zone locking is there but there may be a lot of bouncing cachelines if you run it concurren

Re: [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim

2010-02-24 Thread Christoph Lameter
On Tue, 23 Feb 2010, Anton Blanchard wrote: > zone_reclaim_mode.txt > Now we set zone_reclaim_mode = 1. On each iteration we continue to improve, > but even after 10 runs of stream we have > 10% remote node memory usage. The intend of zone reclaim was never to allocate all memory from on node. Yo

Re: [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim

2010-02-19 Thread Christoph Lameter
On Fri, 19 Feb 2010, Balbir Singh wrote: > >> zone_reclaim. The others back off and try the next zone in the zonelist > >> instead. I'm not sure what the original intention was but most likely it > >> was to prevent too many parallel reclaimers in the same zone potentially > >> dumping out way mor

Re: [PATCH] powerpc: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim

2010-02-19 Thread Christoph Lameter
On Fri, 19 Feb 2010, Mel Gorman wrote: > > > The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables > > > zone reclaim. > > > > I've no problem with the patch anyway. Nor do I. > > - We seem to end up racing between zone_watermark_ok, zone_reclaim and > > buffered_rmqueue.

Re: [PATCH 2/2][v2] powerpc: Make the CMM memory hotplug aware

2009-10-16 Thread Christoph Lameter
On Thu, 15 Oct 2009, Gerald Schaefer wrote: > > The pages allocated as __GFP_MOVABLE are used to store the list of pages > > allocated by the balloon. They reference virtual addresses and it would > > be fine for the kernel to migrate the physical pages for those, the > > balloon would not notice

Re: [PATCH 6/6] Add support for __read_mostly to linux/cache.h

2009-05-01 Thread Christoph Lameter
On Fri, 1 May 2009, Sam Ravnborg wrote: > Are there any specific reason why we do not support read_mostly on all > architectures? Not that I know of. > read_mostly is about grouping rarely written data together > so what is needed is to introduce this section in the remaining > archtectures. > >

Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-30 Thread Christoph Lameter
Mel Gorman wrote: > With Erics patch and libhugetlbfs, we can automatically back text/data[1], > malloc[2] and stacks without source modification. Fairly soon, libhugetlbfs > will also be able to override shmget() to add SHM_HUGETLB. That should cover > a lot of the memory-intensive apps without s

Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Pekka Enberg wrote: > Looking at the code, it's triggerable in 2.6.24.3 at least. Why we don't have > a report yet, probably because (1) the default allocator is SLUB which doesn't > suffer from this and (2) you need a big honkin' NUMA box that causes fallback > allocations to

Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()

2008-03-04 Thread Christoph Lameter
I think this is the correct fix. The NUMA fallback logic should be passing local_flags to kmem_get_pages() and not simply the flags. Maybe a stable candidate since we are now simply passing on flags to the page allocator on the fallback path. Signed-off-by: Christoph Lameter <[EMAIL PROTEC

Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Pekka J Enberg wrote: > On Tue, 4 Mar 2008, Christoph Lameter wrote: > > Slab allocations should never be passed these flags since the slabs do > > their own thing there. > > > > The following patch would clear these in slub: > > Here

Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Pekka Enberg wrote: > > > >> [c9edf5f0] [c00b56e4] > > .__alloc_pages_internal+0xf8/0x470 > > > >> [c9edf6e0] [c00e0458] .kmem_getpages+0x8c/0x194 > > > >> [c9edf770] [c00e1050] .fallback_alloc+0x194/0x254 > > > >> [c

Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Pekka Enberg wrote: > > > I suspect the WARN_ON() is bogus although I really don't know that part > > > of the code all too well. Mel? > > > > > > > The warn-on is valid. A situation should not exist that allows both flags > > to > > be set. I suspect if remove-set_migra

Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node

2008-01-23 Thread Christoph Lameter
On Wed, 23 Jan 2008, Nishanth Aravamudan wrote: > Right, so it might have functioned before, but the correctness was > wobbly at best... Certainly the memoryless patch series has tightened > that up, but we missed these SLAB issues. > > I see that your patch fixed Olaf's machine, Pekka. Nice work

Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node

2008-01-23 Thread Christoph Lameter
On Wed, 23 Jan 2008, Pekka Enberg wrote: > I think Mel said that their configuration did work with 2.6.23 > although I also wonder how that's possible. AFAIK there has been some > changes in the page allocator that might explain this. That is, if > kmem_getpages() returned pages for memoryless nod

Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node

2008-01-23 Thread Christoph Lameter
On Wed, 23 Jan 2008, Pekka J Enberg wrote: > Fine. But, why are we hitting fallback_alloc() in the first place? It's > definitely not because of missing ->nodelists as we do: > > cache_cache.nodelists[node] = &initkmem_list3[CACHE_CACHE]; > > before attempting to set up kmalloc caches.

Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node

2008-01-23 Thread Christoph Lameter
On Wed, 23 Jan 2008, Mel Gorman wrote: > This patch adds the necessary checks to make sure a kmem_list3 exists for > the preferred node used when growing the cache. If the preferred node has > no nodelist then the currently running node is used instead. This > problem only affects the SLAB allocat

Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node

2008-01-23 Thread Christoph Lameter
On Wed, 23 Jan 2008, Pekka J Enberg wrote: > Furthermore, don't let kmem_getpages() call alloc_pages_node() if nodeid > passed > to it is -1 as the latter will always translate that to numa_node_id() which > might not have ->nodelist that caused the invocation of fallback_alloc() in > the > firs

Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node

2008-01-23 Thread Christoph Lameter
On Wed, 23 Jan 2008, Pekka J Enberg wrote: > I still think Christoph's kmem_getpages() patch is correct (to fix > cache_grow() oops) but I overlooked the fact that none the callers of > cache_alloc_node() deal with bootstrapping (with the exception of > __cache_alloc_node() that even has a

  1   2   >