Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, [EMAIL PROTECTED] wrote:

> On Fri, 01 Jun 2007 16:00:30 PDT, Linus Torvalds said:
> 
> > #define BADPTR ((void *)16)
> 
> > I bet you'd find *more* problems that way than by returning NULL, and 
> > you'd also avoid the whole problem with "if (!ptr) return -ENOMEM".
> 
> Hmm.. this looks like a good contender for "first usage of #ifndef 
> CONFIG_STABLE"

The warning? Sure but the BADPTR (or ZERO_SIZE_PTR now) will stay. 
It is definitely an error to deference an object that has no size.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Valdis . Kletnieks
On Fri, 01 Jun 2007 16:00:30 PDT, Linus Torvalds said:

>   #define BADPTR ((void *)16)

> I bet you'd find *more* problems that way than by returning NULL, and 
> you'd also avoid the whole problem with "if (!ptr) return -ENOMEM".

Hmm.. this looks like a good contender for "first usage of #ifndef 
CONFIG_STABLE"

:)


pgp5hRgrE3k4B.pgp
Description: PGP signature


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
>
> On Fri, 1 Jun 2007, Andrew Morton wrote:
> 
> > I think it'd be better if we kept the WARN_ON_ONCE(size == 0) in there,
> 
> The trouble with the WARN_ON is that it triggers even for code that is 
> okay like noted by Jeremy.

Yes. Sometimes it's just more natural to have

ptr = kmalloc(size);
.. use it ..
free(ptr);

and if a *degenerate* case of size=0 happens, who cares? It should just 
work, as long as we (obviously) don't actually try to access the pointer.

So I don't much like the WARN_ON(size == 0). I think it potentially just 
causes people to write around it, and quite possibly causes the callers to 
write code that is not at all more readable or maintainable!

That's why I'd much rather return BADPTR instead: we'll get an oops for 
buggy code, but we don't penalize the "natural" and good code! So once you 
return BADPTR, there really isn't any good reason for the WARN_ON.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Andrew Morton wrote:
> In some cases, sure, and we should remove the warning at some stage.
>
> But it has exposed a couple of bugs and a few weird things.
>   

We just need to scatter better bug powder into the corners of the
allocations.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 01 Jun 2007 17:43:46 -0700 Jeremy Fitzhardinge <[EMAIL PROTECTED]> 
wrote:

> Andrew Morton wrote:
> > As I said, it's specific to the kmalloc(0) problem, and we're fixing that
> > by other means anyway.
> >   
> 
> I guess I'm not getting much traction in my campaign for equal rights
> for zero-sized allocations. They're perfectly reasonable things to have,
> if that's the way the code falls out. The warning is just prejudice.
> 

In some cases, sure, and we should remove the warning at some stage.

But it has exposed a couple of bugs and a few weird things.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Linus Torvalds wrote:
> So for *both* of the above reasons, it's actually stupid to return NULL 
> for a zero-sized allocation. It would be much better to return another 
> pointer that will trap on access. A good candidate might be to return
>
>   #define BADPTR ((void *)16)
>   

I think this is a good idea in principle, but I wonder if there's any
code which assumes that kmalloc(x) != kmalloc(x) for all non-NULL
returns from kmalloc.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Andrew Morton wrote:
> As I said, it's specific to the kmalloc(0) problem, and we're fixing that
> by other means anyway.
>   

I guess I'm not getting much traction in my campaign for equal rights
for zero-sized allocations. They're perfectly reasonable things to have,
if that's the way the code falls out. The warning is just prejudice.

J

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Christoph Lameter wrote:
> Well there are architectural problems. We determine the power of two slab 
> at compile time. The object size information is currently not available in 
> the binary :=).
>   

That only applies to allocations with constant sizes. One presumes
nobody is explicitly doing kmalloc(0), so we can use a separate
runtime-computed-size path to do poisoning. (Which is probably 90% of
the problem, since people who kmalloc(sizeof(struct foo)) will generally
stay within bounds without too much effort.)

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

> I think it'd be better if we kept the WARN_ON_ONCE(size == 0) in there,

The trouble with the WARN_ON is that it triggers even for code that is 
okay like noted by Jeremy. My initial intend with NULL was to allow the 
allocation of a zero sized pointer without extra checks. It should only
trigger a failure if something bad was done.

NULL had the problem of confusion with no memory available. I think BADPTR 
is good. If the coding warts are causing trouble then BADPTR will result 
in a failure. Otherwise if the code is doing a kmalloc(0) and not 
dereferencing the pointer (like Paul's fixed code) then we should be fine 
and not issue a warning.

The false positives may be upsetting some people.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 16:57:20 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> Andrew, want to take this patch to -mm to see if it triggers anything?

spose so.

I think it'd be better if we kept the WARN_ON_ONCE(size == 0) in there,
because it is exposing some coding warts.  But we should turn it off for
2.6.22 and make it conditional on CONFIG_DEVEL_KERNEL (or whatever it will
be called) later.

The BADPTR thing is a little worrying because it will make
previously-working-by-luck code go oops.  I guess we can live with that.

So we end up with the BADPTR code enabled even in production kernels, in
which case your ((unsigned long)x <= 16) trick is worth doing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

> > A too large alloc is >32MB or MAX_ORDER << PAGE_SIZE. A BUG_ON in 
> > kmalloc_slab() will trigger.
> 
> Did we use to BUG_ON()? I think that's wrong. There are ways for users to 
> potentially ask the kernel to do big allocations, and the correct response 
> is to say "no can do", not to crash!

There is no way to distinguish that from out of memory. Failing on large 
allocs is what we have always done for kmalloc(). Before 2.6.22 we used to 
fail for allocs > 256k which was a big nuisance for NUMAs large allocs.

The patches in 2.6.22 allow us for the first time to allocate arbitrary 
sized objects up to MAX_ORDER. So we no longer have troubles with large 
NUMA objects.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
> 
> A too large alloc is >32MB or MAX_ORDER << PAGE_SIZE. A BUG_ON in 
> kmalloc_slab() will trigger.

Did we use to BUG_ON()? I think that's wrong. There are ways for users to 
potentially ask the kernel to do big allocations, and the correct response 
is to say "no can do", not to crash!

> Here is the updated patch. It works fine here:
> 
> SLUB: Return BADPTR instead of warning for kmalloc(0)

Looks fine to me. My only comment is that

> - if (!x)
> + if (!x || x == BADPTR)
>   return;

This could be micro-optimized (again, non-standard, but it should be 
"practically portable") to have just a single test using something like

if ((unsigned long)x <= 16)
return;

but I guess it doesn't really matter much.

I think this is better than what we have now, but I also suspect it's 
*not* something we should try this late in the -rc sequence ;)

Andrew, want to take this patch to -mm to see if it triggers anything?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

> I'm not seeing the path on the freeing side that silently accepts BADPTR.
> 
> It is not ok to derefence BADPTR, but it's obviously ok to _free_ it.

Ok.

> Also, if I read the patch correctly, you _also_ return BADPTR for slabs 
> that are too large. No?  That would be wrong - those need to return NULL 
> for "out of memory".

A too large alloc is >32MB or MAX_ORDER << PAGE_SIZE. A BUG_ON in 
kmalloc_slab() will trigger.

Here is the updated patch. It works fine here:


SLUB: Return BADPTR instead of warning for kmalloc(0)

Remove the WARN_ON_ONCE and simply return BADPTR.

BADPTR can be used legitimately as long as it is not dereferenced.
Can even be freed.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 include/linux/slub_def.h |   18 --
 mm/slub.c|   10 +-
 2 files changed, 13 insertions(+), 15 deletions(-)

Index: slub/include/linux/slub_def.h
===
--- slub.orig/include/linux/slub_def.h  2007-06-01 16:19:36.0 -0700
+++ slub/include/linux/slub_def.h   2007-06-01 16:43:57.0 -0700
@@ -12,6 +12,8 @@
 #include 
 #include 
 
+#define BADPTR ((void *)16)
+
 struct kmem_cache_node {
spinlock_t list_lock;   /* Protect partial list and nr_partial */
unsigned long nr_partial;
@@ -74,13 +76,9 @@ extern struct kmem_cache kmalloc_caches[
  */
 static inline int kmalloc_index(size_t size)
 {
-   /*
-* We should return 0 if size == 0 (which would result in the
-* kmalloc caller to get NULL) but we use the smallest object
-* here for legacy reasons. Just issue a warning so that
-* we can discover locations where we do 0 sized allocations.
-*/
-   WARN_ON_ONCE(size == 0);
+
+   if (!size)
+   return 0;
 
if (size > KMALLOC_MAX_SIZE)
return -1;
@@ -133,7 +131,7 @@ static inline void *kmalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc(s, flags);
} else
@@ -146,7 +144,7 @@ static inline void *kzalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_zalloc(s, flags);
} else
@@ -162,7 +160,7 @@ static inline void *kmalloc_node(size_t 
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc_node(s, flags, node);
} else
Index: slub/mm/slub.c
===
--- slub.orig/mm/slub.c 2007-06-01 16:21:00.0 -0700
+++ slub/mm/slub.c  2007-06-01 16:43:27.0 -0700
@@ -2286,7 +2286,7 @@ void *__kmalloc(size_t size, gfp_t flags
 
if (s)
return slab_alloc(s, flags, -1, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -2297,7 +2297,7 @@ void *__kmalloc_node(size_t size, gfp_t 
 
if (s)
return slab_alloc(s, flags, node, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 #endif
@@ -2338,7 +2338,7 @@ void kfree(const void *x)
struct kmem_cache *s;
struct page *page;
 
-   if (!x)
+   if (!x || x == BADPTR)
return;
 
page = virt_to_head_page(x);
@@ -2707,7 +2707,7 @@ void *__kmalloc_track_caller(size_t size
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, -1, caller);
 }
@@ -2718,7 +2718,7 @@ void *__kmalloc_node_track_caller(size_t
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, node, caller);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
> 
> Something like this? (Not tested yet just for review):

I'm not seeing the path on the freeing side that silently accepts BADPTR.

It is not ok to derefence BADPTR, but it's obviously ok to _free_ it.

Also, if I read the patch correctly, you _also_ return BADPTR for slabs 
that are too large. No?  That would be wrong - those need to return NULL 
for "out of memory".

(But I only looked at the diff, not at the end result, so it may be that 
all the cases where you changed "return NULL" into "return BADPTR" were 
really just the size-zero ones - it just looked suspicious).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

> No, I don't think you can do it this way.

Ultimately not. But its worth to see if this works.

> At a minimum, you'd need to test that the result is word-aligned. 
> Preferably 8-byte aligned. We literally have stuff that knows about these 
> things and uses the low bits in the pointer to keep extra data.

kmalloc allocations are guaranteed to be aligned to KMALLOC_MINALIGN. I 
can bring that in but it will make the patch less readable.

> Of course, there migth be other (even more subtle) cases where we just 
> assume certain alignment, and depend on the fact that we just _happen_ to 
> get it. Who knows..

I tried to get rid of those cased and I hope that work is complete in 
2.6.22.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
> 
> Hmmm... We are going rapidly here. This is a patch that I am testing right 
> now. It right adjust the object and the patch is manageable:

No, I don't think you can do it this way.

At a minimum, you'd need to test that the result is word-aligned. 
Preferably 8-byte aligned. We literally have stuff that knows about these 
things and uses the low bits in the pointer to keep extra data.

(That said, I didn't check whether we actually kmalloc() the data, but I 
think we do - things like "struct key" etc).

Now, maybe those things always have a 8-byte-aligned size, and it works 
out, but I'd be worried.

Of course, there migth be other (even more subtle) cases where we just 
assume certain alignment, and depend on the fact that we just _happen_ to 
get it. Who knows..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

> So for *both* of the above reasons, it's actually stupid to return NULL 
> for a zero-sized allocation. It would be much better to return another 
> pointer that will trap on access. A good candidate might be to return
> 
>   #define BADPTR ((void *)16)

Something like this? (Not tested yet just for review):


SLUB: Return BADPTR instead of warning for kmalloc(0)

Remove the WARN_ON_ONCE and simply return BADPTR.

BADPTR can be used legitimately as long as it is not dereferenced.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 include/linux/slub_def.h |   18 --
 mm/slub.c|8 
 2 files changed, 12 insertions(+), 14 deletions(-)

Index: slub/include/linux/slub_def.h
===
--- slub.orig/include/linux/slub_def.h  2007-06-01 16:19:36.0 -0700
+++ slub/include/linux/slub_def.h   2007-06-01 16:24:54.0 -0700
@@ -12,6 +12,8 @@
 #include 
 #include 
 
+#define BADPTR ((void *)16)
+
 struct kmem_cache_node {
spinlock_t list_lock;   /* Protect partial list and nr_partial */
unsigned long nr_partial;
@@ -74,13 +76,9 @@ extern struct kmem_cache kmalloc_caches[
  */
 static inline int kmalloc_index(size_t size)
 {
-   /*
-* We should return 0 if size == 0 (which would result in the
-* kmalloc caller to get NULL) but we use the smallest object
-* here for legacy reasons. Just issue a warning so that
-* we can discover locations where we do 0 sized allocations.
-*/
-   WARN_ON_ONCE(size == 0);
+
+   if (!size)
+   return 0;
 
if (size > KMALLOC_MAX_SIZE)
return -1;
@@ -133,7 +131,7 @@ static inline void *kmalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc(s, flags);
} else
@@ -146,7 +144,7 @@ static inline void *kzalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_zalloc(s, flags);
} else
@@ -162,7 +160,7 @@ static inline void *kmalloc_node(size_t 
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc_node(s, flags, node);
} else
Index: slub/mm/slub.c
===
--- slub.orig/mm/slub.c 2007-06-01 16:21:00.0 -0700
+++ slub/mm/slub.c  2007-06-01 16:27:12.0 -0700
@@ -2286,7 +2286,7 @@ void *__kmalloc(size_t size, gfp_t flags
 
if (s)
return slab_alloc(s, flags, -1, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -2297,7 +2297,7 @@ void *__kmalloc_node(size_t size, gfp_t 
 
if (s)
return slab_alloc(s, flags, node, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 #endif
@@ -2707,7 +2707,7 @@ void *__kmalloc_track_caller(size_t size
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, -1, caller);
 }
@@ -2718,7 +2718,7 @@ void *__kmalloc_node_track_caller(size_t
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, node, caller);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Andrew Morton wrote:
> 
> We could store the size of the allocation in the allocated object?  Just
> add four bytes to the user's request, then pick the appropriate cache based
> on that, then put the user's `size' at the tail of the resulting allocation?

It should be easy enough to do it for _most_ allocations by just doing it 
when there is already "enough slack" to do it (which is likely true most 
of the time).

IOW, if you ask for a 42-byte allocation, and we allocate from a 64-byte 
slab, you get the slab allocation at address X, you don't actually have to 
return "X" at all. Just return "X+8", and then you do:

 - at 32-bit word at X+0 you put the "real length"
 - at 32-bit word at X+4 you put some good redzone marker
 - at 32-bit word at "X + reallen + 8" you put the endzone marker.

And then you say: if the real length was within 12 bytes of the allocation 
length, we just don't do this.

So you wouldn't get any redzoning for those allocations that are exactly 
sized (or close enough) to fit in an allocation block, but I bet *most* 
allocations would get this for free.

And then, if you actually turn on redzoning, you just always add the 12 
byte to the allocation size (assuming the alignment rules allow you to).

The nice thing about this is that the freeing path already knows where the 
object is *supposed* to start (because it sees the allocation size in the 
slub/slab data structures), so the kfree() path can actually figure out on 
its own whether it is given a "X" or an "X+8" kind of address.

So you don't actually need any extra information. You literally just need 
enough slop in the allocation that you can do this in the first place, so 
there is no cost (except for the cost of checking itself, of course).

Hmm?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Christoph Lameter wrote:

> Hmmm... We are going rapidly here. This is a patch that I am testing right 
> now. It right adjust the object and the patch is manageable:

Does not boot sigh.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
> So a kmalloc(62) would get upped to 66, so we allocate from size-128
> and put the number 62 at bytes 124-127 and we poison bytes 62-123?

Hmmm... We are going rapidly here. This is a patch that I am testing right 
now. It right adjust the object and the patch is manageable:



SLUB mm-only: Right align kmalloc objects to trigger overwrite detection

Right align kmalloc objects if they are less than the full kmalloc slab size.
This will move the object to be flush with the end of the object in order
to allow the easy detection of writes / reads after the end of the kmalloc
object.

Without slub_debug overwrites will destroy the free pointer of the next object
or the next object. Read will yield garbage that is likely zero.

With slub_debug redzone checks will be triggered. Reads will read redzone
poison.

This patch is only for checking things out. There are issues:

1. Alignment of kmalloc objects may now be different. In particular
   objects whose size is not a multiple of wordsize may be not word alignmed.

2. __kmalloc and kfree need to touch an additional cacheline in
   struct kmem_cache thereby reducing performance.

3. An object allocated via kmalloc may no longer be freed via kmem_cache_free.

So we need to figure out some may to make this configurable. Preferably
runtime configurable.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 include/linux/slub_def.h |   22 +++---
 mm/slub.c|   11 ---
 2 files changed, 27 insertions(+), 6 deletions(-)

Index: slub/include/linux/slub_def.h
===
--- slub.orig/include/linux/slub_def.h  2007-06-01 15:56:42.0 -0700
+++ slub/include/linux/slub_def.h   2007-06-01 16:00:03.0 -0700
@@ -120,6 +120,19 @@ static inline struct kmem_cache *kmalloc
return _caches[index];
 }
 
+static inline unsigned long kmalloc_size(size_t size)
+{
+   int index = kmalloc_index(size);
+
+   if (index >= KMALLOC_SHIFT_LOW)
+   return 1 << index;
+
+   if (index == 2)
+   return 192;
+   return 96;
+}
+
+
 #ifdef CONFIG_ZONE_DMA
 #define SLUB_DMA __GFP_DMA
 #else
@@ -135,7 +148,8 @@ static inline void *kmalloc(size_t size,
if (!s)
return NULL;
 
-   return kmem_cache_alloc(s, flags);
+   return kmem_cache_alloc(s, flags)
+   + kmalloc_size(size) - size;
} else
return __kmalloc(size, flags);
 }
@@ -148,7 +162,8 @@ static inline void *kzalloc(size_t size,
if (!s)
return NULL;
 
-   return kmem_cache_zalloc(s, flags);
+   return kmem_cache_zalloc(s, flags)
+   + kmalloc_size(size) - size;
} else
return __kzalloc(size, flags);
 }
@@ -159,7 +174,8 @@ extern void *__kmalloc_node(size_t size,
 static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
if (__builtin_constant_p(size) && !(flags & SLUB_DMA)) {
-   struct kmem_cache *s = kmalloc_slab(size);
+   struct kmem_cache *s = kmalloc_slab(size) +
+   kmalloc_size(size) - size;
 
if (!s)
return NULL;
Index: slub/mm/slub.c
===
--- slub.orig/mm/slub.c 2007-06-01 15:51:05.0 -0700
+++ slub/mm/slub.c  2007-06-01 16:15:21.0 -0700
@@ -2283,9 +2283,10 @@ static struct kmem_cache *get_slab(size_
 void *__kmalloc(size_t size, gfp_t flags)
 {
struct kmem_cache *s = get_slab(size, flags);
+   int offset = size - s->size;
 
if (s)
-   return slab_alloc(s, flags, -1, __builtin_return_address(0));
+   return slab_alloc(s, flags, -1, __builtin_return_address(0)) + 
offset;
return NULL;
 }
 EXPORT_SYMBOL(__kmalloc);
@@ -2294,9 +2295,10 @@ EXPORT_SYMBOL(__kmalloc);
 void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
struct kmem_cache *s = get_slab(size, flags);
+   int offset = size - s->size;
 
if (s)
-   return slab_alloc(s, flags, node, __builtin_return_address(0));
+   return slab_alloc(s, flags, node, __builtin_return_address(0)) 
+ offset;
return NULL;
 }
 EXPORT_SYMBOL(__kmalloc_node);
@@ -2337,6 +2339,7 @@ void kfree(const void *x)
 {
struct kmem_cache *s;
struct page *page;
+   unsigned long addr = (unsigned long) x;
 
if (!x)
return;
@@ -2344,7 +2347,9 @@ void kfree(const void *x)
page = virt_to_head_page(x);
s = page->slab;
 
-   slab_free(s, page, (void *)x, __builtin_return_address(0));
+   addr &= ~((unsigned long)s->size - 1);
+
+   slab_free(s, page, (void *)addr, __builtin_return_address(0));
 }
 EXPORT_SYMBOL(kfree);
 
-
To unsubscribe from this 

Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 15:41:48 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Fri, 1 Jun 2007, Andrew Morton wrote:
> 
> > > I should make SLUB put poisoning values in unused areas of a kmalloced 
> > > object?
> > 
> > hm, I hadn't thought of it that way actually.  I was thinking it was
> > specific to kmalloc(0) but as you point out, the situation is
> > generalisable.
> 
> Right it could catch a lot of other bugs as well.
> 
> > Yes, if someone does kmalloc(42) and we satisfy the allocation from the
> > size-64 slab, we should poison and then check the allegedly-unused 22
> > bytes.
> > 
> > Please ;)
> > 
> > (vaguely stunned that we didn't think of doing this years ago).
> 
> Well there are architectural problems. We determine the power of two slab 
> at compile time. The object size information is currently not available in 
> the binary :=).
>  
> > It'll be a large patch, I expect?
> 
> Ummm... Yes. We need to switch off the compile time power of two slab 
> calculation. Then I need to have some way of storing the object size in 
> the metainformation of each object. Changes a lot of function calls.

Oh well.  Don't lose any sleep over it ;)



We could store the size of the allocation in the allocated object?  Just
add four bytes to the user's request, then pick the appropriate cache based
on that, then put the user's `size' at the tail of the resulting allocation?

So a kmalloc(62) would get upped to 66, so we allocate from size-128
and put the number 62 at bytes 124-127 and we poison bytes 62-123?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
> 
> Right it could catch a lot of other bugs as well.

I'd actually prefer "malloc(0)" to _not_ return NULL, but some known 
(non-NULL) bogus pointer.

Why?

Because it's quite sane to have simple logic like

ptr = malloc(size);
if (!ptr)
return -ENOMEM;

and writing it as

if (size && !ptr)
return -ENOMEM;

is just annoying.

Also, NULL is _special_. There are absolutely tons of code in the kernel 
(and elsewhere) that just does something *different* from NULL pointers, 
and that totally breaks the whole notion of "you can allocate a zero-sized 
allocation, you just must not dereference it". If people special-case 
NULL as something else, they won't even go through the normal code-path.

So for *both* of the above reasons, it's actually stupid to return NULL 
for a zero-sized allocation. It would be much better to return another 
pointer that will trap on access. A good candidate might be to return

#define BADPTR ((void *)16)

which is a portable-enough (where "portable-enough" is "against strict 
ANSI C rules, but works in practice on all architectures") way to return 
something that will cause the same page fault behaviour as NULL, but will 
_not_ trigger the "NULL is special" code.

(Of course, you then need to teach "kfree()" to accept it as another 
pointer to be ignored, that's fine).

I bet you'd find *more* problems that way than by returning NULL, and 
you'd also avoid the whole problem with "if (!ptr) return -ENOMEM".

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

> > I should make SLUB put poisoning values in unused areas of a kmalloced 
> > object?
> 
> hm, I hadn't thought of it that way actually.  I was thinking it was
> specific to kmalloc(0) but as you point out, the situation is
> generalisable.

Right it could catch a lot of other bugs as well.

> Yes, if someone does kmalloc(42) and we satisfy the allocation from the
> size-64 slab, we should poison and then check the allegedly-unused 22
> bytes.
> 
> Please ;)
> 
> (vaguely stunned that we didn't think of doing this years ago).

Well there are architectural problems. We determine the power of two slab 
at compile time. The object size information is currently not available in 
the binary :=).
 
> It'll be a large patch, I expect?

Ummm... Yes. We need to switch off the compile time power of two slab 
calculation. Then I need to have some way of storing the object size in 
the metainformation of each object. Changes a lot of function calls.

> Actually, I have this vague memory that slab would take that kmalloc(42)
> and would then return kmalloc(64)+22, so the returned memory is
> "right-aligned".  This way the existing overrun-detection is applicable to
> all kmallocs.  Maybe I dreamed it.

Yes, that would be possible by simply adding a compile time generated 
offset to the compile time generated call to kmem_cache_alloc. But then 
kfree() will have a difficult time figuring out which object to free. 
Hmmm... But I can get to the slab via the page struct which allows me to 
figure out the power of two size. This would mean that kfree would work 
with an arbitrary pointer anywhere into the object.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 15:20:05 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Fri, 1 Jun 2007, Andrew Morton wrote:
> 
> > If slab was smart enough, it would have poisoned those 8 bytes to some
> > known pattern, and then checked that they still had that pattern when the
> > memory got freed again.
> 
> So this is new feature request?
> 
> > But it isn't smart enough, so the bug went undetected.
> 
> I should make SLUB put poisoning values in unused areas of a kmalloced 
> object?

hm, I hadn't thought of it that way actually.  I was thinking it was
specific to kmalloc(0) but as you point out, the situation is
generalisable.

Yes, if someone does kmalloc(42) and we satisfy the allocation from the
size-64 slab, we should poison and then check the allegedly-unused 22
bytes.

Please ;)

(vaguely stunned that we didn't think of doing this years ago).

It'll be a large patch, I expect?




Actually, I have this vague memory that slab would take that kmalloc(42)
and would then return kmalloc(64)+22, so the returned memory is
"right-aligned".  This way the existing overrun-detection is applicable to
all kmallocs.  Maybe I dreamed it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

> If slab was smart enough, it would have poisoned those 8 bytes to some
> known pattern, and then checked that they still had that pattern when the
> memory got freed again.

So this is new feature request?

> But it isn't smart enough, so the bug went undetected.

I should make SLUB put poisoning values in unused areas of a kmalloced 
object?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 14:45:27 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Fri, 1 Jun 2007, Andrew Morton wrote:
> 
> > Poisoning and redzoning could have caught that.
> 
> Redzoning would not have caught it. This was a kmalloc allocation and 
> SLAB always gave them 32 bytes to play with. Only writes more than 32 
> bytes behind would have been caught.
> 
> Poisoning is only applicable to unallocated objects and these were 
> allocated.

Nope and nope.

This is a special case where the user asked for zero bytes and the kernel
gave him 8 (or 32) bytes instead.

If slab was smart enough, it would have poisoned those 8 bytes to some
known pattern, and then checked that they still had that pattern when the
memory got freed again.

But it isn't smart enough, so the bug went undetected.

As I said, it's specific to the kmalloc(0) problem, and we're fixing that
by other means anyway.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

> Poisoning and redzoning could have caught that.

Redzoning would not have caught it. This was a kmalloc allocation and 
SLAB always gave them 32 bytes to play with. Only writes more than 32 
bytes behind would have been caught.

Poisoning is only applicable to unallocated objects and these were 
allocated.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 13:47:23 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Fri, 1 Jun 2007, Jeremy Fitzhardinge wrote:
> 
> > I think this is a good example of why having to special-case kmalloc(0)
> > is a bad idea.  The original code was straightforward and, barring
> > silliness, should be completely correct with npids==0.  This new code
> > does nothing other than make things more complex.
> 
> Hehe we got you. The code is indexing the pidarray allocated with 
> kmalloc(0). So it uncovered a latent bug. It only worked because SLAB gave 
> him 32 bytes and it now only works because SLUB give him 8. That is enough 
> to illegally index the first array element.
> 

Poisoning and redzoning could have caught that.

But I guess it doesn't matter now, as this shortcoming is specific to
the zero-length allocations, and we're weeding those out anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Christoph Lameter wrote:
> On Fri, 1 Jun 2007, Jeremy Fitzhardinge wrote:
>
>   
>> I think this is a good example of why having to special-case kmalloc(0)
>> is a bad idea.  The original code was straightforward and, barring
>> silliness, should be completely correct with npids==0.  This new code
>> does nothing other than make things more complex.
>> 
>
> Hehe we got you. 

Bugger ;) Well, my get-out clause was "barring silliness".

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
Christoph wrote:
> Then you are deferencing an element in the pidarray that you did not 
> allocate! This is a bug in cpuset code.

Absolutely - as I described in more detail in a reply Jeremy a
few minutes ago.  Thanks for smoking it out.

If, as I suggested in my previous message to which you are
responding:

> Perhaps if you moved the "if (unlikely(n == npids))" test before the
> "pidarray[n++] = p->pid" assignment, it would be safe.

then it looks safe to me, without having to add the bogus "+1" to
the allocated size.  That is, I suspect the following patch fixes
this long standing cpuset bug (warning - white space mangled):

--- kernel/cpuset.c 2006-12-10 12:27:37.0 -0800
+++ kernel/cpuset.c.new 2007-06-01 13:53:00.271010074 -0700
@@ -1661,9 +1661,9 @@ static int pid_array_load(pid_t *pidarra

do_each_thread(g, p) {
if (p->cpuset == cs) {
-   pidarray[n++] = p->pid;
if (unlikely(n == npids))
goto array_full;
+   pidarray[n++] = p->pid;
}
} while_each_thread(g, p);


-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Jeremy Fitzhardinge wrote:

> I think this is a good example of why having to special-case kmalloc(0)
> is a bad idea.  The original code was straightforward and, barring
> silliness, should be completely correct with npids==0.  This new code
> does nothing other than make things more complex.

Hehe we got you. The code is indexing the pidarray allocated with 
kmalloc(0). So it uncovered a latent bug. It only worked because SLAB gave 
him 32 bytes and it now only works because SLUB give him 8. That is enough 
to illegally index the first array element.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

> > There are no checks necessary. Your function worked fine so far for 
> > the case of zero objects with the pointer returned by kmalloc. If the 
> > code is correct then it will not dereference the pointer to the zero 
> > sized array. If not then we may find a bug and fix it.
> 
> I suspect you got lucky.  The check for a full pidarray[] in the routine
> pid_array_load() occurs -after- a pid is put in the array.  If a task
> showed up in this cpuset at the wrong time, we would fall over and die
> in the code:

Then you are deferencing an element in the pidarray that you did not 
allocate! This is a bug in cpuset code. So we would need to allocate at 
mininum one array element? Or would we need to allocate npids + 1 to be 
safe???


---
 kernel/cpuset.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/kernel/cpuset.c
===
--- linux-2.6.orig/kernel/cpuset.c  2007-06-01 13:41:24.0 -0700
+++ linux-2.6/kernel/cpuset.c   2007-06-01 13:42:08.0 -0700
@@ -1741,7 +1741,7 @@ static int cpuset_tasks_open(struct inod
 * show up until sometime later on.
 */
npids = atomic_read(>count);
-   pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
+   pidarray = kmalloc(max(1, npids) * sizeof(pid_t), GFP_KERNEL);
if (!pidarray)
goto err1;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
> I think this is a good example of why having to special-case kmalloc(0)
> is a bad idea.  The original code was straightforward and, barring
> silliness, should be completely correct with npids==0.  This new code
> does nothing other than make things more complex.

Perhaps so.  Perhaps not.

Actually, the original cpuset pid_array_load() code was not correct,
and Christoph's work is smoking this out.

The original code assumed that it could access the first element of the
kmalloc'd array, before it checked if it could go even further.

But with the kmalloc call of:

kmalloc(npids * sizeof(pid_t), GFP_KERNEL);

it is impossible for kmalloc to know how much to allocate, to guarantee
such behaviour.  If npids is zero, all kmalloc sees, in essence, is the
call:

kmalloc(0, GFP_KERNEL);

There is no way it can guess that I might still want to access the
first "size(pid_t)" bytes of whatever it returned.

If we had a different sort of kmalloc API, such as would be called with:

kmalloc(npids, sizeof(pid_t), GFP_KERNEL);

rather like the libc calloc() routine, having separate count and size
fields, then in theory, kmalloc could treat calls of the form:

kmalloc(0, sz, GFP_KERNEL); /* for sz > 0 */

as if they were actually:

kmalloc(1, sz, GFP_KERNEL);

and then my cpuset code might not have been broken.  But, since
some of us can never remember whether it is the size or the count
that comes first in such an API, we would still have had such
bugs -- just fewer, for those who got those arguments backwards,
and finally tripped over this anyway.

That doesn't address the other likely broken code, elsewhere in
the kernel somewhere that happened to try to access the 2nd or
3rd element of such an empty array before noticing.

So ... while I started this reply agreeing with you, I end up still
agreeing with Christoph's changes here.

Thanks, Christoph, for finding a bug in the cpuset code - good work!

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Srinivasa Ds wrote:
> ---
>  kernel/cpuset.c |   10 ++
>  1 file changed, 10 insertions(+)
>
> Index: linux-2.6.22-rc3/kernel/cpuset.c
> ===
> --- linux-2.6.22-rc3.orig/kernel/cpuset.c
> +++ linux-2.6.22-rc3/kernel/cpuset.c
> @@ -1741,6 +1741,13 @@ static int cpuset_tasks_open(struct inod
>* show up until sometime later on.
>*/
>   npids = atomic_read(>count);
> + if (!npids) {
> + ctr->buf = NULL;
> + ctr->bufsz = 0;
> + file->private_data = ctr;
> + return 0;
> + }
> +
>   

I think this is a good example of why having to special-case kmalloc(0)
is a bad idea.  The original code was straightforward and, barring
silliness, should be completely correct with npids==0.  This new code
does nothing other than make things more complex.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
> There are no checks necessary. Your function worked fine so far for 
> the case of zero objects with the pointer returned by kmalloc. If the 
> code is correct then it will not dereference the pointer to the zero 
> sized array. If not then we may find a bug and fix it.

I suspect you got lucky.  The check for a full pidarray[] in the routine
pid_array_load() occurs -after- a pid is put in the array.  If a task
showed up in this cpuset at the wrong time, we would fall over and die
in the code:

static int pid_array_load(pid_t *pidarray, int npids, struct cpuset *cs)
{
int n = 0;
struct task_struct *g, *p;

read_lock(_lock);

do_each_thread(g, p) {
if (p->cpuset == cs) {
pidarray[n++] = p->pid; /* Death if pidarray == 
NULL */
if (unlikely(n == npids))
goto array_full;
}

} while_each_thread(g, p);

Perhaps if you moved the "if (unlikely(n == npids))" test before the
"pidarray[n++] = p->pid" assignment, it would be safe.

And does the next line of code, the call to sort() after the call of
pid_array_load(), work with pidarray == NULL and npids == 0:

npids = pid_array_load(pidarray, npids, cs);
sort(pidarray, npids, sizeof(pid_t), cmppid, NULL); /* <== ?? */

Off hand, I didn't know.  I guess it must, or you would have already
tripped over it.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

> > We must have a NULL pointer exactly because it cannot be dereferenced.
> 
> Well, both patch versions had NULL pointers - either pidarray or ctr->buf.
> 
> But now I can make more sense of what you say -- you -want- the pidarray
> pointer, in particular, to be NULL, so that we don't accidentally use it.
> 
> Does that still mean that your patch suggestion was incomplete, in that
> it lacked the additional checks to avoid using a NULL pidarray?

There are no checks necessary. Your function worked fine so far for 
the case of zero objects with the pointer returned by kmalloc. If the 
code is correct then it will not dereference the pointer to the zero 
sized array. If not then we may find a bug and fix it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
> We must have a NULL pointer exactly because it cannot be dereferenced.

Well, both patch versions had NULL pointers - either pidarray or ctr->buf.

But now I can make more sense of what you say -- you -want- the pidarray
pointer, in particular, to be NULL, so that we don't accidentally use it.

Does that still mean that your patch suggestion was incomplete, in that
it lacked the additional checks to avoid using a NULL pidarray?

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

> > The allocated array has 0 bytes. Any dereference is an error.
> 
> Well, then, that doesn't work either.
> 
> We -agree- we can't have this failing. You suggested (if I understood
> correctly, which I doubt) that we set the pointer to NULL, and noted
> that derefencing a NULL pointer would fail.  I agree such derefences
> would fail, and tried to point out that this would be bad.  We can't
> have pointer dereferences failing in the kernel ... duh.
> 
> Your reply seems like a non sequiter to me, pointing out that having
> a pointer to an array of 0 bytes fails as well.  Ok - that's bad too.

Think about it some more...
 
> So we cannot have a NULL pointer, used unchecked, and we cannot have
> a non-NULL pointer to a zero byte array, used unchecked.

We must have a NULL pointer exactly because it cannot be dereferenced. It 
is a safety precaution that the following code will not deference the 
pointer to the array of zero size. Because attempting to access an object 
in an array with zero object in them is an error.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
> The allocated array has 0 bytes. Any dereference is an error.

Well, then, that doesn't work either.

We -agree- we can't have this failing. You suggested (if I understood
correctly, which I doubt) that we set the pointer to NULL, and noted
that derefencing a NULL pointer would fail.  I agree such derefences
would fail, and tried to point out that this would be bad.  We can't
have pointer dereferences failing in the kernel ... duh.

Your reply seems like a non sequiter to me, pointing out that having
a pointer to an array of 0 bytes fails as well.  Ok - that's bad too.

So we cannot have a NULL pointer, used unchecked, and we cannot have
a non-NULL pointer to a zero byte array, used unchecked.

As I said, and as I thought Srinivasa coded, with these patch lines:

@@ -1772,6 +1779,9 @@ static ssize_t cpuset_tasks_read(struct 
 {
struct ctr_struct *ctr = file->private_data;
 
+   if (!ctr->buf) /* No tasks in this cpuset */
+   return 0;
+
return simple_read_from_buffer(buf, nbytes, ppos, ctr->buf, ctr->bufsz);
 }

we actually have to check the NULL-ness (or emptiness) of this thing,
down where it would be used, to avoid using NULL or empty things.

Summary - Srinivasa's patch still looks ok to me.  This followup
discussion between you and I is just confusing me - sorry.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

> What do you mean, Christoph, by "then any dereferences will fail" ?
> That sounds bad to me -- we don't want pointer dereferences failing
> in the kernel, do we?  Wouldn't you have to add some more lines of
> code, a bit further down, where pidarray is used, to avoid trying to
> use it if its NULL?

The allocated array has 0 bytes. Any dereference is an error.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
Christoph wrote:
> H Isnt it easier to set the pidarray to NULL in the case it
> has no objects? Then any deferences will fail. Paul?

>From what I can tell, in 30 seconds of reading, either of these
approaches, Christoph's or Srinivasa's, based on Vatsa's suggestions
-could- be made to work.

I don't understand how Christoph's approach works, as stated, however.

What do you mean, Christoph, by "then any dereferences will fail" ?
That sounds bad to me -- we don't want pointer dereferences failing
in the kernel, do we?  Wouldn't you have to add some more lines of
code, a bit further down, where pidarray is used, to avoid trying to
use it if its NULL?

I think that Srinivasa's point is that it is legitimate for this array
to be empty, and that we have to code for that case, one way or another.
Now that we can't simply kmalloc zero bytes and get away with it, that
requires explicit tests in the code for the empty case, one way or
another.  This sounds like a valid point to me.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Srinivasa Ds wrote:

> I have modified the patch little bit, Thanks Vatsa for input and hence 
> resending it. Please let me know your comments on this.

H Isnt it easier to set the pidarray to NULL in the case it
has no objects? Then any deferences will fail. Paul?


---
 kernel/cpuset.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

Index: linux-2.6/kernel/cpuset.c
===
--- linux-2.6.orig/kernel/cpuset.c  2007-06-01 11:09:39.0 -0700
+++ linux-2.6/kernel/cpuset.c   2007-06-01 11:11:59.0 -0700
@@ -1723,7 +1723,7 @@ static int cpuset_tasks_open(struct inod
 {
struct cpuset *cs = __d_cs(file->f_path.dentry->d_parent);
struct ctr_struct *ctr;
-   pid_t *pidarray;
+   pid_t *pidarray = NULL;
int npids;
char c;
 
@@ -1741,9 +1741,11 @@ static int cpuset_tasks_open(struct inod
 * show up until sometime later on.
 */
npids = atomic_read(>count);
-   pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
-   if (!pidarray)
-   goto err1;
+   if (npids)
+   pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
+   if (!pidarray)
+   goto err1;
+   }
 
npids = pid_array_load(pidarray, npids, cs);
sort(pidarray, npids, sizeof(pid_t), cmppid, NULL);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Srinivasa Ds
On Friday 01 June 2007 12:57:53 Srinivasa Ds wrote:

I have modified the patch little bit, Thanks Vatsa for input and hence 
resending it. Please let me know your comments on this.

Signed-off-by: Srinivasa DS <[EMAIL PROTECTED]>

---
 kernel/cpuset.c |   10 ++
 1 file changed, 10 insertions(+)

Index: linux-2.6.22-rc3/kernel/cpuset.c
===
--- linux-2.6.22-rc3.orig/kernel/cpuset.c
+++ linux-2.6.22-rc3/kernel/cpuset.c
@@ -1741,6 +1741,13 @@ static int cpuset_tasks_open(struct inod
 * show up until sometime later on.
 */
npids = atomic_read(>count);
+   if (!npids) {
+   ctr->buf = NULL;
+   ctr->bufsz = 0;
+   file->private_data = ctr;
+   return 0;
+   }
+
pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
if (!pidarray)
goto err1;
@@ -1772,6 +1779,9 @@ static ssize_t cpuset_tasks_read(struct 
 {
struct ctr_struct *ctr = file->private_data;
 
+   if (!ctr->buf) /* No tasks in this cpuset */
+   return 0;
+
return simple_read_from_buffer(buf, nbytes, ppos, ctr->buf, ctr->bufsz);
 }
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Srinivasa Ds
On Friday 01 June 2007 12:57:53 Srinivasa Ds wrote:

I have modified the patch little bit, Thanks Vatsa for input and hence 
resending it. Please let me know your comments on this.

Signed-off-by: Srinivasa DS [EMAIL PROTECTED]

---
 kernel/cpuset.c |   10 ++
 1 file changed, 10 insertions(+)

Index: linux-2.6.22-rc3/kernel/cpuset.c
===
--- linux-2.6.22-rc3.orig/kernel/cpuset.c
+++ linux-2.6.22-rc3/kernel/cpuset.c
@@ -1741,6 +1741,13 @@ static int cpuset_tasks_open(struct inod
 * show up until sometime later on.
 */
npids = atomic_read(cs-count);
+   if (!npids) {
+   ctr-buf = NULL;
+   ctr-bufsz = 0;
+   file-private_data = ctr;
+   return 0;
+   }
+
pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
if (!pidarray)
goto err1;
@@ -1772,6 +1779,9 @@ static ssize_t cpuset_tasks_read(struct 
 {
struct ctr_struct *ctr = file-private_data;
 
+   if (!ctr-buf) /* No tasks in this cpuset */
+   return 0;
+
return simple_read_from_buffer(buf, nbytes, ppos, ctr-buf, ctr-bufsz);
 }
 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Srinivasa Ds wrote:

 I have modified the patch little bit, Thanks Vatsa for input and hence 
 resending it. Please let me know your comments on this.

H Isnt it easier to set the pidarray to NULL in the case it
has no objects? Then any deferences will fail. Paul?


---
 kernel/cpuset.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

Index: linux-2.6/kernel/cpuset.c
===
--- linux-2.6.orig/kernel/cpuset.c  2007-06-01 11:09:39.0 -0700
+++ linux-2.6/kernel/cpuset.c   2007-06-01 11:11:59.0 -0700
@@ -1723,7 +1723,7 @@ static int cpuset_tasks_open(struct inod
 {
struct cpuset *cs = __d_cs(file-f_path.dentry-d_parent);
struct ctr_struct *ctr;
-   pid_t *pidarray;
+   pid_t *pidarray = NULL;
int npids;
char c;
 
@@ -1741,9 +1741,11 @@ static int cpuset_tasks_open(struct inod
 * show up until sometime later on.
 */
npids = atomic_read(cs-count);
-   pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
-   if (!pidarray)
-   goto err1;
+   if (npids)
+   pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
+   if (!pidarray)
+   goto err1;
+   }
 
npids = pid_array_load(pidarray, npids, cs);
sort(pidarray, npids, sizeof(pid_t), cmppid, NULL);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
Christoph wrote:
 H Isnt it easier to set the pidarray to NULL in the case it
 has no objects? Then any deferences will fail. Paul?

From what I can tell, in 30 seconds of reading, either of these
approaches, Christoph's or Srinivasa's, based on Vatsa's suggestions
-could- be made to work.

I don't understand how Christoph's approach works, as stated, however.

What do you mean, Christoph, by then any dereferences will fail ?
That sounds bad to me -- we don't want pointer dereferences failing
in the kernel, do we?  Wouldn't you have to add some more lines of
code, a bit further down, where pidarray is used, to avoid trying to
use it if its NULL?

I think that Srinivasa's point is that it is legitimate for this array
to be empty, and that we have to code for that case, one way or another.
Now that we can't simply kmalloc zero bytes and get away with it, that
requires explicit tests in the code for the empty case, one way or
another.  This sounds like a valid point to me.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

 What do you mean, Christoph, by then any dereferences will fail ?
 That sounds bad to me -- we don't want pointer dereferences failing
 in the kernel, do we?  Wouldn't you have to add some more lines of
 code, a bit further down, where pidarray is used, to avoid trying to
 use it if its NULL?

The allocated array has 0 bytes. Any dereference is an error.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
 The allocated array has 0 bytes. Any dereference is an error.

Well, then, that doesn't work either.

We -agree- we can't have this failing. You suggested (if I understood
correctly, which I doubt) that we set the pointer to NULL, and noted
that derefencing a NULL pointer would fail.  I agree such derefences
would fail, and tried to point out that this would be bad.  We can't
have pointer dereferences failing in the kernel ... duh.

Your reply seems like a non sequiter to me, pointing out that having
a pointer to an array of 0 bytes fails as well.  Ok - that's bad too.

So we cannot have a NULL pointer, used unchecked, and we cannot have
a non-NULL pointer to a zero byte array, used unchecked.

As I said, and as I thought Srinivasa coded, with these patch lines:

@@ -1772,6 +1779,9 @@ static ssize_t cpuset_tasks_read(struct 
 {
struct ctr_struct *ctr = file-private_data;
 
+   if (!ctr-buf) /* No tasks in this cpuset */
+   return 0;
+
return simple_read_from_buffer(buf, nbytes, ppos, ctr-buf, ctr-bufsz);
 }

we actually have to check the NULL-ness (or emptiness) of this thing,
down where it would be used, to avoid using NULL or empty things.

Summary - Srinivasa's patch still looks ok to me.  This followup
discussion between you and I is just confusing me - sorry.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

  The allocated array has 0 bytes. Any dereference is an error.
 
 Well, then, that doesn't work either.
 
 We -agree- we can't have this failing. You suggested (if I understood
 correctly, which I doubt) that we set the pointer to NULL, and noted
 that derefencing a NULL pointer would fail.  I agree such derefences
 would fail, and tried to point out that this would be bad.  We can't
 have pointer dereferences failing in the kernel ... duh.
 
 Your reply seems like a non sequiter to me, pointing out that having
 a pointer to an array of 0 bytes fails as well.  Ok - that's bad too.

Think about it some more...
 
 So we cannot have a NULL pointer, used unchecked, and we cannot have
 a non-NULL pointer to a zero byte array, used unchecked.

We must have a NULL pointer exactly because it cannot be dereferenced. It 
is a safety precaution that the following code will not deference the 
pointer to the array of zero size. Because attempting to access an object 
in an array with zero object in them is an error.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
 We must have a NULL pointer exactly because it cannot be dereferenced.

Well, both patch versions had NULL pointers - either pidarray or ctr-buf.

But now I can make more sense of what you say -- you -want- the pidarray
pointer, in particular, to be NULL, so that we don't accidentally use it.

Does that still mean that your patch suggestion was incomplete, in that
it lacked the additional checks to avoid using a NULL pidarray?

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

  We must have a NULL pointer exactly because it cannot be dereferenced.
 
 Well, both patch versions had NULL pointers - either pidarray or ctr-buf.
 
 But now I can make more sense of what you say -- you -want- the pidarray
 pointer, in particular, to be NULL, so that we don't accidentally use it.
 
 Does that still mean that your patch suggestion was incomplete, in that
 it lacked the additional checks to avoid using a NULL pidarray?

There are no checks necessary. Your function worked fine so far for 
the case of zero objects with the pointer returned by kmalloc. If the 
code is correct then it will not dereference the pointer to the zero 
sized array. If not then we may find a bug and fix it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
 There are no checks necessary. Your function worked fine so far for 
 the case of zero objects with the pointer returned by kmalloc. If the 
 code is correct then it will not dereference the pointer to the zero 
 sized array. If not then we may find a bug and fix it.

I suspect you got lucky.  The check for a full pidarray[] in the routine
pid_array_load() occurs -after- a pid is put in the array.  If a task
showed up in this cpuset at the wrong time, we would fall over and die
in the code:

static int pid_array_load(pid_t *pidarray, int npids, struct cpuset *cs)
{
int n = 0;
struct task_struct *g, *p;

read_lock(tasklist_lock);

do_each_thread(g, p) {
if (p-cpuset == cs) {
pidarray[n++] = p-pid; /* Death if pidarray == 
NULL */
if (unlikely(n == npids))
goto array_full;
}

} while_each_thread(g, p);

Perhaps if you moved the if (unlikely(n == npids)) test before the
pidarray[n++] = p-pid assignment, it would be safe.

And does the next line of code, the call to sort() after the call of
pid_array_load(), work with pidarray == NULL and npids == 0:

npids = pid_array_load(pidarray, npids, cs);
sort(pidarray, npids, sizeof(pid_t), cmppid, NULL); /* == ?? */

Off hand, I didn't know.  I guess it must, or you would have already
tripped over it.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Srinivasa Ds wrote:
 ---
  kernel/cpuset.c |   10 ++
  1 file changed, 10 insertions(+)

 Index: linux-2.6.22-rc3/kernel/cpuset.c
 ===
 --- linux-2.6.22-rc3.orig/kernel/cpuset.c
 +++ linux-2.6.22-rc3/kernel/cpuset.c
 @@ -1741,6 +1741,13 @@ static int cpuset_tasks_open(struct inod
* show up until sometime later on.
*/
   npids = atomic_read(cs-count);
 + if (!npids) {
 + ctr-buf = NULL;
 + ctr-bufsz = 0;
 + file-private_data = ctr;
 + return 0;
 + }
 +
   

I think this is a good example of why having to special-case kmalloc(0)
is a bad idea.  The original code was straightforward and, barring
silliness, should be completely correct with npids==0.  This new code
does nothing other than make things more complex.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Paul Jackson wrote:

  There are no checks necessary. Your function worked fine so far for 
  the case of zero objects with the pointer returned by kmalloc. If the 
  code is correct then it will not dereference the pointer to the zero 
  sized array. If not then we may find a bug and fix it.
 
 I suspect you got lucky.  The check for a full pidarray[] in the routine
 pid_array_load() occurs -after- a pid is put in the array.  If a task
 showed up in this cpuset at the wrong time, we would fall over and die
 in the code:

Then you are deferencing an element in the pidarray that you did not 
allocate! This is a bug in cpuset code. So we would need to allocate at 
mininum one array element? Or would we need to allocate npids + 1 to be 
safe???


---
 kernel/cpuset.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/kernel/cpuset.c
===
--- linux-2.6.orig/kernel/cpuset.c  2007-06-01 13:41:24.0 -0700
+++ linux-2.6/kernel/cpuset.c   2007-06-01 13:42:08.0 -0700
@@ -1741,7 +1741,7 @@ static int cpuset_tasks_open(struct inod
 * show up until sometime later on.
 */
npids = atomic_read(cs-count);
-   pidarray = kmalloc(npids * sizeof(pid_t), GFP_KERNEL);
+   pidarray = kmalloc(max(1, npids) * sizeof(pid_t), GFP_KERNEL);
if (!pidarray)
goto err1;
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
 I think this is a good example of why having to special-case kmalloc(0)
 is a bad idea.  The original code was straightforward and, barring
 silliness, should be completely correct with npids==0.  This new code
 does nothing other than make things more complex.

Perhaps so.  Perhaps not.

Actually, the original cpuset pid_array_load() code was not correct,
and Christoph's work is smoking this out.

The original code assumed that it could access the first element of the
kmalloc'd array, before it checked if it could go even further.

But with the kmalloc call of:

kmalloc(npids * sizeof(pid_t), GFP_KERNEL);

it is impossible for kmalloc to know how much to allocate, to guarantee
such behaviour.  If npids is zero, all kmalloc sees, in essence, is the
call:

kmalloc(0, GFP_KERNEL);

There is no way it can guess that I might still want to access the
first size(pid_t) bytes of whatever it returned.

If we had a different sort of kmalloc API, such as would be called with:

kmalloc(npids, sizeof(pid_t), GFP_KERNEL);

rather like the libc calloc() routine, having separate count and size
fields, then in theory, kmalloc could treat calls of the form:

kmalloc(0, sz, GFP_KERNEL); /* for sz  0 */

as if they were actually:

kmalloc(1, sz, GFP_KERNEL);

and then my cpuset code might not have been broken.  But, since
some of us can never remember whether it is the size or the count
that comes first in such an API, we would still have had such
bugs -- just fewer, for those who got those arguments backwards,
and finally tripped over this anyway.

That doesn't address the other likely broken code, elsewhere in
the kernel somewhere that happened to try to access the 2nd or
3rd element of such an empty array before noticing.

So ... while I started this reply agreeing with you, I end up still
agreeing with Christoph's changes here.

Thanks, Christoph, for finding a bug in the cpuset code - good work!

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Jeremy Fitzhardinge wrote:

 I think this is a good example of why having to special-case kmalloc(0)
 is a bad idea.  The original code was straightforward and, barring
 silliness, should be completely correct with npids==0.  This new code
 does nothing other than make things more complex.

Hehe we got you. The code is indexing the pidarray allocated with 
kmalloc(0). So it uncovered a latent bug. It only worked because SLAB gave 
him 32 bytes and it now only works because SLUB give him 8. That is enough 
to illegally index the first array element.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Paul Jackson
Christoph wrote:
 Then you are deferencing an element in the pidarray that you did not 
 allocate! This is a bug in cpuset code.

Absolutely - as I described in more detail in a reply Jeremy a
few minutes ago.  Thanks for smoking it out.

If, as I suggested in my previous message to which you are
responding:

 Perhaps if you moved the if (unlikely(n == npids)) test before the
 pidarray[n++] = p-pid assignment, it would be safe.

then it looks safe to me, without having to add the bogus +1 to
the allocated size.  That is, I suspect the following patch fixes
this long standing cpuset bug (warning - white space mangled):

--- kernel/cpuset.c 2006-12-10 12:27:37.0 -0800
+++ kernel/cpuset.c.new 2007-06-01 13:53:00.271010074 -0700
@@ -1661,9 +1661,9 @@ static int pid_array_load(pid_t *pidarra

do_each_thread(g, p) {
if (p-cpuset == cs) {
-   pidarray[n++] = p-pid;
if (unlikely(n == npids))
goto array_full;
+   pidarray[n++] = p-pid;
}
} while_each_thread(g, p);


-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Christoph Lameter wrote:
 On Fri, 1 Jun 2007, Jeremy Fitzhardinge wrote:

   
 I think this is a good example of why having to special-case kmalloc(0)
 is a bad idea.  The original code was straightforward and, barring
 silliness, should be completely correct with npids==0.  This new code
 does nothing other than make things more complex.
 

 Hehe we got you. 

Bugger ;) Well, my get-out clause was barring silliness.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 13:47:23 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 On Fri, 1 Jun 2007, Jeremy Fitzhardinge wrote:
 
  I think this is a good example of why having to special-case kmalloc(0)
  is a bad idea.  The original code was straightforward and, barring
  silliness, should be completely correct with npids==0.  This new code
  does nothing other than make things more complex.
 
 Hehe we got you. The code is indexing the pidarray allocated with 
 kmalloc(0). So it uncovered a latent bug. It only worked because SLAB gave 
 him 32 bytes and it now only works because SLUB give him 8. That is enough 
 to illegally index the first array element.
 

Poisoning and redzoning could have caught that.

But I guess it doesn't matter now, as this shortcoming is specific to
the zero-length allocations, and we're weeding those out anyway.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

 Poisoning and redzoning could have caught that.

Redzoning would not have caught it. This was a kmalloc allocation and 
SLAB always gave them 32 bytes to play with. Only writes more than 32 
bytes behind would have been caught.

Poisoning is only applicable to unallocated objects and these were 
allocated.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 14:45:27 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 On Fri, 1 Jun 2007, Andrew Morton wrote:
 
  Poisoning and redzoning could have caught that.
 
 Redzoning would not have caught it. This was a kmalloc allocation and 
 SLAB always gave them 32 bytes to play with. Only writes more than 32 
 bytes behind would have been caught.
 
 Poisoning is only applicable to unallocated objects and these were 
 allocated.

Nope and nope.

This is a special case where the user asked for zero bytes and the kernel
gave him 8 (or 32) bytes instead.

If slab was smart enough, it would have poisoned those 8 bytes to some
known pattern, and then checked that they still had that pattern when the
memory got freed again.

But it isn't smart enough, so the bug went undetected.

As I said, it's specific to the kmalloc(0) problem, and we're fixing that
by other means anyway.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

 If slab was smart enough, it would have poisoned those 8 bytes to some
 known pattern, and then checked that they still had that pattern when the
 memory got freed again.

So this is new feature request?

 But it isn't smart enough, so the bug went undetected.

I should make SLUB put poisoning values in unused areas of a kmalloced 
object?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 15:20:05 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 On Fri, 1 Jun 2007, Andrew Morton wrote:
 
  If slab was smart enough, it would have poisoned those 8 bytes to some
  known pattern, and then checked that they still had that pattern when the
  memory got freed again.
 
 So this is new feature request?
 
  But it isn't smart enough, so the bug went undetected.
 
 I should make SLUB put poisoning values in unused areas of a kmalloced 
 object?

hm, I hadn't thought of it that way actually.  I was thinking it was
specific to kmalloc(0) but as you point out, the situation is
generalisable.

Yes, if someone does kmalloc(42) and we satisfy the allocation from the
size-64 slab, we should poison and then check the allegedly-unused 22
bytes.

Please ;)

(vaguely stunned that we didn't think of doing this years ago).

It'll be a large patch, I expect?




Actually, I have this vague memory that slab would take that kmalloc(42)
and would then return kmalloc(64)+22, so the returned memory is
right-aligned.  This way the existing overrun-detection is applicable to
all kmallocs.  Maybe I dreamed it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

  I should make SLUB put poisoning values in unused areas of a kmalloced 
  object?
 
 hm, I hadn't thought of it that way actually.  I was thinking it was
 specific to kmalloc(0) but as you point out, the situation is
 generalisable.

Right it could catch a lot of other bugs as well.

 Yes, if someone does kmalloc(42) and we satisfy the allocation from the
 size-64 slab, we should poison and then check the allegedly-unused 22
 bytes.
 
 Please ;)
 
 (vaguely stunned that we didn't think of doing this years ago).

Well there are architectural problems. We determine the power of two slab 
at compile time. The object size information is currently not available in 
the binary :=).
 
 It'll be a large patch, I expect?

Ummm... Yes. We need to switch off the compile time power of two slab 
calculation. Then I need to have some way of storing the object size in 
the metainformation of each object. Changes a lot of function calls.

 Actually, I have this vague memory that slab would take that kmalloc(42)
 and would then return kmalloc(64)+22, so the returned memory is
 right-aligned.  This way the existing overrun-detection is applicable to
 all kmallocs.  Maybe I dreamed it.

Yes, that would be possible by simply adding a compile time generated 
offset to the compile time generated call to kmem_cache_alloc. But then 
kfree() will have a difficult time figuring out which object to free. 
Hmmm... But I can get to the slab via the page struct which allows me to 
figure out the power of two size. This would mean that kfree would work 
with an arbitrary pointer anywhere into the object.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
 
 Right it could catch a lot of other bugs as well.

I'd actually prefer malloc(0) to _not_ return NULL, but some known 
(non-NULL) bogus pointer.

Why?

Because it's quite sane to have simple logic like

ptr = malloc(size);
if (!ptr)
return -ENOMEM;

and writing it as

if (size  !ptr)
return -ENOMEM;

is just annoying.

Also, NULL is _special_. There are absolutely tons of code in the kernel 
(and elsewhere) that just does something *different* from NULL pointers, 
and that totally breaks the whole notion of you can allocate a zero-sized 
allocation, you just must not dereference it. If people special-case 
NULL as something else, they won't even go through the normal code-path.

So for *both* of the above reasons, it's actually stupid to return NULL 
for a zero-sized allocation. It would be much better to return another 
pointer that will trap on access. A good candidate might be to return

#define BADPTR ((void *)16)

which is a portable-enough (where portable-enough is against strict 
ANSI C rules, but works in practice on all architectures) way to return 
something that will cause the same page fault behaviour as NULL, but will 
_not_ trigger the NULL is special code.

(Of course, you then need to teach kfree() to accept it as another 
pointer to be ignored, that's fine).

I bet you'd find *more* problems that way than by returning NULL, and 
you'd also avoid the whole problem with if (!ptr) return -ENOMEM.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 15:41:48 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 On Fri, 1 Jun 2007, Andrew Morton wrote:
 
   I should make SLUB put poisoning values in unused areas of a kmalloced 
   object?
  
  hm, I hadn't thought of it that way actually.  I was thinking it was
  specific to kmalloc(0) but as you point out, the situation is
  generalisable.
 
 Right it could catch a lot of other bugs as well.
 
  Yes, if someone does kmalloc(42) and we satisfy the allocation from the
  size-64 slab, we should poison and then check the allegedly-unused 22
  bytes.
  
  Please ;)
  
  (vaguely stunned that we didn't think of doing this years ago).
 
 Well there are architectural problems. We determine the power of two slab 
 at compile time. The object size information is currently not available in 
 the binary :=).
  
  It'll be a large patch, I expect?
 
 Ummm... Yes. We need to switch off the compile time power of two slab 
 calculation. Then I need to have some way of storing the object size in 
 the metainformation of each object. Changes a lot of function calls.

Oh well.  Don't lose any sleep over it ;)

loses sleep

We could store the size of the allocation in the allocated object?  Just
add four bytes to the user's request, then pick the appropriate cache based
on that, then put the user's `size' at the tail of the resulting allocation?

So a kmalloc(62) would get upped to 66, so we allocate from size-128
and put the number 62 at bytes 124-127 and we poison bytes 62-123?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
 So a kmalloc(62) would get upped to 66, so we allocate from size-128
 and put the number 62 at bytes 124-127 and we poison bytes 62-123?

Hmmm... We are going rapidly here. This is a patch that I am testing right 
now. It right adjust the object and the patch is manageable:



SLUB mm-only: Right align kmalloc objects to trigger overwrite detection

Right align kmalloc objects if they are less than the full kmalloc slab size.
This will move the object to be flush with the end of the object in order
to allow the easy detection of writes / reads after the end of the kmalloc
object.

Without slub_debug overwrites will destroy the free pointer of the next object
or the next object. Read will yield garbage that is likely zero.

With slub_debug redzone checks will be triggered. Reads will read redzone
poison.

This patch is only for checking things out. There are issues:

1. Alignment of kmalloc objects may now be different. In particular
   objects whose size is not a multiple of wordsize may be not word alignmed.

2. __kmalloc and kfree need to touch an additional cacheline in
   struct kmem_cache thereby reducing performance.

3. An object allocated via kmalloc may no longer be freed via kmem_cache_free.

So we need to figure out some may to make this configurable. Preferably
runtime configurable.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/slub_def.h |   22 +++---
 mm/slub.c|   11 ---
 2 files changed, 27 insertions(+), 6 deletions(-)

Index: slub/include/linux/slub_def.h
===
--- slub.orig/include/linux/slub_def.h  2007-06-01 15:56:42.0 -0700
+++ slub/include/linux/slub_def.h   2007-06-01 16:00:03.0 -0700
@@ -120,6 +120,19 @@ static inline struct kmem_cache *kmalloc
return kmalloc_caches[index];
 }
 
+static inline unsigned long kmalloc_size(size_t size)
+{
+   int index = kmalloc_index(size);
+
+   if (index = KMALLOC_SHIFT_LOW)
+   return 1  index;
+
+   if (index == 2)
+   return 192;
+   return 96;
+}
+
+
 #ifdef CONFIG_ZONE_DMA
 #define SLUB_DMA __GFP_DMA
 #else
@@ -135,7 +148,8 @@ static inline void *kmalloc(size_t size,
if (!s)
return NULL;
 
-   return kmem_cache_alloc(s, flags);
+   return kmem_cache_alloc(s, flags)
+   + kmalloc_size(size) - size;
} else
return __kmalloc(size, flags);
 }
@@ -148,7 +162,8 @@ static inline void *kzalloc(size_t size,
if (!s)
return NULL;
 
-   return kmem_cache_zalloc(s, flags);
+   return kmem_cache_zalloc(s, flags)
+   + kmalloc_size(size) - size;
} else
return __kzalloc(size, flags);
 }
@@ -159,7 +174,8 @@ extern void *__kmalloc_node(size_t size,
 static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
if (__builtin_constant_p(size)  !(flags  SLUB_DMA)) {
-   struct kmem_cache *s = kmalloc_slab(size);
+   struct kmem_cache *s = kmalloc_slab(size) +
+   kmalloc_size(size) - size;
 
if (!s)
return NULL;
Index: slub/mm/slub.c
===
--- slub.orig/mm/slub.c 2007-06-01 15:51:05.0 -0700
+++ slub/mm/slub.c  2007-06-01 16:15:21.0 -0700
@@ -2283,9 +2283,10 @@ static struct kmem_cache *get_slab(size_
 void *__kmalloc(size_t size, gfp_t flags)
 {
struct kmem_cache *s = get_slab(size, flags);
+   int offset = size - s-size;
 
if (s)
-   return slab_alloc(s, flags, -1, __builtin_return_address(0));
+   return slab_alloc(s, flags, -1, __builtin_return_address(0)) + 
offset;
return NULL;
 }
 EXPORT_SYMBOL(__kmalloc);
@@ -2294,9 +2295,10 @@ EXPORT_SYMBOL(__kmalloc);
 void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
struct kmem_cache *s = get_slab(size, flags);
+   int offset = size - s-size;
 
if (s)
-   return slab_alloc(s, flags, node, __builtin_return_address(0));
+   return slab_alloc(s, flags, node, __builtin_return_address(0)) 
+ offset;
return NULL;
 }
 EXPORT_SYMBOL(__kmalloc_node);
@@ -2337,6 +2339,7 @@ void kfree(const void *x)
 {
struct kmem_cache *s;
struct page *page;
+   unsigned long addr = (unsigned long) x;
 
if (!x)
return;
@@ -2344,7 +2347,9 @@ void kfree(const void *x)
page = virt_to_head_page(x);
s = page-slab;
 
-   slab_free(s, page, (void *)x, __builtin_return_address(0));
+   addr = ~((unsigned long)s-size - 1);
+
+   slab_free(s, page, (void *)addr, __builtin_return_address(0));
 }
 EXPORT_SYMBOL(kfree);
 
-
To unsubscribe from this list: send 

Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Christoph Lameter wrote:

 Hmmm... We are going rapidly here. This is a patch that I am testing right 
 now. It right adjust the object and the patch is manageable:

Does not boot sigh.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Andrew Morton wrote:
 
 We could store the size of the allocation in the allocated object?  Just
 add four bytes to the user's request, then pick the appropriate cache based
 on that, then put the user's `size' at the tail of the resulting allocation?

It should be easy enough to do it for _most_ allocations by just doing it 
when there is already enough slack to do it (which is likely true most 
of the time).

IOW, if you ask for a 42-byte allocation, and we allocate from a 64-byte 
slab, you get the slab allocation at address X, you don't actually have to 
return X at all. Just return X+8, and then you do:

 - at 32-bit word at X+0 you put the real length
 - at 32-bit word at X+4 you put some good redzone marker
 - at 32-bit word at X + reallen + 8 you put the endzone marker.

And then you say: if the real length was within 12 bytes of the allocation 
length, we just don't do this.

So you wouldn't get any redzoning for those allocations that are exactly 
sized (or close enough) to fit in an allocation block, but I bet *most* 
allocations would get this for free.

And then, if you actually turn on redzoning, you just always add the 12 
byte to the allocation size (assuming the alignment rules allow you to).

The nice thing about this is that the freeing path already knows where the 
object is *supposed* to start (because it sees the allocation size in the 
slub/slab data structures), so the kfree() path can actually figure out on 
its own whether it is given a X or an X+8 kind of address.

So you don't actually need any extra information. You literally just need 
enough slop in the allocation that you can do this in the first place, so 
there is no cost (except for the cost of checking itself, of course).

Hmm?

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

 So for *both* of the above reasons, it's actually stupid to return NULL 
 for a zero-sized allocation. It would be much better to return another 
 pointer that will trap on access. A good candidate might be to return
 
   #define BADPTR ((void *)16)

Something like this? (Not tested yet just for review):


SLUB: Return BADPTR instead of warning for kmalloc(0)

Remove the WARN_ON_ONCE and simply return BADPTR.

BADPTR can be used legitimately as long as it is not dereferenced.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/slub_def.h |   18 --
 mm/slub.c|8 
 2 files changed, 12 insertions(+), 14 deletions(-)

Index: slub/include/linux/slub_def.h
===
--- slub.orig/include/linux/slub_def.h  2007-06-01 16:19:36.0 -0700
+++ slub/include/linux/slub_def.h   2007-06-01 16:24:54.0 -0700
@@ -12,6 +12,8 @@
 #include linux/kobject.h
 #include linux/log2.h
 
+#define BADPTR ((void *)16)
+
 struct kmem_cache_node {
spinlock_t list_lock;   /* Protect partial list and nr_partial */
unsigned long nr_partial;
@@ -74,13 +76,9 @@ extern struct kmem_cache kmalloc_caches[
  */
 static inline int kmalloc_index(size_t size)
 {
-   /*
-* We should return 0 if size == 0 (which would result in the
-* kmalloc caller to get NULL) but we use the smallest object
-* here for legacy reasons. Just issue a warning so that
-* we can discover locations where we do 0 sized allocations.
-*/
-   WARN_ON_ONCE(size == 0);
+
+   if (!size)
+   return 0;
 
if (size  KMALLOC_MAX_SIZE)
return -1;
@@ -133,7 +131,7 @@ static inline void *kmalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc(s, flags);
} else
@@ -146,7 +144,7 @@ static inline void *kzalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_zalloc(s, flags);
} else
@@ -162,7 +160,7 @@ static inline void *kmalloc_node(size_t 
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc_node(s, flags, node);
} else
Index: slub/mm/slub.c
===
--- slub.orig/mm/slub.c 2007-06-01 16:21:00.0 -0700
+++ slub/mm/slub.c  2007-06-01 16:27:12.0 -0700
@@ -2286,7 +2286,7 @@ void *__kmalloc(size_t size, gfp_t flags
 
if (s)
return slab_alloc(s, flags, -1, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -2297,7 +2297,7 @@ void *__kmalloc_node(size_t size, gfp_t 
 
if (s)
return slab_alloc(s, flags, node, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 #endif
@@ -2707,7 +2707,7 @@ void *__kmalloc_track_caller(size_t size
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, -1, caller);
 }
@@ -2718,7 +2718,7 @@ void *__kmalloc_node_track_caller(size_t
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, node, caller);
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
 
 Hmmm... We are going rapidly here. This is a patch that I am testing right 
 now. It right adjust the object and the patch is manageable:

No, I don't think you can do it this way.

At a minimum, you'd need to test that the result is word-aligned. 
Preferably 8-byte aligned. We literally have stuff that knows about these 
things and uses the low bits in the pointer to keep extra data.

(That said, I didn't check whether we actually kmalloc() the data, but I 
think we do - things like struct key etc).

Now, maybe those things always have a 8-byte-aligned size, and it works 
out, but I'd be worried.

Of course, there migth be other (even more subtle) cases where we just 
assume certain alignment, and depend on the fact that we just _happen_ to 
get it. Who knows..

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

 No, I don't think you can do it this way.

Ultimately not. But its worth to see if this works.

 At a minimum, you'd need to test that the result is word-aligned. 
 Preferably 8-byte aligned. We literally have stuff that knows about these 
 things and uses the low bits in the pointer to keep extra data.

kmalloc allocations are guaranteed to be aligned to KMALLOC_MINALIGN. I 
can bring that in but it will make the patch less readable.

 Of course, there migth be other (even more subtle) cases where we just 
 assume certain alignment, and depend on the fact that we just _happen_ to 
 get it. Who knows..

I tried to get rid of those cased and I hope that work is complete in 
2.6.22.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
 
 Something like this? (Not tested yet just for review):

I'm not seeing the path on the freeing side that silently accepts BADPTR.

It is not ok to derefence BADPTR, but it's obviously ok to _free_ it.

Also, if I read the patch correctly, you _also_ return BADPTR for slabs 
that are too large. No?  That would be wrong - those need to return NULL 
for out of memory.

(But I only looked at the diff, not at the end result, so it may be that 
all the cases where you changed return NULL into return BADPTR were 
really just the size-zero ones - it just looked suspicious).

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

 I'm not seeing the path on the freeing side that silently accepts BADPTR.
 
 It is not ok to derefence BADPTR, but it's obviously ok to _free_ it.

Ok.

 Also, if I read the patch correctly, you _also_ return BADPTR for slabs 
 that are too large. No?  That would be wrong - those need to return NULL 
 for out of memory.

A too large alloc is 32MB or MAX_ORDER  PAGE_SIZE. A BUG_ON in 
kmalloc_slab() will trigger.

Here is the updated patch. It works fine here:


SLUB: Return BADPTR instead of warning for kmalloc(0)

Remove the WARN_ON_ONCE and simply return BADPTR.

BADPTR can be used legitimately as long as it is not dereferenced.
Can even be freed.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/slub_def.h |   18 --
 mm/slub.c|   10 +-
 2 files changed, 13 insertions(+), 15 deletions(-)

Index: slub/include/linux/slub_def.h
===
--- slub.orig/include/linux/slub_def.h  2007-06-01 16:19:36.0 -0700
+++ slub/include/linux/slub_def.h   2007-06-01 16:43:57.0 -0700
@@ -12,6 +12,8 @@
 #include linux/kobject.h
 #include linux/log2.h
 
+#define BADPTR ((void *)16)
+
 struct kmem_cache_node {
spinlock_t list_lock;   /* Protect partial list and nr_partial */
unsigned long nr_partial;
@@ -74,13 +76,9 @@ extern struct kmem_cache kmalloc_caches[
  */
 static inline int kmalloc_index(size_t size)
 {
-   /*
-* We should return 0 if size == 0 (which would result in the
-* kmalloc caller to get NULL) but we use the smallest object
-* here for legacy reasons. Just issue a warning so that
-* we can discover locations where we do 0 sized allocations.
-*/
-   WARN_ON_ONCE(size == 0);
+
+   if (!size)
+   return 0;
 
if (size  KMALLOC_MAX_SIZE)
return -1;
@@ -133,7 +131,7 @@ static inline void *kmalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc(s, flags);
} else
@@ -146,7 +144,7 @@ static inline void *kzalloc(size_t size,
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_zalloc(s, flags);
} else
@@ -162,7 +160,7 @@ static inline void *kmalloc_node(size_t 
struct kmem_cache *s = kmalloc_slab(size);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return kmem_cache_alloc_node(s, flags, node);
} else
Index: slub/mm/slub.c
===
--- slub.orig/mm/slub.c 2007-06-01 16:21:00.0 -0700
+++ slub/mm/slub.c  2007-06-01 16:43:27.0 -0700
@@ -2286,7 +2286,7 @@ void *__kmalloc(size_t size, gfp_t flags
 
if (s)
return slab_alloc(s, flags, -1, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -2297,7 +2297,7 @@ void *__kmalloc_node(size_t size, gfp_t 
 
if (s)
return slab_alloc(s, flags, node, __builtin_return_address(0));
-   return NULL;
+   return BADPTR;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 #endif
@@ -2338,7 +2338,7 @@ void kfree(const void *x)
struct kmem_cache *s;
struct page *page;
 
-   if (!x)
+   if (!x || x == BADPTR)
return;
 
page = virt_to_head_page(x);
@@ -2707,7 +2707,7 @@ void *__kmalloc_track_caller(size_t size
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, -1, caller);
 }
@@ -2718,7 +2718,7 @@ void *__kmalloc_node_track_caller(size_t
struct kmem_cache *s = get_slab(size, gfpflags);
 
if (!s)
-   return NULL;
+   return BADPTR;
 
return slab_alloc(s, gfpflags, node, caller);
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:
 
 A too large alloc is 32MB or MAX_ORDER  PAGE_SIZE. A BUG_ON in 
 kmalloc_slab() will trigger.

Did we use to BUG_ON()? I think that's wrong. There are ways for users to 
potentially ask the kernel to do big allocations, and the correct response 
is to say no can do, not to crash!

 Here is the updated patch. It works fine here:
 
 SLUB: Return BADPTR instead of warning for kmalloc(0)

Looks fine to me. My only comment is that

 - if (!x)
 + if (!x || x == BADPTR)
   return;

This could be micro-optimized (again, non-standard, but it should be 
practically portable) to have just a single test using something like

if ((unsigned long)x = 16)
return;

but I guess it doesn't really matter much.

I think this is better than what we have now, but I also suspect it's 
*not* something we should try this late in the -rc sequence ;)

Andrew, want to take this patch to -mm to see if it triggers anything?

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Linus Torvalds wrote:

  A too large alloc is 32MB or MAX_ORDER  PAGE_SIZE. A BUG_ON in 
  kmalloc_slab() will trigger.
 
 Did we use to BUG_ON()? I think that's wrong. There are ways for users to 
 potentially ask the kernel to do big allocations, and the correct response 
 is to say no can do, not to crash!

There is no way to distinguish that from out of memory. Failing on large 
allocs is what we have always done for kmalloc(). Before 2.6.22 we used to 
fail for allocs  256k which was a big nuisance for NUMAs large allocs.

The patches in 2.6.22 allow us for the first time to allocate arbitrary 
sized objects up to MAX_ORDER. So we no longer have troubles with large 
NUMA objects.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 1 Jun 2007 16:57:20 -0700 (PDT)
Linus Torvalds [EMAIL PROTECTED] wrote:

 Andrew, want to take this patch to -mm to see if it triggers anything?

spose so.

I think it'd be better if we kept the WARN_ON_ONCE(size == 0) in there,
because it is exposing some coding warts.  But we should turn it off for
2.6.22 and make it conditional on CONFIG_DEVEL_KERNEL (or whatever it will
be called) later.

The BADPTR thing is a little worrying because it will make
previously-working-by-luck code go oops.  I guess we can live with that.

So we end up with the BADPTR code enabled even in production kernels, in
which case your ((unsigned long)x = 16) trick is worth doing.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, Andrew Morton wrote:

 I think it'd be better if we kept the WARN_ON_ONCE(size == 0) in there,

The trouble with the WARN_ON is that it triggers even for code that is 
okay like noted by Jeremy. My initial intend with NULL was to allow the 
allocation of a zero sized pointer without extra checks. It should only
trigger a failure if something bad was done.

NULL had the problem of confusion with no memory available. I think BADPTR 
is good. If the coding warts are causing trouble then BADPTR will result 
in a failure. Otherwise if the code is doing a kmalloc(0) and not 
dereferencing the pointer (like Paul's fixed code) then we should be fine 
and not issue a warning.

The false positives may be upsetting some people.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Christoph Lameter wrote:
 Well there are architectural problems. We determine the power of two slab 
 at compile time. The object size information is currently not available in 
 the binary :=).
   

That only applies to allocations with constant sizes. One presumes
nobody is explicitly doing kmalloc(0), so we can use a separate
runtime-computed-size path to do poisoning. (Which is probably 90% of
the problem, since people who kmalloc(sizeof(struct foo)) will generally
stay within bounds without too much effort.)

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Andrew Morton wrote:
 As I said, it's specific to the kmalloc(0) problem, and we're fixing that
 by other means anyway.
   

I guess I'm not getting much traction in my campaign for equal rights
for zero-sized allocations. They're perfectly reasonable things to have,
if that's the way the code falls out. The warning is just prejudice.

J

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Linus Torvalds wrote:
 So for *both* of the above reasons, it's actually stupid to return NULL 
 for a zero-sized allocation. It would be much better to return another 
 pointer that will trap on access. A good candidate might be to return

   #define BADPTR ((void *)16)
   

I think this is a good idea in principle, but I wonder if there's any
code which assumes that kmalloc(x) != kmalloc(x) for all non-NULL
returns from kmalloc.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Andrew Morton
On Fri, 01 Jun 2007 17:43:46 -0700 Jeremy Fitzhardinge [EMAIL PROTECTED] 
wrote:

 Andrew Morton wrote:
  As I said, it's specific to the kmalloc(0) problem, and we're fixing that
  by other means anyway.

 
 I guess I'm not getting much traction in my campaign for equal rights
 for zero-sized allocations. They're perfectly reasonable things to have,
 if that's the way the code falls out. The warning is just prejudice.
 

In some cases, sure, and we should remove the warning at some stage.

But it has exposed a couple of bugs and a few weird things.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Jeremy Fitzhardinge
Andrew Morton wrote:
 In some cases, sure, and we should remove the warning at some stage.

 But it has exposed a couple of bugs and a few weird things.
   

We just need to scatter better bug powder into the corners of the
allocations.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Linus Torvalds


On Fri, 1 Jun 2007, Christoph Lameter wrote:

 On Fri, 1 Jun 2007, Andrew Morton wrote:
 
  I think it'd be better if we kept the WARN_ON_ONCE(size == 0) in there,
 
 The trouble with the WARN_ON is that it triggers even for code that is 
 okay like noted by Jeremy.

Yes. Sometimes it's just more natural to have

ptr = kmalloc(size);
.. use it ..
free(ptr);

and if a *degenerate* case of size=0 happens, who cares? It should just 
work, as long as we (obviously) don't actually try to access the pointer.

So I don't much like the WARN_ON(size == 0). I think it potentially just 
causes people to write around it, and quite possibly causes the callers to 
write code that is not at all more readable or maintainable!

That's why I'd much rather return BADPTR instead: we'll get an oops for 
buggy code, but we don't penalize the natural and good code! So once you 
return BADPTR, there really isn't any good reason for the WARN_ON.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Valdis . Kletnieks
On Fri, 01 Jun 2007 16:00:30 PDT, Linus Torvalds said:

   #define BADPTR ((void *)16)

 I bet you'd find *more* problems that way than by returning NULL, and 
 you'd also avoid the whole problem with if (!ptr) return -ENOMEM.

Hmm.. this looks like a good contender for first usage of #ifndef 
CONFIG_STABLE

:)


pgp5hRgrE3k4B.pgp
Description: PGP signature


Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777 warning

2007-06-01 Thread Christoph Lameter
On Fri, 1 Jun 2007, [EMAIL PROTECTED] wrote:

 On Fri, 01 Jun 2007 16:00:30 PDT, Linus Torvalds said:
 
  #define BADPTR ((void *)16)
 
  I bet you'd find *more* problems that way than by returning NULL, and 
  you'd also avoid the whole problem with if (!ptr) return -ENOMEM.
 
 Hmm.. this looks like a good contender for first usage of #ifndef 
 CONFIG_STABLE

The warning? Sure but the BADPTR (or ZERO_SIZE_PTR now) will stay. 
It is definitely an error to deference an object that has no size.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/