Re: kmalloc zero size changes break i386

2007-07-20 Thread Pekka J Enberg
On Fri, 20 Jul 2007, Pekka J Enberg wrote:
> There's some heavy-duty function inlining going on in__kmalloc so could 
> you please work out the exact location of the oops as described in 
> Documentation/BUG-HUNTING (look for the "use GDB to translate" part).

And, of course, please check if a5c96d8a1c67f31ef48935a78da2d2076513842b 
fixes it.

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-20 Thread Pekka J Enberg
Hi Roland,

On Thu, 19 Jul 2007, Roland Dreier wrote:
> [ 1350.668590] Unable to handle kernel NULL pointer dereference at 
> 0028 RIP:
> [ 1350.674068]  [] __kmalloc+0x51/0xaf

There's some heavy-duty function inlining going on in__kmalloc so could 
you please work out the exact location of the oops as described in 
Documentation/BUG-HUNTING (look for the "use GDB to translate" part).

And to double-check, this is SLAB with CONFIG_DEBUG_SLAB enabled? Do you 
see this with SLUB?

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-20 Thread Pekka J Enberg
Hi Roland,

On Thu, 19 Jul 2007, Roland Dreier wrote:
 [ 1350.668590] Unable to handle kernel NULL pointer dereference at 
 0028 RIP:
 [ 1350.674068]  [8027b373] __kmalloc+0x51/0xaf

There's some heavy-duty function inlining going on in__kmalloc so could 
you please work out the exact location of the oops as described in 
Documentation/BUG-HUNTING (look for the use GDB to translate part).

And to double-check, this is SLAB with CONFIG_DEBUG_SLAB enabled? Do you 
see this with SLUB?

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-20 Thread Pekka J Enberg
On Fri, 20 Jul 2007, Pekka J Enberg wrote:
 There's some heavy-duty function inlining going on in__kmalloc so could 
 you please work out the exact location of the oops as described in 
 Documentation/BUG-HUNTING (look for the use GDB to translate part).

And, of course, please check if a5c96d8a1c67f31ef48935a78da2d2076513842b 
fixes it.

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Andi Kleen
On Thursday 19 July 2007 21:19:29 Linus Torvalds wrote:
> 
> On Thu, 19 Jul 2007, Linus Torvalds wrote:
> > 
> > Does something like this fix it?
> > 
> > Christoph, please go over this and see if there are other cases like that.
> 
> Actually, here's a better version, I think.
> 
> Andi, does this patch fix your problem?

No, unfortunately not.

e.g. I see it in a git checkout (with head 
589f1e81bde732dd0b1bc5d01b6bddd4bcb4527b),
but not in plain -git12, but when I readd my x86 patchkit to -git12 it happens 
again. I also switched to slub in the config and I also still see it. 
Trying to bisect that right now.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Linus Torvalds


On Thu, 19 Jul 2007, Linus Torvalds wrote:
> 
> Does something like this fix it?
> 
> Christoph, please go over this and see if there are other cases like that.

Actually, here's a better version, I think.

Andi, does this patch fix your problem?

Linus
---
 mm/slab.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 88bc633..c3feeaa 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3690,8 +3690,8 @@ static __always_inline void *__do_kmalloc(size_t size, 
gfp_t flags,
 * functions.
 */
cachep = __find_general_cachep(size, flags);
-   if (unlikely(cachep == NULL))
-   return NULL;
+   if (unlikely(ZERO_OR_NULL_PTR(cachep)))
+   return cachep;
return __cache_alloc(cachep, flags, caller);
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Pekka Enberg

Linus Torvalds wrote:

Ok, I think I see it: I think the mm/slab.c conversion of kmalloc(0) is
totally broken.

The problem? It returns ZERO_SIZE_PTR from __find_general_cachep(), not 
from __kmalloc(). So anythign that uses __find_general_cachep() will get 
an invalid cachep pointer, which was not the point.

>
> Does something like this fix it?

I wondered about that too but I didn't spot any callers that would 
actually break. Andi? Roland?



Christoph, please go over this and see if there are other cases like that.


__do_kmalloc_node probably.


Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Linus Torvalds


On Thu, 19 Jul 2007, Roland Dreier wrote:
>
> I think the oops below is related -- Michael reports that avoiding
> kmalloc(0) in the mlx4_ib driver makes it go away.

Ok, I think I see it: I think the mm/slab.c conversion of kmalloc(0) is
totally broken.

The problem? It returns ZERO_SIZE_PTR from __find_general_cachep(), not 
from __kmalloc(). So anythign that uses __find_general_cachep() will get 
an invalid cachep pointer, which was not the point.

We're deprecating SLAB, and a lot of people are already using SLUB, 
which hid this. 

Does something like this fix it?

Christoph, please go over this and see if there are other cases like that.

Linus

---
diff --git a/mm/slab.c b/mm/slab.c
index 88bc633..4bc4bc0 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -775,8 +775,6 @@ static inline struct kmem_cache 
*__find_general_cachep(size_t size,
 */
BUG_ON(malloc_sizes[INDEX_AC].cs_cachep == NULL);
 #endif
-   if (!size)
-   return ZERO_SIZE_PTR;
 
while (size > csizep->cs_size)
csizep++;
@@ -3684,6 +3682,9 @@ static __always_inline void *__do_kmalloc(size_t size, 
gfp_t flags,
 {
struct kmem_cache *cachep;
 
+   if (!size)
+   return ZERO_SIZE_PTR;
+
/* If you want to save a few bytes .text space: replace
 * __ with kmem_.
 * Then kmalloc uses the uninlined functions instead of the inline
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Andi Kleen
On Thursday 19 July 2007 16:08:34 Pekka Enberg wrote:
> Hi Andi,
> 
> On 7/19/07, Andi Kleen <[EMAIL PROTECTED]> wrote:
> > qemu testing and booting test machines with i386 kernels wasn't very 
> > successfull
> > with recent git kernels. I got either BUGs because of failing sysfs 
> > initialization
> > or oopses in kmalloc, but no user land.
> >
> > I bisected it down to this commit.
> >
> > To reproduce: try to boot a 386 defconfig kernel, compiled with gcc 4.1, in 
> > qemu
> 
> [snip]
> 
> > 6cb8f91320d3e720351c21741da795fed580b21b is first bad commit
> > commit 6cb8f91320d3e720351c21741da795fed580b21b
> > Author: Christoph Lameter <[EMAIL PROTECTED]>
> > Date:   Tue Jul 17 04:03:22 2007 -0700
> >
> > Slab allocators: consistent ZERO_SIZE_PTR support and NULL result 
> > semantics
> 
> I have i386 defconfig kernel + qemu + busybox userland image + GCC
> 4.1.2 booting ok here. Did you manage to capture the oops? 

The sysfs crashes are all over; the first I saw is the BUG_ON() in
kernel_param_sysfs_setup(); but it was changing during the bisect
and in other similar sysfs BUG_ON()s too.

During one bisect state I also had a crash in kmalloc itself. I right
now don't have it anymore; do you want me to restart the bisect go
recreate it?

x86-64 kernels BTW work just fine; just something seems to be broken
with i386.

Unfortunately newsetup seems to have broken argument passing to qemu
so it's a bit difficult to get more out of it.

> Is the 
> userland image available somewhere?

Userland is not reached yet. You can just use a dummy dd if=/dev/zero of= 
bs=1M count=1
file. 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Linus Torvalds


On Thu, 19 Jul 2007, Andi Kleen wrote:
> 
> qemu testing and booting test machines with i386 kernels wasn't very 
> successfull
> with recent git kernels. I got either BUGs because of failing sysfs 
> initialization
> or oopses in kmalloc, but no user land.

Can you send in the oopses and BUGs? The bug is almost certainly not the 
commit you point to, but some bad kernel code that that commit just shows.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Roland Dreier
I think the oops below is related -- Michael reports that avoiding
kmalloc(0) in the mlx4_ib driver makes it go away.

From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Subject: oops on mlx4 modprobe
To: [EMAIL PROTECTED], Roland Dreier <[EMAIL PROTECTED]>
Date: Thu, 19 Jul 2007 11:47:51 +0300
Reply-To: "Michael S. Tsirkin" <[EMAIL PROTECTED]>

I got the following when loading mlx4_ib on git
589f1e81bde732dd0b1bc5d01b6bddd4bcb4527b


[ 1350.668590] Unable to handle kernel NULL pointer dereference at 
0028 RIP:
[ 1350.674068]  [] __kmalloc+0x51/0xaf
[ 1350.682159] PGD 0
[ 1350.684378] Oops:  [1] SMP
[ 1350.687735] CPU 3
[ 1350.689950] Modules linked in: ib_ipoib ib_cm ib_sa ib_uverbs ib_umad 
mlx4_ib mlx4_core ib_mthca ib_mad ib_core piix ata_piix
[ 1350.701777] Pid: 5391, comm: ipoib Not tainted 2.6.22-x86_64-git #119
[ 1350.708400] RIP: 0010:[]  [] 
__kmalloc+0x51/0xaf
[ 1350.716536] RSP: 0018:81007c655ba0  EFLAGS: 00010046
[ 1350.722034] RAX: 0003 RBX: 0246 RCX: 0040
[ 1350.729352] RDX: 81007ed15000 RSI: 00d0 RDI: 
[ 1350.736669] RBP: 81007c655bc0 R08: fff0 R09: 810075779d80
[ 1350.743985] R10: 0001 R11: 05b8d800 R12: 00d0
[ 1350.751302] R13: 0010 R14: 81007ed7cc78 R15: 81007dbad800
[ 1350.758620] FS:  () GS:81007ff2b340() 
knlGS:
[ 1350.767089] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[ 1350.773021] CR2: 0028 CR3: 75ca6000 CR4: 06e0
[ 1350.780338] Process ipoib (pid: 5391, threadinfo 81007c654000, task 
81007c5d8040)
[ 1350.788895] Stack:  81007ed7cc00  81007ed7cc00 
81007ed7cd20
[ 1350.797331]  81007c655c40 88063cb6 81006ae20b80 
6ae20c30
[ 1350.805151]  81007c655df0 81007e3ba380 00d0 
81007ffa7c80
[ 1350.812587] Call Trace:
[ 1350.815619]  [] :mlx4_ib:create_qp_common+0x558/0x736
[ 1350.822421]  [] :mlx4_ib:mlx4_ib_create_qp+0x62/0x11f
[ 1350.829223]  [] :ib_ipoib:ipoib_cm_tx_completion+0x0/0x2bb
[ 1350.836461]  [] :ib_core:ib_create_qp+0x18/0x94
[ 1350.842743]  [] :ib_ipoib:ipoib_cm_tx_start+0x216/0x651
[ 1350.849714]  [] queue_work+0x3f/0x4a
[ 1350.855043]  [] :ib_sa:ib_sa_join_multicast+0x292/0x2df
[ 1350.862030]  [] :ib_ipoib:ipoib_cm_tx_start+0x0/0x651
[ 1350.868829]  [] run_workqueue+0x85/0x10f
[ 1350.874501]  [] worker_thread+0x0/0xe7
[ 1350.88]  [] worker_thread+0xdc/0xe7
[ 1350.885585]  [] autoremove_wake_function+0x0/0x38
[ 1350.892036]  [] kthread+0x49/0x77
[ 1350.897102]  [] child_rip+0xa/0x12
[ 1350.902254]  [] kthread+0x0/0x77
[ 1350.907231]  [] child_rip+0x0/0x12
[ 1350.912384]
[ 1350.914068]
[ 1350.914068] Code: 49 8b 54 c5 00 83 3a 00 74 16 8b 02 c7 42 0c 01 00 00 00 ff
[ 1350.923599] RIP  [] __kmalloc+0x51/0xaf
[ 1350.929195]  RSP 
[ 1350.932873] CR2: 0028
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Pekka Enberg

Hi Andi,

On 7/19/07, Andi Kleen <[EMAIL PROTECTED]> wrote:

qemu testing and booting test machines with i386 kernels wasn't very successfull
with recent git kernels. I got either BUGs because of failing sysfs 
initialization
or oopses in kmalloc, but no user land.

I bisected it down to this commit.

To reproduce: try to boot a 386 defconfig kernel, compiled with gcc 4.1, in qemu


[snip]


6cb8f91320d3e720351c21741da795fed580b21b is first bad commit
commit 6cb8f91320d3e720351c21741da795fed580b21b
Author: Christoph Lameter <[EMAIL PROTECTED]>
Date:   Tue Jul 17 04:03:22 2007 -0700

Slab allocators: consistent ZERO_SIZE_PTR support and NULL result semantics


I have i386 defconfig kernel + qemu + busybox userland image + GCC
4.1.2 booting ok here. Did you manage to capture the oops? Is the
userland image available somewhere?

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Pekka Enberg

Hi Andi,

On 7/19/07, Andi Kleen [EMAIL PROTECTED] wrote:

qemu testing and booting test machines with i386 kernels wasn't very successfull
with recent git kernels. I got either BUGs because of failing sysfs 
initialization
or oopses in kmalloc, but no user land.

I bisected it down to this commit.

To reproduce: try to boot a 386 defconfig kernel, compiled with gcc 4.1, in qemu


[snip]


6cb8f91320d3e720351c21741da795fed580b21b is first bad commit
commit 6cb8f91320d3e720351c21741da795fed580b21b
Author: Christoph Lameter [EMAIL PROTECTED]
Date:   Tue Jul 17 04:03:22 2007 -0700

Slab allocators: consistent ZERO_SIZE_PTR support and NULL result semantics


I have i386 defconfig kernel + qemu + busybox userland image + GCC
4.1.2 booting ok here. Did you manage to capture the oops? Is the
userland image available somewhere?

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Roland Dreier
I think the oops below is related -- Michael reports that avoiding
kmalloc(0) in the mlx4_ib driver makes it go away.

From: Michael S. Tsirkin [EMAIL PROTECTED]
Subject: oops on mlx4 modprobe
To: [EMAIL PROTECTED], Roland Dreier [EMAIL PROTECTED]
Date: Thu, 19 Jul 2007 11:47:51 +0300
Reply-To: Michael S. Tsirkin [EMAIL PROTECTED]

I got the following when loading mlx4_ib on git
589f1e81bde732dd0b1bc5d01b6bddd4bcb4527b


[ 1350.668590] Unable to handle kernel NULL pointer dereference at 
0028 RIP:
[ 1350.674068]  [8027b373] __kmalloc+0x51/0xaf
[ 1350.682159] PGD 0
[ 1350.684378] Oops:  [1] SMP
[ 1350.687735] CPU 3
[ 1350.689950] Modules linked in: ib_ipoib ib_cm ib_sa ib_uverbs ib_umad 
mlx4_ib mlx4_core ib_mthca ib_mad ib_core piix ata_piix
[ 1350.701777] Pid: 5391, comm: ipoib Not tainted 2.6.22-x86_64-git #119
[ 1350.708400] RIP: 0010:[8027b373]  [8027b373] 
__kmalloc+0x51/0xaf
[ 1350.716536] RSP: 0018:81007c655ba0  EFLAGS: 00010046
[ 1350.722034] RAX: 0003 RBX: 0246 RCX: 0040
[ 1350.729352] RDX: 81007ed15000 RSI: 00d0 RDI: 
[ 1350.736669] RBP: 81007c655bc0 R08: fff0 R09: 810075779d80
[ 1350.743985] R10: 0001 R11: 05b8d800 R12: 00d0
[ 1350.751302] R13: 0010 R14: 81007ed7cc78 R15: 81007dbad800
[ 1350.758620] FS:  () GS:81007ff2b340() 
knlGS:
[ 1350.767089] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[ 1350.773021] CR2: 0028 CR3: 75ca6000 CR4: 06e0
[ 1350.780338] Process ipoib (pid: 5391, threadinfo 81007c654000, task 
81007c5d8040)
[ 1350.788895] Stack:  81007ed7cc00  81007ed7cc00 
81007ed7cd20
[ 1350.797331]  81007c655c40 88063cb6 81006ae20b80 
6ae20c30
[ 1350.805151]  81007c655df0 81007e3ba380 00d0 
81007ffa7c80
[ 1350.812587] Call Trace:
[ 1350.815619]  [88063cb6] :mlx4_ib:create_qp_common+0x558/0x736
[ 1350.822421]  [88064c2e] :mlx4_ib:mlx4_ib_create_qp+0x62/0x11f
[ 1350.829223]  [880999d2] :ib_ipoib:ipoib_cm_tx_completion+0x0/0x2bb
[ 1350.836461]  [8800eca9] :ib_core:ib_create_qp+0x18/0x94
[ 1350.842743]  [8809a281] :ib_ipoib:ipoib_cm_tx_start+0x216/0x651
[ 1350.849714]  [80244382] queue_work+0x3f/0x4a
[ 1350.855043]  [88080e63] :ib_sa:ib_sa_join_multicast+0x292/0x2df
[ 1350.862030]  [8809a06b] :ib_ipoib:ipoib_cm_tx_start+0x0/0x651
[ 1350.868829]  [80243cd4] run_workqueue+0x85/0x10f
[ 1350.874501]  [80244695] worker_thread+0x0/0xe7
[ 1350.88]  [80244771] worker_thread+0xdc/0xe7
[ 1350.885585]  [80247747] autoremove_wake_function+0x0/0x38
[ 1350.892036]  [80247622] kthread+0x49/0x77
[ 1350.897102]  [8020caa8] child_rip+0xa/0x12
[ 1350.902254]  [802475d9] kthread+0x0/0x77
[ 1350.907231]  [8020ca9e] child_rip+0x0/0x12
[ 1350.912384]
[ 1350.914068]
[ 1350.914068] Code: 49 8b 54 c5 00 83 3a 00 74 16 8b 02 c7 42 0c 01 00 00 00 ff
[ 1350.923599] RIP  [8027b373] __kmalloc+0x51/0xaf
[ 1350.929195]  RSP 81007c655ba0
[ 1350.932873] CR2: 0028
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Linus Torvalds


On Thu, 19 Jul 2007, Andi Kleen wrote:
 
 qemu testing and booting test machines with i386 kernels wasn't very 
 successfull
 with recent git kernels. I got either BUGs because of failing sysfs 
 initialization
 or oopses in kmalloc, but no user land.

Can you send in the oopses and BUGs? The bug is almost certainly not the 
commit you point to, but some bad kernel code that that commit just shows.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Andi Kleen
On Thursday 19 July 2007 16:08:34 Pekka Enberg wrote:
 Hi Andi,
 
 On 7/19/07, Andi Kleen [EMAIL PROTECTED] wrote:
  qemu testing and booting test machines with i386 kernels wasn't very 
  successfull
  with recent git kernels. I got either BUGs because of failing sysfs 
  initialization
  or oopses in kmalloc, but no user land.
 
  I bisected it down to this commit.
 
  To reproduce: try to boot a 386 defconfig kernel, compiled with gcc 4.1, in 
  qemu
 
 [snip]
 
  6cb8f91320d3e720351c21741da795fed580b21b is first bad commit
  commit 6cb8f91320d3e720351c21741da795fed580b21b
  Author: Christoph Lameter [EMAIL PROTECTED]
  Date:   Tue Jul 17 04:03:22 2007 -0700
 
  Slab allocators: consistent ZERO_SIZE_PTR support and NULL result 
  semantics
 
 I have i386 defconfig kernel + qemu + busybox userland image + GCC
 4.1.2 booting ok here. Did you manage to capture the oops? 

The sysfs crashes are all over; the first I saw is the BUG_ON() in
kernel_param_sysfs_setup(); but it was changing during the bisect
and in other similar sysfs BUG_ON()s too.

During one bisect state I also had a crash in kmalloc itself. I right
now don't have it anymore; do you want me to restart the bisect go
recreate it?

x86-64 kernels BTW work just fine; just something seems to be broken
with i386.

Unfortunately newsetup seems to have broken argument passing to qemu
so it's a bit difficult to get more out of it.

 Is the 
 userland image available somewhere?

Userland is not reached yet. You can just use a dummy dd if=/dev/zero of= 
bs=1M count=1
file. 

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Linus Torvalds


On Thu, 19 Jul 2007, Roland Dreier wrote:

 I think the oops below is related -- Michael reports that avoiding
 kmalloc(0) in the mlx4_ib driver makes it go away.

Ok, I think I see it: I think the mm/slab.c conversion of kmalloc(0) is
totally broken.

The problem? It returns ZERO_SIZE_PTR from __find_general_cachep(), not 
from __kmalloc(). So anythign that uses __find_general_cachep() will get 
an invalid cachep pointer, which was not the point.

We're deprecating SLAB, and a lot of people are already using SLUB, 
which hid this. 

Does something like this fix it?

Christoph, please go over this and see if there are other cases like that.

Linus

---
diff --git a/mm/slab.c b/mm/slab.c
index 88bc633..4bc4bc0 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -775,8 +775,6 @@ static inline struct kmem_cache 
*__find_general_cachep(size_t size,
 */
BUG_ON(malloc_sizes[INDEX_AC].cs_cachep == NULL);
 #endif
-   if (!size)
-   return ZERO_SIZE_PTR;
 
while (size  csizep-cs_size)
csizep++;
@@ -3684,6 +3682,9 @@ static __always_inline void *__do_kmalloc(size_t size, 
gfp_t flags,
 {
struct kmem_cache *cachep;
 
+   if (!size)
+   return ZERO_SIZE_PTR;
+
/* If you want to save a few bytes .text space: replace
 * __ with kmem_.
 * Then kmalloc uses the uninlined functions instead of the inline
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Pekka Enberg

Linus Torvalds wrote:

Ok, I think I see it: I think the mm/slab.c conversion of kmalloc(0) is
totally broken.

The problem? It returns ZERO_SIZE_PTR from __find_general_cachep(), not 
from __kmalloc(). So anythign that uses __find_general_cachep() will get 
an invalid cachep pointer, which was not the point.


 Does something like this fix it?

I wondered about that too but I didn't spot any callers that would 
actually break. Andi? Roland?



Christoph, please go over this and see if there are other cases like that.


__do_kmalloc_node probably.


Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Linus Torvalds


On Thu, 19 Jul 2007, Linus Torvalds wrote:
 
 Does something like this fix it?
 
 Christoph, please go over this and see if there are other cases like that.

Actually, here's a better version, I think.

Andi, does this patch fix your problem?

Linus
---
 mm/slab.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 88bc633..c3feeaa 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3690,8 +3690,8 @@ static __always_inline void *__do_kmalloc(size_t size, 
gfp_t flags,
 * functions.
 */
cachep = __find_general_cachep(size, flags);
-   if (unlikely(cachep == NULL))
-   return NULL;
+   if (unlikely(ZERO_OR_NULL_PTR(cachep)))
+   return cachep;
return __cache_alloc(cachep, flags, caller);
 }
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc zero size changes break i386

2007-07-19 Thread Andi Kleen
On Thursday 19 July 2007 21:19:29 Linus Torvalds wrote:
 
 On Thu, 19 Jul 2007, Linus Torvalds wrote:
  
  Does something like this fix it?
  
  Christoph, please go over this and see if there are other cases like that.
 
 Actually, here's a better version, I think.
 
 Andi, does this patch fix your problem?

No, unfortunately not.

e.g. I see it in a git checkout (with head 
589f1e81bde732dd0b1bc5d01b6bddd4bcb4527b),
but not in plain -git12, but when I readd my x86 patchkit to -git12 it happens 
again. I also switched to slub in the config and I also still see it. 
Trying to bisect that right now.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/