Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-28 Thread Andi Kleen
On Sun, Mar 27, 2005 at 08:05:13PM +0200, Christophe Saout wrote:
> Am Sonntag, den 27.03.2005, 19:26 +0200 schrieb Andi Kleen:
> 
> > > preempt_schedule_irq is not an i386 specific function and seems to take
> > > special care of BKL preemption and since reiserfs does use the BKL to do
> > > certain things I think this actually might be the problem...?
> > 
> > Hmm, preempt_schedule_irq is not in mainline as far as I can see.
> > My patches are always for mainline; i dont do a special
> > patch kit for -mm*
> 
> PREEMPT_BKL has been in mainline since 2.6.11-rc1,  preempt_schedule_irq
> made it in 2.6.11-rc3. Please look here:
> http://linux.bkbits.net:8080/linux-2.6/search/?expr=preempt_schedule_irq=ChangeSet+comments

Hmm, true. I must have missed it with the last merge.

Looking at the changeset your simple patch is probably ok.


> 
> Now that I looked into it I think that it's obviously the correct
> solution.

Agreed.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-28 Thread Andi Kleen
On Sun, Mar 27, 2005 at 08:05:13PM +0200, Christophe Saout wrote:
 Am Sonntag, den 27.03.2005, 19:26 +0200 schrieb Andi Kleen:
 
   preempt_schedule_irq is not an i386 specific function and seems to take
   special care of BKL preemption and since reiserfs does use the BKL to do
   certain things I think this actually might be the problem...?
  
  Hmm, preempt_schedule_irq is not in mainline as far as I can see.
  My patches are always for mainline; i dont do a special
  patch kit for -mm*
 
 PREEMPT_BKL has been in mainline since 2.6.11-rc1,  preempt_schedule_irq
 made it in 2.6.11-rc3. Please look here:
 http://linux.bkbits.net:8080/linux-2.6/search/?expr=preempt_schedule_irqsearch=ChangeSet+comments

Hmm, true. I must have missed it with the last merge.

Looking at the changeset your simple patch is probably ok.


 
 Now that I looked into it I think that it's obviously the correct
 solution.

Agreed.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-27 Thread Christophe Saout
Am Sonntag, den 27.03.2005, 19:26 +0200 schrieb Andi Kleen:

> > preempt_schedule_irq is not an i386 specific function and seems to take
> > special care of BKL preemption and since reiserfs does use the BKL to do
> > certain things I think this actually might be the problem...?
> 
> Hmm, preempt_schedule_irq is not in mainline as far as I can see.
> My patches are always for mainline; i dont do a special
> patch kit for -mm*

PREEMPT_BKL has been in mainline since 2.6.11-rc1,  preempt_schedule_irq
made it in 2.6.11-rc3. Please look here:
http://linux.bkbits.net:8080/linux-2.6/search/?expr=preempt_schedule_irq=ChangeSet+comments

For i386 the first change was to switch to preempt_schedule in this code
path: http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]

preempt_schedule takes care of setting PREEMPT_ACTIVE and resetting it
afterwards, so I removed that from the assembler code.

Then preempt_schedule_irq has been introduced to move the sti/cli back
around the call to schedule:
http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]

So in the end the only thing that the patch I proposed was doing is to
*additionally* handle the PREEMPT_BKL case so that schedule doesn't
accidentally release the BKL semaphore when it shouldn't because we are
preempting and nobody explicitly called schedule.

Several other archs have done the same. No bug has shown up until the
recent -mm kernel where the execution of this code path actually became
possible (the "jc -> jnc" fix some lines above).

> It looks like a unfortunate interaction with some other patches
> in mm. Andrew, can you disable CONFIG_PREEMPT on x86-64 in
> mm for now?

These things are in 2.6.11 (except that they never got called because of
the wrong interrupt flag check in the IRQ handler).

> > Unfortunately I don't have a amd64 machine to play with, so can somebody
> > please check this?
> 
> How did you generate the crash dumps above then?

Well, nobody minds if I play with a webserver in the middle of the
night, as long as it works during the day. Shoot me. :)

Both servers are running fine since I applied my patch last night.

Now that I looked into it I think that it's obviously the correct
solution.



signature.asc
Description: This is a digitally signed message part


Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-27 Thread Andi Kleen
On Fri, Mar 25, 2005 at 08:26:25PM +0100, Christophe Saout wrote:
> Fortunately the kernel locked up and there was no data corruption.
> 
> I've got PREEMPT and PREEMPT_BKL enabled under UP.
> 
> I just took a look at the change and found this:
> 
> x86-64 does this (in entry.S):
> 
> bt   $9,EFLAGS-ARGOFFSET(%rsp)  /* interrupts off? */
> jnc   retint_restore_args
> movl $PREEMPT_ACTIVE,threadinfo_preempt_count(%rcx)
> sti
> call schedule
> cli
> GET_THREAD_INFO(%rcx)
> movl $0,threadinfo_preempt_count(%rcx)
> jmp exit_intr
> 
> while i386 does this:
> 
> testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ?
> jz restore_all
> call preempt_schedule_irq
> jmp need_resched
> 
> preempt_schedule_irq is not an i386 specific function and seems to take
> special care of BKL preemption and since reiserfs does use the BKL to do
> certain things I think this actually might be the problem...?

Hmm, preempt_schedule_irq is not in mainline as far as I can see.
My patches are always for mainline; i dont do a special
patch kit for -mm*

It looks like a unfortunate interaction with some other patches
in mm. Andrew, can you disable CONFIG_PREEMPT on x86-64 in
mm for now?

Just calling preempt_schedule_irq may also work, 
but most likely the patch that introduces that function needs
careful reading if it does not require more support from architectures.
BTW I suspect it will break other archs too...

> Unfortunately I don't have a amd64 machine to play with, so can somebody
> please check this?

How did you generate the crash dumps above then?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-27 Thread Andi Kleen
On Fri, Mar 25, 2005 at 08:26:25PM +0100, Christophe Saout wrote:
 Fortunately the kernel locked up and there was no data corruption.
 
 I've got PREEMPT and PREEMPT_BKL enabled under UP.
 
 I just took a look at the change and found this:
 
 x86-64 does this (in entry.S):
 
 bt   $9,EFLAGS-ARGOFFSET(%rsp)  /* interrupts off? */
 jnc   retint_restore_args
 movl $PREEMPT_ACTIVE,threadinfo_preempt_count(%rcx)
 sti
 call schedule
 cli
 GET_THREAD_INFO(%rcx)
 movl $0,threadinfo_preempt_count(%rcx)
 jmp exit_intr
 
 while i386 does this:
 
 testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ?
 jz restore_all
 call preempt_schedule_irq
 jmp need_resched
 
 preempt_schedule_irq is not an i386 specific function and seems to take
 special care of BKL preemption and since reiserfs does use the BKL to do
 certain things I think this actually might be the problem...?

Hmm, preempt_schedule_irq is not in mainline as far as I can see.
My patches are always for mainline; i dont do a special
patch kit for -mm*

It looks like a unfortunate interaction with some other patches
in mm. Andrew, can you disable CONFIG_PREEMPT on x86-64 in
mm for now?

Just calling preempt_schedule_irq may also work, 
but most likely the patch that introduces that function needs
careful reading if it does not require more support from architectures.
BTW I suspect it will break other archs too...

 Unfortunately I don't have a amd64 machine to play with, so can somebody
 please check this?

How did you generate the crash dumps above then?

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-27 Thread Christophe Saout
Am Sonntag, den 27.03.2005, 19:26 +0200 schrieb Andi Kleen:

  preempt_schedule_irq is not an i386 specific function and seems to take
  special care of BKL preemption and since reiserfs does use the BKL to do
  certain things I think this actually might be the problem...?
 
 Hmm, preempt_schedule_irq is not in mainline as far as I can see.
 My patches are always for mainline; i dont do a special
 patch kit for -mm*

PREEMPT_BKL has been in mainline since 2.6.11-rc1,  preempt_schedule_irq
made it in 2.6.11-rc3. Please look here:
http://linux.bkbits.net:8080/linux-2.6/search/?expr=preempt_schedule_irqsearch=ChangeSet+comments

For i386 the first change was to switch to preempt_schedule in this code
path: http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]

preempt_schedule takes care of setting PREEMPT_ACTIVE and resetting it
afterwards, so I removed that from the assembler code.

Then preempt_schedule_irq has been introduced to move the sti/cli back
around the call to schedule:
http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]

So in the end the only thing that the patch I proposed was doing is to
*additionally* handle the PREEMPT_BKL case so that schedule doesn't
accidentally release the BKL semaphore when it shouldn't because we are
preempting and nobody explicitly called schedule.

Several other archs have done the same. No bug has shown up until the
recent -mm kernel where the execution of this code path actually became
possible (the jc - jnc fix some lines above).

 It looks like a unfortunate interaction with some other patches
 in mm. Andrew, can you disable CONFIG_PREEMPT on x86-64 in
 mm for now?

These things are in 2.6.11 (except that they never got called because of
the wrong interrupt flag check in the IRQ handler).

  Unfortunately I don't have a amd64 machine to play with, so can somebody
  please check this?
 
 How did you generate the crash dumps above then?

Well, nobody minds if I play with a webserver in the middle of the
night, as long as it works during the day. Shoot me. :)

Both servers are running fine since I applied my patch last night.

Now that I looked into it I think that it's obviously the correct
solution.



signature.asc
Description: This is a digitally signed message part


x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-25 Thread Christophe Saout
Hi,

> +x86_64-fix-config_preempt.patch
>
> x86_64-fix-config_preempt.patch
>   x86_64: Fix CONFIG_PREEMPT

Has this one been stress-tested?

I've got the impression that things have become a lot worse.

I've been seeing things like these:

Mar 25 01:00:48 websrv2 REISERFS: panic (device dm-1): clm-6000: do_balance, fs 
generation has changed
Mar 25 01:00:48 websrv2
Mar 25 01:00:48 websrv2 --- [cut here ] - [please bite here ] 
-
Mar 25 01:00:48 websrv2 Kernel BUG at prints:362
Mar 25 01:00:48 websrv2 invalid operand:  [1] PREEMPT
Mar 25 01:00:48 websrv2 CPU 0
Mar 25 01:00:48 websrv2 Modules linked in: iptable_nat ipt_MARK iptable_mangle 
ipt_LOG ipt_multiport ipt_owner ipt_mark ipt_state ipt_REJECT iptable_filter 
ip_tables twofish serpent blowfish ext3 jbd reiser4 sha256 aes dm_crypt 
ip_conntrack_irc ip_conntrack_ftp ip_conntrack via_rhine 8139too crc32
Mar 25 01:00:48 websrv2 Pid: 25172, comm: rm Not tainted 2.6.12-rc1-cs1
Mar 25 01:00:48 websrv2 RIP: 0010:[] 
{reiserfs_panic+211}
Mar 25 01:00:48 websrv2 RSP: 0018:81001efe37b8  EFLAGS: 00010292
Mar 25 01:00:48 websrv2 RAX: 0059 RBX: 803fbcac RCX: 
c100
Mar 25 01:00:48 websrv2 RDX:  RSI: 81007d0b31f0 RDI: 

Mar 25 01:00:48 websrv2 RBP: 81004f960060 R08: 81001efe2000 R09: 
0002
Mar 25 01:00:48 websrv2 R10:  R11: 80340ef0 R12: 
81007f850230
Mar 25 01:00:48 websrv2 R13: 81007f85 R14:  R15: 
81004f9565d0
Mar 25 01:00:48 websrv2 FS:  2aabaae0() GS:805be800() 
knlGS:55563dc0
Mar 25 01:00:48 websrv2 CS:  0010 DS:  ES:  CR0: 8005003b
Mar 25 01:00:48 websrv2 CR2: 2aaff008 CR3: 1ebbd000 CR4: 
06e0
Mar 25 01:00:48 websrv2 Process rm (pid: 25172, threadinfo 81001efe2000, 
task 81007d0b31f0)
Mar 25 01:00:48 websrv2 Stack: 00300010 81001efe38a8 
81001efe37d8 81001c041530
Mar 25 01:00:48 websrv2 81001efe39d8 801d4e42 81007e659a00 
0063
Mar 25 01:00:48 websrv2 0063 
Mar 25 01:00:48 websrv2 Call Trace:{pathrelse_and_restore+66} 
{retint_kernel+46}
Mar 25 01:00:48 websrv2 {do_balance+39} 
{do_balance+6901}
Mar 25 01:00:48 websrv2 {unfix_nodes+128} 
{do_balance+10555}
Mar 25 01:00:48 websrv2 {reiserfs_cut_from_item+1673} 
{reiserfs_unlink+362}
Mar 25 01:00:48 websrv2 {vfs_unlink+462} 
{sys_unlink+233}
Mar 25 01:00:48 websrv2 {sys_getdents+232} 
{error_exit+0}
Mar 25 01:00:48 websrv2 {system_call+126}
Mar 25 01:00:48 websrv2
Mar 25 01:00:48 websrv2 Code: 0f 0b b8 c1 3f 80 ff ff ff ff 6a 01 4d 85 ed 48 
c7 c2 40 ba
Mar 25 01:00:48 websrv2 RIP {reiserfs_panic+211} RSP 


or

Mar 25 16:39:21 websrv2 VFS: brelse: Trying to free free buffer
Mar 25 16:39:21 websrv2 Badness in __brelse at fs/buffer.c:1295
Mar 25 16:39:21 websrv2
Mar 25 16:39:21 websrv2 Call Trace:{__find_get_block+479} 
{__getblk+37}
Mar 25 16:39:21 websrv2 {do_journal_end+2181} 
{keventd_create_kthread+0}
Mar 25 16:39:21 websrv2 {reiserfs_sync_fs+64} 
{sync_supers+211}
Mar 25 16:39:21 websrv2 {wb_kupdate+42} 
{pdflush+399}
Mar 25 16:39:21 websrv2 {wb_kupdate+0} 
{keventd_create_kthread+0}
Mar 25 16:39:21 websrv2 {pdflush+0} 
{kthread+205}
Mar 25 16:39:21 websrv2 {child_rip+8} 
{keventd_create_kthread+0}
Mar 25 16:39:21 websrv2 {kthread+0} 
{child_rip+0}

Fortunately the kernel locked up and there was no data corruption.

I've got PREEMPT and PREEMPT_BKL enabled under UP.

I just took a look at the change and found this:

x86-64 does this (in entry.S):

bt   $9,EFLAGS-ARGOFFSET(%rsp)  /* interrupts off? */
jnc   retint_restore_args
movl $PREEMPT_ACTIVE,threadinfo_preempt_count(%rcx)
sti
call schedule
cli
GET_THREAD_INFO(%rcx)
movl $0,threadinfo_preempt_count(%rcx)
jmp exit_intr

while i386 does this:

testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ?
jz restore_all
call preempt_schedule_irq
jmp need_resched

preempt_schedule_irq is not an i386 specific function and seems to take
special care of BKL preemption and since reiserfs does use the BKL to do
certain things I think this actually might be the problem...?

I'm not saying that this fix is wrong (it is obviously the right fix)
but it causes another problem to show up.

Unfortunately I don't have a amd64 machine to play with, so can somebody
please check this?



signature.asc
Description: This is a digitally signed message part


x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

2005-03-25 Thread Christophe Saout
Hi,

 +x86_64-fix-config_preempt.patch

 x86_64-fix-config_preempt.patch
   x86_64: Fix CONFIG_PREEMPT

Has this one been stress-tested?

I've got the impression that things have become a lot worse.

I've been seeing things like these:

Mar 25 01:00:48 websrv2 REISERFS: panic (device dm-1): clm-6000: do_balance, fs 
generation has changed
Mar 25 01:00:48 websrv2
Mar 25 01:00:48 websrv2 --- [cut here ] - [please bite here ] 
-
Mar 25 01:00:48 websrv2 Kernel BUG at prints:362
Mar 25 01:00:48 websrv2 invalid operand:  [1] PREEMPT
Mar 25 01:00:48 websrv2 CPU 0
Mar 25 01:00:48 websrv2 Modules linked in: iptable_nat ipt_MARK iptable_mangle 
ipt_LOG ipt_multiport ipt_owner ipt_mark ipt_state ipt_REJECT iptable_filter 
ip_tables twofish serpent blowfish ext3 jbd reiser4 sha256 aes dm_crypt 
ip_conntrack_irc ip_conntrack_ftp ip_conntrack via_rhine 8139too crc32
Mar 25 01:00:48 websrv2 Pid: 25172, comm: rm Not tainted 2.6.12-rc1-cs1
Mar 25 01:00:48 websrv2 RIP: 0010:[801cfe13] 
801cfe13{reiserfs_panic+211}
Mar 25 01:00:48 websrv2 RSP: 0018:81001efe37b8  EFLAGS: 00010292
Mar 25 01:00:48 websrv2 RAX: 0059 RBX: 803fbcac RCX: 
c100
Mar 25 01:00:48 websrv2 RDX:  RSI: 81007d0b31f0 RDI: 

Mar 25 01:00:48 websrv2 RBP: 81004f960060 R08: 81001efe2000 R09: 
0002
Mar 25 01:00:48 websrv2 R10:  R11: 80340ef0 R12: 
81007f850230
Mar 25 01:00:48 websrv2 R13: 81007f85 R14:  R15: 
81004f9565d0
Mar 25 01:00:48 websrv2 FS:  2aabaae0() GS:805be800() 
knlGS:55563dc0
Mar 25 01:00:48 websrv2 CS:  0010 DS:  ES:  CR0: 8005003b
Mar 25 01:00:48 websrv2 CR2: 2aaff008 CR3: 1ebbd000 CR4: 
06e0
Mar 25 01:00:48 websrv2 Process rm (pid: 25172, threadinfo 81001efe2000, 
task 81007d0b31f0)
Mar 25 01:00:48 websrv2 Stack: 00300010 81001efe38a8 
81001efe37d8 81001c041530
Mar 25 01:00:48 websrv2 81001efe39d8 801d4e42 81007e659a00 
0063
Mar 25 01:00:48 websrv2 0063 
Mar 25 01:00:48 websrv2 Call Trace:801d4e42{pathrelse_and_restore+66} 
8010efe6{retint_kernel+46}
Mar 25 01:00:48 websrv2 801bb847{do_balance+39} 
801bd315{do_balance+6901}
Mar 25 01:00:48 websrv2 801cbd90{unfix_nodes+128} 
801be15b{do_balance+10555}
Mar 25 01:00:48 websrv2 801d7bf9{reiserfs_cut_from_item+1673} 
801bfcfa{reiserfs_unlink+362}
Mar 25 01:00:48 websrv2 801873ae{vfs_unlink+462} 
801874f9{sys_unlink+233}
Mar 25 01:00:48 websrv2 8018a268{sys_getdents+232} 
8010f221{error_exit+0}
Mar 25 01:00:48 websrv2 8010e906{system_call+126}
Mar 25 01:00:48 websrv2
Mar 25 01:00:48 websrv2 Code: 0f 0b b8 c1 3f 80 ff ff ff ff 6a 01 4d 85 ed 48 
c7 c2 40 ba
Mar 25 01:00:48 websrv2 RIP 801cfe13{reiserfs_panic+211} RSP 
81001efe37b8

or

Mar 25 16:39:21 websrv2 VFS: brelse: Trying to free free buffer
Mar 25 16:39:21 websrv2 Badness in __brelse at fs/buffer.c:1295
Mar 25 16:39:21 websrv2
Mar 25 16:39:21 websrv2 Call Trace:8017787f{__find_get_block+479} 
8017a175{__getblk+37}
Mar 25 16:39:21 websrv2 801de3d5{do_journal_end+2181} 
80147d70{keventd_create_kthread+0}
Mar 25 16:39:21 websrv2 801cbf50{reiserfs_sync_fs+64} 
8017c0b3{sync_supers+211}
Mar 25 16:39:21 websrv2 8015a22a{wb_kupdate+42} 
8015ae8f{pdflush+399}
Mar 25 16:39:21 websrv2 8015a200{wb_kupdate+0} 
80147d70{keventd_create_kthread+0}
Mar 25 16:39:21 websrv2 8015ad00{pdflush+0} 
80147d2d{kthread+205}
Mar 25 16:39:21 websrv2 8010f3d7{child_rip+8} 
80147d70{keventd_create_kthread+0}
Mar 25 16:39:21 websrv2 80147c60{kthread+0} 
8010f3cf{child_rip+0}

Fortunately the kernel locked up and there was no data corruption.

I've got PREEMPT and PREEMPT_BKL enabled under UP.

I just took a look at the change and found this:

x86-64 does this (in entry.S):

bt   $9,EFLAGS-ARGOFFSET(%rsp)  /* interrupts off? */
jnc   retint_restore_args
movl $PREEMPT_ACTIVE,threadinfo_preempt_count(%rcx)
sti
call schedule
cli
GET_THREAD_INFO(%rcx)
movl $0,threadinfo_preempt_count(%rcx)
jmp exit_intr

while i386 does this:

testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ?
jz restore_all
call preempt_schedule_irq
jmp need_resched

preempt_schedule_irq is not an i386 specific function and seems to take
special care of BKL preemption and since reiserfs does use the BKL to do
certain things I think this actually might be the problem...?

I'm not saying that this fix is wrong (it is obviously the right fix)
but it causes another problem to show up.

Unfortunately I don't have