Re: Linux 2.6.21-rc5
Am 27.03.2007 08:17 schrieb Andrew Morton: > I have a few fixes here which belong to subsystem trees, which were missed > by the maintainers and which we probably want to get into 2.6.21. [...] > Maintainers are cc'ed. Please promptly ack, nack or otherwise quack, else > I'll be making my own decisions ;) [CC list trimmed] It's not on that list, but would you mind slipping drivers-isdn-gigaset-mark-some-static-data-as-const-v2.patch into 2.6.21 too? It's largely trivial but I'd like to get it out of the door. Thanks, Tilman -- Tilman Schmidt E-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Wed, 28 Mar 2007, Andi Kleen wrote: > > Can you test this patch please? This patch is totally broken. > i386/x86-64: Convert nmi reservation to be global > > It doesn't make much sense to have this per CPU, because all > the services using NMIs run on all CPUs. So make it global. NO! If you do this, then you must make all *callers* be global too. But they aren't. Right now all callers do per-CPU setup! See for example enable_lapic_nmi_watchdog(): on_each_cpu(setup_apic_nmi_watchdog, NULL, 0, 1); where "setup_apic_nmi_watchdog()" will call "setup_k7_watchdog()", which in turn will do a per-CPU reservation of the perfctl for the watchdog. So I agree in that it probably doesn't make sense to have NMI/perfctl reservation per-CPU, but you can't just change the reservation and ignore all the *users* of that reservation that assumed that it was per-CPU. Is that code insane? Probably. But it probably also works. After your patch, one CPU will be able to reserve the NMI/perfctl thing (fine so far) but then all the other CPU's that try to do it will fail. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On 28/03/07, Jiri Kosina <[EMAIL PROTECTED]> wrote: On Wed, 28 Mar 2007, Michal Piotrowski wrote: > BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 > is fixed, thanks. > but I still get this > [ 208.523901] = > [ 208.529739] [ INFO: inconsistent lock state ] > [ 208.534087] 2.6.21-rc5-g28defbea-dirty #131 > [ 208.538260] - > [ 208.542611] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage. > [ 208.548600] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes: Perhaps something like the one below? Problem seems to be fixed. Thanks! Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Wed, 28 Mar 2007, Michal Piotrowski wrote: > BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 > is fixed, thanks. > but I still get this > [ 208.523901] = > [ 208.529739] [ INFO: inconsistent lock state ] > [ 208.534087] 2.6.21-rc5-g28defbea-dirty #131 > [ 208.538260] - > [ 208.542611] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage. > [ 208.548600] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes: Perhaps something like the one below? From: Jiri Kosina <[EMAIL PROTECTED]> oprofile: fix potential deadlock on oprofilefs_lock nmi_cpu_setup() is called from hardirq context and acquires oprofilefs_lock. alloc_event_buffer() and oprofilefs_ulong_from_user() acquire this lock without disabling irqs, which could deadlock. Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]> drivers/oprofile/event_buffer.c |5 +++-- drivers/oprofile/oprofilefs.c |5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/oprofile/event_buffer.c b/drivers/oprofile/event_buffer.c index 00e937e..e7fbac5 100644 --- a/drivers/oprofile/event_buffer.c +++ b/drivers/oprofile/event_buffer.c @@ -70,11 +70,12 @@ void wake_up_buffer_waiter(void) int alloc_event_buffer(void) { int err = -ENOMEM; + unsigned long flags; - spin_lock(&oprofilefs_lock); + spin_lock_irqsave(&oprofilefs_lock, flags); buffer_size = fs_buffer_size; buffer_watershed = fs_buffer_watershed; - spin_unlock(&oprofilefs_lock); + spin_unlock_irqrestore(&oprofilefs_lock, flags); if (buffer_watershed >= buffer_size) return -EINVAL; diff --git a/drivers/oprofile/oprofilefs.c b/drivers/oprofile/oprofilefs.c index 6e67b42..8543cb2 100644 --- a/drivers/oprofile/oprofilefs.c +++ b/drivers/oprofile/oprofilefs.c @@ -65,6 +65,7 @@ ssize_t oprofilefs_ulong_to_user(unsigned long val, char __user * buf, size_t co int oprofilefs_ulong_from_user(unsigned long * val, char const __user * buf, size_t count) { char tmpbuf[TMPBUFSIZE]; + unsigned long flags; if (!count) return 0; @@ -77,9 +78,9 @@ int oprofilefs_ulong_from_user(unsigned long * val, char const __user * buf, siz if (copy_from_user(tmpbuf, buf, count)) return -EFAULT; - spin_lock(&oprofilefs_lock); + spin_lock_irqsave(&oprofilefs_lock, flags); *val = simple_strtoul(tmpbuf, NULL, 0); - spin_unlock(&oprofilefs_lock); + spin_unlock_irqrestore(&oprofilefs_lock, flags); return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Andi Kleen napisał(a): > On Tuesday 27 March 2007 20:53, Michal Piotrowski wrote: >> Linus Torvalds napisał(a): >>> There's various fixes here, ranging from some architecture updates (ia64, >>> ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. >>> >>> And random one-liners. >>> >> I found this in mm snapshot >> http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/1367.html >> it's in mainline too. >> >> Andi, any progress with this bug? > > Can you test this patch please? > BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 is fixed, thanks. but I still get this [ 208.523901] = [ 208.529739] [ INFO: inconsistent lock state ] [ 208.534087] 2.6.21-rc5-g28defbea-dirty #131 [ 208.538260] - [ 208.542611] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage. [ 208.548600] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 208.553553] (oprofilefs_lock){+-..}, at: [] nmi_cpu_setup+0x15/0x4f [oprofile] [ 208.561800] {hardirq-on-W} state was registered at: [ 208.55] [] __lock_acquire+0x442/0xba1 [ 208.571765] [] lock_acquire+0x68/0x82 [ 208.576519] [] _spin_lock+0x35/0x42 [ 208.581102] [] oprofilefs_ulong_from_user+0x4e/0x74 [oprofile] [ 208.588026] [] ulong_write_file+0x2a/0x38 [oprofile] [ 208.594084] [] vfs_write+0xaf/0x138 [ 208.598658] [] sys_write+0x3d/0x61 [ 208.603171] [] syscall_call+0x7/0xb [ 208.607751] [] 0x [ 208.611478] irq event stamp: 575782 [ 208.614960] hardirqs last enabled at (575781): [] default_idle+0x3e/0x59 [ 208.622645] hardirqs last disabled at (575782): [] call_function_interrupt+0x29/0x38 [ 208.631281] softirqs last enabled at (575768): [] __do_softirq+0xe4/0xea [ 208.638965] softirqs last disabled at (575759): [] do_softirq+0x64/0xd1 [ 208.646478] [ 208.646479] other info that might help us debug this: [ 208.653003] no locks held by swapper/0. [ 208.656832] [ 208.656833] stack backtrace: [ 208.661199] [] show_trace_log_lvl+0x1a/0x2f [ 208.666350] [] show_trace+0x12/0x14 [ 208.670811] [] dump_stack+0x16/0x18 [ 208.675272] [] print_usage_bug+0x140/0x14a [ 208.680336] [] mark_lock+0xa1/0x40b [ 208.684796] [] __lock_acquire+0x3b3/0xba1 [ 208.689775] [] lock_acquire+0x68/0x82 [ 208.694410] [] _spin_lock+0x35/0x42 [ 208.698869] [] nmi_cpu_setup+0x15/0x4f [oprofile] [ 208.704540] [] smp_call_function_interrupt+0x3a/0x56 [ 208.710470] [] call_function_interrupt+0x33/0x38 [ 208.716053] [] cpu_idle+0xb6/0xeb [ 208.720342] [] start_secondary+0x333/0x33b [ 208.725407] [<>] 0x0 [ 208.728397] === Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Tuesday 27 March 2007 20:53, Michal Piotrowski wrote: > Linus Torvalds napisał(a): > > There's various fixes here, ranging from some architecture updates (ia64, > > ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. > > > > And random one-liners. > > > > I found this in mm snapshot > http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/1367.html > it's in mainline too. > > Andi, any progress with this bug? Can you test this patch please? -Andi i386/x86-64: Convert nmi reservation to be global It doesn't make much sense to have this per CPU, because all the services using NMIs run on all CPUs. So make it global. This also fixes a warning about unprotected use of smp_processor_id on preemptible kernels. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Index: linux/arch/i386/kernel/nmi.c === --- linux.orig/arch/i386/kernel/nmi.c +++ linux/arch/i386/kernel/nmi.c @@ -41,8 +41,8 @@ int nmi_watchdog_enabled; * different subsystems this reservation system just tries to coordinate * things a little */ -static DEFINE_PER_CPU(unsigned long, perfctr_nmi_owner); -static DEFINE_PER_CPU(unsigned long, evntsel_nmi_owner[3]); +static unsigned long perfctr_nmi_owner; +static unsigned long evntsel_nmi_owner[3]; static cpumask_t backtrace_mask = CPU_MASK_NONE; @@ -124,7 +124,7 @@ int avail_to_resrv_perfctr_nmi_bit(unsig { BUG_ON(counter > NMI_MAX_COUNTER_BITS); - return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner))); + return (!test_bit(counter, &perfctr_nmi_owner)); } /* checks the an msr for availability */ @@ -135,7 +135,7 @@ int avail_to_resrv_perfctr_nmi(unsigned counter = nmi_perfctr_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner))); + return (!test_bit(counter, &perfctr_nmi_owner)); } int reserve_perfctr_nmi(unsigned int msr) @@ -145,7 +145,7 @@ int reserve_perfctr_nmi(unsigned int msr counter = nmi_perfctr_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - if (!test_and_set_bit(counter, &__get_cpu_var(perfctr_nmi_owner))) + if (!test_and_set_bit(counter, &perfctr_nmi_owner)) return 1; return 0; } @@ -157,7 +157,7 @@ void release_perfctr_nmi(unsigned int ms counter = nmi_perfctr_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - clear_bit(counter, &__get_cpu_var(perfctr_nmi_owner)); + clear_bit(counter, &perfctr_nmi_owner); } int reserve_evntsel_nmi(unsigned int msr) @@ -167,7 +167,7 @@ int reserve_evntsel_nmi(unsigned int msr counter = nmi_evntsel_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - if (!test_and_set_bit(counter, &__get_cpu_var(evntsel_nmi_owner)[0])) + if (!test_and_set_bit(counter, &evntsel_nmi_owner[0])) return 1; return 0; } @@ -179,7 +179,7 @@ void release_evntsel_nmi(unsigned int ms counter = nmi_evntsel_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - clear_bit(counter, &__get_cpu_var(evntsel_nmi_owner)[0]); + clear_bit(counter, &evntsel_nmi_owner[0]); } static __cpuinit inline int nmi_known_cpu(void) Index: linux/arch/x86_64/kernel/nmi.c === --- linux.orig/arch/x86_64/kernel/nmi.c +++ linux/arch/x86_64/kernel/nmi.c @@ -39,8 +39,8 @@ int panic_on_unrecovered_nmi; * different subsystems this reservation system just tries to coordinate * things a little */ -static DEFINE_PER_CPU(unsigned, perfctr_nmi_owner); -static DEFINE_PER_CPU(unsigned, evntsel_nmi_owner[2]); +static unsigned perfctr_nmi_owner; +static unsigned evntsel_nmi_owner[2]; static cpumask_t backtrace_mask = CPU_MASK_NONE; @@ -110,7 +110,7 @@ int avail_to_resrv_perfctr_nmi_bit(unsig { BUG_ON(counter > NMI_MAX_COUNTER_BITS); - return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner))); + return (!test_bit(counter, &perfctr_nmi_owner)); } /* checks the an msr for availability */ @@ -121,7 +121,7 @@ int avail_to_resrv_perfctr_nmi(unsigned counter = nmi_perfctr_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner))); + return (!test_bit(counter, &perfctr_nmi_owner)); } int reserve_perfctr_nmi(unsigned int msr) @@ -131,7 +131,7 @@ int reserve_perfctr_nmi(unsigned int msr counter = nmi_perfctr_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_COUNTER_BITS); - if (!test_and_set_bit(counter, &__get_cpu_var(perfctr_nmi_owner))) + if (!test_and_set_bit(counter, &perfctr_nmi_owner)) return 1; return 0; } @@ -143,7 +143,7 @@ void release_perfctr_nmi(unsigned int ms counter = nmi_perfctr_msr_to_bit(msr); BUG_ON(counter > NMI_MAX_
Re: ATA ACPI (was Re: Linux 2.6.21-rc5)
Pavel Machek wrote: > Hi! > So if you have reported a regression in the 2.6.21-rc series, please check 2.6.21-rc5, and update your report as appropriate (whether fixed or "still problems with xyzzy"). >>> [just got back from vacation, or would have sent this >>> earlier] >>> >>> FWIW, I'm still leaning towards disabling libata ACPI >>> support by default for 2.6.21. >>> >>> Upstream has Alan's fix for the worst PATA problems, >>> but for different reasons, I think PATA ACPI and SATA >>> ACPI support in libata does not feel quite ready for >>> prime time in 2.6.21. >>> >>> Scream now, or hold your peace until 2.6.22... :) >> I second disabling ACPI for 2.6.21. > > Ugh.. does that mean we'll have 'regression reports' as in 'it worked > ok in -rc5, broken in final? > > Well, suspend is currently so broken that we'll be flooded by reports, > anyway, but could we get at least define in code so that we can > tell users to flip it? Just the default value for libata.noacpi is changed to 1, so user can easily reenable it by passing boot/module parameter. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA ACPI (was Re: Linux 2.6.21-rc5)
Hi! > >>So if you have reported a regression in the 2.6.21-rc > >>series, please check 2.6.21-rc5, and update your > >>report as appropriate (whether fixed or "still > >>problems with xyzzy"). > > > >[just got back from vacation, or would have sent this > >earlier] > > > >FWIW, I'm still leaning towards disabling libata ACPI > >support by default for 2.6.21. > > > >Upstream has Alan's fix for the worst PATA problems, > >but for different reasons, I think PATA ACPI and SATA > >ACPI support in libata does not feel quite ready for > >prime time in 2.6.21. > > > >Scream now, or hold your peace until 2.6.22... :) > > I second disabling ACPI for 2.6.21. Ugh.. does that mean we'll have 'regression reports' as in 'it worked ok in -rc5, broken in final? Well, suspend is currently so broken that we'll be flooded by reports, anyway, but could we get at least define in code so that we can tell users to flip it? Or maybe it is enough to make libata dependend on EXPERIMETAL? ...making it dependend on BROKEN should be definitely enough... -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Pavel Machek napisał(a): > Hi! >>> There's various fixes here, ranging from some architecture updates (ia64, >>> ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. >>> >> Suspend to disk doesn't work for me with this patch. It hangs after >> PM: Preparing devices for restore. >> Suspending console(s) >> during resuming. >> >> a504e64ab42bcc27074ea37405d06833ed6e0820 is first bad commit >> commit a504e64ab42bcc27074ea37405d06833ed6e0820 >> Author: Stephen Hemminger <[EMAIL PROTECTED]> >> Date: Fri Feb 2 08:22:53 2007 -0800 >> >>skge: WOL support >> >>Add WOL support for Yukon chipsets in skge device. >> >>Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> >>Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> >> >> :04 04 d3d4335e6cba330b7880b4787fbe48733e69f8fc >> 5845b004228d811de912a55da6a7843b72f23f81 M drivers >> >> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2 > > Do you use skge as your network device? Yes, I have a Marvell based onboard NIC. 02:05.0 Ethernet controller: 3Com Corporation 3c940 10/100/1000Base-T [Marvell] (rev 12) Subsystem: ASUSTeK Computer Inc. A7V600/P4P800/K8V motherboard Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR-Pavel Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Hi! > >There's various fixes here, ranging from some architecture updates (ia64, > >ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. > > > > Suspend to disk doesn't work for me with this patch. It hangs after > PM: Preparing devices for restore. > Suspending console(s) > during resuming. > > a504e64ab42bcc27074ea37405d06833ed6e0820 is first bad commit > commit a504e64ab42bcc27074ea37405d06833ed6e0820 > Author: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Fri Feb 2 08:22:53 2007 -0800 > >skge: WOL support > >Add WOL support for Yukon chipsets in skge device. > >Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> >Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> > > :04 04 d3d4335e6cba330b7880b4787fbe48733e69f8fc > 5845b004228d811de912a55da6a7843b72f23f81 M drivers > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2 Do you use skge as your network device? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Linus Torvalds napisał(a): > There's various fixes here, ranging from some architecture updates (ia64, > ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. > > And random one-liners. > I found this in mm snapshot http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/1367.html it's in mainline too. Andi, any progress with this bug? BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] debug_smp_processor_id+0xa2/0xb4 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] nmi_create_files+0x2a/0x10e [oprofile] [] oprofile_create_files+0xe6/0xec [oprofile] [] oprofilefs_fill_super+0x78/0x7e [oprofile] [] get_sb_single+0x46/0x8c [] oprofilefs_get_sb+0x1c/0x1e [oprofile] [] vfs_kern_mount+0x81/0xf1 [] do_kern_mount+0x30/0x42 [] do_mount+0x601/0x678 [] sys_mount+0x6f/0xa4 [] syscall_call+0x7/0xb === BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] debug_smp_processor_id+0xa2/0xb4 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] nmi_create_files+0x2a/0x10e [oprofile] [] oprofile_create_files+0xe6/0xec [oprofile] [] oprofilefs_fill_super+0x78/0x7e [oprofile] [] get_sb_single+0x46/0x8c [] oprofilefs_get_sb+0x1c/0x1e [oprofile] [] vfs_kern_mount+0x81/0xf1 [] do_kern_mount+0x30/0x42 [] do_mount+0x601/0x678 [] sys_mount+0x6f/0xa4 [] syscall_call+0x7/0xb === BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] debug_smp_processor_id+0xa2/0xb4 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] nmi_create_files+0x2a/0x10e [oprofile] [] oprofile_create_files+0xe6/0xec [oprofile] [] oprofilefs_fill_super+0x78/0x7e [oprofile] [] get_sb_single+0x46/0x8c [] oprofilefs_get_sb+0x1c/0x1e [oprofile] [] vfs_kern_mount+0x81/0xf1 [] do_kern_mount+0x30/0x42 [] do_mount+0x601/0x678 [] sys_mount+0x6f/0xa4 [] syscall_call+0x7/0xb === BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] debug_smp_processor_id+0xa2/0xb4 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32 [] nmi_create_files+0x2a/0x10e [oprofile] [] oprofile_create_files+0xe6/0xec [oprofile] [] oprofilefs_fill_super+0x78/0x7e [oprofile] [] get_sb_single+0x46/0x8c [] oprofilefs_get_sb+0x1c/0x1e [oprofile] [] vfs_kern_mount+0x81/0xf1 [] do_kern_mount+0x30/0x42 [] do_mount+0x601/0x678 [] sys_mount+0x6f/0xa4 [] syscall_call+0x7/0xb === SELinux: initialized (dev oprofilefs, type oprofilefs), uses genfs_contexts = [ INFO: inconsistent lock state ] 2.6.21-rc5-gd4590940-dirty #128 - inconsistent {hardirq-on-W} -> {in-hardirq-W} usage. firefox-bin/3542 [HC1[1]:SC0[0]:HE0:SE1] takes: (oprofilefs_lock){+-..}, at: [] nmi_cpu_setup+0x15/0x4f [oprofile] {hardirq-on-W} state was registered at: [] __lock_acquire+0x442/0xba1 [] lock_acquire+0x68/0x82 [] _spin_lock+0x35/0x42 [] oprofilefs_ulong_from_user+0x4e/0x74 [oprofile] [] ulong_write_file+0x2a/0x38 [oprofile] [] vfs_write+0xaf/0x138 [] sys_write+0x3d/0x61 [] syscall_call+0x7/0xb [] 0x irq event stamp: 50464270 hardirqs last enabled at (50464269): [] syscall_exit_work+0x11/0x26 hardirqs last disabled at (50464270): [] call_function_interrupt+0x29/0x38 softirqs last enabled at (50462522): [] __do_softirq+0xe4/0xea softirqs last disabled at (50462515): [] do_softirq+0x64/0xd1 other info that might help us debug this: no locks held by firefox-bin/3542. stack backtrace: [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] print_usage_bug+0x140/0x14a [] mark_lock+0xa1/0x40b [] __lock_acquire+0x3b3/0xba1 [] lock_acquire+0x68/0x82 [] _spin_lock+0x35/0x42 [] nmi_cpu_setup+0x15/0x4f [oprofile] [] smp_call_function_interrupt+0x3a/0x56 [] call_function_interrupt+0x33/0x38 === http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2 http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-dmesg Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http:/
Re: ATA ACPI (was Re: Linux 2.6.21-rc5)
Linus Torvalds wrote: On Tue, 27 Mar 2007, Jeff Garzik wrote: FWIW, I'm still leaning towards disabling libata ACPI support by default for 2.6.21. Hey, I'm not going to argue against anything that says "disable ACPI". Of *course* it should be disabled if there aren't thousands of machines that are in user hands that actually need it (and none that regress). It's required to access data at all (BIOS-supplied password [un]locks disk), in a small minority of configurations. It's strongly suggested for reliable suspend/resume, particularly on laptops, where libata ACPI support fixes some suspend/resume problems. Some BIOSen also want to apply drive+board-specific errata workarounds. That's OK, but ideally we should know about those in the kernel. "none that regress" is the problem though. Buggy tables, unexercised ACPI code paths, and in a few cases unexpected post-ACPI drive/controller behavior expose regressions. Anybody want to send me a patch? Since everybody is OK with my plan, I'll send one today along with the rest of the post-vacation 2.6.21-rc bug fixes. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Hi, On 26/03/07, Linus Torvalds <[EMAIL PROTECTED]> wrote: There's various fixes here, ranging from some architecture updates (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. Suspend to disk doesn't work for me with this patch. It hangs after PM: Preparing devices for restore. Suspending console(s) during resuming. a504e64ab42bcc27074ea37405d06833ed6e0820 is first bad commit commit a504e64ab42bcc27074ea37405d06833ed6e0820 Author: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri Feb 2 08:22:53 2007 -0800 skge: WOL support Add WOL support for Yukon chipsets in skge device. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> :04 04 d3d4335e6cba330b7880b4787fbe48733e69f8fc 5845b004228d811de912a55da6a7843b72f23f81 M drivers http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2 Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA ACPI (was Re: Linux 2.6.21-rc5)
On Tue, 27 Mar 2007, Jeff Garzik wrote: > > FWIW, I'm still leaning towards disabling libata ACPI support by default for > 2.6.21. Hey, I'm not going to argue against anything that says "disable ACPI". Of *course* it should be disabled if there aren't thousands of machines that are in user hands that actually need it (and none that regress). Anybody want to send me a patch? Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Monday, March 26, 2007 11:20 pm Greg KH wrote: > Already in Linus's tree. > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.2 > >1-rc5/2.6.21-rc5-mm2/broken-out/fix-sysfs-rom-file-creation-for-bios > >-rom-shadows.patch > > I'd prefer to wait until 2.6.22 for this one, I've had too many odd > reports of problems in this area, and since no one has reported this > issue, it's not a real rush at all. Yeah, I don't think this one is critical. These files aren't in heavy use yet, so fixing this in 2.6.22 should be ok. I've only heard one complaint about this bug so far, and that was caused by some code still in development. Jesse - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Tue, 27 Mar 2007 14:25:07 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > On Tuesday 27 March 2007 08:17, Andrew Morton wrote: > > > > I have a few fixes here which belong to subsystem trees, which were missed > > by the maintainers and which we probably want to get into 2.6.21. > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/make-aout-executables-work-again.patch > > This is already fixed in a different way. > oh yeah. Did that fix make it into 2.6.20.x? I think we decided that make-aout-executables-work-again.patch might still be a desirable thing to have, but I don't recall the reasoning for that. Anyway, if it doesn't fix a bug it is nowhere near a high-priority patch for that seething bugfest which we like to call a kernel, so I'll drop it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On 3/27/07, Andrew Morton <[EMAIL PROTECTED]> wrote: I have a few fixes here which belong to subsystem trees, which were missed by the maintainers and which we probably want to get into 2.6.21. ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sudden-warps-in-mousedev.patch Slightly different fix is already in input tree. I'd perfer waiting for 2.6.22 as this only affects touchpads when not using synaptics X driver (which most distributions use by default). -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Tuesday 27 March 2007 08:17, Andrew Morton wrote: > > I have a few fixes here which belong to subsystem trees, which were missed > by the maintainers and which we probably want to get into 2.6.21. > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/make-aout-executables-work-again.patch This is already fixed in a different way. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
At Mon, 26 Mar 2007 22:17:31 -0800, Andrew Morton wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/revert-ac97-fix-microphone-and-line_in-selection-logic.patch The better fix is already in rc5, so please drop this one from your tree. c26a8de23a4417f556250c4c099b048b26c430be [ALSA] ac97 - fix AD shared shared jack control logic thanks, Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Mon, Mar 26, 2007 at 10:17:31PM -0800, Andrew Morton wrote: > > I have a few fixes here which belong to subsystem trees, which were missed > by the maintainers and which we probably want to get into 2.6.21. > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/pci-set-pci=bfsort-for-poweredge-r900.patch Already in Linus's tree. > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sysfs-rom-file-creation-for-bios-rom-shadows.patch I'd prefer to wait until 2.6.22 for this one, I've had too many odd reports of problems in this area, and since no one has reported this issue, it's not a real rush at all. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
I have a few fixes here which belong to subsystem trees, which were missed by the maintainers and which we probably want to get into 2.6.21. ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/make-aout-executables-work-again.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/revert-ac97-fix-microphone-and-line_in-selection-logic.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/drivers-mfd-sm501c-fix-an-off-by-one.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/pci-set-pci=bfsort-for-poweredge-r900.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sudden-warps-in-mousedev.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sysfs-rom-file-creation-for-bios-rom-shadows.patch Maintainers are cc'ed. Please promptly ack, nack or otherwise quack, else I'll be making my own decisions ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA ACPI (was Re: Linux 2.6.21-rc5)
Jeff Garzik wrote: Linus Torvalds wrote: There's various fixes here, ranging from some architecture updates (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. And random one-liners. But probably more important, and likely much more visible to most people is the fixes for the fallout from the hrtimers and no-HZ changes, and some of the ACPI regressions. Those timer changes ended up much more painful than anybody wished for, but big thanks to Thomas Gleixner for being on it like a weasel on a dead rat, and the regression list has kept shrinking. So if you have reported a regression in the 2.6.21-rc series, please check 2.6.21-rc5, and update your report as appropriate (whether fixed or "still problems with xyzzy"). Linus [just got back from vacation, or would have sent this earlier] FWIW, I'm still leaning towards disabling libata ACPI support by default for 2.6.21. Upstream has Alan's fix for the worst PATA problems, but for different reasons, I think PATA ACPI and SATA ACPI support in libata does not feel quite ready for prime time in 2.6.21. Scream now, or hold your peace until 2.6.22... :) I second disabling ACPI for 2.6.21. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ATA ACPI (was Re: Linux 2.6.21-rc5)
Linus Torvalds wrote: There's various fixes here, ranging from some architecture updates (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. And random one-liners. But probably more important, and likely much more visible to most people is the fixes for the fallout from the hrtimers and no-HZ changes, and some of the ACPI regressions. Those timer changes ended up much more painful than anybody wished for, but big thanks to Thomas Gleixner for being on it like a weasel on a dead rat, and the regression list has kept shrinking. So if you have reported a regression in the 2.6.21-rc series, please check 2.6.21-rc5, and update your report as appropriate (whether fixed or "still problems with xyzzy"). Linus [just got back from vacation, or would have sent this earlier] FWIW, I'm still leaning towards disabling libata ACPI support by default for 2.6.21. Upstream has Alan's fix for the worst PATA problems, but for different reasons, I think PATA ACPI and SATA ACPI support in libata does not feel quite ready for prime time in 2.6.21. Scream now, or hold your peace until 2.6.22... :) Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
This issue might be resolved with the patch provided in the following bug report: http://bugzilla.kernel.org/show_bug.cgi?id=8058 Please try out the patch in the bug report without your patch and see if the issue reproduces. Ayaz Ingo Molnar wrote: * Linus Torvalds <[EMAIL PROTECTED]> wrote: There's various fixes here, ranging from some architecture updates (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. here's a new v2.6.20 -> v2.6.21 forcedeth.c regression: in the last week or so i've been seeing sporadic under-load forcedeth.c crashes (see the full oops further below): eth1: too many iterations (6) in nv_nic_irq. Unable to handle kernel NULL pointer dereference at 0088 RIP: [] nv_tx_done+0xf4/0x1cf this is line 1906 of drivers/net/forcedeth.c: np->stats.tx_bytes += np->get_tx_ctx->skb->len; struct sk_buff's len field is at offset 88, so np->get_tx_ctx->skb is NULL. That is an 'impossible' scenario for tx descriptors here - the tx ring descriptors are always set up with a valid skb (and a valid dma address), and their completion is serialized via np->lock. these crashes are almost instant on the .21-rc5-rt kernel, but extremely sporadic on the upstream kernel and needed very high networking loads to trigger. Today i found a good way to trigger it almost instantly on upstream kernels too: apply the debug patch attached further below and do: echo 100 > /proc/sys/kernel/panic that will inject 100 artificial 'too many iterations' failures and provokes a TX timeout - which TX timeout will crash. (i've used a dual-core Athlon64 system in this test) my first quick guess was to extend np->priv locking to the whole of nv_start_xmit/nv_start_xmit_optimized - while that appeared to make the crash a bit less likely, it did not prevent it. So there must be some other, more fundamental problem be left as well. At first glance the SMP locking looks OK, so maybe the ring indices are messed up somehow and we got into a 'ring head bites the tail' scenario? i can provide more info if needed. Ingo --> eth1: too many iterations (6) in nv_nic_irq. Unable to handle kernel NULL pointer dereference at 0088 RIP: [] nv_tx_done+0xf4/0x1cf PGD 34d03067 PUD 34d02067 PMD 0 Oops: [1] PREEMPT SMP CPU 1 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.21-rc5 #8 RIP: 0010:[] [] nv_tx_done+0xf4/0x1cf RSP: 0018:81003ff6be40 EFLAGS: 00010206 RAX: RBX: 810002e26700 RCX: 0001 RDX: 0042 RSI: 3ef00cbe RDI: 81003fbeb070 RBP: 81003ff6be60 R08: 810002e26a00 R09: 0003 R10: 81003ff4e100 R11: 810001e283f8 R12: 3ef00cbe R13: 810002e26000 R14: 810002e28fc0 R15: FS: 2b6cb57f1db0() GS:81003ff4ad40() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0088 CR3: 34c87000 CR4: 06e0 Process swapper (pid: 0, threadinfo 81003ff64000, task 81003ff4e100) Stack: 810002e26700 0032 c201a000 810002e26000 81003ff6bea0 80406dae 810002e26700 810002e26700 810002e26000 00ff c201a000 80749080 Call Trace: [] nv_nic_irq+0x76/0x261 [] nv_do_nic_poll+0x200/0x284 [] nv_do_nic_poll+0x0/0x284 [] run_timer_softirq+0x167/0x1dd [] __do_softirq+0x5b/0xc9 [] call_softirq+0x1c/0x28 [] do_softirq+0x31/0x84 [] irq_exit+0x3f/0x50 [] smp_apic_timer_interrupt+0x49/0x5b [] default_idle+0x0/0x44 [] apic_timer_interrupt+0x66/0x70 [] default_idle+0x2f/0x44 [] enter_idle+0x22/0x24 [] cpu_idle+0x91/0xd4 [] start_secondary+0x2e3/0x2f5 --- drivers/net/forcedeth.c | 20 1 file changed, 20 insertions(+) Index: linux/drivers/net/forcedeth.c === --- linux.orig/drivers/net/forcedeth.c +++ linux/drivers/net/forcedeth.c @@ -2908,6 +2908,10 @@ static irqreturn_t nv_nic_irq(int foo, v spin_unlock(&np->lock); break; } + if (panic_timeout > 0) { + panic_timeout--; + i = max_interrupt_work+1; + } if (unlikely(i > max_interrupt_work)) { spin_lock(&np->lock); /* disable interrupts on the nic */ @@ -3026,6 +3030,10 @@ static irqreturn_t nv_nic_irq_optimized( break; } + if (panic_timeout > 0) { + panic_timeout--; + i = max_interrupt_work+1; + } if (unlikely(i > max_interrupt_work)) { spin_lock(&np->lock); /* disable interrupts on the nic */ @@ -3076,6 +3084,10 @@ static irqreturn_t nv_nic_irq_tx(int foo
Re: Linux 2.6.21-rc5
On Mon, 2007-03-26 at 07:25 -0500, Bob Tracy wrote: > Thomas Gleixner wrote: > > This fix from John Stultz is still missing: > > > > http://lkml.org/lkml/2007/3/22/287 > > > > It's in Andrews queue already and waits to be sent to you. > > In summary, that fix is a workaround to allow the acpi_pm clocksource > to be selected instead of the pit clocksource, thereby allowing my > Dell laptop with the PIIX4 bug to boot. Other apic, clocksource, etc. > patches that were included in -rc5 fixed the problem that caused the > boot process to hang when the pit clocksource was selected, as I > suspected would be the case :-). Ah. Ok > Per John's message in the above URL, while the fix is no longer needed > for allowing the laptop to boot, it's probably still "a good thing" to > allow a better clocksource to be selected. Yes. The read three times pmtimer is faster and more reliable than the PIT. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Thomas Gleixner wrote: > This fix from John Stultz is still missing: > > http://lkml.org/lkml/2007/3/22/287 > > It's in Andrews queue already and waits to be sent to you. In summary, that fix is a workaround to allow the acpi_pm clocksource to be selected instead of the pit clocksource, thereby allowing my Dell laptop with the PIIX4 bug to boot. Other apic, clocksource, etc. patches that were included in -rc5 fixed the problem that caused the boot process to hang when the pit clocksource was selected, as I suspected would be the case :-). Per John's message in the above URL, while the fix is no longer needed for allowing the laptop to boot, it's probably still "a good thing" to allow a better clocksource to be selected. -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > There's various fixes here, ranging from some architecture updates > (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers. here's a new v2.6.20 -> v2.6.21 forcedeth.c regression: in the last week or so i've been seeing sporadic under-load forcedeth.c crashes (see the full oops further below): eth1: too many iterations (6) in nv_nic_irq. Unable to handle kernel NULL pointer dereference at 0088 RIP: [] nv_tx_done+0xf4/0x1cf this is line 1906 of drivers/net/forcedeth.c: np->stats.tx_bytes += np->get_tx_ctx->skb->len; struct sk_buff's len field is at offset 88, so np->get_tx_ctx->skb is NULL. That is an 'impossible' scenario for tx descriptors here - the tx ring descriptors are always set up with a valid skb (and a valid dma address), and their completion is serialized via np->lock. these crashes are almost instant on the .21-rc5-rt kernel, but extremely sporadic on the upstream kernel and needed very high networking loads to trigger. Today i found a good way to trigger it almost instantly on upstream kernels too: apply the debug patch attached further below and do: echo 100 > /proc/sys/kernel/panic that will inject 100 artificial 'too many iterations' failures and provokes a TX timeout - which TX timeout will crash. (i've used a dual-core Athlon64 system in this test) my first quick guess was to extend np->priv locking to the whole of nv_start_xmit/nv_start_xmit_optimized - while that appeared to make the crash a bit less likely, it did not prevent it. So there must be some other, more fundamental problem be left as well. At first glance the SMP locking looks OK, so maybe the ring indices are messed up somehow and we got into a 'ring head bites the tail' scenario? i can provide more info if needed. Ingo --> eth1: too many iterations (6) in nv_nic_irq. Unable to handle kernel NULL pointer dereference at 0088 RIP: [] nv_tx_done+0xf4/0x1cf PGD 34d03067 PUD 34d02067 PMD 0 Oops: [1] PREEMPT SMP CPU 1 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.21-rc5 #8 RIP: 0010:[] [] nv_tx_done+0xf4/0x1cf RSP: 0018:81003ff6be40 EFLAGS: 00010206 RAX: RBX: 810002e26700 RCX: 0001 RDX: 0042 RSI: 3ef00cbe RDI: 81003fbeb070 RBP: 81003ff6be60 R08: 810002e26a00 R09: 0003 R10: 81003ff4e100 R11: 810001e283f8 R12: 3ef00cbe R13: 810002e26000 R14: 810002e28fc0 R15: FS: 2b6cb57f1db0() GS:81003ff4ad40() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0088 CR3: 34c87000 CR4: 06e0 Process swapper (pid: 0, threadinfo 81003ff64000, task 81003ff4e100) Stack: 810002e26700 0032 c201a000 810002e26000 81003ff6bea0 80406dae 810002e26700 810002e26700 810002e26000 00ff c201a000 80749080 Call Trace: [] nv_nic_irq+0x76/0x261 [] nv_do_nic_poll+0x200/0x284 [] nv_do_nic_poll+0x0/0x284 [] run_timer_softirq+0x167/0x1dd [] __do_softirq+0x5b/0xc9 [] call_softirq+0x1c/0x28 [] do_softirq+0x31/0x84 [] irq_exit+0x3f/0x50 [] smp_apic_timer_interrupt+0x49/0x5b [] default_idle+0x0/0x44 [] apic_timer_interrupt+0x66/0x70 [] default_idle+0x2f/0x44 [] enter_idle+0x22/0x24 [] cpu_idle+0x91/0xd4 [] start_secondary+0x2e3/0x2f5 --- drivers/net/forcedeth.c | 20 1 file changed, 20 insertions(+) Index: linux/drivers/net/forcedeth.c === --- linux.orig/drivers/net/forcedeth.c +++ linux/drivers/net/forcedeth.c @@ -2908,6 +2908,10 @@ static irqreturn_t nv_nic_irq(int foo, v spin_unlock(&np->lock); break; } + if (panic_timeout > 0) { + panic_timeout--; + i = max_interrupt_work+1; + } if (unlikely(i > max_interrupt_work)) { spin_lock(&np->lock); /* disable interrupts on the nic */ @@ -3026,6 +3030,10 @@ static irqreturn_t nv_nic_irq_optimized( break; } + if (panic_timeout > 0) { + panic_timeout--; + i = max_interrupt_work+1; + } if (unlikely(i > max_interrupt_work)) { spin_lock(&np->lock); /* disable interrupts on the nic */ @@ -3076,6 +3084,10 @@ static irqreturn_t nv_nic_irq_tx(int foo dprintk(KERN_DEBUG "%s: received irq with events 0x%x. Probably TX fail.\n", dev->name, events); } + if (panic_timeout > 0) { + panic_timeo
Re: Linux 2.6.21-rc5
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > my first quick guess was to extend np->priv locking to the whole of > nv_start_xmit/nv_start_xmit_optimized - while that appeared to make > the crash a bit less likely, it did not prevent it. So there must be > some other, more fundamental problem be left as well. At first glance > the SMP locking looks OK, so maybe the ring indices are messed up > somehow and we got into a 'ring head bites the tail' scenario? to be specific, the patch below is what i tried - but it didnt completely fix the crash. Ingo --- drivers/net/forcedeth.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) Index: linux/drivers/net/forcedeth.c === --- linux.orig/drivers/net/forcedeth.c +++ linux/drivers/net/forcedeth.c @@ -1650,9 +1650,10 @@ static int nv_start_xmit(struct sk_buff ((skb_shinfo(skb)->frags[i].size & (NV_TX2_TSO_MAX_SIZE-1)) ? 1 : 0); } + spin_lock_irq(&np->lock); + empty_slots = nv_get_empty_tx_slots(np); if (unlikely(empty_slots <= entries)) { - spin_lock_irq(&np->lock); netif_stop_queue(dev); np->tx_stop = 1; spin_unlock_irq(&np->lock); @@ -1718,8 +1719,6 @@ static int nv_start_xmit(struct sk_buff tx_flags_extra = skb->ip_summed == CHECKSUM_PARTIAL ? NV_TX2_CHECKSUM_L3 | NV_TX2_CHECKSUM_L4 : 0; - spin_lock_irq(&np->lock); - /* set tx flags */ start_tx->flaglen |= cpu_to_le32(tx_flags | tx_flags_extra); np->put_tx.orig = put_tx; @@ -1766,9 +1765,10 @@ static int nv_start_xmit_optimized(struc ((skb_shinfo(skb)->frags[i].size & (NV_TX2_TSO_MAX_SIZE-1)) ? 1 : 0); } + spin_lock_irq(&np->lock); + empty_slots = nv_get_empty_tx_slots(np); if (unlikely(empty_slots <= entries)) { - spin_lock_irq(&np->lock); netif_stop_queue(dev); np->tx_stop = 1; spin_unlock_irq(&np->lock); @@ -1846,8 +1846,6 @@ static int nv_start_xmit_optimized(struc start_tx->txvlan = 0; } - spin_lock_irq(&np->lock); - /* set tx flags */ start_tx->flaglen |= cpu_to_le32(tx_flags | tx_flags_extra); np->put_tx.ex = put_tx; @@ -3484,6 +3482,7 @@ static void nv_do_nic_poll(unsigned long struct net_device *dev = (struct net_device *) data; struct fe_priv *np = netdev_priv(dev); u8 __iomem *base = get_hwbase(dev); + unsigned long flags; u32 mask = 0; /* @@ -3519,7 +3518,7 @@ static void nv_do_nic_poll(unsigned long printk(KERN_INFO "forcedeth: MAC in recoverable error state\n"); if (netif_running(dev)) { netif_tx_lock_bh(dev); - spin_lock(&np->lock); + spin_lock_irqsave(&np->lock, flags); /* stop engines */ nv_stop_rx(dev); nv_stop_tx(dev); @@ -3545,7 +3544,7 @@ static void nv_do_nic_poll(unsigned long /* restart rx engine */ nv_start_rx(dev); nv_start_tx(dev); - spin_unlock(&np->lock); + spin_unlock_irqrestore(&np->lock, flags); netif_tx_unlock_bh(dev); } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
On Sun, 2007-03-25 at 16:08 -0700, Linus Torvalds wrote: > Those timer changes ended up much more painful than anybody wished for, > but big thanks to Thomas Gleixner for being on it like a weasel on a dead > rat, and the regression list has kept shrinking. Why certainly ! I caused them, so I have to fix them. There are still a few to hunt down and I want to sort them out before 2.6.21 final. > So if you have reported a regression in the 2.6.21-rc series, please check > 2.6.21-rc5, and update your report as appropriate (whether fixed or "still > problems with xyzzy"). This fix from John Stultz is still missing: http://lkml.org/lkml/2007/3/22/287 It's in Andrews queue already and waits to be sent to you. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/