Re: [BUG?] APM is hidden in menuconfig
On Thu, Feb 21, 2008 at 07:43:46AM +, Jarek Poplawski wrote: ... > ...Or at least to mention APM in SUSPEND title and description. > Actually, this is really strange: both SUSPEND and PM_SLEEP have > default = y. So it seems they are intended to be more "advertised" > than they are? Now I see it could be even stranger: it *is* visible by default(!) after make defconfig. make menuconfig does something else: it seems to "help me" and sees some other config (current kernel is in other directory) even after deleting .config and make mrproper. So, maybe this started because of this and the problem is with changing this back. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG?] APM is hidden in menuconfig
On Thu, Feb 21, 2008 at 07:43:46AM +, Jarek Poplawski wrote: ... ...Or at least to mention APM in SUSPEND title and description. Actually, this is really strange: both SUSPEND and PM_SLEEP have default = y. So it seems they are intended to be more advertised than they are? Now I see it could be even stranger: it *is* visible by default(!) after make defconfig. make menuconfig does something else: it seems to help me and sees some other config (current kernel is in other directory) even after deleting .config and make mrproper. So, maybe this started because of this and the problem is with changing this back. Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG?] APM is hidden in menuconfig
On Thu, Feb 21, 2008 at 02:21:52AM +0100, Rafael J. Wysocki wrote: > On Thursday, 21 of February 2008, Frans Pop wrote: > > Rafael J. Wysocki wrote: > > > On Wednesday, 20 of February 2008, Jarek Poplawski wrote: > > >> So, has it to be so hard? It seems not - at least in good old times... > > > > > > Something in APM uses some code from drivers/base/power/main.c that > > > depends on PM_SLEEP. > > > > Sure, but that does not make Jarek's point invalid. From a user PoV APM is a > > high-level configuration option. That is what he wants. > > It should thus be easily accessible and not be buried beneath a lot of > > other, much more technical, options. Probably Frans is right: this should be my point... But I'm not so "greedy", and I would be happy if it were at least more visible. It simply seems to me quite not obvious to even think about turning SUSPEND on when I have problems with a basic acpi function. Even more interesting question is why this APM or PM_SLEEP dependency on SUSPEND (or HIBERNATION) isn't visible with "/" searching: PM_SLEEP looks like some "hidden" option - that's why I tried first to find some comment in arch/ instead of simply reading Kconfig. > > Could this maybe be solved by making APM automatically 'select' some options > > instead of 'depending' on them? > > That, unfortunately, doesn't work. > > IMO the solution might be to separate the APM suspend code from the rest of > the > APM code and make it depend on (PM_SUSPEND && APM). ...Or at least to mention APM in SUSPEND title and description. Actually, this is really strange: both SUSPEND and PM_SLEEP have default = y. So it seems they are intended to be more "advertised" than they are? Thanks & regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG?] APM is hidden in menuconfig
Hi, I needed APM to have poweroff on old box. So, in 2.6.24.2 menuconfig: 1) Power management options --> No APM. 2) [*] Power Management support No APM. I can see ACPI... 3) I try searching with "/" + "APM" APM [=n] Depends on: !X86_VOYAGER && X86_32 && PM_SLEEP && !X86_VISWS 4) I check above: all correct except: Symbol: PM_SLEEP [=n] No more "depends on". 5) Exit menuconfig - enter grep: 6) Some time wasting in arch/. 7) less kernel/power/Kconfig: config PM_SLEEP bool depends on SUSPEND || HIBERNATION default y ... config SUSPEND bool "Suspend to RAM and standby" depends on PM depends on SUSPEND_UP_POSSIBLE || SUSPEND_SMP_POSSIBLE default y ---help--- Allow the system to enter sleep states in which main memory is powered and thus its contents are preserved, such as the suspend-to-RAM state (i.e. the ACPI S3 state). 8) Back to menuconfig: [*] Suspend to RAM and standby Bingo! APM revealed! (And not even "DEPRECATED"!) So, has it to be so hard? It seems not - at least in good old times... Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG?] APM is hidden in menuconfig
Hi, I needed APM to have poweroff on old box. So, in 2.6.24.2 menuconfig: 1) Power management options -- No APM. 2) [*] Power Management support No APM. I can see ACPI... 3) I try searching with / + APM APM [=n] Depends on: !X86_VOYAGER X86_32 PM_SLEEP !X86_VISWS 4) I check above: all correct except: Symbol: PM_SLEEP [=n] No more depends on. 5) Exit menuconfig - enter grep: 6) Some time wasting in arch/. 7) less kernel/power/Kconfig: config PM_SLEEP bool depends on SUSPEND || HIBERNATION default y ... config SUSPEND bool Suspend to RAM and standby depends on PM depends on SUSPEND_UP_POSSIBLE || SUSPEND_SMP_POSSIBLE default y ---help--- Allow the system to enter sleep states in which main memory is powered and thus its contents are preserved, such as the suspend-to-RAM state (i.e. the ACPI S3 state). 8) Back to menuconfig: [*] Suspend to RAM and standby Bingo! APM revealed! (And not even DEPRECATED!) So, has it to be so hard? It seems not - at least in good old times... Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG?] APM is hidden in menuconfig
On Thu, Feb 21, 2008 at 02:21:52AM +0100, Rafael J. Wysocki wrote: On Thursday, 21 of February 2008, Frans Pop wrote: Rafael J. Wysocki wrote: On Wednesday, 20 of February 2008, Jarek Poplawski wrote: So, has it to be so hard? It seems not - at least in good old times... Something in APM uses some code from drivers/base/power/main.c that depends on PM_SLEEP. Sure, but that does not make Jarek's point invalid. From a user PoV APM is a high-level configuration option. That is what he wants. It should thus be easily accessible and not be buried beneath a lot of other, much more technical, options. Probably Frans is right: this should be my point... But I'm not so greedy, and I would be happy if it were at least more visible. It simply seems to me quite not obvious to even think about turning SUSPEND on when I have problems with a basic acpi function. Even more interesting question is why this APM or PM_SLEEP dependency on SUSPEND (or HIBERNATION) isn't visible with / searching: PM_SLEEP looks like some hidden option - that's why I tried first to find some comment in arch/ instead of simply reading Kconfig. Could this maybe be solved by making APM automatically 'select' some options instead of 'depending' on them? That, unfortunately, doesn't work. IMO the solution might be to separate the APM suspend code from the rest of the APM code and make it depend on (PM_SUSPEND APM). ...Or at least to mention APM in SUSPEND title and description. Actually, this is really strange: both SUSPEND and PM_SLEEP have default = y. So it seems they are intended to be more advertised than they are? Thanks regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25 patch] fix broken error handling in ieee80211_sta_process_addba_request()
On 19-02-2008 23:58, Adrian Bunk wrote: ... > --- a/net/mac80211/ieee80211_sta.c > +++ b/net/mac80211/ieee80211_sta.c > @@ -1116,9 +1116,10 @@ static void ieee80211_sta_process_addba_request(struct > net_device *dev, ... > + printk(KERN_ERR "can not allocate reordering buffer " + printk(KERN_ERR "cannot allocate reordering buffer " Probably this can be fixed during the commit. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25 patch] fix broken error handling in ieee80211_sta_process_addba_request()
On 19-02-2008 23:58, Adrian Bunk wrote: ... --- a/net/mac80211/ieee80211_sta.c +++ b/net/mac80211/ieee80211_sta.c @@ -1116,9 +1116,10 @@ static void ieee80211_sta_process_addba_request(struct net_device *dev, ... + printk(KERN_ERR can not allocate reordering buffer + printk(KERN_ERR cannot allocate reordering buffer Probably this can be fixed during the commit. Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] workqueues: shrink cpu_populated_map when CPU dies
On Mon, Feb 18, 2008 at 02:45:56AM +0300, Oleg Nesterov wrote: > On 02/17, Jarek Poplawski wrote: ... > > 1) ... workqueue_cpu_callback(...) ... > Yes, but this is harmless. cpu-hotplug callbacks are not time-critical, > and cpu_down/cpu_up happens not often, and LIST_HEAD(workqueues) is not > very long, so ... > > > 2) ... __create_workqueue_key(...) ... > > Shouldn't this list_add() be done after all these inits below? > This doesn't matter. Please note that get_online_cpus() blocks > cpu_up/cpu_down, they take cpu_hotplug_begin(). You are completely right. It looks like this was only about ..."look". (But adding only "default:" in 1) would make it look nicer to me...) Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] workqueues: shrink cpu_populated_map when CPU dies
Hi Oleg, This patch looks OK to me. But while reading this I got some doubts in nearby places, so BTW 2 small questions: 1) ... workqueue_cpu_callback(...) { ... list_for_each_entry(wq, , list) { cwq = per_cpu_ptr(wq->cpu_wq, cpu); switch (action) { case CPU_UP_PREPARE: ... It looks like not all CPU_ cases are served here: shouldn't list_for_each_entry() be omitted for them? 2) ... __create_workqueue_key(...) { ... if (singlethread) { ... } else { get_online_cpus(); spin_lock(_lock); list_add(>list, ); Shouldn't this list_add() be done after all these inits below? spin_unlock(_lock); for_each_possible_cpu(cpu) { cwq = init_cpu_workqueue(wq, cpu); ... } ... Thanks, Jarek P. On Sat, Feb 16, 2008 at 08:22:59PM +0300, Oleg Nesterov wrote: > When cpu_populated_map was introduced, it was supposed that cwq->thread can > survive after CPU_DEAD, that is why we never shrink cpu_populated_map. > > This is not very nice, we can safely remove the already dead CPU from the map. > The only required change is that destroy_workqueue() must hold the hotplug > lock > until it destroys all cwq->thread's, to protect the cpu_populated_map. We > could > make the local copy of cpu mask and drop the lock, but sizeof(cpumask_t) may > be > very large. > > Also, fix the comment near queue_work(). Unless _cpu_down() happens we do > guarantee the cpu-affinity of the work_struct, and we have users which rely on > this. > > Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] workqueues: shrink cpu_populated_map when CPU dies
On Mon, Feb 18, 2008 at 02:45:56AM +0300, Oleg Nesterov wrote: On 02/17, Jarek Poplawski wrote: ... 1) ... workqueue_cpu_callback(...) ... Yes, but this is harmless. cpu-hotplug callbacks are not time-critical, and cpu_down/cpu_up happens not often, and LIST_HEAD(workqueues) is not very long, so ... 2) ... __create_workqueue_key(...) ... Shouldn't this list_add() be done after all these inits below? This doesn't matter. Please note that get_online_cpus() blocks cpu_up/cpu_down, they take cpu_hotplug_begin(). You are completely right. It looks like this was only about ...look. (But adding only default: in 1) would make it look nicer to me...) Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG/ spinlock lockup, 2.6.24
Jarek Poplawski wrote, On 02/15/2008 10:03 PM: ... > ...On the other hand this: > >> Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, >> ksoftirqd/1/7, f0551180 > > seems to point just at spinlock lockup, so it's more about the full report. > I wonder if this patch to prink could help here: > > authorIngo Molnar > Fri, 25 Jan 2008 20:07:58 + (21:07 +0100) > printk: make printk more robust by not allowing recursion > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e ...or maybe a patch like this attached here? Jarek P. diff --git a/lib/spinlock_debug.c b/lib/spinlock_debug.c index 9c4b025..21c8aaa 100644 --- a/lib/spinlock_debug.c +++ b/lib/spinlock_debug.c @@ -111,8 +111,7 @@ static void __spin_lock_debug(spinlock_t *lock) __delay(1); } /* lockup suspected: */ - if (print_once) { - print_once = 0; + if (print_once == 1) { printk(KERN_EMERG "BUG: spinlock lockup on CPU#%d, " "%s/%d, %p\n", raw_smp_processor_id(), current->comm, @@ -122,7 +121,14 @@ static void __spin_lock_debug(spinlock_t *lock) trigger_all_cpu_backtrace(); #endif } + if (print_once++ > 1000) + goto out; } + return; +out: + panic("spinlock lockup panic #%llu\n", i); + // or: + // BUG(); } void _raw_spin_lock(spinlock_t *lock)
Re: BUG/ spinlock lockup, 2.6.24
Jarek Poplawski wrote, On 02/15/2008 09:21 PM: > Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM: > ... > >> I have similar crashes on completely different hardware with same job (QOS), >> so i think it is actually some nasty bug in networking. > > Maybe you could try with some other debugging options? E.g. since lockdep > doesn't help - turn this off. Instead try some others, like these: ...On the other hand this: > Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, > ksoftirqd/1/7, f0551180 seems to point just at spinlock lockup, so it's more about the full report. I wonder if this patch to prink could help here: author Ingo Molnar Fri, 25 Jan 2008 20:07:58 + (21:07 +0100) printk: make printk more robust by not allowing recursion http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG/ spinlock lockup, 2.6.24
Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM: ... > I have similar crashes on completely different hardware with same job (QOS), > so i think it is actually some nasty bug in networking. Maybe you could try with some other debugging options? E.g. since lockdep doesn't help - turn this off. Instead try some others, like these: > # CONFIG_DEBUG_SPINLOCK_SLEEP is not set > # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set > # CONFIG_DEBUG_KOBJECT is not set > # CONFIG_DEBUG_HIGHMEM is not set > # CONFIG_DEBUG_VM is not set > # CONFIG_DEBUG_LIST is not set > # CONFIG_DEBUG_SG is not set > # CONFIG_BOOT_PRINTK_DELAY is not set > # CONFIG_DEBUG_STACKOVERFLOW is not set > # CONFIG_DEBUG_STACK_USAGE is not set > # CONFIG_DEBUG_RODATA is not set Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG/ spinlock lockup, 2.6.24
Jarek Poplawski wrote, On 02/15/2008 09:21 PM: Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM: ... I have similar crashes on completely different hardware with same job (QOS), so i think it is actually some nasty bug in networking. Maybe you could try with some other debugging options? E.g. since lockdep doesn't help - turn this off. Instead try some others, like these: ...On the other hand this: Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, ksoftirqd/1/7, f0551180 seems to point just at spinlock lockup, so it's more about the full report. I wonder if this patch to prink could help here: author Ingo Molnar mingo at elte.hu Fri, 25 Jan 2008 20:07:58 + (21:07 +0100) printk: make printk more robust by not allowing recursion http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG/ spinlock lockup, 2.6.24
Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM: ... I have similar crashes on completely different hardware with same job (QOS), so i think it is actually some nasty bug in networking. Maybe you could try with some other debugging options? E.g. since lockdep doesn't help - turn this off. Instead try some others, like these: # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_HIGHMEM is not set # CONFIG_DEBUG_VM is not set # CONFIG_DEBUG_LIST is not set # CONFIG_DEBUG_SG is not set # CONFIG_BOOT_PRINTK_DELAY is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUG_RODATA is not set Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG/ spinlock lockup, 2.6.24
Jarek Poplawski wrote, On 02/15/2008 10:03 PM: ... ...On the other hand this: Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, ksoftirqd/1/7, f0551180 seems to point just at spinlock lockup, so it's more about the full report. I wonder if this patch to prink could help here: authorIngo Molnar mingo at elte.hu Fri, 25 Jan 2008 20:07:58 + (21:07 +0100) printk: make printk more robust by not allowing recursion http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e ...or maybe a patch like this attached here? Jarek P. diff --git a/lib/spinlock_debug.c b/lib/spinlock_debug.c index 9c4b025..21c8aaa 100644 --- a/lib/spinlock_debug.c +++ b/lib/spinlock_debug.c @@ -111,8 +111,7 @@ static void __spin_lock_debug(spinlock_t *lock) __delay(1); } /* lockup suspected: */ - if (print_once) { - print_once = 0; + if (print_once == 1) { printk(KERN_EMERG BUG: spinlock lockup on CPU#%d, %s/%d, %p\n, raw_smp_processor_id(), current-comm, @@ -122,7 +121,14 @@ static void __spin_lock_debug(spinlock_t *lock) trigger_all_cpu_backtrace(); #endif } + if (print_once++ 1000) + goto out; } + return; +out: + panic(spinlock lockup panic #%llu\n, i); + // or: + // BUG(); } void _raw_spin_lock(spinlock_t *lock)
Re: [PATCH] fib_trie: rcu_assign_pointer warning fix
On Tue, Feb 12, 2008 at 08:32:18PM +0100, Jarek Poplawski wrote: ... > It seems the above version of this macro uses the barrier for 0, but > if I miss something, or for these other: documenting reasons, ...or __builtin_constants could be used for indexing (?!), > then of > course you are right. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fib_trie: rcu_assign_pointer warning fix
On Tue, Feb 12, 2008 at 08:07:29AM -0800, Paul E. McKenney wrote: ... > "All programmers are blind, especially me." Hmm... I got it my way: you - superheroes - sometimes seem to be just like us - common people... (Probably early in the morning, before dressing your funny costumes?) > You are right, Jarek. I ran this through gcc, and the following > comes close: > > #define rcu_assign_pointer(p, v) \ > ({ \ > if (!__builtin_constant_p(v) || (v)) \ > smp_wmb(); \ > (p) = (v); \ > }) > > But I am concerned about the following case: > > rcu_assign_pointer(global_index, 0); > > . . . > > x = global_array[rcu_dereference(global_index)]; > > Since arrays have a zero-th element, we would really want a memory > barrier in this case. It seems the above version of this macro uses the barrier for 0, but if I miss something, or for these other: documenting reasons, then of course you are right. > > So how about leaving the index-unfriendly version of rcu_assign_pointer() > and adding an rcu_assign_index() as follows? > > Thanx, Paul > > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> > --- > > rcupdate.h | 18 ++ > 1 file changed, 18 insertions(+) > > diff -urpNa -X dontdiff linux-2.6.24/include/linux/rcupdate.h > linux-2.6.24-rai/include/linux/rcupdate.h > --- linux-2.6.24/include/linux/rcupdate.h 2008-01-24 14:58:37.0 > -0800 > +++ linux-2.6.24-rai/include/linux/rcupdate.h 2008-02-12 08:04:59.0 > -0800 > @@ -278,6 +278,24 @@ extern struct lockdep_map rcu_lock_map; > }) > > /** > + * rcu_assign_index - assign (publicize) a index of a newly > + * initialized array elementg that will be dereferenced by RCU ---^ + * initialized array element that will be dereferenced by RCU Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fib_trie: rcu_assign_pointer warning fix
On 12-02-2008 02:16, David Miller wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Mon, 11 Feb 2008 16:59:54 -0800 > > linux-kernel added to CC:, any change to generic kernel infrastructure > should be posted there > >> Eliminate warnings when rcu_assign_pointer is used with unsigned long. >> It is reasonable to use RCU with non-pointer values so allow it for general >> use. Add a comment to explain the if test. >> >> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> >> --- >> include/linux/rcupdate.h | 13 +++-- >> 1 files changed, 7 insertions(+), 6 deletions(-) >> >> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h >> index 37a642c..c44ac87 100644 >> --- a/include/linux/rcupdate.h >> +++ b/include/linux/rcupdate.h >> @@ -172,14 +172,15 @@ struct rcu_head { >> * structure after the pointer assignment. More importantly, this >> * call documents which pointers will be dereferenced by RCU read-side >> * code. >> + * >> + * If value is the NULL (constant 0), then no barrier is needed. >> */ >> >> -#define rcu_assign_pointer(p, v) \ >> -({ \ >> -if (!__builtin_constant_p(v) || \ >> -((v) != NULL)) \ >> -smp_wmb(); \ >> -(p) = (v); \ >> +#define rcu_assign_pointer(p, v)\ >> +({ \ >> +if (!(__builtin_constant_p(v) && v))\ ...But, "If value is the NULL (constant 0)" we have: if (!(1 && NULL != 0)) ==> if (!(0)) and the barrier is used?! >> +smp_wmb(); \ >> +(p) = (v); \ >> }) >> >> /** Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fib_trie: rcu_assign_pointer warning fix
On 12-02-2008 02:16, David Miller wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Mon, 11 Feb 2008 16:59:54 -0800 linux-kernel added to CC:, any change to generic kernel infrastructure should be posted there Eliminate warnings when rcu_assign_pointer is used with unsigned long. It is reasonable to use RCU with non-pointer values so allow it for general use. Add a comment to explain the if test. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- include/linux/rcupdate.h | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 37a642c..c44ac87 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -172,14 +172,15 @@ struct rcu_head { * structure after the pointer assignment. More importantly, this * call documents which pointers will be dereferenced by RCU read-side * code. + * + * If value is the NULL (constant 0), then no barrier is needed. */ -#define rcu_assign_pointer(p, v) \ -({ \ -if (!__builtin_constant_p(v) || \ -((v) != NULL)) \ -smp_wmb(); \ -(p) = (v); \ +#define rcu_assign_pointer(p, v)\ +({ \ +if (!(__builtin_constant_p(v) v))\ ...But, If value is the NULL (constant 0) we have: if (!(1 NULL != 0)) == if (!(0)) and the barrier is used?! +smp_wmb(); \ +(p) = (v); \ }) /** Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fib_trie: rcu_assign_pointer warning fix
On Tue, Feb 12, 2008 at 08:32:18PM +0100, Jarek Poplawski wrote: ... It seems the above version of this macro uses the barrier for 0, but if I miss something, or for these other: documenting reasons, ...or __builtin_constants could be used for indexing (?!), then of course you are right. Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On 22-01-2008 01:55, Dave Young wrote: ... > Hi, thanks your effort. Now I think we should stop this thread and > waiting the class_device going away :) Sure! But, if you change your mind I'm interested in this subject. Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
Dave Young wrote, On 01/21/2008 09:44 AM: ... > I applied it in my kernel, built and run without warnings, but it need > more testing. > I will be very glad to see the test result about this if you could, thanks. Bad news. (Alas I won't be able to check this today.) = [ INFO: possible recursive locking detected ] 2.6.24-rc8 #5 - modprobe/536 is trying to acquire lock: (>mutex/1){--..}, at: [] class_device_add+0x2c6/0x322 but task is already holding lock: (>mutex/1){--..}, at: [] class_device_add+0x2c6/0x322 other info that might help us debug this: 2 locks held by modprobe/536: #0: (>scan_mutex){--..}, at: [] __scsi_add_device+0x60/0xc9 [scsi_mod] #1: (>mutex/1){--..}, at: [] class_device_add+0x2c6/0x322 stack backtrace: Pid: 536, comm: modprobe Not tainted 2.6.24-rc8 #5 [] __lock_acquire+0x962/0x10d4 [] _spin_unlock_irqrestore+0x34/0x39 [] kfree+0x87/0xad [] trace_hardirqs_on+0xba/0x15b [] lock_acquire+0x71/0x8b [] class_device_add+0x2c6/0x322 [] mutex_lock_nested+0x92/0x2bc [] class_device_add+0x2c6/0x322 [] class_device_add+0x2c6/0x322 [] class_device_add+0x2c6/0x322 [] kobject_get+0xf/0x13 [] kobject_init+0x29/0x38 [] class_device_create+0x7d/0x9e [] sg_add+0x144/0x39f [sg] [] class_device_add+0x2f4/0x322 [] scsi_sysfs_add_sdev+0x64/0x1ac [scsi_mod] [] ata_scsi_dev_config+0x14/0x91 [libata] [] scsi_probe_and_add_lun+0x9c0/0x9e0 [scsi_mod] [] __scsi_add_device+0x60/0xc9 [scsi_mod] [] __scsi_add_device+0xc7/0xc9 [scsi_mod] [] ata_scsi_scan_host+0xd3/0x26a [libata] [] ata_host_register+0x205/0x280 [libata] [] ata_interrupt+0x0/0x200 [libata] [] ata_host_activate+0x8b/0xf2 [libata] [] svia_init_one+0x2e2/0x511 [sata_via] [] pci_match_device+0xa5/0xb6 [] svia_init_one+0x0/0x511 [sata_via] [] pci_device_probe+0x40/0x5f [] driver_probe_device+0x7c/0x175 [] __driver_attach+0xa2/0xa4 [] bus_for_each_dev+0x3c/0x5a [] driver_attach+0x16/0x1a [] __driver_attach+0x0/0xa4 [] bus_add_driver+0x72/0x1c0 [] __pci_register_driver+0x56/0x89 [] sys_init_module+0xf8/0x1891 [] ata_port_start+0x0/0x65 [libata] [] trace_hardirqs_on+0xba/0x15b [] syscall_call+0x7/0xb === scsi 2:0:0:0: Attached scsi generic sg1 type 0 Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Mon, Jan 21, 2008 at 04:44:36PM +0800, Dave Young wrote: ... > I applied it in my kernel, built and run without warnings, but it need > more testing. > I will be very glad to see the test result about this if you could, thanks. I'll try this of course, but alas I don't have anything such more sophisticated to verify this possibility of deeper nesting. Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Mon, Jan 21, 2008 at 09:43:35AM +0800, Dave Young wrote: ... > Convert the class semaphore to mutex. > class_interface_register/unregister use class_device_* functions, so > SINGLE_DEPTH_NESTING added for lockdep please in these functions. Looks fine to me now, but... I think you forgot again to make it clear that there were no lockdep warnings observed after this patch. And it seems there should be used something more complex than usb mouse or modem to verify this. Then IMHO it would be very nice if Andrew dares to merge this to -mm... Regards, Jarek P. PS: I miss a lot function names in this patch... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Mon, Jan 21, 2008 at 04:44:36PM +0800, Dave Young wrote: ... I applied it in my kernel, built and run without warnings, but it need more testing. I will be very glad to see the test result about this if you could, thanks. I'll try this of course, but alas I don't have anything such more sophisticated to verify this possibility of deeper nesting. Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Mon, Jan 21, 2008 at 09:43:35AM +0800, Dave Young wrote: ... Convert the class semaphore to mutex. class_interface_register/unregister use class_device_* functions, so SINGLE_DEPTH_NESTING added for lockdep please in these functions. Looks fine to me now, but... I think you forgot again to make it clear that there were no lockdep warnings observed after this patch. And it seems there should be used something more complex than usb mouse or modem to verify this. Then IMHO it would be very nice if Andrew dares to merge this to -mm... Regards, Jarek P. PS: I miss a lot function names in this patch... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
Dave Young wrote, On 01/21/2008 09:44 AM: ... I applied it in my kernel, built and run without warnings, but it need more testing. I will be very glad to see the test result about this if you could, thanks. Bad news. (Alas I won't be able to check this today.) = [ INFO: possible recursive locking detected ] 2.6.24-rc8 #5 - modprobe/536 is trying to acquire lock: (cls-mutex/1){--..}, at: [c0217e43] class_device_add+0x2c6/0x322 but task is already holding lock: (cls-mutex/1){--..}, at: [c0217e43] class_device_add+0x2c6/0x322 other info that might help us debug this: 2 locks held by modprobe/536: #0: (shost-scan_mutex){--..}, at: [ce82795b] __scsi_add_device+0x60/0xc9 [scsi_mod] #1: (cls-mutex/1){--..}, at: [c0217e43] class_device_add+0x2c6/0x322 stack backtrace: Pid: 536, comm: modprobe Not tainted 2.6.24-rc8 #5 [c0138edf] __lock_acquire+0x962/0x10d4 [c028be37] _spin_unlock_irqrestore+0x34/0x39 [c015e693] kfree+0x87/0xad [c01380cb] trace_hardirqs_on+0xba/0x15b [c01396c2] lock_acquire+0x71/0x8b [c0217e43] class_device_add+0x2c6/0x322 [c028a87f] mutex_lock_nested+0x92/0x2bc [c0217e43] class_device_add+0x2c6/0x322 [c0217e43] class_device_add+0x2c6/0x322 [c0217e43] class_device_add+0x2c6/0x322 [c01afd42] kobject_get+0xf/0x13 [c01afe68] kobject_init+0x29/0x38 [c0217f2c] class_device_create+0x7d/0x9e [cebfe5b4] sg_add+0x144/0x39f [sg] [c0217e71] class_device_add+0x2f4/0x322 [ce8289f7] scsi_sysfs_add_sdev+0x64/0x1ac [scsi_mod] [ce840ece] ata_scsi_dev_config+0x14/0x91 [libata] [ce826b5c] scsi_probe_and_add_lun+0x9c0/0x9e0 [scsi_mod] [ce82795b] __scsi_add_device+0x60/0xc9 [scsi_mod] [ce8279c2] __scsi_add_device+0xc7/0xc9 [scsi_mod] [ce841401] ata_scsi_scan_host+0xd3/0x26a [libata] [ce83e7e6] ata_host_register+0x205/0x280 [libata] [ce83f430] ata_interrupt+0x0/0x200 [libata] [ce83e8ec] ata_host_activate+0x8b/0xf2 [libata] [ceb95504] svia_init_one+0x2e2/0x511 [sata_via] [c01ca42c] pci_match_device+0xa5/0xb6 [ceb95222] svia_init_one+0x0/0x511 [sata_via] [c01ca4e9] pci_device_probe+0x40/0x5f [c02173ae] driver_probe_device+0x7c/0x175 [c02175ed] __driver_attach+0xa2/0xa4 [c02168dd] bus_for_each_dev+0x3c/0x5a [c0217263] driver_attach+0x16/0x1a [c021754b] __driver_attach+0x0/0xa4 [c0216bea] bus_add_driver+0x72/0x1c0 [c01ca642] __pci_register_driver+0x56/0x89 [c013f11d] sys_init_module+0xf8/0x1891 [ce83901f] ata_port_start+0x0/0x65 [libata] [c01380cb] trace_hardirqs_on+0xba/0x15b [c0102936] syscall_call+0x7/0xb === scsi 2:0:0:0: Attached scsi generic sg1 type 0 Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On 22-01-2008 01:55, Dave Young wrote: ... Hi, thanks your effort. Now I think we should stop this thread and waiting the class_device going away :) Sure! But, if you change your mind I'm interested in this subject. Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
Dave Young wrote, On 01/18/2008 10:07 AM: > On Jan 18, 2008 4:23 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: >> On Fri, Jan 18, 2008 at 03:48:02PM +0800, Dave Young wrote: ... >>> 1) Using CLASS_NORMAL/CLASS_PARENT/CLASS_CHILD will be enough. >>> or >>> 2) Simply add SINGLE_LEVEL_NESTING in class_device_add and other >>> class_device functions because it is the only possible nest-lock place >>> as I know. Dave, after looking a bit at this it seems you could be "mostly" right with this 2). Maybe I've missed something (I didn't verify this yet), but it looks like +1 level (SINGLE_LEVEL_NESTING) could be needed in: class_device_add() (as you did), but probably also class_device_del() and class_device_destroy(). ...But, there seems to be "little" problem, if there is used this recursion with: class_intf->add()/remove() in class_device_add()/del()?! Then Kay is right about possibility of deeper nesting. If this path is really used, and any of these class_device_* functions with locking are called, then this patch couldn't work like this. So, there is a question: how deep nesting is currently used here? Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
Dave Young wrote, On 01/18/2008 10:07 AM: On Jan 18, 2008 4:23 PM, Jarek Poplawski [EMAIL PROTECTED] wrote: On Fri, Jan 18, 2008 at 03:48:02PM +0800, Dave Young wrote: ... 1) Using CLASS_NORMAL/CLASS_PARENT/CLASS_CHILD will be enough. or 2) Simply add SINGLE_LEVEL_NESTING in class_device_add and other class_device functions because it is the only possible nest-lock place as I know. Dave, after looking a bit at this it seems you could be mostly right with this 2). Maybe I've missed something (I didn't verify this yet), but it looks like +1 level (SINGLE_LEVEL_NESTING) could be needed in: class_device_add() (as you did), but probably also class_device_del() and class_device_destroy(). ...But, there seems to be little problem, if there is used this recursion with: class_intf-add()/remove() in class_device_add()/del()?! Then Kay is right about possibility of deeper nesting. If this path is really used, and any of these class_device_* functions with locking are called, then this patch couldn't work like this. So, there is a question: how deep nesting is currently used here? Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 11:45:12AM +0100, Kay Sievers wrote: > On Fri, 2008-01-18 at 08:38 +0100, Jarek Poplawski wrote: > > On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote: > > > On Jan 18, 2008 11:18 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: > > ... > > > > Yeah, might be better to wait until class_device is gone, otherwise you > > > > may need to fix stuff that is just going to be removed. Your change to > > > > have iterators for the class devices look like a nice preparation for > > > > future changes though. > > > > > > > > Our rough plan is: > > > > 2.6.25: > > > > - get the ~100 patches in Greg's tree (in -mm) merged :) > > > > 2.6.26: > > > > ??? - remove the 20 char limit in "struct device" > > > > - get rid of "struct class_device" > > > > > > Fine, thanks. > > > > > > Let's wait for other people's comment. > > > > Dave, I doubt you'll ever manage to do this if you're going to wait: > > probably there will be always some new changes like this around... > > Well there are not "changes" in that sense, the class_device stuff will > be entirely ripped out, and I doubt we will want to change anything > there, just shortly before it's deleted. So, 2.6.26 means shortly... And this all needs some time for testing, debugging or maybe some change of concept, so this would take a while... Well, it's not my problem, but since this stuff will go away, shouldn't we care more about the staff that will stay? > Also your assumptions about device nesting are not really true, there is > no limit, even when there are no current users nesting deeper, and > "struct device" can be any nesting depth, and that's where it gets > interesting. I'm just trying to figure this out. It seems this is a real problem while freezing, but not necessarily here (but I can miss something). Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 09:00:34AM +0100, Jarek Poplawski wrote: > On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote: > ... > > After digging the class usage code again, I found that the only > > possible double lock place is the class_interface_register/unregister > > in which the class_device api could be called. > > OK, but currently after using mostly: > mutex_lock(_class) > > and once: > mutex_lock_nested(_class, SINGLE_DEPTH_NESTING) > > lockdep mostly thinks these parent classes are 2 different objects, > with only 2 possible levels of nesting, so this parent_class has > to have wrong name (2 parents can't be locked from the same thread, > so maybe it's class_grandparent sometimes?). ...Hmm... I was probably wrong: this could be right if there are only two levels of nesting used and class locks it's parent only! Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 03:48:02PM +0800, Dave Young wrote: > On Jan 18, 2008 3:38 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: ... > > IMHO, it would be nice to get the real state of current lockdep > > problems here to figure out if there is any chance to do this right & > > without any warnings with current lockdep. If I got it right from > > earlier threads it might be impossible with USB, at least. > > I don't think so, usb doesn't be affected by struct class mutex, they > only use the lock of struct device. As I replied before, the lockdep > issue exist only between class_interface and class_device. OK, but I've meant possibility of changing their own semaphores later. > > So, since I think these nesting levels seem to be wrong in 7/7 patch, > > maybe it's better to exclude it from this patchset, and to try this as > > testing for some time. > > I may file the updated patch with more nesting changes and test it of > course. Actually I should have done it, thanks. ... > 1) Using CLASS_NORMAL/CLASS_PARENT/CLASS_CHILD will be enough. > or > 2) Simply add SINGLE_LEVEL_NESTING in class_device_add and other > class_device functions because it is the only possible nest-lock place > as I know. If SINGLE_LEVEL_NESTING is enough? (means 2 levels total) I think you should more care about real (logical) relations here, than what's enough to get rid of lockdep warnings. Since there are not so much of these changes, you can try both variants. I'll be glad to look at this - maybe I'll mangage to figure out BTW, what it's all about... Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 09:00:34AM +0100, Jarek Poplawski wrote: On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote: ... After digging the class usage code again, I found that the only possible double lock place is the class_interface_register/unregister in which the class_device api could be called. OK, but currently after using mostly: mutex_lock(parent_class) and once: mutex_lock_nested(parent_class, SINGLE_DEPTH_NESTING) lockdep mostly thinks these parent classes are 2 different objects, with only 2 possible levels of nesting, so this parent_class has to have wrong name (2 parents can't be locked from the same thread, so maybe it's class_grandparent sometimes?). ...Hmm... I was probably wrong: this could be right if there are only two levels of nesting used and class locks it's parent only! Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 11:45:12AM +0100, Kay Sievers wrote: On Fri, 2008-01-18 at 08:38 +0100, Jarek Poplawski wrote: On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote: On Jan 18, 2008 11:18 AM, Kay Sievers [EMAIL PROTECTED] wrote: ... Yeah, might be better to wait until class_device is gone, otherwise you may need to fix stuff that is just going to be removed. Your change to have iterators for the class devices look like a nice preparation for future changes though. Our rough plan is: 2.6.25: - get the ~100 patches in Greg's tree (in -mm) merged :) 2.6.26: ??? - remove the 20 char limit in struct device - get rid of struct class_device Fine, thanks. Let's wait for other people's comment. Dave, I doubt you'll ever manage to do this if you're going to wait: probably there will be always some new changes like this around... Well there are not changes in that sense, the class_device stuff will be entirely ripped out, and I doubt we will want to change anything there, just shortly before it's deleted. So, 2.6.26 means shortly... And this all needs some time for testing, debugging or maybe some change of concept, so this would take a while... Well, it's not my problem, but since this stuff will go away, shouldn't we care more about the staff that will stay? Also your assumptions about device nesting are not really true, there is no limit, even when there are no current users nesting deeper, and struct device can be any nesting depth, and that's where it gets interesting. I'm just trying to figure this out. It seems this is a real problem while freezing, but not necessarily here (but I can miss something). Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote: ... > After digging the class usage code again, I found that the only > possible double lock place is the class_interface_register/unregister > in which the class_device api could be called. OK, but currently after using mostly: mutex_lock(_class) and once: mutex_lock_nested(_class, SINGLE_DEPTH_NESTING) lockdep mostly thinks these parent classes are 2 different objects, with only 2 possible levels of nesting, so this parent_class has to have wrong name (2 parents can't be locked from the same thread, so maybe it's class_grandparent sometimes?). Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote: > On Jan 18, 2008 11:18 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: ... > > Yeah, might be better to wait until class_device is gone, otherwise you > > may need to fix stuff that is just going to be removed. Your change to > > have iterators for the class devices look like a nice preparation for > > future changes though. > > > > Our rough plan is: > > 2.6.25: > > - get the ~100 patches in Greg's tree (in -mm) merged :) > > 2.6.26: > > ??? - remove the 20 char limit in "struct device" > > - get rid of "struct class_device" > > Fine, thanks. > > Let's wait for other people's comment. Dave, I doubt you'll ever manage to do this if you're going to wait: probably there will be always some new changes like this around... IMHO, it would be nice to get the real state of current lockdep problems here to figure out if there is any chance to do this right & without any warnings with current lockdep. If I got it right from earlier threads it might be impossible with USB, at least. So, since I think these nesting levels seem to be wrong in 7/7 patch, maybe it's better to exclude it from this patchset, and to try this as testing for some time. My proposal is to do the annotations with mutex_lock_nested(), everywhere in this patch, according to 'real' relations between these classes from thread POV, so e.g.: mutex_lock_nested(_class->mutex, CLASS_PARENT); mutex_lock_nested(_class->mutex, CLASS_CLASS); ...or more if needed: mutex_lock_nested(_class->mutex, CLASS_CHILD); mutex_lock_nested(_class->mutex, CLASS_ROOT); ...etc. using adequate names for these enums, and only after this check what lockdep thinks about it. Then, if there are no obvious mistakes, but lockdep doesn't like it, send the patch and report without trying to 'silence' lockdep, so we could see what's going on, and if there are any chances to make it right. Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote: > On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote: > > On Thu, 17 Jan 2008, Jarek Poplawski wrote: > > > > > On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote: > > > > On Thu, 17 Jan 2008, Dave Young wrote: > > > > > > > > > > Your meaning isn't clear. Do you mean that your patch doesn't > > > > > > generate > > > > > > any lockdep warnings at all? Or do you mean that it generates a > > > > > > single > > > > > > lockdep warning at boot time and then no more warnings afterward? > > > > > > > > > > I means the latter one. > > > > > > > > That's very bad. > > > > > > > > For each type of violation, lockdep only gives one error message. So > > > > the fact that you get one message at boot time and then no more doesn't > > > > mean the code is almost right -- it probably means the code has lots of > > > > errors and you're seeing only the first one. > > > > > > I hope it's better than this: lockdep really stops checking after first > > > warning, but I've understood from David's description that after fixing > > > this one place lockdep seems to be pleased. > > > > That isn't what Dave said above; he said that lockdep produces a single > > warning at bootup. If he did mention anything about one place being > > fixed up or lockdep being pleased, it was a while back and I've lost > > track of it. > > > > If I recall correctly the nature of the warning was that a method > > routine for one class (called with the class's mutex held) was creating > > a second class and locking that class's mutex. In principle this is > > perfectly legal and should be allowed for arbitrary depths of nesting, > > even though it is the sort of thing lockdep is currently unable to > > handle. > > You are definitely right! After first reading Dave's description I got > it the same way, but after re-reading I probably was misled with this > "thus"! Only now I've had a look at this warning and there is really > mutex_lock_nested(). Sorry Alan! But, on the other hand, mutex_lock() is really mutex_lock_nested(), and after second checking this lockdep warning from Jan. 3, it seems impossible it was get after this patch... Dave, could you please answer with full sentence if there is any lockdep warning possible after applying these 1-7/7 patches, and if so, attach current warning? Otherwise, I'll have apologized for this everybody from the list soon! Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote: > On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote: ... > > If I recall correctly the nature of the warning was that a method > > routine for one class (called with the class's mutex held) was creating > > a second class and locking that class's mutex. In principle this is > > perfectly legal and should be allowed for arbitrary depths of nesting, > > even though it is the sort of thing lockdep is currently unable to > > handle. BTW, Dave, if it's only about one such "second class" here, then it shouldn't be so hard to try this one more level of nesting. I think, the real problem for lockdep starts when there are more such "second classes", but it's probably in another place. You could also have a look at e.g. enum_inode_i_mutex_lock_class in include/linux/fs.h and fh_lock_nested() in include/linux/nfsd/nfsfh.h, and maybe define similar enum for this. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 01:11:01PM -0800, Greg KH wrote: ... > I've known Greg to make lots of mistakes :) Right! Above is one example... > I don't remember ever saying that the "code is correct with the lockdep > warnings", I think I said, "Make sure there are no lockdep warnings with > any conversion you do." I've meant your discussion with David Miller, where you doubted the need to change properly working code. But, as a matter of fact, I really can't see any reason to doubt this, and care so much about lockdep now, if there are no known problems with lockups, and this change has been done done "one for one". Sorry again for any possible misleading, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote: > On Thu, 17 Jan 2008, Jarek Poplawski wrote: > > > On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote: > > > On Thu, 17 Jan 2008, Dave Young wrote: > > > > > > > > Your meaning isn't clear. Do you mean that your patch doesn't > > > > > generate > > > > > any lockdep warnings at all? Or do you mean that it generates a > > > > > single > > > > > lockdep warning at boot time and then no more warnings afterward? > > > > > > > > I means the latter one. > > > > > > That's very bad. > > > > > > For each type of violation, lockdep only gives one error message. So > > > the fact that you get one message at boot time and then no more doesn't > > > mean the code is almost right -- it probably means the code has lots of > > > errors and you're seeing only the first one. > > > > I hope it's better than this: lockdep really stops checking after first > > warning, but I've understood from David's description that after fixing > > this one place lockdep seems to be pleased. > > That isn't what Dave said above; he said that lockdep produces a single > warning at bootup. If he did mention anything about one place being > fixed up or lockdep being pleased, it was a while back and I've lost > track of it. > > If I recall correctly the nature of the warning was that a method > routine for one class (called with the class's mutex held) was creating > a second class and locking that class's mutex. In principle this is > perfectly legal and should be allowed for arbitrary depths of nesting, > even though it is the sort of thing lockdep is currently unable to > handle. You are definitely right! After first reading Dave's description I got it the same way, but after re-reading I probably was misled with this "thus"! Only now I've had a look at this warning and there is really mutex_lock_nested(). Sorry Alan! David, I don't think a patch which causes such a warning can be merged even to -mm, because, as I wrote earlier it would automatically turn off lockdep for everybody. So, every such warning needs to be fixed or, if it's impossible because of some lockdep deficiency, it should be considered if it's better to wait for lockdep changes, or do the change with lockdep turned off locally for each lock (IMHO, it's better, because with sems there is no such control as well, and some other aspects could be tested in the meantime). > > On the other hand, according to Greg the code is OK, so if there are any > > such warnings they simply have to be false! (...Unless you trust lockdep > > more?!) > > It's not a matter of trust or of false warnings. People shouldn't > tolerate any lockdep warnings at all; otherwise they will start to > ignore the valid ones. > > Alan Stern > > P.S.: Just because Greg says the code is okay doesn't mean it will > please lockdep -- it doesn't even mean the code really is okay! I've > known Greg to make an occasional mistake. Alan, you are 200% right! I apologize for my bad jokes too! Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote: > On Thu, 17 Jan 2008, Dave Young wrote: > > > > Your meaning isn't clear. Do you mean that your patch doesn't generate > > > any lockdep warnings at all? Or do you mean that it generates a single > > > lockdep warning at boot time and then no more warnings afterward? > > > > I means the latter one. > > That's very bad. > > For each type of violation, lockdep only gives one error message. So > the fact that you get one message at boot time and then no more doesn't > mean the code is almost right -- it probably means the code has lots of > errors and you're seeing only the first one. I hope it's better than this: lockdep really stops checking after first warning, but I've understood from David's description that after fixing this one place lockdep seems to be pleased. On the other hand, according to Greg the code is OK, so if there are any such warnings they simply have to be false! (...Unless you trust lockdep more?!) Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On 17-01-2008 02:17, Dave Young wrote: > On Jan 16, 2008 4:34 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: >> On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote: >> ... >>> The lockdep warining was posted in the below thread, actually, I have >>> built and run this patced kernel for several days, there's no more >>> warnings. >>> http://lkml.org/lkml/2008/1/3/2 >> Right... But, with something like this: >> >> ... have_some_fun(... cls) >> { >> mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING); >> have_other_fun(cls); >> mutex_unlock(>mutex); >> >> } >> >> ... have_more_fun(...) >> { >> ... >> >> mutex_init(>mutex); >> >> mutex_lock(>mutex); >> have_some_fun(cls); >> mutex_unlock(>mutex); >> } >> >> probably you wouldn't get any lockdep warning too... > > Sorry for late reply. > Actually, I don't know much about lockdep. Could you tell how to use > it properly in this scenario? As you have noticed while working on this patch, if two different instances of the same structure containig some lock are created in the same place, lockdep will treat this one (the same) lock. It looks strange, but actually it's a feature which enables to track dependencies between different locks on 'class' level instead of instance 'level'. The downside is that lockdep is very often too sensitive by default, so you have to 'annotate' when instancess are actually on different level (e.g. parents and children here) and could be locked at the same time or in some order. You can use e.g. mutex_lock_nested() or lockdep_set_class*() for this. Then lockdep simply trusts you, and starts to think they are different locks. If you do it wrong there will be simply no more warnings, but undercover lockups still possible (and diagnosing a bit harder then). So, since in your patch there are two levels of locking, and you started to annotate lockdep about a child taking parent's class lock with: mutex_lock_nested(_class->mutex, SINGLE_DEPTH_NESTING); you should do the same everywhere in a situation like this. lockdep will treat this simply as lock B vs. A (mutex_lock()) dependencies. Regards, Jarek P. PS: BTW, it seems after this patch 1/1 the locking was changed a bit, so these previous tests could be not enough. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On 17-01-2008 02:17, Dave Young wrote: On Jan 16, 2008 4:34 PM, Jarek Poplawski [EMAIL PROTECTED] wrote: On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote: ... The lockdep warining was posted in the below thread, actually, I have built and run this patced kernel for several days, there's no more warnings. http://lkml.org/lkml/2008/1/3/2 Right... But, with something like this: ... have_some_fun(... cls) { mutex_lock_nested(cls-mutex, SINGLE_DEPTH_NESTING); have_other_fun(cls); mutex_unlock(cls-mutex); } ... have_more_fun(...) { ... mutex_init(cls-mutex); mutex_lock(cls-mutex); have_some_fun(cls); mutex_unlock(cls-mutex); } probably you wouldn't get any lockdep warning too... Sorry for late reply. Actually, I don't know much about lockdep. Could you tell how to use it properly in this scenario? As you have noticed while working on this patch, if two different instances of the same structure containig some lock are created in the same place, lockdep will treat this one (the same) lock. It looks strange, but actually it's a feature which enables to track dependencies between different locks on 'class' level instead of instance 'level'. The downside is that lockdep is very often too sensitive by default, so you have to 'annotate' when instancess are actually on different level (e.g. parents and children here) and could be locked at the same time or in some order. You can use e.g. mutex_lock_nested() or lockdep_set_class*() for this. Then lockdep simply trusts you, and starts to think they are different locks. If you do it wrong there will be simply no more warnings, but undercover lockups still possible (and diagnosing a bit harder then). So, since in your patch there are two levels of locking, and you started to annotate lockdep about a child taking parent's class lock with: mutex_lock_nested(parent_class-mutex, SINGLE_DEPTH_NESTING); you should do the same everywhere in a situation like this. lockdep will treat this simply as lock B vs. A (mutex_lock(cls)) dependencies. Regards, Jarek P. PS: BTW, it seems after this patch 1/1 the locking was changed a bit, so these previous tests could be not enough. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote: On Thu, 17 Jan 2008, Dave Young wrote: Your meaning isn't clear. Do you mean that your patch doesn't generate any lockdep warnings at all? Or do you mean that it generates a single lockdep warning at boot time and then no more warnings afterward? I means the latter one. That's very bad. For each type of violation, lockdep only gives one error message. So the fact that you get one message at boot time and then no more doesn't mean the code is almost right -- it probably means the code has lots of errors and you're seeing only the first one. I hope it's better than this: lockdep really stops checking after first warning, but I've understood from David's description that after fixing this one place lockdep seems to be pleased. On the other hand, according to Greg the code is OK, so if there are any such warnings they simply have to be false! (...Unless you trust lockdep more?!) Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote: On Thu, 17 Jan 2008, Jarek Poplawski wrote: On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote: On Thu, 17 Jan 2008, Dave Young wrote: Your meaning isn't clear. Do you mean that your patch doesn't generate any lockdep warnings at all? Or do you mean that it generates a single lockdep warning at boot time and then no more warnings afterward? I means the latter one. That's very bad. For each type of violation, lockdep only gives one error message. So the fact that you get one message at boot time and then no more doesn't mean the code is almost right -- it probably means the code has lots of errors and you're seeing only the first one. I hope it's better than this: lockdep really stops checking after first warning, but I've understood from David's description that after fixing this one place lockdep seems to be pleased. That isn't what Dave said above; he said that lockdep produces a single warning at bootup. If he did mention anything about one place being fixed up or lockdep being pleased, it was a while back and I've lost track of it. If I recall correctly the nature of the warning was that a method routine for one class (called with the class's mutex held) was creating a second class and locking that class's mutex. In principle this is perfectly legal and should be allowed for arbitrary depths of nesting, even though it is the sort of thing lockdep is currently unable to handle. You are definitely right! After first reading Dave's description I got it the same way, but after re-reading I probably was misled with this thus! Only now I've had a look at this warning and there is really mutex_lock_nested(). Sorry Alan! David, I don't think a patch which causes such a warning can be merged even to -mm, because, as I wrote earlier it would automatically turn off lockdep for everybody. So, every such warning needs to be fixed or, if it's impossible because of some lockdep deficiency, it should be considered if it's better to wait for lockdep changes, or do the change with lockdep turned off locally for each lock (IMHO, it's better, because with sems there is no such control as well, and some other aspects could be tested in the meantime). On the other hand, according to Greg the code is OK, so if there are any such warnings they simply have to be false! (...Unless you trust lockdep more?!) It's not a matter of trust or of false warnings. People shouldn't tolerate any lockdep warnings at all; otherwise they will start to ignore the valid ones. Alan Stern P.S.: Just because Greg says the code is okay doesn't mean it will please lockdep -- it doesn't even mean the code really is okay! I've known Greg to make an occasional mistake. Alan, you are 200% right! I apologize for my bad jokes too! Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote: On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote: ... If I recall correctly the nature of the warning was that a method routine for one class (called with the class's mutex held) was creating a second class and locking that class's mutex. In principle this is perfectly legal and should be allowed for arbitrary depths of nesting, even though it is the sort of thing lockdep is currently unable to handle. BTW, Dave, if it's only about one such second class here, then it shouldn't be so hard to try this one more level of nesting. I think, the real problem for lockdep starts when there are more such second classes, but it's probably in another place. You could also have a look at e.g. enum_inode_i_mutex_lock_class in include/linux/fs.h and fh_lock_nested() in include/linux/nfsd/nfsfh.h, and maybe define similar enum for this. Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 01:11:01PM -0800, Greg KH wrote: ... I've known Greg to make lots of mistakes :) Right! Above is one example... I don't remember ever saying that the code is correct with the lockdep warnings, I think I said, Make sure there are no lockdep warnings with any conversion you do. I've meant your discussion with David Miller, where you doubted the need to change properly working code. But, as a matter of fact, I really can't see any reason to doubt this, and care so much about lockdep now, if there are no known problems with lockups, and this change has been done done one for one. Sorry again for any possible misleading, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote: On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote: On Thu, 17 Jan 2008, Jarek Poplawski wrote: On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote: On Thu, 17 Jan 2008, Dave Young wrote: Your meaning isn't clear. Do you mean that your patch doesn't generate any lockdep warnings at all? Or do you mean that it generates a single lockdep warning at boot time and then no more warnings afterward? I means the latter one. That's very bad. For each type of violation, lockdep only gives one error message. So the fact that you get one message at boot time and then no more doesn't mean the code is almost right -- it probably means the code has lots of errors and you're seeing only the first one. I hope it's better than this: lockdep really stops checking after first warning, but I've understood from David's description that after fixing this one place lockdep seems to be pleased. That isn't what Dave said above; he said that lockdep produces a single warning at bootup. If he did mention anything about one place being fixed up or lockdep being pleased, it was a while back and I've lost track of it. If I recall correctly the nature of the warning was that a method routine for one class (called with the class's mutex held) was creating a second class and locking that class's mutex. In principle this is perfectly legal and should be allowed for arbitrary depths of nesting, even though it is the sort of thing lockdep is currently unable to handle. You are definitely right! After first reading Dave's description I got it the same way, but after re-reading I probably was misled with this thus! Only now I've had a look at this warning and there is really mutex_lock_nested(). Sorry Alan! But, on the other hand, mutex_lock() is really mutex_lock_nested(), and after second checking this lockdep warning from Jan. 3, it seems impossible it was get after this patch... Dave, could you please answer with full sentence if there is any lockdep warning possible after applying these 1-7/7 patches, and if so, attach current warning? Otherwise, I'll have apologized for this everybody from the list soon! Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote: On Jan 18, 2008 11:18 AM, Kay Sievers [EMAIL PROTECTED] wrote: ... Yeah, might be better to wait until class_device is gone, otherwise you may need to fix stuff that is just going to be removed. Your change to have iterators for the class devices look like a nice preparation for future changes though. Our rough plan is: 2.6.25: - get the ~100 patches in Greg's tree (in -mm) merged :) 2.6.26: ??? - remove the 20 char limit in struct device - get rid of struct class_device Fine, thanks. Let's wait for other people's comment. Dave, I doubt you'll ever manage to do this if you're going to wait: probably there will be always some new changes like this around... IMHO, it would be nice to get the real state of current lockdep problems here to figure out if there is any chance to do this right without any warnings with current lockdep. If I got it right from earlier threads it might be impossible with USB, at least. So, since I think these nesting levels seem to be wrong in 7/7 patch, maybe it's better to exclude it from this patchset, and to try this as testing for some time. My proposal is to do the annotations with mutex_lock_nested(), everywhere in this patch, according to 'real' relations between these classes from thread POV, so e.g.: mutex_lock_nested(some_class-mutex, CLASS_PARENT); mutex_lock_nested(some_class-mutex, CLASS_CLASS); ...or more if needed: mutex_lock_nested(some_class-mutex, CLASS_CHILD); mutex_lock_nested(some_class-mutex, CLASS_ROOT); ...etc. using adequate names for these enums, and only after this check what lockdep thinks about it. Then, if there are no obvious mistakes, but lockdep doesn't like it, send the patch and report without trying to 'silence' lockdep, so we could see what's going on, and if there are any chances to make it right. Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote: ... After digging the class usage code again, I found that the only possible double lock place is the class_interface_register/unregister in which the class_device api could be called. OK, but currently after using mostly: mutex_lock(parent_class) and once: mutex_lock_nested(parent_class, SINGLE_DEPTH_NESTING) lockdep mostly thinks these parent classes are 2 different objects, with only 2 possible levels of nesting, so this parent_class has to have wrong name (2 parents can't be locked from the same thread, so maybe it's class_grandparent sometimes?). Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Wed, Jan 16, 2008 at 10:27:54AM -0500, Alan Stern wrote: > On Wed, 16 Jan 2008, Dave Young wrote: > > > The lockdep warining was posted in the below thread, actually, I have > > built and run this patced kernel for several days, there's no more > > warnings. > > http://lkml.org/lkml/2008/1/3/2 > > Your meaning isn't clear. Do you mean that your patch doesn't generate > any lockdep warnings at all? Or do you mean that it generates a single > lockdep warning at boot time and then no more warnings afterward? So it wasn't only my sleeping problem! This: "One lockdep warning detected as following, thus use mutex_lock_nested with SINGLE_DEPTH_NESTING in class_device_add" seems to say that mutex_lock_nested is used in the patch just to remove this one possible warning... Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On Wed, Jan 16, 2008 at 09:04:58PM +0100, Willy Tarreau wrote: ... > you can work with latest release provided that you always have a fallback > to an earlier one. That way, you don't bet too much on something you don't > completely control. If it works, it tells you you'll be able to completely > exploit its new possibilities in next product release, and if it breaks, > it's easy to issue a fix to get back to earlier, well-tested version. Of course, this way looks preferable, but sometimes maybe too costly, especially with some complex systems. Actually, I don't even think this have to be fully (production ready) implemented or workable. Probably there would be even enough to maintain some simplified kind of test checking how current kernel changes could affect such a system, and how new versions of this system are better, BTW. Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote: ... > The lockdep warining was posted in the below thread, actually, I have > built and run this patced kernel for several days, there's no more > warnings. > http://lkml.org/lkml/2008/1/3/2 Right... But, with something like this: ... have_some_fun(... cls) { mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING); have_other_fun(cls); mutex_unlock(>mutex); } ... have_more_fun(...) { ... mutex_init(>mutex); mutex_lock(>mutex); have_some_fun(cls); mutex_unlock(>mutex); } probably you wouldn't get any lockdep warning too... Of course, if we know all the locking is right such proper lockdep annotating shouldn't matter too much. (And of course this could be improved later.) Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote: ... The lockdep warining was posted in the below thread, actually, I have built and run this patced kernel for several days, there's no more warnings. http://lkml.org/lkml/2008/1/3/2 Right... But, with something like this: ... have_some_fun(... cls) { mutex_lock_nested(cls-mutex, SINGLE_DEPTH_NESTING); have_other_fun(cls); mutex_unlock(cls-mutex); } ... have_more_fun(...) { ... mutex_init(cls-mutex); mutex_lock(cls-mutex); have_some_fun(cls); mutex_unlock(cls-mutex); } probably you wouldn't get any lockdep warning too... Of course, if we know all the locking is right such proper lockdep annotating shouldn't matter too much. (And of course this could be improved later.) Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On Wed, Jan 16, 2008 at 09:04:58PM +0100, Willy Tarreau wrote: ... you can work with latest release provided that you always have a fallback to an earlier one. That way, you don't bet too much on something you don't completely control. If it works, it tells you you'll be able to completely exploit its new possibilities in next product release, and if it breaks, it's easy to issue a fix to get back to earlier, well-tested version. Of course, this way looks preferable, but sometimes maybe too costly, especially with some complex systems. Actually, I don't even think this have to be fully (production ready) implemented or workable. Probably there would be even enough to maintain some simplified kind of test checking how current kernel changes could affect such a system, and how new versions of this system are better, BTW. Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On Wed, Jan 16, 2008 at 11:17:08AM +1100, Herbert Xu wrote: ... > Well people are always going to operate on this model for commercial > reasons. FWIW I used to work for a company that stuck to a specific > version of the Linux kernel, and I suppose I still do even now :) > > But the important thing is that if you're going to do that, then the > cost that comes with it should be borne by the company and not the > community. Sure. But the most sad thing is there seems to be not so much savings in this (unless a company isn't sure of its near future). Trying to upgrade and test current products with current kernels, even if not necessary, should be always useful and make developing of new products faster and better fit (and of course, BTW, make the kernel better on time). Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On Tue, Jan 15, 2008 at 08:47:07AM -0600, Chris Friesen wrote: > Jarek Poplawski wrote: > >> IMHO, checking this with a current stable, which probably you are going >> to do some day, anyway, should be 100% acceptable: giving some input to >> netdev, while still working for yourself. > > While I would love to do this, it's not that simple. ...Hmm... As a matter of fact, I expected you'd treat my point less literally... Of course, I know it could be sometimes very hard to get something working even after upgrading one version, let alone several at once. So, it was more a rhetorical trick (sorry!) to suggest, that such a business model of being always late with kernels might be quite practical and reasonable for many companies, but looks like the worst possible development model for Linux. On the other hand, it seems there are not so much, nor expensive changes needed (a bit more perspective thinking?) to make everybody happy... Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Tue, Jan 15, 2008 at 05:15:27PM +0800, Dave Young wrote: > Convert the class semaphore to mutex. > > Signed-off-by: Dave Young <[EMAIL PROTECTED]> > > --- > drivers/base/class.c | 38 +++--- > drivers/base/core.c| 18 -- > include/linux/device.h |3 ++- > 3 files changed, 29 insertions(+), 30 deletions(-) > > diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c > --- linux/drivers/base/class.c2008-01-15 14:04:26.0 +0800 > +++ linux.new/drivers/base/class.c2008-01-15 14:04:26.0 +0800 > @@ -144,7 +144,7 @@ int class_register(struct class * cls) > INIT_LIST_HEAD(>devices); > INIT_LIST_HEAD(>interfaces); > kset_init(>class_dirs); > - init_MUTEX(>sem); > + mutex_init(>mutex); > error = kobject_set_name(>subsys.kobj, "%s", cls->name); > if (error) > return error; > @@ -617,13 +617,13 @@ int class_device_add(struct class_device > kobject_uevent(_dev->kobj, KOBJ_ADD); > > /* notify any interfaces this device is now here */ > - down(_class->sem); > + mutex_lock_nested(_class->mutex, SINGLE_DEPTH_NESTING); > list_add_tail(_dev->node, _class->children); > list_for_each_entry(class_intf, _class->interfaces, node) { > if (class_intf->add) > class_intf->add(class_dev, class_intf); > } > - up(_class->sem); > + mutex_unlock(_class->mutex); > > goto out1; > > @@ -725,12 +725,12 @@ void class_device_del(struct class_devic > struct class_interface *class_intf; > > if (parent_class) { > - down(_class->sem); > + mutex_lock(_class->mutex); I hope I'm wrong with this (I don't know this code at all...), and of course I should've noticed this earlier after all, but I wonder about this _NESTING corretness here. So, if these variables names are right, and say about real nesting dependency, then it seems mutex_lock_nested() should be used consistently even if (currently?) not forced by lockdep warnings; otherwise this could possibly cover some other warnings. Alas, if accidentally I'm right, it seems a bit of new testing would be necessary... Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class
On Tue, Jan 15, 2008 at 05:15:27PM +0800, Dave Young wrote: Convert the class semaphore to mutex. Signed-off-by: Dave Young [EMAIL PROTECTED] --- drivers/base/class.c | 38 +++--- drivers/base/core.c| 18 -- include/linux/device.h |3 ++- 3 files changed, 29 insertions(+), 30 deletions(-) diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c --- linux/drivers/base/class.c2008-01-15 14:04:26.0 +0800 +++ linux.new/drivers/base/class.c2008-01-15 14:04:26.0 +0800 @@ -144,7 +144,7 @@ int class_register(struct class * cls) INIT_LIST_HEAD(cls-devices); INIT_LIST_HEAD(cls-interfaces); kset_init(cls-class_dirs); - init_MUTEX(cls-sem); + mutex_init(cls-mutex); error = kobject_set_name(cls-subsys.kobj, %s, cls-name); if (error) return error; @@ -617,13 +617,13 @@ int class_device_add(struct class_device kobject_uevent(class_dev-kobj, KOBJ_ADD); /* notify any interfaces this device is now here */ - down(parent_class-sem); + mutex_lock_nested(parent_class-mutex, SINGLE_DEPTH_NESTING); list_add_tail(class_dev-node, parent_class-children); list_for_each_entry(class_intf, parent_class-interfaces, node) { if (class_intf-add) class_intf-add(class_dev, class_intf); } - up(parent_class-sem); + mutex_unlock(parent_class-mutex); goto out1; @@ -725,12 +725,12 @@ void class_device_del(struct class_devic struct class_interface *class_intf; if (parent_class) { - down(parent_class-sem); + mutex_lock(parent_class-mutex); I hope I'm wrong with this (I don't know this code at all...), and of course I should've noticed this earlier after all, but I wonder about this _NESTING corretness here. So, if these variables names are right, and say about real nesting dependency, then it seems mutex_lock_nested() should be used consistently even if (currently?) not forced by lockdep warnings; otherwise this could possibly cover some other warnings. Alas, if accidentally I'm right, it seems a bit of new testing would be necessary... Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On Tue, Jan 15, 2008 at 08:47:07AM -0600, Chris Friesen wrote: Jarek Poplawski wrote: IMHO, checking this with a current stable, which probably you are going to do some day, anyway, should be 100% acceptable: giving some input to netdev, while still working for yourself. While I would love to do this, it's not that simple. ...Hmm... As a matter of fact, I expected you'd treat my point less literally... Of course, I know it could be sometimes very hard to get something working even after upgrading one version, let alone several at once. So, it was more a rhetorical trick (sorry!) to suggest, that such a business model of being always late with kernels might be quite practical and reasonable for many companies, but looks like the worst possible development model for Linux. On the other hand, it seems there are not so much, nor expensive changes needed (a bit more perspective thinking?) to make everybody happy... Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On Wed, Jan 16, 2008 at 11:17:08AM +1100, Herbert Xu wrote: ... Well people are always going to operate on this model for commercial reasons. FWIW I used to work for a company that stuck to a specific version of the Linux kernel, and I suppose I still do even now :) But the important thing is that if you're going to do that, then the cost that comes with it should be borne by the company and not the community. Sure. But the most sad thing is there seems to be not so much savings in this (unless a company isn't sure of its near future). Trying to upgrade and test current products with current kernels, even if not necessary, should be always useful and make developing of new products faster and better fit (and of course, BTW, make the kernel better on time). Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On 14-01-2008 16:58, Chris Friesen wrote: ... > How close to bleeding edge do we need to be for it to be considered > acceptable to ask questions on netdev? > > Given that the embedded space tends to be perpetually stuck on older > kernels (our "current" release is based on 2.6.14) do you have any > suggestion on how we can obtain information that would be available on > netdev if we were using newer kernels? IMHO, checking this with a current stable, which probably you are going to do some day, anyway, should be 100% acceptable: giving some input to netdev, while still working for yourself. Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
On 14-01-2008 16:58, Chris Friesen wrote: ... How close to bleeding edge do we need to be for it to be considered acceptable to ask questions on netdev? Given that the embedded space tends to be perpetually stuck on older kernels (our current release is based on 2.6.14) do you have any suggestion on how we can obtain information that would be available on netdev if we were using newer kernels? IMHO, checking this with a current stable, which probably you are going to do some day, anyway, should be 100% acceptable: giving some input to netdev, while still working for yourself. Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/7] driver-core : add class iteration api
On Mon, Jan 14, 2008 at 09:36:04AM +0800, Dave Young wrote: > On Jan 13, 2008 4:11 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: ... > > Probably some tiny oversight, but I see this comment to struct class > > doesn't mention devices list, so maybe this needs to be updated BTW?: > > > > (from include/linux/device.h) > > "struct semaphoresem;/* locks both the children and > > interfaces lists */" > > Sorry for my lazy, I think so too. > IMHO, it should be updated after the comments. As a matter of fact, only later I've found this question is 'fixed' in 7/7. But, actually, I was more concerned if this patch changed the 'official' policy of this sem (then it would be nice to mention about this in a changelog) or this comment was simply incomplete. Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/7] driver-core : add class iteration api
On Mon, Jan 14, 2008 at 09:36:04AM +0800, Dave Young wrote: On Jan 13, 2008 4:11 AM, Jarek Poplawski [EMAIL PROTECTED] wrote: ... Probably some tiny oversight, but I see this comment to struct class doesn't mention devices list, so maybe this needs to be updated BTW?: (from include/linux/device.h) struct semaphoresem;/* locks both the children and interfaces lists */ Sorry for my lazy, I think so too. IMHO, it should be updated after the comments. As a matter of fact, only later I've found this question is 'fixed' in 7/7. But, actually, I was more concerned if this patch changed the 'official' policy of this sem (then it would be nice to mention about this in a changelog) or this comment was simply incomplete. Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/7] driver-core : add class iteration api
On Sat, Jan 12, 2008 at 05:47:54PM +0800, Dave Young wrote: > Add the following class iteration functions for driver use: > class_for_each_device > class_find_device > class_for_each_child > class_find_child > > Signed-off-by: Dave Young <[EMAIL PROTECTED]> > > --- > drivers/base/class.c | 159 > + > include/linux/device.h |8 ++ > 2 files changed, 167 insertions(+) > > diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c > --- linux/drivers/base/class.c2008-01-12 14:42:24.0 +0800 > +++ linux.new/drivers/base/class.c2008-01-12 14:42:24.0 +0800 > @@ -798,6 +798,165 @@ void class_device_put(struct class_devic > kobject_put(_dev->kobj); > } > > +/** > + * class_for_each_device - device iterator > + * @class: the class we're iterating > + * @data: data for the callback > + * @fn: function to be called for each device > + * > + * Iterate over @class's list of devices, and call @fn for each, > + * passing it @data. > + * > + * We check the return of @fn each time. If it returns anything > + * other than 0, we break out and return that value. > + */ > +int class_for_each_device(struct class *class, void *data, > +int (*fn)(struct device *, void *)) > +{ > + struct device *dev; > + int error = 0; > + > + if (!class) > + return -EINVAL; > + down(>sem); > + list_for_each_entry(dev, >devices, node) { Probably some tiny oversight, but I see this comment to struct class doesn't mention devices list, so maybe this needs to be updated BTW?: (from include/linux/device.h) "struct semaphoresem;/* locks both the children and interfaces lists */" Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/7] driver-core : add class iteration api
On Sat, Jan 12, 2008 at 05:47:54PM +0800, Dave Young wrote: Add the following class iteration functions for driver use: class_for_each_device class_find_device class_for_each_child class_find_child Signed-off-by: Dave Young [EMAIL PROTECTED] --- drivers/base/class.c | 159 + include/linux/device.h |8 ++ 2 files changed, 167 insertions(+) diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c --- linux/drivers/base/class.c2008-01-12 14:42:24.0 +0800 +++ linux.new/drivers/base/class.c2008-01-12 14:42:24.0 +0800 @@ -798,6 +798,165 @@ void class_device_put(struct class_devic kobject_put(class_dev-kobj); } +/** + * class_for_each_device - device iterator + * @class: the class we're iterating + * @data: data for the callback + * @fn: function to be called for each device + * + * Iterate over @class's list of devices, and call @fn for each, + * passing it @data. + * + * We check the return of @fn each time. If it returns anything + * other than 0, we break out and return that value. + */ +int class_for_each_device(struct class *class, void *data, +int (*fn)(struct device *, void *)) +{ + struct device *dev; + int error = 0; + + if (!class) + return -EINVAL; + down(class-sem); + list_for_each_entry(dev, class-devices, node) { Probably some tiny oversight, but I see this comment to struct class doesn't mention devices list, so maybe this needs to be updated BTW?: (from include/linux/device.h) struct semaphoresem;/* locks both the children and interfaces lists */ Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote: ... > diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c > new file mode 100644 > index 000..495575a > --- /dev/null > +++ b/lib/iommu-helper.c > @@ -0,0 +1,80 @@ > +/* > + * IOMMU helper functions for the free area management > + */ > + > +#include > +#include > + > +static unsigned long find_next_zero_area(unsigned long *map, > + unsigned long size, > + unsigned long start, > + unsigned int nr, > + unsigned long align_mask) > +{ > + unsigned long index, end, i; > +again: > + index = find_next_zero_bit(map, size, start); > + > + /* Align allocation */ > + index = (index + align_mask) & ~align_mask; > + > + end = index + nr; > + if (end >= size) > + return -1; This '>=' looks doubtful to me, e.g.: map points to 0s only, size = 64, nr = 64, we get: index = 0; end = 64; and: return -1 ?! Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote: ... diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c new file mode 100644 index 000..495575a --- /dev/null +++ b/lib/iommu-helper.c @@ -0,0 +1,80 @@ +/* + * IOMMU helper functions for the free area management + */ + +#include linux/module.h +#include linux/bitops.h + +static unsigned long find_next_zero_area(unsigned long *map, + unsigned long size, + unsigned long start, + unsigned int nr, + unsigned long align_mask) +{ + unsigned long index, end, i; +again: + index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) ~align_mask; + + end = index + nr; + if (end = size) + return -1; This '=' looks doubtful to me, e.g.: map points to 0s only, size = 64, nr = 64, we get: index = 0; end = 64; and: return -1 ?! Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Top 10 kernel oopses for the week ending January 5th, 2008
On 08-01-2008 06:59, Al Viro wrote: > On Mon, Jan 07, 2008 at 07:26:12PM -0800, Linus Torvalds wrote: > >> I usually just compile a small program like >> >> const char array[]="\xnn\xnn\xnn..."; >> >> int main(int argc, char **argv) >> { >> printf("%p\n", array); >> *(int *)0=0; >> } > Heh. I prefer > char main[] = {.}; > for the same thing, with gdb a.out and no running at all. ... IMHO, it would be really wasteful if Arian havn't published these and maybe a few more such advices and links on this site, not necessarily waiting for html-ized or howto-ized versions! Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Mon, Jan 07, 2008 at 02:23:33PM +0100, Stefan Richter wrote: > David Brownell wrote: > > On Monday 07 January 2008, Greg KH wrote: > >> Most of the non-driver core code should be converted to not use the > >> lock in the class at all. They should use a local lock instead. > > > > Or better yet, that yet-to-be-written class_for_each_instance() > > iterator ... :) > > By far most of the usages of class.semaphore or class.mutex in drivers > are to protect the class.devices list. The only? right thing to do > there is to keep using the class.{semaphore,mutex}. The more elegant > variation of this would be David's class_for_each_instance() iterator > which would allow us to hide the locking details from the drivers. > > --- > ?) Well, another correct thing to do would be to not take any locks or > mutexes in the driver core but instead let the drivers do the necessary > serialization between calls like class_device_add() and the likes and > list iterations. But this would complicate the API because of the > additional locking requirements, and hence would invariably result in > buggy usages of the API. > -- I hope I'm wrong, but IMHO it should be safer not to mix such changes, so do the mutexes first or delay them until the end. Otherwise some false blaming seems almost inevitable. Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Mon, Jan 07, 2008 at 02:23:33PM +0100, Stefan Richter wrote: David Brownell wrote: On Monday 07 January 2008, Greg KH wrote: Most of the non-driver core code should be converted to not use the lock in the class at all. They should use a local lock instead. Or better yet, that yet-to-be-written class_for_each_instance() iterator ... :) By far most of the usages of class.semaphore or class.mutex in drivers are to protect the class.devices list. The only? right thing to do there is to keep using the class.{semaphore,mutex}. The more elegant variation of this would be David's class_for_each_instance() iterator which would allow us to hide the locking details from the drivers. --- ?) Well, another correct thing to do would be to not take any locks or mutexes in the driver core but instead let the drivers do the necessary serialization between calls like class_device_add() and the likes and list iterations. But this would complicate the API because of the additional locking requirements, and hence would invariably result in buggy usages of the API. -- I hope I'm wrong, but IMHO it should be safer not to mix such changes, so do the mutexes first or delay them until the end. Otherwise some false blaming seems almost inevitable. Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Top 10 kernel oopses for the week ending January 5th, 2008
On 08-01-2008 06:59, Al Viro wrote: On Mon, Jan 07, 2008 at 07:26:12PM -0800, Linus Torvalds wrote: I usually just compile a small program like const char array[]=\xnn\xnn\xnn...; int main(int argc, char **argv) { printf(%p\n, array); *(int *)0=0; } Heh. I prefer char main[] = {.}; for the same thing, with gdb a.out and no running at all. ... IMHO, it would be really wasteful if Arian havn't published these and maybe a few more such advices and links on this site, not necessarily waiting for html-ized or howto-ized versions! Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, Jan 06, 2008 at 11:30:48AM +0100, Torsten Kaiser wrote: ... > I think this bug is highly timing dependent. Its not always the same > package that dies and as this is a SMP system I would guess two CPUs > using the same data will trigger this. > And using the poison-option will definitily slow the system down and > mess up the timings. Of course it looks like using the same data, but it seems there is no reason to think it needs the same time: e.g. some timer or workqueue could retrigger after it's supposed to be killed. Any additional debugging/poisonning might help to see it earlier, so this should be safer for your system, but, most probably this would show data from the damaged side, so not necessarily very helpful. > What also speaks against the 'safer' offsets is, that after adding my > notfreed-byte to skbuff the bug still triggered in the same way. We are not even sure skbuffs were directly affected by this or they were incorrectly freed because of other structures beeing damaged? IMHO, e.g. starting your system with limited memory should cause faster memory reclaiming, and thus more often triggering of these bugs, but of course I can be wrong. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sat, Jan 05, 2008 at 03:52:32PM +0100, Torsten Kaiser wrote: ... > So my personal conclusion would be, that someone is writing to memory > that he no longer owns. Most probably 0-bytes. (the complete_routine > got NULLed and the warning about dst->__refcnt being 0). > > Use-after-free or something else? I agree: your conclusion seems to be the most probable explanation for this. Then it could be really hard to solve this without bisection or something similar. But there is some probabability this something could try kfree later too, but simply this list debugging triggers earlier. > > > If you think some other slub_debug might catch it, I would try this... You can try to add "U" to these other slub_debug options. As a matter of fact, if your above diagnose is right, it seems you risk to damage your system or even the box with these tests, so if you want to continue, you should probably turn any possible debugging on (not in mm only). BTW, you've written that some debugging options seem to delay the bug. Since they often change sizes of some structures than such wrong writes could have some 'safer' offsets. So, this could really delay e.g. these list's bugs, but maybe this could also let to stay 'alive' to such wrong kfree? Cheers, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sat, Jan 05, 2008 at 03:52:32PM +0100, Torsten Kaiser wrote: ... So my personal conclusion would be, that someone is writing to memory that he no longer owns. Most probably 0-bytes. (the complete_routine got NULLed and the warning about dst-__refcnt being 0). Use-after-free or something else? I agree: your conclusion seems to be the most probable explanation for this. Then it could be really hard to solve this without bisection or something similar. But there is some probabability this something could try kfree later too, but simply this list debugging triggers earlier. If you think some other slub_debug might catch it, I would try this... You can try to add U to these other slub_debug options. As a matter of fact, if your above diagnose is right, it seems you risk to damage your system or even the box with these tests, so if you want to continue, you should probably turn any possible debugging on (not in mm only). BTW, you've written that some debugging options seem to delay the bug. Since they often change sizes of some structures than such wrong writes could have some 'safer' offsets. So, this could really delay e.g. these list's bugs, but maybe this could also let to stay 'alive' to such wrong kfree? Cheers, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, Jan 06, 2008 at 11:30:48AM +0100, Torsten Kaiser wrote: ... I think this bug is highly timing dependent. Its not always the same package that dies and as this is a SMP system I would guess two CPUs using the same data will trigger this. And using the poison-option will definitily slow the system down and mess up the timings. Of course it looks like using the same data, but it seems there is no reason to think it needs the same time: e.g. some timer or workqueue could retrigger after it's supposed to be killed. Any additional debugging/poisonning might help to see it earlier, so this should be safer for your system, but, most probably this would show data from the damaged side, so not necessarily very helpful. What also speaks against the 'safer' offsets is, that after adding my notfreed-byte to skbuff the bug still triggered in the same way. We are not even sure skbuffs were directly affected by this or they were incorrectly freed because of other structures beeing damaged? IMHO, e.g. starting your system with limited memory should cause faster memory reclaiming, and thus more often triggering of these bugs, but of course I can be wrong. Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote: > On Jan 5, 2008 1:07 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: > > On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote: > > > On Jan 4, 2008 2:30 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: > > > The only thing that is sadly not practical is bisecting the borkenout > > > mm-patches, as triggering this error is to unreliable / > > > time-consuming. > > > > Right, but it seems there are these 2 main suspects here... > > > > > > - is it still vanilla -rc6-mm1; I've seen on kernel list you tried > > > > some fixes around raid? > > > > > > Yes, without these fixes I can't boot. > > > But they should only be run during starting the arrays, so I doubt > > > that this is that cause. > > > (Also -rc3-mm2 did not need this fix) > > > > You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes? > > vanilla -rc6 is fine without these fixes. > The raid-bugs from -rc6-mm1 are probably introduced by > md-allow-devices-to-be-shared-between-md-arrays.patch and that patch > is new in this mm-release. > > > I think it would be easier just to start with this working -rc6 and > > simply check if we have 'right' suspects, so: git-net.patch and > > git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope, > > can compile - otherwise you could try the other way: add the whole -mm > > and revert these two). Using current gits could complicate this > > "investigation". > > OK, I will try this... It seems that this last report gives the third one: ieee1394 to the pack, so probably, you can hold on a "minute" - this all needs some rethinking. (But, if you've begun with this already, let it be clear at last too.) > > I didn't read all this thread, so probably I miss many points, but are > > you sure there are no problems with filesystem corruption around these > > packets or where you compile(?) them (e.g. after these raid problems)? > > For my setup: It's a gentoo system, so compiling packages is the > normal way of installing something. > The compile itself is done on a tmpfs so a filesystem corruption there > should be rather impossible. ;) > (The system has 4Gb RAM, so it doesn't even need to swap) > The sources are taken from a nfsv4 share that is served from a > different system. Also gentoo checksums all sources it will use. Yes, since this was no problem with vanilla 2.6.24-rc6, I've probably gone astray... > If you think some other slub_debug might catch it, I would try this... OK! But, in the meantime could you send your current .config? I wonder e.g. if there could be used this new ieee1394 code from init_ohci1394_dma.c? You are really helpful, thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote: On Jan 5, 2008 1:07 AM, Jarek Poplawski [EMAIL PROTECTED] wrote: On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote: On Jan 4, 2008 2:30 PM, Jarek Poplawski [EMAIL PROTECTED] wrote: The only thing that is sadly not practical is bisecting the borkenout mm-patches, as triggering this error is to unreliable / time-consuming. Right, but it seems there are these 2 main suspects here... - is it still vanilla -rc6-mm1; I've seen on kernel list you tried some fixes around raid? Yes, without these fixes I can't boot. But they should only be run during starting the arrays, so I doubt that this is that cause. (Also -rc3-mm2 did not need this fix) You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes? vanilla -rc6 is fine without these fixes. The raid-bugs from -rc6-mm1 are probably introduced by md-allow-devices-to-be-shared-between-md-arrays.patch and that patch is new in this mm-release. I think it would be easier just to start with this working -rc6 and simply check if we have 'right' suspects, so: git-net.patch and git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope, can compile - otherwise you could try the other way: add the whole -mm and revert these two). Using current gits could complicate this investigation. OK, I will try this... It seems that this last report gives the third one: ieee1394 to the pack, so probably, you can hold on a minute - this all needs some rethinking. (But, if you've begun with this already, let it be clear at last too.) I didn't read all this thread, so probably I miss many points, but are you sure there are no problems with filesystem corruption around these packets or where you compile(?) them (e.g. after these raid problems)? For my setup: It's a gentoo system, so compiling packages is the normal way of installing something. The compile itself is done on a tmpfs so a filesystem corruption there should be rather impossible. ;) (The system has 4Gb RAM, so it doesn't even need to swap) The sources are taken from a nfsv4 share that is served from a different system. Also gentoo checksums all sources it will use. Yes, since this was no problem with vanilla 2.6.24-rc6, I've probably gone astray... If you think some other slub_debug might catch it, I would try this... OK! But, in the meantime could you send your current .config? I wonder e.g. if there could be used this new ieee1394 code from init_ohci1394_dma.c? You are really helpful, thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote: > On Jan 4, 2008 2:30 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: ... > I'm open for any suggestions and will try to answer any questions. I'm very glad, thanks! > The only thing that is sadly not practical is bisecting the borkenout > mm-patches, as triggering this error is to unreliable / > time-consuming. Right, but it seems there are these 2 main suspects here... > > - is it still vanilla -rc6-mm1; I've seen on kernel list you tried > > some fixes around raid? > > Yes, without these fixes I can't boot. > But they should only be run during starting the arrays, so I doubt > that this is that cause. > (Also -rc3-mm2 did not need this fix) You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes? I think it would be easier just to start with this working -rc6 and simply check if we have 'right' suspects, so: git-net.patch and git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope, can compile - otherwise you could try the other way: add the whole -mm and revert these two). Using current gits could complicate this "investigation". > My skbuff-double-free-detector is still in there, but was never triggered. > > > - could you remind this lockdep warning; is it always and the same, > > always before crash, or no rules? > > ??? > I see no lockdep warning before the crashes. > I have seen a warning about the dst->__refcnt in dst_release and > different warnings about list operations. > > I think I have always posted everything I have seen before the > crashes. (captured via serial console) So, you mean there are no more of these?: "looked into the log in question and the only other warning was a circular locking dependency that lockdep detected around 1.5 hour before this warning." ... "[ 7620.845168] INFO: lockdep is turned off." > (If you mean the lockdep-problem in -rc6: That is more or less a > missing annotation during early bootup. The only problem with that is, > that it will causes lockdep to be turned off and so it can not be used > to find any real problem. A fix for that is in -mm so I do have > lockdep on the mm-kernels) > > > - I've seen you looked after double freeing, but this last debug list > > warning could suggest locking problems during list modification too. > > Yes, but Herbert mentioned double freeing a skb explicit and so I > tried to catch this. > I do not know enough about the network core to verify the locking of > the involved lists. Right, the list corruption could be because of use after freeing too. > > - above git-nfsd and git-net tests should be probably repeated with > > -rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches > > only, and if bug triggers, with one reversed; btw., since in previous > > message you mentioned that 50 packages could be not enough to trigger > > this, these 54 above could make too little margin yet. > > Yes, I think I really need to redo the git-nfsd-test. > With IOMMU_DEBUG enabled rc6-mm1worked for 52 packages, only a secound > run of kde-packages triggered it after only 5 packages. > I don't know what this bug hates about kdeartwork-wallpaper (triggered > it this time) or kdeartwork-styles. I didn't read all this thread, so probably I miss many points, but are you sure there are no problems with filesystem corruption around these packets or where you compile(?) them (e.g. after these raid problems)? > Output from the crash with IOMMU_DEBUG (lockdep was enabled, but did > not trigger): > [15593.236374] Unable to handle kernel NULL pointer > dereference<3>list_add corruption. prev->next should be next Fine! I'll try to look at this. BTW, I guess/hope DEBUG_SLAB etc. are also on... Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On 04-01-2008 11:23, Torsten Kaiser wrote: > On Jan 2, 2008 10:51 PM, Herbert Xu <[EMAIL PROTECTED]> wrote: >> On Wed, Jan 02, 2008 at 07:29:59PM +0100, Torsten Kaiser wrote: >>> Vanilla 2.6.24-rc6 seems stable. I did not see any crash or warnings. >> OK that's great. The next step would be to try excluding specific git >> trees from mm to see if they make a difference. >> >> The two specific trees of interest would be git-nfsd and git-net. > > git-nfsd from git://git.linux-nfs.org/projects/bfields/linux.git#for-mm > -> compiling and installing 54 packages worked without crashes. > > git-net from > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25.git > -> compiling and installing 95 packages worked without crashes. ... > I will enable CONFIG_IOMMU_DEBUG in -rc6-mm1 and see, as otherwise I > have no clue where to look... Hi, A few questions/suggestions: - is it still vanilla -rc6-mm1; I've seen on kernel list you tried some fixes around raid? - could you remind this lockdep warning; is it always and the same, always before crash, or no rules? - I've seen you looked after double freeing, but this last debug list warning could suggest locking problems during list modification too. - above git-nfsd and git-net tests should be probably repeated with -rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches only, and if bug triggers, with one reversed; btw., since in previous message you mentioned that 50 packages could be not enough to trigger this, these 54 above could make too little margin yet. Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On 04-01-2008 11:23, Torsten Kaiser wrote: On Jan 2, 2008 10:51 PM, Herbert Xu [EMAIL PROTECTED] wrote: On Wed, Jan 02, 2008 at 07:29:59PM +0100, Torsten Kaiser wrote: Vanilla 2.6.24-rc6 seems stable. I did not see any crash or warnings. OK that's great. The next step would be to try excluding specific git trees from mm to see if they make a difference. The two specific trees of interest would be git-nfsd and git-net. git-nfsd from git://git.linux-nfs.org/projects/bfields/linux.git#for-mm - compiling and installing 54 packages worked without crashes. git-net from git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25.git - compiling and installing 95 packages worked without crashes. ... I will enable CONFIG_IOMMU_DEBUG in -rc6-mm1 and see, as otherwise I have no clue where to look... Hi, A few questions/suggestions: - is it still vanilla -rc6-mm1; I've seen on kernel list you tried some fixes around raid? - could you remind this lockdep warning; is it always and the same, always before crash, or no rules? - I've seen you looked after double freeing, but this last debug list warning could suggest locking problems during list modification too. - above git-nfsd and git-net tests should be probably repeated with -rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches only, and if bug triggers, with one reversed; btw., since in previous message you mentioned that 50 packages could be not enough to trigger this, these 54 above could make too little margin yet. Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote: On Jan 4, 2008 2:30 PM, Jarek Poplawski [EMAIL PROTECTED] wrote: ... I'm open for any suggestions and will try to answer any questions. I'm very glad, thanks! The only thing that is sadly not practical is bisecting the borkenout mm-patches, as triggering this error is to unreliable / time-consuming. Right, but it seems there are these 2 main suspects here... - is it still vanilla -rc6-mm1; I've seen on kernel list you tried some fixes around raid? Yes, without these fixes I can't boot. But they should only be run during starting the arrays, so I doubt that this is that cause. (Also -rc3-mm2 did not need this fix) You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes? I think it would be easier just to start with this working -rc6 and simply check if we have 'right' suspects, so: git-net.patch and git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope, can compile - otherwise you could try the other way: add the whole -mm and revert these two). Using current gits could complicate this investigation. My skbuff-double-free-detector is still in there, but was never triggered. - could you remind this lockdep warning; is it always and the same, always before crash, or no rules? ??? I see no lockdep warning before the crashes. I have seen a warning about the dst-__refcnt in dst_release and different warnings about list operations. I think I have always posted everything I have seen before the crashes. (captured via serial console) So, you mean there are no more of these?: looked into the log in question and the only other warning was a circular locking dependency that lockdep detected around 1.5 hour before this warning. ... [ 7620.845168] INFO: lockdep is turned off. (If you mean the lockdep-problem in -rc6: That is more or less a missing annotation during early bootup. The only problem with that is, that it will causes lockdep to be turned off and so it can not be used to find any real problem. A fix for that is in -mm so I do have lockdep on the mm-kernels) - I've seen you looked after double freeing, but this last debug list warning could suggest locking problems during list modification too. Yes, but Herbert mentioned double freeing a skb explicit and so I tried to catch this. I do not know enough about the network core to verify the locking of the involved lists. Right, the list corruption could be because of use after freeing too. - above git-nfsd and git-net tests should be probably repeated with -rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches only, and if bug triggers, with one reversed; btw., since in previous message you mentioned that 50 packages could be not enough to trigger this, these 54 above could make too little margin yet. Yes, I think I really need to redo the git-nfsd-test. With IOMMU_DEBUG enabled rc6-mm1worked for 52 packages, only a secound run of kde-packages triggered it after only 5 packages. I don't know what this bug hates about kdeartwork-wallpaper (triggered it this time) or kdeartwork-styles. I didn't read all this thread, so probably I miss many points, but are you sure there are no problems with filesystem corruption around these packets or where you compile(?) them (e.g. after these raid problems)? Output from the crash with IOMMU_DEBUG (lockdep was enabled, but did not trigger): [15593.236374] Unable to handle kernel NULL pointer dereference3list_add corruption. prev-next should be next Fine! I'll try to look at this. BTW, I guess/hope DEBUG_SLAB etc. are also on... Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Thu, Jan 03, 2008 at 03:21:36PM +0800, Dave Young wrote: ... > I don't know if there's other possible warning places with this mutex > or not, if you have any ideas about this, please tell me. I think lockdep is just to tell such things. So, the question is, how much it was tested already, because if there are many warnings reported e.g. after merging to -mm, then this could be better to re-do it this other way... But, I hope this will not be necessary. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Thu, Jan 03, 2008 at 08:06:09AM +0100, Jarek Poplawski wrote: > On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote: > > Convert semaphore to mutex in struct class. > ... > > One lockdep warning detected as following, thus use mutex_lock_nested with > > SINGLE_DEPTH_NESTING in class_device_add > > > > Jan 3 10:45:15 darkstar kernel: > > = > > Jan 3 10:45:15 darkstar kernel: [ INFO: possible recursive locking > > detected ] > > Jan 3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1 > > Jan 3 10:45:15 darkstar kernel: > > - > ... > > If there's anything missed please help to point out, thanks. > > Dave, IMHO it's not 'the right' way to do it: [...] OOPS! (I was sleeping...) Unless it has turned out it's not so hard here, and you are quite sure there should be no more warnings after this one nesting annotation - then of course, this is the right way! Sorry (?) Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote: > Convert semaphore to mutex in struct class. ... > One lockdep warning detected as following, thus use mutex_lock_nested with > SINGLE_DEPTH_NESTING in class_device_add > > Jan 3 10:45:15 darkstar kernel: = > Jan 3 10:45:15 darkstar kernel: [ INFO: possible recursive locking detected ] > Jan 3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1 > Jan 3 10:45:15 darkstar kernel: - ... > If there's anything missed please help to point out, thanks. Dave, IMHO it's not 'the right' way to do it: from this and earlier discussions it seems there could be many more warnings like this one; lockdep simply always turns itself off after first one. So, merging your patches like this would effectively turn off lockdep for all other places as well, maybe for a long time. I'd suggest to try first to do it with some wrappers around mutexes, which simply omit lockdep verification, and later try to replace them one by one, after checking and testing there are no such warnings anymore (which would often need some additional annotations about nesting and probably some changes in lockdep too). Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/12] Use mutex instead of semaphore in driver core
On Wed, Jan 02, 2008 at 01:39:38PM +0100, Jarek Poplawski wrote: > On 02-01-2008 08:00, Greg KH wrote: > ... > > If no one has noticed any issues in this area, [...] BTW, if 'we' are sure there are no issues, and only lockdep is not clever enough yet, why not do such a change partially, e.g. with lockdep_on/off, at least to -mm, for testing? Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/12] Use mutex instead of semaphore in driver core
On 02-01-2008 08:00, Greg KH wrote: ... > If no one has noticed any issues in this area, [...] ...Could also mean there are hidden issues, so it doesn't look like very convincing argument. ...Unless after the change there will be found no hidden issues, then, of course, it looks like convincing enough. Regards, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/12] Use mutex instead of semaphore in driver core
On 02-01-2008 08:00, Greg KH wrote: ... If no one has noticed any issues in this area, [...] ...Could also mean there are hidden issues, so it doesn't look like very convincing argument. ...Unless after the change there will be found no hidden issues, then, of course, it looks like convincing enough. Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/12] Use mutex instead of semaphore in driver core
On Wed, Jan 02, 2008 at 01:39:38PM +0100, Jarek Poplawski wrote: On 02-01-2008 08:00, Greg KH wrote: ... If no one has noticed any issues in this area, [...] BTW, if 'we' are sure there are no issues, and only lockdep is not clever enough yet, why not do such a change partially, e.g. with lockdep_on/off, at least to -mm, for testing? Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote: Convert semaphore to mutex in struct class. ... One lockdep warning detected as following, thus use mutex_lock_nested with SINGLE_DEPTH_NESTING in class_device_add Jan 3 10:45:15 darkstar kernel: = Jan 3 10:45:15 darkstar kernel: [ INFO: possible recursive locking detected ] Jan 3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1 Jan 3 10:45:15 darkstar kernel: - ... If there's anything missed please help to point out, thanks. Dave, IMHO it's not 'the right' way to do it: from this and earlier discussions it seems there could be many more warnings like this one; lockdep simply always turns itself off after first one. So, merging your patches like this would effectively turn off lockdep for all other places as well, maybe for a long time. I'd suggest to try first to do it with some wrappers around mutexes, which simply omit lockdep verification, and later try to replace them one by one, after checking and testing there are no such warnings anymore (which would often need some additional annotations about nesting and probably some changes in lockdep too). Thanks, Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Thu, Jan 03, 2008 at 08:06:09AM +0100, Jarek Poplawski wrote: On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote: Convert semaphore to mutex in struct class. ... One lockdep warning detected as following, thus use mutex_lock_nested with SINGLE_DEPTH_NESTING in class_device_add Jan 3 10:45:15 darkstar kernel: = Jan 3 10:45:15 darkstar kernel: [ INFO: possible recursive locking detected ] Jan 3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1 Jan 3 10:45:15 darkstar kernel: - ... If there's anything missed please help to point out, thanks. Dave, IMHO it's not 'the right' way to do it: [...] OOPS! (I was sleeping...) Unless it has turned out it's not so hard here, and you are quite sure there should be no more warnings after this one nesting annotation - then of course, this is the right way! Sorry (?) Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Thu, Jan 03, 2008 at 03:21:36PM +0800, Dave Young wrote: ... I don't know if there's other possible warning places with this mutex or not, if you have any ideas about this, please tell me. I think lockdep is just to tell such things. So, the question is, how much it was tested already, because if there are many warnings reported e.g. after merging to -mm, then this could be better to re-do it this other way... But, I hope this will not be necessary. Jarek P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make I/O schedulers non-modular
Jarek Poplawski wrote, On 11/27/2007 11:15 PM: > Adrian Bunk wrote, On 11/27/2007 05:47 PM: ... >> There is nothing like a "right of choice". (very late) PS: ...I was a bit confused with this, wondering: so, we've envied you (the West) this "thing" for so many years, and now it seems, you have no idea what's this all about?! Happily it was only my English: http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less "Freedom of choice" was the right term! Regards, Jarek P. PPS: But, of course, no need to discuss this more... unless we're interested in the next Nobel Prize. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make I/O schedulers non-modular
Jarek Poplawski wrote, On 11/27/2007 11:15 PM: Adrian Bunk wrote, On 11/27/2007 05:47 PM: ... There is nothing like a right of choice. (very late) PS: ...I was a bit confused with this, wondering: so, we've envied you (the West) this thing for so many years, and now it seems, you have no idea what's this all about?! Happily it was only my English: http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less Freedom of choice was the right term! Regards, Jarek P. PPS: But, of course, no need to discuss this more... unless we're interested in the next Nobel Prize. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange Panic (Deadlock)
[EMAIL PROTECTED] wrote, On 12/24/2007 07:18 PM: > Hello again. > Its bug depend to > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4aae07025265151e3f7041dfbf0f529e122de1d8 > ? Hello Vyacheslav! I wonder why do you think there is such a dependency, and why do you report timer.c bug to netdev after all? I added some CCs here, but IMHO you would better open a new bug at bugzilla.kernel.org, adding some more details like .config, and reply back to this thread with the bug's number. BTW, if it's patched by Gentoo or otherwise, you should try and report on 'vanilla' one only. Regards, Jarek P. > Its very critical bug to us. This PC must be HA. Server place so far > for me to go and reboot server. I go to it 1-3 times in week =( > Please help to fix it =) > >> Hello all. Some time machine freeze. No information on monitor. No >> rebooting on sysctl "kernel.panic". >> Any idea? >> >> Catched by netconsole: >> [91922.085864] [ cut here ] >> [91922.085975] kernel BUG at kernel/timer.c:606! >> [91922.086058] invalid opcode: [#1] >> [91922.086127] SMP >> [91922.086201] Modules linked in: netconsole cls_u32 sch_sfq sch_htb >> xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core >> [91922.086386] CPU:1 >> [91922.086387] EIP:0060:[]Not tainted VLI >> [91922.086389] EFLAGS: 00010087 (2.6.23-gentoo-r4-fw #4) >> [91922.086600] EIP is at cascade+0x34/0x4f >> [91922.086669] eax: c0452200 ebx: f450408c ecx: 0022 edx: f3c6e08c >> [91922.086740] esi: 0022 edi: c21ce000 ebp: 0001 esp: c21c3ef8 >> [91922.086815] ds: 007b es: 007b fs: 00d8 gs: ss: 0068 >> [91922.086885] Process swapper (pid: 0, ti=c21c2000 task=c21af000 >> task.ti=c21c2000) >> [91922.086954] Stack: f3c6e08c c21bfb74 c21ce000 000a >> c012767a c21af000 0001 >> [91922.087119]c21c3f18 c0106963 c21c3f68 0001 0021 >> c03c0b08 000a c0124556 >> [91922.087285]0046 c21c2008 c01245ec >> c2015120 c0114a11 0046 >> [91922.087451] Call Trace: >> [91922.087586] [] run_timer_softirq+0x51/0x154 >> [91922.087669] [] profile_pc+0x21/0x46 >> [91922.087752] [] __do_softirq+0x5d/0xc1 >> [91922.087833] [] do_softirq+0x32/0x36 >> [91922.087915] [] smp_apic_timer_interrupt+0x74/0x80 >> [91922.087997] [] apic_timer_interrupt+0x28/0x30 >> [91922.088076] [] mwait_idle_with_hints+0x3b/0x3f >> [91922.088162] [] mwait_idle+0x0/0xa >> [91922.088237] [] cpu_idle+0x91/0xaa >> [91922.088319] === >> [91922.088390] Code: 08 8d 04 ca 8b 10 89 62 04 89 14 24 8b 50 04 89 22 >> 89 00 89 54 24 04 8b 14 24 89 40 04 8b 1a eb 19 8b 42 14 83 e0 fe 39 f8 >> 74 04 <0f> 0b eb fe 89 f8 e8 d8 fe ff ff 89 da 8b 1b 39 e2 75 e3 59 89 >> [91922.088864] EIP: [] cascade+0x34/0x4f SS:ESP 0068:c21c3ef8 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to [EMAIL PROTECTED] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > > > This message was sent using IMP, the Internet Messaging Program. > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange Panic (Deadlock)
[EMAIL PROTECTED] wrote, On 12/24/2007 07:18 PM: Hello again. Its bug depend to http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4aae07025265151e3f7041dfbf0f529e122de1d8 ? Hello Vyacheslav! I wonder why do you think there is such a dependency, and why do you report timer.c bug to netdev after all? I added some CCs here, but IMHO you would better open a new bug at bugzilla.kernel.org, adding some more details like .config, and reply back to this thread with the bug's number. BTW, if it's patched by Gentoo or otherwise, you should try and report on 'vanilla' one only. Regards, Jarek P. Its very critical bug to us. This PC must be HA. Server place so far for me to go and reboot server. I go to it 1-3 times in week =( Please help to fix it =) Hello all. Some time machine freeze. No information on monitor. No rebooting on sysctl kernel.panic. Any idea? Catched by netconsole: [91922.085864] [ cut here ] [91922.085975] kernel BUG at kernel/timer.c:606! [91922.086058] invalid opcode: [#1] [91922.086127] SMP [91922.086201] Modules linked in: netconsole cls_u32 sch_sfq sch_htb xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core [91922.086386] CPU:1 [91922.086387] EIP:0060:[c0127387]Not tainted VLI [91922.086389] EFLAGS: 00010087 (2.6.23-gentoo-r4-fw #4) [91922.086600] EIP is at cascade+0x34/0x4f [91922.086669] eax: c0452200 ebx: f450408c ecx: 0022 edx: f3c6e08c [91922.086740] esi: 0022 edi: c21ce000 ebp: 0001 esp: c21c3ef8 [91922.086815] ds: 007b es: 007b fs: 00d8 gs: ss: 0068 [91922.086885] Process swapper (pid: 0, ti=c21c2000 task=c21af000 task.ti=c21c2000) [91922.086954] Stack: f3c6e08c c21bfb74 c21ce000 000a c012767a c21af000 0001 [91922.087119]c21c3f18 c0106963 c21c3f68 0001 0021 c03c0b08 000a c0124556 [91922.087285]0046 c21c2008 c01245ec c2015120 c0114a11 0046 [91922.087451] Call Trace: [91922.087586] [c012767a] run_timer_softirq+0x51/0x154 [91922.087669] [c0106963] profile_pc+0x21/0x46 [91922.087752] [c0124556] __do_softirq+0x5d/0xc1 [91922.087833] [c01245ec] do_softirq+0x32/0x36 [91922.087915] [c0114a11] smp_apic_timer_interrupt+0x74/0x80 [91922.087997] [c010484c] apic_timer_interrupt+0x28/0x30 [91922.088076] [c0102255] mwait_idle_with_hints+0x3b/0x3f [91922.088162] [c0102259] mwait_idle+0x0/0xa [91922.088237] [c0102398] cpu_idle+0x91/0xaa [91922.088319] === [91922.088390] Code: 08 8d 04 ca 8b 10 89 62 04 89 14 24 8b 50 04 89 22 89 00 89 54 24 04 8b 14 24 89 40 04 8b 1a eb 19 8b 42 14 83 e0 fe 39 f8 74 04 0f 0b eb fe 89 f8 e8 d8 fe ff ff 89 da 8b 1b 39 e2 75 e3 59 89 [91922.088864] EIP: [c0127387] cascade+0x34/0x4f SS:ESP 0068:c21c3ef8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html This message was sent using IMP, the Internet Messaging Program. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/