from:"Jarek Poplawski"

Re: [BUG?] APM is hidden in menuconfig

2008-02-21 Thread Jarek Poplawski

On Thu, Feb 21, 2008 at 07:43:46AM +, Jarek Poplawski wrote:
...
> ...Or at least to mention APM in SUSPEND title and description.
> Actually, this is really strange: both SUSPEND and PM_SLEEP have
> default = y. So it seems they are intended to be more "advertised"
> than they are?

Now I see it could be even stranger: it *is* visible by default(!)
after make defconfig. make menuconfig does something else: it seems
to "help me" and sees some other config (current kernel is in other
directory) even after deleting .config and make mrproper. So, maybe
this started because of this and the problem is with changing this
back.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG?] APM is hidden in menuconfig

2008-02-21 Thread Jarek Poplawski

On Thu, Feb 21, 2008 at 07:43:46AM +, Jarek Poplawski wrote:
...
 ...Or at least to mention APM in SUSPEND title and description.
 Actually, this is really strange: both SUSPEND and PM_SLEEP have
 default = y. So it seems they are intended to be more advertised
 than they are?

Now I see it could be even stranger: it *is* visible by default(!)
after make defconfig. make menuconfig does something else: it seems
to help me and sees some other config (current kernel is in other
directory) even after deleting .config and make mrproper. So, maybe
this started because of this and the problem is with changing this
back.

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG?] APM is hidden in menuconfig

2008-02-20 Thread Jarek Poplawski

On Thu, Feb 21, 2008 at 02:21:52AM +0100, Rafael J. Wysocki wrote:
> On Thursday, 21 of February 2008, Frans Pop wrote:
> > Rafael J. Wysocki wrote:
> > > On Wednesday, 20 of February 2008, Jarek Poplawski wrote:
> > >> So, has it to be so hard? It seems not - at least in good old times...
> > > 
> > > Something in APM uses some code from drivers/base/power/main.c that
> > > depends on PM_SLEEP.
> > 
> > Sure, but that does not make Jarek's point invalid. From a user PoV APM is a
> > high-level configuration option. That is what he wants.
> > It should thus be easily accessible and not be buried beneath a lot of
> > other, much more technical, options.

Probably Frans is right: this should be my point... But I'm not so
"greedy", and I would be happy if it were at least more visible. It
simply seems to me quite not obvious to even think about turning
SUSPEND on when I have problems with a basic acpi function.

Even more interesting question is why this APM or PM_SLEEP dependency
on SUSPEND (or HIBERNATION) isn't visible with "/" searching: PM_SLEEP
looks like some "hidden" option - that's why I tried first to find some
comment in arch/ instead of simply reading Kconfig.

> > Could this maybe be solved by making APM automatically 'select' some options
> > instead of 'depending' on them?
> 
> That, unfortunately, doesn't work.
> 
> IMO the solution might be to separate the APM suspend code from the rest of 
> the
> APM code and make it depend on (PM_SUSPEND && APM).

...Or at least to mention APM in SUSPEND title and description.
Actually, this is really strange: both SUSPEND and PM_SLEEP have
default = y. So it seems they are intended to be more "advertised"
than they are?

Thanks & regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BUG?] APM is hidden in menuconfig

2008-02-20 Thread Jarek Poplawski

Hi,

I needed APM to have poweroff on old box. So, in 2.6.24.2 menuconfig:

1) Power management options -->
   No APM.
2) [*] Power Management support
   No APM. I can see ACPI...
3) I try searching with "/" + "APM"
   APM [=n]
   Depends on: !X86_VOYAGER && X86_32 && PM_SLEEP && !X86_VISWS
4) I check above: all correct except:
   Symbol: PM_SLEEP [=n]
   No more "depends on".
5) Exit menuconfig - enter grep:
6) Some time wasting in arch/.
7) less kernel/power/Kconfig:
   config PM_SLEEP
bool
depends on SUSPEND || HIBERNATION
default y
   ... 
   config SUSPEND
bool "Suspend to RAM and standby"
depends on PM
depends on SUSPEND_UP_POSSIBLE || SUSPEND_SMP_POSSIBLE
default y
---help---
  Allow the system to enter sleep states in which main memory is
  powered and thus its contents are preserved, such as the
  suspend-to-RAM state (i.e. the ACPI S3 state).

8) Back to menuconfig:
   [*] Suspend to RAM and standby

   Bingo! APM revealed! (And not even "DEPRECATED"!)

So, has it to be so hard? It seems not - at least in good old times...

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BUG?] APM is hidden in menuconfig

2008-02-20 Thread Jarek Poplawski

Hi,

I needed APM to have poweroff on old box. So, in 2.6.24.2 menuconfig:

1) Power management options --
   No APM.
2) [*] Power Management support
   No APM. I can see ACPI...
3) I try searching with / + APM
   APM [=n]
   Depends on: !X86_VOYAGER  X86_32  PM_SLEEP  !X86_VISWS
4) I check above: all correct except:
   Symbol: PM_SLEEP [=n]
   No more depends on.
5) Exit menuconfig - enter grep:
6) Some time wasting in arch/.
7) less kernel/power/Kconfig:
   config PM_SLEEP
bool
depends on SUSPEND || HIBERNATION
default y
   ... 
   config SUSPEND
bool Suspend to RAM and standby
depends on PM
depends on SUSPEND_UP_POSSIBLE || SUSPEND_SMP_POSSIBLE
default y
---help---
  Allow the system to enter sleep states in which main memory is
  powered and thus its contents are preserved, such as the
  suspend-to-RAM state (i.e. the ACPI S3 state).

8) Back to menuconfig:
   [*] Suspend to RAM and standby

   Bingo! APM revealed! (And not even DEPRECATED!)

So, has it to be so hard? It seems not - at least in good old times...

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG?] APM is hidden in menuconfig

2008-02-20 Thread Jarek Poplawski

On Thu, Feb 21, 2008 at 02:21:52AM +0100, Rafael J. Wysocki wrote:
 On Thursday, 21 of February 2008, Frans Pop wrote:
  Rafael J. Wysocki wrote:
   On Wednesday, 20 of February 2008, Jarek Poplawski wrote:
   So, has it to be so hard? It seems not - at least in good old times...
   
   Something in APM uses some code from drivers/base/power/main.c that
   depends on PM_SLEEP.
  
  Sure, but that does not make Jarek's point invalid. From a user PoV APM is a
  high-level configuration option. That is what he wants.
  It should thus be easily accessible and not be buried beneath a lot of
  other, much more technical, options.

Probably Frans is right: this should be my point... But I'm not so
greedy, and I would be happy if it were at least more visible. It
simply seems to me quite not obvious to even think about turning
SUSPEND on when I have problems with a basic acpi function.

Even more interesting question is why this APM or PM_SLEEP dependency
on SUSPEND (or HIBERNATION) isn't visible with / searching: PM_SLEEP
looks like some hidden option - that's why I tried first to find some
comment in arch/ instead of simply reading Kconfig.

  Could this maybe be solved by making APM automatically 'select' some options
  instead of 'depending' on them?
 
 That, unfortunately, doesn't work.
 
 IMO the solution might be to separate the APM suspend code from the rest of 
 the
 APM code and make it depend on (PM_SUSPEND  APM).

...Or at least to mention APM in SUSPEND title and description.
Actually, this is really strange: both SUSPEND and PM_SLEEP have
default = y. So it seems they are intended to be more advertised
than they are?

Thanks  regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.25 patch] fix broken error handling in ieee80211_sta_process_addba_request()

2008-02-19 Thread Jarek Poplawski

On 19-02-2008 23:58, Adrian Bunk wrote:
...
> --- a/net/mac80211/ieee80211_sta.c
> +++ b/net/mac80211/ieee80211_sta.c
> @@ -1116,9 +1116,10 @@ static void ieee80211_sta_process_addba_request(struct 
> net_device *dev,
...
> + printk(KERN_ERR "can not allocate reordering buffer "

  + printk(KERN_ERR "cannot allocate reordering buffer "

Probably this can be fixed during the commit.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.25 patch] fix broken error handling in ieee80211_sta_process_addba_request()

2008-02-19 Thread Jarek Poplawski

On 19-02-2008 23:58, Adrian Bunk wrote:
...
 --- a/net/mac80211/ieee80211_sta.c
 +++ b/net/mac80211/ieee80211_sta.c
 @@ -1116,9 +1116,10 @@ static void ieee80211_sta_process_addba_request(struct 
 net_device *dev,
...
 + printk(KERN_ERR can not allocate reordering buffer 

  + printk(KERN_ERR cannot allocate reordering buffer 

Probably this can be fixed during the commit.

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] workqueues: shrink cpu_populated_map when CPU dies

2008-02-17 Thread Jarek Poplawski

On Mon, Feb 18, 2008 at 02:45:56AM +0300, Oleg Nesterov wrote:
> On 02/17, Jarek Poplawski wrote:
...
> > 1) ... workqueue_cpu_callback(...)
...
> Yes, but this is harmless. cpu-hotplug callbacks are not time-critical,
> and cpu_down/cpu_up happens not often, and LIST_HEAD(workqueues) is not
> very long, so ...
> 
> > 2) ... __create_workqueue_key(...)
...
> > Shouldn't this list_add() be done after all these inits below?
> This doesn't matter. Please note that get_online_cpus() blocks
> cpu_up/cpu_down, they take cpu_hotplug_begin().

You are completely right. It looks like this was only about ..."look".
(But adding only "default:" in 1) would make it look nicer to me...)

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] workqueues: shrink cpu_populated_map when CPU dies

2008-02-17 Thread Jarek Poplawski

Hi Oleg,

This patch looks OK to me. But while reading this I got some doubts
in nearby places, so BTW 2 small questions:

1) ... workqueue_cpu_callback(...)
{
...
list_for_each_entry(wq, , list) {
cwq = per_cpu_ptr(wq->cpu_wq, cpu);

switch (action) {
case CPU_UP_PREPARE:
...

It looks like not all CPU_ cases are served here: shouldn't
list_for_each_entry() be omitted for them?

2) ... __create_workqueue_key(...)
{
...
if (singlethread) {
...
} else {
get_online_cpus();
spin_lock(_lock);
list_add(>list, );

Shouldn't this list_add() be done after all these inits below?

spin_unlock(_lock);

for_each_possible_cpu(cpu) {
cwq = init_cpu_workqueue(wq, cpu);
...
}
...
Thanks,
Jarek P.
 

On Sat, Feb 16, 2008 at 08:22:59PM +0300, Oleg Nesterov wrote:
> When cpu_populated_map was introduced, it was supposed that cwq->thread can
> survive after CPU_DEAD, that is why we never shrink cpu_populated_map.
> 
> This is not very nice, we can safely remove the already dead CPU from the map.
> The only required change is that destroy_workqueue() must hold the hotplug 
> lock
> until it destroys all cwq->thread's, to protect the cpu_populated_map. We 
> could
> make the local copy of cpu mask and drop the lock, but sizeof(cpumask_t) may 
> be
> very large.
> 
> Also, fix the comment near queue_work(). Unless _cpu_down() happens we do
> guarantee the cpu-affinity of the work_struct, and we have users which rely on
> this.
> 
> Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] workqueues: shrink cpu_populated_map when CPU dies

2008-02-17 Thread Jarek Poplawski

On Mon, Feb 18, 2008 at 02:45:56AM +0300, Oleg Nesterov wrote:
 On 02/17, Jarek Poplawski wrote:
...
  1) ... workqueue_cpu_callback(...)
...
 Yes, but this is harmless. cpu-hotplug callbacks are not time-critical,
 and cpu_down/cpu_up happens not often, and LIST_HEAD(workqueues) is not
 very long, so ...
 
  2) ... __create_workqueue_key(...)
...
  Shouldn't this list_add() be done after all these inits below?
 This doesn't matter. Please note that get_online_cpus() blocks
 cpu_up/cpu_down, they take cpu_hotplug_begin().

You are completely right. It looks like this was only about ...look.
(But adding only default: in 1) would make it look nicer to me...)

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Jarek Poplawski

Jarek Poplawski wrote, On 02/15/2008 10:03 PM:
...

> ...On the other hand this:
> 
>> Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, 
>> ksoftirqd/1/7, f0551180
> 
> seems to point just at spinlock lockup, so it's more about the full report.
> I wonder if this patch to prink could help here:
> 
> authorIngo Molnar 
>   Fri, 25 Jan 2008 20:07:58 + (21:07 +0100)
> printk: make printk more robust by not allowing recursion
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e


...or maybe a patch like this attached here?

Jarek P.
diff --git a/lib/spinlock_debug.c b/lib/spinlock_debug.c
index 9c4b025..21c8aaa 100644
--- a/lib/spinlock_debug.c
+++ b/lib/spinlock_debug.c
@@ -111,8 +111,7 @@ static void __spin_lock_debug(spinlock_t *lock)
 			__delay(1);
 		}
 		/* lockup suspected: */
-		if (print_once) {
-			print_once = 0;
+		if (print_once == 1) {
 			printk(KERN_EMERG "BUG: spinlock lockup on CPU#%d, "
 	"%s/%d, %p\n",
 raw_smp_processor_id(), current->comm,
@@ -122,7 +121,14 @@ static void __spin_lock_debug(spinlock_t *lock)
 			trigger_all_cpu_backtrace();
 #endif
 		}
+		if (print_once++ > 1000)
+			goto out;
 	}
+	return;
+out:
+	panic("spinlock lockup panic #%llu\n", i);
+	// or:
+	// BUG();
 }
 
 void _raw_spin_lock(spinlock_t *lock)

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Jarek Poplawski

Jarek Poplawski wrote, On 02/15/2008 09:21 PM:

> Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM:
> ...
> 
>> I have similar crashes on completely different hardware with same job (QOS), 
>> so i think it is actually some nasty bug in networking.
> 
> Maybe you could try with some other debugging options? E.g. since lockdep
> doesn't help - turn this off. Instead try some others, like these:

...On the other hand this:

> Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, 
> ksoftirqd/1/7, f0551180

seems to point just at spinlock lockup, so it's more about the full report.
I wonder if this patch to prink could help here:

author  Ingo Molnar 
Fri, 25 Jan 2008 20:07:58 + (21:07 +0100)
printk: make printk more robust by not allowing recursion

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Jarek Poplawski

Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM:
...

> I have similar crashes on completely different hardware with same job (QOS), 
> so i think it is actually some nasty bug in networking.

Maybe you could try with some other debugging options? E.g. since lockdep
doesn't help - turn this off. Instead try some others, like these:

> # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
> # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
> # CONFIG_DEBUG_KOBJECT is not set
> # CONFIG_DEBUG_HIGHMEM is not set
> # CONFIG_DEBUG_VM is not set
> # CONFIG_DEBUG_LIST is not set
> # CONFIG_DEBUG_SG is not set
> # CONFIG_BOOT_PRINTK_DELAY is not set
> # CONFIG_DEBUG_STACKOVERFLOW is not set
> # CONFIG_DEBUG_STACK_USAGE is not set
> # CONFIG_DEBUG_RODATA is not set

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Jarek Poplawski

Jarek Poplawski wrote, On 02/15/2008 09:21 PM:

 Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM:
 ...
 
 I have similar crashes on completely different hardware with same job (QOS), 
 so i think it is actually some nasty bug in networking.
 
 Maybe you could try with some other debugging options? E.g. since lockdep
 doesn't help - turn this off. Instead try some others, like these:

...On the other hand this:

 Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, 
 ksoftirqd/1/7, f0551180

seems to point just at spinlock lockup, so it's more about the full report.
I wonder if this patch to prink could help here:

author  Ingo Molnar mingo at elte.hu
Fri, 25 Jan 2008 20:07:58 + (21:07 +0100)
printk: make printk more robust by not allowing recursion

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Jarek Poplawski

Denys Fedoryshchenko wrote, On 02/15/2008 08:42 PM:
...

 I have similar crashes on completely different hardware with same job (QOS), 
 so i think it is actually some nasty bug in networking.

Maybe you could try with some other debugging options? E.g. since lockdep
doesn't help - turn this off. Instead try some others, like these:

 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_HIGHMEM is not set
 # CONFIG_DEBUG_VM is not set
 # CONFIG_DEBUG_LIST is not set
 # CONFIG_DEBUG_SG is not set
 # CONFIG_BOOT_PRINTK_DELAY is not set
 # CONFIG_DEBUG_STACKOVERFLOW is not set
 # CONFIG_DEBUG_STACK_USAGE is not set
 # CONFIG_DEBUG_RODATA is not set

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Jarek Poplawski

Jarek Poplawski wrote, On 02/15/2008 10:03 PM:
...

 ...On the other hand this:
 
 Feb 15 15:50:17 217.151.X.X [1521315.068984] BUG: spinlock lockup on CPU#1, 
 ksoftirqd/1/7, f0551180
 
 seems to point just at spinlock lockup, so it's more about the full report.
 I wonder if this patch to prink could help here:
 
 authorIngo Molnar mingo at elte.hu
   Fri, 25 Jan 2008 20:07:58 + (21:07 +0100)
 printk: make printk more robust by not allowing recursion
 
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=32a76006683f7b28ae3cc491da37716e002f198e


...or maybe a patch like this attached here?

Jarek P.
diff --git a/lib/spinlock_debug.c b/lib/spinlock_debug.c
index 9c4b025..21c8aaa 100644
--- a/lib/spinlock_debug.c
+++ b/lib/spinlock_debug.c
@@ -111,8 +111,7 @@ static void __spin_lock_debug(spinlock_t *lock)
 			__delay(1);
 		}
 		/* lockup suspected: */
-		if (print_once) {
-			print_once = 0;
+		if (print_once == 1) {
 			printk(KERN_EMERG BUG: spinlock lockup on CPU#%d, 
 	%s/%d, %p\n,
 raw_smp_processor_id(), current-comm,
@@ -122,7 +121,14 @@ static void __spin_lock_debug(spinlock_t *lock)
 			trigger_all_cpu_backtrace();
 #endif
 		}
+		if (print_once++  1000)
+			goto out;
 	}
+	return;
+out:
+	panic(spinlock lockup panic #%llu\n, i);
+	// or:
+	// BUG();
 }
 
 void _raw_spin_lock(spinlock_t *lock)

Re: [PATCH] fib_trie: rcu_assign_pointer warning fix

2008-02-12 Thread Jarek Poplawski

On Tue, Feb 12, 2008 at 08:32:18PM +0100, Jarek Poplawski wrote:
...
> It seems the above version of this macro uses the barrier for 0, but
> if I miss something, or for these other: documenting reasons,

...or __builtin_constants could be used for indexing (?!),

> then of
> course you are right.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fib_trie: rcu_assign_pointer warning fix

2008-02-12 Thread Jarek Poplawski

On Tue, Feb 12, 2008 at 08:07:29AM -0800, Paul E. McKenney wrote:
...
> "All programmers are blind, especially me."

Hmm... I got it my way: you - superheroes - sometimes seem to be just
like us - common people... (Probably early in the morning, before
dressing your funny costumes?) 

> You are right, Jarek.  I ran this through gcc, and the following
> comes close:
> 
> #define rcu_assign_pointer(p, v) \
>   ({ \
>   if (!__builtin_constant_p(v) || (v)) \
>   smp_wmb(); \
>   (p) = (v); \
>   })
> 
> But I am concerned about the following case:
> 
>   rcu_assign_pointer(global_index, 0);
> 
>   . . .
> 
>   x = global_array[rcu_dereference(global_index)];
> 
> Since arrays have a zero-th element, we would really want a memory
> barrier in this case.

It seems the above version of this macro uses the barrier for 0, but
if I miss something, or for these other: documenting reasons, then of
course you are right.

> 
> So how about leaving the index-unfriendly version of rcu_assign_pointer()
> and adding an rcu_assign_index() as follows?
> 
>   Thanx, Paul
> 
> Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
> ---
> 
>  rcupdate.h |   18 ++
>  1 file changed, 18 insertions(+)
> 
> diff -urpNa -X dontdiff linux-2.6.24/include/linux/rcupdate.h 
> linux-2.6.24-rai/include/linux/rcupdate.h
> --- linux-2.6.24/include/linux/rcupdate.h 2008-01-24 14:58:37.0 
> -0800
> +++ linux-2.6.24-rai/include/linux/rcupdate.h 2008-02-12 08:04:59.0 
> -0800
> @@ -278,6 +278,24 @@ extern struct lockdep_map rcu_lock_map;
>   })
>  
>  /**
> + * rcu_assign_index - assign (publicize) a index of a newly
> + * initialized array elementg that will be dereferenced by RCU

---^
 + * initialized array element that will be dereferenced by RCU

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fib_trie: rcu_assign_pointer warning fix

2008-02-12 Thread Jarek Poplawski

On 12-02-2008 02:16, David Miller wrote:
> From: Stephen Hemminger <[EMAIL PROTECTED]>
> Date: Mon, 11 Feb 2008 16:59:54 -0800
> 
> linux-kernel added to CC:, any change to generic kernel infrastructure
> should be posted there
> 
>> Eliminate warnings when rcu_assign_pointer is used with unsigned long.
>> It is reasonable to use RCU with non-pointer values so allow it for general
>> use.  Add a comment to explain the if test.
>>
>> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
>> ---
>>  include/linux/rcupdate.h |   13 +++--
>>  1 files changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
>> index 37a642c..c44ac87 100644
>> --- a/include/linux/rcupdate.h
>> +++ b/include/linux/rcupdate.h
>> @@ -172,14 +172,15 @@ struct rcu_head {
>>   * structure after the pointer assignment.  More importantly, this
>>   * call documents which pointers will be dereferenced by RCU read-side
>>   * code.
>> + *
>> + * If value is the NULL (constant 0), then no barrier is needed.
>>   */
>>  
>> -#define rcu_assign_pointer(p, v) \
>> -({ \
>> -if (!__builtin_constant_p(v) || \
>> -((v) != NULL)) \
>> -smp_wmb(); \
>> -(p) = (v); \
>> +#define rcu_assign_pointer(p, v)\
>> +({  \
>> +if (!(__builtin_constant_p(v) && v))\

...But, "If value is the NULL (constant 0)" we have:

if (!(1 && NULL != 0)) ==> if (!(0)) and the barrier is used?!

>> +smp_wmb();  \
>> +(p) = (v);  \
>>  })
>>  
>>  /**


Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fib_trie: rcu_assign_pointer warning fix

2008-02-12 Thread Jarek Poplawski

On 12-02-2008 02:16, David Miller wrote:
 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Mon, 11 Feb 2008 16:59:54 -0800
 
 linux-kernel added to CC:, any change to generic kernel infrastructure
 should be posted there
 
 Eliminate warnings when rcu_assign_pointer is used with unsigned long.
 It is reasonable to use RCU with non-pointer values so allow it for general
 use.  Add a comment to explain the if test.

 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
 ---
  include/linux/rcupdate.h |   13 +++--
  1 files changed, 7 insertions(+), 6 deletions(-)

 diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
 index 37a642c..c44ac87 100644
 --- a/include/linux/rcupdate.h
 +++ b/include/linux/rcupdate.h
 @@ -172,14 +172,15 @@ struct rcu_head {
   * structure after the pointer assignment.  More importantly, this
   * call documents which pointers will be dereferenced by RCU read-side
   * code.
 + *
 + * If value is the NULL (constant 0), then no barrier is needed.
   */
  
 -#define rcu_assign_pointer(p, v) \
 -({ \
 -if (!__builtin_constant_p(v) || \
 -((v) != NULL)) \
 -smp_wmb(); \
 -(p) = (v); \
 +#define rcu_assign_pointer(p, v)\
 +({  \
 +if (!(__builtin_constant_p(v)  v))\

...But, If value is the NULL (constant 0) we have:

if (!(1  NULL != 0)) == if (!(0)) and the barrier is used?!

 +smp_wmb();  \
 +(p) = (v);  \
  })
  
  /**


Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fib_trie: rcu_assign_pointer warning fix

2008-02-12 Thread Jarek Poplawski

On Tue, Feb 12, 2008 at 08:32:18PM +0100, Jarek Poplawski wrote:
...
 It seems the above version of this macro uses the barrier for 0, but
 if I miss something, or for these other: documenting reasons,

...or __builtin_constants could be used for indexing (?!),

 then of
 course you are right.

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

On 22-01-2008 01:55, Dave Young wrote:
...
> Hi, thanks your effort. Now I think we should stop this thread and
> waiting the class_device going away :)

Sure! But, if you change your mind I'm interested in this subject.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

Dave Young wrote, On 01/21/2008 09:44 AM:
...
> I applied it in my kernel, built and run without warnings, but it need
> more testing.
> I will be very glad to see the test result about this if you could, thanks.

Bad news. (Alas I won't be able to check this today.)

=
[ INFO: possible recursive locking detected ]
2.6.24-rc8 #5
-
modprobe/536 is trying to acquire lock:
 (>mutex/1){--..}, at: [] class_device_add+0x2c6/0x322

but task is already holding lock:
 (>mutex/1){--..}, at: [] class_device_add+0x2c6/0x322

other info that might help us debug this:
2 locks held by modprobe/536:
 #0:  (>scan_mutex){--..}, at: [] __scsi_add_device+0x60/0xc9 
[scsi_mod]
 #1:  (>mutex/1){--..}, at: [] class_device_add+0x2c6/0x322

stack backtrace:
Pid: 536, comm: modprobe Not tainted 2.6.24-rc8 #5
 [] __lock_acquire+0x962/0x10d4
 [] _spin_unlock_irqrestore+0x34/0x39
 [] kfree+0x87/0xad
 [] trace_hardirqs_on+0xba/0x15b
 [] lock_acquire+0x71/0x8b
 [] class_device_add+0x2c6/0x322
 [] mutex_lock_nested+0x92/0x2bc
 [] class_device_add+0x2c6/0x322
 [] class_device_add+0x2c6/0x322
 [] class_device_add+0x2c6/0x322
 [] kobject_get+0xf/0x13
 [] kobject_init+0x29/0x38
 [] class_device_create+0x7d/0x9e
 [] sg_add+0x144/0x39f [sg]
 [] class_device_add+0x2f4/0x322
 [] scsi_sysfs_add_sdev+0x64/0x1ac [scsi_mod]
 [] ata_scsi_dev_config+0x14/0x91 [libata]
 [] scsi_probe_and_add_lun+0x9c0/0x9e0 [scsi_mod]
 [] __scsi_add_device+0x60/0xc9 [scsi_mod]
 [] __scsi_add_device+0xc7/0xc9 [scsi_mod]
 [] ata_scsi_scan_host+0xd3/0x26a [libata]
 [] ata_host_register+0x205/0x280 [libata]
 [] ata_interrupt+0x0/0x200 [libata]
 [] ata_host_activate+0x8b/0xf2 [libata]
 [] svia_init_one+0x2e2/0x511 [sata_via]
 [] pci_match_device+0xa5/0xb6
 [] svia_init_one+0x0/0x511 [sata_via]
 [] pci_device_probe+0x40/0x5f
 [] driver_probe_device+0x7c/0x175
 [] __driver_attach+0xa2/0xa4
 [] bus_for_each_dev+0x3c/0x5a
 [] driver_attach+0x16/0x1a
 [] __driver_attach+0x0/0xa4
 [] bus_add_driver+0x72/0x1c0
 [] __pci_register_driver+0x56/0x89
 [] sys_init_module+0xf8/0x1891
 [] ata_port_start+0x0/0x65 [libata]
 [] trace_hardirqs_on+0xba/0x15b
 [] syscall_call+0x7/0xb
 ===
scsi 2:0:0:0: Attached scsi generic sg1 type 0

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

On Mon, Jan 21, 2008 at 04:44:36PM +0800, Dave Young wrote:
...
> I applied it in my kernel, built and run without warnings, but it need
> more testing.
> I will be very glad to see the test result about this if you could, thanks.

I'll try this of course, but alas I don't have anything such more
sophisticated to verify this possibility of deeper nesting.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

On Mon, Jan 21, 2008 at 09:43:35AM +0800, Dave Young wrote:
...
> Convert the class semaphore to mutex.
> class_interface_register/unregister use class_device_* functions, so 
> SINGLE_DEPTH_NESTING added for lockdep please in these functions. 

Looks fine to me now, but... I think you forgot again to make it
clear that there were no lockdep warnings observed after this patch.
And it seems there should be used something more complex than usb
mouse or modem to verify this. Then IMHO it would be very nice if
Andrew dares to merge this to -mm...

Regards,
Jarek P.

PS: I miss a lot function names in this patch...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

On Mon, Jan 21, 2008 at 04:44:36PM +0800, Dave Young wrote:
...
 I applied it in my kernel, built and run without warnings, but it need
 more testing.
 I will be very glad to see the test result about this if you could, thanks.

I'll try this of course, but alas I don't have anything such more
sophisticated to verify this possibility of deeper nesting.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

On Mon, Jan 21, 2008 at 09:43:35AM +0800, Dave Young wrote:
...
 Convert the class semaphore to mutex.
 class_interface_register/unregister use class_device_* functions, so 
 SINGLE_DEPTH_NESTING added for lockdep please in these functions. 

Looks fine to me now, but... I think you forgot again to make it
clear that there were no lockdep warnings observed after this patch.
And it seems there should be used something more complex than usb
mouse or modem to verify this. Then IMHO it would be very nice if
Andrew dares to merge this to -mm...

Regards,
Jarek P.

PS: I miss a lot function names in this patch...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

Dave Young wrote, On 01/21/2008 09:44 AM:
...
 I applied it in my kernel, built and run without warnings, but it need
 more testing.
 I will be very glad to see the test result about this if you could, thanks.

Bad news. (Alas I won't be able to check this today.)

=
[ INFO: possible recursive locking detected ]
2.6.24-rc8 #5
-
modprobe/536 is trying to acquire lock:
 (cls-mutex/1){--..}, at: [c0217e43] class_device_add+0x2c6/0x322

but task is already holding lock:
 (cls-mutex/1){--..}, at: [c0217e43] class_device_add+0x2c6/0x322

other info that might help us debug this:
2 locks held by modprobe/536:
 #0:  (shost-scan_mutex){--..}, at: [ce82795b] __scsi_add_device+0x60/0xc9 
[scsi_mod]
 #1:  (cls-mutex/1){--..}, at: [c0217e43] class_device_add+0x2c6/0x322

stack backtrace:
Pid: 536, comm: modprobe Not tainted 2.6.24-rc8 #5
 [c0138edf] __lock_acquire+0x962/0x10d4
 [c028be37] _spin_unlock_irqrestore+0x34/0x39
 [c015e693] kfree+0x87/0xad
 [c01380cb] trace_hardirqs_on+0xba/0x15b
 [c01396c2] lock_acquire+0x71/0x8b
 [c0217e43] class_device_add+0x2c6/0x322
 [c028a87f] mutex_lock_nested+0x92/0x2bc
 [c0217e43] class_device_add+0x2c6/0x322
 [c0217e43] class_device_add+0x2c6/0x322
 [c0217e43] class_device_add+0x2c6/0x322
 [c01afd42] kobject_get+0xf/0x13
 [c01afe68] kobject_init+0x29/0x38
 [c0217f2c] class_device_create+0x7d/0x9e
 [cebfe5b4] sg_add+0x144/0x39f [sg]
 [c0217e71] class_device_add+0x2f4/0x322
 [ce8289f7] scsi_sysfs_add_sdev+0x64/0x1ac [scsi_mod]
 [ce840ece] ata_scsi_dev_config+0x14/0x91 [libata]
 [ce826b5c] scsi_probe_and_add_lun+0x9c0/0x9e0 [scsi_mod]
 [ce82795b] __scsi_add_device+0x60/0xc9 [scsi_mod]
 [ce8279c2] __scsi_add_device+0xc7/0xc9 [scsi_mod]
 [ce841401] ata_scsi_scan_host+0xd3/0x26a [libata]
 [ce83e7e6] ata_host_register+0x205/0x280 [libata]
 [ce83f430] ata_interrupt+0x0/0x200 [libata]
 [ce83e8ec] ata_host_activate+0x8b/0xf2 [libata]
 [ceb95504] svia_init_one+0x2e2/0x511 [sata_via]
 [c01ca42c] pci_match_device+0xa5/0xb6
 [ceb95222] svia_init_one+0x0/0x511 [sata_via]
 [c01ca4e9] pci_device_probe+0x40/0x5f
 [c02173ae] driver_probe_device+0x7c/0x175
 [c02175ed] __driver_attach+0xa2/0xa4
 [c02168dd] bus_for_each_dev+0x3c/0x5a
 [c0217263] driver_attach+0x16/0x1a
 [c021754b] __driver_attach+0x0/0xa4
 [c0216bea] bus_add_driver+0x72/0x1c0
 [c01ca642] __pci_register_driver+0x56/0x89
 [c013f11d] sys_init_module+0xf8/0x1891
 [ce83901f] ata_port_start+0x0/0x65 [libata]
 [c01380cb] trace_hardirqs_on+0xba/0x15b
 [c0102936] syscall_call+0x7/0xb
 ===
scsi 2:0:0:0: Attached scsi generic sg1 type 0

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-21 Thread Jarek Poplawski

On 22-01-2008 01:55, Dave Young wrote:
...
 Hi, thanks your effort. Now I think we should stop this thread and
 waiting the class_device going away :)

Sure! But, if you change your mind I'm interested in this subject.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-19 Thread Jarek Poplawski

Dave Young wrote, On 01/18/2008 10:07 AM:

> On Jan 18, 2008 4:23 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:

>> On Fri, Jan 18, 2008 at 03:48:02PM +0800, Dave Young wrote:

...

>>> 1) Using CLASS_NORMAL/CLASS_PARENT/CLASS_CHILD will be enough.
>>> or
>>> 2) Simply add SINGLE_LEVEL_NESTING in class_device_add and other
>>> class_device functions because it is the only possible nest-lock place
>>> as I know.

Dave, after looking a bit at this it seems you could be "mostly" right
with this 2). Maybe I've missed something (I didn't verify this yet), but
it looks like +1 level (SINGLE_LEVEL_NESTING) could be needed in:
class_device_add() (as you did), but probably also class_device_del() and
class_device_destroy().

...But, there seems to be "little" problem, if there is used this recursion
with: class_intf->add()/remove() in class_device_add()/del()?! Then Kay
is right about possibility of deeper nesting. If this path is really used,
and any of these class_device_* functions with locking are called, then
this patch couldn't work like this. So, there is a question: how deep
nesting is currently used here?

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-19 Thread Jarek Poplawski

Dave Young wrote, On 01/18/2008 10:07 AM:

 On Jan 18, 2008 4:23 PM, Jarek Poplawski [EMAIL PROTECTED] wrote:

 On Fri, Jan 18, 2008 at 03:48:02PM +0800, Dave Young wrote:

...

 1) Using CLASS_NORMAL/CLASS_PARENT/CLASS_CHILD will be enough.
 or
 2) Simply add SINGLE_LEVEL_NESTING in class_device_add and other
 class_device functions because it is the only possible nest-lock place
 as I know.


Dave, after looking a bit at this it seems you could be mostly right
with this 2). Maybe I've missed something (I didn't verify this yet), but
it looks like +1 level (SINGLE_LEVEL_NESTING) could be needed in:
class_device_add() (as you did), but probably also class_device_del() and
class_device_destroy().

...But, there seems to be little problem, if there is used this recursion
with: class_intf-add()/remove() in class_device_add()/del()?! Then Kay
is right about possibility of deeper nesting. If this path is really used,
and any of these class_device_* functions with locking are called, then
this patch couldn't work like this. So, there is a question: how deep
nesting is currently used here?

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-18 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 11:45:12AM +0100, Kay Sievers wrote:
> On Fri, 2008-01-18 at 08:38 +0100, Jarek Poplawski wrote:
> > On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote:
> > > On Jan 18, 2008 11:18 AM, Kay Sievers <[EMAIL PROTECTED]> wrote:
> > ...
> > > > Yeah, might be better to wait until class_device is gone, otherwise you
> > > > may need to fix stuff that is just going to be removed. Your change to
> > > > have iterators for the class devices look like a nice preparation for
> > > > future changes though.
> > > >
> > > > Our rough plan is:
> > > >  2.6.25:
> > > >   - get the ~100 patches in Greg's tree (in -mm) merged :)
> > > >  2.6.26:
> > > > ???  - remove the 20 char limit in "struct device"
> > > >   - get rid of "struct class_device"
> > > 
> > > Fine, thanks.
> > > 
> > > Let's wait for other people's comment.
> > 
> > Dave, I doubt you'll ever manage to do this if you're going to wait:
> > probably there will be always some new changes like this around...
> 
> Well there are not "changes" in that sense, the class_device stuff will
> be entirely ripped out, and I doubt we will want to change anything
> there, just shortly before it's deleted.

So, 2.6.26 means shortly... And this all needs some time for testing,
debugging or maybe some change of concept, so this would take a while...
Well, it's not my problem, but since this stuff will go away, shouldn't
we care more about the staff that will stay?

> Also your assumptions about device nesting are not really true, there is
> no limit, even when there are no current users nesting deeper, and
> "struct device" can be any nesting depth, and that's where it gets
> interesting.

I'm just trying to figure this out. It seems this is a real problem
while freezing, but not necessarily here (but I can miss something).
 
Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-18 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 09:00:34AM +0100, Jarek Poplawski wrote:
> On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote:
> ...
> > After digging the class usage code again, I found that the only
> > possible double lock place is the class_interface_register/unregister
> > in which the class_device api could be called.
> 
> OK, but currently after using mostly:
> mutex_lock(_class)
> 
> and once:
> mutex_lock_nested(_class, SINGLE_DEPTH_NESTING)
> 
> lockdep mostly thinks these parent classes are 2 different objects,
> with only 2 possible levels of nesting, so this parent_class has
> to have wrong name (2 parents can't be locked from the same thread,
> so maybe it's class_grandparent sometimes?).

...Hmm... I was probably wrong: this could be right if there are only
two levels of nesting used and class locks it's parent only!

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-18 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 03:48:02PM +0800, Dave Young wrote:
> On Jan 18, 2008 3:38 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
...
> > IMHO, it would be nice to get the real state of current lockdep
> > problems here to figure out if there is any chance to do this right &
> > without any warnings with current lockdep. If I got it right from
> > earlier threads it might be impossible with USB, at least.
> 
> I don't think so, usb doesn't be affected by struct class mutex, they
> only use the lock of struct device. As I replied before, the lockdep
> issue exist only between class_interface and class_device.

OK, but I've meant possibility of changing their own semaphores later.

> > So, since I think these nesting levels seem to be wrong in 7/7 patch,
> > maybe it's better to exclude it from this patchset, and to try this as
> > testing for some time.
> 
> I may file the updated patch with more nesting changes and test it of
> course. Actually I should have done it, thanks.
...
> 1) Using CLASS_NORMAL/CLASS_PARENT/CLASS_CHILD will be enough.
> or
> 2) Simply add SINGLE_LEVEL_NESTING in class_device_add and other
> class_device functions because it is the only possible nest-lock place
> as I know.

If SINGLE_LEVEL_NESTING is enough? (means 2 levels total)

I think you should more care about real (logical) relations here, than
what's enough to get rid of lockdep warnings.

Since there are not so much of these changes, you can try both
variants. I'll be glad to look at this - maybe I'll mangage to figure
out BTW, what it's all about...

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-18 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 09:00:34AM +0100, Jarek Poplawski wrote:
 On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote:
 ...
  After digging the class usage code again, I found that the only
  possible double lock place is the class_interface_register/unregister
  in which the class_device api could be called.
 
 OK, but currently after using mostly:
 mutex_lock(parent_class)
 
 and once:
 mutex_lock_nested(parent_class, SINGLE_DEPTH_NESTING)
 
 lockdep mostly thinks these parent classes are 2 different objects,
 with only 2 possible levels of nesting, so this parent_class has
 to have wrong name (2 parents can't be locked from the same thread,
 so maybe it's class_grandparent sometimes?).

...Hmm... I was probably wrong: this could be right if there are only
two levels of nesting used and class locks it's parent only!

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-18 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 11:45:12AM +0100, Kay Sievers wrote:
 On Fri, 2008-01-18 at 08:38 +0100, Jarek Poplawski wrote:
  On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote:
   On Jan 18, 2008 11:18 AM, Kay Sievers [EMAIL PROTECTED] wrote:
  ...
Yeah, might be better to wait until class_device is gone, otherwise you
may need to fix stuff that is just going to be removed. Your change to
have iterators for the class devices look like a nice preparation for
future changes though.
   
Our rough plan is:
 2.6.25:
  - get the ~100 patches in Greg's tree (in -mm) merged :)
 2.6.26:
???  - remove the 20 char limit in struct device
  - get rid of struct class_device
   
   Fine, thanks.
   
   Let's wait for other people's comment.
  
  Dave, I doubt you'll ever manage to do this if you're going to wait:
  probably there will be always some new changes like this around...
 
 Well there are not changes in that sense, the class_device stuff will
 be entirely ripped out, and I doubt we will want to change anything
 there, just shortly before it's deleted.

So, 2.6.26 means shortly... And this all needs some time for testing,
debugging or maybe some change of concept, so this would take a while...
Well, it's not my problem, but since this stuff will go away, shouldn't
we care more about the staff that will stay?

 Also your assumptions about device nesting are not really true, there is
 no limit, even when there are no current users nesting deeper, and
 struct device can be any nesting depth, and that's where it gets
 interesting.

I'm just trying to figure this out. It seems this is a real problem
while freezing, but not necessarily here (but I can miss something).
 
Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote:
...
> After digging the class usage code again, I found that the only
> possible double lock place is the class_interface_register/unregister
> in which the class_device api could be called.

OK, but currently after using mostly:
mutex_lock(_class)

and once:
mutex_lock_nested(_class, SINGLE_DEPTH_NESTING)

lockdep mostly thinks these parent classes are 2 different objects,
with only 2 possible levels of nesting, so this parent_class has
to have wrong name (2 parents can't be locked from the same thread,
so maybe it's class_grandparent sometimes?).

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote:
> On Jan 18, 2008 11:18 AM, Kay Sievers <[EMAIL PROTECTED]> wrote:
...
> > Yeah, might be better to wait until class_device is gone, otherwise you
> > may need to fix stuff that is just going to be removed. Your change to
> > have iterators for the class devices look like a nice preparation for
> > future changes though.
> >
> > Our rough plan is:
> >  2.6.25:
> >   - get the ~100 patches in Greg's tree (in -mm) merged :)
> >  2.6.26:
> > ???  - remove the 20 char limit in "struct device"
> >   - get rid of "struct class_device"
> 
> Fine, thanks.
> 
> Let's wait for other people's comment.

Dave, I doubt you'll ever manage to do this if you're going to wait:
probably there will be always some new changes like this around...

IMHO, it would be nice to get the real state of current lockdep
problems here to figure out if there is any chance to do this right &
without any warnings with current lockdep. If I got it right from
earlier threads it might be impossible with USB, at least.

So, since I think these nesting levels seem to be wrong in 7/7 patch,
maybe it's better to exclude it from this patchset, and to try this as
testing for some time.

My proposal is to do the annotations with mutex_lock_nested(),
everywhere in this patch, according to 'real' relations between these
classes from thread POV, so e.g.:

mutex_lock_nested(_class->mutex, CLASS_PARENT);
mutex_lock_nested(_class->mutex, CLASS_CLASS);
...or more if needed:
mutex_lock_nested(_class->mutex, CLASS_CHILD);
mutex_lock_nested(_class->mutex, CLASS_ROOT);
...etc.

using adequate names for these enums, and only after this check what
lockdep thinks about it. Then, if there are no obvious mistakes, but
lockdep doesn't like it, send the patch and report without trying to
'silence' lockdep, so we could see what's going on, and if there are
any chances to make it right.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote:
> On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote:
> > On Thu, 17 Jan 2008, Jarek Poplawski wrote:
> > 
> > > On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote:
> > > > On Thu, 17 Jan 2008, Dave Young wrote:
> > > > 
> > > > > > Your meaning isn't clear.  Do you mean that your patch doesn't 
> > > > > > generate
> > > > > > any lockdep warnings at all?  Or do you mean that it generates a 
> > > > > > single
> > > > > > lockdep warning at boot time and then no more warnings afterward?
> > > > > 
> > > > > I means the latter one.
> > > > 
> > > > That's very bad.
> > > > 
> > > > For each type of violation, lockdep only gives one error message.  So
> > > > the fact that you get one message at boot time and then no more doesn't
> > > > mean the code is almost right -- it probably means the code has lots of
> > > > errors and you're seeing only the first one.
> > > 
> > > I hope it's better than this: lockdep really stops checking after first
> > > warning, but I've understood from David's description that after fixing
> > > this one place lockdep seems to be pleased.
> > 
> > That isn't what Dave said above; he said that lockdep produces a single
> > warning at bootup.  If he did mention anything about one place being
> > fixed up or lockdep being pleased, it was a while back and I've lost 
> > track of it.
> > 
> > If I recall correctly the nature of the warning was that a method
> > routine for one class (called with the class's mutex held) was creating
> > a second class and locking that class's mutex.  In principle this is
> > perfectly legal and should be allowed for arbitrary depths of nesting,
> > even though it is the sort of thing lockdep is currently unable to
> > handle.
> 
> You are definitely right! After first reading Dave's description I got
> it the same way, but after re-reading I probably was misled with this
> "thus"! Only now I've had a look at this warning and there is really
> mutex_lock_nested(). Sorry Alan!

But, on the other hand, mutex_lock() is really mutex_lock_nested(), and
after second checking this lockdep warning from Jan. 3, it seems
impossible it was get after this patch...

Dave, could you please answer with full sentence if there is any lockdep
warning possible after applying these 1-7/7 patches, and if so, attach
current warning? Otherwise, I'll have apologized for this everybody from
the list soon!

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote:
> On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote:
...
> > If I recall correctly the nature of the warning was that a method
> > routine for one class (called with the class's mutex held) was creating
> > a second class and locking that class's mutex.  In principle this is
> > perfectly legal and should be allowed for arbitrary depths of nesting,
> > even though it is the sort of thing lockdep is currently unable to
> > handle.

BTW, Dave, if it's only about one such "second class" here, then it
shouldn't be so hard to try this one more level of nesting. I think,
the real problem for lockdep starts when there are more such "second
classes", but it's probably in another place.

You could also have a look at e.g. enum_inode_i_mutex_lock_class in
include/linux/fs.h and fh_lock_nested() in include/linux/nfsd/nfsfh.h,
and maybe define similar enum for this.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 01:11:01PM -0800, Greg KH wrote:
...
> I've known Greg to make lots of mistakes :)

Right! Above is one example...

> I don't remember ever saying that the "code is correct with the lockdep
> warnings", I think I said, "Make sure there are no lockdep warnings with
> any conversion you do."

I've meant your discussion with David Miller, where you doubted the need
to change properly working code. But, as a matter of fact, I really can't
see any reason to doubt this, and care so much about lockdep now, if there
are no known problems with lockups, and this change has been done done
"one for one".

Sorry again for any possible misleading,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote:
> On Thu, 17 Jan 2008, Jarek Poplawski wrote:
> 
> > On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote:
> > > On Thu, 17 Jan 2008, Dave Young wrote:
> > > 
> > > > > Your meaning isn't clear.  Do you mean that your patch doesn't 
> > > > > generate
> > > > > any lockdep warnings at all?  Or do you mean that it generates a 
> > > > > single
> > > > > lockdep warning at boot time and then no more warnings afterward?
> > > > 
> > > > I means the latter one.
> > > 
> > > That's very bad.
> > > 
> > > For each type of violation, lockdep only gives one error message.  So
> > > the fact that you get one message at boot time and then no more doesn't
> > > mean the code is almost right -- it probably means the code has lots of
> > > errors and you're seeing only the first one.
> > 
> > I hope it's better than this: lockdep really stops checking after first
> > warning, but I've understood from David's description that after fixing
> > this one place lockdep seems to be pleased.
> 
> That isn't what Dave said above; he said that lockdep produces a single
> warning at bootup.  If he did mention anything about one place being
> fixed up or lockdep being pleased, it was a while back and I've lost 
> track of it.
> 
> If I recall correctly the nature of the warning was that a method
> routine for one class (called with the class's mutex held) was creating
> a second class and locking that class's mutex.  In principle this is
> perfectly legal and should be allowed for arbitrary depths of nesting,
> even though it is the sort of thing lockdep is currently unable to
> handle.

You are definitely right! After first reading Dave's description I got
it the same way, but after re-reading I probably was misled with this
"thus"! Only now I've had a look at this warning and there is really
mutex_lock_nested(). Sorry Alan!

David, I don't think a patch which causes such a warning can be merged
even to -mm, because, as I wrote earlier it would automatically turn
off lockdep for everybody. So, every such warning needs to be fixed or,
if it's impossible because of some lockdep deficiency, it should be
considered if it's better to wait for lockdep changes, or do the change
with lockdep turned off locally for each lock (IMHO, it's better,
because with sems there is no such control as well, and some other
aspects could be tested in the meantime).

> > On the other hand, according to Greg the code is OK, so if there are any
> > such warnings they simply have to be false! (...Unless you trust lockdep
> > more?!)
> 
> It's not a matter of trust or of false warnings.  People shouldn't 
> tolerate any lockdep warnings at all; otherwise they will start to 
> ignore the valid ones.
> 
> Alan Stern
> 
> P.S.: Just because Greg says the code is okay doesn't mean it will 
> please lockdep -- it doesn't even mean the code really is okay!  I've 
> known Greg to make an occasional mistake.

Alan, you are 200% right! I apologize for my bad jokes too!

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote:
> On Thu, 17 Jan 2008, Dave Young wrote:
> 
> > > Your meaning isn't clear.  Do you mean that your patch doesn't generate
> > > any lockdep warnings at all?  Or do you mean that it generates a single
> > > lockdep warning at boot time and then no more warnings afterward?
> > 
> > I means the latter one.
> 
> That's very bad.
> 
> For each type of violation, lockdep only gives one error message.  So
> the fact that you get one message at boot time and then no more doesn't
> mean the code is almost right -- it probably means the code has lots of
> errors and you're seeing only the first one.

I hope it's better than this: lockdep really stops checking after first
warning, but I've understood from David's description that after fixing
this one place lockdep seems to be pleased.

On the other hand, according to Greg the code is OK, so if there are any
such warnings they simply have to be false! (...Unless you trust lockdep
more?!)

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On 17-01-2008 02:17, Dave Young wrote:
> On Jan 16, 2008 4:34 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
>> On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote:
>> ...
>>> The lockdep warining was posted in the below thread, actually, I have
>>> built and run this patced kernel for several days, there's no more
>>> warnings.
>>> http://lkml.org/lkml/2008/1/3/2
>> Right... But, with something like this:
>>
>> ... have_some_fun(... cls)
>> {
>> mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING);
>> have_other_fun(cls);
>> mutex_unlock(>mutex);
>>
>> }
>>
>> ... have_more_fun(...)
>> {
>> ...
>>
>> mutex_init(>mutex);
>>
>> mutex_lock(>mutex);
>> have_some_fun(cls);
>> mutex_unlock(>mutex);
>> }
>>
>> probably you wouldn't get any lockdep warning too...
> 
> Sorry for late reply.
> Actually, I don't know much about lockdep. Could you tell how to use
> it properly in this scenario?

As you have noticed while working on this patch, if two different
instances of the same structure containig some lock are created in
the same place, lockdep will treat this one (the same) lock. It looks
strange, but actually it's a feature which enables to track
dependencies between different locks on 'class' level instead of
instance 'level'. The downside is that lockdep is very often too
sensitive by default, so you have to 'annotate' when instancess are
actually on different level (e.g. parents and children here) and
could be locked at the same time or in some order.

You can use e.g. mutex_lock_nested() or lockdep_set_class*() for this.
Then lockdep simply trusts you, and starts to think they are different
locks. If you do it wrong there will be simply no more warnings, but
undercover lockups still possible (and diagnosing a bit harder then).
So, since in your patch there are two levels of locking, and you
started to annotate lockdep about a child taking parent's class lock
with: 
mutex_lock_nested(_class->mutex, SINGLE_DEPTH_NESTING);

you should do the same everywhere in a situation like this.
lockdep will treat this simply as lock B vs. A (mutex_lock())
dependencies.

Regards,
Jarek P.

PS: BTW, it seems after this patch 1/1 the locking was changed a bit,
so these previous tests could be not enough.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On 17-01-2008 02:17, Dave Young wrote:
 On Jan 16, 2008 4:34 PM, Jarek Poplawski [EMAIL PROTECTED] wrote:
 On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote:
 ...
 The lockdep warining was posted in the below thread, actually, I have
 built and run this patced kernel for several days, there's no more
 warnings.
 http://lkml.org/lkml/2008/1/3/2
 Right... But, with something like this:

 ... have_some_fun(... cls)
 {
 mutex_lock_nested(cls-mutex, SINGLE_DEPTH_NESTING);
 have_other_fun(cls);
 mutex_unlock(cls-mutex);

 }

 ... have_more_fun(...)
 {
 ...

 mutex_init(cls-mutex);

 mutex_lock(cls-mutex);
 have_some_fun(cls);
 mutex_unlock(cls-mutex);
 }

 probably you wouldn't get any lockdep warning too...
 
 Sorry for late reply.
 Actually, I don't know much about lockdep. Could you tell how to use
 it properly in this scenario?

As you have noticed while working on this patch, if two different
instances of the same structure containig some lock are created in
the same place, lockdep will treat this one (the same) lock. It looks
strange, but actually it's a feature which enables to track
dependencies between different locks on 'class' level instead of
instance 'level'. The downside is that lockdep is very often too
sensitive by default, so you have to 'annotate' when instancess are
actually on different level (e.g. parents and children here) and
could be locked at the same time or in some order.

You can use e.g. mutex_lock_nested() or lockdep_set_class*() for this.
Then lockdep simply trusts you, and starts to think they are different
locks. If you do it wrong there will be simply no more warnings, but
undercover lockups still possible (and diagnosing a bit harder then).
So, since in your patch there are two levels of locking, and you
started to annotate lockdep about a child taking parent's class lock
with: 
mutex_lock_nested(parent_class-mutex, SINGLE_DEPTH_NESTING);

you should do the same everywhere in a situation like this.
lockdep will treat this simply as lock B vs. A (mutex_lock(cls))
dependencies.

Regards,
Jarek P.

PS: BTW, it seems after this patch 1/1 the locking was changed a bit,
so these previous tests could be not enough.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote:
 On Thu, 17 Jan 2008, Dave Young wrote:
 
   Your meaning isn't clear.  Do you mean that your patch doesn't generate
   any lockdep warnings at all?  Or do you mean that it generates a single
   lockdep warning at boot time and then no more warnings afterward?
  
  I means the latter one.
 
 That's very bad.
 
 For each type of violation, lockdep only gives one error message.  So
 the fact that you get one message at boot time and then no more doesn't
 mean the code is almost right -- it probably means the code has lots of
 errors and you're seeing only the first one.

I hope it's better than this: lockdep really stops checking after first
warning, but I've understood from David's description that after fixing
this one place lockdep seems to be pleased.

On the other hand, according to Greg the code is OK, so if there are any
such warnings they simply have to be false! (...Unless you trust lockdep
more?!)

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote:
 On Thu, 17 Jan 2008, Jarek Poplawski wrote:
 
  On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote:
   On Thu, 17 Jan 2008, Dave Young wrote:
   
 Your meaning isn't clear.  Do you mean that your patch doesn't 
 generate
 any lockdep warnings at all?  Or do you mean that it generates a 
 single
 lockdep warning at boot time and then no more warnings afterward?

I means the latter one.
   
   That's very bad.
   
   For each type of violation, lockdep only gives one error message.  So
   the fact that you get one message at boot time and then no more doesn't
   mean the code is almost right -- it probably means the code has lots of
   errors and you're seeing only the first one.
  
  I hope it's better than this: lockdep really stops checking after first
  warning, but I've understood from David's description that after fixing
  this one place lockdep seems to be pleased.
 
 That isn't what Dave said above; he said that lockdep produces a single
 warning at bootup.  If he did mention anything about one place being
 fixed up or lockdep being pleased, it was a while back and I've lost 
 track of it.
 
 If I recall correctly the nature of the warning was that a method
 routine for one class (called with the class's mutex held) was creating
 a second class and locking that class's mutex.  In principle this is
 perfectly legal and should be allowed for arbitrary depths of nesting,
 even though it is the sort of thing lockdep is currently unable to
 handle.

You are definitely right! After first reading Dave's description I got
it the same way, but after re-reading I probably was misled with this
thus! Only now I've had a look at this warning and there is really
mutex_lock_nested(). Sorry Alan!

David, I don't think a patch which causes such a warning can be merged
even to -mm, because, as I wrote earlier it would automatically turn
off lockdep for everybody. So, every such warning needs to be fixed or,
if it's impossible because of some lockdep deficiency, it should be
considered if it's better to wait for lockdep changes, or do the change
with lockdep turned off locally for each lock (IMHO, it's better,
because with sems there is no such control as well, and some other
aspects could be tested in the meantime).

  On the other hand, according to Greg the code is OK, so if there are any
  such warnings they simply have to be false! (...Unless you trust lockdep
  more?!)
 
 It's not a matter of trust or of false warnings.  People shouldn't 
 tolerate any lockdep warnings at all; otherwise they will start to 
 ignore the valid ones.
 
 Alan Stern
 
 P.S.: Just because Greg says the code is okay doesn't mean it will 
 please lockdep -- it doesn't even mean the code really is okay!  I've 
 known Greg to make an occasional mistake.

Alan, you are 200% right! I apologize for my bad jokes too!

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote:
 On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote:
...
  If I recall correctly the nature of the warning was that a method
  routine for one class (called with the class's mutex held) was creating
  a second class and locking that class's mutex.  In principle this is
  perfectly legal and should be allowed for arbitrary depths of nesting,
  even though it is the sort of thing lockdep is currently unable to
  handle.

BTW, Dave, if it's only about one such second class here, then it
shouldn't be so hard to try this one more level of nesting. I think,
the real problem for lockdep starts when there are more such second
classes, but it's probably in another place.

You could also have a look at e.g. enum_inode_i_mutex_lock_class in
include/linux/fs.h and fh_lock_nested() in include/linux/nfsd/nfsfh.h,
and maybe define similar enum for this.

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 01:11:01PM -0800, Greg KH wrote:
...
 I've known Greg to make lots of mistakes :)

Right! Above is one example...

 I don't remember ever saying that the code is correct with the lockdep
 warnings, I think I said, Make sure there are no lockdep warnings with
 any conversion you do.

I've meant your discussion with David Miller, where you doubted the need
to change properly working code. But, as a matter of fact, I really can't
see any reason to doubt this, and care so much about lockdep now, if there
are no known problems with lockups, and this change has been done done
one for one.

Sorry again for any possible misleading,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Thu, Jan 17, 2008 at 09:31:55PM +0100, Jarek Poplawski wrote:
 On Thu, Jan 17, 2008 at 02:57:36PM -0500, Alan Stern wrote:
  On Thu, 17 Jan 2008, Jarek Poplawski wrote:
  
   On Thu, Jan 17, 2008 at 10:16:30AM -0500, Alan Stern wrote:
On Thu, 17 Jan 2008, Dave Young wrote:

  Your meaning isn't clear.  Do you mean that your patch doesn't 
  generate
  any lockdep warnings at all?  Or do you mean that it generates a 
  single
  lockdep warning at boot time and then no more warnings afterward?
 
 I means the latter one.

That's very bad.

For each type of violation, lockdep only gives one error message.  So
the fact that you get one message at boot time and then no more doesn't
mean the code is almost right -- it probably means the code has lots of
errors and you're seeing only the first one.
   
   I hope it's better than this: lockdep really stops checking after first
   warning, but I've understood from David's description that after fixing
   this one place lockdep seems to be pleased.
  
  That isn't what Dave said above; he said that lockdep produces a single
  warning at bootup.  If he did mention anything about one place being
  fixed up or lockdep being pleased, it was a while back and I've lost 
  track of it.
  
  If I recall correctly the nature of the warning was that a method
  routine for one class (called with the class's mutex held) was creating
  a second class and locking that class's mutex.  In principle this is
  perfectly legal and should be allowed for arbitrary depths of nesting,
  even though it is the sort of thing lockdep is currently unable to
  handle.
 
 You are definitely right! After first reading Dave's description I got
 it the same way, but after re-reading I probably was misled with this
 thus! Only now I've had a look at this warning and there is really
 mutex_lock_nested(). Sorry Alan!

But, on the other hand, mutex_lock() is really mutex_lock_nested(), and
after second checking this lockdep warning from Jan. 3, it seems
impossible it was get after this patch...

Dave, could you please answer with full sentence if there is any lockdep
warning possible after applying these 1-7/7 patches, and if so, attach
current warning? Otherwise, I'll have apologized for this everybody from
the list soon!

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 01:31:17PM +0800, Dave Young wrote:
 On Jan 18, 2008 11:18 AM, Kay Sievers [EMAIL PROTECTED] wrote:
...
  Yeah, might be better to wait until class_device is gone, otherwise you
  may need to fix stuff that is just going to be removed. Your change to
  have iterators for the class devices look like a nice preparation for
  future changes though.
 
  Our rough plan is:
   2.6.25:
- get the ~100 patches in Greg's tree (in -mm) merged :)
   2.6.26:
  ???  - remove the 20 char limit in struct device
- get rid of struct class_device
 
 Fine, thanks.
 
 Let's wait for other people's comment.

Dave, I doubt you'll ever manage to do this if you're going to wait:
probably there will be always some new changes like this around...

IMHO, it would be nice to get the real state of current lockdep
problems here to figure out if there is any chance to do this right 
without any warnings with current lockdep. If I got it right from
earlier threads it might be impossible with USB, at least.

So, since I think these nesting levels seem to be wrong in 7/7 patch,
maybe it's better to exclude it from this patchset, and to try this as
testing for some time.

My proposal is to do the annotations with mutex_lock_nested(),
everywhere in this patch, according to 'real' relations between these
classes from thread POV, so e.g.:

mutex_lock_nested(some_class-mutex, CLASS_PARENT);
mutex_lock_nested(some_class-mutex, CLASS_CLASS);
...or more if needed:
mutex_lock_nested(some_class-mutex, CLASS_CHILD);
mutex_lock_nested(some_class-mutex, CLASS_ROOT);
...etc.

using adequate names for these enums, and only after this check what
lockdep thinks about it. Then, if there are no obvious mistakes, but
lockdep doesn't like it, send the patch and report without trying to
'silence' lockdep, so we could see what's going on, and if there are
any chances to make it right.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-17 Thread Jarek Poplawski

On Fri, Jan 18, 2008 at 09:42:25AM +0800, Dave Young wrote:
...
 After digging the class usage code again, I found that the only
 possible double lock place is the class_interface_register/unregister
 in which the class_device api could be called.

OK, but currently after using mostly:
mutex_lock(parent_class)

and once:
mutex_lock_nested(parent_class, SINGLE_DEPTH_NESTING)

lockdep mostly thinks these parent classes are 2 different objects,
with only 2 possible levels of nesting, so this parent_class has
to have wrong name (2 parents can't be locked from the same thread,
so maybe it's class_grandparent sometimes?).

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-16 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 10:27:54AM -0500, Alan Stern wrote:
> On Wed, 16 Jan 2008, Dave Young wrote:
> 
> > The lockdep warining was posted in the below thread, actually, I have
> > built and run this patced kernel for several days, there's no more
> > warnings.
> > http://lkml.org/lkml/2008/1/3/2
> 
> Your meaning isn't clear.  Do you mean that your patch doesn't generate
> any lockdep warnings at all?  Or do you mean that it generates a single
> lockdep warning at boot time and then no more warnings afterward?

So it wasn't only my sleeping problem! This:

"One lockdep warning detected as following, thus use mutex_lock_nested
 with SINGLE_DEPTH_NESTING in class_device_add"

seems to say that mutex_lock_nested is used in the patch just to remove
this one possible warning...

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-16 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 09:04:58PM +0100, Willy Tarreau wrote:
...
> you can work with latest release provided that you always have a fallback
> to an earlier one. That way, you don't bet too much on something you don't
> completely control. If it works, it tells you you'll be able to completely
> exploit its new possibilities in next product release, and if it breaks,
> it's easy to issue a fix to get back to earlier, well-tested version.

Of course, this way looks preferable, but sometimes maybe too costly,
especially with some complex systems. Actually, I don't even think
this have to be fully (production ready) implemented or workable.
Probably there would be even enough to maintain some simplified kind
of test checking how current kernel changes could affect such a system,
and how new versions of this system are better, BTW.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-16 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote:
...
> The lockdep warining was posted in the below thread, actually, I have
> built and run this patced kernel for several days, there's no more
> warnings.
> http://lkml.org/lkml/2008/1/3/2

Right... But, with something like this:

... have_some_fun(... cls)
{
mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING);
have_other_fun(cls);
mutex_unlock(>mutex);

}

... have_more_fun(...)
{
...

mutex_init(>mutex);

mutex_lock(>mutex);
have_some_fun(cls);
mutex_unlock(>mutex);
}

probably you wouldn't get any lockdep warning too...

Of course, if we know all the locking is right such proper lockdep
annotating shouldn't matter too much. (And of course this could be
improved later.)

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-16 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 09:03:03AM +0800, Dave Young wrote:
...
 The lockdep warining was posted in the below thread, actually, I have
 built and run this patced kernel for several days, there's no more
 warnings.
 http://lkml.org/lkml/2008/1/3/2

Right... But, with something like this:

... have_some_fun(... cls)
{
mutex_lock_nested(cls-mutex, SINGLE_DEPTH_NESTING);
have_other_fun(cls);
mutex_unlock(cls-mutex);

}

... have_more_fun(...)
{
...

mutex_init(cls-mutex);

mutex_lock(cls-mutex);
have_some_fun(cls);
mutex_unlock(cls-mutex);
}

probably you wouldn't get any lockdep warning too...

Of course, if we know all the locking is right such proper lockdep
annotating shouldn't matter too much. (And of course this could be
improved later.)

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-16 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 09:04:58PM +0100, Willy Tarreau wrote:
...
 you can work with latest release provided that you always have a fallback
 to an earlier one. That way, you don't bet too much on something you don't
 completely control. If it works, it tells you you'll be able to completely
 exploit its new possibilities in next product release, and if it breaks,
 it's easy to issue a fix to get back to earlier, well-tested version.

Of course, this way looks preferable, but sometimes maybe too costly,
especially with some complex systems. Actually, I don't even think
this have to be fully (production ready) implemented or workable.
Probably there would be even enough to maintain some simplified kind
of test checking how current kernel changes could affect such a system,
and how new versions of this system are better, BTW.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-15 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 11:17:08AM +1100, Herbert Xu wrote:
...
> Well people are always going to operate on this model for commercial
> reasons.  FWIW I used to work for a company that stuck to a specific
> version of the Linux kernel, and I suppose I still do even now :)
> 
> But the important thing is that if you're going to do that, then the
> cost that comes with it should be borne by the company and not the
> community.

Sure. But the most sad thing is there seems to be not so much savings
in this (unless a company isn't sure of its near future). Trying to
upgrade and test current products with current kernels, even if not
necessary, should be always useful and make developing of new products
faster and better fit (and of course, BTW, make the kernel better on
time).

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-15 Thread Jarek Poplawski

On Tue, Jan 15, 2008 at 08:47:07AM -0600, Chris Friesen wrote:
> Jarek Poplawski wrote:
>
>> IMHO, checking this with a current stable, which probably you are going
>> to do some day, anyway, should be 100% acceptable: giving some input to
>> netdev, while still working for yourself.
>
> While I would love to do this, it's not that simple.

...Hmm... As a matter of fact, I expected you'd treat my point less
literally... Of course, I know it could be sometimes very hard to get
something working even after upgrading one version, let alone several
at once.

So, it was more a rhetorical trick (sorry!) to suggest, that such a
business model of being always late with kernels might be quite
practical and reasonable for many companies, but looks like the
worst possible development model for Linux.

On the other hand, it seems there are not so much, nor expensive
changes needed (a bit more perspective thinking?) to make everybody
happy...

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-15 Thread Jarek Poplawski

On Tue, Jan 15, 2008 at 05:15:27PM +0800, Dave Young wrote:
> Convert the class semaphore to mutex.
> 
> Signed-off-by: Dave Young <[EMAIL PROTECTED]> 
> 
> ---
>  drivers/base/class.c   |   38 +++---
>  drivers/base/core.c|   18 --
>  include/linux/device.h |3 ++-
>  3 files changed, 29 insertions(+), 30 deletions(-)
> 
> diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c
> --- linux/drivers/base/class.c2008-01-15 14:04:26.0 +0800
> +++ linux.new/drivers/base/class.c2008-01-15 14:04:26.0 +0800
> @@ -144,7 +144,7 @@ int class_register(struct class * cls)
>   INIT_LIST_HEAD(>devices);
>   INIT_LIST_HEAD(>interfaces);
>   kset_init(>class_dirs);
> - init_MUTEX(>sem);
> + mutex_init(>mutex);
>   error = kobject_set_name(>subsys.kobj, "%s", cls->name);
>   if (error)
>   return error;
> @@ -617,13 +617,13 @@ int class_device_add(struct class_device
>   kobject_uevent(_dev->kobj, KOBJ_ADD);
>  
>   /* notify any interfaces this device is now here */
> - down(_class->sem);
> + mutex_lock_nested(_class->mutex, SINGLE_DEPTH_NESTING);
>   list_add_tail(_dev->node, _class->children);
>   list_for_each_entry(class_intf, _class->interfaces, node) {
>   if (class_intf->add)
>   class_intf->add(class_dev, class_intf);
>   }
> - up(_class->sem);
> + mutex_unlock(_class->mutex);
>  
>   goto out1;
>  
> @@ -725,12 +725,12 @@ void class_device_del(struct class_devic
>   struct class_interface *class_intf;
>  
>   if (parent_class) {
> - down(_class->sem);
> + mutex_lock(_class->mutex);

I hope I'm wrong with this (I don't know this code at all...), and
of course I should've noticed this earlier after all, but I wonder
about this _NESTING corretness here. So, if these variables names
are right, and say about real nesting dependency, then it seems
mutex_lock_nested() should be used consistently even if (currently?)
not forced by lockdep warnings; otherwise this could possibly cover
some other warnings. Alas, if accidentally I'm right, it seems a
bit of new testing would be necessary...

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] driver-core : convert semaphore to mutex in struct class

2008-01-15 Thread Jarek Poplawski

On Tue, Jan 15, 2008 at 05:15:27PM +0800, Dave Young wrote:
 Convert the class semaphore to mutex.
 
 Signed-off-by: Dave Young [EMAIL PROTECTED] 
 
 ---
  drivers/base/class.c   |   38 +++---
  drivers/base/core.c|   18 --
  include/linux/device.h |3 ++-
  3 files changed, 29 insertions(+), 30 deletions(-)
 
 diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c
 --- linux/drivers/base/class.c2008-01-15 14:04:26.0 +0800
 +++ linux.new/drivers/base/class.c2008-01-15 14:04:26.0 +0800
 @@ -144,7 +144,7 @@ int class_register(struct class * cls)
   INIT_LIST_HEAD(cls-devices);
   INIT_LIST_HEAD(cls-interfaces);
   kset_init(cls-class_dirs);
 - init_MUTEX(cls-sem);
 + mutex_init(cls-mutex);
   error = kobject_set_name(cls-subsys.kobj, %s, cls-name);
   if (error)
   return error;
 @@ -617,13 +617,13 @@ int class_device_add(struct class_device
   kobject_uevent(class_dev-kobj, KOBJ_ADD);
  
   /* notify any interfaces this device is now here */
 - down(parent_class-sem);
 + mutex_lock_nested(parent_class-mutex, SINGLE_DEPTH_NESTING);
   list_add_tail(class_dev-node, parent_class-children);
   list_for_each_entry(class_intf, parent_class-interfaces, node) {
   if (class_intf-add)
   class_intf-add(class_dev, class_intf);
   }
 - up(parent_class-sem);
 + mutex_unlock(parent_class-mutex);
  
   goto out1;
  
 @@ -725,12 +725,12 @@ void class_device_del(struct class_devic
   struct class_interface *class_intf;
  
   if (parent_class) {
 - down(parent_class-sem);
 + mutex_lock(parent_class-mutex);

I hope I'm wrong with this (I don't know this code at all...), and
of course I should've noticed this earlier after all, but I wonder
about this _NESTING corretness here. So, if these variables names
are right, and say about real nesting dependency, then it seems
mutex_lock_nested() should be used consistently even if (currently?)
not forced by lockdep warnings; otherwise this could possibly cover
some other warnings. Alas, if accidentally I'm right, it seems a
bit of new testing would be necessary...

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-15 Thread Jarek Poplawski

On Tue, Jan 15, 2008 at 08:47:07AM -0600, Chris Friesen wrote:
 Jarek Poplawski wrote:

 IMHO, checking this with a current stable, which probably you are going
 to do some day, anyway, should be 100% acceptable: giving some input to
 netdev, while still working for yourself.

 While I would love to do this, it's not that simple.

...Hmm... As a matter of fact, I expected you'd treat my point less
literally... Of course, I know it could be sometimes very hard to get
something working even after upgrading one version, let alone several
at once.

So, it was more a rhetorical trick (sorry!) to suggest, that such a
business model of being always late with kernels might be quite
practical and reasonable for many companies, but looks like the
worst possible development model for Linux.

On the other hand, it seems there are not so much, nor expensive
changes needed (a bit more perspective thinking?) to make everybody
happy...

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-15 Thread Jarek Poplawski

On Wed, Jan 16, 2008 at 11:17:08AM +1100, Herbert Xu wrote:
...
 Well people are always going to operate on this model for commercial
 reasons.  FWIW I used to work for a company that stuck to a specific
 version of the Linux kernel, and I suppose I still do even now :)
 
 But the important thing is that if you're going to do that, then the
 cost that comes with it should be borne by the company and not the
 community.

Sure. But the most sad thing is there seems to be not so much savings
in this (unless a company isn't sure of its near future). Trying to
upgrade and test current products with current kernels, even if not
necessary, should be always useful and make developing of new products
faster and better fit (and of course, BTW, make the kernel better on
time).

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-14 Thread Jarek Poplawski

On 14-01-2008 16:58, Chris Friesen wrote:
...
> How close to bleeding edge do we need to be for it to be considered 
> acceptable to ask questions on netdev?
> 
> Given that the embedded space tends to be perpetually stuck on older 
> kernels (our "current" release is based on 2.6.14) do you have any 
> suggestion on how we can obtain information that would be available on 
> netdev if we were using newer kernels?

IMHO, checking this with a current stable, which probably you are going
to do some day, anyway, should be 100% acceptable: giving some input to
netdev, while still working for yourself.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: questions on NAPI processing latency and dropped network packets

2008-01-14 Thread Jarek Poplawski

On 14-01-2008 16:58, Chris Friesen wrote:
...
 How close to bleeding edge do we need to be for it to be considered 
 acceptable to ask questions on netdev?
 
 Given that the embedded space tends to be perpetually stuck on older 
 kernels (our current release is based on 2.6.14) do you have any 
 suggestion on how we can obtain information that would be available on 
 netdev if we were using newer kernels?

IMHO, checking this with a current stable, which probably you are going
to do some day, anyway, should be 100% acceptable: giving some input to
netdev, while still working for yourself.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-13 Thread Jarek Poplawski

On Mon, Jan 14, 2008 at 09:36:04AM +0800, Dave Young wrote:
> On Jan 13, 2008 4:11 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
...
> > Probably some tiny oversight, but I see this comment to struct class
> > doesn't mention devices list, so maybe this needs to be updated BTW?:
> >
> > (from include/linux/device.h)
> > "struct semaphoresem;/* locks both the children and 
> > interfaces lists */"
> 
> Sorry for my lazy, I think so too.
> IMHO, it should be updated after the comments.

As a matter of fact, only later I've found this question is 'fixed' in
7/7. But, actually, I was more concerned if this patch changed the
'official' policy of this sem (then it would be nice to mention about
this in a changelog) or this comment was simply incomplete.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-13 Thread Jarek Poplawski

On Mon, Jan 14, 2008 at 09:36:04AM +0800, Dave Young wrote:
 On Jan 13, 2008 4:11 AM, Jarek Poplawski [EMAIL PROTECTED] wrote:
...
  Probably some tiny oversight, but I see this comment to struct class
  doesn't mention devices list, so maybe this needs to be updated BTW?:
 
  (from include/linux/device.h)
  struct semaphoresem;/* locks both the children and 
  interfaces lists */
 
 Sorry for my lazy, I think so too.
 IMHO, it should be updated after the comments.

As a matter of fact, only later I've found this question is 'fixed' in
7/7. But, actually, I was more concerned if this patch changed the
'official' policy of this sem (then it would be nice to mention about
this in a changelog) or this comment was simply incomplete.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-12 Thread Jarek Poplawski

On Sat, Jan 12, 2008 at 05:47:54PM +0800, Dave Young wrote:
> Add the following class iteration functions for driver use:
> class_for_each_device
> class_find_device
> class_for_each_child
> class_find_child
> 
> Signed-off-by: Dave Young <[EMAIL PROTECTED]> 
> 
> ---
>  drivers/base/class.c   |  159 
> +
>  include/linux/device.h |8 ++
>  2 files changed, 167 insertions(+)
> 
> diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c
> --- linux/drivers/base/class.c2008-01-12 14:42:24.0 +0800
> +++ linux.new/drivers/base/class.c2008-01-12 14:42:24.0 +0800
> @@ -798,6 +798,165 @@ void class_device_put(struct class_devic
>   kobject_put(_dev->kobj);
>  }
>  
> +/**
> + *   class_for_each_device - device iterator
> + *   @class: the class we're iterating
> + *   @data: data for the callback
> + *   @fn: function to be called for each device
> + *
> + *   Iterate over @class's list of devices, and call @fn for each,
> + *   passing it @data.
> + *
> + *   We check the return of @fn each time. If it returns anything
> + *   other than 0, we break out and return that value.
> + */
> +int class_for_each_device(struct class *class, void *data,
> +int (*fn)(struct device *, void *))
> +{
> + struct device *dev;
> + int error = 0;
> +
> + if (!class)
> + return -EINVAL;
> + down(>sem);
> + list_for_each_entry(dev, >devices, node) {

Probably some tiny oversight, but I see this comment to struct class
doesn't mention devices list, so maybe this needs to be updated BTW?:

(from include/linux/device.h)
"struct semaphoresem;/* locks both the children and 
interfaces lists */"

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-12 Thread Jarek Poplawski

On Sat, Jan 12, 2008 at 05:47:54PM +0800, Dave Young wrote:
 Add the following class iteration functions for driver use:
 class_for_each_device
 class_find_device
 class_for_each_child
 class_find_child
 
 Signed-off-by: Dave Young [EMAIL PROTECTED] 
 
 ---
  drivers/base/class.c   |  159 
 +
  include/linux/device.h |8 ++
  2 files changed, 167 insertions(+)
 
 diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c
 --- linux/drivers/base/class.c2008-01-12 14:42:24.0 +0800
 +++ linux.new/drivers/base/class.c2008-01-12 14:42:24.0 +0800
 @@ -798,6 +798,165 @@ void class_device_put(struct class_devic
   kobject_put(class_dev-kobj);
  }
  
 +/**
 + *   class_for_each_device - device iterator
 + *   @class: the class we're iterating
 + *   @data: data for the callback
 + *   @fn: function to be called for each device
 + *
 + *   Iterate over @class's list of devices, and call @fn for each,
 + *   passing it @data.
 + *
 + *   We check the return of @fn each time. If it returns anything
 + *   other than 0, we break out and return that value.
 + */
 +int class_for_each_device(struct class *class, void *data,
 +int (*fn)(struct device *, void *))
 +{
 + struct device *dev;
 + int error = 0;
 +
 + if (!class)
 + return -EINVAL;
 + down(class-sem);
 + list_for_each_entry(dev, class-devices, node) {

Probably some tiny oversight, but I see this comment to struct class
doesn't mention devices list, so maybe this needs to be updated BTW?:

(from include/linux/device.h)
struct semaphoresem;/* locks both the children and 
interfaces lists */

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-09 Thread Jarek Poplawski

On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote:
...
> diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
> new file mode 100644
> index 000..495575a
> --- /dev/null
> +++ b/lib/iommu-helper.c
> @@ -0,0 +1,80 @@
> +/*
> + * IOMMU helper functions for the free area management
> + */
> +
> +#include 
> +#include 
> +
> +static unsigned long find_next_zero_area(unsigned long *map,
> +  unsigned long size,
> +  unsigned long start,
> +  unsigned int nr,
> +  unsigned long align_mask)
> +{
> + unsigned long index, end, i;
> +again:
> + index = find_next_zero_bit(map, size, start);
> +
> + /* Align allocation */
> + index = (index + align_mask) & ~align_mask;
> +
> + end = index + nr;
> + if (end >= size)
> + return -1;

This '>=' looks doubtful to me, e.g.:
map points to 0s only,  size = 64, nr = 64,
we get: index = 0; end = 64;
and: return -1 ?!

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-09 Thread Jarek Poplawski

On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote:
...
 diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
 new file mode 100644
 index 000..495575a
 --- /dev/null
 +++ b/lib/iommu-helper.c
 @@ -0,0 +1,80 @@
 +/*
 + * IOMMU helper functions for the free area management
 + */
 +
 +#include linux/module.h
 +#include linux/bitops.h
 +
 +static unsigned long find_next_zero_area(unsigned long *map,
 +  unsigned long size,
 +  unsigned long start,
 +  unsigned int nr,
 +  unsigned long align_mask)
 +{
 + unsigned long index, end, i;
 +again:
 + index = find_next_zero_bit(map, size, start);
 +
 + /* Align allocation */
 + index = (index + align_mask)  ~align_mask;
 +
 + end = index + nr;
 + if (end = size)
 + return -1;

This '=' looks doubtful to me, e.g.:
map points to 0s only,  size = 64, nr = 64,
we get: index = 0; end = 64;
and: return -1 ?!

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Top 10 kernel oopses for the week ending January 5th, 2008

2008-01-07 Thread Jarek Poplawski

On 08-01-2008 06:59, Al Viro wrote:
> On Mon, Jan 07, 2008 at 07:26:12PM -0800, Linus Torvalds wrote:
>  
>> I usually just compile a small program like
>>
>>  const char array[]="\xnn\xnn\xnn...";
>>
>>  int main(int argc, char **argv)
>>  {
>>  printf("%p\n", array);
>>  *(int *)0=0;
>>  }
> Heh.  I prefer
> char main[] = {.};
> for the same thing, with gdb a.out and no running at all.
...

IMHO, it would be really wasteful if Arian havn't published these
and maybe a few more such advices and links on this site, not
necessarily waiting for html-ized or howto-ized versions!

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-07 Thread Jarek Poplawski

On Mon, Jan 07, 2008 at 02:23:33PM +0100, Stefan Richter wrote:
> David Brownell wrote:
> > On Monday 07 January 2008, Greg KH wrote:
> >> Most of the non-driver core code should be converted to not use the
> >> lock in the class at all.  They should use a local lock instead.
> > 
> > Or better yet, that yet-to-be-written class_for_each_instance()
> > iterator ... :)
> 
> By far most of the usages of class.semaphore or class.mutex in drivers
> are to protect the class.devices list.  The only? right thing to do
> there is to keep using the class.{semaphore,mutex}.  The more elegant
> variation of this would be David's class_for_each_instance() iterator
> which would allow us to hide the locking details from the drivers.
> 
> ---
> ?)  Well, another correct thing to do would be to not take any locks or
> mutexes in the driver core but instead let the drivers do the necessary
> serialization between calls like class_device_add() and the likes and
> list iterations.  But this would complicate the API because of the
> additional locking requirements, and hence would invariably result in
> buggy usages of the API.
> -- 

I hope I'm wrong, but IMHO it should be safer not to mix such changes,
so do the mutexes first or delay them until the end. Otherwise some
false blaming seems almost inevitable.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-07 Thread Jarek Poplawski

On Mon, Jan 07, 2008 at 02:23:33PM +0100, Stefan Richter wrote:
 David Brownell wrote:
  On Monday 07 January 2008, Greg KH wrote:
  Most of the non-driver core code should be converted to not use the
  lock in the class at all.  They should use a local lock instead.
  
  Or better yet, that yet-to-be-written class_for_each_instance()
  iterator ... :)
 
 By far most of the usages of class.semaphore or class.mutex in drivers
 are to protect the class.devices list.  The only? right thing to do
 there is to keep using the class.{semaphore,mutex}.  The more elegant
 variation of this would be David's class_for_each_instance() iterator
 which would allow us to hide the locking details from the drivers.
 
 ---
 ?)  Well, another correct thing to do would be to not take any locks or
 mutexes in the driver core but instead let the drivers do the necessary
 serialization between calls like class_device_add() and the likes and
 list iterations.  But this would complicate the API because of the
 additional locking requirements, and hence would invariably result in
 buggy usages of the API.
 -- 

I hope I'm wrong, but IMHO it should be safer not to mix such changes,
so do the mutexes first or delay them until the end. Otherwise some
false blaming seems almost inevitable.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Top 10 kernel oopses for the week ending January 5th, 2008

2008-01-07 Thread Jarek Poplawski

On 08-01-2008 06:59, Al Viro wrote:
 On Mon, Jan 07, 2008 at 07:26:12PM -0800, Linus Torvalds wrote:
  
 I usually just compile a small program like

  const char array[]=\xnn\xnn\xnn...;

  int main(int argc, char **argv)
  {
  printf(%p\n, array);
  *(int *)0=0;
  }
 Heh.  I prefer
 char main[] = {.};
 for the same thing, with gdb a.out and no running at all.
...

IMHO, it would be really wasteful if Arian havn't published these
and maybe a few more such advices and links on this site, not
necessarily waiting for html-ized or howto-ized versions!

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread Jarek Poplawski

On Sun, Jan 06, 2008 at 11:30:48AM +0100, Torsten Kaiser wrote:
...
> I think this bug is highly timing dependent. Its not always the same
> package that dies and as this is a SMP system I would guess two CPUs
> using the same data will trigger this.
> And using the poison-option will definitily slow the system down and
> mess up the timings.

Of course it looks like using the same data, but it seems there is no
reason to think it needs the same time: e.g. some timer or workqueue
could retrigger after it's supposed to be killed. Any additional
debugging/poisonning might help to see it earlier, so this should be
safer for your system, but, most probably this would show data from
the damaged side, so not necessarily very helpful.

> What also speaks against the 'safer' offsets is, that after adding my
> notfreed-byte to skbuff the bug still triggered in the same way.

We are not even sure skbuffs were directly affected by this or they
were incorrectly freed because of other structures beeing damaged?

IMHO, e.g. starting your system with limited memory should cause
faster memory reclaiming, and thus more often triggering of these bugs,
but of course I can be wrong.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread Jarek Poplawski

On Sat, Jan 05, 2008 at 03:52:32PM +0100, Torsten Kaiser wrote:
...
> So my personal conclusion would be, that someone is writing to memory
> that he no longer owns. Most probably 0-bytes. (the complete_routine
> got NULLed and the warning about dst->__refcnt being 0).
> 
> Use-after-free or something else?

I agree: your conclusion seems to be the most probable explanation for
this. Then it could be really hard to solve this without bisection or
something similar. But there is some probabability this something could
try kfree later too, but simply this list debugging triggers earlier.

> > > If you think some other slub_debug might catch it, I would try this...

You can try to add "U" to these other slub_debug options. As a matter
of fact, if your above diagnose is right, it seems you risk to damage
your system or even the box with these tests, so if you want to
continue, you should probably turn any possible debugging on (not in
mm only).

BTW, you've written that some debugging options seem to delay the bug.
Since they often change sizes of some structures than such wrong
writes could have some 'safer' offsets. So, this could really delay
e.g. these list's bugs, but maybe this could also let to stay 'alive'
to such wrong kfree?

Cheers,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread Jarek Poplawski

On Sat, Jan 05, 2008 at 03:52:32PM +0100, Torsten Kaiser wrote:
...
 So my personal conclusion would be, that someone is writing to memory
 that he no longer owns. Most probably 0-bytes. (the complete_routine
 got NULLed and the warning about dst-__refcnt being 0).
 
 Use-after-free or something else?

I agree: your conclusion seems to be the most probable explanation for
this. Then it could be really hard to solve this without bisection or
something similar. But there is some probabability this something could
try kfree later too, but simply this list debugging triggers earlier.

   If you think some other slub_debug might catch it, I would try this...

You can try to add U to these other slub_debug options. As a matter
of fact, if your above diagnose is right, it seems you risk to damage
your system or even the box with these tests, so if you want to
continue, you should probably turn any possible debugging on (not in
mm only).

BTW, you've written that some debugging options seem to delay the bug.
Since they often change sizes of some structures than such wrong
writes could have some 'safer' offsets. So, this could really delay
e.g. these list's bugs, but maybe this could also let to stay 'alive'
to such wrong kfree?

Cheers,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread Jarek Poplawski

On Sun, Jan 06, 2008 at 11:30:48AM +0100, Torsten Kaiser wrote:
...
 I think this bug is highly timing dependent. Its not always the same
 package that dies and as this is a SMP system I would guess two CPUs
 using the same data will trigger this.
 And using the poison-option will definitily slow the system down and
 mess up the timings.

Of course it looks like using the same data, but it seems there is no
reason to think it needs the same time: e.g. some timer or workqueue
could retrigger after it's supposed to be killed. Any additional
debugging/poisonning might help to see it earlier, so this should be
safer for your system, but, most probably this would show data from
the damaged side, so not necessarily very helpful.

 What also speaks against the 'safer' offsets is, that after adding my
 notfreed-byte to skbuff the bug still triggered in the same way.

We are not even sure skbuffs were directly affected by this or they
were incorrectly freed because of other structures beeing damaged?

IMHO, e.g. starting your system with limited memory should cause
faster memory reclaiming, and thus more often triggering of these bugs,
but of course I can be wrong.

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-05 Thread Jarek Poplawski

On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote:
> On Jan 5, 2008 1:07 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> > On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote:
> > > On Jan 4, 2008 2:30 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> > > The only thing that is sadly not practical is bisecting the borkenout
> > > mm-patches, as triggering this error is to unreliable /
> > > time-consuming.
> >
> > Right, but it seems there are these 2 main suspects here...
> >
> > > > - is it still vanilla -rc6-mm1; I've seen on kernel list you tried
> > > > some fixes around raid?
> > >
> > > Yes, without these fixes I can't boot.
> > > But they should only be run during starting the arrays, so I doubt
> > > that this is that cause.
> > > (Also -rc3-mm2 did not need this fix)
> >
> > You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes?
> 
> vanilla -rc6 is fine without these fixes.
> The raid-bugs from -rc6-mm1 are probably introduced by
> md-allow-devices-to-be-shared-between-md-arrays.patch and that patch
> is new in this mm-release.
> 
> > I think it would be easier just to start with this working -rc6 and
> > simply check if we have 'right' suspects, so: git-net.patch and
> > git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope,
> > can compile - otherwise you could try the other way: add the whole -mm
> > and revert these two). Using current gits could complicate this
> > "investigation".
> 
> OK, I will try this...

It seems that this last report gives the third one: ieee1394 to the pack,
so probably, you can hold on a "minute" - this all needs some rethinking.
(But, if you've begun with this already, let it be clear at last too.)

> > I didn't read all this thread, so probably I miss many points, but are
> > you sure there are no problems with filesystem corruption around these
> > packets or where you compile(?) them (e.g. after these raid problems)?
> 
> For my setup: It's a gentoo system, so compiling packages is the
> normal way of installing something.
> The compile itself is done on a tmpfs so a filesystem corruption there
> should be rather impossible. ;)
> (The system has 4Gb RAM, so it doesn't even need to swap)
> The sources are taken from a nfsv4 share that is served from a
> different system. Also gentoo checksums all sources it will use.

Yes, since this was no problem with vanilla 2.6.24-rc6, I've probably
gone astray...

> If you think some other slub_debug might catch it, I would try this...

OK! But, in the meantime could you send your current .config? I wonder
e.g. if there could be used this new ieee1394 code from
init_ohci1394_dma.c?
 
You are really helpful, thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-05 Thread Jarek Poplawski

On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote:
 On Jan 5, 2008 1:07 AM, Jarek Poplawski [EMAIL PROTECTED] wrote:
  On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote:
   On Jan 4, 2008 2:30 PM, Jarek Poplawski [EMAIL PROTECTED] wrote:
   The only thing that is sadly not practical is bisecting the borkenout
   mm-patches, as triggering this error is to unreliable /
   time-consuming.
 
  Right, but it seems there are these 2 main suspects here...
 
- is it still vanilla -rc6-mm1; I've seen on kernel list you tried
some fixes around raid?
  
   Yes, without these fixes I can't boot.
   But they should only be run during starting the arrays, so I doubt
   that this is that cause.
   (Also -rc3-mm2 did not need this fix)
 
  You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes?
 
 vanilla -rc6 is fine without these fixes.
 The raid-bugs from -rc6-mm1 are probably introduced by
 md-allow-devices-to-be-shared-between-md-arrays.patch and that patch
 is new in this mm-release.
 
  I think it would be easier just to start with this working -rc6 and
  simply check if we have 'right' suspects, so: git-net.patch and
  git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope,
  can compile - otherwise you could try the other way: add the whole -mm
  and revert these two). Using current gits could complicate this
  investigation.
 
 OK, I will try this...

It seems that this last report gives the third one: ieee1394 to the pack,
so probably, you can hold on a minute - this all needs some rethinking.
(But, if you've begun with this already, let it be clear at last too.)

  I didn't read all this thread, so probably I miss many points, but are
  you sure there are no problems with filesystem corruption around these
  packets or where you compile(?) them (e.g. after these raid problems)?
 
 For my setup: It's a gentoo system, so compiling packages is the
 normal way of installing something.
 The compile itself is done on a tmpfs so a filesystem corruption there
 should be rather impossible. ;)
 (The system has 4Gb RAM, so it doesn't even need to swap)
 The sources are taken from a nfsv4 share that is served from a
 different system. Also gentoo checksums all sources it will use.

Yes, since this was no problem with vanilla 2.6.24-rc6, I've probably
gone astray...

 If you think some other slub_debug might catch it, I would try this...

OK! But, in the meantime could you send your current .config? I wonder
e.g. if there could be used this new ieee1394 code from
init_ohci1394_dma.c?
 
You are really helpful, thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-04 Thread Jarek Poplawski

On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote:
> On Jan 4, 2008 2:30 PM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
...
> I'm open for any suggestions and will try to answer any questions.

I'm very glad, thanks!

> The only thing that is sadly not practical is bisecting the borkenout
> mm-patches, as triggering this error is to unreliable /
> time-consuming.

Right, but it seems there are these 2 main suspects here...

> > - is it still vanilla -rc6-mm1; I've seen on kernel list you tried
> > some fixes around raid?
> 
> Yes, without these fixes I can't boot.
> But they should only be run during starting the arrays, so I doubt
> that this is that cause.
> (Also -rc3-mm2 did not need this fix)

You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes?
I think it would be easier just to start with this working -rc6 and
simply check if we have 'right' suspects, so: git-net.patch and
git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope,
can compile - otherwise you could try the other way: add the whole -mm
and revert these two). Using current gits could complicate this
"investigation". 

> My skbuff-double-free-detector is still in there, but was never triggered.
> 
> > - could you remind this lockdep warning; is it always and the same,
> > always before crash, or no rules?
> 
> ???
> I see no lockdep warning before the crashes.
> I have seen a warning about the dst->__refcnt in dst_release and
> different warnings about list operations.
> 
> I think I have always posted everything I have seen before the
> crashes. (captured via serial console)

So, you mean there are no more of these?:
 
"looked into the log in question and the only other warning was a
 circular locking dependency that lockdep detected around 1.5 hour
 before this warning."
...
"[ 7620.845168] INFO: lockdep is turned off."

> (If you mean the lockdep-problem in -rc6: That is more or less a
> missing annotation during early bootup. The only problem with that is,
> that it will causes lockdep to be turned off and so it can not be used
> to find any real problem. A fix for that is in -mm so I do have
> lockdep on the mm-kernels)
> 
> > - I've seen you looked after double freeing, but this last debug list
> > warning could suggest locking problems during list modification too.
> 
> Yes, but Herbert mentioned double freeing a skb explicit and so I
> tried to catch this.
> I do not know enough about the network core to verify the locking of
> the involved lists.

Right, the list corruption could be because of use after freeing too.

> > - above git-nfsd and git-net tests should be probably repeated with
> > -rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches
> > only, and if bug triggers, with one reversed; btw., since in previous
> > message you mentioned that 50 packages could be not enough to trigger
> > this, these 54 above could make too little margin yet.
> 
> Yes, I think I really need to redo the git-nfsd-test.
> With IOMMU_DEBUG enabled rc6-mm1worked for 52 packages, only a secound
> run of kde-packages triggered it after only 5 packages.
> I don't know what this bug hates about kdeartwork-wallpaper (triggered
> it this time) or kdeartwork-styles.

I didn't read all this thread, so probably I miss many points, but are
you sure there are no problems with filesystem corruption around these
packets or where you compile(?) them (e.g. after these raid problems)?

> Output from the crash with IOMMU_DEBUG (lockdep was enabled, but did
> not trigger):
> [15593.236374] Unable to handle kernel NULL pointer
> dereference<3>list_add corruption. prev->next should be next

Fine! I'll try to look at this. BTW, I guess/hope DEBUG_SLAB etc. are
also on...

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-04 Thread Jarek Poplawski

On 04-01-2008 11:23, Torsten Kaiser wrote:
> On Jan 2, 2008 10:51 PM, Herbert Xu <[EMAIL PROTECTED]> wrote:
>> On Wed, Jan 02, 2008 at 07:29:59PM +0100, Torsten Kaiser wrote:
>>> Vanilla 2.6.24-rc6 seems stable. I did not see any crash or warnings.
>> OK that's great.  The next step would be to try excluding specific git
>> trees from mm to see if they make a difference.
>>
>> The two specific trees of interest would be git-nfsd and git-net.
> 
> git-nfsd from git://git.linux-nfs.org/projects/bfields/linux.git#for-mm
> -> compiling and installing 54 packages worked without crashes.
> 
> git-net from 
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25.git
> -> compiling and installing 95 packages worked without crashes.
...
> I will enable CONFIG_IOMMU_DEBUG in -rc6-mm1 and see, as otherwise I
> have no clue where to look...

Hi,

A few questions/suggestions:

- is it still vanilla -rc6-mm1; I've seen on kernel list you tried
some fixes around raid?

- could you remind this lockdep warning; is it always and the same,
always before crash, or no rules?

- I've seen you looked after double freeing, but this last debug list
warning could suggest locking problems during list modification too.

- above git-nfsd and git-net tests should be probably repeated with
-rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches
only, and if bug triggers, with one reversed; btw., since in previous
message you mentioned that 50 packages could be not enough to trigger
this, these 54 above could make too little margin yet.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-04 Thread Jarek Poplawski

On 04-01-2008 11:23, Torsten Kaiser wrote:
 On Jan 2, 2008 10:51 PM, Herbert Xu [EMAIL PROTECTED] wrote:
 On Wed, Jan 02, 2008 at 07:29:59PM +0100, Torsten Kaiser wrote:
 Vanilla 2.6.24-rc6 seems stable. I did not see any crash or warnings.
 OK that's great.  The next step would be to try excluding specific git
 trees from mm to see if they make a difference.

 The two specific trees of interest would be git-nfsd and git-net.
 
 git-nfsd from git://git.linux-nfs.org/projects/bfields/linux.git#for-mm
 - compiling and installing 54 packages worked without crashes.
 
 git-net from 
 git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25.git
 - compiling and installing 95 packages worked without crashes.
...
 I will enable CONFIG_IOMMU_DEBUG in -rc6-mm1 and see, as otherwise I
 have no clue where to look...

Hi,

A few questions/suggestions:

- is it still vanilla -rc6-mm1; I've seen on kernel list you tried
some fixes around raid?

- could you remind this lockdep warning; is it always and the same,
always before crash, or no rules?

- I've seen you looked after double freeing, but this last debug list
warning could suggest locking problems during list modification too.

- above git-nfsd and git-net tests should be probably repeated with
-rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches
only, and if bug triggers, with one reversed; btw., since in previous
message you mentioned that 50 packages could be not enough to trigger
this, these 54 above could make too little margin yet.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-04 Thread Jarek Poplawski

On Fri, Jan 04, 2008 at 04:21:26PM +0100, Torsten Kaiser wrote:
 On Jan 4, 2008 2:30 PM, Jarek Poplawski [EMAIL PROTECTED] wrote:
...
 I'm open for any suggestions and will try to answer any questions.

I'm very glad, thanks!

 The only thing that is sadly not practical is bisecting the borkenout
 mm-patches, as triggering this error is to unreliable /
 time-consuming.

Right, but it seems there are these 2 main suspects here...

  - is it still vanilla -rc6-mm1; I've seen on kernel list you tried
  some fixes around raid?
 
 Yes, without these fixes I can't boot.
 But they should only be run during starting the arrays, so I doubt
 that this is that cause.
 (Also -rc3-mm2 did not need this fix)

You've written vanilla -rc6 is OK. Does it mean -rc6 with these fixes?
I think it would be easier just to start with this working -rc6 and
simply check if we have 'right' suspects, so: git-net.patch and
git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope,
can compile - otherwise you could try the other way: add the whole -mm
and revert these two). Using current gits could complicate this
investigation. 

 My skbuff-double-free-detector is still in there, but was never triggered.
 
  - could you remind this lockdep warning; is it always and the same,
  always before crash, or no rules?
 
 ???
 I see no lockdep warning before the crashes.
 I have seen a warning about the dst-__refcnt in dst_release and
 different warnings about list operations.
 
 I think I have always posted everything I have seen before the
 crashes. (captured via serial console)

So, you mean there are no more of these?:
 
looked into the log in question and the only other warning was a
 circular locking dependency that lockdep detected around 1.5 hour
 before this warning.
...
[ 7620.845168] INFO: lockdep is turned off.

 (If you mean the lockdep-problem in -rc6: That is more or less a
 missing annotation during early bootup. The only problem with that is,
 that it will causes lockdep to be turned off and so it can not be used
 to find any real problem. A fix for that is in -mm so I do have
 lockdep on the mm-kernels)
 
  - I've seen you looked after double freeing, but this last debug list
  warning could suggest locking problems during list modification too.
 
 Yes, but Herbert mentioned double freeing a skb explicit and so I
 tried to catch this.
 I do not know enough about the network core to verify the locking of
 the involved lists.

Right, the list corruption could be because of use after freeing too.

  - above git-nfsd and git-net tests should be probably repeated with
  -rc6-mm1 git versions: so vanilla rc6 plus both these -mm patches
  only, and if bug triggers, with one reversed; btw., since in previous
  message you mentioned that 50 packages could be not enough to trigger
  this, these 54 above could make too little margin yet.
 
 Yes, I think I really need to redo the git-nfsd-test.
 With IOMMU_DEBUG enabled rc6-mm1worked for 52 packages, only a secound
 run of kde-packages triggered it after only 5 packages.
 I don't know what this bug hates about kdeartwork-wallpaper (triggered
 it this time) or kdeartwork-styles.

I didn't read all this thread, so probably I miss many points, but are
you sure there are no problems with filesystem corruption around these
packets or where you compile(?) them (e.g. after these raid problems)?

 Output from the crash with IOMMU_DEBUG (lockdep was enabled, but did
 not trigger):
 [15593.236374] Unable to handle kernel NULL pointer
 dereference3list_add corruption. prev-next should be next

Fine! I'll try to look at this. BTW, I guess/hope DEBUG_SLAB etc. are
also on...

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-02 Thread Jarek Poplawski

On Thu, Jan 03, 2008 at 03:21:36PM +0800, Dave Young wrote:
...
> I don't know if there's other possible warning places with this mutex
> or not,  if you have any ideas about this, please tell me.

I think lockdep is just to tell such things. So, the question is, how
much it was tested already, because if there are many warnings
reported e.g. after merging to -mm, then this could be better to re-do
it this other way... But, I hope this will not be necessary.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-02 Thread Jarek Poplawski

On Thu, Jan 03, 2008 at 08:06:09AM +0100, Jarek Poplawski wrote:
> On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote:
> > Convert semaphore to mutex in struct class.
> ...
> > One lockdep warning detected as following, thus use mutex_lock_nested with 
> > SINGLE_DEPTH_NESTING in class_device_add
> > 
> > Jan  3 10:45:15 darkstar kernel: 
> > =
> > Jan  3 10:45:15 darkstar kernel: [ INFO: possible recursive locking 
> > detected ]
> > Jan  3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1
> > Jan  3 10:45:15 darkstar kernel: 
> > -
> ...
> > If there's anything missed please help to point out, thanks.
> 
> Dave, IMHO it's not 'the right' way to do it: [...]

OOPS! (I was sleeping...) Unless it has turned out it's not so hard
here, and you are quite sure there should be no more warnings after
this one nesting annotation - then of course, this is the right way!

Sorry (?)
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-02 Thread Jarek Poplawski

On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote:
> Convert semaphore to mutex in struct class.
...
> One lockdep warning detected as following, thus use mutex_lock_nested with 
> SINGLE_DEPTH_NESTING in class_device_add
> 
> Jan  3 10:45:15 darkstar kernel: =
> Jan  3 10:45:15 darkstar kernel: [ INFO: possible recursive locking detected ]
> Jan  3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1
> Jan  3 10:45:15 darkstar kernel: -
...
> If there's anything missed please help to point out, thanks.

Dave, IMHO it's not 'the right' way to do it: from this and earlier
discussions it seems there could be many more warnings like this one;
lockdep simply always turns itself off after first one. So, merging
your patches like this would effectively turn off lockdep for all
other places as well, maybe for a long time.

I'd suggest to try first to do it with some wrappers around mutexes,
which simply omit lockdep verification, and later try to replace them
one by one, after checking and testing there are no such warnings
anymore (which would often need some additional annotations about
nesting and probably some changes in lockdep too).

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/12] Use mutex instead of semaphore in driver core

2008-01-02 Thread Jarek Poplawski

On Wed, Jan 02, 2008 at 01:39:38PM +0100, Jarek Poplawski wrote:
> On 02-01-2008 08:00, Greg KH wrote:
> ...
> > If no one has noticed any issues in this area, [...]

BTW, if 'we' are sure there are no issues, and only lockdep is not
clever enough yet, why not do such a change partially, e.g. with
lockdep_on/off, at least to -mm, for testing?

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/12] Use mutex instead of semaphore in driver core

2008-01-02 Thread Jarek Poplawski

On 02-01-2008 08:00, Greg KH wrote:
...
> If no one has noticed any issues in this area, [...]

...Could also mean there are hidden issues, so it doesn't look like
very convincing argument.

...Unless after the change there will be found no hidden issues,
then, of course, it looks like convincing enough.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/12] Use mutex instead of semaphore in driver core

2008-01-02 Thread Jarek Poplawski

On 02-01-2008 08:00, Greg KH wrote:
...
 If no one has noticed any issues in this area, [...]

...Could also mean there are hidden issues, so it doesn't look like
very convincing argument.

...Unless after the change there will be found no hidden issues,
then, of course, it looks like convincing enough.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/12] Use mutex instead of semaphore in driver core

2008-01-02 Thread Jarek Poplawski

On Wed, Jan 02, 2008 at 01:39:38PM +0100, Jarek Poplawski wrote:
 On 02-01-2008 08:00, Greg KH wrote:
 ...
  If no one has noticed any issues in this area, [...]

BTW, if 'we' are sure there are no issues, and only lockdep is not
clever enough yet, why not do such a change partially, e.g. with
lockdep_on/off, at least to -mm, for testing?

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-02 Thread Jarek Poplawski

On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote:
 Convert semaphore to mutex in struct class.
...
 One lockdep warning detected as following, thus use mutex_lock_nested with 
 SINGLE_DEPTH_NESTING in class_device_add
 
 Jan  3 10:45:15 darkstar kernel: =
 Jan  3 10:45:15 darkstar kernel: [ INFO: possible recursive locking detected ]
 Jan  3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1
 Jan  3 10:45:15 darkstar kernel: -
...
 If there's anything missed please help to point out, thanks.

Dave, IMHO it's not 'the right' way to do it: from this and earlier
discussions it seems there could be many more warnings like this one;
lockdep simply always turns itself off after first one. So, merging
your patches like this would effectively turn off lockdep for all
other places as well, maybe for a long time.

I'd suggest to try first to do it with some wrappers around mutexes,
which simply omit lockdep verification, and later try to replace them
one by one, after checking and testing there are no such warnings
anymore (which would often need some additional annotations about
nesting and probably some changes in lockdep too).

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-02 Thread Jarek Poplawski

On Thu, Jan 03, 2008 at 08:06:09AM +0100, Jarek Poplawski wrote:
 On Thu, Jan 03, 2008 at 01:50:20PM +0800, Dave Young wrote:
  Convert semaphore to mutex in struct class.
 ...
  One lockdep warning detected as following, thus use mutex_lock_nested with 
  SINGLE_DEPTH_NESTING in class_device_add
  
  Jan  3 10:45:15 darkstar kernel: 
  =
  Jan  3 10:45:15 darkstar kernel: [ INFO: possible recursive locking 
  detected ]
  Jan  3 10:45:15 darkstar kernel: 2.6.24-rc6-mm1-mutex #1
  Jan  3 10:45:15 darkstar kernel: 
  -
 ...
  If there's anything missed please help to point out, thanks.
 
 Dave, IMHO it's not 'the right' way to do it: [...]

OOPS! (I was sleeping...) Unless it has turned out it's not so hard
here, and you are quite sure there should be no more warnings after
this one nesting annotation - then of course, this is the right way!

Sorry (?)
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-02 Thread Jarek Poplawski

On Thu, Jan 03, 2008 at 03:21:36PM +0800, Dave Young wrote:
...
 I don't know if there's other possible warning places with this mutex
 or not,  if you have any ideas about this, please tell me.

I think lockdep is just to tell such things. So, the question is, how
much it was tested already, because if there are many warnings
reported e.g. after merging to -mm, then this could be better to re-do
it this other way... But, I hope this will not be necessary.

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] make I/O schedulers non-modular

2007-12-30 Thread Jarek Poplawski

Jarek Poplawski wrote, On 11/27/2007 11:15 PM:

> Adrian Bunk wrote, On 11/27/2007 05:47 PM:

...

>> There is nothing like a "right of choice".


(very late) PS: 

...I was a bit confused with this, wondering: so, we've envied you
(the West) this "thing" for so many years, and now it seems, you have
no idea what's this all about?! Happily it was only my English:

http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less


"Freedom of choice" was the right term!

Regards,
Jarek P.


PPS: But, of course, no need to discuss this more... unless we're
interested in the next Nobel Prize.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] make I/O schedulers non-modular

2007-12-30 Thread Jarek Poplawski

Jarek Poplawski wrote, On 11/27/2007 11:15 PM:

 Adrian Bunk wrote, On 11/27/2007 05:47 PM:

...

 There is nothing like a right of choice.


(very late) PS: 

...I was a bit confused with this, wondering: so, we've envied you
(the West) this thing for so many years, and now it seems, you have
no idea what's this all about?! Happily it was only my English:

http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less


Freedom of choice was the right term!

Regards,
Jarek P.


PPS: But, of course, no need to discuss this more... unless we're
interested in the next Nobel Prize.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange Panic (Deadlock)

2007-12-24 Thread Jarek Poplawski

[EMAIL PROTECTED] wrote, On 12/24/2007 07:18 PM:

> Hello again.
> Its bug depend to
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4aae07025265151e3f7041dfbf0f529e122de1d8
> ?


Hello Vyacheslav!

I wonder why do you think there is such a dependency, and why do you report
timer.c bug to netdev after all? I added some CCs here, but IMHO you would
better open a new bug at bugzilla.kernel.org, adding some more details like
.config, and reply back to this thread with the bug's number. BTW, if it's
patched by Gentoo or otherwise, you should try and report on 'vanilla' one
only.

Regards,
Jarek P.

> Its very critical bug to us. This PC must be HA. Server place so far  
> for me to go and reboot server. I go to it 1-3 times in week =(
> Please help to fix it =)
> 
>> Hello all. Some time machine freeze. No information on monitor. No
>> rebooting on sysctl "kernel.panic".
>> Any idea?
>>
>> Catched by netconsole:
>> [91922.085864] [ cut here ]
>> [91922.085975] kernel BUG at kernel/timer.c:606!
>> [91922.086058] invalid opcode:  [#1]
>> [91922.086127] SMP
>> [91922.086201] Modules linked in: netconsole cls_u32 sch_sfq sch_htb
>> xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core
>> [91922.086386] CPU:1
>> [91922.086387] EIP:0060:[]Not tainted VLI
>> [91922.086389] EFLAGS: 00010087   (2.6.23-gentoo-r4-fw #4)
>> [91922.086600] EIP is at cascade+0x34/0x4f
>> [91922.086669] eax: c0452200   ebx: f450408c   ecx: 0022   edx: f3c6e08c
>> [91922.086740] esi: 0022   edi: c21ce000   ebp: 0001   esp: c21c3ef8
>> [91922.086815] ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
>> [91922.086885] Process swapper (pid: 0, ti=c21c2000 task=c21af000
>> task.ti=c21c2000)
>> [91922.086954] Stack: f3c6e08c c21bfb74  c21ce000 000a
>> c012767a c21af000 0001
>> [91922.087119]c21c3f18 c0106963 c21c3f68 0001 0021
>> c03c0b08 000a c0124556
>> [91922.087285]0046  c21c2008  c01245ec
>> c2015120 c0114a11 0046
>> [91922.087451] Call Trace:
>> [91922.087586]  [] run_timer_softirq+0x51/0x154
>> [91922.087669]  [] profile_pc+0x21/0x46
>> [91922.087752]  [] __do_softirq+0x5d/0xc1
>> [91922.087833]  [] do_softirq+0x32/0x36
>> [91922.087915]  [] smp_apic_timer_interrupt+0x74/0x80
>> [91922.087997]  [] apic_timer_interrupt+0x28/0x30
>> [91922.088076]  [] mwait_idle_with_hints+0x3b/0x3f
>> [91922.088162]  [] mwait_idle+0x0/0xa
>> [91922.088237]  [] cpu_idle+0x91/0xaa
>> [91922.088319]  ===
>> [91922.088390] Code: 08 8d 04 ca 8b 10 89 62 04 89 14 24 8b 50 04 89 22
>> 89 00 89 54 24 04 8b 14 24 89 40 04 8b 1a eb 19 8b 42 14 83 e0 fe 39 f8
>> 74 04 <0f> 0b eb fe 89 f8 e8 d8 fe ff ff 89 da 8b 1b 39 e2 75 e3 59 89
>> [91922.088864] EIP: [] cascade+0x34/0x4f SS:ESP 0068:c21c3ef8
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to [EMAIL PROTECTED]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> 
> 
> 
> 
> This message was sent using IMP, the Internet Messaging Program.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange Panic (Deadlock)

2007-12-24 Thread Jarek Poplawski

[EMAIL PROTECTED] wrote, On 12/24/2007 07:18 PM:

 Hello again.
 Its bug depend to
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4aae07025265151e3f7041dfbf0f529e122de1d8
 ?


Hello Vyacheslav!

I wonder why do you think there is such a dependency, and why do you report
timer.c bug to netdev after all? I added some CCs here, but IMHO you would
better open a new bug at bugzilla.kernel.org, adding some more details like
.config, and reply back to this thread with the bug's number. BTW, if it's
patched by Gentoo or otherwise, you should try and report on 'vanilla' one
only.

Regards,
Jarek P.

 Its very critical bug to us. This PC must be HA. Server place so far  
 for me to go and reboot server. I go to it 1-3 times in week =(
 Please help to fix it =)
 
 Hello all. Some time machine freeze. No information on monitor. No
 rebooting on sysctl kernel.panic.
 Any idea?

 Catched by netconsole:
 [91922.085864] [ cut here ]
 [91922.085975] kernel BUG at kernel/timer.c:606!
 [91922.086058] invalid opcode:  [#1]
 [91922.086127] SMP
 [91922.086201] Modules linked in: netconsole cls_u32 sch_sfq sch_htb
 xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core
 [91922.086386] CPU:1
 [91922.086387] EIP:0060:[c0127387]Not tainted VLI
 [91922.086389] EFLAGS: 00010087   (2.6.23-gentoo-r4-fw #4)
 [91922.086600] EIP is at cascade+0x34/0x4f
 [91922.086669] eax: c0452200   ebx: f450408c   ecx: 0022   edx: f3c6e08c
 [91922.086740] esi: 0022   edi: c21ce000   ebp: 0001   esp: c21c3ef8
 [91922.086815] ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
 [91922.086885] Process swapper (pid: 0, ti=c21c2000 task=c21af000
 task.ti=c21c2000)
 [91922.086954] Stack: f3c6e08c c21bfb74  c21ce000 000a
 c012767a c21af000 0001
 [91922.087119]c21c3f18 c0106963 c21c3f68 0001 0021
 c03c0b08 000a c0124556
 [91922.087285]0046  c21c2008  c01245ec
 c2015120 c0114a11 0046
 [91922.087451] Call Trace:
 [91922.087586]  [c012767a] run_timer_softirq+0x51/0x154
 [91922.087669]  [c0106963] profile_pc+0x21/0x46
 [91922.087752]  [c0124556] __do_softirq+0x5d/0xc1
 [91922.087833]  [c01245ec] do_softirq+0x32/0x36
 [91922.087915]  [c0114a11] smp_apic_timer_interrupt+0x74/0x80
 [91922.087997]  [c010484c] apic_timer_interrupt+0x28/0x30
 [91922.088076]  [c0102255] mwait_idle_with_hints+0x3b/0x3f
 [91922.088162]  [c0102259] mwait_idle+0x0/0xa
 [91922.088237]  [c0102398] cpu_idle+0x91/0xaa
 [91922.088319]  ===
 [91922.088390] Code: 08 8d 04 ca 8b 10 89 62 04 89 14 24 8b 50 04 89 22
 89 00 89 54 24 04 8b 14 24 89 40 04 8b 1a eb 19 8b 42 14 83 e0 fe 39 f8
 74 04 0f 0b eb fe 89 f8 e8 d8 fe ff ff 89 da 8b 1b 39 e2 75 e3 59 89
 [91922.088864] EIP: [c0127387] cascade+0x34/0x4f SS:ESP 0068:c21c3ef8

 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 
 
 
 
 This message was sent using IMP, the Internet Messaging Program.
 
 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 >

1 - 100 of 821 matches

Mail list logo