Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-17 Thread Jarek Poplawski
On Fri, Apr 06, 2007 at 07:19:25PM +0100, Christian Kujau wrote: On Wed, 4 Apr 2007, Christian Kujau wrote: Maybe it's a real locking problem. Here are some more suggestions for testing (if you don't find anything better): - try without SMP, so: 'acpi=off lapic nosmp' We were able to have

Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6

2007-04-18 Thread Jarek Poplawski
Hi, I didn't analyse this bug report but probably it is nearly connected with one of the bugs visible in a log from this submit: http://bugzilla.kernel.org/show_bug.cgi?id=8132 On 15-04-2007 02:50, Paul Mackerras wrote: David Miller writes: Here is Patrick McHardy's patch: So this

[PATCH -mm] workqueue: debug possible lockups in flush_workqueue

2007-04-18 Thread Jarek Poplawski
Hi, Here is my patch proposal for detecting possible lockups, when flush_workqueue caller holds a lock (e.g. rtnl_lock) also used in work functions. Regards, Jarek P. Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc6-mm1-/kernel/workqueue.c 2.6.21-rc6-mm1/kernel

[PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-19 Thread Jarek Poplawski
with very short, equal times; before flush_workqueue ends, their timers are already fired, so cancel_delayed_work has nothing to do. Maybe this patch could check, if I'm not dreaming... PS: of course the counter value below is a question of taste Signed-off-by: Jarek Poplawski [EMAIL PROTECTED

Re: [PATCH -mm] workqueue: debug possible lockups in flush_workqueue

2007-04-19 Thread Jarek Poplawski
On Thu, Apr 19, 2007 at 08:14:16AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: Here is my patch proposal for detecting possible lockups, when flush_workqueue caller holds a lock (e.g. rtnl_lock) also used in work functions. looks good in principle - did you

Re: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-19 Thread Jarek Poplawski
On Thu, Apr 19, 2007 at 09:32:22AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: + int i = 1000; - while (!cancel_delayed_work(dwork)) + while (!cancel_delayed_work(dwork)) { flush_workqueue(wq

[PATCH] lockdep: lookup_chain_cache comment errata

2007-04-19 Thread Jarek Poplawski
[PATCH] lockdep: lookup_chain_cache comment errata Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc7-/kernel/lockdep.c 2.6.21-rc7/kernel/lockdep.c --- 2.6.21-rc7-/kernel/lockdep.c2007-04-18 10:14:06.0 +0200 +++ 2.6.21-rc7/kernel/lockdep.c 2007-04-19 11

[PATCH] lockdep: removed unused ip argument in mark_lock mark_held_locks

2007-04-19 Thread Jarek Poplawski
It looks like a remainder from designing... (or I miss something?) PS: patched on 2.6.21-rc7 + today's lookup_chain_cache comment errata Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc7-/kernel/lockdep.c 2.6.21-rc7/kernel/lockdep.c --- 2.6.21-rc7-/kernel/lockdep.c

Re: [PATCH] Set a separate lockdep class for neighbour table's proxy_queue

2007-04-19 Thread Jarek Poplawski
On 17-04-2007 21:46, David Miller wrote: From: Pavel Emelianov [EMAIL PROTECTED] Date: Mon, 16 Apr 2007 16:08:25 +0400 Otherwise the following calltrace will lead to a wrong lockdep warning: neigh_proxy_process() `- lock(neigh_table-proxy_queue.lock); arp_redo /* via

Re: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-20 Thread Jarek Poplawski
On Thu, Apr 19, 2007 at 01:07:11PM -0400, Chuck Ebbert wrote: Jarek Poplawski wrote: Hi, IMHO cancel_rearming_delayed_work is dangerous place: - it assumes a work function always rearms (with no exception), which probably isn't explained enough now (but anyway should be checked

Re: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-20 Thread Jarek Poplawski
On Fri, Apr 20, 2007 at 12:46:18AM +1000, David Chinner wrote: On Thu, Apr 19, 2007 at 08:54:04AM +0200, Jarek Poplawski wrote: Hi, IMHO cancel_rearming_delayed_work is dangerous place: Agreed - I spent a couple of hours today learning why it can only be used on work functions

Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-20 Thread Jarek Poplawski
On Thu, Apr 19, 2007 at 02:21:22PM +0400, Oleg Nesterov wrote: On 04/19, Andrew Morton wrote: Begin forwarded message: Date: Thu, 19 Apr 2007 08:54:04 +0200 From: Jarek Poplawski [EMAIL PROTECTED] To: linux-kernel@vger.kernel.org Cc: Ingo Molnar [EMAIL PROTECTED] Subject: [PATCH

Re: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-20 Thread Jarek Poplawski
On Fri, Apr 20, 2007 at 06:53:54PM +1000, David Chinner wrote: ... Yes, after spending another two hours working out why my fix was then hanging in cancel_rearming_delayed_work() I was a little bit annoyed at the now obviously misleading comment. Five minutes later I agree with your feelings,

[PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-20 Thread Jarek Poplawski
, if I'm not dreaming... PS: of course the counter value below is a question of taste Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc6-mm1-/kernel/workqueue.c 2.6.21-rc6-mm1/kernel/workqueue.c --- 2.6.21-rc6-mm1-/kernel/workqueue.c 2007-04-18 20:07:45.0 +0200

[PATCH] workqueue: cancel_rearming_delayed_work/workqueue usage warning

2007-04-20 Thread Jarek Poplawski
Here is my proposal to make things clearer: (this time on 2.6.21-rc7) CC: David Chinner [EMAIL PROTECTED] CC: Oleg Nesterov [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc7-/kernel/workqueue.c 2.6.21-rc7/kernel/workqueue.c --- 2.6.21-rc7-/kernel

Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-23 Thread Jarek Poplawski
On Fri, Apr 20, 2007 at 09:08:36PM +0400, Oleg Nesterov wrote: On 04/20, Jarek Poplawski wrote: On Thu, Apr 19, 2007 at 02:21:22PM +0400, Oleg Nesterov wrote: ... Yes. It would be better to use cancel_work_sync() instead of flush_workqueue() to make this less possible (because

Re: [PATCH] workqueue: cancel_rearming_delayed_work/workqueue usage warning

2007-04-23 Thread Jarek Poplawski
On Fri, Apr 20, 2007 at 09:23:48PM +0400, Oleg Nesterov wrote: On 04/20, Jarek Poplawski wrote: Here is my proposal to make things clearer: (this time on 2.6.21-rc7) CC: David Chinner [EMAIL PROTECTED] CC: Oleg Nesterov [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL

Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-24 Thread Jarek Poplawski
On Mon, Apr 23, 2007 at 08:33:12PM +0400, Oleg Nesterov wrote: On 04/23, Jarek Poplawski wrote: On Fri, Apr 20, 2007 at 09:08:36PM +0400, Oleg Nesterov wrote: First, this flag should be cleared after return from cancel_rearming_delayed_work(). I think this flag, if at all

Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-25 Thread Jarek Poplawski
On Tue, Apr 24, 2007 at 10:55:37PM +0400, Oleg Nesterov wrote: On 04/24, Jarek Poplawski wrote: This looks fine. Of course, it requires to remove some debugging currently done with _PENDING flag For example? Sorry!!! I don't know where I've seen those flags - maybe it's something

Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-25 Thread Jarek Poplawski
2 cents more... On Tue, Apr 24, 2007 at 10:55:37PM +0400, Oleg Nesterov wrote: ... --- OLD/kernel/workqueue.c~1_CRDW 2007-04-13 17:43:23.0 +0400 +++ OLD/kernel/workqueue.c2007-04-24 22:41:15.0 +0400 @@ -242,11 +242,11 @@ static void run_workqueue(struct cpu_wor ...

Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

2007-04-25 Thread Jarek Poplawski
On Wed, Apr 25, 2007 at 02:20:38PM +0200, Jarek Poplawski wrote: 2 cents more... ... On Tue, Apr 24, 2007 at 10:55:37PM +0400, Oleg Nesterov wrote: + do { + retry = 1; Of course this'll be shorter: retry = 0; + spin_lock_irq(cwq-lock

Re: [PATCH] cancel_delayed_work: use del_timer() instead of del_timer_sync()

2007-04-25 Thread Jarek Poplawski
On Wed, Apr 25, 2007 at 01:50:34AM +0400, Oleg Nesterov wrote: del_timer_sync() buys nothing for cancel_delayed_work(), but it is less efficient since it locks the timer unconditionally, and may wait for the completion of the delayed_work_timer_fn(). I'm not sure what is the main aim of this

Re: [PATCH] Use more gcc extensions in the Linux headers

2007-03-09 Thread Jarek Poplawski
On 09-03-2007 08:52, Christoph Hellwig wrote: On Fri, Mar 09, 2007 at 04:56:32PM +1100, Rusty Russell wrote: ... +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) \ ++ sizeof(typeof(int[1 - 2*!!__builtin_types_compatible_p(typeof(arr), \ +

Re: [PATCH 1/2] NET: Multiple queue network device support

2007-03-08 Thread Jarek Poplawski
On 07-03-2007 23:42, David Miller wrote: I didn't say to use skb-priority, I said to shrink skb-priority down to a u16 and then make another u16 which will store your queue mapping value. Peter is right: this is fully used by schedulers (prio, CBQ, HTB, HFSC...) and would break users' scripts,

Re: [PATCH 1/2] NET: Multiple queue network device support

2007-03-12 Thread Jarek Poplawski
On 09-03-2007 14:40, Thomas Graf wrote: * Kok, Auke [EMAIL PROTECTED] 2007-02-08 16:09 diff --git a/net/core/dev.c b/net/core/dev.c index 455d589..42b635c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1477,6 +1477,49 @@ gso: skb-tc_verd = SET_TC_AT(skb-tc_verd,AT_EGRESS);

Re: [PATCH] [REVISED] net/ipv4/multipath_wrandom.c: check kmalloc() return value.

2007-03-12 Thread Jarek Poplawski
On 09-03-2007 08:29, David Miller wrote: From: Amit Choudhary [EMAIL PROTECTED] Date: Thu, 8 Mar 2007 23:22:15 -0800 Description: Check the return value of kmalloc() in function wrandom_set_nhinfo(), in file net/ipv4/multipath_wrandom.c. Signed-off-by: Amit Choudhary [EMAIL PROTECTED]

Re: [PATCH] [REVISED] net/ipv4/multipath_wrandom.c: check kmalloc() return value.

2007-03-12 Thread Jarek Poplawski
On Mon, Mar 12, 2007 at 02:36:46PM +0200, Pekka Enberg wrote: On 3/12/07, Jarek Poplawski [EMAIL PROTECTED] wrote: So, maybe it's less evil to check those NULLs where possible and add some WARN_ONs here and there... No, it's much better to oops rather than paper over a bug. I'm not sure I

Re: Removal of multipath cached (was Re: [PATCH] [REVISED] net/ipv4/multipath_wrandom.c: check kmalloc() return value.)

2007-03-13 Thread Jarek Poplawski
On Mon, Mar 12, 2007 at 10:22:36PM -0800, Andrew Morton wrote: On Mon, 12 Mar 2007 13:53:11 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: ... And there is absolutely no negotiations about this, I've held back on this for nearly 2 years, and nothing has happened, this code is not

[PATCH] Re: Fwd: oprofile lockdep warning on rc1

2007-03-16 Thread Jarek Poplawski
()) and from hardirq (nmi_cpu_setup()), so the lockup is possible. Reported-by: Richard Hughes [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp linux-2.6.21-rc1-/drivers/oprofile/oprofilefs.c linux-2.6.21-rc1/drivers/oprofile/oprofilefs.c --- linux-2.6.21-rc1

Re: [2.6.20] BUG: workqueue leaked lock

2007-03-16 Thread Jarek Poplawski
On 15-03-2007 20:17, Folkert van Heusden wrote: On Tue, 13 Mar 2007 17:50:14 +0100 Folkert van Heusden [EMAIL PROTECTED] wrote: ... [ 1756.728209] BUG: workqueue leaked lock or atomic: nfsd4/0x/3577 ... [ 1846.684023] [c1003bdb] kernel_thread_helper+0x7/0x10 Oleg, that's a fairly

[PATCH] lockdep: debug_locks check after check_chain_key

2007-02-12 Thread Jarek Poplawski
In __lock_acquire check_chain_key can turn off debug_locks, so check is needed to assure proper return code. Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp linux-2.6.20-git7-/kernel/lockdep.c linux-2.6.20-git7/kernel/lockdep.c --- linux-2.6.20-git7-/kernel/lockdep.c 2007-02-12

Re: [PATCH] sata_via: fix resource-managed iomap conversion

2007-02-15 Thread Jarek Poplawski
On 12-02-2007 21:10, Tejun Heo wrote: Markus Trippelsdorf wrote: On Mon, Feb 12, 2007 at 10:51:44AM -0800, Tejun Heo wrote: Conversion to resource-managed iomap was buggy causing init failures on both vt6420 and 6421 - BAR5 wasn't mapped for both controllers while on vt6420 sata_via tried to

Re: [PATCH 1/3] net/bridge/br_if.c: don't use _WORK_NAR

2007-02-19 Thread Jarek Poplawski
On Mon, Feb 19, 2007 at 12:43:59AM +0300, Oleg Nesterov wrote: Afaics, noautorel work_struct buys nothing for struct net_bridge_port. If del_nbp()-cancel_delayed_work(p-carrier_check) fails, port_carrier_check may be called later anyway. So the reading of *work in port_carrier_check() is

Re: [PATCH 1/3] net/bridge/br_if.c: don't use _WORK_NAR

2007-02-19 Thread Jarek Poplawski
On Mon, Feb 19, 2007 at 03:03:53PM +0300, Oleg Nesterov wrote: On 02/19, Jarek Poplawski wrote: ... kfree() doesn't check WORK_STRUCT_PENDING, it makes no difference if it is set or not when work-func() runs. It looks like it's to be checked before kfree. So, even if this functionality

[PATCH] Re: [2.6.20] BUG: workqueue leaked lock

2007-03-20 Thread Jarek Poplawski
(for testing) to take this into consideration. Reported-by: Folkert van Heusden [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc4-git4-/kernel/workqueue.c 2.6.21-rc4-git4/kernel/workqueue.c --- 2.6.21-rc4-git4-/kernel/workqueue.c 2007-02-04 19:44

dquot.c: possible circular locking Re: [2.6.20] BUG: workqueue leaked lock

2007-03-20 Thread Jarek Poplawski
On 15-03-2007 20:17, Folkert van Heusden wrote: On Tue, 13 Mar 2007 17:50:14 +0100 Folkert van Heusden [EMAIL PROTECTED] wrote: ... Haha ok :-) Good, since I run 2.6.20 with these debugging switches switched on, I get occasionally errors like these. I get ALWAYS the following error when

Re: dquot.c: possible circular locking Re: [2.6.20] BUG: workqueue leaked lock

2007-03-20 Thread Jarek Poplawski
On Tue, Mar 20, 2007 at 12:17:01PM +0100, Jarek Poplawski wrote: ... IMHO lockdep found that two locks are taken in different order: - #1: 1) tty_mutex in con_console() 2) dqptr_sem (somewhere later) - #0: 1) dqptr_sem 2) tty_console in dquot_alloc_space() with print_warning() Should

Re: dquot.c: possible circular locking Re: [2.6.20] BUG: workqueue leaked lock

2007-03-20 Thread Jarek Poplawski
On Tue, Mar 20, 2007 at 01:19:09PM +0100, Jan Kara wrote: On Tue 20-03-07 12:31:51, Jarek Poplawski wrote: On Tue, Mar 20, 2007 at 12:22:53PM +0100, Jarek Poplawski wrote: On Tue, Mar 20, 2007 at 12:17:01PM +0100, Jarek Poplawski wrote: ... IMHO lockdep found that two locks are taken

Re: [PATCH] Re: [2.6.20] BUG: workqueue leaked lock

2007-03-21 Thread Jarek Poplawski
On Tue, Mar 20, 2007 at 07:07:59PM +0300, Oleg Nesterov wrote: On 03/20, Jarek Poplawski wrote: ... On Thu, 2007-03-15 at 11:06 -0800, Andrew Morton wrote: On Tue, 13 Mar 2007 17:50:14 +0100 Folkert van Heusden [EMAIL PROTECTED] wrote: ... [ 1756.728209] BUG: workqueue leaked lock

Re: [PATCH] slab: deal with NULL pointers passed to kmem_cache_free

2007-03-21 Thread Jarek Poplawski
On 20-03-2007 08:47, Pekka J Enberg wrote: On 3/19/07, Andrew Morton [EMAIL PROTECTED] wrote: ... On Tue, 20 Mar 2007, Eric Dumazet wrote: CPU: AMD64 processors, speed 1992.52 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit

Re: [PATCH] slab: deal with NULL pointers passed to kmem_cache_free

2007-03-21 Thread Jarek Poplawski
On Wed, Mar 21, 2007 at 02:13:52PM +0200, Pekka Enberg wrote: On 3/21/07, Jarek Poplawski [EMAIL PROTECTED] wrote: I think Pekka was right (it looks he changed his mind now) something should be done here. I think something like this should be a minimum: BUG_ON(!objp || virt_to_cache(objp

Re: [PATCH] slab: deal with NULL pointers passed to kmem_cache_free

2007-03-21 Thread Jarek Poplawski
On Wed, Mar 21, 2007 at 03:36:34PM +0200, Pekka J Enberg wrote: On Wed, 21 Mar 2007, Jarek Poplawski wrote: With __kmem_cache_free you would set #1 I hope, but if nobody would use this - debugging time wouldn't change. I think you got it backwards. I suggested making the _current_

Re: [PATCH] Re: [2.6.20] BUG: workqueue leaked lock

2007-03-21 Thread Jarek Poplawski
On Wed, Mar 21, 2007 at 05:46:20PM +0300, Oleg Nesterov wrote: On 03/21, Jarek Poplawski wrote: On Tue, Mar 20, 2007 at 07:07:59PM +0300, Oleg Nesterov wrote: On 03/20, Jarek Poplawski wrote: ... On Thu, 2007-03-15 at 11:06 -0800, Andrew Morton wrote: On Tue, 13 Mar 2007 17:50

[PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-21 Thread Jarek Poplawski
Here is some joke: [PATCH] lockdep: lockdep_depth vs. debug_locks lockdep really shouldn't be used when debug_locks == 0! Reported-by: Folkert van Heusden [EMAIL PROTECTED] Inspired-by: Oleg Nesterov [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21

[PATCH] lockdep: debug_show_all_locks debug_show_held_locks vs. debug_locks

2007-03-21 Thread Jarek Poplawski
-by: Folkert van Heusden [EMAIL PROTECTED] Inspired-by: Oleg Nesterov [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-rc4-git4-/kernel/lockdep.c 2.6.21-rc4-git4/kernel/lockdep.c --- 2.6.21-rc4-git4-/kernel/lockdep.c 2007-03-21 22:46:26.0 +0100 +++ 2.6.21

Re: [PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-22 Thread Jarek Poplawski
On Wed, Mar 21, 2007 at 10:28:02PM -0800, Andrew Morton wrote: On Thu, 22 Mar 2007 07:11:19 +0100 Jarek Poplawski [EMAIL PROTECTED] wrote: Here is some joke: [PATCH] lockdep: lockdep_depth vs. debug_locks lockdep really shouldn't be used when debug_locks == 0! This isn't

Re: [PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-22 Thread Jarek Poplawski
On Wed, Mar 21, 2007 at 10:28:02PM -0800, Andrew Morton wrote: ... I assume that some codepath is incrementing -lockdep_depth even when debug_locks==0. Isn't that wrong of it? lockdep simply stops to update lockdep_depth just after (during) a bug or a WARN. Jarek P. - To unsubscribe from

Re: [PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-22 Thread Jarek Poplawski
On Thu, Mar 22, 2007 at 08:06:44AM +0100, Jarek Poplawski wrote: ... This should definitely solve this problem - as it was said a few times before lockdep stops registering locks after a bug, so even the lock which caused the warning isn't reported. Here lockdep found a bug in a workqueue

Re: [PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-22 Thread Jarek Poplawski
On Wed, Mar 21, 2007 at 10:28:02PM -0800, Andrew Morton wrote: On Thu, 22 Mar 2007 07:11:19 +0100 Jarek Poplawski [EMAIL PROTECTED] wrote: Here is some joke: [PATCH] lockdep: lockdep_depth vs. debug_locks lockdep really shouldn't be used when debug_locks == 0! This isn't

Re: [PATCH 1/3] net/bridge/br_if.c: don't use _WORK_NAR

2007-02-20 Thread Jarek Poplawski
On Mon, Feb 19, 2007 at 06:04:45PM +0300, Oleg Nesterov wrote: On 02/19, Jarek Poplawski wrote: On Mon, Feb 19, 2007 at 03:03:53PM +0300, Oleg Nesterov wrote: On 02/19, Jarek Poplawski wrote: ... kfree() doesn't check WORK_STRUCT_PENDING, it makes no difference if it is set

Re: [PATCH] net/bridge/br_if.c: fix possible use-after-free in port_carrier_check()

2007-02-21 Thread Jarek Poplawski
On Tue, Feb 20, 2007 at 04:24:34PM -0800, Stephen Hemminger wrote: On Wed, 21 Feb 2007 01:19:41 +0300 Oleg Nesterov [EMAIL PROTECTED] wrote: If del_nbp()-cancel_delayed_work(carrier_check) fails, port_carrier_check() may run later and access an already freed container (struct

Re: [PATCH] net/bridge/br_if.c: fix possible use-after-free in port_carrier_check()

2007-02-21 Thread Jarek Poplawski
On Wed, Feb 21, 2007 at 09:23:45AM +0100, Jarek Poplawski wrote: ... I have known issues with RCU, but dare to disagree here. It's done during call_rcu, so anything RCU friendly shouldn't see this at the moment at all. It could be needed for those with refcounting - than it should be checked

Re: [RFT] bridge: eliminate port_check workqueue

2007-02-22 Thread Jarek Poplawski
On Wed, Feb 21, 2007 at 10:55:55AM -0800, Stephen Hemminger wrote: This is what I was suggesting by getting rid of the work queue completely. ... --- bridge.orig/net/bridge/br_if.c2007-02-21 10:22:46.0 -0800 +++ bridge/net/bridge/br_if.c 2007-02-21 10:53:25.0 -0800 @@

Re: [BUG][2.6.21] af_key: kernel BUG at net/core/skbuff.c:93

2007-02-27 Thread Jarek Poplawski
On 26-02-2007 23:08, Luca Tettamanti wrote: Hello, I'm running 2.6.21 (current git, at 9654640d0af). kernel blows up at startup, when running setkey. Kernel 2.6.20 runs fine. A couple of words ... [ cut here ] kernel BUG at

Re: [PATCH 2.6.20] kobject net ifindex + rename

2007-02-28 Thread Jarek Poplawski
On 28-02-2007 02:27, Jean Tourrilhes wrote: Hi all, ... Patch for 2.6.20 is attached. The patch was tested on a system running the hotplug scripts, and on another system running udev. Have fun... Jean Signed-off-by: Jean Tourrilhes [EMAIL PROTECTED]

Re: [PATCH 2.6.20] kobject net ifindex + rename

2007-02-28 Thread Jarek Poplawski
On Wed, Feb 28, 2007 at 10:34:37AM +0100, Jarek Poplawski wrote: On 28-02-2007 02:27, Jean Tourrilhes wrote: ... + /* This function is only used for network interface. +* Some hotplug package track interfaces by their name and +* therefore want to know when the name is changed

Re: [PATCH 2.6.20] kobject net ifindex + rename

2007-02-28 Thread Jarek Poplawski
On Wed, Feb 28, 2007 at 10:45:41AM -0800, Jean Tourrilhes wrote: On Wed, Feb 28, 2007 at 10:34:37AM +0100, Jarek Poplawski wrote: On 28-02-2007 02:27, Jean Tourrilhes wrote: Hi all, ... Patch for 2.6.20 is attached. The patch was tested on a system running the hotplug scripts

Re: [PATCH 2.6.20] kobject net ifindex + rename

2007-03-02 Thread Jarek Poplawski
On Thu, Mar 01, 2007 at 11:27:34AM -0800, Jean Tourrilhes wrote: On Thu, Mar 01, 2007 at 08:42:09AM +0100, Jarek Poplawski wrote: On Wed, Feb 28, 2007 at 10:45:41AM -0800, Jean Tourrilhes wrote: + + if ((size = 0) || (i = num_envp)) Btw.: 1. if size == 10 and snprintf

[PATCH][SCTP] Re: lockdep: inconsistent lock state ipv6_add_addr/sctp_v6_copy_addrlist (2.6.21-rc1)

2007-03-07 Thread Jarek Poplawski
in ipv6_add_addr is also taken in sctp_v6_copy_addrlist with softirqs enabled, so lockup is possible. Noticed-by: Simon Arlott [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp linux-2.6.21-rc2-mm2-/net/sctp/ipv6.c linux-2.6.21-rc2-mm2/net/sctp/ipv6.c --- linux

Re: [PATCH][SCTP] Re: lockdep: inconsistent lock state ipv6_add_addr/sctp_v6_copy_addrlist (2.6.21-rc1)

2007-03-08 Thread Jarek Poplawski
On Thu, Mar 08, 2007 at 10:02:30PM +1100, Herbert Xu wrote: On Thu, Mar 08, 2007 at 10:00:23PM +1100, Herbert Xu wrote: Who's calling ipv6_add_addr from softirq context? That's got to be wrong because ipv6_add_addr requires the RTNL. Nevermind, I was thinking of ipv6_add_dev. Anyway -

Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-03 Thread Jarek Poplawski
On 02-04-2007 21:41, Christian Kujau wrote: Hi there, we have serious problems with 2 of our servers: both shiny new amd64 dual core, with both 2GB RAM, 32bit kernel+userland (Debian/testing). Both servers have 2 NICs, RTL8139 (eth0, irq10) and RTL8169s (eth1, irq11). Hi, Did you try

Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-04 Thread Jarek Poplawski
On Tue, Apr 03, 2007 at 04:19:46PM +0100, Christian Kujau wrote: On Tue, 3 Apr 2007, Jarek Poplawski wrote: Did you try with 8139cp instead of 8139too? Tried that, 8139cp could not be loaded :( Sorry for misleading! (Maybe even try some other card to narrow the problem?) You could also

Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-05 Thread Jarek Poplawski
On Wed, Apr 04, 2007 at 02:20:23PM +0100, Christian Kujau wrote: On Wed, 4 Apr 2007, Jarek Poplawski wrote: So, it's a lot sooner than before. (BTW, isn't there anything in debug log?) No, nothing. I've set up remote-syslgging to the other node (node1 logging to node2 and vice versa

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-07 Thread Jarek Poplawski
On Mon, Jan 07, 2008 at 02:23:33PM +0100, Stefan Richter wrote: David Brownell wrote: On Monday 07 January 2008, Greg KH wrote: Most of the non-driver core code should be converted to not use the lock in the class at all. They should use a local lock instead. Or better yet, that

Re: Top 10 kernel oopses for the week ending January 5th, 2008

2008-01-07 Thread Jarek Poplawski
On 08-01-2008 06:59, Al Viro wrote: On Mon, Jan 07, 2008 at 07:26:12PM -0800, Linus Torvalds wrote: I usually just compile a small program like const char array[]=\xnn\xnn\xnn...; int main(int argc, char **argv) { printf(%p\n, array); *(int

Re: 2.6.24-rc6-mm1

2008-01-09 Thread Jarek Poplawski
On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote: ... diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c new file mode 100644 index 000..495575a --- /dev/null +++ b/lib/iommu-helper.c @@ -0,0 +1,80 @@ +/* + * IOMMU helper functions for the free area management + */ +

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-12 Thread Jarek Poplawski
On Sat, Jan 12, 2008 at 05:47:54PM +0800, Dave Young wrote: Add the following class iteration functions for driver use: class_for_each_device class_find_device class_for_each_child class_find_child Signed-off-by: Dave Young [EMAIL PROTECTED] --- drivers/base/class.c | 159

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-13 Thread Jarek Poplawski
On Mon, Jan 14, 2008 at 09:36:04AM +0800, Dave Young wrote: On Jan 13, 2008 4:11 AM, Jarek Poplawski [EMAIL PROTECTED] wrote: ... Probably some tiny oversight, but I see this comment to struct class doesn't mention devices list, so maybe this needs to be updated BTW?: (from include/linux

Re: questions on NAPI processing latency and dropped network packets

2008-01-14 Thread Jarek Poplawski
On 14-01-2008 16:58, Chris Friesen wrote: ... How close to bleeding edge do we need to be for it to be considered acceptable to ask questions on netdev? Given that the embedded space tends to be perpetually stuck on older kernels (our current release is based on 2.6.14) do you have any

Re: PCI Interrupt

2007-09-19 Thread Jarek Poplawski
On 18-09-2007 16:42, Rafael J. Wysocki wrote: ... Hm, edge-triggered interrupts cannot be shared, AFAIK. Let's agree it's only a superstition... http://en.wikipedia.org/wiki/Edge_triggered_interrupt Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [PATCH] pci: Fix e100 interrupt quirk

2007-09-19 Thread Jarek Poplawski
On 18-09-2007 13:17, Valentine Barshak wrote: PCI memory space may have a 64-bit offset on some architectures (for example, PowerPC 440) and the actual PCI memory address has to fixed up (an offset to PCI mem space shuld be added) before remapping. So, pci_iomap should be used instead of

Re: PCI Interrupt

2007-09-19 Thread Jarek Poplawski
On Wed, Sep 19, 2007 at 02:32:12PM +0200, Rafael J. Wysocki wrote: On Wednesday, 19 September 2007 12:14, Jarek Poplawski wrote: On 18-09-2007 16:42, Rafael J. Wysocki wrote: ... Hm, edge-triggered interrupts cannot be shared, AFAIK. Let's agree it's only a superstition

Re: PCI Interrupt

2007-09-19 Thread Jarek Poplawski
On 19-09-2007 13:21, Frantisek Rysanek wrote: On 19 Sep 2007 at 12:14, Jarek Poplawski wrote: On 18-09-2007 16:42, Rafael J. Wysocki wrote: Hm, edge-triggered interrupts cannot be shared, AFAIK. Let's agree it's only a superstition... http://en.wikipedia.org/wiki/Edge_triggered_interrupt

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-19 Thread Jarek Poplawski
On 18-09-2007 16:55, Nadia Derbey wrote: ... Well, reviewing the code I found another place where the rcu_read_unlock() was missing. I'm so sorry for the inconvenience. It's true that I should have tested with CONFIG_PREEMPT=y :-( Now, the ltp tests pass even with this option set... In

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-20 Thread Jarek Poplawski
On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote: Jarek Poplawski wrote: On 18-09-2007 16:55, Nadia Derbey wrote: ... Well, reviewing the code I found another place where the rcu_read_unlock() was missing. I'm so sorry for the inconvenience. It's true that I should have tested

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-20 Thread Jarek Poplawski
On Thu, Sep 20, 2007 at 09:28:21AM +0200, Jarek Poplawski wrote: On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote: ... Before Calling msg_unlock() they call ipc_rcu_getref() that increments a refcount in the rcu header for the msg structure. This guarantees

Re: PROBLEM: System Freeze on Particular workload with kernel 2.6.22.6

2007-09-20 Thread Jarek Poplawski
On 19-09-2007 21:25, Ahmed S. Darwish wrote: Hi Low, On Wed, Sep 19, 2007 at 12:16:39PM -0400, Low Yucheng wrote: There are no additional console messages. Not sure what this is: * no relevant Cc (memory management added) Relevant CCs means CCing maintainers or subsystem mailing lists

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-20 Thread Jarek Poplawski
On Thu, Sep 20, 2007 at 10:52:43AM +0200, Nadia Derbey wrote: Jarek Poplawski wrote: ... which seems to suggest out is an RCU protected pointer, so, I thought these refcounts were for something else. But, after looking at how it's used it turns out to be ~90% wrong: probably 9 out of 10

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-20 Thread Jarek Poplawski
On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote: ... So, here is the ipc_lock_by_ptr() status: 1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() call it inside a refcounting. == no rcu read section needed. 2) *_exit_ns(), ipc_findkey() and

Re: top lies ?

2007-11-12 Thread Jarek Poplawski
Tomasz Kłoczko wrote, On 11/12/2007 06:57 PM: Some data showed by top command looks like completly trashed. Fragment from top output: Mem: 2075784k total, 2053352k used,22432k free,19260k buffers Swap: 2096472k total, 136k used, 2096336k free, 1335080k cached PID

Re: [BUG] New Kernel Bugs

2007-11-13 Thread Jarek Poplawski
On 13-11-2007 12:15, Andrew Morton wrote: ... Zero responses from developers ... No response from developers ... Andreas did some work, seemed to lose interest. ... Rafael poked Thomas a week ago, to no effect. Thomas has been travelling. Looks like very reproducible! Maybe you should add

Re: tg3: strange errors and non-working-ness

2007-11-15 Thread Jarek Poplawski
On 13-11-2007 19:57, Jon Nelson wrote: I'm not sure if this is the right place, Me too. Looks more like acpi or pci problem. Did you try to experiment with something like: pci=noacpi or acpi=off boot parameters? Probably some point to your .config and dmesg should be useful too, so taking it to

Re: tg3: strange errors and non-working-ness

2007-11-15 Thread Jarek Poplawski
Jon Nelson wrote, On 11/15/2007 09:21 PM: ... NOTE: to avoid list noise, I can make a bug out of this on bugzilla.kernel.org and we can proceed from there if that is preferred. Why avoid list noise? These lists are made just for this. But, since this case needs a lot of space for your

Re: [PATCH] net/ipv4/arp.c: Fix arp reply when sender ip 0

2007-11-17 Thread Jarek Poplawski
Bill Fink wrote, On 11/16/2007 08:26 PM: ... Regarding the Target IP, RFC 826 says: The target protocol address is necessary in the request form of the packet so that a machine can determine whether or not to enter the sender information in a table or to send a reply.

Re: Race between generic_forget_inode() and sync_sb_inodes()?

2007-11-30 Thread Jarek Poplawski
On 30-11-2007 00:03, Neil Brown wrote: On Friday November 30, [EMAIL PROTECTED] wrote: ... Or have I just not had enough coffee this morning? :-) And I cannot even blame the lack of coffee as I don't drink it. Looks like logical error... (Or I haven't had enough coffee this morning

Re: [PATCH] Documentation/Changes - Documentation/Requirements (resend without truncated comment text)

2007-11-29 Thread Jarek Poplawski
On 30-11-2007 04:32, H. Peter Anvin wrote: ... As far as I can tell, Documentation/Changes is the only thing we have that even attempts to document the basic requirements. This attempts to formalize that fact. Documentation/Changes | 396

Re: Need lockdep help

2007-12-03 Thread Jarek Poplawski
On 02-12-2007 20:45, Alan Stern wrote: Ingo: I ran into a lockdep reporting issue just now with some new code under development. I think it's a false positive; the question is how best to deal with it. Here's the situation. The new code runs during a system sleep (i.e., suspend or

Re: Need lockdep help

2007-12-03 Thread Jarek Poplawski
Alan Stern wrote, On 12/03/2007 04:08 PM: On Mon, 3 Dec 2007, Jarek Poplawski wrote: System sleep start: down_read(notifier-chain rwsem); call the notifier routine down_write(system_sleep_in_progress_rwsem); up_read(notifier-chain

Re: Need lockdep help

2007-12-04 Thread Jarek Poplawski
Alan Stern wrote, On 12/04/2007 04:17 PM: ... Furthermore, in this case deadlock isn't really impossible -- it could occur if there were a bug somewhere else in the kernel. So lockdep was correct to warn that deadlock might occur. Alan, if the scenario was like you described at the

Re: Need lockdep help

2007-12-04 Thread Jarek Poplawski
Alan Stern wrote, On 12/04/2007 08:28 PM: On Tue, 4 Dec 2007, Jarek Poplawski wrote: ... But you have to consider hypothetical kernel bugs. That's exactly what lockdep is for -- to warn you about possible deadlocks that could be caused by bugs. As a simple example, if thread #1 does

Re: NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

2007-12-04 Thread Jarek Poplawski
Joonwoo Park wrote, On 12/04/2007 10:48 AM: Hi, dev_set_rx_mode calls __dev_set_rx_mode with softirq disabled (by netif_tx_lock_bh) therefore __dev_set_promiscuity can be called with softirq disabled. It will cause in_interrupt() to return true and ASSERT_RTNL warning. Is there a good

Re: NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

2007-12-04 Thread Jarek Poplawski
On 04-12-2007 23:26, Jarek Poplawski wrote: ... But, IMHO, blowing ASSERT_RTNL up in a few places shouldn't be much worse. After all, how long such a debugging code should be kept. It seems, at least sometimes we should be a bit more confident of how it's called. I see this won't be done

Re: Scheduler behaviour

2007-12-06 Thread Jarek Poplawski
Arjan van de Ven wrote, On 12/05/2007 10:26 PM: On Wed, 05 Dec 2007 21:15:30 +0100 Holger Wolf [EMAIL PROTECTED] wrote: ... a 2.6.23 kernel. We saw a throughput degradation from 7.2 to 23.4 this is good news! dbench rewards unfair behavior... so higher dbench usually means a worse

gitweb: kernel versions in the history (feature request, probably)

2007-11-20 Thread Jarek Poplawski
Hi, I see gitweb is much more usable (faster) than a few months ago, but there is one thing a bit problematic: in the history of patches I'm very often interested in which kernel version of Linus' tree the patch appeared for the first time. If it's not some big problem, and maybe somebody else

Re: gitweb: kernel versions in the history (feature request, probably)

2007-11-20 Thread Jarek Poplawski
Petr Baudis wrote, On 11/20/2007 10:59 PM: Hi, On Tue, Nov 20, 2007 at 03:20:42PM +0100, Jarek Poplawski wrote: I see gitweb is much more usable (faster) than a few months ago, but there is one thing a bit problematic: in the history of patches I'm very often interested in which kernel

Re: gitweb: kernel versions in the history (feature request, probably)

2007-11-20 Thread Jarek Poplawski
On Tue, Nov 20, 2007 at 10:20:09PM -0500, J. Bruce Fields wrote: On Wed, Nov 21, 2007 at 12:30:23AM +0100, Jarek Poplawski wrote: I don't know git, but it seems, at least if done for web only, this shouldn't be so 'heavy'. It could be a 'simple' translation of commit date by querying

Re: gitweb: kernel versions in the history (feature request, probably)

2007-11-21 Thread Jarek Poplawski
On Wed, Nov 21, 2007 at 08:52:17AM +0100, Jarek Poplawski wrote: ... Of course, you are right, and I probably miss something, but to be sure we think about the same thing let's look at some example: so, I open a page with current Linus' tree, go to something titled: /pub/scm / linux/kernel/git

Re: gitweb: kernel versions in the history (feature request, probably)

2007-11-21 Thread Jarek Poplawski
Kay Sievers wrote, On 11/21/2007 05:06 PM: On Nov 21, 2007 8:52 AM, Jarek Poplawski [EMAIL PROTECTED] wrote: On Tue, Nov 20, 2007 at 10:20:09PM -0500, J. Bruce Fields wrote: On Wed, Nov 21, 2007 at 12:30:23AM +0100, Jarek Poplawski wrote: I don't know git, but it seems, at least if done

Re: gitweb: kernel versions in the history (feature request, probably)

2007-11-21 Thread Jarek Poplawski
Petr Baudis wrote, On 11/21/2007 04:18 PM: On Wed, Nov 21, 2007 at 08:52:17AM +0100, Jarek Poplawski wrote: ... tags 4 days ago v2.6.24-rc3 Linux 2.6.24-rc3 2 weeks ago v2.6.24-rc2 Linux 2.6.24-rc2 4 weeks ago v2.6.24-rc1 Linux 2.6.24-rc1 6 weeks ago v2.6.23 Linux

Re: [2.6 patch] make I/O schedulers non-modular

2007-11-26 Thread Jarek Poplawski
On 25-11-2007 18:22, Jens Axboe wrote: On Sun, Nov 25 2007, Adrian Bunk wrote: ... Is there any technical reason why we need 4 different schedulers at all? Until we have the perfect scheduler :-) IMHO this is not enough yet. There is something called the right of choice, and, it seems,

  1   2   3   4   5   6   7   8   9   >