Re: 2.6.17-mm3

2006-07-09 Thread Herbert Xu
Michal Piotrowski <[EMAIL PROTECTED]> wrote:
> 
> It was moved, sorry.

I fail to spot any relevant backtraces for skge or indeed any part
of the networking stack.  Ingo/Arjan, perhaps you guys can figure
out what's wrong here.

In future perhaps you should consider posting the dmesg to the list
directly.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3

2006-07-08 Thread Michal Piotrowski
Stephen Hemminger napisaƂ(a):
> On Tue, 27 Jun 2006 16:12:42 +0200
> "Michal Piotrowski" <[EMAIL PROTECTED]> wrote:
> 
>> Hi,
>>
>> On 27/06/06, Andrew Morton <[EMAIL PROTECTED]> wrote:
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17/2.6.17-mm3/
>>>
>>>
>> It looks like a skge bug
>>
>> =
>> [ INFO: possible irq lock inversion dependency detected ]
>> -
>> swapper/0 just changed the state of lock:
>>  (tasklist_lock){..-?}, at: [] send_group_sig_info+0x16/0x34
>> but this lock took another, soft-irq-unsafe lock in the past:
>>   (&sig->stats_lock){--..}
>>
>> and interrupts could create inverse lock ordering between them.
>>
>>
>> other info that might help us debug this:
>> no locks held by swapper/0.
>>
>> the first lock's dependencies:
>> -> (tasklist_lock){..-?} ops: 13763 {
>>initial-use  at:
>> [] lock_acquire+0x60/0x80
>> [] _write_lock_irq+0x29/0x38
>> [] copy_process+0xea7/0x13c0
>> [] do_fork+0x8d/0x18f
>> [] kernel_thread+0x6c/0x74
>>         [] rest_init+0x14/0x3c
>> [] start_kernel+0x388/0x390
>> [] 0xc0100210
>>in-softirq-R at:
>> [..]
>>
>> Here is a dmesg log 
>> http://www.stardust.webpages.pl/files/mm/2.6.17-mm3/mm-dmesg
> 
> 
> That file no longer exists...

It was moved, sorry.

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3

2006-07-07 Thread Stephen Hemminger
On Tue, 27 Jun 2006 16:12:42 +0200
"Michal Piotrowski" <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> On 27/06/06, Andrew Morton <[EMAIL PROTECTED]> wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17/2.6.17-mm3/
> >
> >
> 
> It looks like a skge bug
> 
> =
> [ INFO: possible irq lock inversion dependency detected ]
> -
> swapper/0 just changed the state of lock:
>  (tasklist_lock){..-?}, at: [] send_group_sig_info+0x16/0x34
> but this lock took another, soft-irq-unsafe lock in the past:
>   (&sig->stats_lock){--..}
> 
> and interrupts could create inverse lock ordering between them.
> 
> 
> other info that might help us debug this:
> no locks held by swapper/0.
> 
> the first lock's dependencies:
> -> (tasklist_lock){..-?} ops: 13763 {
>initial-use  at:
> [] lock_acquire+0x60/0x80
> [] _write_lock_irq+0x29/0x38
> [] copy_process+0xea7/0x13c0
> [] do_fork+0x8d/0x18f
> [] kernel_thread+0x6c/0x74
> [] rest_init+0x14/0x3c
> [] start_kernel+0x388/0x390
> [] 0xc0100210
>in-softirq-R at:
> [..]
> 
> Here is a dmesg log 
> http://www.stardust.webpages.pl/files/mm/2.6.17-mm3/mm-dmesg


That file no longer exists...

> Here is a config file
> http://www.stardust.webpages.pl/files/mm/2.6.17-mm3/mm-config
> 
> Regards,
> Michal
> 


-- 
If one would give me six lines written by the hand of the most honest
man, I would find something in them to have him hanged. -- Cardinal Richlieu
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage.

2006-06-29 Thread Herbert Xu
Andrew Morton <[EMAIL PROTECTED]> wrote:
>
>> > inet_bind()
>> > ->sk_dst_get
>> >   ->read_lock(&sk->sk_dst_lock)

We are still holding the sock lock when doing sk_dst_get.

>> > > 1 lock held by java_vm/4418:
>> > >  #0:  (af_family_keys + (sk)->sk_family#4){-+..}, at: []
>> > > tcp_v6_rcv+0x308/0x7b7 [ipv6]
>> > 
>> > softirq
>> > ->ip6_dst_lookup
>> >   ->sk_dst_check
>> > ->sk_dst_reset
>> >   ->write_lock(&sk->sk_dst_lock);

The sock lock prevents this path from being entered.  Instead the
received TCP packet is queued and replayed when the sock lock is
released.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage.

2006-06-29 Thread Andrew Morton
On Thu, 29 Jun 2006 21:42:34 +0200
Arjan van de Ven <[EMAIL PROTECTED]> wrote:

> On Thu, 2006-06-29 at 12:26 -0700, Andrew Morton wrote:
> > On Thu, 29 Jun 2006 12:01:06 -0700
> > "Miles Lane" <[EMAIL PROTECTED]> wrote:
> > 
> > > [ BUG: illegal lock usage! ]
> > > 
> > 
> > This is claiming that we're taking sk->sk_dst_lock in a deadlockable manner.
> > 
> > > illegal {softirq-on-W} -> {in-softirq-R} usage.
> > 
> > It found someone doing write_lock(sk_dst_lock) with softirqs enabled, but
> > someone else takes read_lock(dst_lock) inside softirqs.
> > 
> > > java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes:
> > >  (&sk->sk_dst_lock){---?}, at: [] sk_dst_check+0x1b/0xe6
> > > {softirq-on-W} state was registered at:
> > >   [] lock_acquire+0x60/0x80
> > >   [] _write_lock+0x23/0x32
> > >   [] inet_bind+0x16c/0x1cc
> > >   [] sys_bind+0x61/0x80
> > >   [] sys_socketcall+0x7d/0x186
> > >   [] sysenter_past_esp+0x56/0x8d
> > 
> > inet_bind()
> > ->sk_dst_get
> >   ->read_lock(&sk->sk_dst_lock)
> 
> actually write_lock() not read_lock()
> 

static inline struct dst_entry *
sk_dst_get(struct sock *sk)
{
struct dst_entry *dst;

read_lock(&sk->sk_dst_lock);

> > 
> > > irq event stamp: 11052
> > > hardirqs last  enabled at (11052): [] kmem_cache_alloc+0x89/0xa6
> > > hardirqs last disabled at (11051): [] kmem_cache_alloc+0x3a/0xa6
> > > softirqs last  enabled at (11040): [] dev_queue_xmit+0x224/0x24b
> > > softirqs last disabled at (11041): [] do_softirq+0x58/0xbd
> > > 
> > > other info that might help us debug this:
> > > 1 lock held by java_vm/4418:
> > >  #0:  (af_family_keys + (sk)->sk_family#4){-+..}, at: []
> > > tcp_v6_rcv+0x308/0x7b7 [ipv6]
> > 
> > softirq
> > ->ip6_dst_lookup
> >   ->sk_dst_check
> > ->sk_dst_reset
> >   ->write_lock(&sk->sk_dst_lock);
> 
> write_lock.. or read_lock() ? 

static inline void
sk_dst_reset(struct sock *sk)
{
write_lock(&sk->sk_dst_lock);

> > 
> > > stack backtrace:
> > >  [] show_trace_log_lvl+0x54/0xfd
> > >  [] show_trace+0xd/0x10
> > >  [] dump_stack+0x19/0x1b
> > >  [] print_usage_bug+0x1cc/0x1d9
> > >  [] mark_lock+0x193/0x360
> > >  [] __lock_acquire+0x3b7/0x970
> > >  [] lock_acquire+0x60/0x80
> > >  [] _read_lock+0x23/0x32
> 
> backtrace says read lock to me ...
> > >  [] sk_dst_check+0x1b/0xe6
> > >  [] ip6_dst_lookup+0x31/0x172 [ipv6]
> > >  [] tcp_v6_send_synack+0x10f/0x238 [ipv6]
> > >  [] tcp_v6_conn_request+0x281/0x2c7 [ipv6]
> > >  [] tcp_rcv_state_process+0x5d/0xbde
> 
> 
> > So the allegation is that if a softirq runs sk_dst_reset() while
> > process-context code is running sk_dst_set(), we'll do write_lock() while
> > holding read_lock().  
> 
> hmm or...
> 
> we're doing a write_lock(), then an interrupt can happen that triggers
> the softirq that triggers the read_lock(), which will deadlock because
> we interrupted the writer...
> 

either way ain't good ;)

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage.

2006-06-29 Thread Arjan van de Ven
On Thu, 2006-06-29 at 12:26 -0700, Andrew Morton wrote:
> On Thu, 29 Jun 2006 12:01:06 -0700
> "Miles Lane" <[EMAIL PROTECTED]> wrote:
> 
> > [ BUG: illegal lock usage! ]
> > 
> 
> This is claiming that we're taking sk->sk_dst_lock in a deadlockable manner.
> 
> > illegal {softirq-on-W} -> {in-softirq-R} usage.
> 
> It found someone doing write_lock(sk_dst_lock) with softirqs enabled, but
> someone else takes read_lock(dst_lock) inside softirqs.
> 
> > java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes:
> >  (&sk->sk_dst_lock){---?}, at: [] sk_dst_check+0x1b/0xe6
> > {softirq-on-W} state was registered at:
> >   [] lock_acquire+0x60/0x80
> >   [] _write_lock+0x23/0x32
> >   [] inet_bind+0x16c/0x1cc
> >   [] sys_bind+0x61/0x80
> >   [] sys_socketcall+0x7d/0x186
> >   [] sysenter_past_esp+0x56/0x8d
> 
>   inet_bind()
>   ->sk_dst_get
> ->read_lock(&sk->sk_dst_lock)

actually write_lock() not read_lock()


> 
> > irq event stamp: 11052
> > hardirqs last  enabled at (11052): [] kmem_cache_alloc+0x89/0xa6
> > hardirqs last disabled at (11051): [] kmem_cache_alloc+0x3a/0xa6
> > softirqs last  enabled at (11040): [] dev_queue_xmit+0x224/0x24b
> > softirqs last disabled at (11041): [] do_softirq+0x58/0xbd
> > 
> > other info that might help us debug this:
> > 1 lock held by java_vm/4418:
> >  #0:  (af_family_keys + (sk)->sk_family#4){-+..}, at: []
> > tcp_v6_rcv+0x308/0x7b7 [ipv6]
> 
>   softirq
>   ->ip6_dst_lookup
> ->sk_dst_check
>   ->sk_dst_reset
> ->write_lock(&sk->sk_dst_lock);

write_lock.. or read_lock() ? 

> 
> > stack backtrace:
> >  [] show_trace_log_lvl+0x54/0xfd
> >  [] show_trace+0xd/0x10
> >  [] dump_stack+0x19/0x1b
> >  [] print_usage_bug+0x1cc/0x1d9
> >  [] mark_lock+0x193/0x360
> >  [] __lock_acquire+0x3b7/0x970
> >  [] lock_acquire+0x60/0x80
> >  [] _read_lock+0x23/0x32

backtrace says read lock to me ...
> >  [] sk_dst_check+0x1b/0xe6
> >  [] ip6_dst_lookup+0x31/0x172 [ipv6]
> >  [] tcp_v6_send_synack+0x10f/0x238 [ipv6]
> >  [] tcp_v6_conn_request+0x281/0x2c7 [ipv6]
> >  [] tcp_rcv_state_process+0x5d/0xbde


> So the allegation is that if a softirq runs sk_dst_reset() while
> process-context code is running sk_dst_set(), we'll do write_lock() while
> holding read_lock().  

hmm or...

we're doing a write_lock(), then an interrupt can happen that triggers
the softirq that triggers the read_lock(), which will deadlock because
we interrupted the writer...



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage.

2006-06-29 Thread Andrew Morton
On Thu, 29 Jun 2006 12:01:06 -0700
"Miles Lane" <[EMAIL PROTECTED]> wrote:

> [ BUG: illegal lock usage! ]
> 

This is claiming that we're taking sk->sk_dst_lock in a deadlockable manner.

> illegal {softirq-on-W} -> {in-softirq-R} usage.

It found someone doing write_lock(sk_dst_lock) with softirqs enabled, but
someone else takes read_lock(dst_lock) inside softirqs.

> java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (&sk->sk_dst_lock){---?}, at: [] sk_dst_check+0x1b/0xe6
> {softirq-on-W} state was registered at:
>   [] lock_acquire+0x60/0x80
>   [] _write_lock+0x23/0x32
>   [] inet_bind+0x16c/0x1cc
>   [] sys_bind+0x61/0x80
>   [] sys_socketcall+0x7d/0x186
>   [] sysenter_past_esp+0x56/0x8d

inet_bind()
->sk_dst_get
  ->read_lock(&sk->sk_dst_lock)

> irq event stamp: 11052
> hardirqs last  enabled at (11052): [] kmem_cache_alloc+0x89/0xa6
> hardirqs last disabled at (11051): [] kmem_cache_alloc+0x3a/0xa6
> softirqs last  enabled at (11040): [] dev_queue_xmit+0x224/0x24b
> softirqs last disabled at (11041): [] do_softirq+0x58/0xbd
> 
> other info that might help us debug this:
> 1 lock held by java_vm/4418:
>  #0:  (af_family_keys + (sk)->sk_family#4){-+..}, at: []
> tcp_v6_rcv+0x308/0x7b7 [ipv6]

softirq
->ip6_dst_lookup
  ->sk_dst_check
->sk_dst_reset
  ->write_lock(&sk->sk_dst_lock);

> stack backtrace:
>  [] show_trace_log_lvl+0x54/0xfd
>  [] show_trace+0xd/0x10
>  [] dump_stack+0x19/0x1b
>  [] print_usage_bug+0x1cc/0x1d9
>  [] mark_lock+0x193/0x360
>  [] __lock_acquire+0x3b7/0x970
>  [] lock_acquire+0x60/0x80
>  [] _read_lock+0x23/0x32
>  [] sk_dst_check+0x1b/0xe6
>  [] ip6_dst_lookup+0x31/0x172 [ipv6]
>  [] tcp_v6_send_synack+0x10f/0x238 [ipv6]
>  [] tcp_v6_conn_request+0x281/0x2c7 [ipv6]
>  [] tcp_rcv_state_process+0x5d/0xbde
>  [] tcp_v6_do_rcv+0x26d/0x384 [ipv6]
>  [] tcp_v6_rcv+0x75d/0x7b7 [ipv6]
>  [] ip6_input+0x201/0x2d1 [ipv6]
>  [] ipv6_rcv+0x190/0x1bf [ipv6]
>  [] netif_receive_skb+0x2e6/0x37f
>  [] process_backlog+0x80/0x112
>  [] net_rx_action+0x8b/0x1e8
>  [] __do_softirq+0x55/0xb0
>  [] do_softirq+0x58/0xbd
>  [] local_bh_enable+0xd0/0x107
>  [] dev_queue_xmit+0x224/0x24b
>  [] neigh_resolve_output+0x1e2/0x20e
>  [] ip6_output2+0x1de/0x1fc [ipv6]
>  [] ip6_output+0x69c/0x6c6 [ipv6]
>  [] ip6_xmit+0x22b/0x295 [ipv6]
>  [] inet6_csk_xmit+0x200/0x20e [ipv6]
>  [] tcp_transmit_skb+0x5de/0x60c
>  [] tcp_connect+0x2bb/0x31a
>  [] tcp_v6_connect+0x520/0x655 [ipv6]
>  [] inet_stream_connect+0x83/0x20f
>  [] sys_connect+0x67/0x84
>  [] sys_socketcall+0x8c/0x186
>  [] sysenter_past_esp+0x56/0x8d

So the allegation is that if a softirq runs sk_dst_reset() while
process-context code is running sk_dst_set(), we'll do write_lock() while
holding read_lock().  
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3 -- NULL pointer dereference at virtual address 00000020 / EIP is at prism2_registers_proc_read+0x22/0x2ff [hostap_cs]

2006-06-28 Thread Andrew Morton
"Miles Lane" <[EMAIL PROTECTED]> wrote:
>
> BUG: unable to handle kernel NULL pointer dereference at virtual
> address 0020
>  printing eip:
> f8d21f6e
> *pde = 
> Oops:  [#1]
> 4K_STACKS PREEMPT
> last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
> Modules linked in: sg sd_mod usb_storage libusual pcnet_cs 8390
> aha152x_cs scsi_transport_spi ohci_hcd hostap_cs hostap binfmt_misc
> i915 drm ipv6 speedstep_centrino cpufreq_powersave cpufreq_performance
> cpufreq_conservative video thermal button nls_ascii nls_cp437 vfat fat
> nls_utf8 ntfs nls_base md_mod sr_mod sbp2 scsi_mod parport_pc lp
> parport snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm_oss
> snd_mixer_oss snd_pcm snd_timer ehci_hcd pcspkr evdev iTCO_wdt sdhci
> mmc_core uhci_hcd usbcore psmouse snd ipw2200 rtc intel_agp agpgart
> ohci1394 ieee1394 soundcore snd_page_alloc 8139too
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00210246   (2.6.17-mm3miles #15)
> EIP is at prism2_registers_proc_read+0x22/0x2ff [hostap_cs]
> eax:    ebx: f8d21f4c   ecx:    edx: e884ef64
> esi: d5bd04e4   edi: db8ce000   ebp: e884ef38   esp: e884ef2c
> ds: 007b   es: 007b   ss: 0068
> Process cat (pid: 2219, ti=e884e000 task=cdd3e870 task.ti=e884e000)
> Stack: f8d21f4c 0400 db8ce000 e884ef78 c1090c8b 0400 e884ef68 d5bd04e4
>0400 0806c000 cac1123c  0400 f7b6b838  
>dc8ff844 c1090b8c 0806c000 e884ef94 c105fe38 e884efa0 0400 dc8ff844
> Call Trace:
>  [] proc_file_read+0xff/0x218
>  [] vfs_read+0xa9/0x158
>  [] sys_read+0x3b/0x60
>  [] sysenter_past_esp+0x56/0x8d
> Code: c8 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 57 56 53 89 c7 8b 75 10 85
> c9 74 10 8b 45 0c c7 00 01 00 00 00 31 c0 e9 d8 02 00 00 8b 46 14 <8b>
> 50 20 66 ed 0f b7 c0 50 68 03 4a d2 f8 57 e8 42 46 3d c8 8d
> EIP: [] prism2_registers_proc_read+0x22/0x2ff [hostap_cs]
> SS:ESP 0068:e884ef2c

local_info_t.dev is NULL in prism2_registers_proc_read().

Can you please provide a step-by-step means by which others can reproduce
this?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.17-mm3

2006-06-27 Thread Michal Piotrowski

Hi,

On 27/06/06, Andrew Morton <[EMAIL PROTECTED]> wrote:


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17/2.6.17-mm3/




It looks like a skge bug

=
[ INFO: possible irq lock inversion dependency detected ]
-
swapper/0 just changed the state of lock:
(tasklist_lock){..-?}, at: [] send_group_sig_info+0x16/0x34
but this lock took another, soft-irq-unsafe lock in the past:
 (&sig->stats_lock){--..}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
no locks held by swapper/0.

the first lock's dependencies:
-> (tasklist_lock){..-?} ops: 13763 {
  initial-use  at:
   [] lock_acquire+0x60/0x80
   [] _write_lock_irq+0x29/0x38
   [] copy_process+0xea7/0x13c0
   [] do_fork+0x8d/0x18f
   [] kernel_thread+0x6c/0x74
   [] rest_init+0x14/0x3c
   [] start_kernel+0x388/0x390
   [] 0xc0100210
  in-softirq-R at:
[..]

Here is a dmesg log http://www.stardust.webpages.pl/files/mm/2.6.17-mm3/mm-dmesg

Here is a config file
http://www.stardust.webpages.pl/files/mm/2.6.17-mm3/mm-config

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html