Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! > > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > > > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > > > > cycles... so it is probably unrelated to the previous crash. > > > > > > can you try to reproduce this with 2.6.20-rc2 as well. > > > > Yep, here it is, reproduced on 6-th-or-so suspend. > > > > bluetooth may need to be actively used in order for this to trigger; > > connecting to the net over my cellphone seems to work okay. > > > > (Full logs in attachment). > > Is this issue also present in 2.6.19 or is it a regression? Not sure... but I know there were some bluetooth & suspend problems before. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. Yep, here it is, reproduced on 6-th-or-so suspend. bluetooth may need to be actively used in order for this to trigger; connecting to the net over my cellphone seems to work okay. (Full logs in attachment). Is this issue also present in 2.6.19 or is it a regression? Not sure... but I know there were some bluetooth suspend problems before. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
On Mon, Dec 25, 2006 at 12:36:47AM +0100, Pavel Machek wrote: > On Sun 2006-12-24 15:39:23, Marcel Holtmann wrote: > > Hi Pavel, > > > > > > I got this nasty oops while playing with debugger. Not sure if that is > > > > related; it also might be something with bluetooth; I already know it > > > > corrupts memory during suspend, perhaps it corrupts memory in some > > > > error path? > > > > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > > > cycles... so it is probably unrelated to the previous crash. > > > > can you try to reproduce this with 2.6.20-rc2 as well. > > Yep, here it is, reproduced on 6-th-or-so suspend. > > bluetooth may need to be actively used in order for this to trigger; > connecting to the net over my cellphone seems to work okay. > > (Full logs in attachment). Is this issue also present in 2.6.19 or is it a regression? > Pavel cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
On Mon, Dec 25, 2006 at 12:36:47AM +0100, Pavel Machek wrote: On Sun 2006-12-24 15:39:23, Marcel Holtmann wrote: Hi Pavel, I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. Yep, here it is, reproduced on 6-th-or-so suspend. bluetooth may need to be actively used in order for this to trigger; connecting to the net over my cellphone seems to work okay. (Full logs in attachment). Is this issue also present in 2.6.19 or is it a regression? Pavel cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi Pavel, > > > > > I got this nasty oops while playing with debugger. Not sure if that is > > > > > related; it also might be something with bluetooth; I already know it > > > > > corrupts memory during suspend, perhaps it corrupts memory in some > > > > > error path? > > > > > > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > > > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > > > > cycles... so it is probably unrelated to the previous crash. > > > > > > can you try to reproduce this with 2.6.20-rc2 as well. > > > > (reproduced in another mail). > > > > _urb_queue_tail(__pending_q(husb, _urb->type), _urb); > > err = usb_submit_urb(urb, GFP_ATOMIC); > > if (err) { > > BT_ERR("%s tx submit failed urb %p type %d err %d", > > husb->hdev->name, urb, _urb->type, err); > > _urb_unlink(_urb); > > > > ~~ > > Do we need to remove urb from pending_q here? > > > > _urb_queue_tail(__completed_q(husb, _urb->type), _urb); > > } else > > atomic_inc(__pending_tx(husb, _urb->type)); > > > > Any news? Should I convert above idea to a patch? Or should I make > bluetooth suspend() routine return error so corruption is impossible > to hit? to be honest, I have no idea. This code is way to ugly anyway. Regards Marcel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! > > > > I got this nasty oops while playing with debugger. Not sure if that is > > > > related; it also might be something with bluetooth; I already know it > > > > corrupts memory during suspend, perhaps it corrupts memory in some > > > > error path? > > > > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > > > cycles... so it is probably unrelated to the previous crash. > > > > can you try to reproduce this with 2.6.20-rc2 as well. > > (reproduced in another mail). > > _urb_queue_tail(__pending_q(husb, _urb->type), _urb); > err = usb_submit_urb(urb, GFP_ATOMIC); > if (err) { > BT_ERR("%s tx submit failed urb %p type %d err %d", > husb->hdev->name, urb, _urb->type, err); > _urb_unlink(_urb); > > ~~ >Do we need to remove urb from pending_q here? > > _urb_queue_tail(__completed_q(husb, _urb->type), _urb); > } else > atomic_inc(__pending_tx(husb, _urb->type)); > Any news? Should I convert above idea to a patch? Or should I make bluetooth suspend() routine return error so corruption is impossible to hit? Pavel -- Thanks for all the (sleeping) penguins. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. (reproduced in another mail). _urb_queue_tail(__pending_q(husb, _urb-type), _urb); err = usb_submit_urb(urb, GFP_ATOMIC); if (err) { BT_ERR(%s tx submit failed urb %p type %d err %d, husb-hdev-name, urb, _urb-type, err); _urb_unlink(_urb); ~~ Do we need to remove urb from pending_q here? _urb_queue_tail(__completed_q(husb, _urb-type), _urb); } else atomic_inc(__pending_tx(husb, _urb-type)); Any news? Should I convert above idea to a patch? Or should I make bluetooth suspend() routine return error so corruption is impossible to hit? Pavel -- Thanks for all the (sleeping) penguins. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi Pavel, I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. (reproduced in another mail). _urb_queue_tail(__pending_q(husb, _urb-type), _urb); err = usb_submit_urb(urb, GFP_ATOMIC); if (err) { BT_ERR(%s tx submit failed urb %p type %d err %d, husb-hdev-name, urb, _urb-type, err); _urb_unlink(_urb); ~~ Do we need to remove urb from pending_q here? _urb_queue_tail(__completed_q(husb, _urb-type), _urb); } else atomic_inc(__pending_tx(husb, _urb-type)); Any news? Should I convert above idea to a patch? Or should I make bluetooth suspend() routine return error so corruption is impossible to hit? to be honest, I have no idea. This code is way to ugly anyway. Regards Marcel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! > > > I got this nasty oops while playing with debugger. Not sure if that is > > > related; it also might be something with bluetooth; I already know it > > > corrupts memory during suspend, perhaps it corrupts memory in some > > > error path? > > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > > cycles... so it is probably unrelated to the previous crash. > > can you try to reproduce this with 2.6.20-rc2 as well. (reproduced in another mail). _urb_queue_tail(__pending_q(husb, _urb->type), _urb); err = usb_submit_urb(urb, GFP_ATOMIC); if (err) { BT_ERR("%s tx submit failed urb %p type %d err %d", husb->hdev->name, urb, _urb->type, err); _urb_unlink(_urb); ~~ Do we need to remove urb from pending_q here? _urb_queue_tail(__completed_q(husb, _urb->type), _urb); } else atomic_inc(__pending_tx(husb, _urb->type)); Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
On Sun 2006-12-24 15:39:23, Marcel Holtmann wrote: > Hi Pavel, > > > > I got this nasty oops while playing with debugger. Not sure if that is > > > related; it also might be something with bluetooth; I already know it > > > corrupts memory during suspend, perhaps it corrupts memory in some > > > error path? > > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > > cycles... so it is probably unrelated to the previous crash. > > can you try to reproduce this with 2.6.20-rc2 as well. Yep, here it is, reproduced on 6-th-or-so suspend. bluetooth may need to be actively used in order for this to trigger; connecting to the net over my cellphone seems to work okay. (Full logs in attachment). Pavel Linux version 2.6.20-rc2 ([EMAIL PROTECTED]) (gcc version 4.0.4 20060507 (prerelease) (Debian 4.0.3-3)) #383 SMP Fri Dec 22 11:30:05 CET 2006 ... system 00:00: resuming pnp 00:01: resuming system 00:02: resuming pnp 00:03: resuming pnp 00:04: resuming pnp 00:05: resuming pnp 00:06: resuming pnp 00:07: resuming i8042 kbd 00:08: resuming pnp: Device 00:08 does not support activation. i8042 aux 00:09: resuming pnp: Device 00:09 does not support activation. pnp 00:0a: resuming pnp 00:0b: resuming platform bluetooth: resuming pcspkr pcspkr: resuming vesafb vesafb.0: resuming serial8250 serial8250: resuming usb usb1: resuming usb usb3: resuming ata2: SATA link down (SStatus 0 SControl 0) ata3: SATA link down (SStatus 0 SControl 0) ata4: SATA link down (SStatus 0 SControl 0) hub 1-0:1.0: resuming hub 3-0:1.0: resuming i8042 i8042: resuming atkbd serio0: resuming psmouse serio1: resuming usb usb4: resuming usb usb5: resuming hub 4-0:1.0: resuming hub 5-0:1.0: resuming usb usb2: resuming hub 2-0:1.0: resuming mmcblk mmc0:cc53: resuming sd 0:0:0:0: resuming usb 3-2: resuming usbdev3.8_ep00: PM: resume from 0, parent 3-2 still 2 usb 3-2:1.0: PM: resume from 2, parent 3-2 still 2 usb 3-2:1.0: resuming usbdev3.8_ep81: PM: resume from 0, parent 3-2:1.0 still 2 usbdev3.8_ep02: PM: resume from 0, parent 3-2:1.0 still 2 usbdev3.8_ep83: PM: resume from 0, parent 3-2:1.0 still 2 usb 3-1: resuming usbdev3.9_ep00: PM: resume from 0, parent 3-1 still 2 hci_usb 3-1:1.0: PM: resume from 2, parent 3-1 still 2 hci_usb 3-1:1.0: resuming hci0: PM: resume from 0, parent 3-1:1.0 still 2 usbdev3.9_ep81: PM: resume from 0, parent 3-1:1.0 still 2 usbdev3.9_ep82: PM: resume from 0, parent 3-1:1.0 still 2 usbdev3.9_ep02: PM: resume from 0, parent 3-1:1.0 still 2 hci_usb 3-1:1.1: PM: resume from 2, parent 3-1 still 2 hci_usb 3-1:1.1: resuming usbdev3.9_ep83: PM: resume from 0, parent 3-1:1.1 still 2 usbdev3.9_ep03: PM: resume from 0, parent 3-1:1.1 still 2 usb 3-1:1.2: PM: resume from 2, parent 3-1 still 2 usb 3-1:1.2: resuming usbdev3.9_ep84: PM: resume from 0, parent 3-1:1.2 still 2 usbdev3.9_ep04: PM: resume from 0, parent 3-1:1.2 still 2 usb 3-1:1.3: PM: resume from 2, parent 3-1 still 2 usb 3-1:1.3: resuming Restarting tasks ... <3>__tx_submit: hci0 tx submit failed urb f765d1bc type 2 err -19 usb 3-1: USB disconnect, address 9 PM: Removing info for No Bus:usbdev3.9_ep81 PM: Removing info for No Bus:usbdev3.9_ep82 PM: Removing info for No Bus:usbdev3.9_ep02 slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten [] cache_free_debugcheck+0x128/0x1d0 [] hci_usb_close+0xf3/0x160 [] kfree+0x50/0xa0 [] hci_usb_close+0xf3/0x160 [] hci_usb_disconnect+0x2e/0x90 [] usb_disable_interface+0x53/0x70 [] usb_unbind_interface+0x38/0x80 [] __device_release_driver+0x68/0xb0 [] device_release_driver+0x1e/0x40 [] bus_remove_device+0x8b/0xa0 [] device_del+0x159/0x1c0 [] usb_disable_device+0x4d/0x100 [] usb_disconnect+0x9a/0x110 [] hub_thread+0x355/0xbd0 [] schedule+0x2de/0x8f0 [] autoremove_wake_function+0x0/0x50 [] hub_thread+0x0/0xbd0 [] kthread+0xec/0xf0 [] kthread+0x0/0xf0 [] kernel_thread_helper+0x7/0x10 === f70a2720: redzone 1:0x5a5a5a5a, redzone 2:0xc0545e9e. [ cut here ] kernel BUG at mm/slab.c:2878! invalid opcode: [#1] SMP Modules linked in: CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00010012 (2.6.20-rc2 #383) EIP is at cache_free_debugcheck+0x1b2/0x1d0 eax: f70a271c ebx: f70a20f8 ecx: 00052c00 edx: 020c esi: c20df680 edi: f70a2720 ebp: 5a5a5a5a esp: c2313e30 ds: 007b es: 007b ss: 0068 Process khubd (pid: 304, ti=c2312000 task=c2257030 task.ti=c2312000) Stack: c06aedf0 f70a2720 5a5a5a5a c0545e9e c04b08d3 f70a20c0 c20df680 c20d9164 f70a2724 0286 c016b610 f653e8d8 f653e8c4 c2134ba0 000c c04b08d3 c2134b5c c2134b8c f62e0a54 c2134ad0 0001 c2134ad0 f62e0a54 c07dbee0 Call Trace: [] sock_alloc_send_skb+0x16e/0x1c0 [] hci_usb_close+0xf3/0x160 [] kfree+0x50/0xa0 [] hci_usb_close+0xf3/0x160 [] hci_usb_disconnect+0x2e/0x90 []
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! > > PM: Removing info for No Bus:usbdev3.15_ep81 > > PM: Removing info for No Bus:usbdev3.15_ep82 > > PM: Removing info for No Bus:usbdev3.15_ep02 > > slab error in verify_redzone_free(): cache `size-512': memory outside > > object was overwritten > > [] cache_free_debugcheck+0x128/0x1d0 > > [] hci_usb_close+0xf3/0x160 > > [] kfree+0x50/0xa0 > > [] hci_usb_close+0xf3/0x160 > > [] hci_usb_disconnect+0x2e/0x90 > > [] usb_disable_interface+0x53/0x70 > > [] usb_unbind_interface+0x38/0x80 > > [] __device_release_driver+0x68/0xb0 > > [] device_release_driver+0x1e/0x40 > > [] bus_remove_device+0x8b/0xa0 > > [] device_del+0x159/0x1c0 > > [] usb_disable_device+0x4d/0x100 > > [] usb_disconnect+0x9a/0x110 > > [] hub_thread+0x355/0xbd0 > > [] schedule+0x2de/0x8f0 > > [] autoremove_wake_function+0x0/0x50 > > [] hub_thread+0x0/0xbd0 > > [] kthread+0xec/0xf0 > > [] kthread+0x0/0xf0 > > [] kernel_thread_helper+0x7/0x10 > > === > > yes, this one looks like memory scribblage in bluetooth. The > buffer.c assertion failure should now be fixed, please verify. I can confirm buffer.c assertion to be fixed (yes, I was using gdb at that time). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi Pavel, > > I got this nasty oops while playing with debugger. Not sure if that is > > related; it also might be something with bluetooth; I already know it > > corrupts memory during suspend, perhaps it corrupts memory in some > > error path? > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. Regards Marcel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi Pavel, I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. Regards Marcel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! PM: Removing info for No Bus:usbdev3.15_ep81 PM: Removing info for No Bus:usbdev3.15_ep82 PM: Removing info for No Bus:usbdev3.15_ep02 slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten [c016a1b8] cache_free_debugcheck+0x128/0x1d0 [c04b58e3] hci_usb_close+0xf3/0x160 [c016b530] kfree+0x50/0xa0 [c04b58e3] hci_usb_close+0xf3/0x160 [c04b59be] hci_usb_disconnect+0x2e/0x90 [c0454f23] usb_disable_interface+0x53/0x70 [c04576f8] usb_unbind_interface+0x38/0x80 [c032f908] __device_release_driver+0x68/0xb0 [c032fc3e] device_release_driver+0x1e/0x40 [c032f1db] bus_remove_device+0x8b/0xa0 [c032dbc9] device_del+0x159/0x1c0 [c04559ad] usb_disable_device+0x4d/0x100 [c044fe8a] usb_disconnect+0x9a/0x110 [c0452405] hub_thread+0x355/0xbd0 [c061426e] schedule+0x2de/0x8f0 [c013c640] autoremove_wake_function+0x0/0x50 [c04520b0] hub_thread+0x0/0xbd0 [c013c58c] kthread+0xec/0xf0 [c013c4a0] kthread+0x0/0xf0 [c0103be7] kernel_thread_helper+0x7/0x10 === yes, this one looks like memory scribblage in bluetooth. The buffer.c assertion failure should now be fixed, please verify. I can confirm buffer.c assertion to be fixed (yes, I was using gdb at that time). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
On Sun 2006-12-24 15:39:23, Marcel Holtmann wrote: Hi Pavel, I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. Yep, here it is, reproduced on 6-th-or-so suspend. bluetooth may need to be actively used in order for this to trigger; connecting to the net over my cellphone seems to work okay. (Full logs in attachment). Pavel Linux version 2.6.20-rc2 ([EMAIL PROTECTED]) (gcc version 4.0.4 20060507 (prerelease) (Debian 4.0.3-3)) #383 SMP Fri Dec 22 11:30:05 CET 2006 ... system 00:00: resuming pnp 00:01: resuming system 00:02: resuming pnp 00:03: resuming pnp 00:04: resuming pnp 00:05: resuming pnp 00:06: resuming pnp 00:07: resuming i8042 kbd 00:08: resuming pnp: Device 00:08 does not support activation. i8042 aux 00:09: resuming pnp: Device 00:09 does not support activation. pnp 00:0a: resuming pnp 00:0b: resuming platform bluetooth: resuming pcspkr pcspkr: resuming vesafb vesafb.0: resuming serial8250 serial8250: resuming usb usb1: resuming usb usb3: resuming ata2: SATA link down (SStatus 0 SControl 0) ata3: SATA link down (SStatus 0 SControl 0) ata4: SATA link down (SStatus 0 SControl 0) hub 1-0:1.0: resuming hub 3-0:1.0: resuming i8042 i8042: resuming atkbd serio0: resuming psmouse serio1: resuming usb usb4: resuming usb usb5: resuming hub 4-0:1.0: resuming hub 5-0:1.0: resuming usb usb2: resuming hub 2-0:1.0: resuming mmcblk mmc0:cc53: resuming sd 0:0:0:0: resuming usb 3-2: resuming usbdev3.8_ep00: PM: resume from 0, parent 3-2 still 2 usb 3-2:1.0: PM: resume from 2, parent 3-2 still 2 usb 3-2:1.0: resuming usbdev3.8_ep81: PM: resume from 0, parent 3-2:1.0 still 2 usbdev3.8_ep02: PM: resume from 0, parent 3-2:1.0 still 2 usbdev3.8_ep83: PM: resume from 0, parent 3-2:1.0 still 2 usb 3-1: resuming usbdev3.9_ep00: PM: resume from 0, parent 3-1 still 2 hci_usb 3-1:1.0: PM: resume from 2, parent 3-1 still 2 hci_usb 3-1:1.0: resuming hci0: PM: resume from 0, parent 3-1:1.0 still 2 usbdev3.9_ep81: PM: resume from 0, parent 3-1:1.0 still 2 usbdev3.9_ep82: PM: resume from 0, parent 3-1:1.0 still 2 usbdev3.9_ep02: PM: resume from 0, parent 3-1:1.0 still 2 hci_usb 3-1:1.1: PM: resume from 2, parent 3-1 still 2 hci_usb 3-1:1.1: resuming usbdev3.9_ep83: PM: resume from 0, parent 3-1:1.1 still 2 usbdev3.9_ep03: PM: resume from 0, parent 3-1:1.1 still 2 usb 3-1:1.2: PM: resume from 2, parent 3-1 still 2 usb 3-1:1.2: resuming usbdev3.9_ep84: PM: resume from 0, parent 3-1:1.2 still 2 usbdev3.9_ep04: PM: resume from 0, parent 3-1:1.2 still 2 usb 3-1:1.3: PM: resume from 2, parent 3-1 still 2 usb 3-1:1.3: resuming Restarting tasks ... 3__tx_submit: hci0 tx submit failed urb f765d1bc type 2 err -19 usb 3-1: USB disconnect, address 9 PM: Removing info for No Bus:usbdev3.9_ep81 PM: Removing info for No Bus:usbdev3.9_ep82 PM: Removing info for No Bus:usbdev3.9_ep02 slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten [c016a298] cache_free_debugcheck+0x128/0x1d0 [c04b08d3] hci_usb_close+0xf3/0x160 [c016b610] kfree+0x50/0xa0 [c04b08d3] hci_usb_close+0xf3/0x160 [c04b09ae] hci_usb_disconnect+0x2e/0x90 [c044fed3] usb_disable_interface+0x53/0x70 [c04526a8] usb_unbind_interface+0x38/0x80 [c032a8b8] __device_release_driver+0x68/0xb0 [c032abee] device_release_driver+0x1e/0x40 [c032a18b] bus_remove_device+0x8b/0xa0 [c0328b79] device_del+0x159/0x1c0 [c045095d] usb_disable_device+0x4d/0x100 [c044ae3a] usb_disconnect+0x9a/0x110 [c044d3b5] hub_thread+0x355/0xbd0 [c060f53e] schedule+0x2de/0x8f0 [c013c680] autoremove_wake_function+0x0/0x50 [c044d060] hub_thread+0x0/0xbd0 [c013c5cc] kthread+0xec/0xf0 [c013c4e0] kthread+0x0/0xf0 [c0103be7] kernel_thread_helper+0x7/0x10 === f70a2720: redzone 1:0x5a5a5a5a, redzone 2:0xc0545e9e. [ cut here ] kernel BUG at mm/slab.c:2878! invalid opcode: [#1] SMP Modules linked in: CPU:0 EIP:0060:[c016a322]Not tainted VLI EFLAGS: 00010012 (2.6.20-rc2 #383) EIP is at cache_free_debugcheck+0x1b2/0x1d0 eax: f70a271c ebx: f70a20f8 ecx: 00052c00 edx: 020c esi: c20df680 edi: f70a2720 ebp: 5a5a5a5a esp: c2313e30 ds: 007b es: 007b ss: 0068 Process khubd (pid: 304, ti=c2312000 task=c2257030 task.ti=c2312000) Stack: c06aedf0 f70a2720 5a5a5a5a c0545e9e c04b08d3 f70a20c0 c20df680 c20d9164 f70a2724 0286 c016b610 f653e8d8 f653e8c4 c2134ba0 000c c04b08d3 c2134b5c c2134b8c f62e0a54 c2134ad0 0001 c2134ad0 f62e0a54 c07dbee0 Call Trace:
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. can you try to reproduce this with 2.6.20-rc2 as well. (reproduced in another mail). _urb_queue_tail(__pending_q(husb, _urb-type), _urb); err = usb_submit_urb(urb, GFP_ATOMIC); if (err) { BT_ERR(%s tx submit failed urb %p type %d err %d, husb-hdev-name, urb, _urb-type, err); _urb_unlink(_urb); ~~ Do we need to remove urb from pending_q here? _urb_queue_tail(__completed_q(husb, _urb-type), _urb); } else atomic_inc(__pending_tx(husb, _urb-type)); Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
On Sun, 24 Dec 2006 00:55:01 +0100 Pavel Machek <[EMAIL PROTECTED]> wrote: > PM: Removing info for No Bus:usbdev3.15_ep81 > PM: Removing info for No Bus:usbdev3.15_ep82 > PM: Removing info for No Bus:usbdev3.15_ep02 > slab error in verify_redzone_free(): cache `size-512': memory outside object > was overwritten > [] cache_free_debugcheck+0x128/0x1d0 > [] hci_usb_close+0xf3/0x160 > [] kfree+0x50/0xa0 > [] hci_usb_close+0xf3/0x160 > [] hci_usb_disconnect+0x2e/0x90 > [] usb_disable_interface+0x53/0x70 > [] usb_unbind_interface+0x38/0x80 > [] __device_release_driver+0x68/0xb0 > [] device_release_driver+0x1e/0x40 > [] bus_remove_device+0x8b/0xa0 > [] device_del+0x159/0x1c0 > [] usb_disable_device+0x4d/0x100 > [] usb_disconnect+0x9a/0x110 > [] hub_thread+0x355/0xbd0 > [] schedule+0x2de/0x8f0 > [] autoremove_wake_function+0x0/0x50 > [] hub_thread+0x0/0xbd0 > [] kthread+0xec/0xf0 > [] kthread+0x0/0xf0 > [] kernel_thread_helper+0x7/0x10 > === yes, this one looks like memory scribblage in bluetooth. The buffer.c assertion failure should now be fixed, please verify. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! > > I got this nasty oops while playing with debugger. Not sure if that is > > related; it also might be something with bluetooth; I already know it > > corrupts memory during suspend, perhaps it corrupts memory in some > > error path? > > Okay, I spoke too soon. bluetooth & suspend memory corruption was > _way_ harder to reproduce than expected. Took me 5-or-so-suspend > cycles... so it is probably unrelated to the previous crash. > > I was getting pretty regular crashes with bluetooth & gdb, but I was > not using bluetooth at the time of ext3-related crash. And for completeness, here's bluetooth + gdb oops. Ok, I'm not _sure_ it is bluetooth related. I'll try it without bluetooth in a while. Pavel PM: Adding info for No Bus:vcsa8 coda_read_super: Bad mount data coda_read_super: device index: 0 coda_read_super: rootfid is (01234567..080519b0.) PM: Removing info for No Bus:vcs10 PM: Removing info for No Bus:vcsa10 coda_upcall: Venus dead on (op,un) (7.2) flags 10 Failure of coda_cnode_make for root: error -19 hci_cmd_task: hci0 command tx timeout PM: Adding info for No Bus:rfcomm1 PM: Adding info for bluetooth:acl00803715A329 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata:
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
Hi! I got this nasty oops while playing with debugger. Not sure if that is related; it also might be something with bluetooth; I already know it corrupts memory during suspend, perhaps it corrupts memory in some error path? Okay, I spoke too soon. bluetooth suspend memory corruption was _way_ harder to reproduce than expected. Took me 5-or-so-suspend cycles... so it is probably unrelated to the previous crash. I was getting pretty regular crashes with bluetooth gdb, but I was not using bluetooth at the time of ext3-related crash. And for completeness, here's bluetooth + gdb oops. Ok, I'm not _sure_ it is bluetooth related. I'll try it without bluetooth in a while. Pavel PM: Adding info for No Bus:vcsa8 coda_read_super: Bad mount data coda_read_super: device index: 0 coda_read_super: rootfid is (01234567..080519b0.) PM: Removing info for No Bus:vcs10 PM: Removing info for No Bus:vcsa10 coda_upcall: Venus dead on (op,un) (7.2) flags 10 Failure of coda_cnode_make for root: error -19 hci_cmd_task: hci0 command tx timeout PM: Adding info for No Bus:rfcomm1 PM: Adding info for bluetooth:acl00803715A329 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 hci_acldata_packet: hci0 ACL packet for unknown connection handle 12 l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected continuation frame (len 0) l2cap_recv_acldata: Unexpected
Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
On Sun, 24 Dec 2006 00:55:01 +0100 Pavel Machek [EMAIL PROTECTED] wrote: PM: Removing info for No Bus:usbdev3.15_ep81 PM: Removing info for No Bus:usbdev3.15_ep82 PM: Removing info for No Bus:usbdev3.15_ep02 slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten [c016a1b8] cache_free_debugcheck+0x128/0x1d0 [c04b58e3] hci_usb_close+0xf3/0x160 [c016b530] kfree+0x50/0xa0 [c04b58e3] hci_usb_close+0xf3/0x160 [c04b59be] hci_usb_disconnect+0x2e/0x90 [c0454f23] usb_disable_interface+0x53/0x70 [c04576f8] usb_unbind_interface+0x38/0x80 [c032f908] __device_release_driver+0x68/0xb0 [c032fc3e] device_release_driver+0x1e/0x40 [c032f1db] bus_remove_device+0x8b/0xa0 [c032dbc9] device_del+0x159/0x1c0 [c04559ad] usb_disable_device+0x4d/0x100 [c044fe8a] usb_disconnect+0x9a/0x110 [c0452405] hub_thread+0x355/0xbd0 [c061426e] schedule+0x2de/0x8f0 [c013c640] autoremove_wake_function+0x0/0x50 [c04520b0] hub_thread+0x0/0xbd0 [c013c58c] kthread+0xec/0xf0 [c013c4a0] kthread+0x0/0xf0 [c0103be7] kernel_thread_helper+0x7/0x10 === yes, this one looks like memory scribblage in bluetooth. The buffer.c assertion failure should now be fixed, please verify. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/