Re: ahci panics when detaching...
John Baldwin wrote this message on Tue, Jun 24, 2014 at 09:51 -0400: > On Monday, June 23, 2014 9:06:26 pm John-Mark Gurney wrote: > > John Baldwin wrote this message on Mon, Jun 23, 2014 at 10:49 -0400: > > > On Monday, June 23, 2014 9:44:08 am John-Mark Gurney wrote: > > > > So, when I try to eject a ESATA card, the machine panics... I am able > > > > to successfully eject other cards, an ethernet (re) and a serial card > > > > (uart), and both handle the removal of their device w/o issue and with > > > > out crashes... > > > > > > > > When I try w/ ahci, I get a panic... The panic backtrace is: > > > > #8 0x80ced4e2 in calltrap () at > > > ../../../amd64/amd64/exception.S:231 > > > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > > > at ../../../kern/subr_rman.c:979 > > > > #10 0x8092b888 in resource_list_release_active > > > (rl=0xf80006d39c08, > > > > bus=0xf80002cd9000, child=0xf80006b6d700, type=3) > > > > at ../../../kern/subr_bus.c:3419 > > > > #11 0x8065d7a1 in pci_child_detached (dev=0xf80002cd9000, > > > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4133 > > > > ---Type to continue, or q to quit--- > > > > #12 0x80929708 in device_detach (dev=0xf80006b6d700) > > > > at bus_if.h:181 > > > > #13 0x8065f9f7 in pci_delete_child (dev=0xf80002cd9000, > > > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4710 > > > > > > > > In frame 9: > > > > (kgdb) fr 9 > > > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > > > at ../../../kern/subr_rman.c:979 > > > > 979 return (r->__r_i->r_rid); > > > > (kgdb) print r > > > > $1 = (struct resource *) 0xf800064c9380 > > > > (kgdb) print/x *r > > > > $4 = {__r_i = 0xdeadc0dedeadc0de, r_bustag = 0xdeadc0dedeadc0de, > > > > r_bushandle = 0xdeadc0dedeadc0de} > > > > > > > > So, looks like something is corrupted the resource data... > > > > > > This is the malloc junking on free. However, I wonder if the > > > problem is that the resource was freed without being properly > > > cleared from the resource_list in the PCI ivars. Is this with local > > > patches that you have? > > > > Yes, but I didn't patch any of the pci code, or the resource code, so > > this bug is in the original code... My patches only effect the attach > > case, don't touch the detach case... > > What did you change in attach? :) If the resource list isn't setup the same > then that could cause this. In particular, the PCI bus pre-reserves resources > for BARs so that they are allocated even if a driver hasn't allocated them. What I mean by that is that I setup a few things in pci_attach_common, like if the device has a slot that can hotplug, I attach an interrupt, enable interrupts and a couple bookkeeping items... But that code shouldn't change anything for ahci.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ahci panics when detaching...
On Monday, June 23, 2014 9:06:26 pm John-Mark Gurney wrote: > John Baldwin wrote this message on Mon, Jun 23, 2014 at 10:49 -0400: > > On Monday, June 23, 2014 9:44:08 am John-Mark Gurney wrote: > > > So, when I try to eject a ESATA card, the machine panics... I am able > > > to successfully eject other cards, an ethernet (re) and a serial card > > > (uart), and both handle the removal of their device w/o issue and with > > > out crashes... > > > > > > When I try w/ ahci, I get a panic... The panic backtrace is: > > > #8 0x80ced4e2 in calltrap () at > > ../../../amd64/amd64/exception.S:231 > > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > > at ../../../kern/subr_rman.c:979 > > > #10 0x8092b888 in resource_list_release_active > > (rl=0xf80006d39c08, > > > bus=0xf80002cd9000, child=0xf80006b6d700, type=3) > > > at ../../../kern/subr_bus.c:3419 > > > #11 0x8065d7a1 in pci_child_detached (dev=0xf80002cd9000, > > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4133 > > > ---Type to continue, or q to quit--- > > > #12 0x80929708 in device_detach (dev=0xf80006b6d700) > > > at bus_if.h:181 > > > #13 0x8065f9f7 in pci_delete_child (dev=0xf80002cd9000, > > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4710 > > > > > > In frame 9: > > > (kgdb) fr 9 > > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > > at ../../../kern/subr_rman.c:979 > > > 979 return (r->__r_i->r_rid); > > > (kgdb) print r > > > $1 = (struct resource *) 0xf800064c9380 > > > (kgdb) print/x *r > > > $4 = {__r_i = 0xdeadc0dedeadc0de, r_bustag = 0xdeadc0dedeadc0de, > > > r_bushandle = 0xdeadc0dedeadc0de} > > > > > > So, looks like something is corrupted the resource data... > > > > This is the malloc junking on free. However, I wonder if the > > problem is that the resource was freed without being properly > > cleared from the resource_list in the PCI ivars. Is this with local > > patches that you have? > > Yes, but I didn't patch any of the pci code, or the resource code, so > this bug is in the original code... My patches only effect the attach > case, don't touch the detach case... What did you change in attach? :) If the resource list isn't setup the same then that could cause this. In particular, the PCI bus pre-reserves resources for BARs so that they are allocated even if a driver hasn't allocated them. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ahci panics when detaching...
John Baldwin wrote this message on Mon, Jun 23, 2014 at 10:49 -0400: > On Monday, June 23, 2014 9:44:08 am John-Mark Gurney wrote: > > So, when I try to eject a ESATA card, the machine panics... I am able > > to successfully eject other cards, an ethernet (re) and a serial card > > (uart), and both handle the removal of their device w/o issue and with > > out crashes... > > > > When I try w/ ahci, I get a panic... The panic backtrace is: > > #8 0x80ced4e2 in calltrap () at > ../../../amd64/amd64/exception.S:231 > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > at ../../../kern/subr_rman.c:979 > > #10 0x8092b888 in resource_list_release_active > (rl=0xf80006d39c08, > > bus=0xf80002cd9000, child=0xf80006b6d700, type=3) > > at ../../../kern/subr_bus.c:3419 > > #11 0x8065d7a1 in pci_child_detached (dev=0xf80002cd9000, > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4133 > > ---Type to continue, or q to quit--- > > #12 0x80929708 in device_detach (dev=0xf80006b6d700) > > at bus_if.h:181 > > #13 0x8065f9f7 in pci_delete_child (dev=0xf80002cd9000, > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4710 > > > > In frame 9: > > (kgdb) fr 9 > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > at ../../../kern/subr_rman.c:979 > > 979 return (r->__r_i->r_rid); > > (kgdb) print r > > $1 = (struct resource *) 0xf800064c9380 > > (kgdb) print/x *r > > $4 = {__r_i = 0xdeadc0dedeadc0de, r_bustag = 0xdeadc0dedeadc0de, > > r_bushandle = 0xdeadc0dedeadc0de} > > > > So, looks like something is corrupted the resource data... > > This is the malloc junking on free. However, I wonder if the > problem is that the resource was freed without being properly > cleared from the resource_list in the PCI ivars. Is this with local > patches that you have? Yes, but I didn't patch any of the pci code, or the resource code, so this bug is in the original code... My patches only effect the attach case, don't touch the detach case... I was hoping someone who knows the code was like, yeh, I do remeber that place in the code where we free something, but don't properly NULL out the pointer, etc... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ahci panics when detaching...
On Monday, June 23, 2014 9:44:08 am John-Mark Gurney wrote: > So, when I try to eject a ESATA card, the machine panics... I am able > to successfully eject other cards, an ethernet (re) and a serial card > (uart), and both handle the removal of their device w/o issue and with > out crashes... > > When I try w/ ahci, I get a panic... The panic backtrace is: > #8 0x80ced4e2 in calltrap () at ../../../amd64/amd64/exception.S:231 > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > at ../../../kern/subr_rman.c:979 > #10 0x8092b888 in resource_list_release_active (rl=0xf80006d39c08, > bus=0xf80002cd9000, child=0xf80006b6d700, type=3) > at ../../../kern/subr_bus.c:3419 > #11 0x8065d7a1 in pci_child_detached (dev=0xf80002cd9000, > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4133 > ---Type to continue, or q to quit--- > #12 0x80929708 in device_detach (dev=0xf80006b6d700) > at bus_if.h:181 > #13 0x8065f9f7 in pci_delete_child (dev=0xf80002cd9000, > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4710 > > In frame 9: > (kgdb) fr 9 > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > at ../../../kern/subr_rman.c:979 > 979 return (r->__r_i->r_rid); > (kgdb) print r > $1 = (struct resource *) 0xf800064c9380 > (kgdb) print/x *r > $4 = {__r_i = 0xdeadc0dedeadc0de, r_bustag = 0xdeadc0dedeadc0de, > r_bushandle = 0xdeadc0dedeadc0de} > > So, looks like something is corrupted the resource data... This is the malloc junking on free. However, I wonder if the problem is that the resource was freed without being properly cleared from the resource_list in the PCI ivars. Is this with local patches that you have? -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ahci panics when detaching...
Eric van Gyzen wrote this message on Mon, Jun 23, 2014 at 08:57 -0500: > On 06/23/2014 08:44, John-Mark Gurney wrote: > > So, when I try to eject a ESATA card, the machine panics... I am able > > to successfully eject other cards, an ethernet (re) and a serial card > > (uart), and both handle the removal of their device w/o issue and with > > out crashes... > > > > When I try w/ ahci, I get a panic... The panic backtrace is: > > #8 0x80ced4e2 in calltrap () at > > ../../../amd64/amd64/exception.S:231 > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > at ../../../kern/subr_rman.c:979 > > #10 0x8092b888 in resource_list_release_active > > (rl=0xf80006d39c08, > > bus=0xf80002cd9000, child=0xf80006b6d700, type=3) > > at ../../../kern/subr_bus.c:3419 > > #11 0x8065d7a1 in pci_child_detached (dev=0xf80002cd9000, > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4133 > > ---Type to continue, or q to quit--- > > #12 0x80929708 in device_detach (dev=0xf80006b6d700) > > at bus_if.h:181 > > #13 0x8065f9f7 in pci_delete_child (dev=0xf80002cd9000, > > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4710 > > > > In frame 9: > > (kgdb) fr 9 > > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > > at ../../../kern/subr_rman.c:979 > > 979 return (r->__r_i->r_rid); > > (kgdb) print r > > $1 = (struct resource *) 0xf800064c9380 > > (kgdb) print/x *r > > $4 = {__r_i = 0xdeadc0dedeadc0de, r_bustag = 0xdeadc0dedeadc0de, > > r_bushandle = 0xdeadc0dedeadc0de} > > > > So, looks like something is corrupted the resource data... > > The resource data has been freed. Well, that is a type of corruption.. :) If we free it, why wasn't it removed from the list? or properly NULL'd out? > > Attach dmesg: > > atapci0: at device 0.0 on pci2 > > ahci1: at channel -1 on atapci0 > > ahci1: AHCI v1.00 with 2 3Gbps ports, Port Multiplier supported > > ahci1: quirks=0x1 > > ahcich6: at channel 0 on ahci1 > > ahcich7: at channel 1 on ahci1 > > ata2: at channel 0 on atapci0 > > [eject card] > > ahcich6: stopping AHCI engine failed > > ahcich6: stopping AHCI FR engine failed > > ahcich6: detached > > ahcich7: stopping AHCI engine failed > > ahcich7: stopping AHCI FR engine failed > > ahcich7: detached > > ahci1: detached > > ata2: detached > > atapci0: detached > > > > > > Fatal trap 9: general protection fault while in kernel mode > > > > Also, has anyone thought about adding a case in your trap > > handler that when we hit the deadc0de address, to print up a > > special message or something? At least flag it, or do we not get > > the faulting address? > > > > This is HEAD as of r266429. > > > > Let me know if there is anything else you need to know. > > The full stack trace might be useful. I could give it to you, but it contains code I can't release (at least not yet)... It's basicly an interrupt that calls pci_delete_child, so there isn't anymore useful information there.. I'm just puzzled why uart and re don't have this same problem.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ahci panics when detaching...
On 06/23/2014 08:44, John-Mark Gurney wrote: > So, when I try to eject a ESATA card, the machine panics... I am able > to successfully eject other cards, an ethernet (re) and a serial card > (uart), and both handle the removal of their device w/o issue and with > out crashes... > > When I try w/ ahci, I get a panic... The panic backtrace is: > #8 0x80ced4e2 in calltrap () at ../../../amd64/amd64/exception.S:231 > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > at ../../../kern/subr_rman.c:979 > #10 0x8092b888 in resource_list_release_active (rl=0xf80006d39c08, > bus=0xf80002cd9000, child=0xf80006b6d700, type=3) > at ../../../kern/subr_bus.c:3419 > #11 0x8065d7a1 in pci_child_detached (dev=0xf80002cd9000, > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4133 > ---Type to continue, or q to quit--- > #12 0x80929708 in device_detach (dev=0xf80006b6d700) > at bus_if.h:181 > #13 0x8065f9f7 in pci_delete_child (dev=0xf80002cd9000, > child=0xf80006b6d700) at ../../../dev/pci/pci.c:4710 > > In frame 9: > (kgdb) fr 9 > #9 0x8093d037 in rman_get_rid (r=0xf800064c9380) > at ../../../kern/subr_rman.c:979 > 979 return (r->__r_i->r_rid); > (kgdb) print r > $1 = (struct resource *) 0xf800064c9380 > (kgdb) print/x *r > $4 = {__r_i = 0xdeadc0dedeadc0de, r_bustag = 0xdeadc0dedeadc0de, > r_bushandle = 0xdeadc0dedeadc0de} > > So, looks like something is corrupted the resource data... The resource data has been freed. > Attach dmesg: > atapci0: at device 0.0 on pci2 > ahci1: at channel -1 on atapci0 > ahci1: AHCI v1.00 with 2 3Gbps ports, Port Multiplier supported > ahci1: quirks=0x1 > ahcich6: at channel 0 on ahci1 > ahcich7: at channel 1 on ahci1 > ata2: at channel 0 on atapci0 > [eject card] > ahcich6: stopping AHCI engine failed > ahcich6: stopping AHCI FR engine failed > ahcich6: detached > ahcich7: stopping AHCI engine failed > ahcich7: stopping AHCI FR engine failed > ahcich7: detached > ahci1: detached > ata2: detached > atapci0: detached > > > Fatal trap 9: general protection fault while in kernel mode > > Also, has anyone thought about adding a case in your trap > handler that when we hit the deadc0de address, to print up a > special message or something? At least flag it, or do we not get > the faulting address? > > This is HEAD as of r266429. > > Let me know if there is anything else you need to know. The full stack trace might be useful. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"