Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/17/2012 04:27 PM, Alan Stern wrote: On Fri, 17 Aug 2012, Alan Stern wrote: On Fri, 17 Aug 2012, Hans de Goede wrote: No my patch was a hack to undo the results of the commit causing the regression in the IDE case. But Alan's approach clearly is much better! Once we are sure drvdata is not used anywhere the dev_set_drvdata call could be removed in the place where my hack added a second call to it. Note that there are likely actual ide drivers using it, without setting it themselves since the ide core was setting it. So removing it will require even more auditing / checking. I did search for dev_get_drvdata() calls in drivers/ide; there were no other calls that retrieved an ide_drive_t value. But I didn't check anywhere else in the kernel. In fact, I'm not sure where else to look. After a little more checking: Among all the .c files in the kernel which match the pattern '[^a-z]ide_', the only occurrences of dev_get_drvdata() which retrieve an ide_drive_t value are the two in ide-pm.c. Thanks, good work! That means we should move forward with submitting Miklos' patch to the ide subsystem maintainer asap. Also if someone feels like it we still could do the 2 follow up / clean up patches: 1) Removin the set_drvdata call from hwif_register_devices() in ide-prove.c 2) Fixup ide-generic.c ide_remove to not dereference a NULL dev->driver Regards, Hans -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Fri, 17 Aug 2012, Alan Stern wrote: > On Fri, 17 Aug 2012, Hans de Goede wrote: > > > No my patch was a hack to undo the results of the commit causing > > the regression in the IDE case. But Alan's approach clearly is > > much better! Once we are sure drvdata is not used anywhere the > > dev_set_drvdata call could be removed in the place where my > > hack added a second call to it. Note that there are likely > > actual ide drivers using it, without setting it themselves since > > the ide core was setting it. So removing it will require even more > > auditing / checking. > > I did search for dev_get_drvdata() calls in drivers/ide; there were no > other calls that retrieved an ide_drive_t value. But I didn't check > anywhere else in the kernel. In fact, I'm not sure where else to look. After a little more checking: Among all the .c files in the kernel which match the pattern '[^a-z]ide_', the only occurrences of dev_get_drvdata() which retrieve an ide_drive_t value are the two in ide-pm.c. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Fri, 17 Aug 2012, Hans de Goede wrote: > No my patch was a hack to undo the results of the commit causing > the regression in the IDE case. But Alan's approach clearly is > much better! Once we are sure drvdata is not used anywhere the > dev_set_drvdata call could be removed in the place where my > hack added a second call to it. Note that there are likely > actual ide drivers using it, without setting it themselves since > the ide core was setting it. So removing it will require even more > auditing / checking. I did search for dev_get_drvdata() calls in drivers/ide; there were no other calls that retrieved an ide_drive_t value. But I didn't check anywhere else in the kernel. In fact, I'm not sure where else to look. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi all, On 08/16/2012 10:02 PM, Rafael J. Wysocki wrote: On Thursday, August 16, 2012, Alan Stern wrote: On Thu, 16 Aug 2012, Miklos Szeredi wrote: Yes, this appears to work. Following patch fixes the suspend oops. Thanks, Miklos OK Miklos, can you please send that to Dave with a proper changelog and sign-off (please add my ACK too)? Please make it clear that this is a regression fix and which commit has introduced the regression. I think that Alan's suggest fix, as implemented by Milos is great!, but before moving forward with this someone should audit all the other (generic) ide code for other places using the drvdata, a simple grep for dev_get_drvdata should show these. And now you can get rid of the useless dev_set_drvdata() call. That was in the other patch I think. No my patch was a hack to undo the results of the commit causing the regression in the IDE case. But Alan's approach clearly is much better! Once we are sure drvdata is not used anywhere the dev_set_drvdata call could be removed in the place where my hack added a second call to it. Note that there are likely actual ide drivers using it, without setting it themselves since the ide core was setting it. So removing it will require even more auditing / checking. A 3th thing which should be audited is the generic ide code assuming that dev->driver != NULL, this is at least true for generic_ide_remove, where: if (drv->remove) Should be changed to: if (dev->driver && drv->remove) Following the way things are done in generic_ide_shutdown, a similar change could be needed for generic_ide_probe(), although I guess that may never get called with dev->driver == NULL ? Unfortunately I'm very busy with other stuff atm, so I cannot help here other then pointing out that such audits should be done :| Regards, Hans -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi all, On 08/16/2012 10:02 PM, Rafael J. Wysocki wrote: On Thursday, August 16, 2012, Alan Stern wrote: On Thu, 16 Aug 2012, Miklos Szeredi wrote: Yes, this appears to work. Following patch fixes the suspend oops. Thanks, Miklos OK Miklos, can you please send that to Dave with a proper changelog and sign-off (please add my ACK too)? Please make it clear that this is a regression fix and which commit has introduced the regression. I think that Alan's suggest fix, as implemented by Milos is great!, but before moving forward with this someone should audit all the other (generic) ide code for other places using the drvdata, a simple grep for dev_get_drvdata should show these. snip And now you can get rid of the useless dev_set_drvdata() call. That was in the other patch I think. No my patch was a hack to undo the results of the commit causing the regression in the IDE case. But Alan's approach clearly is much better! Once we are sure drvdata is not used anywhere the dev_set_drvdata call could be removed in the place where my hack added a second call to it. Note that there are likely actual ide drivers using it, without setting it themselves since the ide core was setting it. So removing it will require even more auditing / checking. A 3th thing which should be audited is the generic ide code assuming that dev-driver != NULL, this is at least true for generic_ide_remove, where: if (drv-remove) Should be changed to: if (dev-driver drv-remove) Following the way things are done in generic_ide_shutdown, a similar change could be needed for generic_ide_probe(), although I guess that may never get called with dev-driver == NULL ? Unfortunately I'm very busy with other stuff atm, so I cannot help here other then pointing out that such audits should be done :| Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Fri, 17 Aug 2012, Hans de Goede wrote: No my patch was a hack to undo the results of the commit causing the regression in the IDE case. But Alan's approach clearly is much better! Once we are sure drvdata is not used anywhere the dev_set_drvdata call could be removed in the place where my hack added a second call to it. Note that there are likely actual ide drivers using it, without setting it themselves since the ide core was setting it. So removing it will require even more auditing / checking. I did search for dev_get_drvdata() calls in drivers/ide; there were no other calls that retrieved an ide_drive_t value. But I didn't check anywhere else in the kernel. In fact, I'm not sure where else to look. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Fri, 17 Aug 2012, Alan Stern wrote: On Fri, 17 Aug 2012, Hans de Goede wrote: No my patch was a hack to undo the results of the commit causing the regression in the IDE case. But Alan's approach clearly is much better! Once we are sure drvdata is not used anywhere the dev_set_drvdata call could be removed in the place where my hack added a second call to it. Note that there are likely actual ide drivers using it, without setting it themselves since the ide core was setting it. So removing it will require even more auditing / checking. I did search for dev_get_drvdata() calls in drivers/ide; there were no other calls that retrieved an ide_drive_t value. But I didn't check anywhere else in the kernel. In fact, I'm not sure where else to look. After a little more checking: Among all the .c files in the kernel which match the pattern '[^a-z]ide_', the only occurrences of dev_get_drvdata() which retrieve an ide_drive_t value are the two in ide-pm.c. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/17/2012 04:27 PM, Alan Stern wrote: On Fri, 17 Aug 2012, Alan Stern wrote: On Fri, 17 Aug 2012, Hans de Goede wrote: No my patch was a hack to undo the results of the commit causing the regression in the IDE case. But Alan's approach clearly is much better! Once we are sure drvdata is not used anywhere the dev_set_drvdata call could be removed in the place where my hack added a second call to it. Note that there are likely actual ide drivers using it, without setting it themselves since the ide core was setting it. So removing it will require even more auditing / checking. I did search for dev_get_drvdata() calls in drivers/ide; there were no other calls that retrieved an ide_drive_t value. But I didn't check anywhere else in the kernel. In fact, I'm not sure where else to look. After a little more checking: Among all the .c files in the kernel which match the pattern '[^a-z]ide_', the only occurrences of dev_get_drvdata() which retrieve an ide_drive_t value are the two in ide-pm.c. Thanks, good work! That means we should move forward with submitting Miklos' patch to the ide subsystem maintainer asap. Also if someone feels like it we still could do the 2 follow up / clean up patches: 1) Removin the set_drvdata call from hwif_register_devices() in ide-prove.c 2) Fixup ide-generic.c ide_remove to not dereference a NULL dev-driver Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Thursday, August 16, 2012, Alan Stern wrote: > On Thu, 16 Aug 2012, Miklos Szeredi wrote: > > > Yes, this appears to work. Following patch fixes the suspend oops. > > > > Thanks, > > Miklos OK Miklos, can you please send that to Dave with a proper changelog and sign-off (please add my ACK too)? Please make it clear that this is a regression fix and which commit has introduced the regression. > > --- > > drivers/ide/ide-pm.c |4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c > > index 9240609..8d1e32d 100644 > > --- a/drivers/ide/ide-pm.c > > +++ b/drivers/ide/ide-pm.c > > @@ -4,7 +4,7 @@ > > > > int generic_ide_suspend(struct device *dev, pm_message_t mesg) > > { > > - ide_drive_t *drive = dev_get_drvdata(dev); > > + ide_drive_t *drive = to_ide_device(dev); > > ide_drive_t *pair = ide_get_pair_dev(drive); > > ide_hwif_t *hwif = drive->hwif; > > struct request *rq; > > @@ -40,7 +40,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t > > mesg) > > > > int generic_ide_resume(struct device *dev) > > { > > - ide_drive_t *drive = dev_get_drvdata(dev); > > + ide_drive_t *drive = to_ide_device(dev); > > ide_drive_t *pair = ide_get_pair_dev(drive); > > ide_hwif_t *hwif = drive->hwif; > > struct request *rq; > > And now you can get rid of the useless dev_set_drvdata() call. That was in the other patch I think. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Thu, 16 Aug 2012, Miklos Szeredi wrote: > Yes, this appears to work. Following patch fixes the suspend oops. > > Thanks, > Miklos > > --- > drivers/ide/ide-pm.c |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c > index 9240609..8d1e32d 100644 > --- a/drivers/ide/ide-pm.c > +++ b/drivers/ide/ide-pm.c > @@ -4,7 +4,7 @@ > > int generic_ide_suspend(struct device *dev, pm_message_t mesg) > { > - ide_drive_t *drive = dev_get_drvdata(dev); > + ide_drive_t *drive = to_ide_device(dev); > ide_drive_t *pair = ide_get_pair_dev(drive); > ide_hwif_t *hwif = drive->hwif; > struct request *rq; > @@ -40,7 +40,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t > mesg) > > int generic_ide_resume(struct device *dev) > { > - ide_drive_t *drive = dev_get_drvdata(dev); > + ide_drive_t *drive = to_ide_device(dev); > ide_drive_t *pair = ide_get_pair_dev(drive); > ide_hwif_t *hwif = drive->hwif; > struct request *rq; And now you can get rid of the useless dev_set_drvdata() call. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Alan Stern writes: > On Thu, 16 Aug 2012, Hans de Goede wrote: > >> Ah right, these are bus_driver operations. That explains some things, so I've >> done some more research asking myself: "Why does generic_ide_suspend(), which >> is a *bus* op, call dev_get_drvdata?", the answer to that seems to be that >> the ide subsystem is abusing (IMHO) drvdata to store per device bus_driver >> data. Which I believe is not how drvdata is intended to be used. >> >> With that said, the above knowledge has allowed me to write an (ugly) fix for >> the regression Miklos is seeing. Miklos can you give the attached patch a >> try please? >> >> > It clearly should check if drive is not NULL before using that pointer. >> >> I assume you mean drive*r*, yes I agree that generic_ide_remove should >> check for that. So who is going to write a patch for that? > > The existing code could certainly be improved. Your patch does: > >> + /* >> +* device_register() will have cleared drvdata on >> +* device_attach failure, but we use drvdata to store per >> +* device bus info, rather then for driver info, so restore >> it. >> +*/ >> + dev_set_drvdata(dev, drive); > > But at this point, dev is defined by: > > struct device *dev = >gendev; > > So why bother setting anything? It seems to me that > generic_ide_suspend() and generic_ide_resume() could easily replace > > ide_drive_t *drive = dev_get_drvdata(dev); > > with > > ide_drive_t *drive = dev_to_ide_drive(dev); > > where dev_to_ide_drive is defined as "container_of(dev, ide_drive_t, > gendev)" (if this isn't defined already). Yes, this appears to work. Following patch fixes the suspend oops. Thanks, Miklos --- drivers/ide/ide-pm.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c index 9240609..8d1e32d 100644 --- a/drivers/ide/ide-pm.c +++ b/drivers/ide/ide-pm.c @@ -4,7 +4,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive->hwif; struct request *rq; @@ -40,7 +40,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) int generic_ide_resume(struct device *dev) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive->hwif; struct request *rq; -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Thu, 16 Aug 2012, Hans de Goede wrote: > Ah right, these are bus_driver operations. That explains some things, so I've > done some more research asking myself: "Why does generic_ide_suspend(), which > is a *bus* op, call dev_get_drvdata?", the answer to that seems to be that > the ide subsystem is abusing (IMHO) drvdata to store per device bus_driver > data. Which I believe is not how drvdata is intended to be used. > > With that said, the above knowledge has allowed me to write an (ugly) fix for > the regression Miklos is seeing. Miklos can you give the attached patch a > try please? > > > It clearly should check if drive is not NULL before using that pointer. > > I assume you mean drive*r*, yes I agree that generic_ide_remove should > check for that. So who is going to write a patch for that? The existing code could certainly be improved. Your patch does: > + /* > +* device_register() will have cleared drvdata on > +* device_attach failure, but we use drvdata to store per > +* device bus info, rather then for driver info, so restore > it. > +*/ > + dev_set_drvdata(dev, drive); But at this point, dev is defined by: struct device *dev = >gendev; So why bother setting anything? It seems to me that generic_ide_suspend() and generic_ide_resume() could easily replace ide_drive_t *drive = dev_get_drvdata(dev); with ide_drive_t *drive = dev_to_ide_drive(dev); where dev_to_ide_drive is defined as "container_of(dev, ide_drive_t, gendev)" (if this isn't defined already). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/15/2012 09:59 PM, Rafael J. Wysocki wrote: On Wednesday, August 15, 2012, Hans de Goede wrote: Hi, On 08/15/2012 07:13 AM, Miklos Szeredi wrote: Suspend oopses in generic_ide_suspend() because dev_get_drvdata() returns NULL (dev->p->driver_data == NULL) and this function is not prepared for this. I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no driver is bound). Reverting it fixes suspend. First of all, thanks for reporting and bisecting this. With that said, I must say that this is very weird. The patch in question: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 Only makes dev-drvdata NULL in 2 cases: 1) The probe method of the driver fails 2) The driver has been detached from the device by calling one of: device_release_driver() or driver_detach() Note that in both code paths dev->driver also gets set to NULL, and other generic ide driver callbacks very much depend on that not being NULL, ie: static int generic_ide_remove(struct device *dev) { ide_drive_t *drive = to_ide_device(dev); struct ide_driver *drv = to_ide_driver(dev->driver); if (drv->remove) drv->remove(drive); return 0; } Also how can a drivers suspend callback get called if dev->driver is NULL, since that callback would normally be "reached" through dev->driver, so something weird is going on here ... No, it wouldn't, because it is a bus type callback and it is invoked for all devices whose bus type is ide_bus_type, regardless of whether or not their driver field is NULL. Ah right, these are bus_driver operations. That explains some things, so I've done some more research asking myself: "Why does generic_ide_suspend(), which is a *bus* op, call dev_get_drvdata?", the answer to that seems to be that the ide subsystem is abusing (IMHO) drvdata to store per device bus_driver data. Which I believe is not how drvdata is intended to be used. With that said, the above knowledge has allowed me to write an (ugly) fix for the regression Miklos is seeing. Miklos can you give the attached patch a try please? It clearly should check if drive is not NULL before using that pointer. I assume you mean drive*r*, yes I agree that generic_ide_remove should check for that. So who is going to write a patch for that? Regards, Hans >From 00f700ef4dd5fa335f725361aa683388c9b8ec4f Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Thu, 16 Aug 2012 13:23:30 +0200 Subject: [PATCH] ide: Restore drvdata after device_attach failure Since commit 0998d063: "device-core: Ensure drvdata = NULL when no driver is bound", device_attach will clear a device's drvdata if the driver failed to bind to it. In the ide subsystem however drvdata is not used to store driver data, but rather to store per device bus_driver data. So for now restore drvdata after calling device_register(), so that drvdata will still point to the per device bus_driver data after a driver probe failure. In the long run the ide subsystem should probably be fixed to not abuse drvdata in this way, as this clearly is not how drvdata is intended to be used. Signed-off-by: Hans de Goede --- drivers/ide/ide-probe.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c index 068cef0..be1981b 100644 --- a/drivers/ide/ide-probe.c +++ b/drivers/ide/ide-probe.c @@ -1015,6 +1015,12 @@ static void hwif_register_devices(ide_hwif_t *hwif) if (ret < 0) printk(KERN_WARNING "IDE: %s: device_register error: " "%d\n", __func__, ret); + /* + * device_register() will have cleared drvdata on + * device_attach failure, but we use drvdata to store per + * device bus info, rather then for driver info, so restore it. + */ + dev_set_drvdata(dev, drive); } } -- 1.7.11.4
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/15/2012 09:59 PM, Rafael J. Wysocki wrote: On Wednesday, August 15, 2012, Hans de Goede wrote: Hi, On 08/15/2012 07:13 AM, Miklos Szeredi wrote: Suspend oopses in generic_ide_suspend() because dev_get_drvdata() returns NULL (dev-p-driver_data == NULL) and this function is not prepared for this. I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no driver is bound). Reverting it fixes suspend. First of all, thanks for reporting and bisecting this. With that said, I must say that this is very weird. The patch in question: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 Only makes dev-drvdata NULL in 2 cases: 1) The probe method of the driver fails 2) The driver has been detached from the device by calling one of: device_release_driver() or driver_detach() Note that in both code paths dev-driver also gets set to NULL, and other generic ide driver callbacks very much depend on that not being NULL, ie: static int generic_ide_remove(struct device *dev) { ide_drive_t *drive = to_ide_device(dev); struct ide_driver *drv = to_ide_driver(dev-driver); if (drv-remove) drv-remove(drive); return 0; } Also how can a drivers suspend callback get called if dev-driver is NULL, since that callback would normally be reached through dev-driver, so something weird is going on here ... No, it wouldn't, because it is a bus type callback and it is invoked for all devices whose bus type is ide_bus_type, regardless of whether or not their driver field is NULL. Ah right, these are bus_driver operations. That explains some things, so I've done some more research asking myself: Why does generic_ide_suspend(), which is a *bus* op, call dev_get_drvdata?, the answer to that seems to be that the ide subsystem is abusing (IMHO) drvdata to store per device bus_driver data. Which I believe is not how drvdata is intended to be used. With that said, the above knowledge has allowed me to write an (ugly) fix for the regression Miklos is seeing. Miklos can you give the attached patch a try please? It clearly should check if drive is not NULL before using that pointer. I assume you mean drive*r*, yes I agree that generic_ide_remove should check for that. So who is going to write a patch for that? Regards, Hans From 00f700ef4dd5fa335f725361aa683388c9b8ec4f Mon Sep 17 00:00:00 2001 From: Hans de Goede hdego...@redhat.com Date: Thu, 16 Aug 2012 13:23:30 +0200 Subject: [PATCH] ide: Restore drvdata after device_attach failure Since commit 0998d063: device-core: Ensure drvdata = NULL when no driver is bound, device_attach will clear a device's drvdata if the driver failed to bind to it. In the ide subsystem however drvdata is not used to store driver data, but rather to store per device bus_driver data. So for now restore drvdata after calling device_register(), so that drvdata will still point to the per device bus_driver data after a driver probe failure. In the long run the ide subsystem should probably be fixed to not abuse drvdata in this way, as this clearly is not how drvdata is intended to be used. Signed-off-by: Hans de Goede hdego...@redhat.com --- drivers/ide/ide-probe.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c index 068cef0..be1981b 100644 --- a/drivers/ide/ide-probe.c +++ b/drivers/ide/ide-probe.c @@ -1015,6 +1015,12 @@ static void hwif_register_devices(ide_hwif_t *hwif) if (ret 0) printk(KERN_WARNING IDE: %s: device_register error: %d\n, __func__, ret); + /* + * device_register() will have cleared drvdata on + * device_attach failure, but we use drvdata to store per + * device bus info, rather then for driver info, so restore it. + */ + dev_set_drvdata(dev, drive); } } -- 1.7.11.4
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Thu, 16 Aug 2012, Hans de Goede wrote: Ah right, these are bus_driver operations. That explains some things, so I've done some more research asking myself: Why does generic_ide_suspend(), which is a *bus* op, call dev_get_drvdata?, the answer to that seems to be that the ide subsystem is abusing (IMHO) drvdata to store per device bus_driver data. Which I believe is not how drvdata is intended to be used. With that said, the above knowledge has allowed me to write an (ugly) fix for the regression Miklos is seeing. Miklos can you give the attached patch a try please? It clearly should check if drive is not NULL before using that pointer. I assume you mean drive*r*, yes I agree that generic_ide_remove should check for that. So who is going to write a patch for that? The existing code could certainly be improved. Your patch does: + /* +* device_register() will have cleared drvdata on +* device_attach failure, but we use drvdata to store per +* device bus info, rather then for driver info, so restore it. +*/ + dev_set_drvdata(dev, drive); But at this point, dev is defined by: struct device *dev = drive-gendev; So why bother setting anything? It seems to me that generic_ide_suspend() and generic_ide_resume() could easily replace ide_drive_t *drive = dev_get_drvdata(dev); with ide_drive_t *drive = dev_to_ide_drive(dev); where dev_to_ide_drive is defined as container_of(dev, ide_drive_t, gendev) (if this isn't defined already). Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Alan Stern st...@rowland.harvard.edu writes: On Thu, 16 Aug 2012, Hans de Goede wrote: Ah right, these are bus_driver operations. That explains some things, so I've done some more research asking myself: Why does generic_ide_suspend(), which is a *bus* op, call dev_get_drvdata?, the answer to that seems to be that the ide subsystem is abusing (IMHO) drvdata to store per device bus_driver data. Which I believe is not how drvdata is intended to be used. With that said, the above knowledge has allowed me to write an (ugly) fix for the regression Miklos is seeing. Miklos can you give the attached patch a try please? It clearly should check if drive is not NULL before using that pointer. I assume you mean drive*r*, yes I agree that generic_ide_remove should check for that. So who is going to write a patch for that? The existing code could certainly be improved. Your patch does: + /* +* device_register() will have cleared drvdata on +* device_attach failure, but we use drvdata to store per +* device bus info, rather then for driver info, so restore it. +*/ + dev_set_drvdata(dev, drive); But at this point, dev is defined by: struct device *dev = drive-gendev; So why bother setting anything? It seems to me that generic_ide_suspend() and generic_ide_resume() could easily replace ide_drive_t *drive = dev_get_drvdata(dev); with ide_drive_t *drive = dev_to_ide_drive(dev); where dev_to_ide_drive is defined as container_of(dev, ide_drive_t, gendev) (if this isn't defined already). Yes, this appears to work. Following patch fixes the suspend oops. Thanks, Miklos --- drivers/ide/ide-pm.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c index 9240609..8d1e32d 100644 --- a/drivers/ide/ide-pm.c +++ b/drivers/ide/ide-pm.c @@ -4,7 +4,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive-hwif; struct request *rq; @@ -40,7 +40,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) int generic_ide_resume(struct device *dev) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive-hwif; struct request *rq; -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Thu, 16 Aug 2012, Miklos Szeredi wrote: Yes, this appears to work. Following patch fixes the suspend oops. Thanks, Miklos --- drivers/ide/ide-pm.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c index 9240609..8d1e32d 100644 --- a/drivers/ide/ide-pm.c +++ b/drivers/ide/ide-pm.c @@ -4,7 +4,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive-hwif; struct request *rq; @@ -40,7 +40,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) int generic_ide_resume(struct device *dev) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive-hwif; struct request *rq; And now you can get rid of the useless dev_set_drvdata() call. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Thursday, August 16, 2012, Alan Stern wrote: On Thu, 16 Aug 2012, Miklos Szeredi wrote: Yes, this appears to work. Following patch fixes the suspend oops. Thanks, Miklos OK Miklos, can you please send that to Dave with a proper changelog and sign-off (please add my ACK too)? Please make it clear that this is a regression fix and which commit has introduced the regression. --- drivers/ide/ide-pm.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c index 9240609..8d1e32d 100644 --- a/drivers/ide/ide-pm.c +++ b/drivers/ide/ide-pm.c @@ -4,7 +4,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive-hwif; struct request *rq; @@ -40,7 +40,7 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg) int generic_ide_resume(struct device *dev) { - ide_drive_t *drive = dev_get_drvdata(dev); + ide_drive_t *drive = to_ide_device(dev); ide_drive_t *pair = ide_get_pair_dev(drive); ide_hwif_t *hwif = drive-hwif; struct request *rq; And now you can get rid of the useless dev_set_drvdata() call. That was in the other patch I think. Thanks, Rafael -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Wednesday, August 15, 2012, Hans de Goede wrote: > Hi, > > On 08/15/2012 07:13 AM, Miklos Szeredi wrote: > > Suspend oopses in generic_ide_suspend() because dev_get_drvdata() > > returns NULL (dev->p->driver_data == NULL) and this function is not > > prepared for this. > > > > I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no > > driver is bound). Reverting it fixes suspend. > > > > First of all, thanks for reporting and bisecting this. With that said, > I must say that this is very weird. The patch in question: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 > > Only makes dev-drvdata NULL in 2 cases: > 1) The probe method of the driver fails > 2) The driver has been detached from the device by calling one of: > device_release_driver() or driver_detach() > > Note that in both code paths dev->driver also gets set to NULL, and > other generic ide driver callbacks very much depend on that not being > NULL, ie: > > static int generic_ide_remove(struct device *dev) > { > ide_drive_t *drive = to_ide_device(dev); > struct ide_driver *drv = to_ide_driver(dev->driver); > > if (drv->remove) > drv->remove(drive); > > return 0; > } > > Also how can a drivers suspend callback get called if dev->driver is NULL, > since that callback would normally be "reached" through dev->driver, so > something weird is going on here ... No, it wouldn't, because it is a bus type callback and it is invoked for all devices whose bus type is ide_bus_type, regardless of whether or not their driver field is NULL. It clearly should check if drive is not NULL before using that pointer. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/15/2012 07:13 AM, Miklos Szeredi wrote: Suspend oopses in generic_ide_suspend() because dev_get_drvdata() returns NULL (dev->p->driver_data == NULL) and this function is not prepared for this. I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no driver is bound). Reverting it fixes suspend. First of all, thanks for reporting and bisecting this. With that said, I must say that this is very weird. The patch in question: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 Only makes dev-drvdata NULL in 2 cases: 1) The probe method of the driver fails 2) The driver has been detached from the device by calling one of: device_release_driver() or driver_detach() Note that in both code paths dev->driver also gets set to NULL, and other generic ide driver callbacks very much depend on that not being NULL, ie: static int generic_ide_remove(struct device *dev) { ide_drive_t *drive = to_ide_device(dev); struct ide_driver *drv = to_ide_driver(dev->driver); if (drv->remove) drv->remove(drive); return 0; } Also how can a drivers suspend callback get called if dev->driver is NULL, since that callback would normally be "reached" through dev->driver, so something weird is going on here ... I hope one of the ide guys can shed some light on this. Regards, Hans -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/15/2012 07:13 AM, Miklos Szeredi wrote: Suspend oopses in generic_ide_suspend() because dev_get_drvdata() returns NULL (dev-p-driver_data == NULL) and this function is not prepared for this. I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no driver is bound). Reverting it fixes suspend. First of all, thanks for reporting and bisecting this. With that said, I must say that this is very weird. The patch in question: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 Only makes dev-drvdata NULL in 2 cases: 1) The probe method of the driver fails 2) The driver has been detached from the device by calling one of: device_release_driver() or driver_detach() Note that in both code paths dev-driver also gets set to NULL, and other generic ide driver callbacks very much depend on that not being NULL, ie: static int generic_ide_remove(struct device *dev) { ide_drive_t *drive = to_ide_device(dev); struct ide_driver *drv = to_ide_driver(dev-driver); if (drv-remove) drv-remove(drive); return 0; } Also how can a drivers suspend callback get called if dev-driver is NULL, since that callback would normally be reached through dev-driver, so something weird is going on here ... I hope one of the ide guys can shed some light on this. Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
On Wednesday, August 15, 2012, Hans de Goede wrote: Hi, On 08/15/2012 07:13 AM, Miklos Szeredi wrote: Suspend oopses in generic_ide_suspend() because dev_get_drvdata() returns NULL (dev-p-driver_data == NULL) and this function is not prepared for this. I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no driver is bound). Reverting it fixes suspend. First of all, thanks for reporting and bisecting this. With that said, I must say that this is very weird. The patch in question: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 Only makes dev-drvdata NULL in 2 cases: 1) The probe method of the driver fails 2) The driver has been detached from the device by calling one of: device_release_driver() or driver_detach() Note that in both code paths dev-driver also gets set to NULL, and other generic ide driver callbacks very much depend on that not being NULL, ie: static int generic_ide_remove(struct device *dev) { ide_drive_t *drive = to_ide_device(dev); struct ide_driver *drv = to_ide_driver(dev-driver); if (drv-remove) drv-remove(drive); return 0; } Also how can a drivers suspend callback get called if dev-driver is NULL, since that callback would normally be reached through dev-driver, so something weird is going on here ... No, it wouldn't, because it is a bus type callback and it is invoked for all devices whose bus type is ide_bus_type, regardless of whether or not their driver field is NULL. It clearly should check if drive is not NULL before using that pointer. Thanks, Rafael -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/