Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, 13 Oct 2011, Grant Likely wrote: For the deferred case; here is an example of the additional constraint. Consider the following hierarchy: -A +-B | +-C | +-D | +-E +-F +-G dpm_list could be ordered in at least the following ways (depending on exactly when devices get registered). There are many permutation, but children are always be listed after its direct parent. 1) ABECDFG (breadth first) 2) AEBFGCD (breadth first) 3) ABCDEFG (depth first) 4) AEFGBCD (depth first) Now, assume that device B depends on device F, and also assume that there is no way either to express that in the hierarchy or even for the constraint to be known at device registration time (the is exactly the situation we're dealing with on embedded platforms). Only the driver for B knows that it needs a resource provided by F's driver. So, the situation becomes that the ordering of dpm_list must now also be sorted so that non-tree dependencies are also accounted for. Of the list above, only sort order 4 meets the new constraint. The question then becomes, how can the dpm_list get resorted dynamically at runtime to ensure that the new constraints are always met without breaking old ones. Here are some options I can think of: 1) When a deferred probe succeeds, move the deferred device to the end of the dpm_list. Then the new sort order might be one of: 1) AECDFGB 2) AEFGCDB 3) ACDEFGB 4) AEFGCDB However, that approach breaks the guarantee that children appear after their parents (C D appear before B in all cases above) How can a device acquire children before it has a driver? 2a) When a deferred probe succeeds, move the deferred device and it's entire child hierarchy to the end of the list to become one of: This is the same as the above, under the reasonable assumption that devices without drivers can't have any children. Or to be more carefully precise, that a device with children won't need to defer a probe. We can check that easily enough: Fail a deferral if the device already has children. 2b) alternately, when *any* probe succeeds, move the deferred device and its children to the end of the list. This continues to maintain the parent-child relationship, and it ensures that the dpm_list is always also sorted in probe-order (devices bound to drivers will collect at the end of the list). We do not want to move entire hierarchies. And in fact, even in 1 or 2a, we would have to do the move from _within_ the deferred probe routine, not afterward, to make sure that it occurs before any children are added (and after all the prerequisite devices have been registered). 3) Add devices to dpm_list after successful probe instead of at device_add time. Potential problem: this means that only devices with drivers actually get suspend/resume calls. I don't know nearly enough about the PM subsystem, but I suspect that this is a problem. Yes, this is no good. 4) ignore parent-child relationships entirely and just move devices to the end of the list on successful probe. If it probed successfully, then it can be successfully suspended regardless of whether it has any children. That breaks the parent-child ordering, but guarantees the probe order ordering. Any devices without drivers will end up collecting at the beginning of the list, and will be suspended after all the devices with drivers. Problem: Same as with 3, I suspect that even devices without drivers need to process PM suspend, which makes this option unworkable. ... For my money, I think that option 2b shows the most promise despite the potential O(N^2) complexity. There may be a better algorithm for doing the runtime reordering that isn't O(N^2) that I haven't thought of. Having a list that is strongly ordered both on hierarchy *and* probe order I think is the right thing to do, even without deferred probe support. The answer is 1, but do the move at the appropriate time within the probe routine. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, 13 Oct 2011, Grant Likely wrote: However it's worth pointing out right at the start that we already have device_pm_move_before(), device_pm_move_after(), and device_pm_move_last(). They are intended specifically to provide drivers with a way of making sure that dpm_list is in the right order -- people have been aware of these issues for some time. I saw those. I also notice that they are only used by device_move() when reparenting a device (which is another wrinkle I hadn't though about). Reparenting a device becomes problematic if the probe order is also represented in the list. If device_move() gets called, then any implicit probe-order sorting for that device would be lost. I also notice that device_move disregards any children when moving a device, which could also be a problem. Although it looks like the only users of device_move are: drivers/media/video/pvrusb2/pvrusb2-v4l2.c drivers/s390/cio/device.c net/bluetooth/hci_sysfs.c net/bluetooth/rfcomm/tty.c So it may not be significant to adapt. I trust that very little will be needed. I haven't checked the existing callers, but it's reasonable to require that a device being moved not have any children. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 14, 2011 at 9:37 AM, Alan Stern st...@rowland.harvard.edu wrote: On Thu, 13 Oct 2011, Grant Likely wrote: For the deferred case; here is an example of the additional constraint. Consider the following hierarchy: -A +-B | +-C | +-D | +-E +-F +-G dpm_list could be ordered in at least the following ways (depending on exactly when devices get registered). There are many permutation, but children are always be listed after its direct parent. 1) ABECDFG (breadth first) 2) AEBFGCD (breadth first) 3) ABCDEFG (depth first) 4) AEFGBCD (depth first) Now, assume that device B depends on device F, and also assume that there is no way either to express that in the hierarchy or even for the constraint to be known at device registration time (the is exactly the situation we're dealing with on embedded platforms). Only the driver for B knows that it needs a resource provided by F's driver. So, the situation becomes that the ordering of dpm_list must now also be sorted so that non-tree dependencies are also accounted for. Of the list above, only sort order 4 meets the new constraint. The question then becomes, how can the dpm_list get resorted dynamically at runtime to ensure that the new constraints are always met without breaking old ones. Here are some options I can think of: 1) When a deferred probe succeeds, move the deferred device to the end of the dpm_list. Then the new sort order might be one of: 1) AECDFGB 2) AEFGCDB 3) ACDEFGB 4) AEFGCDB However, that approach breaks the guarantee that children appear after their parents (C D appear before B in all cases above) How can a device acquire children before it has a driver? There are a few potential situations in embedded systems (or at least nothing currently prevents it) where platform setup code constructs a device hierarchy without the aid of device drivers, and it is still possible for a parent device to be attached to a driver. IIUC, SPARC creates an entire hierarchy of platform_devices from all the nodes in the OpenFirmware device tree, and any of those devices can be bound to a driver. I don't like that approach, but at the very least it needs to be guarded against. On Fri, Oct 14, 2011 at 9:39 AM, Alan Stern st...@rowland.harvard.edu wrote: On Thu, 13 Oct 2011, Grant Likely wrote: I saw those. I also notice that they are only used by device_move() when reparenting a device (which is another wrinkle I hadn't though about). Reparenting a device becomes problematic if the probe order is also represented in the list. If device_move() gets called, then any implicit probe-order sorting for that device would be lost. I also notice that device_move disregards any children when moving a device, which could also be a problem. Although it looks like the only users of device_move are: drivers/media/video/pvrusb2/pvrusb2-v4l2.c drivers/s390/cio/device.c net/bluetooth/hci_sysfs.c net/bluetooth/rfcomm/tty.c So it may not be significant to adapt. I trust that very little will be needed. I haven't checked the existing callers, but it's reasonable to require that a device being moved not have any children. Yes, that does indeed simplify the issue considerably. Let's do that. So, for this patch, there will be two bits added. First, don't allow deferral on devices with children, and second a successful probe (deferred or otherwise) should always move a device to the end of the dap_list if it doesn't have children to guarantee that the list order satisfies both the hierarchical and probe order. Manjunath, do you think you can prototype this? g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, 14 Oct 2011, Grant Likely wrote: How can a device acquire children before it has a driver? There are a few potential situations in embedded systems (or at least nothing currently prevents it) where platform setup code constructs a device hierarchy without the aid of device drivers, and it is still possible for a parent device to be attached to a driver. IIUC, SPARC creates an entire hierarchy of platform_devices from all the nodes in the OpenFirmware device tree, and any of those devices can be bound to a driver. I don't like that approach, but at the very least it needs to be guarded against. Do these devices ever require deferred probes? On Fri, Oct 14, 2011 at 9:39 AM, Alan Stern st...@rowland.harvard.edu wrote: On Thu, 13 Oct 2011, Grant Likely wrote: I saw those. �I also notice that they are only used by device_move() when reparenting a device (which is another wrinkle I hadn't though about). �Reparenting a device becomes problematic if the probe order is also represented in the list. �If device_move() gets called, then any implicit probe-order sorting for that device would be lost. I also notice that device_move disregards any children when moving a device, which could also be a problem. Although it looks like the only users of device_move are: drivers/media/video/pvrusb2/pvrusb2-v4l2.c drivers/s390/cio/device.c net/bluetooth/hci_sysfs.c net/bluetooth/rfcomm/tty.c So it may not be significant to adapt. I trust that very little will be needed. �I haven't checked the existing callers, but it's reasonable to require that a device being moved not have any children. Yes, that does indeed simplify the issue considerably. Let's do that. So, for this patch, there will be two bits added. First, don't allow deferral on devices with children, and second a successful probe (deferred or otherwise) should always move a device to the end of the dap_list if it doesn't have children to guarantee that the list order satisfies both the hierarchical and probe order. Manjunath, do you think you can prototype this? I don't think the second part needs to be quite so invasive. Certainly drivers that never defer probes shouldn't require anything to be moved. A deferred probe _should_ move the device to the end of the list. But this needs to happen in the probe routine itself (after it has verified that all the other required devices are present and before it has registered any children) to prevent races. It can't be done in a central location. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 14, 2011 at 10:33 AM, Alan Stern st...@rowland.harvard.edu wrote: On Fri, 14 Oct 2011, Grant Likely wrote: How can a device acquire children before it has a driver? There are a few potential situations in embedded systems (or at least nothing currently prevents it) where platform setup code constructs a device hierarchy without the aid of device drivers, and it is still possible for a parent device to be attached to a driver. IIUC, SPARC creates an entire hierarchy of platform_devices from all the nodes in the OpenFirmware device tree, and any of those devices can be bound to a driver. I don't like that approach, but at the very least it needs to be guarded against. Do these devices ever require deferred probes? Yes, they very well might. However, I'm happy with the limitation that only leaf devices can take advantage of probe deferral. On Fri, Oct 14, 2011 at 9:39 AM, Alan Stern st...@rowland.harvard.edu wrote: On Thu, 13 Oct 2011, Grant Likely wrote: Yes, that does indeed simplify the issue considerably. Let's do that. So, for this patch, there will be two bits added. First, don't allow deferral on devices with children, and second a successful probe (deferred or otherwise) should always move a device to the end of the dap_list if it doesn't have children to guarantee that the list order satisfies both the hierarchical and probe order. Manjunath, do you think you can prototype this? I don't think the second part needs to be quite so invasive. Certainly drivers that never defer probes shouldn't require anything to be moved. Think about that a minute. Consider a dpm_list of devices: abcDefGh Now assume that D has an implicit dependency on G. If D gets probed first, then it will be deferred until G gets probed and then D will get retried and moved to the end of the list resulting in: abcefGhD Everything is good now for the order that things need to be suspended in. Now lets assume that the drivers get linked into the kernel in a different order (or the modules get loaded in a different order) and G gets probed first, followed by D. No deferral occurred and so no reordering occurs, resulting in: abcDefGh (unchanged) But now suspend is broken because D depends on G, but G will be suspended before D. This looks and smells like a bug to me. In fact, even without using probe deferral it looks like a bug because the dap_list isn't taking into account the fact that there are very likely to be implicit dependencies between devices that are not represented in the device hierarchy (maybe not common in PCs, but certainly is the case for embedded). But, it is also easy to solve by ensuring the dap_list is also probe-order sorted. A deferred probe _should_ move the device to the end of the list. But this needs to happen in the probe routine itself (after it has verified that all the other required devices are present and before it has registered any children) to prevent races. It can't be done in a central location. I'm really concerned about drivers having to implement this and not getting it correct; particularly when moving a device to the end of the list is cheap, and it shouldn't be a problem to do the move unconditionally after a driver has been matches, but before probe() is called. We may be able to keep watch to make sure that drivers using probe deferral are also reordering themselves, but that does nothing for the cases described above where the link order of the drivers determines probe order, not the dap_list order. g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, 14 Oct 2011, Grant Likely wrote: I don't think the second part needs to be quite so invasive. Certainly drivers that never defer probes shouldn't require anything to be moved. Think about that a minute. Consider a dpm_list of devices: abcDefGh Now assume that D has an implicit dependency on G. If D gets probed first, then it will be deferred until G gets probed and then D will get retried and moved to the end of the list resulting in: abcefGhD Everything is good now for the order that things need to be suspended in. Now lets assume that the drivers get linked into the kernel in a different order (or the modules get loaded in a different order) and G gets probed first, followed by D. No deferral occurred and so no reordering occurs, resulting in: abcDefGh (unchanged) But now suspend is broken because D depends on G, but G will be suspended before D. However D sometimes does defer probes. Therefore the device always needs to be moved, even though this particular probe wasn't deferred. This looks and smells like a bug to me. In fact, even without using probe deferral it looks like a bug because the dap_list isn't taking into account the fact that there are very likely to be implicit dependencies between devices that are not represented in the device hierarchy (maybe not common in PCs, but certainly is the case for embedded). But, it is also easy to solve by ensuring the dap_list is also probe-order sorted. A deferred probe _should_ move the device to the end of the list. �But this needs to happen in the probe routine itself (after it has verified that all the other required devices are present and before it has registered any children) to prevent races. �It can't be done in a central location. I'm really concerned about drivers having to implement this and not getting it correct; particularly when moving a device to the end of the list is cheap, and it shouldn't be a problem to do the move unconditionally after a driver has been matches, but before probe() is called. But that's too early. What if D gets moved to the end of the list, then G gets added to the list and probed, and then D's probe succeeds? And after the probe returns is too late, because by then there may well be child devices. We may be able to keep watch to make sure that drivers using probe deferral are also reordering themselves, but that does nothing for the cases described above where the link order of the drivers determines probe order, not the dap_list order. Devices need to be moved whenever they have any external dependencies, regardless of the particular order they get probed in. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 14, 2011 at 11:33 AM, Alan Stern st...@rowland.harvard.edu wrote: On Fri, 14 Oct 2011, Grant Likely wrote: I don't think the second part needs to be quite so invasive. Certainly drivers that never defer probes shouldn't require anything to be moved. Think about that a minute. Consider a dpm_list of devices: abcDefGh Now assume that D has an implicit dependency on G. If D gets probed first, then it will be deferred until G gets probed and then D will get retried and moved to the end of the list resulting in: abcefGhD Everything is good now for the order that things need to be suspended in. Now lets assume that the drivers get linked into the kernel in a different order (or the modules get loaded in a different order) and G gets probed first, followed by D. No deferral occurred and so no reordering occurs, resulting in: abcDefGh (unchanged) But now suspend is broken because D depends on G, but G will be suspended before D. However D sometimes does defer probes. Therefore the device always needs to be moved, even though this particular probe wasn't deferred. Yes, that's part of my point. This looks and smells like a bug to me. In fact, even without using probe deferral it looks like a bug because the dap_list isn't taking into account the fact that there are very likely to be implicit dependencies between devices that are not represented in the device hierarchy (maybe not common in PCs, but certainly is the case for embedded). But, it is also easy to solve by ensuring the dap_list is also probe-order sorted. A deferred probe _should_ move the device to the end of the list. But this needs to happen in the probe routine itself (after it has verified that all the other required devices are present and before it has registered any children) to prevent races. It can't be done in a central location. I'm really concerned about drivers having to implement this and not getting it correct; particularly when moving a device to the end of the list is cheap, and it shouldn't be a problem to do the move unconditionally after a driver has been matches, but before probe() is called. But that's too early. What if D gets moved to the end of the list, then G gets added to the list and probed, and then D's probe succeeds? So the argument is that if really_probe() called dpm_move_last() immediately before the dev-bus-probe()/drv-probe() call then there is a race condition that G could get both registered and probed before D's probe() starts using G's resources. And so, the list would still have G after D which is in the wrong order. Do I understand correctly? And after the probe returns is too late, because by then there may well be child devices. Agreed, after probe returns is definitely too late. We may be able to keep watch to make sure that drivers using probe deferral are also reordering themselves, but that does nothing for the cases described above where the link order of the drivers determines probe order, not the dap_list order. Devices need to be moved whenever they have any external dependencies, regardless of the particular order they get probed in. I suspect this gets messy in a hurry, but perhaps it is worth trying. So, any device that makes use of a PHY, GPIO line, codec, etc will need to call dpm_move_last() before adding child devices, correct? I'd be much happier to find a way to do this in core code though. And there is still a potential race condition here. For example, if G is in the middle of it's probe routine, and D gets probed between G registering GPIOs and calling dpm_move_last(), then we're in the same boat again. I think the window for this race can be considered to be of the same magnitude as the moved to early race described above. I need to think about this more... g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, 14 Oct 2011, Grant Likely wrote: However D sometimes does defer probes. �Therefore the device always needs to be moved, even though this particular probe wasn't deferred. Yes, that's part of my point. Okay, then we agree. :-) So the argument is that if really_probe() called dpm_move_last() immediately before the dev-bus-probe()/drv-probe() call then there is a race condition that G could get both registered and probed before D's probe() starts using G's resources. And so, the list would still have G after D which is in the wrong order. Do I understand correctly? Exactly so. Devices need to be moved whenever they have any external dependencies, regardless of the particular order they get probed in. I suspect this gets messy in a hurry, but perhaps it is worth trying. So, any device that makes use of a PHY, GPIO line, codec, etc will need to call dpm_move_last() before adding child devices, correct? Pretty much, yes. Unless the driver somehow knows that it will become sufficiently idle at an early suspend stage (like the prepare callback) that it doesn't need to use the PHY, GPIO, codec, etc. I'd be much happier to find a way to do this in core code though. And there is still a potential race condition here. For example, if G is in the middle of it's probe routine, and D gets probed between G registering GPIOs and calling dpm_move_last(), then we're in the same boat again. Of course, this means that G must call dpm_move_last() _before_ registering its GPIOs. So the overall flow of a probe routine is simple enough: 1. Check that all the resources you need are available. 2. If not, defer your probe. If yes, call dpm_move_last(). 3. Finish the probe, including registration of resources that will be available to other drivers (such as child devices). I think the window for this race can be considered to be of the same magnitude as the moved to early race described above. I need to think about this more... Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On 10/14/2011 10:20 AM, Grant Likely wrote: On Fri, Oct 14, 2011 at 10:33 AM, Alan Sternst...@rowland.harvard.edu wrote: On Fri, 14 Oct 2011, Grant Likely wrote: How can a device acquire children before it has a driver? There are a few potential situations in embedded systems (or at least nothing currently prevents it) where platform setup code constructs a device hierarchy without the aid of device drivers, and it is still possible for a parent device to be attached to a driver. IIUC, SPARC creates an entire hierarchy of platform_devices from all the nodes in the OpenFirmware device tree, and any of those devices can be bound to a driver. I don't like that approach, but at the very least it needs to be guarded against. Do these devices ever require deferred probes? Yes, they very well might. However, I'm happy with the limitation that only leaf devices can take advantage of probe deferral. I have: I2C-Bus-A +--Mux-I2C (controlled by parent I2C-Bus-A) +---I2C-Bus-1 | +--GPIO-Expander-A | +---I2C-Bus-2 +--GPIO-Expander-B These all have a parent/child relationship so no deferral is needed, so far so good. Then this: MDIO-Bus-A +---Mux-MDIO (controlled by GPIO-Expander-A) +---MDIO-Bus-1 | +---MDIO-Bus-2 +---PHY-1 | +---PHY-2 In this case the driver for Mux-MDIO needs to be deferred until *both* MDIO-Bus-A's driver *and* GPIO-Expander-B's driver are loaded. A perfect use case for the patch. Would you consider Mux-MDIO to be a 'leaf device'? If not, then I have real problems with 'the limitation that only leaf devices can take advantage of probe deferral' David Daney -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
CC Rafael and linux-pm On Thu, Oct 13, 2011 at 12:09 PM, Grant Likely grant.lik...@secretlab.ca wrote: On Tue, Oct 11, 2011 at 08:29:18PM +0800, Ming Lei wrote: On Tue, Oct 11, 2011 at 1:37 AM, Andrei Warkentin awarken...@vmware.com wrote: Hi, - Original Message - From: Greg KH g...@kroah.com To: Josh Triplett j...@joshtriplett.org Cc: G, Manjunath Kondaiah manj...@ti.com, linux-arm-ker...@lists.infradead.org, Grant Likely grant.lik...@secretlab.ca, linux-o...@vger.kernel.org, linux-mmc@vger.kernel.org, linux-ker...@vger.kernel.org, Dilan Lee di...@nvidia.com, Mark Brown broo...@opensource.wolfsonmicro.com, manjun...@jasper.es Sent: Saturday, October 8, 2011 11:55:02 AM Subject: Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism I'm a bit of a fly on the wall here, but I'm curious how this impacts suspend/resume. device_initialize-device_pm_init are called from device_register, so certainly this patch doesn't also ensure that the PM ordering matches probe ordering, which is bound to break suspend, right? Was this ever tested with the OMAP target? Shouldn't the Inside device_add(), device_pm_add is called before bus_probe_device, so the patch can't change the device order in pm list, and just change the driver probe order. That's the way it works now, but can it be reworked? It would be IMO, it depends on what shape you plan to rework. Currently, the deferred probe may found a resource dependency, but I am not sure that pm dependency is same with the resource dependency found during probe. possible to adjust the list order after successful probe. However, I'm not clear on the ordering rules for the dpm_list. Right now it is explicitly ordered to have parents before children, but as already expressed, that doesn't accurately represent ordering constraints for multiple device dependancies. Maybe we should understand the correct model of the ordering constraints for the multiple device dependancies first, could you give a description or some examples about it? So, reordering the list would probably require maintaining the existing parent-child ordering constraint, but to also shift devices (and any possible children?) to be after drivers that are already probed. That alone will be difficult to implement and get right, but maybe the constraints can be simplified. It needs some further thought. g. thanks, -- Ming Lei -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, 13 Oct 2011, Ming Lei wrote: Inside device_add(), device_pm_add is called before bus_probe_device, so the patch can't change the device order in pm list, and just change the driver probe order. That's the way it works now, but can it be reworked? �It would be IMO, it depends on what shape you plan to rework. Currently, the deferred probe may found a resource dependency, but I am not sure that pm dependency is same with the resource dependency found during probe. possible to adjust the list order after successful probe. �However, I'm not clear on the ordering rules for the dpm_list. �Right now it is explicitly ordered to have parents before children, but as already expressed, that doesn't accurately represent ordering constraints for multiple device dependancies. Maybe we should understand the correct model of the ordering constraints for the multiple device dependancies first, could you give a description or some examples about it? The requirement is that devices must be capable of resuming in the order given by dpm_list, and they must be capable of suspending in the reverse order. Therefore if device A can't work unless device B is functional, then B must come before A in dpm_list. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
Hi, On Thu, Oct 13, 2011 at 10:31 PM, Alan Stern st...@rowland.harvard.edu Maybe we should understand the correct model of the ordering constraints for the multiple device dependancies first, could you give a description or some examples about it? The requirement is that devices must be capable of resuming in the order given by dpm_list, and they must be capable of suspending in the reverse order. Therefore if device A can't work unless device B is functional, then B must come before A in dpm_list. If all devices can support async suspend and resume correctly, looks like the device order given by dpm_list is not needed any longer, doesn't it? thanks, -- Ming Lei -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, 13 Oct 2011, Ming Lei wrote: Hi, On Thu, Oct 13, 2011 at 10:31 PM, Alan Stern st...@rowland.harvard.edu Maybe we should understand the correct model of the ordering constraints for the multiple device dependancies first, could you give a description or some examples about it? The requirement is that devices must be capable of resuming in the order given by dpm_list, and they must be capable of suspending in the reverse order. Therefore if device A can't work unless device B is functional, then B must come before A in dpm_list. If all devices can support async suspend and resume correctly, looks like the device order given by dpm_list is not needed any longer, doesn't it? It _is_ needed, because the user can disable async suspend/resume via /sys/power/pm_async. Also, not all devices do support async suspend/resume. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, Oct 13, 2011 at 10:31:45AM -0400, Alan Stern wrote: On Thu, 13 Oct 2011, Ming Lei wrote: Inside device_add(), device_pm_add is called before bus_probe_device, so the patch can't change the device order in pm list, and just change the driver probe order. That's the way it works now, but can it be reworked? ?It would be IMO, it depends on what shape you plan to rework. Currently, the deferred probe may found a resource dependency, but I am not sure that pm dependency is same with the resource dependency found during probe. possible to adjust the list order after successful probe. ?However, I'm not clear on the ordering rules for the dpm_list. ?Right now it is explicitly ordered to have parents before children, but as already expressed, that doesn't accurately represent ordering constraints for multiple device dependancies. Maybe we should understand the correct model of the ordering constraints for the multiple device dependancies first, could you give a description or some examples about it? The requirement is that devices must be capable of resuming in the order given by dpm_list, and they must be capable of suspending in the reverse order. Therefore if device A can't work unless device B is functional, then B must come before A in dpm_list. For the deferred case; here is an example of the additional constraint. Consider the following hierarchy: -A +-B | +-C | +-D | +-E +-F +-G dpm_list could be ordered in at least the following ways (depending on exactly when devices get registered). There are many permutation, but children are always be listed after its direct parent. 1) ABECDFG (breadth first) 2) AEBFGCD (breadth first) 3) ABCDEFG (depth first) 4) AEFGBCD (depth first) Now, assume that device B depends on device F, and also assume that there is no way either to express that in the hierarchy or even for the constraint to be known at device registration time (the is exactly the situation we're dealing with on embedded platforms). Only the driver for B knows that it needs a resource provided by F's driver. So, the situation becomes that the ordering of dpm_list must now also be sorted so that non-tree dependencies are also accounted for. Of the list above, only sort order 4 meets the new constraint. The question then becomes, how can the dpm_list get resorted dynamically at runtime to ensure that the new constraints are always met without breaking old ones. Here are some options I can think of: 1) When a deferred probe succeeds, move the deferred device to the end of the dpm_list. Then the new sort order might be one of: 1) AECDFGB 2) AEFGCDB 3) ACDEFGB 4) AEFGCDB However, that approach breaks the guarantee that children appear after their parents (C D appear before B in all cases above) 2a) When a deferred probe succeeds, move the deferred device and it's entire child hierarchy to the end of the list to become one of: 1) AEFGBCD 2) AEFGBCD 3) AEFGBCD 4) AEFGBCD (hey! they're all the same now, but there are other orderings possible that aren't) :-) Problem: Complexity increases. This requires iterating through all the children and performing a reorder for each. Fortunately, this should be too expensive since I believe each individual move is an O(1) operation. I don't think the code will need to walk the list for each device. Potential problem: This may result in unnecessary sorting in a lot of cases. It may be that the newly probed device is already after the device it depends on, but the driver just hadn't been probed early enough to avoid deferral. Potential problem: It may end up exposing implicit dependencies between drivers that weren't observed before. Potential problem: This completely breaks if a parent gets probed *after* its child which might happen if something other than the parent driver creates the child devices. I don't think this is a real problem, but I think the kernel would first need to ensure that none of the children are bound to drivers, and complain loudly if they are. Otherwise the dpm_list won't reflect probe order. 2b) alternately, when *any* probe succeeds, move the deferred device and its children to the end of the list. This continues to maintain the parent-child relationship, and it ensures that the dpm_list is always also sorted in probe-order (devices bound to drivers will collect at the end of the list). Potential problem: On a big device hierarchy, this will mean a lot of moves. It should still only be O(1) for each move, but O(N^2) over probing the entire hierarchy. Devices will end up being moved once for itself, and once for each parent and grandparent bound to a driver. It could become slow. This option also has the potential problem when a parent gets probed after its child as discussed in 2a. 3) Add devices to dpm_list after successful probe instead of at
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, 13 Oct 2011, Grant Likely wrote: For the deferred case; here is an example of the additional constraint. Consider the following hierarchy: -A +-B | +-C | +-D | +-E +-F +-G dpm_list could be ordered in at least the following ways (depending on exactly when devices get registered). There are many permutation, but children are always be listed after its direct parent. 1) ABECDFG (breadth first) 2) AEBFGCD (breadth first) 3) ABCDEFG (depth first) 4) AEFGBCD (depth first) Now, assume that device B depends on device F, and also assume that there is no way either to express that in the hierarchy or even for the constraint to be known at device registration time (the is exactly the situation we're dealing with on embedded platforms). Only the driver for B knows that it needs a resource provided by F's driver. So, the situation becomes that the ordering of dpm_list must now also be sorted so that non-tree dependencies are also accounted for. Of the list above, only sort order 4 meets the new constraint. The question then becomes, how can the dpm_list get resorted dynamically at runtime to ensure that the new constraints are always met without breaking old ones. Here are some options I can think of: This was a long message and I haven't absorbed the whole thing. However it's worth pointing out right at the start that we already have device_pm_move_before(), device_pm_move_after(), and device_pm_move_last(). They are intended specifically to provide drivers with a way of making sure that dpm_list is in the right order -- people have been aware of these issues for some time. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, Oct 13, 2011 at 02:16:42PM -0400, Alan Stern wrote: On Thu, 13 Oct 2011, Grant Likely wrote: For the deferred case; here is an example of the additional constraint. Consider the following hierarchy: -A +-B | +-C | +-D | +-E +-F +-G dpm_list could be ordered in at least the following ways (depending on exactly when devices get registered). There are many permutation, but children are always be listed after its direct parent. 1) ABECDFG (breadth first) 2) AEBFGCD (breadth first) 3) ABCDEFG (depth first) 4) AEFGBCD (depth first) Now, assume that device B depends on device F, and also assume that there is no way either to express that in the hierarchy or even for the constraint to be known at device registration time (the is exactly the situation we're dealing with on embedded platforms). Only the driver for B knows that it needs a resource provided by F's driver. So, the situation becomes that the ordering of dpm_list must now also be sorted so that non-tree dependencies are also accounted for. Of the list above, only sort order 4 meets the new constraint. The question then becomes, how can the dpm_list get resorted dynamically at runtime to ensure that the new constraints are always met without breaking old ones. Here are some options I can think of: This was a long message and I haven't absorbed the whole thing. heh; I did get rather verbose there. However it's worth pointing out right at the start that we already have device_pm_move_before(), device_pm_move_after(), and device_pm_move_last(). They are intended specifically to provide drivers with a way of making sure that dpm_list is in the right order -- people have been aware of these issues for some time. I saw those. I also notice that they are only used by device_move() when reparenting a device (which is another wrinkle I hadn't though about). Reparenting a device becomes problematic if the probe order is also represented in the list. If device_move() gets called, then any implicit probe-order sorting for that device would be lost. I also notice that device_move disregards any children when moving a device, which could also be a problem. Although it looks like the only users of device_move are: drivers/media/video/pvrusb2/pvrusb2-v4l2.c drivers/s390/cio/device.c net/bluetooth/hci_sysfs.c net/bluetooth/rfcomm/tty.c So it may not be significant to adapt. g. -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 14, 2011 at 12:04 AM, Alan Stern st...@rowland.harvard.edu wrote: On Thu, 13 Oct 2011, Ming Lei wrote: Hi, On Thu, Oct 13, 2011 at 10:31 PM, Alan Stern st...@rowland.harvard.edu Maybe we should understand the correct model of the ordering constraints for the multiple device dependancies first, could you give a description or some examples about it? The requirement is that devices must be capable of resuming in the order given by dpm_list, and they must be capable of suspending in the reverse order. Therefore if device A can't work unless device B is functional, then B must come before A in dpm_list. If all devices can support async suspend and resume correctly, looks like the device order given by dpm_list is not needed any longer, doesn't it? It _is_ needed, because the user can disable async suspend/resume via /sys/power/pm_async. Also, not all devices do support async suspend/resume. Basically, the devices which don't support async suspend/resume should have out of tree PM dependency. If out of tree PM dependency issue is solved, all devices can support async suspend/resume. thanks, -- Ming Lei -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Mon, Oct 10, 2011 at 10:37:22AM -0700, Andrei Warkentin wrote: Hi, - Original Message - From: Greg KH g...@kroah.com To: Josh Triplett j...@joshtriplett.org Cc: G, Manjunath Kondaiah manj...@ti.com, linux-arm-ker...@lists.infradead.org, Grant Likely grant.lik...@secretlab.ca, linux-o...@vger.kernel.org, linux-mmc@vger.kernel.org, linux-ker...@vger.kernel.org, Dilan Lee di...@nvidia.com, Mark Brown broo...@opensource.wolfsonmicro.com, manjun...@jasper.es Sent: Saturday, October 8, 2011 11:55:02 AM Subject: Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism I'm a bit of a fly on the wall here, but I'm curious how this impacts suspend/resume. device_initialize-device_pm_init are called from device_register, so certainly this patch doesn't also ensure that the PM ordering matches probe ordering, which is bound to break suspend, right? Was this ever tested with the OMAP target? Shouldn't the PM change be also part of this patch set? I don't see why you would want to have this in without the PM changes. suspend/resume handling is already in TODO list: http://permalink.gmane.org/gmane.linux.ports.arm.kernel/135461 -M Maybe I have it all wrong though :-). A -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Tue, Oct 11, 2011 at 08:29:18PM +0800, Ming Lei wrote: On Tue, Oct 11, 2011 at 1:37 AM, Andrei Warkentin awarken...@vmware.com wrote: Hi, - Original Message - From: Greg KH g...@kroah.com To: Josh Triplett j...@joshtriplett.org Cc: G, Manjunath Kondaiah manj...@ti.com, linux-arm-ker...@lists.infradead.org, Grant Likely grant.lik...@secretlab.ca, linux-o...@vger.kernel.org, linux-mmc@vger.kernel.org, linux-ker...@vger.kernel.org, Dilan Lee di...@nvidia.com, Mark Brown broo...@opensource.wolfsonmicro.com, manjun...@jasper.es Sent: Saturday, October 8, 2011 11:55:02 AM Subject: Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism I'm a bit of a fly on the wall here, but I'm curious how this impacts suspend/resume. device_initialize-device_pm_init are called from device_register, so certainly this patch doesn't also ensure that the PM ordering matches probe ordering, which is bound to break suspend, right? Was this ever tested with the OMAP target? Shouldn't the Inside device_add(), device_pm_add is called before bus_probe_device, so the patch can't change the device order in pm list, and just change the driver probe order. That's the way it works now, but can it be reworked? It would be possible to adjust the list order after successful probe. However, I'm not clear on the ordering rules for the dpm_list. Right now it is explicitly ordered to have parents before children, but as already expressed, that doesn't accurately represent ordering constraints for multiple device dependancies. So, reordering the list would probably require maintaining the existing parent-child ordering constraint, but to also shift devices (and any possible children?) to be after drivers that are already probed. That alone will be difficult to implement and get right, but maybe the constraints can be simplified. It needs some further thought. g. -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Tue, Oct 11, 2011 at 1:37 AM, Andrei Warkentin awarken...@vmware.com wrote: Hi, - Original Message - From: Greg KH g...@kroah.com To: Josh Triplett j...@joshtriplett.org Cc: G, Manjunath Kondaiah manj...@ti.com, linux-arm-ker...@lists.infradead.org, Grant Likely grant.lik...@secretlab.ca, linux-o...@vger.kernel.org, linux-mmc@vger.kernel.org, linux-ker...@vger.kernel.org, Dilan Lee di...@nvidia.com, Mark Brown broo...@opensource.wolfsonmicro.com, manjun...@jasper.es Sent: Saturday, October 8, 2011 11:55:02 AM Subject: Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism I'm a bit of a fly on the wall here, but I'm curious how this impacts suspend/resume. device_initialize-device_pm_init are called from device_register, so certainly this patch doesn't also ensure that the PM ordering matches probe ordering, which is bound to break suspend, right? Was this ever tested with the OMAP target? Shouldn't the Inside device_add(), device_pm_add is called before bus_probe_device, so the patch can't change the device order in pm list, and just change the driver probe order. PM change be also part of this patch set? I don't see why you would want to have this in without the PM changes. thanks, -- Ming Lei -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
Hi, - Original Message - From: Greg KH g...@kroah.com To: Josh Triplett j...@joshtriplett.org Cc: G, Manjunath Kondaiah manj...@ti.com, linux-arm-ker...@lists.infradead.org, Grant Likely grant.lik...@secretlab.ca, linux-o...@vger.kernel.org, linux-mmc@vger.kernel.org, linux-ker...@vger.kernel.org, Dilan Lee di...@nvidia.com, Mark Brown broo...@opensource.wolfsonmicro.com, manjun...@jasper.es Sent: Saturday, October 8, 2011 11:55:02 AM Subject: Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism I'm a bit of a fly on the wall here, but I'm curious how this impacts suspend/resume. device_initialize-device_pm_init are called from device_register, so certainly this patch doesn't also ensure that the PM ordering matches probe ordering, which is bound to break suspend, right? Was this ever tested with the OMAP target? Shouldn't the PM change be also part of this patch set? I don't see why you would want to have this in without the PM changes. Maybe I have it all wrong though :-). A -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 07, 2011 at 09:03:51PM -0700, Josh Triplett wrote: On Fri, Oct 07, 2011 at 02:23:26PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 01:57:15PM -0700, Josh Triplett wrote: On Thu, Oct 06, 2011 at 11:49:28PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. Given that the drivers which use this mechanism will not necessarily get built into the kernel, I'd suggest that it should remain optional and default to n. Those drivers can then add a dependency on PROBE_DEFER. Let's try to avoid adding more infrastructure to the kernel that takes up space even when unused; certainly embedded will appreciate not having this feature unless a driver needs it. How much extra space is this feature really? Just checked: 776 bytes, 640 of text and 136 of data. We have kconfig options for comparable amounts. I don't see it being anything larger than the amount of memory increase that just happened as I typed this email as part of the ongoing memory density changes. I don't know about the changes you mean Moore's law. Really, 776 bytes, just always enable it, it's not worth it. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Sat, Oct 08, 2011 at 08:55:02AM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 09:03:51PM -0700, Josh Triplett wrote: On Fri, Oct 07, 2011 at 02:23:26PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 01:57:15PM -0700, Josh Triplett wrote: On Thu, Oct 06, 2011 at 11:49:28PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. Given that the drivers which use this mechanism will not necessarily get built into the kernel, I'd suggest that it should remain optional and default to n. Those drivers can then add a dependency on PROBE_DEFER. Let's try to avoid adding more infrastructure to the kernel that takes up space even when unused; certainly embedded will appreciate not having this feature unless a driver needs it. How much extra space is this feature really? Just checked: 776 bytes, 640 of text and 136 of data. We have kconfig options for comparable amounts. I don't see it being anything larger than the amount of memory increase that just happened as I typed this email as part of the ongoing memory density changes. I don't know about the changes you mean Moore's law. Ah, I see. For new systems, sure; for systems or mechanisms with a pre-existing size constraint, that doesn't help. Really, 776 bytes, just always enable it, it's not worth it. 776 bytes alone, no; 776 bytes times the next (or previous) thousand features, yes. - Josh Triplett -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: From: Grant Likely grant.lik...@secretlab.ca Allow drivers to report at probe time that they cannot get all the resources required by the device, and should be retried at a later time. This should completely solve the problem of getting devices initialized in the right order. Right now this is mostly handled by mucking about with initcall ordering which is a complete hack, and doesn't even remotely handle the case where device drivers are in modules. This approach completely sidesteps the issues by allowing driver registration to occur in any order, and any driver can request to be retried after a few more other drivers get probed. Original patch posted by Grant Likely grant.lik...@secretlab.ca at: http://lwn.net/Articles/460522/ Enhancements to original patch by G, Manjunath Kondaiah manj...@ti.com - checkpatch warning fixes - added Kconfig symbol CONFIG_PROBE_DEFER - replacing normal workqueue with singlethread_workqueue - handling -EPROBE_DEFER error Signed-off-by: Grant Likely grant.lik...@secretlab.ca Signed-off-by: G, Manjunath Kondaiah manj...@ti.com --- Cc: linux-o...@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: Grant Likely grant.lik...@secretlab.ca Cc: Greg Kroah-Hartman g...@kroah.com Cc: Dilan Lee di...@nvidia.com Cc: Mark Brown broo...@opensource.wolfsonmicro.com Cc: Manjunath GKondaiah manjunath.gkonda...@linaro.org Cc: Arnd Bergmann a...@arndb.de drivers/base/Kconfig | 11 drivers/base/base.h|3 + drivers/base/core.c|6 ++ drivers/base/dd.c | 145 include/linux/device.h |7 ++ 5 files changed, 172 insertions(+), 0 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 21cf46f..b412a71 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -172,6 +172,17 @@ config SYS_HYPERVISOR bool default n +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. It also cleans up this diff a lot, as you really don't want #ifdef in .c files. --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -28,6 +28,136 @@ #include base.h #include power/power.h +#if defined CONFIG_PROBE_DEFER +/* + * Deferred Probe infrastructure. Why not use kerneldoc? + * + * Sometimes driver probe order matters, but the kernel doesn't always have + * dependency information which means some drivers will get probed before a + * resource it depends on is available. For example, an SDHCI driver may + * first need a GPIO line from an i2c GPIO controller before it can be + * initialized. If a required resource is not available yet, a driver can + * request probing to be deferred by returning -EPROBE_DEFER from its probe hook + * + * Deferred probe maintains two lists of devices, a pending list and an active + * list. A driver returning -EPROBE_DEFER causes the device to be added to the + * pending list. + * + * The deferred_probe_mutex *must* be held any time the deferred_probe_*_list + * of the (struct device*)-deferred_probe pointers are manipulated + */ +static DEFINE_MUTEX(deferred_probe_mutex); +static LIST_HEAD(deferred_probe_pending_list); +static LIST_HEAD(deferred_probe_active_list); +static struct workqueue_struct *deferred_wq; + +/** + * deferred_probe_work_func() - Retry probing devices in the active list. + */ +static void deferred_probe_work_func(struct work_struct *work) +{ + struct device *dev; + /* Extra blank line please. + * This bit is tricky. We want to process every device in the + * deferred list, but devices can be removed from the list at any + * time while inside this for-each loop. There are two things that + * need to be protected against: That's what the klist structure is supposed to make easier, why not use that here? + * - if the device is removed from the deferred_probe_list, then we + * loose our place in the loop. Since any device can be removed + * asynchronously, list_for_each_entry_safe() wouldn't make things + * much better. Simplest solution is to restart walking the list + * whenever the
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Thu, Oct 06, 2011 at 11:49:28PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. Given that the drivers which use this mechanism will not necessarily get built into the kernel, I'd suggest that it should remain optional and default to n. Those drivers can then add a dependency on PROBE_DEFER. Let's try to avoid adding more infrastructure to the kernel that takes up space even when unused; certainly embedded will appreciate not having this feature unless a driver needs it. (That said, it still feels to me like an explicit dependency mechanism would make more sense than this try again later mechanism, but nonetheless try again later seems like an improvement over what we have now.) It also cleans up this diff a lot, as you really don't want #ifdef in .c files. Ideally the entire .c file could become conditional on PROBE_DEFER via kbuild, with the usual compatibility inlines in a .h file for the !PROBE_DEFER case. - Josh Triplett -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 07, 2011 at 01:57:15PM -0700, Josh Triplett wrote: On Thu, Oct 06, 2011 at 11:49:28PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. Given that the drivers which use this mechanism will not necessarily get built into the kernel, I'd suggest that it should remain optional and default to n. Those drivers can then add a dependency on PROBE_DEFER. Let's try to avoid adding more infrastructure to the kernel that takes up space even when unused; certainly embedded will appreciate not having this feature unless a driver needs it. How much extra space is this feature really? I don't see it being anything larger than the amount of memory increase that just happened as I typed this email as part of the ongoing memory density changes. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 7, 2011 at 12:49 AM, Greg KH g...@kroah.com wrote: On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. It also cleans up this diff a lot, as you really don't want #ifdef in .c files. Okay, we'll drop the kconfig + * This bit is tricky. We want to process every device in the + * deferred list, but devices can be removed from the list at any + * time while inside this for-each loop. There are two things that + * need to be protected against: That's what the klist structure is supposed to make easier, why not use that here? + * - if the device is removed from the deferred_probe_list, then we + * loose our place in the loop. Since any device can be removed + * asynchronously, list_for_each_entry_safe() wouldn't make things + * much better. Simplest solution is to restart walking the list + * whenever the current device gets removed. Not the most efficient, + * but is simple to implement and easy to audit for correctness. + * - if the device is unregistered, and freed, then there is a risk + * of a null pointer dereference. This code uses get/put_device() + * to ensure the device cannot disappear from under our feet. Ick, use a klist, it was created to handle this exact problem in the driver core, so to not use it would be wrong, right? This comment block is stale. I reworked the code before I handed it off to Manjunath, but I never rewrote the comment. The current code uses a pair of list to keep deferred devices separate from devices currently being probed, and the problem described above no longer exists. However, Manjunath and I will look into switching from the two list design to using klist. Thanks for the feedback. g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] drivercore: Add driver probe deferral mechanism
On Fri, Oct 07, 2011 at 02:23:26PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 01:57:15PM -0700, Josh Triplett wrote: On Thu, Oct 06, 2011 at 11:49:28PM -0700, Greg KH wrote: On Fri, Oct 07, 2011 at 10:33:07AM +0500, G, Manjunath Kondaiah wrote: +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. Why is this even an option? Why would you ever want it disabled? Why does it need to be selected? If you are going to default something to 'y' then just make it so it can't be turned off any other way by just not making it an option at all. Given that the drivers which use this mechanism will not necessarily get built into the kernel, I'd suggest that it should remain optional and default to n. Those drivers can then add a dependency on PROBE_DEFER. Let's try to avoid adding more infrastructure to the kernel that takes up space even when unused; certainly embedded will appreciate not having this feature unless a driver needs it. How much extra space is this feature really? Just checked: 776 bytes, 640 of text and 136 of data. We have kconfig options for comparable amounts. I don't see it being anything larger than the amount of memory increase that just happened as I typed this email as part of the ongoing memory density changes. I don't know about the changes you mean, but in any case I'd like to prevent mandatory size increases wherever possible. I'd love to see the size of allnoconfig getting *smaller* over time, not larger. - Josh Triplett -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] drivercore: Add driver probe deferral mechanism
From: Grant Likely grant.lik...@secretlab.ca Allow drivers to report at probe time that they cannot get all the resources required by the device, and should be retried at a later time. This should completely solve the problem of getting devices initialized in the right order. Right now this is mostly handled by mucking about with initcall ordering which is a complete hack, and doesn't even remotely handle the case where device drivers are in modules. This approach completely sidesteps the issues by allowing driver registration to occur in any order, and any driver can request to be retried after a few more other drivers get probed. Original patch posted by Grant Likely grant.lik...@secretlab.ca at: http://lwn.net/Articles/460522/ Enhancements to original patch by G, Manjunath Kondaiah manj...@ti.com - checkpatch warning fixes - added Kconfig symbol CONFIG_PROBE_DEFER - replacing normal workqueue with singlethread_workqueue - handling -EPROBE_DEFER error Signed-off-by: Grant Likely grant.lik...@secretlab.ca Signed-off-by: G, Manjunath Kondaiah manj...@ti.com --- Cc: linux-o...@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: Grant Likely grant.lik...@secretlab.ca Cc: Greg Kroah-Hartman g...@kroah.com Cc: Dilan Lee di...@nvidia.com Cc: Mark Brown broo...@opensource.wolfsonmicro.com Cc: Manjunath GKondaiah manjunath.gkonda...@linaro.org Cc: Arnd Bergmann a...@arndb.de drivers/base/Kconfig | 11 drivers/base/base.h|3 + drivers/base/core.c|6 ++ drivers/base/dd.c | 145 include/linux/device.h |7 ++ 5 files changed, 172 insertions(+), 0 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 21cf46f..b412a71 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -172,6 +172,17 @@ config SYS_HYPERVISOR bool default n +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. + source drivers/base/regmap/Kconfig endmenu diff --git a/drivers/base/base.h b/drivers/base/base.h index a34dca0..95afe27 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -105,6 +105,9 @@ extern void bus_remove_driver(struct device_driver *drv); extern void driver_detach(struct device_driver *drv); extern int driver_probe_device(struct device_driver *drv, struct device *dev); +#ifdef CONFIG_PROBE_DEFER +extern void driver_deferred_probe_del(struct device *dev); +#endif static inline int driver_match_device(struct device_driver *drv, struct device *dev) { diff --git a/drivers/base/core.c b/drivers/base/core.c index bc8729d..4ba8653 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -588,6 +588,9 @@ void device_initialize(struct device *dev) { dev-kobj.kset = devices_kset; kobject_init(dev-kobj, device_ktype); +#ifdef CONFIG_PROBE_DEFER + INIT_LIST_HEAD(dev-deferred_probe); +#endif INIT_LIST_HEAD(dev-dma_pools); mutex_init(dev-mutex); lockdep_set_novalidate_class(dev-mutex); @@ -1119,6 +1122,9 @@ void device_del(struct device *dev) device_remove_file(dev, uevent_attr); device_remove_attrs(dev); bus_remove_device(dev); +#if defined CONFIG_PROBE_DEFER + driver_deferred_probe_del(dev); +#endif /* * Some platform devices are driven without driver attached diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 6658da7..f637da6 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -28,6 +28,136 @@ #include base.h #include power/power.h +#if defined CONFIG_PROBE_DEFER +/* + * Deferred Probe infrastructure. + * + * Sometimes driver probe order matters, but the kernel doesn't always have + * dependency information which means some drivers will get probed before a + * resource it depends on is available. For example, an SDHCI driver may + * first need a GPIO line from an i2c GPIO controller before it can be + * initialized. If a required resource is not available yet, a driver can + * request probing to be deferred by returning -EPROBE_DEFER from its probe hook + * + * Deferred probe maintains two lists of devices, a pending list and an active + * list. A driver returning -EPROBE_DEFER causes the device to be added to the + * pending list. + * + * The deferred_probe_mutex *must* be held any time the deferred_probe_*_list + * of the (struct device*)-deferred_probe pointers are manipulated + */ +static DEFINE_MUTEX(deferred_probe_mutex); +static
[PATCH 2/5] drivercore: Add driver probe deferral mechanism
From: Grant Likely grant.lik...@secretlab.ca Allow drivers to report at probe time that they cannot get all the resources required by the device, and should be retried at a later time. This should completely solve the problem of getting devices initialized in the right order. Right now this is mostly handled by mucking about with initcall ordering which is a complete hack, and doesn't even remotely handle the case where device drivers are in modules. This approach completely sidesteps the issues by allowing driver registration to occur in any order, and any driver can request to be retried after a few more other drivers get probed. Original patch posted by Grant Likely grant.lik...@secretlab.ca at: http://lwn.net/Articles/460522/ Enhancements to original patch by G, Manjunath Kondaiah manj...@ti.com - checkpatch warning fixes - added Kconfig symbol CONFIG_PROBE_DEFER - replacing normal workqueue with singlethread_workqueue - handling -EPROBE_DEFER error Signed-off-by: Grant Likely grant.lik...@secretlab.ca Signed-off-by: G, Manjunath Kondaiah manj...@ti.com --- Cc: linux-o...@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: Grant Likely grant.lik...@secretlab.ca Cc: Greg Kroah-Hartman g...@kroah.com Cc: Dilan Lee di...@nvidia.com Cc: Mark Brown broo...@opensource.wolfsonmicro.com Cc: Manjunath GKondaiah manjunath.gkonda...@linaro.org Cc: Arnd Bergmann a...@arndb.de drivers/base/Kconfig | 11 drivers/base/base.h|3 + drivers/base/core.c|6 ++ drivers/base/dd.c | 145 include/linux/device.h |7 ++ 5 files changed, 172 insertions(+), 0 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 21cf46f..b412a71 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -172,6 +172,17 @@ config SYS_HYPERVISOR bool default n +config PROBE_DEFER + bool Deferred Driver Probe + default y + help + This option provides deferring driver probe if it has dependency on + other driver. Without this feature, initcall ordering should be done + manually to resolve driver dependencies. This feature completely side + steps the issues by allowing driver registration to occur in any + order, and any driver can request to be retried after a few more other + drivers get probed. + source drivers/base/regmap/Kconfig endmenu diff --git a/drivers/base/base.h b/drivers/base/base.h index a34dca0..95afe27 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -105,6 +105,9 @@ extern void bus_remove_driver(struct device_driver *drv); extern void driver_detach(struct device_driver *drv); extern int driver_probe_device(struct device_driver *drv, struct device *dev); +#ifdef CONFIG_PROBE_DEFER +extern void driver_deferred_probe_del(struct device *dev); +#endif static inline int driver_match_device(struct device_driver *drv, struct device *dev) { diff --git a/drivers/base/core.c b/drivers/base/core.c index bc8729d..4ba8653 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -588,6 +588,9 @@ void device_initialize(struct device *dev) { dev-kobj.kset = devices_kset; kobject_init(dev-kobj, device_ktype); +#ifdef CONFIG_PROBE_DEFER + INIT_LIST_HEAD(dev-deferred_probe); +#endif INIT_LIST_HEAD(dev-dma_pools); mutex_init(dev-mutex); lockdep_set_novalidate_class(dev-mutex); @@ -1119,6 +1122,9 @@ void device_del(struct device *dev) device_remove_file(dev, uevent_attr); device_remove_attrs(dev); bus_remove_device(dev); +#if defined CONFIG_PROBE_DEFER + driver_deferred_probe_del(dev); +#endif /* * Some platform devices are driven without driver attached diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 6658da7..f637da6 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -28,6 +28,136 @@ #include base.h #include power/power.h +#if defined CONFIG_PROBE_DEFER +/* + * Deferred Probe infrastructure. + * + * Sometimes driver probe order matters, but the kernel doesn't always have + * dependency information which means some drivers will get probed before a + * resource it depends on is available. For example, an SDHCI driver may + * first need a GPIO line from an i2c GPIO controller before it can be + * initialized. If a required resource is not available yet, a driver can + * request probing to be deferred by returning -EPROBE_DEFER from its probe hook + * + * Deferred probe maintains two lists of devices, a pending list and an active + * list. A driver returning -EPROBE_DEFER causes the device to be added to the + * pending list. + * + * The deferred_probe_mutex *must* be held any time the deferred_probe_*_list + * of the (struct device*)-deferred_probe pointers are manipulated + */ +static DEFINE_MUTEX(deferred_probe_mutex); +static