Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > > > With the USB subsystem I have followed the approach taken by the PM > > > core, which is that tasks are frozen. But one can -- and Linus has on > > > at least one occasion -- make a good case that tasks should be left > > > running while only I/O is frozen. This would require the subsystem to > > > distinguish between a selective device suspend and a system-wide > > > suspend-to-RAM, so that selective resume could be enabled on demand in > > > one case but not the other. > > > > > > It's quite doable in principle -- it's just not the technique I used. > > > > I guess we need to do that. Random user should not be able to prevent > > machine from sleeping. > > Just to be clear about this, let's agree that we're talking about > suspend-to-RAM here, not hibernation. Yes. > It boils down to whether we want to freeze user tasks. As I recall, > Linus said that he didn't have any big objection to freezing user > threads; he was much more concerned about freezing kernel threads. > Thanks to Raphael's new notifier chains this will no longer be an > issue, since kernel threads will be able to stop themselves when they > receive a suspend notification. ... > The alternative is to have drivers take over the burden. I don't like > this at all. The most obvious disadvantage is that the necessary > checks would have to be duplicated many many times and spread out over > lots of drivers. I like freezer better :-). > It's also harder to handle these things at the driver level. Suppose a > driver gets an I/O request while a suspend is underway. What should it > do? Return an error? Block until the suspend is over? Both > approaches have their difficulties: > > Returning an error would mean that suspend is no longer transparent. > Even an error like -EAGAIN. No, -EAGAIN is not nice. > Waiting until the suspend is over is likely to be impractical. At a > minimum it would involve adding code to drop a lock or mutex, enter the > freezer (or its equivalent), and then restart the I/O operation. And > then, what if the driver was invoked with O_NONBLOCK? Blocking would be possible option. I agree it is tricky to implement... it may also be useful for a harddrive: "I'm riding a horse at 40kph now, so you'll kill the harddrive if you access it; just freeze everyone until we are at the other end of meadow". ...hmm, but this seems to be blockdevice specific, and I can't think of a network or char driver where similar behaviour would be useful. > I think it is much better overall to stop I/O requests from being > generated at the source, either by freezing userspace or preventing it > from making system calls. It's hard to imagine that anybody would > miss the small amount of CPU time they'd be giving up by not allowing > user threads to run during the time that a suspend is underway! Agreed. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 4 Jun 2007, Pavel Machek wrote: > > With the USB subsystem I have followed the approach taken by the PM > > core, which is that tasks are frozen. But one can -- and Linus has on > > at least one occasion -- make a good case that tasks should be left > > running while only I/O is frozen. This would require the subsystem to > > distinguish between a selective device suspend and a system-wide > > suspend-to-RAM, so that selective resume could be enabled on demand in > > one case but not the other. > > > > It's quite doable in principle -- it's just not the technique I used. > > I guess we need to do that. Random user should not be able to prevent > machine from sleeping. Just to be clear about this, let's agree that we're talking about suspend-to-RAM here, not hibernation. It boils down to whether we want to freeze user tasks. As I recall, Linus said that he didn't have any big objection to freezing user threads; he was much more concerned about freezing kernel threads. Thanks to Raphael's new notifier chains this will no longer be an issue, since kernel threads will be able to stop themselves when they receive a suspend notification. There may remain some obscure difficulties in discerning whether a particular thread should be classified as user or kernel, but let's ignore them. Even if we don't actively freeze user threads, approximately the same effect can be achieved in the following way: Change the main kernel entry points so that any thread performing a system call during a suspend will get frozen until the suspend is over. Threads that run entirely in userspace will continue doing useful work as before, and kernel threads won't be affected at all. (Not that I think it's necessary to do this; it's just a way to avoid freezing user tasks until they need it.) One way or another, freezing user tasks should not be a big deal. After all, once the suspend is complete eveything will effectively be frozen anyway. I suppose there might be issues involving tasks which need to run in order to complete the suspend -- IMO any such issues should be handled by carrying out the necessary actions before the point where we now start up the freezer. The alternative is to have drivers take over the burden. I don't like this at all. The most obvious disadvantage is that the necessary checks would have to be duplicated many many times and spread out over lots of drivers. It's also harder to handle these things at the driver level. Suppose a driver gets an I/O request while a suspend is underway. What should it do? Return an error? Block until the suspend is over? Both approaches have their difficulties: Returning an error would mean that suspend is no longer transparent. Even an error like -EAGAIN. Waiting until the suspend is over is likely to be impractical. At a minimum it would involve adding code to drop a lock or mutex, enter the freezer (or its equivalent), and then restart the I/O operation. And then, what if the driver was invoked with O_NONBLOCK? I think it is much better overall to stop I/O requests from being generated at the source, either by freezing userspace or preventing it from making system calls. It's hard to imagine that anybody would miss the small amount of CPU time they'd be giving up by not allowing user threads to run during the time that a suspend is underway! Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > > > The theoretical answer is that it behaves the way we want. The kernel > > > thread does selective resumes in response to device requests. If such > > > a request comes in while the system is asleep it will awaken the > > > system; so it's only logical that a request coming in while the system > > > is in the process of going to sleep should abort the suspend. > > > > I'd say that it shows ppc being broken. User wanted to suspend the > > system, and now unrelated task did lsusb... and system will not sleep. > > > > AFAICT it is DoS issue -- if one of your users keeps doing lsusb, root > > will not be able to suspend the system. > > This is a matter of one's philosophy. In suspend-to-RAM, should tasks > be frozen or should I/O queues be frozen? > > With the USB subsystem I have followed the approach taken by the PM > core, which is that tasks are frozen. But one can -- and Linus has on > at least one occasion -- make a good case that tasks should be left > running while only I/O is frozen. This would require the subsystem to > distinguish between a selective device suspend and a system-wide > suspend-to-RAM, so that selective resume could be enabled on demand in > one case but not the other. > > It's quite doable in principle -- it's just not the technique I used. I guess we need to do that. Random user should not be able to prevent machine from sleeping. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > > >> Userspace should handle the async request just fine when it comes > > >> back > > >> running, regardless of the time it was submitted. > > > > > > Okay, so the solution is to convert the drivers to use > > > request_firmware_nowait() instead of request_firmware() in > > > their .resume() > > > routines. > > > > [added Rob Landley to CC] > > > > I think asynchronous loading should be made the default behaviour. > > However, we need to think and document how to do firmware loading. > > I'd love to write up documentation on this if anybody can tell me what it > should say. :) Well, chapter on firmware & suspend/hibernation should say something about either loading the firmware during early suspend, or just keeping firmware in ram from boot... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday 28 May 2007 4:48 am, Michael-Luke Jones wrote: > On 28 May 2007, at 08:43, Rafael J. Wysocki wrote: > > >> Seems, that's just the broken synchronous firmware loading interface > >> with the useless timeout handling. The nowait version of the same > >> loader > >> doesn't time out, and should not have that problem. The sync version > >> should be removed from the kernel, it just causes all sorts of > >> problems > >> since it exists. > >> > >> Userspace should handle the async request just fine when it comes > >> back > >> running, regardless of the time it was submitted. > > > > Okay, so the solution is to convert the drivers to use > > request_firmware_nowait() instead of request_firmware() in > > their .resume() > > routines. > > [added Rob Landley to CC] > > I think asynchronous loading should be made the default behaviour. > However, we need to think and document how to do firmware loading. I'd love to write up documentation on this if anybody can tell me what it should say. :) > Firmware loading can be done at two different times: > Device Driver Load > First use of Device Driver > > For a network device, this correlates to when the driver is first > loaded into memory or at 'ifconfig up' respectively. And for block devices ala: https://bugs.launchpad.net/ubuntu/dapper/+source/initramfs-tools/+bug/74004 https://issues.rpath.com/browse/RPL-1350 The first use could be "mount" or open of the block device under /dev. So now you have the mount and open syscalls returning -ENOFIRMWARE. This is not a small change. > At device driver load, firmware loading must be asynchronous. This is > because device driver init can occur before userspace runs and > registers a hotplug/uevent event handler. Userspace doesn't have to register a hotplug handler: /sys/hotplug is the default and if there is one in initramfs it should get called. You shouldn't have to wait for init to run for this to be the case. I believe the initramfs cpio archive used to get extracted just before the attempt to exec /init out of it, and that it was moved to much earlier in the boot process so that firmware loading could be done out of it for statically linked drivers. Look at do_basic_setup() in init/main.c: notice how usermodehelper_init() gets called right before driver_init(). Ask yourself: "why did it do that"? Notice how rest_init() forks the first kernel thread (PID 1) to run kernel_init(), and then becomes PID 0 (the idle task). So from kernel_init() it's ok to spawn all the new tasks we want because they can't take any reserved PIDs. > If device use is attempted > before firmware loads, a -ENOFIRMWARE call would be great so that > userspace and thus the user can be clearly informed what the problem is. Because of course every userspace utility in the world will immediately be rewritten to provide clear and informative error messages for this race condition corner case. Somehow, I can't bring myself to believe this. > However, at 'first use' firmware loading, the synchronous interface > should remain. 'ifconfig up' either works or it doesn't, and as I see > it, has to just hang around until firmware turns up. Why would you make two different code paths for module load and first use? > Documentation for how hotplug/uevent handlers should cope with these > types of firmware loading is also *strongly* requested, in order for > lightweight but fully functional implementations to be made. Happy to. I'm just trying to figure out what the correct behavior is so I can document it. Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday 28 May 2007 5:06 am, Kay Sievers wrote: > Well, 10 seconds are just to short for userspace to react on some > setups, from tiny boxes which are busy, to 512 CPU boxes enumerating > thousands of devices, all had problems here. Any timeout for a > firmware-request is just a broken concept, the request should wait > forever, to be fulfilled or canceled from userspace when it's ready. If it's spawning a new usermode helper process, then figuring out when to give up and exit is that process's job. If it _can't_ exec usermode helper, then that should fail immediately. Where does the timeout come in? Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Tue, 29 May 2007, David Brownell wrote: > On Monday 28 May 2007, Alan Stern wrote: > > > This is a matter of one's philosophy. In suspend-to-RAM, should tasks > > be frozen or should I/O queues be frozen? > > > > With the USB subsystem I have followed the approach taken by the PM > > core, which is that tasks are frozen. But one can -- and Linus has on > > at least one occasion -- make a good case that tasks should be left > > running while only I/O is frozen. > > In fact that makes a heck of a lot more sense to me from the > conceptual point of view. From the hardware perspective, the > task of preparing to enter true suspend states (STR, or suspend > for ACPI; embedded systems have more options) focusses on what > I/O signals are disabled. > > Once the relevant I/O signals are first idled, then disabled, > the CPU can do whatever it likes. Whether it runs or not is > purely a workload decision... > > Remember too that not all systems suffer from the constraints > that ACPI decrees. In particular, it's not uncommon that some > parts of the system be active in certain suspend states. The > whole point is to turn off as much of the system as possible, > especially the high power portions, while letting work proceed. > > Turning off some clocks and peripherals doesn't need to imply > turning them all off, or disabling DMA ... and should not need > to be triggered by a user (or userspace tool) explicitly saying > "go into STR". One can think of suspend-to-RAM as nothing more than a super-duper selective suspend of all devices (including the CPU!). This illustrates the relationship between suspend-to-RAM and runtime PM. However there is one important distinction, the DoS issue that Pavel raised. With true suspend-to-RAM, autoresume on demand is not enabled -- only on remote wakeup. With runtime PM, both are enabled. Elegant though it is, the "freeze I/O" approach suffers from implementation awkwardness. Ideally individual drivers shouldn't have to worry about it, but the subsystems certainly will. Consider as an example the class of char device drivers. The only subsystem they have in common is VFS. It would then be necessary for every entry point into VFS to check whether a suspend-to-RAM is in progress, and if it is, block until the suspend is over. Each one of those entry points is on a hot path, so adding these checks is the sort of thing one would like to avoid. By freezing tasks we can eliminate (most) I/O requests at the source with a single, relatively small amount of code (i.e., the refrigerator). Freezing I/O, on the other hand, would involve many checks dispersed throughout a large number of source files -- checks that would have to execute even when a suspend wasn't in progress. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sunday 27 May 2007 4:45 pm, Michael-Luke Jones wrote: > (Slightly OT: A particularly nasty race is when an initramfs > userspace is present, but firmware loading cannot occur because init > has not run, so proc hasn't been mounted, so a hotplug event handler > cannot be registered, despite the fact that the firmware is sitting > on the ramdisk mounted correctly...) I believe that the current functionality is that /sbin/hotplug is the default. I think that if there is one in initramfs, it'll get called as a usermode helper, even before init has run. (If it needs /proc /sys or /dev, setting it up properly is its problem, but not an insurmountable one.) I remember discussion of this a while back, and that it was indeed intentional. I dunno if anybody's actually tried it. Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday 28 May 2007, Alan Stern wrote: > This is a matter of one's philosophy. In suspend-to-RAM, should tasks > be frozen or should I/O queues be frozen? > > With the USB subsystem I have followed the approach taken by the PM > core, which is that tasks are frozen. But one can -- and Linus has on > at least one occasion -- make a good case that tasks should be left > running while only I/O is frozen. In fact that makes a heck of a lot more sense to me from the conceptual point of view. From the hardware perspective, the task of preparing to enter true suspend states (STR, or suspend for ACPI; embedded systems have more options) focusses on what I/O signals are disabled. Once the relevant I/O signals are first idled, then disabled, the CPU can do whatever it likes. Whether it runs or not is purely a workload decision... Remember too that not all systems suffer from the constraints that ACPI decrees. In particular, it's not uncommon that some parts of the system be active in certain suspend states. The whole point is to turn off as much of the system as possible, especially the high power portions, while letting work proceed. Turning off some clocks and peripherals doesn't need to imply turning them all off, or disabling DMA ... and should not need to be triggered by a user (or userspace tool) explicitly saying "go into STR". > This would require the subsystem to > distinguish between a selective device suspend and a system-wide > suspend-to-RAM, so that selective resume could be enabled on demand in > one case but not the other. Exactly. "Selective suspend" of parts of the system is a far more general model. It fits well with runtime power management, degrades smoothly to states where memory goes into self-refresh (maybe the system idle loop when NO_HZ is being effective) or even hibernation (as discussed elsewhere). - Dave > It's quite doable in principle -- it's just not the technique I used. > > Alan Stern > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 28 May 2007, Pavel Machek wrote: > > The theoretical answer is that it behaves the way we want. The kernel > > thread does selective resumes in response to device requests. If such > > a request comes in while the system is asleep it will awaken the > > system; so it's only logical that a request coming in while the system > > is in the process of going to sleep should abort the suspend. > > I'd say that it shows ppc being broken. User wanted to suspend the > system, and now unrelated task did lsusb... and system will not sleep. > > AFAICT it is DoS issue -- if one of your users keeps doing lsusb, root > will not be able to suspend the system. This is a matter of one's philosophy. In suspend-to-RAM, should tasks be frozen or should I/O queues be frozen? With the USB subsystem I have followed the approach taken by the PM core, which is that tasks are frozen. But one can -- and Linus has on at least one occasion -- make a good case that tasks should be left running while only I/O is frozen. This would require the subsystem to distinguish between a selective device suspend and a system-wide suspend-to-RAM, so that selective resume could be enabled on demand in one case but not the other. It's quite doable in principle -- it's just not the technique I used. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > >In fact, I'd like drivers to use notifiers to actually > >load the firmware into > >memory before hibernation/suspend. Namely, if there's > >PM_PRE_FREEZE, the > >driver calls request_firmware() from within the > >notifier and saves the firmware > >in memory for future use, if need be. Later, when > >PM_POST_THAW comes, the > >memory holding the firmware is released. > > > >Unfortunately there are drivers that call > >request_firmware() directly from > >.resume() which blocks until timeout expires and fails > >anyway. I just wanted > >this to fail immediately, without waiting. > > Stupid question time. Wouldn't it just be easier to have > request_firmware() keep a copy of the firmware once it's > been loaded? > We're not talking about a lot of memory that would be > wasted, and that > way no drivers have to be changed. Actually, I like this idea. Firmware problems magically disappear. ...unless someone uses x86 emulator in userland to POST graphics card. You can't 'cache' that. But that's separate problem. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > > > I can't speak for the second example, but there's a good reason the > > > first example works this way. It's not a matter of races; the problem > > > is that the kernel thread's job is to selectively suspend and resume > > > devices. We don't want it doing this while a system sleep is in > > > progress; it would (and in fact has, before the thread was made > > > freezable) cause the sleep transition to abort. > > > > How does this work on PPC or APM systems? > > For hibernation it behaves the same as on other types of systems. > > For STR it generally works okay. There was one report of suspends > aborting, and it looked like this was caused by selective resumes > originating from userspace. This seemed to be unrelated to the kernel > threads; apparently some program was running while the STR was in > progress, and causing the problem. For example, the lsusb program will > do a selective resume on every USB device as it scans through them all. > However that's just a guess, we haven't fully resolved that bug report. > > The theoretical answer is that it behaves the way we want. The kernel > thread does selective resumes in response to device requests. If such > a request comes in while the system is asleep it will awaken the > system; so it's only logical that a request coming in while the system > is in the process of going to sleep should abort the suspend. I'd say that it shows ppc being broken. User wanted to suspend the system, and now unrelated task did lsusb... and system will not sleep. AFAICT it is DoS issue -- if one of your users keeps doing lsusb, root will not be able to suspend the system. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 5/28/07, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote: On Monday, 28 May 2007 17:55, Alan Stern wrote: > You're using the PM_PRE_FREEZE and PM_POST_THAW notifiers for both this > and the userspace helper change. Is it your intention that drivers > should continue to request these services but encounter an error if the > request occurs at the wrong time? Or do you expect drivers to use the > notifier chains to know when they shouldn't make any requests? In fact, I'd like drivers to use notifiers to actually load the firmware into memory before hibernation/suspend. Namely, if there's PM_PRE_FREEZE, the driver calls request_firmware() from within the notifier and saves the firmware in memory for future use, if need be. Later, when PM_POST_THAW comes, the memory holding the firmware is released. Unfortunately there are drivers that call request_firmware() directly from .resume() which blocks until timeout expires and fails anyway. I just wanted this to fail immediately, without waiting. Stupid question time. Wouldn't it just be easier to have request_firmware() keep a copy of the firmware once it's been loaded? We're not talking about a lot of memory that would be wasted, and that way no drivers have to be changed. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 28 May 2007, Matthew Garrett wrote: > On Mon, May 28, 2007 at 12:43:36PM -0400, Alan Stern wrote: > > On Mon, 28 May 2007, Matthew Garrett wrote: > > > On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote: > > > > > > > I can't speak for the second example, but there's a good reason the > > > > first example works this way. It's not a matter of races; the problem > > > > is that the kernel thread's job is to selectively suspend and resume > > > > devices. We don't want it doing this while a system sleep is in > > > > progress; it would (and in fact has, before the thread was made > > > > freezable) cause the sleep transition to abort. > > > > > > How does this work on PPC or APM systems? > > > > For hibernation it behaves the same as on other types of systems. > > > > For STR it generally works okay. There was one report of suspends > > aborting, and it looked like this was caused by selective resumes > > originating from userspace. This seemed to be unrelated to the kernel > > threads; apparently some program was running while the STR was in > > progress, and causing the problem. For example, the lsusb program will > > do a selective resume on every USB device as it scans through them all. > > However that's just a guess, we haven't fully resolved that bug report. > > > > The theoretical answer is that it behaves the way we want. The kernel > > thread does selective resumes in response to device requests. If such > > a request comes in while the system is asleep it will awaken the > > system; so it's only logical that a request coming in while the system > > is in the process of going to sleep should abort the suspend. > > Ok, I guess I'm still not clear on this :) If it doesn't cause major > problems on Powermac or APM systems, why is freezing the thread > beneficial on ACPI systems? Isn't it true, even on PPC and APM systems, that tasks are frozen for hibernation? The difference is that they aren't frozen for STR. Hence the answer to your question: Not freezing the thread would work okay on ACPI systems for STR. (In fact, not freezing anything would probably work okay for STR.) But it doesn't work with hibernation. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday, 28 May 2007 17:55, Alan Stern wrote: > On Sun, 27 May 2007, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > Use a hibernation and suspend notifier to disable the firmware requesting > > mechanism before a hibernation/suspend and enable it after the operation. > > You're using the PM_PRE_FREEZE and PM_POST_THAW notifiers for both this > and the userspace helper change. Is it your intention that drivers > should continue to request these services but encounter an error if the > request occurs at the wrong time? Or do you expect drivers to use the > notifier chains to know when they shouldn't make any requests? In fact, I'd like drivers to use notifiers to actually load the firmware into memory before hibernation/suspend. Namely, if there's PM_PRE_FREEZE, the driver calls request_firmware() from within the notifier and saves the firmware in memory for future use, if need be. Later, when PM_POST_THAW comes, the memory holding the firmware is released. Unfortunately there are drivers that call request_firmware() directly from .resume() which blocks until timeout expires and fails anyway. I just wanted this to fail immediately, without waiting. > In the second case you may have a problem, because there's no > specification about the order in which the notifiers are sent. The > service may get disabled before the driver learns it isn't available, > or the driver may think the service is once again available before it > gets enabled. Yes. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, May 28, 2007 at 12:43:36PM -0400, Alan Stern wrote: > On Mon, 28 May 2007, Matthew Garrett wrote: > > On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote: > > > > > I can't speak for the second example, but there's a good reason the > > > first example works this way. It's not a matter of races; the problem > > > is that the kernel thread's job is to selectively suspend and resume > > > devices. We don't want it doing this while a system sleep is in > > > progress; it would (and in fact has, before the thread was made > > > freezable) cause the sleep transition to abort. > > > > How does this work on PPC or APM systems? > > For hibernation it behaves the same as on other types of systems. > > For STR it generally works okay. There was one report of suspends > aborting, and it looked like this was caused by selective resumes > originating from userspace. This seemed to be unrelated to the kernel > threads; apparently some program was running while the STR was in > progress, and causing the problem. For example, the lsusb program will > do a selective resume on every USB device as it scans through them all. > However that's just a guess, we haven't fully resolved that bug report. > > The theoretical answer is that it behaves the way we want. The kernel > thread does selective resumes in response to device requests. If such > a request comes in while the system is asleep it will awaken the > system; so it's only logical that a request coming in while the system > is in the process of going to sleep should abort the suspend. Ok, I guess I'm still not clear on this :) If it doesn't cause major problems on Powermac or APM systems, why is freezing the thread beneficial on ACPI systems? -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 28 May 2007, Matthew Garrett wrote: > On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote: > > > I can't speak for the second example, but there's a good reason the > > first example works this way. It's not a matter of races; the problem > > is that the kernel thread's job is to selectively suspend and resume > > devices. We don't want it doing this while a system sleep is in > > progress; it would (and in fact has, before the thread was made > > freezable) cause the sleep transition to abort. > > How does this work on PPC or APM systems? For hibernation it behaves the same as on other types of systems. For STR it generally works okay. There was one report of suspends aborting, and it looked like this was caused by selective resumes originating from userspace. This seemed to be unrelated to the kernel threads; apparently some program was running while the STR was in progress, and causing the problem. For example, the lsusb program will do a selective resume on every USB device as it scans through them all. However that's just a guess, we haven't fully resolved that bug report. The theoretical answer is that it behaves the way we want. The kernel thread does selective resumes in response to device requests. If such a request comes in while the system is asleep it will awaken the system; so it's only logical that a request coming in while the system is in the process of going to sleep should abort the suspend. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote: > I can't speak for the second example, but there's a good reason the > first example works this way. It's not a matter of races; the problem > is that the kernel thread's job is to selectively suspend and resume > devices. We don't want it doing this while a system sleep is in > progress; it would (and in fact has, before the thread was made > freezable) cause the sleep transition to abort. How does this work on PPC or APM systems? -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sun, 27 May 2007, Matthew Garrett wrote: > > BTW, I know of two subsystems that want their kernel threads to be frozen > > for > > synchronization purposes. Please see these messages: > > > > 1) > > https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html > > (plus follow up) > > > > 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2 > > I'm not entirely sold on this. The issue is that there's the possibility > of races during suspend/resume? It sounds like that should be > implemented in the driver, rather than depending on a side-effect of > process freezing. Otherwise there's no way of selectively suspending > that device. I can't speak for the second example, but there's a good reason the first example works this way. It's not a matter of races; the problem is that the kernel thread's job is to selectively suspend and resume devices. We don't want it doing this while a system sleep is in progress; it would (and in fact has, before the thread was made freezable) cause the sleep transition to abort. Handling it entirely from within the drivers is possible in theory. Unfortunately the design of the PM core has not leant itself to such an approach. Using separate callbacks for hibernation vs. STR will help, as will Raphael's notifiers. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sun, 27 May 2007, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > Use a hibernation and suspend notifier to disable the firmware requesting > mechanism before a hibernation/suspend and enable it after the operation. You're using the PM_PRE_FREEZE and PM_POST_THAW notifiers for both this and the userspace helper change. Is it your intention that drivers should continue to request these services but encounter an error if the request occurs at the wrong time? Or do you expect drivers to use the notifier chains to know when they shouldn't make any requests? In the second case you may have a problem, because there's no specification about the order in which the notifiers are sent. The service may get disabled before the driver learns it isn't available, or the driver may think the service is once again available before it gets enabled. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 28 May 2007, at 14:00, Pavel Machek wrote: I'm too lazy to figure out how it currently works, but I'll just state it is not my fault Figuring out how a userspace/kernelspace interface works should not rely on having to read kernelspace code. Unfortunately, in the case of hotplug / uevents, there is no such documentation. Thus, what kernelspace / userspace interactions actually are and what they should be is unclear, leading to confusion over corner cases, such as this one. Michael-Luke Jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > Use a hibernation and suspend notifier to disable the firmware requesting > mechanism before a hibernation/suspend and enable it after the operation. > > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > > drivers/base/firmware_class.c | 36 > 1 file changed, 36 insertions(+) > > Index: linux-2.6.22-rc3/drivers/base/firmware_class.c > === > --- linux-2.6.22-rc3.orig/drivers/base/firmware_class.c 2007-05-27 > 19:43:03.0 +0200 > +++ linux-2.6.22-rc3/drivers/base/firmware_class.c2007-05-27 > 19:44:06.0 +0200 > @@ -17,6 +17,8 @@ > #include > #include > #include > +#include > +#include > > #include > #include "base.h" > @@ -35,6 +37,9 @@ enum { > > static int loading_timeout = 60; /* In seconds */ > > +/* If set, _request_firmware() will exit immediately returning -EBUSY */ > +static int requesting_firmware_disabled; > + > /* fw_lock could be moved to 'struct firmware_priv' but since it is just > * guarding for corner cases a global lock should be OK */ > static DEFINE_MUTEX(fw_lock); > @@ -397,6 +402,9 @@ _request_firmware(const struct firmware > struct firmware *firmware; > int retval; > > + if (requesting_firmware_disabled) > + return -EBUSY; WARN_ON()? This should not happen, and if it does, we'll end up with non-working device. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > >From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > >Use a hibernation and suspend notifier to disable the firmware > >requesting > >mechanism before a hibernation/suspend and enable it after the > >operation. > > > >Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > drivers/base/firmware_class.c | 36 ++ > >++ > > 1 file changed, 36 insertions(+) > > I don't like this approach, as I feel that the firmware loading > interface should be able to detect if a firmware load request is not > being handled, due to absence of userspace / hotplug handler presence. I don't think that's possible. If hotplug handler needs /dev/foo, but /dev/foo is not available, it will just block waiting there. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > *Grumble ahead* > Unfortunately, I don't properly understand the uevent interface, so > some of the above may be inaccurate. This is *not* my fault as there > is no decent documentation (and btw, telling me to write the Translated: "I don't know what I'm talking about, but I still like to flame people. I'm too lazy to figure out how it currently works, but I'll just state it is not my fault". Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 2007-05-28 at 13:26 +0100, Michael-Luke Jones wrote: > On 28 May 2007, at 12:51, Kay Sievers wrote: > > > Either the whole idea of userspace firmware-loading should be > > considered > > as a problem impossible to do right because of its unsolvable side > > effects. Or at least releasing loaded firmware should be the exception > > for drivers which can be sure, that the firmware is not needed in > > situations we try to work around here. > > I don't believe that it is impossible. What is needed is some > thought, and maybe a loose spec. Hehe, let's see. :) > We need to think about safe ways of solving a simple problem: what to > do when userspace is unable to respond to a kernel uevent request. We > now have a timeout, whether it be handled synchronously or in the > background. I don't think either solution is good enough. > > [RFC] Possible Plan: > > 1) request_firmware() should stick around unmodified but be marked > __deprecated. > > 2) request_firmware_nowait() as it stands should probably be removed. > After all, it only has 2 in-kernel users at the moment. We should probably leave the whole firmware class alone, add something new, and convert the current users. > 3) uevent interface should be notified when userspace handlers are / > are not available so that events can be queued to be handled when > userspace does turn up and re-register a hotplug event handler. Uevents are queued by the netlink socket buffer. If userspace can't receice them (udevd) while it's frozen, they will just be handled (in the right order) when userspace runs again. We can queue up thousands of events, and it should already work that way for a long time now. > 4) request_firmware_async() introduced: asynchronous firmware loading > interface built on basis of 3, such that problems at early boot and > suspend / resume are avoided. Also request_firmware_async() taught to > retain firmware in memory by default such that subsequent calls do > not require a call to userspace if userspace is unavailable. Something like this makes sense, yes. > 6) Documentation on correct firmware loading behaviour for driver > authors written, together with migration notes for existing users of > request_firmware(). > > 7) Existing uses of request_firmware() converted. > > *Grumble ahead* > Unfortunately, I don't properly understand the uevent interface, so > some of the above may be inaccurate. The event emission in the kernel and the userspace side is probably fine. Two years ago, benh already run into exactly the same suspend problems, and I think I addressed the problems on the uevent side. > This is *not* my fault as there > is no decent documentation (and btw, telling me to write the > documentation is not the answer to that problem). Sure, we always need at least one working example before we can even tell what's the right way to do it. :) Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 28 May 2007, at 12:51, Kay Sievers wrote: Either the whole idea of userspace firmware-loading should be considered as a problem impossible to do right because of its unsolvable side effects. Or at least releasing loaded firmware should be the exception for drivers which can be sure, that the firmware is not needed in situations we try to work around here. I don't believe that it is impossible. What is needed is some thought, and maybe a loose spec. We need to think about safe ways of solving a simple problem: what to do when userspace is unable to respond to a kernel uevent request. We now have a timeout, whether it be handled synchronously or in the background. I don't think either solution is good enough. [RFC] Possible Plan: 1) request_firmware() should stick around unmodified but be marked __deprecated. 2) request_firmware_nowait() as it stands should probably be removed. After all, it only has 2 in-kernel users at the moment. 3) uevent interface should be notified when userspace handlers are / are not available so that events can be queued to be handled when userspace does turn up and re-register a hotplug event handler. 4) request_firmware_async() introduced: asynchronous firmware loading interface built on basis of 3, such that problems at early boot and suspend / resume are avoided. Also request_firmware_async() taught to retain firmware in memory by default such that subsequent calls do not require a call to userspace if userspace is unavailable. 6) Documentation on correct firmware loading behaviour for driver authors written, together with migration notes for existing users of request_firmware(). 7) Existing uses of request_firmware() converted. *Grumble ahead* Unfortunately, I don't properly understand the uevent interface, so some of the above may be inaccurate. This is *not* my fault as there is no decent documentation (and btw, telling me to write the documentation is not the answer to that problem). Michael-Luke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 28 May 2007, at 13:01, Kay Sievers wrote: In our development model, users of an interface are expected to improve it, because nobody else really knows what's needed. That unfortunately didn't happen so far. Thanks for responding :) The fact that no-one is told to use the new right way (tm) in any available documentation does not help matters improve. One issue is that the 'asynchronous' interface seems to rely on the same 'dodgy' timeout mechanism as the original request_firmware() call... Looks like the lack of a maintainer should be the opportunity for a rework of this API. Michael-Luke Jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 2007-05-28 at 11:26 +0100, Michael-Luke Jones wrote: > On 28 May 2007, at 10:06, Kay Sievers wrote: > > > The underlying issue are the driver authors, that's not so easy to > > fix. :) > > Sorry, I know this maybe be unintentional, but comments like this > make me somewhat angry. It was the response to "pushing a bodge upstream", while we just try to make things working. :) We are fighting that broken firmware-loader model for years, but every driver author seems to have its own idea how that should work in theory, but they are not even willing to switch to the existing version that is expected to work better. > If there is no decent documentation as to how to do it the right way > (tm), how do you expect people to do it the right way (tm)? > > > Any timeout for a > > firmware-request is just a broken concept, the request should wait > > forever, to be fulfilled or canceled from userspace when it's ready. > > What I wrote above is especially true when the in-kernel APIs > themselves do things the wrong way (tm) themselves. Thus, even more > thought is required to work around this imperfect behaviour in a sane > manner. And without documentation, every single device driver author > has to go through this thought process themselves. Unsurprisingly, > they often get it wrong. But as there's no decent documentation to do > it right, it's *not* their fault. I'd suggest it's more the fault of > the core kernel devs who fail to fix up a questionable firmware > loading interface, then fail to document how to work around its > shortcomings. > > Again, apologies if this sounds angry, I don't want to start an > argument. But as someone just starting out here, this kind of thing > can be very frustrating, as even with the best will in the world, > achieving the right way (tm) is basically impossible if those in the > know about what the right way (tm) is fail to share this information. In our development model, users of an interface are expected to improve it, because nobody else really knows what's needed. That unfortunately didn't happen so far. We get this kind of conversation every few months since a few years now. I wonder, if we will see code, or at least come up with a better idea this time. :) Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 2007-05-28 at 12:38 +0100, Michael-Luke Jones wrote: > On 28 May 2007, at 12:24, Kay Sievers wrote: > > > A driver for a bootup-critical device like this should just never > > release the firmware after the first load. There is absolutely no > > point > > in doing that. > > Bogus argument: is a USB-Ethernet device which needs firmware loading > boot-up critical? Not on the surface, but if the device loads root > over this device, it suddenly is. Releasing loaded firmware for anything that needs to handle situations like this just doesn't make sense. > This functionality should also be written into the firmware-class > (and this fact *is* acknowledged in the sparse documentation). Yeah, but these are just words, nothing people could use. Unfortunately the author of this document died a few years ago. Either the whole idea of userspace firmware-loading should be considered as a problem impossible to do right because of its unsolvable side effects. Or at least releasing loaded firmware should be the exception for drivers which can be sure, that the firmware is not needed in situations we try to work around here. Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
> This functionality should also be written into the firmware-class > (and this fact *is* acknowledged in the sparse documentation). Feel free to submit documentation improvements. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 28 May 2007, at 12:24, Kay Sievers wrote: A driver for a bootup-critical device like this should just never release the firmware after the first load. There is absolutely no point in doing that. Bogus argument: is a USB-Ethernet device which needs firmware loading boot-up critical? Not on the surface, but if the device loads root over this device, it suddenly is. This functionality should also be written into the firmware-class (and this fact *is* acknowledged in the sparse documentation). Michael-Luke Jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon 2007-05-28 13:24:53, Kay Sievers wrote: > On Mon, 2007-05-28 at 13:15 +0200, Pavel Machek wrote: > > Hi! > > > > > > > > What exactly is the problem we see here? The timeout of the > > > > > > firmware loader? > > > > > > What goes wrong with frozen userspace, usually there is only a > > > > > > netlink > > > > > > message sent from the kernel, which should be received and handled > > > > > > just fine when userspace is running again. > > > > > > > > > > Driver calls request_firmware in the resume method. The userspace > > > > > helper > > > > > can't be run because it's been frozen, so the firmware never gets > > > > > loaded > > > > > and the call times out. The driver then fails to resume. While all > > > > > this > > > > > is happening, the rest of the kernel is blocking on that resume > > > > > method. > > > > > The firmware can be loaded once userspace has been started again, but > > > > > by > > > > > that time the driver has given up. > > > > > > > > Seems, that's just the broken synchronous firmware loading interface > > > > with the useless timeout handling. The nowait version of the same loader > > > > doesn't time out, and should not have that problem. The sync version > > > > should be removed from the kernel, it just causes all sorts of problems > > > > since it exists. > > > > > > > > Userspace should handle the async request just fine when it comes back > > > > running, regardless of the time it was submitted. > > > > > > Okay, so the solution is to convert the drivers to use > > > request_firmware_nowait() instead of request_firmware() in their .resume() > > > routines. > > > > You'll just get deadlock at different level (and more rare). > > > > Imagine disk with its firmware on NFS and NFS with its firmware on > > disk. > > > > (Or maybe firmware loader doing find /, including both disk and > > NFS). Just don't call request_firmware_* from .resume(). > > A driver for a bootup-critical device like this should just never > release the firmware after the first load. There is absolutely no point > in doing that. It does not have to be _bootup-critical_ device. Problem is any device that might be used by userspace firmware loader. And that is _any_ device. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 2007-05-28 at 13:15 +0200, Pavel Machek wrote: > Hi! > > > > > > What exactly is the problem we see here? The timeout of the firmware > > > > > loader? > > > > > What goes wrong with frozen userspace, usually there is only a netlink > > > > > message sent from the kernel, which should be received and handled > > > > > just fine when userspace is running again. > > > > > > > > Driver calls request_firmware in the resume method. The userspace > > > > helper > > > > can't be run because it's been frozen, so the firmware never gets > > > > loaded > > > > and the call times out. The driver then fails to resume. While all this > > > > is happening, the rest of the kernel is blocking on that resume method. > > > > The firmware can be loaded once userspace has been started again, but > > > > by > > > > that time the driver has given up. > > > > > > Seems, that's just the broken synchronous firmware loading interface > > > with the useless timeout handling. The nowait version of the same loader > > > doesn't time out, and should not have that problem. The sync version > > > should be removed from the kernel, it just causes all sorts of problems > > > since it exists. > > > > > > Userspace should handle the async request just fine when it comes back > > > running, regardless of the time it was submitted. > > > > Okay, so the solution is to convert the drivers to use > > request_firmware_nowait() instead of request_firmware() in their .resume() > > routines. > > You'll just get deadlock at different level (and more rare). > > Imagine disk with its firmware on NFS and NFS with its firmware on > disk. > > (Or maybe firmware loader doing find /, including both disk and > NFS). Just don't call request_firmware_* from .resume(). A driver for a bootup-critical device like this should just never release the firmware after the first load. There is absolutely no point in doing that. Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday, 28 May 2007 13:15, Pavel Machek wrote: > Hi! > > > > > > What exactly is the problem we see here? The timeout of the firmware > > > > > loader? > > > > > What goes wrong with frozen userspace, usually there is only a netlink > > > > > message sent from the kernel, which should be received and handled > > > > > just fine when userspace is running again. > > > > > > > > Driver calls request_firmware in the resume method. The userspace > > > > helper > > > > can't be run because it's been frozen, so the firmware never gets > > > > loaded > > > > and the call times out. The driver then fails to resume. While all this > > > > is happening, the rest of the kernel is blocking on that resume method. > > > > The firmware can be loaded once userspace has been started again, but > > > > by > > > > that time the driver has given up. > > > > > > Seems, that's just the broken synchronous firmware loading interface > > > with the useless timeout handling. The nowait version of the same loader > > > doesn't time out, and should not have that problem. The sync version > > > should be removed from the kernel, it just causes all sorts of problems > > > since it exists. > > > > > > Userspace should handle the async request just fine when it comes back > > > running, regardless of the time it was submitted. > > > > Okay, so the solution is to convert the drivers to use > > request_firmware_nowait() instead of request_firmware() in their .resume() > > routines. > > You'll just get deadlock at different level (and more rare). > > Imagine disk with its firmware on NFS and NFS with its firmware on > disk. Yeah, I've just given the very same example in another message. :-) I believe the only _solution_ would be to load the firmware into memory before the suspend and use the in-RAM copy to upload it into the device in .resume(). A PM_PRE_FREEZE notifier could be used for that just fine, BTW. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday, 28 May 2007 10:30, Nigel Cunningham wrote: > Hi. > > On Sun, 2007-05-27 at 23:45 +0200, Rafael J. Wysocki wrote: > > On Sunday, 27 May 2007 22:49, Matthew Garrett wrote: > > > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote: > > > > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > > > > > Use a hibernation and suspend notifier to disable the firmware > > > > requesting > > > > mechanism before a hibernation/suspend and enable it after the > > > > operation. > > > > > > This avoids the problem of .resume methods calling userspace while > > > userspace is frozen and a resulting hang, but does it actually result in > > > the drivers beginning to work again? > > > > Well, this was acutally invented before you've decided to remove the > > freezing > > of tasks from the suspend code path (which I think is a mistake, but that's > > only my personal opinion, so it doesn't matter very much ;-)) and I regard > > it > > as a workaround. > > Suspend-to-ram code paths shouldn't assume userspace is unfrozen anyway. > Doesn't [u]swsusp have a code path like Suspend2 where we can suspend to > ram after writing the hibernation image? In that case, it will still be > possible that we seek to enter and leave S3 with processes frozen. That's correct. > Apologies if anyone has already mentioned this - I'm just starting to > play catchup. No one has and that's a valid point, I think. :-) Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi! > > > > What exactly is the problem we see here? The timeout of the firmware > > > > loader? > > > > What goes wrong with frozen userspace, usually there is only a netlink > > > > message sent from the kernel, which should be received and handled > > > > just fine when userspace is running again. > > > > > > Driver calls request_firmware in the resume method. The userspace helper > > > can't be run because it's been frozen, so the firmware never gets loaded > > > and the call times out. The driver then fails to resume. While all this > > > is happening, the rest of the kernel is blocking on that resume method. > > > The firmware can be loaded once userspace has been started again, but by > > > that time the driver has given up. > > > > Seems, that's just the broken synchronous firmware loading interface > > with the useless timeout handling. The nowait version of the same loader > > doesn't time out, and should not have that problem. The sync version > > should be removed from the kernel, it just causes all sorts of problems > > since it exists. > > > > Userspace should handle the async request just fine when it comes back > > running, regardless of the time it was submitted. > > Okay, so the solution is to convert the drivers to use > request_firmware_nowait() instead of request_firmware() in their .resume() > routines. You'll just get deadlock at different level (and more rare). Imagine disk with its firmware on NFS and NFS with its firmware on disk. (Or maybe firmware loader doing find /, including both disk and NFS). Just don't call request_firmware_* from .resume(). -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Now a technical rather than emotional response... On 28 May 2007, at 10:06, Kay Sievers wrote: At device driver load, firmware loading must be asynchronous. This is because device driver init can occur before userspace runs and registers a hotplug/uevent event handler. If device use is attempted before firmware loads, a -ENOFIRMWARE call would be great so that userspace and thus the user can be clearly informed what the problem is. Why would a driver create an interface before it has the needed firmware loaded? A valid point. But there should be some kind of error notification if firmware loading hasn't happened correctly rather than a permanent asynchronous wait in which the interface fails to turn up. Possibly a kernel information printk or something, which does not exist at the moment. However, at 'first use' firmware loading, the synchronous interface should remain. 'ifconfig up' either works or it doesn't, and as I see it, has to just hang around until firmware turns up. What kind of network driver does create an interface for a non-functioning device? That sounds like a bug on its own. Unclear. My point was that when ifconfig up exits, the interface should be up, not asynchronously waiting for firmware to be loaded, then taken up in the background. Thus, firmware loading in this case should be kept synchronous, in my opinion. If a driver binds to a device, it should just have the firmware already loaded, and not wait until its used. What's the reason for such a behavior, to let a driver pretend it can handle a device, but it doesn't even know if all the needed pieces are available on the system? Basically, you have a device which can carry out different functions depending on the firmware loaded into it. Driver A is specific to this device, and loads the firmware. Driver B uses functions exported by Driver A to carry out one particular function of the device. Driver C uses the same functions to carry out a totally different function on the same device, but with different firmware loaded. Add in multiple devices handled by Driver A, all with different functionality, and sometimes with combinations of functionality that can coexist, and you see that when Driver A loads it cannot possibly know which firmware to load, but must wait for other Drivers to turn up and be put into use. Thus it 'pretends' to handle all the devices until it's forced to make a choice. Yes, this is hellishly complicated. Blame Intel :) The underlying issue are the driver authors, that's not so easy to fix. :) Addressed in previous email. Well, 10 seconds are just to short for userspace to react on some setups, from tiny boxes which are busy, to 512 CPU boxes enumerating thousands of devices, all had problems here. Any timeout for a firmware-request is just a broken concept, the request should wait forever, to be fulfilled or canceled from userspace when it's ready. Absolutely. I said this in an earlier email and suggested rejecting this patchset on the grounds that it was another bodge covering over a fundamentally broken area of the kernel :) Kay Michael-Luke Jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 28 May 2007, at 10:06, Kay Sievers wrote: The underlying issue are the driver authors, that's not so easy to fix. :) Sorry, I know this maybe be unintentional, but comments like this make me somewhat angry. If there is no decent documentation as to how to do it the right way (tm), how do you expect people to do it the right way (tm)? Any timeout for a firmware-request is just a broken concept, the request should wait forever, to be fulfilled or canceled from userspace when it's ready. What I wrote above is especially true when the in-kernel APIs themselves do things the wrong way (tm) themselves. Thus, even more thought is required to work around this imperfect behaviour in a sane manner. And without documentation, every single device driver author has to go through this thought process themselves. Unsurprisingly, they often get it wrong. But as there's no decent documentation to do it right, it's *not* their fault. I'd suggest it's more the fault of the core kernel devs who fail to fix up a questionable firmware loading interface, then fail to document how to work around its shortcomings. Again, apologies if this sounds angry, I don't want to start an argument. But as someone just starting out here, this kind of thing can be very frustrating, as even with the best will in the world, achieving the right way (tm) is basically impossible if those in the know about what the right way (tm) is fail to share this information. Michael-Luke Jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Mon, 2007-05-28 at 09:48 +0100, Michael-Luke Jones wrote: > On 28 May 2007, at 08:43, Rafael J. Wysocki wrote: > > >> Seems, that's just the broken synchronous firmware loading interface > >> with the useless timeout handling. The nowait version of the same > >> loader > >> doesn't time out, and should not have that problem. The sync version > >> should be removed from the kernel, it just causes all sorts of > >> problems > >> since it exists. > >> > >> Userspace should handle the async request just fine when it comes > >> back > >> running, regardless of the time it was submitted. > > > > Okay, so the solution is to convert the drivers to use > > request_firmware_nowait() instead of request_firmware() in > > their .resume() > > routines. > > [added Rob Landley to CC] > > I think asynchronous loading should be made the default behaviour. > However, we need to think and document how to do firmware loading. > > Firmware loading can be done at two different times: > Device Driver Load > First use of Device Driver > > For a network device, this correlates to when the driver is first > loaded into memory or at 'ifconfig up' respectively. > > At device driver load, firmware loading must be asynchronous. This is > because device driver init can occur before userspace runs and > registers a hotplug/uevent event handler. If device use is attempted > before firmware loads, a -ENOFIRMWARE call would be great so that > userspace and thus the user can be clearly informed what the problem is. Why would a driver create an interface before it has the needed firmware loaded? > However, at 'first use' firmware loading, the synchronous interface > should remain. 'ifconfig up' either works or it doesn't, and as I see > it, has to just hang around until firmware turns up. What kind of network driver does create an interface for a non-functioning device? That sounds like a bug on its own. If a driver binds to a device, it should just have the firmware already loaded, and not wait until its used. What's the reason for such a behavior, to let a driver pretend it can handle a device, but it doesn't even know if all the needed pieces are available on the system? > One more thing, it seems that the asynchronous firmware loading > thread just spawns a _request_firmware() call which then times out at > 60 seconds. I think, if the first request fails it spawns another. 60 > seconds is *far* too long for this type of thing, and this was set at > 10 seconds before the last two kernel releases (which is even a bit > slow itself). Unfortunately, this appears to a case of quite senior > kernel developers pushing a bodge upstream rather than fixing the > underlying issue :( [1] [2] The underlying issue are the driver authors, that's not so easy to fix. :) Well, 10 seconds are just to short for userspace to react on some setups, from tiny boxes which are busy, to 512 CPU boxes enumerating thousands of devices, all had problems here. Any timeout for a firmware-request is just a broken concept, the request should wait forever, to be fulfilled or canceled from userspace when it's ready. Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 28 May 2007, at 08:43, Rafael J. Wysocki wrote: Seems, that's just the broken synchronous firmware loading interface with the useless timeout handling. The nowait version of the same loader doesn't time out, and should not have that problem. The sync version should be removed from the kernel, it just causes all sorts of problems since it exists. Userspace should handle the async request just fine when it comes back running, regardless of the time it was submitted. Okay, so the solution is to convert the drivers to use request_firmware_nowait() instead of request_firmware() in their .resume() routines. [added Rob Landley to CC] I think asynchronous loading should be made the default behaviour. However, we need to think and document how to do firmware loading. Firmware loading can be done at two different times: Device Driver Load First use of Device Driver For a network device, this correlates to when the driver is first loaded into memory or at 'ifconfig up' respectively. At device driver load, firmware loading must be asynchronous. This is because device driver init can occur before userspace runs and registers a hotplug/uevent event handler. If device use is attempted before firmware loads, a -ENOFIRMWARE call would be great so that userspace and thus the user can be clearly informed what the problem is. However, at 'first use' firmware loading, the synchronous interface should remain. 'ifconfig up' either works or it doesn't, and as I see it, has to just hang around until firmware turns up. One more thing, it seems that the asynchronous firmware loading thread just spawns a _request_firmware() call which then times out at 60 seconds. I think, if the first request fails it spawns another. 60 seconds is *far* too long for this type of thing, and this was set at 10 seconds before the last two kernel releases (which is even a bit slow itself). Unfortunately, this appears to a case of quite senior kernel developers pushing a bodge upstream rather than fixing the underlying issue :( [1] [2] Documentation for how hotplug/uevent handlers should cope with these types of firmware loading is also *strongly* requested, in order for lightweight but fully functional implementations to be made. Documentation > Reference Implementation :) Michael-Luke [1] http://tinyurl.com/2colng (git.kernel.org) [2] http://tinyurl.com/224h54 (redhat bugzilla) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
Hi. On Sun, 2007-05-27 at 23:45 +0200, Rafael J. Wysocki wrote: > On Sunday, 27 May 2007 22:49, Matthew Garrett wrote: > > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > > > Use a hibernation and suspend notifier to disable the firmware requesting > > > mechanism before a hibernation/suspend and enable it after the operation. > > > > This avoids the problem of .resume methods calling userspace while > > userspace is frozen and a resulting hang, but does it actually result in > > the drivers beginning to work again? > > Well, this was acutally invented before you've decided to remove the freezing > of tasks from the suspend code path (which I think is a mistake, but that's > only my personal opinion, so it doesn't matter very much ;-)) and I regard it > as a workaround. Suspend-to-ram code paths shouldn't assume userspace is unfrozen anyway. Doesn't [u]swsusp have a code path like Suspend2 where we can suspend to ram after writing the hibernation image? In that case, it will still be possible that we seek to enter and leave S3 with processes frozen. Apologies if anyone has already mentioned this - I'm just starting to play catchup. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday, 28 May 2007 00:16, Kay Sievers wrote: > On Sun, 2007-05-27 at 23:04 +0100, Matthew Garrett wrote: > > On Sun, May 27, 2007 at 11:49:30PM +0200, Kay Sievers wrote: > > > > > What exactly is the problem we see here? The timeout of the firmware > > > loader? > > > What goes wrong with frozen userspace, usually there is only a netlink > > > message sent from the kernel, which should be received and handled > > > just fine when userspace is running again. > > > > Driver calls request_firmware in the resume method. The userspace helper > > can't be run because it's been frozen, so the firmware never gets loaded > > and the call times out. The driver then fails to resume. While all this > > is happening, the rest of the kernel is blocking on that resume method. > > The firmware can be loaded once userspace has been started again, but by > > that time the driver has given up. > > Seems, that's just the broken synchronous firmware loading interface > with the useless timeout handling. The nowait version of the same loader > doesn't time out, and should not have that problem. The sync version > should be removed from the kernel, it just causes all sorts of problems > since it exists. > > Userspace should handle the async request just fine when it comes back > running, regardless of the time it was submitted. Okay, so the solution is to convert the drivers to use request_firmware_nowait() instead of request_firmware() in their .resume() routines. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Monday, 28 May 2007 00:01, Matthew Garrett wrote: > On Sun, May 27, 2007 at 11:45:03PM +0200, Rafael J. Wysocki wrote: > > On Sunday, 27 May 2007 22:49, Matthew Garrett wrote: > > > If we remove process freezing in STR, this should just work[1] without the > > > need to complicate things. > > > > Under the (optimistic, IMO) assumption that the relevant user space task > > won't > > block on I/O with a suspended device involved or something like this. > > Yeah, I forgot the footnote that was going to mention that. Clearly > there are issues if this is on some block device that hasn't been > resumed yet. Or worse. Suppose, for example, that the device which needs the firmware uploaded is your network adapter and the firmware file is on NFS. I don't think there's a solution to the "requesting firmware from .resume()" problem other than copying the firmware into memory _before_ the suspend and using this copy to upload the firmware in .resume(). > > BTW, I know of two subsystems that want their kernel threads to be frozen > > for > > synchronization purposes. Please see these messages: > > > > 1) > > https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html > > (plus follow up) > > > > 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2 > > I'm not entirely sold on this. The issue is that there's the possibility > of races during suspend/resume? Yes. > It sounds like that should be implemented in the driver, rather than > depending on a side-effect of process freezing. Otherwise there's no way > of selectively suspending that device. That's true, but we've been using the freezer for so long that at least some drivers started to rely on it being used. That's the reason, IMO, why we can't just drop the freezer right now. It can be _replaced_ by some other synchronization mechanisms, but not just dropped. Moreover, I think that to implement such mechanisms within device drivers we should first stop using the same .suspend() and .resume() routines for both hibernation and suspend. This is the plan right now (ie. to stop using .suspend() and .resume() for hibernation) and when we're done with that (yes, it'll take some time), then we can think of dropping the freezer entirely from the suspend code path. > > Besides, there's the hibernation that needs to freeze tasks for another > > reason, > > so it needs some way to ensure that drivers won't request firmware while > > the user land is frozen. > > I agree that that's an issue right now, but I think we should see if > this is repairable without just breaking those drivers. Well, I don't think we're breaking the drivers. > One option would be to defer resuming them until userspace is alive again - > that would be no worse than the suspend to RAM case without process freezing. Please consider the example with a firmware on NFS. :-) I think the _solution_ would be to fix the drivers. The other approaches are just workarounds, including the $subject patch. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sun, 2007-05-27 at 23:04 +0100, Matthew Garrett wrote: > On Sun, May 27, 2007 at 11:49:30PM +0200, Kay Sievers wrote: > > > What exactly is the problem we see here? The timeout of the firmware loader? > > What goes wrong with frozen userspace, usually there is only a netlink > > message sent from the kernel, which should be received and handled > > just fine when userspace is running again. > > Driver calls request_firmware in the resume method. The userspace helper > can't be run because it's been frozen, so the firmware never gets loaded > and the call times out. The driver then fails to resume. While all this > is happening, the rest of the kernel is blocking on that resume method. > The firmware can be loaded once userspace has been started again, but by > that time the driver has given up. Seems, that's just the broken synchronous firmware loading interface with the useless timeout handling. The nowait version of the same loader doesn't time out, and should not have that problem. The sync version should be removed from the kernel, it just causes all sorts of problems since it exists. Userspace should handle the async request just fine when it comes back running, regardless of the time it was submitted. Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sun, May 27, 2007 at 11:49:30PM +0200, Kay Sievers wrote: > What exactly is the problem we see here? The timeout of the firmware loader? > What goes wrong with frozen userspace, usually there is only a netlink > message sent from the kernel, which should be received and handled > just fine when userspace is running again. Driver calls request_firmware in the resume method. The userspace helper can't be run because it's been frozen, so the firmware never gets loaded and the call times out. The driver then fails to resume. While all this is happening, the rest of the kernel is blocking on that resume method. The firmware can be loaded once userspace has been started again, but by that time the driver has given up. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sun, May 27, 2007 at 11:45:03PM +0200, Rafael J. Wysocki wrote: > On Sunday, 27 May 2007 22:49, Matthew Garrett wrote: > > If we remove process freezing in STR, this should just work[1] without the > > need to complicate things. > > Under the (optimistic, IMO) assumption that the relevant user space task won't > block on I/O with a suspended device involved or something like this. Yeah, I forgot the footnote that was going to mention that. Clearly there are issues if this is on some block device that hasn't been resumed yet. > BTW, I know of two subsystems that want their kernel threads to be frozen for > synchronization purposes. Please see these messages: > > 1) https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html > (plus follow up) > > 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2 I'm not entirely sold on this. The issue is that there's the possibility of races during suspend/resume? It sounds like that should be implemented in the driver, rather than depending on a side-effect of process freezing. Otherwise there's no way of selectively suspending that device. > Besides, there's the hibernation that needs to freeze tasks for another > reason, > so it needs some way to ensure that drivers won't request firmware while > the user land is frozen. I agree that that's an issue right now, but I think we should see if this is repairable without just breaking those drivers. One option would be to defer resuming them until userspace is alive again - that would be no worse than the suspend to RAM case without process freezing. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sunday, 27 May 2007 23:49, Kay Sievers wrote: > On 5/27/07, Matthew Garrett <[EMAIL PROTECTED]> wrote: > > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > > > Use a hibernation and suspend notifier to disable the firmware requesting > > > mechanism before a hibernation/suspend and enable it after the operation. > > > > This avoids the problem of .resume methods calling userspace while > > userspace is frozen and a resulting hang, but does it actually result in > > the drivers beginning to work again? If we remove process freezing in > > STR, this should just work[1] without the need to complicate things. On > > the other hand, if we don't want to support these functions in the > > suspend and resume methods we could just audit the kernel and remove > > them all. > > What exactly is the problem we see here? The timeout of the firmware loader? > What goes wrong with frozen userspace, usually there is only a netlink > message sent from the kernel, which should be received and handled > just fine when userspace is running again. Users report the timeout as a problem and it's not that straightforward to figure out what happens. Still, I agree it's much better if drivers don't use request_firmware() in their .resume() routines at all, because they shouldn't rely upon user land at this point. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 5/27/07, Matthew Garrett <[EMAIL PROTECTED]> wrote: On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > Use a hibernation and suspend notifier to disable the firmware requesting > mechanism before a hibernation/suspend and enable it after the operation. This avoids the problem of .resume methods calling userspace while userspace is frozen and a resulting hang, but does it actually result in the drivers beginning to work again? If we remove process freezing in STR, this should just work[1] without the need to complicate things. On the other hand, if we don't want to support these functions in the suspend and resume methods we could just audit the kernel and remove them all. What exactly is the problem we see here? The timeout of the firmware loader? What goes wrong with frozen userspace, usually there is only a netlink message sent from the kernel, which should be received and handled just fine when userspace is running again. Thanks, Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sunday, 27 May 2007 22:45, Michael-Luke Jones wrote: > On 27 May 2007, at 21:31, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > Use a hibernation and suspend notifier to disable the firmware > > requesting > > mechanism before a hibernation/suspend and enable it after the > > operation. > > > > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > drivers/base/firmware_class.c | 36 ++ > > ++ > > 1 file changed, 36 insertions(+) > > I don't like this approach, as I feel that the firmware loading > interface should be able to detect if a firmware load request is not > being handled, due to absence of userspace / hotplug handler presence. In principle, I agree. In practice, though, I don't know how to make this happen. > Other circumstances in which this can be a problem is during bootup > when request_firmware() calls can be made before userspace is up and > init has run (even in the presence of an initramfs). > > (Slightly OT: A particularly nasty race is when an initramfs > userspace is present, but firmware loading cannot occur because init > has not run, so proc hasn't been mounted, so a hotplug event handler > cannot be registered, despite the fact that the firmware is sitting > on the ramdisk mounted correctly...) > > In short, a more general solution would be preferred, and preferably > one which allows firmware loading to *actually* occur once userspace > has actually turned up and registered a handler :) I agree, but well ... ;-) Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sunday, 27 May 2007 22:49, Matthew Garrett wrote: > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > > Use a hibernation and suspend notifier to disable the firmware requesting > > mechanism before a hibernation/suspend and enable it after the operation. > > This avoids the problem of .resume methods calling userspace while > userspace is frozen and a resulting hang, but does it actually result in > the drivers beginning to work again? Well, this was acutally invented before you've decided to remove the freezing of tasks from the suspend code path (which I think is a mistake, but that's only my personal opinion, so it doesn't matter very much ;-)) and I regard it as a workaround. > If we remove process freezing in STR, this should just work[1] without the > need to complicate things. Under the (optimistic, IMO) assumption that the relevant user space task won't block on I/O with a suspended device involved or something like this. BTW, I know of two subsystems that want their kernel threads to be frozen for synchronization purposes. Please see these messages: 1) https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html (plus follow up) 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2 Your patch breaks them and I suspect there are more cases like these. Besides, there's the hibernation that needs to freeze tasks for another reason, so it needs some way to ensure that drivers won't request firmware while the user land is frozen. > On the other hand, if we don't want to support these functions in the > suspend and resume methods we could just audit the kernel and remove > them all. Yes, I think we should do this. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > Use a hibernation and suspend notifier to disable the firmware requesting > mechanism before a hibernation/suspend and enable it after the operation. This avoids the problem of .resume methods calling userspace while userspace is frozen and a resulting hang, but does it actually result in the drivers beginning to work again? If we remove process freezing in STR, this should just work[1] without the need to complicate things. On the other hand, if we don't want to support these functions in the suspend and resume methods we could just audit the kernel and remove them all. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend
On 27 May 2007, at 21:31, Rafael J. Wysocki wrote: From: Rafael J. Wysocki <[EMAIL PROTECTED]> Use a hibernation and suspend notifier to disable the firmware requesting mechanism before a hibernation/suspend and enable it after the operation. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> drivers/base/firmware_class.c | 36 ++ ++ 1 file changed, 36 insertions(+) I don't like this approach, as I feel that the firmware loading interface should be able to detect if a firmware load request is not being handled, due to absence of userspace / hotplug handler presence. Other circumstances in which this can be a problem is during bootup when request_firmware() calls can be made before userspace is up and init has run (even in the presence of an initramfs). (Slightly OT: A particularly nasty race is when an initramfs userspace is present, but firmware loading cannot occur because init has not run, so proc hasn't been mounted, so a hotplug event handler cannot be registered, despite the fact that the firmware is sitting on the ramdisk mounted correctly...) In short, a more general solution would be preferred, and preferably one which allows firmware loading to *actually* occur once userspace has actually turned up and registered a handler :) Michael-Luke Jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/