subject:"Re\: \[RFC\]\[PATCH \-mm 3\/3\] PM\: Disable _request

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-06-05 Thread Pavel Machek

Hi!

> > > With the USB subsystem I have followed the approach taken by the PM
> > > core, which is that tasks are frozen.  But one can -- and Linus has on
> > > at least one occasion -- make a good case that tasks should be left
> > > running while only I/O is frozen.  This would require the subsystem to
> > > distinguish between a selective device suspend and a system-wide
> > > suspend-to-RAM, so that selective resume could be enabled on demand in
> > > one case but not the other.
> > > 
> > > It's quite doable in principle -- it's just not the technique I used.
> > 
> > I guess we need to do that. Random user should not be able to prevent
> > machine from sleeping.
> 
> Just to be clear about this, let's agree that we're talking about 
> suspend-to-RAM here, not hibernation.

Yes.

> It boils down to whether we want to freeze user tasks.  As I recall,
> Linus said that he didn't have any big objection to freezing user
> threads; he was much more concerned about freezing kernel threads.  
> Thanks to Raphael's new notifier chains this will no longer be an
> issue, since kernel threads will be able to stop themselves when they
> receive a suspend notification.
...
> The alternative is to have drivers take over the burden.  I don't like
> this at all.  The most obvious disadvantage is that the necessary
> checks would have to be duplicated many many times and spread out over
> lots of drivers.

I like freezer better :-).

> It's also harder to handle these things at the driver level.  Suppose a
> driver gets an I/O request while a suspend is underway.  What should it
> do?  Return an error?  Block until the suspend is over?  Both
> approaches have their difficulties:
> 
> Returning an error would mean that suspend is no longer transparent.  
> Even an error like -EAGAIN.

No, -EAGAIN is not nice.

> Waiting until the suspend is over is likely to be impractical.  At a 
> minimum it would involve adding code to drop a lock or mutex, enter the 
> freezer (or its equivalent), and then restart the I/O operation.  And 
> then, what if the driver was invoked with O_NONBLOCK?

Blocking would be possible option. I agree it is tricky to
implement... it may also be useful for a harddrive:

"I'm riding a horse at 40kph now, so you'll kill the harddrive if you
access it; just freeze everyone until we are at the other end of
meadow".

...hmm, but this seems to be blockdevice specific, and I can't think
of a network or char driver where similar behaviour would be useful.

> I think it is much better overall to stop I/O requests from being
> generated at the source, either by freezing userspace or preventing it
> from making system calls.  It's hard to imagine that anybody would
> miss the small amount of CPU time they'd be giving up by not allowing
> user threads to run during the time that a suspend is underway!

Agreed.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-06-05 Thread Alan Stern

On Mon, 4 Jun 2007, Pavel Machek wrote:

> > With the USB subsystem I have followed the approach taken by the PM
> > core, which is that tasks are frozen.  But one can -- and Linus has on
> > at least one occasion -- make a good case that tasks should be left
> > running while only I/O is frozen.  This would require the subsystem to
> > distinguish between a selective device suspend and a system-wide
> > suspend-to-RAM, so that selective resume could be enabled on demand in
> > one case but not the other.
> > 
> > It's quite doable in principle -- it's just not the technique I used.
> 
> I guess we need to do that. Random user should not be able to prevent
> machine from sleeping.

Just to be clear about this, let's agree that we're talking about 
suspend-to-RAM here, not hibernation.

It boils down to whether we want to freeze user tasks.  As I recall,
Linus said that he didn't have any big objection to freezing user
threads; he was much more concerned about freezing kernel threads.  
Thanks to Raphael's new notifier chains this will no longer be an
issue, since kernel threads will be able to stop themselves when they
receive a suspend notification.

There may remain some obscure difficulties in discerning whether a 
particular thread should be classified as user or kernel, but let's 
ignore them.

Even if we don't actively freeze user threads, approximately the same
effect can be achieved in the following way: Change the main kernel
entry points so that any thread performing a system call during a
suspend will get frozen until the suspend is over.  Threads that run
entirely in userspace will continue doing useful work as before, and
kernel threads won't be affected at all.  (Not that I think it's 
necessary to do this; it's just a way to avoid freezing user tasks
until they need it.)

One way or another, freezing user tasks should not be a big deal.  
After all, once the suspend is complete eveything will effectively be 
frozen anyway.  I suppose there might be issues involving tasks which 
need to run in order to complete the suspend -- IMO any such issues 
should be handled by carrying out the necessary actions before the 
point where we now start up the freezer.

The alternative is to have drivers take over the burden.  I don't like
this at all.  The most obvious disadvantage is that the necessary
checks would have to be duplicated many many times and spread out over
lots of drivers.

It's also harder to handle these things at the driver level.  Suppose a
driver gets an I/O request while a suspend is underway.  What should it
do?  Return an error?  Block until the suspend is over?  Both
approaches have their difficulties:

Returning an error would mean that suspend is no longer transparent.  
Even an error like -EAGAIN.

Waiting until the suspend is over is likely to be impractical.  At a 
minimum it would involve adding code to drop a lock or mutex, enter the 
freezer (or its equivalent), and then restart the I/O operation.  And 
then, what if the driver was invoked with O_NONBLOCK?

I think it is much better overall to stop I/O requests from being
generated at the source, either by freezing userspace or preventing it
from making system calls.  It's hard to imagine that anybody would
miss the small amount of CPU time they'd be giving up by not allowing
user threads to run during the time that a suspend is underway!

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-06-04 Thread Pavel Machek

Hi!

> > > The theoretical answer is that it behaves the way we want.  The kernel 
> > > thread does selective resumes in response to device requests.  If such 
> > > a request comes in while the system is asleep it will awaken the 
> > > system; so it's only logical that a request coming in while the system 
> > > is in the process of going to sleep should abort the suspend.
> > 
> > I'd say that it shows ppc being broken. User wanted to suspend the
> > system, and now unrelated task did lsusb... and system will not sleep.
> > 
> > AFAICT it is DoS issue -- if one of your users keeps doing lsusb, root
> > will not be able to suspend the system.
> 
> This is a matter of one's philosophy.  In suspend-to-RAM, should tasks
> be frozen or should I/O queues be frozen?
> 
> With the USB subsystem I have followed the approach taken by the PM
> core, which is that tasks are frozen.  But one can -- and Linus has on
> at least one occasion -- make a good case that tasks should be left
> running while only I/O is frozen.  This would require the subsystem to
> distinguish between a selective device suspend and a system-wide
> suspend-to-RAM, so that selective resume could be enabled on demand in
> one case but not the other.
> 
> It's quite doable in principle -- it's just not the technique I used.

I guess we need to do that. Random user should not be able to prevent
machine from sleeping.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-30 Thread Pavel Machek

Hi!

> > >> Userspace should handle the async request just fine when it comes  
> > >> back
> > >> running, regardless of the time it was submitted.
> > >
> > > Okay, so the solution is to convert the drivers to use
> > > request_firmware_nowait() instead of request_firmware() in  
> > > their .resume()
> > > routines.
> > 
> > [added Rob Landley to CC]
> > 
> > I think asynchronous loading should be made the default behaviour.  
> > However, we need to think and document how to do firmware loading.
> 
> I'd love to write up documentation on this if anybody can tell me what it 
> should say. :)

Well, chapter on firmware & suspend/hibernation should say something
about either loading the firmware during early suspend, or just
keeping firmware in ram from boot...
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-29 Thread Rob Landley

On Monday 28 May 2007 4:48 am, Michael-Luke Jones wrote:
> On 28 May 2007, at 08:43, Rafael J. Wysocki wrote:
> 
> >> Seems, that's just the broken synchronous firmware loading interface
> >> with the useless timeout handling. The nowait version of the same  
> >> loader
> >> doesn't time out, and should not have that problem. The sync version
> >> should be removed from the kernel, it just causes all sorts of  
> >> problems
> >> since it exists.
> >>
> >> Userspace should handle the async request just fine when it comes  
> >> back
> >> running, regardless of the time it was submitted.
> >
> > Okay, so the solution is to convert the drivers to use
> > request_firmware_nowait() instead of request_firmware() in  
> > their .resume()
> > routines.
> 
> [added Rob Landley to CC]
> 
> I think asynchronous loading should be made the default behaviour.  
> However, we need to think and document how to do firmware loading.

I'd love to write up documentation on this if anybody can tell me what it 
should say. :)

> Firmware loading can be done at two different times:
> Device Driver Load
> First use of Device Driver
> 
> For a network device, this correlates to when the driver is first  
> loaded into memory or at 'ifconfig up' respectively.

And for block devices ala:
https://bugs.launchpad.net/ubuntu/dapper/+source/initramfs-tools/+bug/74004
https://issues.rpath.com/browse/RPL-1350

The first use could be "mount" or open of the block device under /dev.  So now 
you have the mount and open syscalls returning -ENOFIRMWARE.  This is not a 
small change.

> At device driver load, firmware loading must be asynchronous. This is  
> because device driver init can occur before userspace runs and  
> registers a hotplug/uevent event handler.

Userspace doesn't have to register a hotplug handler: /sys/hotplug is the 
default and if there is one in initramfs it should get called.  You shouldn't 
have to wait for init to run for this to be the case.

I believe the initramfs cpio archive used to get extracted just before the 
attempt to exec /init out of it, and that it was moved to much earlier in the 
boot process so that firmware loading could be done out of it for statically 
linked drivers.

Look at do_basic_setup() in init/main.c: notice how usermodehelper_init() gets 
called right before driver_init().  Ask yourself: "why did it do that"?  
Notice how rest_init() forks the first kernel thread (PID 1) to run 
kernel_init(), and then becomes PID 0 (the idle task).  So from kernel_init() 
it's ok to spawn all the new tasks we want because they can't take any 
reserved PIDs.

> If device use is attempted   
> before firmware loads, a -ENOFIRMWARE call would be great so that  
> userspace and thus the user can be clearly informed what the problem is.

Because of course every userspace utility in the world will immediately be 
rewritten to provide clear and informative error messages for this race 
condition corner case.

Somehow, I can't bring myself to believe this.

> However, at 'first use' firmware loading, the synchronous interface  
> should remain. 'ifconfig up' either works or it doesn't, and as I see  
> it, has to just hang around until firmware turns up.

Why would you make two different code paths for module load and first use?

> Documentation for how hotplug/uevent handlers should cope with these  
> types of firmware loading is also *strongly* requested, in order for  
> lightweight but fully functional implementations to be made.

Happy to.  I'm just trying to figure out what the correct behavior is so I can 
document it.

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-29 Thread Rob Landley

On Monday 28 May 2007 5:06 am, Kay Sievers wrote:
> Well, 10 seconds are just to short for userspace to react on some
> setups, from tiny boxes which are busy, to 512 CPU boxes enumerating
> thousands of devices, all had problems here. Any timeout for a
> firmware-request is just a broken concept, the request should wait
> forever, to be fulfilled or canceled from userspace when it's ready.

If it's spawning a new usermode helper process, then figuring out when to give 
up and exit is that process's job.  If it _can't_ exec usermode helper, then 
that should fail immediately.

Where does the timeout come in?

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-pm] Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-29 Thread Alan Stern

On Tue, 29 May 2007, David Brownell wrote:

> On Monday 28 May 2007, Alan Stern wrote:
> 
> > This is a matter of one's philosophy.  In suspend-to-RAM, should tasks
> > be frozen or should I/O queues be frozen?
> > 
> > With the USB subsystem I have followed the approach taken by the PM
> > core, which is that tasks are frozen.  But one can -- and Linus has on
> > at least one occasion -- make a good case that tasks should be left
> > running while only I/O is frozen. 
> 
> In fact that makes a heck of a lot more sense to me from the
> conceptual point of view.  From the hardware perspective, the
> task of preparing to enter true suspend states (STR, or suspend
> for ACPI; embedded systems have more options) focusses on what
> I/O signals are disabled.
> 
> Once the relevant I/O signals are first idled, then disabled,
> the CPU can do whatever it likes.  Whether it runs or not is
> purely a workload decision...
> 
> Remember too that not all systems suffer from the constraints
> that ACPI decrees.  In particular, it's not uncommon that some
> parts of the system be active in certain suspend states.  The
> whole point is to turn off as much of the system as possible,
> especially the high power portions, while letting work proceed.
> 
> Turning off some clocks and peripherals doesn't need to imply
> turning them all off, or disabling DMA ... and should not need
> to be triggered by a user (or userspace tool) explicitly saying
> "go into STR".

One can think of suspend-to-RAM as nothing more than a super-duper
selective suspend of all devices (including the CPU!).  This
illustrates the relationship between suspend-to-RAM and runtime PM.  
However there is one important distinction, the DoS issue that Pavel
raised.  With true suspend-to-RAM, autoresume on demand is not enabled
-- only on remote wakeup.  With runtime PM, both are enabled.

Elegant though it is, the "freeze I/O" approach suffers from 
implementation awkwardness.  Ideally individual drivers shouldn't have 
to worry about it, but the subsystems certainly will.

Consider as an example the class of char device drivers.  The only 
subsystem they have in common is VFS.  It would then be necessary for 
every entry point into VFS to check whether a suspend-to-RAM is in 
progress, and if it is, block until the suspend is over.  Each one of 
those entry points is on a hot path, so adding these checks is the sort 
of thing one would like to avoid.

By freezing tasks we can eliminate (most) I/O requests at the source 
with a single, relatively small amount of code (i.e., the 
refrigerator).  Freezing I/O, on the other hand, would involve many 
checks dispersed throughout a large number of source files -- checks 
that would have to execute even when a suspend wasn't in progress.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-29 Thread Rob Landley

On Sunday 27 May 2007 4:45 pm, Michael-Luke Jones wrote:
> (Slightly OT: A particularly nasty race is when an initramfs  
> userspace is present, but firmware loading cannot occur because init  
> has not run, so proc hasn't been mounted, so a hotplug event handler  
> cannot be registered, despite the fact that the firmware is sitting  
> on the ramdisk mounted correctly...)

I believe that the current functionality is that /sbin/hotplug is the default.  
I think that if there is one in initramfs, it'll get called as a usermode 
helper, even before init has run.  (If it needs /proc /sys or /dev, setting 
it up properly is its problem, but not an insurmountable one.)

I remember discussion of this a while back, and that it was indeed 
intentional.  I dunno if anybody's actually tried it.

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-pm] Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-29 Thread David Brownell

On Monday 28 May 2007, Alan Stern wrote:

> This is a matter of one's philosophy.  In suspend-to-RAM, should tasks
> be frozen or should I/O queues be frozen?
> 
> With the USB subsystem I have followed the approach taken by the PM
> core, which is that tasks are frozen.  But one can -- and Linus has on
> at least one occasion -- make a good case that tasks should be left
> running while only I/O is frozen. 

In fact that makes a heck of a lot more sense to me from the
conceptual point of view.  From the hardware perspective, the
task of preparing to enter true suspend states (STR, or suspend
for ACPI; embedded systems have more options) focusses on what
I/O signals are disabled.

Once the relevant I/O signals are first idled, then disabled,
the CPU can do whatever it likes.  Whether it runs or not is
purely a workload decision...

Remember too that not all systems suffer from the constraints
that ACPI decrees.  In particular, it's not uncommon that some
parts of the system be active in certain suspend states.  The
whole point is to turn off as much of the system as possible,
especially the high power portions, while letting work proceed.

Turning off some clocks and peripherals doesn't need to imply
turning them all off, or disabling DMA ... and should not need
to be triggered by a user (or userspace tool) explicitly saying
"go into STR".

> This would require the subsystem to 
> distinguish between a selective device suspend and a system-wide
> suspend-to-RAM, so that selective resume could be enabled on demand in
> one case but not the other.

Exactly.  "Selective suspend" of parts of the system is a far
more general model.  It fits well with runtime power management,
degrades smoothly to states where memory goes into self-refresh
(maybe the system idle loop when NO_HZ is being effective) or
even hibernation (as discussed elsewhere).

- Dave

> It's quite doable in principle -- it's just not the technique I used.
> 
> Alan Stern
> 
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Alan Stern

On Mon, 28 May 2007, Pavel Machek wrote:

> > The theoretical answer is that it behaves the way we want.  The kernel 
> > thread does selective resumes in response to device requests.  If such 
> > a request comes in while the system is asleep it will awaken the 
> > system; so it's only logical that a request coming in while the system 
> > is in the process of going to sleep should abort the suspend.
> 
> I'd say that it shows ppc being broken. User wanted to suspend the
> system, and now unrelated task did lsusb... and system will not sleep.
> 
> AFAICT it is DoS issue -- if one of your users keeps doing lsusb, root
> will not be able to suspend the system.

This is a matter of one's philosophy.  In suspend-to-RAM, should tasks
be frozen or should I/O queues be frozen?

With the USB subsystem I have followed the approach taken by the PM
core, which is that tasks are frozen.  But one can -- and Linus has on
at least one occasion -- make a good case that tasks should be left
running while only I/O is frozen.  This would require the subsystem to
distinguish between a selective device suspend and a system-wide
suspend-to-RAM, so that selective resume could be enabled on demand in
one case but not the other.

It's quite doable in principle -- it's just not the technique I used.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

Hi!

> >In fact, I'd like drivers to use notifiers to actually 
> >load the firmware into
> >memory before hibernation/suspend.  Namely, if there's 
> >PM_PRE_FREEZE, the
> >driver calls request_firmware() from within the 
> >notifier and saves the firmware
> >in memory for future use, if need be.  Later, when 
> >PM_POST_THAW comes, the
> >memory holding the firmware is released.
> >
> >Unfortunately there are drivers that call 
> >request_firmware() directly from
> >.resume() which blocks until timeout expires and fails 
> >anyway.  I just wanted
> >this to fail immediately, without waiting.
> 
> Stupid question time. Wouldn't it just be easier to have
> request_firmware() keep a copy of the firmware once it's 
> been loaded?
> We're not talking about a lot of memory that would be 
> wasted, and that
> way no drivers have to be changed.

Actually, I like this idea. Firmware problems magically disappear.

...unless someone uses x86 emulator in userland to POST graphics card.
You can't 'cache' that. But that's separate problem.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

Hi!

> > > I can't speak for the second example, but there's a good reason the
> > > first example works this way.  It's not a matter of races; the problem
> > > is that the kernel thread's job is to selectively suspend and resume
> > > devices.  We don't want it doing this while a system sleep is in 
> > > progress; it would (and in fact has, before the thread was made 
> > > freezable) cause the sleep transition to abort.
> > 
> > How does this work on PPC or APM systems?
> 
> For hibernation it behaves the same as on other types of systems.
> 
> For STR it generally works okay.  There was one report of suspends
> aborting, and it looked like this was caused by selective resumes
> originating from userspace.  This seemed to be unrelated to the kernel
> threads; apparently some program was running while the STR was in
> progress, and causing the problem.  For example, the lsusb program will
> do a selective resume on every USB device as it scans through them all.
> However that's just a guess, we haven't fully resolved that bug report.
> 
> The theoretical answer is that it behaves the way we want.  The kernel 
> thread does selective resumes in response to device requests.  If such 
> a request comes in while the system is asleep it will awaken the 
> system; so it's only logical that a request coming in while the system 
> is in the process of going to sleep should abort the suspend.

I'd say that it shows ppc being broken. User wanted to suspend the
system, and now unrelated task did lsusb... and system will not sleep.

AFAICT it is DoS issue -- if one of your users keeps doing lsusb, root
will not be able to suspend the system.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Ray Lee

On 5/28/07, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:

On Monday, 28 May 2007 17:55, Alan Stern wrote:
> You're using the PM_PRE_FREEZE and PM_POST_THAW notifiers for both this
> and the userspace helper change.  Is it your intention that drivers
> should continue to request these services but encounter an error if the
> request occurs at the wrong time?  Or do you expect drivers to use the
> notifier chains to know when they shouldn't make any requests?

In fact, I'd like drivers to use notifiers to actually load the firmware into
memory before hibernation/suspend.  Namely, if there's PM_PRE_FREEZE, the
driver calls request_firmware() from within the notifier and saves the firmware
in memory for future use, if need be.  Later, when PM_POST_THAW comes, the
memory holding the firmware is released.

Unfortunately there are drivers that call request_firmware() directly from
.resume() which blocks until timeout expires and fails anyway.  I just wanted
this to fail immediately, without waiting.

Stupid question time. Wouldn't it just be easier to have
request_firmware() keep a copy of the firmware once it's been loaded?
We're not talking about a lot of memory that would be wasted, and that
way no drivers have to be changed.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Alan Stern

On Mon, 28 May 2007, Matthew Garrett wrote:

> On Mon, May 28, 2007 at 12:43:36PM -0400, Alan Stern wrote:
> > On Mon, 28 May 2007, Matthew Garrett wrote:
> > > On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote:
> > > 
> > > > I can't speak for the second example, but there's a good reason the
> > > > first example works this way.  It's not a matter of races; the problem
> > > > is that the kernel thread's job is to selectively suspend and resume
> > > > devices.  We don't want it doing this while a system sleep is in 
> > > > progress; it would (and in fact has, before the thread was made 
> > > > freezable) cause the sleep transition to abort.
> > > 
> > > How does this work on PPC or APM systems?
> > 
> > For hibernation it behaves the same as on other types of systems.
> > 
> > For STR it generally works okay.  There was one report of suspends
> > aborting, and it looked like this was caused by selective resumes
> > originating from userspace.  This seemed to be unrelated to the kernel
> > threads; apparently some program was running while the STR was in
> > progress, and causing the problem.  For example, the lsusb program will
> > do a selective resume on every USB device as it scans through them all.
> > However that's just a guess, we haven't fully resolved that bug report.
> > 
> > The theoretical answer is that it behaves the way we want.  The kernel 
> > thread does selective resumes in response to device requests.  If such 
> > a request comes in while the system is asleep it will awaken the 
> > system; so it's only logical that a request coming in while the system 
> > is in the process of going to sleep should abort the suspend.
> 
> Ok, I guess I'm still not clear on this :) If it doesn't cause major 
> problems on Powermac or APM systems, why is freezing the thread 
> beneficial on ACPI systems?

Isn't it true, even on PPC and APM systems, that tasks are frozen for
hibernation?  The difference is that they aren't frozen for STR.

Hence the answer to your question: Not freezing the thread would work 
okay on ACPI systems for STR.  (In fact, not freezing anything would 
probably work okay for STR.)  But it doesn't work with hibernation.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Rafael J. Wysocki

On Monday, 28 May 2007 17:55, Alan Stern wrote:
> On Sun, 27 May 2007, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > 
> > Use a hibernation and suspend notifier to disable the firmware requesting
> > mechanism before a hibernation/suspend and enable it after the operation.
> 
> You're using the PM_PRE_FREEZE and PM_POST_THAW notifiers for both this 
> and the userspace helper change.  Is it your intention that drivers 
> should continue to request these services but encounter an error if the 
> request occurs at the wrong time?  Or do you expect drivers to use the 
> notifier chains to know when they shouldn't make any requests?

In fact, I'd like drivers to use notifiers to actually load the firmware into
memory before hibernation/suspend.  Namely, if there's PM_PRE_FREEZE, the
driver calls request_firmware() from within the notifier and saves the firmware
in memory for future use, if need be.  Later, when PM_POST_THAW comes, the
memory holding the firmware is released.

Unfortunately there are drivers that call request_firmware() directly from
.resume() which blocks until timeout expires and fails anyway.  I just wanted
this to fail immediately, without waiting.

> In the second case you may have a problem, because there's no 
> specification about the order in which the notifiers are sent.  The 
> service may get disabled before the driver learns it isn't available, 
> or the driver may think the service is once again available before it 
> gets enabled.

Yes.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Matthew Garrett

On Mon, May 28, 2007 at 12:43:36PM -0400, Alan Stern wrote:
> On Mon, 28 May 2007, Matthew Garrett wrote:
> > On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote:
> > 
> > > I can't speak for the second example, but there's a good reason the
> > > first example works this way.  It's not a matter of races; the problem
> > > is that the kernel thread's job is to selectively suspend and resume
> > > devices.  We don't want it doing this while a system sleep is in 
> > > progress; it would (and in fact has, before the thread was made 
> > > freezable) cause the sleep transition to abort.
> > 
> > How does this work on PPC or APM systems?
> 
> For hibernation it behaves the same as on other types of systems.
> 
> For STR it generally works okay.  There was one report of suspends
> aborting, and it looked like this was caused by selective resumes
> originating from userspace.  This seemed to be unrelated to the kernel
> threads; apparently some program was running while the STR was in
> progress, and causing the problem.  For example, the lsusb program will
> do a selective resume on every USB device as it scans through them all.
> However that's just a guess, we haven't fully resolved that bug report.
> 
> The theoretical answer is that it behaves the way we want.  The kernel 
> thread does selective resumes in response to device requests.  If such 
> a request comes in while the system is asleep it will awaken the 
> system; so it's only logical that a request coming in while the system 
> is in the process of going to sleep should abort the suspend.

Ok, I guess I'm still not clear on this :) If it doesn't cause major 
problems on Powermac or APM systems, why is freezing the thread 
beneficial on ACPI systems?

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Alan Stern

On Mon, 28 May 2007, Matthew Garrett wrote:

> On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote:
> 
> > I can't speak for the second example, but there's a good reason the
> > first example works this way.  It's not a matter of races; the problem
> > is that the kernel thread's job is to selectively suspend and resume
> > devices.  We don't want it doing this while a system sleep is in 
> > progress; it would (and in fact has, before the thread was made 
> > freezable) cause the sleep transition to abort.
> 
> How does this work on PPC or APM systems?

For hibernation it behaves the same as on other types of systems.

For STR it generally works okay.  There was one report of suspends
aborting, and it looked like this was caused by selective resumes
originating from userspace.  This seemed to be unrelated to the kernel
threads; apparently some program was running while the STR was in
progress, and causing the problem.  For example, the lsusb program will
do a selective resume on every USB device as it scans through them all.
However that's just a guess, we haven't fully resolved that bug report.

The theoretical answer is that it behaves the way we want.  The kernel 
thread does selective resumes in response to device requests.  If such 
a request comes in while the system is asleep it will awaken the 
system; so it's only logical that a request coming in while the system 
is in the process of going to sleep should abort the suspend.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Matthew Garrett

On Mon, May 28, 2007 at 12:09:30PM -0400, Alan Stern wrote:

> I can't speak for the second example, but there's a good reason the
> first example works this way.  It's not a matter of races; the problem
> is that the kernel thread's job is to selectively suspend and resume
> devices.  We don't want it doing this while a system sleep is in 
> progress; it would (and in fact has, before the thread was made 
> freezable) cause the sleep transition to abort.

How does this work on PPC or APM systems?
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Alan Stern

On Sun, 27 May 2007, Matthew Garrett wrote:

> > BTW, I know of two subsystems that want their kernel threads to be frozen 
> > for
> > synchronization purposes.  Please see these messages:
> > 
> > 1) 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html
> > (plus follow up)
> > 
> > 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2
> 
> I'm not entirely sold on this. The issue is that there's the possibility 
> of races during suspend/resume? It sounds like that should be 
> implemented in the driver, rather than depending on a side-effect of 
> process freezing. Otherwise there's no way of selectively suspending 
> that device.

I can't speak for the second example, but there's a good reason the
first example works this way.  It's not a matter of races; the problem
is that the kernel thread's job is to selectively suspend and resume
devices.  We don't want it doing this while a system sleep is in 
progress; it would (and in fact has, before the thread was made 
freezable) cause the sleep transition to abort.

Handling it entirely from within the drivers is possible in theory.  
Unfortunately the design of the PM core has not leant itself to such an 
approach.  Using separate callbacks for hibernation vs. STR will help, 
as will Raphael's notifiers.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Alan Stern

On Sun, 27 May 2007, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
> Use a hibernation and suspend notifier to disable the firmware requesting
> mechanism before a hibernation/suspend and enable it after the operation.

You're using the PM_PRE_FREEZE and PM_POST_THAW notifiers for both this 
and the userspace helper change.  Is it your intention that drivers 
should continue to request these services but encounter an error if the 
request occurs at the wrong time?  Or do you expect drivers to use the 
notifier chains to know when they shouldn't make any requests?

In the second case you may have a problem, because there's no 
specification about the order in which the notifiers are sent.  The 
service may get disabled before the driver learns it isn't available, 
or the driver may think the service is once again available before it 
gets enabled.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


On 28 May 2007, at 14:00, Pavel Machek wrote:


I'm too lazy to figure out how it currently works, but
I'll just state it is not my fault


Figuring out how a userspace/kernelspace interface works should not  
rely on having to read kernelspace code. Unfortunately, in the case  
of hotplug / uevents, there is no such documentation. Thus, what  
kernelspace / userspace interactions actually are and what they  
should be is unclear, leading to confusion over corner cases, such as  
this one.


Michael-Luke Jones
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

Hi!

> From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
> Use a hibernation and suspend notifier to disable the firmware requesting
> mechanism before a hibernation/suspend and enable it after the operation.
> 
> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
>  drivers/base/firmware_class.c |   36 
>  1 file changed, 36 insertions(+)
> 
> Index: linux-2.6.22-rc3/drivers/base/firmware_class.c
> ===
> --- linux-2.6.22-rc3.orig/drivers/base/firmware_class.c   2007-05-27 
> 19:43:03.0 +0200
> +++ linux-2.6.22-rc3/drivers/base/firmware_class.c2007-05-27 
> 19:44:06.0 +0200
> @@ -17,6 +17,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include 
>  #include "base.h"
> @@ -35,6 +37,9 @@ enum {
>  
>  static int loading_timeout = 60; /* In seconds */
>  
> +/* If set, _request_firmware() will exit immediately returning -EBUSY */
> +static int requesting_firmware_disabled;
> +
>  /* fw_lock could be moved to 'struct firmware_priv' but since it is just
>   * guarding for corner cases a global lock should be OK */
>  static DEFINE_MUTEX(fw_lock);
> @@ -397,6 +402,9 @@ _request_firmware(const struct firmware 
>   struct firmware *firmware;
>   int retval;
>  
> + if (requesting_firmware_disabled)
> + return -EBUSY;

WARN_ON()? This should not happen, and if it does, we'll end up with
non-working device.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

Hi!

> >From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> >
> >Use a hibernation and suspend notifier to disable the firmware  
> >requesting
> >mechanism before a hibernation/suspend and enable it after the  
> >operation.
> >
> >Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> >
> > drivers/base/firmware_class.c |   36 ++ 
> >++
> > 1 file changed, 36 insertions(+)
> 
> I don't like this approach, as I feel that the firmware loading  
> interface should be able to detect if a firmware load request is not  
> being handled, due to absence of userspace / hotplug handler presence.

I don't think that's possible. If hotplug handler needs /dev/foo, but
/dev/foo is not available, it will just block waiting there.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

Hi!

> *Grumble ahead*
> Unfortunately, I don't properly understand the uevent interface, so  
> some of the above may be inaccurate. This is *not* my fault as there  
> is no decent documentation (and btw, telling me to write the  

Translated: "I don't know what I'm talking about, but I still like to
flame people. I'm too lazy to figure out how it currently works, but
I'll just state it is not my fault".
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Kay Sievers

On Mon, 2007-05-28 at 13:26 +0100, Michael-Luke Jones wrote:
> On 28 May 2007, at 12:51, Kay Sievers wrote:
> 
> > Either the whole idea of userspace firmware-loading should be  
> > considered
> > as a problem impossible to do right because of its unsolvable side
> > effects. Or at least releasing loaded firmware should be the exception
> > for drivers which can be sure, that the firmware is not needed in
> > situations we try to work around here.
> 
> I don't believe that it is impossible. What is needed is some  
> thought, and maybe a loose spec.

Hehe, let's see. :)

> We need to think about safe ways of solving a simple problem: what to  
> do when userspace is unable to respond to a kernel uevent request. We  
> now have a timeout, whether it be handled synchronously or in the  
> background. I don't think either solution is good enough.
> 
> [RFC] Possible Plan:
> 
> 1) request_firmware() should stick around unmodified but be marked  
> __deprecated.
> 
> 2) request_firmware_nowait() as it stands should probably be removed.  
> After all, it only has 2 in-kernel users at the moment.

We should probably leave the whole firmware class alone, add something
new, and convert the current users. 

> 3) uevent interface should be notified when userspace handlers are /  
> are not available so that events can be queued to be handled when  
> userspace does turn up and re-register a hotplug event handler.

Uevents are queued by the netlink socket buffer. If userspace can't
receice them (udevd) while it's frozen, they will just be handled (in
the right order) when userspace runs again. We can queue up thousands of
events, and it should already work that way for a long time now.

> 4) request_firmware_async() introduced: asynchronous firmware loading  
> interface built on basis of 3, such that problems at early boot and  
> suspend / resume are avoided. Also request_firmware_async() taught to  
> retain firmware in memory by default such that subsequent calls do  
> not require a call to userspace if userspace is unavailable.

Something like this makes sense, yes.

> 6) Documentation on correct firmware loading behaviour for driver  
> authors written, together with migration notes for existing users of  
> request_firmware().
> 
> 7) Existing uses of request_firmware() converted.
> 
> *Grumble ahead*
> Unfortunately, I don't properly understand the uevent interface, so  
> some of the above may be inaccurate.

The event emission in the kernel and the userspace side is probably
fine. Two years ago, benh already run into exactly the same suspend
problems, and I think I addressed the problems on the uevent side.

> This is *not* my fault as there  
> is no decent documentation (and btw, telling me to write the  
> documentation is not the answer to that problem).

Sure, we always need at least one working example before we can even
tell what's the right way to do it. :)

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


On 28 May 2007, at 12:51, Kay Sievers wrote:

Either the whole idea of userspace firmware-loading should be  
considered

as a problem impossible to do right because of its unsolvable side
effects. Or at least releasing loaded firmware should be the exception
for drivers which can be sure, that the firmware is not needed in
situations we try to work around here.


I don't believe that it is impossible. What is needed is some  
thought, and maybe a loose spec.


We need to think about safe ways of solving a simple problem: what to  
do when userspace is unable to respond to a kernel uevent request. We  
now have a timeout, whether it be handled synchronously or in the  
background. I don't think either solution is good enough.


[RFC] Possible Plan:

1) request_firmware() should stick around unmodified but be marked  
__deprecated.


2) request_firmware_nowait() as it stands should probably be removed.  
After all, it only has 2 in-kernel users at the moment.


3) uevent interface should be notified when userspace handlers are /  
are not available so that events can be queued to be handled when  
userspace does turn up and re-register a hotplug event handler.


4) request_firmware_async() introduced: asynchronous firmware loading  
interface built on basis of 3, such that problems at early boot and  
suspend / resume are avoided. Also request_firmware_async() taught to  
retain firmware in memory by default such that subsequent calls do  
not require a call to userspace if userspace is unavailable.


6) Documentation on correct firmware loading behaviour for driver  
authors written, together with migration notes for existing users of  
request_firmware().


7) Existing uses of request_firmware() converted.

*Grumble ahead*
Unfortunately, I don't properly understand the uevent interface, so  
some of the above may be inaccurate. This is *not* my fault as there  
is no decent documentation (and btw, telling me to write the  
documentation is not the answer to that problem).


Michael-Luke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


On 28 May 2007, at 13:01, Kay Sievers wrote:

In our development model, users of an interface are expected to  
improve

it, because nobody else really knows what's needed. That unfortunately
didn't happen so far.


Thanks for responding :)

The fact that no-one is told to use the new right way (tm) in any  
available documentation does not help matters improve. One issue is  
that the 'asynchronous' interface seems to rely on the same 'dodgy'  
timeout mechanism as the original request_firmware() call... Looks  
like the lack of a maintainer should be the opportunity for a rework  
of this API.


Michael-Luke Jones
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Kay Sievers

On Mon, 2007-05-28 at 11:26 +0100, Michael-Luke Jones wrote:
> On 28 May 2007, at 10:06, Kay Sievers wrote:
> 
> > The underlying issue are the driver authors, that's not so easy to
> > fix. :)
> 
> Sorry, I know this maybe be unintentional, but comments like this  
> make me somewhat angry.

It was the response to "pushing a bodge upstream", while we just try to
make things working. :)
We are fighting that broken firmware-loader model for years, but every
driver author seems to have its own idea how that should work in theory,
but they are not even willing to switch to the existing version that is
expected to work better.

> If there is no decent documentation as to how to do it the right way  
> (tm), how do you expect people to do it the right way (tm)?
> 
> > Any timeout for a
> > firmware-request is just a broken concept, the request should wait
> > forever, to be fulfilled or canceled from userspace when it's ready.
> 
> What I wrote above is especially true when the in-kernel APIs  
> themselves do things the wrong way (tm) themselves. Thus, even more  
> thought is required to work around this imperfect behaviour in a sane  
> manner. And without documentation, every single device driver author  
> has to go through this thought process themselves. Unsurprisingly,  
> they often get it wrong. But as there's no decent documentation to do  
> it right, it's *not* their fault. I'd suggest it's more the fault of  
> the core kernel devs who fail to fix up a questionable firmware  
> loading interface, then fail to document how to work around its  
> shortcomings.
> 
> Again, apologies if this sounds angry, I don't want to start an  
> argument. But as someone just starting out here, this kind of thing  
> can be very frustrating, as even with the best will in the world,  
> achieving the right way (tm) is basically impossible if those in the  
> know about what the right way (tm) is fail to share this information.

In our development model, users of an interface are expected to improve
it, because nobody else really knows what's needed. That unfortunately
didn't happen so far.

We get this kind of conversation every few months since a few years now.
I wonder, if we will see code, or at least come up with a better idea
this time. :)

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Kay Sievers

On Mon, 2007-05-28 at 12:38 +0100, Michael-Luke Jones wrote:
> On 28 May 2007, at 12:24, Kay Sievers wrote:
> 
> > A driver for a bootup-critical device like this should just never
> > release the firmware after the first load. There is absolutely no  
> > point
> > in doing that.
> 
> Bogus argument: is a USB-Ethernet device which needs firmware loading  
> boot-up critical? Not on the surface, but if the device loads root  
> over this device, it suddenly is.

Releasing loaded firmware for anything that needs to handle situations
like this just doesn't make sense.

> This functionality should also be written into the firmware-class  
> (and this fact *is* acknowledged in the sparse documentation).

Yeah, but these are just words, nothing people could use. Unfortunately
the author of this document died a few years ago.

Either the whole idea of userspace firmware-loading should be considered
as a problem impossible to do right because of its unsolvable side
effects. Or at least releasing loaded firmware should be the exception
for drivers which can be sure, that the firmware is not needed in
situations we try to work around here.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-pm] Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

> This functionality should also be written into the firmware-class  
> (and this fact *is* acknowledged in the sparse documentation).

 Feel free to submit documentation improvements.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


On 28 May 2007, at 12:24, Kay Sievers wrote:


A driver for a bootup-critical device like this should just never
release the firmware after the first load. There is absolutely no  
point

in doing that.


Bogus argument: is a USB-Ethernet device which needs firmware loading  
boot-up critical? Not on the surface, but if the device loads root  
over this device, it suddenly is.


This functionality should also be written into the firmware-class  
(and this fact *is* acknowledged in the sparse documentation).


Michael-Luke Jones
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

On Mon 2007-05-28 13:24:53, Kay Sievers wrote:
> On Mon, 2007-05-28 at 13:15 +0200, Pavel Machek wrote:
> > Hi!
> > 
> > > > > > What exactly is the problem we see here? The timeout of the 
> > > > > > firmware loader?
> > > > > > What goes wrong with frozen userspace, usually there is only a 
> > > > > > netlink
> > > > > > message sent from the kernel, which should be received and handled
> > > > > > just fine when userspace is running again.
> > > > > 
> > > > > Driver calls request_firmware in the resume method. The userspace 
> > > > > helper 
> > > > > can't be run because it's been frozen, so the firmware never gets 
> > > > > loaded 
> > > > > and the call times out. The driver then fails to resume. While all 
> > > > > this 
> > > > > is happening, the rest of the kernel is blocking on that resume 
> > > > > method. 
> > > > > The firmware can be loaded once userspace has been started again, but 
> > > > > by 
> > > > > that time the driver has given up.
> > > > 
> > > > Seems, that's just the broken synchronous firmware loading interface
> > > > with the useless timeout handling. The nowait version of the same loader
> > > > doesn't time out, and should not have that problem. The sync version
> > > > should be removed from the kernel, it just causes all sorts of problems
> > > > since it exists.
> > > > 
> > > > Userspace should handle the async request just fine when it comes back
> > > > running, regardless of the time it was submitted.
> > > 
> > > Okay, so the solution is to convert the drivers to use
> > > request_firmware_nowait() instead of request_firmware() in their .resume()
> > > routines.
> > 
> > You'll just get deadlock at different level (and more rare).
> > 
> > Imagine disk with its firmware on NFS and NFS with its firmware on
> > disk.
> > 
> > (Or maybe firmware loader doing find /, including both disk and
> > NFS). Just don't call request_firmware_* from .resume().
> 
> A driver for a bootup-critical device like this should just never
> release the firmware after the first load. There is absolutely no point
> in doing that.

It does not have to be _bootup-critical_ device. Problem is any device
that might be used by userspace firmware loader. And that is _any_
device.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Kay Sievers

On Mon, 2007-05-28 at 13:15 +0200, Pavel Machek wrote:
> Hi!
> 
> > > > > What exactly is the problem we see here? The timeout of the firmware 
> > > > > loader?
> > > > > What goes wrong with frozen userspace, usually there is only a netlink
> > > > > message sent from the kernel, which should be received and handled
> > > > > just fine when userspace is running again.
> > > > 
> > > > Driver calls request_firmware in the resume method. The userspace 
> > > > helper 
> > > > can't be run because it's been frozen, so the firmware never gets 
> > > > loaded 
> > > > and the call times out. The driver then fails to resume. While all this 
> > > > is happening, the rest of the kernel is blocking on that resume method. 
> > > > The firmware can be loaded once userspace has been started again, but 
> > > > by 
> > > > that time the driver has given up.
> > > 
> > > Seems, that's just the broken synchronous firmware loading interface
> > > with the useless timeout handling. The nowait version of the same loader
> > > doesn't time out, and should not have that problem. The sync version
> > > should be removed from the kernel, it just causes all sorts of problems
> > > since it exists.
> > > 
> > > Userspace should handle the async request just fine when it comes back
> > > running, regardless of the time it was submitted.
> > 
> > Okay, so the solution is to convert the drivers to use
> > request_firmware_nowait() instead of request_firmware() in their .resume()
> > routines.
> 
> You'll just get deadlock at different level (and more rare).
> 
> Imagine disk with its firmware on NFS and NFS with its firmware on
> disk.
> 
> (Or maybe firmware loader doing find /, including both disk and
> NFS). Just don't call request_firmware_* from .resume().

A driver for a bootup-critical device like this should just never
release the firmware after the first load. There is absolutely no point
in doing that.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Rafael J. Wysocki

On Monday, 28 May 2007 13:15, Pavel Machek wrote:
> Hi!
> 
> > > > > What exactly is the problem we see here? The timeout of the firmware 
> > > > > loader?
> > > > > What goes wrong with frozen userspace, usually there is only a netlink
> > > > > message sent from the kernel, which should be received and handled
> > > > > just fine when userspace is running again.
> > > > 
> > > > Driver calls request_firmware in the resume method. The userspace 
> > > > helper 
> > > > can't be run because it's been frozen, so the firmware never gets 
> > > > loaded 
> > > > and the call times out. The driver then fails to resume. While all this 
> > > > is happening, the rest of the kernel is blocking on that resume method. 
> > > > The firmware can be loaded once userspace has been started again, but 
> > > > by 
> > > > that time the driver has given up.
> > > 
> > > Seems, that's just the broken synchronous firmware loading interface
> > > with the useless timeout handling. The nowait version of the same loader
> > > doesn't time out, and should not have that problem. The sync version
> > > should be removed from the kernel, it just causes all sorts of problems
> > > since it exists.
> > > 
> > > Userspace should handle the async request just fine when it comes back
> > > running, regardless of the time it was submitted.
> > 
> > Okay, so the solution is to convert the drivers to use
> > request_firmware_nowait() instead of request_firmware() in their .resume()
> > routines.
> 
> You'll just get deadlock at different level (and more rare).
> 
> Imagine disk with its firmware on NFS and NFS with its firmware on
> disk.

Yeah, I've just given the very same example in another message. :-)

I believe the only _solution_ would be to load the firmware into memory
before the suspend and use the in-RAM copy to upload it into the device
in .resume().

A PM_PRE_FREEZE notifier could be used for that just fine, BTW.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Rafael J. Wysocki

On Monday, 28 May 2007 10:30, Nigel Cunningham wrote:
> Hi.
> 
> On Sun, 2007-05-27 at 23:45 +0200, Rafael J. Wysocki wrote:
> > On Sunday, 27 May 2007 22:49, Matthew Garrett wrote:
> > > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > > > 
> > > > Use a hibernation and suspend notifier to disable the firmware 
> > > > requesting
> > > > mechanism before a hibernation/suspend and enable it after the 
> > > > operation.
> > > 
> > > This avoids the problem of .resume methods calling userspace while 
> > > userspace is frozen and a resulting hang, but does it actually result in 
> > > the drivers beginning to work again?
> > 
> > Well, this was acutally invented before you've decided to remove the 
> > freezing
> > of tasks from the suspend code path (which I think is a mistake, but that's
> > only my personal opinion, so it doesn't matter very much ;-)) and I regard 
> > it
> > as a workaround.
> 
> Suspend-to-ram code paths shouldn't assume userspace is unfrozen anyway.
> Doesn't [u]swsusp have a code path like Suspend2 where we can suspend to
> ram after writing the hibernation image? In that case, it will still be
> possible that we seek to enter and leave S3 with processes frozen.

That's correct.

> Apologies if anyone has already mentioned this - I'm just starting to
> play catchup.

No one has and that's a valid point, I think. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Pavel Machek

Hi!

> > > > What exactly is the problem we see here? The timeout of the firmware 
> > > > loader?
> > > > What goes wrong with frozen userspace, usually there is only a netlink
> > > > message sent from the kernel, which should be received and handled
> > > > just fine when userspace is running again.
> > > 
> > > Driver calls request_firmware in the resume method. The userspace helper 
> > > can't be run because it's been frozen, so the firmware never gets loaded 
> > > and the call times out. The driver then fails to resume. While all this 
> > > is happening, the rest of the kernel is blocking on that resume method. 
> > > The firmware can be loaded once userspace has been started again, but by 
> > > that time the driver has given up.
> > 
> > Seems, that's just the broken synchronous firmware loading interface
> > with the useless timeout handling. The nowait version of the same loader
> > doesn't time out, and should not have that problem. The sync version
> > should be removed from the kernel, it just causes all sorts of problems
> > since it exists.
> > 
> > Userspace should handle the async request just fine when it comes back
> > running, regardless of the time it was submitted.
> 
> Okay, so the solution is to convert the drivers to use
> request_firmware_nowait() instead of request_firmware() in their .resume()
> routines.

You'll just get deadlock at different level (and more rare).

Imagine disk with its firmware on NFS and NFS with its firmware on
disk.

(Or maybe firmware loader doing find /, including both disk and
NFS). Just don't call request_firmware_* from .resume().


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


Now a technical rather than emotional response...

On 28 May 2007, at 10:06, Kay Sievers wrote:


At device driver load, firmware loading must be asynchronous. This is
because device driver init can occur before userspace runs and
registers a hotplug/uevent event handler. If device use is attempted
before firmware loads, a -ENOFIRMWARE call would be great so that
userspace and thus the user can be clearly informed what the  
problem is.


Why would a driver create an interface before it has the needed
firmware loaded?


A valid point. But there should be some kind of error notification if  
firmware loading hasn't happened correctly rather than a permanent  
asynchronous wait in which the interface fails to turn up. Possibly a  
kernel information printk or something, which does not exist at the  
moment.



However, at 'first use' firmware loading, the synchronous interface
should remain. 'ifconfig up' either works or it doesn't, and as I see
it, has to just hang around until firmware turns up.


What kind of network driver does create an interface for a
non-functioning device? That sounds like a bug on its own.


Unclear. My point was that when ifconfig up exits, the interface  
should be up, not asynchronously waiting for firmware to be loaded,  
then taken up in the background. Thus, firmware loading in this case  
should be kept synchronous, in my opinion.


If a driver binds to a device, it should just have the firmware  
already

loaded, and not wait until its used. What's the reason for such a
behavior, to let a driver pretend it can handle a device, but it  
doesn't

even know if all the needed pieces are available on the system?


 Basically, you have a device which can carry out different  
functions depending on the firmware loaded into it. Driver A is  
specific to this device, and loads the firmware. Driver B uses  
functions exported by Driver A to carry out one particular function  
of the device. Driver C uses the same functions to carry out a  
totally different function on the same device, but with different  
firmware loaded.


Add in multiple devices handled by Driver A, all with different  
functionality, and sometimes with combinations of functionality that  
can coexist, and you see that when Driver A loads it cannot possibly  
know which firmware to load, but must wait for other Drivers to turn  
up and be put into use. Thus it 'pretends' to handle all the devices  
until it's forced to make a choice.


Yes, this is hellishly complicated. Blame Intel :)


The underlying issue are the driver authors, that's not so easy to
fix. :)


Addressed in previous email.


Well, 10 seconds are just to short for userspace to react on some
setups, from tiny boxes which are busy, to 512 CPU boxes enumerating
thousands of devices, all had problems here. Any timeout for a
firmware-request is just a broken concept, the request should wait
forever, to be fulfilled or canceled from userspace when it's ready.


Absolutely. I said this in an earlier email and suggested rejecting  
this patchset on the grounds that it was another bodge covering over  
a fundamentally broken area of the kernel :)



Kay


Michael-Luke Jones

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


On 28 May 2007, at 10:06, Kay Sievers wrote:


The underlying issue are the driver authors, that's not so easy to
fix. :)


Sorry, I know this maybe be unintentional, but comments like this  
make me somewhat angry.


If there is no decent documentation as to how to do it the right way  
(tm), how do you expect people to do it the right way (tm)?



Any timeout for a
firmware-request is just a broken concept, the request should wait
forever, to be fulfilled or canceled from userspace when it's ready.


What I wrote above is especially true when the in-kernel APIs  
themselves do things the wrong way (tm) themselves. Thus, even more  
thought is required to work around this imperfect behaviour in a sane  
manner. And without documentation, every single device driver author  
has to go through this thought process themselves. Unsurprisingly,  
they often get it wrong. But as there's no decent documentation to do  
it right, it's *not* their fault. I'd suggest it's more the fault of  
the core kernel devs who fail to fix up a questionable firmware  
loading interface, then fail to document how to work around its  
shortcomings.


Again, apologies if this sounds angry, I don't want to start an  
argument. But as someone just starting out here, this kind of thing  
can be very frustrating, as even with the best will in the world,  
achieving the right way (tm) is basically impossible if those in the  
know about what the right way (tm) is fail to share this information.


Michael-Luke Jones

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Kay Sievers

On Mon, 2007-05-28 at 09:48 +0100, Michael-Luke Jones wrote:
> On 28 May 2007, at 08:43, Rafael J. Wysocki wrote:
> 
> >> Seems, that's just the broken synchronous firmware loading interface
> >> with the useless timeout handling. The nowait version of the same  
> >> loader
> >> doesn't time out, and should not have that problem. The sync version
> >> should be removed from the kernel, it just causes all sorts of  
> >> problems
> >> since it exists.
> >>
> >> Userspace should handle the async request just fine when it comes  
> >> back
> >> running, regardless of the time it was submitted.
> >
> > Okay, so the solution is to convert the drivers to use
> > request_firmware_nowait() instead of request_firmware() in  
> > their .resume()
> > routines.
> 
> [added Rob Landley to CC]
> 
> I think asynchronous loading should be made the default behaviour.  
> However, we need to think and document how to do firmware loading.
> 
> Firmware loading can be done at two different times:
> Device Driver Load
> First use of Device Driver
> 
> For a network device, this correlates to when the driver is first  
> loaded into memory or at 'ifconfig up' respectively.
> 
> At device driver load, firmware loading must be asynchronous. This is  
> because device driver init can occur before userspace runs and  
> registers a hotplug/uevent event handler. If device use is attempted  
> before firmware loads, a -ENOFIRMWARE call would be great so that  
> userspace and thus the user can be clearly informed what the problem is.

Why would a driver create an interface before it has the needed
firmware loaded?

> However, at 'first use' firmware loading, the synchronous interface  
> should remain. 'ifconfig up' either works or it doesn't, and as I see  
> it, has to just hang around until firmware turns up.

What kind of network driver does create an interface for a
non-functioning device? That sounds like a bug on its own.

If a driver binds to a device, it should just have the firmware already
loaded, and not wait until its used. What's the reason for such a
behavior, to let a driver pretend it can handle a device, but it doesn't
even know if all the needed pieces are available on the system?

> One more thing, it seems that the asynchronous firmware loading  
> thread just spawns a _request_firmware() call which then times out at  
> 60 seconds. I think, if the first request fails it spawns another. 60  
> seconds is *far* too long for this type of thing, and this was set at  
> 10 seconds before the last two kernel releases (which is even a bit  
> slow itself). Unfortunately, this appears to a case of quite senior  
> kernel developers pushing a bodge upstream rather than fixing the  
> underlying issue :( [1] [2]

The underlying issue are the driver authors, that's not so easy to
fix. :)

Well, 10 seconds are just to short for userspace to react on some
setups, from tiny boxes which are busy, to 512 CPU boxes enumerating
thousands of devices, all had problems here. Any timeout for a
firmware-request is just a broken concept, the request should wait
forever, to be fulfilled or canceled from userspace when it's ready.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Michael-Luke Jones


On 28 May 2007, at 08:43, Rafael J. Wysocki wrote:


Seems, that's just the broken synchronous firmware loading interface
with the useless timeout handling. The nowait version of the same  
loader

doesn't time out, and should not have that problem. The sync version
should be removed from the kernel, it just causes all sorts of  
problems

since it exists.

Userspace should handle the async request just fine when it comes  
back

running, regardless of the time it was submitted.


Okay, so the solution is to convert the drivers to use
request_firmware_nowait() instead of request_firmware() in  
their .resume()

routines.


[added Rob Landley to CC]

I think asynchronous loading should be made the default behaviour.  
However, we need to think and document how to do firmware loading.


Firmware loading can be done at two different times:
Device Driver Load
First use of Device Driver

For a network device, this correlates to when the driver is first  
loaded into memory or at 'ifconfig up' respectively.


At device driver load, firmware loading must be asynchronous. This is  
because device driver init can occur before userspace runs and  
registers a hotplug/uevent event handler. If device use is attempted  
before firmware loads, a -ENOFIRMWARE call would be great so that  
userspace and thus the user can be clearly informed what the problem is.


However, at 'first use' firmware loading, the synchronous interface  
should remain. 'ifconfig up' either works or it doesn't, and as I see  
it, has to just hang around until firmware turns up.


One more thing, it seems that the asynchronous firmware loading  
thread just spawns a _request_firmware() call which then times out at  
60 seconds. I think, if the first request fails it spawns another. 60  
seconds is *far* too long for this type of thing, and this was set at  
10 seconds before the last two kernel releases (which is even a bit  
slow itself). Unfortunately, this appears to a case of quite senior  
kernel developers pushing a bodge upstream rather than fixing the  
underlying issue :( [1] [2]


Documentation for how hotplug/uevent handlers should cope with these  
types of firmware loading is also *strongly* requested, in order for  
lightweight but fully functional implementations to be made.


Documentation > Reference Implementation :)

Michael-Luke

[1] http://tinyurl.com/2colng (git.kernel.org)
[2] http://tinyurl.com/224h54 (redhat bugzilla)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Nigel Cunningham

Hi.

On Sun, 2007-05-27 at 23:45 +0200, Rafael J. Wysocki wrote:
> On Sunday, 27 May 2007 22:49, Matthew Garrett wrote:
> > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > > 
> > > Use a hibernation and suspend notifier to disable the firmware requesting
> > > mechanism before a hibernation/suspend and enable it after the operation.
> > 
> > This avoids the problem of .resume methods calling userspace while 
> > userspace is frozen and a resulting hang, but does it actually result in 
> > the drivers beginning to work again?
> 
> Well, this was acutally invented before you've decided to remove the freezing
> of tasks from the suspend code path (which I think is a mistake, but that's
> only my personal opinion, so it doesn't matter very much ;-)) and I regard it
> as a workaround.

Suspend-to-ram code paths shouldn't assume userspace is unfrozen anyway.
Doesn't [u]swsusp have a code path like Suspend2 where we can suspend to
ram after writing the hibernation image? In that case, it will still be
possible that we seek to enter and leave S3 with processes frozen.

Apologies if anyone has already mentioned this - I'm just starting to
play catchup.

Regards,

Nigel



signature.asc
Description: This is a digitally signed message part

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Rafael J. Wysocki

On Monday, 28 May 2007 00:16, Kay Sievers wrote:
> On Sun, 2007-05-27 at 23:04 +0100, Matthew Garrett wrote:
> > On Sun, May 27, 2007 at 11:49:30PM +0200, Kay Sievers wrote:
> > 
> > > What exactly is the problem we see here? The timeout of the firmware 
> > > loader?
> > > What goes wrong with frozen userspace, usually there is only a netlink
> > > message sent from the kernel, which should be received and handled
> > > just fine when userspace is running again.
> > 
> > Driver calls request_firmware in the resume method. The userspace helper 
> > can't be run because it's been frozen, so the firmware never gets loaded 
> > and the call times out. The driver then fails to resume. While all this 
> > is happening, the rest of the kernel is blocking on that resume method. 
> > The firmware can be loaded once userspace has been started again, but by 
> > that time the driver has given up.
> 
> Seems, that's just the broken synchronous firmware loading interface
> with the useless timeout handling. The nowait version of the same loader
> doesn't time out, and should not have that problem. The sync version
> should be removed from the kernel, it just causes all sorts of problems
> since it exists.
> 
> Userspace should handle the async request just fine when it comes back
> running, regardless of the time it was submitted.

Okay, so the solution is to convert the drivers to use
request_firmware_nowait() instead of request_firmware() in their .resume()
routines.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-28 Thread Rafael J. Wysocki

On Monday, 28 May 2007 00:01, Matthew Garrett wrote:
> On Sun, May 27, 2007 at 11:45:03PM +0200, Rafael J. Wysocki wrote:
> > On Sunday, 27 May 2007 22:49, Matthew Garrett wrote:
> > > If we remove process freezing in STR, this should just work[1] without the
> > > need to complicate things. 
> > 
> > Under the (optimistic, IMO) assumption that the relevant user space task 
> > won't
> > block on I/O with a suspended device involved or something like this.
> 
> Yeah, I forgot the footnote that was going to mention that. Clearly 
> there are issues if this is on some block device that hasn't been 
> resumed yet.

Or worse.  Suppose, for example, that the device which needs the firmware
uploaded is your network adapter and the firmware file is on NFS.

I don't think there's a solution to the "requesting firmware from .resume()"
problem other than copying the firmware into memory _before_ the suspend
and using this copy to upload the firmware in .resume().

> > BTW, I know of two subsystems that want their kernel threads to be frozen 
> > for
> > synchronization purposes.  Please see these messages:
> > 
> > 1) 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html
> > (plus follow up)
> > 
> > 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2
> 
> I'm not entirely sold on this. The issue is that there's the possibility 
> of races during suspend/resume?

Yes.

> It sounds like that should be implemented in the driver, rather than
> depending on a side-effect of process freezing. Otherwise there's no way 
> of selectively suspending that device.

That's true, but we've been using the freezer for so long that at least some
drivers started to rely on it being used.  That's the reason, IMO, why we can't
just drop the freezer right now.

It can be _replaced_ by some other synchronization mechanisms, but not just
dropped.  Moreover, I think that to implement such mechanisms within device
drivers we should first stop using the same .suspend() and .resume() routines
for both hibernation and suspend.

This is the plan right now (ie. to stop using .suspend() and .resume() for
hibernation) and when we're done with that (yes, it'll take some time), then we
can think of dropping the freezer entirely from the suspend code path.

> > Besides, there's the hibernation that needs to freeze tasks for another 
> > reason,
> > so it needs some way to ensure that drivers won't request firmware while
> > the user land is frozen.
> 
> I agree that that's an issue right now, but I think we should see if 
> this is repairable without just breaking those drivers.

Well, I don't think we're breaking the drivers.

> One option would be to defer resuming them until userspace is alive again -
> that would be no worse than the suspend to RAM case without process freezing.

Please consider the example with a firmware on NFS. :-)

I think the _solution_ would be to fix the drivers.  The other approaches are
just workarounds, including the $subject patch.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Kay Sievers

On Sun, 2007-05-27 at 23:04 +0100, Matthew Garrett wrote:
> On Sun, May 27, 2007 at 11:49:30PM +0200, Kay Sievers wrote:
> 
> > What exactly is the problem we see here? The timeout of the firmware loader?
> > What goes wrong with frozen userspace, usually there is only a netlink
> > message sent from the kernel, which should be received and handled
> > just fine when userspace is running again.
> 
> Driver calls request_firmware in the resume method. The userspace helper 
> can't be run because it's been frozen, so the firmware never gets loaded 
> and the call times out. The driver then fails to resume. While all this 
> is happening, the rest of the kernel is blocking on that resume method. 
> The firmware can be loaded once userspace has been started again, but by 
> that time the driver has given up.

Seems, that's just the broken synchronous firmware loading interface
with the useless timeout handling. The nowait version of the same loader
doesn't time out, and should not have that problem. The sync version
should be removed from the kernel, it just causes all sorts of problems
since it exists.

Userspace should handle the async request just fine when it comes back
running, regardless of the time it was submitted.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Matthew Garrett

On Sun, May 27, 2007 at 11:49:30PM +0200, Kay Sievers wrote:

> What exactly is the problem we see here? The timeout of the firmware loader?
> What goes wrong with frozen userspace, usually there is only a netlink
> message sent from the kernel, which should be received and handled
> just fine when userspace is running again.

Driver calls request_firmware in the resume method. The userspace helper 
can't be run because it's been frozen, so the firmware never gets loaded 
and the call times out. The driver then fails to resume. While all this 
is happening, the rest of the kernel is blocking on that resume method. 
The firmware can be loaded once userspace has been started again, but by 
that time the driver has given up.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Matthew Garrett

On Sun, May 27, 2007 at 11:45:03PM +0200, Rafael J. Wysocki wrote:
> On Sunday, 27 May 2007 22:49, Matthew Garrett wrote:
> > If we remove process freezing in STR, this should just work[1] without the
> > need to complicate things. 
> 
> Under the (optimistic, IMO) assumption that the relevant user space task won't
> block on I/O with a suspended device involved or something like this.

Yeah, I forgot the footnote that was going to mention that. Clearly 
there are issues if this is on some block device that hasn't been 
resumed yet.

> BTW, I know of two subsystems that want their kernel threads to be frozen for
> synchronization purposes.  Please see these messages:
> 
> 1) https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html
> (plus follow up)
> 
> 2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2

I'm not entirely sold on this. The issue is that there's the possibility 
of races during suspend/resume? It sounds like that should be 
implemented in the driver, rather than depending on a side-effect of 
process freezing. Otherwise there's no way of selectively suspending 
that device.

> Besides, there's the hibernation that needs to freeze tasks for another 
> reason,
> so it needs some way to ensure that drivers won't request firmware while
> the user land is frozen.

I agree that that's an issue right now, but I think we should see if 
this is repairable without just breaking those drivers. One option would 
be to defer resuming them until userspace is alive again - that would be 
no worse than the suspend to RAM case without process freezing.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Rafael J. Wysocki

On Sunday, 27 May 2007 23:49, Kay Sievers wrote:
> On 5/27/07, Matthew Garrett <[EMAIL PROTECTED]> wrote:
> > On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > >
> > > Use a hibernation and suspend notifier to disable the firmware requesting
> > > mechanism before a hibernation/suspend and enable it after the operation.
> >
> > This avoids the problem of .resume methods calling userspace while
> > userspace is frozen and a resulting hang, but does it actually result in
> > the drivers beginning to work again? If we remove process freezing in
> > STR, this should just work[1] without the need to complicate things. On
> > the other hand, if we don't want to support these functions in the
> > suspend and resume methods we could just audit the kernel and remove
> > them all.
> 
> What exactly is the problem we see here? The timeout of the firmware loader?
> What goes wrong with frozen userspace, usually there is only a netlink
> message sent from the kernel, which should be received and handled
> just fine when userspace is running again.

Users report the timeout as a problem and it's not that straightforward to
figure out what happens.

Still, I agree it's much better if drivers don't use request_firmware() in
their .resume() routines at all, because they shouldn't rely upon user land at
this point.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Kay Sievers

On 5/27/07, Matthew Garrett <[EMAIL PROTECTED]> wrote:

On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[EMAIL PROTECTED]>
>
> Use a hibernation and suspend notifier to disable the firmware requesting
> mechanism before a hibernation/suspend and enable it after the operation.

This avoids the problem of .resume methods calling userspace while
userspace is frozen and a resulting hang, but does it actually result in
the drivers beginning to work again? If we remove process freezing in
STR, this should just work[1] without the need to complicate things. On
the other hand, if we don't want to support these functions in the
suspend and resume methods we could just audit the kernel and remove
them all.

What exactly is the problem we see here? The timeout of the firmware loader?
What goes wrong with frozen userspace, usually there is only a netlink
message sent from the kernel, which should be received and handled
just fine when userspace is running again.

Thanks,
Kay
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Rafael J. Wysocki

On Sunday, 27 May 2007 22:45, Michael-Luke Jones wrote:
> On 27 May 2007, at 21:31, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> >
> > Use a hibernation and suspend notifier to disable the firmware  
> > requesting
> > mechanism before a hibernation/suspend and enable it after the  
> > operation.
> >
> > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> >
> >  drivers/base/firmware_class.c |   36 ++ 
> > ++
> >  1 file changed, 36 insertions(+)
> 
> I don't like this approach, as I feel that the firmware loading  
> interface should be able to detect if a firmware load request is not  
> being handled, due to absence of userspace / hotplug handler presence.

In principle, I agree.  In practice, though, I don't know how to make this
happen.

> Other circumstances in which this can be a problem is during bootup  
> when request_firmware() calls can be made before userspace is up and  
> init has run (even in the presence of an initramfs).
> 
> (Slightly OT: A particularly nasty race is when an initramfs  
> userspace is present, but firmware loading cannot occur because init  
> has not run, so proc hasn't been mounted, so a hotplug event handler  
> cannot be registered, despite the fact that the firmware is sitting  
> on the ramdisk mounted correctly...)
> 
> In short, a more general solution would be preferred, and preferably  
> one which allows firmware loading to *actually* occur once userspace  
> has actually turned up and registered a handler :)

I agree, but well ... ;-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Rafael J. Wysocki

On Sunday, 27 May 2007 22:49, Matthew Garrett wrote:
> On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > 
> > Use a hibernation and suspend notifier to disable the firmware requesting
> > mechanism before a hibernation/suspend and enable it after the operation.
> 
> This avoids the problem of .resume methods calling userspace while 
> userspace is frozen and a resulting hang, but does it actually result in 
> the drivers beginning to work again?

Well, this was acutally invented before you've decided to remove the freezing
of tasks from the suspend code path (which I think is a mistake, but that's
only my personal opinion, so it doesn't matter very much ;-)) and I regard it
as a workaround.

> If we remove process freezing in STR, this should just work[1] without the
> need to complicate things. 

Under the (optimistic, IMO) assumption that the relevant user space task won't
block on I/O with a suspended device involved or something like this.

BTW, I know of two subsystems that want their kernel threads to be frozen for
synchronization purposes.  Please see these messages:

1) https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012592.html
(plus follow up)

2) http://marc.info/?l=linux-kernel&m=117919066830575&w=2

Your patch breaks them and I suspect there are more cases like these.

Besides, there's the hibernation that needs to freeze tasks for another reason,
so it needs some way to ensure that drivers won't request firmware while
the user land is frozen.

> On the other hand, if we don't want to support these functions in the 
> suspend and resume methods we could just audit the kernel and remove 
> them all.

Yes, I think we should do this.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Matthew Garrett

On Sun, May 27, 2007 at 10:31:53PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
> Use a hibernation and suspend notifier to disable the firmware requesting
> mechanism before a hibernation/suspend and enable it after the operation.

This avoids the problem of .resume methods calling userspace while 
userspace is frozen and a resulting hang, but does it actually result in 
the drivers beginning to work again? If we remove process freezing in 
STR, this should just work[1] without the need to complicate things. On 
the other hand, if we don't want to support these functions in the 
suspend and resume methods we could just audit the kernel and remove 
them all.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm 3/3] PM: Disable _request_firmware before hibernation/suspend

2007-05-27 Thread Michael-Luke Jones


On 27 May 2007, at 21:31, Rafael J. Wysocki wrote:


From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Use a hibernation and suspend notifier to disable the firmware  
requesting
mechanism before a hibernation/suspend and enable it after the  
operation.


Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>

 drivers/base/firmware_class.c |   36 ++ 
++

 1 file changed, 36 insertions(+)


I don't like this approach, as I feel that the firmware loading  
interface should be able to detect if a firmware load request is not  
being handled, due to absence of userspace / hotplug handler presence.


Other circumstances in which this can be a problem is during bootup  
when request_firmware() calls can be made before userspace is up and  
init has run (even in the presence of an initramfs).


(Slightly OT: A particularly nasty race is when an initramfs  
userspace is present, but firmware loading cannot occur because init  
has not run, so proc hasn't been mounted, so a hotplug event handler  
cannot be registered, despite the fact that the firmware is sitting  
on the ramdisk mounted correctly...)


In short, a more general solution would be preferred, and preferably  
one which allows firmware loading to *actually* occur once userspace  
has actually turned up and registered a handler :)


Michael-Luke Jones

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

52 matches

Mail list logo