Re: suspend blockers Android integration

2010-06-03 Thread Linus Torvalds


On Fri, 4 Jun 2010, Ingo Molnar wrote:
 
 What you say is absolutely true, hence this would be driven via sched_tick() 
 + 
 TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can 
 be done with no overhead to the regular fastpaths.
 
 The TIF notifier would be the one scheduling to idle - and would thus do it 
 only to user-mode tasks.

The thing is, unless there is some _really_ deep other reason to do 
something like this, I still think it's total overdesign to push any 
knowledge/choices like this into the scheduler. I'd rather keep things way 
more independent, less tied to each other and to deep kernel subsystems.

IOW, my personal opinion is that somethng like a suspend (blocker or not) 
decision simply shouldn't be important enough to be tied into the 
scheduler. Especially not if it could just be its own layer.

That said, as far as I know, the Android people have mostly been looking 
at the suspend angle from a single-core standpoint. And I'm not at all 
convinced that they should hijack the existing /sys/power/state thing 
which is what I think they do now.

And those two things go together. The /sys/power/state thing is a global 
suspend - which I don't think is appropriate for a opportunistic thing in 
the first place, especially for multi-core.

A well-designed opportunistic suspend should be a two-phase thing: an 
opportunistc CPU hotunplug (shutting down cores one by one as the system 
is idle), and not a global event in the first place. And only when 
you've reached single-core state should you then say do I suspend the 
system too.

So I've tried to look a bit at the patches, and my admittedly rough 
comments so far is

 - I really do prefer the off to the side approach that the current 
   google opportunistic suspend patches have. As mentioned, I don't think 
   this should be deep in the scheduler. Not at all.

 - I do think there are possibly races and CPU idle issues there, but I 
   think they are mainly for the multi-core thing. And I think that's a 
   totally separate issue. Or it _should_ be.

 - once you're single-core (whether because you never had more cores to 
   begin with, or because the opportunistic CPU offlining has taken down 
   the other cores), I think the suspend-blocker is fine as a concept, and 
   certainly shouldn't need any deep scheduler hooks.

so I'd like to see the opportunistc suspend thing think about CPU 
offlining, and I'd like to see it disconnect from the existing 
/sys/power/state. And I'd really not like to involved deep internal kernel 
hooks into it.

But I'll also admit that maybe I'm not seeing some problems. I've frankly 
tried to avoid the whole discussion until Andrew pulled me in yesterday.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers Android integration

2010-06-03 Thread Linus Torvalds


On Thu, 3 Jun 2010, Linus Torvalds wrote:
 
 so I'd like to see the opportunistc suspend thing think about CPU 
 offlining

Side note: one reason for me being somewhat interested in the CPU 
offlining is that I think the Android kind of opportunistic suspend is 
_not_ likely something I'd like to see on a desktop. But an the 
opportunistic CPU offliner? That might _well_ be useful even outside of 
any other suspend activity.

If the system is idle (or almost idle) for long times, I would heartily 
recommend actively shutting down unused cores. Some CPU's are hopefully 
smart enough to not even need that kind of software management, but I 
suspect even the really smart ones might be able to take advantage of the 
kernel saying: I'm shutting you down, you don't have to worry about 
latency AT ALL, because I'm keeping another CPU active to do any real 
work.

I'd also be interested to see if it could even improve single-thread 
performance if we end up doing the whole SMP-UP lock prefix rewriting 
when the system is idle enough that we'd be better off running just a 
single core. I dunno - just throwing that out there.

Anyway, the only reason I think this is related is literally because I 
think that if we know there is only a single CPU active, I think the 
actual real opportunistic suspend is easier. Suddenly you don't have to 
worry about what happens on other run-queues etc, and whether another CPU 
is just about to create a suspend block etc.

So I think they tie together, although it's mostly tangential. And as 
mentioned, I think a opportunistic CPU suspend part is more relevant 
outside of Android, and thus perhaps more widely interesting.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Future of resource framework?

2010-06-03 Thread Gadiyar, Anand
Kevin Hilman wrote:
 Mike Chan m...@android.com writes:
 
  On Fri, May 21, 2010 at 9:47 AM, Kevin Hilman
  khil...@deeprootsystems.com wrote:
  Mike Chan m...@android.com writes:
 
  I'm not sure if this has been discussed, yet but since it seems that
  the resource framework will not be making it upstream, I am curious
  what are the replacements under consideration. I am starting to see
  similar issues on other platforms (msm / tegra) so more generic
  (non-omap) solution might be something to consider.
 
  Hi Mike,
 
  Which parts of the SRF do you currently use and find useful?  It would
  be helpful for us to to understand the parts you see as useful and
  potentially helpful to generalize.
 
 
  Off the top of my head, for Droid specifically, OPP values are useful,
  although in theory if you changed OPP requests to cpu throughput that
  might give the equivalent functionality.
 
  Memory bus speeds / bandwidth, although its tied to CPU, which
  ultimately ends up in a cpu speed bump.
 
  Although most of the usage I've seen are just hacks, ie: the driver
  knows it needs 550mhz from the cpu so it will request some bogus
  value.
 
 
  As you know, the current implementation has a several layers
  and attempts to manage several things: OPPs, latencies etc.
 
  Our current plans are essentially to break up the one framework to
  rule them all philosophy and design of SRF and manage the various
  pieces by exending other layers such as the new OPP layer and voltage
  layers.  Latencies are being managed by the omap_device layer and we
  will hopefully have some discussions with the broader linux-pm
  community about generalizing that more into the generic driver model
  over this year.
 
 
  Bus speed is a common resource I see for omap / msm / tegra. Clocks
  for devices also.
 
  ie: If I'm doing heavy mem operation and need max memory bus, I might
  need to request higher performance. (which might mean 600mhz on
  omap34030, on msm it might mean AXI clock running at 128mhz, and
  something else on tegra).
 
  Or if I'm doing graphics, I may need to up the gfx clock rate, or
  swich which pll its sourcing etc.. etc..
 
  It doesn't look like pm qos has bus support, or even clock support,
  and this gets tricky if you want something semi-general.
 
 What we're hoping to work towards (and has come up in the suspend
 blocker related discussions) is moving towards a way to track
 per-device (or per-subsystem) constraints like latency and throughput
 in real world terms (usecs, bytes/sec, etc.) that would be general
 way.
 
 These constraints would then be visible to the bus- or
 platform-specific code that could make intelligent decisions with them
 (i.e whether or not to raise/lower OPP or bus speed, or whether or not
 to power down a powerdomain etc.)
 


What if a driver knows that it cannot afford to let the PM layer
turn off the power domain at certain points of time (maybe as long
as a USB cable is connected). How can this be specified in terms
of a latency or throughput constraint?

Just curious, since I don't understand current OMAP3 PM code
as well as I would like to.

- Anand
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers Android integration

2010-06-03 Thread Arjan van de Ven
On Thu, 3 Jun 2010 19:26:50 -0700 (PDT)
Linus Torvalds torva...@linux-foundation.org wrote:

 
 If the system is idle (or almost idle) for long times, I would
 heartily recommend actively shutting down unused cores. Some CPU's
 are hopefully smart enough to not even need that kind of software
 management, but I suspect even the really smart ones might be able to
 take advantage of the kernel saying: I'm shutting you down, you
 don't have to worry about latency AT ALL, because I'm keeping another
 CPU active to do any real work.

sadly the reality is that offline is actually the same as deepest C
state. At best.

As far as I can see, this is at least true for all Intel and AMD cpus.

And because there's then no power saving (but a performance cost), it's
actually a negative for battery life/total energy.

(lots of experiments inside Intel seem to confirm that, it's not just
theory)





-- 
Arjan van de VenIntel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers Android integration

2010-06-03 Thread Arve Hjønnevåg
On Thu, Jun 3, 2010 at 7:16 PM, Linus Torvalds
torva...@linux-foundation.org wrote:


 On Fri, 4 Jun 2010, Ingo Molnar wrote:

 What you say is absolutely true, hence this would be driven via sched_tick() 
 +
 TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can
 be done with no overhead to the regular fastpaths.

 The TIF notifier would be the one scheduling to idle - and would thus do it
 only to user-mode tasks.

 The thing is, unless there is some _really_ deep other reason to do
 something like this, I still think it's total overdesign to push any
 knowledge/choices like this into the scheduler. I'd rather keep things way
 more independent, less tied to each other and to deep kernel subsystems.

 IOW, my personal opinion is that somethng like a suspend (blocker or not)
 decision simply shouldn't be important enough to be tied into the
 scheduler. Especially not if it could just be its own layer.

 That said, as far as I know, the Android people have mostly been looking
 at the suspend angle from a single-core standpoint. And I'm not at all
 convinced that they should hijack the existing /sys/power/state thing
 which is what I think they do now.


While it is true that we have not used this code on a multi core
system yet, I'm not sure why multiple cores codes would affect it. We
annotate that works needs to be done before it is safe to suspend, but
we don't care which core does the work (or if multiple cores do pieces
of it).

 And those two things go together. The /sys/power/state thing is a global
 suspend - which I don't think is appropriate for a opportunistic thing in
 the first place, especially for multi-core.

 A well-designed opportunistic suspend should be a two-phase thing: an
 opportunistc CPU hotunplug (shutting down cores one by one as the system
 is idle), and not a global event in the first place. And only when
 you've reached single-core state should you then say do I suspend the
 system too.


This seems to fit better into the cpuidle and/or frequency scaling framework.

 So I've tried to look a bit at the patches, and my admittedly rough
 comments so far is

  - I really do prefer the off to the side approach that the current
   google opportunistic suspend patches have. As mentioned, I don't think
   this should be deep in the scheduler. Not at all.

  - I do think there are possibly races and CPU idle issues there, but I
   think they are mainly for the multi-core thing. And I think that's a
   totally separate issue. Or it _should_ be.


I'm not aware of any races with multi-core systems unless there are
existing problems in suspend. We check if any suspend blockers are
active after disable_nonboot_cpus() has returned.

  - once you're single-core (whether because you never had more cores to
   begin with, or because the opportunistic CPU offlining has taken down
   the other cores), I think the suspend-blocker is fine as a concept, and
   certainly shouldn't need any deep scheduler hooks.

 so I'd like to see the opportunistc suspend thing think about CPU
 offlining,

I see this as a separate problem. We ignore a single busy CPU for
opportunistic suspend, so why should the number of online CPUs matter?

 and I'd like to see it disconnect from the existing
 /sys/power/state.

The entry point is not important to us. The current interface is what
Rafael wanted instead of the /sys/power/request-state interface which
is what we changed it to last year.

 And I'd really not like to involved deep internal kernel
 hooks into it.

 But I'll also admit that maybe I'm not seeing some problems. I've frankly
 tried to avoid the whole discussion until Andrew pulled me in yesterday.

                        Linus




-- 
Arve Hjønnevåg
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers Android integration

2010-06-03 Thread Neil Brown
On Fri, 4 Jun 2010 01:23:02 +0200
Ingo Molnar mi...@elte.hu wrote:

 Btw., i'd like to summarize the scheduler based suspend scheme proposed by 
 Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in 
 the big thread, and there are also new elements of the proposal:

Hi
 I would like to summarise the alternate proposal that I an others have
 suggested in a variety of different forms.

 It starts from the premise that
 1/ Android developers actually like the big hammer aspect of suspend.
   Initiating suspend powers down some devices, puts others in low power
   states, freezes all processes and generally puts the device to sleep
   with a well defined and easily controlled (at the whole-of-system level)
   set of events that will wake from suspend.  This is a big part of the
   Android approach to power-saving and I'm guessing they are not keen to
   depart from it.

 2/ The main problem with using suspend as-is is that it is racy.
   The purpose of suspend is to put the device to sleep until a wake-event
   occurs.  When that wake-event occurs at much the same time that suspend is
   requested races can occur.  We want a wake-event to not only wake the
   device, to be keep the device awake while the wake-event is being handled,
   and to cancel any suspend that was initiated before the wake event
   completed.
   We need to understand wake event in an holistic sense.  If a key press is
   expected to brighten the screen and make a glyph appear, and if that key
   press is considered to be a wake-event, then the glyph appearing must also
   be a part of the wake event.  For such a holistic wake-event to fully
   block/cancel a suspend there much be some mechanism for hand-over of
   wake-events from kernel-space to user-space.

  Given those premises, google's suspend-blocker approach was to allow a
  kernel thread to initiate suspend whenever nothing was stopping it, and to
  allow both drivers and user processes to block that suspend while handling
  a wake event (or anything else that needed to keep the device awake).
  In this case the hand-over is fairly straight forward as the kernel thread
  as full knowledge and can easily wait for all sorts of things.

  The alternate proposal is simply to have user-space initiate a suspend (as
  is already possible), user-space processes can then trivially block that
  suspend through any of a number of IPC approaches, and kernel space drivers
  can block/abort suspend by explicitly requesting a block.

  The variety of alternate proposals comes from a variety of ways to modify
  the semantics of ask for a suspend in such a way that userspace can
  discover when there are kernel-space blocks, and can wait for them to be
  released without spinning.

  A sample modification (which I think is different to all the ones
  mentioned so far, and hopefully pulls out the best of them all) is
  to allow userspace to write e.g. mem_safe rather than mem to
  /sys/power/state.  The 'safe' implies it is safe from races.

  When this is written, the process sleeps in an interruptible state until
  all in-kernel suspend blocks have been dropped.  If any such suspend blocks
  were found, or if a signal is received, the request aborts.  Only if there
  were no suspend blocks and no pending signals does the suspend progress.

  wake-events in the kernel then need to be tracked all the way to user-space,
  and the in-kernel lock is only dropped when the event is consumed by
  user-space.  User-space must take some sort of lock to ensure no new
  suspend is requested before consuming any wake-events from the kernel.

  I believe this is very close to what android has today, only with a much
  smaller change to the user-space interface, which I believe to be the thing
  that has been found most objectionable.
  I does still require a degree of event-tracking within the kernel which
  might still be objectionable - I'm not so sure about different people's
  positions on that.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers Android integration

2010-06-03 Thread Linus Torvalds


On Thu, 3 Jun 2010, Arjan van de Ven wrote:
 
 And because there's then no power saving (but a performance cost), it's
 actually a negative for battery life/total energy.

Including the UP optimizations we do (ie lock prefix removal)? It's 
possible that I'm just biased by benchmarks, and it's true that Intel has 
been getting lots better, but the locking costs are very noticeable 
performance-wise on some benchmarks.

And several CPU's have been held back from going into deepest sleep states 
by stupid firmware and/or platform bugs.

But hey, if it's not going to help, and people have tried it, I guess I'll 
have to believe it.

Linus

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers Android integration

2010-06-03 Thread Arve Hjønnevåg
On Thu, Jun 3, 2010 at 4:23 PM, Ingo Molnar mi...@elte.hu wrote:
...
  - Controlled auto-suspend: drivers (such as input) could on wakeup
   automatically set the 'minimum wakeup latency' value of wakee tasks to a
   lower value. This automatically prevents another auto-suspend in the near
   future: up to the point the wakee task increases its latency (via the
   scheduler syscall) again and allows suspend again.


How do you clear the latency value in a safe way? If another wakeup
event happens right after your wakee task is done processing the last
event and decides to increase its latency, auto suspend will be
allowed even though you have an unprocessed wakeup event. Also how do
you know which task will read the event if it is not already waiting
for it?


   This means there will be no surprise suspends for a task that may take a
   bit longer than usual to finish its work. [ Detail: this would only be done
   for tasks that have a non-default (non-infinity) task-latency value - to
   prevent the input driver from lowering latency values (and preventing
   future suspends) just because some unaware apps are running and using input
   drivers. ]

Don't you need two inifinity values for this?

-- 
Arve Hjønnevåg
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


<    1   2