Re: suspend blockers Android integration
On Fri, 4 Jun 2010, Ingo Molnar wrote: What you say is absolutely true, hence this would be driven via sched_tick() + TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can be done with no overhead to the regular fastpaths. The TIF notifier would be the one scheduling to idle - and would thus do it only to user-mode tasks. The thing is, unless there is some _really_ deep other reason to do something like this, I still think it's total overdesign to push any knowledge/choices like this into the scheduler. I'd rather keep things way more independent, less tied to each other and to deep kernel subsystems. IOW, my personal opinion is that somethng like a suspend (blocker or not) decision simply shouldn't be important enough to be tied into the scheduler. Especially not if it could just be its own layer. That said, as far as I know, the Android people have mostly been looking at the suspend angle from a single-core standpoint. And I'm not at all convinced that they should hijack the existing /sys/power/state thing which is what I think they do now. And those two things go together. The /sys/power/state thing is a global suspend - which I don't think is appropriate for a opportunistic thing in the first place, especially for multi-core. A well-designed opportunistic suspend should be a two-phase thing: an opportunistc CPU hotunplug (shutting down cores one by one as the system is idle), and not a global event in the first place. And only when you've reached single-core state should you then say do I suspend the system too. So I've tried to look a bit at the patches, and my admittedly rough comments so far is - I really do prefer the off to the side approach that the current google opportunistic suspend patches have. As mentioned, I don't think this should be deep in the scheduler. Not at all. - I do think there are possibly races and CPU idle issues there, but I think they are mainly for the multi-core thing. And I think that's a totally separate issue. Or it _should_ be. - once you're single-core (whether because you never had more cores to begin with, or because the opportunistic CPU offlining has taken down the other cores), I think the suspend-blocker is fine as a concept, and certainly shouldn't need any deep scheduler hooks. so I'd like to see the opportunistc suspend thing think about CPU offlining, and I'd like to see it disconnect from the existing /sys/power/state. And I'd really not like to involved deep internal kernel hooks into it. But I'll also admit that maybe I'm not seeing some problems. I've frankly tried to avoid the whole discussion until Andrew pulled me in yesterday. Linus -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspend blockers Android integration
On Thu, 3 Jun 2010, Linus Torvalds wrote: so I'd like to see the opportunistc suspend thing think about CPU offlining Side note: one reason for me being somewhat interested in the CPU offlining is that I think the Android kind of opportunistic suspend is _not_ likely something I'd like to see on a desktop. But an the opportunistic CPU offliner? That might _well_ be useful even outside of any other suspend activity. If the system is idle (or almost idle) for long times, I would heartily recommend actively shutting down unused cores. Some CPU's are hopefully smart enough to not even need that kind of software management, but I suspect even the really smart ones might be able to take advantage of the kernel saying: I'm shutting you down, you don't have to worry about latency AT ALL, because I'm keeping another CPU active to do any real work. I'd also be interested to see if it could even improve single-thread performance if we end up doing the whole SMP-UP lock prefix rewriting when the system is idle enough that we'd be better off running just a single core. I dunno - just throwing that out there. Anyway, the only reason I think this is related is literally because I think that if we know there is only a single CPU active, I think the actual real opportunistic suspend is easier. Suddenly you don't have to worry about what happens on other run-queues etc, and whether another CPU is just about to create a suspend block etc. So I think they tie together, although it's mostly tangential. And as mentioned, I think a opportunistic CPU suspend part is more relevant outside of Android, and thus perhaps more widely interesting. Linus -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Future of resource framework?
Kevin Hilman wrote: Mike Chan m...@android.com writes: On Fri, May 21, 2010 at 9:47 AM, Kevin Hilman khil...@deeprootsystems.com wrote: Mike Chan m...@android.com writes: I'm not sure if this has been discussed, yet but since it seems that the resource framework will not be making it upstream, I am curious what are the replacements under consideration. I am starting to see similar issues on other platforms (msm / tegra) so more generic (non-omap) solution might be something to consider. Hi Mike, Which parts of the SRF do you currently use and find useful? It would be helpful for us to to understand the parts you see as useful and potentially helpful to generalize. Off the top of my head, for Droid specifically, OPP values are useful, although in theory if you changed OPP requests to cpu throughput that might give the equivalent functionality. Memory bus speeds / bandwidth, although its tied to CPU, which ultimately ends up in a cpu speed bump. Although most of the usage I've seen are just hacks, ie: the driver knows it needs 550mhz from the cpu so it will request some bogus value. As you know, the current implementation has a several layers and attempts to manage several things: OPPs, latencies etc. Our current plans are essentially to break up the one framework to rule them all philosophy and design of SRF and manage the various pieces by exending other layers such as the new OPP layer and voltage layers. Latencies are being managed by the omap_device layer and we will hopefully have some discussions with the broader linux-pm community about generalizing that more into the generic driver model over this year. Bus speed is a common resource I see for omap / msm / tegra. Clocks for devices also. ie: If I'm doing heavy mem operation and need max memory bus, I might need to request higher performance. (which might mean 600mhz on omap34030, on msm it might mean AXI clock running at 128mhz, and something else on tegra). Or if I'm doing graphics, I may need to up the gfx clock rate, or swich which pll its sourcing etc.. etc.. It doesn't look like pm qos has bus support, or even clock support, and this gets tricky if you want something semi-general. What we're hoping to work towards (and has come up in the suspend blocker related discussions) is moving towards a way to track per-device (or per-subsystem) constraints like latency and throughput in real world terms (usecs, bytes/sec, etc.) that would be general way. These constraints would then be visible to the bus- or platform-specific code that could make intelligent decisions with them (i.e whether or not to raise/lower OPP or bus speed, or whether or not to power down a powerdomain etc.) What if a driver knows that it cannot afford to let the PM layer turn off the power domain at certain points of time (maybe as long as a USB cable is connected). How can this be specified in terms of a latency or throughput constraint? Just curious, since I don't understand current OMAP3 PM code as well as I would like to. - Anand -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspend blockers Android integration
On Thu, 3 Jun 2010 19:26:50 -0700 (PDT) Linus Torvalds torva...@linux-foundation.org wrote: If the system is idle (or almost idle) for long times, I would heartily recommend actively shutting down unused cores. Some CPU's are hopefully smart enough to not even need that kind of software management, but I suspect even the really smart ones might be able to take advantage of the kernel saying: I'm shutting you down, you don't have to worry about latency AT ALL, because I'm keeping another CPU active to do any real work. sadly the reality is that offline is actually the same as deepest C state. At best. As far as I can see, this is at least true for all Intel and AMD cpus. And because there's then no power saving (but a performance cost), it's actually a negative for battery life/total energy. (lots of experiments inside Intel seem to confirm that, it's not just theory) -- Arjan van de VenIntel Open Source Technology Centre For development, discussion and tips for power savings, visit http://www.lesswatts.org -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspend blockers Android integration
On Thu, Jun 3, 2010 at 7:16 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, 4 Jun 2010, Ingo Molnar wrote: What you say is absolutely true, hence this would be driven via sched_tick() + TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can be done with no overhead to the regular fastpaths. The TIF notifier would be the one scheduling to idle - and would thus do it only to user-mode tasks. The thing is, unless there is some _really_ deep other reason to do something like this, I still think it's total overdesign to push any knowledge/choices like this into the scheduler. I'd rather keep things way more independent, less tied to each other and to deep kernel subsystems. IOW, my personal opinion is that somethng like a suspend (blocker or not) decision simply shouldn't be important enough to be tied into the scheduler. Especially not if it could just be its own layer. That said, as far as I know, the Android people have mostly been looking at the suspend angle from a single-core standpoint. And I'm not at all convinced that they should hijack the existing /sys/power/state thing which is what I think they do now. While it is true that we have not used this code on a multi core system yet, I'm not sure why multiple cores codes would affect it. We annotate that works needs to be done before it is safe to suspend, but we don't care which core does the work (or if multiple cores do pieces of it). And those two things go together. The /sys/power/state thing is a global suspend - which I don't think is appropriate for a opportunistic thing in the first place, especially for multi-core. A well-designed opportunistic suspend should be a two-phase thing: an opportunistc CPU hotunplug (shutting down cores one by one as the system is idle), and not a global event in the first place. And only when you've reached single-core state should you then say do I suspend the system too. This seems to fit better into the cpuidle and/or frequency scaling framework. So I've tried to look a bit at the patches, and my admittedly rough comments so far is - I really do prefer the off to the side approach that the current google opportunistic suspend patches have. As mentioned, I don't think this should be deep in the scheduler. Not at all. - I do think there are possibly races and CPU idle issues there, but I think they are mainly for the multi-core thing. And I think that's a totally separate issue. Or it _should_ be. I'm not aware of any races with multi-core systems unless there are existing problems in suspend. We check if any suspend blockers are active after disable_nonboot_cpus() has returned. - once you're single-core (whether because you never had more cores to begin with, or because the opportunistic CPU offlining has taken down the other cores), I think the suspend-blocker is fine as a concept, and certainly shouldn't need any deep scheduler hooks. so I'd like to see the opportunistc suspend thing think about CPU offlining, I see this as a separate problem. We ignore a single busy CPU for opportunistic suspend, so why should the number of online CPUs matter? and I'd like to see it disconnect from the existing /sys/power/state. The entry point is not important to us. The current interface is what Rafael wanted instead of the /sys/power/request-state interface which is what we changed it to last year. And I'd really not like to involved deep internal kernel hooks into it. But I'll also admit that maybe I'm not seeing some problems. I've frankly tried to avoid the whole discussion until Andrew pulled me in yesterday. Linus -- Arve Hjønnevåg -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspend blockers Android integration
On Fri, 4 Jun 2010 01:23:02 +0200 Ingo Molnar mi...@elte.hu wrote: Btw., i'd like to summarize the scheduler based suspend scheme proposed by Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in the big thread, and there are also new elements of the proposal: Hi I would like to summarise the alternate proposal that I an others have suggested in a variety of different forms. It starts from the premise that 1/ Android developers actually like the big hammer aspect of suspend. Initiating suspend powers down some devices, puts others in low power states, freezes all processes and generally puts the device to sleep with a well defined and easily controlled (at the whole-of-system level) set of events that will wake from suspend. This is a big part of the Android approach to power-saving and I'm guessing they are not keen to depart from it. 2/ The main problem with using suspend as-is is that it is racy. The purpose of suspend is to put the device to sleep until a wake-event occurs. When that wake-event occurs at much the same time that suspend is requested races can occur. We want a wake-event to not only wake the device, to be keep the device awake while the wake-event is being handled, and to cancel any suspend that was initiated before the wake event completed. We need to understand wake event in an holistic sense. If a key press is expected to brighten the screen and make a glyph appear, and if that key press is considered to be a wake-event, then the glyph appearing must also be a part of the wake event. For such a holistic wake-event to fully block/cancel a suspend there much be some mechanism for hand-over of wake-events from kernel-space to user-space. Given those premises, google's suspend-blocker approach was to allow a kernel thread to initiate suspend whenever nothing was stopping it, and to allow both drivers and user processes to block that suspend while handling a wake event (or anything else that needed to keep the device awake). In this case the hand-over is fairly straight forward as the kernel thread as full knowledge and can easily wait for all sorts of things. The alternate proposal is simply to have user-space initiate a suspend (as is already possible), user-space processes can then trivially block that suspend through any of a number of IPC approaches, and kernel space drivers can block/abort suspend by explicitly requesting a block. The variety of alternate proposals comes from a variety of ways to modify the semantics of ask for a suspend in such a way that userspace can discover when there are kernel-space blocks, and can wait for them to be released without spinning. A sample modification (which I think is different to all the ones mentioned so far, and hopefully pulls out the best of them all) is to allow userspace to write e.g. mem_safe rather than mem to /sys/power/state. The 'safe' implies it is safe from races. When this is written, the process sleeps in an interruptible state until all in-kernel suspend blocks have been dropped. If any such suspend blocks were found, or if a signal is received, the request aborts. Only if there were no suspend blocks and no pending signals does the suspend progress. wake-events in the kernel then need to be tracked all the way to user-space, and the in-kernel lock is only dropped when the event is consumed by user-space. User-space must take some sort of lock to ensure no new suspend is requested before consuming any wake-events from the kernel. I believe this is very close to what android has today, only with a much smaller change to the user-space interface, which I believe to be the thing that has been found most objectionable. I does still require a degree of event-tracking within the kernel which might still be objectionable - I'm not so sure about different people's positions on that. Thanks, NeilBrown -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspend blockers Android integration
On Thu, 3 Jun 2010, Arjan van de Ven wrote: And because there's then no power saving (but a performance cost), it's actually a negative for battery life/total energy. Including the UP optimizations we do (ie lock prefix removal)? It's possible that I'm just biased by benchmarks, and it's true that Intel has been getting lots better, but the locking costs are very noticeable performance-wise on some benchmarks. And several CPU's have been held back from going into deepest sleep states by stupid firmware and/or platform bugs. But hey, if it's not going to help, and people have tried it, I guess I'll have to believe it. Linus -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspend blockers Android integration
On Thu, Jun 3, 2010 at 4:23 PM, Ingo Molnar mi...@elte.hu wrote: ... - Controlled auto-suspend: drivers (such as input) could on wakeup automatically set the 'minimum wakeup latency' value of wakee tasks to a lower value. This automatically prevents another auto-suspend in the near future: up to the point the wakee task increases its latency (via the scheduler syscall) again and allows suspend again. How do you clear the latency value in a safe way? If another wakeup event happens right after your wakee task is done processing the last event and decides to increase its latency, auto suspend will be allowed even though you have an unprocessed wakeup event. Also how do you know which task will read the event if it is not already waiting for it? This means there will be no surprise suspends for a task that may take a bit longer than usual to finish its work. [ Detail: this would only be done for tasks that have a non-default (non-infinity) task-latency value - to prevent the input driver from lowering latency values (and preventing future suspends) just because some unaware apps are running and using input drivers. ] Don't you need two inifinity values for this? -- Arve Hjønnevåg -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html