Re: Doubts about GPeriodic
[ Reply abbreviated to a couple of topics where I had firmer answers ] On Sat, 2010-10-23 at 17:42 -0400, Havoc Pennington wrote: > On Sat, Oct 23, 2010 at 3:37 PM, Owen Taylor wrote: > > - We should not start painting the next frame until we are notified > > the last frame is complete. > > Does frame-complete arrive when "we just did the vsync" i.e. last > frame is just now on the screen? > > We can dispatch "other stuff" while we wait for this, right? Does the > time between sending off the buffer swap, and getting the frame > complete back, count as time spent doing other stuff? I guess that > would roughly mean "if paint finishes earlier than it had to, get > ahead on other stuff in the meantime" - the wait-for-frame-complete is > a way to take advantage of any time in the 50% designated for painting > that we didn't need. > > I mean, presumably while waiting for frame-complete the main loop is > going to run, the question is just whether that time gap factors into > any of the time calculations. The time between when we finish the frame and the we receive frame complete is some unknowable mix of: - CPU time in the kernel validating render buffers (or in a indirect rendering X server, I suppose) - Time waiting for the GPU to finish - Time until VBlank occurs Only the first is conceivably something we want to balance with "other stuff", and even that is likely running on another core these days and more so in the future. So, what we are trying to balance here is: A) The time in event-processing, animation update, layout, and paint up to the point we call glXSwapBuffers() (or XCopyArea() in the X case) B) The time we spend processing other stuff from the point where we called glXSwapBuffers() [...] > > So, there's some appeal to actually base it on measured frame times. > > Using just the last frame time is not a reliable measure, since frame > > painting times (using "painting" to include event process and relayout) > > are very spiky. Something like: > > I had awful luck with this. (I did try the averaging over a window of > a few frames. I didn't try minimum.) > It's just really lumpy. Say you're transitioning from one screen to > another, on the first frame maybe you're laying out the new screen and > uploading some textures for it, and then you have potentially very > different repaint times for the original screen alone, both screens at > once during transition, and the final screen. And especially on crappy > hardware, maybe you only get a few frames in the whole animation to > begin with. Minimum might be more stable than average. Another issue > with animation is you don't know the average until you're well into > the animation For the purposes of balancing, it doesn't matter if we have an accurate estimate. If the animation has more complexity and takes longer to paint than the pre-animation state, that means we're just balancing a bit more toward animation than in the pre-animation state until we get new statistics. (I don't think animations typically have *reduced* complexity, because the animation have a mixture of pre-animation GUI elements and post-animation GUI elements.) What we don't want is to be thrown way off - if the first animation frame takes 100ms to layout because we have a bunch of new text to measure, we don't want to eat another 100ms doing background processing. This is why I suggested a minimum over several frames. (Or detecting "first frames" by looking for keystrokes and button presses.) - Owen ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Fri, 22 Oct 2010 at 16:48:34 -0400, Owen Taylor wrote: > I suppose you could tag D-Bus methods in some fashion, and have the > D-Bus client library reorder the messages. But again, that depends on > the user recognizing where compression is useful. Some D-Bus APIs rely on method replies and signals arriving in the order they were sent; various bits of Telepathy want this, for instance. Otherwise, you can end up with situations where a receipient has to process a message without having received the necessary information to process it correctly. (Obviously, if the process or client library or whatever has opted-in to re-ordering, that's fine.) In Telepathy, our usual approach to this sort of compression is to aggregate things at the D-Bus level - for instance, methods that get information about contacts act on a *list* of contacts (on the Nokia 770 this apparently reduced the join time for a pathologically busy IRC channel like #ubuntu from minutes to seconds!). This also reduces process wakeups, context switches and other per-message overhead. S ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, On Sat, Oct 23, 2010 at 3:37 PM, Owen Taylor wrote: > - We should not start painting the next frame until we are notified > the last frame is complete. Does frame-complete arrive when "we just did the vsync" i.e. last frame is just now on the screen? We can dispatch "other stuff" while we wait for this, right? Does the time between sending off the buffer swap, and getting the frame complete back, count as time spent doing other stuff? I guess that would roughly mean "if paint finishes earlier than it had to, get ahead on other stuff in the meantime" - the wait-for-frame-complete is a way to take advantage of any time in the 50% designated for painting that we didn't need. I mean, presumably while waiting for frame-complete the main loop is going to run, the question is just whether that time gap factors into any of the time calculations. > paint time other time fps work fraction > == == === = > 1ms 15ms 60 94% > 8ms 8ms 60 50% > 10ms 22ms 30 68% > 17ms 15ms 30 47% > 20ms 12ms 30 38% > 24ms 8ms 30 33% > 40ms 10ms 20 20% > 55ms 11ms 15 20% > 90ms 10ms 10 10% > > But what this does mean is that there is a cliff across different > systems here that's even worse than it looks from above. Take a very > non-extreme example - if I'm testing my app on my laptop, maybe painting > is taking 20ms, and I'm getting a reasonable 30fps. I give it to someone > with a netbook where CPU and GPU are half the speed and painting takes > 40ms. The framerate drops only to 20fps but the time for a background > operation to finish increases by 3.8x. The netbook user has half the CPU > and we're using only half that half to do the background work. Good breakdown, a detail, at some point animations are just unusable. I think you need a pretty smooth / reliable 15-20fps-ish for an animation to be even worth doing, really. If you average a rate like that but have some big hang at the start or end, or if you average a really bad rate like 5-10fps, my experience is that people say "this is slow and sucks" and also the animations don't usually achieve the UI feedback goal because people don't perceive motion, they perceive an annoying artifact. I don't know exactly how to deal with that. It doesn't really matter for the tree view thing, that's not an animation. But I do wonder if we could automatically make an animation drop all frames, if it's only going to have 4 frames anyway, or whatever. I guess this is a side issue / additional elaboration. > So, there's some appeal to actually base it on measured frame times. > Using just the last frame time is not a reliable measure, since frame > painting times (using "painting" to include event process and relayout) > are very spiky. Something like: I had awful luck with this. (I did try the averaging over a window of a few frames. I didn't try minimum.) It's just really lumpy. Say you're transitioning from one screen to another, on the first frame maybe you're laying out the new screen and uploading some textures for it, and then you have potentially very different repaint times for the original screen alone, both screens at once during transition, and the final screen. And especially on crappy hardware, maybe you only get a few frames in the whole animation to begin with. Minimum might be more stable than average. Another issue with animation is you don't know the average until you're well into the animation. Now in the tree view case, rather than an animation case, I'd expect very uniform paint times. Here's a direction of thought: do things differently when there's an animation active vs. when we're doing a "regular" paint. litl shell actually does track this. The problem with it (maybe obviously) is if you have any continuous/persistent animations. > - Average time over last three frames > - Minimum time over last three frames > - Average time over last three frames where only motion events were > delivered I wonder if some kind of synthetic benchmark would end up working more predictably. Sort of like that "performance score" Windows will compute for you. Maybe they even store that somewhere and base stuff on it, who knows. Though as long as we're only using this for a *max* time to do "other stuff," rather than setting a timeout or simply blocking, it isn't exactly the end of the world if we get this wrong. > Basically yes. I can't see us ever doing video in gnome-shell - we just > have to be able to smoothly composite video someone else is playing. (to be clear, that's what we do also.) > But I'm not really thinking about gnome-shell here, I'm really thinking > more about a "standard" application perhaps Evolution or Rhythmbox. > Whether written in GTK+ or in Clutter or some hybrid. Agre
Re: Doubts about GPeriodic
On Fri, 2010-10-22 at 19:30 -0400, Havoc Pennington wrote: > Hi, > > On Fri, Oct 22, 2010 at 4:48 PM, Owen Taylor wrote: > > I think we're largely agreeing on the big picture here - that priorities > > don't work so there has to be arbitration between painting and certain > > types of processing. > > Right, good. The rest is really just details - there are various ways > it could work. > > As I wrote this email I realized I'm not 100% clear how you propose > the 50/50 would work, so maybe it's something to spell out more > explicitly. There's no way to know how long painting will take right, > so it's a rule for the "other stuff" half? Do you just mean an > alternative way to compute the max time on non-painting tasks (half of > frame length, instead of 5ms or "until frame-complete comes back")? I hadn't really worked it out to the point of an algorithm, but let me see if I can take a stab at that. My starting point is that: - We should not start painting the next frame until we are notified the last frame is complete. - Once we are notified the last frame is complete, if the "other stuff" queue is empty, we should start painting the next frame immediately - we shouldn't hang around waiting just in case something shows up. So the question is how long after frame completion we should keep on processing "other stuff" before we start painting the frame. The target here is the 50% rule - that we want to roughly balance the time to paint the frame with the time that we spend processing everything else before processing the frame. The simplest technique we could take is to say that when we have contention processing "other stuff" is limited to 0.5 / (refresh rate) seconds (roughly 8ms for the standard 60hz refresh.) This works out pretty well until the paint time gets "big". Picking a bunch of arbitrary data points: paint timeother time fps work fraction ==== === = 1ms 15ms 60 94% 8ms8ms 60 50% 10ms 22ms 30 68% 17ms 15ms 30 47% 20ms 12ms 30 38% 24ms 8ms 30 33% 40ms 10ms 20 20% 55ms 11ms 15 20% 90ms 10ms 10 10% But what this does mean is that there is a cliff across different systems here that's even worse than it looks from above. Take a very non-extreme example - if I'm testing my app on my laptop, maybe painting is taking 20ms, and I'm getting a reasonable 30fps. I give it to someone with a netbook where CPU and GPU are half the speed and painting takes 40ms. The framerate drops only to 20fps but the time for a background operation to finish increases by 3.8x. The netbook user has half the CPU and we're using only half that half to do the background work. (This type of thing can happen not just because of a slow system, but because of other different conditions - the user has more data, has a bigger screen, etc. The less predictable the situation, the more we need to make sure that things degrade gracefully. "GTK+ application running on a user's system" is a pretty unpredictable situation.) So, there's some appeal to actually base it on measured frame times. Using just the last frame time is not a reliable measure, since frame painting times (using "painting" to include event process and relayout) are very spiky. Something like: - Average time over last three frames - Minimum time over last three frames - Average time over last three frames where only motion events were delivered Probably works better. Once you have a historical frame time estimate, then you limit the total "other stuff" time (before and after frame completion) to that time. paint timeother time fps work fraction ==== === = 1ms 15ms 60 94% 8ms8ms 60 50% 10ms 22ms 30 68% 17ms 33ms 20 65% (30 47%) 20ms 30ms 20 60% (30 38%) 24ms 26ms 20 52% (30 33%) 40ms 60ms 10 60% (20 20%) 55ms 61ms8.6 52% (15 20%) 90ms 93ms5.5 51% (10 10%) > > But pathological or not, I think it's also common. This is where my > > suggestion of a 50% rule comes in. It's a compromise. If we're lucky > > repainting is cheap, we're hitting the full frame rate, and we're also > > using 75% of the cpu to make progress. But when drawing takes more time, > > when there is real competition going on, then we don't do worse than > > halve the frame rate. > > > > (This continues to hold in the extreme - if redrawing is *really* slow - > > if redrawing takes 1s, then certainly we don't want to redraw for 1s, do > > 5ms of work, redraw for another 1s, and so forth. Better to slow down > > from 1 fps to 0.5fps
Re: Doubts about GPeriodic
Hi, On Fri, Oct 22, 2010 at 9:56 PM, Paul Davis wrote: > you guys are working out an incredibly complex and potentially baroque > solution when the elegant and arguably "correct" one has already been > implemented several times in different contexts. what's the point? > There's a lot of text in this thread but I think the resulting code at issue is not large, dev time measured in days. Well what we're arguing about is just a small patch once there's a paint clock. The paint clock itself somewhat larger but still hopefully days. (Developer days not calendar days.) We have significant prior art (clutter master clock, litl shell, gnome shell, etc.) so it isn't from scratch, that's part of why people have stuff to say about it. Heck I'm sure Ryan has already finished the thing while we're discussing it here. Changing over to having threaded rendering and GL-composited layers is comparatively huge by 10x or 100x I would think, and hasn't even been prototyped out by anyone. I could be wrong, as I said I haven't tried to work it through other than idly thinking about it a little. Maybe there is a simple version. An important problem is more or less addressed by just the paint clock, which is to be able to sync to hardware refresh and have a tween timestamp related to that syncing. It's possible to get smooth animation by just adding the paint clock. As a practical matter what I'm going for is to get GTK to be sensible when in-process with Clutter. The other stuff I listed at http://log.ometer.com/2010-10.html#18 is part of that too. I just feel like it sucks to continue the "to use Clutter you have to reinvent all the GTK wheels" situation until GTK 4 and that it might not be such a huge task to make GTK behave itself. It may be that a possible approach to a render thread is to have clutter in one thread and GTK in another thread layers are clutter actors that GTK renders to... just idly thinking again ;-) honestly I have no idea how all this should work. Another question that keeps popping up for me is why each process should have its own compositor and then there's also an X compositing manager. Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Fri, Oct 22, 2010 at 5:16 PM, Havoc Pennington wrote: > Right - Miguel pointed out to me (unrelated to this thread) that Mac > and Windows both work this way these days, wayland too, as far as i understand >and I'd already been > noticing that Chrome, new Pepper plugin API, lightspark, and probably > other stuff all seem to be moving to a "build a stack of > software-rendered layers then use GL to squish them" model. So > building a bunch of layers (ideally in threads) then GL-compositing > the layers (with optional transformations and shaders) seems to be > kind of the thing to do these days. You can animate the layers and > shaders totally in the GL thread without mixing up with the app code. > > It seems like a big hairy change though. I definitely haven't taken > the time to start working it out. to me, it seems *really* pointless to go halfway here. its abundantly clear what the right design is, so why put a bunch of effort into cooking up a half-cooked solution that will still need to discarded for real future? you guys are working out an incredibly complex and potentially baroque solution when the elegant and arguably "correct" one has already been implemented several times in different contexts. what's the point? ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Owen Taylor writes: > I guess the basic question here is whether "most stuff" is updating the > text of a label or whether most stuff is adding 1 elements to a > GtkTreeView. If most things that applications are doing in response to > timeouts/IO completions/IPC are cheap and take almost no time, then we > gain compression and efficiency by doing them before painting and we can > ask the applications to code specially for the rest. This is why I suggested a cooperative thread system at one point. Callbacks that would add 10,000 elements to a tree view could simply call yield() in between each item. Then the main loop could prevent it from running again if it wanted to start painting. There are some potentially nasty reentrancy issues though, and of course it would require a bit of assembly for each platform it would run on. Soren ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, On Fri, Oct 22, 2010 at 4:48 PM, Owen Taylor wrote: > I think we're largely agreeing on the big picture here - that priorities > don't work so there has to be arbitration between painting and certain > types of processing. Right, good. The rest is really just details - there are various ways it could work. As I wrote this email I realized I'm not 100% clear how you propose the 50/50 would work, so maybe it's something to spell out more explicitly. There's no way to know how long painting will take right, so it's a rule for the "other stuff" half? Do you just mean an alternative way to compute the max time on non-painting tasks (half of frame length, instead of 5ms or "until frame-complete comes back")? > But pathological or not, I think it's also common. This is where my > suggestion of a 50% rule comes in. It's a compromise. If we're lucky > repainting is cheap, we're hitting the full frame rate, and we're also > using 75% of the cpu to make progress. But when drawing takes more time, > when there is real competition going on, then we don't do worse than > halve the frame rate. > > (This continues to hold in the extreme - if redrawing is *really* slow - > if redrawing takes 1s, then certainly we don't want to redraw for 1s, do > 5ms of work, redraw for another 1s, and so forth. Better to slow down > from 1 fps to 0.5fps than to turn a 1s computation into a 3 minute > computation.) Let me think about this in terms of litl shell, which is the real-world example I'm thinking of, and maybe we can see how gnome-shell or some other GL-using apps differ which could be instructive. Just to go over what's in there, you have your compositor; the most intensive painting it does would usually be repainting video. But also just has various GL eye candy animations and it has a lot of clutter actors on some screens, so that takes time. Since lots of UI is in the shell itself, compositing is not always what we're doing, sometimes just a regular Clutter app really. Then crammed into this process, for better or for worse: - "UI chrome" for out-of-process browser (toolbar in-process, actual web page out) - global UI chrome (settings, switching between apps, etc.) - video chat stuff mixed in to the overall UI (the video chat engine and video/audio playback is out of process, but has a lot of chit-chat with the shell, it isn't an "app" but mixed into the global UI) - photos app that downloads photos from web then does GL-ish stuff to display them (this is in process for the bad reason that drivers are broken and out of process GL is/was fail) - playing audio bleeps and bings in response to UI interaction - chatter with a sqlite thread - chatter with the litl servers (over dbus, via the browser process) - chatter with gio threads - chatter over dbus about NetworkManager and whatnot - chatter over dbus to GTK widgets that are out of process (don't ask but guess why I'm interested in certain GTK work ;-)) - "misc" I guess gnome-shell is similar except less "stuff" As you say in the followup mail, at some point multiple processes/threads exist for a reason. Agreed, but in litl shell there's only one thing I think is in-process that shouldn't be, which is the photo app, and one thing is out of process that shouldn't be (GTK widgets). It's just a complex app talking to a lot of other processes. All the main shell really does is coordinate processes and paint an assemblage of the other processes. I don't know, I would think it's basically the same deal as the "main" Chrome process with every tab out of process, or as the main Eclipse process where Eclipse has lots of threads, or whatever. The main shell doesn't do blocking IO or long computations. It does have loads of IPC queues to talk to all the threads and processes. I almost feel like the threads and processes are the whole reason we have queue-based GSource. It almost seems like this is my prototypical case, where there *isn't* any computation in the main thread, just lots of queues to dispatch, and the case you're worried about most is where there *is* ... ? On Radeon hardware with indirect rendering, litl shell paint takes in the 7-9ms area. So for 60fps (16.6ms per frame) you have about 5ms per frame leftover with a little headroom. On 50fps then you have more headroom. I'm not sure exactly what you're suggesting on the 50% rule; if it strictly said 8ms instead of 5ms for the non-paint half, then that sort of moves the practical numbers from "60fps with a bit to spare" to "dropping frames" right? Most likely it isn't genuinely that big of a deal because most frames don't hit the 5ms max, and even fewer would hit the 8ms max, and we can start painting once there's nothing to do. But there is sort of a qualitative difference between 5 and 8, which is whether the painting still fits in the frame or finishes too late. Ignoring the specifics, takeaways could be: * there's a cliff in the chosen time to spend not-painting where we make ourselves miss
Re: Doubts about GPeriodic
On Fri, 2010-10-22 at 16:20 -0400, Havoc Pennington wrote: > Imagine two processes that are both following the rules and have 10 > streams open to each other and they are both processing all 10 at a > superfast rate just tossing messages back and forth. What's the > latency between occasions where these processes have 0 sources to > dispatch? That drives your framerate. While 10 streams between two > apps sounds contrived, I don't think one big complex app with say a > sound server and some downloads in the background and some other > random main loop tasks is all that different. At some point we do have to realize that preemptive multitasking was invented for a reason. We can play around the edges, but we can't make a single thread able to smoothly do 5 things at once. That may sound weird coming from me - considering that the gnome-shell approach is to put everything in a single process and write it in Javascript. But as I see gnome-shell it is limited in scope - it's the compositor, it handles selecting and switching tasks. But it isn't playing movies, it isn't loading web pages. It isn't doing your taxes. If it does start doing any of those things, we'll have to answer the question of how to get those activities into a different thread, into a different process. This is why as web browser become more and more application containers, you are seeing a move to isolate the pages from each other - separate threads, separate garbage collection, even separate processes. - Owen ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, On Fri, Oct 22, 2010 at 5:06 PM, Paul Davis wrote: > starting from scratch, and thinking about the parallels with a > pull-model realtime audio design, it seems to me that if you were > designing this entirely from scratch you wouldn't serialize painting > and other source handling. you'd double buffer everything and run the > paint/expose/vblank cycle in a different thread. whenever the > non-paint/expose/vblank threads were done with refilling a new buffer, > it would be pushed (preferably lock free) into a place where it could > be used by a compositor and blitted to the h/w during the > paint/expose/vblank iteration. > Right - Miguel pointed out to me (unrelated to this thread) that Mac and Windows both work this way these days, and I'd already been noticing that Chrome, new Pepper plugin API, lightspark, and probably other stuff all seem to be moving to a "build a stack of software-rendered layers then use GL to squish them" model. So building a bunch of layers (ideally in threads) then GL-compositing the layers (with optional transformations and shaders) seems to be kind of the thing to do these days. You can animate the layers and shaders totally in the GL thread without mixing up with the app code. It seems like a big hairy change though. I definitely haven't taken the time to start working it out. Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Fri, Oct 22, 2010 at 4:20 PM, Havoc Pennington wrote: > If I have a spinner animation and that paints flat-out as fast as > possible without a "yield gap" to let other sources run, then that > _also_ starves a superfast main loop source that just sits there > grabbing messages and then throwing them elsewhere, just as much as it > starves a super slow main loop source. Speed is not the problem, it's > whether the source runs at all. [ ... ] starting from scratch, and thinking about the parallels with a pull-model realtime audio design, it seems to me that if you were designing this entirely from scratch you wouldn't serialize painting and other source handling. you'd double buffer everything and run the paint/expose/vblank cycle in a different thread. whenever the non-paint/expose/vblank threads were done with refilling a new buffer, it would be pushed (preferably lock free) into a place where it could be used by a compositor and blitted to the h/w during the paint/expose/vblank iteration. with this design, the fast painters always get fast redraws because their new buffers are always ready at each iteration of the paint/expose/vblank cycle. the slow painters or the ones that get blocked by slow sources or whatever simply get their natural frame rate reduced because they haven't pushed a new buffer by each iteration. any similarity to JACK is entirely coincidental :) ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
I think we're largely agreeing on the big picture here - that priorities don't work so there has to be arbitration between painting and certain types of processing. I think the points where aren't entirely aligned are: what is a suitable method of arbitration, and whether the arbitration is something that happens normally, or you have to opt into it as a "background task". About the method of arbitration: if we look at the idea of reserving a fixed 5ms or painting during the "waiting for completion" gap, that works well if the computation and painting are essentially unrelated - if we are painting a GtkExpander expanding smoothly while we are filling a treeview. It doesn't work so well if the painting is being triggered *by* the computation. If we are using 12ms of CPU to relayout and repaint and only filling the treeview in the intermediate 4ms, then we've increased the total time to complete by a factor of 4. If course, having each chunk of a large computation trigger the same amount of compressible paint work is pathological - ideally we'd be in a situation like incrementally laying out a GtkTextView - only the first chunk triggers a full repaint, subsequent chunks only cause the scrollbar to redraw and the scrollbar redraw is cheap. But pathological or not, I think it's also common. This is where my suggestion of a 50% rule comes in. It's a compromise. If we're lucky repainting is cheap, we're hitting the full frame rate, and we're also using 75% of the cpu to make progress. But when drawing takes more time, when there is real competition going on, then we don't do worse than halve the frame rate. (This continues to hold in the extreme - if redrawing is *really* slow - if redrawing takes 1s, then certainly we don't want to redraw for 1s, do 5ms of work, redraw for another 1s, and so forth. Better to slow down from 1 fps to 0.5fps than to turn a 1s computation into a 3 minute computation.) [...] > > Are IO completions and IPC well behaved? Well that's really up to the > > application however, they have to be *somewhat* well behaved in any > > case. > > What's hard I think is to make them well behaved in the aggregate and > on every single frame. > > i.e. it's hard to avoid just randomly having "too much" to dispatch > from time to time, then you drop 3 frames, it just looks bad. But as > long as you're OK *on average* this can be solved by spreading the > dispatch of everything else across more than one frame, instead of > insisting on doing it all at once. I think this is a very good point, especially when we are trying to keep a video from stuttering or similar cases where the redrawing is unrelated to the painting. Unfortunately, however, we can't spread things out if the work occurs at the layout stage - not an uncommon circumstance. (There may be a slight overestimation going on about how bad it is to drop frames- early versions of Clutter and of the Clutter/Tweener integration just didn't handle the computations correctly, so dropped frames were causing velocity stutters.) [...] > > While a two-part system like this sounds like a huge pain for > > application writers - it does have the big advantage that everybody gets > > a say. If we just cut things off after a fixed time and started > > painting, then we could end up in a situation where we were just filling > > the treeview and painting, and never processing D-Bus events at all. > > If painting is a higher than default priority you could still add > sources at an even higher priority, or you could hook into the paint > clock in the same place events, resize, etc. hook in to force some > queue to be drained before allowing paint to proceed. Yes, the X event queue could be force-drained before painting (with considerable adaption to the current interfaces for doing things like GTK+/Clutter integration.) > Also if you have a handler that just does your whole queue of whatever > at once, it effectively does run on every frame and compress it all, > even if it's an idle - since the main loop can't interrupt a dispatch > in progress, and the gap means that we'll probably run the dispatch > handler once on all nonpaint sources in a typical frame. [...] > dbus works like the GDK source (because it copied it). One message per > dispatch at default priority. I'm not sure how gdbus works. > > I think what dbus does works well, as long as painting is 1) above > default priority and 2) not ready for dispatch for at least some time > during each frame length. > > The thing is that as long as "everything but painting" is basically > sane, then the "up to 5ms" or "while waiting for vsync" gap is going > to be enough to dispatch everything. If you get a flood of dbus > messages or whatever though, then you start spreading those over > frames (but still making progress) instead of losing frames > indefinitely until you make it through the queue. > > It's just less bad, to spread dispatching stuff out over a few frames > if you get a floo
Re: Doubts about GPeriodic
Hi, On Fri, Oct 22, 2010 at 10:28 AM, David Zeuthen wrote: > If you believe that the GUI thread should never perform blocking IO > (such as reading from disk or IPC) or never perform CPU-intensive > tasks (such as image- or video-decoding) then... then all that your > code in the GUI thread does, is to receive data from one thread and > maybe signal another thread. This point only affects the "constant factor" though right, not the basic character of what's going on. If I have a spinner animation and that paints flat-out as fast as possible without a "yield gap" to let other sources run, then that _also_ starves a superfast main loop source that just sits there grabbing messages and then throwing them elsewhere, just as much as it starves a super slow main loop source. Speed is not the problem, it's whether the source runs at all. If you go the other way and want to starve painting if we're doing our fast message throwing, then you're relying on always processing messages faster than the other side can send them. Being "fast" is not enough; it has to be "faster," i.e. it has to catch up and empty the queue. In fact *all* active main loop sources have to happen to be caught up *on the same iteration*. So if you're emptying three queues you have stay ahead of all three senders at once and get them all to zero at the same moment, and only then can you paint. There's no special guarantee you can do that. You're relying on it happening to work out. Imagine two processes that are both following the rules and have 10 streams open to each other and they are both processing all 10 at a superfast rate just tossing messages back and forth. What's the latency between occasions where these processes have 0 sources to dispatch? That drives your framerate. While 10 streams between two apps sounds contrived, I don't think one big complex app with say a sound server and some downloads in the background and some other random main loop tasks is all that different. > Anyway, my point really was something like this: If you want to talk > about smooth animations and synchronoizing to vblank, it would > probably be good for GPeriodic, Clutter and GTK+ to specify what > people should and shouldn't be doing in the "GUI Thread" I know a > lot of it is obvious to people on this list but it never hurts to > write it down anyway. With my "finite-length yield gap in the paint loop" approach, I think you can do anything in the GUI thread as long as each single dispatch is fast (and bounded in time). So you can't block, but you can do anything that is _always_ pretty fast (you can't do things that are _usually_ pretty fast, such as drain an unbounded queue, or blocking IO that sometimes decides to hang for a bit). That's how I'd document it. The one thing that still breaks is if all your fast-dispatching GSource add up to not-fast, since we have to dispatch all or nothing at each priority. Solving that could be overkill though. File under "someday." With a yield gap, you need threads to block or to do an indivisible long computation, but you don't need threads just because you have a queue (or a divisible computation, which is basically a queue). In the more classic approach (paint priority stays constant regardless of time elapsed), I think you can do anything in the GUI thread as long as: - each individual dispatch is fast AND - (an animation is allowed to starve whatever you are doing indefinitely so you can be below paint priority OR - the sum of all dispatches that could occur in a single frame timespan is fast and bounded. No queues. Long computations are no good even if split up across dispatches. So you can be above paint priority.) it isn't clear that all long, but divisible, computations can be tossed out of the UI thread. "filling a Tree/TextView" is a good example of how "UI" vs. "not UI" is kinda fuzzy To avoid queues in the main UI thread, you need both source and sink to be threadsafe, and real-world unthreadsafe sources include say libdbus, real-world unthreadsafe sinks include say GtkWidget. One thing, it doesn't matter if paint priority is default or idle. The issue is created when it's fixed with respect to "other stuff," instead of changing priority dynamically. Anyway, yes, this only matters for a large complex app doing lots of stuff in the same process as painting, while also trying to avoid dropping frames. For any simple case, things just muddle through. Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Fri, Oct 22, 2010 at 10:28 AM, David Zeuthen wrote: > Anyway, my point really was something like this: If you want to talk > about smooth animations and synchronoizing to vblank, it would > probably be good for GPeriodic, Clutter and GTK+ to specify what > people should and shouldn't be doing in the "GUI Thread" I know a > lot of it is obvious to people on this list but it never hurts to > write it down anyway. just a footnote: this is all start to sound quite a like JACK: * regularly scheduled updates, driven by h/w or system timer: check * clients that might take too long to do their updates: check * clients with other tasks than just the updates, which must also complete: check * important to not lock up whole system when update cycle takes too long: check * specific set of things that should not be done from within an update: check, probably i don't know that JACK's architecture has anything in particular to offer here, other than the somewhat pleasing convergence of design ideas. it always seemed to me that X's network transparency's worst (and possibly only bad) side effect was to decouple basic conceptions about the internals of a GUI toolkit from the reality of display hardware. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, On Fri, Oct 22, 2010 at 10:24 AM, Owen Taylor wrote: > Is painting well behaved? Inherently - no. We can easily get in > situations where we can spend all our time painting and no time doing > anything else. That's the point of the up-to-5ms-of-dispatch thing (or the wait-for-frame-complete thing), though, right? We can force painting to be well behaved by having a period during each frame where painting is not ready for dispatch or has a lower priority. During that period anything else can run - even idles can run. (Or if we like, even default idles can run but not low idles, or only default priority not idles, could be implemented. Just has to be defined.) I don't see the need to choose between painting-starves-everything and everything-starves-painting. The "dispatch everything but painting" gap can have a max length, not a fixed length - if you do get everything else done you can proceed to paint. (Well, that's what we do with the 5ms. I guess it might be harder to do with a pure GMainLoop solution but you could do it, by having the paint source dynamically change its priority or by having a super-low-priority paint source that's always ready plus a high-priority one that is only ready when frame-complete or after an interval.) The effect of max rather than fixed length "everything else" gap is that if painting truly needs the whole frame interval (rather than interval minus gap), painting may still keep up on those frames where there's little else to do. The idea is there's a period in which painting will be well-behaved by yielding - it isn't allowed to defer everything else indefinitely / 100% of the time. This changes your max time you theoretically have to paint from 1/fps seconds to (1/fps - gap) seconds. However even if you can't paint in (1/fps-gap) you will only lose a frame due to the gap when there's really stuff to do inside the gap. And if you can't afford a reasonable gap you probably really are hosed anyhow. There is already a fixed max on paint time in order to look good, so that doesn't change, the max is just tweaked (and can flex up to the absolute max, 1/fps, whenever possible). The idea is that if your average paint-cpu and everything-else-cpu adds up to less than 100% cpu, we shouldn't need to drop frames. However with a starvation model, you drop frames even if you can keep up on average, due to lumpiness, right? so that's the idea of only yielding for a gap - if you get a lump in your everything-else, such as a flood of messages, try to spread it over a few frames. In practice I think that's what we're talking about. Say I have a spinner animating smoothly, and I get some flood of IO or dbus or whatever it is. Do I handle that flood in 30ms and my spinner hiccups, or do I spread the flood in chunks over several frames, and my spinner stays pretty. As an app developer I think what I want is the latter. But I also don't want to wait for the spinner to go away before handling this flood. I want to smooth the flood over multiple frames, but not defer it indefinitely, I want to make progress on every frame. I don't think that can be expressed by just having a fixed priority on paint and a fixed priority on my handler; there has to be some way in which the priorities "flip" during the frame. > Are IO completions and IPC well behaved? Well that's really up to the > application however, they have to be *somewhat* well behaved in any > case. What's hard I think is to make them well behaved in the aggregate and on every single frame. i.e. it's hard to avoid just randomly having "too much" to dispatch from time to time, then you drop 3 frames, it just looks bad. But as long as you're OK *on average* this can be solved by spreading the dispatch of everything else across more than one frame, instead of insisting on doing it all at once. If you aren't OK on average, that's a problem the app is just going to have to solve. > If I have a GIO async callback that fills a treeview, there is one > pathology where my callback gets called so frequently that we never get > get to repaint. But what may happen instead is that I get so much data > in a *single* callback that I block the main loop for an unacceptably > long period of time. So we always will have the requirement that > callbacks from the main loop must be *individually* short. Yes. I think things can be made to work pretty well with just this requirement. > Making IO > completions and IPC highest priority makes this requirement a bit more > stringent - it means that callbacks from the main loop must be *in > aggregate* short. That callbacks from the mainloop aren't allowed to do > expensive stuff, but instead must queue it up for an idle at lower > priority. A priority lower than paint is the right priority for most stuff, IMO, as long as we're going to dispatch that lower priority at least for some gap per frame. If the presence of an animation means that we aren't going to dispatch lower-than-paint sources unti
Re: Doubts about GPeriodic
Hi, On Thu, Oct 21, 2010 at 6:30 PM, Havoc Pennington wrote: > Hi, > > On Thu, Oct 21, 2010 at 5:47 PM, David Zeuthen wrote: >> Note that with GDBus the resulting GDBusMessage is actually being >> built in a separate (and private) thread - so in practice there is >> zero overhead in the GUI thread - in particular it doesn't depend on >> what kind of message it is or how big the message is. The same is true >> for most of libgio's file and networking primitives (e.g. >> g_file_load_contents_async() will cause work to happen in a worker >> thread etc.). > > I don't think this matters as long as there's effectively a queue in > the main thread (i.e. each message or IO chunk has a handler in the > main thread). > > If you dispatch once per dbus message from an X-priority main loop > source, then lower-than-X-priority handlers will not run until there > are no dbus messages available. So if you are getting flooded with > messages, for example if a stream of data is being dbus-transmitted, > you would need to chew through that whole stream - _or_ it would have > to just happen that on one iteration of the main loop the message > processing had caught up with the message sending and the queue was > empty. Moreover that iteration would have to happen to have no other > X-priority sources ready. In that case a lower-than-X-priority > dispatch could occur. As the number of X-priority sources goes up > (think big complex app with plugins) the odds of getting no X-priority > sources on a given iteration would presumably drop. > > As long as the main loop is seeing a stream of items with no > guaranteed pauses in the stream, in theory that main loop source runs > unbounded. (yes, in practice it usually doesn't run unbounded, but I > bet trying to hold 60fps will make the in-practice glitches more > visible) If you believe that the GUI thread should never perform blocking IO (such as reading from disk or IPC) or never perform CPU-intensive tasks (such as image- or video-decoding) then... then all that your code in the GUI thread does, is to receive data from one thread and maybe signal another thread. Additionally, if you arrange for things like download tasks (which might return a storm of events if you have a fat pipe) to happen in a separate threads... then you end up not doing a lot of stuff in your GUI thread - basically only reacting to external stimuli (e.g. X11 events and maybe some D-Bus IPC). There's a couple assumptions in the above paragraph of what the GUI thread should and should not be doing. There's also a built-in assumption that threading Is A Good Thing(tm) and that it's easy to use threading (which I believe it now is with libgio - I don't know about video- or image-decoding). Those are assumptions that I personally think are sound for the Linux desktop... but I won't be surprised if some people don't agree. Anyway, my point really was something like this: If you want to talk about smooth animations and synchronoizing to vblank, it would probably be good for GPeriodic, Clutter and GTK+ to specify what people should and shouldn't be doing in the "GUI Thread" I know a lot of it is obvious to people on this list but it never hurts to write it down anyway. David ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
If we say that "painting should have a higher priority than IO completions and IPC" or "IO completions and IPC should have a higher priority than painting, then we are talking a hard priority system. And the fundamental rule of hard priority systems is that the stuff with higher priority has to be well behaved. If PulseAudio uses real time priorities to get itself scheduled ahead of everything else, then it must make sure that it's not eating all the CPU. Is painting well behaved? Inherently - no. We can easily get in situations where we can spend all our time painting and no time doing anything else. Once we add synchronization to a an external clock, painting becomes *better behaved*. If we are able to paint at 80fps or 40fps, then that will be throttled to 60fps or 30fps and there will be some time remaining. But we maybe we can inherently paint at 61fps? If we make painting highest priority, we have to make provisions for other stuff to progress. Are IO completions and IPC well behaved? Well that's really up to the application however, they have to be *somewhat* well behaved in any case. If I have a GIO async callback that fills a treeview, there is one pathology where my callback gets called so frequently that we never get get to repaint. But what may happen instead is that I get so much data in a *single* callback that I block the main loop for an unacceptably long period of time. So we always will have the requirement that callbacks from the main loop must be *individually* short. Making IO completions and IPC highest priority makes this requirement a bit more stringent - it means that callbacks from the main loop must be *in aggregate* short. That callbacks from the mainloop aren't allowed to do expensive stuff, but instead must queue it up for an idle at lower priority. While a two-part system like this sounds like a huge pain for application writers - it does have the big advantage that everybody gets a say. If we just cut things off after a fixed time and started painting, then we could end up in a situation where we were just filling the treeview and painting, and never processing D-Bus events at all. Well, sort of - the main loop algorithms are designed to protect against this. *All* sources at the current priority are collected and dispatched once before we check again for higher priority sources. But since sources have different granularities and policies, the effect would be some unpredictable. The behavior of the GDK event source is to unqueue and dispatch one X event source per pass of the main loop, D-Bus and GIO probably do different things. Right now event compression in Clutter counts on events getting unqueued from the X socket at the default priority and then stored in an internal queue for compression and dispatching before painting. Going to a system where painting was higher priority than normal stuff would actually require 3 priorities: Event queueing, then painting, then everything else. [*] But can we say for sure that nothing coming in over D-Bus should be treated like an event? Generally, anything where the bulk of the work is compressible is better to handle before painting. An example: If we have a change notification coming over D-Bus which is compressible - it's cheap other than a triggered repaint. Say updating the text of label. And combine that with our GtkTreeView filler, then we might have: Fill chunk of tree view Change notification Repaint tree view and label Fill chunk of tree view Change notification Repaint tree view and label Fill chunk of tree view Change notification Repaint tree view and label Instead of: Queue stuff up for filling Change notification Change notification Change notification Fill chunk of tree view Repaint tree view and label Fill chunk of tree view Repaint tree view Fill chunk of tree view Repaint tree view On Thu, 2010-10-21 at 16:25 -0400, Havoc Pennington wrote: [...] > Re: frame-complete, it of course assumes working drivers... If you > don't have the async frame completed signal you may be back to the 5ms > thing, no? I guess with direct rendering you are just hosed in that > case... with indirect you can use XCB to avoid blocking and then just > dispatch for 5ms, which is what we do, but with direct rendering you > might just have to block. Unless you're using fglrx which vsyncs but > does not block to do so (at least with indirect, not totally sure on > direct). Sigh. The driver workarounds rapidly proliferate. Maybe > clutter team already debugged them all and the workarounds are in > COGL. :-P I think the basic assumption here is working drivers. If we know what we want, define what we want precisely, implement what we want in the free drivers, the proprietary drivers will eventually catch up. Of course, for GTK+, it doesn't matter, since it's not directly vblank swapping. [...] > You were talking about handling incoming IPC at higher priority than > repaint... it sort of depends on what t
Re: Doubts about GPeriodic
Hi, On Thu, Oct 21, 2010 at 5:47 PM, David Zeuthen wrote: > Note that with GDBus the resulting GDBusMessage is actually being > built in a separate (and private) thread - so in practice there is > zero overhead in the GUI thread - in particular it doesn't depend on > what kind of message it is or how big the message is. The same is true > for most of libgio's file and networking primitives (e.g. > g_file_load_contents_async() will cause work to happen in a worker > thread etc.). I don't think this matters as long as there's effectively a queue in the main thread (i.e. each message or IO chunk has a handler in the main thread). If you dispatch once per dbus message from an X-priority main loop source, then lower-than-X-priority handlers will not run until there are no dbus messages available. So if you are getting flooded with messages, for example if a stream of data is being dbus-transmitted, you would need to chew through that whole stream - _or_ it would have to just happen that on one iteration of the main loop the message processing had caught up with the message sending and the queue was empty. Moreover that iteration would have to happen to have no other X-priority sources ready. In that case a lower-than-X-priority dispatch could occur. As the number of X-priority sources goes up (think big complex app with plugins) the odds of getting no X-priority sources on a given iteration would presumably drop. As long as the main loop is seeing a stream of items with no guaranteed pauses in the stream, in theory that main loop source runs unbounded. (yes, in practice it usually doesn't run unbounded, but I bet trying to hold 60fps will make the in-practice glitches more visible) Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, On Thu, Oct 21, 2010 at 4:25 PM, Havoc Pennington wrote: > If you think > about something like the dbus main loop source, the dbus library > doesn't know what the heck is going to be coming in, and you can't > tune the main loop source depending on what kind of message it is. Note that with GDBus the resulting GDBusMessage is actually being built in a separate (and private) thread - so in practice there is zero overhead in the GUI thread - in particular it doesn't depend on what kind of message it is or how big the message is. The same is true for most of libgio's file and networking primitives (e.g. g_file_load_contents_async() will cause work to happen in a worker thread etc.). David ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, I guess a frame-complete signal (or timer) acts like the 5ms to create a window for other event sources to run? So painting should not starve other stuff, the mainloop could dispatch other stuff while the frame is being completed. Given a gap waiting for frame-completed you don't need a hardcoded 5ms (makes sense to me). Re: frame-complete, it of course assumes working drivers... If you don't have the async frame completed signal you may be back to the 5ms thing, no? I guess with direct rendering you are just hosed in that case... with indirect you can use XCB to avoid blocking and then just dispatch for 5ms, which is what we do, but with direct rendering you might just have to block. Unless you're using fglrx which vsyncs but does not block to do so (at least with indirect, not totally sure on direct). Sigh. The driver workarounds rapidly proliferate. Maybe clutter team already debugged them all and the workarounds are in COGL. :-P I guess COGL or whatever could in theory just send frame-completed after a fixed 5ms as a workaround on platforms that need it. There's also ARM and whatever platforms with no X to consider. Re: priorities, I would think once the frame-complete comes back (or 5ms expires, absent frame-complete) it's appropriate to drop everything else (unless it's explicitly asked to be super high priority) and paint. "paint" includes processing entire event queue and relayout, so that should have the UI sufficiently updated. You were talking about handling incoming IPC at higher priority than repaint... it sort of depends on what the IPC is about. For example, we have some that is UI-related, similar to events, and other that is basically IO. If you have a flood of IO coming in (say downloading a big file) then I don't think it's acceptable to wait for that queue to drain before painting - it could be minutes, not seconds. If you think about something like the dbus main loop source, the dbus library doesn't know what the heck is going to be coming in, and you can't tune the main loop source depending on what kind of message it is. Anything with a queue doesn't really have a bounded time within which its GSource won't be ready anymore. Threads only help if you can squish the queue in the thread... otherwise the unboundedness ends up in the main thread anyway. For example if you're reading a file, then if you can parse it and convert it to a small object in the thread, there's no potential paint starvation problem, but if you need to feed the whole unbounded dataset over into a TextView/TreeView, then there is (as you mention). I feel like most stuff should be below paint priority, not above, and then each frame should have a window (either "while waiting for frame-completed" or "fixed time like 5ms" or whatever) in which things below paint priority are going to run. That way things more or less can't break, as long as each individual dispatch() is reasonably fast/bounded. If most stuff is below paint priority (in order to ensure we keep up the frame rate), that could be implemented either by making most stuff an idle, or by making paint priority above default. "Most stuff should be an idle" is weird to me - seems to make default priority kind of meaningless and render g_*_add() etc. APIs useless. Why not make paint priority greater than the default priority, and so most things should be default, and idle is reserved for things that it's acceptable to starve? Conceptually, events+paint _should_ be highest priority - without those we are hiccuping and breaking interactivity - the only thing is, they can't run continuously, each frame needs a slice of doing "other stuff" and that could be a fixed interval, or given decent drivers, the time during which the GPU is chewing on the frame / waiting on vsync. Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Thu, 2010-10-21 at 08:17 -0400, Havoc Pennington wrote: > Hi, > > On Thu, Oct 21, 2010 at 5:46 AM, Ryan Lortie wrote: > > > > What about non-input events, though? Like, if some download is > > happening and packets are coming in and causing dispatches from the > > mainloop that we do not have control over. > > I brought this up a bit in the earlier thread. > > My takeaway is that for I/O type stuff you usually want what we ended > up with at litl, which is to limit it to some length of time per > frame. Unfortunately GMainLoop has no native way to do that. I > described our solution a bit in the old paint clock thread. > > There's a danger both of some random download starving animation and > of your download spinner starving the download. I think to start off we have to realize that a GTK+ application is significantly different from a compositor like Mutter or the litl shell in a number of ways: * GTK+ is quite efficient when just a small amount of stuff is changing. Even if the entire toplevel takes a long time to paint, a cheesy animation somewhere in the frame isn't going to cause all the time to be spent painting. * GTK+ is not going to be using a blocking glSwapBuffers(); GTK+ will be timing frames either based on a straight-up timer, or by getting frame-complete signals back from the compositor. * It's not the compositor - if painting blocks, it's not the end of the world. Once we move beyond that, then I'm skeptical about lumping everything that's not events/animation/relayout/repaint into the same bucket. "Everything else" includes a number of different things: * UI updates being done in response to asynchronous IO finishing In this case, I think usually you just want to do the updates immediately; for most UI updates the real expense is the relayout/ repaint, so there's no advantage to trickling them in... if you get such a bunch of updates that you block for a couple hundred ms, then you just accept a small stutter. If that might be a couple of seconds, then I think it's up to the app author to figure out how to fix the situation - if updates can be batched and batching reduces the work that needs to be done, then an easy to use "before relayout" API is handy. * Computations being done in idle chunks because "threads are evil". If the computations don't affect the GUI, then in my mind they should just happen in whatever time isn't needed to draw whatever is going on. We have no way of knowing whether whatever is going on is a spinner or is a video playing. In other words, progress displays need to be self-limiting to eat only a small amount of CPU. After all, it's pretty bad if my computation is going on at *half*-speed because of the progress spinner! * Servicing incoming IPC calls Assuming incoming calls queue up, I think it's fine to just handle them at higher priority than the repaint. The pathological case here is that Totem is playing a movie which is maxing out the frame rate, and somebody in another process does sync calls: for (movie in allMovies) movie.length = totemProxy.getPlayingTime(movie.id); And Totem handles one call, then paints a frame, then handles another call and the whole thing takes forever. This is clearly bad, but I don't think the solution is for totem to reserve 5ms for ever frame of every movie just because someone might start using the D-Bus API it exports. Solutions here are general solutions: - Don't put "service" API's in the GUI thread of GUI applications - Use async calls - if the above was done by making a bunch of async calls in parallel, it would be completed in one frame. * Expensive GUI work done incrementally (adding thousands of items to a GtkTreeView, say) Threads not useful because GTK+ isn't thread-safe. This one is slightly harder because each update can actually trigger a relayout/repaint, which might be expensive. So if this is being done at idle priority, you may be in the situation of do one chunk, which takes 0.1ms, repaint for 20ms, do another chunk, and so forth. This is the case where something like your proposal of reserving time per frame starts making sense to me. But rather than just doing a blanket reservation of 5ms per frame, it seems better to actually let the master clock know what's going on. To have an API where you add a function and the master clock balances calling it with relayout. That a) avoids wasting time waiting for nothing to happen b) allows better handling of the case where the relayout takes 100ms not 20ms so you don't work for 5ms, relayout for 100ms, repeat. - Owen ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Thu, 2010-10-21 at 11:46 +0200, Ryan Lortie wrote: > Hi Owen, > > A few questions, if you don't mind me picking your brain: > > On Wed, 2010-10-20 at 14:58 -0400, Owen Taylor wrote: > > The real problem is that the phases of the repaint cycle matter. We > > don't just have a bunch of stuff we need to do every frame, we need to > > do things in the order: > > > > * Process events > > The only benefit I see to processing input events at the start of a > frame cycle is that we possibly get to merge them. We probably do want > that. Compressing events to the frame isn't some sort of "oh, that sounds a bit cool" thing. It's vastly better than any other approach I know of and fixes problems that have been in GTK+ for a long time. For example, the explicit calls to gdk_window_process_updates() you'll find in a lot of GTK+ scrolling widgets are workarounds for lack of proper event compression. > What about non-input events, though? Like, if some download is > happening and packets are coming in and causing dispatches from the > mainloop that we do not have control over. > > Do we try to ensure that those people treat this 'external stimulus' in > the same way that we treat events (ie: possibly merge multiple incoming > packets so that they update their UI state only once in response to it) > or do we want them to just call the GTK API on the spot and risk > duplicated work because that's quite a lot easier to understand? > > Maybe we should have some mechanism that allows them to choose. We shouldn't generalize when there is one case that is 90% of the problem and that has a better solution than you can provide generally. Mouse events are special because: - The user can very easily see things lagging behind the mouse pointer - Mouse events flood in and *do* cause lag in many programs - Pointer motion events are inherently compressible in almost every circumstance... it simply doesn't matter if the mouse moves more than once per painted frame (the only exception is a paint program, you may want some sort of history API to get compressed events.) Yes, you can have compressible network events or whatever; and so there does need to be be an easy way to schedule things to happen once before relayout/repaint. Once you have a paint clock, g_idle_add() doesn't for work for compression. See MetaLaterAdd: http://git.gnome.org/browse/mutter/tree/src/core/util.c#n801 For the API we are using in gnome-shell. (Maybe you could do it with adding a "tick" function, but the important thing is being able to do it during event processing and happen *this* frame. And that it's convenient.) > > * Update animations > > "tick" Note that this has to happen *after* event processing. > > * Update layout > > Clutter and GDK want to do two different things here, it seems. > Presently (and almost by chance) GPeriodic is emitting a "tick" signal > after running all of the tick functions and Emmanuele is using this to > run "stage updates" on all of the Clutter stages. This is a little bit > like Gtk managing it's geometry but not exactly. > > In Gtk's case, we have a chance to do this a bit more programmatically - > only run layout adjustments on widgets/windows that have been marked as > requiring some resize (ie: toplevels and containers with > GTK_RESIZE_QUEUE themselves). That could be handled from a 'tick' > handler, or we could add some more hooks to GPeriodic. I don't think it works to just talk about handling stuff from *a* tick handler. Order matters, you can't just add a bunch of tick handlers and hope they get run in the right order. > A reason that I think it makes sense to do layout updates separate from > repainting is that layout updates can result in us wanting to change the > size of our toplevels (eg: someone opened GtkExpander). Sure, that's a reason that GTK+ doesn't look like Clutter exactly. > This is a *really* tough problem because if that happens, we can't just > paint. We have to wait until the window manager has actually given us > our new size. I did some benchmarking, and that tends not to happen > until about 1ms later (a bit less for metacity, a bit more for compiz > and mutter). Have you looked at how it works in GtkWindow currently? GTK+ computes the new size, and requests a new size from the window manager, but doesn't actually allocate children and *freezes updates* on the toplevel (gdk_window_freeze_toplevel_updates_libgtk_only) until the configure response is received. > So do we block the mainloop for ~1-2ms and desperately suck from the X > socket until we receive ConfigureNotify (at least until some timeout)? No, you cannot block synchronously for the window manager to handle your ConfigureRequest. Don't go there. > Do we skip drawing the window and wait until next frame if we have a > pending ConfigureNotify? You draw the next frame *when* you get the ConfigureNotify. > Is there some way we can abuse > _NET_WM_SYNC_REQUEST to make this problem ea
Re: Doubts about GPeriodic
Hi, On Thu, Oct 21, 2010 at 5:46 AM, Ryan Lortie wrote: > > What about non-input events, though? Like, if some download is > happening and packets are coming in and causing dispatches from the > mainloop that we do not have control over. I brought this up a bit in the earlier thread. My takeaway is that for I/O type stuff you usually want what we ended up with at litl, which is to limit it to some length of time per frame. Unfortunately GMainLoop has no native way to do that. I described our solution a bit in the old paint clock thread. There's a danger both of some random download starving animation and of your download spinner starving the download. Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi, On Thu, Oct 21, 2010 at 4:26 AM, Emmanuele Bassi wrote: > no, the GL context should *not* be per window. multiple GL contexts > usually pose more problems than they solve, in synchronization and > context switching, on basically all drivers - except maybe on nvidia[0]. Fair enough, I didn't realize it was actually possible to do one global context for all windows. I would still argue for per-window (so we can shut down unmapped/hidden windows and so widgets look at parent, not a global) Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Hi Owen, A few questions, if you don't mind me picking your brain: On Wed, 2010-10-20 at 14:58 -0400, Owen Taylor wrote: > The real problem is that the phases of the repaint cycle matter. We > don't just have a bunch of stuff we need to do every frame, we need to > do things in the order: > > * Process events The only benefit I see to processing input events at the start of a frame cycle is that we possibly get to merge them. We probably do want that. What about non-input events, though? Like, if some download is happening and packets are coming in and causing dispatches from the mainloop that we do not have control over. Do we try to ensure that those people treat this 'external stimulus' in the same way that we treat events (ie: possibly merge multiple incoming packets so that they update their UI state only once in response to it) or do we want them to just call the GTK API on the spot and risk duplicated work because that's quite a lot easier to understand? Maybe we should have some mechanism that allows them to choose. If we elect not to have that mechanism, then the input problem is actually quite easy, by virtue of the fact that there is only ever one person managing input from the X server. > * Update animations "tick" > * Update layout Clutter and GDK want to do two different things here, it seems. Presently (and almost by chance) GPeriodic is emitting a "tick" signal after running all of the tick functions and Emmanuele is using this to run "stage updates" on all of the Clutter stages. This is a little bit like Gtk managing it's geometry but not exactly. In Gtk's case, we have a chance to do this a bit more programmatically - only run layout adjustments on widgets/windows that have been marked as requiring some resize (ie: toplevels and containers with GTK_RESIZE_QUEUE themselves). That could be handled from a 'tick' handler, or we could add some more hooks to GPeriodic. A reason that I think it makes sense to do layout updates separate from repainting is that layout updates can result in us wanting to change the size of our toplevels (eg: someone opened GtkExpander). This is a *really* tough problem because if that happens, we can't just paint. We have to wait until the window manager has actually given us our new size. I did some benchmarking, and that tends not to happen until about 1ms later (a bit less for metacity, a bit more for compiz and mutter). So do we block the mainloop for ~1-2ms and desperately suck from the X socket until we receive ConfigureNotify (at least until some timeout)? Do we skip drawing the window and wait until next frame if we have a pending ConfigureNotify? Is there some way we can abuse _NET_WM_SYNC_REQUEST to make this problem easier? On the Gtk experiment branch, layout updates are actually done pretty much "on the spot" right now (ie: you make some changes to the layout which will queue a idle that will run pretty much immediately). There have been no changes to this part yet. > * Repaint This part is what I originally intended the damage/repair mechanism to be used for. > If GTK+ and Clutter are working together in the same process, then we > still need to go through those phases in the same order and do > everything for each phase. > > It looks like GPeriodic has two phases: > > - Tick > - Repair > > Which I guess are meant to be "update animations" and "relayout and > repaint". I can sort of see how I can squeeze the right behavior out of > it given various assumptions. In particular, you need to only ever have > one one repair function that does all the work of relayout then repaint > - you can't have separate repair functions for relayout and repaint. Or > for clutter and for GTK+. GPeriodic is probably going to need to gain some more phases, indeed. I don't plan to have relayout and repaint shoved into the same stage for the reasons listed above, but also for reasons of sanity. > But does an abstraction make sense at that point? If we need to > explicitly glue GTK+ into clutter or clutter into GTK+ using hooks > provided in GTK+ and Clutter, then all that GPeriodic is doing is saving > a bit of code reuse. Right. Another way we could do this is by having some hooks in Gtk: - do this - do that - do the other thing and have those clocked internally by Gdk in the Gtk-runs-itself case and by Clutter in the Clutter-runs-Gtk case. That certainly could make sense for the "set tasks" like layout, drawing, etc. In fact, all of these things could be driven by one-big-handler on the "tick" signal that GPeriodic currently emits. For timeline advancement, however (ie: the stuff that the user wants to do) I think an abstraction like GPeriodic is quite useful. It gives a common place that users can register their animation hooks that works the same way for both Clutter and Gtk. It prevents us from having some timeline system within Gtk that is clocked by Clutter telling Gtk "run all your timelines now!". Cody is cr
Re: Doubts about GPeriodic
hi Havoc, First I want to note that GPeriodic was only an attempt to get the timing right. It's just a clock. It is not in any way meant to provide policy about how a paint cycle is actually run. That said, I did make a Gtk branch with some crackful experimentation (currently shoving GPeriodic into gdkthreads in a global way). This is not meant to be "the way" -- it was just a convenient place to stick it for now so that we could experiment with getting some widgets animating using it. Of course, we're discovering that resize handling and stuff is quite difficult (another mail for all that stuff). It's worth noting, though, that Emmanuele was able to get Clutter's paint cycle working on top of it without modification, so there is something here... Anyway. GPeriodic is just a clock, so let's talk timing. On Thu, 2010-10-21 at 03:09 -0400, Havoc Pennington wrote: > Another thought, in the patch >periodic->last_run = now; > I think this will look a little rocky - the frames are going to > display at actual-screen-hz intervals, no matter what time it is when > you record last_run and no matter what time it is when you call your > drawing APIs. So things look better if you keep the "tween timestamp" > on hz intervals. The last_run time probably has very little to do with > when frames hit the display. Animations should go ahead and paint > assuming they are hitting the display at a fixed fps. This is something that gave me a great deal of pause, and is a rather interesting question. I'll attach the full code fragment in its current form, for context: /* Update the last_run. * * In the normal case we want to add exactly 1 tick to it. This * ensures that the clock runs at the proper rate in the normal case * (ie: the dispatch overhead time is not lost). * * If more than one tick has elapsed, we set it equal to the current * time. This has two purposes: * * - sets last_run to something reasonable if the clock is running * for the first time or after a long period of inactivity * * - resets our stride if we missed a frame */ now = g_periodic_get_microticks (periodic); elapsed_ticks = (now - periodic->last_run) / 100; g_assert (elapsed_ticks > 0); if G_LIKELY (elapsed_ticks == 1) periodic->last_run += 100; else periodic->last_run = now; [[and yes, I'm using G_LIKELY strictly for documentation purposes]] In the usual case, it's true that the ticker (which is expressed in microframes, by the way) is advanced exactly one frame when dispatching from the main context (ie: free-running clock with no external synch information from vblank). The only place I disagree with you is on what to do when we want to skip a frame. > In the litl shell fwiw the pseudocode for the tween time on each frame is: > > int frame_time = 1000 / fps; > int actual_time = - current_ticker_time; > int frames_late = (actual_time / frame_time) - 1; > current_ticker_time += frame_time; > if (frames_late > 0) { > current_ticker_time += (frame_time * (frames_late + 1)); > } > > The idea of this is: decide to drop frames based on floor(frames_late) > and then skip ahead by ceil(frames_late). The point of that is to bias > against dropping a frame until we're a full frame behind, but then be > sure we drop enough frames to get ahead a bit when we do drop them, > and always stay on a multiple of the refresh rate. This is an interesting proposal. The problem when the clock is free-running is that we don't know exactly what side of the vblank we're on. That's the point of resetting the stride (ie: assuming that the new top-of-the-frame time is now). That might end up being right, or it might be wrong. In the event that we are late for only one frame, it's a cointoss. On the other hand, if we are consistently dropping frames -and- are unaware of the vblank, I think I could mount a statistical argument that my approach is more likely to result in a more smooth/accurate animation (say, RMS of correct position vs. actual position for each frame that actually hits the monitor). Far more interesting, I think is in unblock(): periodic->last_run = g_periodic_get_microticks (periodic); In the event that we *do* have vblank information, we set the counter to exactly the wallclock time at some semi-random interval (namely: whenever our process bothered to notice the notification from the X server). I'm actively unhappy with that. I was talking with Emmanuele about the information that's in the vblank notification we get from the server. There is timestamp information there. I'd be quite a lot happier if we had a method to inject that information into GPeriodic (ie: a timestamp parameter on the unblock API). > Due to this and also the desire to not explode when the computer's > clock is set, I would define the ticker to be a monotonic value that > is in time units but is not a wall clock time. i.e. if I change my > computer's clock back an hour,
Re: Doubts about GPeriodic
On Thu, 2010-10-21 at 09:26 +0100, Emmanuele Bassi wrote: > the g_source_get_current_time() function will use the monotonic clock on > Linux assuming you link against gthread; since gobject does that, we can > safely assume that it will be using a monotonic clock. sorry, I'm confusing myself with a patch from Maemo: https://bugzilla.gnome.org/show_bug.cgi?id=540545 that I thought we already applied... for monotonic sources, g_thread_gettime() is the monotonic clock API. nothing to see here, move along... :-) ciao, Emmanuele. -- W: http://www.emmanuelebassi.name B: http://blogs.gnome.org/ebassi ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
On Thu, 2010-10-21 at 03:09 -0400, Havoc Pennington wrote: > Another issue, seems like the ticker needs to be per-native-window: > > * the GL context is per-window so the vsync mechanism also is no, the GL context should *not* be per window. multiple GL contexts usually pose more problems than they solve, in synchronization and context switching, on basically all drivers - except maybe on nvidia[0]. this has nothing to do about whether GPeriodic should be per-window or a global singleton - it's just that multiple contexts in a single process are generally a no-no. > Due to this and also the desire to not explode when the computer's > clock is set, I would define the ticker to be a monotonic value that > is in time units but is not a wall clock time. i.e. if I change my > computer's clock back an hour, the ticker should keep marching > forward, and the ticker is allowed to be fudged to make animations > pretty. the g_source_get_current_time() function will use the monotonic clock on Linux assuming you link against gthread; since gobject does that, we can safely assume that it will be using a monotonic clock. ciao, Emmanuele. [0] the first implementation of Clutter used multiple contexts, one for each top-level, and shared the texture list to allow sharing resources across top-levels; it did not have nice results - ironically, mostly on fglrx, so I couldn't even look at a solution. -- W: http://www.emmanuelebassi.name B: http://blogs.gnome.org/ebassi ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Doubts about GPeriodic
Another issue, seems like the ticker needs to be per-native-window: * the GL context is per-window so the vsync mechanism also is * we ought to shut down the ticker on windows that aren't visible * each screen has its own vsync and the window is the normal convention to imply a screen * the general principle that widgets should be getting context and state from parent widgets, in most cases ultimately from the toplevel - but by chaining through parents. Rather than from global singletons or state. attempted to explain in http://log.ometer.com/2010-09.html#19 (so any gtk widget that's a child of a clutter stage for example, would want to be asking that clutter stage for paint clock) Native windows would be either toplevels or embedded clutter/glarea widgets, generally. But maybe just saying "any native window can have its own clock" is right. There probably shouldn't even be a global API because using it would be broken, right? When not actually using GL or vsync, then I think all native windows could just inherit a single global ticker that would just be a timeout, but that's more of an implementation detail than an API thing. Another thought, in the patch periodic->last_run = now; I think this will look a little rocky - the frames are going to display at actual-screen-hz intervals, no matter what time it is when you record last_run and no matter what time it is when you call your drawing APIs. So things look better if you keep the "tween timestamp" on hz intervals. The last_run time probably has very little to do with when frames hit the display. Animations should go ahead and paint assuming they are hitting the display at a fixed fps. In the litl shell fwiw the pseudocode for the tween time on each frame is: int frame_time = 1000 / fps; int actual_time = - current_ticker_time; int frames_late = (actual_time / frame_time) - 1; current_ticker_time += frame_time; if (frames_late > 0) { current_ticker_time += (frame_time * (frames_late + 1)); } The idea of this is: decide to drop frames based on floor(frames_late) and then skip ahead by ceil(frames_late). The point of that is to bias against dropping a frame until we're a full frame behind, but then be sure we drop enough frames to get ahead a bit when we do drop them, and always stay on a multiple of the refresh rate. Due to this and also the desire to not explode when the computer's clock is set, I would define the ticker to be a monotonic value that is in time units but is not a wall clock time. i.e. if I change my computer's clock back an hour, the ticker should keep marching forward, and the ticker is allowed to be fudged to make animations pretty. Havoc ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Doubts about GPeriodic
A new GPeriodic class has popped up in GIO that's supposed to be the basis of a unified master clock implementation between Clutter and GTK+. I'm skeptical that any abstraction like GPeriodic can provide useful integration between Clutter and GTK+ The real problem is that the phases of the repaint cycle matter. We don't just have a bunch of stuff we need to do every frame, we need to do things in the order: * Process events * Update animations * Update layout * Repaint If GTK+ and Clutter are working together in the same process, then we still need to go through those phases in the same order and do everything for each phase. It looks like GPeriodic has two phases: - Tick - Repair Which I guess are meant to be "update animations" and "relayout and repaint". I can sort of see how I can squeeze the right behavior out of it given various assumptions. In particular, you need to only ever have one one repair function that does all the work of relayout then repaint - you can't have separate repair functions for relayout and repaint. Or for clutter and for GTK+. But does an abstraction make sense at that point? If we need to explicitly glue GTK+ into clutter or clutter into GTK+ using hooks provided in GTK+ and Clutter, then all that GPeriodic is doing is saving a bit of code reuse. - Owen ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list