Grr, I didn't realize the sysprof traces got so big. Sorry for that. Resending to both lists with external links.
---------- Forwarded message ---------- From: Kalle Vahlman <[EMAIL PROTECTED]> Date: 8.10.2006 23:41 Subject: Re: Gtk performance issues from a user's point of view To: [EMAIL PROTECTED], [email protected] (adding cairo list as the first issue is highly relevant there too) (second issue is GTK+, about half email down) (goodness, this got a bit more verbose than I intended :) 2006/9/29, Federico Mena Quintero <[EMAIL PROTECTED]>: > On Thu, 2006-09-28 at 20:13 +0200, Adalbert Dawid wrote: > > > 1. GtkTreeView's repaints are slow. This gets especially obvious, when you > > perform one of the actions below: > > * Resize a column (i.e. change its width). You will notice that the > > header is badly lagging behind the mouse pointer, which gets worse as > > one enlarges the viewport. > > * Drag an icon from the Desktop over a big nautilus window with many > > files and directories in the list view mode. When you keep moving the > > icon over nautilus, the CPU goes up to 100% and the icon leaves an ugly > > white trail. > > What theme engine are you running? This can indeed be a major factor when considering performance. For example, here is some figures on a tester of mine with many buttons and a big widget (which expands when resizing). Here is a "plain" run with the builtin theme: _gtk_marshal_BOOLEAN__BOXED 0,03 43,98 scw_view_expose 0,00 21,59 gtk_label_expose 0,00 6,77 meta_frames_expose_event 0,00 5,99 gtk_container_expose 0,08 2,92 gtk_button_expose 0,05 2,45 ... So that looks about right I guess, ScwView renders a TreeModel with lots of text and expands both horizontally and vertically where the buttons only expand their width. So it's only natural for it to take considerably more effort. The resizing was lagging a bit, but that's also not news on my low-end laptop. But, what happens with clearlooks? This: gtk_marshal_BOOLEAN__BOXED 0,03 46,28 gtk_button_expose 0,03 24,88 scw_view_expose 0,00 8,96 ... Ok, the buttons are sweet, but not _that_ sweet :) The resizing lags noticeably. Looking at what eats the most percentages, I found the new cairo tesselator there: _cairo_bentley_ottmann_tessellate_polygon 0,58 20,68 _cairo_bo_event_queue_insert_if_intersect_below_current_y 0,56 14,81 _cairo_int128_divrem 0,19 13,22 _cairo_uint128_divrem 4,79 11,75 _cairo_uint128_rsl 1,92 1,92 _cairo_uint128_lt 1,73 1,73 _cairo_uint128_lsl 1,64 1,64 _cairo_uint128_eq 0,86 0,86 _cairo_uint128_add 0,39 0,39 _cairo_uint128_sub 0,36 0,36 ... It seemed to matter very little even if I commented out the fancy shadows and only drew a rounded rectangle (lines&arcs), the button expose maintained its position as the major (application-side) CPU eater. The new tesselator was supposed to be up to four times faster than the old one, but running the same test with the old one yields a different result: _gtk_marshal_BOOLEAN__BOXED 0,05 33,43 gtk_button_expose 0,00 10,91 scw_view_expose 0,02 10,13 gtk_label_expose 0,00 3,54 meta_frames_expose_event 0,00 3,16 also the path to tesselation seem very different, but that's probably to be expeceted... So, now I toss the ball to Carl's corner; am I misinterpeting or is it a regression in the tesselator? Do the traces look plausible? The actual code that Clearlooks uses to draw is at http://cvs.gnome.org/viewcvs/gtk-engines/engines/clearlooks/src/clearlooks_draw.c?view=markup (clearlooks_draw_button) but as I said, it seemed to matter only little what was drawn. The checkouts from the new-tesselator2 and master branches for the tests should be up-to-date as far as I know. Files: http://kalle.vahlman.googlepages.com/scw-row-test.png * A screenshot of the test app (available from the sources of Scw) http://kalle.vahlman.googlepages.com/scw-row-test-clearlooks-profile.xml * Profile of resizing the window with clearlooks and new tesselator http://kalle.vahlman.googlepages.com/scw-row-test-clearlooks-oldtesselator-profile.xml * Profile of resizing the window with clearlooks and old tesselator [...] > With my setup, an empty window resizes quite quicly. It is a bit slower > when the window gets close to full-screen (1400x1050), but I can't get > it to lag or anything. > > It's interesting to note that if I add a single button to the window, > then it gets noticeably slower (but it doesn't lag, either). Then, > Sysprof says that 69% of the time is spent in libfb/libxaa in the X > server, not GTK+ itself. My setup is slow enough to get real lagging ;) but the fbComposite stuff is a big portion of my X percentage too, I'm just curious if that is actually expected (and present with Qt etc). Can anyone explain what they actually do and are they supposed to be doing it a lot?-) I found an issue with GTK+ and the expose events generated in response to the configure event when I wrote a simple svg-loading app, disabled double-buffering and loaded a complex-enough svg. It striked me as odd how many times the svg was actually rendered in response to expose events (ok, it could be rendered to a backbuffer but that's hardly the point with resizing). This is what happens with the two signals when I stretch the window vertically in one swift go (the original size is 400x300): configure(400x301+5+24) expose(400x1+0+300) expose(400x301+0+0) configure(400x571+5+24) expose(400x270+0+301) expose(400x571+0+0) First expose is 400x1 pixels at 300, ie. the first new row. I guess it means that GTK+ reacts first and the configure-event-collapsing kicks in only after that. But the second expose pair shows something which feels to me as a mistake. The first expose is for the new area of the window. This makes a full redraw in my test case anyway, even if the clipping mask is set. The second is for the full window area, which of course means we'll draw it again making the first draw totally unneccesary. I'm guessing this is so (assuming it's intentional) in order to make the empty space fill up faster and only after that fill the rest. But then it would make sense to get the later expose only to the part that wasn't already updated, not to the whole widget. And if you think about it, how many widgets actually are cabable of drawing like that without drawing the whole thing twice? FWIW, the test case I did was naturally considerably faster in resizing after I rendered only in response to full exposes (but this is of course not a solution as you need to refresh partly obscured areas of the widgets). IMO sending only a full expose would be the way to go here, but first I need to find out the code that is causing this and testing without the "extra" expose if it really makes a difference in real apps... Anyone know it already? -- Kalle Vahlman, [EMAIL PROTECTED] Powered by http://movial.fi Interesting stuff at http://syslog.movial.fi _______________________________________________ Performance-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/performance-list
