On Tue, Nov 29, 2011 at 11:29 PM, Christoph Bartoschek <bartosc...@or.uni-bonn.de> wrote: > Am 29.11.2011 23:19, schrieb Maarten Maathuis: >> >> On Tue, Nov 29, 2011 at 2:33 PM, Christoph Bartoschek >> <bartosc...@or.uni-bonn.de> wrote: >>> >>> Hi, >>> >>> I am moving the thread "EXA performance problem" from xorg to xorg-devel >>> and >>> hope to get some help here. >>> >>> To sum up the problem: We use an application that displays vector >>> pictures. >>> We use it mostly to display pictures with millions of rectangles. Using >>> our >>> old X11 thin clients (XFree86) the performance was acceptable. The speed >>> was >>> about 1 mio rectangles per second. After upgrading to newer thin clients >>> (Xorg) the performance dropped significantly. >>> >>> I have a testcase where displaying the picture now takes 90 seconds. It >>> was >>> below one second on the older thin clients. >>> >>> The profiler says that 95% of the runtime is spent in pixman region >>> operations. >>> >>> The application draws polyRectangle most of the time. And I see that >>> nearly >>> 100% of time is spent in damagePolyRectangle and the functions below. >>> >>> 33% of the time in damagePolyRectangle is spent in the while loop to >>> construct the damage region. The algorithm runs in O(n^2) because it adds >>> one rectangle at a time. This can be fixed by constructing the damage >>> region >>> in one step. The attached patch does this. >>> >>> However after fixing this most of the time is spent in ExaCheckPolylines >>> which is called by this chain: >>> >>> >>> damagePolyRectangle -> miPolyRectangle -> exaPolylines -> >>> ExaCheckPolylines >>> >>> I've measured the runtime of the steps in ExaCheckPolylines: >>> >>> >>> void >>> ExaCheckPolylines (DrawablePtr pDrawable, GCPtr pGC, >>> int mode, int npt, DDXPointPtr ppt) >>> { >>> EXA_PRE_FALLBACK_GC(pGC); >>> EXA_FALLBACK(("to %p (%c), width %d, mode %d, count %d\n", >>> pDrawable, exaDrawableLocation(pDrawable), >>> pGC->lineWidth, mode, npt)); >>> >>> exaPrepareAccess (pDrawable, EXA_PREPARE_DEST); // Step1: 55 s >>> exaPrepareAccessGC (pGC); // Step2: 2.4 s >>> pGC->ops->Polylines (pDrawable, pGC, mode, npt, ppt); // Step3: 2.4 s >>> exaFinishAccessGC (pGC); // Step4: 2.2 s >>> exaFinishAccess (pDrawable, EXA_PREPARE_DEST); // Step5: 2.2 s >>> EXA_POST_FALLBACK_GC(pGC); >>> } >>> >>> We see that exaPrepareAccess needs most of the time. Is that expected? >> >> I don't know which driver this is (and which type of EXA), but worst >> case scenario the destination is a tiled frontbuffer that gets copied >> back and forth for every operation (you want to see the framebuffer, >> so you can't wait). If it's done using a hardware copy the software >> needs to wait for the copy to be finished. The other way around can be >> faster (and relatively non-blocking) depending on how it's >> implemented. I think the interfaces inside the xserver are the main >> reason it's done this way. The truth is that the whole thing was never >> designed for modern hardware, so EXA can only do so much. You could >> define new interfaces inside the xserver, but if your app does a call >> for each rectangle, then that won't help much. At some point it >> becomes easier to change the app if you can (rendering to a pixmap >> instead of the frontbuffer should help a lot already if you are >> bottlenecked by frontbuffer copies). >> >>> >>> Inside there are several operations on the damage region. This makes >>> damagePolyRectangle a quadratic algorithm. >>> >>> For N rectangles the damage region has O(N) rectangles. And for each >>> Rectangle there are operations on the damage region. The result is >>> O(N^2). >>> >>> Is it necessary to call exaPrepareAccess for each of the rectangles? >> >> No, but unless the app gives you all rectangles at once i don't see >> any other way. > > I do not know whether it gives all rectangles at once. But I see that > damagePolyRectangle is called with chunks of 2044 rectangles.
Then consider making a multiPolylines or multiPolyRectangle interface or something like that, then you can override the mi implementation in exa. > > It is miPolyRectangle that iterates over all rectangles. > > Christoph > > -- Far away from the primal instinct, the song seems to fade away, the river get wider between your thoughts and the things we do and say. _______________________________________________ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel