Carsten Haitzler (The Rasterman) wrote: > aaaaah shared framebuffer. ok. then i can understand some of the "hurt" :) > > >>What values do you get on your hardware? (Unscaled only, the rest is >>entirely depending on the CPU) > > > for the benchmarker: > *** ROUND 1 *** > --------------------------------------------------------------- > Test: Test Xrender doing non-scaled Over blends > Time: 12.445 sec. > --------------------------------------------------------------- > Test: Test Xrender (offscreen) doing non-scaled Over blends > Time: 10.056 sec. > --------------------------------------------------------------- > Test: Test Imlib2 doing non-scaled Over blends > Time: 0.332 sec.
That is strange. Without acceleration, I get 9.7 1.7 2.3 Seems imlib uses the video RAM. > (xrender doesnt have accel turned on there. if i turn it on it bats imlib 2 by > 3-4 times. i cant get to that box right now... thanks to my isp being screwed) > :) That's about the same factor as I get here. > i dont have the old code working anymore for the gl engine i had - i just > remember getting full screen 1600x1200 composities and scales going from > somewhere like 2 to 50+ fps once opengl got slid under the bonnet. No GL for newer SiS chips, therefore I need to use the 2D engine. >>I need to copy the texture to video RAM once, unless somebody tells me >>it already is there (I could use the 2D accelerator for this, too, then. >>In this case I just wonder why the mga driver doesn't do it this way.). > > > is this per composite? or just the first time it (the pixmap lets say) is > created? Frankly, I don't know. I haven't looked into the composite function yet. >>Since the accelerator does not sync after initiating the command, using >>the provided memory area is unsecure. The app might reuse it for >>something else before the command is actually executed. Syncing after >>the command is insane (because it could take forever, depending on the >>amount of commands already in the queue - and this queue is BIG) > > > hmmm do you have a way of knowing where the accelerator is up to? Yes, I can check the queue location anytime. But doing this before every accelerator command slows down the whole stuff dramatically. > ie interrupts > etc? No interrupts. >>It's a fast CPU with fast RAM, and a slow GPU with memory shared with >>the CPU. More can't be expected, I guess. > > 2.3 seconds for a blend doesnt smell like a fast cpu :) my 1.7ghz athlon gets > the 1:1 blends done in 0.3 secs or so. so technically my desktop cpu (ram/bus > etc.) is still double the speed of your sis gfx chips :) imlib probably handles this in video RAM. This is a 2.0Ghz P4. As of now, I still consider this quite a fast one... >>Text drawing (x11perf -aa24text) went from 25000 to 105000, which is >>more than factor 4. I am satisfied. (Now, if I just could find out why >>the accelerator functions are not being called on my 4.3 system...) > > thats good :) though i still like to compare x performance against external code > (ie like imlib2 - or use gdk-pixbuf, or anything else) and always try to at > least equal your "software rivals" :) i really want to see x accelerating where > hardware can and beating the PANTS off any software code :) Up to certain degree, yes. However, if we want to keep people from screaming "X is bloated", we need to have some generic functions. imlib'n stuff might contain assembler/MMX routines for special situations, which will beat the generic X routines, of course. That will not change. (Not speaking about portability here.) Furthermore, using the accelerator for small tasks (eg blitting a glyph) won't be much faster than doing it be the CPU, since the engine setup is about the same amount of code like blitting eg. 64 bits into the framebuffer by the CPU. >>Hm, the background never looks like that gfx during render tests. I just >>see a rainbow-like gradient from top/left to bottom/right... (no matter >>whether with or without the accleration) I'll send you a screen shot (per private mail) >>That could be a problem. So far, I haven't found a suitable hook for >>this (and replacing the entire composite function seems a bit far >>fetched at the moment) > > hmmm but likely you would be better wrapping it. special case the 1:1 (as > currently) on a per call basis (do it within the call) and detect certain > transforms (ie non rotation/skew ones) since scaling blitters often only do > simple pixel scaling - not full matrix transforms. (tho my knowledge of this may > be waaay out of date by now), and pump them through the acclerator :) Since the composite uses as good as no internal hooks, it's either all or nothing. And this function does MUCH. I'll have a closer look on the weekend. Thomas -- Thomas Winischhofer Vienna/Austria thomas AT winischhofer DOT net *** http://www.winischhofer.net/ twini AT xfree86 DOT org _______________________________________________ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel