Carsten Haitzler (The Rasterman) wrote:
> aaaaah shared framebuffer. ok. then i can understand some of the "hurt" :)
> 
> 
>>What values do you get on your hardware? (Unscaled only, the rest is
>>entirely depending on the CPU)
> 
> 
> for the benchmarker:
> *** ROUND 1 ***
> ---------------------------------------------------------------
> Test: Test Xrender doing non-scaled Over blends
> Time: 12.445 sec.
> ---------------------------------------------------------------
> Test: Test Xrender (offscreen) doing non-scaled Over blends
> Time: 10.056 sec.
> ---------------------------------------------------------------
> Test: Test Imlib2 doing non-scaled Over blends
> Time: 0.332 sec.

That is strange. Without acceleration, I get

9.7
1.7
2.3

Seems imlib uses the video RAM.

> (xrender doesnt have accel turned on there. if i turn it on it bats imlib 2 by
> 3-4 times. i cant get to that box right now... thanks to my isp being screwed)
> :)

That's about the same factor as I get here.

> i dont have the old code working anymore for the gl engine i had - i just
> remember getting full screen 1600x1200 composities and scales going from
> somewhere like 2 to 50+ fps once opengl got slid under the bonnet.

No GL for newer SiS chips, therefore I need to use the 2D engine.

>>I need to copy the texture to video RAM once, unless somebody tells me
>>it already is there (I could use the 2D accelerator for this, too, then.
>>In this case I just wonder why the mga driver doesn't do it this way.).
> 
> 
> is this per composite? or just the first time it (the pixmap lets say) is
> created?

Frankly, I don't know. I haven't looked into the composite function yet.

>>Since the accelerator does not sync after initiating the command, using
>>the provided memory area is unsecure. The app might reuse it for
>>something else before the command is actually executed. Syncing after
>>the command is insane (because it could take forever, depending on the
>>amount of commands already in the queue - and this queue is BIG)
> 
> 
> hmmm do you have a way of knowing where the accelerator is up to?

Yes, I can check the queue location anytime. But doing this before every
accelerator command slows down the whole stuff dramatically.

> ie interrupts
> etc? 

No interrupts.

>>It's a fast CPU with fast RAM, and a slow GPU with memory shared with
>>the CPU. More can't be expected, I guess.
> 
> 2.3 seconds for a blend doesnt smell like a fast cpu :) my 1.7ghz athlon gets
> the 1:1 blends done in 0.3 secs or so. so technically my desktop cpu (ram/bus
> etc.) is still double the speed of your sis gfx chips :)

imlib probably handles this in video RAM. This is a 2.0Ghz P4. As of
now, I still consider this quite a fast one...

>>Text drawing (x11perf -aa24text) went from 25000 to 105000, which is
>>more than factor 4. I am satisfied. (Now, if I just could find out why
>>the accelerator functions are not being called on my 4.3 system...)
> 
> thats good :) though i still like to compare x performance against external code
> (ie like imlib2 - or use gdk-pixbuf, or anything else) and always try to at
> least equal your "software rivals" :) i really want to see x accelerating where
> hardware can and beating the PANTS off any software code :)

Up to certain degree, yes. However, if we want to keep people from
screaming "X is bloated", we need to have some generic functions.
imlib'n stuff might contain assembler/MMX routines for special
situations, which will beat the generic X routines, of course. That will
not change. (Not speaking about portability here.)

Furthermore, using the accelerator for small tasks (eg blitting a glyph)
won't be much faster than doing it be the CPU, since the engine setup is
about the same amount of code like blitting eg. 64 bits into the
framebuffer by the CPU.

>>Hm, the background never looks like that gfx during render tests. I just
>>see a rainbow-like gradient from top/left to bottom/right... (no matter
>>whether with or without the accleration)

I'll send you a screen shot (per private mail)

>>That could be a problem. So far, I haven't found a suitable hook for
>>this (and replacing the entire composite function seems a bit far
>>fetched at the moment)
> 
> hmmm but likely you would be better wrapping it. special case the 1:1 (as
> currently) on a per call basis (do it within the call) and detect certain
> transforms (ie non rotation/skew ones) since scaling blitters often only do
> simple pixel scaling - not full matrix transforms. (tho my knowledge of this may
> be waaay out of date by now), and pump them through the acclerator :)

Since the composite uses as good as no internal hooks, it's either all
or nothing. And this function does MUCH. I'll have a closer look on the
weekend.

Thomas


-- 
Thomas Winischhofer
Vienna/Austria
thomas AT winischhofer DOT net          *** http://www.winischhofer.net/
twini AT xfree86 DOT org



_______________________________________________
Devel mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/devel

Reply via email to