> On Oct 20, 2015, at 4:23 PM, Jose Fonseca <jfons...@vmware.com> wrote:
> 
> I tried it on my i7-5500U, but I run into two issues:
> 
> - OpenSWR seems to only use 2 threads (even though my system support 4 
> threads)
> 
> - and even when I compensate llvmpipe to only use 2 rasterizer threads, I 
> still only get half the framerate of llvmpipe with the "gloss" Mesa demo (a 
> very simple texturing demo):
> 
> $ ./gloss
> SWR create screen!
> This processor supports AVX2.
> 720 frames in 5.004 seconds = 143.885 FPS
> 737 frames in 5.005 seconds = 147.253 FPS
> 729 frames in 5.004 seconds = 145.683 FPS
> 732 frames in 5.002 seconds = 146.341 FPS
> 735 frames in 5.001 seconds = 146.971 FPS
> [...]
> $ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=2 ./gloss
> 1539 frames in 5.002 seconds = 307.677 FPS
> 1719 frames in 5 seconds = 343.8 FPS
> 1780 frames in 5.002 seconds = 355.858 FPS
> 1497 frames in 5.002 seconds = 299.28 FPS
> 1548 frames in 5.001 seconds = 309.538 FPS
> [..]
> 
> I see similar ratio with more complex  workload with the trace from:
> 
>  http://people.freedesktop.org/~jrfonseca/traces/furmark-1.8.2-svga.trace
> 
> (you'll need to download https://github.com/apitrace/apitrace and build)
> 
> My questions are:
> 
> - Is this the expected performance when texturing is used? Or is there 
> something wrong with my setup?
> 

Two things are happening here to cause the behavior you’re seeing.  First, 
OpenSWR only generates threads equal to the number of physical cores.  On our 
workloads, going beyond that and using hyperthreads was a minimal or negative 
performance increase.  Second, one thread is reserved for the API thread, which 
does not participate in either frontend (geometry) or backend (fragment) work.  
Thus on your two core 5500U OpenSWR only had one raster thread versus 
llvmpipe’s two, giving half the performance.  If you want to switch OpenSWR to 
using hyperthreads, set the environment variable KNOB_MAX_THREADS_PER_CORE=0.

>  I understand that OpenSWR actually leverages llvmpipe (well gallivm's) code 
> for texture sampling, so I was expecting a smaller gap.

Yes, we use gallivm’s texture sampler so our performance should be similar on 
texture-limited workloads.  I tried a quick test of openarena on a 4-core 
machine and the performance delta was about 6% (default N-1 OpenSWR worker 
threads).

> - What exactly was the benchmark used for SWR_Sept15.pdf's figures ? Was 
> there any texture sampling used on it, or was it just simple lighting?

I don’t have the apitrace in front of me, but I believe the turbulence data was 
two-sided lit, with a textured plane.

Tim

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to