On Sep 11, 2014, at 2:51 PM, Stastny, Bret <[email protected]> wrote:

>  
> It looks like you are using 25 threads? We found that the default bin size 
> size of the TileCache bin is 32, and if your threads spend a majority of your 
> time reading texture data 32 is not enough and the threads end up blocking 
> each other,  we typically use 128 bins and have found this improved 
> performance in our application and that has been successful for up to 60 
> concurrent threads.


That's a lot of threads!

Yeah, I set the default cache bin size to about twice the number of threads I 
expected anybody to be using for the next couple years (where I work, our 
biggest machines in production at the moment are 12 physical cores, and they 
usually aren't all rendering the same frame), anticipating the need to raise 
the bin size when 20+ cores become more common. Remember that for most 
apps/renderers, all threads are not inside the TextureSystem simultaneously -- 
there are other things to do, like trace rays and run non-texture portions of 
shaders. So I figured 32 was pretty safe for the time being, for ensuring that 
threads are not usually blocking. But whenever people think it's time for more, 
I'm happy to raise the default.

Note, however, an interesting edge case: The way the concurrent file and tile 
caches work is to divide the cache (by hash value) into NBINS individually 
locked sub-regions. So two threads accessing *different* files/tiles will, on 
average have only a 1/NBINS chance of needing the same subcache, and therefore 
a very good chance of acquiring the lock immediately without blocking on the 
other thread. HOWEVER, if both threads are accessing the SAME files/tiles, they 
will necessarily be contending for the same subcache and the same lock. The 
scheme scales well with large numbers of threads only if there are also large 
numbers of textures in flight at once. If you only have a few textures (fewer 
than threads, say), and the different threads tend to need the same tiles at 
roughly the same times, it doesn't have very good highly-threaded performance. 
Hard to get around that one.

--
Larry Gritz
[email protected]



_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

Reply via email to