Re: [Oiio-dev] TextureSys best practices

Larry Gritz Thu, 11 May 2017 15:24:29 -0700

I think just one TextureSystem overall should be fine. I don't think there is 
any advantage to having it be per-thread, and you *really* wouldn't want to 
have any accident where a per-thread TS inadvertently ended up with a separate 
ImageCache per thread.

A bunch of suggestions, in no particular order, because I don't know how many 
you are already doing:

Be sure you are preprocessing all your textures with maketx so that they are 
tiled and MIP-mapped. That's definitely better then forcing it to emulate 
tiling/mipmapping, which will happen if you use untiled, un-mipped textures.

Note that there are two varieties of each call, for example,

    bool texture (ustring filename, TextureOpt &options, ...)

  and

    bool texture (TextureHandle *texture_handle, Perthread *thread_info,
                  TextureOpt &options, ...)

You can reduce the per-call overhead somewhat if your you use the latter call 
-- that is, if each thread already knows its thread_info (which you can 
retrieve ONCE per thread with get_thread_info()), and also if you pass the 
handle rather than the filename (which you can retrieve ONCE per filename, 
using get_texture_handle()).

And if you have to use the first variety of the call, where you look up by 
filename and without knowing the per-thread info already, then at least ensure 
that you are creating the ustring ONCE and passing it repeatedly, and not 
inadvertently constructing a ustring every time.

In other words, this is the most wasteful thing to do:

    texturesys->texture (ustring("foo.exr"), /* construct ustring every time */
                         options, s, t, ...);

and this is the most efficient thing to do:

    // ONCE per thread:   my_thread_info = texturesys->get_thread_info();
    // ONCE per texture:  handle = texturesys->get_texture_handle(filename);
    // for each texture lookup:
    texturesys->texture (handle, my_thread_info, options, ...);

Are your derivatives reasonable? If they are 0 or very small, you'll always be 
sampling from the finest level of the MIP-map, which is probably not kind to 
caches, and also that finest level of the MIPmap will tend to use bicubic 
sampling unless you force bilinear everywhere (somewhat more math). If you are 
using correct derivs and your textures are sized well to handle all your views 
(without forcing the highest-res level), then you should be in good shape and 
as long as you're not "magnifying"/blurring/on the top level, then 
"SmartBicubic" will actually give you bilinear most of the time.

Another difference you may be seeing is from our anisotropic texturing, 
compared to your old engine. If you don't require the anisotropy, then you may 
want to set options.mipmode to MipModeTrilinear rather than MipModeAniso (which 
is the default).

What kind of hardware are you compiling for? Are you using appropriate USE_SIMD 
flags? Because that can speed up the texture system quite a bit.

I'm not sure how you are benchmarking, but make sure your benchmark run is long 
enough (in time) that you are measuring the steady state, and not having it 
dominated by initial texture read time. For example, if your prior system was 
reading whole textures in one shot, and the new one is reading tiles on demand 
(and reading multiple MIP levels as well), the total read time may be a bit 
higher. That won't matter at all for a 1 hour render, but the increase in disk 
read may show up as significant for a 15 second benchmark.

Assuming you're doing all this... well, you may just be seeing the overhead of 
all the flexibility of TextureSystem. Remember that in some sense, it is NOT 
designed to be the fastest possible texture implementation for texture sets 
that fit in memory. Rather, it's supposed to be acceptable speed and degrade 
gracefully as the texture set grows. In production, we routinely render frames 
that reference many thousands of textures totalling many hundreds of GB (well 
into the TB range), using a memory cache of perhaps only 2 or 4GB, and it 
performs very, very well. Texture sets much larger than available memory are 
the case where it really shines.

        -- lg

> On May 11, 2017, at 7:26 AM, Stefan Werner <[email protected]> wrote:
> 
> Hi,
> 
> I’m in the middle of integrating OIIO’s TextureSys into a path tracer. 
> Previously, textures were just loaded into memory in in full, and lookups 
> would always happen at the full resolution, without mip maps. When replacing 
> that with TextureSys, I’m noticing a significant performance drop, up to the 
> point where texture lookups (sample_bilinear() for example, sample_bicubic() 
> even more) occupy 30% or more of the render time. This is with good cache hit 
> rates, the cache size exceeds the size of all textures and the OIIO stats 
> report a cache miss rate of < 0.01% (in addition, I tried hardcoding 
> dsdx/dsdy/dtdx/dtdy to 0.01, just to be sure).
> 
> I did expect some performance drop compared to the previous naive strategy, 
> but this is a bit steeper than I expected. I am wondering if I am doing 
> something wrong on my side and if there are some best practises on how to 
> integrate OIIO into a path tracer. (I had it running in a REYES renderer 
> years ago and don’t remember it being that slow.)
> 
> I am creating one TextureSys instance per CPU thread, with a shared 
> ImageCache - are separate caches per thread any better? I cache perthread 
> data and do lookups using TextureHandle, not texture name. Do people 
> generally use smartbicubic for path tracing or do you not see enough of a 
> difference and stay with bilinear (as pbrt does)? For any diffuse/sss/smooth 
> glossy/etc bounces, I use MipModeNoMIP/InterpClosest. I am observing this on 
> macOS, Windows and Ubuntu, OIIO built with whatever compiler flags CMake 
> picks for a Release build. Is it worth it forcing more aggressive 
> optimisation (-O3 -lto -ffast-math…)?
> 
> Thanks,
> Stefan
> _______________________________________________
> Oiio-dev mailing list
> [email protected]
> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

--
Larry Gritz
[email protected]

_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

Re: [Oiio-dev] TextureSys best practices

Reply via email to