Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

Peter Pearson Fri, 15 Jan 2016 13:09:46 -0800

Replies inline...

On 16 January 2016 at 07:52, Larry Gritz <[email protected]> wrote:


> I'm sorry for the long delay here, I got sidetracked for quite a while
> trying to unravel a site-specific problem -- in the process of trying to
> benchmark different OpenEXR versions, I found out that I was getting vastly
> different speeds even on the same exr version depending on whether I built
> libIlmImf myself or used the system libraries. It seems to have boiled down
> to compiler releases  (gcc 4.4 vs gcc 4.8 vs clang -- the latter two make
> much faster code for some reason) so it's important to do these kinds of
> benchmarks certain that you used the same toolchain for each option you're
> benchmarking.
>


I was using GCC 4.7.2 for both tests, and building everything (IlmIlmf as
well as OpenEXR) in both cases, not using system libs (the system hasn't
got them installed). I've just done a *very* rough single-threaded test of
just opening a tiled image once, and the speeds between 1.7 and 2.0 are
close to identical, so maybe the discrepancy I can see (tested it again
within renderer) is to do with the usage profile there of multiple threads
interacting with other renderer stuff...



> Anyway, the long and short of it is that I'm unable to replicate Peter's
> results. For me, OpenEXR 2.2 is not any slower than 1.7 in my benchmarks.
> If anything, 2.2 is slightly faster. The identical benchmark using tiled,
> MIP-mapped TIFF files is still about 15% faster than OpenEXR, even when I
> use the compiler versions that give the best exr results.
>
> So I'm still very eager to get suggestions for what to try next, and if
> anybody more familiar with OpenEXR internals is interested in taking
>  deeper look at why performance may not be what we hope.
>


I've got no evidence this *is* the issue for EXR reading, but in terms of
performance, I've long suspected that the use of IOStreams within OpenEXR
might account for some performance penalty compared to raw fread()s -
streams in C++ are generally slower, and getting the buffering right for
high-performance stuff is tricky, definitely cross-platform.

Also, reading and writing of values in OpenEXR goes through ImfXdr.h's
conversion routines doing bitshifting for I assume endianness conversion? -
I guess the x86 port for OpenEXR had to convert this, whereas the SGI
versions didn't, and we're stuck with it now?

On top of that, in the multi-threading scenario, while using a LUT for
half->float conversion is faster than not using it, it causes absolute
havoc in terms of L1/L2 cache thrashing - from disk I've sometimes found
reading full float EXRs faster than half EXRs due to this, but that's
probably only when the OS disk cache has them, so in general it's not a
huge issue given the IO saving that'll happen in most real-world usage for
big facilities...

Cheers,
Peter

_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

Reply via email to