Hi!
Thanks for taking the time to explain this. The arguments you make
related to string/char * function parameters and return types in the
OIIO interfaces are quite good.
However, I am still not convinced about one thing: have string
operations in/related to OIIO ever shown up as a significant portion of
the program runtime in a profiler? If not, one can easily argue that the
string handling in OIIO is at least partly a case of premature optimization.
I'm not opposed to any changes you make in that regard. I'm trying to
take an opposing view in the discussion to get a better understanding.
Gregor
On 01/07/2014 07:22 PM, Larry Gritz wrote:
OIIO (parts of it anyway) tends to be used in fairly performance-critical ways,
so I don't think there's a strong argument to be made for intentionally having
bad string performance.
Perhaps you meant, "isn't OIIO mostly image processing, which is all float math, so
why do strings come up much at all?"
The answer is that we still use strings all the time. Two spots that come to
mind are (1) names of texture and other image files; (2) the image headers
themselves are chock full of strings (metadata names, and often the metadata
values). We are constantly rummaging through metadata, searching for things by
name, making and destroying lists of such things, or specifying texture lookups
by name.
But even if this never became significant in the runtime profile, there are also
(non-performance) issues of APIs -- both convenience and safety. Consider a
mundane utility function that extracts the file extension from an image filename.
What kind of parameter should extension() take? A const char*? A const
std::string&?
If it takes a const char* and the app has a const char*, all is good (though,
if it's a string literal, extension() will probably needlessly call strlen()
internally to find its length, which is weird because it started life as a
constant with known length). But if the app is trying to pass a std::string, it
will have to pass as mystring.c_str(), and then extension() will again
needlessly call strlen(), which is especially galling because the length was
already known when it was a std::string.
If, on the other hand, extension() takes a const string&, then that's great for
an app that already has the data as a std::string, but for a different app that has
it as a C string or a string literal), passing it to extension will pointlessly do
a malloc to create a std::string, a strlen to know how big it is, pass the string
(which is just a temporary!), and then have to free it after the call. Ick.
Should we have both varieties of extension()? Have two copies of every
string-handling function, one for char* and one for std::string? What about a
function that takes 3 strings as arguments... do we need 8 versions of the
function, to handle the full cross product of every one of those parameters being
either a char* or std::string&?
Or, consider the return type. If you have a function that returns a
std::string, it must allocate and copy that string, and the caller must
eventually free it. But for extension() and many other functions, the returned
string will by definition ALWAYS be a subset of the string that was passed as
the input parameter. However, that observation does nothing to prevent a new
malloc/strcpy/free from happening. These little things add up.
The neat thing is that string_ref cuts through all of this mess, and presents a
SINGLE interface, through which it's fine (and efficient!) for the app to pass
a std::string, a char*, or even one of our ustring's. It NEVER does a
malloc/free in order to pass a string, and although it can occasionally trigger
an unneeded strlen, it probably saves more strlen's than it creates, because
any function that internally needs the length will already have it. Also, in
cases where a return string value is guaranteed to be one of (or a substring
of) the inputs, the function can also return a string_ref, again completely
avoiding any allocation or copying.
It's a very slick idea. (Not originally mine.)
On Jan 7, 2014, at 1:56 AM, Gregor Mückl <[email protected]> wrote:
Just to get a better understanding of this discussion and OIIO in general: why
do you place such an emphasis on optimizing string operations in OIIO? Why is
decent performance in this area so important?
Gregor
On 12/31/2013 12:57 AM, Larry Gritz wrote:
Here's my stab at it: https://github.com/OpenImageIO/oiio/pull/772
Feedback appreciated. Now that I've done it, I rather like it.
_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org