Hi!

Thanks for taking the time to explain this. The arguments you make related to string/char * function parameters and return types in the OIIO interfaces are quite good.

However, I am still not convinced about one thing: have string operations in/related to OIIO ever shown up as a significant portion of the program runtime in a profiler? If not, one can easily argue that the string handling in OIIO is at least partly a case of premature optimization.

I'm not opposed to any changes you make in that regard. I'm trying to take an opposing view in the discussion to get a better understanding.

Gregor

On 01/07/2014 07:22 PM, Larry Gritz wrote:
OIIO (parts of it anyway) tends to be used in fairly performance-critical ways, 
so I don't think there's a strong argument to be made for intentionally having 
bad string performance.

Perhaps you meant, "isn't OIIO mostly image processing, which is all float math, so 
why do strings come up much at all?"

The answer is that we still use strings all the time. Two spots that come to 
mind are (1) names of texture and other image files; (2) the image headers 
themselves are chock full of strings (metadata names, and often the metadata 
values).  We are constantly rummaging through metadata, searching for things by 
name, making and destroying lists of such things, or specifying texture lookups 
by name.

But even if this never became significant in the runtime profile, there are also 
(non-performance) issues of APIs -- both convenience and safety.  Consider a 
mundane utility function that extracts the file extension from an image filename.  
What kind of parameter should extension() take? A const char*? A const 
std::string&?

If it takes a const char* and the app has a const char*, all is good (though, 
if it's a string literal, extension() will probably needlessly call strlen() 
internally to find its length, which is weird because it started life as a 
constant with known length). But if the app is trying to pass a std::string, it 
will have to pass as mystring.c_str(), and then extension() will again 
needlessly call strlen(), which is especially galling because the length was 
already known when it was a std::string.

If, on the other hand, extension() takes a const string&, then that's great for 
an app that already has the data as a std::string, but for a different app that has 
it as a C string or a string literal), passing it to extension will pointlessly do 
a malloc to create a std::string, a strlen to know how big it is, pass the string 
(which is just a temporary!), and then have to free it after the call. Ick.

Should we have both varieties of extension()? Have two copies of every 
string-handling function, one for char* and one for std::string? What about a 
function that takes 3 strings as arguments... do we need 8 versions of the 
function, to handle the full cross product of every one of those parameters being 
either a char* or std::string&?

Or, consider the return type. If you have a function that returns a 
std::string, it must allocate and copy that string, and the caller must 
eventually free it. But for extension() and many other functions, the returned 
string will by definition ALWAYS be a subset of the string that was passed as 
the input parameter. However, that observation does nothing to prevent a new 
malloc/strcpy/free from happening. These little things add up.

The neat thing is that string_ref cuts through all of this mess, and presents a 
SINGLE interface, through which it's fine (and efficient!) for the app to pass 
a std::string, a char*, or even one of our ustring's. It NEVER does a 
malloc/free in order to pass a string, and although it can occasionally trigger 
an unneeded strlen, it probably saves more strlen's than it creates, because 
any function that internally needs the length will already have it.  Also, in 
cases where a return string value is guaranteed to be one of (or a substring 
of) the inputs, the function can also return a string_ref, again completely 
avoiding any allocation or copying.

It's a very slick idea. (Not originally mine.)



On Jan 7, 2014, at 1:56 AM, Gregor Mückl <[email protected]> wrote:

Just to get a better understanding of this discussion and OIIO in general: why 
do you place such an emphasis on optimizing string operations in OIIO? Why is 
decent performance in this area so important?

Gregor

On 12/31/2013 12:57 AM, Larry Gritz wrote:
Here's my stab at it: https://github.com/OpenImageIO/oiio/pull/772

Feedback appreciated. Now that I've done it, I rather like it.



_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

Reply via email to