Just to get a better understanding of this discussion and OIIO in general: why do you place such an emphasis on optimizing string operations in OIIO? Why is decent performance in this area so important?

Gregor

On 12/31/2013 12:57 AM, Larry Gritz wrote:
Here's my stab at it: https://github.com/OpenImageIO/oiio/pull/772

Feedback appreciated. Now that I've done it, I rather like it.


On Dec 26, 2013, at 12:35 PM, Larry Gritz <[email protected]> wrote:

Here's the C++ draft document: 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3442.html
It's being considered for C++14.

Boost 1.55 has an implementation, but not everybody (including my employers) is 
on such a recent boost, so I don't think we can depend on that.  It's a simple 
class, I don't mind rolling our own, and in fact if we do we can also make it a 
bit more optimized in dealing with ustring.

The basic gist is that, say you have a library function foo(x), where x is a 
character sequence. What type should x be?  Let's assume that foo needs to read 
the characters but doesn't wish to assume ownership or modify them in place. 
And also let's assume that the library function may be used by apps that like 
char*, and other apps that like std::string.

If x is const char*, then (a) it's technically unsafe because no length is 
passed, so it assumes that the characters are valid and are 0-terminated, and 
also (b) an app that is dealing with std::string needs to call it as 
foo(s.c_str()), which is a little clunky and may rub some people wrong.

On the other hand, if x is a std::string&, that works really nicely for an app 
already using std::string, but if the caller just has a char array, it'll end up 
converting to a std::string to pass (which involves allocating memory, copying the 
chars, then deallocating after the call, very wasteful).

So the idea is that you'd declare a non-owning (char* + length) object and pass 
that around with implied conversions.  You can just pass either a char* or a 
std::string, works fine both ways, and in neither case will it allocate, copy, 
or deallocate. For the char* case, if the app knows the length of the string 
already, it further provides safety by passing the length without having to 
scan the string.

Also, there are examples of functions that currently return a string where the 
string will necessarily be a contiguous subregion of another existing string 
that was passed as an argument (example: substr). Instead of allocating 
std::strings, it can return string_ref's.  The allocation, if ever needed, is 
done on the assignment after return, if you know what I mean.

To answer your question, you can have an empty string_ref, yes.



On Dec 26, 2013, at 3:02 PM, Mark Visser <[email protected]> wrote:

I'm really only familiar with the oiiotool part of the codebase, but I noticed 
there are a few places that pass const char* instead of std::string because the 
latter can't represent NULL. Does string_ref allow the null string?

Is this the same thing? 
http://www.boost.org/doc/libs/1_55_0/libs/utility/doc/html/string_ref.html

cheers

On Dec 26, 2013, at 1:10 PM, Larry Gritz <[email protected]> wrote:

Just floating the idea to see what people think.

Many packages have some version of a "string reference" class that is a 
non-owning (and non-allocating) reference to character data that auto-converts to char* 
or std::string when assigned.  C++14 is drafting it into its standard library.

I've been toying with the idea of doing an implementation of this in OIIO, and 
changing many of the utility and other functions that take strings -- which are 
currently a bit of a mish-mash of std::string, char*, and ustring -- to 
accepting a string_ref, then internally it can do what it wishes. For many uses 
that just need to look at the characters, this can eliminate many transitory 
std::string allocations and copies.  Since we can't count on C++14 (hah, we 
can't even count on C++11), I'd have to write my own (it's very simple) and so 
it could also be aware of ustring.

Does anybody have strong opinions?  Including:

1. Don't change anything ever, we hate string_ref.

2. A nice idea to use it when it's available from C++14, but wait until then.

3. Yes, do it and convert anyplace where it's appropriate. Making it 
ustring-aware is great, even if it means deviating from what C++14 will 
eventually offer as std::string_ref.

4. Do it, but only if it's exactly like what C++14 std::string_ref will end up 
being.

5. Something else.

If you don't care one way or the other, as long as it won't really affect what 
the source code looks like on the app side, you don't need to chime in, that's 
the default answer.

Similarly, there are a lot of places where we pass arrays as just a T*.  It sometimes 
bugs me how unsafe this is, there are always assumptions about how many T's it points 
to.  We could make an array_ref<T> template, which is just a bundling of the 
pointer and a length, and use that in many places instead of raw pointers. There's no 
allocation issue here, but forcing the callers to pass an explicit length (and have 
the called function given an explicit length) could be useful in improving code 
safety.

--
Larry Gritz
[email protected]



_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

--
Larry Gritz
[email protected]



_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

--
Larry Gritz
[email protected]



_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org


_______________________________________________
Oiio-dev mailing list
[email protected]
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

Reply via email to