On 03.12.2013 10:35, Herbert Duerr wrote:
On 03.12.2013 09:13, Andre Fischer wrote:
A developer who apparently wants to remain anonymous has added the
function isEmpty() to the rtl::OUString class.  See
main/sal/inc/rtl/ustring.hxx for not much more information.

Sorry for being too short. The full semantic for isEmpty() is:

"The method isEmpty() returns true if the string is empty. If the length of the string is one or two or three or any number bigger than zero then isEmpty() returns false."

Additionally to this almost correct statement one could mention that isEmpty() is preferred over getLength()>0 and why.

Can you tell me what happens when an OUString is created for "\0". Is that handled as end-of-string or as just one additional character?


I added isEmpty() to make it possible to cleanly express the check for an empty string. In our codebase there were quite a few constructs such as
    if( aString) {}
which were intended to mean
    if( aString.isEmpty()) {}
What's funny is that the old construct compiled but it did the wrong thing: The string was implicitly converted to a pointer to its elements and that pointer was then compared against NULL. For our OUString that pointer was always non-NULL though.

Please see issue 123068 for further problems caused by the implicit conversion of the OUString to a pointer to its elements. This dangerous conversion is now disabled. By making the method private all such problems will be found and prevented by the compiler. When we're confident that all has been found the operator can be removed completely.

This in itself may not yet be very exciting but I hope that it is the
first of several improvements to one of our most frequently used
classes.  Sadly, we missed the opportunity to make some more substantial
but incompatible changes for the 4.0 release. However, some changes that
make OUString more accessible to new (and old) developers might include:

- Make construction from string literal more straightforward. At the
moment you have to write
     ::rtl::OUString("text", sizeof("text"), RTL_TEXTENCODING_ASCII_US)
   or slightly shorter and safer
     ::rtl::OUString::createFromAscii("text")

Allocating heap space, transcoding a literal string to this memory and deallocating it later when the string is deleted are quite wasteful operations. Especially when considering that the literal string is already there. It would be great if constructs such
    OUString( L"hello")
used the pointer to the UTF-16 literal directly instead of copying its contents around. The same applies for the OString(). The 'L' prefix is a Windows convention but C++11 has even more possibilities with its support for unicode string literals.

Also we shouldn't bother our main string classes with non-unicode support. Having external tooling for converting from/to other encodings is still needed though.

We should drop our support for ASCII?


Looking over our string processing I'm confident that we could get along great with UTF-8 strings. Only when interfacing with other APIs an eventual conversion to UTF-16 would be needed.

And if we were using UTF-8 byte strings we could base them directly on the standard std::string.

- Conversion back to char* is not much better
     ::rtl::OUStringToOString(sOUStringVariable,
RTL_TEXTENCODING_ASCII_US).getStr()

This awful construct could be made much simpler if our strings were always unicode (UTF-8/UTF-16/UTF-32).

I thought that OUString is UTF-16 and that that where the cause, not the solution of the conversion problems.

-Andre


Do you have more ideas?

Using ideas from languages such as Python/Perl/Java for convenient and powerful string processing to replace the awkward string handling that is too often seen in our code base. E.g. having regexp enabled match() or search() methods would be a great start.

Herbert


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to