With ICU installed we have now a rather complete support for unicode
string manipulation (byte, codepoint levels).
Still todo is string_bitwise_{or,and,xor}.
What should happen, if charsets, or encondings don't match, if encoding
is utf8 or utf16/ucs2, ...
I think there are basically two options:
1) throw exceptions *)
(which combinations are valid?)
2) just do it and mark the resulting bit mess as binary
*) any allowed operations would still produce binary strings (except
maybe latin1 <op> latin1 -> latin1).
Any thoughts?
leo