Hi Ariel,

thanks for your hints. It seems that the class OUString has the needed methods. But I need some time to test it.

Kind regards
Regina

Ariel Constenla-Haile schrieb:
Hello Regina,

On Wed, Apr 08, 2015 at 09:02:06PM +0200, Regina Henschel wrote:
Hi all,

I'm going to improve the MathML type detection. Currently there exist files,
that can be opened or imported fine, when the type detection would allow it.
https://bz.apache.org/ooo/show_bug.cgi?id=126230

I have attached a C++ file to show what I want to do.
The problem is, that MathML does not need to be encoded in utf-8 but can
have any other encoding. For example MS Windows "Math Input Control" exports
formulas in utf-16.

So my question is, which kind of string can I use, that is able to
detect/use utf-16 and has the needed methods similar to C++ string methods
find, rfind, insert, substring, clear, erase? Does AOO has such kind of
string?

You can use OpenOffice's rtl string and string buffer classes, together
with the lower lever text conversion from
https://www.openoffice.org/api/docs/cpp/ref/names/o-textcvt.h.html

It is possible to get the encoding from the MathML file or set default
utf-8, in case that information is needed for to instantiate a string
object.

If the file has no information about its encoding, you will have to
perform some kind of encoding detection, see Writer's ASCII filter for
example:

bool SwIoSystem::IsDetectableText
main/sw/source/filter/basflt/iodetect.cxx

used in sal_uLong SwASCIIParser::ReadChars()
main/sw/source/filter/ascii/parasc.cxx

Searching rtl_convertTextToUnicode in OpenGrok might give other useful
hints.


Regards



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to