Hello!
I have some strange problem. I am trying to parse XML files and extract some information from it. I use library dxml for it by Jonathan M Davis. But I have a probleme that I have multiple XML files made by different people around the world. Some of these files was created with Byte Order Mark, but some of them without BOM. dxml expects no BOM at the start of the string. At first I tried to read file with std.file.readText. Looks like it doesn't decode file at any way and doesn't remove BOM, so dxml failed to parse it then. This looks strange for me, because I expect that "text" function must decode data to UTF-8. Then I read that this behavior is documented at least:
"""
...However, no width or endian conversions are performed. So, if the width or endianness of the characters in the given file differ from the width or endianness of the element type of S, then validation will fail.
"""
So it's OK. But I understood that this function "readText" is not usefull for me. So I tried to use plain "read" that returns "void[]". Problemmme is that I still don't understand which method I should use to convert this to string[] with proper UTF-8 decoding and remove BOM and etc.
Could you help me, please to make some clearance.
P.S. Function readText looks odd in std.file, because you cannot specify any encoding to decode this file. And logic how it decodes is unclear...

Reply via email to