Hi, All, I have a question regarding how encoding is handled in
xerces-c. Let’s say I have an xml document, the header is <?xml
version=”1.0”, encoding=”UTF-8” ?>, also assume the
xml body is correctly encoded in UTF-8. Now I created an SAX2XMLReader and pass
a LocalFileInputSource(myDoc) to do the passing, so I can receive A bunch of SAX events. If I understand correctly, the xerces parser will get the
document encoding information from the header, which is UTF-8 in this case. But
the different XML document may have different encoding. So here are my questions: (1) For the simple
type element or attribute in the SAX events I receive, what encoding should I
assume for the value? (2) And for
the tag? (3) Is there a
way to get the encoding information from the parser (for example, from the SAX2XMLReader
I created)? I need the encoding information, because my application that
uses Xerese-c to parse XML files can be configured to run in different
codepage, for example, UTF-8 or WINDOWS-1252 et cetera, so after an input xml
is parsed by Xerces, I need to first convert the atttrbute/element values into
my application code page from their original code page. Thanks for your help in advance. Frank |