Hi,
I am new to the mailing list. I am trying to parse xml
document files with the DOMParser. I am using the release 1.03.
My code looks like this:
InputSource source = new
InputSource(in);
DOMParser parser = new DOMParser(); parser.parse(source); doc = parser.getDocument(); "in" is an InputStream...
My file is including swedish characters.
When I try to read and parse my XML files with the
preceding code, I got the following errors:
sun.io.ByteToCharUTF-16
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1155) at lm.fsd.XercesImplementation.stream2DOM(XercesImplementation.java:195) at lm.fsd.DOMinatorTest.main(DOMinatorTest.java:149) at symantec.tools.debug.MainThread.run(Agent.java:48) Without the swedish characters, the parsing is working well.
How could I set my encoding (UTF-16 for example) even if my stream is a
character stream ? Since it seems that the DOMParser doesn´t extract, by itself,
the encoding which is written in the XML document ...
I have tried InputSource.setEncoding() but it doesn´t change anything... I suppose that there is a way to handle this type of encoding
problem. I tried to find the solution in the mailing list.
I haven´t found any ansvers corresponding completely to my
issue. Maybe I missed it or maybe you have an idea and you can help
me.
If so ... any advise is welcome...
Thank you in advance
Jean-Guillaume LALANNE
Application developper - LARGEMEDIUM AB
|
- RE: Problem parsing non-english XML files. jean-gui
- RE: Problem parsing non-english XML files. Johan Mörén