Hi All: Does anyone have expertise with BOMInputStream?
I know that some XML parsers (like the one shipped with the Oracle JRE) do not detect UTF-32 BOMs (UTF-8 and UTF-16 BOMs are OK) but using BOMInputStream is supposed to fix the issue. These tests I added and @Ignore'd fail: - org.apache.commons.io.input.BOMInputStreamTest.testReadXmlWithBOMUtf32Be() - org.apache.commons.io.input.BOMInputStreamTest.testReadXmlWithBOMUtf32Le() More basic tests do work: - org.apache.commons.io.input.BOMInputStreamTest.testReadWithBOMUtf32Be() - org.apache.commons.io.input.BOMInputStreamTest.testReadWithBOMUtf32Le() When I look at the Oracle JRE (which uses a copy of Xerces) I see code to deal with UCS-4, which is a precursor to UTF-32, like UCS-2 is a subset to UTF-16, but as the test shows, Xerces fail parsing a UTF-32 document. Any thoughts? Thank you, Gary -- E-Mail: [email protected] | [email protected] JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK Blog: http://garygregory.wordpress.com Home: http://garygregory.com/ Tweet! http://twitter.com/GaryGregory
