The bottom line in all of this is that no Xerces version seem to handle UTF-32 even though I see a UCSReader class in there.
Gary On Aug 11, 2012, at 7:40, sebb <seb...@gmail.com> wrote: > On 10 August 2012 18:44, Gary Gregory <garydgreg...@gmail.com> wrote: >> Hi All: >> >> Does anyone have expertise with BOMInputStream? >> >> I know that some XML parsers (like the one shipped with the Oracle JRE) do >> not detect UTF-32 BOMs (UTF-8 and UTF-16 BOMs are OK) but using >> BOMInputStream is supposed to fix the issue. >> >> These tests I added and @Ignore'd fail: >> >> - >> org.apache.commons.io.input.BOMInputStreamTest.testReadXmlWithBOMUtf32Be() >> - >> org.apache.commons.io.input.BOMInputStreamTest.testReadXmlWithBOMUtf32Le() >> >> More basic tests do work: >> >> - org.apache.commons.io.input.BOMInputStreamTest.testReadWithBOMUtf32Be() >> - org.apache.commons.io.input.BOMInputStreamTest.testReadWithBOMUtf32Le() >> >> When I look at the Oracle JRE (which uses a copy of Xerces) I see code to > > OT to this thread, but note that the Oracle version of Xerces was > forked from Apache Xerces a long time ago, and is very different from > the current Xerces code. > It's also in a different package name space: com.sun.org.apache.xerces.* > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org