The bottom line in all of this is that no Xerces version seem to
handle UTF-32 even though I see a UCSReader class in there.

Gary

On Aug 11, 2012, at 7:40, sebb <seb...@gmail.com> wrote:

> On 10 August 2012 18:44, Gary Gregory <garydgreg...@gmail.com> wrote:
>> Hi All:
>>
>> Does anyone have expertise with BOMInputStream?
>>
>> I know that some XML parsers (like the one shipped with the Oracle JRE) do
>> not detect UTF-32 BOMs (UTF-8 and UTF-16 BOMs are OK) but using
>> BOMInputStream is supposed to fix the issue.
>>
>> These tests I added and @Ignore'd fail:
>>
>>   -
>>   org.apache.commons.io.input.BOMInputStreamTest.testReadXmlWithBOMUtf32Be()
>>   -
>>   org.apache.commons.io.input.BOMInputStreamTest.testReadXmlWithBOMUtf32Le()
>>
>> More basic tests do work:
>>
>>   - org.apache.commons.io.input.BOMInputStreamTest.testReadWithBOMUtf32Be()
>>   - org.apache.commons.io.input.BOMInputStreamTest.testReadWithBOMUtf32Le()
>>
>> When I look at the Oracle JRE (which uses a copy of Xerces) I see code to
>
> OT to this thread, but note that the Oracle version of Xerces was
> forked from Apache Xerces a long time ago, and is very different from
> the current Xerces code.
> It's also in a different package name space: com.sun.org.apache.xerces.*
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to