Hello,
It looks like XmlStreamReader is not correctly handling several encodings
in Commons IO 2.14.0 that previously worked in version 2.13.0.
Here's a self-contained snippet (Kotlin) that demonstrates the problem:
val xml = "<?xml version='1.0' encoding='437'?><root>Ç</root>"
val stream = xml.byteInputStream(Charset.forName("437"))
val reader = XmlStreamReader.builder()
.setInputStream(stream)
.setLenient(false)
.get()
reader.readText() shouldBe xml
With 2.13.0 this code works fine, but in 2.14.0 the "Ç" (C-cedilla) becomes
a "�" (Unicode replacement character).
We're seeing similar issues with all of the other code page encodings we've
tried (850, 852, 855, 857, 860, 861, 862, 863, 865, and 866).