On Wed, 5 Jan 2022 at 14:05, Lewis John McGibbney <[email protected]> wrote: > > Hi Peter, > > On 2022/01/04 23:38:41 Peter Ansell wrote: > > > Tracing the error in a debugger shows that the correct result, > > "ISO-8859-1" is found by the meta tag detection method, but it is then > > overridden with "windows-1252" because a carriage return character is > > detected by the following code because Git on WSL2 is checking the > > code out using Windows line-endings: > > > > https://github.com/apache/any23/blob/any23-2.6/encoding/src/main/java/org/apache/any23/encoding/EncodingUtils.java#L62-L69 > > Very interesting. Do you think a JIRA issue and patch is required here or is > that what you would expect? It sounds like the latter to me but as I have not > touched Windows foe quite some time your confirmation would be appreciated. > Thank you
I don't think it should block a release, but I can't validate it right now as I only have Windows and WSL2 Ubuntu on the machine I have access to and it is affecting both. I have created an issue to track fixing the use of `\r` to heuristically override a specifically labelled encoding: https://issues.apache.org/jira/browse/ANY23-554 Thanks, Peter
