On Wed, 5 Jan 2022 at 14:05, Lewis John McGibbney <[email protected]> wrote:
>
> Hi Peter,
>
> On 2022/01/04 23:38:41 Peter Ansell wrote:
>
> > Tracing the error in a debugger shows that the correct result,
> > "ISO-8859-1" is found by the meta tag detection method, but it is then
> > overridden with "windows-1252" because a carriage return character is
> > detected by the following code because Git on WSL2 is checking the
> > code out using Windows line-endings:
> >
> > https://github.com/apache/any23/blob/any23-2.6/encoding/src/main/java/org/apache/any23/encoding/EncodingUtils.java#L62-L69
>
> Very interesting. Do you think a JIRA issue and patch is required here or is 
> that what you would expect? It sounds like the latter to me but as I have not 
> touched Windows foe quite some time your confirmation would be appreciated.
> Thank you

I don't think it should block a release, but I can't validate it right
now as I only have Windows and WSL2 Ubuntu on the machine I have
access to and it is affecting both.

I have created an issue to track fixing the use of `\r` to
heuristically override a specifically labelled encoding:

https://issues.apache.org/jira/browse/ANY23-554

Thanks,

Peter

Reply via email to