Control: retitle -1 uchardet: misdetects UTF-8 text with a few multi-byte chars
Control: tags -1 fixed-upstream

Hi,

I can confirm this bug still exists in 0.0.8 and running
"uchardet < broken" also gives the wrong encoding.

I think upstream has fixed it in this commit:
https://gitlab.freedesktop.org/uchardet/uchardet/-/commit/bed459c6e75e8a5be59ccd9bc80ac76c0bb8dbeb

I'm a little hesitant to just apply it though because there's been a lot
of change upstream related to UTF-8 detection and it might not interact
well without the other commits.

James

Reply via email to