Control: retitle -1 uchardet: misdetects UTF-8 text with a few multi-byte chars Control: tags -1 fixed-upstream
Hi, I can confirm this bug still exists in 0.0.8 and running "uchardet < broken" also gives the wrong encoding. I think upstream has fixed it in this commit: https://gitlab.freedesktop.org/uchardet/uchardet/-/commit/bed459c6e75e8a5be59ccd9bc80ac76c0bb8dbeb I'm a little hesitant to just apply it though because there's been a lot of change upstream related to UTF-8 detection and it might not interact well without the other commits. James