On Wed, 18 Jul 2018 16:52:54 +0200 Geert Uytterhoeven <[email protected]> 
wrote:

> As PERL uses its own internal character encoding, always calling
> encode("utf8", ...) on the author name may cause corruption, leading to
> an author signoff mismatch.
> 
> This happens in the following cases:
>   - If a patch is in ISO-8859, and contains a non-ASCII author name in
>     the From: line, it is converted to UTF-8, while the Signed-off-by
>     line will still be in ISO-8859.
>   - If a patch is in UTF-8, and contains a non-ASCII author name in the
>     body (not header) From: line, it is assumed to be encoded in PERL's
>     internal character encoding, and converted to UTF-8 incorrectly,
>     while the Signed-off-by line will be in real UTF-8.
> 
> Fix this by only doing the encode step if the From: line used UTF-8
> quoted printable encoding.

Works for me, thanks.


Relatedly, would it be worth adding a checkpatch warning if a patch
contains anything other than ASCII or UTF-8?

I added this to my little local patch-checking script.

        if ! file $p | grep -q -P "ASCII text|Unicode text"
        then
                echo $p: weird charset
        fi

Reply via email to