Hi!

On Tue, 2024-01-09 at 15:39:33 +0100, Peter Krefting wrote:
> Package: dpkg-dev
> Version: 1.21.22
> Severity: wishlist
> Tags: l10n

> With PERL5OPTS=-Mutf8 and PERL_UNICODE=SDL set in environment [1], output from
> dpkg-buildpackage (and others) is garbled ("double" UTF-8 encoding):
> 
>   $ dpkg-buildpackage --version
>   Debian dpkg-buildpackage version 1.21.22.
> 
>   Detta program är fri programvara. Se GNU General Public License version 2
>   eller senare för kopieringsvillkor. Det finns INGEN garanti.
> 
> Unsetting PERL5OPTS fixes it:
> 
>   $ bash -c "unset PERL_UNICODE; dpkg-buildpackage --version"
>   Debian dpkg-buildpackage version 1.21.22.
> 
>   Detta program är fri programvara. Se GNU General Public License version 2
>   eller senare för kopieringsvillkor. Det finns INGEN garanti.

Right, dpkg does not currently set its streams to be UTF-8. But I
agree it probably should.

> [1] As per https://stackoverflow.com/a/6163129

Ah, yes, that page is great, I've had it in my bookmarks for a long
time. :D

In any case I started looking into this the other day, and the first
blocker is that adding «use open qw(:encoding(UTF-8) :std);» is not
enough as the gettext code needs to be switched to its Object Oriented
methods which handle encoding according to the locale automatically,
otherwise we also get doubly encoded output. I've got some of this in
a branch but…

…my concern is whether just with those two things will be enough, or
if we'll get botched input/output, like we did in around 2008, when a
similar change in spirit was done for dpkg-genchanges, dpkg-gencontrol
and dpkg-source. So I'll need to check all this thoroughly, and add
new test cases, etc.

Thanks,
Guillem

Reply via email to