Hi! On Tue, 2024-01-09 at 15:39:33 +0100, Peter Krefting wrote: > Package: dpkg-dev > Version: 1.21.22 > Severity: wishlist > Tags: l10n
> With PERL5OPTS=-Mutf8 and PERL_UNICODE=SDL set in environment [1], output from > dpkg-buildpackage (and others) is garbled ("double" UTF-8 encoding): > > $ dpkg-buildpackage --version > Debian dpkg-buildpackage version 1.21.22. > > Detta program är fri programvara. Se GNU General Public License version 2 > eller senare för kopieringsvillkor. Det finns INGEN garanti. > > Unsetting PERL5OPTS fixes it: > > $ bash -c "unset PERL_UNICODE; dpkg-buildpackage --version" > Debian dpkg-buildpackage version 1.21.22. > > Detta program är fri programvara. Se GNU General Public License version 2 > eller senare för kopieringsvillkor. Det finns INGEN garanti. Right, dpkg does not currently set its streams to be UTF-8. But I agree it probably should. > [1] As per https://stackoverflow.com/a/6163129 Ah, yes, that page is great, I've had it in my bookmarks for a long time. :D In any case I started looking into this the other day, and the first blocker is that adding «use open qw(:encoding(UTF-8) :std);» is not enough as the gettext code needs to be switched to its Object Oriented methods which handle encoding according to the locale automatically, otherwise we also get doubly encoded output. I've got some of this in a branch but… …my concern is whether just with those two things will be enough, or if we'll get botched input/output, like we did in around 2008, when a similar change in spirit was done for dpkg-genchanges, dpkg-gencontrol and dpkg-source. So I'll need to check all this thoroughly, and add new test cases, etc. Thanks, Guillem