UTF-8 should be the right format for doc-base files, according to https://lintian.debian.org/tags/doc-base-file-uses-obsolete-national-encoding.html
I also don't know ruby, but from my research setting Encoding.default_external is considered the "wrong" thing to do, the "right" way being to pass "-E UTF-8" as an option to ruby via the command line, or the environment variable RUBYOPT. I had to explicitly silence a warning because of this. See http://docs.ruby-lang.org/en/2.1.0/Encoding.html#method-c-default_external-3D However, neither of those "right" ways to set the encoding work well with using a ruby file directly as a script. (Is ruby not intended to be used in scripts?!) In the ruby docs, it says the problem is if code gets run before the change to the encoding. That's avoidable, and I believe I avoided it in my patch by placing the encoding change before any require imports. An alternative is to explicitly set the encoding to UTF-8 each time a file is opened. If someone feels that's a better way, I'm willing to do that and create a new patch. But like I said, I don't know ruby, so I can't guarantee correctness beyond trying it and seeing that it works. - Dan On Sun, Dec 7, 2014 at 2:06 PM, gregor herrmann <gre...@debian.org> wrote: > Control: tag -1 - moreinfo > Control: tag -1 + confirmed > > On Sat, 06 Dec 2014 01:33:58 -0100, Daniel Getz wrote: > > I can reproduce the problem with > LC_ALL=C LANG=C /etc/cron.weekly/dhelp > > > Attached is a diff with a change to dhelp_parse.rb which sets > > Encoding.default_external explicitly, so that even if LANG=C, it uses > UTF-8 > > instead of US-ASCII as the default for opening files. By my (limited) > > understanding of Encoding.default_external, this should have the same > > effect on opening files as replacing LANG=C with LANG=xx_XX.UTF-8 would. > > > > On my machine, without the patch, I see the same errors with LANG=C as > the > > others here. With the patch, I do not. > > Works for me as well. > > > Since I don't speak any ruby I'm a bit hesitant to upload; maybe some > ruby speaker knowing Encoding.default_external can confirm that's the > correct way forwards? > > (And: Are we sure all doc-base files are us-ascii or utf-8 encoded? > At least on my machine they are, so maybe that's a non-concern.) > > > Cheers, > gregor > > -- > .''`. Homepage: http://info.comodo.priv.at/ - OpenPGP key > 0xBB3A68018649AA06 > : :' : Debian GNU/Linux user, admin, and developer - > http://www.debian.org/ > `. `' Member of VIBE!AT & SPI, fellow of the Free Software Foundation > Europe > `- NP: Treibhaus: Yellowman Jamaica > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQJ8BAEBCgBmBQJUhGzpXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w > ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXREMUUxMzE2RTkzQTc2MEE4MTA0RDg1RkFC > QjNBNjgwMTg2NDlBQTA2AAoJELs6aAGGSaoG/LMP/2o9yR4MuLwI+uxzEq0sgiPW > wz5K4/+98llYpEnHrcEzWIp5sdJF3NkMqEr8eqtycOUUdLismSp3MeH7DByxQX9H > to/qFXpwM+qTf6dLiNrQykQzkBI+kTg7SszslTIdNbrOqSDR9UGOSZs2IX3OoKac > N/651M1MfPz6EuyVehUEeLchUJWaiqz+XpLblV10FjnH8UxUzeMg6Dck7bYpGAuT > +PLfNrurXx1ldoCkoqaCwCzBbKb0ZBu8A0AzdfgWUeudXwmgIF+u0Fs0rQMqUifS > +QfcS0lMFAxBTBIimDogoyteLhxgE9OaNGqizZv2/xQPPvXOTrzF7BlKSr5SLWw0 > A73YqAhrzU0Rxawl6i7+eKyEYUt59Cc7mJWAKCJ8o10QipDid90GPAJ78Rmjxo8W > aWb/zGu/DJ70e+D1WEZ+VEwDQs6LgpibY10cjkLOH813b62DahDh9vuHIgvIc7Xa > 3naQRh626lAmpxdCqqDobxMa3o8M2tcbqrIFrQRq69VarW2eDXJVT/MoCUy+vjCS > Qu5t5vCX+qONuxYnGUAiHsnk7eSGh52EOUtaXjYFvqUA6YWFkSfy0+apaFD1nlj9 > H93c1xAFfDFbE4Aue9oxIenIVXMEH/KtPqYikt0ApHH/IcYiMDc3nGNhUUL4Nvyc > WuWu7s3lZpbMnI0Cgzly > =pVVw > -----END PGP SIGNATURE----- > >