Hi Jason, thanks for your test cases. However, I don't think that binmode provides an acceptable solution, at least not alone. While it ensures that the strings are valid utf-8 strings, it will convert any valid utf-8 character to two "garbage" characters. Try
$ ./utf8_test.pl testlog (see attached files) I'm not really sure what a proper solution is. But I'm actually not yet fully convinced that there is a problem logwatch should solve. I will ask Debian's security team for advice. WM Am 2016-12-30 um 20:26 schrieb Jason Pyeron: > A very rudimentary test: > > /projects/logwatch > $ perl -e 'for ($i=0; $i<256; ++$i) {print chr($i);}' | hexdump.exe -C > 00000000 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f |................| > 00000010 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f |................| > 00000020 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f | !"#$%&'()*+,-./| > 00000030 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f |0123456789:;<=>?| > 00000040 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f |@ABCDEFGHIJKLMNO| > 00000050 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f |PQRSTUVWXYZ[\]^_| > 00000060 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f |`abcdefghijklmno| > 00000070 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f |pqrstuvwxyz{|}~.| > 00000080 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f |................| > 00000090 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f |................| > 000000a0 a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af |................| > 000000b0 b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf |................| > 000000c0 c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf |................| > 000000d0 d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df |................| > 000000e0 e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef |................| > 000000f0 f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff |................| > 00000100 > > /projects/logwatch > $ perl -e 'binmode(STDOUT, ":utf8"); for ($i=0; $i<256; ++$i) {print STDOUT > chr($i);}' | hexdump.exe -C > 00000000 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f |................| > 00000010 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f |................| > 00000020 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f | !"#$%&'()*+,-./| > 00000030 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f |0123456789:;<=>?| > 00000040 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f |@ABCDEFGHIJKLMNO| > 00000050 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f |PQRSTUVWXYZ[\]^_| > 00000060 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f |`abcdefghijklmno| > 00000070 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f |pqrstuvwxyz{|}~.| > 00000080 c2 80 c2 81 c2 82 c2 83 c2 84 c2 85 c2 86 c2 87 |................| > 00000090 c2 88 c2 89 c2 8a c2 8b c2 8c c2 8d c2 8e c2 8f |................| > 000000a0 c2 90 c2 91 c2 92 c2 93 c2 94 c2 95 c2 96 c2 97 |................| > 000000b0 c2 98 c2 99 c2 9a c2 9b c2 9c c2 9d c2 9e c2 9f |................| > 000000c0 c2 a0 c2 a1 c2 a2 c2 a3 c2 a4 c2 a5 c2 a6 c2 a7 |................| > 000000d0 c2 a8 c2 a9 c2 aa c2 ab c2 ac c2 ad c2 ae c2 af |................| > 000000e0 c2 b0 c2 b1 c2 b2 c2 b3 c2 b4 c2 b5 c2 b6 c2 b7 |................| > 000000f0 c2 b8 c2 b9 c2 ba c2 bb c2 bc c2 bd c2 be c2 bf |................| > 00000100 c3 80 c3 81 c3 82 c3 83 c3 84 c3 85 c3 86 c3 87 |................| > 00000110 c3 88 c3 89 c3 8a c3 8b c3 8c c3 8d c3 8e c3 8f |................| > 00000120 c3 90 c3 91 c3 92 c3 93 c3 94 c3 95 c3 96 c3 97 |................| > 00000130 c3 98 c3 99 c3 9a c3 9b c3 9c c3 9d c3 9e c3 9f |................| > 00000140 c3 a0 c3 a1 c3 a2 c3 a3 c3 a4 c3 a5 c3 a6 c3 a7 |................| > 00000150 c3 a8 c3 a9 c3 aa c3 ab c3 ac c3 ad c3 ae c3 af |................| > 00000160 c3 b0 c3 b1 c3 b2 c3 b3 c3 b4 c3 b5 c3 b6 c3 b7 |................| > 00000170 c3 b8 c3 b9 c3 ba c3 bb c3 bc c3 bd c3 be c3 bf |................| > 00000180 > > This confirms that binmode utf8 is needed to print out the full ASCII range. > >> -----Original Message----- >> From: Jason Pyeron [mailto:jpye...@pdinc.us] >> Sent: Friday, December 30, 2016 14:03 >> To: Jason Pyeron; 'Willi Mann'; logwatch-de...@lists.sourceforge.net >> Cc: 849...@bugs.debian.org; 849531-forwar...@bugs.debian.org; >> 'Klaus Ethgen' >> Subject: RE: [Logwatch-devel] Bug#849531: Possible security >> problem,new logwatch sends mails with charset UTF-8 >> >> I have opened https://sourceforge.net/p/logwatch/bugs/56/ . >> >> I am working a test case for this right now. >> >> As I see it, there are 3 paths to test. >> >> Output as STDOUT, file, and email. In each case does an 8bit >> value (0x00..0xff unsigned) result in a valid UTF-8 character. >> >> Is binmode(STDOUT, ":utf8") needed? Does it fix the issue if >> it was needed? >> >>>> -----Original Message----- >>>> From: Willi Mann >>>> Sent: Friday, December 30, 2016 12:18 >>>> To: logwatch-devel >>>> Cc: 849...@bugs.debian.org; >> 849531-forwar...@bugs.debian.org; Klaus Ethgen >>>> What would be your suggested fix? >> >> >> $ git show f9db5949c58321175bda66310156f43ae607109f >> commit f9db5949c58321175bda66310156f43ae607109f >> Author: bjorn <bjo...@users.sourceforge.net> >> Date: Sat Oct 15 17:38:40 2016 -0700 >> >> Changed encoding to UTF-8, as suggested by Goran Uddeborg. >> >> diff --git a/scripts/logwatch.pl b/scripts/logwatch.pl >> index 0f863dc..0167755 100755 >> --- a/scripts/logwatch.pl >> +++ b/scripts/logwatch.pl >> @@ -1162,9 +1162,9 @@ sub initprint { >> } >> #Config{output} html >> if ( $Config{'format'} eq "html" ) { >> - $out_mime .= "Content-Type: text/html; >> charset=\"iso-8859-1\"\n\n"; >> + $out_mime .= "Content-Type: text/html; >> charset=\"UTF-8\"\n\n"; >> } else { >> - $out_mime .= "Content-Type: text/plain; >> charset=\"iso-8859-1\"\n\n"; >> + $out_mime .= "Content-Type: text/plain; >> charset=\"UTF-8\"\n\n"; >> } >> >> if ($Config{'hostformat'} eq "split") { #8.0 check >> hostlimit also? or ne none? >> >>
übersät
utf8_test.pl
Description: Perl program