[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 Sandro Knaußchanged: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Latest Commit||http://commits.kde.org/mess ||agelib/04334e2f8390b967fc5b ||1c4ecde8caacf4787238 Version Fixed In||5.4.0 --- Comment #14 from Sandro Knauß --- Git commit 04334e2f8390b967fc5b1c4ecde8caacf4787238 by Sandro Knauß. Committed on 18/07/2016 at 07:49. Pushed by knauss into branch 'Applications/16.08'. Fix: Message with wrong charset MUAs sometimes fail to set the correct character encoding. If the set us-ascii, we can help a little bit by setting it to utf-8. Because utf-8 is a superset of us-ascii we do not break anything. FIXED-IN: 5.4.0 A +34 -0 mimetreeparser/autotests/data/openpgp-inline-wrong-charset-encrypted.mbox A +47 -0 mimetreeparser/autotests/data/openpgp-inline-wrong-charset-encrypted.mbox.html A +4-0 mimetreeparser/autotests/data/openpgp-inline-wrong-charset-encrypted.mbox.tree M +8-1mimetreeparser/src/viewer/nodehelper.cpp http://commits.kde.org/messagelib/04334e2f8390b967fc5b1c4ecde8caacf4787238 -- You are receiving this mail because: You are watching all bug changes.
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 --- Comment #13 from Andre Heinecke--- Btw. I've asked about armor headers as part of another issue regarding gpgme_data_identify and the maintainer of gnupg also says that they should not be used and are not used by gnupg: https://bugs.gnupg.org/gnupg/issue2314 -- You are receiving this mail because: You are watching all bug changes.
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 --- Comment #12 from Thorsten Glaser--- (In reply to Sandro Knauß from comment #9) > Make the experiment - change the charset of you konsole/ and use a text > document with a different encoding and encrypt it and look at the output in > your normal console ( utf-8). You will see that this is broken. This all > works for you because you have a consistent utf8 environment. But for mails Possibly, but ISTR that OpenPGP still stores the encoding of the message, so I’d have a way to know what charset to pass to iconv(1) to be able to read it, and I’m not talking about the ASCII armour pseudo-header either. I’ll search for it when I have more time. > > > GnuPG / GPGME itself does not do any reencoding it just decrypts the > > > "bytes" > > > of the message. > > > > It does *record* the charset of the message. > > But maybe all are wrong and you are right - give me the link to the > documentation or a script/snippset, how It detect the correct charset of the > decrypted mail i'll fix this instantly in kmail. OK. -- You are receiving this mail because: You are watching all bug changes.
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 --- Comment #11 from Sandro Knauß--- Just for make it clear - my console is also by default utf-8 luit is a programm that translate from/to the encding that is specified. So within the command everything is like it is ISO-8859-15 input and output. -- You are receiving this mail because: You are watching all bug changes.
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 --- Comment #10 from Sandro Knauß--- Created attachment 99676 --> https://bugs.kde.org/attachment.cgi?id=99676=edit An encrypted ISO-8859-15 text -- You are receiving this mail because: You are watching all bug changes.
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 --- Comment #9 from Sandro Knauß--- (In reply to Thorsten Glaser from comment #8) > (In reply to Andre Heinecke from comment #7) > > > PGP Inline is perfectly fine standardised: the display agent has to use > > > the charset indicated by the PGP > > > message, and discard any charset/encoding information of the surrounding > > > message. > > > > No it's not. Especially the Encoding handling is very problematic and not > > standardised. See: https://debian-administration.org/users/dkg/weblog/108 ( > > It is, and especially the encoding is trivial. It’s just often misunderstood > or implemented wrong. > Citing someone who doesn’t fully understand it doesn’t help (I knew that > posting). dkg and andre know what about they are talking - search for references in the internet and what they do inside the openpg project. You will find a lot references to them. > > Inline PGP is easy: the MIME-level encoding is valid for the “outer” part of > the message; for > example, if MIME says quoted-printable then those ‘=’ in the ASCII armour of > the PGP message > are encoded as “=3D”. > In your comment you mix often differnent encodings. in the mail context we have two: - content-transfer-encoding - this is the encoding how the text (that is not ascii 7bit encoded) is modified to be 7bit. This is quoted-printablem base64 or plain. It is out of question, that we have first do decode this before entering the content. This is the "=3D" -> "=" the encoding of the text is more problematic :) We have one field, where we can set the encoing of the mimepart that is the content-type header for a mime part with the charset setting: Content-Type: text/plain; charset="UTF-8" the problem is now, that you are arguing, that gnupg have a defined in/output charset, so that we should ignore the charset setting of the mimepart after we piped the content through gnupg. But this is not true. gnupg only parsing bytestream and do charset handling at all. The only thing, is that gnupg suggest that you SHOULD use utf-8, but do not force this. It only works for you, because alpine is a cmdline mua, that puts it output to your console, and your console using utf-8 encoding, but if you would switch to something else, you couldn't read the text successfully. > The “inner” part of the message, i.e. the output of pgp/gpg decrypting it, > is *completely* independent of the MIME message surrounding it, and for > displaying it, *only* the rules that the command-line utilities use are > valid; this means, that the OpenPGP-level encoding is used (which is always > 8bit not quoted-printable or base64, and in absence of an explicit charset > selection is UTF-8). Well, the problem is that there is no "OpenPGP-level encoding". There is no API to ask gnupg about the encoding ( if there would be a api Andre would know this, because he is one of the authors fof the gnupg apis :) . > The reason for this is easy: Inline PGP works, basically (i.e. without > explicit MUA support), by someone writing a plaintext file, throwing that > through pgp or gpg, and copy/pasting that into their MUA’s composer. > Anything an MUA does to integrate Inline PGP support *must* behave *exactly > the same*. Make the experiment - change the charset of you konsole/ and use a text document with a different encoding and encrypt it and look at the output in your normal console ( utf-8). You will see that this is broken. This all works for you because you have a consistent utf8 environment. But for mails we can't say, what is the encoding of the sender, we can only guess here. > > GnuPG / GPGME itself does not do any reencoding it just decrypts the "bytes" > > of the message. > > It does *record* the charset of the message. But maybe all are wrong and you are right - give me the link to the documentation or a script/snippset, how It detect the correct charset of the decrypted mail i'll fix this instantly in kmail. Okay here is my console test: % LANG=C luit -encoding ISO-8859-15 gpg --encrypt -a -o test.enc You did not specify a user ID. (you may use "-r") Current recipients: Enter the user ID. End with an empty line: 0x36FD5E35D1D8EFD2 gpg: 0x36FD5E35D1D8EFD2: There is no assurance this key belongs to the named user pub 1024R/0x36FD5E35D1D8EFD2 2014-08-18 Test for Mozilla bug#1054187 Primary key fingerprint: 8D15 3316 76F4 6081 1A99 DB56 36FD 5E35 D1D8 EFD2 It is NOT certain that the key belongs to the person named in the user ID. If you *really* know what you are doing, you may answer the next question with yes. Use this key anyway? (y/N) y Current recipients: 1024R/0x36FD5E35D1D8EFD2 2014-08-18 "Test for Mozilla bug#1054187" Enter the user ID. End with an empty line: test äöü test % LANG=C luit -encoding ISO-8859-15 gpg -d test.enc You need a passphrase to unlock the secret key for user: "Test for Mozilla bug#1054187" 1024-bit RSA key, ID 0x36FD5E35D1D8EFD2, created
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 --- Comment #8 from Thorsten Glaser--- (In reply to Andre Heinecke from comment #7) > > PGP Inline is perfectly fine standardised: the display agent has to use the > > charset indicated by the PGP > > message, and discard any charset/encoding information of the surrounding > > message. > > No it's not. Especially the Encoding handling is very problematic and not > standardised. See: https://debian-administration.org/users/dkg/weblog/108 ( It is, and especially the encoding is trivial. It’s just often misunderstood or implemented wrong. Citing someone who doesn’t fully understand it doesn’t help (I knew that posting). Inline PGP is easy: the MIME-level encoding is valid for the “outer” part of the message; for example, if MIME says quoted-printable then those ‘=’ in the ASCII armour of the PGP message are encoded as “=3D”. The “inner” part of the message, i.e. the output of pgp/gpg decrypting it, is *completely* independent of the MIME message surrounding it, and for displaying it, *only* the rules that the command-line utilities use are valid; this means, that the OpenPGP-level encoding is used (which is always 8bit not quoted-printable or base64, and in absence of an explicit charset selection is UTF-8). The reason for this is easy: Inline PGP works, basically (i.e. without explicit MUA support), by someone writing a plaintext file, throwing that through pgp or gpg, and copy/pasting that into their MUA’s composer. Anything an MUA does to integrate Inline PGP support *must* behave *exactly the same*. > Basically your Mail says that it's ASCII Encoded but then actually has UTF-8 > encoding in the content after decryption. I would argue that this is not a See above, “after decryption” when Inline PGP is used means you *have* to *forget* anything from the previous container. Yes, this is different than what PGP/MIME requires. Yes, both are right, for their respective scopes. > KMail bug but that your Mail is broken. For proper encoding Handling you > need to use PGP/MIME. One of the Advantages of PGP/MIME is proper encoding This sounds half like a sales pitch, half like “KMail doesn’t handle encoding in Inline PGP correctly” – which is *exactly my point*. > GnuPG / GPGME itself does not do any reencoding it just decrypts the "bytes" > of the message. It does *record* the charset of the message. > As a "workaround" / to improve compatibility with broken MUA's I like > Sandro's idea to treat PGP Messages as UTF-8 if the specified Charset is > 7Bit ASCII. I think that would be a good solution to fix your bug. That would help in the specific case, but still leave KMail a broken MUA claiming to support Inline PGP and not doing it correctly. However, as a first step, it’s okay; please do so. Actually, why haven’t you done so yet… > Although I would suggest to use a proper MUA with PGP/MIME support. No, PGP/MIME often breaks, interestingly enough, with encoding-related issues, and with mailing lists. Its interoperability is also limited to MUAs supporting it, whereas interoperability of Inline PGP is maximal. -- You are receiving this mail because: You are watching all bug changes.
[kmail2] [Bug 248058] Message preview pane character encoding issue (utf-8, unicode)
https://bugs.kde.org/show_bug.cgi?id=248058 Andre Heineckechanged: What|Removed |Added CC||aheine...@intevation.de --- Comment #7 from Andre Heinecke --- > PGP Inline is perfectly fine standardised: the display agent has to use the > charset indicated by the PGP > message, and discard any charset/encoding information of the surrounding > message. No it's not. Especially the Encoding handling is very problematic and not standardised. See: https://debian-administration.org/users/dkg/weblog/108 ( https://dkg.fifthhorseman.net/notes/inline-pgp-harmful/ ) Basically your Mail says that it's ASCII Encoded but then actually has UTF-8 encoding in the content after decryption. I would argue that this is not a KMail bug but that your Mail is broken. For proper encoding Handling you need to use PGP/MIME. One of the Advantages of PGP/MIME is proper encoding handling. KMail uses the Content-Type charset of the PGP Message which would be correct. GnuPG / GPGME itself does not do any reencoding it just decrypts the "bytes" of the message. The Armor Header from RFC2440 is afaik not used in practice. As changing the encoding can change the meaning and the armor headers themself are not signed / encrypted this offers not much advantage over the Content-Type. Except that you would have an even more fragile implementation because you would have to handle mixed encodings in a message for multiple PGP/Message parts. And you would have to treat PGP Clearsigned messages differently,.. As a "workaround" / to improve compatibility with broken MUA's I like Sandro's idea to treat PGP Messages as UTF-8 if the specified Charset is 7Bit ASCII. I think that would be a good solution to fix your bug. Although I would suggest to use a proper MUA with PGP/MIME support. -- You are receiving this mail because: You are watching all bug changes.