Dear Markus,
thanks for the explanation.
From this I understand that the bug is in the way Mime4j is called from K-9
(and Google's original Email client). Mime4j is meant for parsing header fields
as they arrive, that is following the appropriate RFC for MIME. Mime4j is not
intended for validation of header fields as they are presented to (or in my case
entered by) the user.
Is there a method in Mime4j to encode UTF-8 to the 'encoded word' =?...?=? (I
guess there is not.) Such a method would have to correctly handle *lists* of
'decoded' addresses and not create e.g.
=?ISO-8859-1?Q?Hans_=3Chans=40acme.org=3E,_Hans_M=FCller?=
<[email protected]>
from
Hans <[email protected]>, Hans Müller <[email protected]>
Thanks, Ondrej.
P.S. for myself or K-9 developers:
com/android/email/EmailAddressValidator.java:16
should not call com.android.email.mail.Address.parse
(or com.android.email.mail.Address.parse should first encoded UTF-8 prior to
passing to to Mime4j)
Markus Wiederkehr wrote:
E-mail header fields may contain us-ascii characters only. To overcome
this restriction the "name" part of an e-mail address is usually
encoded by a mechanism called an "encoded word". Your mail client then
knows how to interpret these encoded words and is able to display the
original name.
Just look into the source code of your e-mails to see what I mean.
You'll occasionally see e-mail addresses such as
"=?ISO-8859-1?Q?Hans_M=FCller?= <[email protected]>" which is
equivalent to "Hans Müller <[email protected]>".
Mime4j should be capable of decoding them too..
Markus
On Thu, Mar 26, 2009 at 10:45 PM, Ondrej Bojar <[email protected]> wrote:
Dear Mime4J developers,
I use android and both the builtin Email client and the K-9 replacement
delete e-mail addresses containing accented characters. (If say "Pétér
<[email protected]>" sends me an e-mail and I hit 'Reply', the 'To' field
becomes blank.)
I can barely read Java, but I understood from K-9 source they use your
Mime4J for e-mail address parsing (and thus validation).
I was not able to compile the code downloaded from your site (I know nothing
about Maven, I installed it but running 'mvn test' tried to download
something and failed.)
I compiled K-9 (the source of which includes a version of mime4j) and I
guess this exception is exactly the reason why they remove addresses with
accented characters:
22:39 vaio classes$java org.apache.james.mime4j.field.address.AddressList
Pétér <[email protected]>
Pétér <[email protected]>
org.apache.james.mime4j.field.address.parser.ParseException: Lexical error
at line 1, column 2. Encountered: "\u00e9" (233), after : ""
at
org.apache.james.mime4j.field.address.parser.AddressListParser.parse(AddressListParser.java:42)
at
org.apache.james.mime4j.field.address.AddressList.parse(AddressList.java:116)
at
org.apache.james.mime4j.field.address.AddressList.main(AddressList.java:132)
I've read your remark somewhere that you're deliberately not handling Base64
or Quoted-Printable, but this is plain UTF-8 so that shouldn't pose a
problem.
My question is simple: who should I blame ;-)
With apologies for a question from a non-Javist,
Ondrej Bojar.
--
Ondrej Bojar (mailto:[email protected] / [email protected])
http://www.cuni.cz/~obo
--
Ondrej Bojar (mailto:[email protected] / [email protected])
http://www.cuni.cz/~obo