In article 
<ofc0fea11b.05dda05c-onc125826a.0038eb98-c125826a.0038f...@notes.na.collabserv.com>
 you write:
>-=-=-=-=-=-
>-=-=-=-=-=-
>-=-=-=-=-=-
>
>Hello folks
>
>I've been tasked with finding out what the general consensus is on the 
>support in email headers for International characters such as  UTF-8 
>Charcacters and including things like accented characters like � and � and 
>can also include Asian and Cyrillic characters.
>
>I know there's an RFC from 2012, but my Product Dev people are interested 
>in knowing how wide-spread the actual adoption is. 

Funny you should ask.  I'm doing some work for the UASG group to document how
internationalized email (known as EAI) works.

UTF-8 in everything except the actual addresses can be in MIME body
parts and encoded-words in mail headers.  Those have been around for
at least a decade and should work everywhere.  

RFCs 6530-6533 defined an SMTP extension called SMTPUTF8 which, to
oversimplify a little, allows UTF-8 anywhere you can have ASCII,
including in both the local part and the domain part of the addresses.
This modifies both the messages themselves and the address in the
SMTP dialog MAIL FROM and RCPT TO.

Uptake has been slow, but Gmail quietly added support last year, and
Hotmail/Outlook/Live added support about a month ago.  Some of the
large Chinese services like Coremail support it as do some Indian
services like Xgenplus.  Yahoo/AOL/Oath have as far as I can tell no
plans to support it.

The Gmail and Hotmail support handles other people's UTF-8 addresses
in mail but they still don't provide UTF-8 addresses on their own
systems.  It is my impression that the main interest is currently in
India since some bits of the government are planning to hand out
e-mail addresses to go with the biometric IDs, and a lot of Indians
are literate in their own languages, which are written in their own
scripts, but not English.

Having recently written EAI support into my own qmail system I can say
that the basic address handling was a lot easier than I expected,
since most mail code these days is already 8-bit clean sort of by
accident.  The hardest part, which I haven't done yet, is generalizing
the address mapping that MTAs do on incoming mail.  Converting between
upper and lower case is remarkably language-specific, even in
languages written in Latin characters.  Add things like all the ways
Unicode can represent accented characters, the meaning of
o-with-umlaut which is short for "oe" in German but not in
Scandinavia, and Chinese traditional and simplified characters and
it's a challenge to make addresses work in ways that seem natural in
whatever language the address is written in.

R's,
John

PS: Youtube blurb about EAI we recorded in San Juan a few weeks ago
here: https://youtu.be/REDeEhvHwsU

The Microsoft guy announces support in Hotmail/Outlook.

_______________________________________________
mailop mailing list
mailop@mailop.org
https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop

Reply via email to