Re: Unicode and Security

Elliotte Rusty Harold Thu, 07 Feb 2002 10:00:24 -0800

I've been thinking about security issues in Unicode, and I've come up 
with one that's quite scary and worse than any I've heard before. It 
uses only plaintext, no fonts involved, doesn't require buggy 
software, and works over e-mail instead of the Web. All it requires 
added to the existing infrastructure is internationalized domain 
names. So in the hope that this becomes a self-defeating prophecy, 
here's the scenario:


I as a reporter or industrial spy or detective working on a divorce 
case, have learned the identities and internal e-mail addresses of 
two people, call them Alice and Bob, at Microsoft (or just about any 
other large company). I've somehow communicated with these people 
personally, for instance on an e-mail list completely unrelated to 
work but for which they use their work e-mail so I'm familiar with 
their style and signature files. Or perhaps, I've communicated with 
them on work related matters before. In any case, it's not hard to 
get two people who know each other at a large company to send you 
e-mail. Of course, they would presumably be careful not to give me 
secret company information since they know they're talking to an 
outsider.

For the sake of argument, let's call the company they work at 
Microsoft, but this attack could hit most companies with a .com 
address. Let's say I register microsoft.com, only the fifth letter 
isn't a lower-case Latin o. It's actually a lower case Greek omicron. 
I then forge a believable letter from [EMAIL PROTECTED] to 
[EMAIL PROTECTED] saying "Can you please update me on your budget?" 
Bob, noticing that the e-mail appears to come from Alice, whom he 
knows and trusts, fires off a reply with his confidential 
information. Only it doesn't go to Alice. It goes to me. I can then 
reply to Bob, asking for clarification or more details. I can ask him 
to attach the latest build of his software. I can carry on a 
conversation in which Bob believes me to be Alice and spills his 
guts. This is very, very bad.

E-mail forgery has been a problem for a long time, but it's always 
been one-way. You couldn't trick somebody into sending you a reply 
because doing so required using a different e-mail address than the 
one they expected, thus revealing the message as forged. With a 
Unicode enabled mailer, that's no longer true. If the fonts Bob (not 
me, but Bob) chooses for his e-mail program do not make a clear 
distinction between an o and an omicron, this works. There are lots 
of other attacks. The Cyrillic and Greek alphabets provide lots of 
options for replacing single letters in Latin domain names.

I'm not sure whether or not the internationalized domain names 
working group has fully grokked this or not. Like Unicode, they seem 
to be trying to pass the buck. In particular, they state 
<http://www.ietf.org/internet-drafts/draft-ietf-idn-requirements-09.txt>:

Specifying requirements for internationalized domain names does not 
itself raise any new security issues. However, any change to the DNS 
MAY affect the security of any protocol that relies on the DNS or on 
DNS names. A thorough evaluation of those protocols for security
concerns will be needed when they are developed. In particular, IDNs 
MUST be compatible with DNSSEC and, if multiple charsets or 
representation forms are permitted, the implications of this 
name-spoof MUST be throughly understood.

In other words, it's not our fault. Blame the client software. Sounds 
distressingly like the Unicode Consortium's approach to these issues. 
Interestingly, my attack works with a single character representation 
(Unicode). It is not dependent on multiple charsets. I don't know if 
the IDN working group has thought of this problem. I hope they have, 
and consider it their responsibility to prevent. I also hope the 
Unicode consortium and vendors of client software think about these 
problems. But I don't think we can count on client software getting 
this right. (Hell, Microsoft, can't even stop e-mail from running 
scripts.)  The problem needs to be fixed closer to the source.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
+----------------------------------+---------------------------------+

Re: Unicode and Security

Reply via email to