Hi Ingo,

Ingo Schwarze writes:
> While the article is old, the essence of what Schneier said here
> still stands, and it is not likely to fall in the future:
> 
>   https://www.schneier.com/crypto-gram-0007.html#9

The most interesting sentence here is:

"Unicode is just too complex to ever be secure."

This is sort of valid, and it's why the only sane way to handle UTF-8
is to ignore the complexities and escape methods he alluded to.
Codepoints should be represented with the shortest possible sequence.
Surrogate pairs should not be encoded in UTF-8. Byte order marks should
not exist in UTF-8. UTF-8 parsers should handle encoding errors in the
same well-defined way: abort decoding on invalid sequence and retry
starting with the second byte.

I like how Plan 9 handled Unicode. Aside from inventing UTF-8--an
encoding scheme that actually makes sense with C strings, unlike the
disastrous designs-by-committee that were UCS-2 and UTF-16--they
basically used it as just a way to have more than 256 characters.
Most parts of Unicode proper, like collation or canonical equivalence,
were simply dropped. Noncompliant? Sure, but it made things dramatically
simpler.

In other words, divorce UTF-8 the encoding from Unicode the standard.

Homograph attacks are a real concern with any large character set. But:

1) I've been tricked by... well, not attacks, but simply badly written
   filenames with plain old ASCII: e instead of a, spaces instead of
   underscores, 0/O or l/I/1. It's easy to fool the human mind by
   feeding it something that sort of looks like what's expected.

2) Given that filenames can contain literally anything except / and \0,
   there are so many other attacks that enforcing valid UTF-8 in
   filenames would be a hypothetical improvement (not that I'm
   necessarily advocating doing that in OpenBSD). Spaces are bad enough.
   How many shell scripts handle *newlines* correctly? What about VT100
   escape sequences? This whole thing is a security nightmare already.

I happily use UTF-8 filenames on OpenBSD, and have done so for years.

-- 
Anthony J. Bentley

Reply via email to