Again, looking at this as a script problem is much more appropriate, after all, neither "com" nor any ccTLD nor most of the gTLDs are associated with any single language.
Nor is com associated with a single script.
I.e. I think you've taken a step in the right direction, namely:
language -> script
But I believe it would be a good idea to consider taking another step in the same direction:
language -> script -> character set
StringPrep and NamePrep are great, but I wonder if it might be good to take their ideas one step further to solve the IDN spoofing problem. I.e. to "normalize" the homographs by mapping the similar-looking characters to "base" characters.
For example, Cyrillic small 'a' would be mapped to Latin small 'a', since they are virtually identical (and, indeed, given the same glyph index in some fonts, I hear).
Of course, this idea is likely to be extremely controversial since any pair of characters that look similar to one person will look different to another.
Erik
