I'm by no means a linguist but I would assume that there are a plethora of good and usefull mixtures of scripts that exist in daily life. Passing this problem (of which all of us have been aware of for years now) back to the policy arena won't help anyone since I doubt that there can be any kind working group (now or in the future) that can come up with a good rational for all scripts and languages without restricting "good" and usefull mixtures.
By design the IDNA processing happens inside the application and therefore in my thinking the applications are the right place for any security meassures as well. Talking about about security measures we have to think about what exactly we want to prevent from happening. Do we in general want to battle fraud than we would have to have a look at typo domain names like pajpal.com as well or is the goal to enable the user to better understand exactly what URL he is using. In that case we are not talking about security anymore but about awareness. If a user is aware of the fact that a URL he wants to use is a mixture of scripts he can decide for himself whether he wants to trust it or not. I guess thats fair enough afterall are all users responsible for there own behavior and the risks comming along with it. Best, tom Am 15.02.2005 schrieb Michel Suignard: > No languages used in the former soviet union should require a mix of latin > and cyrillic in a single dns label. > Unicode contains many latin homographs in the Cyrillic block exactly for that > reason, to avoid mixing the two scripts in a single word. It is unfortunate > that the exact visual match is now haunting us. However it should not be used > as a rationale to accept registration of mixed Cyrillic/Latin labels by tld > registries. > > To answer another message in this thread, there is no definitive answer about > which Unicode characters are allowed for a given languages. But in all > languages that have a reasonable concept of 'words', you should never need to > allow mixed script in a word, at least in the context of IDN label. There are > exceptions to these rules, like in South and East Asia (Japanese comes to > mind), but these languages can be detected reasonably using the Unicode > script property. > > Michel > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kane, Pat > > VeriSign does prevent domains with the Russian language tag from commingling > A-Z with the Cyrillic characters. It does permit 0-9 and the dash to be > used. This filter also applies to other Cyrillic based languages such as > Belarusian, Ukrainian, Serbian, Macedonian and Bulgarian. > > There are other languages that are listed within ISO 639-2 that today use a > combination of Latin and Cyrillic as they were originally Latin based (Tajik > was Arabic prior to being Latin based), migrated to Cyrillic during the > Soviet era and today are migrating back to Latin. It is common to use Latin > and Cyrillic characters in Tajik, from what I understand not being a native > speaker. Granted there are not a lot of registrations in com net that are > Tajik, but this is just the point of an IDN. > > Pat Kane > > > > Gruss, tom (__) (OO)_____ (oo) /|\ A cow is not entirely full of | |--/ | * milk some of it is hamburger! w w w w
