On 5 Jun 2014, at 19:24, Jeff Senn <s...@maya.com> wrote:

> On Jun 5, 2014, at 12:41 PM, Hans Aberg <haber...@telia.com> wrote:
> 
>> On 5 Jun 2014, at 17:46, Jeff Senn <s...@maya.com> wrote:
>> 
>>> That is: are identifiers merely sequences of characters or intended to be 
>>> comparable as “Unicode strings” (under some sort of compatibility rule)?
>> 
>> In computer languages, identifiers are normally compared only for equality, 
>> as it reduces lookup time complexity.
> 
> Well in this case we are talking about parsing a source file and generating 
> internal symbols, so the complexity of the comparison operation is a red 
> herring.
> 
> The real question is how does the source identifier get mapped into a 
> (compiled) symbol.  (e.g. in C++ this is not an obvious operation)
> 
> If your implication is that there should be no canonicalization (the string 
> from the source is used as a sequence of characters only directly mapped to a 
> symbol), then I predict sticky problems in the future.  The most obvious of 
> which is that in some cases I will be able to change the semantics of the 
> complied program by (accidentally) canonicalizing the source text (an 
> operation, I will point out, that is invisible to the user in many (most?) 
> Unicode aware editors).

It is not difficult to mangle any byte sequence into c/C++ identifiers, but 
Swift compiles directly into LLVM, so perhaps it is not needed. Xcode is very 
aggressive at combining characters, so it is hard to write non-normalized 
characters from it. The manual says that after the first character, combining 
characters are allowed, but does not seem to mention normalization. But it 
seems the compiler only needs to compare byte sequences for equality, which is 
what is traditional.



_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

Reply via email to