On 24.06.2016 0:57, João Pinheiro via swift-evolution wrote:
Indeed, the case shown in Josh's example was the motivation for this thread
and will be solved by the proposal.

The current discussion has been around whether it should be solved by
ignoring invisible characters or prohibiting them and explicitly
highlighting them as an error. I originally proposed prohibiting them and
was convinced into thinking that ignoring them would suffice. Upon further
reading of the unicode normalisation and security documents, I agree that
prohibiting them outside of the situations described in UAX #31 is the best
and safest choice.

I do believe the *safest* variant should be chosen as, actually, do we see lot of sources with unicode identifiers? I believe very small percent in real code. IMO At first we should protect Swift from problems with unicode identifiers, and only after this support as much unicode as we can. (Personally I really don't understand why we need anything than ASCII codes for identifiers. This could solve all the problems with invisible space/left-to-right-flags/complicated rules/graphemes etc. But someone needs to be able to put dog emoji as identifiers.. well.. OK)


Sincerely,
João Pinheiro


On 23 Jun 2016, at 21:45, Xiaodi Wu via swift-evolution
<swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Let me correct myself: what I think Josh's example is should be corrected
whether we prohibit or ignore. However, since no one can see the
invisible characters he used, I can't say for sure.

If he found a clever way to reorder or change spacing between letters
(e.g. superimpose two characters so that "var11" looks like "var1"), then
the problem can only be fixed by prohibition.
On Thu, Jun 23, 2016 at 15:36 James Hillhouse <jdhillhou...@icloud.com
<mailto:jdhillhou...@icloud.com>> wrote:

    Thanks Xiaodi. That’s a relief to know.


    On Jun 23, 2016, at 3:32 PM, Xiaodi Wu <xiaodi...@gmail.com
    <mailto:xiaodi...@gmail.com>> wrote:

    FWIW, Josh's example would be fixed whether we prohibit or ignore
    invisible characters, but there are other potential strings for
    which prohibition would be more secure.

    On Thu, Jun 23, 2016 at 15:27 James Hillhouse
    <jdhillhou...@icloud.com <mailto:jdhillhou...@icloud.com>> wrote:

        +1 on this. Josh Wisenbaker’s example says enough. Yikes!

        On Jun 23, 2016, at 3:18 PM, David Sweeris via swift-evolution
        <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>
        wrote:

        +1
        I didn't even know there were any invisible characters until
        this thread came up.

        - Dave Sweeris

        On Jun 23, 2016, at 15:13, Xiaodi Wu via swift-evolution
        <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>
        wrote:

        On Thu, Jun 23, 2016 at 2:54 PM, João Pinheiro
        <j...@joaopinheiro.org <mailto:j...@joaopinheiro.org>> wrote:


            > On 23 Jun 2016, at 20:43, Xiaodi Wu <xiaodi...@gmail.com 
<mailto:xiaodi...@gmail.com>> wrote:
            > That's cool, although my preferred solution would be more closely 
aligned with UAX #31: overtly disallow the glyphs in Table 4 (instead of ignoring 
them) except in the specific scenarios for ZWJ and ZWNJ identified in UAX #31, 
then afterwards internally represent the identifier as its NFC-normalized string.

            Explicitly disallowing them was my initial idea, but I
            think it would end up being a confusing error for users to
            encounter. Ignoring the invisible characters and leaving
            it up to a linter to remove them is less likely to cause
            confusion for users.

            I'll be sure to describe the alternative of explicitly
            prohibiting them in the proposal though.


        I would strongly urge you to propose explicitly prohibiting
        them just as UAX #31 recommends. Their reasoning is that these
        characters, which include those that reverse text direction or
        control joining, can cause one identifier to be maliciously
        changed to look like another. If you ignore these characters
        instead of prohibiting them, an identifier that visually
        appears as one string could in fact be a different one to the
        compiler.

        Moreover, a compiler error can be made helpful by saying that
        the offending character is potentially invisible and it can
        come with a fix-it to remove the offending character. I don't
        think that would confuse the user at all. It would be more
        confusing if invisible characters that caused one thing to
        look identical to another were silently permitted.


            Sincerely,
            João Pinheiro


        _______________________________________________
        swift-evolution mailing list
        swift-evolution@swift.org <mailto:swift-evolution@swift.org>
        https://lists.swift.org/mailman/listinfo/swift-evolution
        _______________________________________________
        swift-evolution mailing list
        swift-evolution@swift.org <mailto:swift-evolution@swift.org>
        https://lists.swift.org/mailman/listinfo/swift-evolution


_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution



_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to