Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
Real Swift code uses very very few “unicode” operators, so I would heavily tilt the division towards making most characters identifiers. While I don’t want to talk about specific characters, I often wish I could name variables `∇f` or `∂u∂v`, while no sane API designer would ever use `∇` or `∂` as operators, even though they are considered “mathematical”. I think the bar for making a character an operator should be higher: no character should be classified as an operator if it can appear in language as part of an identifier. On Tue, Aug 8, 2017 at 2:10 PM, Nevin Brackett-Rozinsky via swift-evolution wrote: > Is this proposal still on track, or are there other plans to address the > issue of operator and identifier characters in Swift? > > Nevin > > > On Fri, Feb 17, 2017 at 12:50 AM, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> wrote: > >> As Stage 2 of Swift 4 evolution starts now, I'd like to share a revised >> proposal in draft form. >> >> It proposes a source-breaking change for *rationalizing* which >> characters are permitted in identifiers and which in operators. It's >> justified for this phase of Swift 4 because: >> >> - Existing grammar, in permitting invisible characters without >> security-minded restrictions, can be *actively harmful.* >> - A rationalized approach is *superior* to the current approach: by >> referencing Unicode standards, Swift should be able to evolve in a >> backwards-compatible way alongside Unicode, and will benefit from the >> significant expertise of others outside the Swift community with respect to >> Unicode best practices. >> - The vast majority of existing code (including all of the standard >> library) should *require no migration* work at all >> >> *What's changed* since the last time: >> >> - In an earlier draft, we proposed some radical changes to align with >> available Unicode standards; in particular, since emoji represent a >> difficult issue, and no recommendations about "operator identifiers" have >> surfaced from Unicode, we proposed temporarily stripping them out. This was >> *very >> poorly received*. This revision uses Unicode categories to identify >> nearly all emoji and classify them as identifier characters (while >> excluding those that depict operators such as !), and it uses Unicode >> categories to identify over 900 operators that nearly all pass the >> subjective test of "operator-likeness." >> >> What this proposal *does not attempt* to do: >> >> - This document *does not* seek to stake out new ground as to what >> characters should be *added* to the set of valid identifiers and >> operators. Such additions to the grammar are properly separate discussions. >> This proposal is only an attempt at systemization and rationalization. Only >> one character is incidentally added to the list of valid characters (`\`), >> and it is on the basis of an explicit table in Unicode Technical Report 25 >> regarding ASCII characters that are "mathematical." >> >> What feedback would be* most helpful*: >> >> - "Hey, this approach is so much more *clumsy* than my superior, more >> elegant category-based approach to identifying [operators/emoji], which is >> [insert here]." >> - "Hey, I disagree with the detailed design because it's got a *major >> security hole*, which is [insert here]." >> - "Hey, your proposal would break my *real-world* Swift code, which >> requires that character [X] be an [identifier/operator]." >> >> What would be *less helpful*: >> >> - "Hey, let's talk about how [specific character] should be an >> [identifier/operator]. We should add that character to the list of >> [identifiers/operators]. In fact, let's discuss [list] characters one by >> one." >> >> Acknowledgments: >> Thanks to co-authors of the previous take for their support for >> resurrecting this issue. Any brilliant ideas are undoubtedly theirs, and >> any botched efforts are certainly mine. Thanks also to Nevin >> Brackett-Rozinsky for helpful feedback. >> >> Link: >> https://gist.github.com/xwu/d2c2bb7097b0b5a4e9985aae737a2651 >> >> ___ >> swift-evolution mailing list >> swift-evolution@swift.org >> https://lists.swift.org/mailman/listinfo/swift-evolution >> >> > > ___ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution > > ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
As I said before, I am happy with this proposal overall. I just had a strange thought that I thought I should share before this goes through. If we make ‘π’ an operator instead of identifier, then we would be able to write things like 3π directly. For those of us with rational types, we could write (3/4)π. Another option is that we could have it be a literal with an associated ExpressibleBy… protocol. Just a thought I wanted to share... ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
On Mon, Feb 27, 2017 at 10:07 PM, Nevin Brackett-Rozinsky via swift-evolution wrote: > I think the most important goal is to end up with the right set of > operator and identifier characters for *Swift*. The Unicode guidelines are > a useful tool for that purpose, and get us a long way toward where we want > to be. However at the end of the day we should weigh our success by how > well we have done for Swift, not by how rigidly we adhere to Unicode > recommendations. > > Our treatment of emoji is a great example: the right thing for Swift is > different from the right thing for Unicode, so we choose to do what works > best for Swift. This proposal captures that very well. > In fact, I'm greatly dissatisfied with how this proposal captures emoji. Having come up with that scheme, I suspect that it is deficient in subtle or obvious ways that are not yet apparent to me. This is why I have asked for feedback along those lines. Note that for emoji, too, I have deliberately resisted the one-by-one inclusion of certain characters that are excluded by Unicode categories, of which there are a (small) handful. My very strong personal preference, though soundly rejected, would have been to remove the security and forward compatibility headache of support emoji altogether. It does not in my opinion hold its own weight. Matching what Unicode does should be a means for us, not an end. A stepping > stone we can use when it helps. Unicode’s categorizations should inform > and guide out decisions, not constrain them. > Well, now we are talking about overarching principles. The aim of this proposal is in fact to assert that Swift's identifiers and operators should be rationalized in a way that is constrained by Unicode recommendations. Just as Swift aims to provide full support for correct Unicode handling in strings by default, this proposal aims to align the valid characters to current and future Unicode recommendations as tightly as possible. It is anticipated that it should break a very small amount of actual code (if any). It permits Swift to evolve with new developments in Unicode in the future essentially "for free." In exchange we accept imperfections in Unicode as imperfections in Swift. I argue that we should do so because our own imperfections in understanding international character sets will necessarily be greater than that of Unicode experts working systematically. With regard to the fact that reclassifying the infinity and empty set > symbols would be a breaking change, that is all the more reason to do it > now, for Swift 4, before it is too late. Those two characters have come up > in every iteration of this discussion on Swift Evolution that I can recall, > and I have not heard anyone argue that they ought to be operators. I think > it is safe to consider them low-hanging fruit. > Disagree. As mentioned in the proposal, no attempt is made to expand the set of valid identifier characters to include non-emoji pictographs or symbols. If we adopt your approach, infinity and empty set would be the only non-emoji non-"human language" symbols deliberately allowed in identifiers, an approach no more consistent that the previous proposal to include only the cow and dog emoji. The alternative is to go through a vast swath of symbols character-by-character to determine which is sufficiently "noun-like" to be an identifier, as Unicode does not and (as far as I can tell) will never expand UAX#31 to include such symbols among identifiers. As I mentioned, it would be also be inconsistent to consider excluding only these two characters and not related characters, such as variations on the infinity symbol, from the set of valid operators. Very quickly, the necessity of doing a character-by-character debate balloons to encompass the entire character set. I continue to believe that this is absolutely the wrong approach. > Nevin > > ___ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution > > ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
I think the most important goal is to end up with the right set of operator and identifier characters for *Swift*. The Unicode guidelines are a useful tool for that purpose, and get us a long way toward where we want to be. However at the end of the day we should weigh our success by how well we have done for Swift, not by how rigidly we adhere to Unicode recommendations . Our treatment of emoji is a great example: the right thing for Swift is different from the right thing for Unicode, so we choose to do what works best for Swift. This proposal captures that very well. Matching what Unicode does should be a means for us, not an end. A stepping stone we can use when it helps. Unicode’s categorizations should inform and guide out decisions, not constrain them. With regard to the fact that reclassifying the infinity and empty set symbols would be a breaking change, that is all the more reason to do it now, for Swift 4, before it is too late. Those two characters have come up in every iteration of this discussion on Swift Evolution that I can recall, and I have not heard anyone argue that they ought to be operators. I think it is safe to consider them low-hanging fruit. Nevin ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
On Sun, Feb 26, 2017 at 11:50 AM, Nevin Brackett-Rozinsky via swift-evolution wrote: > This looks very good Xiaodi, and I have a few thoughts about it. > > First, is the intent that Swift will follow future changes to Unicode > operator recommendations, or that Swift will choose a “frozen in time” set > of Unicode recommendations to adopt? If the former, then we will likely see > source-breaking changes as Unicode evolves. And if the latter, then Swift’s > choices are apt to diverge even more from Unicode’s over time. > Great question. I guess the text leaves the mechanics of forward compatibility unsaid. The answer is: both. With respect to Unicode identifiers, UAX#31 guarantees future compatibility for ID_Start and ID_Continue. That is, anything that is currently valid in ID_Start will be valid in ID_Start for all time. It is reasonable to expect that the same experts will adopt that approach for their operator recommendations in the future. Indeed, they have set themselves up fairly well for this already: UAX#31 also guarantees that Pattern_Syntax characters will never be moved into ID_Start or ID_Continue. Therefore, we also have a guarantee that the approach for Swift's operators proposed here will *never* overlap with Swift's identifier characters even as Unicode evolves. Second, it is well-established that programming operators do not have to be > mathematical. For example, Swift uses the punctuation marks ‘!’, ‘?’, and > ‘&’ as operators in its standard library. The approach described in your > proposal does an excellent job at covering the core mathematical operator > characters in Unicode, however it does not appear to make such an effort > toward non-mathematical operators. > > Of particular note, given that ‘?’, ‘¿’, and ‘‽’ are operator characters, > it seems inconsistent to omit ‘⸘’. Similarly, with ‘&’ an operator, one > would expect ‘⅋’ to be as well. I see that “expanding the set of operator > characters” is listed as a non-goal, however that does not make it an > anti-goal, and the proposal indeed expands the set by adding ‘\’. Likewise > “rectifying Unicode shortcomings” is listed as a non-goal, although the > proposal incorporates some 16 characters for Swift 3 compatibility. > Expanding the set of valid operator characters by adding `\` is not a goal for this proposal. However, it so happens that UTR#25 explicitly mentions `\` as an operator. In fact, UTR#25 lists every one of Swift's ASCII operators as mathematical operators not classified as [:Math:], minus `?` but plus `\`. Therefore, if we agree that the alignment of Swift to Unicode recommendations as closely as possible is a desirable goal, the most intellectually honest set of ASCII operators would include `\`. Now, if Swift-specific implementation concerns preclude its inclusion, then I personally wouldn't fight it. The proposal makes no attempt to define a "non-mathematical operator" because, again, Unicode has no such definition--yet. There is no approach of which I'm aware to achieving consensus on that topic, short of either (a) waiting for more expert hands over at the Unicode Consortium; or (b) a character-by-character survey of all symbols in Unicode by non-experts (I count myself here) on this list, which is an explicit anti-goal of this proposal. In anticipation of Unicode completing its work, this proposal advances a design that (as I write above) makes possible the adoption of future Unicode recommendations in a source-compatible way. The chief mechanism by which this is guaranteed is by not assigning non-[:Math:] Pattern_Syntax characters (emoji excepted) to either identifiers or operators. It addresses the most common concern of those responding to an earlier version of this proposal, who argued against restricting operators in the interim to only ASCII characters (which would also be a source-compatible approach that makes room for future Unicode recommendations) because there is a set of non-ASCII characters that have unambiguously the characteristics of "operatorlikeness" useful to enable a more math-like syntax. The proposal here makes no effort to expand our understanding of what an operator is beyond what's required for the Swift standard library plus Unicode's somewhat imperfect classification of mathematical symbols. Indeed, the proposal makes explicit the expectation that Unicode experts will undertake that task. The 20 characters included for Swift 3 compatibility have as their objective only the preservation of Swift 3 source compatibility. They represent an educated guess (based on public code samples and messages to this list) as to what symbols are most likely to be used in real, shipping Swift code, absent arguments against inclusion on other grounds. They are not intended to represent any attempt at rationalization in alignment with some Unicode-recommended criterion. As I mentioned, I'm eager to hear feedback to the effect that some real, shipping code would be broken by the proposal. I'm s
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
This looks very good Xiaodi, and I have a few thoughts about it. First, is the intent that Swift will follow future changes to Unicode operator recommendations, or that Swift will choose a “frozen in time” set of Unicode recommendations to adopt? If the former, then we will likely see source-breaking changes as Unicode evolves. And if the latter, then Swift’s choices are apt to diverge even more from Unicode’s over time. Second, it is well-established that programming operators do not have to be mathematical. For example, Swift uses the punctuation marks ‘!’, ‘?’, and ‘&’ as operators in its standard library. The approach described in your proposal does an excellent job at covering the core mathematical operator characters in Unicode, however it does not appear to make such an effort toward non-mathematical operators. Of particular note, given that ‘?’, ‘¿’, and ‘‽’ are operator characters, it seems inconsistent to omit ‘⸘’. Similarly, with ‘&’ an operator, one would expect ‘⅋’ to be as well. I see that “expanding the set of operator characters” is listed as a non-goal, however that does not make it an anti-goal, and the proposal indeed expands the set by adding ‘\’. Likewise “rectifying Unicode shortcomings” is listed as a non-goal, although the proposal incorporates some 16 characters for Swift 3 compatibility. Another point that may be worth considering, are the two specific characters ‘∅’ and ‘∞’ which, although strongly mathematical, are definitely not operators. They are names for things—objects, quantities—and thus by the principle of least surprise they should be available for use in identifier names. Just as one might write “let π = Double.pi” at the top of a file, so too might one write “let ∞ = Double.infinity” or “let ∅ = Set()” for use later on: let y = sin(π * x) if tan(θ) == ∞ { … } var s = ∅ Thus, for the purpose of consistency, I think it makes sense to classify ‘∅’ and ‘∞’ as identifiers, as well as ‘⸘’ and ‘⅋’ as operators. Alternatively, ‘∞’ could be a floating-point literal, in which case it still would not be an operator. I understand that you described this type of feedback (on particular characters) as “less helpful”, however it appears that the “most helpful” types of feedback are unnecessary: the proposal is well thought out, with a strong core approach. It is only in the fine details that a few improvements can be made, “lesser” though they may be. Nevin ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
> On Feb 16, 2017, at 9:50 PM, Xiaodi Wu via swift-evolution > wrote: > > As Stage 2 of Swift 4 evolution starts now, I'd like to share a revised > proposal in draft form. > > It proposes a source-breaking change for rationalizing which characters are > permitted in identifiers and which in operators. It's justified for this > phase of Swift 4 because: > > - Existing grammar, in permitting invisible characters without > security-minded restrictions, can be actively harmful. > - A rationalized approach is superior to the current approach: by referencing > Unicode standards, Swift should be able to evolve in a backwards-compatible > way alongside Unicode, and will benefit from the significant expertise of > others outside the Swift community with respect to Unicode best practices. > - The vast majority of existing code (including all of the standard library) > should require no migration work at all > > What's changed since the last time: > > - In an earlier draft, we proposed some radical changes to align with > available Unicode standards; in particular, since emoji represent a difficult > issue, and no recommendations about "operator identifiers" have surfaced from > Unicode, we proposed temporarily stripping them out. This was very poorly > received. This revision uses Unicode categories to identify nearly all emoji > and classify them as identifier characters (while excluding those that depict > operators such as !), and it uses Unicode categories to identify over 900 > operators that nearly all pass the subjective test of "operator-likeness." > > I was one of the people leading the charge for preserving Emoji support and I really like where this proposal landed. Thank you to all the authors for the hard work! +1 Russ ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
On Mon, Feb 20, 2017 at 12:29 PM, Alex Blewitt wrote: > > On 17 Feb 2017, at 05:50, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> wrote: > > As Stage 2 of Swift 4 evolution starts now, I'd like to share a revised > proposal in draft form. > > It proposes a source-breaking change for *rationalizing* which characters > are permitted in identifiers and which in operators. > > What feedback would be* most helpful*: > > - "Hey, this approach is so much more *clumsy* than my superior, more > elegant category-based approach to identifying [operators/emoji], which is > [insert here]." > - "Hey, I disagree with the detailed design because it's got a *major > security hole*, which is [insert here]." > - "Hey, your proposal would break my *real-world* Swift code, which > requires that character [X] be an [identifier/operator]." > > > I like the approach taken here, and it is a much better way of concluding > the characters. I don't disagree with the design and don't have any example > code that will be affected, but I do have some (minor) observations about > the proposal. > Thanks Alex! I've updated the document accordingly. Here's the link: https://github.com/xwu/swift-evolution/blob/d1643c5c451232a277fe77b22fb891cdae90dcb4/proposals/-refining-identifier-and-operator-symbology.md > * The 'Dots' treatment feels like a special case in an otherwise good > write-up of Unicode, seemingly to lean towards Dart's method chaining > and/or cleanliness of implementation. It might be clearer to pull that out > to its own proposal, either independent of or building upon the general > Unicode changes? > Excellent point. I've removed mentions of method cascades. The rationale for revising the "dots rule" is clarified in the context of alignment to Unicode (or more accurately here, skating to where Unicode will be). * The grammar changes for the operator head contain a number of (what seems > like) hand-picked unicode symbols for increased compatibility with Swift 3 > (e.g. dagger and friends). Maybe these could be pulled out into their own > group e.g. operator-head -> operator-head-swift3, to call out the reason > for their hand-picked nature (and for later cleanup, should that be > required). > Done. > * The proposed solution tables (shall be an identifier/is an identifier) > wasn't clear to me at first what the rows and columns were. Maybe calling > these out as a bulleted list would be better: > > - Identifiers under Swift 3 and this proposal: 120,617 code points > - Identifiers that would be added under this proposal: 699 emoji > - Identifiers under Swift 3 that would no longer be an identifier: > unassigned code points and 4,929 other code points > > Similarly, for operators: > > - Operators under Swift 3 and this proposal: 986 code points > - Operators that would be added under this proposal: \ > - Operators under Swift 3 that would no longer be an identifier: > unassigned code points and 2,024 other code points > > You could summarise that as a pseudo-diff --stat > > Identifiers > + 699 emoji > 120,617 code points > - 4,929 code points and unassigned code points > > Operators > + 1 code point \ > 986 code points > - 2,024 code points > > Alternatively you could change the 'Is an identifier/operator' to 'Is a > Swift 3 identifier' to make it clear that it's the Swift 3 header, but the > tabular form is still not that clear to me. > I've converted this to bulleted lists like you suggest. > Another stat that would be worth calling out: of the 2,042 code points > that are no longer operators, what the overlap is with the 699 emoji that > are added to the identifiers? If they were all of them then it would only > be 1,325 operators that were no longer valid. > The answer to that is 98; the 601 are emoji sequences that weren't permitted previously. I've incorporated this information into the text. > To conclude: I like the look of the proposal from the block set > definition, which will be better than hand-picking the character set as the > grammar currently stands. > > Alex > ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution
Re: [swift-evolution] [Draft] Refining identifier and operator symbology (take 2)
> On 17 Feb 2017, at 05:50, Xiaodi Wu via swift-evolution > wrote: > > As Stage 2 of Swift 4 evolution starts now, I'd like to share a revised > proposal in draft form. > > It proposes a source-breaking change for rationalizing which characters are > permitted in identifiers and which in operators. > > What feedback would be most helpful: > > - "Hey, this approach is so much more clumsy than my superior, more elegant > category-based approach to identifying [operators/emoji], which is [insert > here]." > - "Hey, I disagree with the detailed design because it's got a major security > hole, which is [insert here]." > - "Hey, your proposal would break my real-world Swift code, which requires > that character [X] be an [identifier/operator]." I like the approach taken here, and it is a much better way of concluding the characters. I don't disagree with the design and don't have any example code that will be affected, but I do have some (minor) observations about the proposal. * The 'Dots' treatment feels like a special case in an otherwise good write-up of Unicode, seemingly to lean towards Dart's method chaining and/or cleanliness of implementation. It might be clearer to pull that out to its own proposal, either independent of or building upon the general Unicode changes? * The grammar changes for the operator head contain a number of (what seems like) hand-picked unicode symbols for increased compatibility with Swift 3 (e.g. dagger and friends). Maybe these could be pulled out into their own group e.g. operator-head -> operator-head-swift3, to call out the reason for their hand-picked nature (and for later cleanup, should that be required). * The proposed solution tables (shall be an identifier/is an identifier) wasn't clear to me at first what the rows and columns were. Maybe calling these out as a bulleted list would be better: - Identifiers under Swift 3 and this proposal: 120,617 code points - Identifiers that would be added under this proposal: 699 emoji - Identifiers under Swift 3 that would no longer be an identifier: unassigned code points and 4,929 other code points Similarly, for operators: - Operators under Swift 3 and this proposal: 986 code points - Operators that would be added under this proposal: \ - Operators under Swift 3 that would no longer be an identifier: unassigned code points and 2,024 other code points You could summarise that as a pseudo-diff --stat Identifiers + 699 emoji 120,617 code points - 4,929 code points and unassigned code points Operators + 1 code point \ 986 code points - 2,024 code points Alternatively you could change the 'Is an identifier/operator' to 'Is a Swift 3 identifier' to make it clear that it's the Swift 3 header, but the tabular form is still not that clear to me. Another stat that would be worth calling out: of the 2,042 code points that are no longer operators, what the overlap is with the 699 emoji that are added to the identifiers? If they were all of them then it would only be 1,325 operators that were no longer valid. To conclude: I like the look of the proposal from the block set definition, which will be better than hand-picking the character set as the grammar currently stands. Alex ___ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution