I’m not sure what the current year of the Gregorian calendar has to do with strings. :P
l8r Sean > On Aug 17, 2016, at 2:20 PM, Shawn Erickson via swift-evolution > <swift-evolution@swift.org> wrote: > > As stated earlier it is 2016, I think the baseline should be robust Unicode > support and what we have in Swift is actually a fairly good way of dealing > with it IMHO. I think new to development folks should have this as their > baseline as well... not that we shouldn't make it as easy to work with as > possible. > > -Shawn > > On Wed, Aug 17, 2016 at 12:15 PM Kenny Leung via swift-evolution > <swift-evolution@swift.org> wrote: > It seems to me that UTF-8 is the best choice to encode strings in English and > English-like character sets for storage, but it’s not clear that it is the > most useful or performant internal representation for working with strings. > In my opinion, conflating the preferred storage format and the best internal > representation is not the proper thing to do. Picking the right internal > storage format should be evaluated based on its own criteria. Even as an > experienced programmer, I assert that the most useful indexing system is > glyph based. > > In Félix’s case, I would expect to have to ask for a mail-friendly > representation of his name, just like you have to ask for a > filesystem-friendly representation of a filename regardless of what the > internal representation is. Just because you are using UTF-8 as the internal > format, it does not mean that universal support is guaranteed. > > In response to this statement: “Optimizing developer experience for beginning > developers is just going to lead to software that screws…”, the current > system trips up not only beginning developers, but is different from pretty > much every programming language in my experience. > > -Kenny > > > > On Aug 17, 2016, at 11:48 AM, Zach Waldowski via swift-evolution > > <swift-evolution@swift.org> wrote: > > > > It's 2016, "the thing people would most commonly expect" > > impossible-to-screw-up Unicode support that's performance. Optimizing > > developer experience for beginning developers is just going to lead to > > software that screws up in situations the developer doesn't anticipate, > > as F+¬lix notes above. > > > > Zachary > > > > On Wed, Aug 17, 2016, at 09:40 AM, Kenny Leung via swift-evolution > > wrote: > >> I understand that the most friendly approach may not be the most > >> efficient, but that’s not what I’m pushing for. I’m pushing for "does the > >> thing people would most commonly expect”. Take a first-time programmer > >> who reads any (human) language, and that is what they would expect. > >> > >> Why couldn’t String’s internal storage format be glyph-based? If I were, > >> say, writing a text editor, it would certainly be the easiest and most > >> efficient format to work in. > >> > >> -Kenny > >> > >> > >>> On Aug 15, 2016, at 9:20 PM, Félix Cloutier <felix...@yahoo.ca> wrote: > >>> > >>> The major problem with this approach is that visual glyphs themselves > >>> have one level of variable-length encoding, and they sit on top of > >>> another variable-length encoding used to represent the Unicode characters > >>> (Swift-native Strings are currently encoded as UTF-8). For instance, the > >>> visual glyph 🇺🇸 is the the result of putting side-by-side the Unicode > >>> characters 🇺 and 🇸("REGIONAL INDICATOR SYMBOL LETTER U" and "REGIONAL > >>> INDICATOR SYMBOL LETTER S"), which are themselves encoded as UTF-8 using > >>> 4 bytes each. A design in which you can "just write" string[4544] hides > >>> the fact that indexing is a linear-time operation that needs to recompose > >>> UTF-8 characters and then recompose visual glyphs on top of that. > >>> > >>> Generally speaking, I *think* that I agree that human-geared "long > >>> string" on which you probably won't need random access, and > >>> machine-geared smaller strings that encode a command, could benefit from > >>> not being considered the same fundamental thing. However, I'm also afraid > >>> that this will end with more applications and websites that think that > >>> first names only contain 7-bit-clean characters in the A-Z range. (I live > >>> in the US and I can attest that this is still very common.) > >>> > >>> You could make a point too that better facilities to parse strings would > >>> probably address this issue. > >>> > >>> Félix > >>> > >>>> Le 15 août 2016 à 10:52:02, Kenny Leung via swift-evolution > >>>> <swift-evolution@swift.org> a écrit : > >>>> > >>>> I agree with both points of view. I think we need to bring back > >>>> subscripting on strings which does the thing people would most commonly > >>>> expect. > >>>> > >>>> I would say that the subscripts indexes should correspond to a visual > >>>> glyph. This seems reasonable to me for most character sets like Roman, > >>>> Cyrillic, Chinese. There is some doubt in my mind for things like > >>>> subscripted Japanese or connected (ligatured?) languages like Arabic, > >>>> Hindi or Thai. > >>>> > >>>> -Kenny > >>>> > >>>> > >>>>> On Aug 15, 2016, at 10:42 AM, Xiaodi Wu via swift-evolution > >>>>> <swift-evolution@swift.org> wrote: > >>>>> > >>>>> On Sun, Aug 14, 2016 at 5:41 PM, Michael Savich via swift-evolution > >>>>> <swift-evolution@swift.org> wrote: > >>>>> Back in Swift 1.0, subscripting a String was easy, you could just use > >>>>> subscripting in a very Python like way. But now, things are a bit more > >>>>> complicated. I recognize why we need syntax like > >>>>> str.startIndex.advancedBy(x) but it has its downsides. Namely, it makes > >>>>> things hard on beginners. If one of Swift's goals is to make it a great > >>>>> first language, this syntax fights that. Imagine having to explain > >>>>> Unicode and character size to an 8 year old. This is doubly problematic > >>>>> because String manipulation is one of the first things new coders might > >>>>> want to do. > >>>>> > >>>>> What about having an InternalString subclass that only supports one > >>>>> encoding, allowing it to be subscripted with Ints? The idea is that an > >>>>> InternalString is for Strings that are more or less hard coded into the > >>>>> app. Dictionary keys, enum raw values, that kind of stuff. This also > >>>>> has the added benefit of forcing the programmer to think about what the > >>>>> String is being used for. Is it user facing? Or is it just for internal > >>>>> use? And of course, it makes code dealing with String manipulation much > >>>>> more concise and readable. > >>>>> > >>>>> It follows that something like this would need to be entered as a > >>>>> literal to make it as easy as using String. One way would be to make > >>>>> all String literals InternalStrings, but that sounds far too drastic. > >>>>> Maybe appending an exclamation point like "this"! Or even just wrapping > >>>>> the whole thing in exclamation marks like !"this"! Of course, we could > >>>>> go old school and write it like @"this" …That last one is a joke. > >>>>> > >>>>> I'll be the first to admit I'm way in over my head here, so I'm very > >>>>> open to suggestions and criticism. Thanks! > >>>>> > >>>>> I can sympathize, but this is tricky. > >>>>> > >>>>> Fundamentally, if it's going to be a learning and teaching issue, then > >>>>> this "easy" string should be the default. That is to say, if I write > >>>>> `var a = "Hello, world!"`, then `a` should be inferred to be of type > >>>>> InternalString or EasyString, whatever you want to call it. > >>>>> > >>>>> But, we also want Swift to support Unicode by default, and we want that > >>>>> support to do things The Right Way(TM) by default. In other words, a > >>>>> user should not have to reach for a special type in order to handle > >>>>> arbitrary strings correctly, and I should be able to reassign `a = > >>>>> "你好"` and have things work as expected. So, we also can't have the > >>>>> "easy" string type be the default... > >>>>> > >>>>> I can't think of a way to square that circle. > >>>>> > >>>>> > >>>>> Sent from my iPad > >>>>> > >>>>> _______________________________________________ > >>>>> swift-evolution mailing list > >>>>> swift-evolution@swift.org > >>>>> https://lists.swift.org/mailman/listinfo/swift-evolution > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> swift-evolution mailing list > >>>>> swift-evolution@swift.org > >>>>> https://lists.swift.org/mailman/listinfo/swift-evolution > >>>> > >>>> _______________________________________________ > >>>> swift-evolution mailing list > >>>> swift-evolution@swift.org > >>>> https://lists.swift.org/mailman/listinfo/swift-evolution > >>> > >> > >> _______________________________________________ > >> swift-evolution mailing list > >> swift-evolution@swift.org > >> https://lists.swift.org/mailman/listinfo/swift-evolution > > _______________________________________________ > > swift-evolution mailing list > > swift-evolution@swift.org > > https://lists.swift.org/mailman/listinfo/swift-evolution > > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution _______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution