It's 2016, "the thing people would most commonly expect" impossible-to-screw-up Unicode support that's performance. Optimizing developer experience for beginning developers is just going to lead to software that screws up in situations the developer doesn't anticipate, as F+¬lix notes above.
Zachary On Wed, Aug 17, 2016, at 09:40 AM, Kenny Leung via swift-evolution wrote: > I understand that the most friendly approach may not be the most > efficient, but that’s not what I’m pushing for. I’m pushing for "does the > thing people would most commonly expect”. Take a first-time programmer > who reads any (human) language, and that is what they would expect. > > Why couldn’t String’s internal storage format be glyph-based? If I were, > say, writing a text editor, it would certainly be the easiest and most > efficient format to work in. > > -Kenny > > > > On Aug 15, 2016, at 9:20 PM, Félix Cloutier <felix...@yahoo.ca> wrote: > > > > The major problem with this approach is that visual glyphs themselves have > > one level of variable-length encoding, and they sit on top of another > > variable-length encoding used to represent the Unicode characters > > (Swift-native Strings are currently encoded as UTF-8). For instance, the > > visual glyph 🇺🇸 is the the result of putting side-by-side the Unicode > > characters 🇺 and 🇸("REGIONAL INDICATOR SYMBOL LETTER U" and "REGIONAL > > INDICATOR SYMBOL LETTER S"), which are themselves encoded as UTF-8 using 4 > > bytes each. A design in which you can "just write" string[4544] hides the > > fact that indexing is a linear-time operation that needs to recompose UTF-8 > > characters and then recompose visual glyphs on top of that. > > > > Generally speaking, I *think* that I agree that human-geared "long string" > > on which you probably won't need random access, and machine-geared smaller > > strings that encode a command, could benefit from not being considered the > > same fundamental thing. However, I'm also afraid that this will end with > > more applications and websites that think that first names only contain > > 7-bit-clean characters in the A-Z range. (I live in the US and I can attest > > that this is still very common.) > > > > You could make a point too that better facilities to parse strings would > > probably address this issue. > > > > Félix > > > >> Le 15 août 2016 à 10:52:02, Kenny Leung via swift-evolution > >> <swift-evolution@swift.org> a écrit : > >> > >> I agree with both points of view. I think we need to bring back > >> subscripting on strings which does the thing people would most commonly > >> expect. > >> > >> I would say that the subscripts indexes should correspond to a visual > >> glyph. This seems reasonable to me for most character sets like Roman, > >> Cyrillic, Chinese. There is some doubt in my mind for things like > >> subscripted Japanese or connected (ligatured?) languages like Arabic, > >> Hindi or Thai. > >> > >> -Kenny > >> > >> > >>> On Aug 15, 2016, at 10:42 AM, Xiaodi Wu via swift-evolution > >>> <swift-evolution@swift.org> wrote: > >>> > >>> On Sun, Aug 14, 2016 at 5:41 PM, Michael Savich via swift-evolution > >>> <swift-evolution@swift.org> wrote: > >>> Back in Swift 1.0, subscripting a String was easy, you could just use > >>> subscripting in a very Python like way. But now, things are a bit more > >>> complicated. I recognize why we need syntax like > >>> str.startIndex.advancedBy(x) but it has its downsides. Namely, it makes > >>> things hard on beginners. If one of Swift's goals is to make it a great > >>> first language, this syntax fights that. Imagine having to explain > >>> Unicode and character size to an 8 year old. This is doubly problematic > >>> because String manipulation is one of the first things new coders might > >>> want to do. > >>> > >>> What about having an InternalString subclass that only supports one > >>> encoding, allowing it to be subscripted with Ints? The idea is that an > >>> InternalString is for Strings that are more or less hard coded into the > >>> app. Dictionary keys, enum raw values, that kind of stuff. This also has > >>> the added benefit of forcing the programmer to think about what the > >>> String is being used for. Is it user facing? Or is it just for internal > >>> use? And of course, it makes code dealing with String manipulation much > >>> more concise and readable. > >>> > >>> It follows that something like this would need to be entered as a literal > >>> to make it as easy as using String. One way would be to make all String > >>> literals InternalStrings, but that sounds far too drastic. Maybe > >>> appending an exclamation point like "this"! Or even just wrapping the > >>> whole thing in exclamation marks like !"this"! Of course, we could go old > >>> school and write it like @"this" …That last one is a joke. > >>> > >>> I'll be the first to admit I'm way in over my head here, so I'm very open > >>> to suggestions and criticism. Thanks! > >>> > >>> I can sympathize, but this is tricky. > >>> > >>> Fundamentally, if it's going to be a learning and teaching issue, then > >>> this "easy" string should be the default. That is to say, if I write `var > >>> a = "Hello, world!"`, then `a` should be inferred to be of type > >>> InternalString or EasyString, whatever you want to call it. > >>> > >>> But, we also want Swift to support Unicode by default, and we want that > >>> support to do things The Right Way(TM) by default. In other words, a user > >>> should not have to reach for a special type in order to handle arbitrary > >>> strings correctly, and I should be able to reassign `a = "你好"` and have > >>> things work as expected. So, we also can't have the "easy" string type be > >>> the default... > >>> > >>> I can't think of a way to square that circle. > >>> > >>> > >>> Sent from my iPad > >>> > >>> _______________________________________________ > >>> swift-evolution mailing list > >>> swift-evolution@swift.org > >>> https://lists.swift.org/mailman/listinfo/swift-evolution > >>> > >>> > >>> _______________________________________________ > >>> swift-evolution mailing list > >>> swift-evolution@swift.org > >>> https://lists.swift.org/mailman/listinfo/swift-evolution > >> > >> _______________________________________________ > >> swift-evolution mailing list > >> swift-evolution@swift.org > >> https://lists.swift.org/mailman/listinfo/swift-evolution > > > > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution _______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution