The major problem with this approach is that visual glyphs themselves have one level of variable-length encoding, and they sit on top of another variable-length encoding used to represent the Unicode characters (Swift-native Strings are currently encoded as UTF-8). For instance, the visual glyph 🇺🇸 is the the result of putting side-by-side the Unicode characters 🇺 and 🇸("REGIONAL INDICATOR SYMBOL LETTER U" and "REGIONAL INDICATOR SYMBOL LETTER S"), which are themselves encoded as UTF-8 using 4 bytes each. A design in which you can "just write" string[4544] hides the fact that indexing is a linear-time operation that needs to recompose UTF-8 characters and then recompose visual glyphs on top of that.
Generally speaking, I *think* that I agree that human-geared "long string" on which you probably won't need random access, and machine-geared smaller strings that encode a command, could benefit from not being considered the same fundamental thing. However, I'm also afraid that this will end with more applications and websites that think that first names only contain 7-bit-clean characters in the A-Z range. (I live in the US and I can attest that this is still very common.) You could make a point too that better facilities to parse strings would probably address this issue. Félix > Le 15 août 2016 à 10:52:02, Kenny Leung via swift-evolution > <swift-evolution@swift.org> a écrit : > > I agree with both points of view. I think we need to bring back subscripting > on strings which does the thing people would most commonly expect. > > I would say that the subscripts indexes should correspond to a visual glyph. > This seems reasonable to me for most character sets like Roman, Cyrillic, > Chinese. There is some doubt in my mind for things like subscripted Japanese > or connected (ligatured?) languages like Arabic, Hindi or Thai. > > -Kenny > > >> On Aug 15, 2016, at 10:42 AM, Xiaodi Wu via swift-evolution >> <swift-evolution@swift.org> wrote: >> >> On Sun, Aug 14, 2016 at 5:41 PM, Michael Savich via swift-evolution >> <swift-evolution@swift.org> wrote: >> Back in Swift 1.0, subscripting a String was easy, you could just use >> subscripting in a very Python like way. But now, things are a bit more >> complicated. I recognize why we need syntax like >> str.startIndex.advancedBy(x) but it has its downsides. Namely, it makes >> things hard on beginners. If one of Swift's goals is to make it a great >> first language, this syntax fights that. Imagine having to explain Unicode >> and character size to an 8 year old. This is doubly problematic because >> String manipulation is one of the first things new coders might want to do. >> >> What about having an InternalString subclass that only supports one >> encoding, allowing it to be subscripted with Ints? The idea is that an >> InternalString is for Strings that are more or less hard coded into the app. >> Dictionary keys, enum raw values, that kind of stuff. This also has the >> added benefit of forcing the programmer to think about what the String is >> being used for. Is it user facing? Or is it just for internal use? And of >> course, it makes code dealing with String manipulation much more concise and >> readable. >> >> It follows that something like this would need to be entered as a literal to >> make it as easy as using String. One way would be to make all String >> literals InternalStrings, but that sounds far too drastic. Maybe appending >> an exclamation point like "this"! Or even just wrapping the whole thing in >> exclamation marks like !"this"! Of course, we could go old school and write >> it like @"this" …That last one is a joke. >> >> I'll be the first to admit I'm way in over my head here, so I'm very open to >> suggestions and criticism. Thanks! >> >> I can sympathize, but this is tricky. >> >> Fundamentally, if it's going to be a learning and teaching issue, then this >> "easy" string should be the default. That is to say, if I write `var a = >> "Hello, world!"`, then `a` should be inferred to be of type InternalString >> or EasyString, whatever you want to call it. >> >> But, we also want Swift to support Unicode by default, and we want that >> support to do things The Right Way(TM) by default. In other words, a user >> should not have to reach for a special type in order to handle arbitrary >> strings correctly, and I should be able to reassign `a = "你好"` and have >> things work as expected. So, we also can't have the "easy" string type be >> the default... >> >> I can't think of a way to square that circle. >> >> >> Sent from my iPad >> >> _______________________________________________ >> swift-evolution mailing list >> swift-evolution@swift.org >> https://lists.swift.org/mailman/listinfo/swift-evolution >> >> >> _______________________________________________ >> swift-evolution mailing list >> swift-evolution@swift.org >> https://lists.swift.org/mailman/listinfo/swift-evolution > > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution