Sorry for my unclear "offset indexable". So what I mean by "offset indexable" is providing a "collection-element level offset indexing". This indexing could provide to any collection, because it is the basic concept of collections. The unicode offset is different than it and is important to String. Thus I just want the team to consider providing collections with the collection-element level offset indexing as an assistant to String.Index (which is the unicode level offset indexing).
> 在 2017年12月15日,上午9:34,Michael Ilseman <milse...@apple.com > <mailto:milse...@apple.com>> 写道: > > Yes, I was trying to highlight that they are different and should be treated > different. This was because it seemed you were conflating the two in your > argument. You claim that people expect it, and I’m pointing out that what > people actually expect (assuming they’re coming from C or languages with a > similar model) already exists as those models deal in encoded offsets. > > More important than expectations surrounding what to provide to a subscript > are expectations surrounding algorithmic complexity. This has security > implications. The expectation of subscript is that it is “constant-ish”, for > a fuzzy hand-wavy definition of “constant-ish” which includes amortized > constant or logarithmic. > > Now, I agree with the overall sentiment that `index(offsetBy:)` is unwieldy. > I am interested in approaches to improve this. But, we cannot throw linear > complexity into subscript without extreme justification. > > >> On Dec 14, 2017, at 5:25 PM, Cao, Jiannan <frog...@163.com >> <mailto:frog...@163.com>> wrote: >> >> This offset is unicode offset, is not the offset of element. >> For example: index(startIndex, offsetBy:1) is encodedOffset 4 or 8, not 1. >> >> Offset indexable is based on the offset of count of each element/index. it >> is the same result of s.index(s.startIndex, offsetBy:i) >> The encodedOffset is the underlaying offset of unicode string, not the same >> concept of the offset index of collection. >> >> The offset indexable is meaning to the elements and index of collection >> (i-th element of the collection), not related to the unicode offset (which >> is the underlaying data offset meaning to the UTF-16 String). >> >> These two offset is totally different. >> >> Best, >> Jiannan >> >>> 在 2017年12月15日,上午9:17,Michael Ilseman <milse...@apple.com >>> <mailto:milse...@apple.com>> 写道: >>> >>> >>> >>>> On Dec 14, 2017, at 4:49 PM, Cao, Jiannan via swift-dev >>>> <swift-...@swift.org <mailto:swift-...@swift.org>> wrote: >>>> >>>> People used to the offset index system instead of the String.Index. Using >>>> offset indices to name the elements, count the elements is normal and >>>> nature. >>>> >>> >>> The offset system that you’re referring to is totally available in String >>> today, if you’re willing for it to be the offset into the encoding. That’s >>> the offset “people” you’re referring to are likely used to and consider >>> normal and natural. On String.Index, there is the following: >>> >>> init(encodedOffset offset: Int >>> <https://developer.apple.com/documentation/swift/int>) >>> >>> and >>> >>> var encodedOffset: Int >>> <https://developer.apple.com/documentation/swift/int> { get } >>> >>> >>> [1] https://developer.apple.com/documentation/swift/string.index >>> <https://developer.apple.com/documentation/swift/string.index> >>> >>> >>>> This offset index system has a long history and a real meaning to the >>>> collection. The subscript s[i] has a fix meaning of "getting the i-th >>>> element in this collection", which is normal and direct. Get the range >>>> with offset indices, is also direct. It means the substring is from the >>>> i-th character up to the j-th character of the original string. >>>> >>>> People used to play subscript, range with offset indices. Use >>>> string[string.index(i, offsetBy: 5)] is not as directly and easily as >>>> string[i + 5]. Also the Range<String.Index> is not as directly as >>>> Range<Offset>. Developers need to transfer the Range<String.Index> result >>>> of string.range(of:) to Range<OffsetIndex> to know the exact range of the >>>> substring. Range<String.Index> has a real meaning to the machine and >>>> underlaying data location for the substring, but Range<OffsetIndex> also >>>> has a direct location information for human being, and represents the >>>> abstract location concept of the collection (This is the most >>>> UNIMPEACHABLE REASON I could provide). >>>> >>>> Offset index system is based on the nature of collection. Each element of >>>> the collection could be located by offset, which is a direct and simple >>>> conception to any collection. Right? Even the String with String.Index has >>>> some offset index property within it. For example: the `count` of the >>>> String, is the offset index of the endIndex.The enumerated() generated a >>>> sequence with elements contains the same offset as the offset index system >>>> provided. And when we apply Array(string), the string divided by each >>>> character and make the offset indices available for the new array. >>>> >>>> The offset index system is just an assistant for collection, not a >>>> replacement to String.Index. We use String.Index to represent the normal >>>> underlaying of the String. We also could use offset indices to represent >>>> the nature of the Collection with its elements. Providing the offset index >>>> as a second choice to access elements in collections, is not only for the >>>> String struct, is for all collections, since it is the nature of the >>>> collection concept, and developer could choose use it or not. >>>> >>>> We don't make the String.Index O(1), but translate the offset indices to >>>> the underlaying String.Index. Each time using subscript with offset index, >>>> we just need to translate offset indices to underlaying indices using >>>> c.index(startIndex, offsetBy:i), c.distance(from: startIndex, to:i) >>>> >>>> We can make the offset indices available through extension to Collection >>>> (as my GitHub repo demo: >>>> https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable- >>>> <https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable->). >>>> >>>> or we could make it at compile time: >>>> for example >>>> >>>> c[1...] >>>> compile to >>>> c[c.index(startIndex, offsetBy:1)...] >>>> >>>> let index: Int = s.index(of: "a") >>>> compile to >>>> let index: Int = s.distance(from: s.startIndex, to: s.index(of:"a")) >>>> >>>> let index = 1 // if used in s only >>>> s[index..<index+2] >>>> compile to >>>> let index = s.index(s.startIndex, offsetBy: 1) >>>> s[index..<s.index(index, offsetBy: 2)] >>>> >>>> let index = 1 // if used both in s1, s2 >>>> s1[index..<index+2] >>>> s2[index..<index+2] >>>> compile to >>>> let index = 1 >>>> let index1 = s1.index(s.startIndex, offsetBy: index) >>>> let index2 = s2.index(s.startIndex, offsetBy: index) >>>> s1[index1..<s.index(index1, offsetBy: 2)] >>>> s2[index2..<s.index(index2, offsetBy: 2)] >>>> >>>> I really want the team to consider providing the offset index system as an >>>> assistant to the collection. It is the very necessary basic concept of >>>> Collection. >>>> >>>> Thanks! >>>> Jiannan >>>> >>>>> 在 2017年12月15日,上午2:13,Jordan Rose <jordan_r...@apple.com >>>>> <mailto:jordan_r...@apple.com>> 写道: >>>>> >>>>> We really don't want to make subscripting a non-O(1) operation. That just >>>>> provides false convenience and encourages people to do the wrong thing >>>>> with Strings anyway. >>>>> >>>>> I'm always interested in why people want this kind of ability. Yes, it's >>>>> nice for teaching programming to be able to split strings on character >>>>> boundaries indexed by integers, but where does it come up in real life? >>>>> The most common cases I see are trying to strip off the first or last >>>>> character, or a known prefix or suffix, and I feel like we should have >>>>> better answers for those than "use integer indexes" anyway. >>>>> >>>>> Jordan >>>>> >>>>> >>>>>> On Dec 13, 2017, at 22:30, Cao, Jiannan via swift-dev >>>>>> <swift-...@swift.org <mailto:swift-...@swift.org>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I would like to discuss the String.Index problem within Swift. I know >>>>>> the current situation of String.Index is based on the nature of the >>>>>> underlaying data structure of the string. >>>>>> >>>>>> But could we just make String.Index contain offset information? Or make >>>>>> offset index subscript available for accessing character in String? >>>>>> >>>>>> for example: >>>>>> let a = "01234" >>>>>> print(a[0]) // 0 >>>>>> print(a[0...4]) // 01234 >>>>>> print(a[...]) // 01234 >>>>>> print(a[..<2]) // 01 >>>>>> print(a[...2]) // 012 >>>>>> print(a[2...]) // 234 >>>>>> print(a[2...3]) // 23 >>>>>> print(a[2...2]) // 2 >>>>>> if let number = a.index(of: "1") { >>>>>> print(number) // 1 >>>>>> print(a[number...]) // 1234 >>>>>> } >>>>>> >>>>>> >>>>>> 0 equals to Collection.Index of collection.index(startIndex, offsetBy: 0) >>>>>> 1 equals to Collection.Index of collection.index(startIndex, offsetBy: 1) >>>>>> ... >>>>>> we keep the String.Index, but allow another kind of index, which is >>>>>> called "offsetIndex" to access the String.Index and the character in the >>>>>> string. >>>>>> Any Collection could use the offset index to access their element, >>>>>> regarding the real index of it. >>>>>> >>>>>> I have make the Collection OffsetIndexable protocol available here, and >>>>>> make it more accessible for StringProtocol considering all API related >>>>>> to the index. >>>>>> >>>>>> https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable- >>>>>> >>>>>> <https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable-> >>>>>> >>>>>> If someone want to make the offset index/range available for any >>>>>> collection, you just need to extend the collection: >>>>>> extension String : OffsetIndexableCollection { >>>>>> } >>>>>> >>>>>> extension Substring : OffsetIndexableCollection { >>>>>> } >>>>>> >>>>>> >>>>>> I hope the Swift core team could consider bring the offset index to >>>>>> string, or make it available to other collection, thus let developer to >>>>>> decide whether their collection could use offset indices as an assistant >>>>>> for the real index of the collection. >>>>>> >>>>>> >>>>>> Thanks! >>>>>> Jiannan >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> swift-dev mailing list >>>>>> swift-...@swift.org <mailto:swift-...@swift.org> >>>>>> https://lists.swift.org/mailman/listinfo/swift-dev >>>>>> <https://lists.swift.org/mailman/listinfo/swift-dev> >>>>> >>>> >>>> _______________________________________________ >>>> swift-dev mailing list >>>> swift-...@swift.org <mailto:swift-...@swift.org> >>>> https://lists.swift.org/mailman/listinfo/swift-dev >>>> <https://lists.swift.org/mailman/listinfo/swift-dev> >>> >> >
_______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution