Re: [swift-evolution] [swift-dev] Make offset index available for String

2018-01-02 Thread Karl Wagner via swift-evolution
And really, this is more an issue for swift-evolution, since what you’re 
talking about (self-incrementing indexes) would be a new language feature.

- Karl

> On 3. Jan 2018, at 01:19, Karl Wagner  wrote:
> 
> Swift used to do this, but we switched it around so indexes couldn’t 
> self-increment.
> 
> One of the problems was that strings are value-types. So you would get an 
> index, then append stuff to the string, but when you tried to advance the 
> index again it would blow up. The index retained the backing, which means the 
> “append” caused a copy, and the index was suddenly pointing to a different 
> String backing.
> 
> Basically, self-incrementing indexes require that the Collection has 
> reference semantics. Otherwise there simply is no concept of an independent 
> “owning” Collection which your Index can hold a reference to.
> 
> Anyway, that doesn’t mean you’re wrong. Collection-slicing syntax is still 
> way too ugly. We need to keep it safe, and communicative, but it should also 
> be obvious and not tiring.
> 
> Currently, you have to write:
> 
> [.index(., offsetBy: )]
> 
> And an example...
> 
> results[results.index(results.startIndex, offsetBy: 3)]
> 
> Which is safe, and communicative, and obvious, but also really, really 
> tiring. There are ways we can make it less tiring without sacrificing the 
> good parts:
> 
> 1) Add a version of index(_: offsetBy:) which takes a KeyPath Self.Index> as its first argument. That’s a minor convenience you can add 
> today in your own projects. It removes one repetition of , in 
> many common cases.
> 
> extension Collection {
>   func index(_ i: KeyPath, offsetBy n: IndexDistance) -> Index {
> return index(self[keyPath: i], offsetBy: n)
>   }
>   func index(_ i: KeyPath, offsetBy n: IndexDistance, limitedBy: 
> Index) -> Index? {
> return index(self[keyPath: i], offsetBy: n, limitedBy: limitedBy)
>   }
> }
> 
> results[results.index(\.startIndex, offsetBy: 3)]
> 
> Seriously, man, KeyPaths are just the business. I love them.
> 
> 2) Bind  to something like an anonymous closure argument within 
> the subscript. Or just allow “.” syntax, as for static members. That removes 
> another .
> 
> results[.index(\.startIndex, offsetBy: 3)]
> 
> or
> 
> results[$.index(\.startIndex, offsetBy: 3)]
> 
> If anybody’s interested, I was playing around with an “IndexExpression” type 
> for this kind of thing. The language lets you get pretty far, but it doesn’t 
> work and I can’t figure out why. It looks like a simple-enough generic 
> struct, but it fails with a cyclic metadata dependency.
> 
> https://gist.github.com/karwa/04cc43431951f24ae9334ba8a25e6a31 
> 
> 
> - Karl
> 
>> On 19. Dec 2017, at 08:38, Cao, Jiannan via swift-dev > > wrote:
>> 
>> I implemented the second approach: SuperIndex
>> 
>> https://github.com/frogcjn/SuperStringIndex/ 
>> 
>> 
>> SuperString is a special version of String. Its SuperIndex keeps a reference 
>> to the string, let the index calculate the offset.
>> 
>> 
>> struct SuperIndex : Comparable, Strideable, CustomStringConvertible {
>> 
>> var owner: Substring
>> var wrapped: String.Index
>>
>>  ...
>> 
>> // Offset
>> var offset: Int {
>> return owner.distance(from: owner.startIndex, to: wrapped)
>> }
>> 
>> // Strideable
>> func advanced(by n: SuperIndex.Stride) -> SuperIndex {
>> return SuperIndex(owner.index(wrapped, offsetBy: n), owner)
>> }
>> 
>> static  func +(lhs: SuperIndex, rhs: SuperIndex.Stride) -> SuperIndex {
>> return lhs.advanced(by: rhs)
>> }
>> }
>> 
>> let a: SuperString = "01234"
>> let o = a.startIndex
>> let o1 = o + 4
>> print(a[o]) // 0
>> print(a[...]) // 01234
>> print(a[..<(o+2)]) // 01
>> print(a[...(o+2)]) // 012
>> print(a[(o+2)...]) // 234
>> print(a[o+2..> print(a[o1-2...o1-1]) // 23
>> 
>> if let number = a.index(of: "1") {
>> print(number) // 1
>> print(a[number...]) // 1234
>> }
>> 
>> if let number = a.index(where: { $0 > "1" }) {
>> print(number) // 2
>> }
>> 
>> let b = a[(o+1)...]
>> let z = b.startIndex
>> let z1 = z + 4
>> print(b[z]) // 1
>> print(b[...]) // 1234
>> print(b[..<(z+2)]) // 12
>> print(b[...(z+2)]) // 123
>> print(b[(z+2)...]) // 34
>> print(b[z+2...z+3]) // 34
>> print(b[z1-2...z1-2]) // 3
>> 
>> 
>>> 在 2017年12月18日,下午4:53,Cao, Jiannan >> > 写道:
>>> 
>>> Or we can copy the design of std::vector::iterator in C++.The index could 
>>> keep a reference to the collection.
>>> When the index being offset by + operator, it could call the owner to 
>>> offset the index, since it keeps a reference to the collection owner.
>>> 
>>> let startIndex = s.startIndex
>>> s[startIndex+1]
>>> 
>>> public struct MyIndex : Comparable where T.Index == MyIndex {
>>> public let owner: T
>>> ...
>>> public static func + (lhs: MyIndex, 

Re: [swift-evolution] [swift-dev] Make offset index available for String

2017-12-14 Thread Cao, Jiannan via swift-evolution
Sorry for my unclear "offset indexable".
So what I mean by "offset indexable" is providing a "collection-element level 
offset indexing". This indexing could provide to any collection, because it is 
the basic concept of collections.
The unicode offset is different than it and is important to String. Thus I just 
want the team to consider providing collections with the collection-element 
level offset indexing as an assistant to String.Index (which is the unicode 
level offset indexing).

> 在 2017年12月15日,上午9:34,Michael Ilseman  > 写道:
> 
> Yes, I was trying to highlight that they are different and should be treated 
> different. This was because it seemed you were conflating the two in your 
> argument. You claim that people expect it, and I’m pointing out that what 
> people actually expect (assuming they’re coming from C or languages with a 
> similar model) already exists as those models deal in encoded offsets.
> 
> More important than expectations surrounding what to provide to a subscript 
> are expectations surrounding algorithmic complexity. This has security 
> implications. The expectation of subscript is that it is “constant-ish”, for 
> a fuzzy hand-wavy definition of “constant-ish” which includes amortized 
> constant or logarithmic.
> 
> Now, I agree with the overall sentiment that `index(offsetBy:)` is unwieldy. 
> I am interested in approaches to improve this. But, we cannot throw linear 
> complexity into subscript without extreme justification.
> 
> 
>> On Dec 14, 2017, at 5:25 PM, Cao, Jiannan > > wrote:
>> 
>> This offset is unicode offset, is not the offset of element. 
>> For example: index(startIndex, offsetBy:1) is encodedOffset 4 or 8, not 1.
>> 
>> Offset indexable is based on the offset of count of each element/index. it 
>> is the same result of s.index(s.startIndex, offsetBy:i)
>> The encodedOffset is the underlaying offset of unicode string, not the same 
>> concept of the offset index of collection.
>> 
>> The offset indexable is meaning to the elements and index of collection 
>> (i-th element of the collection), not related to the unicode offset (which 
>> is the underlaying data offset meaning to the UTF-16 String).
>> 
>> These two offset is totally different.
>> 
>> Best,
>> Jiannan
>> 
>>> 在 2017年12月15日,上午9:17,Michael Ilseman >> > 写道:
>>> 
>>> 
>>> 
 On Dec 14, 2017, at 4:49 PM, Cao, Jiannan via swift-dev 
 mailto:swift-...@swift.org>> wrote:
 
 People used to the offset index system instead of the String.Index. Using 
 offset indices to name the elements, count the elements is normal and 
 nature.
 
>>> 
>>> The offset system that you’re referring to is totally available in String 
>>> today, if you’re willing for it to be the offset into the encoding. That’s 
>>> the offset “people” you’re referring to are likely used to and consider 
>>> normal and natural. On String.Index, there is the following:
>>> 
>>> init(encodedOffset offset: Int 
>>> )
>>> 
>>> and 
>>> 
>>> var encodedOffset: Int 
>>>  { get }
>>> 
>>> 
>>> [1] https://developer.apple.com/documentation/swift/string.index 
>>> 
>>> 
>>> 
 This offset index system has a long history and a real meaning to the 
 collection. The subscript s[i] has a fix meaning of "getting the i-th 
 element in this collection", which is normal and direct. Get the range 
 with offset indices, is also direct. It means the substring is from the 
 i-th character up to the j-th character of the original string.
 
 People used to play subscript, range with offset indices. Use 
 string[string.index(i, offsetBy: 5)] is not as directly and easily as 
 string[i + 5]. Also the Range is not as directly as 
 Range. Developers need to transfer the Range result 
 of string.range(of:) to Range to know the exact range of the 
 substring. Range has a real meaning to the machine and 
 underlaying data location for the substring, but Range also 
 has a direct location information for human being, and represents the 
 abstract location concept of the collection (This is the most 
 UNIMPEACHABLE REASON I could provide).
 
 Offset index system is based on the nature of collection. Each element of 
 the collection could be located by offset, which is a direct and simple 
 conception to any collection. Right? Even the String with String.Index has 
 some offset index property within it. For example: the `count` of the 
 String, is the offset index of the endIndex.The enumerated() generated a 
 sequence with elements contains the same offset as the offset index system 
 provided. And when we apply Array(string), the string divided by each 
 character and make the offset indices available f