Re: [swift-evolution] [Review] SE-0065 A New Model for Collections and Indices

Brent Royal-Gordon via swift-evolution Tue, 12 Apr 2016 04:28:52 -0700

>> (On the other hand, it might be that I'm conceiving of the purpose of 
>> `limitedBy` differently from you—I think of it as a safety measure, but you 
>> may be thinking of it specifically as an automatic truncation mechanism.)
> 
> Hi Brent,
> 
> Could you explain what kind of safety do you have in mind?  Swift will
> guarantee memory safety even if you attempt to advance an index past
> endIndex using the non-limiting overload.


By "safety" here, I mean what I will call "index safety": not accidentally 
using an index which would violate the preconditions of the methods or 
properties you are planning to use it with. I think it's too easy to 
accidentally overrun the permitted range of indices, and the API should help 
you avoid doing that.

For instance, suppose I'm porting XCTest to Swift, and I decide to rewrite its 
`demangleSimpleClass` function, which extracts the identifiers from a mangled 
Swift symbol name. Specifically, I'm implementing `scanIdentifier`, which reads 
one particular identifier out of the middle of a string. (For those unfamiliar: 
an identifier in a mangled symbol name consists of one or more digits to 
represent a length, followed by that many characters.) I will assume that the 
mangled symbol name is in a Swift.String.

Here's a direct port:

        func scanIdentifier(partialMangled: String) -> (identifier: String, 
remainder: String) {
                let chars = partialMangled.characters
                var lengthRange = chars.startIndex ..< chars.startIndex
        
                while chars[lengthRange.endIndex].isDigit {
                        lengthRange.endIndex = chars.successor(of: 
lengthRange.endDigit)
                }
        
                let lengthString = String(chars[lengthRange])
                let length = Int(lengthString)!
        
                let identifierRange = lengthRange.endIndex ..< 
chars.index(length, stepsFrom: lengthRange.endIndex)
                let remainderRange = chars.suffix(from: 
identifierRange.endIndex)
        
                return (String(chars[identifierRange]), 
String(chars[identifierRange]))
        }

This works (note: probably, I haven't actually tested it), but it fails a 
precondition if the mangled symbol is invalid. Suppose we want to detect this 
condition so that our parent function can throw a nice error instead:

        func scanIdentifier(partialMangled: String) -> (identifier: String, 
remainder: String)? {
                let chars = partialMangled.characters
                var lengthRange = chars.startIndex ..< chars.startIndex
        
                while chars[lengthRange.endIndex].isDigit {
                        lengthRange.endIndex = chars.successor(of: 
lengthRange.endDigit)
                        if lengthRange.endIndex == chars.endIndex {
                                return nil
                        }
                }
        
                let lengthString = String(chars[lengthRange])
                guard let length = Int(lengthString) else {
                        return nil
                }
        
                let identifierRange = lengthRange.endIndex ..< 
chars.index(length, stepsFrom: lengthRange.endIndex)
                if identifierRange.endIndex > chars.endIndex {
                        return nil
                }
        
                let remainderRange = chars.suffix(from: 
identifierRange.endIndex)
        
                return (String(chars[identifierRange]), 
String(chars[identifierRange]))
        }

That's really not the greatest. To tell the truth, I've actually guessed what 
bounds-checking is needed here; I'm not 100% sure I caught all the cases. And, 
um, I'm not really sure that `index(length, stepsFrom: lengthRange.endIndex)` 
is guaranteed to return anything valid if `length` is too large. Even 
`limitedBy:` wouldn't help me here—I would end up silently accepting and 
truncating an invalid string instead of detecting the error.

Now, imagine if `successor(of:)` and `index(_:stepsFrom:)` instead had variants 
which performed range checks on their results and returned `nil` if they failed:

        func scanIdentifier(partialMangled: String) -> (identifier: String, 
remainder: String)? {
                let chars = partialMangled.characters
                var lengthRange = chars.startIndex ..< chars.startIndex
        
                while chars[lengthRange.endIndex].isDigit {
                        guard let nextIndex = chars.successor(of: 
lengthRange.endDigit, permittingEnd: false) else {
                                return nil
                        }
                        lengthRange.endIndex = nextIndex
                }
        
                let lengthString = String(chars[lengthRange])
                guard let length = Int(lengthString) else {
                        return nil
                }
        
                guard let identifierEndIndex = chars.index(length, stepsFrom: 
lengthRange.endIndex, permittingEnd: true) else {
                        return nil
                }
        
                let identifierRange = lengthRange.endIndex ..< 
identifierEndIndex
                let remainderRange = chars.suffix(from: 
identifierRange.endIndex)
        
                return (String(chars[identifierRange]), 
String(chars[identifierRange]))
        }

By using these variants of the index-manipulation operations, the Collection 
API itself tells me where I need to handle bounds-check violations. Just like 
the failable `Int(_: String)` initializer, if I forget to check bounds after 
manipulating an index, the code will not type-check. That's a nice victory for 
correct semantics.

* * *

Incidentally, rather than having Valid<Index>, an alternative would be to have 
Unchecked<Index>. This would mark an index which had *not* been checked. You 
could use its `uncheckedIndex` property to access the index directly, or you 
could pass it to `Collection.check(_: Unchecked<Index>) -> Index?` to perform 
the check.

This would not serve to eliminate redundant checks; it would merely get the 
type system to help you catch index-checking mistakes. You could, of course, 
perform the check and then invalidate the index with a mutation, but that's 
just as true today. I believe that, with aggressive enough optimization, this 
could be costless at runtime. *And* it would offer a way to provide the 
so-called "safe indexing" many people ask for: you could offer a subscript 
which took an Unchecked<Index> and returned an Optional<Element>.

-- 
Brent Royal-Gordon
Architechies

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Review] SE-0065 A New Model for Collections and Indices

Reply via email to