> On 13 Mar 2017, at 20:18, Sean Heber <s...@fifthace.com> wrote:
> 
> Is there any reason the API couldn’t be expressed as something along these 
> lines?
> 
> func hash() -> [Hashable] {
>  return [x, y]
> }

From a pure performance standpoint, it forces the creation of an array. But if 
you really want something like that, you can define:

extension Hasher {
    mutating func write(_ visitables: [HashVisitable]) {
        for visitable in visitables {
            visitable.hash(&self)
        }
    }
}

So you can do:

func hash<H: Hasher>(_ hasher: inout H) {
    hasher.write([x, y])
}

> l8r
> Sean
> 
> 
>> On Mar 13, 2017, at 2:15 PM, David Hart <da...@hartbit.com> wrote:
>> 
>>> 
>>> On 13 Mar 2017, at 18:54, Sean Heber via swift-evolution 
>>> <swift-evolution@swift.org> wrote:
>>> 
>>> I’m dumb when it comes to proper hashing, but it’s such a tediously common 
>>> thing in my experience to need to add Hashable to some kind of a struct so 
>>> I can stash it in a set or use it as a dictionary key. Is there really no 
>>> way to make this all more automatic? I have to be honest - this proposal 
>>> seems *harder* to understand than the way it works now.
>> 
>> It's not really harder: just call hash on each of your type's significant 
>> values:
>> 
>> x.hash(&hasher)
>> y.hash(&hasher)
>> 
>> How would you implement hashValue in a simpler way, remembering that 'x ^ y' 
>> is an incorrect implementation?
>> 
>>> Of course the easiest would be if the language could just do this “good 
>>> enough" for me using reflection or whatever and if I really did run into a 
>>> problem where I wanted to do this myself, I could override something.
>>> 
>>> Perfect is the enemy of good.
>>> 
>>> l8r
>>> Sean
>>> 
>>> 
>>>> On Mar 13, 2017, at 10:38 AM, Vincent Esche via swift-evolution 
>>>> <swift-evolution@swift.org> wrote:
>>>> 
>>>> Hi there,
>>>> 
>>>> I've written up a proposal for an overhaul/replacement of the Hashable 
>>>> protocol and would love to hear your feedback on it!
>>>> 
>>>> Rendered | Blog Post
>>>> 
>>>> Cheers,
>>>> Vincent
>>>> 
>>>> Ps: I'd like to thank David Hart (@hartbit) for his great editorial 
>>>> feedback on this proposal. 👍
>>>> 
>>>> HashVisitable
>>>> 
>>>>  • Proposal: SE-NNNN
>>>>  • Authors: Vincent Esche
>>>>  • Review Manager: TBD
>>>>  • Status: Awaiting review
>>>> Introduction
>>>> 
>>>> Replace the Hashable protocol by two new procotols (Hasher and 
>>>> HashVisitable) to improve safety, versatility and learnability.
>>>> 
>>>> Motivation
>>>> 
>>>> Implementing Hashable is difficult and the consequences if not done well 
>>>> have dire performance and safety repercussions.
>>>> 
>>>> The documentation of Hashable lists a sample implementation of var 
>>>> hashValue:
>>>> 
>>>> /// A point in an x-y coordinate system.
>>>> struct GridPoint {
>>>>  var x: Int
>>>>  var y: Int
>>>> }
>>>> 
>>>> extension GridPoint: Hashable {
>>>>  var hashValue: Int {
>>>>      return x.hashValue ^ y.hashValue
>>>>  }
>>>> 
>>>>  static func == (lhs: GridPoint, rhs: GridPoint) -> Bool {
>>>>      return lhs.x == rhs.x && lhs.y == rhs.y
>>>>  }
>>>> }
>>>> 
>>>> Calculating the hashes of all GridPoints (given the above implementation) 
>>>> on a 1000 × 1000 grid …
>>>> 
>>>> let (width, height) = (1000, 1000)
>>>> let total = width * height
>>>> var hashes = Set<Int>()
>>>> for x in 0..<width {
>>>>  for y in 0..<height {
>>>>      hashes.insert(GridPoint(x: x, y: y).hashValue)
>>>>  }
>>>> }
>>>> print("\(hashes.count) unique hashes out of a total of \(total).")
>>>> 
>>>> … results in just 1024 unique hash values for 1_000_000 unique values.
>>>> 
>>>> In other words: The recommended implementation causes 99.9% of values to 
>>>> trigger a hash collision.
>>>> 
>>>> Out of those 1_000_000 values the median collision count was 976 with min 
>>>> and max being 976 and 1000respectively.
>>>> 
>>>> The collision rate will have negative impact in algorithms which heavily 
>>>> use hashValue like the ones in Dictionaryand Set. Furthermore, it 
>>>> increases the vulnerability to DDOS attacks when exposed to the web.
>>>> 
>>>> If even the official Swift documentation gets the implementation of 
>>>> hashValue wrong, then who is to expect the average Swift programmer to do 
>>>> any better?
>>>> 
>>>> In contrast, running the same snippet using HashVisitable and the 
>>>> semi-secure Fnv1aHash (see below) results in zero collisions!
>>>> 
>>>> Finally, the design of the Hashable protocol forces the use of one 
>>>> implementation without the possibility of switching between multiple 
>>>> hashing algorithms.
>>>> 
>>>> Proposed solution
>>>> 
>>>> Instead of coupling the hashing algorithm with each and every Swift type, 
>>>> we should provide a hashing API based on the visitor-pattern. By freeing 
>>>> application developers from the burden of having to implement hashing 
>>>> algorithms, the Standard Library can provide default ones which fulfill 
>>>> collision, performance and security goals. Furthermore, it would allow 
>>>> developers to swap to different algorithms based on the use case.
>>>> 
>>>> Detailed design
>>>> 
>>>> The proposal deprecates the Hashable protocol and introduces the following 
>>>> two:
>>>> 
>>>> protocol Hasher
>>>> {
>>>> 
>>>> mutating func finish() -> Int
>>>> 
>>>> 
>>>> mutating func write(bytes
>>>> : UnsafeRawBufferPointer)
>>>> }
>>>> 
>>>> 
>>>> protocol HashVisitable
>>>> {
>>>> 
>>>> func hash<H: Hasher>(_ hasher: inout
>>>> H)
>>>> }
>>>> 
>>>> Hasher is the protocol which represents a hashing algorithm, and 
>>>> HashVisitable replaces Hashable. For types entirely represented by their 
>>>> memory layout, the following protocol would provide a default 
>>>> implementation:
>>>> 
>>>> protocol ContiguouslyHashable: HashVisitable {}
>>>> 
>>>> extension ContiguouslyHashable {
>>>>  func hash<H: Hasher>(_ hasher: inout H) {
>>>>      var mutableSelf = self
>>>>      try! Swift.withUnsafeBytes(of: &mutableSelf) {
>>>>          hasher.write(bytes: $0)
>>>>      }
>>>>  }
>>>> }
>>>> 
>>>> extension Bool : ContiguouslyHashable {}
>>>> extension UInt8 : ContiguouslyHashable {}
>>>> extension UInt16 : ContiguouslyHashable {}
>>>> extension UInt32 : ContiguouslyHashable {}
>>>> extension UInt64 : ContiguouslyHashable {}
>>>> extension UInt : ContiguouslyHashable {}
>>>> extension Int8 : ContiguouslyHashable {}
>>>> extension Int16 : ContiguouslyHashable {}
>>>> extension Int32 : ContiguouslyHashable {}
>>>> extension Int64 : ContiguouslyHashable {}
>>>> extension Int : ContiguouslyHashable {}
>>>> 
>>>> The Standard-Library would then provide a set of hashing implementations 
>>>> specific to each purpose. A possible choice for hashing algorithms would 
>>>> be the reasonably fast SipHash-2-4, and the reasonably secure SipHash-4-8.
>>>> 
>>>> FNV-1A is another popular semi-secure but blazingly fast hash algorithm, 
>>>> which – for the sake of demonstration – could be implemented as follows:
>>>> 
>>>> struct Fnv1aHash
>>>> {
>>>> 
>>>> fileprivate var state: UInt
>>>> 
>>>> 
>>>> init(seed: UInt
>>>> ) {
>>>> 
>>>> self.state = seed &+ 14695981039346656037
>>>> 
>>>>  }
>>>> }
>>>> 
>>>> 
>>>> extension Fnv1aHash: Hasher 
>>>> {
>>>> 
>>>> mutating func write(bytes
>>>> : UnsafeRawBufferPointer) {
>>>> 
>>>> for byte in
>>>> bytes {
>>>> 
>>>> self.state = (self.state ^ UInt(byte)) &* 1099511628211
>>>> 
>>>>      }
>>>>  }
>>>> 
>>>> mutating func finish() -> Int
>>>> {
>>>> 
>>>> return unsafeBitCast(self.state, to: Int.self
>>>> )
>>>>  }
>>>> }
>>>> 
>>>> Coming back to the sample code present in the Hashable documentation, the 
>>>> new implementation would look like:
>>>> 
>>>> extension GridPoint: HashVisitable 
>>>> {
>>>> 
>>>> func hash<H: Hasher>(_ hasher: inout
>>>> H) {
>>>> 
>>>> self.x.hash(&
>>>> hasher)
>>>> 
>>>> self.y.hash(&
>>>> hasher)
>>>>  }
>>>> }
>>>> 
>>>> Source compatibility
>>>> 
>>>> Making use of "extending protocols to conform to protocols":
>>>> 
>>>> extension Hashable: HashVisitable 
>>>> {
>>>> 
>>>> func hash<H: Hasher>(_ hasher: inout
>>>> H) {
>>>> 
>>>> self.hashValue.hash(&
>>>> hasher)
>>>>  }
>>>> }
>>>> 
>>>> Effect on ABI stability
>>>> 
>>>> n/a
>>>> 
>>>> Effect on API resilience
>>>> 
>>>> This feature should be possible to add/remove without breaking ABI.
>>>> 
>>>> Alternatives considered
>>>> 
>>>> n/a
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution@swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution@swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
> 

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to