Re: [Chicken-users] Hash table equality pitfall
Hallo, Sending back to list... On Fri, Mar 2, 2012 at 10:43 AM, Pierpaolo Bernardi wrote: > On Fri, Mar 2, 2012 at 10:32, Alex Queiroz wrote: >> On Thu, Mar 1, 2012 at 7:01 PM, Peter Bex wrote: >>> >>> If they do it "correctly", how do they deal with differences in initial >>> bucket size, and what do they do with hash tables having identical >>> key/values but different hashing or different comparison procedures? >>> >> >> If I were to write a `equal?` procedure for hash-tables, my test for >> "correctness" would be: >> >> (equal? hash-table1 hash-table2) iff (equal? (hash-table->alist >> hash-table1) (hash-table->alist hash-table2)) >> >> provided that `hash-table->alist` does the obvious thing. > > I'd expect a procedure named hash-table->alist to return an alist in > no particular order. > > I guess you meant to insert a couple of sorts in there. ;^) > You are right, the "obvious" hash-table->alist is not so obvious then. :) -- -alex http://www.artisancoder.com/ ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
On Thu, Mar 1, 2012 at 7:01 PM, Peter Bex wrote: > > If they do it "correctly", how do they deal with differences in initial > bucket size, and what do they do with hash tables having identical > key/values but different hashing or different comparison procedures? > If I were to write a `equal?` procedure for hash-tables, my test for "correctness" would be: (equal? hash-table1 hash-table2) iff (equal? (hash-table->alist hash-table1) (hash-table->alist hash-table2)) provided that `hash-table->alist` does the obvious thing. -- -alex http://www.artisancoder.com/ ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
Peter Bex scripsit: > That's rather odd, considering there were quite a few people discussing > it, judging by the SRFI discussion archive (not as much as some other > SRFIs, but still a few). It's also a very useful thing to have. > I guess that's just a Chickeneer's way of looking at it :) Chicken didn't have a pre-existing hash table implementation to support. As I've said before, most applications don't need to be portable across multiple Schemes, since almost all are highly portable themselves. So if you are writing in MIT Scheme or Racket, you can use the hash tables provided by MIT Scheme or Racket without a problem. It's library authors who benefit most from standardization. > Why is it not drop-in? Maybe I'm missing something, but from a quick > glance the implementation doesn't really look too different from > Chicken's implementation, conceptually. For one thing, the identity hash function is the same as the main hash function, which only works on standard R5RS types. There may be other issues. -- John Cowanco...@ccil.orghttp://ccil.org/~cowan There was an old manSaid with a laugh, "I From Peru, whose lim'ricks all Cut them in half, the pay is Look'd like haiku. He Much better for two." --Emmet O'Brien ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
Alex Shinn scripsit: > > What does this mean in practice? Do they do a "dumb" comparison like > > Chicken does (ie, producing different results depending on the insertion > > order) or do they check whether the hash tables have exactly the same > > keys, each with identical corresponding values? > > Chibi does a dumb comparison like Chicken does. I'd guess that Kawa delegates to Java's java.util.AbstractMap#equals method, which returns true if the keys and their values are equal; it does not care whether the maps are of the same types. -- Si hoc legere scis, nimium eruditionis habes. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
On Fri, Mar 2, 2012 at 3:01 AM, Peter Bex wrote: > On Thu, Mar 01, 2012 at 12:03:25PM -0500, John Cowan wrote: >> In addition, the following Schemes support SRFI-69 with `equal?` >> descending into hash tables: Kawa, Chibi. > > What does this mean in practice? Do they do a "dumb" comparison like > Chicken does (ie, producing different results depending on the insertion > order) or do they check whether the hash tables have exactly the same > keys, each with identical corresponding values? Chibi does a dumb comparison like Chicken does. If you use the equal? from R7RS (scheme base) it also handles cycles. -- Alex ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
On Thu, Mar 01, 2012 at 12:45:57PM -1000, Derrell Piper wrote: > Um, your 5th line resets 'a to 2, which is why they're not comparing > equal?. Removing that yields the expected results on 4.7.0.5-st: Whoops, thanks for pointing that out, Derrel. I overlooked it (that's what you get while quickly writing up such things on the sly at work!). The fact is that it sizes the hash table only in terms of the next value in hash-table-prime-lengths, which means it'll be 307 items big. However, this does not invalidate my main point. If you insert either more than 307 items in different order or manage to hash two objects to the same value you get this problem: #;1> (use srfi-69) ; loading library srfi-69 ... #;2> (define x (make-hash-table)) #;3> (hash-table-set! x "a" 1) #;4> (hash-table-set! x "\x00a" 1) #;5> (define y (make-hash-table)) #;6> (hash-table-set! y "\x00a" 1) #;7> (hash-table-set! y "a" 1) #;8> (equal? x y) #f while in 4.5.0, #;1> (use srfi-69) ; loading library srfi-69 ... #;2> (define x (make-hash-table)) #;3> (hash-table-set! x "a" 1) #;4> (hash-table-set! x "\x00a" 1) #;5> (define y (make-hash-table)) #;6> (hash-table-set! y "a" 1) #;7> (hash-table-set! y "\x00a" 1) #;8> (equal? x y) #t Cheers, Peter -- http://sjamaan.ath.cx -- "The process of preparing programs for a digital computer is especially attractive, not only because it can be economically and scientifically rewarding, but also because it can be an aesthetic experience much like composing poetry or music." -- Donald Knuth ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
On Thu, Mar 01, 2012 at 12:03:25PM -0500, John Cowan wrote: > In addition, the following Schemes support SRFI-69 with `equal?` > descending into hash tables: Kawa, Chibi. What does this mean in practice? Do they do a "dumb" comparison like Chicken does (ie, producing different results depending on the insertion order) or do they check whether the hash tables have exactly the same keys, each with identical corresponding values? If they do it "correctly", how do they deal with differences in initial bucket size, and what do they do with hash tables having identical key/values but different hashing or different comparison procedures? > I think the main reason SRFI 69 doesn't have much support is its fairly > recent date (2005). It would probably be straightforward to layer it > over the native support in most Schemes, but people haven't bothered to > do so. That's rather odd, considering there were quite a few people discussing it, judging by the SRFI discussion archive (not as much as some other SRFIs, but still a few). It's also a very useful thing to have. I guess that's just a Chickeneer's way of looking at it :) > In addition, the reference implementation is more of a proof of > concept than a drop-in implementation like those provided with SRFI-1, > SRFI-2, SRFI-43, etc. Why is it not drop-in? Maybe I'm missing something, but from a quick glance the implementation doesn't really look too different from Chicken's implementation, conceptually. > > Luckily, with the new version the mistake of thinking that equal? > > is defined for hash-tables is a lot harder to make since the > > randomization factor will ensure they are almost always different. > > I think that's good. The behavior of `equal?` is primarily historical > anyway; people can and should write their own equivalence predicates to > solve their own problems. You could easily argue either way, but it's good that it doesn't seem to be correct when you first try it. Cheers, Peter -- http://sjamaan.ath.cx -- "The process of preparing programs for a digital computer is especially attractive, not only because it can be economically and scientifically rewarding, but also because it can be an aesthetic experience much like composing poetry or music." -- Donald Knuth ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Hash table equality pitfall
Peter Bex scripsit: > Of course, from the end-user perspective this is *very* tricky > behaviour, so I've added a note to the manual about this. I checked > the SRFI, but it doesn't mention anything about hash tables having to > compare equal to eachother so Chicken's behaviour is strictly not in > violation of any standard :) The R5RS definition of `equal?` says it is recursive on pairs, vectors, and strings, going on to say that `equal?` defers to `eqv?` "on other objects such as numbers and symbols". It is not clear whether this means all other objects (in which case `equal?` on hashtables must be the same as `eqv?`), or only some other objects, including numbers and symbols but allowing implementation-defined behavior on implementation-defined objects. Chicken evidently assumes the latter interpretation. In R6RS, `equal?` definitely defers to `eqv?` for all objects except pairs, vectors, strings, and bytevectors. R7RS currently uses the same language as R5RS, with the same ambiguity, but requires descent into bytevectors and records. > The only other implementation I was able to test with was Racket (why > is SRFI-69 so uncommon in Schemes?) and there hash tables also don't > compare with equal? I maintain a table of what Schemes implement which SRFIs at http://tinyurl.com/scheme-s5 (Spreadsheet of Scheme Systems Supporting SRFIs). There is a portable R6RS implementation of SRFI-69 on top of R6RS hashtables. These are naturally not `equal?` in accordance with the R6RS rule given above. In addition, the following Schemes support SRFI-69 with `equal?` descending into hash tables: Kawa, Chibi. The following Schemes also support it, with `equal?` the same as `eqv?`: Guile, STklos. I think the main reason SRFI 69 doesn't have much support is its fairly recent date (2005). It would probably be straightforward to layer it over the native support in most Schemes, but people haven't bothered to do so. In addition, the reference implementation is more of a proof of concept than a drop-in implementation like those provided with SRFI-1, SRFI-2, SRFI-43, etc. > Luckily, with the new version the mistake of thinking that equal? > is defined for hash-tables is a lot harder to make since the > randomization factor will ensure they are almost always different. I think that's good. The behavior of `equal?` is primarily historical anyway; people can and should write their own equivalence predicates to solve their own problems. -- John Cowanhttp://ccil.org/~cowan co...@ccil.org The Penguin shall hunt and devour all that is crufty, gnarly and bogacious; all code which wriggles like spaghetti, or is infested with blighting creatures, or is bound by grave and perilous Licences shall it capture. And in capturing shall it replicate, and in replicating shall it document, and in documentation shall it bring freedom, serenity and most cool froodiness to the earth and all who code therein. --Gospel of Tux ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] Hash table equality pitfall
Hi all! I ran into some very tricky behavior with hash tables that I'd like to document for posterity. Turns out that (equal? (make-hash-table) (make-hash-table)) no longer always returns #t in the current master, but it did in Chicken 4.7.0. This is because of obscure internal details: the hash table's internal hashing function is a closure which contains the randomization fixnum, which usually differs between tables. In Chicken 4.7.0 and earlier, equal? only *appears* to work correctly for hash tables. In fact, it works if the table is filled such that buckets contain at most one value, or if they were filled in the same order: #;1> (use srfi-69) ; loading library srfi-69 ... #;2> (define x (make-hash-table size: 2)) #;3> (define y (make-hash-table size: 2)) #;4> (equal? x y) #t #;5> (hash-table-set! x 'a 1) #;6> (hash-table-set! x 'b 2) #;7> (hash-table-set! x 'c 3) #;8> (hash-table-set! y 'a 1) #;9> (hash-table-set! y 'b 2) #;10> (hash-table-set! y 'c 3) #;11> (equal? x y) #t However, if we change the fill order, it no longer compares equal: #;1> (use srfi-69) ; loading library srfi-69 ... #;2> (define x (make-hash-table size: 2)) #;3> (define y (make-hash-table size: 2)) #;4> (hash-table-set! x 'a 1) #;5> (hash-table-set! x 'a 2) #;6> (hash-table-set! x 'b 2) #;7> (hash-table-set! x 'c 3) #;8> (hash-table-set! y 'c 3) #;9> (hash-table-set! y 'b 2) #;10> (hash-table-set! y 'a 1) #;11> (equal? x y) #f The reason for this is that equal? is a very generic procedure, which knows about core datatypes and compares record type values field by field. A hash table is just a record type, and it contains some internal slots with the configuration details as well as a vector which holds the hash table's buckets. The bucket contents themselves are alists, which are filled simply using alist-cons when you use hash-table-set!. That's why the ordering differs. Of course, from the end-user perspective this is *very* tricky behaviour, so I've added a note to the manual about this. I checked the SRFI, but it doesn't mention anything about hash tables having to compare equal to eachother so Chicken's behaviour is strictly not in violation of any standard :) The only other implementation I was able to test with was Racket (why is SRFI-69 so uncommon in Schemes?) and there hash tables also don't compare with equal? Luckily, with the new version the mistake of thinking that equal? is defined for hash-tables is a lot harder to make since the randomization factor will ensure they are almost always different. Cheers, Peter -- http://sjamaan.ath.cx -- "The process of preparing programs for a digital computer is especially attractive, not only because it can be economically and scientifically rewarding, but also because it can be an aesthetic experience much like composing poetry or music." -- Donald Knuth ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users