On Sun, May 11, 2014 at 2:06 PM, Mikera <mike.r.anderson...@gmail.com>wrote:

> OK..... this thread is a bit worrying. If I understand correctly, it means
> that we've now got inconsistent hash and equals functions. I suspect this
> hasn't bitten many people yet simply because it is unusual to mix Java and
> Clojure collections as keys in the same structure, but it seems like a
> really nasty issue.
>

Reports of real problems in real codebases would change my own opinion of
the severity of this issue.


> Hash and equality functions generally have a very simple rule: if two
> things are equal they must have the same hashcode. But now we've got:
>
> (= (java.util.ArrayList. [1 2]) [1 2])
> => true
>
> (hash (java.util.ArrayList. [1 2]))
> => 994
>
> (hash [1 2])
> => 156247261
>
> I consider this a major defect. If we want to define our own equals and
> hashcode that's fine, but then these functions absolutely should be
> consistent with each other.
>
> If it is really true that non-Clojure collections aren't considered as
> values and aren't expected to participate in hash/= comparisons then I'd
> expect an exception, not an erroneous value. Silent potential corruption of
> data is *much* worse than a noisy failure. But I think even that is a bad
> direction to go: easy Java ecosystem interop is definitely one of the main
> reasons why I'm using Clojure, and I'd like to see this maintained.
>

Please be more specific about where an exception could be thrown (equals vs
hash vs collection usage).

I can think of only two sensible ways to fix this inconsistency:
> a) Roll back the hash changes. Correctness is much more important than a
> performance edge case.
>

We're not doing that.


> (I suspect the hash changes have hurt average case performance anyway...)
>

We have made hash calculation slightly slower to generate better hashes
across the board. Timing info is at
http://dev.clojure.org/display/design/Better+hashing but in general, the
time increases per hash are on the order of nanoseconds for simple values
and sub-millisecond for more complicated strings and collections (where
hashes are cached and times are likely to be dwarfed by other things). The
better hashes avoid catastrophically bad (and relatively easy to find)
cases.

It is quite challenging to predict the performance impact of the hashing
changes in Strings, keywords, and symbols due to the effects of usage
patterns, interning, caching, etc. I am aware of several people that have
reported better performance overall in 1.6 and I don't think I've heard any
reports of significantly slower performance in 1.6. I'm calling FUD on this
unless you have data to report.


> b) Make (= (java.util.ArrayList. [1 2]) [1 2]) and similar comparisons
> evaluate to false (on the grounds that ArrayLists aren't Clojure
> collections, so should not be considered equal).
>

Equality comparison between Clojure and Java collections is something that
seems relatively common. For example, calling a Java API that returns a
List and comparing the returned value with a vector. I don't want to lose
that ability.

c) Fail on hasheq of Java collections

Adding this check would decrease the performance of hasheq for many other
types. Given what I know right now, I'm not inclined to make that tradeoff.

d) Fail when conj'ing Java collections into Clojure map keys or sets.

This would scope the check more but still add a performance hit for this
case on all other types. Note that Clojure maps or sets of only Java
collections currently work so this would take away functionality that might
be useful even if in a gray area.





>
> On Saturday, 10 May 2014 15:51:24 UTC+1, John Hume wrote:
>>
>> Thanks for the ticket pointer. I didn't find it in my initial search
>> because it "Affects Version/s: None" :-/, and I knew this behavior was new
>> to 1.6.
>>
>> For those not up for the medium-length comment-thread read:
>> non-Clojure-aware Java Collections Framework interface instances are not
>> considered values,* and 'Rich says "all bets should be off for
>> hasheq/equiv of non-values."' That's a bit down in the details, but I
>> think the upshot is that they're not currently supported as set members or
>> hash keys. Unfortunately they used to be supported (unintentionally, I
>> guess) due to Clojure following Java's lead, so I have some interop-heavy
>> code to review and test very carefully. Though the above seems to frame
>> non-support as based on principle, it seems the driver is keeping the new
>> hasheq stuff highly performant. I would hope a performant implementation
>> that supported the Collections interfaces would be accepted for some future
>> version.
>>
>> *If there is one, I'd like to read a clear definition of what "values"
>> means in Clojure. The headings on http://clojure.org/data_structures are
>> almost an enumeration of them, except the "Collections" section talks a
>> little about Java Collections classes, hashCode, and hasheq without making
>> it clear that non-Clojure collections are not values. Unfortunately the
>> current definition seems to admit only the Java primitives and
>> near-primitives (String and primitive wrappers) and clojure-aware types.
>>
>> What "not supported" looks like via maps:
>> user> *clojure-version*
>> {:major 1, :minor 6, :incremental 0, :qualifier nil}
>> user> (hash-map [1 2] "v" (java.util.ArrayList. [1 2]) "a")
>> {[1 2] "a", [1 2] "v"}
>> user> {[1 2] "v" (java.util.ArrayList. [1 2]) "a"}
>> IllegalArgumentException Duplicate key: [1 2]  clojure.lang.
>> PersistentArrayMap.createWithCheck (PersistentArrayMap.java:70)
>> user> (-> {} (assoc [1 2] "v") (assoc (java.util.ArrayList. [1 2]) "a"))
>> {[1 2] "a"}
>>
>>
>> On Friday, May 9, 2014 6:34:26 PM UTC-5, Andy Fingerhut wrote:
>>>
>>> Mike is correct.  This change in behavior is due to the hashing changes
>>> in Clojure 1.6.  See the comments on ticket http://dev.clojure.org/jira/
>>> browse/CLJ-1372<http://www.google.com/url?q=http%3A%2F%2Fdev.clojure.org%2Fjira%2Fbrowse%2FCLJ-1372&sa=D&sntz=1&usg=AFQjCNG6I-rrRfAMsUZUomYEbGOJkJqWGA>for
>>>  some discussion of whether this is considered a bug.  It appears that
>>> perhaps the hash consistency is not promised for mutable objects, only
>>> immutable ones.
>>>
>>> Below are a couple more REPL transcripts that illustrate the change in
>>> behavior:
>>>
>>> user=> *clojure-version*
>>> {:major 1, :minor 5, :incremental 1, :qualifier nil}
>>> user=> (= [1 2] (java.util.ArrayList. [1 2]))
>>> true
>>> user=> (map hash [ [1 2] (java.util.ArrayList. [1 2]) ])
>>> (994 994)
>>> user=> (hash-map [1 2] "vec" (java.util.ArrayList. [1 2]) "alist")
>>> {[1 2] "alist"}
>>>
>>>
>>>
>>> user=> *clojure-version*
>>> {:major 1, :minor 6, :incremental 0, :qualifier nil}
>>> user=> (= [1 2] (java.util.ArrayList. [1 2]))
>>> true
>>> user=> (map hash [ [1 2] (java.util.ArrayList. [1 2]) ])
>>> (156247261 994)
>>> user=> (hash-map [1 2] "vec" (java.util.ArrayList. [1 2]) "alist")
>>> {[1 2] "alist", [1 2] "vec"}
>>>
>>>  --
> You received this message because you are subscribed to the Google Groups
> "Clojure Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure-dev+unsubscr...@googlegroups.com.
> To post to this group, send email to clojure-...@googlegroups.com.
> Visit this group at http://groups.google.com/group/clojure-dev.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to