Re: hash-map based on identity

2009-03-09 Thread Chouser

On Sun, Mar 8, 2009 at 9:28 PM, Rich Hickey richhic...@gmail.com wrote:

 On Mar 8, 6:17 am, David Powell djpow...@djpowell.net wrote:
  Identity is tested first in equality, if identical, equal, full stop.

 That's what I'd assumed (it's what the JDK collections do), but
 looking at the code, to say, APersistentVectory, I can't see where the
 identity test is done?  Am I looking in the wrong place?

 Clojure code routes through clojure.lang.Util.equiv()/equals() which
 do the check, but each collection should as well, for use outside of
 Clojure.

 Issue/patch welcome.

I've created an issue, just so we don't lose track of this, but I'm
not writing a patch at the moment.  This is probably some nice
low-hanging fruit for someone who has sent in their CA and is
interested in starting to get patches into Clojure itself:

http://code.google.com/p/clojure/issues/detail?id=92

--Chouser

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re[2]: hash-map based on identity

2009-03-08 Thread David Powell


 Identity is tested first in equality, if identical, equal, full stop.

That's what I'd assumed (it's what the JDK collections do), but
looking at the code, to say, APersistentVectory, I can't see where the
identity test is done?  Am I looking in the wrong place?

-- 
Dave


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-08 Thread Rich Hickey



On Mar 8, 6:17 am, David Powell djpow...@djpowell.net wrote:
  Identity is tested first in equality, if identical, equal, full stop.

 That's what I'd assumed (it's what the JDK collections do), but
 looking at the code, to say, APersistentVectory, I can't see where the
 identity test is done?  Am I looking in the wrong place?


Clojure code routes through clojure.lang.Util.equiv()/equals() which
do the check, but each collection should as well, for use outside of
Clojure.

Issue/patch welcome.

Rich

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread Jeffrey Straszheim
java.util.IdentityHashMap

On Sat, Mar 7, 2009 at 1:49 AM, Mark Engelberg mark.engelb...@gmail.comwrote:


 Is there a variation of hash-map which supports comparison of keys
 using identical? rather than = ?  Ditto with sets.

 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread Mark Engelberg

Here's why it would be useful to have the option of hash maps that use
identity...

Let's say my keys are rather long lists.  Equality comparison will be
slow.  But if I know that the keys are always the exact same long
lists (perhaps because I traversed the key sequence to find a key with
a certain property, and then am updating with that precise key as one
example), then I can save lots of time by using an identity
comparison.

FWIW, the Scheme I usually use (PLT Scheme) supports both eq? and
equal? based hash tables, and I have found both to be useful (although
I prefer equal? based hash tables as the default).

On Sat, Mar 7, 2009 at 7:57 AM, David Powell djpow...@djpowell.net wrote:
 What objects are you trying to store? The ideal in the Clojure world,
 is that most objects are immutable.

  Immutable objects will implement value equality in .equals() ('=' in
  Clojure), and you shouldn't care about whether two objects are
  identical? because it won't affect anything.

It's an efficiency issue.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread David Nolen
Depending on what you are doing perhaps you could use (hash x). This might
not be very helpful if you need to update keys frequently.  However if
you're mostly doing lookups, then this would help quite a bit.
On Sat, Mar 7, 2009 at 3:44 PM, Mark Engelberg mark.engelb...@gmail.comwrote:


 Here's why it would be useful to have the option of hash maps that use
 identity...

 Let's say my keys are rather long lists.  Equality comparison will be
 slow.  But if I know that the keys are always the exact same long
 lists (perhaps because I traversed the key sequence to find a key with
 a certain property, and then am updating with that precise key as one
 example), then I can save lots of time by using an identity
 comparison.

 FWIW, the Scheme I usually use (PLT Scheme) supports both eq? and
 equal? based hash tables, and I have found both to be useful (although
 I prefer equal? based hash tables as the default).

 On Sat, Mar 7, 2009 at 7:57 AM, David Powell djpow...@djpowell.net
 wrote:
  What objects are you trying to store? The ideal in the Clojure world,
  is that most objects are immutable.
 
   Immutable objects will implement value equality in .equals() ('=' in
   Clojure), and you shouldn't care about whether two objects are
   identical? because it won't affect anything.

 It's an efficiency issue.

 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread Rich Hickey



On Mar 7, 3:44 pm, Mark Engelberg mark.engelb...@gmail.com wrote:
 Here's why it would be useful to have the option of hash maps that use
 identity...

 Let's say my keys are rather long lists.  Equality comparison will be
 slow.  But if I know that the keys are always the exact same long
 lists (perhaps because I traversed the key sequence to find a key with
 a certain property, and then am updating with that precise key as one
 example), then I can save lots of time by using an identity
 comparison.


Identity is tested first in equality, if identical, equal, full stop.
So if you are using identical and unique collections as keys you'll
find them without a value-by-value comparison. If they are not
present, it's unlikely you'll get a matching hashCode and a matching
count and a matching long prefix from a mismatch.

 It's an efficiency issue.

I don't think so. Identity keys are most often needed e.g. when you
are trying to save/restore object graphs and need to know that you've
already seen a particular object perhaps equal to another non-
identical object in the same graph, or otherwise tracking object
identity for synchronization with an external system. Using
IdentityHashMap as an implementation detail in these situations is
perfectly fine. I don't see a compelling argument for clojure to offer
identity-based collections.

Rich

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread Mark Engelberg

On Sat, Mar 7, 2009 at 2:57 PM, Rich Hickey richhic...@gmail.com wrote:
 Identity is tested first in equality, if identical, equal, full stop.
 So if you are using identical and unique collections as keys you'll
 find them without a value-by-value comparison. If they are not
 present, it's unlikely you'll get a matching hashCode and a matching
 count and a matching long prefix from a mismatch.

But if the key is a long list, wouldn't it have to traverse the list
just to come up with the hash code for that list?  And doesn't it need
that hash code just to know which bin to look in for matching keys?
If so, then the performance hit occurs before you even get to
comparing against other keys.

My understanding is that eq?-based associative collections use a
completely different hashing scheme, hashing on the address or some
other unique/stable value associated with each object, and then use
eq? (like identical?) to do the actual comparison when they get to the
right bin.

Another use case:  Consider sorting a collection of lengthy lists, and
you want to sort based on the length of the lists.  You want to cache
the length of those lists.  Using an identity-based hash-map would be
the right kind of cache for such a thing, because you're looking up
the exact same lists and want to be efficient about it (assuming my
comments above about hash performance is correct).

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread Rich Hickey


On Mar 7, 2009, at 8:30 PM, Mark Engelberg wrote:


 On Sat, Mar 7, 2009 at 2:57 PM, Rich Hickey richhic...@gmail.com  
 wrote:
 Identity is tested first in equality, if identical, equal, full stop.
 So if you are using identical and unique collections as keys you'll
 find them without a value-by-value comparison. If they are not
 present, it's unlikely you'll get a matching hashCode and a matching
 count and a matching long prefix from a mismatch.

 But if the key is a long list, wouldn't it have to traverse the list
 just to come up with the hash code for that list?  And doesn't it need
 that hash code just to know which bin to look in for matching keys?
 If so, then the performance hit occurs before you even get to
 comparing against other keys.

The core collection classes cache their hash codes. There is an issue  
to have them create them incrementally, amortizing the cost over all  
insertions, which will work for all but lists, which have a front-to- 
back hash algorithm but are built back-to-front.

http://code.google.com/p/clojure/issues/detail?id=11


 My understanding is that eq?-based associative collections use a
 completely different hashing scheme, hashing on the address or some
 other unique/stable value associated with each object, and then use
 eq? (like identical?) to do the actual comparison when they get to the
 right bin.


Yes, so? It's still not an argument for another collection family,  
which will only engender confusion as people make similar incorrect  
assumptions about performance and equality.

 Another use case:  Consider sorting a collection of lengthy lists, and
 you want to sort based on the length of the lists.  You want to cache
 the length of those lists.  Using an identity-based hash-map would be
 the right kind of cache for such a thing, because you're looking up
 the exact same lists and want to be efficient about it (assuming my
 comments above about hash performance is correct).


This argument only applies if you never want to calculate the hash  
code by a walk, even once. I think that's a false economy, and again,  
not persuasive. With incremental hashcode calculation, even less so.

Clojure has other data structures that address some of the  
deficiencies of lists, but even PersistentList is Counted, i.e.  
provides its count in constant time, as do maps, sets and vectors,  
array, vector and string seqs etc.

Rich


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: hash-map based on identity

2009-03-07 Thread Stephen C. Gilardi


On Mar 7, 2009, at 8:30 PM, Mark Engelberg wrote:


But if the key is a long list, wouldn't it have to traverse the list
just to come up with the hash code for that list?


Yes, but only once. Since Clojure lists are immutable, their hash code  
is calculated once and cached and the cache will never become invalid.


--Steve



smime.p7s
Description: S/MIME cryptographic signature


hash-map based on identity

2009-03-06 Thread Mark Engelberg

Is there a variation of hash-map which supports comparison of keys
using identical? rather than = ?  Ditto with sets.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Clojure group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---