I am using the collections library from CVS and did not have a problem with the minor re-factoring. After a few quick cosmetic changes, everything works fine as far as I can tell.

I have placed a couple of UML diagrams on my web site with the code to help explain what I am doing. You can find the files at: http://bblfish.net/java/collections/

The collections.jpg picture is a quick sketch of the current collections.map class hierarchy. As you can see AbstractHashedMap contains a collection of HashEntry objects. Each of these have a key and a value. Various subclasses of AbstractHashedMap implement different specialisations of hashes, some of which require specialisations of HashedEntry (eg IdentityMap requires an IdentityEntry).
This diagram does not show the inner class relationships. For example HashedEntry is an inner class of AbstractHashedMap.


What I am proposing is really only a very small change. (Even if the next diagram looks very different)

If you looks at collections-altered.jpg you will see that the only novelty is that HashEntry is now an interface, with most of the code that it used to contain now in DefaultHashEntry. I have shown the packages nested inside one another to make clear the static inner class/interface relationships. As you see DefaultHashEntry has three fields, but ReflexiveHashEntry only has two, since it is a data structure that is used for a set where they map.key = map.value.

The only change needed is the interface change - making HashEntry an interface.
This allows us to bypass the default implementation of HashEntry.


I just added the HashedSet as an example of how it could come in useful.

On 10 Feb 2004, at 18:57, Stephen Colebourne wrote:

Unfortunately, as version 3 of [collections] has now been released with
HashEntry as a class, that cannot be changed to an interface.

I would however, support the addition of a similar AbstractHashedSet class
to [collections], preferably with a default HashedSet implementation.


Stephen


----- Original Message ----- From: "Henry Story" <[EMAIL PROTECTED]>


Hi,

A little background to this contribution is in order. I was looking at
an interesting framework for Value Objects written by Dirk Riehle,
available under the lgpl at http://www.jvalue.org/, when I found a
small bug in his code. It was obvious that what was needed was a
WeakHashedSet - a HashSet that would wrap its contents in
WeakReference's. Given the difficulty of extending java.util classes I
opted to look at the apache collections package instead.

I first decided to create a HashedSet class, an equivalent of
java.net.HashSet. The best way to do this I realised, would be to
extend the AbstractHashedMap. The JavaDoc for this class says:

This class implements all the features necessary for a subclass
hash-based map.
Key-value entries are stored in instances of the HashEntry class,
which can be
overridden and replaced. The iterators can similarly be replaced,
without the
need to replace the KeySet, EntrySet and Values view classes.

But in fact it is not quite as extensible as it could be because the HashEntry is currently a class, not an interface. Let me explain by taking the simpler example of rewriting the java.util.HashSet class.

HashSet
     map -> HashMap
              entries -> HashEntries[]
                      key -> Object
                           value -> Object
                      next -> HashEntry

This picture show that HashSet contains a pointer to a HashMap, which
contains a pointer to an Array of HashEntries. Each HashEntry contains
a key pointer, a value pointer and a pointer to the next HashEntry.

A HashSet uses the HashMap because the keys in a HashMap are unique. As
a result the value pointer in the HashEntry has really no role to play.
The way AbstractHashedMap is written in the apache collection means
that to simply follow the above pattern would waste one pointer for
every HashEntry, that is 4 bytes wasted for every entry.


The solution is simple.
1. I turned the HashEntry inner class into an inner interface of
AbstractHashedMap.
2. I added a few methods to get/set the key and the value. These
methods should be inlined by all modern compilers, so I don't think
there is any efficiency problem here.
3. I then wrote a HashedSet which has an inner class that extends
AbstractHashedMap, but which takes a modified HashEntry class where the
key and the value are the same object. The resulting
net.bblfish.util.HashedSet is thus more memory efficient than
java.util.HashSet.


HashedSet
     map -> SpecialisedHashMap
              entries -> HashEntries[]
                      key -> Object
                           next -> HashEntry


Perhaps it would be worth adding these changes to the library.


If the change to
org.apache.commons.collections.map.AbstractHashedMap.HashEntry is
accepted then it will make it relatively easy to write a WeakHashedSet.
It will be just a matter of extending WeakReference by implementing the
HashEntry methods.


WeakHashedSet
     map -> SpeacialWeakHashMap
              entries -> WeakHashEntries[] extends WeakReference
                                           implements HashEntry
                      key -> Object
                           next -> HashEntry


Henry


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to