Hi,
On 05/05/2016 02:54 PM, Stephen Colebourne wrote:
On 5 May 2016 at 02:30, Stuart Marks <stuart.ma...@oracle.com> wrote:
I disagree with altering the iteration order. Guava's ImmutableSet and
ImmutableMap have reliable iteration specified to match creation
order. This aspect of the design is very useful.
I think this is a reasonable thing to want, but it would be a different API.
The current Set.of() and Map.of() APIs have unspecified iteration order, and
I want to "enforce" that via randomizing the iteration order. Having
unspecified iteration order, but with the implementation giving a stable
order, is the worst of both worlds. It lets clients bake in inadvertent
order dependencies, and then when we do change it, it breaks them. (This
seems to happen every couple years with HashMap.) Randomizing iteration
order helps preserve implementation freedom.
That said, it's a fine thing to have a different API that gives a Set or Map
with a stable iteration order. Several people have asked for this. I've
filed JDK-8156070 to track this.
To be clear, I believe that the spec of Set.of() and Map.of() should
require iteration order matching insertion order (because the order is
very clearly defined in this case and anything else will be
surprising).
Stephen
So what if the API was specified to have iteration order matching
construction order. Are you afraid that would impact implementations so
that they would be less compact or less performant? At least for small
number of elements, there is a possible implementation that keeps
construction order and is even more compact and still has O(1) lookup
performance. For example an implementation for up to 255 elements:
public class Set0xFF<E> extends AbstractSet<E> implements Serializable {
/**
* A "salt" value used for randomizing iteration order. This is
initialized once
* and stays constant for the lifetime of the JVM. It need not be
truly random, but
* it needs to vary sufficiently from one run to the next so that
iteration order
* will vary between JVM runs.
*/
static final int SALT;
static {
SALT = new Random().nextInt();
}
/**
* The reciprocal of load factor. Given a number of elements
* to store, multiply by this factor to get the table size.
*/
static final double EXPAND_FACTOR = 2.0;
final E[] elements;
final byte[] index;
@SafeVarargs
@SuppressWarnings("unchecked")
Set0xFF(E... input) {
if (input.length > 255) {
throw new IllegalArgumentException("To many elements for
this implementation");
}
elements = input.clone();
index = new byte[(int) Math.ceil(EXPAND_FACTOR * elements.length)];
for (int i = 0; i < elements.length; i++) {
E e = Objects.requireNonNull(elements[i]);
int idx = probe(e);
if (idx >= 0) {
throw new IllegalArgumentException("duplicate element:
" + e);
} else {
index[-(idx + 1)] = (byte) (i + 1);
}
}
}
@Override
public int size() {
return elements.length;
}
@Override
@SuppressWarnings("unchecked")
public boolean contains(Object o) {
Objects.requireNonNull(o);
return probe((E) o) >= 0;
}
@Override
public Iterator<E> iterator() {
return new Iterator<E>() {
int idx = 0;
@Override
public boolean hasNext() {
return idx < elements.length;
}
@Override
public E next() {
int i = idx;
if (i >= elements.length) {
throw new NoSuchElementException();
}
idx = i + 1;
return elements[i];
}
};
}
// returns index into index array at which element index + 1 is
present; or if absent,
// (-i - 1) where i is location where element index should be inserted
private int probe(E pe) {
int idx = Math.floorMod(pe.hashCode() ^ SALT, index.length);
while (true) {
int i = (int) index[idx] & 0xFF;
if (i == 0) {
return -idx - 1;
} else if (elements[i - 1].equals(pe)) {
return idx;
} else if (++idx == index.length) {
idx = 0;
}
}
}
}
Your SetN occupies (without considering the elements themselves) the
following number of bytes:
8 * size() (on 32 bit/compressed OOPS) or 16 * size() (on 64bit
OOPS) (+ 2 object headers and alignments)
Above Set0xFF occupies:
6 * size() (on 32 bit/compressed OOPS) or 10 * size() (on 64bit
OOPS) (+ 3 object headers and alignments)
An analogue Set0xFFFF for up to 65535 elements would occupy:
8 * size() (on 32 bit/compressed OOPS) or 12 * size() (on 64bit
OOPS) (+ 3 object headers and alignments)
An analogue Set0xFFFFFFFF for up to 2^31 - 1 elements that still keeps
iteration order would occupy:
12 * size() (on 32 bit/compressed OOPS) or 16 * size() (on 64bit
OOPS) (+ 3 object headers and alignments)
So only on 32 bit/compressed OOPS for # of elements in range 2^16 ...
2^31 - 1 there is 1.5 x footprint increase.
Worth considering?
Regards, Peter