On 07/09/2014 06:36 PM, Martin Buchholz wrote:
OK, I think we're down to the smallest possible bug:
I wouldn't call it a bug, since it's not present.
if size == MAX_CAPACITY - 1 and we putAll a concurrent map with
greater size, but the concurrent map shrinks while we're adding it,
and no actual elements get added to the IHM (but each element put
takes ~ 2**28 probes), then an ISE is thrown when it was not strictly
necessary.
We won't be adding any elements if resize() throws "capacity exceeded",
so this is just hypothetical thinking what would happen if we didn't
call resize() and just started adding elements from the argument map. We
might hit the ceiling or we might not.
Meanwhile, I'm actually wishing we could throw at something like 80%
full...
I created a little simulation:
public class IMHSimulation {
static final int MAXIMUM_CAPACITY = 1 << 29;
static final int MASK = MAXIMUM_CAPACITY - 1;
static int hash(Object x) {
int h = System.identityHashCode(x);
// Multiply by -127 to use least bit as part of hash
return (h - (h << 7)) & MASK;
}
static int add(BitSet bs, Object o) {
int i = hash(o);
int probes = 0;
while (bs.get(i)) {
i = (i + 1) & MASK;
probes++;
}
bs.set(i);
return probes;
}
public static void main(String[] args) {
BitSet bs = new BitSet(MAXIMUM_CAPACITY);
long totalProbes = 0;
int maxProbes = 0;
int percent = 10;
int nextReportAt = (int) ((long) percent * MAXIMUM_CAPACITY / 100);
for (int i = 0; i < MAXIMUM_CAPACITY; i++) {
int probes = add(bs, new Object());
totalProbes += probes;
maxProbes = Math.max(maxProbes, probes);
if (i == nextReportAt) {
System.out.printf(
"%3d%% full, avg. %4.1f, max. %6d probes\n",
percent, (double) totalProbes / i, maxProbes
);
if (percent < 60)
percent += 10;
else if (percent < 90)
percent += 5;
else
percent++;
nextReportAt = (int) ((long) percent * MAXIMUM_CAPACITY
/ 100);
}
}
}
Which produces the following:
10% full, avg. 0.1, max. 9 probes
20% full, avg. 0.1, max. 15 probes
30% full, avg. 0.2, max. 25 probes
40% full, avg. 0.3, max. 38 probes
50% full, avg. 0.5, max. 64 probes
60% full, avg. 0.7, max. 93 probes
65% full, avg. 0.9, max. 147 probes
70% full, avg. 1.2, max. 224 probes
75% full, avg. 1.5, max. 354 probes
80% full, avg. 2.0, max. 441 probes
85% full, avg. 2.8, max. 789 probes
90% full, avg. 4.5, max. 1869 probes
91% full, avg. 5.1, max. 2377 probes
92% full, avg. 5.7, max. 3409 probes
93% full, avg. 6.6, max. 3804 probes
94% full, avg. 7.8, max. 5824 probes
95% full, avg. 9.5, max. 7021 probes
96% full, avg. 12.0, max. 12607 probes
97% full, avg. 16.2, max. 17643 probes
98% full, avg. 24.5, max. 34561 probes
99% full, avg. 49.6, max. 131371 probes
IMH resizing is arranged so that the table is always 33% ... 66% full
(if nothing is ever removed from it) except when capacity reaches 2**29,
then it can be filled up to the top.
avg. # of probes can be taken as a rough estimation of average
slow-down, max. # of probes can be taken as a rough estimation of
worst-case-operation slow-down.
So where to draw the line?
Regards, Peter
Relatedly: It occurs to me that it's traditional in java.util to throw
OOME instead of ISE for capacity exceeded.
On Wed, Jul 9, 2014 at 12:33 AM, Peter Levart <peter.lev...@gmail.com
<mailto:peter.lev...@gmail.com>> wrote:
On 07/09/2014 09:23 AM, Peter Levart wrote:
On 07/09/2014 02:46 AM, Martin Buchholz wrote:
Let me understand - you're worried that when size is
MAX_CAPACITY - 1, then a call to putAll that does not actually
add any elements might throw?
This is not possible, because resize() is only called from
putAll(map) if argument map.size() > this.size. At least one
element will be added to the map and it's correct to throw if
current size == MAX_CAPACITY - 1.
...at least if the argument map obeys the rules of Map contract.
Even if it's a HashMap or another IdentityHashMap, it should not
contain the same INSTANCE of the key in two or more mappings,
should not "overshoot" reporting it's size() and should be stable
during the whole putAll() operation... So calling IHM.addAll()
with a live changing ConcurrentHashMap is not advisable. Even
then, addAll() could only throw, because at some point the
argument's size indicated that IHM could reach it's max. capacity.
Peter
Well, I'm having a hard time considering that corner case a
bug, given how unusable the map is at this point.
It seems even this corner case is not present.
Your suggested fix of returning immediately in case of no resize
would cause put to succeed and reach size == MAX_CAPACITY, which
you were trying to prevent.
That's not possible either, since resize() is always called from
put() with current table.length, which makes newLength == 2 *
oldLength, therefore (oldLength >= newLength) will never succeed
in this case.
Peter
On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov
<ivan.gerasi...@oracle.com <mailto:ivan.gerasi...@oracle.com>>
wrote:
On 09.07.2014 3:20, Martin Buchholz wrote:
I agree with Peter that we do not need to increment
modCount on resize.
My latest webrev is again "done".
Ivan, feel free to take over.
Yes, thanks! The fix is much much better now.
I think I see yet another very minor glitch, though.
If the table is already of size 2 * MAXIMUM_CAPACITY, and
we're merging in another map, which has more elements with
putAll(), resize() will be called and it will throw, even
though all the elements could fit into the map due to the
key duplicates.
To avoid this the following check should be done first in
resize():
469 if (oldLength >= newLength)
470 return false;
Sincerely yours,
Ivan