Re: RFR: 8281631: HashMap copy constructor and putAll can over-allocate table [v27]

Stuart Marks Wed, 09 Mar 2022 17:15:09 -0800

On Sat, 5 Mar 2022 19:06:03 GMT, XenoAmess <d...@openjdk.java.net> wrote:

>> 8281631: HashMap copy constructor and putAll can over-allocate table
>
> XenoAmess has updated the pull request incrementally with one additional
> commit since the last revision:
>
> refactor tests

Sorry, the test changes look like they're heading in the wrong direction. I
tried to provide some hints for what I was looking for in my previous comments.
At this point, I felt it would have been too time-consuming to provide a bunch
of review comments to try to get your proposed tests closer to what I was
looking for, so instead I've extended/rewritten the existing
`WhiteBoxResizeTest` file:

https://github.com/stuart-marks/jdk/blob/JDK-8281631-smarks/test/jdk/java/util/HashMap/WhiteBoxResizeTest.java

I hope you're ok with this.

Here are some issues I have with the multiple-file approach:

* The separate classes tends to obscure things, as the separate files are
actually tightly coupled; infrastructure (reflection stuff) and test case setup
(Util, TestSuite) are separated from the test, making it hard to see what's
going on.
* Each test method still does too much work. In general, test methods should do
one thing in three phases: setup, perform action, assert results -- the
[GivenWhenThen](https://martinfowler.com/bliki/GivenWhenThen.html) pattern.
(Yes, I know a lot of tests currently in the regression suite violate this, but
I think it's reasonable for new fine-grained combo tests like this one to be
held to this standard.) Several of the test methods here do setup, perform an
op, make assertions, perform another op, make assertions, perform another op,
make more assertions, etc. The reason this is bad is that it's hard to separate
out special cases. Consider `WhiteBoxHashMapInitialCapacityTest`. Since this
test method tests both lazy allocation and initial capacity allocation, it
can't handle the different behavior of `WeakHashMap`.
* The need for separate classes is possibly driven by JUnit 4, where the test
parameterization mechanism seems to only be available on the granularity of an
entire class. (I'm not a JUnit expert, so I might have missed something.) This
makes it difficult to structure parameterized tests on a per-method basis, as
can be done in TestNG.

The existing `WhiteBoxResizeTest` is TestNG, so it seems sensible to extend
that, and add several parameterized test methods. I believe the new
`WhiteBoxResizeTest` covers all the existing cases as well as the new ones in
your tests, plus some additional ones that we've uncovered during this
discussion. It covers a reasonable set of cases in the following areas:

* combo of various constructors plus populating the map in different ways all
get the same expected table size
* lazy allocation of table
* correct default allocation of table
* verification of fix for `WeakHashMap` premature resizing
* verification of fix using double instead of float calculation for certain
large sizes

Here are some notes about what I did. Unfortunately all that's here is the end
result. I tried a bunch of experiments and different alternatives, which aren't
present here, but I'll try to document some of the processes I went through to
get to this result.

* Rearranged the infrastructure to accommodate `WeakHashMap`. The `table()` and
`capacity()` methods now work for all map classes under test. Really, there are
only three classes, and two (`HashMap` and `LinkedHashMap`) have the table in
the `HashMap` class, so a simple conditional will work. If additional classes
are added, the conditional might turn into a case expression.
* Simplified VarHandle setup. The old code had to load some private classes in
order to get the exact types to provide to `findVarHandle()`. It's easier to
find the field using reflection and then to "unreflect" it into a VarHandle.
* Rearranged test methods into pairs of data provider and
data-provider-consuming test methods, possibly with some helper methods,
demarcated into different sections in the test class. Using TestNG data
providers is a bit clunky with `Object[][]` or `Iterator<Object[]>` but it ends
up working reasonably well.
* The `tableSizeFor()` test is also now a data provider. It probably would have
been ok to leave it as a single method with a bunch of asserts, but using a
data provider enables using a tabular form for the test data, which I think
makes it easier to manage the test cases.
* Most test data is tabular. It's fairly straightforward but for some providers
it's bit a repetitive to add more test cases. However, I didn't go the next
step of compressing out the repetition, because doing so would have made it
difficult to exclude special cases. For example, the `WeakHashMap` growth
policy for `putAll` differs from the others.
* Of note is that some of the larger sizes/capacities are tested using a "fake"
map, which reports some size but whose `entrySet` iterator returns only one
entry. This works for some of the implementations, so only cases for those are
included.
* I used helper methods called from the data provider in order to establish a
target type for lambdas. This reduces the clutter in the table data.
* It's much easier to parameterize constructors using `Supplier` lambdas as
method arguments instead of as fields in a helper class. Using a `record` might
be helpful here though.
* If a parameterized test fails, TestNG will print out the arguments obtained
from the data provider that caused the test method to fail. The string form of
a lambda isn't useful. Providing a string label is a technique (hack) to
identify the failing test case. It's a bit cryptic but it's quite effective.

There is a certain amount of brittleness in these tests such that they're prone
to failure if the internals of one of the implementations changes. These are
whitebox tests after all. But the new organization should facilitate
rearrangement of the test cases and test methods if one of the implementations
diverges. In addition, if changing one of the implementations causes test
failures here, it will point to other implementations that might also need to
be changed.

The new test is comprehensive enough that we might be able to remove the
`Enum/ConstantDirectoryOptimalCapacity` test and the related test that uses
those utilities to verify map table sizes. However, that evaluation should be
done separately.

I suggest you merge this file into this PR in place of the other test files.
Once you get this done I can run everything through our internal test system
and sponsor this PR for integration.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7431

Re: RFR: 8281631: HashMap copy constructor and putAll can over-allocate table [v27]

Reply via email to