Re: RFR: 8281631: HashMap copy constructor and putAll can over-allocate table [v27]

XenoAmess Thu, 10 Mar 2022 08:01:56 -0800

On Thu, 10 Mar 2022 01:11:03 GMT, Stuart Marks <sma...@openjdk.org> wrote:


> Sorry, the test changes look like they're heading in the wrong direction. I 
> tried to provide some hints for what I was looking for in my previous 
> comments. At this point, I felt it would have been too time-consuming to 
> provide a bunch of review comments to try to get your proposed tests closer 
> to what I was looking for, so instead I've extended/rewritten the existing 
> `WhiteBoxResizeTest` file:
> 
> https://github.com/stuart-marks/jdk/blob/JDK-8281631-smarks/test/jdk/java/util/HashMap/WhiteBoxResizeTest.java
> 
> I hope you're ok with this.
> 
> Here are some issues I have with the multiple-file approach:
> 
> * The separate classes tends to obscure things, as the separate files are 
> actually tightly coupled; infrastructure (reflection stuff) and test case 
> setup (Util, TestSuite) are separated from the test, making it hard to see 
> what's going on.
> * Each test method still does too much work. In general, test methods should 
> do one thing in three phases: setup, perform action, assert results -- the 
> [GivenWhenThen](https://martinfowler.com/bliki/GivenWhenThen.html) pattern. 
> (Yes, I know a lot of tests currently in the regression suite violate this, 
> but I think it's reasonable for new fine-grained combo tests like this one to 
> be held to this standard.) Several of the test methods here do setup, perform 
> an op, make assertions, perform another op, make assertions, perform another 
> op, make more assertions, etc. The reason this is bad is that it's hard to 
> separate out special cases. Consider `WhiteBoxHashMapInitialCapacityTest`. 
> Since this test method tests both lazy allocation and initial capacity 
> allocation, it can't handle the different behavior of `WeakHashMap`.
> * The need for separate classes is possibly driven by JUnit 4, where the test 
> parameterization mechanism seems to only be available on the granularity of 
> an entire class. (I'm not a JUnit expert, so I might have missed something.) 
> This makes it difficult to structure parameterized tests on a per-method 
> basis, as can be done in TestNG.
> 
> The existing `WhiteBoxResizeTest` is TestNG, so it seems sensible to extend 
> that, and add several parameterized test methods. I believe the new 
> `WhiteBoxResizeTest` covers all the existing cases as well as the new ones in 
> your tests, plus some additional ones that we've uncovered during this 
> discussion. It covers a reasonable set of cases in the following areas:
> 
> * combo of various constructors plus populating the map in different ways all 
> get the same expected table size
> * lazy allocation of table
> * correct default allocation of table
> * verification of fix for `WeakHashMap` premature resizing
> * verification of fix using double instead of float calculation for certain 
> large sizes
> 
> Here are some notes about what I did. Unfortunately all that's here is the 
> end result. I tried a bunch of experiments and different alternatives, which 
> aren't present here, but I'll try to document some of the processes I went 
> through to get to this result.
> 
> * Rearranged the infrastructure to accommodate `WeakHashMap`. The `table()` 
> and `capacity()` methods now work for all map classes under test. Really, 
> there are only three classes, and two (`HashMap` and `LinkedHashMap`) have 
> the table in the `HashMap` class, so a simple conditional will work. If 
> additional classes are added, the conditional might turn into a case 
> expression.
> * Simplified VarHandle setup. The old code had to load some private classes 
> in order to get the exact types to provide to `findVarHandle()`. It's easier 
> to find the field using reflection and then to "unreflect" it into a 
> VarHandle.
> * Rearranged test methods into pairs of data provider and 
> data-provider-consuming test methods, possibly with some helper methods, 
> demarcated into different sections in the test class. Using TestNG data 
> providers is a bit clunky with `Object[][]` or `Iterator<Object[]>` but it 
> ends up working reasonably well.
> * The `tableSizeFor()` test is also now a data provider. It probably would 
> have been ok to leave it as a single method with a bunch of asserts, but 
> using a data provider enables using a tabular form for the test data, which I 
> think makes it easier to manage the test cases.
> * Most test data is tabular. It's fairly straightforward but for some 
> providers it's bit a repetitive to add more test cases. However, I didn't go 
> the next step of compressing out the repetition, because doing so would have 
> made it difficult to exclude special cases. For example, the `WeakHashMap` 
> growth policy for `putAll` differs from the others.
> * Of note is that some of the larger sizes/capacities are tested using a 
> "fake" map, which reports some size but whose `entrySet` iterator returns 
> only one entry. This works for some of the implementations, so only cases for 
> those are included.
> * I used helper methods called from the data provider in order to establish a 
> target type for lambdas. This reduces the clutter in the table data.
> * It's much easier to parameterize constructors using `Supplier` lambdas as 
> method arguments instead of as fields in a helper class. Using a `record` 
> might be helpful here though.
> * If a parameterized test fails, TestNG will print out the arguments obtained 
> from the data provider that caused the test method to fail. The string form 
> of a lambda isn't useful. Providing a string label is a technique (hack) to 
> identify the failing test case. It's a bit cryptic but it's quite effective.
> 
> There is a certain amount of brittleness in these tests such that they're 
> prone to failure if the internals of one of the implementations changes. 
> These are whitebox tests after all. But the new organization should 
> facilitate rearrangement of the test cases and test methods if one of the 
> implementations diverges. In addition, if changing one of the implementations 
> causes test failures here, it will point to other implementations that might 
> also need to be changed.
> 
> The new test is comprehensive enough that we might be able to remove the 
> `Enum/ConstantDirectoryOptimalCapacity` test and the related test that uses 
> those utilities to verify map table sizes. However, that evaluation should be 
> done separately.
> 
> I suggest you merge this file into this PR in place of the other test files. 
> Once you get this done I can run everything through our internal test system 
> and sponsor this PR for integration.

Thanks for your help.

A quick glance: that is really big differences between the designs, but I do 
accept that sometimes there be no absolute right and wrong in designs, and 
consistency to the group is important.

So I will have a look at your version of this test and it functionally correct 
and can cover our newly found cases, I'm glad to change the tests to your 
version.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7431

Re: RFR: 8281631: HashMap copy constructor and putAll can over-allocate table [v27]

Reply via email to