On 2016/1/29 20:53, Paul Sandoz wrote:
On 29 Jan 2016, at 13:43, Hamlin Li <huaming...@oracle.com> wrote:

Hi Paul,

Sorry for delayed response, have been occupied by other higher priority task.
Thanks for your review, I agree with you that your second approach is better.
New webrev: http://cr.openjdk.java.net/~mli/8076458/webrev.01/

The changes to the data providers look ok.

Would you mind splitting out the tests between StreamTestData<Integer> and 
StreamTestData<Integer>.small as outlined in 2) below. That way for the non-eplosive 
stuff we can still crunch on larger data without much of a slow down.
Hi Pual,

Yes, you're right, it does not slow down too much, it cost 15.553 seconds after the first revision(webrev.01), and it cost 16.064 after the second revision(webrev.02).
Please check the webrev: http://cr.openjdk.java.net/~mli/8076458/webrev.02/
Below are times cost for different ops:
  total        :169.996
  testOps only    :108.988
  testIntOps only :23.865
  testLongOps only :22.326
  testDoubleOps only :16.944
so, I build small data providers for each of them.

Ok, and i suspect/hope it drops by at least an order of magnitude with your 
changes applied.
Yes, it cost 15.553 seconds after the first revision(webrev.01).
Out of curiosity i wonder what the times would be if using parallel GC rather 
than G1.
With different GC options after second revision(webrev.02):
  -UseParallelGC:  elapsed time (seconds): 16.047
  +UseParallelGC: elapsed time (seconds): 13.263
  -UseG1GC:         elapsed time (seconds): 16.612
  +UseG1GC:        elapsed time (seconds): 16.998
  -UseParallelOldGC: elapsed time (seconds): 16.039
  +UseParallelOldGC: elapsed time (seconds): 14.297

Thank you
-Hamlin

Paul.

Thank you
-Hamlin

On 2016/1/26 21:18, Paul Sandoz wrote:
Hi Hamlin,

Conservatively I would prefer not to remove data sets if at all possible. It 
will affect all tests, and leaf tasks for parallel streams should have enough 
data to crunch on.

I suspect the problem of the flatMap test is not necessarily due to the source 
sizes being of 1000 elements but that there are tests that substitute an 
element whose value is m for n elements from 0..m, which can explode things and 
generate lots of garbage.

Have you tried executing those kinds tests when the data size is < 1000?

My bet is the FlatMapOpTest will run significantly faster and you will not need 
to split it out.

There are two ways we could consider doing this:

1) Check the size in the test method:

if (data.size() < 1000) {
     exerciseOps(data, s -> s.flatMap(mfLt));
     exerciseOps(data, s -> s.flatMap(integerRangeMapper));
     exerciseOps(data, s -> s.flatMap((Integer e) -> IntStream.range(0, 
e).boxed().limit(10)));
}

2) Include a new data provider for smaller data sets

@Test(dataProvider = "StreamTestData<Integer>", dataProviderClass = 
StreamTestDataProvider.class)
public void testOps(String name, TestData.OfRef<Integer> data) {
     Collection<Integer> result = exerciseOps(data, s -> s.flatMap(mfId));
     assertEquals(data.size(), result.size());

     result = exerciseOps(data, s -> s.flatMap(mfNull));
     assertEquals(0, result.size());

     result = exerciseOps(data, s-> s.flatMap(e -> Stream.empty()));
     assertEquals(0, result.size());
}

@Test(dataProvider = "StreamTestData<Integer>.small", dataProviderClass = 
StreamTestDataProvider.class)
public void testOpsX(String name, TestData.OfRef<Integer> data) {
     exerciseOps(data, s -> s.flatMap(mfLt));
     exerciseOps(data, s -> s.flatMap(integerRangeMapper));
     exerciseOps(data, s -> s.flatMap((Integer e) -> IntStream.range(0, 
e).boxed().limit(10)));
}

I prefer the latter approach (applied to ref and primitive data sets). It’s 
more work, but i think the right direction.

Paul.

Reply via email to