EdwinIngJ opened a new issue, #18651:
URL: https://github.com/apache/druid/issues/18651

   I was able to reproduce similar errors while running the test 
`org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle`
 as the errors mentioned in this previous PR for 
`org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd`
 here: https://github.com/apache/druid/pull/17264.
   
   Digging deeper I found the root cause to be line 
[227](https://github.com/apache/druid/blob/aa3d659f6db50340a2572b0afc710c409fe386a2/extensions-contrib/moving-average-query/src/main/java/org/apache/druid/query/movingaverage/MovingAverageIterable.java#L227)
 in `MovingAverageIterable.java` where the `averagerKey.iterator()` is used to 
traverse the `Set<Map<String,Object>> averagerKeys`. The iteration order of the 
Set is not guaranteed to have any specific ordering. For example if I modify 
lines 225 like such: 
   
   ```java
   -            Set<Map<String, Object>> averagerKeys = new 
HashSet<>(averagers.keySet());
   +            Set<Map<String, Object>> averagerKeys = new HashSet<>(101); // 
Set initial capacity
   +            averagerKeys.addAll(averagers.keySet());
                averagerKeys.removeAll(seenKeys);
                averagersKeysIter = averagerKeys.iterator();
                cacheIter = null;
   ```
   Initializing the HashSet to some initial capacity (101), both the 
`extensions-contrib/moving-average-query,org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle`
 and 
`org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd`fail
 consistently with the following errors:
   
   ```
   [INFO] Running org.apache.druid.query.movingaverage.MovingAverageIterableTest
   [ERROR] Tests run: 4, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 
0.811 s <<< FAILURE! -- in 
org.apache.druid.query.movingaverage.MovingAverageIterableTest
   [ERROR] 
org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle
 -- Time elapsed: 0.734 s <<< FAILURE!
   org.junit.ComparisonFailure: expected:<[u]> but was:<[f]>
        at org.junit.Assert.assertEquals(Assert.java:117)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at 
org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtMiddle(MovingAverageIterableTest.java:521)
        at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
   ```
   
   and 
   
   ```
   [INFO] Running org.apache.druid.query.movingaverage.MovingAverageIterableTest
   [ERROR] Tests run: 4, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 
0.763 s <<< FAILURE! -- in 
org.apache.druid.query.movingaverage.MovingAverageIterableTest
   [ERROR] 
org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd
 -- Time elapsed: 0.691 s <<< FAILURE!
   org.junit.ComparisonFailure: expected:<[u]> but was:<[f]>
        at org.junit.Assert.assertEquals(Assert.java:117)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at 
org.apache.druid.query.movingaverage.MovingAverageIterableTest.testMissingDataAtTheEnd(MovingAverageIterableTest.java:441)
        at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
   ```
   , respectively.
   
   Should the iteration order change both this test and the test mentioned in 
the PR above would fail.
   
   According to the observations above, I propose solution to make these tests 
more robust by using a set comparison for the lines where the test asserts for 
the presence of the injections such as 
   ```
   Set<String> expectedGenders = new HashSet<>(Arrays.asList("u", "f"));
   Set<String> actualGenders = new HashSet<>();
   
   Assert.assertTrue(iter.hasNext());
   result = iter.next();
   actualGenders.add((result.getDimension("gender")).get(0));
   Assert.assertEquals(JAN_2, (result.getTimestamp()));
   
   Assert.assertTrue(iter.hasNext());
   result = iter.next();
   actualGenders.add((result.getDimension("gender")).get(0));
   Assert.assertEquals(JAN_2, (result.getTimestamp()));
   
   Assert.assertEquals(expectedGenders, actualGenders);
   ``` 
   This would not reduce the coverage of these tests.
   
   @kfaraz Would you like me to open a PR for this? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to