[ https://issues.apache.org/jira/browse/GEODE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kirk Lund reassigned GEODE-10323: --------------------------------- Assignee: Kirk Lund > OffHeapStorageJUnitTest testCreateOffHeapStorage fails with AssertionError: > expected:<100> but was:<1048576> > ------------------------------------------------------------------------------------------------------------ > > Key: GEODE-10323 > URL: https://issues.apache.org/jira/browse/GEODE-10323 > Project: Geode > Issue Type: Improvement > Components: offheap > Affects Versions: 1.16.0 > Reporter: Kirk Lund > Assignee: Kirk Lund > Priority: Major > Labels: needsTriage, pull-request-available > > [Please see 1st comment on this ticket below the description for > recommendation on how to fix.] > {{OffHeapStorageJUnitTest testCreateOffHeapStorage}} started failing > intermittently due to commit a350ed2 which went in as > "[GEODE-10087|https://issues.apache.org/jira/browse/GEODE-10087]: Enhance > off-heap fragmentation visibility. > ([#7407|https://github.com/apache/geode/pull/7407/commits/1640effc5c760afa8d9ec4c2743950bb1de0ef8f])". > The failure stack looks like: > {noformat} > OffHeapStorageJUnitTest > testCreateOffHeapStorage FAILED > java.lang.AssertionError: expected:<100> but was:<1048576> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at org.junit.Assert.assertEquals(Assert.java:633) > at > org.apache.geode.internal.offheap.OffHeapStorageJUnitTest.testCreateOffHeapStorage(OffHeapStorageJUnitTest.java:220) > {noformat} > The cause is in {{MemoryAllocatorImpl}}. A new scheduled executor > {{updateNonRealTimeStatsExecutor}} was added near the bottom of the > constructor which immediately gets scheduled to invoke > {{freeList::updateNonRealTimeStats}} repeatedly at the specified frequency of > {{updateOffHeapStatsFrequencyMs}} whenever an instance of > {{MemoryAllocatorImpl}} is created. > {{freeList::updateNonRealTimeStats}} then updates {{setLargestFragment}} and > {{setFreedChunks}}. > Scheduling or starting any sort of threads from a constructor is considered a > bad practice and should only be done via some sort of activation method such > as {{start()}} which can be invoked from the static factory method for > {{MemoryAllocatorImpl}}. The unit tests would then continue to construct > instances via a constructor that does NOT invoke {{start()}}. > {{OffHeapStorageJUnitTest testCreateOffHeapStorage}}, and possibly other > tests, is written with the assumption that no other thread or component can > or will be updating the {{OffHeapStats}}. Hence it is then safe for the test > to setLargestFragment and then assert that it has the value passed in: > {noformat} > 219: stats.setLargestFragment(100); > 220: assertEquals(100, stats.getLargestFragment()); > {noformat} > Line 220 is the source of the assertion failure because > {{updateNonRealTimeStatsExecutor}} is racing with the JUnit thread and > changes the value of {{setLargestFragment}} immediately after the test sets > the value to 100, but before the test invokes the assertion. > Looking back at the [PR test > results|https://cio.hdb.gemfire-ci.info/commits/pr/7407/junit?days=90] we can > see that stress-new-test failed 5 times with this exact failure before being > merged to develop: > * > http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646140652/ > * > http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646154134/ > (x2) > * > http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646156997/ > * > http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646170346/ -- This message was sent by Atlassian Jira (v8.20.7#820007)