Jason Huynh created GEODE-1588:
----------------------------------
Summary: Starting and stopping wan sender can cause OOME
Key: GEODE-1588
URL: https://issues.apache.org/jira/browse/GEODE-1588
Project: Geode
Issue Type: Bug
Components: wan
Reporter: Jason Huynh
The following test will more than likely cause an OOME due to a timing issue in
stopping the gateway sender.
{noformat}
public void
closingSenderWhileBatchOperationsAreProcessingShouldNotHaveMultipleThreadsReadFromSameStream()
throws Exception {
Integer lnPort = (Integer)vm0.invoke(() ->
WANTestBase.createFirstLocatorWithDSId( 1 ));
Integer nyPort = (Integer)vm1.invoke(() ->
WANTestBase.createFirstRemoteLocator( 2, lnPort ));
createCacheInVMs(nyPort, vm2);
createReceiverInVMs(vm2);
createCacheInVMs(lnPort, vm4);
//keep the maxQueueMemory low enough to trigger eviction
vm4.invoke(() -> WANTestBase.createConcurrentSender( "ln", 2,
false, 100, 101, false, false, null, true, 3, OrderPolicy.KEY ));
vm2.invoke(() -> WANTestBase.createPartitionedRegion(
getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
// vm2.invoke(() -> WANTestBase.createPartitionedRegion(
// getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
startSenderInVMs("ln", vm4);
vm2.invoke(() -> addListenerToSleepAfterCreateEvent(10, getTestMethodName()
+ "_RR"));
// vm4.invoke(() -> WANTestBase.createPartitionedRegion(
// getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
vm4.invoke(() -> WANTestBase.createReplicatedRegion(
getTestMethodName() + "_RR", "ln", isOffHeap() ));
vm4.invoke(() -> addListenerToSleepAfterCreateEvent(1, getTestMethodName()
+ "_RR"));
vm4.invokeAsync(() -> WANTestBase.doPutsAfter300(
getTestMethodName() + "_RR", 1000000 ));
Thread.sleep(5000);
stopSenderInVMsAsync("ln", vm4);
Thread.sleep(10000);
for (int i = 0; i < 100; i++) {
startSenderInVMs("ln", vm4);
Thread.sleep(10000);
stopSenderInVMs("ln", vm4);
Thread.sleep(5000);
}
//
// stopSenderInVMsAsync("ln", vm4);
// Thread.sleep(1000);
// startSenderInVMs("ln", vm4);
// Thread.sleep(1000);
vm2.invoke(() -> WANTestBase.validateRegionSize(
getTestMethodName() + "_RR", 10000, 240000));
}
{noformat}
Due to the way this test is written, I wouldn't necessarily want it checked in
as it is very time based and possibly flakey. It will run into the OOME
eventually but it's all based on timing.
The issue is that the ack reader thread is reading off the same socket as the
gateway sender closing thread, which causes the stream to be corrupted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)