Jason Huynh created GEODE-1588:
----------------------------------

             Summary: Starting and stopping wan sender can cause OOME 
                 Key: GEODE-1588
                 URL: https://issues.apache.org/jira/browse/GEODE-1588
             Project: Geode
          Issue Type: Bug
          Components: wan
            Reporter: Jason Huynh


The following test will more than likely cause an OOME due to a timing issue in 
stopping the gateway sender.
{noformat}
public void 
closingSenderWhileBatchOperationsAreProcessingShouldNotHaveMultipleThreadsReadFromSameStream()
 throws Exception {

    Integer lnPort = (Integer)vm0.invoke(() -> 
WANTestBase.createFirstLocatorWithDSId( 1 ));
    Integer nyPort = (Integer)vm1.invoke(() -> 
WANTestBase.createFirstRemoteLocator( 2, lnPort ));

    createCacheInVMs(nyPort, vm2);
    createReceiverInVMs(vm2);

    createCacheInVMs(lnPort, vm4);

    //keep the maxQueueMemory low enough to trigger eviction
    vm4.invoke(() -> WANTestBase.createConcurrentSender( "ln", 2,
      false, 100, 101, false, false, null, true, 3, OrderPolicy.KEY ));

    vm2.invoke(() -> WANTestBase.createPartitionedRegion(
      getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
    //    vm2.invoke(() -> WANTestBase.createPartitionedRegion(
    //      getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));

    startSenderInVMs("ln", vm4);
    vm2.invoke(() -> addListenerToSleepAfterCreateEvent(10, getTestMethodName() 
+ "_RR"));

    //    vm4.invoke(() -> WANTestBase.createPartitionedRegion(
    //      getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
    vm4.invoke(() -> WANTestBase.createReplicatedRegion(
      getTestMethodName() + "_RR", "ln", isOffHeap() ));
    vm4.invoke(() -> addListenerToSleepAfterCreateEvent(1, getTestMethodName() 
+ "_RR"));


    vm4.invokeAsync(() -> WANTestBase.doPutsAfter300(
      getTestMethodName() + "_RR", 1000000 ));

    Thread.sleep(5000);
    stopSenderInVMsAsync("ln", vm4);


    Thread.sleep(10000);
    for (int i = 0; i < 100; i++) {
      startSenderInVMs("ln", vm4);
      Thread.sleep(10000);
      stopSenderInVMs("ln", vm4);
      Thread.sleep(5000);
    }
    //
    //    stopSenderInVMsAsync("ln", vm4);
    //    Thread.sleep(1000);

    //    startSenderInVMs("ln", vm4);
    //    Thread.sleep(1000);
    vm2.invoke(() -> WANTestBase.validateRegionSize(
      getTestMethodName() + "_RR", 10000, 240000));
  }
{noformat}

Due to the way this test is written, I wouldn't necessarily want it checked in 
as it is very time based and possibly flakey.  It will run into the OOME 
eventually but it's all based on timing.

The issue is that the ack reader thread is reading off the same socket as the 
gateway sender closing thread, which causes the stream to be corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to