[ 
https://issues.apache.org/jira/browse/GEODE-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768480#comment-15768480
 ] 

Kirk Lund commented on GEODE-2238:
----------------------------------

I closed GEODE-2244 as a duplicate of GEODE-2238. Below is where initialization 
of cluster config is async during locator startup. I believe we need to have 
incoming requests for cluster config determine that cluster config is enabled 
and then wait on a Future for initialization of cluster config to complete.

The method in InternalLocator is startCache(DistributedSystem):
{code:java}
  private void startCache(DistributedSystem ds) {

    GemFireCacheImpl gfc = GemFireCacheImpl.getInstance();
    if (gfc == null) {
      logger.info("Creating cache for locator.");
      this.myCache = new CacheFactory(ds.getProperties()).create();
      gfc = (GemFireCacheImpl) this.myCache;
    } else {
      logger.info("Using existing cache for locator.");
      ((InternalDistributedSystem) 
ds).handleResourceEvent(ResourceEvent.LOCATOR_START, this);
    }
    startJmxManagerLocationService(gfc);

    startSharedConfigurationService(gfc);
  }
{code}
The method startSharedConfigurationService hands off to a thread to load 
cluster config and use distributed lock service to become primary cluster 
config source. This was probably made async due to use of distributed lock 
service to keep startup responsive. Unfortunately, it opens up a race condition 
window -- if any requests come in before cluster config is ready, the locator 
will reply saying it doesn't have cluster config. I think some of our Flaky 
tests are hitting this race condition.


> Member may fail to receive cluster configuration from locator
> -------------------------------------------------------------
>
>                 Key: GEODE-2238
>                 URL: https://issues.apache.org/jira/browse/GEODE-2238
>             Project: Geode
>          Issue Type: Bug
>          Components: management
>    Affects Versions: 1.0.0-incubating
>            Reporter: Kirk Lund
>            Assignee: Dan Smith
>              Labels: Flaky
>
> LuceneClusterConfigurationDUnitTest.indexWithAnalyzerGetsCreatedUsingClusterConfiguration
>  is failing frequently in precheckin. I'm going to mark it as FlakyTest. 
> Below is the stack trace:
> {noformat}
> :geode-lucene:distributedTest
> org.apache.geode.cache.lucene.internal.configuration.LuceneClusterConfigurationDUnitTest
>  > indexWithAnalyzerGetsCreatedUsingClusterConfiguration FAILED
>     org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache.lucene.internal.configuration.LuceneClusterConfigurationDUnitTest$$Lambda$29/613305101.run
>  in VM 2 running on Host 3fb23bc375ef with 4 VMs
>         at org.apache.geode.test.dunit.VM.invoke(VM.java:344)
>         at org.apache.geode.test.dunit.VM.invoke(VM.java:314)
>         at org.apache.geode.test.dunit.VM.invoke(VM.java:259)
>         at org.apache.geode.test.dunit.rules.Member.invoke(Member.java:60)
>         at 
> org.apache.geode.cache.lucene.internal.configuration.LuceneClusterConfigurationDUnitTest.indexWithAnalyzerGetsCreatedUsingClusterConfiguration(LuceneClusterConfigurationDUnitTest.java:102)
>         Caused by:
>         java.lang.AssertionError
>             at org.junit.Assert.fail(Assert.java:86)
>             at org.junit.Assert.assertTrue(Assert.java:41)
>             at org.junit.Assert.assertNotNull(Assert.java:712)
>             at org.junit.Assert.assertNotNull(Assert.java:722)
>             at 
> org.apache.geode.cache.lucene.internal.configuration.LuceneClusterConfigurationDUnitTest.lambda$indexWithAnalyzerGetsCreatedUsingClusterConfiguration$bb17a952$1(LuceneClusterConfigurationDUnitTest.java:105)
> 94 tests completed, 1 failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to