[jira] [Resolved] (GEODE-9729) CI Failure: PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs FAILED

2021-10-25 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-9729.
--
Fix Version/s: 1.15.0
 Assignee: Xiaojian Zhou
   Resolution: Fixed

> CI Failure: 
> PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs FAILED
> -
>
> Key: GEODE-9729
> URL: https://issues.apache.org/jira/browse/GEODE-9729
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Eric Shu
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> org.apache.geode.cache.client.AllConnectionsInUseException
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:304)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:137)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:120)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:805)
>   at 
> org.apache.geode.cache.client.internal.GetClientPRMetaDataOp.execute(GetClientPRMetaDataOp.java:53)
>   at 
> org.apache.geode.cache.client.internal.ClientMetadataService.getClientPRMetadata(ClientMetadataService.java:574)
>   at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:972)
>   at 
> org.awaitility.core.AssertionCondition.lambda$new$0(AssertionCondition.java:53)
>   at 
> org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:234)
>   at 
> org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:221)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.lang.Thread.run(Thread.java:829)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9747) CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of exception

2021-10-30 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-9747:


Assignee: Xiaojian Zhou

> CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of 
> exception
> ---
>
> Key: GEODE-9747
> URL: https://issues.apache.org/jira/browse/GEODE-9747
> Project: Geode
>  Issue Type: Bug
>  Components: core, tests
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage
>
> May be the same issue as GEODE-7030 but it's hard to tell since that other 
> ticket is short on details.
> {noformat}
> PersistentPartitionedRegionDistributedTest > 
> cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest$$Lambda$331/778323733.run
>  in VM 0 running on Host 
> heavy-lifter-2597c5be-686f-56ce-ab29-4c643f8174ba.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown(PersistentPartitionedRegionDistributedTest.java:1129)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> Expecting value to be true but was false
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.lambda$cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown$bb17a952$4(PersistentPartitionedRegionDistributedTest.java:1136)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9747) CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of exception

2021-11-08 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440837#comment-17440837
 ] 

Xiaojian Zhou commented on GEODE-9747:
--

This is caused by my fix GEODE-9705, where I saved the exceptions which were 
originally ignored in cleanupFailedInitialization(). I only keep the first 
exception and ignore later exceptions. 

cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown() will purposely 
close the cache and cause creating PR failed with CacheClosedException or disk 
recovery exception. 

Before my code changes in GEODE-9705, all these exceptions in 
cleanupFailedInitialization() will be ignored. Then it will go further and fail 
in later steps. 

This time, it failed in 


{code:java}
[vm0] [warn 2021/10/16 04:50:40.769 UTC   
tid=0x21] PartitionedRegion#cleanupFailedInitialization(): Failed to clean the 
PartionRegion data store
[vm0] org.apache.geode.distributed.DistributedSystemDisconnectedException: This 
connection to a distributed system has been disconnected.
[vm0]   at 
org.apache.geode.distributed.internal.InternalDistributedSystem.checkConnected(InternalDistributedSystem.java:957)
[vm0]   at 
org.apache.geode.distributed.internal.InternalDistributedSystem.getDistributionManager(InternalDistributedSystem.java:1658)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.getDistributionManager(ReplyProcessor21.java:366)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.postWait(ReplyProcessor21.java:592)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:818)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:773)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:859)
[vm0]   at 
org.apache.geode.internal.cache.PartitionedRegion.attemptToSendDestroyRegionMessage(PartitionedRegion.java:7592)
[vm0]   at 
org.apache.geode.internal.cache.PartitionedRegion.sendDestroyRegionMessage(PartitionedRegion.java:7553)
[vm0]   at 
org.apache.geode.internal.cache.PartitionedRegion.cleanupFailedInitialization(PartitionedRegion.java:5577)
{code}

Depends on timing, most of the time the test will not fail here and it will 
still throw the expected CacheClosedException. But occasionally it will fail 
here, and this exception is not expected by the test. 

I can change the product code to save the last exception instead of the first 
exception to fix the bug. But after thought over, it's better to fix the test 
code and add this expected exception as long as we know what's going on. 


> CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of 
> exception
> ---
>
> Key: GEODE-9747
> URL: https://issues.apache.org/jira/browse/GEODE-9747
> Project: Geode
>  Issue Type: Bug
>  Components: core, tests
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> May be the same issue as GEODE-7030 but it's hard to tell since that other 
> ticket is short on details.
> {noformat}
> PersistentPartitionedRegionDistributedTest > 
> cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest$$Lambda$331/778323733.run
>  in VM 0 running on Host 
> heavy-lifter-2597c5be-686f-56ce-ab29-4c643f8174ba.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown(PersistentPartitionedRegionDistributedTest.java:1129)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> Expecting value to be true but was false
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.lambda$cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown$bb17a952$4(PersistentPartitionedRegionDistributedTest.java:1136)
> {noformat}



--
This message was sent by Atl

[jira] [Resolved] (GEODE-9747) CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of exception

2021-11-09 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-9747.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of 
> exception
> ---
>
> Key: GEODE-9747
> URL: https://issues.apache.org/jira/browse/GEODE-9747
> Project: Geode
>  Issue Type: Bug
>  Components: core, tests
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> May be the same issue as GEODE-7030 but it's hard to tell since that other 
> ticket is short on details.
> {noformat}
> PersistentPartitionedRegionDistributedTest > 
> cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest$$Lambda$331/778323733.run
>  in VM 0 running on Host 
> heavy-lifter-2597c5be-686f-56ce-ab29-4c643f8174ba.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown(PersistentPartitionedRegionDistributedTest.java:1129)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> Expecting value to be true but was false
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.lambda$cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown$bb17a952$4(PersistentPartitionedRegionDistributedTest.java:1136)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9428) CI Failure: NativeRedisAcceptanceTest fails with CLUSTERDOWN error

2021-11-16 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444708#comment-17444708
 ] 

Xiaojian Zhou commented on GEODE-9428:
--

Rerproduced in https://hydradb.hdb.gemfire-ci.info/hdb/testresult/12258055


> CI Failure: NativeRedisAcceptanceTest fails with CLUSTERDOWN error
> --
>
> Key: GEODE-9428
> URL: https://issues.apache.org/jira/browse/GEODE-9428
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Reporter: Hale Bales
>Priority: Major
>
> *This ticket tracks failures seen in NativeRedisAcceptanceTests due to 
> non-Geode code. It is closed because no work will be done in the Geode 
> project to fix this issue. If the issue becomes unbearable, a bug should be 
> opened with Redis: 
> [https://github.com/redis/redis/issues|https://github.com/redis/redis/issues*]*
> CI run is here: 
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk11/builds/82#L60e11384:311]
> {code:java}
> org.apache.geode.redis.internal.executor.string.PSetEXNativeRedisAcceptanceTest
>  > testPSetEX FAILED
> redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The 
> cluster is down
> at redis.clients.jedis.Protocol.processError(Protocol.java:125)
> at redis.clients.jedis.Protocol.process(Protocol.java:169)
> at redis.clients.jedis.Protocol.read(Protocol.java:223)
> at 
> redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
> at 
> redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:270)
> at redis.clients.jedis.Jedis.psetex(Jedis.java:3616)
> at redis.clients.jedis.JedisCluster$30.execute(JedisCluster.java:572)
> at redis.clients.jedis.JedisCluster$30.execute(JedisCluster.java:569)
> at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121)
> at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
> at redis.clients.jedis.JedisCluster.psetex(JedisCluster.java:574)
> at 
> org.apache.geode.redis.internal.executor.string.AbstractPSetEXIntegrationTest.testPSetEX(AbstractPSetEXIntegrationTest.java:54)
> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:566)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at 
> org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:120)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
> at 
> org.gradle.api

[jira] [Created] (GEODE-9810) CI: NativeRedisClusterTest testEachProxyReturnsExposedPorts failed

2021-11-16 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-9810:


 Summary: CI: NativeRedisClusterTest 
testEachProxyReturnsExposedPorts failed
 Key: GEODE-9810
 URL: https://issues.apache.org/jira/browse/GEODE-9810
 Project: Geode
  Issue Type: Bug
  Components: redis
Reporter: Xiaojian Zhou


https://hydradb.hdb.gemfire-ci.info/hdb/testresult/12258442

{code:java}
> Task :geode-for-redis:acceptanceTest

NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED
java.lang.AssertionError: 
Expecting actual:
  [44073, 45679, 36065, 40077, 42137]
to contain exactly in any order:
  [40077, 45679, 33425, 36065, 42137, 44073]
but could not find the following elements:
  [33425]
at 
org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48)

1385 tests completed, 1 failed, 2 skipped

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Test report artifacts from this job are available at:

http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9838) Log key info for deserilation issue while index update

2021-11-22 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-9838:


Assignee: Xiaojian Zhou

> Log key info for deserilation issue while index update 
> ---
>
> Key: GEODE-9838
> URL: https://issues.apache.org/jira/browse/GEODE-9838
> Project: Geode
>  Issue Type: Improvement
>  Components: querying
>Affects Versions: 1.15.0
>Reporter: Anilkumar Gingade
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI
>
> When there is issue in Index update (maintenance); the index is marked as 
> invalid. And warning is logged: 
> [warn 2021/11/11 07:39:28.215 CST pazrslsrv004  Processor 963> tid=0x124ecf] Updating the Index patientMemberIdentifier 
> failed. The index is corrupted and marked as invalid.
> org.apache.geode.cache.query.internal.index.IMQException
> Adding "key" information in the log helps diagnosing the failure and adding 
> or removing the entry in question. 
> Code path IndexManager.java:
> void addIndexMapping(RegionEntry entry, IndexProtocol index) {
>   try {
> index.addIndexMapping(entry);
>   } catch (Exception exception) {
> index.markValid(false);
> setPRIndexAsInvalid((AbstractIndex) index);
> logger.warn(String.format(
> "Updating the Index %s failed. The index is corrupted and marked as 
> invalid.",
> ((AbstractIndex) index).indexName), exception);
>   }
> }
> void removeIndexMapping(RegionEntry entry, IndexProtocol index, int opCode) {
> try {
>   index.removeIndexMapping(entry, opCode);
> } catch (Exception exception) {
>   index.markValid(false);
>   setPRIndexAsInvalid((AbstractIndex) index);
>   logger.warn(String.format(
>   "Updating the Index %s failed. The index is corrupted and marked as 
> invalid.",
>   ((AbstractIndex) index).indexName), exception);
> }
>   }



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-22 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447573#comment-17447573
 ] 

Xiaojian Zhou commented on GEODE-8644:
--

The reproduced failures are all caused by locator disconnected:

[locator] [info 2021/11/13 08:43:12.331 UTC   
tid=0x46] Failed to connect to localhost/127.0.0.1:0

[locator] [warn 2021/11/13 08:43:12.331 UTC   
tid=0x46] Locator discovery task for locator 
heavy-lifter-ca6688de-b95d-5db6-9ac5-57db242f6302.c.apachegeode-ci.internal[34223]
 could not exchange locator information with localhost[0] after 45 retry 
attempts. Retrying in 1 ms.


> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9850) flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper

2021-11-23 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-9850:


Assignee: Xiaojian Zhou

> flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper
> --
>
> Key: GEODE-9850
> URL: https://issues.apache.org/jira/browse/GEODE-9850
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.13.5
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI
>
> First saw this failure in PR pipeline on support/1.13 here: 
> [https://concourse.apachegeode-ci.info/builds/3912569]
> {code:java}
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest > 
> testGetOldestTombstoneTimeForReplicateTombstoneSweeper FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest$$Lambda$42/2046302475.run
>  in VM 0 running on Host 9a305b2d7db7 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.testGetOldestTombstoneTimeForReplicateTombstoneSweeper(TombstoneDUnitTest.java:228)
> Caused by:
> java.lang.AssertionError: 
> Expecting:
>  <-1637701703343L>
> to be greater than:
>  <0L> 
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.lambda$testGetOldestTombstoneTimeForReplicateTombstoneSweeper$bb17a952$3(TombstoneDUnitTest.java:237)
>  {code}
> I believe the fix is to wrap this assertion in an awaitility call.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9850) flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper

2021-11-23 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou updated GEODE-9850:
-
Labels: GeodeOperationAPI  (was: )

> flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper
> --
>
> Key: GEODE-9850
> URL: https://issues.apache.org/jira/browse/GEODE-9850
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.13.5
>Reporter: Bill Burcham
>Priority: Major
>  Labels: GeodeOperationAPI
>
> First saw this failure in PR pipeline on support/1.13 here: 
> [https://concourse.apachegeode-ci.info/builds/3912569]
> {code:java}
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest > 
> testGetOldestTombstoneTimeForReplicateTombstoneSweeper FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest$$Lambda$42/2046302475.run
>  in VM 0 running on Host 9a305b2d7db7 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.testGetOldestTombstoneTimeForReplicateTombstoneSweeper(TombstoneDUnitTest.java:228)
> Caused by:
> java.lang.AssertionError: 
> Expecting:
>  <-1637701703343L>
> to be greater than:
>  <0L> 
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.lambda$testGetOldestTombstoneTimeForReplicateTombstoneSweeper$bb17a952$3(TombstoneDUnitTest.java:237)
>  {code}
> I believe the fix is to wrap this assertion in an awaitility call.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9838) Log key info for deserilation issue while index update

2021-11-23 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-9838.
--
Fix Version/s: 1.12.6
   1.13.5
   1.14.1
   1.15.0
   Resolution: Fixed

> Log key info for deserilation issue while index update 
> ---
>
> Key: GEODE-9838
> URL: https://issues.apache.org/jira/browse/GEODE-9838
> Project: Geode
>  Issue Type: Improvement
>  Components: querying
>Affects Versions: 1.15.0
>Reporter: Anilkumar Gingade
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.12.6, 1.13.5, 1.14.1, 1.15.0
>
>
> When there is issue in Index update (maintenance); the index is marked as 
> invalid. And warning is logged: 
> [warn 2021/11/11 07:39:28.215 CST pazrslsrv004  Processor 963> tid=0x124ecf] Updating the Index patientMemberIdentifier 
> failed. The index is corrupted and marked as invalid.
> org.apache.geode.cache.query.internal.index.IMQException
> Adding "key" information in the log helps diagnosing the failure and adding 
> or removing the entry in question. 
> Code path IndexManager.java:
> void addIndexMapping(RegionEntry entry, IndexProtocol index) {
>   try {
> index.addIndexMapping(entry);
>   } catch (Exception exception) {
> index.markValid(false);
> setPRIndexAsInvalid((AbstractIndex) index);
> logger.warn(String.format(
> "Updating the Index %s failed. The index is corrupted and marked as 
> invalid.",
> ((AbstractIndex) index).indexName), exception);
>   }
> }
> void removeIndexMapping(RegionEntry entry, IndexProtocol index, int opCode) {
> try {
>   index.removeIndexMapping(entry, opCode);
> } catch (Exception exception) {
>   index.markValid(false);
>   setPRIndexAsInvalid((AbstractIndex) index);
>   logger.warn(String.format(
>   "Updating the Index %s failed. The index is corrupted and marked as 
> invalid.",
>   ((AbstractIndex) index).indexName), exception);
> }
>   }



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9850) flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper

2021-11-23 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448310#comment-17448310
 ] 

Xiaojian Zhou commented on GEODE-9850:
--

It's flaky because the destroy(key) might have not created the tombstone yet, 
which caused getOldestTombstoneTime() return 0. 

> flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper
> --
>
> Key: GEODE-9850
> URL: https://issues.apache.org/jira/browse/GEODE-9850
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.13.5
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI
>
> First saw this failure in PR pipeline on support/1.13 here: 
> [https://concourse.apachegeode-ci.info/builds/3912569]
> {code:java}
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest > 
> testGetOldestTombstoneTimeForReplicateTombstoneSweeper FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest$$Lambda$42/2046302475.run
>  in VM 0 running on Host 9a305b2d7db7 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.testGetOldestTombstoneTimeForReplicateTombstoneSweeper(TombstoneDUnitTest.java:228)
> Caused by:
> java.lang.AssertionError: 
> Expecting:
>  <-1637701703343L>
> to be greater than:
>  <0L> 
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.lambda$testGetOldestTombstoneTimeForReplicateTombstoneSweeper$bb17a952$3(TombstoneDUnitTest.java:237)
>  {code}
> I believe the fix is to wrap this assertion in an awaitility call.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9850) flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper

2021-11-24 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-9850.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> flaky test: testGetOldestTombstoneTimeForReplicateTombstoneSweeper
> --
>
> Key: GEODE-9850
> URL: https://issues.apache.org/jira/browse/GEODE-9850
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.13.5
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> First saw this failure in PR pipeline on support/1.13 here: 
> [https://concourse.apachegeode-ci.info/builds/3912569]
> {code:java}
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest > 
> testGetOldestTombstoneTimeForReplicateTombstoneSweeper FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest$$Lambda$42/2046302475.run
>  in VM 0 running on Host 9a305b2d7db7 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.testGetOldestTombstoneTimeForReplicateTombstoneSweeper(TombstoneDUnitTest.java:228)
> Caused by:
> java.lang.AssertionError: 
> Expecting:
>  <-1637701703343L>
> to be greater than:
>  <0L> 
> at 
> org.apache.geode.internal.cache.versions.TombstoneDUnitTest.lambda$testGetOldestTombstoneTimeForReplicateTombstoneSweeper$bb17a952$3(TombstoneDUnitTest.java:237)
>  {code}
> I believe the fix is to wrap this assertion in an awaitility call.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-12-07 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454924#comment-17454924
 ] 

Xiaojian Zhou commented on GEODE-8644:
--

What raised my attention is:

1) In all failed tests, there're 

{code:java}
[vm5] [info 2021/12/04 03:37:28.229 UTC   tid=0x69] Socket 
receive buffer size is 212992 instead of the requested 524288.
[vm5] [info 2021/12/04 03:37:28.230 UTC   tid=0x69] Socket send 
buffer size is 212992 instead of the requested 524288.
[vm4] [info 2021/12/04 03:37:32.891 UTC   tid=0x6c] Socket 
receive buffer size is 212992 instead of the requested 524288.
[vm4] [info 2021/12/04 03:37:32.891 UTC   tid=0x6c] Socket send 
buffer size is 212992 instead of the requested 524288.
{code}

In the passed tests, there's no such message. 

2) There're much more locator error message:

{code:java}
[locator] [info 2021/12/04 03:37:42.421 UTC   
tid=0x46] Failed to connect to localhost/127.0.0.1:0
{code}

In passed tests, there're about less than 5 of these messages, but in failed 
tests, there're 15-20 of these message. 

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-12-07 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454924#comment-17454924
 ] 

Xiaojian Zhou edited comment on GEODE-8644 at 12/8/21, 1:44 AM:


What raised my attention are:

1) In all failed tests, there're 

{code:java}
[vm5] [info 2021/12/04 03:37:28.229 UTC   tid=0x69] Socket 
receive buffer size is 212992 instead of the requested 524288.
[vm5] [info 2021/12/04 03:37:28.230 UTC   tid=0x69] Socket send 
buffer size is 212992 instead of the requested 524288.
[vm4] [info 2021/12/04 03:37:32.891 UTC   tid=0x6c] Socket 
receive buffer size is 212992 instead of the requested 524288.
[vm4] [info 2021/12/04 03:37:32.891 UTC   tid=0x6c] Socket send 
buffer size is 212992 instead of the requested 524288.
{code}

In the passed tests, there's no such message. 

BTW, this could happen after Bill's fix in "GEODE-9825". 

2) There're much more locator error message:

{code:java}
[locator] [info 2021/12/04 03:37:42.421 UTC   
tid=0x46] Failed to connect to localhost/127.0.0.1:0
{code}

In passed tests, there're about less than 5 of these messages, but in failed 
tests, there're 15-20 of these message. 



was (Author: zhouxj):
What raised my attention is:

1) In all failed tests, there're 

{code:java}
[vm5] [info 2021/12/04 03:37:28.229 UTC   tid=0x69] Socket 
receive buffer size is 212992 instead of the requested 524288.
[vm5] [info 2021/12/04 03:37:28.230 UTC   tid=0x69] Socket send 
buffer size is 212992 instead of the requested 524288.
[vm4] [info 2021/12/04 03:37:32.891 UTC   tid=0x6c] Socket 
receive buffer size is 212992 instead of the requested 524288.
[vm4] [info 2021/12/04 03:37:32.891 UTC   tid=0x6c] Socket send 
buffer size is 212992 instead of the requested 524288.
{code}

In the passed tests, there's no such message. 

2) There're much more locator error message:

{code:java}
[locator] [info 2021/12/04 03:37:42.421 UTC   
tid=0x46] Failed to connect to localhost/127.0.0.1:0
{code}

In passed tests, there're about less than 5 of these messages, but in failed 
tests, there're 15-20 of these message. 

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-12-09 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456592#comment-17456592
 ] 

Xiaojian Zhou commented on GEODE-8644:
--

"Failed to connect to localhost/127.0.0.1:0" error message was introduced in 
Geode-7751. But introducing this error message itself is not the root cause. 

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-12-15 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-8644:


Assignee: Xiaojian Zhou  (was: Mark Hanson)

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-12-16 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460939#comment-17460939
 ] 

Xiaojian Zhou commented on GEODE-8644:
--

The root cause is: 

When CME happened, notifyTimestampsToGateways() will be called in 
AbstractRegionMap. The gateway event with UPDATE_VERSION operation will be 
enqueued. 

At the server as secondary queue holder, this event is ignored, not to call 
handleSecondaryEvent(). But at the primary queue holder, this event will still 
be queued and add a unprocessedToken. Since there's no corresponding event will 
arrive at secondary queue to trigger removal of the token, when this scenario 
happen, the tokens will always be leaked. 

It's a very old code and behavior, as old as in 8.2. We did not find this 
problem earlier is due to 2 reasons: 1) It's a rarely happened race. 2) We did 
not have a test to purposely test unprocessedToken draining until GEODE-7643 
introduced one. 

There're several ways to fix it:
One alternative is not to enqueue this kind of event into primary queue, like 
what we did in secondary queue. But this alternative changed current logic and 
assumption and it's risky. 

So I choose only not to add into unprocessedTokens for this kind of event. This 
fix is very conservative. 


> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (GEODE-9060) checkMyStateOnMemebers is better not to change the replicates

2022-01-03 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reopened GEODE-9060:
--

Need to backport it to 1.12, 1.13, 1.14 according to the requirements. 

> checkMyStateOnMemebers is better not to change the replicates
> -
>
> Key: GEODE-9060
> URL: https://issues.apache.org/jira/browse/GEODE-9060
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> This is an enhancement to previous GEODE-9003. 
> The previous fix will remove the members in replicates if it cannot recognize 
> current member, thus these removed members will not acting as GII provider 
> candidates. The fix is correct. 
> However, Dan suggested to keep the replicates as is to be more conservative. 
> These members will still act as GII provider candidates even they cannot 
> recognize current member due to the split-brain. Conceptually these members 
> should be the same as other online members. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9060) checkMyStateOnMemebers is better not to change the replicates

2022-01-03 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-9060.
--
Fix Version/s: 1.12.8
   1.13.7
   1.14.3
   Resolution: Fixed

> checkMyStateOnMemebers is better not to change the replicates
> -
>
> Key: GEODE-9060
> URL: https://issues.apache.org/jira/browse/GEODE-9060
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.12.8, 1.13.7, 1.14.3, 1.15.0
>
>
> This is an enhancement to previous GEODE-9003. 
> The previous fix will remove the members in replicates if it cannot recognize 
> current member, thus these removed members will not acting as GII provider 
> candidates. The fix is correct. 
> However, Dan suggested to keep the replicates as is to be more conservative. 
> These members will still act as GII provider candidates even they cannot 
> recognize current member due to the split-brain. Conceptually these members 
> should be the same as other online members. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2022-01-04 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-8644.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9989) add a few info level logs in PersistenceAdvisorImpl to identify splitbrain issue

2022-01-25 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-9989:


 Summary: add a few info level logs in PersistenceAdvisorImpl to 
identify splitbrain issue
 Key: GEODE-9989
 URL: https://issues.apache.org/jira/browse/GEODE-9989
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


In scenario like:


{code:java}
03:33:03.644 dataStoregemfire4_4494 recovered from disk
03:33:03.732 dataStoregemfire4_4494 closing
03:33:03.735 dataStoregemfire4_4494 Initialization of region replicate_5 
completed, send newId(let’s name it 4494) to gemfire2
03:33:03.754 dataStoregemfire2_4493 recovered from disk
03:33:03.770 dataStoregemfire2_4493 closing
03:33:03.792 dataStoregemfire2_4493 Initialization of region replicate_5 
completed. send newId(let’s name is 4493) to gemfire4, but gemfire4 is offline. 
So gemfire4 does not know gemfire2’s newId 4493.


03:34:11.247 gemfire4_9779 restarted, it does not know 4493
03:34:11.269 gemfire2_9856 restarted, it sends oldId=4493, newId=9856 to 
gemfire4, but gemfire4 does not know either of gemfire2’s oldId and newId

When gemfire2_9856 asked gemfire4_9779 for its state, gemfire4_9779 replied "I 
don't know you", then gemfire2_9856's starting ends with 
ConflictingPersistentDataException.
{code}

We need more log to identify the issue. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (GEODE-9998) Update jedis library to the current latest (>= 4.1.0)

2022-02-23 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reopened GEODE-9998:
--

It caused NoClassDefFoundError in some tests. 

> Update jedis library to the current latest (>= 4.1.0)
> -
>
> Key: GEODE-9998
> URL: https://issues.apache.org/jira/browse/GEODE-9998
> Project: Geode
>  Issue Type: Test
>  Components: redis
>Affects Versions: 1.16.0
>Reporter: Jens Deppe
>Assignee: Eric Zoerner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> The 4.x version has been out for a while and we're still on 3.6.x (3.8 is the 
> last in the 3.x line).
> This is not a trivial change as various APIs have changed which will affect a 
> lot of test code.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-974) CI Failure: PersistentPartitionedRegionDUnitTest.testRevokeBeforeStartup failed with AssertionError

2022-03-08 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503084#comment-17503084
 ] 

Xiaojian Zhou commented on GEODE-974:
-

I found it reproduced in 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/distributed-test-openjdk8/builds/194


> CI Failure: PersistentPartitionedRegionDUnitTest.testRevokeBeforeStartup 
> failed with AssertionError
> ---
>
> Key: GEODE-974
> URL: https://issues.apache.org/jira/browse/GEODE-974
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Barrett Oglesby
>Assignee: Kirk Lund
>Priority: Major
>  Labels: CI, Flaky, pull-request-available
> Fix For: 1.7.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Geode_develop_DistributedTests
> Private Build #1602 (Feb 13, 2016 9:09:53 AM)
> Revision: 781277f31f37388f7247cbdf05025c12de825d2a
> Error Message
> {noformat}
> java.lang.AssertionError: Suspicious strings were written to the log during 
> this run.
> Fix the strings or use DistributedTestCase.addExpectedException to ignore.
> ---
> Found suspect string in log4j at line 2919
> [fatal 2016/02/13 11:30:42.638 PST  tid=0x580] 
> Uncaught exception processing  Alert "Error processing request class 
> com.gemstone.gemfire.internal.admin.remote.PrepareRevokePersistentIDRequest." 
> level ERROR
> java.lang.NullPointerException
>   at 
> com.gemstone.gemfire.internal.admin.remote.RemoteGfManagerAgent.getApplicationById(RemoteGfManagerAgent.java:606)
>   at 
> com.gemstone.gemfire.internal.admin.remote.RemoteGfManagerAgent.getMemberById(RemoteGfManagerAgent.java:592)
>   at 
> com.gemstone.gemfire.internal.admin.remote.AlertListenerMessage.process(AlertListenerMessage.java:83)
>   at 
> com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:380)
>   at 
> com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:451)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:656)
>   at 
> com.gemstone.gemfire.distributed.internal.DistributionManager$4$1.run(DistributionManager.java:930)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Stacktrace
> {noformat}
> com.gemstone.gemfire.test.dunit.RMIException: While invoking 
> com.gemstone.gemfire.management.internal.cli.commands.CliCommandTestBase$1.call
>  in VM 0 running on Host cc4-rh6.gemstone.com with 4 VMs
>   at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:372)
>   at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:315)
>   at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:281)
>   at 
> com.gemstone.gemfire.management.internal.cli.commands.CliCommandTestBase.createDefaultSetup(CliCommandTestBase.java:105)
>   at 
> com.gemstone.gemfire.management.internal.cli.commands.CreateAlterDestroyRegionCommandsDUnitTest.testCreateRegion46391(CreateAlterDestroyRegionCommandsDUnitTest.java:290)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at junit.framework.TestCase.runBare(TestCase.java:141)
>   at junit.framework.TestResult$1.protect(TestResult.java:122)
>   at junit.framework.TestResult.runProtected(TestResult.java:142)
>   at junit.framework.TestResult.run(TestResult.java:125)
>   at junit.framework.TestCase.run(TestCase.java:129)
>   at junit.framework.TestSuite.runTest(TestSuite.java:252)
>   at junit.framework.TestSuite.run(TestSuite.java:247)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:86)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestCla

[jira] [Created] (GEODE-10113) CI: AutoConnectionSourceImplTest.queryLocatorsTriesNextLocatorOnSSLExceptions() FAILED

2022-03-08 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10113:
-

 Summary: CI: 
AutoConnectionSourceImplTest.queryLocatorsTriesNextLocatorOnSSLExceptions() 
FAILED
 Key: GEODE-10113
 URL: https://issues.apache.org/jira/browse/GEODE-10113
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


> Task :geode-core:test

AutoConnectionSourceImplTest > queryLocatorsTriesNextLocatorOnSSLExceptions() 
FAILED
java.lang.AssertionError: 
Expecting actual:
  Mock for ServerLocationResponse, hashCode: 186159903
and actual:
  null
to refer to the same object
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImplTest.queryLocatorsTriesNextLocatorOnSSLExceptions(AutoConnectionSourceImplTest.java:91)

7320 tests completed, 1 failed, 11 skipped

It's found in 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/windows-unit-test-openjdk11/builds/187

It's said to be similar to GEODE-10066



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10118) PartitionedRegionClearWithConcurrentOperationsDUnitTest will hang in clearShouldFailWhenCoordinatorMemberIsBounced

2022-03-10 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10118:
-

 Summary: PartitionedRegionClearWithConcurrentOperationsDUnitTest 
will hang in clearShouldFailWhenCoordinatorMemberIsBounced 
 Key: GEODE-10118
 URL: https://issues.apache.org/jira/browse/GEODE-10118
 Project: Geode
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Xiaojian Zhou


In PR Clear feature branch, after rebase merged in GEODE-9522, 
clusterDistributionManager setRootCause before TCPConduit.stop. It could cause 
restarting of server hang after force disconnect. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-10118) PartitionedRegionClearWithConcurrentOperationsDUnitTest will hang in clearShouldFailWhenCoordinatorMemberIsBounced

2022-03-10 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou updated GEODE-10118:
--
Parent: GEODE-7665
Issue Type: Sub-task  (was: Bug)

> PartitionedRegionClearWithConcurrentOperationsDUnitTest will hang in 
> clearShouldFailWhenCoordinatorMemberIsBounced 
> ---
>
> Key: GEODE-10118
> URL: https://issues.apache.org/jira/browse/GEODE-10118
> Project: Geode
>  Issue Type: Sub-task
>Affects Versions: 1.16.0
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> In PR Clear feature branch, after rebase merged in GEODE-9522, 
> clusterDistributionManager setRootCause before TCPConduit.stop. It could 
> cause restarting of server hang after force disconnect. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-10118) PartitionedRegionClearWithConcurrentOperationsDUnitTest will hang in clearShouldFailWhenCoordinatorMemberIsBounced

2022-03-10 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504724#comment-17504724
 ] 

Xiaojian Zhou commented on GEODE-10118:
---

This is for PR clear feature branch only. Not for any releases yet. 

> PartitionedRegionClearWithConcurrentOperationsDUnitTest will hang in 
> clearShouldFailWhenCoordinatorMemberIsBounced 
> ---
>
> Key: GEODE-10118
> URL: https://issues.apache.org/jira/browse/GEODE-10118
> Project: Geode
>  Issue Type: Sub-task
>Affects Versions: 1.16.0
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> In PR Clear feature branch, after rebase merged in GEODE-9522, 
> clusterDistributionManager setRootCause before TCPConduit.stop. It could 
> cause restarting of server hang after force disconnect. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10229) Saving received RVV's caused exception should be filled.

2022-04-10 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10229:
-

 Summary: Saving received RVV's caused exception should be filled.
 Key: GEODE-10229
 URL: https://issues.apache.org/jira/browse/GEODE-10229
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


Saving received RVV's caused exception should be filled during processing GII 
image

There's a race in GII: 
A distribution of operation (RemoveAll could be more obvious) arrived when the 
member is requesting GII and before saving received RVV. The saveReceivedRVV() 
will the newly arrived operation to be an exception. 

In normal case, the exception will be filled in processChunk(). But in above 
case, the processChunk() will skip the entry because the local entry is not 
recovered from disk. 

Thus the exception will stay after GII. 

To fix it, and not to slow down the performance, we need to check if such 
exception exists, and do the recordVersion() for this entry. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-10229) Saving received RVV's caused exception should be filled.

2022-04-10 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-10229:
-

Assignee: Xiaojian Zhou

> Saving received RVV's caused exception should be filled.
> 
>
> Key: GEODE-10229
> URL: https://issues.apache.org/jira/browse/GEODE-10229
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> Saving received RVV's caused exception should be filled during processing GII 
> image
> There's a race in GII: 
> A distribution of operation (RemoveAll could be more obvious) arrived when 
> the member is requesting GII and before saving received RVV. The 
> saveReceivedRVV() will the newly arrived operation to be an exception. 
> In normal case, the exception will be filled in processChunk(). But in above 
> case, the processChunk() will skip the entry because the local entry is not 
> recovered from disk. 
> Thus the exception will stay after GII. 
> To fix it, and not to slow down the performance, we need to check if such 
> exception exists, and do the recordVersion() for this entry. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10247) CI: spring-compatibility-test compile failure

2022-04-19 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10247:
-

 Summary: CI: spring-compatibility-test compile failure
 Key: GEODE-10247
 URL: https://issues.apache.org/jira/browse/GEODE-10247
 Project: Geode
  Issue Type: Bug
  Components: redis
Reporter: Xiaojian Zhou


It's found in 
https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14671910
https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14643293
https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14643323

ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (java-compile) on 
project spring-data-geode: Compilation failure: Compilation failure: 
[ERROR] 
/home/geode/spring-data-geode/spring-data-geode/src/main/java/org/springframework/data/gemfire/GemFireProperties.java:[103,43]
 error: cannot find symbol
[ERROR]   symbol:   variable GEODE_FOR_REDIS_BIND_ADDRESS
[ERROR]   location: interface ConfigurationProperties
[ERROR] 
/home/geode/spring-data-geode/spring-data-geode/src/main/java/org/springframework/data/gemfire/GemFireProperties.java:[104,38]
 error: cannot find symbol
[ERROR]   symbol:   variable GEODE_FOR_REDIS_ENABLED
[ERROR]   location: interface ConfigurationProperties
[ERROR] 
/home/geode/spring-data-geode/spring-data-geode/src/main/java/org/springframework/data/gemfire/GemFireProperties.java:[105,35]
 error: cannot find symbol
[ERROR]   symbol:   variable GEODE_FOR_REDIS_PORT
[ERROR]   location: interface ConfigurationProperties
[ERROR] 
/home/geode/spring-data-geode/spring-data-geode/src/main/java/org/springframework/data/gemfire/GemFireProperties.java:[106,47]
 error: cannot find symbol
[ERROR]   symbol:   variable GEODE_FOR_REDIS_REDUNDANT_COPIES
[ERROR]   location: interface ConfigurationProperties
[ERROR] 
/home/geode/spring-data-geode/spring-data-geode/src/main/java/org/springframework/data/gemfire/GemFireProperties.java:[107,39]
 error: cannot find symbol
[ERROR]   symbol:   variable GEODE_FOR_REDIS_USERNAME
[ERROR]   location: interface ConfigurationProperties



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string

2022-04-19 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10248:
-

 Summary: CI: DeployToMultiGroupDUnitTest encountered suspect string
 Key: GEODE-10248
 URL: https://issues.apache.org/jira/browse/GEODE-10248
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


Found in https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14643293

> Task :geode-assembly:distributedTest

DeployToMultiGroupDUnitTest > executionError FAILED
java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
Fix the strings or use IgnoredException.addIgnoredException to ignore.
---
Found suspect string in 'dunit_suspect-vm0.log' at line 571


$??http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build./test-results/distributedTest/1650107916/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Test report artifacts from this job are available at:

http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build./test-artifacts/1650107916/distributedtestfiles-openjdk8-1.15.0-build..tgz



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (GEODE-10229) Saving received RVV's caused exception should be filled.

2022-04-20 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou resolved GEODE-10229.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> Saving received RVV's caused exception should be filled.
> 
>
> Key: GEODE-10229
> URL: https://issues.apache.org/jira/browse/GEODE-10229
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> Saving received RVV's caused exception should be filled during processing GII 
> image
> There's a race in GII: 
> A distribution of operation (RemoveAll could be more obvious) arrived when 
> the member is requesting GII and before saving received RVV. The 
> saveReceivedRVV() will the newly arrived operation to be an exception. 
> In normal case, the exception will be filled in processChunk(). But in above 
> case, the processChunk() will skip the entry because the local entry is not 
> recovered from disk. 
> Thus the exception will stay after GII. 
> To fix it, and not to slow down the performance, we need to check if such 
> exception exists, and do the recordVersion() for this entry. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (GEODE-10229) Saving received RVV's caused exception should be filled.

2022-04-22 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526242#comment-17526242
 ] 

Xiaojian Zhou commented on GEODE-10229:
---

backported into 1.14,1.13,1.12. 

> Saving received RVV's caused exception should be filled.
> 
>
> Key: GEODE-10229
> URL: https://issues.apache.org/jira/browse/GEODE-10229
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage, pull-request-available
> Fix For: 1.12.10, 1.13.9, 1.14.5, 1.15.0
>
>
> Saving received RVV's caused exception should be filled during processing GII 
> image
> There's a race in GII: 
> A distribution of operation (RemoveAll could be more obvious) arrived when 
> the member is requesting GII and before saving received RVV. The 
> saveReceivedRVV() will the newly arrived operation to be an exception. 
> In normal case, the exception will be filled in processChunk(). But in above 
> case, the processChunk() will skip the entry because the local entry is not 
> recovered from disk. 
> Thus the exception will stay after GII. 
> To fix it, and not to slow down the performance, we need to check if such 
> exception exists, and do the recordVersion() for this entry. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (GEODE-10229) Saving received RVV's caused exception should be filled.

2022-04-22 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou updated GEODE-10229:
--
Fix Version/s: 1.12.10
   1.13.9
   1.14.5

> Saving received RVV's caused exception should be filled.
> 
>
> Key: GEODE-10229
> URL: https://issues.apache.org/jira/browse/GEODE-10229
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage, pull-request-available
> Fix For: 1.12.10, 1.13.9, 1.14.5, 1.15.0
>
>
> Saving received RVV's caused exception should be filled during processing GII 
> image
> There's a race in GII: 
> A distribution of operation (RemoveAll could be more obvious) arrived when 
> the member is requesting GII and before saving received RVV. The 
> saveReceivedRVV() will the newly arrived operation to be an exception. 
> In normal case, the exception will be filled in processChunk(). But in above 
> case, the processChunk() will skip the entry because the local entry is not 
> recovered from disk. 
> Thus the exception will stay after GII. 
> To fix it, and not to slow down the performance, we need to check if such 
> exception exists, and do the recordVersion() for this entry. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Reopened] (GEODE-7138) CI failure: ClientServerTransactionFailoverWithMixedVersionServersDistributedTest > clientTransactionOperationsAreNotLostIfTransactionIsOnRolledServer

2022-04-22 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reopened GEODE-7138:
--

Reproduced in https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14761766


> CI failure: 
> ClientServerTransactionFailoverWithMixedVersionServersDistributedTest > 
> clientTransactionOperationsAreNotLostIfTransactionIsOnRolledServer
> --
>
> Key: GEODE-7138
> URL: https://issues.apache.org/jira/browse/GEODE-7138
> Project: Geode
>  Issue Type: Bug
>  Components: transactions
>Affects Versions: 1.11.0
>Reporter: Anilkumar Gingade
>Assignee: Eric Shu
>Priority: Major
>  Labels: GeodeCommons, flaky
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> DistributedTestOpenJDK8 #1035
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest
>  > clientTransactionOperationsAreNotLostIfTransactionIsOnRolledServer FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest$$Lambda$47/1742885319.run
>  in VM 0 running on Host 13889d5ebaf9 with 6 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:579)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:406)
> at 
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.clientTransactionOperationsAreNotLostIfTransactionIsOnRolledServer(ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.java:137)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest
>  that uses org.apache.geode.cache.Region, org.apache.geode.cache.Regionint 
> expected:<[144]> but was:<[37]> within 300 seconds.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:145)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:122)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:902)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.verifyTransactionResult(ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.java:361)
> at 
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.lambda$clientTransactionOperationsAreNotLostIfTransactionIsOnRolledServer$2967fbd$2(ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.java:137)
> Caused by:
> org.junit.ComparisonFailure: expected:<[144]> but was:<[37]>
> at 
> sun.reflect.GeneratedConstructorAccessor38.newInstance(Unknown Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.lambda$verifyTransactionResult$2(ClientServerTransactionFailoverWithMixedVersionServersDistributedTest.java:361)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (GEODE-10258) CI: ClearDuringNetSearchOplogRegressionTest > testQueryGetWithClear FAILED

2022-04-25 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10258:
-

 Summary: CI: ClearDuringNetSearchOplogRegressionTest > 
testQueryGetWithClear FAILED
 Key: GEODE-10258
 URL: https://issues.apache.org/jira/browse/GEODE-10258
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14769373 found 

> Task :geode-core:distributedTest

ClearDuringNetSearchOplogRegressionTest > testQueryGetWithClear FAILED
org.awaitility.core.ConditionTimeoutException: Assertion condition defined 
as a lambda expression in 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest 
cacheObserver.afterSettingDiskRef();
Wanted 1 time:
-> at 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest.lambda$doConcurrentNetSearchGetAndClear$0(ClearDuringNetSearchOplogRegressionTest.java:161)
But was 2 times:
-> at 
org.apache.geode.internal.cache.DiskRegion.setClearCountReference(DiskRegion.java:622)
-> at 
org.apache.geode.internal.cache.DiskRegion.setClearCountReference(DiskRegion.java:622)

 within 5 minutes.
at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167)
at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985)
at 
org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769)
at 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest.doConcurrentNetSearchGetAndClear(ClearDuringNetSearchOplogRegressionTest.java:161)
at 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest.concurrentNetSearchGetAndClear(ClearDuringNetSearchOplogRegressionTest.java:145)
at 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest.testQueryGetWithClear(ClearDuringNetSearchOplogRegressionTest.java:105)

Caused by:
org.mockito.exceptions.verification.TooManyActualInvocations: 
cacheObserver.afterSettingDiskRef();
Wanted 1 time:
-> at 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest.lambda$doConcurrentNetSearchGetAndClear$0(ClearDuringNetSearchOplogRegressionTest.java:161)
But was 2 times:
-> at 
org.apache.geode.internal.cache.DiskRegion.setClearCountReference(DiskRegion.java:622)
-> at 
org.apache.geode.internal.cache.DiskRegion.setClearCountReference(DiskRegion.java:622)
at 
org.apache.geode.internal.cache.ClearDuringNetSearchOplogRegressionTest.lambda$doConcurrentNetSearchGetAndClear$0(ClearDuringNetSearchOplogRegressionTest.java:161)





--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size

2022-04-25 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527645#comment-17527645
 ] 

Xiaojian Zhou commented on GEODE-9617:
--

Reproduced in https://hydradb.hdb.gemfire-ci.info/hdb/testresult/14769422

> CI Failure: PartitionedRegionSingleHopDUnitTest fails with 
> ConditionTimeoutException waiting for server to bucket map size
> --
>
> Key: GEODE-9617
> URL: https://issues.apache.org/jira/browse/GEODE-9617
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testClientMetadataForPersistentPrs FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService, 
> org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region
>  
> Expecting actual not to be null within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971)
> Caused by:
> java.lang.AssertionError: 
> Expecting actual not to be null
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976)
> {noformat}
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testMetadataServiceCallAccuracy_FromGetOp FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e> within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394)
> Caused by:
> org.junit.ComparisonFailure: 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e>
> at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (GEODE-10279) Need to lock RVV and flush before backup

2022-05-04 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-10279:
-

Assignee: Xiaojian Zhou

> Need to lock RVV and flush before backup
> 
>
> Key: GEODE-10279
> URL: https://issues.apache.org/jira/browse/GEODE-10279
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> When using async disk writer, in memory RVV has contained all the operations 
> in async queue. The items in the async queue might not have completely 
> flushed to disk. So RVV mismatch with the entries' status.
> When restored and GII, since RVVs are the same, no GII will be triggered. 
> Thus the data mismatched in different members.
> To fix it, introduce a step to lock rvvs for all the regions of all the 
> diskstores that will be backup.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (GEODE-10279) Need to lock RVV and flush before backup

2022-05-04 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10279:
-

 Summary: Need to lock RVV and flush before backup
 Key: GEODE-10279
 URL: https://issues.apache.org/jira/browse/GEODE-10279
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


When using async disk writer, in memory RVV has contained all the operations in 
async queue. The items in the async queue might not have completely flushed to 
disk. So RVV mismatch with the entries' status.

When restored and GII, since RVVs are the same, no GII will be triggered. Thus 
the data mismatched in different members.

To fix it, introduce a step to lock rvvs for all the regions of all the 
diskstores that will be backup.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (GEODE-10290) GII requester should remove departed members

2022-05-09 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-10290:
-

 Summary: GII requester should remove departed members
 Key: GEODE-10290
 URL: https://issues.apache.org/jira/browse/GEODE-10290
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


In non-persistent but concurrent-check enabled case, members departed will be 
marked. They should be removed from RVV during GII to prevent memberToVersion 
in RVV grows bigger and bigger. 

However, we only removed them from GII provider, not in GII requester. The good 
opportunity to remove them in GII requester is when calculating unfinished 
operations. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (GEODE-10290) GII requester should remove departed members

2022-05-09 Thread Xiaojian Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou reassigned GEODE-10290:
-

Assignee: Xiaojian Zhou

> GII requester should remove departed members
> 
>
> Key: GEODE-10290
> URL: https://issues.apache.org/jira/browse/GEODE-10290
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> In non-persistent but concurrent-check enabled case, members departed will be 
> marked. They should be removed from RVV during GII to prevent memberToVersion 
> in RVV grows bigger and bigger. 
> However, we only removed them from GII provider, not in GII requester. The 
> good opportunity to remove them in GII requester is when calculating 
> unfinished operations. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (GEODE-4302) ClientServerMiscBCDUnitTest.testSubscriptionWithMixedServersAndNewPeerFeed failed

2018-01-17 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4302:


 Summary: 
ClientServerMiscBCDUnitTest.testSubscriptionWithMixedServersAndNewPeerFeed 
failed
 Key: GEODE-4302
 URL: https://issues.apache.org/jira/browse/GEODE-4302
 Project: Geode
  Issue Type: Bug
  Components: ci, client/server
Reporter: xiaojian zhou


GemFireDistributedTest #321

 
org.apache.geode.internal.cache.tier.sockets.ClientServerMiscBCDUnitTest > 
testSubscriptionWithMixedServersAndNewPeerFeed[1] FAILED
 org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.test.dunit.NamedRunnable.run in VM 0 running on Host 
1de33499ce69 with 5 VMs with version 120
 at org.apache.geode.test.dunit.VM.invoke(VM.java:393)
 at org.apache.geode.test.dunit.VM.invoke(VM.java:363)
 at org.apache.geode.test.dunit.VM.invoke(VM.java:296)
 at 
org.apache.geode.internal.cache.tier.sockets.ClientServerMiscBCDUnitTest.doTestSubscriptionWithMixedServersAndPeerFeed(ClientServerMiscBCDUnitTest.java:204)
 at 
org.apache.geode.internal.cache.tier.sockets.ClientServerMiscBCDUnitTest.testSubscriptionWithMixedServersAndNewPeerFeed(ClientServerMiscBCDUnitTest.java:112)
 
 Caused by:
 java.lang.AssertionError: expected:<4> but was:<3>
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:834)
 at org.junit.Assert.assertEquals(Assert.java:645)
 at org.junit.Assert.assertEquals(Assert.java:631)
 at 
org.apache.geode.internal.cache.tier.sockets.ClientServerMiscBCDUnitTest.lambda$doTestSubscriptionWithMixedServersAndPeerFeed$60ce2e92$8(ClientServerMiscBCDUnitTest.java:212)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-3967) if put hits concurrent modification exception should still notify serial gateway sender

2018-01-22 Thread xiaojian zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334986#comment-16334986
 ] 

xiaojian zhou commented on GEODE-3967:
--

{noformat}

There are following race conditions contribute together to the bug:

1) When ConcurrentModificationException happened:
 * if tag's timestamp is higher, then notifyGateSender with 
UPDATE_VERSION_STAMP, which is creating a new event id.
 * if tag's timestamp is lower, then do nothing. 

In either way, the event for original event id not sent to the Gateway queue. 
If this member happened to host the primary queue, the event in second queue 
will not get a chance to cleanup. We should notify gatewaySender for event with 
CME

2) UPDATE_VERSION_STAMP will have new event id. If this member happened to be a 
secondary queue, then the event will stay there long or forever. We should 
check if queue's primary, only enqueue UPDATE_VERSION_STAMP event when queue is 
primary.

3) AUO.doPutOrCreate() will call basicUpdate() 3 times, or 3 tries: create, 
update, create.

 

There's race that between update(2nd try) and create (3rd try), other thread 
delete the entry caused the 2nd try failed, then before 3rd try, a create jump 
in cause the 3rd try failed silently (no log and no exception, not to 
notifyGatewaySender). 

 

As the last try, it should use ifNew=false, ifOld=false to apply the put() 
anyway. 

{noformat}

> if put hits concurrent modification exception should still notify serial 
> gateway sender
> ---
>
> Key: GEODE-3967
> URL: https://issues.apache.org/jira/browse/GEODE-3967
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
>
> In serial gateway sender, the event arrives at secondary will be put into 
> unprocessedMap and wait for event from primary queue to distribute over, then 
> remove it from the unprocessedMap.
> If the put at primary member (member with primary queue) failed with CME, the 
> event in unprocessedMap will never be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-4434) Need an API to confirm recovery is finished

2018-01-29 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4434:


 Summary: Need an API to confirm recovery is finished
 Key: GEODE-4434
 URL: https://issues.apache.org/jira/browse/GEODE-4434
 Project: Geode
  Issue Type: New Feature
  Components: persistence
Reporter: xiaojian zhou


This feature is expected especially when we need to decide if it's safe to 
shutdown one member.  So far we were using following workarrounds instead:

 
1) wait until cacheserver is listening on the port (to make sure all replicated 
regions have finished recovery)
2) call rebalance() to make sure the defined redundancy are satisfied (for PR)
 
we need to introduce a boolean parameter into rebalance() to only check 
redundancy, not to do real rebalance.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4631) Travis builds failing due to issue in TypeUtilsJUnitTest

2018-02-07 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4631:


Assignee: xiaojian zhou

> Travis builds failing due to issue in TypeUtilsJUnitTest
> 
>
> Key: GEODE-4631
> URL: https://issues.apache.org/jira/browse/GEODE-4631
> Project: Geode
>  Issue Type: Bug
>  Components: querying
>Reporter: Nick Reich
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> Seeing the follow failure in travis builds from github:
> {noformat}
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest > 
> getRegionEntryTypeShouldReturnTheProperTypeImplementation FAILED
>     org.mockito.exceptions.misusing.InvalidUseOfMatchersException:
>     Misplaced or misused argument matcher detected here:
>     -> at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.lambda$booleanCompareShouldThrowExceptionIfValuesAreNotInstancesOfBoolean$7(TypeUtilsJUnitTest.java:441)
>     -> at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.lambda$booleanCompareShouldThrowExceptionIfValuesAreNotInstancesOfBoolean$8(TypeUtilsJUnitTest.java:444)
>     -> at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.lambda$booleanCompareShouldThrowExceptionIfValuesAreNotInstancesOfBoolean$9(TypeUtilsJUnitTest.java:447)
>     You cannot use argument matchers outside of verification or stubbing.
>     Examples of correct usage of argument matchers:
>         when(mock.get(anyInt())).thenReturn(null);
>         doThrow(new 
> RuntimeException()).when(mock).someVoidMethod(anyObject());
>         verify(mock).someMethod(contains("foo"))
>     This message may appear after an NullPointerException if the last matcher 
> is returning an object
>     like any() but the stubbed method signature expect a primitive argument, 
> in this case,
>     use primitive alternatives.
>         when(mock.get(any())); // bad use, will raise NPE
>         when(mock.get(anyInt())); // correct usage use
>     Also, this error might show up because you use argument matchers with 
> methods that cannot be mocked.
>     Following methods *cannot* be stubbed/verified: 
> final/private/equals()/hashCode().
>     Mocking methods declared on non-public parent classes is not supported.
>         at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.getRegionEntryTypeShouldReturnTheProperTypeImplementation(TypeUtilsJUnitTest.java:429){noformat}
> The issue does not reproduce locally however.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-4631) Travis builds failing due to issue in TypeUtilsJUnitTest

2018-02-08 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4631.
--
   Resolution: Fixed
Fix Version/s: 1.5.0

> Travis builds failing due to issue in TypeUtilsJUnitTest
> 
>
> Key: GEODE-4631
> URL: https://issues.apache.org/jira/browse/GEODE-4631
> Project: Geode
>  Issue Type: Bug
>  Components: querying
>Reporter: Nick Reich
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> Seeing the follow failure in travis builds from github:
> {noformat}
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest > 
> getRegionEntryTypeShouldReturnTheProperTypeImplementation FAILED
>     org.mockito.exceptions.misusing.InvalidUseOfMatchersException:
>     Misplaced or misused argument matcher detected here:
>     -> at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.lambda$booleanCompareShouldThrowExceptionIfValuesAreNotInstancesOfBoolean$7(TypeUtilsJUnitTest.java:441)
>     -> at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.lambda$booleanCompareShouldThrowExceptionIfValuesAreNotInstancesOfBoolean$8(TypeUtilsJUnitTest.java:444)
>     -> at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.lambda$booleanCompareShouldThrowExceptionIfValuesAreNotInstancesOfBoolean$9(TypeUtilsJUnitTest.java:447)
>     You cannot use argument matchers outside of verification or stubbing.
>     Examples of correct usage of argument matchers:
>         when(mock.get(anyInt())).thenReturn(null);
>         doThrow(new 
> RuntimeException()).when(mock).someVoidMethod(anyObject());
>         verify(mock).someMethod(contains("foo"))
>     This message may appear after an NullPointerException if the last matcher 
> is returning an object
>     like any() but the stubbed method signature expect a primitive argument, 
> in this case,
>     use primitive alternatives.
>         when(mock.get(any())); // bad use, will raise NPE
>         when(mock.get(anyInt())); // correct usage use
>     Also, this error might show up because you use argument matchers with 
> methods that cannot be mocked.
>     Following methods *cannot* be stubbed/verified: 
> final/private/equals()/hashCode().
>     Mocking methods declared on non-public parent classes is not supported.
>         at 
> org.apache.geode.cache.query.internal.types.TypeUtilsJUnitTest.getRegionEntryTypeShouldReturnTheProperTypeImplementation(TypeUtilsJUnitTest.java:429){noformat}
> The issue does not reproduce locally however.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-4659) AbstractGatewaySenderEventProcessor put loop of filter in wrong place

2018-02-13 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-4659:
-
Issue Type: Bug  (was: New Feature)

> AbstractGatewaySenderEventProcessor put loop of filter in wrong place
> -
>
> Key: GEODE-4659
> URL: https://issues.apache.org/jira/browse/GEODE-4659
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Priority: Major
>
> {noformat}
> When fixing GEODE-3967, I found the loop of filter is in wrong place. 
>  
> If there's no filter defined, the processing  to ignore UPDATE_VERSION_STAMP 
> and events with CME should have nothing to do with filters. But if there's no 
> filter defined, the code will not ignore the UPDATE_VERSION_STAMP and events 
> with CME.
>  
> However, if fixed this problem. the GEODE-3967 have more race conditions to 
> be fixed. (I have fixed several of them). It looks like this bug hided other 
> race conditions from blowing out. 
>  
> GIving the time constrain, I will not fix the filter issue in GEODE_3967 and 
> log this bug for future reference. 
>  
> Here are the diff to fix or this bug:
> diff --git 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
>  
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
> index 8739a8f72..a3a89fbd0 100644
> --- 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
> +++ 
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
> @@ -81,40 +81,8 @@ public class RemoteParallelGatewaySenderEventProcessor 
> extends ParallelGatewaySe
>     * @param disp
>     * @return true if remote site Gemfire Version is >= 7.0.1
>     */
> -  private boolean shouldSendVersionEvents(GatewaySenderEventDispatcher disp)
> -      throws GatewaySenderException {
> -    try {
> -      GatewaySenderEventRemoteDispatcher remoteDispatcher =
> -          (GatewaySenderEventRemoteDispatcher) disp;
> -      // This will create a new connection if no batch has been sent till
> -      // now.
> -      Connection conn = remoteDispatcher.getConnection(false);
> -      if (conn != null) {
> -        short remoteSiteVersion = conn.getWanSiteVersion();
> -        if (Version.GFE_701.compareTo(remoteSiteVersion) <= 0) {
> -          return true;
> -        }
> -      }
> -    } catch (GatewaySenderException e) {
> -      Throwable cause = e.getCause();
> -      if (cause instanceof IOException || e instanceof 
> GatewaySenderConfigurationException
> -          || cause instanceof ConnectionDestroyedException) {
> -        try {
> -          int sleepInterval = GatewaySender.CONNECTION_RETRY_INTERVAL;
> -          if (logger.isDebugEnabled()) {
> -            logger.debug("Sleeping for {} milliseconds", sleepInterval);
> -          }
> -          Thread.sleep(sleepInterval);
> -        } catch (InterruptedException ie) {
> -          // log the exception
> -          if (logger.isDebugEnabled()) {
> -            logger.debug(ie.getMessage(), ie);
> -          }
> -        }
> -      }
> -      throw e;
> -    }
> -    return false;
> +  protected boolean shouldSendVersionEvents(GatewaySenderEventDispatcher 
> disp) {
> +    return true;
>    }
> }
> diff --git 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
>  
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
> index 69005e02b..da5d1baee 100644
> --- 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
> +++ 
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
> @@ -19,6 +19,7 @@ import org.apache.logging.log4j.Logger;
> import org.apache.geode.cache.wan.GatewaySender;
> import org.apache.geode.internal.cache.wan.AbstractGatewaySender;
> import 
> org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher;
> +import org.apache.geode.internal.cache.wan.GatewaySenderEventDispatcher;
> import org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher;
> import org.apache.geode.internal.logging.LogService;
> @@ -44,4 +45,14 @@ public class RemoteSerialGatewaySenderEventProcessor 
> extends SerialGatewaySender
>      }
>    }
> +  /**
> +   * Returns if corresponding receiver WAN site of this GatewaySender has 
> GemfireVersion > 7.0.1
> +   *
> +   * @param disp
> +   * @return true if remote site Gemfire Version is >= 7.0.1
> +   */
> +  protected boolean shouldSendVersionEve

[jira] [Created] (GEODE-4659) AbstractGatewaySenderEventProcessor put loop of filter in wrong place

2018-02-13 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4659:


 Summary: AbstractGatewaySenderEventProcessor put loop of filter in 
wrong place
 Key: GEODE-4659
 URL: https://issues.apache.org/jira/browse/GEODE-4659
 Project: Geode
  Issue Type: New Feature
  Components: wan
Reporter: xiaojian zhou


{noformat}
When fixing GEODE-3967, I found the loop of filter is in wrong place. 

 

If there's no filter defined, the processing  to ignore UPDATE_VERSION_STAMP 
and events with CME should have nothing to do with filters. But if there's no 
filter defined, the code will not ignore the UPDATE_VERSION_STAMP and events 
with CME.

 

However, if fixed this problem. the GEODE-3967 have more race conditions to be 
fixed. (I have fixed several of them). It looks like this bug hided other race 
conditions from blowing out. 

 

GIving the time constrain, I will not fix the filter issue in GEODE_3967 and 
log this bug for future reference. 

 
Here are the diff to fix or this bug:
diff --git 
a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
 
b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
index 8739a8f72..a3a89fbd0 100644
--- 
a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
+++ 
b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
@@ -81,40 +81,8 @@ public class RemoteParallelGatewaySenderEventProcessor 
extends ParallelGatewaySe
    * @param disp
    * @return true if remote site Gemfire Version is >= 7.0.1
    */
-  private boolean shouldSendVersionEvents(GatewaySenderEventDispatcher disp)
-      throws GatewaySenderException {
-    try {
-      GatewaySenderEventRemoteDispatcher remoteDispatcher =
-          (GatewaySenderEventRemoteDispatcher) disp;
-      // This will create a new connection if no batch has been sent till
-      // now.
-      Connection conn = remoteDispatcher.getConnection(false);
-      if (conn != null) {
-        short remoteSiteVersion = conn.getWanSiteVersion();
-        if (Version.GFE_701.compareTo(remoteSiteVersion) <= 0) {
-          return true;
-        }
-      }
-    } catch (GatewaySenderException e) {
-      Throwable cause = e.getCause();
-      if (cause instanceof IOException || e instanceof 
GatewaySenderConfigurationException
-          || cause instanceof ConnectionDestroyedException) {
-        try {
-          int sleepInterval = GatewaySender.CONNECTION_RETRY_INTERVAL;
-          if (logger.isDebugEnabled()) {
-            logger.debug("Sleeping for {} milliseconds", sleepInterval);
-          }
-          Thread.sleep(sleepInterval);
-        } catch (InterruptedException ie) {
-          // log the exception
-          if (logger.isDebugEnabled()) {
-            logger.debug(ie.getMessage(), ie);
-          }
-        }
-      }
-      throw e;
-    }
-    return false;
+  protected boolean shouldSendVersionEvents(GatewaySenderEventDispatcher disp) 
{
+    return true;
   }
}
diff --git 
a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
 
b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
index 69005e02b..da5d1baee 100644
--- 
a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
+++ 
b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
@@ -19,6 +19,7 @@ import org.apache.logging.log4j.Logger;
import org.apache.geode.cache.wan.GatewaySender;
import org.apache.geode.internal.cache.wan.AbstractGatewaySender;
import org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher;
+import org.apache.geode.internal.cache.wan.GatewaySenderEventDispatcher;
import org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher;
import org.apache.geode.internal.logging.LogService;

@@ -44,4 +45,14 @@ public class RemoteSerialGatewaySenderEventProcessor extends 
SerialGatewaySender
     }
   }

+  /**
+   * Returns if corresponding receiver WAN site of this GatewaySender has 
GemfireVersion > 7.0.1
+   *
+   * @param disp
+   * @return true if remote site Gemfire Version is >= 7.0.1
+   */
+  protected boolean shouldSendVersionEvents(GatewaySenderEventDispatcher disp) 
{
+    return true;
+  }
+
}
diff --git 
a/geode-core/src/main/java/org/apache/geode/internal/cache/wan/AbstractGatewaySenderEventProcessor.java
 
b/geode-core/src/main/java/org/apache/geode/internal/cache/wan/AbstractGatewaySenderEventProcessor.java
index 7e67e9bfb..439394382 100644
--- 
a/geode-core/src/main/java/org/apache/geode/internal/cache/wan/AbstractGatewaySenderEventProcessor.java
+

[jira] [Assigned] (GEODE-4659) AbstractGatewaySenderEventProcessor put loop of filter in wrong place

2018-02-13 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4659:


Assignee: xiaojian zhou

> AbstractGatewaySenderEventProcessor put loop of filter in wrong place
> -
>
> Key: GEODE-4659
> URL: https://issues.apache.org/jira/browse/GEODE-4659
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> {noformat}
> When fixing GEODE-3967, I found the loop of filter is in wrong place. 
>  
> If there's no filter defined, the processing  to ignore UPDATE_VERSION_STAMP 
> and events with CME should have nothing to do with filters. But if there's no 
> filter defined, the code will not ignore the UPDATE_VERSION_STAMP and events 
> with CME.
>  
> However, if fixed this problem. the GEODE-3967 have more race conditions to 
> be fixed. (I have fixed several of them). It looks like this bug hided other 
> race conditions from blowing out. 
>  
> GIving the time constrain, I will not fix the filter issue in GEODE_3967 and 
> log this bug for future reference. 
>  
> Here are the diff to fix or this bug:
> diff --git 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
>  
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
> index 8739a8f72..a3a89fbd0 100644
> --- 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
> +++ 
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/RemoteParallelGatewaySenderEventProcessor.java
> @@ -81,40 +81,8 @@ public class RemoteParallelGatewaySenderEventProcessor 
> extends ParallelGatewaySe
>     * @param disp
>     * @return true if remote site Gemfire Version is >= 7.0.1
>     */
> -  private boolean shouldSendVersionEvents(GatewaySenderEventDispatcher disp)
> -      throws GatewaySenderException {
> -    try {
> -      GatewaySenderEventRemoteDispatcher remoteDispatcher =
> -          (GatewaySenderEventRemoteDispatcher) disp;
> -      // This will create a new connection if no batch has been sent till
> -      // now.
> -      Connection conn = remoteDispatcher.getConnection(false);
> -      if (conn != null) {
> -        short remoteSiteVersion = conn.getWanSiteVersion();
> -        if (Version.GFE_701.compareTo(remoteSiteVersion) <= 0) {
> -          return true;
> -        }
> -      }
> -    } catch (GatewaySenderException e) {
> -      Throwable cause = e.getCause();
> -      if (cause instanceof IOException || e instanceof 
> GatewaySenderConfigurationException
> -          || cause instanceof ConnectionDestroyedException) {
> -        try {
> -          int sleepInterval = GatewaySender.CONNECTION_RETRY_INTERVAL;
> -          if (logger.isDebugEnabled()) {
> -            logger.debug("Sleeping for {} milliseconds", sleepInterval);
> -          }
> -          Thread.sleep(sleepInterval);
> -        } catch (InterruptedException ie) {
> -          // log the exception
> -          if (logger.isDebugEnabled()) {
> -            logger.debug(ie.getMessage(), ie);
> -          }
> -        }
> -      }
> -      throw e;
> -    }
> -    return false;
> +  protected boolean shouldSendVersionEvents(GatewaySenderEventDispatcher 
> disp) {
> +    return true;
>    }
> }
> diff --git 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
>  
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
> index 69005e02b..da5d1baee 100644
> --- 
> a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
> +++ 
> b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/serial/RemoteSerialGatewaySenderEventProcessor.java
> @@ -19,6 +19,7 @@ import org.apache.logging.log4j.Logger;
> import org.apache.geode.cache.wan.GatewaySender;
> import org.apache.geode.internal.cache.wan.AbstractGatewaySender;
> import 
> org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher;
> +import org.apache.geode.internal.cache.wan.GatewaySenderEventDispatcher;
> import org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher;
> import org.apache.geode.internal.logging.LogService;
> @@ -44,4 +45,14 @@ public class RemoteSerialGatewaySenderEventProcessor 
> extends SerialGatewaySender
>      }
>    }
> +  /**
> +   * Returns if corresponding receiver WAN site of this GatewaySender has 
> GemfireVersion > 7.0.1
> +   *
> +   * @param disp
> +   * @return true if remote site Gemfire Version is >= 7.0.1
> +   */
> +  protec

[jira] [Resolved] (GEODE-3967) if put hits concurrent modification exception should still notify serial gateway sender

2018-02-14 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-3967.
--
Resolution: Fixed

> if put hits concurrent modification exception should still notify serial 
> gateway sender
> ---
>
> Key: GEODE-3967
> URL: https://issues.apache.org/jira/browse/GEODE-3967
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In serial gateway sender, the event arrives at secondary will be put into 
> unprocessedMap and wait for event from primary queue to distribute over, then 
> remove it from the unprocessedMap.
> If the put at primary member (member with primary queue) failed with CME, the 
> event in unprocessedMap will never be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4647) Add a new stat for AyncEventQueue/GatewaySender to track secondaryEventsQueueSize

2018-02-14 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4647:


Assignee: xiaojian zhou

> Add a new stat for AyncEventQueue/GatewaySender to track 
> secondaryEventsQueueSize
> -
>
> Key: GEODE-4647
> URL: https://issues.apache.org/jira/browse/GEODE-4647
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Reporter: Jason Huynh
>Assignee: xiaojian zhou
>Priority: Major
>
> Currently we have eventsQueueSize which tells us how big the queue is based 
> on how many primary events are in the queue.
> It would be nice to have the same type of stat for how many secondary events 
> are in the queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4624) Add a new stat for AyncEventQueue/GatewaySender to track the processing of queueRemovals

2018-02-14 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4624:


Assignee: xiaojian zhou

> Add a new stat for AyncEventQueue/GatewaySender to track the processing of 
> queueRemovals
> 
>
> Key: GEODE-4624
> URL: https://issues.apache.org/jira/browse/GEODE-4624
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Affects Versions: 1.5.0
>Reporter: Shelley Lynn Hughes-Godfrey
>Assignee: xiaojian zhou
>Priority: Major
>
> We currently track the number of events queues, queue size and 
> eventsDistributed ... but we don't track the number of events removed via 
> queue removal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-3967) if put hits concurrent modification exception should still notify serial gateway sender

2018-02-14 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-3967:
-
Fix Version/s: 1.5.0

> if put hits concurrent modification exception should still notify serial 
> gateway sender
> ---
>
> Key: GEODE-3967
> URL: https://issues.apache.org/jira/browse/GEODE-3967
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In serial gateway sender, the event arrives at secondary will be put into 
> unprocessedMap and wait for event from primary queue to distribute over, then 
> remove it from the unprocessedMap.
> If the put at primary member (member with primary queue) failed with CME, the 
> event in unprocessedMap will never be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-4675) CI failure (suspect strings): DistributedSystemDisconnectedException: This connection to a distributed system has been disconnected reported as fatal log message during

2018-02-26 Thread xiaojian zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377475#comment-16377475
 ] 

xiaojian zhou commented on GEODE-4675:
--

{noformat}
Found in CI develop DistributedTest #162
 
org.apache.geode.internal.cache.wan.concurrent.ConcurrentParallelGatewaySenderOffHeapDUnitTest
 > testParallelPropagationWithUnEqualBucketDivision FAILED
 

java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
 

Fix the strings or use IgnoredException.addIgnoredException to ignore
.
 

---
 

Found suspect string in log4j at line 9083
[fatal 2018/02/24 06:27:04.231 UTC  tid=1086] Unexpected 
exception:
 

org.apache.geode.distributed.DistributedSystemDisconnectedException: This 
connection to a distributed system has been disconnected.
 

at 
org.apache.geode.distributed.internal.InternalDistributedSystem.checkConnected(InternalDistributedSystem.java:911)
 

at 
org.apache.geode.distributed.internal.InternalDistributedSystem.getDistributionManager(InternalDistributedSystem.java:1493)
 

at 
org.apache.geode.internal.cache.AbstractRegion.getDistributionManager(AbstractRegion.java:1757)
 

at 
org.apache.geode.distributed.internal.DistributionAdvisor.getDistributionManager(DistributionAdvisor.java:380)
 

at 
org.apache.geode.distributed.internal.DistributionAdvisor.notifyListenersMemberRemoved(DistributionAdvisor.java:1225)
 

at 
org.apache.geode.distributed.internal.DistributionAdvisor.basicRemoveId(DistributionAdvisor.java:897)
 

at 
org.apache.geode.distributed.internal.DistributionAdvisor.doRemoveId(DistributionAdvisor.java:964)
 

at 
org.apache.geode.distributed.internal.DistributionAdvisor.removeId(DistributionAdvisor.java:926)
 

at 
org.apache.geode.internal.cache.CacheDistributionAdvisor.removeId(CacheDistributionAdvisor.java:1183)
 

at 
org.apache.geode.internal.cache.partitioned.RegionAdvisor.removeId(RegionAdvisor.java:391)
 

at 
org.apache.geode.distributed.internal.DistributionAdvisor$1.memberDeparted(DistributionAdvisor.java:232)
 

at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberDepartedEvent.handleEvent(ClusterDistributionManager.java:4198)
 

at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:4127)
 

at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:4116)
 

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.handleMemberEvent(ClusterDistributionManager.java:2218)
 

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$900(ClusterDistributionManager.java:109)
 

at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEventInvoker.run(ClusterDistributionManager.java:2250)
 

at java.lang.Thread.run(Thread.java:748{noformat}
[ 
|https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/162#L5a6ba88b:628]
 

> CI failure (suspect strings): DistributedSystemDisconnectedException: This 
> connection to a distributed system has been disconnected reported as fatal 
> log message during shutdown
> -
>
> Key: GEODE-4675
> URL: https://issues.apache.org/jira/browse/GEODE-4675
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.5.0
>Reporter: Shelley Lynn Hughes-Godfrey
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This failure occurred during CI on geode:
> https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/140
> {noformat}
> org.apache.geode.internal.cache.wan.concurrent.ConcurrentParallelGatewaySenderOffHeapDUnitTest
>  > testPartitionedParallelPropagationHA FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 9339
> [fatal 2018/02/13 21:12:48.099 UTC  tid=891] 
> Unexpected exception:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: This 
> connection to a distributed system has been disconnected.
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.checkConnected(InternalDistributedSystem.java:911)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.getDis

[jira] [Created] (GEODE-4788) parReg.execute.FunctionServiceTest.getSomeKeys failed with util.TestException: Test Issue - Got the exception

2018-03-06 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4788:


 Summary: parReg.execute.FunctionServiceTest.getSomeKeys failed 
with util.TestException: Test Issue - Got the exception
 Key: GEODE-4788
 URL: https://issues.apache.org/jira/browse/GEODE-4788
 Project: Geode
  Issue Type: New Feature
Reporter: xiaojian zhou


{noformat}
CLIENT 
vm_12_thr_15_accessor1_rs-FullRegression-2018-03-03-05-01-17-client-8_4804
TASK[0] parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
ERROR util.TestException: Test Issue - Got the exception 

util.TestException: Test Issue - Got the exception 
at parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
at 
parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
at 
parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hydra.MethExecutor.execute(MethExecutor.java:181)
at hydra.MethExecutor.execute(MethExecutor.java:149)
at hydra.TestTask.execute(TestTask.java:192)
at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
Caused by: java.lang.RuntimeException: 
org.apache.geode.internal.cache.ForceReattemptException: FetchKeysResponse got 
remote CacheClosedException; forcing reattempt.
at 
org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:74)
at parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:502)
... 10 more
Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
FetchKeysResponse got remote CacheClosedException; forcing reattempt.
at 
org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:546)
at 
org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:66)
... 11 more
Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is closed: 
rs-FullRegression-2018-03-03-05-01-17-client-8(dataStoregemfire5_rs-FullRegression-2018-03-03-05-01-17-client-8_4776:4776):1031
at 
org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1588)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1576)
at 
org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:304)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:382)
at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1118)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:943)
at java.lang.Thread.run(Thread.java:748)

CLIENT 
vm_15_thr_29_accessor4_rs-FullRegression-2018-03-03-05-01-17-client-8_4918
TASK[0] parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
ERROR util.TestException: Test Issue - Got the exception 

util.TestException: Test Issue - Got the exception 
at parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
at 
parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
at 
parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hydra.MethExecutor.execute(MethExecutor.java:{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4788) parReg.execute.FunctionServiceTest.getSomeKeys failed with util.TestException: Test Issue - Got the exception

2018-03-06 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4788:


Assignee: xiaojian zhou

> parReg.execute.FunctionServiceTest.getSomeKeys failed with 
> util.TestException: Test Issue - Got the exception
> -
>
> Key: GEODE-4788
> URL: https://issues.apache.org/jira/browse/GEODE-4788
> Project: Geode
>  Issue Type: New Feature
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> {noformat}
> CLIENT 
> vm_12_thr_15_accessor1_rs-FullRegression-2018-03-03-05-01-17-client-8_4804
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at hydra.MethExecutor.execute(MethExecutor.java:181)
> at hydra.MethExecutor.execute(MethExecutor.java:149)
> at hydra.TestTask.execute(TestTask.java:192)
> at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
> Caused by: java.lang.RuntimeException: 
> org.apache.geode.internal.cache.ForceReattemptException: FetchKeysResponse 
> got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:74)
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:502)
> ... 10 more
> Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
> FetchKeysResponse got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:546)
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:66)
> ... 11 more
> Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is 
> closed: 
> rs-FullRegression-2018-03-03-05-01-17-client-8(dataStoregemfire5_rs-FullRegression-2018-03-03-05-01-17-client-8_4776:4776):1031
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1588)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1576)
> at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:304)
> at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:382)
> at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:448)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1118)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:943)
> at java.lang.Thread.run(Thread.java:748)
> 
> CLIENT 
> vm_15_thr_29_accessor4_rs-FullRegression-2018-03-03-05-01-17-client-8_4918
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at hydra.MethExecutor.execute(MethExecutor.java:{noforma

[jira] [Commented] (GEODE-4788) parReg.execute.FunctionServiceTest.getSomeKeys failed with util.TestException: Test Issue - Got the exception

2018-03-06 Thread xiaojian zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388448#comment-16388448
 ] 

xiaojian zhou commented on GEODE-4788:
--

Root cause is:

The refactoring in GEODE-2673 changed the behavior of the test code of 
getSomeKeys().

The old code will catch ForceReattemptException or PRLocallyDestroyedException, 
log them and continue.

But the new code will throw RuntimeException when caught these exception.

> parReg.execute.FunctionServiceTest.getSomeKeys failed with 
> util.TestException: Test Issue - Got the exception
> -
>
> Key: GEODE-4788
> URL: https://issues.apache.org/jira/browse/GEODE-4788
> Project: Geode
>  Issue Type: New Feature
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> {noformat}
> CLIENT 
> vm_12_thr_15_accessor1_rs-FullRegression-2018-03-03-05-01-17-client-8_4804
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at hydra.MethExecutor.execute(MethExecutor.java:181)
> at hydra.MethExecutor.execute(MethExecutor.java:149)
> at hydra.TestTask.execute(TestTask.java:192)
> at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
> Caused by: java.lang.RuntimeException: 
> org.apache.geode.internal.cache.ForceReattemptException: FetchKeysResponse 
> got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:74)
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:502)
> ... 10 more
> Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
> FetchKeysResponse got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:546)
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:66)
> ... 11 more
> Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is 
> closed: 
> rs-FullRegression-2018-03-03-05-01-17-client-8(dataStoregemfire5_rs-FullRegression-2018-03-03-05-01-17-client-8_4776:4776):1031
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1588)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1576)
> at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:304)
> at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:382)
> at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:448)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1118)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:943)
> at java.lang.Thread.run(Thread.java:748)
> 
> CLIENT 
> vm_15_thr_29_accessor4_rs-FullRegression-2018-03-03-05-01-17-client-8_4918
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(N

[jira] [Resolved] (GEODE-4788) parReg.execute.FunctionServiceTest.getSomeKeys failed with util.TestException: Test Issue - Got the exception

2018-03-06 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4788.
--
   Resolution: Fixed
Fix Version/s: 1.6.0

> parReg.execute.FunctionServiceTest.getSomeKeys failed with 
> util.TestException: Test Issue - Got the exception
> -
>
> Key: GEODE-4788
> URL: https://issues.apache.org/jira/browse/GEODE-4788
> Project: Geode
>  Issue Type: New Feature
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> CLIENT 
> vm_12_thr_15_accessor1_rs-FullRegression-2018-03-03-05-01-17-client-8_4804
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at hydra.MethExecutor.execute(MethExecutor.java:181)
> at hydra.MethExecutor.execute(MethExecutor.java:149)
> at hydra.TestTask.execute(TestTask.java:192)
> at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
> Caused by: java.lang.RuntimeException: 
> org.apache.geode.internal.cache.ForceReattemptException: FetchKeysResponse 
> got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:74)
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:502)
> ... 10 more
> Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
> FetchKeysResponse got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:546)
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:66)
> ... 11 more
> Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is 
> closed: 
> rs-FullRegression-2018-03-03-05-01-17-client-8(dataStoregemfire5_rs-FullRegression-2018-03-03-05-01-17-client-8_4776:4776):1031
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1588)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1576)
> at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:304)
> at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:382)
> at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:448)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1118)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:943)
> at java.lang.Thread.run(Thread.java:748)
> 
> CLIENT 
> vm_15_thr_29_accessor4_rs-FullRegression-2018-03-03-05-01-17-client-8_4918
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(Del

[jira] [Commented] (GEODE-4788) parReg.execute.FunctionServiceTest.getSomeKeys failed with util.TestException: Test Issue - Got the exception

2018-03-06 Thread xiaojian zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389157#comment-16389157
 ] 

xiaojian zhou commented on GEODE-4788:
--

fixed in revision 

378d97aed9f23c237b9616c78e17bdb1fea2c21f

> parReg.execute.FunctionServiceTest.getSomeKeys failed with 
> util.TestException: Test Issue - Got the exception
> -
>
> Key: GEODE-4788
> URL: https://issues.apache.org/jira/browse/GEODE-4788
> Project: Geode
>  Issue Type: New Feature
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> CLIENT 
> vm_12_thr_15_accessor1_rs-FullRegression-2018-03-03-05-01-17-client-8_4804
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at hydra.MethExecutor.execute(MethExecutor.java:181)
> at hydra.MethExecutor.execute(MethExecutor.java:149)
> at hydra.TestTask.execute(TestTask.java:192)
> at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
> Caused by: java.lang.RuntimeException: 
> org.apache.geode.internal.cache.ForceReattemptException: FetchKeysResponse 
> got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:74)
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:502)
> ... 10 more
> Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
> FetchKeysResponse got remote CacheClosedException; forcing reattempt.
> at 
> org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:546)
> at 
> org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:66)
> ... 11 more
> Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is 
> closed: 
> rs-FullRegression-2018-03-03-05-01-17-client-8(dataStoregemfire5_rs-FullRegression-2018-03-03-05-01-17-client-8_4776:4776):1031
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1588)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1576)
> at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:304)
> at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:382)
> at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:448)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1118)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:943)
> at java.lang.Thread.run(Thread.java:748)
> 
> CLIENT 
> vm_15_thr_29_accessor4_rs-FullRegression-2018-03-03-05-01-17-client-8_4918
> TASK[0] 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
> ERROR util.TestException: Test Issue - Got the exception 
> util.TestException: Test Issue - Got the exception 
> at 
> parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
> at 
> parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
> at 
> parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j

[jira] [Updated] (GEODE-4788) change back the behavior of test code of getSomeKeys to ignore exceptions

2018-03-07 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-4788:
-
Description: 
In GEODE-2673, getSomeKeys' behavior was changed to throw exception when 
detected some exceptions. 

We need to change it back to ignore the exceptions as test code. 

  was:
{noformat}
CLIENT 
vm_12_thr_15_accessor1_rs-FullRegression-2018-03-03-05-01-17-client-8_4804
TASK[0] parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
ERROR util.TestException: Test Issue - Got the exception 

util.TestException: Test Issue - Got the exception 
at parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
at 
parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
at 
parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hydra.MethExecutor.execute(MethExecutor.java:181)
at hydra.MethExecutor.execute(MethExecutor.java:149)
at hydra.TestTask.execute(TestTask.java:192)
at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
Caused by: java.lang.RuntimeException: 
org.apache.geode.internal.cache.ForceReattemptException: FetchKeysResponse got 
remote CacheClosedException; forcing reattempt.
at 
org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:74)
at parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:502)
... 10 more
Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
FetchKeysResponse got remote CacheClosedException; forcing reattempt.
at 
org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:546)
at 
org.apache.geode.internal.cache.PartitionedRegionGetSomeKeys.getSomeKeys(PartitionedRegionGetSomeKeys.java:66)
... 11 more
Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is closed: 
rs-FullRegression-2018-03-03-05-01-17-client-8(dataStoregemfire5_rs-FullRegression-2018-03-03-05-01-17-client-8_4776:4776):1031
at 
org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1588)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1576)
at 
org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:304)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:382)
at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1118)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:943)
at java.lang.Thread.run(Thread.java:748)

CLIENT 
vm_15_thr_29_accessor4_rs-FullRegression-2018-03-03-05-01-17-client-8_4918
TASK[0] parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions
ERROR util.TestException: Test Issue - Got the exception 

util.TestException: Test Issue - Got the exception 
at parReg.execute.FunctionServiceTest.getSomeKeys(FunctionServiceTest.java:508)
at 
parReg.execute.FunctionServiceTest.doRandomFunctionExecutions(FunctionServiceTest.java:473)
at 
parReg.execute.FunctionServiceTest.HydraTask_doRandomFunctionExecutions(FunctionServiceTest.java:436)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hydra.MethExecutor.execute(MethExecutor.java:{noformat}

Summary: change back the behavior of test code of getSomeKeys to ignore 
exceptions  (was: parReg.execute.FunctionServiceTest.getSomeKeys failed with 
util.TestException: Test Issue - Got the exception)

> change back the behavior of test code of getSomeKeys to ignore exceptions
> -
>
> Key: GEODE-4788
> URL: https://issues.apache.org/jira/browse/GEODE-4788
> Project: Geode
>  

[jira] [Updated] (GEODE-4788) change back the behavior of test code of getSomeKeys to ignore exceptions

2018-03-07 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-4788:
-
Fix Version/s: 1.5.0

> change back the behavior of test code of getSomeKeys to ignore exceptions
> -
>
> Key: GEODE-4788
> URL: https://issues.apache.org/jira/browse/GEODE-4788
> Project: Geode
>  Issue Type: New Feature
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.0, 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In GEODE-2673, getSomeKeys' behavior was changed to throw exception when 
> detected some exceptions. 
> We need to change it back to ignore the exceptions as test code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-4834) Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl

2018-03-13 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4834:


 Summary: Remove the newly added isConcurrencyConflict from 
GatewaySenderEventImpl
 Key: GEODE-4834
 URL: https://issues.apache.org/jira/browse/GEODE-4834
 Project: Geode
  Issue Type: New Feature
  Components: wan
Reporter: xiaojian zhou


This boolean field was introduced in GEODE-3967 to resolve a rarely happen race 
condition. 

But it caused some issue in rolling upgrade. 

 

We decided to revert this part of the fix. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4834) Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl

2018-03-13 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4834:


Assignee: xiaojian zhou

> Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl
> 
>
> Key: GEODE-4834
> URL: https://issues.apache.org/jira/browse/GEODE-4834
> Project: Geode
>  Issue Type: New Feature
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> This boolean field was introduced in GEODE-3967 to resolve a rarely happen 
> race condition. 
> But it caused some issue in rolling upgrade. 
>  
> We decided to revert this part of the fix. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-4834) Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl

2018-03-14 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4834.
--
Resolution: Fixed

> Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl
> 
>
> Key: GEODE-4834
> URL: https://issues.apache.org/jira/browse/GEODE-4834
> Project: Geode
>  Issue Type: New Feature
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This boolean field was introduced in GEODE-3967 to resolve a rarely happen 
> race condition. 
> But it caused some issue in rolling upgrade. 
>  
> We decided to revert this part of the fix. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-4834) Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl

2018-03-14 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-4834:
-
Fix Version/s: 1.6.0
   1.5.0

> Remove the newly added isConcurrencyConflict from GatewaySenderEventImpl
> 
>
> Key: GEODE-4834
> URL: https://issues.apache.org/jira/browse/GEODE-4834
> Project: Geode
>  Issue Type: New Feature
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.0, 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This boolean field was introduced in GEODE-3967 to resolve a rarely happen 
> race condition. 
> But it caused some issue in rolling upgrade. 
>  
> We decided to revert this part of the fix. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4868) when member deposed primary buckets, it did not decrease the queue size

2018-03-16 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4868:


Assignee: xiaojian zhou

> when member deposed primary buckets, it did not decrease the queue size
> ---
>
> Key: GEODE-4868
> URL: https://issues.apache.org/jira/browse/GEODE-4868
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> {noformat}
> It can use following test code to reproduce the issue:
> diff --git 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
>  
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
> index 545d0cac4..fbc0dc015 100644
> --- 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
> +++ 
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
> @@ -717,6 +717,10 @@ public class AsyncEventQueueTestBase extends 
> JUnit4DistributedTestCase {
>        }
>      }
>      final AsyncEventQueueStats statistics = ((AsyncEventQueueImpl) 
> queue).getStatistics();
> +    Awaitility.await().atMost(60, TimeUnit.SECONDS)
> +    .until(() -> assertEquals("Expected queue entries: " + queueSize
> +        + " but actual entries: " + statistics.getEventQueueSize(), 
> queueSize,
> +        statistics.getEventQueueSize()));
>      assertEquals(queueSize, statistics.getEventQueueSize());
>      assertEquals(eventsReceived, statistics.getEventsReceived());
> diff --git 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
>  
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
> index 465f35a87..058bf19cc 100644
> --- 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
> +++ 
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
> @@ -1519,6 +1519,11 @@ public class AsyncEventListenerDUnitTest extends 
> AsyncEventQueueTestBase {
>          () -> 
> AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(getTestMethodName() + 
> "_PR"));
>  
>      LogWriterUtils.getLogWriter().info("Primary buckets on vm2: " + 
> primaryBucketsvm2);
> +    
> +    // before shutdown vm2, both vm1 and vm2 should have 40 events in 
> primary queue
> +    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40,
> 80, 80, 0));
> +    vm2.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40, 80, 80, 0));
> +    
>      //  Kill vm2 --
>      vm2.invoke(() -> AsyncEventQueueTestBase.killSender());
>      // 
> @@ -1527,15 +1532,26 @@ public class AsyncEventListenerDUnitTest extends 
> AsyncEventQueueTestBase {
>      vm3.invoke(createCacheRunnable(lnPort));
>      vm3.invoke(() -> 
> AsyncEventQueueTestBase.createAsyncEventQueueWithListener2("ln", true, 100, 5,
>          false, null));
> +    // vm3 will move some primary buckets from vm1, but vm1's primary queue 
> size did not reduce
> +    vm3.invoke(pauseAsyncEventQueueRunnable());
>      vm3.invoke(() -> 
> AsyncEventQueueTestBase.createPRWithRedundantCopyWithAsyncEventQueue(
>          getTestMethodName() + "_PR", "ln", isOffHeap()));
> -
> +    
>      // --
>      String regionName = getTestMethodName() + "_PR";
>      Set primaryBucketsvm3 = (Set) vm3
>          .invoke(() -> 
> AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(regionName));
> +    LogWriterUtils.getLogWriter().info("Primary buckets on vm3: " + 
> primaryBucketsvm3);
> +    Set primaryBucketsvm1 = (Set) vm1.invoke(
> +            () -> 
> AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(getTestMethodName() + 
> "_PR"));
> +    LogWriterUtils.getLogWriter().info("After shutdown vm2, started vm3, 
> Primary buckets on vm1: " + primaryBucketsvm1);
>  
> +//    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 80, 80, 80, 0));
> +    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40, 80, 80, 0));
> +    vm3.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40, 0, 0, 0));
> +
> +    vm3.invoke(() -> AsyncEventQueueTestBase.resumeAsyncEventQueue("ln"));
>      vm1.invoke(() -> AsyncEventQueueTestBase.resumeAsyncEventQueue("ln"));
>  
>      vm1.invoke(() -> 
> AsyncEventQueueTestBase.waitForAsyncQueueToGetEmpty("ln"))
> The root cause is:
> when depose primary, it only check if bucke

[jira] [Created] (GEODE-4868) when member deposed primary buckets, it did not decrease the queue size

2018-03-16 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4868:


 Summary: when member deposed primary buckets, it did not decrease 
the queue size
 Key: GEODE-4868
 URL: https://issues.apache.org/jira/browse/GEODE-4868
 Project: Geode
  Issue Type: Bug
  Components: wan
Reporter: xiaojian zhou


{noformat}
It can use following test code to reproduce the issue:

diff --git 
a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
 
b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java

index 545d0cac4..fbc0dc015 100644

--- 
a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java

+++ 
b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java

@@ -717,6 +717,10 @@ public class AsyncEventQueueTestBase extends 
JUnit4DistributedTestCase {

       }

     }

     final AsyncEventQueueStats statistics = ((AsyncEventQueueImpl) 
queue).getStatistics();

+    Awaitility.await().atMost(60, TimeUnit.SECONDS)

+    .until(() -> assertEquals("Expected queue entries: " + queueSize

+        + " but actual entries: " + statistics.getEventQueueSize(), queueSize,

+        statistics.getEventQueueSize()));

     assertEquals(queueSize, statistics.getEventQueueSize());

     assertEquals(eventsReceived, statistics.getEventsReceived());

diff --git 
a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
 
b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java

index 465f35a87..058bf19cc 100644

--- 
a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java

+++ 
b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java

@@ -1519,6 +1519,11 @@ public class AsyncEventListenerDUnitTest extends 
AsyncEventQueueTestBase {

         () -> 
AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(getTestMethodName() + 
"_PR"));

 

     LogWriterUtils.getLogWriter().info("Primary buckets on vm2: " + 
primaryBucketsvm2);

+    

+    // before shutdown vm2, both vm1 and vm2 should have 40 events in primary 
queue

+    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 40,

80, 80, 0));

+    vm2.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 40, 
80, 80, 0));

+    

     //  Kill vm2 --

     vm2.invoke(() -> AsyncEventQueueTestBase.killSender());

     // 

@@ -1527,15 +1532,26 @@ public class AsyncEventListenerDUnitTest extends 
AsyncEventQueueTestBase {

     vm3.invoke(createCacheRunnable(lnPort));

     vm3.invoke(() -> 
AsyncEventQueueTestBase.createAsyncEventQueueWithListener2("ln", true, 100, 5,

         false, null));

+    // vm3 will move some primary buckets from vm1, but vm1's primary queue 
size did not reduce

+    vm3.invoke(pauseAsyncEventQueueRunnable());

     vm3.invoke(() -> 
AsyncEventQueueTestBase.createPRWithRedundantCopyWithAsyncEventQueue(

         getTestMethodName() + "_PR", "ln", isOffHeap()));

-

+    

     // --

     String regionName = getTestMethodName() + "_PR";

     Set primaryBucketsvm3 = (Set) vm3

         .invoke(() -> 
AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(regionName));

+    LogWriterUtils.getLogWriter().info("Primary buckets on vm3: " + 
primaryBucketsvm3);

+    Set primaryBucketsvm1 = (Set) vm1.invoke(

+            () -> 
AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(getTestMethodName() + 
"_PR"));

+    LogWriterUtils.getLogWriter().info("After shutdown vm2, started vm3, 
Primary buckets on vm1: " + primaryBucketsvm1);

 

+//    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
80, 80, 80, 0));

+    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 40, 
80, 80, 0));

+    vm3.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 40, 
0, 0, 0));

+

+    vm3.invoke(() -> AsyncEventQueueTestBase.resumeAsyncEventQueue("ln"));

     vm1.invoke(() -> AsyncEventQueueTestBase.resumeAsyncEventQueue("ln"));

 

     vm1.invoke(() -> AsyncEventQueueTestBase.waitForAsyncQueueToGetEmpty("ln"))


The root cause is:
when depose primary, it only check if bucket is a brq for data region. 
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-4868) when member deposed primary buckets, it did not decrease the queue size

2018-03-16 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4868.
--
   Resolution: Fixed
Fix Version/s: 1.6.0

> when member deposed primary buckets, it did not decrease the queue size
> ---
>
> Key: GEODE-4868
> URL: https://issues.apache.org/jira/browse/GEODE-4868
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> It can use following test code to reproduce the issue:
> diff --git 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
>  
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
> index 545d0cac4..fbc0dc015 100644
> --- 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
> +++ 
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/AsyncEventQueueTestBase.java
> @@ -717,6 +717,10 @@ public class AsyncEventQueueTestBase extends 
> JUnit4DistributedTestCase {
>        }
>      }
>      final AsyncEventQueueStats statistics = ((AsyncEventQueueImpl) 
> queue).getStatistics();
> +    Awaitility.await().atMost(60, TimeUnit.SECONDS)
> +    .until(() -> assertEquals("Expected queue entries: " + queueSize
> +        + " but actual entries: " + statistics.getEventQueueSize(), 
> queueSize,
> +        statistics.getEventQueueSize()));
>      assertEquals(queueSize, statistics.getEventQueueSize());
>      assertEquals(eventsReceived, statistics.getEventsReceived());
> diff --git 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
>  
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
> index 465f35a87..058bf19cc 100644
> --- 
> a/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
> +++ 
> b/geode-core/src/test/java/org/apache/geode/internal/cache/wan/asyncqueue/AsyncEventListenerDUnitTest.java
> @@ -1519,6 +1519,11 @@ public class AsyncEventListenerDUnitTest extends 
> AsyncEventQueueTestBase {
>          () -> 
> AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(getTestMethodName() + 
> "_PR"));
>  
>      LogWriterUtils.getLogWriter().info("Primary buckets on vm2: " + 
> primaryBucketsvm2);
> +    
> +    // before shutdown vm2, both vm1 and vm2 should have 40 events in 
> primary queue
> +    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40,
> 80, 80, 0));
> +    vm2.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40, 80, 80, 0));
> +    
>      //  Kill vm2 --
>      vm2.invoke(() -> AsyncEventQueueTestBase.killSender());
>      // 
> @@ -1527,15 +1532,26 @@ public class AsyncEventListenerDUnitTest extends 
> AsyncEventQueueTestBase {
>      vm3.invoke(createCacheRunnable(lnPort));
>      vm3.invoke(() -> 
> AsyncEventQueueTestBase.createAsyncEventQueueWithListener2("ln", true, 100, 5,
>          false, null));
> +    // vm3 will move some primary buckets from vm1, but vm1's primary queue 
> size did not reduce
> +    vm3.invoke(pauseAsyncEventQueueRunnable());
>      vm3.invoke(() -> 
> AsyncEventQueueTestBase.createPRWithRedundantCopyWithAsyncEventQueue(
>          getTestMethodName() + "_PR", "ln", isOffHeap()));
> -
> +    
>      // --
>      String regionName = getTestMethodName() + "_PR";
>      Set primaryBucketsvm3 = (Set) vm3
>          .invoke(() -> 
> AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(regionName));
> +    LogWriterUtils.getLogWriter().info("Primary buckets on vm3: " + 
> primaryBucketsvm3);
> +    Set primaryBucketsvm1 = (Set) vm1.invoke(
> +            () -> 
> AsyncEventQueueTestBase.getAllPrimaryBucketsOnTheNode(getTestMethodName() + 
> "_PR"));
> +    LogWriterUtils.getLogWriter().info("After shutdown vm2, started vm3, 
> Primary buckets on vm1: " + primaryBucketsvm1);
>  
> +//    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 80, 80, 80, 0));
> +    vm1.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40, 80, 80, 0));
> +    vm3.invoke(()->AsyncEventQueueTestBase.checkAsyncEventQueueStats("ln", 
> 40, 0, 0, 0));
> +
> +    vm3.invoke(() -> AsyncEventQueueTestBase.resumeAsyncEventQueue("ln"));
>      vm1.invoke(() -> AsyncEventQueueTestBase.resumeAsyncEventQueue("ln"));
>  
>    

[jira] [Created] (GEODE-4942) Secondary Gateway Sender queue after GII might leave some event un-drained

2018-03-26 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-4942:


 Summary: Secondary Gateway Sender queue after GII might leave some 
event un-drained
 Key: GEODE-4942
 URL: https://issues.apache.org/jira/browse/GEODE-4942
 Project: Geode
  Issue Type: Bug
  Components: wan
Reporter: xiaojian zhou


We found this problem in #49196. It has been fixed several times. I just 
reproduced another scenario via 

ParallelGatewaySenderOperationsDUnitTest.

testParallelPropagationSenderStartAfterStop_Scenario2.

The test did not check the drain of secondary queue, if added the checking. 
sometime it will fail with some event stay in secondary queue forever. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4942) Secondary Gateway Sender queue after GII might leave some event un-drained

2018-03-26 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4942:


Assignee: xiaojian zhou

> Secondary Gateway Sender queue after GII might leave some event un-drained
> --
>
> Key: GEODE-4942
> URL: https://issues.apache.org/jira/browse/GEODE-4942
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> We found this problem in #49196. It has been fixed several times. I just 
> reproduced another scenario via 
> ParallelGatewaySenderOperationsDUnitTest.
> testParallelPropagationSenderStartAfterStop_Scenario2.
> The test did not check the drain of secondary queue, if added the checking. 
> sometime it will fail with some event stay in secondary queue forever. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-4942) Secondary Gateway Sender queue after GII might leave some event un-drained

2018-04-10 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4942.
--
   Resolution: Fixed
Fix Version/s: 1.6.0

> Secondary Gateway Sender queue after GII might leave some event un-drained
> --
>
> Key: GEODE-4942
> URL: https://issues.apache.org/jira/browse/GEODE-4942
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We found this problem in #49196. It has been fixed several times. I just 
> reproduced another scenario via 
> ParallelGatewaySenderOperationsDUnitTest.
> testParallelPropagationSenderStartAfterStop_Scenario2.
> The test did not check the drain of secondary queue, if added the checking. 
> sometime it will fail with some event stay in secondary queue forever. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-5056) ParallelGatewaySenderOperationsDUnitTest.testParallelPropagationSenderStartAfterStop_Scenario2 intermittently fail

2018-04-12 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-5056:


Assignee: xiaojian zhou

> ParallelGatewaySenderOperationsDUnitTest.testParallelPropagationSenderStartAfterStop_Scenario2
>  intermittently fail 
> ---
>
> Key: GEODE-5056
> URL: https://issues.apache.org/jira/browse/GEODE-5056
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> After fixe GEODE-4942, I found there's at least one race condition is not 
> covered. 
>  
> [vm6] [debug 2018/04/11 16:47:35.189 PDT  Processor2> tid=110] WAN: On primary bucket 57, setting the seq number as 1357
>  
> [vm7] [info 2018/04/11 16:47:35.150 PDT  
> tid=19] Started  ParallelGatewaySender\{id=ln,remoteDsId=2,isRunning =true}
>  
> [vm7] [debug 2018/04/11 16:47:35.189 PDT  10.118.19.25(27489):32781 shared ordered uid=7 port=59148> tid=95] WAN: 
> On secondary bucket 57, setting the seq number as 1357
> [vm7] [debug 2018/04/11 16:47:35.190 PDT  10.118.19.25(27489):32781 shared ordered uid=7 port=59148> tid=95] Key : 
> > 1357
> [vm6] [debug 2018/04/11 16:47:35.190 PDT  Processor2> tid=110] register dropped event for primary queue. BucketId is 
> 57, shadowKey is 1357, prQ is /ln_PARALLEL_GATEWAY_SENDER_QUEUE
>  
> - Note: vm6's sender is restarted and cleanup the map, before the
> QueueRemvalMessage is sent out for the map.
> [vm6] [info 2018/04/11 16:47:35.249 PDT  
> tid=19] Started  ParallelGatewaySender\{id=ln,remoteDsId=2,isRunning =true}
> [vm6] [debug 2018/04/11 16:47:35.437 PDT  GatewaySender_ln_0> tid=118] BatchRemovalThread about to query the batch 
> removal map \{/ln_PARALLEL_GATEWAY_SENDER_QUEUE={96=[1396], 2=[1402], 
> 83=[1383], 6=[1406], 71=[1371], 87=[1387], 73=[1373], 90=[1390], 77=[1377], 
> 94=[1394]}}
> [vm6] [debug 2018/04/11 16:47:35.753 PDT  GatewaySender_ln_0> tid=118] BatchRemovalThread about to query the batch 
> removal map {/ln_PARALLEL_GATEWAY_SENDER_QUEUE={49=[1449], 65=[1465], 
> 83=[1483], 53=[1453], 71=[1471], 87=[1487], *57=[1457]*, 73=[1473], 
> 77=[1477], 62=[1462]}}
>  shadowKey 1457 was created after the sender is restarted
>  
> [vm6] [debug 2018/04/11 16:47:35.438 PDT  GatewaySender_ln_0> tid=118] Sending (ParallelQueueRemovalMessage@2344969b 
> processorId=0 sender=10.118.19.25(27489):32781) to 3 peers 
> ([10.118.19.25(27492):32783@4(GEODE 1.6.0), 
> 10.118.19.25(27485):32779@1(GEODE 1.6.0), 
> 10.118.19.25(27482):32778@2(GEODE 1.6.0)]) via tcp/ip
> [vm7] [debug 2018/04/11 16:47:35.439 PDT  10.118.19.25(27489):32781 shared unordered uid=4 port=59119> tid=52] 
> Received message 'ParallelQueueRemovalMessage@11583f5b processorId=0 
> sender=10.118.19.25(27489):32781' from <10.118.19.25(27489):32781>
>  
> i.e. the dropped key was in the map, but before sending a QueueRemovalMessage 
> the sender is closed and cleared the map. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-5056) ParallelGatewaySenderOperationsDUnitTest.testParallelPropagationSenderStartAfterStop_Scenario2 intermittently fail

2018-04-12 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-5056:


 Summary: 
ParallelGatewaySenderOperationsDUnitTest.testParallelPropagationSenderStartAfterStop_Scenario2
 intermittently fail 
 Key: GEODE-5056
 URL: https://issues.apache.org/jira/browse/GEODE-5056
 Project: Geode
  Issue Type: Bug
  Components: wan
Reporter: xiaojian zhou


After fixe GEODE-4942, I found there's at least one race condition is not 
covered. 

 

[vm6] [debug 2018/04/11 16:47:35.189 PDT  
tid=110] WAN: On primary bucket 57, setting the seq number as 1357

 

[vm7] [info 2018/04/11 16:47:35.150 PDT  
tid=19] Started  ParallelGatewaySender\{id=ln,remoteDsId=2,isRunning =true}

 

[vm7] [debug 2018/04/11 16:47:35.189 PDT :32781 shared ordered uid=7 port=59148> tid=95] WAN: On 
secondary bucket 57, setting the seq number as 1357

[vm7] [debug 2018/04/11 16:47:35.190 PDT :32781 shared ordered uid=7 port=59148> tid=95] Key : 
> 1357

[vm6] [debug 2018/04/11 16:47:35.190 PDT  
tid=110] register dropped event for primary queue. BucketId is 57, shadowKey is 
1357, prQ is /ln_PARALLEL_GATEWAY_SENDER_QUEUE

 

- Note: vm6's sender is restarted and cleanup the map, before the

QueueRemvalMessage is sent out for the map.

[vm6] [info 2018/04/11 16:47:35.249 PDT  
tid=19] Started  ParallelGatewaySender\{id=ln,remoteDsId=2,isRunning =true}

[vm6] [debug 2018/04/11 16:47:35.437 PDT  tid=118] BatchRemovalThread about to query the batch 
removal map \{/ln_PARALLEL_GATEWAY_SENDER_QUEUE={96=[1396], 2=[1402], 
83=[1383], 6=[1406], 71=[1371], 87=[1387], 73=[1373], 90=[1390], 77=[1377], 
94=[1394]}}

[vm6] [debug 2018/04/11 16:47:35.753 PDT  tid=118] BatchRemovalThread about to query the batch 
removal map {/ln_PARALLEL_GATEWAY_SENDER_QUEUE={49=[1449], 65=[1465], 
83=[1483], 53=[1453], 71=[1471], 87=[1487], *57=[1457]*, 73=[1473], 77=[1477], 
62=[1462]}}

 shadowKey 1457 was created after the sender is restarted

 

[vm6] [debug 2018/04/11 16:47:35.438 PDT  tid=118] Sending (ParallelQueueRemovalMessage@2344969b 
processorId=0 sender=10.118.19.25(27489):32781) to 3 peers 
([10.118.19.25(27492):32783@4(GEODE 1.6.0), 
10.118.19.25(27485):32779@1(GEODE 1.6.0), 
10.118.19.25(27482):32778@2(GEODE 1.6.0)]) via tcp/ip

[vm7] [debug 2018/04/11 16:47:35.439 PDT :32781 shared unordered uid=4 port=59119> tid=52] 
Received message 'ParallelQueueRemovalMessage@11583f5b processorId=0 
sender=10.118.19.25(27489):32781' from <10.118.19.25(27489):32781>

 

i.e. the dropped key was in the map, but before sending a QueueRemovalMessage 
the sender is closed and cleared the map. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-5056) ParallelGatewaySenderOperationsDUnitTest.testParallelPropagationSenderStartAfterStop_Scenario2 intermittently fail

2018-04-13 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-5056.
--
   Resolution: Fixed
Fix Version/s: 1.6.0

> ParallelGatewaySenderOperationsDUnitTest.testParallelPropagationSenderStartAfterStop_Scenario2
>  intermittently fail 
> ---
>
> Key: GEODE-5056
> URL: https://issues.apache.org/jira/browse/GEODE-5056
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After fixe GEODE-4942, I found there's at least one race condition is not 
> covered. 
>  
> [vm6] [debug 2018/04/11 16:47:35.189 PDT  Processor2> tid=110] WAN: On primary bucket 57, setting the seq number as 1357
>  
> [vm7] [info 2018/04/11 16:47:35.150 PDT  
> tid=19] Started  ParallelGatewaySender\{id=ln,remoteDsId=2,isRunning =true}
>  
> [vm7] [debug 2018/04/11 16:47:35.189 PDT  10.118.19.25(27489):32781 shared ordered uid=7 port=59148> tid=95] WAN: 
> On secondary bucket 57, setting the seq number as 1357
> [vm7] [debug 2018/04/11 16:47:35.190 PDT  10.118.19.25(27489):32781 shared ordered uid=7 port=59148> tid=95] Key : 
> > 1357
> [vm6] [debug 2018/04/11 16:47:35.190 PDT  Processor2> tid=110] register dropped event for primary queue. BucketId is 
> 57, shadowKey is 1357, prQ is /ln_PARALLEL_GATEWAY_SENDER_QUEUE
>  
> - Note: vm6's sender is restarted and cleanup the map, before the
> QueueRemvalMessage is sent out for the map.
> [vm6] [info 2018/04/11 16:47:35.249 PDT  
> tid=19] Started  ParallelGatewaySender\{id=ln,remoteDsId=2,isRunning =true}
> [vm6] [debug 2018/04/11 16:47:35.437 PDT  GatewaySender_ln_0> tid=118] BatchRemovalThread about to query the batch 
> removal map \{/ln_PARALLEL_GATEWAY_SENDER_QUEUE={96=[1396], 2=[1402], 
> 83=[1383], 6=[1406], 71=[1371], 87=[1387], 73=[1373], 90=[1390], 77=[1377], 
> 94=[1394]}}
> [vm6] [debug 2018/04/11 16:47:35.753 PDT  GatewaySender_ln_0> tid=118] BatchRemovalThread about to query the batch 
> removal map {/ln_PARALLEL_GATEWAY_SENDER_QUEUE={49=[1449], 65=[1465], 
> 83=[1483], 53=[1453], 71=[1471], 87=[1487], *57=[1457]*, 73=[1473], 
> 77=[1477], 62=[1462]}}
>  shadowKey 1457 was created after the sender is restarted
>  
> [vm6] [debug 2018/04/11 16:47:35.438 PDT  GatewaySender_ln_0> tid=118] Sending (ParallelQueueRemovalMessage@2344969b 
> processorId=0 sender=10.118.19.25(27489):32781) to 3 peers 
> ([10.118.19.25(27492):32783@4(GEODE 1.6.0), 
> 10.118.19.25(27485):32779@1(GEODE 1.6.0), 
> 10.118.19.25(27482):32778@2(GEODE 1.6.0)]) via tcp/ip
> [vm7] [debug 2018/04/11 16:47:35.439 PDT  10.118.19.25(27489):32781 shared unordered uid=4 port=59119> tid=52] 
> Received message 'ParallelQueueRemovalMessage@11583f5b processorId=0 
> sender=10.118.19.25(27489):32781' from <10.118.19.25(27489):32781>
>  
> i.e. the dropped key was in the map, but before sending a QueueRemovalMessage 
> the sender is closed and cleared the map. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-4647) Add a new stat for AyncEventQueue/GatewaySender to track secondaryEventsQueueSize

2018-04-16 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4647.
--
   Resolution: Fixed
Fix Version/s: 1.6.0

> Add a new stat for AyncEventQueue/GatewaySender to track 
> secondaryEventsQueueSize
> -
>
> Key: GEODE-4647
> URL: https://issues.apache.org/jira/browse/GEODE-4647
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Reporter: Jason Huynh
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Currently we have eventsQueueSize which tells us how big the queue is based 
> on how many primary events are in the queue.
> It would be nice to have the same type of stat for how many secondary events 
> are in the queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-4624) Add a new stat for AyncEventQueue/GatewaySender to track the processing of queueRemovals

2018-04-16 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-4624.
--
   Resolution: Fixed
Fix Version/s: 1.6.0

fixed in GEODE-4647 revision 

69815f90ad22c10b0a9b7236c02f9bf04ec28223

> Add a new stat for AyncEventQueue/GatewaySender to track the processing of 
> queueRemovals
> 
>
> Key: GEODE-4624
> URL: https://issues.apache.org/jira/browse/GEODE-4624
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Affects Versions: 1.5.0
>Reporter: Shelley Lynn Hughes-Godfrey
>Assignee: xiaojian zhou
>Priority: Major
> Fix For: 1.6.0
>
>
> We currently track the number of events queues, queue size and 
> eventsDistributed ... but we don't track the number of events removed via 
> queue removal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-5087) SerialGatewaySenderOperationsDUnitTest.testRestartSerialGatewaySendersWhilePutting intermittently fail

2018-04-16 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-5087:


 Summary: 
SerialGatewaySenderOperationsDUnitTest.testRestartSerialGatewaySendersWhilePutting
 intermittently fail 
 Key: GEODE-5087
 URL: https://issues.apache.org/jira/browse/GEODE-5087
 Project: Geode
  Issue Type: Bug
  Components: wan
Reporter: xiaojian zhou


I introduced this dunit test for GEODE-4942, while GEODE-4942 is for parallel 
gateway sender, the test is for serial gateway sender. And I did reproduce the 
issue: event is dropped at primary sender (because it's not running yet), but 
event has been put into unprocessedEventsMap at secondary sender's queue and 
will stay there forever.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-5087) SerialGatewaySenderOperationsDUnitTest.testRestartSerialGatewaySendersWhilePutting intermittently fail

2018-04-16 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-5087:


Assignee: xiaojian zhou

> SerialGatewaySenderOperationsDUnitTest.testRestartSerialGatewaySendersWhilePutting
>  intermittently fail 
> ---
>
> Key: GEODE-5087
> URL: https://issues.apache.org/jira/browse/GEODE-5087
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> I introduced this dunit test for GEODE-4942, while GEODE-4942 is for parallel 
> gateway sender, the test is for serial gateway sender. And I did reproduce 
> the issue: event is dropped at primary sender (because it's not running yet), 
> but event has been put into unprocessedEventsMap at secondary sender's queue 
> and will stay there forever.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-4624) Add a new stat for AyncEventQueue/GatewaySender to track the processing of queueRemovals

2018-04-17 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-4624:
-
Fix Version/s: (was: 1.6.0)
   1.7.0

> Add a new stat for AyncEventQueue/GatewaySender to track the processing of 
> queueRemovals
> 
>
> Key: GEODE-4624
> URL: https://issues.apache.org/jira/browse/GEODE-4624
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Affects Versions: 1.5.0
>Reporter: Shelley Lynn Hughes-Godfrey
>Assignee: xiaojian zhou
>Priority: Major
> Fix For: 1.7.0
>
>
> We currently track the number of events queues, queue size and 
> eventsDistributed ... but we don't track the number of events removed via 
> queue removal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-4647) Add a new stat for AyncEventQueue/GatewaySender to track secondaryEventsQueueSize

2018-04-17 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-4647:
-
Fix Version/s: (was: 1.6.0)
   1.7.0

> Add a new stat for AyncEventQueue/GatewaySender to track 
> secondaryEventsQueueSize
> -
>
> Key: GEODE-4647
> URL: https://issues.apache.org/jira/browse/GEODE-4647
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Reporter: Jason Huynh
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Currently we have eventsQueueSize which tells us how big the queue is based 
> on how many primary events are in the queue.
> It would be nice to have the same type of stat for how many secondary events 
> are in the queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-4624) Add a new stat for AyncEventQueue/GatewaySender to track the processing of queueRemovals

2018-04-17 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-4624:


Assignee: xiaojian zhou  (was: Karen Smoler Miller)

> Add a new stat for AyncEventQueue/GatewaySender to track the processing of 
> queueRemovals
> 
>
> Key: GEODE-4624
> URL: https://issues.apache.org/jira/browse/GEODE-4624
> Project: Geode
>  Issue Type: Bug
>  Components: docs, wan
>Affects Versions: 1.5.0
>Reporter: Shelley Lynn Hughes-Godfrey
>Assignee: xiaojian zhou
>Priority: Major
> Fix For: 1.7.0
>
>
> We currently track the number of events queues, queue size and 
> eventsDistributed ... but we don't track the number of events removed via 
> queue removal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-5106) CI Failure: ParallelWANConflationOffHeapDUnitTest.testParallelPropagationBatchConflation failed with AssertionError

2018-04-18 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-5106:


Assignee: xiaojian zhou

> CI Failure: 
> ParallelWANConflationOffHeapDUnitTest.testParallelPropagationBatchConflation 
> failed with AssertionError
> ---
>
> Key: GEODE-5106
> URL: https://issues.apache.org/jira/browse/GEODE-5106
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Barry Oglesby
>Assignee: xiaojian zhou
>Priority: Major
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/282]
> {noformat}
> org.apache.geode.internal.cache.wan.offheap.ParallelWANConflationOffHeapDUnitTest
>  > testParallelPropagationBatchConflation FAILED
> java.lang.AssertionError: Event in secondary queue should be 0 after 
> dispatched expected:<0> but was:<11>
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-5106) CI Failure: ParallelWANConflationOffHeapDUnitTest.testParallelPropagationBatchConflation failed with AssertionError

2018-04-18 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-5106.
--
   Resolution: Fixed
Fix Version/s: 1.7.0

> CI Failure: 
> ParallelWANConflationOffHeapDUnitTest.testParallelPropagationBatchConflation 
> failed with AssertionError
> ---
>
> Key: GEODE-5106
> URL: https://issues.apache.org/jira/browse/GEODE-5106
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Barry Oglesby
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/282]
> {noformat}
> org.apache.geode.internal.cache.wan.offheap.ParallelWANConflationOffHeapDUnitTest
>  > testParallelPropagationBatchConflation FAILED
> java.lang.AssertionError: Event in secondary queue should be 0 after 
> dispatched expected:<0> but was:<11>
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-5123) CI Falure: org.apache.geode.session.tests.Tomcat6ClientServerTest: No space left on device for /tmp

2018-04-23 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-5123:


 Summary: CI Falure: 
org.apache.geode.session.tests.Tomcat6ClientServerTest: No space left on device 
for /tmp
 Key: GEODE-5123
 URL: https://issues.apache.org/jira/browse/GEODE-5123
 Project: Geode
  Issue Type: Bug
  Components: http session
Reporter: xiaojian zhou


{noformat}
Found in 
http://concourse.gemfire.pivotal.io/teams/main/pipelines/gemfire-9.5/jobs/DistributedTest/builds/11
http://concourse.gemfire.pivotal.io/teams/main/pipelines/gemfire-9.5/jobs/DistributedTest/builds/11
To download the test artifacts from this job, execute the following command 
after the job has completed:

 aws s3 cp 
s3://gemfire-build-artifacts/9.5/9.5.0-build.7/1524352838/distributedtestfiles-9.5.0-build.7.tgz
 .

java.nio.file.FileSystemException: 
/tmp/build/ae3c03f4/built-gemfire/test/gemfire/open/geode-assembly/build/install/apache-geode/lib/fastutil-8.1.1.jar
 -> 
/tmp/cargo_containers/Tomcat6ClientServerTest/apache-tomcat-6.0.37/apache-tomcat-6.0.37/lib/fastutil-8.1.1.jar:
 No space left on device
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:253)
at sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:581)
at 
sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253)
at java.nio.file.Files.copy(Files.java:1274)
at 
org.apache.geode.session.tests.TomcatInstall.copyTomcatGeodeReqFiles(TomcatInstall.java:309)
at 
org.apache.geode.session.tests.TomcatInstall.(TomcatInstall.java:148)
at 
org.apache.geode.session.tests.TomcatInstall.(TomcatInstall.java:127)
at 
org.apache.geode.session.tests.Tomcat6ClientServerTest.setupTomcatInstall(Tomcat6ClientServerTest.java:32)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:147)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:129)

[jira] [Updated] (GEODE-5123) CI Falure: org.apache.geode.session.tests.Jetty9CachingClientServerTest: containersShouldShareDataRemovals

2018-04-23 Thread xiaojian zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-5123:
-
Description: 
{noformat}
java.io.IOException: Http request to localhost[22563] failed. HTTP/1.1 503 
Service Unavailable
at org.apache.geode.session.tests.Client.doRequest(Client.java:229)
at org.apache.geode.session.tests.Client.get(Client.java:95)
at org.apache.geode.session.tests.Client.get(Client.java:84)
at 
org.apache.geode.session.tests.CargoTestBase.getKeyValueDataOnAllClients(CargoTestBase.java:86)
at 
org.apache.geode.session.tests.CargoTestBase.containersShouldShareDataRemovals(CargoTestBase.java:294)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:147)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:129)
at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at 
org.gradle.i

[jira] [Commented] (GEODE-5123) CI Falure: org.apache.geode.session.tests.Jetty9CachingClientServerTest: containersShouldShareDataRemovals

2018-04-23 Thread xiaojian zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448657#comment-16448657
 ] 

xiaojian zhou commented on GEODE-5123:
--

We suspected it's due to run out of memory. GEODE-5121 might be the same root 
cause. 

> CI Falure: org.apache.geode.session.tests.Jetty9CachingClientServerTest: 
> containersShouldShareDataRemovals
> --
>
> Key: GEODE-5123
> URL: https://issues.apache.org/jira/browse/GEODE-5123
> Project: Geode
>  Issue Type: Bug
>  Components: http session
>Reporter: xiaojian zhou
>Priority: Major
>
> {noformat}
> java.io.IOException: Http request to localhost[22563] failed. HTTP/1.1 503 
> Service Unavailable
>   at org.apache.geode.session.tests.Client.doRequest(Client.java:229)
>   at org.apache.geode.session.tests.Client.get(Client.java:95)
>   at org.apache.geode.session.tests.Client.get(Client.java:84)
>   at 
> org.apache.geode.session.tests.CargoTestBase.getKeyValueDataOnAllClients(CargoTestBase.java:86)
>   at 
> org.apache.geode.session.tests.CargoTestBase.containersShouldShareDataRemovals(CargoTestBase.java:294)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>

[jira] [Created] (GEODE-5124) org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTest

2018-04-23 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-5124:


 Summary: 
org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTest
 Key: GEODE-5124
 URL: https://issues.apache.org/jira/browse/GEODE-5124
 Project: Geode
  Issue Type: Bug
  Components: http session
Reporter: xiaojian zhou


{noformat}
http://concourse.gemfire.pivotal.io/teams/main/pipelines/gemfire-9.5/jobs/DistributedTest/builds/10

Unfortunately, the oldest one run we can hijack the log is #11, Here are the 
error msg showed in console:

:geode-assembly:distributedTest

org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTest > 
tomcat7079WithOldModulesMixedWithCurrentCanDoPutFromCurrentModule[0] FAILED
org.junit.ComparisonFailure: [
No longer connected to 34d0ceedc76e[1099].
.
No longer connected to 34d0ceedc76e[1099].
...The Cache Server process terminated unexpectedly with exit 
status 1. Please refer to the log file in /tmp/junit8988160276633658413/server 
for full details.

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in 
[jar:file:/tmp/cargo_containers/Tomcat7079AndCurrentModules/apache-tomcat-7.0.79/apache-tomcat-7.0.79/lib/slf4j-jdk14-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in 
[jar:file:/tmp/build/ae3c03f4/built-gemfire/test/geode/geode-assembly/build/install/apache-geode/lib/log4j-slf4j-impl-2.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]

Exception in thread "main" org.apache.geode.GemFireConfigException: Unable 
to join the distributed system.  Operation either timed out, was stopped or 
Locator does not exist.

at 
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.join(GMSMembershipManager.java:661)

at 
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.joinDistributedSystem(GMSMembershipManager.java:747)

at 
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:191)

at 
org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:106)

at 
org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:1027)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:1061)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:554)

at 
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:763)

at 
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:355)

at 
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:343)

at 
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:335)

at 
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:211)

at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:219)

at 
org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)

at 
org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:844)

at 
org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:762)

at 
org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:692)

at 
org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:225)



] expected:<[OK]> but was:<[ERROR]>
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
org.apache.geode.test.junit.assertions.CommandResultAssert.statusIsSuccess(CommandResultAssert.java:102)
at 
org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTest.startServer(TomcatSessionBackwardsCompatibilityTest.java:104)
at 
org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTest.startClusterWithTomcat(TomcatSessionBackwardsCompatibilityTest.java:160)
at 
org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTest.tomcat7079WithOldModulesMixedWithCurrentCanDoPutFromCurrentModule(TomcatSessionBackward

[jira] [Updated] (GEODE-6277) CI failure: DistributedNoAckRegionDUnitTest.testNBRegionDestructionDuringGetInitialImage

2019-06-17 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-6277:
-
Labels: GeodeCommons pull-request-available  (was: pull-request-available)

> CI failure: 
> DistributedNoAckRegionDUnitTest.testNBRegionDestructionDuringGetInitialImage
> 
>
> Key: GEODE-6277
> URL: https://issues.apache.org/jira/browse/GEODE-6277
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.9.0
>Reporter: Ryan McMahon
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: GeodeCommons, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  
> {code:java}
> org.apache.geode.cache30.DistributedNoAckRegionDUnitTest > 
> testNBRegionDestructionDuringGetInitialImage FAILED
>  java.lang.AssertionError: asyncGII failed
> Caused by:
>  org.junit.ComparisonFailure: expected:<[tru]e> but was:<[fals]e>
> {code}
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/288|http://example.com/]
> This appears to be part of a larger flakey failure ticket, but this 
> particular failure was not addressed.  Original flakey failure ticket:
> [https://issues.apache.org/jira/browse/GEODE-5412?jql=text%20~%20%22testNBRegionDestructionDuringGetInitialImage%22|http://example.com/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-6899) retried client should set last try's version tag if found

2019-06-21 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-6899:


 Summary: retried client should set last try's version tag if found
 Key: GEODE-6899
 URL: https://issues.apache.org/jira/browse/GEODE-6899
 Project: Geode
  Issue Type: Bug
Reporter: xiaojian zhou


client does a put to serverA with replicated region, serverA distributed to B 
and C, before the distribution arrived at C, A is killed. Then client could 
retry to C. C noticed this is a retry operation, it will search for previous 
try's version tag. 

The found tag should be set into the current event. Interesting thing is:
I found other operations, such as PutIfAbsent and create, they both did that. 
But replace (i.e. put) did not.  

Another issue is: GEODE-6802 introduce a synchronizeIfNotScheduled(). But there 
could be a race that membershipListener is also scheduling. The fix is to pause 
1 second before calling the newly introduced synchronizeIfNotScheduled(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-6908) Some retry operation from client did not set setPossibleDuplicate in event

2019-06-25 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-6908:


 Summary: Some retry operation from client did not set 
setPossibleDuplicate in event
 Key: GEODE-6908
 URL: https://issues.apache.org/jira/browse/GEODE-6908
 Project: Geode
  Issue Type: Bug
Reporter: xiaojian zhou


I fixed a bug GEODE-6899, where the retried UPDATE did not set event the 
version tag found from previous try. 

I searched other code and found there's one more place:
basicBridgeRemove() should also add following lines like in basicBridgeDestroy:

  // if this is a replayed operation we may already have a version tag
  event.setVersionTag(clientEvent.getVersionTag());
  event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());

BTW, basicBridgeDestroy(), basicBridgeUpdateVersionStamp(), 
basicBridgeInvalidate() called 
event.setVersionTag(clientEvent.getVersionTag()); but they did not call 
"event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());"

I think it's better to keep all the code the same. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-6908) Some retry operation from client did not set setPossibleDuplicate in event

2019-06-25 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-6908:
-
Labels: GeodeCommons  (was: )

> Some retry operation from client did not set setPossibleDuplicate in event
> --
>
> Key: GEODE-6908
> URL: https://issues.apache.org/jira/browse/GEODE-6908
> Project: Geode
>  Issue Type: Bug
>Reporter: xiaojian zhou
>Priority: Major
>  Labels: GeodeCommons
>
> I fixed a bug GEODE-6899, where the retried UPDATE did not set event the 
> version tag found from previous try. 
> I searched other code and found there's one more place:
> basicBridgeRemove() should also add following lines like in 
> basicBridgeDestroy:
>   // if this is a replayed operation we may already have a version tag
>   event.setVersionTag(clientEvent.getVersionTag());
>   event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());
> BTW, basicBridgeDestroy(), basicBridgeUpdateVersionStamp(), 
> basicBridgeInvalidate() called 
> event.setVersionTag(clientEvent.getVersionTag()); but they did not call 
> "event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());"
> I think it's better to keep all the code the same. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-6908) Some retry operation from client did not set setPossibleDuplicate in event

2019-06-25 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-6908:


Assignee: xiaojian zhou

> Some retry operation from client did not set setPossibleDuplicate in event
> --
>
> Key: GEODE-6908
> URL: https://issues.apache.org/jira/browse/GEODE-6908
> Project: Geode
>  Issue Type: Bug
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: GeodeCommons
>
> I fixed a bug GEODE-6899, where the retried UPDATE did not set event the 
> version tag found from previous try. 
> I searched other code and found there's one more place:
> basicBridgeRemove() should also add following lines like in 
> basicBridgeDestroy:
>   // if this is a replayed operation we may already have a version tag
>   event.setVersionTag(clientEvent.getVersionTag());
>   event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());
> BTW, basicBridgeDestroy(), basicBridgeUpdateVersionStamp(), 
> basicBridgeInvalidate() called 
> event.setVersionTag(clientEvent.getVersionTag()); but they did not call 
> "event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());"
> I think it's better to keep all the code the same. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-6899) retried client should set last try's version tag if found

2019-06-27 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-6899.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

> retried client should set last try's version tag if found
> -
>
> Key: GEODE-6899
> URL: https://issues.apache.org/jira/browse/GEODE-6899
> Project: Geode
>  Issue Type: Bug
>Reporter: xiaojian zhou
>Priority: Major
> Fix For: 1.10.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> client does a put to serverA with replicated region, serverA distributed to B 
> and C, before the distribution arrived at C, A is killed. Then client could 
> retry to C. C noticed this is a retry operation, it will search for previous 
> try's version tag. 
> The found tag should be set into the current event. Interesting thing is:
> I found other operations, such as PutIfAbsent and create, they both did that. 
> But replace (i.e. put) did not.  
> Another issue is: GEODE-6802 introduce a synchronizeIfNotScheduled(). But 
> there could be a race that membershipListener is also scheduling. The fix is 
> to pause 1 second before calling the newly introduced 
> synchronizeIfNotScheduled(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-6925) Added DUnit test to verify behavior of repeated rebalance operations

2019-06-27 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-6925:


 Summary: Added DUnit test to verify behavior of repeated rebalance 
operations
 Key: GEODE-6925
 URL: https://issues.apache.org/jira/browse/GEODE-6925
 Project: Geode
  Issue Type: Bug
  Components: gfsh
Reporter: xiaojian zhou


There have been reports that rebalance is often run multiple times in rapid 
succession to assure the desired result is achieved, but an immediate second 
rebalance shouldn't produce a different result on a region if rerun before 
additional region operations are performed. Since we don't have any artifacts 
on a reproduction of this issue, we were unable to reproduce it ourselves but 
this test contributes to the coverage of rebalance and may help us diagnose 
this issue in the future if it appears.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-6925) Added DUnit test to verify behavior of repeated rebalance operations

2019-06-27 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-6925.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

Fixed in 0736e1509601488d2fa042ad3e4bb91c572004cc

> Added DUnit test to verify behavior of repeated rebalance operations
> 
>
> Key: GEODE-6925
> URL: https://issues.apache.org/jira/browse/GEODE-6925
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Reporter: xiaojian zhou
>Priority: Major
> Fix For: 1.10.0
>
>
> There have been reports that rebalance is often run multiple times in rapid 
> succession to assure the desired result is achieved, but an immediate second 
> rebalance shouldn't produce a different result on a region if rerun before 
> additional region operations are performed. Since we don't have any artifacts 
> on a reproduction of this issue, we were unable to reproduce it ourselves but 
> this test contributes to the coverage of rebalance and may help us diagnose 
> this issue in the future if it appears.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GEODE-6908) Some retry operation from client did not set setPossibleDuplicate in event

2019-06-28 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou resolved GEODE-6908.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

> Some retry operation from client did not set setPossibleDuplicate in event
> --
>
> Key: GEODE-6908
> URL: https://issues.apache.org/jira/browse/GEODE-6908
> Project: Geode
>  Issue Type: Bug
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: GeodeCommons
> Fix For: 1.10.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I fixed a bug GEODE-6899, where the retried UPDATE did not set event the 
> version tag found from previous try. 
> I searched other code and found there's one more place:
> basicBridgeRemove() should also add following lines like in 
> basicBridgeDestroy:
>   // if this is a replayed operation we may already have a version tag
>   event.setVersionTag(clientEvent.getVersionTag());
>   event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());
> BTW, basicBridgeDestroy(), basicBridgeUpdateVersionStamp(), 
> basicBridgeInvalidate() called 
> event.setVersionTag(clientEvent.getVersionTag()); but they did not call 
> "event.setPossibleDuplicate(clientEvent.isPossibleDuplicate());"
> I think it's better to keep all the code the same. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-6930) Lucene Functions specified using Internal Function's required permission, will be rejected by PCC

2019-06-28 Thread xiaojian zhou (JIRA)
xiaojian zhou created GEODE-6930:


 Summary: Lucene Functions specified using Internal Function's 
required permission, will be rejected by PCC
 Key: GEODE-6930
 URL: https://issues.apache.org/jira/browse/GEODE-6930
 Project: Geode
  Issue Type: Bug
  Components: lucene
Reporter: xiaojian zhou


When playing lucene app in PCC, I noticed the query is rejected by PCC with 
following error msg:
2019-06-14T10:24:29.83-0700 [APP/PROC/WEB/0] OUT Caused by: 
org.apache.geode.security.NotAuthorizedException: 
developer_jNnlmXMEdwsrmaDayfNKg not authorized for *

This is because all the lucene functions are implementing Internal Function but 
forgot to override it's getRequiredPermissions method. So it requires to have 
ResourcePermissions.ALL to execute. 

There're following 9 lucene functions:
WaitUntilFlushedFunction (Need READ)
LuceneQueryFunction (Need READ)
IndexingInProgressFunction (Need READ)
LuceneCreateIndexFunction (used by gfsh only, no need to change)
LuceneDestroyIndexFunction (used by gfsh only, no need to change)
LuceneDescribeIndexFunction (used by gfsh only, no need to change)
LuceneSearchIndexFunction (used by gfsh only, no need to change)
LuceneListIndexFunction (used by gfsh only, no need to change)
LuceneGetPageFunction (Need READ)

The 5 of them are only used by gfsh, which is the real "internal function". 
The other 4 will be called by client application, so they should specify 
ResourcePermissions.READ. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-6930) Lucene Functions specified using Internal Function's required permission, will be rejected by PCC

2019-06-28 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou updated GEODE-6930:
-
Labels: GeodeCommons  (was: )

> Lucene Functions specified using Internal Function's required permission, 
> will be rejected by PCC
> -
>
> Key: GEODE-6930
> URL: https://issues.apache.org/jira/browse/GEODE-6930
> Project: Geode
>  Issue Type: Bug
>  Components: lucene
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>  Labels: GeodeCommons
>
> When playing lucene app in PCC, I noticed the query is rejected by PCC with 
> following error msg:
> 2019-06-14T10:24:29.83-0700 [APP/PROC/WEB/0] OUT Caused by: 
> org.apache.geode.security.NotAuthorizedException: 
> developer_jNnlmXMEdwsrmaDayfNKg not authorized for *
> This is because all the lucene functions are implementing Internal Function 
> but forgot to override it's getRequiredPermissions method. So it requires to 
> have ResourcePermissions.ALL to execute. 
> There're following 9 lucene functions:
> WaitUntilFlushedFunction (Need READ)
> LuceneQueryFunction (Need READ)
> IndexingInProgressFunction (Need READ)
> LuceneCreateIndexFunction (used by gfsh only, no need to change)
> LuceneDestroyIndexFunction (used by gfsh only, no need to change)
> LuceneDescribeIndexFunction (used by gfsh only, no need to change)
> LuceneSearchIndexFunction (used by gfsh only, no need to change)
> LuceneListIndexFunction (used by gfsh only, no need to change)
> LuceneGetPageFunction (Need READ)
> The 5 of them are only used by gfsh, which is the real "internal function". 
> The other 4 will be called by client application, so they should specify 
> ResourcePermissions.READ. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GEODE-6930) Lucene Functions specified using Internal Function's required permission, will be rejected by PCC

2019-06-28 Thread xiaojian zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/GEODE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojian zhou reassigned GEODE-6930:


Assignee: xiaojian zhou

> Lucene Functions specified using Internal Function's required permission, 
> will be rejected by PCC
> -
>
> Key: GEODE-6930
> URL: https://issues.apache.org/jira/browse/GEODE-6930
> Project: Geode
>  Issue Type: Bug
>  Components: lucene
>Reporter: xiaojian zhou
>Assignee: xiaojian zhou
>Priority: Major
>
> When playing lucene app in PCC, I noticed the query is rejected by PCC with 
> following error msg:
> 2019-06-14T10:24:29.83-0700 [APP/PROC/WEB/0] OUT Caused by: 
> org.apache.geode.security.NotAuthorizedException: 
> developer_jNnlmXMEdwsrmaDayfNKg not authorized for *
> This is because all the lucene functions are implementing Internal Function 
> but forgot to override it's getRequiredPermissions method. So it requires to 
> have ResourcePermissions.ALL to execute. 
> There're following 9 lucene functions:
> WaitUntilFlushedFunction (Need READ)
> LuceneQueryFunction (Need READ)
> IndexingInProgressFunction (Need READ)
> LuceneCreateIndexFunction (used by gfsh only, no need to change)
> LuceneDestroyIndexFunction (used by gfsh only, no need to change)
> LuceneDescribeIndexFunction (used by gfsh only, no need to change)
> LuceneSearchIndexFunction (used by gfsh only, no need to change)
> LuceneListIndexFunction (used by gfsh only, no need to change)
> LuceneGetPageFunction (Need READ)
> The 5 of them are only used by gfsh, which is the real "internal function". 
> The other 4 will be called by client application, so they should specify 
> ResourcePermissions.READ. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   >