[jira] [Updated] (GEODE-9805) Debug logging of Radish AUTH command in ExecutionHandlerContext.executeCommand() reveals sensitive information

2021-11-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9805:
--
Labels: blocks-1.15.0​ pull-request-available  (was: blocks-1.15.0​)

> Debug logging of Radish AUTH command in 
> ExecutionHandlerContext.executeCommand() reveals sensitive information
> --
>
> Key: GEODE-9805
> URL: https://issues.apache.org/jira/browse/GEODE-9805
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Assignee: Donal Evans
>Priority: Major
>  Labels: blocks-1.15.0​, pull-request-available
>
> With debug logging enabled, the ExecutionHandlerContext.executeCommand() 
> method logs every command executed along with its arguments. In the case of 
> the AUTH command, this results in un-redacted userId and/or password being 
> logged, which represents a serious security issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9747) CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of exception

2021-11-08 Thread Xiaojian Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440837#comment-17440837
 ] 

Xiaojian Zhou commented on GEODE-9747:
--

This is caused by my fix GEODE-9705, where I saved the exceptions which were 
originally ignored in cleanupFailedInitialization(). I only keep the first 
exception and ignore later exceptions. 

cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown() will purposely 
close the cache and cause creating PR failed with CacheClosedException or disk 
recovery exception. 

Before my code changes in GEODE-9705, all these exceptions in 
cleanupFailedInitialization() will be ignored. Then it will go further and fail 
in later steps. 

This time, it failed in 


{code:java}
[vm0] [warn 2021/10/16 04:50:40.769 UTC   
tid=0x21] PartitionedRegion#cleanupFailedInitialization(): Failed to clean the 
PartionRegion data store
[vm0] org.apache.geode.distributed.DistributedSystemDisconnectedException: This 
connection to a distributed system has been disconnected.
[vm0]   at 
org.apache.geode.distributed.internal.InternalDistributedSystem.checkConnected(InternalDistributedSystem.java:957)
[vm0]   at 
org.apache.geode.distributed.internal.InternalDistributedSystem.getDistributionManager(InternalDistributedSystem.java:1658)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.getDistributionManager(ReplyProcessor21.java:366)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.postWait(ReplyProcessor21.java:592)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:818)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:773)
[vm0]   at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:859)
[vm0]   at 
org.apache.geode.internal.cache.PartitionedRegion.attemptToSendDestroyRegionMessage(PartitionedRegion.java:7592)
[vm0]   at 
org.apache.geode.internal.cache.PartitionedRegion.sendDestroyRegionMessage(PartitionedRegion.java:7553)
[vm0]   at 
org.apache.geode.internal.cache.PartitionedRegion.cleanupFailedInitialization(PartitionedRegion.java:5577)
{code}

Depends on timing, most of the time the test will not fail here and it will 
still throw the expected CacheClosedException. But occasionally it will fail 
here, and this exception is not expected by the test. 

I can change the product code to save the last exception instead of the first 
exception to fix the bug. But after thought over, it's better to fix the test 
code and add this expected exception as long as we know what's going on. 


> CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of 
> exception
> ---
>
> Key: GEODE-9747
> URL: https://issues.apache.org/jira/browse/GEODE-9747
> Project: Geode
>  Issue Type: Bug
>  Components: core, tests
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> May be the same issue as GEODE-7030 but it's hard to tell since that other 
> ticket is short on details.
> {noformat}
> PersistentPartitionedRegionDistributedTest > 
> cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest$$Lambda$331/778323733.run
>  in VM 0 running on Host 
> heavy-lifter-2597c5be-686f-56ce-ab29-4c643f8174ba.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown(PersistentPartitionedRegionDistributedTest.java:1129)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> Expecting value to be true but was false
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.lambda$cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown$bb17a952$4(PersistentPartitionedRegionDistributedTest.java:1136)
> {noformat}



--
This message was sent by Atl

[jira] [Updated] (GEODE-9747) CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of exception

2021-11-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9747:
--
Labels: GeodeOperationAPI needsTriage pull-request-available  (was: 
GeodeOperationAPI needsTriage)

> CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of 
> exception
> ---
>
> Key: GEODE-9747
> URL: https://issues.apache.org/jira/browse/GEODE-9747
> Project: Geode
>  Issue Type: Bug
>  Components: core, tests
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> May be the same issue as GEODE-7030 but it's hard to tell since that other 
> ticket is short on details.
> {noformat}
> PersistentPartitionedRegionDistributedTest > 
> cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest$$Lambda$331/778323733.run
>  in VM 0 running on Host 
> heavy-lifter-2597c5be-686f-56ce-ab29-4c643f8174ba.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown(PersistentPartitionedRegionDistributedTest.java:1129)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> Expecting value to be true but was false
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.lambda$cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown$bb17a952$4(PersistentPartitionedRegionDistributedTest.java:1136)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9747) CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of exception

2021-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440835#comment-17440835
 ] 

ASF subversion and git services commented on GEODE-9747:


Commit 1f49a587d0b22ee18342b4840739806a8e5db73c in geode's branch 
refs/heads/feature/GEODE-9747 from zhouxh
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=1f49a58 ]

GEODE-9747: add expected DistributedSystemDisconnectedException to the test


> CI failure: PersistentPartitionedRegionDistributedTest sees wrong kind of 
> exception
> ---
>
> Key: GEODE-9747
> URL: https://issues.apache.org/jira/browse/GEODE-9747
> Project: Geode
>  Issue Type: Bug
>  Components: core, tests
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> May be the same issue as GEODE-7030 but it's hard to tell since that other 
> ticket is short on details.
> {noformat}
> PersistentPartitionedRegionDistributedTest > 
> cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest$$Lambda$331/778323733.run
>  in VM 0 running on Host 
> heavy-lifter-2597c5be-686f-56ce-ab29-4c643f8174ba.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown(PersistentPartitionedRegionDistributedTest.java:1129)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> Expecting value to be true but was false
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest.lambda$cacheIsClosedWhenConflictingPersistentDataExceptionIsThrown$bb17a952$4(PersistentPartitionedRegionDistributedTest.java:1136)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9693) Remove deprecated elements from ListIndexCommandDistributedTestBase

2021-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440832#comment-17440832
 ] 

ASF subversion and git services commented on GEODE-9693:


Commit 276a81998662d604890e6e96df46b4e6d10e6649 in geode's branch 
refs/heads/develop from Nabarun Nag
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=276a819 ]

GEODE-9693:Removed deprecated elements from ListIndexCommandDistributedTestBase 
(#6958)

* Remove deprecated elements
* Rename DUnit to DistributedTest

> Remove deprecated elements from ListIndexCommandDistributedTestBase
> ---
>
> Key: GEODE-9693
> URL: https://issues.apache.org/jira/browse/GEODE-9693
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Nabarun Nag
>Priority: Major
>  Labels: needsTriage, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9402) Automatic Reconnect Failure: Address already in use

2021-11-08 Thread Bill Burcham (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440828#comment-17440828
 ] 

Bill Burcham commented on GEODE-9402:
-

h2. Summary

In each of the attached logs, we see the member that logged the BindException 
eventually joining the view (in 8 and 11 seconds respectively).

My suspicion is that what we see here is nondeterminism in the time it takes 
for a port to become available after it is unbound.

Since the members in question do re-join the cluster successfully I don't think 
this is a bug. What do you think [~jjramos] ?
h2. Detailed Analysis of cluster_logs_gke_latest_54

Looking at cluster_logs_gke_latest_54 quorum loss happens:

[Entry id=4208, date=2021/06/23 15:55:48.119 GMT, level=fatal, thread=tid=0x92, 
emitter=Geode Membership View Creator, message=Possible loss of quorum due to 
the loss of 5 cache processes: 
[gemfire-cluster-server-3(gemfire-cluster-server-3:1):41000, 
gemfire-cluster-server-1(gemfire-cluster-server-1:1):41000, 
gemfire-cluster-locator-1(gemfire-cluster-locator-1:1:locator):41000, 
gemfire-cluster-server-2(gemfire-cluster-server-2:1):41000, 
gemfire-cluster-locator-0(gemfire-cluster-locator-0:1:locator):41000]
, Host=gemfire-cluster-server-0 , 
mergedFile=/Users/bburcham/Downloads/cluster_logs_gke_latest_54/gemfire-cluster-server-0/gemfire-cluster-server-0-01-01.log]

It takes about two minutes for the network partition to be healed and for a 
coordinator to be designated. It is TBD what part of that two minutes was due 
to the test delaying the healing of the partition, vs what part of that time 
was spent re-forming a cluster after the network partition was healed. Here's 
the coordinator thread starting:

[Entry id=4925, date=2021/06/23 15:57:57.671 GMT, level=info, thread=tid=0x87, 
emitter=ReconnectThread, message=This member is becoming the membership 
coordinator with address 
gemfire-cluster-locator-0(gemfire-cluster-locator-0:1:locator):41000
, Host=gemfire-cluster-locator-0 , 
mergedFile=/Users/bburcham/Downloads/cluster_logs_gke_latest_54/gemfire-cluster-locator-0/gemfire-cluster-locator-0.log]

That point in time corresponds to view 21 (the pre-partition view sequence 
ended at view 5):

[Entry id=4960, date=2021/06/23 15:57:58.009 GMT, level=info, thread=tid=0xad, 
emitter=Geode Membership View Creator, message=sending new view 
View[gemfire-cluster-locator-0(gemfire-cluster-locator-0:1:locator):41000|21]
 members: 
[gemfire-cluster-locator-0(gemfire-cluster-locator-0:1:locator):41000, 
gemfire-cluster-server-0(gemfire-cluster-server-0:1):41000\{lead}, 
gemfire-cluster-server-1(gemfire-cluster-server-1:1):41000, 
gemfire-cluster-server-3(gemfire-cluster-server-3:1):41000, 
gemfire-cluster-server-2(gemfire-cluster-server-2:1):41000, 
gemfire-cluster-locator-1(gemfire-cluster-locator-1:1:locator):41000]  
crashed: 
[gemfire-cluster-locator-1(gemfire-cluster-locator-1:1:locator):41000, 
gemfire-cluster-server-2(gemfire-cluster-server-2:1):41000, 
gemfire-cluster-locator-0(gemfire-cluster-locator-0:1:locator):41000]
, Host=gemfire-cluster-locator-0 , 
mergedFile=/Users/bburcham/Downloads/cluster_logs_gke_latest_54/gemfire-cluster-locator-0/gemfire-cluster-locator-0.log]

About a minute later server-0 logs the BindException while reconnecting:

[Entry id=5536, date=2021/06/23 16:00:31.491 GMT, level=error, thread=tid=0x94, 
emitter=ReconnectThread, message=Cache initialization for GemFireCache[id = 
1795575589; isClosing = false; isShutDownAll = false; created = Wed Jun 23 
15:58:29 GMT 2021; server = false; copyOnRead = false; lockLease = 120; 
lockTimeout = 60] failed because:
org.apache.geode.GemFireIOException: While starting cache server CacheServer on 
port=40404 client subscription config policy=none client subscription config 
capacity=1 client subscription config overflow directory=.
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:800)
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:599)
    at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4207)
    at 
org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:199)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1497)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1449)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
    at 
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2668)
    at 
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistribut

[jira] [Updated] (GEODE-9805) Debug logging of Radish AUTH command in ExecutionHandlerContext.executeCommand() reveals sensitive information

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-9805:
---
Labels: blocks-1.15.0​  (was: needsTriage)

> Debug logging of Radish AUTH command in 
> ExecutionHandlerContext.executeCommand() reveals sensitive information
> --
>
> Key: GEODE-9805
> URL: https://issues.apache.org/jira/browse/GEODE-9805
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Assignee: Donal Evans
>Priority: Major
>  Labels: blocks-1.15.0​
>
> With debug logging enabled, the ExecutionHandlerContext.executeCommand() 
> method logs every command executed along with its arguments. In the case of 
> the AUTH command, this results in un-redacted userId and/or password being 
> logged, which represents a serious security issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9805) Debug logging of Radish AUTH command in ExecutionHandlerContext.executeCommand() reveals sensitive information

2021-11-08 Thread Donal Evans (Jira)
Donal Evans created GEODE-9805:
--

 Summary: Debug logging of Radish AUTH command in 
ExecutionHandlerContext.executeCommand() reveals sensitive information
 Key: GEODE-9805
 URL: https://issues.apache.org/jira/browse/GEODE-9805
 Project: Geode
  Issue Type: Bug
  Components: redis
Affects Versions: 1.15.0
Reporter: Donal Evans


With debug logging enabled, the ExecutionHandlerContext.executeCommand() method 
logs every command executed along with its arguments. In the case of the AUTH 
command, this results in un-redacted userId and/or password being logged, which 
represents a serious security issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9805) Debug logging of Radish AUTH command in ExecutionHandlerContext.executeCommand() reveals sensitive information

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans reassigned GEODE-9805:
--

Assignee: Donal Evans

> Debug logging of Radish AUTH command in 
> ExecutionHandlerContext.executeCommand() reveals sensitive information
> --
>
> Key: GEODE-9805
> URL: https://issues.apache.org/jira/browse/GEODE-9805
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Assignee: Donal Evans
>Priority: Major
>  Labels: needsTriage
>
> With debug logging enabled, the ExecutionHandlerContext.executeCommand() 
> method logs every command executed along with its arguments. In the case of 
> the AUTH command, this results in un-redacted userId and/or password being 
> logged, which represents a serious security issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9805) Debug logging of Radish AUTH command in ExecutionHandlerContext.executeCommand() reveals sensitive information

2021-11-08 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9805:
-
Labels: needsTriage  (was: )

> Debug logging of Radish AUTH command in 
> ExecutionHandlerContext.executeCommand() reveals sensitive information
> --
>
> Key: GEODE-9805
> URL: https://issues.apache.org/jira/browse/GEODE-9805
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: needsTriage
>
> With debug logging enabled, the ExecutionHandlerContext.executeCommand() 
> method logs every command executed along with its arguments. In the case of 
> the AUTH command, this results in un-redacted userId and/or password being 
> logged, which represents a serious security issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9792) Client in some cases would send in AuthenticationRequest multiple times even when they share the same connection

2021-11-08 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao updated GEODE-9792:
---
Fix Version/s: 1.15.0

> Client in some cases would send in AuthenticationRequest multiple times even 
> when they share the same connection
> 
>
> Key: GEODE-9792
> URL: https://issues.apache.org/jira/browse/GEODE-9792
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> It's observed that AuthenticationRequest will come in multiple times in the 
> same connection by different threads.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9792) Client in some cases would send in AuthenticationRequest multiple times even when they share the same connection

2021-11-08 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao resolved GEODE-9792.

Resolution: Fixed

> Client in some cases would send in AuthenticationRequest multiple times even 
> when they share the same connection
> 
>
> Key: GEODE-9792
> URL: https://issues.apache.org/jira/browse/GEODE-9792
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> It's observed that AuthenticationRequest will come in multiple times in the 
> same connection by different threads.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9804) Both registerAllKeys and registerRegex always fetch initial entries.

2021-11-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9804:
--
Labels: needsTriage pull-request-available  (was: needsTriage)

> Both registerAllKeys and registerRegex always fetch initial entries.
> 
>
> Key: GEODE-9804
> URL: https://issues.apache.org/jira/browse/GEODE-9804
> Project: Geode
>  Issue Type: Bug
>Reporter: Jacob Barrett
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> A inconsistency and bug in how the two regex interest methods configure the 
> initial interest policy results in the both registerAllKeys and registerRegex 
> fetching all initial entries despite the boolean parameter to get initial 
> entries. Furthermore the misconfiguration results in none of the entries 
> actually getting sent to the cache listener. The result is unnecessarily long 
> registration times, network traffic and load on the servers. On servers with 
> say millions of keys this can result in long GC pauses to unintentionally 
> iterate over all those keys.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9804) Both registerAllKeys and registerRegex always fetch initial entries.

2021-11-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440746#comment-17440746
 ] 

ASF GitHub Bot commented on GEODE-9804:
---

pivotal-jbarrett opened a new pull request #892:
URL: https://github.com/apache/geode-native/pull/892


   * Makes InterestResultPolicy a proper enum.
   * Adds integration benchmark for register interest.
   * Cleanup methods unused arguments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Both registerAllKeys and registerRegex always fetch initial entries.
> 
>
> Key: GEODE-9804
> URL: https://issues.apache.org/jira/browse/GEODE-9804
> Project: Geode
>  Issue Type: Bug
>Reporter: Jacob Barrett
>Priority: Major
>  Labels: needsTriage
>
> A inconsistency and bug in how the two regex interest methods configure the 
> initial interest policy results in the both registerAllKeys and registerRegex 
> fetching all initial entries despite the boolean parameter to get initial 
> entries. Furthermore the misconfiguration results in none of the entries 
> actually getting sent to the cache listener. The result is unnecessarily long 
> registration times, network traffic and load on the servers. On servers with 
> say millions of keys this can result in long GC pauses to unintentionally 
> iterate over all those keys.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9803) CI Failure: AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault fails after user is expired

2021-11-08 Thread Anilkumar Gingade (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Gingade updated GEODE-9803:
-
Labels: GeodeOperationAPI flaky needsTriage  (was: flaky needsTriage)

> CI Failure: AuthExpirationDUnitTest > 
> registeredInterest_slowReAuth_policyDefault fails after user is expired
> -
>
> Key: GEODE-9803
> URL: https://issues.apache.org/jira/browse/GEODE-9803
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: GeodeOperationAPI, flaky, needsTriage
>
> Originally seen in the distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$580/141775835.run 
> in VM 0 running on Host 
> heavy-lifter-2a24cff8-0d64-55e0-9585-2d6391f92533.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:96)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyDefault(AuthExpirationDUnitTest.java:156)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.security.AuthExpirationDUnitTest that uses 
> org.apache.geode.cache.Region 
> Expected size: 100 but was: 0 in:
> [] within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$registeredInterest_slowReAuth_policyDefault$bb17a952$2(AuthExpirationDUnitTest.java:158)
> Caused by:
> java.lang.AssertionError: 
> Expected size: 100 but was: 0 in:
> []
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$3(AuthExpirationDUnitTest.java:159)
> 8334 tests completed, 1 failed, 414 skipped
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636236082/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636236082/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9767) bump netty to recommended version

2021-11-08 Thread Dan Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith reassigned GEODE-9767:


Assignee: Dan Smith

> bump netty to recommended version
> -
>
> Key: GEODE-9767
> URL: https://issues.apache.org/jira/browse/GEODE-9767
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.12.5, 1.13.4, 1.14.0, 1.15.0
>Reporter: Owen Nichols
>Assignee: Dan Smith
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> latest is 4.1.69, we should be using 4.1.68 or 4.1.69 on all branches if 
> possible



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9767) bump netty to recommended version

2021-11-08 Thread Owen Nichols (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440726#comment-17440726
 ] 

Owen Nichols commented on GEODE-9767:
-

fixed on develop by https://issues.apache.org/jira/browse/GEODE-9778

leaving ticket open until backports are also completed...1.14 is a little 
trickier, see https://github.com/apache/geode/pull/7044

> bump netty to recommended version
> -
>
> Key: GEODE-9767
> URL: https://issues.apache.org/jira/browse/GEODE-9767
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.12.5, 1.13.4, 1.14.0, 1.15.0
>Reporter: Owen Nichols
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> latest is 4.1.69, we should be using 4.1.68 or 4.1.69 on all branches if 
> possible



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9767) bump netty to recommended version

2021-11-08 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9767:

Fix Version/s: 1.15.0

> bump netty to recommended version
> -
>
> Key: GEODE-9767
> URL: https://issues.apache.org/jira/browse/GEODE-9767
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.12.5, 1.13.4, 1.14.0, 1.15.0
>Reporter: Owen Nichols
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> latest is 4.1.69, we should be using 4.1.68 or 4.1.69 on all branches if 
> possible



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9802:
--
Labels: flaky pull-request-available  (was: flaky)

> LoggingWithReconnectDistributedTest uses ephemeral port to create servers, 
> leading to occasional failures with java.net.BindException: Address already 
> in use
> -
>
> Key: GEODE-9802
> URL: https://issues.apache.org/jira/browse/GEODE-9802
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Assignee: Donal Evans
>Priority: Major
>  Labels: flaky, pull-request-available
>
> Seen originally in distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
>  in VM -1 running on Host 
> heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Reconnect attempts terminated due to exception, caused by 
> org.apache.geode.GemFireIOException: While starting cache server CacheServer 
> on port=46103 client subscription config policy=none client subscription 
> config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
> ... 2 more
> Caused by:
> org.apache.geode.GemFireIOException: While starting cache server 
> CacheServer on port=46103 client subscription config policy=none client 
> subscription config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> 10.0.0.107[46103]
> at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
> at 
> org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
> at 
> org.apache.geode.internal.cache.

[jira] [Updated] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-9802:
---
Description: 
Seen originally in distributed mass test run:
{noformat}
> Task :geode-core:distributedTest

LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
 in VM -1 running on Host 
heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
with 4 VMs
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
at 
org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)

Caused by:
org.apache.geode.distributed.DistributedSystemDisconnectedException: 
Reconnect attempts terminated due to exception, caused by 
org.apache.geode.GemFireIOException: While starting cache server CacheServer on 
port=46103 client subscription config policy=none client subscription config 
capacity=1 client subscription config overflow directory=.
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
at 
org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
at 
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
... 2 more

Caused by:
org.apache.geode.GemFireIOException: While starting cache server 
CacheServer on port=46103 client subscription config policy=none client 
subscription config capacity=1 client subscription config overflow directory=.
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
at java.lang.Thread.run(Thread.java:748)

Caused by:
java.net.BindException: Failed to create server socket on 
10.0.0.107[46103]
at 
org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
at 
org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
at 
org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.(AcceptorImpl.java:573)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorBuilder.create(AcceptorBuilder.java:291)
at 
org.apache.geode.internal.cache.CacheServerImpl.createAcceptor(CacheServerImpl.java:420)
at 
org.apache.geode.internal.cache.CacheServerImpl.start(CacheServerImpl.java:377)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2769)
... 7 more

Caused by:
java.net.BindException: Address already in use (Bind failed)
at java.net.PlainSocketImpl.socketBind(Native Method)
at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImp

[jira] [Assigned] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans reassigned GEODE-9802:
--

Assignee: Donal Evans

> LoggingWithReconnectDistributedTest uses ephemeral port to create servers, 
> leading to occasional failures with java.net.BindException: Address already 
> in use
> -
>
> Key: GEODE-9802
> URL: https://issues.apache.org/jira/browse/GEODE-9802
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Assignee: Donal Evans
>Priority: Major
>  Labels: flaky
>
> Seen originally in distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
>  in VM -1 running on Host 
> heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Reconnect attempts terminated due to exception, caused by 
> org.apache.geode.GemFireIOException: While starting cache server CacheServer 
> on port=46103 client subscription config policy=none client subscription 
> config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
> ... 2 more
> Caused by:
> org.apache.geode.GemFireIOException: While starting cache server 
> CacheServer on port=46103 client subscription config policy=none client 
> subscription config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> 10.0.0.107[46103]
> at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
> at 
> org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
> at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.(AcceptorImpl.java:573)
> 

[jira] [Commented] (GEODE-6222) CI Failure: GemFireDeadlockDetectorDUnitTest

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440669#comment-17440669
 ] 

Geode Integration commented on GEODE-6222:
--

Seen in [distributed-test-openjdk8 
#2594|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2594]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636242338/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636242338/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> CI Failure: GemFireDeadlockDetectorDUnitTest
> 
>
> Key: GEODE-6222
> URL: https://issues.apache.org/jira/browse/GEODE-6222
> Project: Geode
>  Issue Type: Bug
>  Components: distributed lock service
>Affects Versions: 1.9.0, 1.14.0, 1.15.0
>Reporter: Ken Howe
>Priority: Major
>  Labels: flaky
>
> Flaky test failure in 
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/247]
> {code:java}
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest
>  > testDistributedDeadlockWithDLock FAILED
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest.testDistributedDeadlockWithDLock(GemFireDeadlockDetectorDUnitTest.java:199)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-6222) CI Failure: GemFireDeadlockDetectorDUnitTest

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-6222:
---
Affects Version/s: 1.14.0
   1.15.0

> CI Failure: GemFireDeadlockDetectorDUnitTest
> 
>
> Key: GEODE-6222
> URL: https://issues.apache.org/jira/browse/GEODE-6222
> Project: Geode
>  Issue Type: Bug
>  Components: distributed lock service
>Affects Versions: 1.9.0, 1.14.0, 1.15.0
>Reporter: Ken Howe
>Priority: Major
>  Labels: flaky
>
> Flaky test failure in 
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/247]
> {code:java}
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest
>  > testDistributedDeadlockWithDLock FAILED
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest.testDistributedDeadlockWithDLock(GemFireDeadlockDetectorDUnitTest.java:199)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-6489) CI Failures with testDistributedDeadlock

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-6489:
---
Affects Version/s: 1.14.0
   1.15.0

> CI Failures with testDistributedDeadlock
> 
>
> Key: GEODE-6489
> URL: https://issues.apache.org/jira/browse/GEODE-6489
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Affects Versions: 1.10.0, 1.14.0, 1.15.0
>Reporter: Lynn Hughes-Godfrey
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: flaky
>
> In an single CI run, we see 3 failures all related to testDistributedDeadlock:
> {noformat}
> org.apache.geode.management.internal.cli.commands.ShowDeadlockOverHttpDUnitTest
>  > testDistributedDeadlockWithFunction FAILED
> org.apache.geode.management.internal.cli.commands.ShowDeadlockOverHttpDUnitTest
>  > testNoDeadlock FAILED
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest
>  > testDistributedDeadlockWithDLock FAILED
> {noformat}
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/469
> {noformat}
> org.apache.geode.management.internal.cli.commands.ShowDeadlockOverHttpDUnitTest
>  > testDistributedDeadlockWithFunction FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.management.internal.cli.commands.ShowDeadlockDistributedTestBase$$Lambda$68/829260532.run
>  in VM 1 running on Host ceb4d948b5be with 4 VMs
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Condition with 
> org.apache.geode.management.internal.cli.commands.ShowDeadlockDistributedTestBase
>  was not fulfilled within 300 seconds.
> org.apache.geode.management.internal.cli.commands.ShowDeadlockOverHttpDUnitTest
>  > testNoDeadlock FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.management.internal.cli.commands.ShowDeadlockDistributedTestBase$$Lambda$68/829260532.run
>  in VM 1 running on Host ceb4d948b5be with 4 VMs
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Condition with 
> org.apache.geode.management.internal.cli.commands.ShowDeadlockDistributedTestBase
>  was not fulfilled within 300 seconds.
> 137 tests completed, 2 failed
> > Task :geode-web:distributedTest FAILED
> > Task :geode-core:distributedTest
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest
>  > testDistributedDeadlockWithDLock FAILED
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest.testDistributedDeadlockWithDLock(GemFireDeadlockDetectorDUnitTest.java:201)
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0019/test-results/distributedTest/1551833386/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0019/test-artifacts/1551833386/distributedtestfiles-OpenJDK8-1.10.0-SNAPSHOT.0019.tgz



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9804) Both registerAllKeys and registerRegex always fetch initial entries.

2021-11-08 Thread Jacob Barrett (Jira)
Jacob Barrett created GEODE-9804:


 Summary: Both registerAllKeys and registerRegex always fetch 
initial entries.
 Key: GEODE-9804
 URL: https://issues.apache.org/jira/browse/GEODE-9804
 Project: Geode
  Issue Type: Bug
Reporter: Jacob Barrett


A inconsistency and bug in how the two regex interest methods configure the 
initial interest policy results in the both registerAllKeys and registerRegex 
fetching all initial entries despite the boolean parameter to get initial 
entries. Furthermore the misconfiguration results in none of the entries 
actually getting sent to the cache listener. The result is unnecessarily long 
registration times, network traffic and load on the servers. On servers with 
say millions of keys this can result in long GC pauses to unintentionally 
iterate over all those keys.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9804) Both registerAllKeys and registerRegex always fetch initial entries.

2021-11-08 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9804:
-
Labels: needsTriage  (was: )

> Both registerAllKeys and registerRegex always fetch initial entries.
> 
>
> Key: GEODE-9804
> URL: https://issues.apache.org/jira/browse/GEODE-9804
> Project: Geode
>  Issue Type: Bug
>Reporter: Jacob Barrett
>Priority: Major
>  Labels: needsTriage
>
> A inconsistency and bug in how the two regex interest methods configure the 
> initial interest policy results in the both registerAllKeys and registerRegex 
> fetching all initial entries despite the boolean parameter to get initial 
> entries. Furthermore the misconfiguration results in none of the entries 
> actually getting sent to the cache listener. The result is unnecessarily long 
> registration times, network traffic and load on the servers. On servers with 
> say millions of keys this can result in long GC pauses to unintentionally 
> iterate over all those keys.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9803) CI Failure: AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault fails after user is expired

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440657#comment-17440657
 ] 

Geode Integration commented on GEODE-9803:
--

Seen in [distributed-test-openjdk8 
#2585|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2585]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636236082/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636236082/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> CI Failure: AuthExpirationDUnitTest > 
> registeredInterest_slowReAuth_policyDefault fails after user is expired
> -
>
> Key: GEODE-9803
> URL: https://issues.apache.org/jira/browse/GEODE-9803
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: flaky, needsTriage
>
> Originally seen in the distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$580/141775835.run 
> in VM 0 running on Host 
> heavy-lifter-2a24cff8-0d64-55e0-9585-2d6391f92533.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:96)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyDefault(AuthExpirationDUnitTest.java:156)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.security.AuthExpirationDUnitTest that uses 
> org.apache.geode.cache.Region 
> Expected size: 100 but was: 0 in:
> [] within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$registeredInterest_slowReAuth_policyDefault$bb17a952$2(AuthExpirationDUnitTest.java:158)
> Caused by:
> java.lang.AssertionError: 
> Expected size: 100 but was: 0 in:
> []
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$3(AuthExpirationDUnitTest.java:159)
> 8334 tests completed, 1 failed, 414 skipped
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636236082/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636236082/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9803) CI Failure: AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault fails after user is expired

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-9803:
---
Labels: flaky needsTriage  (was: needsTriage)

> CI Failure: AuthExpirationDUnitTest > 
> registeredInterest_slowReAuth_policyDefault fails after user is expired
> -
>
> Key: GEODE-9803
> URL: https://issues.apache.org/jira/browse/GEODE-9803
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: flaky, needsTriage
>
> Originally seen in the distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$580/141775835.run 
> in VM 0 running on Host 
> heavy-lifter-2a24cff8-0d64-55e0-9585-2d6391f92533.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:96)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyDefault(AuthExpirationDUnitTest.java:156)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.security.AuthExpirationDUnitTest that uses 
> org.apache.geode.cache.Region 
> Expected size: 100 but was: 0 in:
> [] within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$registeredInterest_slowReAuth_policyDefault$bb17a952$2(AuthExpirationDUnitTest.java:158)
> Caused by:
> java.lang.AssertionError: 
> Expected size: 100 but was: 0 in:
> []
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$3(AuthExpirationDUnitTest.java:159)
> 8334 tests completed, 1 failed, 414 skipped
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636236082/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636236082/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9803) CI Failure: AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault fails after user is expired

2021-11-08 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9803:
-
Labels: needsTriage  (was: )

> CI Failure: AuthExpirationDUnitTest > 
> registeredInterest_slowReAuth_policyDefault fails after user is expired
> -
>
> Key: GEODE-9803
> URL: https://issues.apache.org/jira/browse/GEODE-9803
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: needsTriage
>
> Originally seen in the distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$580/141775835.run 
> in VM 0 running on Host 
> heavy-lifter-2a24cff8-0d64-55e0-9585-2d6391f92533.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:96)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyDefault(AuthExpirationDUnitTest.java:156)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.security.AuthExpirationDUnitTest that uses 
> org.apache.geode.cache.Region 
> Expected size: 100 but was: 0 in:
> [] within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$registeredInterest_slowReAuth_policyDefault$bb17a952$2(AuthExpirationDUnitTest.java:158)
> Caused by:
> java.lang.AssertionError: 
> Expected size: 100 but was: 0 in:
> []
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$3(AuthExpirationDUnitTest.java:159)
> 8334 tests completed, 1 failed, 414 skipped
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636236082/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636236082/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9803) CI Failure: AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault fails after user is expired

2021-11-08 Thread Donal Evans (Jira)
Donal Evans created GEODE-9803:
--

 Summary: CI Failure: AuthExpirationDUnitTest > 
registeredInterest_slowReAuth_policyDefault fails after user is expired
 Key: GEODE-9803
 URL: https://issues.apache.org/jira/browse/GEODE-9803
 Project: Geode
  Issue Type: Bug
Affects Versions: 1.15.0
Reporter: Donal Evans


Originally seen in the distributed mass test run:

{noformat}
> Task :geode-core:distributedTest

AuthExpirationDUnitTest > registeredInterest_slowReAuth_policyDefault FAILED
org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$580/141775835.run in 
VM 0 running on Host 
heavy-lifter-2a24cff8-0d64-55e0-9585-2d6391f92533.c.apachegeode-ci.internal 
with 4 VMs
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
at 
org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:96)
at 
org.apache.geode.security.AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyDefault(AuthExpirationDUnitTest.java:156)

Caused by:
org.awaitility.core.ConditionTimeoutException: Assertion condition 
defined as a lambda expression in 
org.apache.geode.security.AuthExpirationDUnitTest that uses 
org.apache.geode.cache.Region 
Expected size: 100 but was: 0 in:
[] within 5 minutes.
at 
org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
at 
org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
at 
org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
at 
org.apache.geode.security.AuthExpirationDUnitTest.lambda$registeredInterest_slowReAuth_policyDefault$bb17a952$2(AuthExpirationDUnitTest.java:158)

Caused by:
java.lang.AssertionError: 
Expected size: 100 but was: 0 in:
[]
at 
org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$3(AuthExpirationDUnitTest.java:159)

8334 tests completed, 1 failed, 414 skipped

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636236082/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Test report artifacts from this job are available at:

http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636236082/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9050) Redis test fails with Netty 4.1.60 and later

2021-11-08 Thread Dan Smith (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440655#comment-17440655
 ] 

Dan Smith commented on GEODE-9050:
--

 I tracked this down in 1.14 so we can upgrade netty there. This bug exists in 
geode 1.14 but not in the latest geode 1.15 develop. In 1.14, we are changing 
the event loop group for a netty channel while threads maybe writing to the 
channel in ExecutionHandlerContext.changeChannelEventLoopGroup. This leads to 
the below assertion failure with netty 4.1.68 and above. It is unknown what 
sort or problems this might cause with the earlier versions of netty without 
the assertion:

This exception occurs when running 
PubSubIntegrationTest.ensureOrderingOfPublishedMessages after upgrading to 
netty 4.1.68 on support/1.14.


{noformat}
[warn 2021/10/27 22:34:47.657 GMT   tid=0x3d4] 
Failed to execute publish function java.lang.AssertionError
org.apache.geode.cache.execute.FunctionException: java.lang.AssertionError
at 
org.apache.geode.internal.cache.execute.LocalResultCollectorImpl.setException(LocalResultCollectorImpl.java:205)
at 
org.apache.geode.internal.cache.execute.MemberFunctionResultSender.setException(MemberFunctionResultSender.java:233)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.handleException(AbstractExecution.java:504)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:353)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionOnLocalNode(AbstractExecution.java:307)
at 
org.apache.geode.internal.cache.execute.MemberFunctionExecutor.executeFunction(MemberFunctionExecutor.java:136)
at 
org.apache.geode.internal.cache.execute.MemberFunctionExecutor.executeFunction(MemberFunctionExecutor.java:191)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.execute(AbstractExecution.java:376)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.execute(AbstractExecution.java:359)
at 
org.apache.geode.redis.internal.pubsub.PubSubImpl.publish(PubSubImpl.java:76)
at 
org.apache.geode.redis.internal.executor.pubsub.PublishExecutor.executeCommand(PublishExecutor.java:35)
at 
org.apache.geode.redis.internal.RedisCommandType.executeCommand(RedisCommandType.java:335)
at 
org.apache.geode.redis.internal.netty.Command.execute(Command.java:188)
at 
org.apache.geode.redis.internal.netty.ExecutionHandlerContext.executeCommand(ExecutionHandlerContext.java:315)
at 
org.apache.geode.redis.internal.netty.ExecutionHandlerContext.processCommandQueue(ExecutionHandlerContext.java:150)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.AssertionError
at 
io.netty.handler.timeout.WriteTimeoutHandler.addWriteTimeoutTask(WriteTimeoutHandler.java:144)
at 
io.netty.handler.timeout.WriteTimeoutHandler.scheduleTimeout(WriteTimeoutHandler.java:136)
at 
io.netty.handler.timeout.WriteTimeoutHandler.write(WriteTimeoutHandler.java:110)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:764)
at 
io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1071)
at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
... 1 more {noformat}

Here is the full sequence of events with geode 1.14. 



1. A subscription is created and marked ready to publish
2. In another thread A publish message comes in, starts writing to the channel 
of the subscriber
3. Netty uses the executor for the channel to perform the write (executor A)
4. The subcription thread changes the exector of the channel in 
changeChannelEventLoopGroup
5. The write eventually hits this assertion that the executor of the write 
matches the current executor of the channel. But because we changed the 
executor it no longer matches.

Since this is a hard

[jira] [Updated] (GEODE-9050) Redis test fails with Netty 4.1.60 and later

2021-11-08 Thread Dan Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-9050:
-
Fix Version/s: 1.15.0

> Redis test fails with Netty 4.1.60 and later
> 
>
> Key: GEODE-9050
> URL: https://issues.apache.org/jira/browse/GEODE-9050
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Owen Nichols
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> {{PubSubIntegrationTest > ensureOrderingOfPublishedMessages}} 
> [fails|http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-6153/test-results/integrationTest/1616031328/index.html]
>  reliably, on both Linux and Windows, if I [bump 
> Netty|https://github.com/apache/geode/pull/6153/commits/03b81f93b011377a5021a4b87acecacfa02b93a4]
>  from 4.1.59.Final to 4.1.60.Final.  It's important to keep up to date with 
> latest versions of our 3rd-party dependencies but breaking this out 
> separately so someone with redis knowledge can tackle it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9050) Redis test fails with Netty 4.1.60 and later

2021-11-08 Thread Dan Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-9050:
-
Affects Version/s: 1.14.0

> Redis test fails with Netty 4.1.60 and later
> 
>
> Key: GEODE-9050
> URL: https://issues.apache.org/jira/browse/GEODE-9050
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Owen Nichols
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
>
> {{PubSubIntegrationTest > ensureOrderingOfPublishedMessages}} 
> [fails|http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-6153/test-results/integrationTest/1616031328/index.html]
>  reliably, on both Linux and Windows, if I [bump 
> Netty|https://github.com/apache/geode/pull/6153/commits/03b81f93b011377a5021a4b87acecacfa02b93a4]
>  from 4.1.59.Final to 4.1.60.Final.  It's important to keep up to date with 
> latest versions of our 3rd-party dependencies but breaking this out 
> separately so someone with redis knowledge can tackle it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2021-11-08 Thread Kirk Lund (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440654#comment-17440654
 ] 

Kirk Lund commented on GEODE-9704:
--

Client cache should be cleared before sending ready for events.

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Kirk Lund
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>    - clears the region of keys of interest
>    - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8056) CI: ReplicateEntryIdleExpirationDistributedTest.readsInOtherMemberShouldPreventExpirationWhenEvictionEnabled FAILED

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440652#comment-17440652
 ] 

Geode Integration commented on GEODE-8056:
--

Seen in [distributed-test-openjdk8 
#2574|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2574]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636227061/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636227061/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> CI: 
> ReplicateEntryIdleExpirationDistributedTest.readsInOtherMemberShouldPreventExpirationWhenEvictionEnabled
>  FAILED
> ---
>
> Key: GEODE-8056
> URL: https://issues.apache.org/jira/browse/GEODE-8056
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.12.0
>Reporter: Jinmei Liao
>Priority: Major
>
> Happened once: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/111#A
> org.apache.geode.internal.cache.ReplicateEntryIdleExpirationDistributedTest > 
> readsInOtherMemberShouldPreventExpirationWhenEvictionEnabled FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.ReplicateEntryIdleExpirationDistributedTest$$Lambda$55/10096754.run
>  in VM 0 running on Host debd6ade8357 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.internal.cache.ReplicateEntryIdleExpirationDistributedTest.readsInOtherMemberShouldPreventExpirationWhenEvictionEnabled(ReplicateEntryIdleExpirationDistributedTest.java:197)
> Caused by:
> org.junit.ComparisonFailure: expected:<[tru]e> but was:<[fals]e>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.ReplicateEntryIdleExpirationDistributedTest.lambda$readsInOtherMemberShouldPreventExpirationWhenEvictionEnabled$a3e1e98a$5(ReplicateEntryIdleExpirationDistributedTest.java:207)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9770) CI Failure: ConflictingPersistentDataException in PersistentRecoveryOrderDUnitTest > testRecoverAfterConflict

2021-11-08 Thread Kirk Lund (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk Lund reassigned GEODE-9770:


Assignee: Kirk Lund  (was: Joris Melchior)

> CI Failure: ConflictingPersistentDataException in 
> PersistentRecoveryOrderDUnitTest > testRecoverAfterConflict
> -
>
> Key: GEODE-9770
> URL: https://issues.apache.org/jira/browse/GEODE-9770
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 1.15.0
>Reporter: Nabarun Nag
>Assignee: Kirk Lund
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage
>
> This ConflictingPersistentDataException has popped up multiple number of 
> times.
> GEODE-6975
> GEODE-7898
>  
> {noformat}
> PersistentRecoveryOrderDUnitTest > testRecoverAfterConflict FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest$$Lambda$477/1255368072.run
>  in VM 0 running on Host 
> heavy-lifter-7860ae84-3be2-5775-9a40-47a7abc4e64d.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.testRecoverAfterConflict(PersistentRecoveryOrderDUnitTest.java:1328)
> Caused by:
> org.apache.geode.cache.CacheClosedException: Region 
> /PersistentRecoveryOrderDUnitTest_testRecoverAfterConflictRegion remote 
> member heavy-lifter-7860ae84-3be2-5775-9a40-47a7abc4e64d(585689):51002 
> with persistent data 
> /10.0.0.50:/tmp/junit4736556655757609006/rootDir-testRecoverAfterConflict/vm-1
>  created at timestamp 1635009815552 version 0 diskStoreId 
> bf4774f44f2e4dcd-aa6c79424132a2e4 name  was not part of the same distributed 
> system as the local data from 
> /10.0.0.50:/tmp/junit4736556655757609006/rootDir-testRecoverAfterConflict/vm-0
>  created at timestamp 1635009814986 version 0 diskStoreId 
> cc4c64d81e9d4119-9e7320b29f540199 name , caused by 
> org.apache.geode.cache.persistence.ConflictingPersistentDataException: Region 
> /PersistentRecoveryOrderDUnitTest_testRecoverAfterConflictRegion remote 
> member heavy-lifter-7860ae84-3be2-5775-9a40-47a7abc4e64d(585689):51002 
> with persistent data 
> /10.0.0.50:/tmp/junit4736556655757609006/rootDir-testRecoverAfterConflict/vm-1
>  created at timestamp 1635009815552 version 0 diskStoreId 
> bf4774f44f2e4dcd-aa6c79424132a2e4 name  was not part of the same distributed 
> system as the local data from 
> /10.0.0.50:/tmp/junit4736556655757609006/rootDir-testRecoverAfterConflict/vm-0
>  created at timestamp 1635009814986 version 0 diskStoreId 
> cc4c64d81e9d4119-9e7320b29f540199 name 
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl$Stopper.generateCancelledException(GemFireCacheImpl.java:5223)
> at 
> org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getInternalResourceManager(GemFireCacheImpl.java:4259)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getInternalResourceManager(GemFireCacheImpl.java:4253)
> at 
> org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1175)
> at 
> org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1095)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3108)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:3002)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2986)
> at 
> org.apache.geode.cache.RegionFactory.create(RegionFactory.java:773)
> at 
> org.apache.geode.internal.cache.InternalRegionFactory.create(InternalRegionFactory.java:75)
> at 
> org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.createReplicateRegion(PersistentRecoveryOrderDUnitTest.java:1358)
> at 
> org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.createReplicateRegion(PersistentRecoveryOrderDUnitTest.java:1362)
> at 
> org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.lambda$testRecoverAfterConflict$bb17a952$5(PersistentRecoveryOrderDUnitTest.java:1330)
> Caused by:
> 
> org.apache.geode.cache.persistence.ConflictingPersistentDataException: Region 
> /PersistentRecoveryOrderDUnitTest_testRecoverAfterConflictRegion re

[jira] [Assigned] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2021-11-08 Thread Kirk Lund (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk Lund reassigned GEODE-9704:


Assignee: Kirk Lund

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Kirk Lund
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>    - clears the region of keys of interest
>    - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-1537) DurableRegistrationDUnitTest.testDurableClientWithRegistrationHA

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440651#comment-17440651
 ] 

Geode Integration commented on GEODE-1537:
--

Seen in [distributed-test-openjdk8 
#2550|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2550]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636203886/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636203886/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> DurableRegistrationDUnitTest.testDurableClientWithRegistrationHA
> 
>
> Key: GEODE-1537
> URL: https://issues.apache.org/jira/browse/GEODE-1537
> Project: Geode
>  Issue Type: Bug
>  Components: client queues
>Reporter: Jinmei Liao
>Priority: Major
>  Labels: CI
>
> Geode_develop_DistributedTests/2883
> Error Message
> com.gemstone.gemfire.test.dunit.RMIException: While invoking 
> com.gemstone.gemfire.internal.cache.tier.sockets.DurableRegistrationDUnitTest$$Lambda$373/449639279.run
>  in VM 1 running on Host timor.gemstone.com with 4 VMs
> Stacktrace
> com.gemstone.gemfire.test.dunit.RMIException: While invoking 
> com.gemstone.gemfire.internal.cache.tier.sockets.DurableRegistrationDUnitTest$$Lambda$373/449639279.run
>  in VM 1 running on Host timor.gemstone.com with 4 VMs
>   at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:389)
>   at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:355)
>   at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:293)
>   at 
> com.gemstone.gemfire.internal.cache.tier.sockets.DurableRegistrationDUnitTest.testDurableClientWithRegistrationHA(DurableRegistrationDUnitTest.java:421)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:112)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor426.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.g

[jira] [Commented] (GEODE-9381) CI Failure: WanAutoDiscoveryDUnitTest.test_LN_Sender_recognises_ALL_NY_Locators FAILED

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440648#comment-17440648
 ] 

Geode Integration commented on GEODE-9381:
--

Seen in [distributed-test-openjdk8 
#2533|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2533]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636194789/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636194789/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> CI Failure: 
> WanAutoDiscoveryDUnitTest.test_LN_Sender_recognises_ALL_NY_Locators FAILED
> --
>
> Key: GEODE-9381
> URL: https://issues.apache.org/jira/browse/GEODE-9381
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>
> {noformat}
> 15:59:33org.apache.geode.internal.cache.wan.misc.WanAutoDiscoveryDUnitTest > 
> test_LN_Sender_recognises_ALL_NY_Locators FAILED
> 15:59:33org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.wan.misc.WanAutoDiscoveryDUnitTest$$Lambda$313/2101673563.run
>  in VM 2 running on Host 7038f08c5687 with 8 VMs
> 15:59:33at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> 15:59:33at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> 15:59:33at 
> org.apache.geode.internal.cache.wan.misc.WanAutoDiscoveryDUnitTest.test_LN_Sender_recognises_ALL_NY_Locators(WanAutoDiscoveryDUnitTest.java:328)
> 15:59:33
> 15:59:33Caused by:
> 15:59:33java.lang.AssertionError: Waited 1 for 
> localhost/127.0.0.1:23752 to be discovered on client. List is now: []
> 15:59:33at org.junit.Assert.fail(Assert.java:89)
> 15:59:33at org.junit.Assert.assertTrue(Assert.java:42)
> 15:59:33at 
> org.apache.geode.internal.cache.wan.WANTestBase.checkLocatorsinSender(WANTestBase.java:3249)
> 15:59:33at 
> org.apache.geode.internal.cache.wan.misc.WanAutoDiscoveryDUnitTest.lambda$test_LN_Sender_recognises_ALL_NY_Locators$d07c$1(WanAutoDiscoveryDUnitTest.java:328)
>  {noformat}
>  
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0317/test-results/distributedTest/1623456621/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0317/test-artifacts/1623456621/distributedtestfiles-openjdk8-1.15.0-build.0317.tgz
>  {noformat}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-9802:
---
Labels: flaky  (was: flaky-test)

> LoggingWithReconnectDistributedTest uses ephemeral port to create servers, 
> leading to occasional failures with java.net.BindException: Address already 
> in use
> -
>
> Key: GEODE-9802
> URL: https://issues.apache.org/jira/browse/GEODE-9802
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: flaky
>
> Seen originally in distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
>  in VM -1 running on Host 
> heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Reconnect attempts terminated due to exception, caused by 
> org.apache.geode.GemFireIOException: While starting cache server CacheServer 
> on port=46103 client subscription config policy=none client subscription 
> config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
> ... 2 more
> Caused by:
> org.apache.geode.GemFireIOException: While starting cache server 
> CacheServer on port=46103 client subscription config policy=none client 
> subscription config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> 10.0.0.107[46103]
> at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
> at 
> org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
> at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.(AcceptorImpl.java:573)
> at 
> org.a

[jira] [Updated] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-9802:
---
Labels: flaky-test  (was: needsTriage)

> LoggingWithReconnectDistributedTest uses ephemeral port to create servers, 
> leading to occasional failures with java.net.BindException: Address already 
> in use
> -
>
> Key: GEODE-9802
> URL: https://issues.apache.org/jira/browse/GEODE-9802
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: flaky-test
>
> Seen originally in distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
>  in VM -1 running on Host 
> heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Reconnect attempts terminated due to exception, caused by 
> org.apache.geode.GemFireIOException: While starting cache server CacheServer 
> on port=46103 client subscription config policy=none client subscription 
> config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
> ... 2 more
> Caused by:
> org.apache.geode.GemFireIOException: While starting cache server 
> CacheServer on port=46103 client subscription config policy=none client 
> subscription config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> 10.0.0.107[46103]
> at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
> at 
> org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
> at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.(AcceptorImpl.java:573)
> 

[jira] [Commented] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440645#comment-17440645
 ] 

Geode Integration commented on GEODE-9802:
--

Seen in [distributed-test-openjdk8 
#2530|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2530]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636187130/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636187130/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> LoggingWithReconnectDistributedTest uses ephemeral port to create servers, 
> leading to occasional failures with java.net.BindException: Address already 
> in use
> -
>
> Key: GEODE-9802
> URL: https://issues.apache.org/jira/browse/GEODE-9802
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: needsTriage
>
> Seen originally in distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
>  in VM -1 running on Host 
> heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Reconnect attempts terminated due to exception, caused by 
> org.apache.geode.GemFireIOException: While starting cache server CacheServer 
> on port=46103 client subscription config policy=none client subscription 
> config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
> ... 2 more
> Caused by:
> org.apache.geode.GemFireIOException: While starting cache server 
> CacheServer on port=46103 client subscription config policy=none client 
> subscription config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> 10.0

[jira] [Updated] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9802:
-
Labels: needsTriage  (was: )

> LoggingWithReconnectDistributedTest uses ephemeral port to create servers, 
> leading to occasional failures with java.net.BindException: Address already 
> in use
> -
>
> Key: GEODE-9802
> URL: https://issues.apache.org/jira/browse/GEODE-9802
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: needsTriage
>
> Seen originally in distributed mass test run:
> {noformat}
> > Task :geode-core:distributedTest
> LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
>  in VM -1 running on Host 
> heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Reconnect attempts terminated due to exception, caused by 
> org.apache.geode.GemFireIOException: While starting cache server CacheServer 
> on port=46103 client subscription config policy=none client subscription 
> config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
> at 
> org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> at 
> org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
> ... 2 more
> Caused by:
> org.apache.geode.GemFireIOException: While starting cache server 
> CacheServer on port=46103 client subscription config policy=none client 
> subscription config capacity=1 client subscription config overflow directory=.
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> 10.0.0.107[46103]
> at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
> at 
> org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
> at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.(AcceptorImpl.java:573)
>  

[jira] [Created] (GEODE-9802) LoggingWithReconnectDistributedTest uses ephemeral port to create servers, leading to occasional failures with java.net.BindException: Address already in use

2021-11-08 Thread Donal Evans (Jira)
Donal Evans created GEODE-9802:
--

 Summary: LoggingWithReconnectDistributedTest uses ephemeral port 
to create servers, leading to occasional failures with java.net.BindException: 
Address already in use
 Key: GEODE-9802
 URL: https://issues.apache.org/jira/browse/GEODE-9802
 Project: Geode
  Issue Type: Bug
Affects Versions: 1.15.0
Reporter: Donal Evans


Seen originally in distributed mass test run:
{noformat}
> Task :geode-core:distributedTest

LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED
org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run
 in VM -1 running on Host 
heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal 
with 4 VMs
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
at 
org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141)

Caused by:
org.apache.geode.distributed.DistributedSystemDisconnectedException: 
Reconnect attempts terminated due to exception, caused by 
org.apache.geode.GemFireIOException: While starting cache server CacheServer on 
port=46103 client subscription config policy=none client subscription config 
capacity=1 client subscription config overflow directory=.
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916)
at 
org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
at 
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628)
... 2 more

Caused by:
org.apache.geode.GemFireIOException: While starting cache server 
CacheServer on port=46103 client subscription config policy=none client 
subscription config capacity=1 client subscription config overflow directory=.
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794)
at java.lang.Thread.run(Thread.java:748)

Caused by:
java.net.BindException: Failed to create server socket on 
10.0.0.107[46103]
at 
org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
at 
org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
at 
org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.(AcceptorImpl.java:573)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorBuilder.create(AcceptorBuilder.java:291)
at 
org.apache.geode.internal.cache.CacheServerImpl.createAcceptor(CacheServerImpl.java:420)
at 
org.apache.geode.internal.cache.CacheServerImpl.start(CacheServerImpl.java:377)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2769)
... 7 more


[jira] [Commented] (GEODE-9454) The client, if multiple operations are in flight, should not flood the authentication server with re-authentication requests.

2021-11-08 Thread Jinmei Liao (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440634#comment-17440634
 ] 

Jinmei Liao commented on GEODE-9454:


Work done in this PR solves this issue.

> The client, if multiple operations are in flight, should not flood the 
> authentication server with re-authentication requests.
> -
>
> Key: GEODE-9454
> URL: https://issues.apache.org/jira/browse/GEODE-9454
> Project: Geode
>  Issue Type: Sub-task
>  Components: core, security
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI
> Fix For: 1.15.0
>
>
> the blocking should only be for one client only
> Having a test with multiple clients with multiple thread doing cache 
> operations
> The test should also cover the multi-user authentication mode



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (GEODE-9739) CI: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo failed with MemberDisconnectedException

2021-11-08 Thread Kamilla Aslami (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440625#comment-17440625
 ] 

Kamilla Aslami edited comment on GEODE-9739 at 11/8/21, 5:36 PM:
-

Removing membership component because this issue seems to be caused by the 
"create gateway-sender" command and is not membership-related. Here's my 
analysis:

This test creates 2 WAN sites: one with three 1.12 members vm0, vm1, vm2 and 
another with three 1.15 members: vm4, vm5, and vm6.
Then it upgrades vm0 and vm2 to 1.15 and attempts to create a gateway sender:
{noformat}
[vm0] [info 2021/10/13 23:34:21.093 UTC   
tid=0x62] Executing command: create gateway-sender --id=toSite2 
--remote-distributed-system-id=1{noformat}
This command is supposed to fail, and I see the expected error message in the 
logs ~40 seconds later (usually it fails without delay):
{noformat}
[vm4] [info 2021/10/13 23:35:00.906 UTC   tid=0x37d] Shutting 
down DistributionManager 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919. 
At least one Exception occurred.

Command result for : 
Gateway Sender cannot be created until all members are the current 
version{noformat}
It looks like the "create gateway-sender" command hung because there were no 
logs for at least 15 seconds after it was executed.

26 seconds after the command was executed, vm4, vm5, and vm6 started having 
problems connecting to each other:
{noformat}
[vm6] [info 2021/10/13 23:34:47.076 UTC   
tid=0x35f] received suspect message from myself for 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919: 
Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.083 UTC   
tid=0x35f] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.134 UTC   
tid=0x360] received suspect message from myself for 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920: Member 
isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.787 UTC   
tid=0x35f] Performing availability check for suspect member 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919 
reason=Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.809 UTC   
tid=0x35f] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.844 UTC   
tid=0x360] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.866 UTC   
tid=0x362] Performing availability check for suspect member 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920 
reason=Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.866 UTC   
tid=0x362] All other members are suspect at this point
...{noformat}
Most of the availability checks passed, but one member got kicked out, and the 
quorum was lost:
{noformat}
[vm4] [info 2021/10/13 23:34:54.355 UTC   
tid=0x35a] View Creator is processing 1 requests for the next membership view 
([RemoveMemberMessage(heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920;
 reason=Member isn't responding to heartbeat requests)])
[vm4] [info 2021/10/13 23:34:54.486 UTC   
tid=0x35a]   heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920 
had a weight of 15
[vm4] [warn 2021/10/13 23:34:54.517 UTC   
tid=0x35a] total weight lost in this view change is 15 of 28.  Quorum has been 
lost!
[vm4] [fatal 2021/10/13 23:34:54.989 UTC   
tid=0x35a] Possible loss of quorum due to the loss of 1 cache processes: 
[heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920]{noformat}
vm4, vm5, and vm6 also reported that heartbeat-generation thread overslept by 
more than a full period:
{noformat}
[vm6] [warn 2021/10/13 23:34:46.402 UTC   tid=0x343] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 7,274,351,195 nanoseconds. Period: 2,500,000,000 
nanoseconds.
[vm4] [warn 2021/10/13 23:34:49.469 UTC   tid=0x358] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 19,549,608,016 nanoseconds. Period: 2,500,000,000 
nanoseconds.
[vm5] [warn 2021/10/13 23:34:52.864 UTC   tid=0x338] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 5,271,931,427 nanoseconds. Period: 2,500,000,000 
nanoseconds.{noformat}
The "heartbeat-generation thread overslept" message implies that there was a 
resource issue. In this case, it might've been caused by the create 
gateway-sender command as it took 40 seconds to complete. I don't see 
membership-related issues in the logs; the membership component operated 
correctly based upon the analysis.

 

I ran this test 200 times but couldn't reproduce the issue.


was (Author: kaslami):
Removing membership component because this issue seems to be caused by the 
"create gateway-sender" command and is not membership-related. Here's my 
analys

[jira] [Comment Edited] (GEODE-9739) CI: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo failed with MemberDisconnectedException

2021-11-08 Thread Kamilla Aslami (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440625#comment-17440625
 ] 

Kamilla Aslami edited comment on GEODE-9739 at 11/8/21, 5:33 PM:
-

Removing membership component because this issue seems to be caused by the 
"create gateway-sender" command and is not membership-related. Here's my 
analysis:

This test creates 2 WAN sites: one with three 1.12 members vm0, vm1, vm2 and 
another with three 1.15 members: vm4, vm5, and vm6.
Then it upgrades vm0 and vm2 to 1.15 and attempts to create a gateway sender:
{noformat}
[vm0] [info 2021/10/13 23:34:21.093 UTC   
tid=0x62] Executing command: create gateway-sender --id=toSite2 
--remote-distributed-system-id=1{noformat}
This command is supposed to fail, and I see the expected error message in the 
logs ~40 seconds later (usually it fails without delay):
{noformat}
[vm4] [info 2021/10/13 23:35:00.906 UTC   tid=0x37d] Shutting 
down DistributionManager 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919. 
At least one Exception occurred.

Command result for : 
Gateway Sender cannot be created until all members are the current 
version{noformat}
It looks like the "create gateway-sender" command hung because there were no 
logs for at least 15 seconds after it was executed. ~25 seconds after the 
command was executed, vm4, vm5, and vm6 started having problems connecting to 
each other:
{noformat}
[vm6] [info 2021/10/13 23:34:47.076 UTC   
tid=0x35f] received suspect message from myself for 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919: 
Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.083 UTC   
tid=0x35f] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.134 UTC   
tid=0x360] received suspect message from myself for 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920: Member 
isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.787 UTC   
tid=0x35f] Performing availability check for suspect member 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919 
reason=Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.809 UTC   
tid=0x35f] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.844 UTC   
tid=0x360] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.866 UTC   
tid=0x362] Performing availability check for suspect member 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920 
reason=Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.866 UTC   
tid=0x362] All other members are suspect at this point
...{noformat}
Most of the availability checks passed, but one member got kicked out, and the 
quorum was lost:
{noformat}
[vm4] [info 2021/10/13 23:34:54.355 UTC   
tid=0x35a] View Creator is processing 1 requests for the next membership view 
([RemoveMemberMessage(heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920;
 reason=Member isn't responding to heartbeat requests)])
[vm4] [info 2021/10/13 23:34:54.486 UTC   
tid=0x35a]   heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920 
had a weight of 15
[vm4] [warn 2021/10/13 23:34:54.517 UTC   
tid=0x35a] total weight lost in this view change is 15 of 28.  Quorum has been 
lost!
[vm4] [fatal 2021/10/13 23:34:54.989 UTC   
tid=0x35a] Possible loss of quorum due to the loss of 1 cache processes: 
[heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920]{noformat}
vm4, vm5, and vm6 also reported that heartbeat-generation thread overslept by 
more than a full period:
{noformat}
[vm6] [warn 2021/10/13 23:34:46.402 UTC   tid=0x343] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 7,274,351,195 nanoseconds. Period: 2,500,000,000 
nanoseconds.
[vm4] [warn 2021/10/13 23:34:49.469 UTC   tid=0x358] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 19,549,608,016 nanoseconds. Period: 2,500,000,000 
nanoseconds.
[vm5] [warn 2021/10/13 23:34:52.864 UTC   tid=0x338] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 5,271,931,427 nanoseconds. Period: 2,500,000,000 
nanoseconds.{noformat}
The "heartbeat-generation thread overslept" message implies that there was a 
resource issue. In this case, it might've been caused by the create 
gateway-sender command as it took 40 seconds to complete. I don't see 
membership-related issues in the logs; the membership component operated 
correctly based upon the analysis.

 

I ran this test 200 times but couldn't reproduce the issue.


was (Author: kaslami):
Removing membership component because this issue seems to be caused by the 
"create gateway-sender" command and is not membership-related. Here's my 
analy

[jira] [Updated] (GEODE-9739) CI: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo failed with MemberDisconnectedException

2021-11-08 Thread Kamilla Aslami (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamilla Aslami updated GEODE-9739:
--
Component/s: wan
 (was: membership)

> CI: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo failed 
> with MemberDisconnectedException
> --
>
> Key: GEODE-9739
> URL: https://issues.apache.org/jira/browse/GEODE-9739
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Kamilla Aslami
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo > 
> CreateGatewaySenderMixedSiteOneCurrentSiteTwo[from_v1.12.2] FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm6.log' at line 481[fatal 
> 2021/10/13 23:34:55.115 UTC  receiver,heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3-37210> tid=830] 
> Membership service failure: Membership coordinator 
> heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919
>  has declared that a network partition has occurred
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Membership coordinator 
> heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919
>  has declared that a network partition has occurred
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:1807)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1122)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1466)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1367)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1303)
> at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
> at org.jgroups.JChannel.up(JChannel.java:741)
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
> at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
> at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
> at 
> org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
> at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
> at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
> at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
> at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789)
> at org.jgroups.protocols.TP.receive(TP.java:1714)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:160)
> at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
> at java.base/java.lang.Thread.run(Thread.java:829)
> ---
> Found suspect string in 'dunit_suspect-vm4.log' at line 549[fatal 
> 2021/10/13 23:34:54.989 UTC  tid=858] Possible 
> loss of quorum due to the loss of 1 cache processes: 
> [heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920]
> ---
> Found suspect string in 'dunit_suspect-vm4.log' at line 551[fatal 
> 2021/10/13 23:34:56.179 UTC  tid=858] 
> Membership service failure: Exiting due to possible network partition event 
> due to loss of 1 cache processes: 
> [heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920]
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Exiting due to possible network partition event due to loss of 1 cache 
> processes: 
> [heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920]
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMem

[jira] [Commented] (GEODE-9739) CI: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo failed with MemberDisconnectedException

2021-11-08 Thread Kamilla Aslami (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440625#comment-17440625
 ] 

Kamilla Aslami commented on GEODE-9739:
---

Removing membership component because this issue seems to be caused by the 
"create gateway-sender" command and is not membership-related. Here's my 
analysis:


This test creates 2 WAN sites: one with three 1.12 members vm0, vm1, vm2 and 
another with three 1.15 members: vm4, vm5, and vm6.
Then it upgrades vm0 and vm2 to 1.15 and attempts to create a gateway sender:
{noformat}
[vm0] [info 2021/10/13 23:34:21.093 UTC   
tid=0x62] Executing command: create gateway-sender --id=toSite2 
--remote-distributed-system-id=1{noformat}
This command is supposed to fail, and I see the expected error message in the 
logs ~40 seconds later (usually it fails without delay):
{noformat}
[vm4] [info 2021/10/13 23:35:00.906 UTC   tid=0x37d] Shutting 
down DistributionManager 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919. 
At least one Exception occurred.

Command result for : 
Gateway Sender cannot be created until all members are the current 
version{noformat}
It looks like the "create gateway-sender" command hung because there were no 
logs for at least 15 seconds after it was executed. ~25 seconds after the 
command was executed, vm4, vm5, and vm6 started having problems connecting to 
each other:
{noformat}
[vm6] [info 2021/10/13 23:34:47.076 UTC   
tid=0x35f] received suspect message from myself for 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919: 
Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.083 UTC   
tid=0x35f] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.134 UTC   
tid=0x360] received suspect message from myself for 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920: Member 
isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.787 UTC   
tid=0x35f] Performing availability check for suspect member 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(39325:locator):53919 
reason=Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.809 UTC   
tid=0x35f] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.844 UTC   
tid=0x360] All other members are suspect at this point
[vm6] [info 2021/10/13 23:34:47.866 UTC   
tid=0x362] Performing availability check for suspect member 
heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920 
reason=Member isn't responding to heartbeat requests
[vm6] [info 2021/10/13 23:34:47.866 UTC   
tid=0x362] All other members are suspect at this point
...{noformat}
Most of the availability checks passed, but one member got kicked out, and the 
quorum was lost:
{noformat}
[vm4] [info 2021/10/13 23:34:54.355 UTC   
tid=0x35a] View Creator is processing 1 requests for the next membership view 
([RemoveMemberMessage(heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920;
 reason=Member isn't responding to heartbeat requests)])
[vm4] [info 2021/10/13 23:34:54.486 UTC   
tid=0x35a]   heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920 
had a weight of 15
[vm4] [warn 2021/10/13 23:34:54.517 UTC   
tid=0x35a] total weight lost in this view change is 15 of 28.  Quorum has been 
lost!
[vm4] [fatal 2021/10/13 23:34:54.989 UTC   
tid=0x35a] Possible loss of quorum due to the loss of 1 cache processes: 
[heavy-lifter-f2dde05e-a413-5672-b192-2396306457b3(41092):53920]{noformat}
vm4, vm5, and vm6 also reported that heartbeat-generation thread overslept by 
more than a full period:
{noformat}
[vm6] [warn 2021/10/13 23:34:46.402 UTC   tid=0x343] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 7,274,351,195 nanoseconds. Period: 2,500,000,000 
nanoseconds.
[vm4] [warn 2021/10/13 23:34:49.469 UTC   tid=0x358] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 19,549,608,016 nanoseconds. Period: 2,500,000,000 
nanoseconds.
[vm5] [warn 2021/10/13 23:34:52.864 UTC   tid=0x338] 
Failure detection heartbeat-generation thread overslept by more than a full 
period. Asleep time: 5,271,931,427 nanoseconds. Period: 2,500,000,000 
nanoseconds.{noformat}
The "heartbeat-generation thread overslept" message implies that there was a 
resource issue. In this case, it might've been caused by the create 
gateway-sender command as it took 40 seconds to complete. I don't see 
membership-related issues in the logs; the membership component operated 
correctly based upon the analysis.

> CI: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo failed 
> with MemberDisconnectedException
> --
>
> Key: GEODE-9739
> URL: https://issue

[jira] [Updated] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-8644:
---
Labels: needsTriage pull-request-available  (was: pull-request-available)

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Benjamin P Ross
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-08 Thread Donal Evans (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440622#comment-17440622
 ] 

Donal Evans commented on GEODE-8644:


Given the continued appearance of this failure in mass test runs, despite the 
changes in [GitHub Pull Request 
#5796|https://github.com/apache/geode/pull/5796] increasing the timeout to 5 
minutes, it seems that the original analysis that the failure was caused by a 
too-short timeout is incorrect and that further investigation is needed.

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Benjamin P Ross
>Priority: Major
>  Labels: pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440620#comment-17440620
 ] 

Geode Integration commented on GEODE-8644:
--

Seen in [distributed-test-openjdk8 
#2523|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2523]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636186509/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636186509/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Benjamin P Ross
>Priority: Major
>  Labels: pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440619#comment-17440619
 ] 

Geode Integration commented on GEODE-8644:
--

Seen in [distributed-test-openjdk8 
#2522|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2522]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636186347/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636186347/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Benjamin P Ross
>Priority: Major
>  Labels: pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-7739) JMX managers may fail to federate mbeans for other members

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440605#comment-17440605
 ] 

Geode Integration commented on GEODE-7739:
--

Seen in [distributed-test-openjdk8 
#2517|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2517]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636178995/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636178995/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> JMX managers may fail to federate mbeans for other members
> --
>
> Key: GEODE-7739
> URL: https://issues.apache.org/jira/browse/GEODE-7739
> Project: Geode
>  Issue Type: Bug
>  Components: jmx
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> JMX Manager may fail to federate one or more MXBeans for other members 
> because of a race condition during startup. When ManagementCacheListener is 
> first constructed, it is in a state that will ignore all callbacks because 
> the field readyForEvents is false.
> 
> Debugging with JMXMBeanReconnectDUnitTest revealed this bug.
> The test starts two locators with jmx manager configured and started. 
> Locator1 always has all of locator2's mbeans, but locator2 is intermittently 
> missing the personal mbeans of locator1. 
> I think this is caused by some sort of race condition in the code that 
> creates the monitoring regions for other members in locator2.
> It's possible that the jmx manager that hits this bug might fail to have 
> mbeans for servers as well as other locators but I haven't seen a test case 
> for this scenario.
> The exposure of this bug means that a user running more than one locator 
> might have a locator that is missing one or more mbeans for the cluster.
> 
> Studying the JMX code also reveals the existence of *GEODE-8012*.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-7739) JMX managers may fail to federate mbeans for other members

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440606#comment-17440606
 ] 

Geode Integration commented on GEODE-7739:
--

Seen in [distributed-test-openjdk8 
#2564|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2564]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636218797/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636218797/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> JMX managers may fail to federate mbeans for other members
> --
>
> Key: GEODE-7739
> URL: https://issues.apache.org/jira/browse/GEODE-7739
> Project: Geode
>  Issue Type: Bug
>  Components: jmx
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> JMX Manager may fail to federate one or more MXBeans for other members 
> because of a race condition during startup. When ManagementCacheListener is 
> first constructed, it is in a state that will ignore all callbacks because 
> the field readyForEvents is false.
> 
> Debugging with JMXMBeanReconnectDUnitTest revealed this bug.
> The test starts two locators with jmx manager configured and started. 
> Locator1 always has all of locator2's mbeans, but locator2 is intermittently 
> missing the personal mbeans of locator1. 
> I think this is caused by some sort of race condition in the code that 
> creates the monitoring regions for other members in locator2.
> It's possible that the jmx manager that hits this bug might fail to have 
> mbeans for servers as well as other locators but I haven't seen a test case 
> for this scenario.
> The exposure of this bug means that a user running more than one locator 
> might have a locator that is missing one or more mbeans for the cluster.
> 
> Studying the JMX code also reveals the existence of *GEODE-8012*.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9437) Redis session dunit tests are flaky

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440604#comment-17440604
 ] 

Geode Integration commented on GEODE-9437:
--

Seen in [distributed-test-openjdk8 
#2513|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2513]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636178365/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636178365/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> Redis session dunit tests are flaky
> ---
>
> Key: GEODE-9437
> URL: https://issues.apache.org/jira/browse/GEODE-9437
> Project: Geode
>  Issue Type: Test
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>
> The Redis session-related DUnit tests will sometimes fail with errors such as:
> {noformat}
> org.apache.geode.redis.session.RedisSessionDUnitTest > should_storeSession 
> FAILED
> 
> org.springframework.web.client.HttpServerErrorException$InternalServerError: 
> 500 Server Error: [no body]
> at 
> org.springframework.web.client.HttpServerErrorException.create(HttpServerErrorException.java:100)
> at 
> org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:188)
> at 
> org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:125)
> at 
> org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63)
> at 
> org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:819)
> at 
> org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:777)
> at 
> org.springframework.web.client.RestTemplate.execute(RestTemplate.java:711)
> at 
> org.springframework.web.client.RestTemplate.postForEntity(RestTemplate.java:468)
> at 
> org.apache.geode.redis.session.SessionDUnitTest.createNewSessionWithNote0(SessionDUnitTest.java:207)
> at 
> org.apache.geode.redis.session.SessionDUnitTest.lambda$createNewSessionWithNote$1(SessionDUnitTest.java:201)
> at 
> io.github.resilience4j.retry.Retry.lambda$decorateCallable$5(Retry.java:306)
> at 
> org.apache.geode.redis.session.SessionDUnitTest.createNewSessionWithNote(SessionDUnitTest.java:201)
> at 
> org.apache.geode.redis.session.RedisSessionDUnitTest.should_storeSession(RedisSessionDUnitTest.java:88)
> org.apache.geode.redis.session.RedisSessionDUnitTest > 
> should_propagateSession_toOtherServers FAILED
> 
> org.springframework.web.client.HttpServerErrorException$InternalServerError: 
> 500 Server Error: 
> [{"timestamp":"2021-07-19T15:38:49.855+00:00","status":500,"error":"Internal 
> Server Error","path":"/addSessionNote"}]
> at 
> org.springframework.web.client.HttpServerErrorException.create(HttpServerErrorException.java:100)
> at 
> org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:188)
> at 
> org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:125)
> at 
> org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63)
> at 
> org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:819)
> at 
> org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:777)
> at 
> org.springframework.web.client.RestTemplate.execute(RestTemplate.java:711)
> at 
> org.springframework.web.client.RestTemplate.postForEntity(RestTemplate.java:468)
> at 
> org.apache.geode.redis.session.SessionDUnitTest.createNewSessionWithNote0(SessionDUnitTest.java:207)
> at 
> org.apache.geode.redis.session.SessionDUnitTest.lambda$createNewSessionWithNote$1(SessionDUnitTest.java:201)
> at 
> io.github.resilience4j.retry.Retry.lambda$decorateCallable$5(Retry.java:306)
> at 
> org.apache.geode.redis.session.SessionDUnitTest.createNewSessionWithNote(SessionDUnitTest.java:201)
> at 
> org.apache.geode.redis.session.RedisSessionDUnitTest.should_propagateSession_toOtherServers(RedisSessionDUnitTest.java:97)
> {noformat}
> It's unclear exactly what is causing the problem as it seems to be related to 
> lettuce when servers stop/restart and lettuce tries to resubmit commands.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8191) MemberMXBeanDistributedTest.testBucketCount fails intermittently

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440603#comment-17440603
 ] 

Geode Integration commented on GEODE-8191:
--

Seen in [distributed-test-openjdk8 
#2504|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2504]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636170293/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636170293/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz].

> MemberMXBeanDistributedTest.testBucketCount fails intermittently
> 
>
> Key: GEODE-8191
> URL: https://issues.apache.org/jira/browse/GEODE-8191
> Project: Geode
>  Issue Type: Bug
>  Components: jmx, tests
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Kirk Lund
>Assignee: Mario Ivanac
>Priority: Major
>  Labels: flaky, pull-request-available
>
> This appears to be a flaky test related to GEODE-7963 which was resolved by 
> Mario Ivanac so I've assigned the ticket to him.
> {noformat}
> org.apache.geode.management.MemberMXBeanDistributedTest > testBucketCount 
> FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.management.MemberMXBeanDistributedTest Expected bucket count 
> is 4000, and actual count is 3750 expected:<3750> but was:<4000> within 5 
> minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.testBucketCount(MemberMXBeanDistributedTest.java:102)
> Caused by:
> java.lang.AssertionError: Expected bucket count is 4000, and actual 
> count is 3750 expected:<3750> but was:<4000>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:834)
> at org.junit.Assert.assertEquals(Assert.java:645)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.lambda$testBucketCount$1(MemberMXBeanDistributedTest.java:107)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-8191) MemberMXBeanDistributedTest.testBucketCount fails intermittently

2021-11-08 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-8191:
---
Fix Version/s: (was: 1.15.0)

> MemberMXBeanDistributedTest.testBucketCount fails intermittently
> 
>
> Key: GEODE-8191
> URL: https://issues.apache.org/jira/browse/GEODE-8191
> Project: Geode
>  Issue Type: Bug
>  Components: jmx, tests
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Kirk Lund
>Assignee: Mario Ivanac
>Priority: Major
>  Labels: flaky, pull-request-available
>
> This appears to be a flaky test related to GEODE-7963 which was resolved by 
> Mario Ivanac so I've assigned the ticket to him.
> {noformat}
> org.apache.geode.management.MemberMXBeanDistributedTest > testBucketCount 
> FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.management.MemberMXBeanDistributedTest Expected bucket count 
> is 4000, and actual count is 3750 expected:<3750> but was:<4000> within 5 
> minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.testBucketCount(MemberMXBeanDistributedTest.java:102)
> Caused by:
> java.lang.AssertionError: Expected bucket count is 4000, and actual 
> count is 3750 expected:<3750> but was:<4000>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:834)
> at org.junit.Assert.assertEquals(Assert.java:645)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.lambda$testBucketCount$1(MemberMXBeanDistributedTest.java:107)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9428) CI Failure: NativeRedisAcceptanceTest fails with CLUSTERDOWN error

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440602#comment-17440602
 ] 

Geode Integration commented on GEODE-9428:
--

Seen in [acceptance-test-openjdk11 
#332|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk11/builds/332]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0646/test-results/acceptanceTest/1636142589/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0646/test-artifacts/1636142589/acceptancetestfiles-openjdk11-1.15.0-build.0646.tgz].

> CI Failure: NativeRedisAcceptanceTest fails with CLUSTERDOWN error
> --
>
> Key: GEODE-9428
> URL: https://issues.apache.org/jira/browse/GEODE-9428
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Reporter: Hale Bales
>Priority: Major
>
> *This ticket tracks failures seen in NativeRedisAcceptanceTests due to 
> non-Geode code. It is closed because no work will be done in the Geode 
> project to fix this issue. If the issue becomes unbearable, a bug should be 
> opened with Redis: 
> [https://github.com/redis/redis/issues|https://github.com/redis/redis/issues*]*
> CI run is here: 
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk11/builds/82#L60e11384:311]
> {code:java}
> org.apache.geode.redis.internal.executor.string.PSetEXNativeRedisAcceptanceTest
>  > testPSetEX FAILED
> redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The 
> cluster is down
> at redis.clients.jedis.Protocol.processError(Protocol.java:125)
> at redis.clients.jedis.Protocol.process(Protocol.java:169)
> at redis.clients.jedis.Protocol.read(Protocol.java:223)
> at 
> redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
> at 
> redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:270)
> at redis.clients.jedis.Jedis.psetex(Jedis.java:3616)
> at redis.clients.jedis.JedisCluster$30.execute(JedisCluster.java:572)
> at redis.clients.jedis.JedisCluster$30.execute(JedisCluster.java:569)
> at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121)
> at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
> at redis.clients.jedis.JedisCluster.psetex(JedisCluster.java:574)
> at 
> org.apache.geode.redis.internal.executor.string.AbstractPSetEXIntegrationTest.testPSetEX(AbstractPSetEXIntegrationTest.java:54)
> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:566)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at 
> org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:120)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 

[jira] [Commented] (GEODE-8411) CI Failure: Jetty9CachingClientServerTest. containersShouldShareDataRemovals() fails with comparison failure

2021-11-08 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440595#comment-17440595
 ] 

Geode Integration commented on GEODE-8411:
--

Seen on support/1.13 in [distributed-test-openjdk8 
#69|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-support-1-13-main/jobs/distributed-test-openjdk8/builds/69]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-support-1-13-main/1.13.5-build.0614/test-results/distributedTest/1636159901/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-support-1-13-main/1.13.5-build.0614/test-artifacts/1636159901/distributedtestfiles-openjdk8-1.13.5-build.0614.tgz].

> CI Failure: Jetty9CachingClientServerTest. 
> containersShouldShareDataRemovals() fails with comparison failure
> 
>
> Key: GEODE-8411
> URL: https://issues.apache.org/jira/browse/GEODE-8411
> Project: Geode
>  Issue Type: Bug
>Reporter: Benjamin P Ross
>Assignee: Benjamin P Ross
>Priority: Major
>
> We saw Jetty9CachingClientServerTest fail with a Comparison failure in a CI 
> run.
> {code:java}
> org.apache.geode.session.tests.Jetty9CachingClientServerTest > 
> containersShouldShareDataRemovals FAILED
> org.junit.ComparisonFailure: expected:<"[]"> but was:<"[Foo]">
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)