[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9815:
--
Labels: GeodeOperationAPI needsTriage pull-request-available  (was: 
GeodeOperationAPI needsTriage)

> Recovering persistent members can result in extra copies of a bucket or two 
> copies in the same redundancy zone
> --
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-17 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445643#comment-17445643
 ] 

Mark Hanson commented on GEODE-8644:


 I have rerun this on a variety of cloud instances trying to reproduce this and 
I have not been successful. 

I think we may need to add more logging into the code so when it does fail we 
have more detail.

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9816) Implement Radish CLIENT command

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9816:
--
Labels: pull-request-available  (was: )

> Implement Radish CLIENT command
> ---
>
> Key: GEODE-9816
> URL: https://issues.apache.org/jira/browse/GEODE-9816
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: Jens Deppe
>Assignee: Kristen
>Priority: Major
>  Labels: pull-request-available
>
> It appears that using {{JedisCluster}} with security requires the {{CLIENT}} 
> command to exist. Using {{JedisCluster}} as:
> {noformat}
> JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, 
> redisServerPort),
> REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", 
> "client",
> new GenericObjectPoolConfig<>()); {noformat}
> Results in connection errors:
> {noformat}
> Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown 
> command `CLIENT`, with args beginning with: `setname`, `client`, 
>  {noformat}
> We will need to decide which subcommands to implement. At a minimum:
> - {{SETNAME}} (https://redis.io/commands/client-setname)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9449) remove 'b' prefix from constants

2021-11-17 Thread Kristen (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristen resolved GEODE-9449.

Fix Version/s: 1.15.0
   Resolution: Fixed

> remove 'b' prefix from constants
> 
>
> Key: GEODE-9449
> URL: https://issues.apache.org/jira/browse/GEODE-9449
> Project: Geode
>  Issue Type: Improvement
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Darrel Schneider
>Assignee: Kristen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> I number of constants in the redis packages have a 'b' prefix on their name. 
> This might have been a use of Hungarian notation but that is not clear. The 
> convention for constant names in geode is all upper case with underscore 
> between words. So the 'b' prefix should be removed. See StringBytesGlossary 
> for the location of many of these constants. Most of the constants in 
> StringBytesGlossary contains bytes, one or more, but if a few of the 
> constants in it are actually String instances. Consider renaming them to have 
> _STRING suffix or moving them to another class like StringGlossary. 
> The byte array constants in this class are marked with the MakeImmutable 
> annotation. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9449) remove 'b' prefix from constants

2021-11-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445561#comment-17445561
 ] 

ASF subversion and git services commented on GEODE-9449:


Commit e832de0fddc8b3b9b4d8d9a6ad24eba87373dc1e in geode's branch 
refs/heads/develop from Kris10
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=e832de0 ]

GEODE-9449: Remove 'b' prefix from constants (#7118)

Removed 'b' prefix on all byte arrays and added "_STRING" to string constants.

Co-authored-by: Kristen Oduca 

> remove 'b' prefix from constants
> 
>
> Key: GEODE-9449
> URL: https://issues.apache.org/jira/browse/GEODE-9449
> Project: Geode
>  Issue Type: Improvement
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Darrel Schneider
>Assignee: Kristen
>Priority: Minor
>  Labels: pull-request-available
>
> I number of constants in the redis packages have a 'b' prefix on their name. 
> This might have been a use of Hungarian notation but that is not clear. The 
> convention for constant names in geode is all upper case with underscore 
> between words. So the 'b' prefix should be removed. See StringBytesGlossary 
> for the location of many of these constants. Most of the constants in 
> StringBytesGlossary contains bytes, one or more, but if a few of the 
> constants in it are actually String instances. Consider renaming them to have 
> _STRING suffix or moving them to another class like StringGlossary. 
> The byte array constants in this class are marked with the MakeImmutable 
> annotation. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-7389) CI failure: BENCHMARK FAILED: PartitionedFunctionExecutionWithFiltersBenchmark average latency is 5% worse than baseline

2021-11-17 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445560#comment-17445560
 ] 

Geode Integration commented on GEODE-7389:
--

Seen in [benchmark-base 
#13|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/13].

> CI failure: BENCHMARK FAILED: 
> PartitionedFunctionExecutionWithFiltersBenchmark average latency is 5% worse 
> than baseline
> 
>
> Key: GEODE-7389
> URL: https://issues.apache.org/jira/browse/GEODE-7389
> Project: Geode
>  Issue Type: Bug
>  Components: benchmarks
>Reporter: Barrett Oglesby
>Priority: Major
>
> The PartitionedFunctionExecutionWithFiltersBenchmark in Benchmark build 643 
> failed:
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/Benchmark/builds/643]
> {noformat}
> org.apache.geode.benchmark.tests.PartitionedFunctionExecutionWithFiltersBenchmark
>   average ops/second  Baseline:350652.64  Test:318470.59  
> Difference:   -9.2%
>ops/second standard error  Baseline:   399.41  Test:   408.97  
> Difference:   +2.4%
>ops/second standard deviation  Baseline:  6906.46  Test:  7071.73  
> Difference:   +2.4%
>   YS 99th percentile latency  Baseline:  3006.00  Test: 20005.00  
> Difference: +565.5%
>   median latency  Baseline:   1202175.00  Test:   1327103.00  
> Difference:  +10.4%
>  90th percentile latency  Baseline:   2510847.00  Test:   2826239.00  
> Difference:  +12.6%
>  99th percentile latency  Baseline:  11788287.00  Test:  12476415.00  
> Difference:   +5.8%
>99.9th percentile latency  Baseline:  42631167.00  Test:  46989311.00  
> Difference:  +10.2%
>  average latency  Baseline:   1640650.66  Test:   1807133.87  
> Difference:  +10.1%
>   latency standard deviation  Baseline:   2881426.77  Test:   3148034.63  
> Difference:   +9.3%
>   latency standard error  Baseline:   281.02  Test:   322.21  
> Difference:  +14.7%
> BENCHMARK FAILED: 
> org.apache.geode.benchmark.tests.PartitionedFunctionExecutionWithFiltersBenchmark
>  average latency is 5% worse than baseline.
> {noformat}
> Please drop a link to any additional CI runs that have this failure and 
> restart the benchmarks.
>   



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9820) stopCQ does not trigger re-authentication

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9820:
--
Labels: GeodeOperationAPI pull-request-available  (was: GeodeOperationAPI)

> stopCQ does not trigger re-authentication
> -
>
> Key: GEODE-9820
> URL: https://issues.apache.org/jira/browse/GEODE-9820
> Project: Geode
>  Issue Type: Sub-task
>  Components: cq
>Affects Versions: 1.14.0
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>
> after credential expires, when user execute `stopCQ` operation, 
> re-authentication does not get triggered.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9820) stopCQ does not trigger re-authentication

2021-11-17 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao updated GEODE-9820:
---
Labels: GeodeOperationAPI  (was: )

> stopCQ does not trigger re-authentication
> -
>
> Key: GEODE-9820
> URL: https://issues.apache.org/jira/browse/GEODE-9820
> Project: Geode
>  Issue Type: Sub-task
>  Components: cq
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI
>
> after credential expires, when user execute `stopCQ` operation, 
> re-authentication does not get triggered.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9820) stopCQ does not trigger re-authentication

2021-11-17 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao updated GEODE-9820:
---
Affects Version/s: 1.14.0

> stopCQ does not trigger re-authentication
> -
>
> Key: GEODE-9820
> URL: https://issues.apache.org/jira/browse/GEODE-9820
> Project: Geode
>  Issue Type: Sub-task
>  Components: cq
>Affects Versions: 1.14.0
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI
>
> after credential expires, when user execute `stopCQ` operation, 
> re-authentication does not get triggered.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9820) stopCQ does not trigger re-authentication

2021-11-17 Thread Jinmei Liao (Jira)
Jinmei Liao created GEODE-9820:
--

 Summary: stopCQ does not trigger re-authentication
 Key: GEODE-9820
 URL: https://issues.apache.org/jira/browse/GEODE-9820
 Project: Geode
  Issue Type: Sub-task
  Components: cq
Reporter: Jinmei Liao


after credential expires, when user execute `stopCQ` operation, 
re-authentication does not get triggered.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9820) stopCQ does not trigger re-authentication

2021-11-17 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao reassigned GEODE-9820:
--

Assignee: Jinmei Liao

> stopCQ does not trigger re-authentication
> -
>
> Key: GEODE-9820
> URL: https://issues.apache.org/jira/browse/GEODE-9820
> Project: Geode
>  Issue Type: Sub-task
>  Components: cq
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>
> after credential expires, when user execute `stopCQ` operation, 
> re-authentication does not get triggered.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (GEODE-9451) On demand authentication expiration and re-authentication

2021-11-17 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao reopened GEODE-9451:

  Assignee: Jinmei Liao

re-open to add more subtask to it

> On demand authentication expiration and re-authentication
> -
>
> Key: GEODE-9451
> URL: https://issues.apache.org/jira/browse/GEODE-9451
> Project: Geode
>  Issue Type: New Feature
>  Components: core, security
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> This is to implement the feature proposed in this RFC
> https://cwiki.apache.org/confluence/display/GEODE/On+Demand+Geode+Authentication+Expiration+and+Re-authentication



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9819) Client socket leak in CacheClientNotifier.registerClientInternal when error conditions occur for the durable client

2021-11-17 Thread Leon Finker (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leon Finker updated GEODE-9819:
---
Priority: Critical  (was: Major)

> Client socket leak in CacheClientNotifier.registerClientInternal when error 
> conditions occur for the durable client
> ---
>
> Key: GEODE-9819
> URL: https://issues.apache.org/jira/browse/GEODE-9819
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, core
>Affects Versions: 1.14.0
>Reporter: Leon Finker
>Priority: Critical
>
> In CacheClientNotifier.registerClientInternal client socket can be left half 
> open and not properly closed when error conditions occur. Such as the case of:
> {code:java}
> } else {
>   // The existing proxy is already running (which means that another
>   // client is already using this durable id.
>   unsuccessfulMsg =
>   String.format(
>   "The requested durable client has the same identifier ( %s ) as an 
> existing durable client ( %s ). Duplicate durable clients are not allowed.",
>   clientProxyMembershipID.getDurableId(), cacheClientProxy);
>   logger.warn(unsuccessfulMsg);
>   // Set the unsuccessful response byte.
>   responseByte = Handshake.REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT;
> } {code}
> It considers the current client connect attempt to have failed. It writes 
> this response back to client: REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT. This 
> will cause the client to throw ServerRefusedConnectionException. What seems 
> wrong about this method is that even though it sets "unsuccessfulMsg" and 
> correctly sends back a handshake saying the client is rejected, it does not 
> throw an exception and it does not close "socket". I think right before it 
> calls performPostAuthorization it should do the followiing:
> if (unsuccessfulMsg != null) {
>   try {
> socket.close();
>   } catch (IOException ignore) {
>   }
>  } else {
> performPostAuthorization(...)
> }
> Full discussion details can be found at 
> https://markmail.org/thread/2gqmbq2m57pz7pxu



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9819) Client socket leak in CacheClientNotifier.registerClientInternal when error conditions occur for the durable client

2021-11-17 Thread Leon Finker (Jira)
Leon Finker created GEODE-9819:
--

 Summary: Client socket leak in 
CacheClientNotifier.registerClientInternal when error conditions occur for the 
durable client
 Key: GEODE-9819
 URL: https://issues.apache.org/jira/browse/GEODE-9819
 Project: Geode
  Issue Type: Bug
  Components: client/server, core
Affects Versions: 1.14.0
Reporter: Leon Finker


In CacheClientNotifier.registerClientInternal client socket can be left half 
open and not properly closed when error conditions occur. Such as the case of:
{code:java}
} else {
  // The existing proxy is already running (which means that another
  // client is already using this durable id.
  unsuccessfulMsg =
  String.format(
  "The requested durable client has the same identifier ( %s ) as an 
existing durable client ( %s ). Duplicate durable clients are not allowed.",
  clientProxyMembershipID.getDurableId(), cacheClientProxy);
  logger.warn(unsuccessfulMsg);
  // Set the unsuccessful response byte.
  responseByte = Handshake.REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT;
} {code}

It considers the current client connect attempt to have failed. It writes this 
response back to client: REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT. This will 
cause the client to throw ServerRefusedConnectionException. What seems wrong 
about this method is that even though it sets "unsuccessfulMsg" and correctly 
sends back a handshake saying the client is rejected, it does not throw an 
exception and it does not close "socket". I think right before it calls 
performPostAuthorization it should do the followiing:
if (unsuccessfulMsg != null) {
  try {
socket.close();
  } catch (IOException ignore) {
  }
 } else {
performPostAuthorization(...)
}

Full discussion details can be found at 
https://markmail.org/thread/2gqmbq2m57pz7pxu




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException

2021-11-17 Thread Kamilla Aslami (Jira)
Kamilla Aslami created GEODE-9818:
-

 Summary: CI failure: 
RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed 
with RMIException
 Key: GEODE-9818
 URL: https://issues.apache.org/jira/browse/GEODE-9818
 Project: Geode
  Issue Type: Bug
  Components: client/server
Affects Versions: 1.13.5
Reporter: Kamilla Aslami


{noformat}
org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > 
testRedundancySpecifiedNonPrimaryEPFails FAILED
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run
 in VM 2 running on Host 17763e768fb6 with 4 VMs
        at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
        at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
        at 
org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258)
        Caused by:
        org.awaitility.core.ConditionTimeoutException: Assertion condition 
defined as a lambda expression in 
org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest that 
uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier 
        Expecting:
         <0>
        to be greater than:
         <0>  within 5 minutes.
            at 
org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
            at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
            at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
            at 
org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
            at 
org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
            at 
org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504)
            Caused by:
            java.lang.AssertionError: 
            Expecting:
             <0>
            to be greater than:
             <0> 
                at 
org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException

2021-11-17 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445464#comment-17445464
 ] 

Geode Integration commented on GEODE-9818:
--

Seen on support/1.13 in [distributed-test-openjdk8 
#3|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-support-1-13-main/jobs/distributed-test-openjdk8/builds/3]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-support-1-13-main/1.13.5-build.0618/test-results/distributedTest/1637149771/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-support-1-13-main/1.13.5-build.0618/test-artifacts/1637149771/distributedtestfiles-openjdk8-1.13.5-build.0618.tgz].

> CI failure: 
> RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed 
> with RMIException
> ---
>
> Key: GEODE-9818
> URL: https://issues.apache.org/jira/browse/GEODE-9818
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.13.5
>Reporter: Kamilla Aslami
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > 
> testRedundancySpecifiedNonPrimaryEPFails FAILED
>     org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run
>  in VM 2 running on Host 17763e768fb6 with 4 VMs
>         at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
>         at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
>         at 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258)
>         Caused by:
>         org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest 
> that uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier 
>         Expecting:
>          <0>
>         to be greater than:
>          <0>  within 5 minutes.
>             at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
>             at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
>             at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
>             at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
>             at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
>             at 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504)
>             Caused by:
>             java.lang.AssertionError: 
>             Expecting:
>              <0>
>             to be greater than:
>              <0> 
>                 at 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException

2021-11-17 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9818:
-
Labels: needsTriage  (was: )

> CI failure: 
> RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed 
> with RMIException
> ---
>
> Key: GEODE-9818
> URL: https://issues.apache.org/jira/browse/GEODE-9818
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.13.5
>Reporter: Kamilla Aslami
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > 
> testRedundancySpecifiedNonPrimaryEPFails FAILED
>     org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run
>  in VM 2 running on Host 17763e768fb6 with 4 VMs
>         at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
>         at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
>         at 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258)
>         Caused by:
>         org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest 
> that uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier 
>         Expecting:
>          <0>
>         to be greater than:
>          <0>  within 5 minutes.
>             at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
>             at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
>             at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
>             at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
>             at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
>             at 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504)
>             Caused by:
>             java.lang.AssertionError: 
>             Expecting:
>              <0>
>             to be greater than:
>              <0> 
>                 at 
> org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9816) Implement Radish CLIENT command

2021-11-17 Thread Kristen (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristen reassigned GEODE-9816:
--

Assignee: Kristen

> Implement Radish CLIENT command
> ---
>
> Key: GEODE-9816
> URL: https://issues.apache.org/jira/browse/GEODE-9816
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: Jens Deppe
>Assignee: Kristen
>Priority: Major
>
> It appears that using {{JedisCluster}} with security requires the {{CLIENT}} 
> command to exist. Using {{JedisCluster}} as:
> {noformat}
> JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, 
> redisServerPort),
> REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", 
> "client",
> new GenericObjectPoolConfig<>()); {noformat}
> Results in connection errors:
> {noformat}
> Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown 
> command `CLIENT`, with args beginning with: `setname`, `client`, 
>  {noformat}
> We will need to decide which subcommands to implement. At a minimum:
> - {{SETNAME}} (https://redis.io/commands/client-setname)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9817) Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9817:
--
Labels: pull-request-available  (was: )

> Allow analyze serializables tests to provide custom source set paths to 
> ClassAnalysisRule
> -
>
> Key: GEODE-9817
> URL: https://issues.apache.org/jira/browse/GEODE-9817
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> In order to make SanctionedSerializablesService and the related tests to be 
> more pluggable by external modules, I need to make changes to allow analyze 
> serializables tests to provide custom source set paths to ClassAnalysisRule.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone

2021-11-17 Thread Dan Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-9815:
-
Summary: Recovering persistent members can result in extra copies of a 
bucket or two copies in the same redundancy zone  (was: Recovering persistent 
members can result in extra copies of a bucket or two copies int the same 
redundancy zone)

> Recovering persistent members can result in extra copies of a bucket or two 
> copies in the same redundancy zone
> --
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9638) CI failure: DeployedJarTest getDeployedFileName failed on Windows intermittently

2021-11-17 Thread Anilkumar Gingade (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Gingade reassigned GEODE-9638:


Assignee: Dale Emery

> CI failure: DeployedJarTest  getDeployedFileName failed on Windows 
> intermittently
> -
>
> Key: GEODE-9638
> URL: https://issues.apache.org/jira/browse/GEODE-9638
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Darrel Schneider
>Assignee: Dale Emery
>Priority: Major
>  Labels: GeodeOperationAPI, flaky, needsTriage
>
> org.apache.geode.deployment.internal.DeployedJarTest > getDeployedFileName 
> FAILED
> java.nio.file.DirectoryNotEmptyException: 
> C:\Users\geode\AppData\Local\Temp\javaCompiler2976436474406314797\classes
> at 
> sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:266)
> at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
> at java.nio.file.Files.delete(Files.java:1126)
> at org.apache.commons.io.FileUtils.delete(FileUtils.java:1175)
> at 
> org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1194)
> at 
> org.apache.geode.test.compiler.JavaCompiler.compile(JavaCompiler.java:91)
> at 
> org.apache.geode.test.compiler.JarBuilder.buildJarFromClassNames(JarBuilder.java:83)
> at 
> org.apache.geode.deployment.internal.DeployedJarTest.createJarFile(DeployedJarTest.java:82)
> at 
> org.apache.geode.deployment.internal.DeployedJarTest.getDeployedFileName(DeployedJarTest.java:65)
> see: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/windows-unit-test-openjdk8/builds/206



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone

2021-11-17 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9815:
--

Assignee: Mark Hanson

> Recovering persistent members can result in extra copies of a bucket or two 
> copies int the same redundancy zone
> ---
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9810) CI: NativeRedisClusterTest testEachProxyReturnsExposedPorts failed

2021-11-17 Thread Wayne (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wayne updated GEODE-9810:
-
Description: 
 
{code:java}
> Task :geode-for-redis:acceptanceTest

NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED
java.lang.AssertionError: 
Expecting actual:
  [44073, 45679, 36065, 40077, 42137]
to contain exactly in any order:
  [40077, 45679, 33425, 36065, 42137, 44073]
but could not find the following elements:
  [33425]
at 
org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48)

1385 tests completed, 1 failed, 2 skipped

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Test report artifacts from this job are available at:

http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz
{code}

  was:
https://hydradb.hdb.gemfire-ci.info/hdb/testresult/12258442

{code:java}
> Task :geode-for-redis:acceptanceTest

NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED
java.lang.AssertionError: 
Expecting actual:
  [44073, 45679, 36065, 40077, 42137]
to contain exactly in any order:
  [40077, 45679, 33425, 36065, 42137, 44073]
but could not find the following elements:
  [33425]
at 
org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48)

1385 tests completed, 1 failed, 2 skipped

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Test report artifacts from this job are available at:

http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz
{code}



> CI: NativeRedisClusterTest testEachProxyReturnsExposedPorts failed
> --
>
> Key: GEODE-9810
> URL: https://issues.apache.org/jira/browse/GEODE-9810
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
>  
> {code:java}
> > Task :geode-for-redis:acceptanceTest
> NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED
> java.lang.AssertionError: 
> Expecting actual:
>   [44073, 45679, 36065, 40077, 42137]
> to contain exactly in any order:
>   [40077, 45679, 33425, 36065, 42137, 44073]
> but could not find the following elements:
>   [33425]
> at 
> org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48)
> 1385 tests completed, 1 failed, 2 skipped
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9817) Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule

2021-11-17 Thread Kirk Lund (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk Lund reassigned GEODE-9817:


Assignee: Kirk Lund

> Allow analyze serializables tests to provide custom source set paths to 
> ClassAnalysisRule
> -
>
> Key: GEODE-9817
> URL: https://issues.apache.org/jira/browse/GEODE-9817
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>
> In order to make SanctionedSerializablesService and the related tests to be 
> more pluggable by external modules, I need to make changes to allow analyze 
> serializables tests to provide custom source set paths to ClassAnalysisRule.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9817) Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule

2021-11-17 Thread Kirk Lund (Jira)
Kirk Lund created GEODE-9817:


 Summary: Allow analyze serializables tests to provide custom 
source set paths to ClassAnalysisRule
 Key: GEODE-9817
 URL: https://issues.apache.org/jira/browse/GEODE-9817
 Project: Geode
  Issue Type: Wish
  Components: tests
Reporter: Kirk Lund


In order to make SanctionedSerializablesService and the related tests to be 
more pluggable by external modules, I need to make changes to allow analyze 
serializables tests to provide custom source set paths to ClassAnalysisRule.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone

2021-11-17 Thread Dan Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-9815:
-
Description: 
The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
new issue when removing buckets that are over redundancy.

GEODE-9554 and these new issues are all related to using redundancy zones and 
having persistent members.

With persistence, when we start up a member with persisted buckets, we always 
recover the persisted buckets on startup, regardless of whether redundancy is 
already met or what zone the existing buckets are on. This is necessary to 
ensure that we can recover all colocated buckets that might be persisted on the 
member.

Because recovering these persistent buckets may cause us to go over redundancy, 
after we recover from disk, we run a "restore redundancy" task that actually 
removes copies of buckets that are over redundancy.

GEODE-9554 addressed one case where we end up removing the last copy of a 
bucket from one redundancy zone while leaving two copies in another redundancy 
zone. It did so by disallowing the removal of a bucket if it is the last copy 
in a redundancy zone.

There are a couple of issues with this approach.

*Problem 1:* We may end up with two copies of the bucket in one zone in some 
cases

With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
getting out of the situation where we have two copies of a bucket in the same 
zone.

Steps:
1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
member A1 and B1.
2. Shutdown member A1.
3. Rebalance - this will create bucket 0 on A2.
4. Shutdown B1. Revoke it's disk store and delete the data
5. Startup A1 - it will recover bucket 0.
6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
situation.

*Problem 2:* We may never delete extra copies of a bucket
The fix for GEODE-9554 introduces a new problem if we have more than 2 
redundancy zones

Steps
1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
and B1
2. Shutdown A1
3. Rebalance -  this will create Bucket 0 on C1
4. Startup A1 - this will recreate bucket 0
5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.

I think the overall fix is probably to do something different than prevent 
removing the last copy of a bucket from a redundancy zone. Instead, I think we 
should do something like this:
1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
buckets that have two copies in the same zone, as well as any buckets that are 
actually over redundancy.
2. Change PartitionRegionLoadModel.findBestRemove to always remove extra copies 
of a bucket in the same zone first
3. Back out the changes for GEODE-9554 and let the last copy be deleted from a 
zone.

  was:
The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
new issue when removing buckets that are over redundancy.

GEODE-9554 and these new issues are all related to using redundancy zones and 
having persistent members.

With persistence, when we start up a member with persisted buckets, we always 
recover the persisted buckets on startup, regardless of whether redundancy is 
already met or what zone the existing buckets are on. This is necessary to 
ensure that we can recover all colocated buckets that might be persisted on the 
member.

Because recovering these persistent buckets may cause us to go over redundancy, 
after we recover from disk, we run a "restore redundancy" task that actually 
removes copies of buckets that are over redundancy.

GEODE-9554 addressed one case where we end up removing the last copy of a 
bucket from one redundancy zone while leaving two copies in another redundancy 
zone. It did so by disallowing the removal of a bucket if it is the last copy 
in a redundancy zone.

There are a couple of issues with this approach. 

*Problem 1:* We may end up with two copies of the bucket in one zone in some 
cases

With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
getting out of the situation where we have two copies of a bucket in the same 
zone.

Steps:
1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
member A1 and B1.
2. Shutdown member A1.
3. Rebalance - this will create bucket 0 on A2.
4. Shutdown B1. Revoke it's disk store and delete the data
5. Startup A1 - it will recover bucket 0.
6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
situation.

*Problem 2:* We may never delete extra copies of a bucket
The fix for GEODE-9554 introduces a new problem if we have more than 2 
redundancy zones

Steps
1. Start three redundancy zones A,B,C with two members each. Bucket 0 is on A1 
and B1>
2. Shutdown A1
3. Rebalance -  this will create Bucket 0 on C1
4. Startup A1 - this will recreate bucket 0
5. Now we have bucket 0 on A1, B1, 

[jira] [Updated] (GEODE-9816) Implement Radish CLIENT command

2021-11-17 Thread Jens Deppe (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jens Deppe updated GEODE-9816:
--
Description: 
It appears that using {{JedisCluster}} with security requires the {{CLIENT}} 
command to exist. Using {{JedisCluster}} as:
{noformat}
JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, 
redisServerPort),
REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", 
"client",
new GenericObjectPoolConfig<>()); {noformat}
Results in connection errors:
{noformat}
Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown 
command `CLIENT`, with args beginning with: `setname`, `client`, 
 {noformat}
We will need to decide which subcommands to implement. At a minimum:

- {{SETNAME}} (https://redis.io/commands/client-setname)

  was:
It appears that using {{JedisCluster}} with security requires the {{CLIENT}} 
command to exist. Using {{JedisCluster}} as:
{noformat}
JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, 
redisServerPort),
REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", 
"client",
new GenericObjectPoolConfig<>()); {noformat}
Results in connection errors:
{noformat}
Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown 
command `CLIENT`, with args beginning with: `setname`, `client`, 
 {noformat}


> Implement Radish CLIENT command
> ---
>
> Key: GEODE-9816
> URL: https://issues.apache.org/jira/browse/GEODE-9816
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>
> It appears that using {{JedisCluster}} with security requires the {{CLIENT}} 
> command to exist. Using {{JedisCluster}} as:
> {noformat}
> JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, 
> redisServerPort),
> REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", 
> "client",
> new GenericObjectPoolConfig<>()); {noformat}
> Results in connection errors:
> {noformat}
> Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown 
> command `CLIENT`, with args beginning with: `setname`, `client`, 
>  {noformat}
> We will need to decide which subcommands to implement. At a minimum:
> - {{SETNAME}} (https://redis.io/commands/client-setname)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9449) remove 'b' prefix from constants

2021-11-17 Thread Kristen (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristen reassigned GEODE-9449:
--

Assignee: Kristen

> remove 'b' prefix from constants
> 
>
> Key: GEODE-9449
> URL: https://issues.apache.org/jira/browse/GEODE-9449
> Project: Geode
>  Issue Type: Improvement
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Darrel Schneider
>Assignee: Kristen
>Priority: Minor
>  Labels: pull-request-available
>
> I number of constants in the redis packages have a 'b' prefix on their name. 
> This might have been a use of Hungarian notation but that is not clear. The 
> convention for constant names in geode is all upper case with underscore 
> between words. So the 'b' prefix should be removed. See StringBytesGlossary 
> for the location of many of these constants. Most of the constants in 
> StringBytesGlossary contains bytes, one or more, but if a few of the 
> constants in it are actually String instances. Consider renaming them to have 
> _STRING suffix or moving them to another class like StringGlossary. 
> The byte array constants in this class are marked with the MakeImmutable 
> annotation. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9816) Implement Radish CLIENT command

2021-11-17 Thread Jens Deppe (Jira)
Jens Deppe created GEODE-9816:
-

 Summary: Implement Radish CLIENT command
 Key: GEODE-9816
 URL: https://issues.apache.org/jira/browse/GEODE-9816
 Project: Geode
  Issue Type: New Feature
  Components: redis
Reporter: Jens Deppe


It appears that using {{JedisCluster}} with security requires the {{CLIENT}} 
command to exist. Using {{JedisCluster}} as:
{noformat}
JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, 
redisServerPort),
REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", 
"client",
new GenericObjectPoolConfig<>()); {noformat}
Results in connection errors:
{noformat}
Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown 
command `CLIENT`, with args beginning with: `setname`, `client`, 
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9653) ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > serverVersioningTest[version=1.9.1] FAILED

2021-11-17 Thread Anilkumar Gingade (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445302#comment-17445302
 ] 

Anilkumar Gingade commented on GEODE-9653:
--

This is test related issue with test Rules; this should not be a blocker for 
1.15. Removing needs triage label.

> ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > 
> serverVersioningTest[version=1.9.1] FAILED
> ---
>
> Key: GEODE-9653
> URL: https://issues.apache.org/jira/browse/GEODE-9653
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.15.0
>Reporter: Kamilla Aslami
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> org.apache.geode.test.dunit.rules.tests.ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest
>  > serverVersioningTest[version=1.9.1] FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 350[fatal 
> 2021/09/28 00:08:38.644 UTC  tid=47] Exception in 
> processing request from 10.0.0.30
> java.lang.Exception: Improperly configured client detected - use 
> addPoolLocator to configure its locators instead of addPoolServer.
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 357[fatal 
> 2021/09/28 00:08:38.657 UTC  tid=47] Exception in 
> processing request from 10.0.0.30
> java.lang.Exception: Improperly configured client detected - use 
> addPoolLocator to configure its locators instead of addPoolServer.
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 364[fatal 
> 2021/09/28 00:08:38.657 UTC  tid=48] Exception in 
> processing request from 10.0.0.30
> java.lang.Exception: Improperly configured client detected - use 
> addPoolLocator to configure its locators instead of addPoolServer.
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at 

[jira] [Updated] (GEODE-9653) ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > serverVersioningTest[version=1.9.1] FAILED

2021-11-17 Thread Anilkumar Gingade (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Gingade updated GEODE-9653:
-
Labels:   (was: needsTriage)

> ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > 
> serverVersioningTest[version=1.9.1] FAILED
> ---
>
> Key: GEODE-9653
> URL: https://issues.apache.org/jira/browse/GEODE-9653
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.15.0
>Reporter: Kamilla Aslami
>Priority: Major
>
> {noformat}
> org.apache.geode.test.dunit.rules.tests.ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest
>  > serverVersioningTest[version=1.9.1] FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 350[fatal 
> 2021/09/28 00:08:38.644 UTC  tid=47] Exception in 
> processing request from 10.0.0.30
> java.lang.Exception: Improperly configured client detected - use 
> addPoolLocator to configure its locators instead of addPoolServer.
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 357[fatal 
> 2021/09/28 00:08:38.657 UTC  tid=47] Exception in 
> processing request from 10.0.0.30
> java.lang.Exception: Improperly configured client detected - use 
> addPoolLocator to configure its locators instead of addPoolServer.
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 364[fatal 
> 2021/09/28 00:08:38.657 UTC  tid=48] Exception in 
> processing request from 10.0.0.30
> java.lang.Exception: Improperly configured client detected - use 
> addPoolLocator to configure its locators instead of addPoolServer.
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at org.junit.runners.Suite.runChild(Suite.java:128)
> at 

[jira] [Updated] (GEODE-9638) CI failure: DeployedJarTest getDeployedFileName failed on Windows intermittently

2021-11-17 Thread Anilkumar Gingade (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Gingade updated GEODE-9638:
-
Labels: GeodeOperationAPI flaky needsTriage  (was: flaky needsTriage)

> CI failure: DeployedJarTest  getDeployedFileName failed on Windows 
> intermittently
> -
>
> Key: GEODE-9638
> URL: https://issues.apache.org/jira/browse/GEODE-9638
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, flaky, needsTriage
>
> org.apache.geode.deployment.internal.DeployedJarTest > getDeployedFileName 
> FAILED
> java.nio.file.DirectoryNotEmptyException: 
> C:\Users\geode\AppData\Local\Temp\javaCompiler2976436474406314797\classes
> at 
> sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:266)
> at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
> at java.nio.file.Files.delete(Files.java:1126)
> at org.apache.commons.io.FileUtils.delete(FileUtils.java:1175)
> at 
> org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1194)
> at 
> org.apache.geode.test.compiler.JavaCompiler.compile(JavaCompiler.java:91)
> at 
> org.apache.geode.test.compiler.JarBuilder.buildJarFromClassNames(JarBuilder.java:83)
> at 
> org.apache.geode.deployment.internal.DeployedJarTest.createJarFile(DeployedJarTest.java:82)
> at 
> org.apache.geode.deployment.internal.DeployedJarTest.getDeployedFileName(DeployedJarTest.java:65)
> see: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/windows-unit-test-openjdk8/builds/206



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone

2021-11-17 Thread Anilkumar Gingade (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Gingade updated GEODE-9815:
-
Labels: GeodeOperationAPI needsTriage  (was: needsTriage)

> Recovering persistent members can result in extra copies of a bucket or two 
> copies int the same redundancy zone
> ---
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach. 
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with two members each. Bucket 0 is on 
> A1 and B1>
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first>
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)