[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9815: -- Labels: GeodeOperationAPI needsTriage pull-request-available (was: GeodeOperationAPI needsTriage) > Recovering persistent members can result in extra copies of a bucket or two > copies in the same redundancy zone > -- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly
[ https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445643#comment-17445643 ] Mark Hanson commented on GEODE-8644: I have rerun this on a variety of cloud instances trying to reproduce this and I have not been successful. I think we may need to add more logging into the code so when it does fail we have more detail. > SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() > intermittently fails when queues drain too slowly > --- > > Key: GEODE-8644 > URL: https://issues.apache.org/jira/browse/GEODE-8644 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Benjamin P Ross >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > Currently the test > SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() > relies on a 2 second delay to allow for queues to finish draining after > finishing the put operation. If queues take longer than 2 seconds to drain > the test will fail. We should change the test to wait for the queues to be > empty with a long timeout in case the queues never fully drain. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9816) Implement Radish CLIENT command
[ https://issues.apache.org/jira/browse/GEODE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9816: -- Labels: pull-request-available (was: ) > Implement Radish CLIENT command > --- > > Key: GEODE-9816 > URL: https://issues.apache.org/jira/browse/GEODE-9816 > Project: Geode > Issue Type: New Feature > Components: redis >Reporter: Jens Deppe >Assignee: Kristen >Priority: Major > Labels: pull-request-available > > It appears that using {{JedisCluster}} with security requires the {{CLIENT}} > command to exist. Using {{JedisCluster}} as: > {noformat} > JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, > redisServerPort), > REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", > "client", > new GenericObjectPoolConfig<>()); {noformat} > Results in connection errors: > {noformat} > Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown > command `CLIENT`, with args beginning with: `setname`, `client`, > {noformat} > We will need to decide which subcommands to implement. At a minimum: > - {{SETNAME}} (https://redis.io/commands/client-setname) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9449) remove 'b' prefix from constants
[ https://issues.apache.org/jira/browse/GEODE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristen resolved GEODE-9449. Fix Version/s: 1.15.0 Resolution: Fixed > remove 'b' prefix from constants > > > Key: GEODE-9449 > URL: https://issues.apache.org/jira/browse/GEODE-9449 > Project: Geode > Issue Type: Improvement > Components: redis >Affects Versions: 1.15.0 >Reporter: Darrel Schneider >Assignee: Kristen >Priority: Minor > Labels: pull-request-available > Fix For: 1.15.0 > > > I number of constants in the redis packages have a 'b' prefix on their name. > This might have been a use of Hungarian notation but that is not clear. The > convention for constant names in geode is all upper case with underscore > between words. So the 'b' prefix should be removed. See StringBytesGlossary > for the location of many of these constants. Most of the constants in > StringBytesGlossary contains bytes, one or more, but if a few of the > constants in it are actually String instances. Consider renaming them to have > _STRING suffix or moving them to another class like StringGlossary. > The byte array constants in this class are marked with the MakeImmutable > annotation. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9449) remove 'b' prefix from constants
[ https://issues.apache.org/jira/browse/GEODE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445561#comment-17445561 ] ASF subversion and git services commented on GEODE-9449: Commit e832de0fddc8b3b9b4d8d9a6ad24eba87373dc1e in geode's branch refs/heads/develop from Kris10 [ https://gitbox.apache.org/repos/asf?p=geode.git;h=e832de0 ] GEODE-9449: Remove 'b' prefix from constants (#7118) Removed 'b' prefix on all byte arrays and added "_STRING" to string constants. Co-authored-by: Kristen Oduca > remove 'b' prefix from constants > > > Key: GEODE-9449 > URL: https://issues.apache.org/jira/browse/GEODE-9449 > Project: Geode > Issue Type: Improvement > Components: redis >Affects Versions: 1.15.0 >Reporter: Darrel Schneider >Assignee: Kristen >Priority: Minor > Labels: pull-request-available > > I number of constants in the redis packages have a 'b' prefix on their name. > This might have been a use of Hungarian notation but that is not clear. The > convention for constant names in geode is all upper case with underscore > between words. So the 'b' prefix should be removed. See StringBytesGlossary > for the location of many of these constants. Most of the constants in > StringBytesGlossary contains bytes, one or more, but if a few of the > constants in it are actually String instances. Consider renaming them to have > _STRING suffix or moving them to another class like StringGlossary. > The byte array constants in this class are marked with the MakeImmutable > annotation. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-7389) CI failure: BENCHMARK FAILED: PartitionedFunctionExecutionWithFiltersBenchmark average latency is 5% worse than baseline
[ https://issues.apache.org/jira/browse/GEODE-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445560#comment-17445560 ] Geode Integration commented on GEODE-7389: -- Seen in [benchmark-base #13|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/13]. > CI failure: BENCHMARK FAILED: > PartitionedFunctionExecutionWithFiltersBenchmark average latency is 5% worse > than baseline > > > Key: GEODE-7389 > URL: https://issues.apache.org/jira/browse/GEODE-7389 > Project: Geode > Issue Type: Bug > Components: benchmarks >Reporter: Barrett Oglesby >Priority: Major > > The PartitionedFunctionExecutionWithFiltersBenchmark in Benchmark build 643 > failed: > [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/Benchmark/builds/643] > {noformat} > org.apache.geode.benchmark.tests.PartitionedFunctionExecutionWithFiltersBenchmark > average ops/second Baseline:350652.64 Test:318470.59 > Difference: -9.2% >ops/second standard error Baseline: 399.41 Test: 408.97 > Difference: +2.4% >ops/second standard deviation Baseline: 6906.46 Test: 7071.73 > Difference: +2.4% > YS 99th percentile latency Baseline: 3006.00 Test: 20005.00 > Difference: +565.5% > median latency Baseline: 1202175.00 Test: 1327103.00 > Difference: +10.4% > 90th percentile latency Baseline: 2510847.00 Test: 2826239.00 > Difference: +12.6% > 99th percentile latency Baseline: 11788287.00 Test: 12476415.00 > Difference: +5.8% >99.9th percentile latency Baseline: 42631167.00 Test: 46989311.00 > Difference: +10.2% > average latency Baseline: 1640650.66 Test: 1807133.87 > Difference: +10.1% > latency standard deviation Baseline: 2881426.77 Test: 3148034.63 > Difference: +9.3% > latency standard error Baseline: 281.02 Test: 322.21 > Difference: +14.7% > BENCHMARK FAILED: > org.apache.geode.benchmark.tests.PartitionedFunctionExecutionWithFiltersBenchmark > average latency is 5% worse than baseline. > {noformat} > Please drop a link to any additional CI runs that have this failure and > restart the benchmarks. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9820) stopCQ does not trigger re-authentication
[ https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9820: -- Labels: GeodeOperationAPI pull-request-available (was: GeodeOperationAPI) > stopCQ does not trigger re-authentication > - > > Key: GEODE-9820 > URL: https://issues.apache.org/jira/browse/GEODE-9820 > Project: Geode > Issue Type: Sub-task > Components: cq >Affects Versions: 1.14.0 >Reporter: Jinmei Liao >Assignee: Jinmei Liao >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > after credential expires, when user execute `stopCQ` operation, > re-authentication does not get triggered. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9820) stopCQ does not trigger re-authentication
[ https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinmei Liao updated GEODE-9820: --- Labels: GeodeOperationAPI (was: ) > stopCQ does not trigger re-authentication > - > > Key: GEODE-9820 > URL: https://issues.apache.org/jira/browse/GEODE-9820 > Project: Geode > Issue Type: Sub-task > Components: cq >Reporter: Jinmei Liao >Assignee: Jinmei Liao >Priority: Major > Labels: GeodeOperationAPI > > after credential expires, when user execute `stopCQ` operation, > re-authentication does not get triggered. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9820) stopCQ does not trigger re-authentication
[ https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinmei Liao updated GEODE-9820: --- Affects Version/s: 1.14.0 > stopCQ does not trigger re-authentication > - > > Key: GEODE-9820 > URL: https://issues.apache.org/jira/browse/GEODE-9820 > Project: Geode > Issue Type: Sub-task > Components: cq >Affects Versions: 1.14.0 >Reporter: Jinmei Liao >Assignee: Jinmei Liao >Priority: Major > Labels: GeodeOperationAPI > > after credential expires, when user execute `stopCQ` operation, > re-authentication does not get triggered. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9820) stopCQ does not trigger re-authentication
Jinmei Liao created GEODE-9820: -- Summary: stopCQ does not trigger re-authentication Key: GEODE-9820 URL: https://issues.apache.org/jira/browse/GEODE-9820 Project: Geode Issue Type: Sub-task Components: cq Reporter: Jinmei Liao after credential expires, when user execute `stopCQ` operation, re-authentication does not get triggered. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9820) stopCQ does not trigger re-authentication
[ https://issues.apache.org/jira/browse/GEODE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinmei Liao reassigned GEODE-9820: -- Assignee: Jinmei Liao > stopCQ does not trigger re-authentication > - > > Key: GEODE-9820 > URL: https://issues.apache.org/jira/browse/GEODE-9820 > Project: Geode > Issue Type: Sub-task > Components: cq >Reporter: Jinmei Liao >Assignee: Jinmei Liao >Priority: Major > > after credential expires, when user execute `stopCQ` operation, > re-authentication does not get triggered. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Reopened] (GEODE-9451) On demand authentication expiration and re-authentication
[ https://issues.apache.org/jira/browse/GEODE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinmei Liao reopened GEODE-9451: Assignee: Jinmei Liao re-open to add more subtask to it > On demand authentication expiration and re-authentication > - > > Key: GEODE-9451 > URL: https://issues.apache.org/jira/browse/GEODE-9451 > Project: Geode > Issue Type: New Feature > Components: core, security >Reporter: Jinmei Liao >Assignee: Jinmei Liao >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > Fix For: 1.15.0 > > > This is to implement the feature proposed in this RFC > https://cwiki.apache.org/confluence/display/GEODE/On+Demand+Geode+Authentication+Expiration+and+Re-authentication -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9819) Client socket leak in CacheClientNotifier.registerClientInternal when error conditions occur for the durable client
[ https://issues.apache.org/jira/browse/GEODE-9819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leon Finker updated GEODE-9819: --- Priority: Critical (was: Major) > Client socket leak in CacheClientNotifier.registerClientInternal when error > conditions occur for the durable client > --- > > Key: GEODE-9819 > URL: https://issues.apache.org/jira/browse/GEODE-9819 > Project: Geode > Issue Type: Bug > Components: client/server, core >Affects Versions: 1.14.0 >Reporter: Leon Finker >Priority: Critical > > In CacheClientNotifier.registerClientInternal client socket can be left half > open and not properly closed when error conditions occur. Such as the case of: > {code:java} > } else { > // The existing proxy is already running (which means that another > // client is already using this durable id. > unsuccessfulMsg = > String.format( > "The requested durable client has the same identifier ( %s ) as an > existing durable client ( %s ). Duplicate durable clients are not allowed.", > clientProxyMembershipID.getDurableId(), cacheClientProxy); > logger.warn(unsuccessfulMsg); > // Set the unsuccessful response byte. > responseByte = Handshake.REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT; > } {code} > It considers the current client connect attempt to have failed. It writes > this response back to client: REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT. This > will cause the client to throw ServerRefusedConnectionException. What seems > wrong about this method is that even though it sets "unsuccessfulMsg" and > correctly sends back a handshake saying the client is rejected, it does not > throw an exception and it does not close "socket". I think right before it > calls performPostAuthorization it should do the followiing: > if (unsuccessfulMsg != null) { > try { > socket.close(); > } catch (IOException ignore) { > } > } else { > performPostAuthorization(...) > } > Full discussion details can be found at > https://markmail.org/thread/2gqmbq2m57pz7pxu -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9819) Client socket leak in CacheClientNotifier.registerClientInternal when error conditions occur for the durable client
Leon Finker created GEODE-9819: -- Summary: Client socket leak in CacheClientNotifier.registerClientInternal when error conditions occur for the durable client Key: GEODE-9819 URL: https://issues.apache.org/jira/browse/GEODE-9819 Project: Geode Issue Type: Bug Components: client/server, core Affects Versions: 1.14.0 Reporter: Leon Finker In CacheClientNotifier.registerClientInternal client socket can be left half open and not properly closed when error conditions occur. Such as the case of: {code:java} } else { // The existing proxy is already running (which means that another // client is already using this durable id. unsuccessfulMsg = String.format( "The requested durable client has the same identifier ( %s ) as an existing durable client ( %s ). Duplicate durable clients are not allowed.", clientProxyMembershipID.getDurableId(), cacheClientProxy); logger.warn(unsuccessfulMsg); // Set the unsuccessful response byte. responseByte = Handshake.REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT; } {code} It considers the current client connect attempt to have failed. It writes this response back to client: REPLY_EXCEPTION_DUPLICATE_DURABLE_CLIENT. This will cause the client to throw ServerRefusedConnectionException. What seems wrong about this method is that even though it sets "unsuccessfulMsg" and correctly sends back a handshake saying the client is rejected, it does not throw an exception and it does not close "socket". I think right before it calls performPostAuthorization it should do the followiing: if (unsuccessfulMsg != null) { try { socket.close(); } catch (IOException ignore) { } } else { performPostAuthorization(...) } Full discussion details can be found at https://markmail.org/thread/2gqmbq2m57pz7pxu -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException
Kamilla Aslami created GEODE-9818: - Summary: CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException Key: GEODE-9818 URL: https://issues.apache.org/jira/browse/GEODE-9818 Project: Geode Issue Type: Bug Components: client/server Affects Versions: 1.13.5 Reporter: Kamilla Aslami {noformat} org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > testRedundancySpecifiedNonPrimaryEPFails FAILED org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run in VM 2 running on Host 17763e768fb6 with 4 VMs at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) at org.apache.geode.test.dunit.VM.invoke(VM.java:437) at org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258) Caused by: org.awaitility.core.ConditionTimeoutException: Assertion condition defined as a lambda expression in org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest that uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier Expecting: <0> to be greater than: <0> within 5 minutes. at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165) at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895) at org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679) at org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504) Caused by: java.lang.AssertionError: Expecting: <0> to be greater than: <0> at org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException
[ https://issues.apache.org/jira/browse/GEODE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445464#comment-17445464 ] Geode Integration commented on GEODE-9818: -- Seen on support/1.13 in [distributed-test-openjdk8 #3|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-support-1-13-main/jobs/distributed-test-openjdk8/builds/3] ... see [test results|http://files.apachegeode-ci.info/builds/apache-support-1-13-main/1.13.5-build.0618/test-results/distributedTest/1637149771/] or download [artifacts|http://files.apachegeode-ci.info/builds/apache-support-1-13-main/1.13.5-build.0618/test-artifacts/1637149771/distributedtestfiles-openjdk8-1.13.5-build.0618.tgz]. > CI failure: > RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed > with RMIException > --- > > Key: GEODE-9818 > URL: https://issues.apache.org/jira/browse/GEODE-9818 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.13.5 >Reporter: Kamilla Aslami >Priority: Major > Labels: needsTriage > > {noformat} > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > > testRedundancySpecifiedNonPrimaryEPFails FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run > in VM 2 running on Host 17763e768fb6 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258) > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > that uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier > Expecting: > <0> > to be greater than: > <0> within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679) > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504) > Caused by: > java.lang.AssertionError: > Expecting: > <0> > to be greater than: > <0> > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException
[ https://issues.apache.org/jira/browse/GEODE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Murmann updated GEODE-9818: - Labels: needsTriage (was: ) > CI failure: > RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed > with RMIException > --- > > Key: GEODE-9818 > URL: https://issues.apache.org/jira/browse/GEODE-9818 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.13.5 >Reporter: Kamilla Aslami >Priority: Major > Labels: needsTriage > > {noformat} > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > > testRedundancySpecifiedNonPrimaryEPFails FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run > in VM 2 running on Host 17763e768fb6 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258) > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > that uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier > Expecting: > <0> > to be greater than: > <0> within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679) > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504) > Caused by: > java.lang.AssertionError: > Expecting: > <0> > to be greater than: > <0> > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9816) Implement Radish CLIENT command
[ https://issues.apache.org/jira/browse/GEODE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristen reassigned GEODE-9816: -- Assignee: Kristen > Implement Radish CLIENT command > --- > > Key: GEODE-9816 > URL: https://issues.apache.org/jira/browse/GEODE-9816 > Project: Geode > Issue Type: New Feature > Components: redis >Reporter: Jens Deppe >Assignee: Kristen >Priority: Major > > It appears that using {{JedisCluster}} with security requires the {{CLIENT}} > command to exist. Using {{JedisCluster}} as: > {noformat} > JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, > redisServerPort), > REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", > "client", > new GenericObjectPoolConfig<>()); {noformat} > Results in connection errors: > {noformat} > Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown > command `CLIENT`, with args beginning with: `setname`, `client`, > {noformat} > We will need to decide which subcommands to implement. At a minimum: > - {{SETNAME}} (https://redis.io/commands/client-setname) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9817) Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule
[ https://issues.apache.org/jira/browse/GEODE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9817: -- Labels: pull-request-available (was: ) > Allow analyze serializables tests to provide custom source set paths to > ClassAnalysisRule > - > > Key: GEODE-9817 > URL: https://issues.apache.org/jira/browse/GEODE-9817 > Project: Geode > Issue Type: Wish > Components: tests >Reporter: Kirk Lund >Assignee: Kirk Lund >Priority: Major > Labels: pull-request-available > > In order to make SanctionedSerializablesService and the related tests to be > more pluggable by external modules, I need to make changes to allow analyze > serializables tests to provide custom source set paths to ClassAnalysisRule. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith updated GEODE-9815: - Summary: Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone (was: Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone) > Recovering persistent members can result in extra copies of a bucket or two > copies in the same redundancy zone > -- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9638) CI failure: DeployedJarTest getDeployedFileName failed on Windows intermittently
[ https://issues.apache.org/jira/browse/GEODE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade reassigned GEODE-9638: Assignee: Dale Emery > CI failure: DeployedJarTest getDeployedFileName failed on Windows > intermittently > - > > Key: GEODE-9638 > URL: https://issues.apache.org/jira/browse/GEODE-9638 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.15.0 >Reporter: Darrel Schneider >Assignee: Dale Emery >Priority: Major > Labels: GeodeOperationAPI, flaky, needsTriage > > org.apache.geode.deployment.internal.DeployedJarTest > getDeployedFileName > FAILED > java.nio.file.DirectoryNotEmptyException: > C:\Users\geode\AppData\Local\Temp\javaCompiler2976436474406314797\classes > at > sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:266) > at > sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) > at java.nio.file.Files.delete(Files.java:1126) > at org.apache.commons.io.FileUtils.delete(FileUtils.java:1175) > at > org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1194) > at > org.apache.geode.test.compiler.JavaCompiler.compile(JavaCompiler.java:91) > at > org.apache.geode.test.compiler.JarBuilder.buildJarFromClassNames(JarBuilder.java:83) > at > org.apache.geode.deployment.internal.DeployedJarTest.createJarFile(DeployedJarTest.java:82) > at > org.apache.geode.deployment.internal.DeployedJarTest.getDeployedFileName(DeployedJarTest.java:65) > see: > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/windows-unit-test-openjdk8/builds/206 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9815: -- Assignee: Mark Hanson > Recovering persistent members can result in extra copies of a bucket or two > copies int the same redundancy zone > --- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9810) CI: NativeRedisClusterTest testEachProxyReturnsExposedPorts failed
[ https://issues.apache.org/jira/browse/GEODE-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne updated GEODE-9810: - Description: {code:java} > Task :geode-for-redis:acceptanceTest NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED java.lang.AssertionError: Expecting actual: [44073, 45679, 36065, 40077, 42137] to contain exactly in any order: [40077, 45679, 33425, 36065, 42137, 44073] but could not find the following elements: [33425] at org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48) 1385 tests completed, 1 failed, 2 skipped =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test report artifacts from this job are available at: http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz {code} was: https://hydradb.hdb.gemfire-ci.info/hdb/testresult/12258442 {code:java} > Task :geode-for-redis:acceptanceTest NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED java.lang.AssertionError: Expecting actual: [44073, 45679, 36065, 40077, 42137] to contain exactly in any order: [40077, 45679, 33425, 36065, 42137, 44073] but could not find the following elements: [33425] at org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48) 1385 tests completed, 1 failed, 2 skipped =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test report artifacts from this job are available at: http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz {code} > CI: NativeRedisClusterTest testEachProxyReturnsExposedPorts failed > -- > > Key: GEODE-9810 > URL: https://issues.apache.org/jira/browse/GEODE-9810 > Project: Geode > Issue Type: Bug > Components: redis >Reporter: Xiaojian Zhou >Priority: Major > Labels: needsTriage > > > {code:java} > > Task :geode-for-redis:acceptanceTest > NativeRedisClusterTest > testEachProxyReturnsExposedPorts FAILED > java.lang.AssertionError: > Expecting actual: > [44073, 45679, 36065, 40077, 42137] > to contain exactly in any order: > [40077, 45679, 33425, 36065, 42137, 44073] > but could not find the following elements: > [33425] > at > org.apache.geode.redis.NativeRedisClusterTest.testEachProxyReturnsExposedPorts(NativeRedisClusterTest.java:48) > 1385 tests completed, 1 failed, 2 skipped > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-results/acceptanceTest/1637046056/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0662/test-artifacts/1637046056/acceptancetestfiles-openjdk8-1.15.0-build.0662.tgz > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9817) Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule
[ https://issues.apache.org/jira/browse/GEODE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk Lund reassigned GEODE-9817: Assignee: Kirk Lund > Allow analyze serializables tests to provide custom source set paths to > ClassAnalysisRule > - > > Key: GEODE-9817 > URL: https://issues.apache.org/jira/browse/GEODE-9817 > Project: Geode > Issue Type: Wish > Components: tests >Reporter: Kirk Lund >Assignee: Kirk Lund >Priority: Major > > In order to make SanctionedSerializablesService and the related tests to be > more pluggable by external modules, I need to make changes to allow analyze > serializables tests to provide custom source set paths to ClassAnalysisRule. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9817) Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule
Kirk Lund created GEODE-9817: Summary: Allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule Key: GEODE-9817 URL: https://issues.apache.org/jira/browse/GEODE-9817 Project: Geode Issue Type: Wish Components: tests Reporter: Kirk Lund In order to make SanctionedSerializablesService and the related tests to be more pluggable by external modules, I need to make changes to allow analyze serializables tests to provide custom source set paths to ClassAnalysisRule. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith updated GEODE-9815: - Description: The fix in GEODE-9554 is incomplete for some cases, and it also introduces a new issue when removing buckets that are over redundancy. GEODE-9554 and these new issues are all related to using redundancy zones and having persistent members. With persistence, when we start up a member with persisted buckets, we always recover the persisted buckets on startup, regardless of whether redundancy is already met or what zone the existing buckets are on. This is necessary to ensure that we can recover all colocated buckets that might be persisted on the member. Because recovering these persistent buckets may cause us to go over redundancy, after we recover from disk, we run a "restore redundancy" task that actually removes copies of buckets that are over redundancy. GEODE-9554 addressed one case where we end up removing the last copy of a bucket from one redundancy zone while leaving two copies in another redundancy zone. It did so by disallowing the removal of a bucket if it is the last copy in a redundancy zone. There are a couple of issues with this approach. *Problem 1:* We may end up with two copies of the bucket in one zone in some cases With a slight tweak to the scenario fixed with GEODE-9554 we can end up never getting out of the situation where we have two copies of a bucket in the same zone. Steps: 1. Start two redundancy zones A and B with two members each. Bucket 0 is on member A1 and B1. 2. Shutdown member A1. 3. Rebalance - this will create bucket 0 on A2. 4. Shutdown B1. Revoke it's disk store and delete the data 5. Startup A1 - it will recover bucket 0. 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that situation. *Problem 2:* We may never delete extra copies of a bucket The fix for GEODE-9554 introduces a new problem if we have more than 2 redundancy zones Steps 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 and B1 2. Shutdown A1 3. Rebalance - this will create Bucket 0 on C1 4. Startup A1 - this will recreate bucket 0 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. I think the overall fix is probably to do something different than prevent removing the last copy of a bucket from a redundancy zone. Instead, I think we should do something like this: 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* buckets that have two copies in the same zone, as well as any buckets that are actually over redundancy. 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra copies of a bucket in the same zone first 3. Back out the changes for GEODE-9554 and let the last copy be deleted from a zone. was: The fix in GEODE-9554 is incomplete for some cases, and it also introduces a new issue when removing buckets that are over redundancy. GEODE-9554 and these new issues are all related to using redundancy zones and having persistent members. With persistence, when we start up a member with persisted buckets, we always recover the persisted buckets on startup, regardless of whether redundancy is already met or what zone the existing buckets are on. This is necessary to ensure that we can recover all colocated buckets that might be persisted on the member. Because recovering these persistent buckets may cause us to go over redundancy, after we recover from disk, we run a "restore redundancy" task that actually removes copies of buckets that are over redundancy. GEODE-9554 addressed one case where we end up removing the last copy of a bucket from one redundancy zone while leaving two copies in another redundancy zone. It did so by disallowing the removal of a bucket if it is the last copy in a redundancy zone. There are a couple of issues with this approach. *Problem 1:* We may end up with two copies of the bucket in one zone in some cases With a slight tweak to the scenario fixed with GEODE-9554 we can end up never getting out of the situation where we have two copies of a bucket in the same zone. Steps: 1. Start two redundancy zones A and B with two members each. Bucket 0 is on member A1 and B1. 2. Shutdown member A1. 3. Rebalance - this will create bucket 0 on A2. 4. Shutdown B1. Revoke it's disk store and delete the data 5. Startup A1 - it will recover bucket 0. 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that situation. *Problem 2:* We may never delete extra copies of a bucket The fix for GEODE-9554 introduces a new problem if we have more than 2 redundancy zones Steps 1. Start three redundancy zones A,B,C with two members each. Bucket 0 is on A1 and B1> 2. Shutdown A1 3. Rebalance - this will create Bucket 0 on C1 4. Startup A1 - this will recreate bucket 0 5. Now we have bucket 0 on A1, B1,
[jira] [Updated] (GEODE-9816) Implement Radish CLIENT command
[ https://issues.apache.org/jira/browse/GEODE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe updated GEODE-9816: -- Description: It appears that using {{JedisCluster}} with security requires the {{CLIENT}} command to exist. Using {{JedisCluster}} as: {noformat} JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, redisServerPort), REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", "client", new GenericObjectPoolConfig<>()); {noformat} Results in connection errors: {noformat} Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown command `CLIENT`, with args beginning with: `setname`, `client`, {noformat} We will need to decide which subcommands to implement. At a minimum: - {{SETNAME}} (https://redis.io/commands/client-setname) was: It appears that using {{JedisCluster}} with security requires the {{CLIENT}} command to exist. Using {{JedisCluster}} as: {noformat} JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, redisServerPort), REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", "client", new GenericObjectPoolConfig<>()); {noformat} Results in connection errors: {noformat} Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown command `CLIENT`, with args beginning with: `setname`, `client`, {noformat} > Implement Radish CLIENT command > --- > > Key: GEODE-9816 > URL: https://issues.apache.org/jira/browse/GEODE-9816 > Project: Geode > Issue Type: New Feature > Components: redis >Reporter: Jens Deppe >Priority: Major > > It appears that using {{JedisCluster}} with security requires the {{CLIENT}} > command to exist. Using {{JedisCluster}} as: > {noformat} > JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, > redisServerPort), > REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", > "client", > new GenericObjectPoolConfig<>()); {noformat} > Results in connection errors: > {noformat} > Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown > command `CLIENT`, with args beginning with: `setname`, `client`, > {noformat} > We will need to decide which subcommands to implement. At a minimum: > - {{SETNAME}} (https://redis.io/commands/client-setname) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9449) remove 'b' prefix from constants
[ https://issues.apache.org/jira/browse/GEODE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristen reassigned GEODE-9449: -- Assignee: Kristen > remove 'b' prefix from constants > > > Key: GEODE-9449 > URL: https://issues.apache.org/jira/browse/GEODE-9449 > Project: Geode > Issue Type: Improvement > Components: redis >Affects Versions: 1.15.0 >Reporter: Darrel Schneider >Assignee: Kristen >Priority: Minor > Labels: pull-request-available > > I number of constants in the redis packages have a 'b' prefix on their name. > This might have been a use of Hungarian notation but that is not clear. The > convention for constant names in geode is all upper case with underscore > between words. So the 'b' prefix should be removed. See StringBytesGlossary > for the location of many of these constants. Most of the constants in > StringBytesGlossary contains bytes, one or more, but if a few of the > constants in it are actually String instances. Consider renaming them to have > _STRING suffix or moving them to another class like StringGlossary. > The byte array constants in this class are marked with the MakeImmutable > annotation. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9816) Implement Radish CLIENT command
Jens Deppe created GEODE-9816: - Summary: Implement Radish CLIENT command Key: GEODE-9816 URL: https://issues.apache.org/jira/browse/GEODE-9816 Project: Geode Issue Type: New Feature Components: redis Reporter: Jens Deppe It appears that using {{JedisCluster}} with security requires the {{CLIENT}} command to exist. Using {{JedisCluster}} as: {noformat} JedisCluster jedis = new JedisCluster(new HostAndPort(BIND_ADDRESS, redisServerPort), REDIS_CLIENT_TIMEOUT, SO_TIMEOUT, DEFAULT_MAX_ATTEMPTS, "user", "user", "client", new GenericObjectPoolConfig<>()); {noformat} Results in connection errors: {noformat} Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR unknown command `CLIENT`, with args beginning with: `setname`, `client`, {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9653) ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > serverVersioningTest[version=1.9.1] FAILED
[ https://issues.apache.org/jira/browse/GEODE-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445302#comment-17445302 ] Anilkumar Gingade commented on GEODE-9653: -- This is test related issue with test Rules; this should not be a blocker for 1.15. Removing needs triage label. > ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > > serverVersioningTest[version=1.9.1] FAILED > --- > > Key: GEODE-9653 > URL: https://issues.apache.org/jira/browse/GEODE-9653 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.15.0 >Reporter: Kamilla Aslami >Priority: Major > Labels: needsTriage > > {noformat} > org.apache.geode.test.dunit.rules.tests.ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > > serverVersioningTest[version=1.9.1] FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 350[fatal > 2021/09/28 00:08:38.644 UTC tid=47] Exception in > processing request from 10.0.0.30 > java.lang.Exception: Improperly configured client detected - use > addPoolLocator to configure its locators instead of addPoolServer. > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 357[fatal > 2021/09/28 00:08:38.657 UTC tid=47] Exception in > processing request from 10.0.0.30 > java.lang.Exception: Improperly configured client detected - use > addPoolLocator to configure its locators instead of addPoolServer. > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 364[fatal > 2021/09/28 00:08:38.657 UTC tid=48] Exception in > processing request from 10.0.0.30 > java.lang.Exception: Improperly configured client detected - use > addPoolLocator to configure its locators instead of addPoolServer. > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at
[jira] [Updated] (GEODE-9653) ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > serverVersioningTest[version=1.9.1] FAILED
[ https://issues.apache.org/jira/browse/GEODE-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade updated GEODE-9653: - Labels: (was: needsTriage) > ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > > serverVersioningTest[version=1.9.1] FAILED > --- > > Key: GEODE-9653 > URL: https://issues.apache.org/jira/browse/GEODE-9653 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.15.0 >Reporter: Kamilla Aslami >Priority: Major > > {noformat} > org.apache.geode.test.dunit.rules.tests.ClusterStartupRuleCanSpecifyOlderVersionsDUnitTest > > serverVersioningTest[version=1.9.1] FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 350[fatal > 2021/09/28 00:08:38.644 UTC tid=47] Exception in > processing request from 10.0.0.30 > java.lang.Exception: Improperly configured client detected - use > addPoolLocator to configure its locators instead of addPoolServer. > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 357[fatal > 2021/09/28 00:08:38.657 UTC tid=47] Exception in > processing request from 10.0.0.30 > java.lang.Exception: Improperly configured client detected - use > addPoolLocator to configure its locators instead of addPoolServer. > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 364[fatal > 2021/09/28 00:08:38.657 UTC tid=48] Exception in > processing request from 10.0.0.30 > java.lang.Exception: Improperly configured client detected - use > addPoolLocator to configure its locators instead of addPoolServer. > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:374) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runners.Suite.runChild(Suite.java:128) > at
[jira] [Updated] (GEODE-9638) CI failure: DeployedJarTest getDeployedFileName failed on Windows intermittently
[ https://issues.apache.org/jira/browse/GEODE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade updated GEODE-9638: - Labels: GeodeOperationAPI flaky needsTriage (was: flaky needsTriage) > CI failure: DeployedJarTest getDeployedFileName failed on Windows > intermittently > - > > Key: GEODE-9638 > URL: https://issues.apache.org/jira/browse/GEODE-9638 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.15.0 >Reporter: Darrel Schneider >Priority: Major > Labels: GeodeOperationAPI, flaky, needsTriage > > org.apache.geode.deployment.internal.DeployedJarTest > getDeployedFileName > FAILED > java.nio.file.DirectoryNotEmptyException: > C:\Users\geode\AppData\Local\Temp\javaCompiler2976436474406314797\classes > at > sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:266) > at > sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) > at java.nio.file.Files.delete(Files.java:1126) > at org.apache.commons.io.FileUtils.delete(FileUtils.java:1175) > at > org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1194) > at > org.apache.geode.test.compiler.JavaCompiler.compile(JavaCompiler.java:91) > at > org.apache.geode.test.compiler.JarBuilder.buildJarFromClassNames(JarBuilder.java:83) > at > org.apache.geode.deployment.internal.DeployedJarTest.createJarFile(DeployedJarTest.java:82) > at > org.apache.geode.deployment.internal.DeployedJarTest.getDeployedFileName(DeployedJarTest.java:65) > see: > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/windows-unit-test-openjdk8/builds/206 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade updated GEODE-9815: - Labels: GeodeOperationAPI needsTriage (was: needsTriage) > Recovering persistent members can result in extra copies of a bucket or two > copies int the same redundancy zone > --- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Priority: Major > Labels: GeodeOperationAPI, needsTriage > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with two members each. Bucket 0 is on > A1 and B1> > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first> > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)