[jira] [Created] (GEODE-9955) Implement RPUSHX
Wayne created GEODE-9955: Summary: Implement RPUSHX Key: GEODE-9955 URL: https://issues.apache.org/jira/browse/GEODE-9955 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [RPUSHX|https://redis.io/commands/rpushx] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9954) Implement RPUSH
Wayne created GEODE-9954: Summary: Implement RPUSH Key: GEODE-9954 URL: https://issues.apache.org/jira/browse/GEODE-9954 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [RPUSH|https://redis.io/commands/rpush] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9953) Implement LTRIM
Wayne created GEODE-9953: Summary: Implement LTRIM Key: GEODE-9953 URL: https://issues.apache.org/jira/browse/GEODE-9953 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [LTRIM|https://redis.io/commands/ltrim] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9952) Implement LSET
Wayne created GEODE-9952: Summary: Implement LSET Key: GEODE-9952 URL: https://issues.apache.org/jira/browse/GEODE-9952 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [LSET|https://redis.io/commands/lset] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9951) Implement RPOP
Wayne created GEODE-9951: Summary: Implement RPOP Key: GEODE-9951 URL: https://issues.apache.org/jira/browse/GEODE-9951 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [RPOP|https://redis.io/commands/rpop] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9950) Implement LRANGE
Wayne created GEODE-9950: Summary: Implement LRANGE Key: GEODE-9950 URL: https://issues.apache.org/jira/browse/GEODE-9950 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the LRANGE command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9949) Implement LPUSHX
Wayne created GEODE-9949: Summary: Implement LPUSHX Key: GEODE-9949 URL: https://issues.apache.org/jira/browse/GEODE-9949 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [LPUSHX|https://redis.io/commands/lpushx] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9948) Implement LINSERT
Wayne created GEODE-9948: Summary: Implement LINSERT Key: GEODE-9948 URL: https://issues.apache.org/jira/browse/GEODE-9948 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [LINSERT|http://https//redis.io/commands/linsert] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9947) Implement LINDEX
Wayne created GEODE-9947: Summary: Implement LINDEX Key: GEODE-9947 URL: https://issues.apache.org/jira/browse/GEODE-9947 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [LINDEX|https://redis.io/commands/lindex] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9946) Implement LREM Command
Wayne created GEODE-9946: Summary: Implement LREM Command Key: GEODE-9946 URL: https://issues.apache.org/jira/browse/GEODE-9946 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Implement the [LREM|https://redis.io/commands/lrem] command. +Acceptance Criteria+ The command has been implemented along with appropriate unit and system tests. The command has been tested using the redis-cli tool and verified against native redis. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9945) Override Equality Checking for Redis Lists
Wayne created GEODE-9945: Summary: Override Equality Checking for Redis Lists Key: GEODE-9945 URL: https://issues.apache.org/jira/browse/GEODE-9945 Project: Geode Issue Type: New Feature Components: redis Reporter: Wayne Override methods in LinkedList that perform equality checking to account for our usage of byte[] content. +Acceptance Criteria+ At a minimum, the remove and indexof methods are overridden. Appropriate unit testing is added. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473212#comment-17473212 ] ASF subversion and git services commented on GEODE-9815: Commit 83ce65f08db5d3d3e0f109dc8e532a998778ab21 in geode's branch refs/heads/develop from mhansonp [ https://gitbox.apache.org/repos/asf?p=geode.git;h=83ce65f ] GEODE-9815: Prefer to remove a redundant copy in the same zone (#7124) -Fix primaryship in tests -Fix comments -Remove log statements -Remove unnecessary exceptions -Extracting a method to make things more readable -Changes per comments > Recovering persistent members can result in extra copies of a bucket or two > copies in the same redundancy zone > -- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0, needsTriage, > pull-request-available > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9942) Stress test tasks neglect non-public JUnit 5 test classes
[ https://issues.apache.org/jira/browse/GEODE-9942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473202#comment-17473202 ] ASF subversion and git services commented on GEODE-9942: Commit b2417ca82b096beeb174e536edd29c37f360f8fd in geode's branch refs/heads/develop from Dale Emery [ https://gitbox.apache.org/repos/asf?p=geode.git;h=b2417ca ] GEODE-9942: Include JUnit 5 tests in stress tests (#7256) PROBLEM JUnit 5 test classes need not be public. Indeed, IntelliJ's default inspections discourage making JUnit 5 classes public. `StressTestHelper` uses a `ClassGraph` to gather information about test classes. By default, the `ClassGraph` scans only public classes. So by default, the `ClassGraph` does not gather information about JUnit 5 classes with non-public visibility. As a result, our stress test scripts do not run JUnit 5 tests. SOLUTION Call `ignoreClassVisibility()` to configure the `ClassGraph` to scan all classes, not just public ones. A few poorly-controlled, unsophisticated experiments (on a 2016 MacBook) show that this increases the scan duration from 3 seconds to 3.4 seconds. > Stress test tasks neglect non-public JUnit 5 test classes > - > > Key: GEODE-9942 > URL: https://issues.apache.org/jira/browse/GEODE-9942 > Project: Geode > Issue Type: Test > Components: tests >Affects Versions: 1.15.0 >Reporter: Dale Emery >Assignee: Dale Emery >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > JUnit 5 test classes need not be public. Indeed, IntelliJ's default > inspections discourage making JUnit 5 classes public. > {{StressTestHelper}} uses a {{ClassGraph}} to gather information about test > classes. By default, {{ClassGraph}} scans only public classes. So by default, > {{ClassGraph}} does not gather information about JUnit 5 classes with > non-public visibility. As a result, our stress test scripts do not run JUnit > 5 tests. > Solution: Call {{ignoreClassVisibility()}} to configure {{ClassGraph}} to > scan all classes, not just public ones. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9943) ReplicatedIndexedQueryBenchmark perfomance degradation
[ https://issues.apache.org/jira/browse/GEODE-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bala Tripura Sundari Kaza Venkata resolved GEODE-9943. -- Resolution: Invalid This is due to test instability and not a real benchmark issue. > ReplicatedIndexedQueryBenchmark perfomance degradation > -- > > Key: GEODE-9943 > URL: https://issues.apache.org/jira/browse/GEODE-9943 > Project: Geode > Issue Type: Bug >Reporter: Bala Tripura Sundari Kaza Venkata >Priority: Major > Labels: needsTriage > > There is performance degradation on the ReplicatedIndexedQuery benchmark. > Below is the output: > {code:java} > org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark > average ops/second Baseline: 31678.93 Test: 28420.90 > Difference: -10.3% >ops/second standard error Baseline:48.69 Test:50.69 > Difference: +4.1% >ops/second standard deviation Baseline: 840.57 Test: 874.99 > Difference: +4.1% > YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 > Difference: -0.0% > median latency Baseline: 7467007.00 Test: 7217151.00 > Difference: -3.3% > 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 > Difference: +49.6% > 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 > Difference: +20.5% >99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 > Difference: -10.5% > average latency Baseline: 18182876.80 Test: 20330177.60 > Difference: +11.8% > latency standard deviation Baseline: 28044862.49 Test: 33920338.87 > Difference: +21.0% > latency standard error Baseline: 9109.64 Test: 11652.11 > Difference: +27.9% > average ops/second Baseline: 31594.76 Test: 28257.85 > Difference: -10.6% > {code} > Failure seen in this CI run: > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/83 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9944) NPE could occur if HARegion is being created again but not fully initialized
[ https://issues.apache.org/jira/browse/GEODE-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9944: -- Labels: GeodeOperationAPI needsTriage pull-request-available (was: GeodeOperationAPI needsTriage) > NPE could occur if HARegion is being created again but not fully initialized > - > > Key: GEODE-9944 > URL: https://issues.apache.org/jira/browse/GEODE-9944 > Project: Geode > Issue Type: Bug > Components: client queues >Affects Versions: 1.15.0 >Reporter: Eric Shu >Assignee: Eric Shu >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > The stack trace for the NPE is: > {noformat} > fatal 2022/01/08 12:45:33.175 PST bridgegemfire7_host1_25026 Processor 7> tid=0x148] Uncaught exception processing > QueueSynchronizationProcessor$QueueSynchronizationMessage@340d586b > processorId=4029 > sender=rs-FullRegression15142100a3i3large-hydra-client-18(bridgegemfire6_host1_25009:25009):41006 > java.lang.NullPointerException > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.getDispatchedEvents(QueueSynchronizationProcessor.java:160) > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.process(QueueSynchronizationProcessor.java:127) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:391) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This could occur when a client is not able to re-auth in time, and the server > removes the HARegionQueue to the client. When the client is able to re-auth > later, the new HARegionQueue is created again. There is a race that when the > server process the above message, the HARegionQueue is not fully initialized > and cause this NPE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9944) NPE could occur if HARegion is being created again but not fully initialized
[ https://issues.apache.org/jira/browse/GEODE-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Shu updated GEODE-9944: Labels: GeodeOperationAPI needsTriage (was: needsTriage) > NPE could occur if HARegion is being created again but not fully initialized > - > > Key: GEODE-9944 > URL: https://issues.apache.org/jira/browse/GEODE-9944 > Project: Geode > Issue Type: Bug > Components: client queues >Affects Versions: 1.15.0 >Reporter: Eric Shu >Assignee: Eric Shu >Priority: Major > Labels: GeodeOperationAPI, needsTriage > > The stack trace for the NPE is: > {noformat} > fatal 2022/01/08 12:45:33.175 PST bridgegemfire7_host1_25026 Processor 7> tid=0x148] Uncaught exception processing > QueueSynchronizationProcessor$QueueSynchronizationMessage@340d586b > processorId=4029 > sender=rs-FullRegression15142100a3i3large-hydra-client-18(bridgegemfire6_host1_25009:25009):41006 > java.lang.NullPointerException > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.getDispatchedEvents(QueueSynchronizationProcessor.java:160) > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.process(QueueSynchronizationProcessor.java:127) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:391) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This could occur when a client is not able to re-auth in time, and the server > removes the HARegionQueue to the client. When the client is able to re-auth > later, the new HARegionQueue is created again. There is a race that when the > server process the above message, the HARegionQueue is not fully initialized > and cause this NPE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9944) NPE could occur if HARegion is being created again but not fully initialized
[ https://issues.apache.org/jira/browse/GEODE-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Shu reassigned GEODE-9944: --- Assignee: Eric Shu > NPE could occur if HARegion is being created again but not fully initialized > - > > Key: GEODE-9944 > URL: https://issues.apache.org/jira/browse/GEODE-9944 > Project: Geode > Issue Type: Bug > Components: client queues >Reporter: Eric Shu >Assignee: Eric Shu >Priority: Major > Labels: needsTriage > > The stack trace for the NPE is: > {noformat} > fatal 2022/01/08 12:45:33.175 PST bridgegemfire7_host1_25026 Processor 7> tid=0x148] Uncaught exception processing > QueueSynchronizationProcessor$QueueSynchronizationMessage@340d586b > processorId=4029 > sender=rs-FullRegression15142100a3i3large-hydra-client-18(bridgegemfire6_host1_25009:25009):41006 > java.lang.NullPointerException > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.getDispatchedEvents(QueueSynchronizationProcessor.java:160) > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.process(QueueSynchronizationProcessor.java:127) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:391) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This could occur when a client is not able to re-auth in time, and the server > removes the HARegionQueue to the client. When the client is able to re-auth > later, the new HARegionQueue is created again. There is a race that when the > server process the above message, the HARegionQueue is not fully initialized > and cause this NPE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9944) NPE could occur if HARegion is being created again but not fully initialized
[ https://issues.apache.org/jira/browse/GEODE-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Shu updated GEODE-9944: Affects Version/s: 1.15.0 > NPE could occur if HARegion is being created again but not fully initialized > - > > Key: GEODE-9944 > URL: https://issues.apache.org/jira/browse/GEODE-9944 > Project: Geode > Issue Type: Bug > Components: client queues >Affects Versions: 1.15.0 >Reporter: Eric Shu >Assignee: Eric Shu >Priority: Major > Labels: needsTriage > > The stack trace for the NPE is: > {noformat} > fatal 2022/01/08 12:45:33.175 PST bridgegemfire7_host1_25026 Processor 7> tid=0x148] Uncaught exception processing > QueueSynchronizationProcessor$QueueSynchronizationMessage@340d586b > processorId=4029 > sender=rs-FullRegression15142100a3i3large-hydra-client-18(bridgegemfire6_host1_25009:25009):41006 > java.lang.NullPointerException > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.getDispatchedEvents(QueueSynchronizationProcessor.java:160) > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.process(QueueSynchronizationProcessor.java:127) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:391) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This could occur when a client is not able to re-auth in time, and the server > removes the HARegionQueue to the client. When the client is able to re-auth > later, the new HARegionQueue is created again. There is a race that when the > server process the above message, the HARegionQueue is not fully initialized > and cause this NPE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9944) NPE could occur if HARegion is being created again but not fully initialized
[ https://issues.apache.org/jira/browse/GEODE-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Murmann updated GEODE-9944: - Labels: needsTriage (was: ) > NPE could occur if HARegion is being created again but not fully initialized > - > > Key: GEODE-9944 > URL: https://issues.apache.org/jira/browse/GEODE-9944 > Project: Geode > Issue Type: Bug > Components: client queues >Reporter: Eric Shu >Priority: Major > Labels: needsTriage > > The stack trace for the NPE is: > {noformat} > fatal 2022/01/08 12:45:33.175 PST bridgegemfire7_host1_25026 Processor 7> tid=0x148] Uncaught exception processing > QueueSynchronizationProcessor$QueueSynchronizationMessage@340d586b > processorId=4029 > sender=rs-FullRegression15142100a3i3large-hydra-client-18(bridgegemfire6_host1_25009:25009):41006 > java.lang.NullPointerException > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.getDispatchedEvents(QueueSynchronizationProcessor.java:160) > at > org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.process(QueueSynchronizationProcessor.java:127) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:391) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This could occur when a client is not able to re-auth in time, and the server > removes the HARegionQueue to the client. When the client is able to re-auth > later, the new HARegionQueue is created again. There is a race that when the > server process the above message, the HARegionQueue is not fully initialized > and cause this NPE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9944) NPE could occur if HARegion is being created again but not fully initialized
Eric Shu created GEODE-9944: --- Summary: NPE could occur if HARegion is being created again but not fully initialized Key: GEODE-9944 URL: https://issues.apache.org/jira/browse/GEODE-9944 Project: Geode Issue Type: Bug Components: client queues Reporter: Eric Shu The stack trace for the NPE is: {noformat} fatal 2022/01/08 12:45:33.175 PST bridgegemfire7_host1_25026 tid=0x148] Uncaught exception processing QueueSynchronizationProcessor$QueueSynchronizationMessage@340d586b processorId=4029 sender=rs-FullRegression15142100a3i3large-hydra-client-18(bridgegemfire6_host1_25009:25009):41006 java.lang.NullPointerException at org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.getDispatchedEvents(QueueSynchronizationProcessor.java:160) at org.apache.geode.internal.cache.ha.QueueSynchronizationProcessor$QueueSynchronizationMessage.process(QueueSynchronizationProcessor.java:127) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444) at org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:391) at org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) at java.lang.Thread.run(Thread.java:748) {noformat} This could occur when a client is not able to re-auth in time, and the server removes the HARegionQueue to the client. When the client is able to re-auth later, the new HARegionQueue is created again. There is a race that when the server process the above message, the HARegionQueue is not fully initialized and cause this NPE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9942) Stress test tasks neglect non-public JUnit 5 test classes
[ https://issues.apache.org/jira/browse/GEODE-9942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dale Emery updated GEODE-9942: -- Labels: GeodeOperationAPI pull-request-available (was: pull-request-available) > Stress test tasks neglect non-public JUnit 5 test classes > - > > Key: GEODE-9942 > URL: https://issues.apache.org/jira/browse/GEODE-9942 > Project: Geode > Issue Type: Test > Components: tests >Affects Versions: 1.15.0 >Reporter: Dale Emery >Assignee: Dale Emery >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > JUnit 5 test classes need not be public. Indeed, IntelliJ's default > inspections discourage making JUnit 5 classes public. > {{StressTestHelper}} uses a {{ClassGraph}} to gather information about test > classes. By default, {{ClassGraph}} scans only public classes. So by default, > {{ClassGraph}} does not gather information about JUnit 5 classes with > non-public visibility. As a result, our stress test scripts do not run JUnit > 5 tests. > Solution: Call {{ignoreClassVisibility()}} to configure {{ClassGraph}} to > scan all classes, not just public ones. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9704: -- Labels: GeodeOperationAPI blocks-1.15.1 pull-request-available (was: GeodeOperationAPI blocks-1.15.1) > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9943) ReplicatedIndexedQueryBenchmark perfomance degradation
[ https://issues.apache.org/jira/browse/GEODE-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bala Tripura Sundari Kaza Venkata updated GEODE-9943: - Description: There is performance degradation on the ReplicatedIndexedQuery benchmark. Below is the output: {code:java} org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark average ops/second Baseline: 31678.93 Test: 28420.90 Difference: -10.3% ops/second standard error Baseline:48.69 Test:50.69 Difference: +4.1% ops/second standard deviation Baseline: 840.57 Test: 874.99 Difference: +4.1% YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 Difference: -0.0% median latency Baseline: 7467007.00 Test: 7217151.00 Difference: -3.3% 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 Difference: +49.6% 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 Difference: +20.5% 99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 Difference: -10.5% average latency Baseline: 18182876.80 Test: 20330177.60 Difference: +11.8% latency standard deviation Baseline: 28044862.49 Test: 33920338.87 Difference: +21.0% latency standard error Baseline: 9109.64 Test: 11652.11 Difference: +27.9% average ops/second Baseline: 31594.76 Test: 28257.85 Difference: -10.6% {code} Failure seen in this CI run: https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/83 was: There is performance degradation on the ReplicatedIndexedQuery benchmark. Below is the output: {code:java} org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark average ops/second Baseline: 31678.93 Test: 28420.90 Difference: -10.3% ops/second standard error Baseline:48.69 Test:50.69 Difference: +4.1% ops/second standard deviation Baseline: 840.57 Test: 874.99 Difference: +4.1% YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 Difference: -0.0% median latency Baseline: 7467007.00 Test: 7217151.00 Difference: -3.3% 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 Difference: +49.6% 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 Difference: +20.5% 99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 Difference: -10.5% average latency Baseline: 18182876.80 Test: 20330177.60 Difference: +11.8% latency standard deviation Baseline: 28044862.49 Test: 33920338.87 Difference: +21.0% latency standard error Baseline: 9109.64 Test: 11652.11 Difference: +27.9% average ops/second Baseline: 31594.76 Test: 28257.85 Difference: -10.6% {code} > ReplicatedIndexedQueryBenchmark perfomance degradation > -- > > Key: GEODE-9943 > URL: https://issues.apache.org/jira/browse/GEODE-9943 > Project: Geode > Issue Type: Bug >Reporter: Bala Tripura Sundari Kaza Venkata >Priority: Major > Labels: needsTriage > > There is performance degradation on the ReplicatedIndexedQuery benchmark. > Below is the output: > {code:java} > org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark > average ops/second Baseline: 31678.93 Test: 28420.90 > Difference: -10.3% >ops/second standard error Baseline:48.69 Test:50.69 > Difference: +4.1% >ops/second standard deviation Baseline: 840.57 Test: 874.99 > Difference: +4.1% > YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 > Difference: -0.0% > median latency Baseline: 7467007.00 Test: 7217151.00 > Difference: -3.3% > 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 > Difference: +49.6% > 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 > Difference: +20.5% >99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 > Difference: -10.5% > average latency Baseline: 18182876.80 Test: 20330177.60 > Difference: +11.8% > latency standard deviation Baseline: 28044862.49 Test: 33920338.87 > Difference: +21.0% > latency standard error Baseline: 9109.64 Test: 11652.11 > Difference: +27.9% > average ops/second Baseline: 31594.76 Test: 28257.85 > Difference: -10.6% > {code} > Failure seen in this CI
[jira] [Commented] (GEODE-9943) ReplicatedIndexedQueryBenchmark perfomance degradation
[ https://issues.apache.org/jira/browse/GEODE-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473161#comment-17473161 ] Geode Integration commented on GEODE-9943: -- Seen in [benchmark-base #83|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/83]. > ReplicatedIndexedQueryBenchmark perfomance degradation > -- > > Key: GEODE-9943 > URL: https://issues.apache.org/jira/browse/GEODE-9943 > Project: Geode > Issue Type: Bug >Reporter: Bala Tripura Sundari Kaza Venkata >Priority: Major > Labels: needsTriage > > There is performance degradation on the ReplicatedIndexedQuery benchmark. > Below is the output: > {code:java} > org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark > average ops/second Baseline: 31678.93 Test: 28420.90 > Difference: -10.3% >ops/second standard error Baseline:48.69 Test:50.69 > Difference: +4.1% >ops/second standard deviation Baseline: 840.57 Test: 874.99 > Difference: +4.1% > YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 > Difference: -0.0% > median latency Baseline: 7467007.00 Test: 7217151.00 > Difference: -3.3% > 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 > Difference: +49.6% > 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 > Difference: +20.5% >99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 > Difference: -10.5% > average latency Baseline: 18182876.80 Test: 20330177.60 > Difference: +11.8% > latency standard deviation Baseline: 28044862.49 Test: 33920338.87 > Difference: +21.0% > latency standard error Baseline: 9109.64 Test: 11652.11 > Difference: +27.9% > average ops/second Baseline: 31594.76 Test: 28257.85 > Difference: -10.6% > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9943) ReplicatedIndexedQueryBenchmark perfomance degradation
Bala Tripura Sundari Kaza Venkata created GEODE-9943: Summary: ReplicatedIndexedQueryBenchmark perfomance degradation Key: GEODE-9943 URL: https://issues.apache.org/jira/browse/GEODE-9943 Project: Geode Issue Type: Bug Reporter: Bala Tripura Sundari Kaza Venkata There is performance degradation on the ReplicatedIndexedQuery benchmark. Below is the output: {code:java} org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark average ops/second Baseline: 31678.93 Test: 28420.90 Difference: -10.3% ops/second standard error Baseline:48.69 Test:50.69 Difference: +4.1% ops/second standard deviation Baseline: 840.57 Test: 874.99 Difference: +4.1% YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 Difference: -0.0% median latency Baseline: 7467007.00 Test: 7217151.00 Difference: -3.3% 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 Difference: +49.6% 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 Difference: +20.5% 99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 Difference: -10.5% average latency Baseline: 18182876.80 Test: 20330177.60 Difference: +11.8% latency standard deviation Baseline: 28044862.49 Test: 33920338.87 Difference: +21.0% latency standard error Baseline: 9109.64 Test: 11652.11 Difference: +27.9% average ops/second Baseline: 31594.76 Test: 28257.85 Difference: -10.6% {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9943) ReplicatedIndexedQueryBenchmark perfomance degradation
[ https://issues.apache.org/jira/browse/GEODE-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Murmann updated GEODE-9943: - Labels: needsTriage (was: ) > ReplicatedIndexedQueryBenchmark perfomance degradation > -- > > Key: GEODE-9943 > URL: https://issues.apache.org/jira/browse/GEODE-9943 > Project: Geode > Issue Type: Bug >Reporter: Bala Tripura Sundari Kaza Venkata >Priority: Major > Labels: needsTriage > > There is performance degradation on the ReplicatedIndexedQuery benchmark. > Below is the output: > {code:java} > org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark > average ops/second Baseline: 31678.93 Test: 28420.90 > Difference: -10.3% >ops/second standard error Baseline:48.69 Test:50.69 > Difference: +4.1% >ops/second standard deviation Baseline: 840.57 Test: 874.99 > Difference: +4.1% > YS 99th percentile latency Baseline: 20095.61 Test: 20094.75 > Difference: -0.0% > median latency Baseline: 7467007.00 Test: 7217151.00 > Difference: -3.3% > 90th percentile latency Baseline: 54657023.00 Test: 81788927.00 > Difference: +49.6% > 99th percentile latency Baseline: 111345663.00 Test: 134217727.00 > Difference: +20.5% >99.9th percentile latency Baseline: 226623487.00 Test: 202899455.00 > Difference: -10.5% > average latency Baseline: 18182876.80 Test: 20330177.60 > Difference: +11.8% > latency standard deviation Baseline: 28044862.49 Test: 33920338.87 > Difference: +21.0% > latency standard error Baseline: 9109.64 Test: 11652.11 > Difference: +27.9% > average ops/second Baseline: 31594.76 Test: 28257.85 > Difference: -10.6% > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-7860) Unmarshalling exception in Benchmark test with EOFException as cause
[ https://issues.apache.org/jira/browse/GEODE-7860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473146#comment-17473146 ] Geode Integration commented on GEODE-7860: -- Seen in [benchmark-base #84|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/84]. > Unmarshalling exception in Benchmark test with EOFException as cause > > > Key: GEODE-7860 > URL: https://issues.apache.org/jira/browse/GEODE-7860 > Project: Geode > Issue Type: Bug > Components: benchmarks, querying >Reporter: Donal Evans >Priority: Major > > {noformat} > org.apache.geode.benchmark.tests.ReplicatedIndexedQueryBenchmark > run() > FAILED > java.util.concurrent.CompletionException: java.lang.RuntimeException: > java.rmi.UnmarshalException: Error unmarshaling return header; nested > exception is: > java.io.EOFException > Caused by: > java.lang.RuntimeException: java.rmi.UnmarshalException: Error > unmarshaling return header; nested exception is: > java.io.EOFException > Caused by: > java.rmi.UnmarshalException: Error unmarshaling return header; > nested exception is: > java.io.EOFException > Caused by: > java.io.EOFException > 16 tests completed, 1 failed > org.apache.geode.benchmark.tests.PartitionedIndexedQueryBenchmark > run() > FAILED > java.util.concurrent.CompletionException: java.lang.RuntimeException: > java.rmi.UnmarshalException: Error unmarshaling return header; nested > exception is: > java.io.EOFException > Caused by: > java.lang.RuntimeException: java.rmi.UnmarshalException: Error > unmarshaling return header; nested exception is: > java.io.EOFException > Caused by: > java.rmi.UnmarshalException: Error unmarshaling return header; > nested exception is: > java.io.EOFException > Caused by: > java.io.EOFException > 16 tests completed, 1 failed > {noformat} > Failure seen in this run: > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/Benchmark_base/builds/146 > but was passing in the following run. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9942) Stress test tasks neglect non-public JUnit 5 test classes
[ https://issues.apache.org/jira/browse/GEODE-9942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9942: -- Labels: pull-request-available (was: ) > Stress test tasks neglect non-public JUnit 5 test classes > - > > Key: GEODE-9942 > URL: https://issues.apache.org/jira/browse/GEODE-9942 > Project: Geode > Issue Type: Test > Components: tests >Affects Versions: 1.15.0 >Reporter: Dale Emery >Assignee: Dale Emery >Priority: Major > Labels: pull-request-available > > JUnit 5 test classes need not be public. Indeed, IntelliJ's default > inspections discourage making JUnit 5 classes public. > {{StressTestHelper}} uses a {{ClassGraph}} to gather information about test > classes. By default, {{ClassGraph}} scans only public classes. So by default, > {{ClassGraph}} does not gather information about JUnit 5 classes with > non-public visibility. As a result, our stress test scripts do not run JUnit > 5 tests. > Solution: Call {{ignoreClassVisibility()}} to configure {{ClassGraph}} to > scan all classes, not just public ones. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9942) Stress test tasks neglect non-public JUnit 5 test classes
[ https://issues.apache.org/jira/browse/GEODE-9942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dale Emery reassigned GEODE-9942: - Assignee: Dale Emery > Stress test tasks neglect non-public JUnit 5 test classes > - > > Key: GEODE-9942 > URL: https://issues.apache.org/jira/browse/GEODE-9942 > Project: Geode > Issue Type: Test > Components: tests >Affects Versions: 1.15.0 >Reporter: Dale Emery >Assignee: Dale Emery >Priority: Major > > JUnit 5 test classes need not be public. Indeed, IntelliJ's default > inspections discourage making JUnit 5 classes public. > {{StressTestHelper}} uses a {{ClassGraph}} to gather information about test > classes. By default, {{ClassGraph}} scans only public classes. So by default, > {{ClassGraph}} does not gather information about JUnit 5 classes with > non-public visibility. As a result, our stress test scripts do not run JUnit > 5 tests. > Solution: Call {{ignoreClassVisibility()}} to configure {{ClassGraph}} to > scan all classes, not just public ones. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9940) ci improvements
[ https://issues.apache.org/jira/browse/GEODE-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473133#comment-17473133 ] ASF subversion and git services commented on GEODE-9940: Commit 233533bde74aaf35ca6c0ac502f2214fb61f1d33 in geode's branch refs/heads/develop from Owen Nichols [ https://gitbox.apache.org/repos/asf?p=geode.git;h=233533b ] GEODE-9940: ci improvements (#7255) * pin the image too for all runs in the same mass test run * fix typo in time format from #7254 > ci improvements > --- > > Key: GEODE-9940 > URL: https://issues.apache.org/jira/browse/GEODE-9940 > Project: Geode > Issue Type: Improvement > Components: ci, release >Reporter: Owen Nichols >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > mass test report hangs, mass test may not trigger, RC may trigger > prematurely/twice -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9942) Stress test tasks neglect non-public JUnit 5 test classes
Dale Emery created GEODE-9942: - Summary: Stress test tasks neglect non-public JUnit 5 test classes Key: GEODE-9942 URL: https://issues.apache.org/jira/browse/GEODE-9942 Project: Geode Issue Type: Test Components: tests Affects Versions: 1.15.0 Reporter: Dale Emery JUnit 5 test classes need not be public. Indeed, IntelliJ's default inspections discourage making JUnit 5 classes public. {{StressTestHelper}} uses a {{ClassGraph}} to gather information about test classes. By default, {{ClassGraph}} scans only public classes. So by default, {{ClassGraph}} does not gather information about JUnit 5 classes with non-public visibility. As a result, our stress test scripts do not run JUnit 5 tests. Solution: Call {{ignoreClassVisibility()}} to configure {{ClassGraph}} to scan all classes, not just public ones. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9636) CI failure: NoClassDefFoundError in lucene examples
[ https://issues.apache.org/jira/browse/GEODE-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473117#comment-17473117 ] ASF GitHub Bot commented on GEODE-9636: --- gesterzhou closed pull request #109: URL: https://github.com/apache/geode-examples/pull/109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > CI failure: NoClassDefFoundError in lucene examples > --- > > Key: GEODE-9636 > URL: https://issues.apache.org/jira/browse/GEODE-9636 > Project: Geode > Issue Type: Bug > Components: lucene >Reporter: Darrel Schneider >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > The lucene examples have started failing (3 runs in a row) with the following > exceptions: > org.apache.geode_examples.luceneSpatial.TrainStopSerializerTest > > serializerReturnsSingleDocument FAILED > java.lang.NoClassDefFoundError at TrainStopSerializerTest.java:30 > Caused by: java.lang.ClassNotFoundException at > TrainStopSerializerTest.java:30 > org.apache.geode_examples.luceneSpatial.SpatialHelperTest > > queryFindsADocumentThatWasAdded FAILED > java.lang.NoClassDefFoundError at SpatialHelperTest.java:45 > The first failed run was: > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-examples/jobs/test-examples/builds/243 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9636) CI failure: NoClassDefFoundError in lucene examples
[ https://issues.apache.org/jira/browse/GEODE-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473116#comment-17473116 ] ASF GitHub Bot commented on GEODE-9636: --- gesterzhou commented on pull request #109: URL: https://github.com/apache/geode-examples/pull/109#issuecomment-1010297465 Not to implement until another fix is merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > CI failure: NoClassDefFoundError in lucene examples > --- > > Key: GEODE-9636 > URL: https://issues.apache.org/jira/browse/GEODE-9636 > Project: Geode > Issue Type: Bug > Components: lucene >Reporter: Darrel Schneider >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > The lucene examples have started failing (3 runs in a row) with the following > exceptions: > org.apache.geode_examples.luceneSpatial.TrainStopSerializerTest > > serializerReturnsSingleDocument FAILED > java.lang.NoClassDefFoundError at TrainStopSerializerTest.java:30 > Caused by: java.lang.ClassNotFoundException at > TrainStopSerializerTest.java:30 > org.apache.geode_examples.luceneSpatial.SpatialHelperTest > > queryFindsADocumentThatWasAdded FAILED > java.lang.NoClassDefFoundError at SpatialHelperTest.java:45 > The first failed run was: > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-examples/jobs/test-examples/builds/243 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9704: --- Description: This is the old Geode behavior, but may or may not be the correct behavior. When durable clients recovers, there is a queueTimer thread that runs `QueueManagerImp.recoverPrimary` method, it * makes new connection to server - sends readyForEvents (which will cause the server to start sending the queued events) - recovers interest - clears the region of keys of interest - re-registers interest It sends readyForEvents before it clears region of keys of interest, if server sends some events of those keys in between, it will clear them, thus it seems to the user that the client region doesn't have those keys. Run geode-core distributedTest AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), change the InterestResultPolicy to NONE, you would see the test would fail occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between `createNewPrimary` and `recoverInterest` would make the test fail more consistently. was: This is the old Geode behavior, but may or may not be the correct behavior. When durable clients recovers, there is a queueTimer thread that runs `QueueManagerImp.recoverPrimary` method, it * makes new connection to server - sends readyForEvents (which will cause the server to start sending the queued events) - recovers interest - clears the region of keys of interest - re-registers interest It sends readyForEvents before it clears region of keys of interest, if server sends some events of those keys in between, it will clear them, thus it seems to the user that the client region doesn't have those keys. Run geode-core distributedTest AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(), change the InterestResultPolicy to NONE, you would see the test would fail occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between `createNewPrimary` and `recoverInterest` would make the test fail more consistently. > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1 > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9704: -- Assignee: Mark Hanson (was: Kirk Lund) > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1 > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Closed] (GEODE-9924) Make repeat tests log each test class instance separately
[ https://issues.apache.org/jira/browse/GEODE-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dale Emery closed GEODE-9924. - > Make repeat tests log each test class instance separately > - > > Key: GEODE-9924 > URL: https://issues.apache.org/jira/browse/GEODE-9924 > Project: Geode > Issue Type: Test > Components: tests >Reporter: Dale Emery >Assignee: Dale Emery >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > Currently, our repeat test tasks merge the output from all executions of a > given test class, making it very difficult to diagnose failures in repeat > tests. > CAUSE: > In order to run tests repeatedly, our repeat test tasks override Gradle code > to allow a test class to execute more than once. > Gradle directs the output from each test to a log associated with the test > class name. > SOLUTION: > Change Gradle to distinguish separate executions of a test class, and to log > the output from each execution separately. This can be done by using a custom > "test result processor" that appends an iteration counter to the end of the > test class name before processing the result. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9924) Make repeat tests log each test class instance separately
[ https://issues.apache.org/jira/browse/GEODE-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dale Emery resolved GEODE-9924. --- Resolution: Fixed > Make repeat tests log each test class instance separately > - > > Key: GEODE-9924 > URL: https://issues.apache.org/jira/browse/GEODE-9924 > Project: Geode > Issue Type: Test > Components: tests >Reporter: Dale Emery >Assignee: Dale Emery >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > > Currently, our repeat test tasks merge the output from all executions of a > given test class, making it very difficult to diagnose failures in repeat > tests. > CAUSE: > In order to run tests repeatedly, our repeat test tasks override Gradle code > to allow a test class to execute more than once. > Gradle directs the output from each test to a log associated with the test > class name. > SOLUTION: > Change Gradle to distinguish separate executions of a test class, and to log > the output from each execution separately. This can be done by using a custom > "test result processor" that appends an iteration counter to the end of the > test class name before processing the result. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9818) CI failure: RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed with RMIException
[ https://issues.apache.org/jira/browse/GEODE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Baker updated GEODE-9818: - Labels: (was: needsTriage) > CI failure: > RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails failed > with RMIException > --- > > Key: GEODE-9818 > URL: https://issues.apache.org/jira/browse/GEODE-9818 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.13.5 >Reporter: Kamilla Aslami >Priority: Major > > {noformat} > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > > testRedundancySpecifiedNonPrimaryEPFails FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest$$Lambda$315/1371457741.run > in VM 2 running on Host 17763e768fb6 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.testRedundancySpecifiedNonPrimaryEPFails(RedundancyLevelPart1DUnitTest.java:258) > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest > that uses org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier > Expecting: > <0> > to be greater than: > <0> within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679) > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.verifyInterestRegistration(RedundancyLevelPart1DUnitTest.java:504) > Caused by: > java.lang.AssertionError: > Expecting: > <0> > to be greater than: > <0> > at > org.apache.geode.internal.cache.tier.sockets.RedundancyLevelPart1DUnitTest.lambda$verifyInterestRegistration$19(RedundancyLevelPart1DUnitTest.java:505) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9919) CI Failure: RegionReliabilityDistNoAckDUnitTest > testLimitedAccess
[ https://issues.apache.org/jira/browse/GEODE-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Baker updated GEODE-9919: - Labels: CI (was: CI needsTriage) > CI Failure: RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > --- > > Key: GEODE-9919 > URL: https://issues.apache.org/jira/browse/GEODE-9919 > Project: Geode > Issue Type: Bug > Components: membership, regions, tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Priority: Major > Labels: CI > > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with > suspicious string with member not responding to heartbeats. > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) > at > org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1658) > at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876) > at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10) > at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789) > at org.jgroups.protocols.TP.receive(TP.java:1714) > at > org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:160) > at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701) > at java.lang.Thread.run(Thread.java:748) > --- > Found suspect string in log4j at line 1191 > [error 2022/01/04 01:04:34.715 GMT > tid=33] Cache initialization for GemFireCache[id = 1852143676; isClosing = > false; isShutDownAll = false; created = Tue Jan 04 01:04:20 GMT 2022; server > = false; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed > because: org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > --- > Found suspect string in log4j at line 1195 > [error 2022/01/04 01:04:34.739 GMT > tid=33]
[jira] [Updated] (GEODE-9744) bug like CVE-2020-8908
[ https://issues.apache.org/jira/browse/GEODE-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-9744: -- Labels: needsTriage pull-request-available (was: needsTriage) > bug like CVE-2020-8908 > -- > > Key: GEODE-9744 > URL: https://issues.apache.org/jira/browse/GEODE-9744 > Project: Geode > Issue Type: Bug >Reporter: lujie >Priority: Major > Labels: needsTriage, pull-request-available > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] > A temp directory creation vulnerability exists in all versions of Guava, > allowing an attacker with access to the machine to potentially access data in > a temporary directory created by the Guava API > com.google.common.io.Files.createTempDir(). By default, on unix-like systems, > the created directory is world-readable (readable by an attacker with access > to the system). The method in question has been marked > [@deprecated|https://github.com/deprecated] in versions 30.0 and later and > should not be used. For Android developers, we recommend choosing a temporary > directory API provided by Android, such as context.getCacheDir(). For other > Java developers, we recommend migrating to the Java 7 API > java.nio.file.Files.createTempDirectory() which explicitly configures > permissions of 700, or configuring the Java runtime's java.io.tmpdir system > property to point to a location whose permissions are appropriately > configured. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9860) NativeRedisRenameRedirectionsDUnitTest. initializationError
[ https://issues.apache.org/jira/browse/GEODE-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe updated GEODE-9860: -- Labels: (was: needsTriage) > NativeRedisRenameRedirectionsDUnitTest. initializationError > --- > > Key: GEODE-9860 > URL: https://issues.apache.org/jira/browse/GEODE-9860 > Project: Geode > Issue Type: Bug > Components: redis >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Jens Deppe >Priority: Major > > {noformat} > NativeRedisRenameRedirectionsDUnitTest > initializationError FAILED > java.lang.RuntimeException: java.lang.NullPointerException > at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:68) > at > org.rnorth.ducttape.timeouts.Timeouts.doWithTimeout(Timeouts.java:60) > at > org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilReady(WaitAllStrategy.java:53) > at > org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:285) > at > java.util.concurrent.ConcurrentHashMap.forEach(ConcurrentHashMap.java:1597) > at > org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:265) > at > org.testcontainers.containers.DockerComposeContainer.start(DockerComposeContainer.java:179) > at > org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:84) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at
[jira] [Assigned] (GEODE-9860) NativeRedisRenameRedirectionsDUnitTest. initializationError
[ https://issues.apache.org/jira/browse/GEODE-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe reassigned GEODE-9860: - Assignee: Jens Deppe > NativeRedisRenameRedirectionsDUnitTest. initializationError > --- > > Key: GEODE-9860 > URL: https://issues.apache.org/jira/browse/GEODE-9860 > Project: Geode > Issue Type: Bug > Components: redis >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Jens Deppe >Priority: Major > Labels: needsTriage > > {noformat} > NativeRedisRenameRedirectionsDUnitTest > initializationError FAILED > java.lang.RuntimeException: java.lang.NullPointerException > at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:68) > at > org.rnorth.ducttape.timeouts.Timeouts.doWithTimeout(Timeouts.java:60) > at > org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilReady(WaitAllStrategy.java:53) > at > org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:285) > at > java.util.concurrent.ConcurrentHashMap.forEach(ConcurrentHashMap.java:1597) > at > org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:265) > at > org.testcontainers.containers.DockerComposeContainer.start(DockerComposeContainer.java:179) > at > org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:84) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Resolved] (GEODE-9735) Avoid wan-copy region command to copy entries updated after it started
[ https://issues.apache.org/jira/browse/GEODE-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alberto Gomez resolved GEODE-9735. -- Fix Version/s: 1.15.0 Resolution: Fixed > Avoid wan-copy region command to copy entries updated after it started > -- > > Key: GEODE-9735 > URL: https://issues.apache.org/jira/browse/GEODE-9735 > Project: Geode > Issue Type: Improvement > Components: wan >Reporter: Alberto Gomez >Assignee: Alberto Gomez >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > The wan-copy region command must not copy entries that have been created or > updated after the command has started to copy entries. > There are two reasons for this: > * Efficiency: entries copied after the command has been started will be > replicated by the gateway sender anyway so the copying of these entries by > the command will be a waste of processing resources and duplicated events > will arrive to the remote site. > * Problematic reordering of events in the receiving side: if an entry is > modified in the same millisecond in the source site and the wan-copy region > command tries to copy this entry, it might happen that the command reads the > first version of the entry and sends it to the remote site. The gateway > sender will also send two events to the remote site, one with the first > version of the entry and one with the second. If the event of the wan-copy > region command containing the first version of the entry arrives to the > remote site after the second event sent by the gateway sender, it will > overwrite the second version causing an inconsistency between the two sites. > The reason is that the granularity of the timestamp of events is of > milliseconds and therefore the conflict resolver on the receiving side will > not be able to detect that the event sent by the command is prior to the one > received by the gateway sender. > If entries updated while the command is running are not copied by the > command, this problem is avoided. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9881) Fully recoverd Oplogs object indicating unrecoveredRegionCount>0 preventing compaction
[ https://issues.apache.org/jira/browse/GEODE-9881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakov Varenina resolved GEODE-9881. --- Fix Version/s: 1.15.0 Resolution: Fixed > Fully recoverd Oplogs object indicating unrecoveredRegionCount>0 preventing > compaction > -- > > Key: GEODE-9881 > URL: https://issues.apache.org/jira/browse/GEODE-9881 > Project: Geode > Issue Type: Bug > Components: persistence >Reporter: Jakov Varenina >Assignee: Jakov Varenina >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > We have found problem in case when region is closed with Region.close() and > then recreated to start the recovery. If you inspect this code in close() > function you will notice that it doesn't make any sense: > {code:java} > void close(DiskRegion dr) { > // while a krf is being created can not close a region > lockCompactor(); > try { > if (!isDrfOnly()) { > DiskRegionInfo dri = getDRI(dr); > if (dri != null) { > long clearCount = dri.clear(null); > if (clearCount != 0) { > totalLiveCount.addAndGet(-clearCount); > // no need to call handleNoLiveValues because we now have an > // unrecovered region. > } > regionMap.get().remove(dr.getId(), dri); > } > addUnrecoveredRegion(dr.getId()); > } > } finally { > unlockCompactor(); > } > } > {code} > Please notice that addUnrecoveredRegion() marks DiskRegionInfo object as > unrecovered and increments counter unrecoveredRegionCount. This > DiskRegionInfo object is contained in regionMap structure. Then afterwards it > removes DiskRegionInfo object (that was previously marked as unrecovered) > from the regionMap. This doesn't make any sense, it updated object and then > removed it from map to be garbage collected. As you will see later on this > will cause some issues when region is recovered. > Please check this code at recovery: > {code:java} > /** > * For each dri that this oplog has that is currently unrecoverable check to > see if a DiskRegion > * that is recoverable now exists. > */ > void checkForRecoverableRegion(DiskRegionView dr) { > if (unrecoveredRegionCount.get() > 0) { > DiskRegionInfo dri = getDRI(dr); > if (dri != null) { > if (dri.testAndSetRecovered(dr)) { > unrecoveredRegionCount.decrementAndGet(); > } > } > } > } > {code} > The problem is that geode will not clear counter unrecoveredRegionCount in > Oplog objects after recovery is done. This is because > checkForRecoverableRegion will check unrecoveredRegionCount counter and > perform testAndSetRecovered. The testAndSetRecovered will always return > false, because non of the DiskRegionInfo objects in region map have > unrecovered flag set to true (all object marked as unrecovered were deleted > by close(), and then they were recreated during recovery see note below). > The problem here is that all Oplogs will be fully recovered with the counter > incorrectly indicating unrecoveredRegionCount>0. This will later on prevent > the compaction of recovered Oplogs (the files that have .crf, .drf and .krf) > when they reach compaction threshold. > Note: During recovery regionMap will be recreated from the Oplog files. Since > all DiskRegionInfo objects are deleted from regionMap during the close(), > they will be recreated by using function initRecoveredEntry during the > recovery. All DiskRegionInfo will be created with flag unrecovered set to > false. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9881) Fully recoverd Oplogs object indicating unrecoveredRegionCount>0 preventing compaction
[ https://issues.apache.org/jira/browse/GEODE-9881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472666#comment-17472666 ] ASF subversion and git services commented on GEODE-9881: Commit c0fbe309ded8e1b53b048ff80a1892eb6a1285ff in geode's branch refs/heads/develop from Jakov Varenina [ https://gitbox.apache.org/repos/asf?p=geode.git;h=c0fbe30 ] GEODE-9881: Oplog not compacted after recovery (#7193) * GEODE-9881: Oplog not compacted after recovery > Fully recoverd Oplogs object indicating unrecoveredRegionCount>0 preventing > compaction > -- > > Key: GEODE-9881 > URL: https://issues.apache.org/jira/browse/GEODE-9881 > Project: Geode > Issue Type: Bug > Components: persistence >Reporter: Jakov Varenina >Assignee: Jakov Varenina >Priority: Major > Labels: pull-request-available > > We have found problem in case when region is closed with Region.close() and > then recreated to start the recovery. If you inspect this code in close() > function you will notice that it doesn't make any sense: > {code:java} > void close(DiskRegion dr) { > // while a krf is being created can not close a region > lockCompactor(); > try { > if (!isDrfOnly()) { > DiskRegionInfo dri = getDRI(dr); > if (dri != null) { > long clearCount = dri.clear(null); > if (clearCount != 0) { > totalLiveCount.addAndGet(-clearCount); > // no need to call handleNoLiveValues because we now have an > // unrecovered region. > } > regionMap.get().remove(dr.getId(), dri); > } > addUnrecoveredRegion(dr.getId()); > } > } finally { > unlockCompactor(); > } > } > {code} > Please notice that addUnrecoveredRegion() marks DiskRegionInfo object as > unrecovered and increments counter unrecoveredRegionCount. This > DiskRegionInfo object is contained in regionMap structure. Then afterwards it > removes DiskRegionInfo object (that was previously marked as unrecovered) > from the regionMap. This doesn't make any sense, it updated object and then > removed it from map to be garbage collected. As you will see later on this > will cause some issues when region is recovered. > Please check this code at recovery: > {code:java} > /** > * For each dri that this oplog has that is currently unrecoverable check to > see if a DiskRegion > * that is recoverable now exists. > */ > void checkForRecoverableRegion(DiskRegionView dr) { > if (unrecoveredRegionCount.get() > 0) { > DiskRegionInfo dri = getDRI(dr); > if (dri != null) { > if (dri.testAndSetRecovered(dr)) { > unrecoveredRegionCount.decrementAndGet(); > } > } > } > } > {code} > The problem is that geode will not clear counter unrecoveredRegionCount in > Oplog objects after recovery is done. This is because > checkForRecoverableRegion will check unrecoveredRegionCount counter and > perform testAndSetRecovered. The testAndSetRecovered will always return > false, because non of the DiskRegionInfo objects in region map have > unrecovered flag set to true (all object marked as unrecovered were deleted > by close(), and then they were recreated during recovery see note below). > The problem here is that all Oplogs will be fully recovered with the counter > incorrectly indicating unrecoveredRegionCount>0. This will later on prevent > the compaction of recovered Oplogs (the files that have .crf, .drf and .krf) > when they reach compaction threshold. > Note: During recovery regionMap will be recreated from the Oplog files. Since > all DiskRegionInfo objects are deleted from regionMap during the close(), > they will be recreated by using function initRecoveredEntry during the > recovery. All DiskRegionInfo will be created with flag unrecovered set to > false. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9941) Coredump during PdxSerializable object deserialization
[ https://issues.apache.org/jira/browse/GEODE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mario Salazar de Torres reassigned GEODE-9941: -- Assignee: Mario Salazar de Torres > Coredump during PdxSerializable object deserialization > -- > > Key: GEODE-9941 > URL: https://issues.apache.org/jira/browse/GEODE-9941 > Project: Geode > Issue Type: Bug > Components: native client >Reporter: Mario Salazar de Torres >Assignee: Mario Salazar de Torres >Priority: Major > > *GIVEN* a cluster with a single server and a single locator with a > PdxSerializable like class implementation named Order > *AND* a geode-native client with 1 PdxSerializable class implementation named > Order, matching the implementation on the cluster > *AND* also on-client-disconnect-clear-pdxType-Ids=true in client configuration > *WHEN* an Order object is tried to be deserialized > *WHILE* the cluster is being restarted > *THEN* a coredump happens given that PdxType=nullptr > — > {*}Additional information{*}. As seen by early troubleshooting, the coredump > happens because the pdx type is tried to be fetched from the PdxTypeRegistry > by its class name, but the PdxTypeRegistry is cleaned up during serialization > given that subscription redundancy was lost. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9941) Coredump during PdxSerializable object deserialization
[ https://issues.apache.org/jira/browse/GEODE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mario Salazar de Torres updated GEODE-9941: --- Description: *GIVEN* a cluster with a single server and a single locator with a PdxSerializable like class implementation named Order *AND* a geode-native client with 1 PdxSerializable class implementation named Order, matching the implementation on the cluster *AND* also on-client-disconnect-clear-pdxType-Ids=true in client configuration *WHEN* an Order object is tried to be deserialized *WHILE* the cluster is being restarted *THEN* a coredump happens given that PdxType=nullptr — {*}Additional information{*}. As seen by early troubleshooting, the coredump happens because the pdx type is tried to be fetched from the PdxTypeRegistry by its class name, but the PdxTypeRegistry is cleaned up during serialization given that subscription redundancy was lost. was: *GIVEN* a cluster with a single server and a single locator with a PdxSerializable like class implementation named Order *AND* a geode-native client with 1 PdxSerializable class implementation named Order, matching the implementation on the cluster *WHEN* an Order object is tried to be deserialized *WHILE* the cluster is being restarted *THEN* a coredump happens given that PdxType=nullptr > Coredump during PdxSerializable object deserialization > -- > > Key: GEODE-9941 > URL: https://issues.apache.org/jira/browse/GEODE-9941 > Project: Geode > Issue Type: Bug > Components: native client >Reporter: Mario Salazar de Torres >Priority: Major > > *GIVEN* a cluster with a single server and a single locator with a > PdxSerializable like class implementation named Order > *AND* a geode-native client with 1 PdxSerializable class implementation named > Order, matching the implementation on the cluster > *AND* also on-client-disconnect-clear-pdxType-Ids=true in client configuration > *WHEN* an Order object is tried to be deserialized > *WHILE* the cluster is being restarted > *THEN* a coredump happens given that PdxType=nullptr > — > {*}Additional information{*}. As seen by early troubleshooting, the coredump > happens because the pdx type is tried to be fetched from the PdxTypeRegistry > by its class name, but the PdxTypeRegistry is cleaned up during serialization > given that subscription redundancy was lost. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9941) Coredump during PdxSerializable object deserialization
Mario Salazar de Torres created GEODE-9941: -- Summary: Coredump during PdxSerializable object deserialization Key: GEODE-9941 URL: https://issues.apache.org/jira/browse/GEODE-9941 Project: Geode Issue Type: Bug Components: native client Reporter: Mario Salazar de Torres *GIVEN* a cluster with a single server and a single locator with a PdxSerializable like class implementation named Order *AND* a geode-native client with 1 PdxSerializable class implementation named Order, matching the implementation on the cluster *WHEN* an Order object is tried to be deserialized *WHILE* the cluster is being restarted *THEN* a coredump happens given that PdxType=nullptr -- This message was sent by Atlassian Jira (v8.20.1#820001)