[jira] [Commented] (GEODE-7921) NullPointerExceptions logged during auto-reconnect

ASF subversion and git services (Jira) Wed, 01 Apr 2020 15:29:40 -0700


    [ 
https://issues.apache.org/jira/browse/GEODE-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073220#comment-17073220
 ]


ASF subversion and git services commented on GEODE-7921:
--------------------------------------------------------

Commit 0400983ae2dd81898c2a416d41d57e89148ad8e2 in geode's branch 
refs/heads/feature/GEODE-7921 from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=0400983 ]

GEODE-7921: NullPointerExceptions logged during auto-reconnect

Don't deliver cache-level messages that were queued while disconnected
for auto-reconnect.  During auto-reconnect there is a QuorumChecker that
receives messages on the jgroups channel and tries to establish
communications with a quorum of the old membership view.  It may also
get JoinRequest messages and other membership-level messages but I
observed one case where it also queued cache-level messages when the
property disable-tcp was set to true (funnelling all comms through
jgroups).

I also added some null checks to the LatestLastAccessTimeMessage and a
small test for that.


> NullPointerExceptions logged during auto-reconnect
> --------------------------------------------------
>
>                 Key: GEODE-7921
>                 URL: https://issues.apache.org/jira/browse/GEODE-7921
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce J Schuchardt
>            Assignee: Bruce J Schuchardt
>            Priority: Major
>
> In a test with disable-tcp=true I found a huge number of NPEs logged when a 
> server auto-reconnected.
> {noformat}
> [fatal 2020/03/25 15:26:57.136 PDT <Pooled Message Processor 2> tid=0x17f] 
> Uncaught exception processing LatestLastAccessTimeMessage@546d3459 
> processorId=0 
> sender=rs-FullRegression25213428a3i32xlarge-hydra-client-22(gemfire3_host1_32677:32677)<ec><v5>:41009
> java.lang.NullPointerException
>         at 
> org.apache.geode.internal.cache.LatestLastAccessTimeMessage.process(LatestLastAccessTimeMessage.java:66)
>         at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
>         at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:440)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442)
>         at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:389)
>         at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> These came from a queue of messages built up by the QuorumChecker that were 
> processed before a new cache was built.  The QuorumChecker shouldn't be 
> queueing cache operations.  It only needs to queue membership messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-7921) NullPointerExceptions logged during auto-reconnect

Reply via email to