[ 
https://issues.apache.org/jira/browse/GEODE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamilla Aslami resolved GEODE-9350.
-----------------------------------
    Fix Version/s: 1.15.0
                   1.14.0
       Resolution: Fixed

> MemberJoinedEvent should be triggered after new view is installed
> -----------------------------------------------------------------
>
>                 Key: GEODE-9350
>                 URL: https://issues.apache.org/jira/browse/GEODE-9350
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>    Affects Versions: 1.14.0, 1.15.0
>            Reporter: Kamilla Aslami
>            Assignee: Kamilla Aslami
>            Priority: Major
>              Labels: pull-request-available, release-blocker
>             Fix For: 1.14.0, 1.15.0
>
>
> While investigating GEODE-9070, we noticed a problem when a server tries to 
> join a cluster, and soon after, membership fails with ShunnedMemberException:
> {noformat}
> org.apache.geode.distributed.internal.direct.ShunnedMemberException: Member 
> is being shunned: ccf730fb2b62(161)<v2>:41002
>  at 
> org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:469)
>  at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:283)
>  at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:190)
>  at 
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:550)
>  at 
> org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:354)
>  at 
> org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:296)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2068)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1983)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2028)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1085)
>  at 
> org.apache.geode.internal.cache.execute.StreamingFunctionOperation.getFunctionResultFrom(StreamingFunctionOperation.java:113)
>  at 
> org.apache.geode.internal.cache.execute.MemberFunctionExecutor.executeFunction(MemberFunctionExecutor.java:149)
>  at 
> org.apache.geode.internal.cache.execute.MemberFunctionExecutor.executeFunction(MemberFunctionExecutor.java:191)
>  at 
> org.apache.geode.internal.cache.execute.AbstractExecution.execute(AbstractExecution.java:397)
>  at 
> org.apache.geode.internal.cache.execute.AbstractExecution.execute(AbstractExecution.java:402)
>  at 
> org.apache.geode.modules.util.BootstrappingFunction.bootstrapMember(BootstrappingFunction.java:170)
>  at 
> org.apache.geode.modules.util.BootstrappingFunction.memberJoined(BootstrappingFunction.java:240)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$MemberJoinedEvent.handleEvent(ClusterDistributionManager.java:2498)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2451)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2440)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.handleMemberEvent(ClusterDistributionManager.java:1406)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$200(ClusterDistributionManager.java:109)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEventInvoker.run(ClusterDistributionManager.java:1438)
>  at java.base/java.lang.Thread.run(Thread.java:834){noformat}
> Further analysis showed that ShunnedMemberException is thrown because 
> GMSMembership.memberExists() method returns false, which means that the 
> member ccf730fb2b62(161)<v2>:41002 was not in the view. Looking at the 
> stacktrace, we noticed that BootstrappingFunction.bootstrapMember() gets 
> executed on MemberJoinedEvent, which is triggered by 
> MembershipListener.newMemberConnected(). newMemberConnected() is called in 
> GMSMembership.processView() before the new view is installed, so it's likely 
> that the failure happens because BootstrappingFunction receives the event 
> before the view was actually updated. Possible solution for this problem 
> could be to change GMSMembership.processView() to call 
> MembershipListener.newMemberConnected() only after the new view is installed.
> This issue was introduced by the fix for GEODE-7245 which removed latestView 
> lock from GMSMembership.memberExists(). Before GEODE-7245, this method was 
> waiting until GMSMembership.processView() released the lock, so the problem 
> described above could never happen. GEODE-7245 was back-ported to 1.14.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to