[ 
https://issues.apache.org/jira/browse/HELIX-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323117#comment-14323117
 ] 

Zhen Zhang commented on HELIX-548:
----------------------------------

There is a timing issue in ZkHelixParticipant#setupMsgHandler(). We should hook 
up ZK callback (line 347 in 
https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/manager/zk/ZkHelixParticipant.java)
 after all message handler registrations are done (line 354 in 
https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/manager/zk/ZkHelixParticipant.java).
 Fix is to move adding ZK callback to the end. Will add a test case that can 
reliably reproduce this issue.

> race condition in Helix message handler factory registration
> ------------------------------------------------------------
>
>                 Key: HELIX-548
>                 URL: https://issues.apache.org/jira/browse/HELIX-548
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Zhen Zhang
>
> -- copied from email --
> I have a small issue when running a Helix cluster in Docker (all on the same 
> machine, very small RTT):
> I register two message handler factories, similarly to the way it is done in 
> BootstrapProcess.java example. The manager is not yet connected. My 
> understanding is that this places the handlers in a hashmap from which they 
> will be read and registered later, upon connect.
> Upon connect, I see that the DefaultMessagingService.onConnected function is 
> called and Helix starts registering the message handlers. However, messages 
> start arriving already and HelixTaskExecutor.createMessageHandler is called, 
> before all handlers have been registered. The result is that I get 
> NullPointerExceptions for the missing handlers in the following line of 
> HelixTaskExecutor.createMessageHandler:
> MessageHandlerFactory handlerFactory = item.factory();
> due to the fact that there is no item (MessageHandlerFactoryRegistryItem) for 
> that message type registered yet.
> I came upon this bug after updating the git code from an old snapshot of 
> 0.7.1. I saw that the relevant code changes are from around the summer.
> Regards,
> Vlad



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to