[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054621#comment-15054621
 ] 

Powell Molleti commented on ZOOKEEPER-1045:
-------------------------------------------

All API are async , will return immediately. The module does try to hold the 
invariant of ensuring that current vote is sent to peers in all cases(adding 
peers, reconnecting peers, msg tx errors etc). Will come with unit-tests to 
verify correctness. I will certainly write a design doc and publish that soon. 
The goal is to address ZOOKEEPER-901 and its related issues. Perhaps we should 
move this discussion there.

Two points of divergence from current implementation w.r.t FLE:
1. FLE does not to put the Vote into per Peer queue via manager.toSend() 
anymore only has to send call broadcast(vote) when ever it likes. QCM will take 
of sending the new Vote, FLE asked it to , if it knows this is a new Vote for 
peer(s) its managing. There are no outgoing queue to each Peer in QCM either, 
when it connects to a peer it just sends current Vote it has.

2. FLE unlike now will call getVotesBlockingQueue() (instead of 
manager.pollRecvQueue()) will get current votes of all peers that QCM knows now 
and any future vote received that is different than last one sent for every 
peer since the first call. 

>From my understanding this will eliminate unnecessary transitions of FLE since 
>currently it has to digest all the messages received when it calls 
>pollRecvQueue() since last round, this is because QCM's Rx/Tx is always alive 
>until QuorumPeer shuts it down, which is when QuorumPeer is shutdown. 

There is also an API call called getVotes() which simply returns current vote 
view that QCM knows, but too keep FLE simple/same with current implementation I 
will add getVotesBlockingQueue() to mimic pollRecvQueue().

Few points regarding implementation:
1. QCM will use a single thread executor. Hopefully this will address the 
concern of starting and stoping multiple threads to each peer.

2. This thread is shared by Netty to handle all TCP channels and also tasks in 
QCM to perform operations (like new channel/write and queue mgmt).

3. Netty handlers are written to be thread safe so one could pass more threads 
but I think one thread should be enough to handle handful of QuorumPeers 
talking part time.

4. Will try to keep FLE changes to minimum, only touching interfaces to manager.

Please let me know if I assumed something wrong and all feedback/comments are 
welcome. 

> Quorum Peer mutual authentication
> ---------------------------------
>
>                 Key: ZOOKEEPER-1045
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: server
>            Reporter: Eugene Koontz
>            Assignee: Rakesh R
>         Attachments: ZOOKEEPER-1045-00.patch
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to