[
https://issues.apache.org/jira/browse/ZOOKEEPER-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900748#comment-15900748
]
Powell Molleti commented on ZOOKEEPER-236:
------------------------------------------
Hi Abe,
{quote}
bq. if I understand correctly, both the operations of managing the certs
(add/remove of certs) and reconfig() API to change members of quorum have to
be fault-tolerant.
Would you mind clarifying what you mean by "fault-tolerant" here? Can you give
an example of how a fault would break my patch?
{quote}
Either it be CA(s) with CRL's or self signed list of certs what I am pointing
to is that the way an admin manages this information should also support
fault-tolerance. Not only it should be fault-tolerant but also should work
nicely/easily with most probable next thing an admin would do i.e issue a
reconfig() command, it could be an add/removing/modify quorum peer(s)
configuration.
It will be nice to provide a way to manage reconfiguration of quorum peers when
SSL is enabled with the same weak assumptions that are necessary for reconfig()
to work when SSL is not enabled.
Providing a Truststore and asking admins to manage them on their own for the
entire quorum will mean that this operation is not fault-tolerant i.e we are
expecting them to first set all members of the quorum to a consistent SSL
config state and then issue reconfig() command.
It would seem that a set of quorum IP addresses dictate what the current
configuration of connectivity is allowed and this has to be managed properly to
ensure safety and extending this idea the set of SSL certs(be self signed or CA
signed) also dictate the current configuration of connectivity. Hence if one
considers the Pair<IP set, SSL set> as config and provide that to reconfig()
API it should work. That is what is done for self signed certs in my patch and
we should/could provide similar functionality for CA cert case.
Hence there is no new problem to solve here, we piggy back on reconfig() API
and provide a single API to manage this, we get fault-tolerance for this
configuration and safety that reconfig() provides for free.
Please consider the above comments and let me know what you think, I was not
saying that your patch is breaking fault-tolerance instead what my comments
pointed to is that we should provide fault-tolerance and safety for
reconfiguration of SSL configuration be it self signed or CA based. There are
use cases where CA cert based cluster deployment might not be possible hence it
would be nice to see Zookeeper provide both possibilities but also maintain the
ease of use and provide same guarantees that reconfig() does.
{quote}
This is how I feel as well. I'm sure we can pretty quickly come up with a list
of deficiencies in the current design but I don't think there is anything
severe enough at this moment to give us cause to rewrite right now.
{quote}
There are bugs like ZOOKEEPER-2164, ZOOKEEPER-1678 to consider along with
ZOOKEEPER-901. Netty or NIO will work but considering SSL will mean Netty will
make it easier to implement.
Doing this in phases is better, getting SSL socket to work with reconfig()
support is great first step. The Netty patch I have also gets this support only
for FLE and not ZAB. I found it not so easier to abstract away the calls to
socket(s) from ZAB code.
Cheers
Powell.
> SSL Support for Atomic Broadcast protocol
> -----------------------------------------
>
> Key: ZOOKEEPER-236
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-236
> Project: ZooKeeper
> Issue Type: New Feature
> Components: quorum, server
> Reporter: Benjamin Reed
> Assignee: Abraham Fine
> Priority: Minor
>
> We should have the ability to use SSL to authenticate and encrypt the traffic
> between ZooKeeper servers. For the most part this is a very easy change. We
> would probably only want to support this for TCP based leader elections.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)