Hi everyone,

since this smells so much like a bug, I filed an issue now in ZK's bugtracker 
at https://issues.apache.org/jira/browse/ZOOKEEPER-4403 
<https://issues.apache.org/jira/browse/ZOOKEEPER-4403> .

Best regards,
Marc

> On 25. Oct 2021, at 13:10, Marc Richter <[email protected]> wrote:
> 
> Hi Andor,
> 
> # How LE-certs were requested:
> 1. Using https://github.com/acmesh-official/acme.sh 
> <https://github.com/acmesh-official/acme.sh>
> 2. Requesting the cert from Let's Encrypt using:
>    `./acme.sh --issue --dns dns_nsupdate -d zookeeper1.ourdomain.cloud`
>    for each system.
> 3. Merge fullchain- and certificate-file to a single PKCS12 file using:
>    `openssl pkcs12 -export -in <certfile> -inkey <keyfile> -out <pkcs12_file> 
> -name zookeeper1.ourdomain.cloud \
>     -CAfile <fullchainfile> -password <JKS_password>`
> 4. Adding the resulting PKCS12 file to the Quorum Keystore:
>    `keytool -importkeystore -deststorepass <JKS_password> -destkeypass 
> <JKS_password> -deststoretype pkcs12 \
>     -srckeystore <pkcs12_file> -srcstoretype PKCS12 -srcstorepass 
> <JKS_password> -destkeystore <quorum_jks> \
>     -alias zookeeper1.ourdomain.cloud`
> 
> # Is this an IPv6-only environment? Do those hostnames resolve only to IPv6 
> addresses?
> The machines also have an IPv4 IP associated, but that is only for 
> infrastructure standard-reasons and not used in our scenario: The Service 
> DNS-Record is IPv6 only and the Quorum config only uses that DNS Record as 
> shown in the `server.n`-lines before.
> 
> I think the log lines cited before clearly show:
> 1. Zookeeper is picking up the correct certificate from the quorum Keystore, 
> since it states that the request does not match
>    any SNA and lists "zookeeper3.ourdomain.cloud" only, which it can only 
> know from the certificate itself.
> 2. Zookeeper is validating the wrong thing here: Even though the config 
> clearly states to use a DNS name, the
>    certificates SNAs alre validated against the IPv6 address that record 
> belongs to instead of the DNS name condigured
>    (ERROR Failed to verify host address: 2a01:--CUT--:750)
> 
> In my understanding, this is a Bug in Zookeeper's Quorum certificate handling 
> obviously and how certificates are received should not be too important in 
> this context. If it would state that it can't find any cert or similar, 
> without naming that SNA name as written in the certificate, maybe. But this 
> is a different case.
> 
> Best regards,
> Marc
> 
> 
>> On 22. Oct 2021, at 01:20, Andor Molnar <[email protected]> wrote:
>> 
>> 
>> Hi Marc,
>> 
>> I need to take a closer look.
>> Would you please share how have you requested the certificates from
>> Let's Encrypt?
>> Is this an IPv6-only environment? Do those hostnames resolve only to
>> IPv6 addresses?
>> 
>> Regards,
>> Andor
>> 
>> 
>> 
>> On Wed, 2021-10-13 at 15:47 +0200, Marc Richter wrote:
>>> Hi everyone,
>>> 
>>> for some days now, I am trying to wrap my head around TLS encryption
>>> for the quorum-traffic. The hosts running Zookeeper do have a
>>> publicly available DNS name and I am using those to issue SSL
>>> certificates from Let's Encrypt.
>>> This seems to work - but it seems like Zookeeper decides to validate
>>> the SSL certificates against the IP(v6) of the connecting nodes
>>> instead of their hostnames.
>>> 
>>> In the `zookeeper.properties` of all my 3 nodes, I have set the
>>> servers by their DNS names like this:
>>> 
>>> ```
>>> server.1=zookeeper1.ourdomain.cloud:2888:3888
>>> server.2=zookeeper2.ourdomain.cloud:2888:3888
>>> server.3=zookeeper3.ourdomain.cloud:2888:3888
>>> ```
>>> 
>>> I requested SSL certificates from Let's Encrypt for these DNS names
>>> and added the certificate/key pairs to the Keystores of the nodes.
>>> 
>>> In the logs of the `zookeeper2` node, I now see something like this
>>> when the `zookeeper3` node tries to connect:
>>> 
>>> ```
>>> [2021-10-13 15:13:49,960] INFO Received connection request from
>>> /2a01:--CUT--:750:47566
>>> (org.apache.zookeeper.server.quorum.QuorumCnxManager)
>>> [2021-10-13 15:13:50,094] ERROR Failed to verify host address: 2a01:-
>>> -CUT--:750 (org.apache.zookeeper.common.ZKTrustManager)
>>> javax.net.ssl.SSLPeerUnverifiedException: Certificate for <2a01:--
>>> CUT--:750> doesn't match any of the subject alternative names:
>>> [zookeeper3.ourdomain.cloud]
>>> ```
>>> 
>>> Zookeeper seems to ignore the hostnames and complains about that the
>>> IPv6 is not listed in the SNA of the presented certificate. Since
>>> most open CAs do not sign IP addresses (Let's Encrypt does not do
>>> that at all, ZeroSSL only for http auth, etc.), this behaviour
>>> enforces me to have an internal CA and work with self signed
>>> certificates; including all the negative things that come with it and
>>> a lot of extra effort.
>>> 
>>> How can I make Zookeeper to resolve this correctly?
>>> 
>>> Best regards,
>>> Marc
>> 
>> 
> 

Reply via email to