Calvin Hartwell created SOLR-11362:
--------------------------------------

             Summary: Solr Cloud SSL handshake_failure keystore issue
                 Key: SOLR-11362
                 URL: https://issues.apache.org/jira/browse/SOLR-11362
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: SolrCloud
    Affects Versions: 6.6.0
         Environment: CentOS 7.3, Virtual Machines, AWS. 
            Reporter: Calvin Hartwell
            Priority: Minor


Hey all,

I ran into a strange scenario recently so I thought I'd share, it was very 
frustrating and I only discovered the fix on a whim. Let's imagine I have three 
nodes which form a solrcloud: 

- node0.someaddress.com
- node1.someaddress.com
- node2.someaddress.com 

Each of these machines has an SSL key and csr generated which is signed by a 
CA. The truststore contains the public certificate of the CA (defined as per 
manual using SOLR_SSL_TRUST_STORE). 

The keystore (SOLR_SSL_KEY_STORE) contains three entries, one for the CA public 
cert, and two entries for the server itself, with different alises (one has the 
alias set to the FQDN, the other is set to localhost). 

All parameters for SSL/TLS are configured correctly as per the solr manuals. 
Obviously the keystore (SOLR_SSL_KEY_STORE) only needs the single cert/private 
key for the server with no other entries, but this setup works 100% with Kafka 
using the three entries. 

Here is an example:

keytool -list -keystore solrkeystore.jks 
localhost ..(omitted)
node0.someaddress.com ...(omitted)
cacert ..(omitted)

Here is the interesting part, with this setup, when the nodes are started only 
1/3 nodes starts correctly (in my case, node1.someaddress.com) all the other 
nodes (node0.someaddress.com, node2.someaddress.com) have a handshake_failure 
error. If you try to run solr status on the two broken nodes it doesn't work 
but this command works fine for the working node. 

I enabled the most detailed level of logging and monitored the handshake but 
couldn't see anything really a miss, all the configuration properties were set 
correctly. 

What I noticed was this: when running keytool to list the keys for each 
keystore, the certificates in the keystore were displayed in different orders, 
like they were sorted by alphabetical order by the keytool cli tool. This gave 
me an idea to delete the rest of the certs in each keystore for each node so 
they only had single entries for the fqdn. 

So the keystores now looked like this: 
keytool -list -keystore solrkeystore.jks ...
node0.someaddress.com ...(omitted)

After I did this and restarted the solr nodes started working again fine, so 
here are the questions: 

1) Why does this setup work with Kafka and not Solr if the java classes used 
should be very similar? 

2) Why can't I use multiple keys/certs in the keystore? Is this expected 
functionality? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to