Hi,

Glad to know you guys are talking about the key distribution and rotation for 
Fernet token. Hans and I did a prototype for multisite identity service 
management, and have a similar issue.

The use case is : a user should, using a single authentication point be able to 
manage virtual resources spread over multiple OpenStack regions  
(https://etherpad.opnfv.org/p/multisite_identity_management)

We did the prototype of Fernet token used in multi-KeyStone cluster for 
multi-OpenStack instances installed in multi-sites, “write” is only allowed in 
the master KeyStone cluster, the slave KeyStone cluster is read only (  
https://github.com/hafe/dockers, remember that the slave Galera cluster should 
be configured with replicate_do_db=KeyStone, but not binlog_do_db=KeyStone, 
Hans may haven’t update the script yet. The prototype is for candidate solution 
2 )

From the prototype, we found that Fernet token validation could be successfully 
done by local KeyStone server with the async-replicated db. This means if we 
have a lot of sites with OpenStack installed, we can deploy a fully distributed 
KeyStone service in each site, provide token validation in local site only to 
realize high performance and high availability.

After the prototype, I think the candidate solution 3 would be better one 
solution for multisite identity service management.

“Candidate solution 3”. KeyStone service(Distributed) with Fernet token + Async 
replication ( star-mode).
one master KeyStone cluster with Fernet token in two sites (for site level high 
availability purpose), other sites will be installed with at least 2 slave 
nodes where the node is configured with DB async replication from the master 
cluster members, and one slave’s mater node in site1, another slave’s master 
node in site 2.

Only the master cluster nodes are allowed to write,  other slave nodes waiting 
for replication from the master cluster ( very little delay) member.

Pros.
1) Why cluster in the master sites? There are lots of master nodes in the 
cluster, in order to provide more slaves could be done async. replication in 
parallel.
2) Why two sites for the master cluster? to provide higher reliability (site 
level) for writing request.
3) Why using multi-slaves in other sites. Slave has no knowledge of other 
slaves, so easy to manage multi-slaves in one site than a cluster, and 
multi-slaves work independently but provide multi-instance redundancy(like a 
cluster, but independent).

Cons. The distribution/rotation of key management.

------------------------------------

Appreciate the new introduced Fernet token very much in addressing the scenario 
of multi-site cloud identity management, but it brings a new challenge that how 
to address the key distribution and rotation in multi-site cloud. Should the 
key distribution/rotation management be the responsibility of a new service or 
KeyStone itself? It’s tough to depends on script to manage multi-sites (lots of 
sites, not only 3 or 5).

Best Regards
Chaoyi Huang ( Joe Huang )

From: Dolph Mathews [mailto:dolph.math...@gmail.com]
Sent: Tuesday, July 28, 2015 3:31 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Keystone][Fernet] HA SQL backend for Fernet keys



On Mon, Jul 27, 2015 at 2:03 PM, Clint Byrum 
<cl...@fewbar.com<mailto:cl...@fewbar.com>> wrote:
Excerpts from Dolph Mathews's message of 2015-07-27 11:48:12 -0700:
> On Mon, Jul 27, 2015 at 1:31 PM, Clint Byrum 
> <cl...@fewbar.com<mailto:cl...@fewbar.com>> wrote:
>
> > Excerpts from Alexander Makarov's message of 2015-07-27 10:01:34 -0700:
> > > Greetings!
> > >
> > > I'd like to discuss pro's and contra's of having Fernet encryption keys
> > > stored in a database backend.
> > > The idea itself emerged during discussion about synchronizing rotated
> > keys
> > > in HA environment.
> > > Now Fernet keys are stored in the filesystem that has some availability
> > > issues in unstable cluster.
> > > OTOH, making SQL highly available is considered easier than that for a
> > > filesystem.
> > >
> >
> > I don't think HA is the root of the problem here. The problem is
> > synchronization. If I have 3 keystone servers (n+1), and I rotate keys on
> > them, I must very carefully restart them all at the exact right time to
> > make sure one of them doesn't issue a token which will not be validated
> > on another. This is quite a real possibility because the validation
> > will not come from the user, but from the service, so it's not like we
> > can use simple persistence rules. One would need a layer 7 capable load
> > balancer that can find the token ID and make sure it goes back to the
> > server that issued it.
> >
>
> This is not true (or if it is, I'd love see a bug report). keystone-manage
> fernet_rotate uses a three phase rotation strategy (staged -> primary ->
> secondary) that allows you to distribute a staged key (used only for token
> validation) throughout your cluster before it becomes a primary key (used
> for token creation and validation) anywhere. Secondary keys are only used
> for token validation.
>
> All you have to do is atomically replace the fernet key directory with a
> new key set.
>
> You also don't have to restart keystone for it to pickup new keys dropped
> onto the filesystem beneath it.
>
That's great news! Is this documented anywhere? I dug through the
operators guides, security guide, install guide, etc. Nothing described
this dance, which is impressive and should be written down!

(BTW, your original assumption would normally have been an accurate one!)

I don't believe it's documented in any of those places, yet. The best 
explanation of the three phases in tree I'm aware of is probably this (which 
isn't particularly accessible..):

  
https://github.com/openstack/keystone/blob/6a6fcc2/keystone/cmd/cli.py#L208-L223

Lance Bragstad and I also gave a small presentation at the Vancouver summit on 
the behavior and he mentions the same on one of his blog posts:

  https://www.youtube.com/watch?v=duRBlm9RtCw&feature=youtu.be
  http://lbragstad.com/?p=133


I even tried to discern how it worked from the code but it actually
looks like it does not work the way you describe on casual investigation.

I don't blame you! I'll work to improve the user-facing docs on the topic.


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to