tl;dr - I think Trove's successor has a future, but there are two conflicting ideas presented and Trove should pick one or the other.
Excerpts from Amrith Kumar's message of 2017-06-18 07:35:49 -0400: > > We have learned a lot from v1, and the hope is that we can address that in > v2. Some of the more significant things that I have learned are: > > - We should adopt a versioned front-end API from the very beginning; making > the REST API versioned is not a ‘v2 feature’ > +1 > - A guest agent running on a tenant instance, with connectivity to a shared > management message bus is a security loophole; encrypting traffic, > per-tenant-passwords, and any other scheme is merely lipstick on a security > hole > This is a broad statement, and I'm not sure I understand the actual risk you're presenting here as "a security loophole". How else would you administer a database server than through some kind of agent? Whether that agent is a python daemon of our making, sshd, or whatever kubernetes component lets you change things, they're all administrative pieces that sit next to the resource. > - Reliance on Nova for compute resources is fine, but dependence on Nova VM > specific capabilities (like instance rebuild) is not; it makes things like > containers or bare-metal second class citizens > I whole heartedly agree that rebuild is a poor choice for database servers. In fact, I believe it is a completely non-scalable feature that should not even exist in Nova. This is kind of a "we shouldn't be this". What should we be running database clusters on? > - A fair portion of what Trove does is resource orchestration; don’t > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far > along when Trove got started but that’s not the case today and we have an > opportunity to fix that now > Yeah. You can do that. I'm not really sure what it gets you at this level. There was an effort a few years ago to use Heat for Trove and some other pieces, but they fell short at the point where they had to ask Heat for a few features like, oddly enough, rebuild confirmation after test. Also, it increases friction to your project if your project requires Heat in a cloud. That's a whole new service that one would have to choose to expose or not to users and manage just for Trove. That's a massive dependency, and it should come with something significant. I don't see what it actually gets you when you already have to keep track of your resources for cluster membership purposes anyway. > - A similarly significant portion of what Trove does is to implement a > state-machine that will perform specific workflows involved in implementing > database specific operations. This makes the Trove taskmanager a stateful > entity. Some of the operations could take a fair amount of time. This is a > serious architectural flaw > A state driven workflow is unavoidable if you're going to do cluster manipulation. So you can defer this off to Mistral or some other workflow engine, but I don't think it's an architectural flaw _that Trove does it_. Clusters have states. They have to be tracked. Do that well and your users will be happy. > - Tenants should not ever be able to directly interact with the underlying > storage and compute used by database instances; that should be the default > configuration, not an untested deployment alternative > Agreed. There's no point in having an "inside the cloud" service if you're just going to hand them the keys to the VMs and volumes anyway. The point of something like Trove is to be able to retain control at the operator level, and only give users the interface you promised, optimized without the limitations of the cloud. > - The CI should test all databases that are considered to be ‘supported’ > without excessive use of resources in the gate; better code modularization > will help determine the tests which can safely be skipped in testing changes > Take the same approach as the other driver-hosting things. If it's in-tree, it has to have a gate test. > - Clusters should be first class citizens not an afterthought, single > instance databases may be the ‘special case’, not the other way around > +1 > - The project must provide guest images (or at least complete tooling for > deployers to build these); while the project can’t distribute operating > systems and database software, the current deployment model merely impedes > adoption > IIRC the project provides dib elements and a basic command line to build images for your cloud, yes? Has that not worked out? > - Clusters spanning OpenStack deployments are a real thing that must be > supported > This is the most problematic thing you asserted. There are two basic desires I see that drive a Trove adoption: 1) I need database clusters and I don't know how to do them right. 2) I need _high performance/availability/capacity_ databases and my cloud's standard VM flavors/hosts/networks/disks/etc. stand in the way of that. For the openstack-spanning cluster, thing, (1) is fine. But (1) can and probably should be handled by things like Helm, Juju, Ansible, Habitat, Docker Compose, etc. (2) is much more likely to draw people into an official "inside the cloud" Trove deployment. Let the operators install Ironic, wire up some baremetal with huge disks or powerful RAID controllers or an infiniband mesh, and build their own images with tuned kernels and tightly controlled builds of MySQL/MariaDB/Postgres/MongoDB/etc. Don't let the users know anything about the computers their database cluster runs on. They get cluster access details, and knobs that are workload specific. But not all the knobs, just the knobs that an operator can't possibly know. And in return you give them highly capable databases. But (2) is directly counter to (1). I would say pick one, and focus on that for Trove. To me, (2) is the more interesting story. (1) is a place to let 1000 flowers bloom (in many cases they already have, and just need porting from AWS/GCE/Azure/DigitalOcean to OpenStack). If you want to run cross cloud, you are accepting the limitations of multi-cloud, and should likely be building cloud-native apps that don't rely on a beefy database cluster. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev