Thank you Kevin. Lots of container (specific?) goodness here. -amrith
-amrith -- Amrith Kumar Phone: +1-978-563-9590 On Mon, Jun 19, 2017 at 2:34 PM, Fox, Kevin M <[email protected]> wrote: > Thanks for starting this difficult discussion. > > I think I agree with all the lessons learned except the nova one. while > you can treat containers and vm's the same, after years of using both, I > really don't think its a good idea to treat them equally. Containers can't > work properly if used as a vm. (really, really.) > > I agree whole heartedly with your statement that its mostly an > orchestration problem and should reuse stuff now that there are options. > > I would propose the following that I think meets your goals and could > widen your contributor base substantially: > > Look at the Kubernetes (k8s) concept of Operator -> > https://coreos.com/blog/introducing-operators.html > > They allow application specific logic to be added to Kubernetes while > reusing the rest of k8s to do what its good at. Container Orchestration. > etcd is just a clustered database and if the operator concept works for it, > it should also work for other databases such as Gallera. > > Where I think the containers/vm thing is incompatible is the thing I think > will make Trove's life easier. You can think of a member of the database as > few different components, such as: > * main database process > * metrics gatherer (such as https://github.com/prometheus/mysqld_exporter > ) > * trove_guest_agent > > With the current approach, all are mixed into the same vm image, making it > very difficult to update the trove_guest_agent without touching the main > database process. (needed when you upgrade the trove controllers). With the > k8s sidecar concept, each would be a separate container loaded into the > same pod. > > So rather then needing to maintain a trove image for every possible > combination of db version, trove version, etc, you can reuse upstream > database containers along with trove provided guest agents. > > There's a secure channel between kube-apiserver and kubelet so you can > reuse it for secure communications. No need to add anything for secure > communication. trove engine -> kubectl exec xxxxx-db -c guest_agent some > command. > > There is k8s federation, so if the operator was started at the federation > level, it can cross multiple OpenStack regions. > > Another big feature I that hasn't been mentioned yet that I think is > critical. In our performance tests, databases in VM's have never performed > particularly well. Using k8s as a base, bare metal nodes could be pulled in > easily, with dedicated disk or ssd's that the pods land on that are very > very close to the database. This should give native performance. > > So, my suggestion would be to strongly consider basing Trove v2 on > Kubernetes. It can provide a huge bang for the buck, simplifying the Trove > architecture substantially while gaining the new features your list as > being important. The Trove v2 OpenStack api can be exposed as a very thin > wrapper over k8s Third Party Resources (TPR) and would make Trove entirely > stateless. k8s maintains all state for everything in etcd. > > Please consider this architecture. > > Thanks, > Kevin > > ------------------------------ > *From:* Amrith Kumar [[email protected]] > *Sent:* Sunday, June 18, 2017 4:35 AM > *To:* OpenStack Development Mailing List (not for usage questions) > *Subject:* [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > Trove has evolved rapidly over the past several years, since integration > in IceHouse when it only supported single instances of a few databases. > Today it supports a dozen databases including clusters and replication. > > The user survey [1] indicates that while there is strong interest in the > project, there are few large production deployments that are known of (by > the development team). > > Recent changes in the OpenStack community at large (company realignments, > acquisitions, layoffs) and the Trove community in particular, coupled with > a mounting burden of technical debt have prompted me to make this proposal > to re-architect Trove. > > This email summarizes several of the issues that face the project, both > structurally and architecturally. This email does not claim to include a > detailed specification for what the new Trove would look like, merely the > recommendation that the community should come together and develop one so > that the project can be sustainable and useful to those who wish to use it > in the future. > > TL;DR > > Trove, with support for a dozen or so databases today, finds itself in a > bind because there are few developers, and a code-base with a significant > amount of technical debt. > > Some architectural choices which the team made over the years have > consequences which make the project less than ideal for deployers. > > Given that there are no major production deployments of Trove at present, > this provides us an opportunity to reset the project, learn from our v1 and > come up with a strong v2. > > An important aspect of making this proposal work is that we seek to > eliminate the effort (planning, and coding) involved in migrating existing > Trove v1 deployments to the proposed Trove v2. Effectively, with work > beginning on Trove v2 as proposed here, Trove v1 as released with Pike will > be marked as deprecated and users will have to migrate to Trove v2 when it > becomes available. > > While I would very much like to continue to support the users on Trove v1 > through this transition, the simple fact is that absent community > participation this will be impossible. Furthermore, given that there are no > production deployments of Trove at this time, it seems pointless to build > that upgrade path from Trove v1 to Trove v2; it would be the proverbial > bridge from nowhere. > > This (previous) statement is, I realize, contentious. There are those who > have told me that an upgrade path must be provided, and there are those who > have told me of unnamed deployments of Trove that would suffer. To this, > all I can say is that if an upgrade path is of value to you, then please > commit the development resources to participate in the community to make > that possible. But equally, preventing a v2 of Trove or delaying it will > only make the v1 that we have today less valuable. > > We have learned a lot from v1, and the hope is that we can address that in > v2. Some of the more significant things that I have learned are: > > - We should adopt a versioned front-end API from the very beginning; > making the REST API versioned is not a ‘v2 feature’ > > - A guest agent running on a tenant instance, with connectivity to a > shared management message bus is a security loophole; encrypting traffic, > per-tenant-passwords, and any other scheme is merely lipstick on a security > hole > > - Reliance on Nova for compute resources is fine, but dependence on Nova > VM specific capabilities (like instance rebuild) is not; it makes things > like containers or bare-metal second class citizens > > - A fair portion of what Trove does is resource orchestration; don’t > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far > along when Trove got started but that’s not the case today and we have an > opportunity to fix that now > > - A similarly significant portion of what Trove does is to implement a > state-machine that will perform specific workflows involved in implementing > database specific operations. This makes the Trove taskmanager a stateful > entity. Some of the operations could take a fair amount of time. This is a > serious architectural flaw > > - Tenants should not ever be able to directly interact with the underlying > storage and compute used by database instances; that should be the default > configuration, not an untested deployment alternative > > - The CI should test all databases that are considered to be ‘supported’ > without excessive use of resources in the gate; better code modularization > will help determine the tests which can safely be skipped in testing changes > > - Clusters should be first class citizens not an afterthought, single > instance databases may be the ‘special case’, not the other way around > > - The project must provide guest images (or at least complete tooling for > deployers to build these); while the project can’t distribute operating > systems and database software, the current deployment model merely impedes > adoption > > - Clusters spanning OpenStack deployments are a real thing that must be > supported > > This might sound harsh, that isn’t the intent. Each of these is the > consequence of one or more perfectly rational decisions. Some of those > decisions have had unintended consequences, and others were made knowing > that we would be incurring some technical debt; debt we have not had the > time or resources to address. Fixing all these is not impossible, it just > takes the dedication of resources by the community. > > I do not have a complete design for what the new Trove would look like. > For example, I don’t know how we will interact with other projects (like > Heat). Many questions remain to be explored and answered. > > Would it suffice to just use the existing Heat resources and build > templates around those, or will it be better to implement custom Trove > resources and then orchestrate things based on those resources? > > Would Trove implement the workflows required for multi-stage database > operations by itself, or would it rely on some other project (say Mistral) > for this? Is Mistral really a workflow service, or just cron on steroids? I > don’t know the answer but I would like to find out. > > While we don’t have the answers to these questions, I think this is a > conversation that we must have, one that we must decide on, and then as a > community commit the resources required to make a Trove v2 which delivers > on the mission of the project; “To provide scalable and reliable Cloud > Database as a Service provisioning functionality for both relational and > non-relational database engines, and to continue to improve its > fully-featured and extensible open source framework.”[2] > > Thanks, > > -amrith > > > [1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf > [2] https://wiki.openstack.org/wiki/Trove#Mission_Statement > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
