Kevin, In interests of 'keeping it simple', I'm going to try and prioritize the use-cases and pick implementation strategies which target the higher priority ones without needlessly excluding other (lower priority) ones.
Thanks, -amrith -- Amrith Kumar P.S. Verizon is hiring OpenStack engineers nationwide. If you are interested, please contact me or visit https://t.co/gGoUzYvqbE On Wed, Jul 12, 2017 at 5:46 PM, Fox, Kevin M <[email protected]> wrote: > There is a use case where some sites have folks buy whole bricks of > compute nodes that get added to the overarching cloud, but using AZ's or > HostAggregates/Flavors to dedicate the hardware to the users. > > You might want to land the db vm on the hardware for that project and one > would expect the normal quota would be dinged for it rather then a special > trove quota. Otherwise they may have more quota then the hosts can actually > handle. > > Thanks, > Kevin > ________________________________________ > From: Doug Hellmann [[email protected]] > Sent: Wednesday, July 12, 2017 6:57 AM > To: openstack-dev > Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > Excerpts from Amrith Kumar's message of 2017-07-12 06:14:28 -0500: > > All: > > > > First, let me thank all of you who responded and provided feedback > > on what I wrote. I've summarized what I heard below and am posting > > it as one consolidated response rather than responding to each > > of your messages and making this thread even deeper. > > > > As I say at the end of this email, I will be setting up a session at > > the Denver PTG to specifically continue this conversation and hope > > you will all be able to attend. As soon as time slots for PTG are > > announced, I will try and pick this slot and request that you please > > attend. > > > > ---- > > > > Thierry: naming issue; call it Hoard if it does not have a migration > > path. > > > > ---- > > > > Kevin: use a container approach with k8s as the orchestration > > mechanism, addresses multiple issues including performance. Trove to > > provide containers for multiple components which cooperate to provide > > a single instance of a database or cluster. Don't put all components > > (agent, monitoring, database) in a single VM, decoupling makes > > migraiton and upgrades easier and allows trove to reuse database > > vendor supplied containers. Performance of databases in VM's poor > > compared to databases on bare-metal. > > > > ---- > > > > Doug Hellmann: > > > > > Does "service VM" need to be a first-class thing? Akanda creates > > > them, using a service user. The VMs are tied to a "router" which is > > > the billable resource that the user understands and interacts with > > > through the API. > > > > Amrith: Doug, yes because we're looking not just for service VM's but all > > resources provisioned by a service. So, to Matt's comment about a > > blackbox DBaaS, the VM's, storage, snapshots, ... they should all be > > owned by the service, charged to a users quota but not visible to the > > user directly. > > I still don't understand. If you have entities that represent the > DBaaS "host" or "database" or "database backup" or whatever, then > you put a quota on those entities and you bill for them. If the > database actually runs in a VM or the backup is a snapshot, those > are implementation details. You don't want to have to rewrite your > quota management or billing integration if those details change. > > Doug > > > > > ---- > > > > Jay: > > > > > Frankly, I believe all of these types of services should be built > > > as applications that run on OpenStack (or other) > > > infrastructure. In other words, they should not be part of the > > > infrastructure itself. > > > > > > There's really no need for a user of a DBaaS to have access to the > > > host or hosts the DB is running on. If the user really wanted > > > that, they would just spin up a VM/baremetal server and install > > > the thing themselves. > > > > and subsequently in follow-up with Zane: > > > > > Think only in terms of what a user of a DBaaS really wants. At the > > > end of the day, all they want is an address in the cloud where they > > > can point their application to write and read data from. > > > ... > > > At the end of the day, I think Trove is best implemented as a hosted > > > application that exposes an API to its users that is entirely > > > separate from the underlying infrastructure APIs like > > > Cinder/Nova/Neutron. > > > > Amrith: Yes, I agree, +1000 > > > > ---- > > > > Clint (in response to Jay's proposal regarding the service making all > > resources multi-tenant) raised a concern about having multi-tenant > > shared resources. The issue is with ensuring separation between > > tenants (don't want to use the word isolation because this is database > > related). > > > > Amrith: yes, definitely a concern and one that we don't have today > > because each DB is a VM of its own. Personally, I'd rather stick with > > that construct, one DB per VM/container/baremetal and leave that be > > the separation boundary. > > > > ---- > > > > Zane: Discomfort over throwing out working code, grass is greener on > > the other side, is there anything to salvage? > > > > Amrith: Yes, there is certainly a 'grass is greener with a rewrite' > > fallacy. But, there is stuff that can be salvaged. The elements are > > still good, they are separable and can be used with the new > > project. Much of the controller logic however will fall by the > > wayside. > > > > In a similar vein, Clint asks about the elements that Trove provides, > > "how has that worked out". > > > > Amrith: Honestly, not well. Trove only provided reference elements > > suitable for development use. Never really production hardened > > ones. For example, the image elements trove provides don't bake the > > guest agent in; they assume that at VM launch, the guest agent code > > will be slurped (technical term) from the controller and > > launched. Great for debugging, not great for production. That is > > something that should change. But, equally, I've heard disagreements > > saying that slurping the guest agent at runtime is clever and good > > in production. > > > > ---- > > > > Zane: consider using Mistral for workflow. > > > > > The disadvantage, obviously, is that it requires the cloud to offer > > > Mistral as-a-Service, which currently doesn't include nearly as many > > > clouds as I'd like. > > > > Amrith: Yes, as we discussed, we are in agreement with both parts of > > this recommendation. > > > > Zane, Jay and Dims: a subtle distinction between Tessmaster and Magnum > > (I want a database figure out the lower layers, vs. I want a k8s > > cluster). > > > > ---- > > > > Zane: Fun fact: Trove started out as a *complete fork* of Nova(!). > > > > Amrith: Not fun at all :) Never, ever, ever, ever f5g do that > > again. Yeah, sure, if you can have i18n, and k8s, I can have f5g :) > > > > ---- > > > > Thierry: > > > > > We generally need to be very careful about creating dependencies > > > between OpenStack projects. > > > ... > > > I understand it's a hard trade-off: you want to reuse functionality > > > rather than reinvent it in every project... we just need to > > > recognize the cost of doing that. > > > > Amrith: Yes, this is part of my concern re: Mistral, and earlier in > > trove's life on depending on Manila for Oracle RAC. Clint raised a > > similar concern about the dependency on Heat. > > > > In response, Kevin: > > > > > That view of dependencies is why Kubernetes development is outpacing > > > OpenStacks and some users are leaving IMO. Not trying to be mean > > > here but trying to shine some light on this issue. > > > > I disagree, but that's a topic for another email thread and maybe not > > even an email thread but an in-person conversation with suitable > > beverages. It is a religious discussion which is best handled in a > > different forum; such as the emacs-vi forum. > > > > ---- > > > > I wrote: > > > > > - A guest agent running on a tenant instance, with connectivity to a > > > shared management message bus is a security loophole; encrypting > > > traffic, per-tenant-passwords, and any other scheme is merely > > > lipstick on a security hole > > > > Clint asks: > > > > This is a broad statement, and I'm not sure I understand the actual > > risk you're presenting here as "a security loophole". > > > > How else would you administer a database server than through some > > kind of agent? Whether that agent is a python daemon of our making, > > sshd, or whatever kubernetes component lets you change things, > > they're all administrative pieces that sit next to the resource. > > > > Amrith: The issue is that the guest agent (currently) running in a > > tenants context needs to establish a connection to a shared rabbitmq > > server running in the service (control plane) context. I am fine with > > a guest agent running in the control plan establishing a connection > > into a guest VM if required, not the other way around. > > > > ---- > > > > Clint makes a distinction between a database cluster within an > > OpenStack deployment and an uber database cluster spanning clouds, > > recommending that the latter is best left to a tertiary > > orchestrator. Further, these are two distinct things, pick one and do > > it well. > > > > Amrith: A valid approach and one that will allow Trove to focus on the > > high value single OpenStack deployment of a db cluster (and to Jay's > > point, do it well). > > > > ---- > > > > Consensus: > > > > Trove should expose (what Matt Fischer calls) BlackBox DB, not storage + > > compute. > > > > Address rabbitmq security concerns differently; move guest agent off > > instance. > > > > Don't reinvent the orchestration piece. > > > > Fewer DB's better support > > > > Clusters are first class citizens, not an afterthought > > > > Clusters spanning regions and openstack deployments > > > > Restart the service VM's discussion: > > https://review.openstack.org/#/c/438134/ > > > > ---- > > > > Several people emailed me privately and said they (or their companies) > > would like to invest resources in Trove. Some indicated that they (or > > their companies) would like to invest resources in Trove if the > > commitment was towards a certain direction or technology choice. > > Others have offered resources if the direction would be to provide > > an AWS compatible API. > > > > To anyone who wants to contribute resources to a project, please do > > it. Big companies considering contributing one or two people to a > > project and making it seem like a big decision is really an indication > > of a lack of seriousness. If the project is really valuable to you, > > you'd have put people on it already. The fact that you haven't speaks > > volumes. > > > > To those who want to place pre-conditions on technology choice, I have > > no (good) words for you. > > > > Thanks to all who participated, I appreciate all the input. I will be > > setting up a session at the Denver PTG to specifically continue this > > conversation and hope you will all be able to attend. As soon as time > > slots for PTG are announced, I will try and pick this slot and request > > that you please attend. > > > > Thanks, > > > > -amrith > > > > > > > > > > On Sun, Jun 18, 2017 at 6:35 AM, Amrith Kumar <[email protected]> > > wrote: > > > > > Trove has evolved rapidly over the past several years, since > integration > > > in IceHouse when it only supported single instances of a few databases. > > > Today it supports a dozen databases including clusters and replication. > > > > > > The user survey [1] indicates that while there is strong interest in > the > > > project, there are few large production deployments that are known of > (by > > > the development team). > > > > > > Recent changes in the OpenStack community at large (company > realignments, > > > acquisitions, layoffs) and the Trove community in particular, coupled > with > > > a mounting burden of technical debt have prompted me to make this > proposal > > > to re-architect Trove. > > > > > > This email summarizes several of the issues that face the project, both > > > structurally and architecturally. This email does not claim to include > a > > > detailed specification for what the new Trove would look like, merely > the > > > recommendation that the community should come together and develop one > so > > > that the project can be sustainable and useful to those who wish to > use it > > > in the future. > > > > > > TL;DR > > > > > > Trove, with support for a dozen or so databases today, finds itself in > a > > > bind because there are few developers, and a code-base with a > significant > > > amount of technical debt. > > > > > > Some architectural choices which the team made over the years have > > > consequences which make the project less than ideal for deployers. > > > > > > Given that there are no major production deployments of Trove at > present, > > > this provides us an opportunity to reset the project, learn from our > v1 and > > > come up with a strong v2. > > > > > > An important aspect of making this proposal work is that we seek to > > > eliminate the effort (planning, and coding) involved in migrating > existing > > > Trove v1 deployments to the proposed Trove v2. Effectively, with work > > > beginning on Trove v2 as proposed here, Trove v1 as released with Pike > will > > > be marked as deprecated and users will have to migrate to Trove v2 > when it > > > becomes available. > > > > > > While I would very much like to continue to support the users on Trove > v1 > > > through this transition, the simple fact is that absent community > > > participation this will be impossible. Furthermore, given that there > are no > > > production deployments of Trove at this time, it seems pointless to > build > > > that upgrade path from Trove v1 to Trove v2; it would be the proverbial > > > bridge from nowhere. > > > > > > This (previous) statement is, I realize, contentious. There are those > who > > > have told me that an upgrade path must be provided, and there are > those who > > > have told me of unnamed deployments of Trove that would suffer. To > this, > > > all I can say is that if an upgrade path is of value to you, then > please > > > commit the development resources to participate in the community to > make > > > that possible. But equally, preventing a v2 of Trove or delaying it > will > > > only make the v1 that we have today less valuable. > > > > > > We have learned a lot from v1, and the hope is that we can address > that in > > > v2. Some of the more significant things that I have learned are: > > > > > > - We should adopt a versioned front-end API from the very beginning; > > > making the REST API versioned is not a ‘v2 feature’ > > > > > > - A guest agent running on a tenant instance, with connectivity to a > > > shared management message bus is a security loophole; encrypting > traffic, > > > per-tenant-passwords, and any other scheme is merely lipstick on a > security > > > hole > > > > > > - Reliance on Nova for compute resources is fine, but dependence on > Nova > > > VM specific capabilities (like instance rebuild) is not; it makes > things > > > like containers or bare-metal second class citizens > > > > > > - A fair portion of what Trove does is resource orchestration; don’t > > > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as > far > > > along when Trove got started but that’s not the case today and we have > an > > > opportunity to fix that now > > > > > > - A similarly significant portion of what Trove does is to implement a > > > state-machine that will perform specific workflows involved in > implementing > > > database specific operations. This makes the Trove taskmanager a > stateful > > > entity. Some of the operations could take a fair amount of time. This > is a > > > serious architectural flaw > > > > > > - Tenants should not ever be able to directly interact with the > underlying > > > storage and compute used by database instances; that should be the > default > > > configuration, not an untested deployment alternative > > > > > > - The CI should test all databases that are considered to be > ‘supported’ > > > without excessive use of resources in the gate; better code > modularization > > > will help determine the tests which can safely be skipped in testing > changes > > > > > > - Clusters should be first class citizens not an afterthought, single > > > instance databases may be the ‘special case’, not the other way around > > > > > > - The project must provide guest images (or at least complete tooling > for > > > deployers to build these); while the project can’t distribute operating > > > systems and database software, the current deployment model merely > impedes > > > adoption > > > > > > - Clusters spanning OpenStack deployments are a real thing that must be > > > supported > > > > > > This might sound harsh, that isn’t the intent. Each of these is the > > > consequence of one or more perfectly rational decisions. Some of those > > > decisions have had unintended consequences, and others were made > knowing > > > that we would be incurring some technical debt; debt we have not had > the > > > time or resources to address. Fixing all these is not impossible, it > just > > > takes the dedication of resources by the community. > > > > > > I do not have a complete design for what the new Trove would look like. > > > For example, I don’t know how we will interact with other projects > (like > > > Heat). Many questions remain to be explored and answered. > > > > > > Would it suffice to just use the existing Heat resources and build > > > templates around those, or will it be better to implement custom Trove > > > resources and then orchestrate things based on those resources? > > > > > > Would Trove implement the workflows required for multi-stage database > > > operations by itself, or would it rely on some other project (say > Mistral) > > > for this? Is Mistral really a workflow service, or just cron on > steroids? I > > > don’t know the answer but I would like to find out. > > > > > > While we don’t have the answers to these questions, I think this is a > > > conversation that we must have, one that we must decide on, and then > as a > > > community commit the resources required to make a Trove v2 which > delivers > > > on the mission of the project; “To provide scalable and reliable Cloud > > > Database as a Service provisioning functionality for both relational > and > > > non-relational database engines, and to continue to improve its > > > fully-featured and extensible open source framework.”[2] > > > > > > Thanks, > > > > > > -amrith > > > > > > > > > [1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf > > > [2] https://wiki.openstack.org/wiki/Trove#Mission_Statement > > > > > > > > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
