Excerpts from Andrew Laski's message of 2016-05-03 14:46:08 -0700: > > On Mon, May 2, 2016, at 01:13 PM, Edward Leafe wrote: > > On May 2, 2016, at 10:51 AM, Mike Bayer <mba...@redhat.com> wrote: > > > > >> Concretely, we think that there are three possible approaches: > > >> 1) We can use the SQLAlchemy API as the common denominator between a > > >> relational and non-relational implementation of the db.api component. > > >> These two implementation could continue to converge by sharing a large > > >> amount of code. > > >> 2) We create a new non-relational implementation (from scratch) of > > >> the db.api component. It would require probably more work. > > >> 3) We are also studying a last alternative: writing a SQLAlchemy > > >> engine that targets NewSQL databases (scalability + ACID): > > >> - https://github.com/cockroachdb/cockroach > > >> - https://github.com/pingcap/tidb > > > > > > Going with a NewSQL backend is by far the best approach here. That way, > > > very little needs to be reinvented and the application's approach to data > > > doesn't need to dramatically change. > > > > I’m glad that Matthieu responded, but I did want to emphasize one thing: > > of *course* this isn’t an ideal approach, but it *is* a practical one. > > The biggest problem in any change like this isn’t getting it to work, or > > to perform better, or anything else except being able to make the change > > while disrupting as little of the existing code as possible. Taking an > > approach that would be more efficient would be a non-starter since it > > wouldn’t provide a clean upgrade path for existing deployments. > > I would like to point out that this same logic applies to the current > cellsv2 effort. It is a very practical set of changes which allows Nova > to move forward with only minor effort on the part of deployers. And it > moves towards a model that is already used and well understood by large > deployers of Nova while also learning from the shortcomings of the > previous architecture. In short, much of this is already battle tested > and proven. > > If we started Nova from scratch, I hear golang is lovely for this sort > of thing, would we do things differently? Probably. However that's not > the position we're in. And we're able to make measurable progress with > cellsv2 at the moment and have a pretty clear idea of the end state. I > can recall conversations about NoSQL as far back as the San Diego > summit, which was my first so I can't say they didn't happen previously, > and this is the first time I've seen any measurable progress on moving > forward with it. But where it would go is not at all clear. >
I beg to differ about "pretty clear idea of the end state". * There's no clear answer about scheduling. It's a high level "we'll give it a scheduler/resource tracker database of its own". But that's a massive amount of work just to design the migrations and solidify the API. I understand some of that work is ongoing and unrelated to cells v2, but it's not done or clear yet. * This also doesn't address the fact that for cellsv1 users a move like that will _regress_ scheduler scalability since now we can only have one scheduler and resource tracker instead of many. For those of us just now ramping up, it leaves us with no way to get high throughput on our scheduler. * Further, if there's a central scheduler, that means all of the sort of clever scheduling hacks that people have achieved with cells v1 (a cell of baremetal, a cell of SSD, etc) will need to be done via other means, which is more design work that needs to happen. * There's no clear way to efficiently list and sort results from lots of cells. The discussion came up with a few experiments to try, but the problem is _fundamental_ to sharding, and the cells v1 answer was a duplication of data which obviously cells v2 wants to avoid, and I would assume with good reason. I have a huge amount of respect for what has been achieved with cells v1, and I totally understand the hesitance to promote the way it works given what cells v1 has taught its users. However, the design of v2 is quite a bit different than v1, enough so that I think it should be treated as an experiment until someone has a solid design of the whole thing and can assert that it actually addresses scale without regressing things significantly. Meanwhile, there are other things deployers can do to address scale that will likely cause less churn in Nova, and may even help other projects scale to a similar size. I intend to return to my pursuit of actual experiment results for these things now that I understand the state of cells v2. I hope others will consider this path as well, so we can collaborate on things like 0mq and better database connection handling. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev