Re: [openstack-dev] [nova] Distributed Database

Mike Bayer Tue, 03 May 2016 09:09:18 -0700


On 05/02/2016 01:48 PM, Clint Byrum wrote:


FWIW, I agree with you. If you're going to use SQLAlchemy, use it to
take advantage of the relational model.

However, how is what you describe a win? Whether you use SELECT .. FOR
UPDATE, or a stored procedure, the lock is not distributed, and thus, will
still suffer rollback failures in Galera. For single DB server setups, you
don't have to worry about that, and SELECT .. FOR UPDATE will work fine.

Well it's a "win" vs. the lesser approach considered which also did notinclude a distributed locking system like Zookeeper. It is also a wineven with a Zookeeper-like system in place because it allows a SQL queryto be much smarter about selecting data that involves IP numbers andCIDRs, without the need to pull data into memory and process it there.This is the most common mistake in SQL programming, not taking advantageof SQL's set-based nature and instead pulling data into memoryunnecessarily.

Also, the "federated MySQL" approach of Cells V2 would still be OK withpessimistic locking, since this lock is not "distributed" across theentire dataspace. Only the usual Galera caveats apply, e.g. point toonly one galera "master" at a time and/or wait for Galera to support"SELECT FOR UPDATE" across the cluster.


Furthermore, any logic that happens inside the database server is extra
load on a much much much harder resource to scale, using code that is
much more complicated to update.

So I was careful to use the term "stored function" and not "storedprocedure". As ironic as it is for me to defend both the ORMbusiness-logic-in-the-application-not-the-database position, *and* thelet-the-database-do-things-not-the-application at the same time, usingdatabase functions to allow new kinds of math and comparison operationsto take place over sets is entirely reasonable, and should not beconfused with the old-school big-business approach of building an entirebusiness logic layer as a huge wall of stored procedures, this isnothing like that.

The Postgresql database has INET and CIDR types native which include thesame overlap logic we are implementing here as a MySQL stored function,so the addition of math functions like these shouldn't be controversial.The "load" of this function is completely negligible (however I wouldbe glad to assist in load testing it to confirm), especially compared topulling the same data across the wire, processing it in Python, thensending just a tiny portion of it back again after we've extracted theneedle from the haystack.

In pretty much every kind of load testing scenario we do with Openstack,the actual "load" on the database barely pushes anything. The onlydatabase "resource" issue we have is Openstack using far more idleconnections than it should, which is on my end to work on improvementsto the connection pooling system which does not scale well acrossOpenstack's tons-of-processes model.


To be clear, it's not the amount of data, but the size of the failure
domain. We're more worried about what will happen to those 40,000 open
connections from our 4000 servers when we do have to violently move them.

That's a really big number and I will admit I would need to dig intothis particular problem domain more deeply to understand what exactlythe rationale of that kind of scale would be here. But it does seemlike if you were using SQL databases, and the 4000 server system is infact grouped into hundreds of "silos" that only deal with strictsegments of the total dataspace, a federated approach would be exactlywhat you'd want to go with.


That particular problem isn't as scary if you have a large
Cassandra/MongoDB/Riak/ROME cluster, as the client libraries are
generally connecting to all or most of the nodes already, and will
simply use a different connection if the initial one fails. However,
these other systems also bring a whole host of new problems which the
simpler SQL approach doesn't have.

Regarding ROME, I only seek to make the point that if you're going toswitch to NoSQL, you have to switch to NoSQL. Bolting SQLAlchemy ontop of Redis without a mature and widely-proven relational layer inbetween, down to the level of replicating the actual tables that werebuilt within a relational schema, is a denial of the reality of theproblem to be solved.


So it's worth doing an actual analysis of the failure handling before
jumping to the conclusion that a pile of cells/sharding code or a rewrite
to use a distributed database would be of benefit.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Distributed Database

Reply via email to