Re: [openstack-dev] [tc] Active or passive role with our database layer

Jay Pipes Tue, 23 May 2017 11:45:42 -0700

On 05/23/2017 07:23 AM, Chris Dent wrote:

That "higher dev cost" is one of my objections to the 'active'
approach but it is another implication that worries me more. If we
limit deployer architecture choices at the persistence layer then it
seems very likely that we will be tempted to build more and more
power and control into the persistence layer rather than in the
so-called "business" layer. In my experience this is a recipe for
ossification. The persistence layer needs to be dumb and
replaceable.

Err, in my experience, having a *completely* dumb persistence layer --i.e. one that tries to assuage the differences between, say, relationaland non-relational stores -- is a recipe for disaster. The developerjust ends up writing join constructs in that business layer instead ofusing a relational data store the way it is intended to be used. Samefor aggregate operations. [1]

Now, if what you're referring to is "don't use vendor-specificextensions in your persistence layer", then yes, I agree with you.


Best,
-jay

[1] Witness the join constructs in Golang in Kubernetes as they workaround etcd not being a relational data store:


https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/deployment_controller.go#L528-L556

Instead of a single SQL statement:

SELECT p.* FROM pods AS p
JOIN deployments AS d
ON p.deployment_id = d.id
WHERE d.name = $name;

the deployments controller code has to read every Pod message from etcdand loop through each Pod message, returning a list of Pods that matchthe deployment searched for.

Similarly, Kubenetes API does not support any aggregate (SUM, GROUP BY,etc) functionality. Instead, clients are required to perform these kindsof calculations/operations in memory. This is because etcd, being an(awesome) key/value store is not designed for aggregate operations (justas Cassandra or CockroachDB do not allow most aggregate operations).

My point here is not to denigrate Kubernetes. Far from it. They (todate) have a relatively shallow relational schema and doing join andindex maintenance [2] operations in client-side code has so far been acost that the project has been OK carrying. The point I'm trying to makeis that the choice of data store semantics (relational or not, columnaror not, eventually-consistent or not, etc) *does make a difference* tothe architecture of a project, its deployment and the amount of codethat the project needs to keep to properly handle its data schema.There's no way -- in my experience -- to make a "persistence layer" thatpapers over these differences and ends up being useful.

[2] In Kubernetes, all services are required to keep all relevant datain memory:


https://github.com/kubernetes/community/blob/master/contributors/design-proposals/principles.md

This means that code that maintains a bunch of in-memory indexes ofvarious data objects ends up being placed into every component, Here'san example of this in the kubelet (the equivalent-ish of thenova-compute daemon) pod manager, keeping an index of pods and mirroredpods in memory:


https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L104-L114

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L159-L181

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tc] Active or passive role with our database layer

Reply via email to