On 4 July 2014 11:24, William Reade <william.re...@canonical.com> wrote:
> My expectation is that:
>
> 1) We certainly need the environment UUID as a separate field for the shard
> key.
> 2) We *also* need the environment UUID as an _id prefix to keep our watchers
> sane.
> 2a) If we had separate collections per environment, we wouldn't; but AIUI,
> scaling mongo by adding collections tends to end badly (I don't have direct
> experience here myself; but it does indeed seem that we'd start consuming
> namespaces at a pretty terrifying rate, and I'm inclined to trust those who
> have done this and failed.)
> 2b) I'd ordinarily dislike the duplication across the _id and uuid fields,
> but there's a clear reason for doing so here, so I'm not going to complain.
> I *will* continue to complain about documents that duplicate info across
> fields in order to save a few runtime microseconds here and there ;).
>
> If someone with direct experience can chip in reassuringly I *might* be
> prepared to back off on the N-collections-per-environment thing, but I'm
> certainly not willing to take it so far as to separate the txn logs and thus
> discard consistency across environments: I think there will certainly be
> references between individual hosted environments and the initial
> environment.

It can be a great advantage when scaling to be able to partition the
transactions across different parts of the database. If we want this to
be able to scale, I think we *have* to make it work without requiring
transactions across environments. There is no way that we can scale
as far as we'd like to by using a single mongo replica set for all environments.

This talk is about mysql, not mongo, but I believe some of the lessons
are relevant to us. https://www.youtube.com/watch?v=qATTTSg6zXk

By my calculations, with a maximum-sized namespace file, a single
mongo should be able to support over 90000 environments
using a separate collection-set for each environment.

>From my recollection of juju performance, we will be lucky to scale
a single mongo up to 1000 environments, let alone 90000, so I suspect we'd never
get remotely that far. Perhaps there are other disadvantages
from having many collections though.

It would be nice if we could make this crucial architectural decision in
the light of some actual measurements. We may all have some kind
of gut feeling for how this might perform, but without actually measuring,
we just don't know.

As usual, my first reaction is KISS.

  cheers,
    rog.

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev

Reply via email to