These are great points Liran. These points are also very closely related to
one another.

I agree that the SB DB could be entirely in memory - of course, for high
availability of course it should be replicated. As a bonus, replication of
an in-memory data structure is easier than of a durable data structure -
atomic broadcast could be used. There are questions of what happens in the
extremely rare eventuality that the entire SB cluster goes down and its
state is lost - I'm not sure how realistic that is. There are also
questions about how to do an upgrade of the SB cluster. I'm assuming that
the NB2SB translations could be regenerated, but there may also be data in
the SB DB that comes from (is written by) the ovn-controller agents, or
some other agents, like ovn-controller-vtep. It may be possible to "replay"
that as well, in case of a disconnect. In other words, northd (the
translator) is the authoritative owner of the entries in writes into the SB
DB. There may be other processes that write into the SB DB, such as the
ovn-controller-vtep, and we could consider such processes as authoritative
for that state, so that they could replay it into the SB DB.

Ben indicated that the size of the data isn't too large, but I agree that
sending updates of everything to thousands of clients could become
problematic, particularly if the churn in the system is high, such as in a
container environment. A pub/sub like mechanism could definitely work here
- it wouldn't even need to be too fancy, just per table interest, perhaps.
Another thing to add to the protocol would be per table versioning, so that
when a client gets disconnected, if it happens to reconnect to another
server in the cluster, it can exchange table versions and resync, coming up
to speed by getting changes from a changelog, without necessarily
downloading a full snapshot. Perhaps OVSDB already works like this and I'm
demonstrating my ignorance. :)

On Thu, Mar 10, 2016 at 11:15 PM, Liran Schour <lir...@il.ibm.com> wrote:

> I'd like to raise the following issues for discussion:
>
> 1. That the client side is abstracted from the specific choice of
> server-side database by using a db-abstraction layer on the client side.
> We already have some kind of an abstraction layer in the code: ovsdb-idl.
> Maybe we can start from there.
> 2. I think if clients receive all updates this will pose a scalability
> concern. I'd like to propose adding a pub/sub-like subsystem to be used
> for keeping clients up-to-date about updates in the DB. This can serve as
> a mechanism for table tracking and could also enable clients to receive
> updates only on a small subset of changes instead of all changes. This
> would greatly improve scalability in number of clients.
> 3. Do we really need an on-disk DB for the southbound DB? I think an
> in-memory DB for the Southbound is worth discussing.
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to