These are great points Liran. These points are also very closely related to one another.
I agree that the SB DB could be entirely in memory - of course, for high availability of course it should be replicated. As a bonus, replication of an in-memory data structure is easier than of a durable data structure - atomic broadcast could be used. There are questions of what happens in the extremely rare eventuality that the entire SB cluster goes down and its state is lost - I'm not sure how realistic that is. There are also questions about how to do an upgrade of the SB cluster. I'm assuming that the NB2SB translations could be regenerated, but there may also be data in the SB DB that comes from (is written by) the ovn-controller agents, or some other agents, like ovn-controller-vtep. It may be possible to "replay" that as well, in case of a disconnect. In other words, northd (the translator) is the authoritative owner of the entries in writes into the SB DB. There may be other processes that write into the SB DB, such as the ovn-controller-vtep, and we could consider such processes as authoritative for that state, so that they could replay it into the SB DB. Ben indicated that the size of the data isn't too large, but I agree that sending updates of everything to thousands of clients could become problematic, particularly if the churn in the system is high, such as in a container environment. A pub/sub like mechanism could definitely work here - it wouldn't even need to be too fancy, just per table interest, perhaps. Another thing to add to the protocol would be per table versioning, so that when a client gets disconnected, if it happens to reconnect to another server in the cluster, it can exchange table versions and resync, coming up to speed by getting changes from a changelog, without necessarily downloading a full snapshot. Perhaps OVSDB already works like this and I'm demonstrating my ignorance. :) On Thu, Mar 10, 2016 at 11:15 PM, Liran Schour <lir...@il.ibm.com> wrote: > I'd like to raise the following issues for discussion: > > 1. That the client side is abstracted from the specific choice of > server-side database by using a db-abstraction layer on the client side. > We already have some kind of an abstraction layer in the code: ovsdb-idl. > Maybe we can start from there. > 2. I think if clients receive all updates this will pose a scalability > concern. I'd like to propose adding a pub/sub-like subsystem to be used > for keeping clients up-to-date about updates in the DB. This can serve as > a mechanism for table tracking and could also enable clients to receive > updates only on a small subset of changes instead of all changes. This > would greatly improve scalability in number of clients. > 3. Do we really need an on-disk DB for the southbound DB? I think an > in-memory DB for the Southbound is worth discussing. > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev