On Wed, Jun 29, 2016 at 6:38 PM, Andy Zhou <az...@ovn.org> wrote: > > > On Wed, Jun 22, 2016 at 10:43 AM, Ben Pfaff <b...@ovn.org> wrote: > >> On Wed, Jun 22, 2016 at 01:56:17AM -0700, Andy Zhou wrote: >> > 3. How should the OVN databases be arranged within etcd? There are >> > > multiple possibilities: >> > > >> > > - Define OVSDB bindings to etcd and implement those bindings in the >> > > OVSDB client libraries (C and Python). >> > > >> > > - Define OVSDB bindings to etcd and implement those bindings in the >> > > OVSDB server (so that ovsdb-server uses etcd as a storage layer). >> > > >> > > - Define a native etcd schema for OVN SB (and probably NB) database >> > > and make ovn-controller and ovn-northd use it natively. >> > >> > >> > > >> > It would be nice to be able to reuse current schema definition. #3 >> option >> > makes this not a >> > hard requirement, but having schema is much nicer to maintain changes >> over >> > release -- for example, upgrade due to schema version changes. >> > >> > Both #1 and #2 option above require us to figure out how DB, TABLE and >> > COLUMNS are logically map to >> > a key value store. Just for discussion purpose, Let's say the keys are >> in >> > the format of db/table/<row-uuid>/column. >> > >> > >> > OVSDB supports complex value types such as set and maps, Those can also >> be >> > supported with the following >> > format: db/table/<row-uuid>/column/set-key (with a fixed value, say, >> > "set") or db/table/<row-uuid>/column/map-key >> > >> > To optimize certain key range queries (i.e. the benefits that can be >> > realized by conditional monitoring), we can declare >> > certain columns to be prefix of the <row-uuid>. One possible way is to >> > enhance current schema definition to add a "priority" >> > field for each column. "normal" columns, by default have the lowest >> > priority. When C1 has a higher priority than C2, and both >> > have non default priority, The etcd key layout can be: >> > db/table/c1<value>/c2<value>/<row-uuid>/columns. >> > >> > With this key layout, rows that matches a particular c1 value (or c1 && >> c2) >> > to be "watched". This is not as general as the conditional monitoring, >> but >> > may be sufficient for OVN SB's current use cases. >> > >> > Enforcing constrains expressed in schema can be tricky for #1, some of >> the >> > possible solutions are: >> > >> > The value constrains expressed by the schema are not going to enforced >> by >> > etcd. One possible solution here is >> > to have all clients that issues transactions enforce constrains before >> > issuing. >> > >> > References integrity can also be enforced by the client. Logically, we >> can >> > have a dedicated client that enforces referential integrity, >> > (It can be combined into one of the clients in practice). Ideally we >> would >> > like to both original transaction + reference integrity changes appears >> as >> > one transaction to the client (at least the clients of the idl layer). >> This >> > may need additional logic OVN needs to build that >> > not currently provided by etcd -- I don't know if this is a deal >> breaker. >> > >> > To me, #2 seems to make overall system more complex and less efficient >> than >> > #1. >> >> Thanks for all the thoughts! I agree with all of these ideas, at least >> at first glance. They are very close to what I was thinking too. It's >> good that we're on the same page. >> > > This is one possible way to implement reference integrity with etcd: > > * DB wide versioning. > > Assign a key db/version that stores db wide transaction id. Assume the id > starts with 0. Any client issued transaction on the DB should also include > this key; A transaction will increase its value by 1; Any etcd client > transaction > will always bring this version number from even to an odd number. > > No further transaction can be issued until "db/version"'s value become > even. > > * A dedicated client enforces referential integrity > > There is a dedicated etcd client whose job is to enforce referential > integrity. > It starts to run when the version number is odd, commit the next > transaction > that "fixes" the etcd. The version number is increased even if there is > nothing > to fix. > > In the HA setup, referential integrity checking clients should run on the > same machines > that run etcd. Only the etcd client that runs on the same machine as the > etcd leader > will actively enforce referential integrity. Other clients will be > running in standby mode, > and only become active when its local etcd server become the leader. > > Will this work? >
A single, dedicated etcd client handling every other transaction sounds like it could be a scale bottleneck. What do you think? -- Russell Bryant
_______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss