Hi Mark,

Thanks for the patch. We had a brief discussion during last OVN meeting.
Let me put my points inlined.

On Fri, Oct 18, 2019 at 1:43 PM Mark Michelson <mmich...@redhat.com> wrote:
>
> This proposes a set of patches to move pinctrl operations out of the
> ovn-controller process and into its own.
>
> The main reasons for doing this are the following:
> 1) Separating pinctrl makes it so that receiving a packet-in can't wake
> up ovn-controller.

To avoid waking up ovn-controller, it doesn't have to be in a separate
process. A thread with its own OVSDB IDL to SB DB can achieve the same, as
what this old patch did:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332887.html

However, the problem of a separate SB connection introduced the concern for
scalability. There were discussions and thoughts for a separate thread
without introducing new SB connection, but once the two threads share same
SB connection, there has to be some synchronization between the threads
that ends up waking up or blocking each other whenever there is a pinctrl
processing that requires read/write SB data. The current multi-thread
implementation from Numan is a trade off that avoids new SB connection but
syncing with the main thread when SB data is needed. It is perfect for
pinctrl handling that doesn't require SB data, and then wakes up
ovn-controller for updating SB data.

Today (2.12) there were improvements on both ovn-controller and OVSDB
server, that alleviated the scale problems on both side.
- For ovn-controller, with incremental processing, when there is no input
change, it doesn't trigger flow recomputing, even when pinctrl wakes up the
main thread. The major concern may be when main thread does need a
recompute, it could block pinctrl processing for messages that requires SB
data accessing, such as ARP handling.
- For SB DB
  - Active-active cluster alleviates the burden of a single server and
spread to 3 or 5. However, RAFT is not designed for scale. Write always
happen on the leader node, and the cost of cluster sync between leader and
follower becomes higher when number of nodes increases.
  - The fast-resync feature (requiring active-active clustered mode) avoids
the slowness of data resync to all clients after DB restart/failover.
However, it doesn't help if ovsdb-server is overloaded for regular updates
and notifications during normal operations, given that it is single
threaded. Also, there are corner cases that fast-resync doesn't help, e.g.
when DB restart/failover happened just after a compress, when all the
transaction history is lost.

I'd suggest to reconsider these scalability concerns, the pros and cons for
a dedicated SB connection for pinctrl, before moving forward to this
approach.

> 2) Separating pinctrl allows for manipulating the southbound database
> directly while handling packets in, thus minimizing the need for storing
> local copies of data

This is true, but similar as point 1), it doesn't necessarily need a
separate process. The point is whether pinctrl (thread or process) should
use a dedicated SB connection.

> 3) This lays the groundwork for an easier eventual conversion of
> ovn-controller to DDlog, since the DDlog code would need to only handle
> flow creation, not packet in handling.
>

Agree with this point. This is probably the most important benefit of
separating pinctrl as a process. Although it is still possible to have
pinctrl as a thread sharing SB connection while converting the flow
processing part with DDlog, a separate process does make the conversion
cleaner.

In addition, a separate process introduces some operational costs, although
not a big concern. The tooling like ovn-ctl and packaging also needs to be
updated.

Thanks,
Han
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to