Hi Bhanu,

Bhanuprakash Bodireddy <bhanuprakash.bodire...@intel.com> writes:

> This patch is aimed at achieving Fastpath Service Assurance in
> OVS-DPDK deployments. This commit adds support for monitoring the packet
> processing cores(pmd thread cores) by dispatching heartbeats at regular
> intervals. Incase of heartbeat miss the failure shall be detected &
> reported to higher level fault management systems/frameworks.
>
> The implementation uses POSIX shared memory object for storing the
> events that will be read by monitoring framework. keep-alive feature
> can be enabled through below OVSDB settings.

I've been thinking about this design, and I'm concerned - shared memory
is inflexible, and allows multiple actors to mess with the information.
Is there a reason to use shared memory?  I am not sure of what
advantage this form of reporting has vs. simply using a message
passing interface.  With messages there is clear abstraction, and the
projects / processes are truly separated.  Shared memory is leads to a
situation of inextricably coupling two (or more) processes.

As an example, if the constant changes, or a new statistic is desired to
be tracked, the consumer which wants to use this data needs to be
recompiled, and needs to have the *exact* correct version.  If the pad
bits from the compiler change, if anything from the ovs side causes
alignment to be shifted, if OvS wants to redefine the struct, if OvS
uses any data from there as the rhs... the list of scenarios where this
interface can fail goes on - and the failures are quite catastrophic.

I think maybe a design doc of this interface would be good to read
through, as it will explain why this design was chosen.  It also might
allow for better feedback, putting a more generic solution (for example,
any new threads that OvS spawns we might want to monitor, as well - and
that would be good to report).  Do you agree?

>     keepalive=true
>        - Keepalive feature is disabled by default
>
>     keepalive-interval="50"
>        - Timer interval in milliseconds for monitoring the packet
>          processing cores.
>
>     keepalive-shm-name="/dpdk_keepalive_shm_name"
>        - Shared memory block name where the events shall be updated.
>
> When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes
> up at regular intervals to update the timestamp and status of pmd cores
> in shared memory region.
>
> An external monitoring framework like collectd(with dpdk plugin support)
> can read the status updates from shared memory. On a missing heartbeat,
> the collectd shall relay the status to ceilometer service running in the
> controller. Below is the high level overview of deployment model.
>
>         Compute Node                   Controller
>
>          Collectd  <-----------------> Ceilometer
>
>          OVS DPDK
>
>    +-----+
>    | VM  |
>    +--+--+
>   \---+---/
>       |
>    +--+---+       +------------+----------+     +------+-------+
>    | OVS  |-----> |    dpdkevents plugin  | --> |   collectd   |
>    +--+---+       +------------+----------+     +------+-------+
>
>  +------+-----+     +---------------+------------+
>  | Ceilometer | <-- | collectd ceilometer plugin |  <----
>  +------+-----+     +---------------+------------+
>
> v1->v2:
> - Sort the patches in the order leaving no dead code behind.
> - Remove xusleep() implementation and instead used usleep().
> - Replace '_WIN32' with '__linux__' in get_process_status().
> - Add comments for different Keepalive states.
> - Remove semaphore and all the logic associated with it.
> - Fix the documentation as suggested.
> - Fix and added few appropriate comments to KA helper functions.
> - Add latency stats details in the commit log for future reference.
>
> Bhanuprakash Bodireddy (6):
>   dpdk: Add helper functions for DPDK keepalive.
>   process: Retrieve process status.
>   dpif-netdev: Register packet processing cores for keepalive.
>   netdev-dpdk: Add support for keepalive functionality.
>   vswitch.xml: Add keepalive support.
>   Documentation: Update DPDK doc with Keepalive feature.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to