Hi Bhanu, I've tested v6 and LGTM, works as expected. To check the behavior I blocked one or more of the PMD with the trick of attaching cgdb. So with
utilities/ovs-appctl keepalive/pmd-health-show I could see for example 2 of the 3 PMDs - those I intentionally blocked - correctly reported as DEAD first, then GONE like Keepalive status keepalive status : Enabled keepalive interval : 1000 ms keepalive init time: 15 Dec 2017 13:20:54 PMD threads : 3 PMD CORE STATE LAST SEEN TIMESTAMP(UTC) pmd62 5 GONE 15 Dec 2017 13:22:39 pmd65 7 ALIVE 15 Dec 2017 13:22:50 pmd63 4 GONE 15 Dec 2017 13:22:14 Thanks, Antonio > -----Original Message----- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Bhanuprakash Bodireddy > Sent: Friday, December 8, 2017 12:04 PM > To: d...@openvswitch.org > Subject: [ovs-dev] [PATCH v6 0/8] Add OVS DPDK keep-alive functionality. > > Keepalive feature is aimed at achieving Fastpath Service Assurance > in OVS-DPDK deployments. It adds support for monitoring the packet > processing threads by dispatching heartbeats at regular intervals. > > keepalive feature can be enabled through below OVSDB settings. > > enable-keepalive=true > - Keepalive feature is disabled by default and should be enabled > at startup before ovs-vswitchd daemon is started. > > keepalive-interval="5000" > - Timer interval in milliseconds for monitoring the packet > processing cores. > > TESTING: > The testing of keepalive is done using stress cmd (simulating the > stalls). > - pmd-cpu-mask=0xf [MQ enabled on DPDK ports] > - stress -c 1 & [tid is usually the __tid + 1 of the > output] > - chrt -r -p 99 <tid> [set realtime priority for stress thread] > - taskset -p 0x8 <tid> [Pin the stress thread to the core PMD is > running] > - PMD thread will be descheduled due to its normal priority and > yields > core to stress thread. > > - ovs-appctl keepalive/pmd-health-show [Display that the thread > is GONE] > - ./ovsdb/ovsdb-client monitor Open_vSwitch [Should update the > status] > > - taskset -p 0x10 <tid> [This brings back pmd thread to life as > stress thread > is moved to idle core] > > (watch out for stress threads, and carefully pin them to core not > to hang your DUTs > during tesing). > > v5 -> v6 > * Remove 2 patches from series > - xnanosleep was applied to master as part of high resolution > timeout support. > - Extend get_process_info() API was also applied to master earlier. > * Remove KA_STATE_DOZING as it was initially meant to handle Core C > states, not needed > for now. > * Fixed ka_destroy(), to fix unit test cases 536, 537. > * A minor performance degradation(0.5%) is observed with Keepalive > enabled. > [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts > and > 1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout] > * Verified with sparse, MSVC compilers(appveyor). > > v4 -> v5 > * Add 3 more patches to the series > - xnanosleep() > - Documentation > - Update to NEWS > * Remove all references to core_id and instead implemented thread > based tracking. > * Addressed most of the comments in v4. > > v3 -> v4 > * Split the functionality in to 2 parts. This patch series only > updates > PMD status to OVSDB. The incremental patch series to handle false > positives, > negatives and more checking and stats. > * Remove code from netdev layer and dependency on rte_keepalive lib. > * Merged few patches and simplified the patch series. > * Timestamp in human readable form. > > v2 -> v3 > * Rebase. > * Verified with dpdk-stable-17.05.1 release. > * Fixed build issues with MSVC and cross checked with appveyor. > > v1 -> v2 > * Rebase > * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it > is already applied as separate patch. > > RFCv3 -> v1 > * Made changes to fix failures in some unit test cases. > * some more code cleanup w.r.t process related APIs. > > RFCv2 -> RFCv3 > * Remove POSIX shared memory block implementation (suggested by > Aaron). > * Rework the logic to register and track threads instead of cores. > This way > in the future any thread can be registered to KA framework. For now > only PMD > threads are tracked (suggested by Aaron). > * Refactor few APIs and further clean up the code. > > RFCv1 -> RFCv2 > * Merged the xml and schema commits to later commit where the actual > implementation is done(suggested by Ben). > * Fix ovs-appctl keepalive/* hang issue when KA disabled. > * Fixed memory leaks with appctl commands for keepalive/pmd-health- > show, > pmd-xstats-show. > * Refactor code and fixed APIs dealing with PMD health monitoring. > > > Bhanuprakash Bodireddy (8): > Keepalive: Add initial keepalive configuration. > dpif-netdev: Register packet processing cores to KA framework. > dpif-netdev: Enable heartbeats for DPDK datapath. > keepalive: Retrieve PMD status periodically. > bridge: Update keepalive status in OVSDB. > keepalive: Add support to query keepalive status and statistics. > Documentation: Update DPDK doc with Keepalive feature. > NEWS: Add keepalive support information in NEWS. > > Documentation/howto/dpdk.rst | 112 +++++++++ > NEWS | 2 + > lib/automake.mk | 2 + > lib/dpif-netdev.c | 92 ++++++++ > lib/keepalive.c | 552 > +++++++++++++++++++++++++++++++++++++++++++ > lib/keepalive.h | 109 +++++++++ > lib/ovs-thread.c | 6 + > lib/ovs-thread.h | 1 + > lib/util.c | 22 ++ > lib/util.h | 1 + > vswitchd/bridge.c | 29 +++ > vswitchd/vswitch.ovsschema | 8 +- > vswitchd/vswitch.xml | 49 ++++ > 13 files changed, 983 insertions(+), 2 deletions(-) > create mode 100644 lib/keepalive.c > create mode 100644 lib/keepalive.h > > -- > 2.4.11 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev