"Bodireddy, Bhanuprakash" <bhanuprakash.bodire...@intel.com> writes:

> Hi Aaron,
>
>>Quick comment before I do an in-depth review.
>>
>>One thing that is missing in this series is some form of documentation added
>>to explain why this feature should exist (for instance, why can't the standard
>>posix process accounting information suffice?) and what the high-level
>>concepts are (you have the states being used, but I don't see a definition 
>>that
>>will be needed to understand when reading a keep-alive report).
>
> I am planning to write a cookbook with instructions on setting up Keepalive 
> in OvS, 
> Installing & configuring collectd and setting up ceilometer service to read 
> the events.
> The definition of the KA states and how to interpret them would be explained 
> in the
> document. Also the minimal step guide would be added in to OvS-DPDK Advanced 
> guide
> with links to cookbook. 

Please put that as you go in the patches.  It will make review easier,
too.

> On your other question on why posix process accounting isn't enough, please 
> see below
> for testcase and details.
>
>>
>>I think there could be a reason to provide this, but I think it's important to
>>explain why collectd will need to use the ovsdb interface, rather than calling
>>ex: times[1] or parsing /proc/<tid>/stat for the runtime (and watching
>>accumulation).
>
> 1) On collectd reading ovsdb rather than directly monitoring the threads.
>
>       Collectd for sure is one popular daemon to collect and monitor system 
> wide statistics.
>       However, if we move ovs thread monitoring functionality to collectd we 
> are *forcing*
>       the users to use collectd to monitor OvS health. This may not be a 
> problem for someone using
>       collectd + OpenStack. 

It's important to note - collectd monitoring threads has nothing to do
with this feature.  If collectd can monitor threads from arbitrary
processes and report, it becomes much more powerful, no?  Let's keep it
focused on Open vSwitch.

>       Think of customer using OvS but having their proprietary monitoring 
> application with OpenStack or
>       worse their own orchestrator, in this case it's easy for them to 
> monitor OvS by querying OvSDB
>       with minimal code changes in to their app. 
>
>       Also it might be easy for any monitoring application to query/subscribe 
> to OvSDB for knowing the
>       OvS configuration and health. 

I don't really like using the idea of proprietary monitors as
justification for this.

OTOH, I think there's a good justification when it comes to multi-node
Open vSwitch tracking.  There, it may not be possible to aggregate the
statistics on each individual node (due to possible some kind of
administration policy) - so I agree having something like this exposed
through ovsdb could be useful.

> 2) On /proc/[pid]/stats:
>
> - I do read 'stats' file in 01/7  patch to get the thread name and 'core id' 
> the thread was last scheduled.
> - The other fields related to time in stats file can't be completely relied 
> due to below test case.
> 
> This test scenario was to simulate & identify the PMD stalls when a higher 
> priority thread(kernel/other IO thread) 
> gets scheduled on the same core.
>
> Test scenario:
> - OvS with single/multiple PMD thread.
> - Start a worker thread spinning continuously on the core (stress -c 1 &).
> - Change the worker thread attributes to RT (chrt -r -p 99  <tid>).
> - Pin the worker thread on the same core as one of the PMDs  (taskset -p 
> <core> <tid>)
>
> Now the PMD stalls as the other worker thread has higher priority and is 
> favored & scheduled by Linux scheduler preempting PMD thread.
> However the /proc/pid/stat shows that the thread is still in 
>          *Running (R)* state                                 ->   field 3     
>          (see the output below)
>            Utime,stime were incrementing      ->    field 14, 15    (-do-)
>
> All the other time related fields were '0' as they don't apply here.
> For fields information:  http://man7.org/linux/man-pages/man5/proc.5.html
>
> -------------------------------------------sample 
> output-------------------------------------------
> $ cat /proc/12506/stat
> 12506 (pmd61) R 1 12436 12436 0 -1 4210752 101244 0 0 0 389393 3099 0 0 20 0 
> 35 0 226680879 4798472192 4363 18446744073709551615
>  4194304 9786556 140737290674320 140344414947088 4467454 0 0 4096 24579 0 0 0 
> -1 3 0 0 0 0 0 11883752 12196256 48676864 140737290679638
>  140737290679790 140737290679790 140737290682316 0
> --------------------------------------------------------------------------------------------------------
>
> But with the KA framework, the PMD stall be detected immediately and
> reported.

Please make sure this kind of tests and detection is detailed in the
documentation.  It really does help to understand the justification.

That said, I suspect that when those times were increasing, they really
were getting some system / user time.  Was your worker thread
occasionally making system calls (or even doing sched_yield()?)  In
those cases, it likely was letting the pmd thread move on.

I agree other tests would be needed to find various types of performance
impacts.  Make sure to document them all.  I don't have a magic spyglass
that lets me peek into your thought process - I only have code ;)  Even
if the tests aren't currently implemented, it would be good to write
them up.

> IMHO, we can use /proc interface or other mechanisms you suggested but that 
> should
> be used as part of additional health checks. I do check the /proc/[pid]/stat 
> to read the 
> thread states as part of larger health check mechanism in V3. 
> 
> Hope I answered all your questions here. Let me know your comments while you 
> review this series in-depth.
>
> - Bhanuprakash.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to