Re: [opnfv-tech-discuss] [barometer] Runtime analysis of the monitoring agents

Rao, Sridhar Wed, 24 May 2017 09:54:45 -0700

Thanks Maryam,

Really helpful inputs. I think I need to change the strategy. Should consider 
one by one in detail.


Over past one week, I tried different things, and  here are some points and 
queries.

  1.  What should be ‘metrics/parameters’ to compare the runtime behavior of 
these agents? Are the values from /proc/<pid>/stat sufficient? <pid> is the 
agent process-id for which we are analyzing the runtime behavior. Currently, 
I’m just considering those in /proc/<pid>/stat.
  2.  Below, the metrics you have mentioned refers to run the ‘agent’ to 
monitor them, Right?
     *   Currently, I was only running the CPU, Processes, Disk, Memory, 
Libvirt, IPMI, and network interfaces.
     *   I’ll add more and rerun.
  3.  Any choice on the initial set of agents to study the runtime behavior?
     *   I have started with collectd, telegraf and snap.
  4.  For workload, I’m using stress-ng and run different kind of stresses for 
5-minute. Is it OK?
  5.  I had not considered the ‘application’ aspect at all. Need to decide what 
app to run.

Regards,
Sridhar K. N. Rao (Ph. D)
Solutions Architect
+91-9900088064

From: Tahhan, Maryam [mailto:maryam.tah...@intel.com]
Sent: Tuesday, May 23, 2017 8:49 PM
To: Rao, Sridhar <sridhar....@spirent.com>; 'MORTON, ALFRED C (AL)' 
<acmor...@att.com>; Aaron Smith <aasm...@redhat.com>
Cc: Mcmahon, Tony B <tony.b.mcma...@intel.com>; Power, Damien 
<damien.po...@intel.com>; TECH-DISCUSS OPNFV 
<opnfv-tech-discuss@lists.opnfv.org>
Subject: [barometer] Runtime analysis of the monitoring agents

Hi Sridhar

To make the comparison fair I would really advise the following:

  *   The publishing Mode – should really be writing somewhere else off the 
system – ideally some sort of time series DB… you want to minimize the impact 
of noise on the system
  *   You isolate and pin cores appropriately
  *   Footprints measurement process:
     *   Measure Idle System resources usage
     *   Run plugin/plugins combination - Measure System resources usage
     *   Repeat tests on a busy System – or one running a workload.
     *   Report results
     *   Repeat with a busy system.
  *   Metrics to collect:
     *   Sysstat metrics
        *   CPU     %user     %nice   %system   %iowait    %steal
        *   Memory usage: kbmemfree kbmemused  %memused kbbuffers  kbcached 
kbswpfree
        *   Cache thrashing if any
        *   IO
           *   tps – Transactions per second (this includes both read and write)
           *   rtps – Read transactions per second
           *   wtps – Write transactions per second
           *   bread/s – Bytes read per second
           *   bwrtn/s – Bytes written per second
     *   collectd/any other collector specific process stats if possible.
     *   Application stats for the application you are running – to determine 
the impact of collectd/other collectors on the workload.
        *   You might pick a usecase with some network traffic – to see the 
impact on this if any.
  *   Intervals: you might want to try 1 second, 10 seconds and 60 seconds… if 
possible you might drop below a second.

I hope this helps, please let me know if you have any questions/comments.

BR
Maryam

From: Rao, Sridhar [mailto:sridhar....@spirent.com]
Sent: Friday, May 12, 2017 2:38 PM
To: Mcmahon, Tony B 
<tony.b.mcma...@intel.com<mailto:tony.b.mcma...@intel.com>>; Tahhan, Maryam 
<maryam.tah...@intel.com<mailto:maryam.tah...@intel.com>>; Power, Damien 
<damien.po...@intel.com<mailto:damien.po...@intel.com>>
Subject: RE: [barometer] Weekly Call

No Tony. Here you go:
--------------
This is the template I’ll be using for Runtime analysis of the monitoring 
agents. By runtime analysis I’ll be comparing how much of CPU and Memory these 
agents consume when they are ‘monitoring’.  Please let me know if I have missed 
anything or anything else I should be considering to make the comparison more 
meaningful.

Metrics monitored by the agents:

Publishing Mode:

Frequency of Reading the values:

Other metrics that may apply only to few agents:

System Configuration

CPU, Processes, Memory, Interfaces, Libvirt, IPMI, Disk Status

Writing to the file

1 Sec.

OVS, DPDK, PCM, RAS (MceLog)

1. Intel Xeon Server with at least 3 ethernet interfaces.
2. KVM/Qemu
3. At least 2 VMs Running.


Q. Why These metrics?
A. The choice of metrics first started with this link - 
https://wiki.opnfv.org/display/fastpath/Collectd+Metrics+and+Events . From this 
list, these metrics where chosen as they are supported by all the agents.

Q. Why publishing mode as ‘writing to the file’?
A. This mode is, again, supported by all (well, almost). And, it makes 
comparison fairer!

Q. Why Frequency of 1sec?
A. Frankly, I wasn’t sure. This is based on the input I received during the 
previous Barometer weekly call.

Q. What about other metrics from the link?
A. These will be considered and studies only for those that support – as they 
are relevant to NFV.

Regards,
Sridhar K. N. Rao (Ph. D)
Solutions Architect
+91-9900088064




Spirent Communications e-mail confidentiality.
------------------------------------------------------------------------
This e-mail contains confidential and / or privileged information belonging to 
Spirent Communications plc, its affiliates and / or subsidiaries. If you are 
not the intended recipient, you are hereby notified that any disclosure, 
copying, distribution and / or the taking of any action based upon reliance on 
the contents of this transmission is strictly forbidden. If you have received 
this message in error please notify the sender by return e-mail and delete it 
from your system.

Spirent Communications plc
Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.
Tel No. +44 (0) 1293 767676
Fax No. +44 (0) 1293 767677

Registered in England Number 470893
Registered at Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, 
United Kingdom.

Or if within the US,

Spirent Communications,
27349 Agoura Road, Calabasas, CA, 91301, USA.
Tel No. 1-818-676- 2300

_______________________________________________
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss

Re: [opnfv-tech-discuss] [barometer] Runtime analysis of the monitoring agents

Reply via email to