Hi Lydia, I have used sar monitoring (sar -u -n DEV -p -d -r 1) and plotted the average over multiple nodes.
1)So for each node you can collect the sar output, and obtain for example: Linux 3.2.0-4-amd64 (parasilo-4.rennes.grid5000.fr) 2016-01-27 _x86_64_ (16 CPU) 12:54:09 CPU %user %nice %system %iowait %steal %idle 12:54:10 all 4.63 0.00 3.25 0.13 0.00 91.99 12:54:09 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact 12:54:10 129538812 2525308 1.91 1292 85876 3662636 2.69 2111652 55132 12:54:09 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 12:54:10 sda 28.71 2708.91 87.13 97.38 0.03 1.10 0.97 2.77 12:54:09 IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 12:54:10 eth0 632.67 587.13 3173.60 58.47 0.00 0.00 0.00 2) Calculate the average over your nodes (sync clocks) and obtain a final output over which you run some plot scripts: LINE DATE FILENAME CPU_user CPU_SYS KBMEMFREE KBMEMUSED MEMUSED DISK_UTIL DISK_RKBs DISK_WKBs _IO_RSTs _IO_WSTs 1 12:54:10 res1Avg 6.12 1.25 129554704 2509412 1.90 6.00 4253.63 87.04 3944.00 88.00 2 12:54:11 res1Avg 3.41 0.28 129523432 2540690 1.92 4.00 2335.82 51.62 2692.00 0.00 3 12:54:12 res1Avg 0.06 0.03 129522000 2542120 1.92 1.60 0.16 0.59 2048.00 32.00 4 12:54:13 res1Avg 0.09 0.06 129520936 2543182 1.92 0.60 0.19 0.59 2048.00 0.00 5 12:54:14 res1Avg 0.06 0.06 129518448 2545670 1.93 6.80 4.31 169.47 4044.00 16.00 For other metrics specific to Flinkās execution you may need to rely on various metrics Flink is currently exposing. Best, Ovidiu > On 21 Dec 2016, at 19:55, Lydia Ickler <ickle...@googlemail.com> wrote: > > Hi all, > > I have a question regarding the Monitoring REST API; > > I want to analyze the behavior of my program with regards to I/O MiB/s, > Network MiB/s and CPU % as the authors of this paper did. > (https://hal.inria.fr/hal-01347638v2/document > <https://hal.inria.fr/hal-01347638v2/document>) > From the JSON file at http:master:8081/jobs/jobid/ I get a summary including > the information of read/write records and read/write bytes. > Unfortunately the entries of Network or CPU are either (unknown) or 0.0. I am > running my program on a cluster with up to 32 nodes. > > Where can I find the values for e.g. CPU or Network? > > Thanks in advance! > Lydia >