Hello
I want to mention a general case, in which we want to support "group by"
queries for different attributes and resources.
Regarding the problem mentioned by François, suppose we want to calculate *
per*-*cpu CPU utilization of each process* (select CPU usages for each CPU
separately, group by process).
> Process #1 : 25% total
> -> CPU0 : 20%
> -> CPU1 : 5%
>
> Process #2 : 10% total
> -> CPU0 : 10%
> -> CPU1 : 0%
In the meantime, suppose we are also interested to have a reverse
statistics: *per*-*process CPU utilization for each CPU* (select CPU
usages of each process separately, group by CPU).
> CPU0 : 30% total
> -> Process #1 : 20%
> -> Process #2 : 10%
>
> CPU1 : 5% total
> -> Process #1 : 5%
> -> Process #2 : 0%
Or another example, we want to calculate the IO throughout of
processes and files grouped by each one separately:
For IO throughput:
Process #1 : 25% total
-> File0 : 10% (quark: 1)
-> File1 : 5% (quark: 2)
Process #2 : 15% total
-> File0 : 5% (quark: 5)
-> File1 : 10% (quark: 6)
and
File0 : 12% total
-> Process #1 : 8% (quark: 10)
-> Process #2 : 4% (quark: 11)
File1 : 20% total
-> Process #1 : 10% (quark: 15)
-> Process #2 : 10% (quark: 16)
By using the current organization of the attribute tree , we may need to
duplicate the data and store them twice in the history tree, a separate
value for each attribute pair (e.g. cpu1--> process1 and process1-->cpu1
have different quark values and need to store their equal statistics values
separately in different places of the history tree).
*However, it may be useful to somehow relax the definition of
the attribute tree and let different applications define their own
organizations of the attributes.*
For instance, I suggest a new organization for managing the statistics:
1- We firstly create hierarchy of resources separately.
Processes
-> Process #1
-> Process #2
CPUs
-> CPU #1
-> CPU #2
Files
-> File #1
-> File #2
2- Then, define the metric nodes between different resources and
assign them different quark values. For example, we define "cpu usage"
metric node between each process and each CPU:
-> Process #2
---> CPU usage (quark: 1)
-> CPU #1
or IO between each File and Process
-> Process #1
---> IO (quark: 2)
-> File #3
This organization avoids duplication in the history tree: for each tuple
(e.g. process and CPU), it stores only one value in the history tree.
Furthermore, it supports different "group by" queries, aggregation
functions, etc.
Thanks,
Naser
_______________________________________________
linuxtools-dev mailing list
[email protected]
https://dev.eclipse.org/mailman/listinfo/linuxtools-dev