First: sorry for my slow response.

Mark Kanda <mark.ka...@oracle.com> writes:

> Thank you Markus.
>
> On 3/11/2022 7:06 AM, Markus Armbruster wrote:
>> Mark Kanda <mark.ka...@oracle.com> writes:
>>
>>> Introduce QMP support for querying stats. Provide a framework for adding new
>>> stats and support for the following commands:
>>>
>>> - query-stats
>>> Returns a list of all stats per target type (only VM and vCPU to start), 
>>> with
>>> additional options for specifying stat names, vCPU qom paths, and providers.
>>>
>>> - query-stats-schemas
>>> Returns a list of stats included in each target type, with an option for
>>> specifying the provider.
>>>
>>> The framework provides a method to register callbacks for these QMP 
>>> commands.
>>>
>>> The first use-case will be for fd-based KVM stats (in an upcoming patch).
>>>
>>> Examples (with fd-based KVM stats):
>>>
>>> - Query all VM stats:
>>>
>>> { "execute": "query-stats", "arguments" : { "target": "vm" } }
>>>
>>> { "return": {
>>>    "vm": [
>>>       { "provider": "kvm",
>>>         "stats": [
>>>            { "name": "max_mmu_page_hash_collisions", "value": 0 },
>>>            { "name": "max_mmu_rmap_size", "value": 0 },
>>>            { "name": "nx_lpage_splits", "value": 148 },
>>>            ...
>>>       { "provider": "xyz",
>>>         "stats": [ ...
>>>       ...
>>> ] } }
>>>
>>> - Query all vCPU stats:
>>>
>>> { "execute": "query-stats", "arguments" : { "target": "vcpu" } }
>>>
>>> { "return": {
>>>      "vcpus": [
>>>        { "path": "/machine/unattached/device[0]"
>>>          "providers": [
>>>            { "provider": "kvm",
>>>              "stats": [
>>>                { "name": "guest_mode", "value": 0 },
>>>                { "name": "directed_yield_successful", "value": 0 },
>>>                { "name": "directed_yield_attempted", "value": 106 },
>>>                ...
>>>            { "provider": "xyz",
>>>              "stats": [ ...
>>>             ...
>>>        { "path": "/machine/unattached/device[1]"
>>>          "providers": [
>>>            { "provider": "kvm",
>>>              "stats": [...
>>>            ...
>>> } ] } }
>>>
>>> - Query 'exits' and 'l1d_flush' KVM stats, and 'somestat' from provider 
>>> 'xyz'
>>> for vCPUs '/machine/unattached/device[2]' and 
>>> '/machine/unattached/device[4]':
>>>
>>> { "execute": "query-stats",
>>>    "arguments": {
>>>      "target": "vcpu",
>>>      "vcpus": [ "/machine/unattached/device[2]",
>>>                 "/machine/unattached/device[4]" ],
>>>      "filters": [
>>>        { "provider": "kvm",
>>>          "fields": [ "l1d_flush", "exits" ] },
>>>        { "provider": "xyz",
>>>          "fields": [ "somestat" ] } ] } }
>> Are the stats bulky enough to justfify the extra complexity of
>> filtering?
>
> If this was only for KVM, the complexity probably isn't worth it. However, 
> the 
> framework is intended to support future stats with new providers and targets 
> (there has also been mention of moving existing stats to this framework). 
> Without some sort of filtering, I think the payload could become unmanageable.

I'm deeply wary of "may need $complexity in the future" when $complexity
could be added when we actually need it :)

>>> { "return": {
>>>      "vcpus": [
>>>        { "path": "/machine/unattached/device[2]"
>>>          "providers": [
>>>            { "provider": "kvm",
>>>              "stats": [ { "name": "l1d_flush", "value": 41213 },
>>>                         { "name": "exits", "value": 74291 } ] },
>>>            { "provider": "xyz",
>>>              "stats": [ ... ] } ] },
>>>        { "path": "/machine/unattached/device[4]"
>>>          "providers": [
>>>            { "provider": "kvm",
>>>              "stats": [ { "name": "l1d_flush", "value": 16132 },
>>>                         { "name": "exits", "value": 57922 } ] },
>>>            { "provider": "xyz",
>>>              "stats": [ ... ] } ] } ] } }
>>>
>>> - Query stats schemas:
>>>
>>> { "execute": "query-stats-schemas" }
>>>
>>> { "return": {
>>>      "vcpu": [
>>>        { "provider": "kvm",
>>>          "stats": [
>>>             { "name": "guest_mode",
>>>               "unit": "none",
>>>               "base": 10,
>>>               "exponent": 0,
>>>               "type": "instant" },
>>>            { "name": "directed_yield_successful",
>>>               "unit": "none",
>>>               "base": 10,
>>>               "exponent": 0,
>>>               "type": "cumulative" },
>>>               ...
>>>        { "provider": "xyz",
>>>          ...
>>>     "vm": [
>>>        { "provider": "kvm",
>>>          "stats": [
>>>             { "name": "max_mmu_page_hash_collisions",
>>>               "unit": "none",
>>>               "base": 10,
>>>               "exponent": 0,
>>>               "type": "peak" },
>>>        { "provider": "xyz",
>>>        ...
>> Can you give a use case for query-stats-schemas?
>
> 'query-stats-schemas' provide the the type details about each stat; such as 
> the 
> unit, base, etc. These details are not reported by 'query-stats' (only the 
> stat 
> name and raw values are returned).

Yes, but what is going to use these type details, and for what purpose?

>>> Signed-off-by: Mark Kanda <mark.ka...@oracle.com>
>>> ---
>>>   include/monitor/stats.h |  51 ++++++++
>>>   monitor/qmp-cmds.c      | 219 +++++++++++++++++++++++++++++++++
>>>   qapi/meson.build        |   1 +
>>>   qapi/qapi-schema.json   |   1 +
>>>   qapi/stats.json         | 259 ++++++++++++++++++++++++++++++++++++++++
>>
>> That's a lot of schema code.
>>
>> How much of it is for query-stats, and how much for query-stats-schemas?
>
> It's roughly 60% query-stats, 40% query-stats-schemas.
>
>> How much of the query-stats part is for filtering?
>
> I think filtering is about 40% of query-stats.

Have you considered splitting this up into three parts: unfiltered
query-stats, filtering, and query-stats-schemas?

[...]
>>>   5 files changed, 531 insertions(+)
>>>   create mode 100644 include/monitor/stats.h
>>>   create mode 100644 qapi/stats.json
>> [...]
>>
>>> diff --git a/qapi/stats.json b/qapi/stats.json
>>> new file mode 100644
>>> index 0000000000..ae5dc3ee2c
>>> --- /dev/null
>>> +++ b/qapi/stats.json

[...]

>>> +##
>>> +# @StatsValue:
>>> +#
>>> +# @scalar: single uint64.
>>> +# @list: list of uint64.
>>> +#
>>> +# Since: 7.0
>>> +##
>>> +{ 'alternate': 'StatsValue',
>>> +  'data': { 'scalar': 'uint64',
>>> +            'list': 'StatsValueArray' } }
>>
>> Any particular reason for wrapping the array in a struct?
>
> Due to the limitation in the QAPI framework, I hit:
> ../qapi/stats.json:139: 'data' member 'list' cannot be an array
>
> I can look at adding support...

That would be nice.  Could you use help to get started with it?

We could perhaps merge with the current schema, then clean it up on top,
both in 7.1, if that's easier for you.


Reply via email to