Hi all,

Currently, in Qemu, we query the block states by using “info blockstats”.
For each disk, the command calls “bdrv_query_stats” function to get the
info of block states in “qmp_query_blockstats” .
Since commit 13344f3a, Stefan added mutex to ensures that the dataplane
IOThread and the main loop's monitor code do not race.
These are very useful to us.

Recently, we fount that when the guest in qemu had a lot of disks, the
I/O performance might be degraded if we used the command above.

So, we consider whether the following optimization can be carried out:

1. Add an optional parameter to identify a block, If this parameter is
not empty, only the states of the identified block is acquired.

 BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
                                      bool query_nodes,
+                                     BlockBackend *query_blk,
                                      Error **errp)
 {
     BlockStatsList *head = NULL, **p_next = &head;

2. Just use the mutex when we set the 'x-data-plane=on' feature.


The following is the reason:

1. background:

we have some guests which have more than 450 disks. And we use
“qmp_query_blockstats” to monitor their disks states in real time( 2
times/s).

we found that, in the monitor situation, the I/O performance of the
guests might be degraded. We tested(used “dd”):

 disk number |  degraded    |
     10      |  1.4%   |
     100     |  3.4%   |
     200     |  5.5%   |
     500     |  13.6%  |


2. In the description above, a comparison was made on the number of 500
disks:

  2.1. Adds the specified parameter: degraded 2.1%

  2.2. Turn off the lock protection(for test): degraded 1.6%

3. Guess

We stamped the time-stamps before and after the method named
“qmp_query_blockstats”, found that in our host:

3.1 the time to acquire a disk states is about 4637ns (without
considering the time of searching the linked list).
3.2  the time to get 500 disks states is about 901583ns.
3.3  if we don't consider the mutex,  the time to acquire a disk states
is about 2302ns.

The operation consumes the running resources of I/O . As the number of
disks increases, it becomes apparent.

We think that the performance degradation is inevitable, but if we can
configure the parameters, allowing users to balance between the number
of disks states and the performance of I/O, that may be also a good
thing.

And, we searched in the mail-list archive, did not find the relevant
discussion. We are not sure if anyone has ever experienced this
problem, or if anyone ever mentioned it before.

So, we boldly put forward, and hope to get everyone's advices.

Thanks,

 Liyang.



Reply via email to