Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-12 Thread Richard W.M. Jones
On Tue, Aug 05, 2014 at 01:36:02PM +0800, Li Wei wrote:
 Hi Richard,
 
 Thanks for your comment!
 
 On 08/04/2014 04:39 PM, Richard W.M. Jones wrote:
  On Mon, Aug 04, 2014 at 11:38:41AM +0800, Li Wei wrote:
  Hi,
 
  On 07/22/2014 03:25 PM, Richard W.M. Jones wrote:
 
  Did anything come of this discussion, and/or is someone working on this?
 
  I am working on an API to query block stats in a bulk style and proposed an
  API as follow:
 
  virDomainBlockStatsBulkFlags(virDomainPtr dom,
  virTypedParameterPtr params,
  int nparams,
  int ndisks,
  unsigned int flags)
 
  @dom: pointer to domain object
  @params: an array of typed param to be populated with block stats
  @nparams: how many params used for each block device
  @ndisks: how many block devices to query
  @flags: flags to filter block devices (not used for now)
 
  Returns -1 in case of error, 0 in case of success.
  with params == NULL, nparams == -1, ndisks == 1, return number of params 
  for each block device.
  with params == NULL, nparams == -1, ndisks == -1, return number of disks 
  in the domain.
 
  A typical usage of this API should be:
  nparams = virDomainBlockStatsBulkFlags(dom, NULL, -1, 1, 0);
  ndisks = virDomainBlockStatsBulkFlags(dom, NULL, -1, -1, 0);
 
  params = VIR_ALLOC_N(params, nparams * ndisks);
 
  ret = virDomainBlockStatsBulkFlags(dom, params, nparams, ndisks, 0);
 
  ... do something with params
 
  VIR_FREE(params);
 
  With this bulk API, virt-top can updates in a short interval for a domain 
  with a lot of disks.
  Any comments?
  
  I think this works OK for the case where you have 1 domains with
  lots of disks.
  
  However if you have a large number of domains each with 1 or 2
  disks I think you would have the same problem as currently.
 
 Yes, it is.
 
  
  Is it possible to design an API that can work across all domains
  in a single call?
 
 How about the following API:
 
 int virConnectGetAllBlockStats(virConnectPtr conn,
   virDomainPtr domain,
   virDomainBlockBulkStatsPtr *stats,
   unsigned int flags);
 @conn: pointer to libvirt connection
 @domain: pointer to the domain to be queried, NULL for all domains
 @stats: array of virDomainBlockBulkStats struct(see below) to be populated
 @flags: filter flags
 Return the number of virDomainBlockBulkStats populated.
 
 where virDomainBlockBulkStats defined as:
 
 struct _virDomainBlockBulkStats {
 virDomainPtr domain;   /* domain the block stats belongs to */
 virTypedParameterPtr params; /* params to store block stats */
 unsigned int nparams;  /* how many params used for each block stats */
 unsigned int ndisks;   /* how many block stats in this domain */
 };

Works for me.

Please CC me on any patches so I can review them more easily for you.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-12 Thread Francesco Romani
- Original Message -
 From: Richard W.M. Jones rjo...@redhat.com
 To: Li Wei l...@cn.fujitsu.com
 Cc: Francesco Romani from...@redhat.com, libvir-list@redhat.com
 Sent: Tuesday, August 12, 2014 11:04:05 AM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats
 

[...]
   Is it possible to design an API that can work across all domains
   in a single call?
  
  How about the following API:
  
  int virConnectGetAllBlockStats(virConnectPtr conn,
  virDomainPtr domain,
  virDomainBlockBulkStatsPtr *stats,
  unsigned int flags);
  @conn: pointer to libvirt connection
  @domain: pointer to the domain to be queried, NULL for all domains
  @stats: array of virDomainBlockBulkStats struct(see below) to be populated
  @flags: filter flags
  Return the number of virDomainBlockBulkStats populated.
  
  where virDomainBlockBulkStats defined as:
  
  struct _virDomainBlockBulkStats {
  virDomainPtr domain; /* domain the block stats belongs to */
  virTypedParameterPtr params; /* params to store block stats */
  unsigned int nparams;/* how many params used for each block stats */
  unsigned int ndisks; /* how many block stats in this domain */
  };
 
 Works for me.

Same here.

oVirt, more specifically VDSM, needs to check all the stats of all
the domains on a given host at once, so this API should fit the task.

Since VDSM takes ownership (read: keep track and control) of all the VMs,
the filtering capability of this new API should be good enough.

+++

It would be nice, but less important, to be able to somehow reuse the 
`stats' argument.

What I'm looking here is a way to avoid to allocate/deallocate every time
all the needed structure before and after each call.

I'm saying so because is a pretty common scenario for a VM (at least in
the cases I'm aware of) to have the same number of disks during all its life.

But I believe this is an optimization which can be added later.

Thanks,

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-11 Thread Li Wei
ping ...

On 08/05/2014 01:36 PM, Li Wei wrote:
 Hi Richard,
 
 Thanks for your comment!
 
 On 08/04/2014 04:39 PM, Richard W.M. Jones wrote:
 On Mon, Aug 04, 2014 at 11:38:41AM +0800, Li Wei wrote:
 Hi,

 On 07/22/2014 03:25 PM, Richard W.M. Jones wrote:

 Did anything come of this discussion, and/or is someone working on this?

 I am working on an API to query block stats in a bulk style and proposed an
 API as follow:

 virDomainBlockStatsBulkFlags(virDomainPtr dom,
  virTypedParameterPtr params,
  int nparams,
  int ndisks,
  unsigned int flags)

 @dom: pointer to domain object
 @params: an array of typed param to be populated with block stats
 @nparams: how many params used for each block device
 @ndisks: how many block devices to query
 @flags: flags to filter block devices (not used for now)

 Returns -1 in case of error, 0 in case of success.
 with params == NULL, nparams == -1, ndisks == 1, return number of params 
 for each block device.
 with params == NULL, nparams == -1, ndisks == -1, return number of disks in 
 the domain.

 A typical usage of this API should be:
 nparams = virDomainBlockStatsBulkFlags(dom, NULL, -1, 1, 0);
 ndisks = virDomainBlockStatsBulkFlags(dom, NULL, -1, -1, 0);

 params = VIR_ALLOC_N(params, nparams * ndisks);

 ret = virDomainBlockStatsBulkFlags(dom, params, nparams, ndisks, 0);

 ... do something with params

 VIR_FREE(params);

 With this bulk API, virt-top can updates in a short interval for a domain 
 with a lot of disks.
 Any comments?

 I think this works OK for the case where you have 1 domains with
 lots of disks.

 However if you have a large number of domains each with 1 or 2
 disks I think you would have the same problem as currently.
 
 Yes, it is.
 

 Is it possible to design an API that can work across all domains
 in a single call?
 
 How about the following API:
 
 int virConnectGetAllBlockStats(virConnectPtr conn,
   virDomainPtr domain,
   virDomainBlockBulkStatsPtr *stats,
   unsigned int flags);
 @conn: pointer to libvirt connection
 @domain: pointer to the domain to be queried, NULL for all domains
 @stats: array of virDomainBlockBulkStats struct(see below) to be populated
 @flags: filter flags
 Return the number of virDomainBlockBulkStats populated.
 
 where virDomainBlockBulkStats defined as:
 
 struct _virDomainBlockBulkStats {
 virDomainPtr domain;   /* domain the block stats belongs to */
 virTypedParameterPtr params; /* params to store block stats */
 unsigned int nparams;  /* how many params used for each block stats */
 unsigned int ndisks;   /* how many block stats in this domain */
 };
 
 Note:
 1. because the API allocate memory to store stats, the caller need to free it 
 after use.
 2. to distinguish each block stats in a domain, we need use a param to store 
 block device name.
 

 PS:
 It seems we need a bunch of bulk APIs to query stats, I wonder if I can 
 submit a patchset for each
 bulk API or must supply all the bulk APIs in one patchset?

 Whichever is easiest to review.  I suspect that smaller patches, each
 containing a single new API, will be simpler to review, but that's
 just my opinion.
 
 I prefer this way also.
 
 Thanks,
 Li Wei
 

 Rich.

 
 --
 libvir-list mailing list
 libvir-list@redhat.com
 https://www.redhat.com/mailman/listinfo/libvir-list
 .
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-05 Thread Eric Blake
On 08/04/2014 11:46 PM, Li Wei wrote:


 How about the following API:

 int virConnectGetAllBlockStats(virConnectPtr conn,
  virDomainPtr domain,
  virDomainBlockBulkStatsPtr *stats,
  unsigned int flags);
 @conn: pointer to libvirt connection
 @domain: pointer to the domain to be queried, NULL for all domains
 @stats: array of virDomainBlockBulkStats struct(see below) to be populated
 @flags: filter flags
 
 Because block stats only valid for active domains, may be this filter flag
 can be remove.

No, keep the flag. It is still useful to filter on at least transient
vs. persistent.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-04 Thread Richard W.M. Jones
On Mon, Aug 04, 2014 at 11:38:41AM +0800, Li Wei wrote:
 Hi,
 
 On 07/22/2014 03:25 PM, Richard W.M. Jones wrote:
  
  Did anything come of this discussion, and/or is someone working on this?
 
 I am working on an API to query block stats in a bulk style and proposed an
 API as follow:
 
 virDomainBlockStatsBulkFlags(virDomainPtr dom,
virTypedParameterPtr params,
int nparams,
int ndisks,
unsigned int flags)
 
 @dom: pointer to domain object
 @params: an array of typed param to be populated with block stats
 @nparams: how many params used for each block device
 @ndisks: how many block devices to query
 @flags: flags to filter block devices (not used for now)
 
 Returns -1 in case of error, 0 in case of success.
 with params == NULL, nparams == -1, ndisks == 1, return number of params for 
 each block device.
 with params == NULL, nparams == -1, ndisks == -1, return number of disks in 
 the domain.
 
 A typical usage of this API should be:
 nparams = virDomainBlockStatsBulkFlags(dom, NULL, -1, 1, 0);
 ndisks = virDomainBlockStatsBulkFlags(dom, NULL, -1, -1, 0);
 
 params = VIR_ALLOC_N(params, nparams * ndisks);
 
 ret = virDomainBlockStatsBulkFlags(dom, params, nparams, ndisks, 0);
 
 ... do something with params
 
 VIR_FREE(params);
 
 With this bulk API, virt-top can updates in a short interval for a domain 
 with a lot of disks.
 Any comments?

I think this works OK for the case where you have 1 domains with
lots of disks.

However if you have a large number of domains each with 1 or 2
disks I think you would have the same problem as currently.

Is it possible to design an API that can work across all domains
in a single call?

 PS:
 It seems we need a bunch of bulk APIs to query stats, I wonder if I can 
 submit a patchset for each
 bulk API or must supply all the bulk APIs in one patchset?

Whichever is easiest to review.  I suspect that smaller patches, each
containing a single new API, will be simpler to review, but that's
just my opinion.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-04 Thread Li Wei
Hi Richard,

Thanks for your comment!

On 08/04/2014 04:39 PM, Richard W.M. Jones wrote:
 On Mon, Aug 04, 2014 at 11:38:41AM +0800, Li Wei wrote:
 Hi,

 On 07/22/2014 03:25 PM, Richard W.M. Jones wrote:

 Did anything come of this discussion, and/or is someone working on this?

 I am working on an API to query block stats in a bulk style and proposed an
 API as follow:

 virDomainBlockStatsBulkFlags(virDomainPtr dom,
   virTypedParameterPtr params,
   int nparams,
   int ndisks,
   unsigned int flags)

 @dom: pointer to domain object
 @params: an array of typed param to be populated with block stats
 @nparams: how many params used for each block device
 @ndisks: how many block devices to query
 @flags: flags to filter block devices (not used for now)

 Returns -1 in case of error, 0 in case of success.
 with params == NULL, nparams == -1, ndisks == 1, return number of params for 
 each block device.
 with params == NULL, nparams == -1, ndisks == -1, return number of disks in 
 the domain.

 A typical usage of this API should be:
 nparams = virDomainBlockStatsBulkFlags(dom, NULL, -1, 1, 0);
 ndisks = virDomainBlockStatsBulkFlags(dom, NULL, -1, -1, 0);

 params = VIR_ALLOC_N(params, nparams * ndisks);

 ret = virDomainBlockStatsBulkFlags(dom, params, nparams, ndisks, 0);

 ... do something with params

 VIR_FREE(params);

 With this bulk API, virt-top can updates in a short interval for a domain 
 with a lot of disks.
 Any comments?
 
 I think this works OK for the case where you have 1 domains with
 lots of disks.
 
 However if you have a large number of domains each with 1 or 2
 disks I think you would have the same problem as currently.

Yes, it is.

 
 Is it possible to design an API that can work across all domains
 in a single call?

How about the following API:

int virConnectGetAllBlockStats(virConnectPtr conn,
virDomainPtr domain,
virDomainBlockBulkStatsPtr *stats,
unsigned int flags);
@conn: pointer to libvirt connection
@domain: pointer to the domain to be queried, NULL for all domains
@stats: array of virDomainBlockBulkStats struct(see below) to be populated
@flags: filter flags
Return the number of virDomainBlockBulkStats populated.

where virDomainBlockBulkStats defined as:

struct _virDomainBlockBulkStats {
virDomainPtr domain; /* domain the block stats belongs to */
virTypedParameterPtr params; /* params to store block stats */
unsigned int nparams;/* how many params used for each block stats */
unsigned int ndisks; /* how many block stats in this domain */
};

Note:
1. because the API allocate memory to store stats, the caller need to free it 
after use.
2. to distinguish each block stats in a domain, we need use a param to store 
block device name.

 
 PS:
 It seems we need a bunch of bulk APIs to query stats, I wonder if I can 
 submit a patchset for each
 bulk API or must supply all the bulk APIs in one patchset?
 
 Whichever is easiest to review.  I suspect that smaller patches, each
 containing a single new API, will be simpler to review, but that's
 just my opinion.

I prefer this way also.

Thanks,
Li Wei

 
 Rich.
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-04 Thread Li Wei


On 08/05/2014 01:36 PM, Li Wei wrote:
 Hi Richard,
 
 Thanks for your comment!
 
 On 08/04/2014 04:39 PM, Richard W.M. Jones wrote:
 On Mon, Aug 04, 2014 at 11:38:41AM +0800, Li Wei wrote:
 Hi,

 On 07/22/2014 03:25 PM, Richard W.M. Jones wrote:

 Did anything come of this discussion, and/or is someone working on this?

 I am working on an API to query block stats in a bulk style and proposed an
 API as follow:

 virDomainBlockStatsBulkFlags(virDomainPtr dom,
  virTypedParameterPtr params,
  int nparams,
  int ndisks,
  unsigned int flags)

 @dom: pointer to domain object
 @params: an array of typed param to be populated with block stats
 @nparams: how many params used for each block device
 @ndisks: how many block devices to query
 @flags: flags to filter block devices (not used for now)

 Returns -1 in case of error, 0 in case of success.
 with params == NULL, nparams == -1, ndisks == 1, return number of params 
 for each block device.
 with params == NULL, nparams == -1, ndisks == -1, return number of disks in 
 the domain.

 A typical usage of this API should be:
 nparams = virDomainBlockStatsBulkFlags(dom, NULL, -1, 1, 0);
 ndisks = virDomainBlockStatsBulkFlags(dom, NULL, -1, -1, 0);

 params = VIR_ALLOC_N(params, nparams * ndisks);

 ret = virDomainBlockStatsBulkFlags(dom, params, nparams, ndisks, 0);

 ... do something with params

 VIR_FREE(params);

 With this bulk API, virt-top can updates in a short interval for a domain 
 with a lot of disks.
 Any comments?

 I think this works OK for the case where you have 1 domains with
 lots of disks.

 However if you have a large number of domains each with 1 or 2
 disks I think you would have the same problem as currently.
 
 Yes, it is.
 

 Is it possible to design an API that can work across all domains
 in a single call?
 
 How about the following API:
 
 int virConnectGetAllBlockStats(virConnectPtr conn,
   virDomainPtr domain,
   virDomainBlockBulkStatsPtr *stats,
   unsigned int flags);
 @conn: pointer to libvirt connection
 @domain: pointer to the domain to be queried, NULL for all domains
 @stats: array of virDomainBlockBulkStats struct(see below) to be populated
 @flags: filter flags

Because block stats only valid for active domains, may be this filter flag
can be remove.

Thanks.

 Return the number of virDomainBlockBulkStats populated.
 
 where virDomainBlockBulkStats defined as:
 
 struct _virDomainBlockBulkStats {
 virDomainPtr domain;   /* domain the block stats belongs to */
 virTypedParameterPtr params; /* params to store block stats */
 unsigned int nparams;  /* how many params used for each block stats */
 unsigned int ndisks;   /* how many block stats in this domain */
 };
 
 Note:
 1. because the API allocate memory to store stats, the caller need to free it 
 after use.
 2. to distinguish each block stats in a domain, we need use a param to store 
 block device name.
 

 PS:
 It seems we need a bunch of bulk APIs to query stats, I wonder if I can 
 submit a patchset for each
 bulk API or must supply all the bulk APIs in one patchset?

 Whichever is easiest to review.  I suspect that smaller patches, each
 containing a single new API, will be simpler to review, but that's
 just my opinion.
 
 I prefer this way also.
 
 Thanks,
 Li Wei
 

 Rich.

 
 --
 libvir-list mailing list
 libvir-list@redhat.com
 https://www.redhat.com/mailman/listinfo/libvir-list
 .
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-08-03 Thread Li Wei
Hi,

On 07/22/2014 03:25 PM, Richard W.M. Jones wrote:
 
 Did anything come of this discussion, and/or is someone working on this?

I am working on an API to query block stats in a bulk style and proposed an
API as follow:

virDomainBlockStatsBulkFlags(virDomainPtr dom,
 virTypedParameterPtr params,
 int nparams,
 int ndisks,
 unsigned int flags)

@dom: pointer to domain object
@params: an array of typed param to be populated with block stats
@nparams: how many params used for each block device
@ndisks: how many block devices to query
@flags: flags to filter block devices (not used for now)

Returns -1 in case of error, 0 in case of success.
with params == NULL, nparams == -1, ndisks == 1, return number of params for 
each block device.
with params == NULL, nparams == -1, ndisks == -1, return number of disks in the 
domain.

A typical usage of this API should be:
nparams = virDomainBlockStatsBulkFlags(dom, NULL, -1, 1, 0);
ndisks = virDomainBlockStatsBulkFlags(dom, NULL, -1, -1, 0);

params = VIR_ALLOC_N(params, nparams * ndisks);

ret = virDomainBlockStatsBulkFlags(dom, params, nparams, ndisks, 0);

... do something with params

VIR_FREE(params);

With this bulk API, virt-top can updates in a short interval for a domain with 
a lot of disks.
Any comments?

PS:
It seems we need a bunch of bulk APIs to query stats, I wonder if I can submit 
a patchset for each
bulk API or must supply all the bulk APIs in one patchset?

Thanks,
Li Wei


 
 Rich.
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-22 Thread Richard W.M. Jones

Did anything come of this discussion, and/or is someone working on this?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-09 Thread Francesco Romani


- Original Message -
 From: Francesco Romani from...@redhat.com
 To: libvir-list@redhat.com
 Sent: Friday, July 4, 2014 6:44:07 PM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats

   However, a question here about bulk APIs.
   One cornerstone of oVirt is shared storage (NFS, ISCSI...); another is
   qemu/kvm,
   and COW images are supported (probably even the default, need to check).
   
   Due to storage being unavailable because a network outage, it happened
   that
   virDomainGetBlockInfo blocked beyond recover.
   
   On such scenarios, how will a bulk API behave? There will be a timeout or
   something else?
  
  It depends on the storage and the way it is configured. If NFS is mounted
  with 'hard' + 'nointr' any call libvirt makes to dead storage will get
  stuck in an uninterruptable sleep in kernel space. There's no way for
  libvirt to time out since by the very definition of 'hard' mount option
  it does not time out. If you mount with 'soft' then the calls libvirt
  makes will time out.
 
 My bad, I worded poorly my question.
 
 What I mean is: on top of what the kernel or QEMU (libnfs, libiscsi) does,
 there are plans for any additional mechanism/safeguard?
 (I guess no, I'm asking just to be sure).

Hi,

maybe borderline offtopic, but still about blocking calls:

We (VDSM/oVirt developers) are reviewing our usage of libvirt in sampling.
Afer a (quick) inspection of the code, I believe the following calls cannot
block due to FS/storage issues, as they do not need it in any way

I'm quite confident about these
* virDomainGetCPUStats: uses cgroups only (no FS/storage access)
* virDomainInterfaceStats: uses /proc/net/dev  (no FS/storage access)
* virDomainGetVcpus: uses uses /proc and syscall for PCPU affinity (no 
FS/storage access)
* virDomainSchedulerParameters: which uses cgroups (no FS/storage access)

Not sure about this, but it looks to me they don't need to access FS/storage 
either:
* virDomainGetVcpusFlags
* virDomainGetMetadata


Can please anyone confirm or deny?

Thanks and best regards

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-09 Thread Daniel P. Berrange
On Wed, Jul 09, 2014 at 06:14:12AM -0400, Francesco Romani wrote:
 
 
 - Original Message -
  From: Francesco Romani from...@redhat.com
  To: libvir-list@redhat.com
  Sent: Friday, July 4, 2014 6:44:07 PM
  Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats
 
However, a question here about bulk APIs.
One cornerstone of oVirt is shared storage (NFS, ISCSI...); another is
qemu/kvm,
and COW images are supported (probably even the default, need to check).

Due to storage being unavailable because a network outage, it happened
that
virDomainGetBlockInfo blocked beyond recover.

On such scenarios, how will a bulk API behave? There will be a timeout 
or
something else?
   
   It depends on the storage and the way it is configured. If NFS is mounted
   with 'hard' + 'nointr' any call libvirt makes to dead storage will get
   stuck in an uninterruptable sleep in kernel space. There's no way for
   libvirt to time out since by the very definition of 'hard' mount option
   it does not time out. If you mount with 'soft' then the calls libvirt
   makes will time out.
  
  My bad, I worded poorly my question.
  
  What I mean is: on top of what the kernel or QEMU (libnfs, libiscsi) does,
  there are plans for any additional mechanism/safeguard?
  (I guess no, I'm asking just to be sure).
 
 Hi,
 
 maybe borderline offtopic, but still about blocking calls:
 
 We (VDSM/oVirt developers) are reviewing our usage of libvirt in sampling.
 Afer a (quick) inspection of the code, I believe the following calls cannot
 block due to FS/storage issues, as they do not need it in any way
 
 I'm quite confident about these
 * virDomainGetCPUStats: uses cgroups only (no FS/storage access)
 * virDomainInterfaceStats: uses /proc/net/dev  (no FS/storage access)
 * virDomainGetVcpus: uses uses /proc and syscall for PCPU affinity (no 
 FS/storage access)
 * virDomainSchedulerParameters: which uses cgroups (no FS/storage access)
 
 Not sure about this, but it looks to me they don't need to access FS/storage 
 either:
 * virDomainGetVcpusFlags
 * virDomainGetMetadata
 
 
 Can please anyone confirm or deny?

If there is a prior call to libvirt that involves that guest domain
which has blocked on storage, then this can prevent subsequent calls
from completely since the prior call may hold a lock.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Daniel P. Berrange
On Thu, Jul 03, 2014 at 01:49:41PM -0600, Eric Blake wrote:
 On 07/01/2014 03:33 AM, Daniel P. Berrange wrote:
 
   1. Time to write() the RPC call to the socket
   2. Time for libvirtd to process the RPC call
   3. Time to recv() the RPC reply from the socket
   ...and so on..
  
  If the time for item 2 dominates over the time for items 1  2 (which
  it should really) then the client thread is going to be sleeping in a
  poll() for the bulk of the duration of the libvirt API call. If we had
  an async API mechanism, then the VDSM time would essentially be consumed
  with
  
   1. Time to write() the RPC call to the socket
   2. Time to write() the RPC call to the socket
   3. Time to write() the RPC call to the socket
   4. Time to write() the RPC call to the socket
   5. Time to write() the RPC call to the socket
   6. Time to write() the RPC call to the socket
   7. wait for replies to start arriving
   8. Time to recv() the RPC reply from the socket
   9. Time to recv() the RPC reply from the socket
   10. Time to recv() the RPC reply from the socket
   11. Time to recv() the RPC reply from the socket
   12. Time to recv() the RPC reply from the socket
   13. Time to recv() the RPC reply from the socket
   14. Time to recv() the RPC reply from the socket
 
 This assumes you are still calling one async call per domain query.
 
 With regards to a bulk API, are you thinking synchronous?
 
 1. Time to write() the RPC call - one bulk request
 2. wait for reply - oh, and we'd better increase our on-wire size limits
 3. Time to recv() the RPC reply - one bulk response
 
 or asynchronous?
 
 1. Time to write() the RPC call - one bulk request
 2. wait for replies to start arriving
 3. Time to recv() an RPC async reply - first domain
 4. Time to recv() an RPC async reply - second domain
 ...
 n. Time to recv() final RPC async reply
 
 The asynchronous works nicely in that we don't have to size up our max
 RPC on-wire limits, but implies that you still need a callback invoked
 once per reply received, instead of getting all data back in one giant
 memory blob.

I was thinking the former actually, but the latter is another possibility
to consider I guess.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Richard W.M. Jones
On Tue, Jul 01, 2014 at 03:09:13AM -0400, Francesco Romani wrote:
 I'd like to discuss possible APIs and plans for new query APIs in libvirt.
 
 I'm one of the oVirt (http://www.ovirt.org) developers, and I write code for 
 VDSM;
 VDSM is the node management daemon, which is in charge, among many other 
 things, to
 gather the host and statistics per Domain/VM.
 
 Right now we aim for a number of VM per node in the (few) hundreds, but we 
 have big plans
 to scale much more, and to possibly reach thousands in a not so distant 
 future.
 At the moment, we use one thread per VM to gather the VM stats (CPU, network, 
 disk),
 and of course this obviously scales poorly.

I'll just note here that a bug has been opened for virt-top, which
is similar to this.

If a domain has a large number of disks (256 virtio-scsi disks in the
customer's case), then virt-top spends so long fetching the data for
each separate disk, it can take 30-40 seconds between updates.

The same thing would happen if you had lots of domains, each with a
few disks, but with the total adding up to hundreds of disks.

The same thing would happen if you substitute network interfaces for disks.

What would help for us:

 - A way to get information for multiple objects in a single domain

 - A way to get information for multiple objects across multiple domains

in as few API round trips as possible.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Richard W.M. Jones

On Tue, Jul 01, 2014 at 09:35:21AM +0100, Daniel P. Berrange wrote:
 For the async API design, I could see two potential designs
 
 1. A custom callback to run per API
 
  typedef (void)(*virDomainBlockInfoCallback)(virDomainPtr dom,
  bool isError,
  virDomainBlockInfoPtr info,
  void *opaque);
 
 int virDomainGetBlockInfoAsync(virDomainPtr dom,
const char *disk,
virDomainBlockInfoCallback cb,
void *opaque,
unsigned int flags);
 
 
 2. A standard callback and a pair of APIs
 
  typedef void *virDomainAsyncResult;
  typedef (void)(*virDomainAsyncCallback)(virDomainPtr dom,
  virDomainAsyncResult res);
 
void virDomainGetBlockInfoAsync(virDomainPtr dom,
const char *disk,
virDomainBlockInfoCallback cb,
void *opaque,
unsigned int flags);
int virDomainGetBlockInfoFinish(virDomainPtr dom,
   virDomainAsyncResult res,
   virDomainBlockInfoPtr info);

Could we consider an API which worked across all active domains?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Francesco Romani
- Original Message -
 From: Richard W.M. Jones rjo...@redhat.com
 To: Francesco Romani from...@redhat.com
 Cc: libvir-list@redhat.com
 Sent: Friday, July 4, 2014 1:11:54 PM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats

  Right now we aim for a number of VM per node in the (few) hundreds, but we
  have big plans
  to scale much more, and to possibly reach thousands in a not so distant
  future.
  At the moment, we use one thread per VM to gather the VM stats (CPU,
  network, disk),
  and of course this obviously scales poorly.
 
 I'll just note here that a bug has been opened for virt-top, which
 is similar to this.
 
 If a domain has a large number of disks (256 virtio-scsi disks in the
 customer's case), then virt-top spends so long fetching the data for
 each separate disk, it can take 30-40 seconds between updates.
 
 The same thing would happen if you had lots of domains, each with a
 few disks, but with the total adding up to hundreds of disks.
 
 The same thing would happen if you substitute network interfaces for disks.
 
 What would help for us:
 
  - A way to get information for multiple objects in a single domain
 
  - A way to get information for multiple objects across multiple domains
 
 in as few API round trips as possible.

I concur. Actually you also expressed our (VDSM) need better than I did.
I think we are on the same boat.

Bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Daniel P. Berrange
On Fri, Jul 04, 2014 at 12:14:06PM +0100, Richard W.M. Jones wrote:
 
 On Tue, Jul 01, 2014 at 09:35:21AM +0100, Daniel P. Berrange wrote:
  For the async API design, I could see two potential designs
  
  1. A custom callback to run per API
  
   typedef (void)(*virDomainBlockInfoCallback)(virDomainPtr dom,
   bool isError,
   virDomainBlockInfoPtr info,
   void *opaque);
  
  int virDomainGetBlockInfoAsync(virDomainPtr dom,
 const char *disk,
 virDomainBlockInfoCallback cb,
 void *opaque,
 unsigned int flags);
  
  
  2. A standard callback and a pair of APIs
  
   typedef void *virDomainAsyncResult;
   typedef (void)(*virDomainAsyncCallback)(virDomainPtr dom,
   virDomainAsyncResult res);
  
 void virDomainGetBlockInfoAsync(virDomainPtr dom,
 const char *disk,
 virDomainBlockInfoCallback cb,
 void *opaque,
 unsigned int flags);
 int virDomainGetBlockInfoFinish(virDomainPtr dom,
virDomainAsyncResult res,
virDomainBlockInfoPtr info);
 
 Could we consider an API which worked across all active domains?

Of course. I was intentionally ignoring the bulk API side of the
request in this example, to just focus on the illustration of some
general patterns for providing an async API design.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Daniel P. Berrange
On Fri, Jul 04, 2014 at 12:11:54PM +0100, Richard W.M. Jones wrote:
 On Tue, Jul 01, 2014 at 03:09:13AM -0400, Francesco Romani wrote:
  I'd like to discuss possible APIs and plans for new query APIs in libvirt.
  
  I'm one of the oVirt (http://www.ovirt.org) developers, and I write code 
  for VDSM;
  VDSM is the node management daemon, which is in charge, among many other 
  things, to
  gather the host and statistics per Domain/VM.
  
  Right now we aim for a number of VM per node in the (few) hundreds, but we 
  have big plans
  to scale much more, and to possibly reach thousands in a not so distant 
  future.
  At the moment, we use one thread per VM to gather the VM stats (CPU, 
  network, disk),
  and of course this obviously scales poorly.
 
 I'll just note here that a bug has been opened for virt-top, which
 is similar to this.
 
 If a domain has a large number of disks (256 virtio-scsi disks in the
 customer's case), then virt-top spends so long fetching the data for
 each separate disk, it can take 30-40 seconds between updates.
 
 The same thing would happen if you had lots of domains, each with a
 few disks, but with the total adding up to hundreds of disks.
 
 The same thing would happen if you substitute network interfaces for disks.
 
 What would help for us:
 
  - A way to get information for multiple objects in a single domain
 
  - A way to get information for multiple objects across multiple domains

I'd say that we want something similar to the virDomainListAllDomains()
API for stats. ie we shouldn't try to pass in the full list of domains
or paths we want info for. We should just list all domains, optionally
using flags to filter based on some characteristic, eg exclude inactive.
Similarly always list stats for all disks.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Richard W.M. Jones
On Fri, Jul 04, 2014 at 12:33:27PM +0100, Daniel P. Berrange wrote:
 On Fri, Jul 04, 2014 at 12:11:54PM +0100, Richard W.M. Jones wrote:
  On Tue, Jul 01, 2014 at 03:09:13AM -0400, Francesco Romani wrote:
   I'd like to discuss possible APIs and plans for new query APIs in libvirt.
   
   I'm one of the oVirt (http://www.ovirt.org) developers, and I write code 
   for VDSM;
   VDSM is the node management daemon, which is in charge, among many other 
   things, to
   gather the host and statistics per Domain/VM.
   
   Right now we aim for a number of VM per node in the (few) hundreds, but 
   we have big plans
   to scale much more, and to possibly reach thousands in a not so distant 
   future.
   At the moment, we use one thread per VM to gather the VM stats (CPU, 
   network, disk),
   and of course this obviously scales poorly.
  
  I'll just note here that a bug has been opened for virt-top, which
  is similar to this.
  
  If a domain has a large number of disks (256 virtio-scsi disks in the
  customer's case), then virt-top spends so long fetching the data for
  each separate disk, it can take 30-40 seconds between updates.
  
  The same thing would happen if you had lots of domains, each with a
  few disks, but with the total adding up to hundreds of disks.
  
  The same thing would happen if you substitute network interfaces for disks.
  
  What would help for us:
  
   - A way to get information for multiple objects in a single domain
  
   - A way to get information for multiple objects across multiple domains
 
 I'd say that we want something similar to the virDomainListAllDomains()
 API for stats. ie we shouldn't try to pass in the full list of domains
 or paths we want info for. We should just list all domains, optionally
 using flags to filter based on some characteristic, eg exclude inactive.
 Similarly always list stats for all disks.

FYI for virt-top we only care about stats of all active domains, and
we only care about all disks  all network interfaces for domains
(ie. never any subset).

We also collect CPU time and memory usage per domain.

Of course this only applies to virt-top, not to other clients.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Francesco Romani
- Original Message -
 From: Richard W.M. Jones rjo...@redhat.com
 To: Daniel P. Berrange berra...@redhat.com
 Cc: libvir-list@redhat.com, Francesco Romani from...@redhat.com
 Sent: Friday, July 4, 2014 1:39:57 PM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats

   What would help for us:
   
- A way to get information for multiple objects in a single domain
   
- A way to get information for multiple objects across multiple domains
  
  I'd say that we want something similar to the virDomainListAllDomains()
  API for stats. ie we shouldn't try to pass in the full list of domains
  or paths we want info for. We should just list all domains, optionally
  using flags to filter based on some characteristic, eg exclude inactive.
  Similarly always list stats for all disks.
 
 FYI for virt-top we only care about stats of all active domains, and
 we only care about all disks  all network interfaces for domains
 (ie. never any subset).
 
 We also collect CPU time and memory usage per domain.

Is the same for VDSM. VDSM takes ownership of all the domain on an host,
so all it never does any kind of filtering or consider subsets of any kind.

However, a question here about bulk APIs.
One cornerstone of oVirt is shared storage (NFS, ISCSI...); another is qemu/kvm,
and COW images are supported (probably even the default, need to check).

Due to storage being unavailable because a network outage, it happened that
virDomainGetBlockInfo blocked beyond recover.

On such scenarios, how will a bulk API behave? There will be a timeout or
something else?

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Daniel P. Berrange
On Fri, Jul 04, 2014 at 12:13:32PM -0400, Francesco Romani wrote:
 - Original Message -
  From: Richard W.M. Jones rjo...@redhat.com
  To: Daniel P. Berrange berra...@redhat.com
  Cc: libvir-list@redhat.com, Francesco Romani from...@redhat.com
  Sent: Friday, July 4, 2014 1:39:57 PM
  Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats
 
What would help for us:

 - A way to get information for multiple objects in a single domain

 - A way to get information for multiple objects across multiple domains
   
   I'd say that we want something similar to the virDomainListAllDomains()
   API for stats. ie we shouldn't try to pass in the full list of domains
   or paths we want info for. We should just list all domains, optionally
   using flags to filter based on some characteristic, eg exclude inactive.
   Similarly always list stats for all disks.
  
  FYI for virt-top we only care about stats of all active domains, and
  we only care about all disks  all network interfaces for domains
  (ie. never any subset).
  
  We also collect CPU time and memory usage per domain.
 
 Is the same for VDSM. VDSM takes ownership of all the domain on an host,
 so all it never does any kind of filtering or consider subsets of any kind.
 
 However, a question here about bulk APIs.
 One cornerstone of oVirt is shared storage (NFS, ISCSI...); another is 
 qemu/kvm,
 and COW images are supported (probably even the default, need to check).
 
 Due to storage being unavailable because a network outage, it happened that
 virDomainGetBlockInfo blocked beyond recover.
 
 On such scenarios, how will a bulk API behave? There will be a timeout or
 something else?

It depends on the storage and the way it is configured. If NFS is mounted
with 'hard' + 'nointr' any call libvirt makes to dead storage will get
stuck in an uninterruptable sleep in kernel space. There's no way for
libvirt to time out since by the very definition of 'hard' mount option
it does not time out. If you mount with 'soft' then the calls libvirt
makes will time out.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-04 Thread Francesco Romani
- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: Francesco Romani from...@redhat.com
 Cc: libvir-list@redhat.com, Richard W.M. Jones rjo...@redhat.com
 Sent: Friday, July 4, 2014 6:21:30 PM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats

  However, a question here about bulk APIs.
  One cornerstone of oVirt is shared storage (NFS, ISCSI...); another is
  qemu/kvm,
  and COW images are supported (probably even the default, need to check).
  
  Due to storage being unavailable because a network outage, it happened that
  virDomainGetBlockInfo blocked beyond recover.
  
  On such scenarios, how will a bulk API behave? There will be a timeout or
  something else?
 
 It depends on the storage and the way it is configured. If NFS is mounted
 with 'hard' + 'nointr' any call libvirt makes to dead storage will get
 stuck in an uninterruptable sleep in kernel space. There's no way for
 libvirt to time out since by the very definition of 'hard' mount option
 it does not time out. If you mount with 'soft' then the calls libvirt
 makes will time out.

My bad, I worded poorly my question.

What I mean is: on top of what the kernel or QEMU (libnfs, libiscsi) does,
there are plans for any additional mechanism/safeguard?
(I guess no, I'm asking just to be sure).

VDSM already uses soft mount for NFS (need to check what we do for ISCSI and
the other supported storage).

Thanks and bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-03 Thread Eric Blake
On 07/01/2014 02:35 AM, Daniel P. Berrange wrote:

 1. A custom callback to run per API
 
  typedef (void)(*virDomainBlockInfoCallback)(virDomainPtr dom,
  bool isError,
  virDomainBlockInfoPtr info,
  void *opaque);
 

It might be nice to require the callback to return an int; 0 to keep
going, non-zero to stop immediately.

 int virDomainGetBlockInfoAsync(virDomainPtr dom,
const char *disk,
virDomainBlockInfoCallback cb,
void *opaque,
unsigned int flags);

What should this function return on success, 0 or the number of times
the callback was reached?  However, even if we add a callback return
value (non-zero to quit immediately), I don't think feeding it directly
to the return value is nice; we still want to reserve negative values
for errors (couldn't even invoke callbacks, perhaps because dom was a
bad pointer).  Besides, a user can always use opaque to collect counts
of how many times the callback was invoked, and/or a specific return
value on early exit.


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-03 Thread Eric Blake
On 07/01/2014 03:33 AM, Daniel P. Berrange wrote:

  1. Time to write() the RPC call to the socket
  2. Time for libvirtd to process the RPC call
  3. Time to recv() the RPC reply from the socket
  ...and so on..
 
 If the time for item 2 dominates over the time for items 1  2 (which
 it should really) then the client thread is going to be sleeping in a
 poll() for the bulk of the duration of the libvirt API call. If we had
 an async API mechanism, then the VDSM time would essentially be consumed
 with
 
  1. Time to write() the RPC call to the socket
  2. Time to write() the RPC call to the socket
  3. Time to write() the RPC call to the socket
  4. Time to write() the RPC call to the socket
  5. Time to write() the RPC call to the socket
  6. Time to write() the RPC call to the socket
  7. wait for replies to start arriving
  8. Time to recv() the RPC reply from the socket
  9. Time to recv() the RPC reply from the socket
  10. Time to recv() the RPC reply from the socket
  11. Time to recv() the RPC reply from the socket
  12. Time to recv() the RPC reply from the socket
  13. Time to recv() the RPC reply from the socket
  14. Time to recv() the RPC reply from the socket

This assumes you are still calling one async call per domain query.

With regards to a bulk API, are you thinking synchronous?

1. Time to write() the RPC call - one bulk request
2. wait for reply - oh, and we'd better increase our on-wire size limits
3. Time to recv() the RPC reply - one bulk response

or asynchronous?

1. Time to write() the RPC call - one bulk request
2. wait for replies to start arriving
3. Time to recv() an RPC async reply - first domain
4. Time to recv() an RPC async reply - second domain
...
n. Time to recv() final RPC async reply

The asynchronous works nicely in that we don't have to size up our max
RPC on-wire limits, but implies that you still need a callback invoked
once per reply received, instead of getting all data back in one giant
memory blob.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-02 Thread Francesco Romani
- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: Francesco Romani from...@redhat.com
 Cc: libvir-list@redhat.com
 Sent: Tuesday, July 1, 2014 10:35:21 AM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats
 

[...] 
  We [in VDSM] currently use these APIs for our sempling:
virDomainBlockInfo
virDomainGetInfo
virDomainGetCPUStats
virDomainBlockStats
virDomainBlockStatsFlags
virDomainInterfaceStats
virDomainGetVcpusFlags
virDomainGetMetadata
 
 Why do you need to call virDomainGetMetadata so often ? That merely contains
 a opaque data blob that can only have come from VDSM itself, so I'm surprised
 you need to call that at all frequently.

We store some QoS info in the domain metadata. Actually we can elide this API 
call
from the list and fix our coude to make smarter use of it.

please note that we are much more concerned about thread reduction then
about performance numbers. We had report of thread number becoming a
real harm, while performance so far is not yet a concern
(https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)
  
  * bulk APIs for querying domain stats
  (https://bugzilla.redhat.com/show_bug.cgi?id=1113116)
would be really welcome as well. It is quite independent from the
previous bullet point
and would help us greatly with scale.
 
 If we did the first bullet point, we'd be adding another ~10 APIs for
 async variants. If we then did the second bullet point we'd be adding
 another ~10 APIs for bulk querying. So while you're right that they
 are independent, it would be desirable to address them both at the
 same time, so we only need to add 10 new APIs in total, not 20.

I'm fine with this approach.


 For the async API design, I could see two potential designs
 
 1. A custom callback to run per API
 
  typedef (void)(*virDomainBlockInfoCallback)(virDomainPtr dom,
  bool isError,
  virDomainBlockInfoPtr info,
  void *opaque);
 
 int virDomainGetBlockInfoAsync(virDomainPtr dom,
const char *disk,
virDomainBlockInfoCallback cb,
void *opaque,
unsigned int flags);
 
 
 2. A standard callback and a pair of APIs
 
  typedef void *virDomainAsyncResult;
  typedef (void)(*virDomainAsyncCallback)(virDomainPtr dom,
  virDomainAsyncResult res);
 
void virDomainGetBlockInfoAsync(virDomainPtr dom,
const char *disk,
virDomainBlockInfoCallback cb,
void *opaque,
unsigned int flags);
int virDomainGetBlockInfoFinish(virDomainPtr dom,
   virDomainAsyncResult res,
   virDomainBlockInfoPtr info);
 
 This second approach is the way GIO works (see example in this page
 https://developer.gnome.org/gio/stable/GAsyncResult.html ). The main
 difference between them really is probably the way you get error
 reporting from the APIs. In the first example, libvirt would raise
 an error before it invoked the callback, with isError set to True.
 In the second example, the Finish() func would raise the error and
 return -1.

I need to check in deeper detail and sync up with other VDSM developers,
but I have a feel that the first approach is a bit easier for VDSM to consume.

Bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-02 Thread Francesco Romani
- Original Message -
 From: Michal Privoznik mpriv...@redhat.com
 To: Francesco Romani from...@redhat.com, libvir-list@redhat.com
 Sent: Tuesday, July 1, 2014 11:19:04 AM
 Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats

  Right now we aim for a number of VM per node in the (few) hundreds, but we
  have big plans
  to scale much more, and to possibly reach thousands in a not so distant
  future.
  At the moment, we use one thread per VM to gather the VM stats (CPU,
  network, disk),
  and of course this obviously scales poorly.
 
 I think this is your main problem. Why not have only one thread that
 would manage list of domains to query and issue the APIs periodically
 instead of having one thread per domain?

Indeed it is. I'm actually personally addressing this problem in VDSM.
It is mostly an inheritence of past times, when this wasn't yet a big problem.
We are moving toward a thread pool of fixed size to handle the sampling.

  This is made only worse by the fact that VDSM is a python 2.7 application,
  and notoriously
  python 2.x behaves very badly with threads. We are already working to
  improve our code,
  but I'd like to bring the discussion here and see if and when the querying
  API can be improved.
 
  We currently use these APIs for our sempling:
 virDomainBlockInfo
 virDomainGetInfo
 virDomainGetCPUStats
 virDomainBlockStats
 virDomainBlockStatsFlags
 virDomainInterfaceStats
 virDomainGetVcpusFlags
 virDomainGetMetadata
 
  What we'd like to have is
 
  * asynchronous APIs for querying domain stats
  (https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
 This would be just awesome. Either a single callback or a different one
 per call is fine
 (let's discuss this!).
 please note that we are much more concerned about thread reduction then
 about performance
 numbers. We had report of thread number becoming a real harm, while
 performance so far
 is not yet a concern
 (https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)
 
 I'm not a big fan of this approach. I mean, IIRC python has this Big
 Python Lock, which effectively prevents two threads run concurrently.

It has the GIL, yes. Only one thread can run python code at any given time.
This however it is not true for extensions modules written in C which if 
carefully
designed (read: coded to properly release the GIL) can run concurrently.
This is one of the reasons while threading in python it is tolerated for I/O,
evne though never recommended.

AFAIK/IIRC the code the libvirt module for python allows this, so we should
be good to go.

 So while in C this would make perfect sense, it doesn't do so in python.
 The callbacks would be called from the event loop, which given how
 frequently you dump the info will block other threads. Therefore I'm
 afraid the approach would not bring any speed up, rather slow down.

I'm not sure about this and I think quite the opposite, that performance-wise
we can gain something, even though yes, all the callbacks will pile up in the
event loop. Surely this will greatly reduce the GIL battle

http://dabeaz.blogspot.it/2010/01/python-gil-visualized.html

- which is improved in python = 3.2, but we are on 2.7 for the foreseeable 
future,
and will improve our thread proliferation which is an immediate and real
concern of us

- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-02 Thread Daniel P. Berrange
On Wed, Jul 02, 2014 at 11:56:23AM -0400, Francesco Romani wrote:
   This is made only worse by the fact that VDSM is a python 2.7 application,
   and notoriously
   python 2.x behaves very badly with threads. We are already working to
   improve our code,
   but I'd like to bring the discussion here and see if and when the querying
   API can be improved.
  
   We currently use these APIs for our sempling:
  virDomainBlockInfo
  virDomainGetInfo
  virDomainGetCPUStats
  virDomainBlockStats
  virDomainBlockStatsFlags
  virDomainInterfaceStats
  virDomainGetVcpusFlags
  virDomainGetMetadata
  
   What we'd like to have is
  
   * asynchronous APIs for querying domain stats
   (https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
  This would be just awesome. Either a single callback or a different one
  per call is fine
  (let's discuss this!).
  please note that we are much more concerned about thread reduction then
  about performance
  numbers. We had report of thread number becoming a real harm, while
  performance so far
  is not yet a concern
  (https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)
  
  I'm not a big fan of this approach. I mean, IIRC python has this Big
  Python Lock, which effectively prevents two threads run concurrently.
 
 It has the GIL, yes. Only one thread can run python code at any given time.
 This however it is not true for extensions modules written in C which if 
 carefully
 designed (read: coded to properly release the GIL) can run concurrently.
 This is one of the reasons while threading in python it is tolerated for I/O,
 evne though never recommended.
 
 AFAIK/IIRC the code the libvirt module for python allows this, so we should
 be good to go.

For the sake of completeness I'll point out that there's another theoretical
option. The libvirt-gobject binding to libvirt provides async APIs to libvirt
APIs. It does this by using threads internally. Since these are C level threads
though, if VDSM were to use libvirt-gobject it could get async APIs and the
benefits of real threads, while remaining single threaded at the python layer.

That all said, I'm not sure whether libvirt-gobject has sufficient API
coverage for all the APIs VDSM needs. It primarily just has bindings for
the APIs used by GNOME Boxes  libvirt-sandbox so far. Also not sure if
it is a widely deployed enough dep for VDSM to mandate.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [RFC][scale] new API for querying domains stats

2014-07-01 Thread Francesco Romani
Hi everyone,

I'd like to discuss possible APIs and plans for new query APIs in libvirt.

I'm one of the oVirt (http://www.ovirt.org) developers, and I write code for 
VDSM;
VDSM is the node management daemon, which is in charge, among many other 
things, to
gather the host and statistics per Domain/VM.

Right now we aim for a number of VM per node in the (few) hundreds, but we have 
big plans
to scale much more, and to possibly reach thousands in a not so distant future.
At the moment, we use one thread per VM to gather the VM stats (CPU, network, 
disk),
and of course this obviously scales poorly.

This is made only worse by the fact that VDSM is a python 2.7 application, and 
notoriously
python 2.x behaves very badly with threads. We are already working to improve 
our code,
but I'd like to bring the discussion here and see if and when the querying API 
can be improved.

We currently use these APIs for our sempling:
  virDomainBlockInfo
  virDomainGetInfo
  virDomainGetCPUStats
  virDomainBlockStats
  virDomainBlockStatsFlags
  virDomainInterfaceStats
  virDomainGetVcpusFlags
  virDomainGetMetadata

What we'd like to have is

* asynchronous APIs for querying domain stats 
(https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
  This would be just awesome. Either a single callback or a different one per 
call is fine
  (let's discuss this!).
  please note that we are much more concerned about thread reduction then about 
performance
  numbers. We had report of thread number becoming a real harm, while 
performance so far
  is not yet a concern (https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)

* bulk APIs for querying domain stats 
(https://bugzilla.redhat.com/show_bug.cgi?id=1113116)
  would be really welcome as well. It is quite independent from the previous 
bullet point
  and would help us greatly with scale.

So, I'd like to discuss if these additions are (or can be) in the project 
roadmap,
and, if so, how the API could look like and what the possible timeframe could 
be.
Of course I'd be happy to provide any further information about VDSM and its 
workings.

Thoughts very welcome!

Thanks and best regards,

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-01 Thread Daniel P. Berrange
On Tue, Jul 01, 2014 at 03:09:13AM -0400, Francesco Romani wrote:
 Hi everyone,
 
 I'd like to discuss possible APIs and plans for new query APIs in libvirt.
 
 I'm one of the oVirt (http://www.ovirt.org) developers, and I write code for 
 VDSM;
 VDSM is the node management daemon, which is in charge, among many other 
 things, to
 gather the host and statistics per Domain/VM.
 
 Right now we aim for a number of VM per node in the (few) hundreds, but we 
 have big plans
 to scale much more, and to possibly reach thousands in a not so distant 
 future.
 At the moment, we use one thread per VM to gather the VM stats (CPU, network, 
 disk),
 and of course this obviously scales poorly.
 
 This is made only worse by the fact that VDSM is a python 2.7 application, 
 and notoriously
 python 2.x behaves very badly with threads. We are already working to improve 
 our code,
 but I'd like to bring the discussion here and see if and when the querying 
 API can be improved.
 
 We currently use these APIs for our sempling:
   virDomainBlockInfo
   virDomainGetInfo
   virDomainGetCPUStats
   virDomainBlockStats
   virDomainBlockStatsFlags
   virDomainInterfaceStats
   virDomainGetVcpusFlags
   virDomainGetMetadata

Why do you need to call virDomainGetMetadata so often ? That merely contains a
opaque data blob that can only have come from VDSM itself, so I'm surprised you
need to call that at all frequently.

 What we'd like to have is
 
 * asynchronous APIs for querying domain stats 
 (https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
   This would be just awesome. Either a single callback or a different one per 
 call is fine
   (let's discuss this!).
   please note that we are much more concerned about thread reduction then 
 about performance
   numbers. We had report of thread number becoming a real harm, while 
 performance so far
   is not yet a concern 
 (https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)
 
 * bulk APIs for querying domain stats 
 (https://bugzilla.redhat.com/show_bug.cgi?id=1113116)
   would be really welcome as well. It is quite independent from the previous 
 bullet point
   and would help us greatly with scale.

If we did the first bullet point, we'd be adding another ~10 APIs for
async variants. If we then did the second bullet point we'd be adding
another ~10 APIs for bulk querying. So while you're right that they
are independent, it would be desirable to address them both at the
same time, so we only need to add 10 new APIs in total, not 20.

For the async API design, I could see two potential designs

1. A custom callback to run per API

 typedef (void)(*virDomainBlockInfoCallback)(virDomainPtr dom,
 bool isError,
 virDomainBlockInfoPtr info,
 void *opaque);

int virDomainGetBlockInfoAsync(virDomainPtr dom,
   const char *disk,
   virDomainBlockInfoCallback cb,
   void *opaque,
   unsigned int flags);


2. A standard callback and a pair of APIs

 typedef void *virDomainAsyncResult;
 typedef (void)(*virDomainAsyncCallback)(virDomainPtr dom,
 virDomainAsyncResult res);

   void virDomainGetBlockInfoAsync(virDomainPtr dom,
   const char *disk,
   virDomainBlockInfoCallback cb,
   void *opaque,
   unsigned int flags);
   int virDomainGetBlockInfoFinish(virDomainPtr dom,
  virDomainAsyncResult res,
  virDomainBlockInfoPtr info);

This second approach is the way GIO works (see example in this page
https://developer.gnome.org/gio/stable/GAsyncResult.html ). The main
difference between them really is probably the way you get error
reporting from the APIs. In the first example, libvirt would raise
an error before it invoked the callback, with isError set to True.
In the second example, the Finish() func would raise the error and
return -1.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-01 Thread Michal Privoznik

On 01.07.2014 09:09, Francesco Romani wrote:

Hi everyone,

I'd like to discuss possible APIs and plans for new query APIs in libvirt.

I'm one of the oVirt (http://www.ovirt.org) developers, and I write code for 
VDSM;
VDSM is the node management daemon, which is in charge, among many other 
things, to
gather the host and statistics per Domain/VM.

Right now we aim for a number of VM per node in the (few) hundreds, but we have 
big plans
to scale much more, and to possibly reach thousands in a not so distant future.
At the moment, we use one thread per VM to gather the VM stats (CPU, network, 
disk),
and of course this obviously scales poorly.


I think this is your main problem. Why not have only one thread that 
would manage list of domains to query and issue the APIs periodically 
instead of having one thread per domain?




This is made only worse by the fact that VDSM is a python 2.7 application, and 
notoriously
python 2.x behaves very badly with threads. We are already working to improve 
our code,
but I'd like to bring the discussion here and see if and when the querying API 
can be improved.

We currently use these APIs for our sempling:
   virDomainBlockInfo
   virDomainGetInfo
   virDomainGetCPUStats
   virDomainBlockStats
   virDomainBlockStatsFlags
   virDomainInterfaceStats
   virDomainGetVcpusFlags
   virDomainGetMetadata

What we'd like to have is

* asynchronous APIs for querying domain stats 
(https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
   This would be just awesome. Either a single callback or a different one per 
call is fine
   (let's discuss this!).
   please note that we are much more concerned about thread reduction then 
about performance
   numbers. We had report of thread number becoming a real harm, while 
performance so far
   is not yet a concern 
(https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)


I'm not a big fan of this approach. I mean, IIRC python has this Big 
Python Lock, which effectively prevents two threads run concurrently. So 
while in C this would make perfect sense, it doesn't do so in python. 
The callbacks would be called from the event loop, which given how 
frequently you dump the info will block other threads. Therefore I'm 
afraid the approach would not bring any speed up, rather slow down.




* bulk APIs for querying domain stats 
(https://bugzilla.redhat.com/show_bug.cgi?id=1113116)
   would be really welcome as well. It is quite independent from the previous 
bullet point
   and would help us greatly with scale.


I think this one looks better. Especially if you consider my suggestion 
of having only one thread to serve all domains.




So, I'd like to discuss if these additions are (or can be) in the project 
roadmap,
and, if so, how the API could look like and what the possible timeframe could 
be.
Of course I'd be happy to provide any further information about VDSM and its 
workings.

Thoughts very welcome!

Thanks and best regards,



Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-01 Thread Daniel P. Berrange
On Tue, Jul 01, 2014 at 11:19:04AM +0200, Michal Privoznik wrote:
 On 01.07.2014 09:09, Francesco Romani wrote:
 Hi everyone,
 
 I'd like to discuss possible APIs and plans for new query APIs in libvirt.
 
 I'm one of the oVirt (http://www.ovirt.org) developers, and I write code for 
 VDSM;
 VDSM is the node management daemon, which is in charge, among many other 
 things, to
 gather the host and statistics per Domain/VM.
 
 Right now we aim for a number of VM per node in the (few) hundreds, but we 
 have big plans
 to scale much more, and to possibly reach thousands in a not so distant 
 future.
 At the moment, we use one thread per VM to gather the VM stats (CPU, 
 network, disk),
 and of course this obviously scales poorly.
 
 I think this is your main problem. Why not have only one thread that would
 manage list of domains to query and issue the APIs periodically instead of
 having one thread per domain?

You suffer from round trip time on every API call if you serialize it all
in a single thread. eg if every API call is 50ms and you want to check
once per scond, you can only monitor  20 VMs before you take more time than
you have available. This really sucks when the majority of that 50ms is a
sleep in poll() waiting for the RPC response.

 This is made only worse by the fact that VDSM is a python 2.7 application, 
 and notoriously
 python 2.x behaves very badly with threads. We are already working to 
 improve our code,
 but I'd like to bring the discussion here and see if and when the querying 
 API can be improved.
 
 We currently use these APIs for our sempling:
virDomainBlockInfo
virDomainGetInfo
virDomainGetCPUStats
virDomainBlockStats
virDomainBlockStatsFlags
virDomainInterfaceStats
virDomainGetVcpusFlags
virDomainGetMetadata
 
 What we'd like to have is
 
 * asynchronous APIs for querying domain stats 
 (https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
This would be just awesome. Either a single callback or a different one 
  per call is fine
(let's discuss this!).
please note that we are much more concerned about thread reduction then 
  about performance
numbers. We had report of thread number becoming a real harm, while 
  performance so far
is not yet a concern 
  (https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)
 
 I'm not a big fan of this approach. I mean, IIRC python has this Big Python
 Lock, which effectively prevents two threads run concurrently. So while in C
 this would make perfect sense, it doesn't do so in python. The callbacks
 would be called from the event loop, which given how frequently you dump the
 info will block other threads. Therefore I'm afraid the approach would not
 bring any speed up, rather slow down.

I'm not sure I agree with your assessment here. If we consider a single
API call, the time this takes to complete is made up of a number of parts

 1. Time to write() the RPC call to the socket
 2. Time for libvirtd to process the RPC call
 3. Time to recv() the RPC reply from the socket

 1. Time to write() the RPC call to the socket
 2. Time for libvirtd to process the RPC call
 3. Time to recv() the RPC reply from the socket

 1. Time to write() the RPC call to the socket
 2. Time for libvirtd to process the RPC call
 3. Time to recv() the RPC reply from the socket
 ...and so on..

If the time for item 2 dominates over the time for items 1  2 (which
it should really) then the client thread is going to be sleeping in a
poll() for the bulk of the duration of the libvirt API call. If we had
an async API mechanism, then the VDSM time would essentially be consumed
with

 1. Time to write() the RPC call to the socket
 2. Time to write() the RPC call to the socket
 3. Time to write() the RPC call to the socket
 4. Time to write() the RPC call to the socket
 5. Time to write() the RPC call to the socket
 6. Time to write() the RPC call to the socket
 7. wait for replies to start arriving
 8. Time to recv() the RPC reply from the socket
 9. Time to recv() the RPC reply from the socket
 10. Time to recv() the RPC reply from the socket
 11. Time to recv() the RPC reply from the socket
 12. Time to recv() the RPC reply from the socket
 13. Time to recv() the RPC reply from the socket
 14. Time to recv() the RPC reply from the socket

Of course there's a limit to how many outstanding async calls you can
make before the event loop gets 100% busy processing the responses,
but I don't think that makes async calls worthless. Even if we had the
bulk list API calls, async calling would be useful, because it would
let VDSM fire off requests for disk, net, cpu, mem stats in parallel
from a single thread.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   

Re: [libvirt] [RFC][scale] new API for querying domains stats

2014-07-01 Thread Michal Privoznik

On 01.07.2014 11:33, Daniel P. Berrange wrote:

On Tue, Jul 01, 2014 at 11:19:04AM +0200, Michal Privoznik wrote:

On 01.07.2014 09:09, Francesco Romani wrote:

Hi everyone,

I'd like to discuss possible APIs and plans for new query APIs in libvirt.

I'm one of the oVirt (http://www.ovirt.org) developers, and I write code for 
VDSM;
VDSM is the node management daemon, which is in charge, among many other 
things, to
gather the host and statistics per Domain/VM.

Right now we aim for a number of VM per node in the (few) hundreds, but we have 
big plans
to scale much more, and to possibly reach thousands in a not so distant future.
At the moment, we use one thread per VM to gather the VM stats (CPU, network, 
disk),
and of course this obviously scales poorly.


I think this is your main problem. Why not have only one thread that would
manage list of domains to query and issue the APIs periodically instead of
having one thread per domain?


You suffer from round trip time on every API call if you serialize it all
in a single thread. eg if every API call is 50ms and you want to check
once per scond, you can only monitor  20 VMs before you take more time than
you have available. This really sucks when the majority of that 50ms is a
sleep in poll() waiting for the RPC response.


Unless you have the bulk query API which will take the RTT only once ;)




This is made only worse by the fact that VDSM is a python 2.7 application, and 
notoriously
python 2.x behaves very badly with threads. We are already working to improve 
our code,
but I'd like to bring the discussion here and see if and when the querying API 
can be improved.

We currently use these APIs for our sempling:
   virDomainBlockInfo
   virDomainGetInfo
   virDomainGetCPUStats
   virDomainBlockStats
   virDomainBlockStatsFlags
   virDomainInterfaceStats
   virDomainGetVcpusFlags
   virDomainGetMetadata

What we'd like to have is

* asynchronous APIs for querying domain stats 
(https://bugzilla.redhat.com/show_bug.cgi?id=1113106)
   This would be just awesome. Either a single callback or a different one per 
call is fine
   (let's discuss this!).
   please note that we are much more concerned about thread reduction then 
about performance
   numbers. We had report of thread number becoming a real harm, while 
performance so far
   is not yet a concern 
(https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)


I'm not a big fan of this approach. I mean, IIRC python has this Big Python
Lock, which effectively prevents two threads run concurrently. So while in C
this would make perfect sense, it doesn't do so in python. The callbacks
would be called from the event loop, which given how frequently you dump the
info will block other threads. Therefore I'm afraid the approach would not
bring any speed up, rather slow down.


I'm not sure I agree with your assessment here. If we consider a single
API call, the time this takes to complete is made up of a number of parts

  1. Time to write() the RPC call to the socket
  2. Time for libvirtd to process the RPC call
  3. Time to recv() the RPC reply from the socket

  1. Time to write() the RPC call to the socket
  2. Time for libvirtd to process the RPC call
  3. Time to recv() the RPC reply from the socket

  1. Time to write() the RPC call to the socket
  2. Time for libvirtd to process the RPC call
  3. Time to recv() the RPC reply from the socket
  ...and so on..

If the time for item 2 dominates over the time for items 1  2 (which
it should really) then the client thread is going to be sleeping in a
poll() for the bulk of the duration of the libvirt API call. If we had
an async API mechanism, then the VDSM time would essentially be consumed
with

  1. Time to write() the RPC call to the socket
  2. Time to write() the RPC call to the socket
  3. Time to write() the RPC call to the socket
  4. Time to write() the RPC call to the socket
  5. Time to write() the RPC call to the socket
  6. Time to write() the RPC call to the socket
  7. wait for replies to start arriving
  8. Time to recv() the RPC reply from the socket
  9. Time to recv() the RPC reply from the socket
  10. Time to recv() the RPC reply from the socket
  11. Time to recv() the RPC reply from the socket
  12. Time to recv() the RPC reply from the socket
  13. Time to recv() the RPC reply from the socket
  14. Time to recv() the RPC reply from the socket



Well, in the async form you need to account even the time spent in the 
callbacks:


1. write(serial=1, ...)
2. write(serial=2, ...)
..
7. wait for replies
8. recv(serial=x1, ...)   // there's no guarantee on order of replies
9. callback(serial=x1, ...)
10. recv(serial=x2, ...)
11. callback(serial=x2, )

And it's the callback times I'm worried about. I'm not saying we should 
not add the callback APIs. What I'm really saying is I have doubts it 
will help python apps. It will definitely help scaling C applications 
though.



Of course there's a limit to how many outstanding async