date:20121207

Re: [vdsm] RFC: New Storage API

2012-12-07 Thread Saggi Mizrahi



- Original Message -
> From: "Shu Ming" 
> To: "Deepak C Shetty" 
> Cc: "Saggi Mizrahi" , "engine-devel" 
> , "VDSM Project Development"
> 
> Sent: Friday, December 7, 2012 1:37:20 AM
> Subject: Re: [vdsm] RFC: New Storage API
> 
> 于 2012-12-7 13:23, Deepak C Shetty:
> > On 12/06/2012 10:22 PM, Saggi Mizrahi wrote:
> >>
> >> - Original Message -
> >>> From: "Shu Ming" 
> >>> To: "Saggi Mizrahi" 
> >>> Cc: "VDSM Project Development"
> >>> ,
> >>> "engine-devel" 
> >>> Sent: Thursday, December 6, 2012 11:02:02 AM
> >>> Subject: Re: [vdsm] RFC: New Storage API
> >>>
> >>> Saggi,
> >>>
> >>> Thanks for sharing your thought and I get some comments below.
> >>>
> >>>
> >>> Saggi Mizrahi:
>  I've been throwing a lot of bits out about the new storage API
>  and
>  I think it's time to talk a bit.
>  I will purposefully try and keep implementation details away and
>  concentrate about how the API looks and how you use it.
> 
>  First major change is in terminology, there is no long a storage
>  domain but a storage repository.
>  This change is done because so many things are already called
>  domain in the system and this will make things less confusing
>  for
>  new-commers with a libvirt background.
> 
>  One other changes is that repositories no longer have a UUID.
>  The UUID was only used in the pool members manifest and is no
>  longer needed.
> 
> 
>  connectStorageRepository(repoId, repoFormat,
>  connectionParameters={}):
>  repoId - is a transient name that will be used to refer to the
>  connected domain, it is not persisted and doesn't have to be the
>  same across the cluster.
>  repoFormat - Similar to what used to be type (eg. localfs-1.0,
>  nfs-3.4, clvm-1.2).
>  connectionParameters - This is format specific and will used to
>  tell VDSM how to connect to the repo.
> >>>
> >>> Where does repoID come from? I think repoID doesn't exist before
> >>> connectStorageRepository() return.  Isn't repoID a return value
> >>> of
> >>> connectStorageRepository()?
> >> No, repoIDs are no longer part of the domain, they are just a
> >> transient handle.
> >> The user can put whatever it wants there as long as it isn't
> >> already
> >> taken by another currently connected domain.
> >
> > So what happens when user mistakenly gives a repoID that is in use
> > before.. there should be something in the return value that
> > specifies
> > the error and/or reason for error so that user can try with a
> > new/diff
> > repoID ?
> 
> I think let the user to give the repoID is meaningless and
> error-prune.
The repo ID is meaningless, it's just a handle to the instance.
It's never persisted to disk and doesn't have to be unique across the cluster.
> Developer must maintain a a unique ID list for every storage
> repository
> connected.
Why? You could just use repoId = 
___
"ovirt_example.com_hosting_3" as an example
If you agree with all other users of the same VDSM instance that you are going 
to use this scheme you can cooperate.
You can use whatever scheme you want to make sure you don't hit anyone else's.
The point is, VDSM doesn't care how you provision repoIDs and how unique they 
are across the cluster.
It's the user's choice how and if to persist this information.
> 
> >
>  disconnectStorageRepository(self, repoId)
> 
> 
>  In the new API there are only images, some images are mutable
>  and
>  some are not.
>  mutable images are also called VirtualDisks
>  immutable images are also called Snapshots
> 
>  There are no explicit templates, you can create as many images
>  as
>  you want from any snapshot.
> 
>  There are 4 major image operations:
> 
> 
>  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
>  userData={}, options={}):
> 
>  targetRepoId - ID of a connected repo where the disk will be
>  created
>  size - The size of the image you wish to create
>  baseSnapshotId - the ID of the snapshot you want the base the
>  new
>  virtual disk on
>  userData - optional data that will be attached to the new VD,
>  could
>  be anything that the user desires.
>  options - options to modify VDSMs default behavior
> >
> > IIUC, i can use options to do storage offloads ? For eg. I can
> > create
> > a LUN that represents this VD on my storage array based on the
> > 'options' parameter ? Is this the intended way to use 'options' ?
> >
> 
>  returns the id of the new VD
> >>> I think we will also need a function to check if a a VirtualDisk
> >>> is
> >>> based on a specific snapshot.
> >>> Like: isSnapshotOf(virtualDiskId, baseSnapshotID):
> >> No, the design is that volume dependencies are an implementation
> >> detail.
> >> There is no reason for you to know that an image is physically a
> >> snapshot of another.
> >> Logical snapshots, template

Re: [vdsm] RFC: New Storage API

2012-12-07 Thread Saggi Mizrahi



- Original Message -
> From: "Deepak C Shetty" 
> To: "Saggi Mizrahi" 
> Cc: "Shu Ming" , "engine-devel" 
> , "VDSM Project Development"
> , "Deepak C Shetty" 
> 
> Sent: Friday, December 7, 2012 12:23:15 AM
> Subject: Re: [vdsm] RFC: New Storage API
> 
> On 12/06/2012 10:22 PM, Saggi Mizrahi wrote:
> >
> > - Original Message -
> >> From: "Shu Ming" 
> >> To: "Saggi Mizrahi" 
> >> Cc: "VDSM Project Development"
> >> , "engine-devel"
> >> 
> >> Sent: Thursday, December 6, 2012 11:02:02 AM
> >> Subject: Re: [vdsm] RFC: New Storage API
> >>
> >> Saggi,
> >>
> >> Thanks for sharing your thought and I get some comments below.
> >>
> >>
> >> Saggi Mizrahi:
> >>> I've been throwing a lot of bits out about the new storage API
> >>> and
> >>> I think it's time to talk a bit.
> >>> I will purposefully try and keep implementation details away and
> >>> concentrate about how the API looks and how you use it.
> >>>
> >>> First major change is in terminology, there is no long a storage
> >>> domain but a storage repository.
> >>> This change is done because so many things are already called
> >>> domain in the system and this will make things less confusing for
> >>> new-commers with a libvirt background.
> >>>
> >>> One other changes is that repositories no longer have a UUID.
> >>> The UUID was only used in the pool members manifest and is no
> >>> longer needed.
> >>>
> >>>
> >>> connectStorageRepository(repoId, repoFormat,
> >>> connectionParameters={}):
> >>> repoId - is a transient name that will be used to refer to the
> >>> connected domain, it is not persisted and doesn't have to be the
> >>> same across the cluster.
> >>> repoFormat - Similar to what used to be type (eg. localfs-1.0,
> >>> nfs-3.4, clvm-1.2).
> >>> connectionParameters - This is format specific and will used to
> >>> tell VDSM how to connect to the repo.
> >>
> >> Where does repoID come from? I think repoID doesn't exist before
> >> connectStorageRepository() return.  Isn't repoID a return value of
> >> connectStorageRepository()?
> > No, repoIDs are no longer part of the domain, they are just a
> > transient handle.
> > The user can put whatever it wants there as long as it isn't
> > already taken by another currently connected domain.
> 
> So what happens when user mistakenly gives a repoID that is in use
> before.. there should be something in the return value that specifies
> the error and/or reason for error so that user can try with a
> new/diff
> repoID ?
Asi I said, connect fails if the repoId is in use ATM.
> 
> >>> disconnectStorageRepository(self, repoId)
> >>>
> >>>
> >>> In the new API there are only images, some images are mutable and
> >>> some are not.
> >>> mutable images are also called VirtualDisks
> >>> immutable images are also called Snapshots
> >>>
> >>> There are no explicit templates, you can create as many images as
> >>> you want from any snapshot.
> >>>
> >>> There are 4 major image operations:
> >>>
> >>>
> >>> createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
> >>> userData={}, options={}):
> >>>
> >>> targetRepoId - ID of a connected repo where the disk will be
> >>> created
> >>> size - The size of the image you wish to create
> >>> baseSnapshotId - the ID of the snapshot you want the base the new
> >>> virtual disk on
> >>> userData - optional data that will be attached to the new VD,
> >>> could
> >>> be anything that the user desires.
> >>> options - options to modify VDSMs default behavior
> 
> IIUC, i can use options to do storage offloads ? For eg. I can create
> a
> LUN that represents this VD on my storage array based on the
> 'options'
> parameter ? Is this the intended way to use 'options' ?
No, this has nothing to do with offloads.
If by "offloads" you mean having other VDSM hosts to the heavy lifting then 
this is what the option autoFix=False and the fix mechanism is for.
If you are talking about advanced scsi features (ie. write same) they will be 
used automatically whenever possible.
In any case, how we manage LUNs (if they are even used) is an implementation 
detail.
> 
> >>>
> >>> returns the id of the new VD
> >> I think we will also need a function to check if a a VirtualDisk
> >> is
> >> based on a specific snapshot.
> >> Like: isSnapshotOf(virtualDiskId, baseSnapshotID):
> > No, the design is that volume dependencies are an implementation
> > detail.
> > There is no reason for you to know that an image is physically a
> > snapshot of another.
> > Logical snapshots, template information, and any other information
> > can be set by the user by using the userData field available for
> > every image.
> >>> createSnapshot(targetRepoId, baseVirtualDiskId,
> >>>  userData={}, options={}):
> >>> targetRepoId - The ID of a connected repo where the new sanpshot
> >>> will be created and the original image exists as well.
> >>> size - The size of the image you wish to create
> >>> baseVirtualDisk - the ID of a mutable image (Virtual Disk)

Re: [vdsm] moving the collection of statistics to external process

2012-12-07 Thread Itamar Heim


On 12/07/2012 12:39 PM, Mark Wu wrote:

On 12/06/2012 11:29 PM, Adam Litke wrote:

On Thu, Dec 06, 2012 at 11:19:34PM +0800, Shu Ming wrote:

于 2012-12-6 4:51, Itamar Heim 写道:

On 12/05/2012 10:33 PM, Adam Litke wrote:

On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote:

On 12/05/2012 10:16 PM, Adam Litke wrote:

On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote:

On 12/05/2012 08:57 PM, Adam Litke wrote:

On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:

On 12/05/2012 04:42 PM, Adam Litke wrote:

I wanted to know what do you think about it and if
you have better

solution to avoid initiate so many threads? And
if splitting vdsm is
a good idea here?
In first look, my opinion is that it can help
and would be nice to
have vmStatisticService that runs and writes to
separate log the vms
status.

Vdsm recently started requiring the MOM package. MOM
also performs some host
and guest statistics collection as part of the
policy framework.  I think it
would be a really good idea to consolidate all stats
collection into MOM.  Then,
all stats become usable within the policy and by
vdsm for its own internal
purposes.  Today, MOM has one stats collection
thread per VM and one thread for
the host stats.  It has an API for gathering the
most recently collected stats
which vdsm can use.


isn't this what collectd (and its libvirt plugin) or
pcp are already doing?

Lot's of things collect statistics, but as of right now,
we're using MOM and
we're not yet using collectd on the host, right?


I think we should have a single stats collection service
and clients for it.
I think mom and vdsm should get their stats from that service,
rather than have either beholden to any new stats something
needs to
collect.

How would this work for collecting guest statistics?  Would
we require collectd
to be installed in all guests running under oVirt?


my understanding is collectd is installed on the host, and uses
collects libvirt plugin to collect guests statistics?

Yes, but some statistics can only be collected by making a call
to the oVirt
guest agent (eg. guest memory statistics).  The logical next
step would be to
write a collectd plugin for ovirt-guest-agent, but vdsm owns the
connections to
the guest agents and probably does not want to multiplex those
connections for
many reasons (security being the main one).


and some will come from qemu-ga which libvirt will support?
maybe a collectd vdsm plugin for the guest agent stats?


I am thinking to have the collectd as a stand alone service to
collect the statics from both ovirt-guest and qemu-ga.  Then
collected can export the information to host proc file system in
layered architecture.  Then mom or other vdsm service can get the
information from the proc file system like other OS statics exported
in the host.

You wouldn't use the host /proc filesystem for this purpose.  /proc is an
interface between userspace and the kernel.  It is not for direct
application
use.

The problem I see with hooking collectd up to ovirt-ga is that vdsm
still needs
a connection to ovirt-ga for things like shutdown and desktopLogin.
Today vdsm,
owns the connection to the guest agent and there is not a nice way to
multiplex
that connection for use by multiple clients simultaneously.
/home/tlv/iheim/workspace

Actually,  I don't like to collect from statistics from guest agent. Now
libvirt can provide the statistics of vcpu, block and network
interface.  So I think we should reconsider enabling guest memory report
in virtio balloon driver.  I am not sure if async event is
supported in qmp now. How do you think of it?

In vdsm and mom,  we don't just simply collect statistics, but also need
perform appropriate action on it.  So probably we still need a output
plugin
for collectd to to make the data is available to vdsm and mom, and
generate an event to vdsm or mom when the data reaches a given threshold.
Just an idea.  I am not sure how easy to implement it.


should be easy for such stats, question is what other items are reported 
by the current guest agent (say, list of installed applications).

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] moving the collection of statistics to external process

2012-12-07 Thread Mark Wu


On 12/06/2012 11:29 PM, Adam Litke wrote:

On Thu, Dec 06, 2012 at 11:19:34PM +0800, Shu Ming wrote:

于 2012-12-6 4:51, Itamar Heim 写道:

On 12/05/2012 10:33 PM, Adam Litke wrote:

On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote:

On 12/05/2012 10:16 PM, Adam Litke wrote:

On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote:

On 12/05/2012 08:57 PM, Adam Litke wrote:

On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:

On 12/05/2012 04:42 PM, Adam Litke wrote:

I wanted to know what do you think about it and if
you have better

solution to avoid initiate so many threads? And
if splitting vdsm is
a good idea here?
In first look, my opinion is that it can help
and would be nice to
have vmStatisticService that runs and writes to
separate log the vms
status.

Vdsm recently started requiring the MOM package. MOM
also performs some host
and guest statistics collection as part of the
policy framework.  I think it
would be a really good idea to consolidate all stats
collection into MOM.  Then,
all stats become usable within the policy and by
vdsm for its own internal
purposes.  Today, MOM has one stats collection
thread per VM and one thread for
the host stats.  It has an API for gathering the
most recently collected stats
which vdsm can use.


isn't this what collectd (and its libvirt plugin) or
pcp are already doing?

Lot's of things collect statistics, but as of right now,
we're using MOM and
we're not yet using collectd on the host, right?


I think we should have a single stats collection service
and clients for it.
I think mom and vdsm should get their stats from that service,
rather than have either beholden to any new stats something needs to
collect.

How would this work for collecting guest statistics?  Would
we require collectd
to be installed in all guests running under oVirt?


my understanding is collectd is installed on the host, and uses
collects libvirt plugin to collect guests statistics?

Yes, but some statistics can only be collected by making a call
to the oVirt
guest agent (eg. guest memory statistics).  The logical next
step would be to
write a collectd plugin for ovirt-guest-agent, but vdsm owns the
connections to
the guest agents and probably does not want to multiplex those
connections for
many reasons (security being the main one).


and some will come from qemu-ga which libvirt will support?
maybe a collectd vdsm plugin for the guest agent stats?


I am thinking to have the collectd as a stand alone service to
collect the statics from both ovirt-guest and qemu-ga.  Then
collected can export the information to host proc file system in
layered architecture.  Then mom or other vdsm service can get the
information from the proc file system like other OS statics exported
in the host.

You wouldn't use the host /proc filesystem for this purpose.  /proc is an
interface between userspace and the kernel.  It is not for direct application
use.

The problem I see with hooking collectd up to ovirt-ga is that vdsm still needs
a connection to ovirt-ga for things like shutdown and desktopLogin.  Today vdsm,
owns the connection to the guest agent and there is not a nice way to multiplex
that connection for use by multiple clients simultaneously.

Actually,  I don't like to collect from statistics from guest agent.  
Now libvirt can provide the statistics of vcpu, block and network
interface.  So I think we should reconsider enabling guest memory report 
in virtio balloon driver.  I am not sure if async event is

supported in qmp now. How do you think of it?

In vdsm and mom,  we don't just simply collect statistics, but also need 
perform appropriate action on it.  So probably we still need a output plugin
for collectd to to make the data is available to vdsm and mom, and 
generate an event to vdsm or mom when the data reaches a given threshold.

Just an idea.  I am not sure how easy to implement it.











___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] moving the collection of statistics to external process

2012-12-07 Thread Mark Wu


On 12/05/2012 10:23 PM, ybronhei wrote:
As part of an issue that if you push start for 200vms in the same time 
it takes hours because undefined issue, we thought about moving the 
collection of statistics outside vdsm.


It can help because the stat collection is an internal threads of vdsm 
that can spend not a bit of a time, I'm not sure if it would help with 
the issue of starting many vms simultaneously, but it might improve 
vdsm response.


Currently we start thread for each vm and then collecting stats on 
them in constant intervals, and it must effect vdsm if we have 200 
thread like this that can take some time. for example if we have 
connection errors to storage and we can't receive its response, all 
the 200 threads can get stuck and lock other threads (gil issue).
As far as I know, the design of oop is try to resolve the problem you 
state. However,  I don't understand how GIL can cause this problem?  
Python should release GIL before executing any I/O involved
instruction.  I did some tests before and found the other threads can 
continue to run while one thread get stuck on I/O.


I wanted to know what do you think about it and if you have better 
solution to avoid initiate so many threads? And if splitting vdsm is a 
good idea here?
In first look, my opinion is that it can help and would be nice to 
have vmStatisticService that runs and writes to separate log the vms 
status.


The problem with this solution is that if  those interval functions 
needs to communicate with internal parts of vdsm to set values or 
start internal processes when something has changed, it depends on the 
stat function.. and I'm not sure that stat function should control 
Asinternal flows.
Today to recognize connectivity error we count on this method, but we 
can add polling mechanics for those issues (which can raise same 
problems we are trying to deal with..)


I would like to here your ideas and comments.. thanks



___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] Fedora, udev and nic renaming

2012-12-07 Thread Dan Kenigsberg

On Thu, Dec 06, 2012 at 01:12:52PM +0200, Michael S. Tsirkin wrote:
> On Wed, Dec 05, 2012 at 03:51:39PM +0200, Dan Kenigsberg wrote:
> > On Tue, Dec 04, 2012 at 05:25:48AM -0500, Alon Bar-Lev wrote:
> > > 
> > > Thanks for this verbose description.
> > > 
> > > I don't think using libguestfs is the solution for this.
> > 
> > Yeah, it seems like a hack that would be quite hard to maintain for all
> > supported guest operating systems.
> > 
> > > 
> > > Fixing qemu to accept BIOS interface name at -net parameter is 
> > > preferable. I don't think we should expose the interface a PCI device as 
> > > it will have some drawbacks, but attempt to use the onboard convention.
> > 
> > I don't see a real use case for setting the bios name explicitly. After
> > all, libvirt/vdsm/Engine is going to to allocate them according to their
> > relative order. I'd be content with qemu providing a sane, reproducible,
> > biosdevname for each nic.
> > 
> > Michael, would it be difficult to have?
> 
> 
> This is not a qemu issue. This is a biosdevname/VMware issue.
> biodevname has this code:
> 
> /*
>   Algorithm suggested by:
>   
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
> */
> 
> static int
> running_in_virtual_machine (void)
> {
> u_int32_t eax=1U, ecx=0U;
> 
> ecx = cpuid (eax, ecx);
> if (ecx & 0x8000U)
>return 1;
> return 0;
> }
> 
> So it just looks for a hypervisor.
> 
> It should look at the hypervisor leaf
> and either blacklist vmware specifically or whitelist kvm.
> 
> Please open (preferably urgent prio) bugzilla for biosdevname component
> so we can fix it in F18, cc me.
> I can write you a patch but maintainer needs to apply it.

Thanks for the analysis, Michael.
Fedora bug opened: Bug 884990 - non deterministic bios dev naming in KVM guests
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] RFC: New Storage API

Re: [vdsm] RFC: New Storage API

Re: [vdsm] moving the collection of statistics to external process

Re: [vdsm] moving the collection of statistics to external process

Re: [vdsm] moving the collection of statistics to external process

Re: [vdsm] Fedora, udev and nic renaming

6 matches

Site Navigation

Mail list logo

Footer information