Re: [Engine-devel] [vdsm] is gerrit.ovirt.org down?

2012-09-12 Thread Adam Litke
So the fix is to just regularly restart gerrit?  Do we have any idea about the
real, underlying problem?

On Wed, Sep 12, 2012 at 11:56:44AM -0400, Eyal Edri wrote:
> 
> 
> - Original Message -
> > From: "Itamar Heim" 
> > To: "Asaf Shakarchi" 
> > Cc: "Alon Bar-Lev" , "Shireesh Anjal" 
> > , engine-devel@ovirt.org, "VDSM Project
> > Development" , "Shu Ming" 
> > , "Eyal Edri"
> > 
> > Sent: Wednesday, September 12, 2012 6:34:56 PM
> > Subject: Re: [Engine-devel] is gerrit.ovirt.org down? 
> > 
> > On 09/12/2012 06:23 PM, Asaf Shakarchi wrote:
> > > It happens from time to time, restart is required, Itamar only.
> > 
> > restarted.
> > eyal - can we make progress on the jenkins job with permission to
> >  more
> > people to restart gerrit?
> 
> the job is ready 
> http://jenkins.ovirt.org/view/system-monitoring/job/restart_gerrit_service
> but i need to have jenkins user access to gerrit server + sudo access to run 
> 'service' restart... 
> 
> it has access to www.ovirt.org but not to gerrit.ovirt.org. 
> 
> > others - please email infra on gerrit issues (well, me personally
> > always
> > help as well)
> > 
> > >
> > > - Original Message -
> > >>
> > >> Yes, I am experiencing this too...
> > >>
> > >> Itamar?
> > >>
> > >> - Original Message -
> > >>> From: "Shu Ming" 
> > >>> To: "Alon Bar-Lev" 
> > >>> Cc: "Shireesh Anjal" , engine-devel@ovirt.org,
> > >>> "VDSM Project Development"
> > >>> 
> > >>> Sent: Wednesday, September 12, 2012 5:50:14 PM
> > >>> Subject: Re: [Engine-devel] is gerrit.ovirt.org down? 
> > >>>
> > >>> It seems gerrit has downed for several times recently. Is there
> > >>> any
> > >>> special reason?
> > >>> 于 2012-9-12 22:45, Alon Bar-Lev:
> > >>>> yes.
> > >>>>
> > >>>> - Original Message -
> > >>>>> From: "Shireesh Anjal" 
> > >>>>> To: engine-devel@ovirt.org
> > >>>>> Sent: Wednesday, September 12, 2012 5:43:35 PM
> > >>>>> Subject: [Engine-devel] is gerrit.ovirt.org down? 
> > >>>>>
> > >>>>>
> > >>>>> ___
> > >>>>> Engine-devel mailing list
> > >>>>> Engine-devel@ovirt.org
> > >>>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
> > >>>>>
> > >>>> ___
> > >>>> Engine-devel mailing list
> > >>>> Engine-devel@ovirt.org
> > >>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> ---
> > >>> θˆ’ζ˜Ž Shu Ming
> > >>> Open Virtualization Engineerning; CSTL, IBM Corp.
> > >>> Tel: 86-10-82451626  Tieline: 9051626 E-mail: shum...@cn.ibm.com
> > >>> or
> > >>> shum...@linux.vnet.ibm.com
> > >>> Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> > >>> District, Beijing 100193, PRC
> > >>>
> > >>>
> > >>>
> > >> ___
> > >> Engine-devel mailing list
> > >> Engine-devel@ovirt.org
> > >> http://lists.ovirt.org/mailman/listinfo/engine-devel
> > >>
> > 
> > 
> >
> ___
> vdsm-devel mailing list
> vdsm-de...@lists.fedorahosted.org
> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Mom Balloon policy issue

2012-10-09 Thread Adam Litke
Thanks for writing this.  Some thoughts inline, below.  Also, cc'ing some lists
in case other folks want to participate in the discussion.

On Tue, Oct 09, 2012 at 01:12:30PM -0400, Noam Slomianko wrote:
> Greetings,
> 
> I've fiddled around with ballooning and wanted to raise a question for debate.
> 
> Currently as long as the host is under memory pressure, MOM will try and 
> reclaim back memory from all guests with more free memory then a given 
> threshold.
> 
> Main issue: Guest allocated memory is not the same as the resident (physical) 
> memory used by qemu.
> This means that when memory is reclaimed back (the balloon is inflated) we 
> might not get as much memory as planed back (or non at all).
> 
>  *Example1 no memory is reclaimed back:
> name | allocated memory | used by the vm | resident memory used in the 
> host by qemu
> Vm1  |   4G |   4G,  |4G
> Vm2  |   4G |   1G   |1G
>  - MOM will inflate the balloon in vm2 (as vm has no free memory) and will 
> gain no memory

One thing to keep in mind is that VMs having less host RSS than their memory
allocation is a temporary condition.  All VMs will eventually consume their full
allocation if allowed to run.  I'd be curious to know how long this process
takes in general.

We might be able to handle this case by refusing to inflate the balloon if:
(VM free memory - planned balloon inflation) > host RSS


>  *Example1 memory is reclaimed partially:
> name | allocated memory | used by the vm | resident memory used in the 
> host by qemu
> Vm1  |   4G |   4G,  |4G
> Vm2  |   4G |   1G   |1G
> Vm3  |   4G |   1G   |4G
>  - MOM will inflate the balloon in vm2 and vm3 slowly gaining only from vm3

The above rule extension may help here too.

> this behaviour might in the cause us to:
>  * spend time reclaiming memory from many guests when we can reclaim only 
> from a subgroup
>  * be under the impression that we have more potential memory to reclaim when 
> we do
>  * bring inactive VMs dangerously low as they are constantly reclaimed (I've 
> had guests crashing from kernel out of memory)
> 
> 
> To address this I suggest that we collect guest memory stats from libvirt as 
> well, so we have the option to use them in our calculations.
> This can be achieved with the command "virsh dommemstat " which 
> returns
> actual 3915372 (allocated)
> rss 2141580 (resident memory used by qemu)

I would suggest adding these two fields to the VmStats that are collected by
vdsm.  Then, to try it out, add the fields to the GuestMemory Collector.  (Note:
MOM does have a collector that gathers RSS for VMs.  It's called GuestQemuProc).
You can then extend the Balloon policy to add a snippet to check if the proposed
balloon adjustment should be carried out.  You could add the logic to the
change_big_enough function.

> additional topic:
>  * should we include per guest config (for example a hard minimum memory cap, 
> this vm cannot run effectively with less then 1G memory)

Yes.  This is probably something we want to do.  There is a whole topic around
VM tagging that we should consider.  In the future we will want to be able to do
many different things in policy based on a VMs tag.  For example, some VMs may
be completely exempt from ballooning.  Others may have a minimum limit.

I want to avoid passing in the raw guest configuration because MOM needs to work
with direct libvirt vms and with ovirt/vdsm vms.  Therefore, we want to think
carefully about the abstractions we use when presenting VM properties to MOM.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Strange input from oVirt-engine for create (VM) API

2012-10-22 Thread Adam Litke
Hi all,

Today I was watching the vdsm log as ovirt-engine started a VM and I saw
something peculiar with how VM device addresses were specified.  Here is a
sample of the python dictionary for the VM from vdsm.log (I reformatted it with
pprint for readability):

 {'address': {' bus': '1',
  ' controller': '0',
  ' target': '0',
  ' type': 'drive',
  'unit': '0'},

Notice the whitespace in the 'controller', 'target', and 'type' keys.  Could
someone explain why this is happening?  Is it deliberate or a bug?

Thanks!

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] New component 'mom' added to Bugzilla

2012-10-25 Thread Adam Litke
Hi all,

MOM is becoming a bigger part of oVirt and unfortunately it may have bugs at
some point :(  Thanks to Yaniv we have a new 'mom' component in oVirt's bugzilla
where you can report these.

To file a new bug against MOM: 
https://bugzilla.redhat.com/enter_bug.cgi?product=oVirt;component=mom

Thanks!

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

2012-11-28 Thread Adam Litke
On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
> 
> 
> - Original Message -
> > From: "Dan Kenigsberg" 
> > To: "Alon Bar-Lev" 
> > Cc: "VDSM Project Development" , 
> > "engine-devel" , "users"
> > 
> > Sent: Wednesday, November 28, 2012 10:39:42 PM
> > Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
> > 
> > On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
> > > 
> > > > > No... we need it as compatibility with older engines...
> > > > > We keep minimum changes there for legacy, until end-of-life.
> > > > 
> > > > Is there an EoL statement for oVirt-3.1?
> > > > We can make sure that oVirt-3.2's vdsm installs properly with
> > > > ovirt-3.1's vdsm-bootstrap, or even require that Engine must be
> > > > upgraded
> > > > to ovirt-3.2 before upgrading any of the hosts. Is it too harsh
> > > > to
> > > > our
> > > > vast install base?  us...@ovirt.org, please chime in!
> > > >
> > > 
> > > I tried to find such, but the more I dig I find that we need to
> > > support old legacy.
> > 
> > Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an
> > unupgradable F16). Should we be any better than our (currently
> > single)
> > platform?
> 
> We should start and detach from specific distro procedures.
> 
> > 
> > > > > > > 
> > > > > > >  * legacy-removed: change machine width core file
> > > > > > >   # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
> > > > > > 
> > > > > > Yeah, qemu-kvm and libvirtd are much more stable than in the
> > > > > > old
> > > > > > days,
> > > > > > but wouldn't we want to keep a means to collect the corpses
> > > > > > of
> > > > > > dead
> > > > > > processes from hypervisors? It has helped us nail down nasty
> > > > > > bugs,
> > > > > > even
> > > > > > in Python.
> > > > > 
> > > > > It does not mean it should be at /var/lib/vdsm ... :)
> > > > 
> > > > I don't get the joke :-(. If you mind the location, we can think
> > > > of
> > > > somewhere else to put the core dumps. Would it be hard to
> > > > reinstate a
> > > > parallel feature in otopi?
> > > 
> > > I usually do not make any jokes...
> > > A global system setting should not go into package specific
> > > location.
> > > Usually core dumps are off by default, I like this approach as
> > > unattended system may fast consume all disk space because of
> > > dumps.
> > 
> > If a host fills up with dumps so quickly, it's a sign that it should
> > not
> > be used for production, and that someone should look into the cores.
> > (P.S. we have a logrotate rule for them in vdsm)
> 
> There should be a vdsm-debug-aids (or similar) to perform such changes.
> Again, I don't think vdsm should (by default) modify any system width 
> parameter such as this.
> But I will happy to hear more views.

I agree with your statement above that a single package should not override a
global system setting.  We should really work to remove as many of these from
vdsm as we possibly can.  It will help to make vdsm a much safer/well-behaved
package.

> 
> > 
> > > If sysadmin manually enables dumps, he may do this at a location of
> > > his own choice.
> > 
> > Note that we've just swapped hats: you're arguing for letting a local
> > admin log in and mess with system configuration, and I'm for keeping
> > a
> > centralized feature for storing and collecting core dumps.
> 
> As problems like crashes are investigated per case and reproduction scenario.
> But again, I may be wrong and we should have VDSM API command to start/stop 
> storing dumps and manage this via its master...

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 10:00:12AM +0200, Dan Kenigsberg wrote:
> On Wed, Nov 28, 2012 at 03:29:35PM -0600, Adam Litke wrote:
> > On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
> > > 
> > > 
> > > - Original Message -
> > > > From: "Dan Kenigsberg" 
> > > > To: "Alon Bar-Lev" 
> > > > Cc: "VDSM Project Development" , 
> > > > "engine-devel" , "users"
> > > > 
> > > > Sent: Wednesday, November 28, 2012 10:39:42 PM
> > > > Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
> > > > 
> > > > On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
> > > > > 
> > > > > > > No... we need it as compatibility with older engines...
> > > > > > > We keep minimum changes there for legacy, until end-of-life.
> > > > > > 
> > > > > > Is there an EoL statement for oVirt-3.1?
> > > > > > We can make sure that oVirt-3.2's vdsm installs properly with
> > > > > > ovirt-3.1's vdsm-bootstrap, or even require that Engine must be
> > > > > > upgraded
> > > > > > to ovirt-3.2 before upgrading any of the hosts. Is it too harsh
> > > > > > to
> > > > > > our
> > > > > > vast install base?  us...@ovirt.org, please chime in!
> > > > > >
> > > > > 
> > > > > I tried to find such, but the more I dig I find that we need to
> > > > > support old legacy.
> > > > 
> > > > Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an
> > > > unupgradable F16). Should we be any better than our (currently
> > > > single)
> > > > platform?
> > > 
> > > We should start and detach from specific distro procedures.
> > > 
> > > > 
> > > > > > > > > 
> > > > > > > > >  * legacy-removed: change machine width core file
> > > > > > > > >   # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
> > > > > > > > 
> > > > > > > > Yeah, qemu-kvm and libvirtd are much more stable than in the
> > > > > > > > old
> > > > > > > > days,
> > > > > > > > but wouldn't we want to keep a means to collect the corpses
> > > > > > > > of
> > > > > > > > dead
> > > > > > > > processes from hypervisors? It has helped us nail down nasty
> > > > > > > > bugs,
> > > > > > > > even
> > > > > > > > in Python.
> > > > > > > 
> > > > > > > It does not mean it should be at /var/lib/vdsm ... :)
> > > > > > 
> > > > > > I don't get the joke :-(. If you mind the location, we can think
> > > > > > of
> > > > > > somewhere else to put the core dumps. Would it be hard to
> > > > > > reinstate a
> > > > > > parallel feature in otopi?
> > > > > 
> > > > > I usually do not make any jokes...
> > > > > A global system setting should not go into package specific
> > > > > location.
> > > > > Usually core dumps are off by default, I like this approach as
> > > > > unattended system may fast consume all disk space because of
> > > > > dumps.
> > > > 
> > > > If a host fills up with dumps so quickly, it's a sign that it should
> > > > not
> > > > be used for production, and that someone should look into the cores.
> > > > (P.S. we have a logrotate rule for them in vdsm)
> > > 
> > > There should be a vdsm-debug-aids (or similar) to perform such changes.
> > > Again, I don't think vdsm should (by default) modify any system width 
> > > parameter such as this.
> > > But I will happy to hear more views.
> > 
> > I agree with your statement above that a single package should not override 
> > a
> > global system setting.  We should really work to remove as many of these 
> > from
> > vdsm as we possibly can.  It will help to make vdsm a much 
> > safer/well-behaved
> > package.
> 
> I'm fine with dropping these from vdsm, but I think they are good for
> ovirt - we would like to (be able to) enfornce policy on our nodes.
> 
> If configuring core dumps is removed from vdsm, it should go somewhere
> else, or our log-collector users would miss their beloved dumps.

Yes, I agree.  From my point of view the plan was to do the following:

1. Remove unnecessary system configuration changes.  This includes things like
Royce's supervdsm startup process patch (and accompanying sudo->supervdsm
conversions) which allows us to remove some of the sudo configuration.

2. Isolate the remaining tweaks into vdsm-tool.

3. Provide a service/program that can be run to configure a system to work in an
ovirt-engine controlled cluster.

Doing this allows vdsm to be safely installed on any system as a basic
prerequisite for other software.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Adam Litke
Today in vdsm, every object (StoragePool, StorageDomain, VM, Volume, etc) is
identified by a single UUID.  On the surface, it seems like this is enough info
to properly identify a resource but in practice it's not.  For example, when you
look at the API's dealing with Volumes, almost all of them require an sdUUID,
spUUID, and imgUUID in order to provide proper context for the operation.

Needing to provide these extra UUIDs is a burden on the API user because knowing
which values to pass requires internal knowledge of the API.  For example, the
spUUID parameter is almost always just the connected storage pool.  Since we
know there can currently be only one connected pool, the value is known.

I would like to move away from needing to understand all of these relationships
from the end user perspective by encapsulating the extra context into new object
identifier types as follows:

StoragePoolIdentifier:
{ 'storagepoolID': 'UUID' }
StorageDomainIdentifier:
{ 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID' }
ImageIdentifier:
{ 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID', 'imageID': 'UUID' }
VolumeIdentifier:
{ 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID',
  'imageID': 'UUID', 'volumeID': 'UUID' }
TaskIdentifier:
{ 'taskID': 'UUID' }
VMIdentifier:
{ 'vmID': 'UUID' }

In the new API, anytime a reference to an object is required, one of the above
structures must be passed in place of today's single UUID.  In many cases, this
will allow us to reduce the number of parameters to the function since the
needed contextual parameters (spUUID, etc) will be part of the object's
identifier.  Similarly, any time the API returns an object reference it would
return a *Identifier instead of a bare UUID.

These identifier types are basically opaque blobs to the API users and are only
ever generated by vdsm itself.  Because of this, we can change the internal
structure of the identifier to require new information or (before freezing the
API) remove fields that no longer make sense.

I would greatly appreciate your comments on this proposal.  If it seems
reasonable, I will revamp the current schema to make the necessary changes and
provide the Bridge patch functions to convert between the current implementation
and the new schema.

--- sample schema patch ---

commit 48f6b0f0a111dd0b372d211a4e566ce87f375cee
Author: Adam Litke 
Date:   Tue Nov 27 14:14:06 2012 -0600

schema: Introduce class identifier types

When calling API methods that belong to a particular class, a class instance
must be indicated by passing a set of identifiers in the request.  The 
location
of these parameters within the request is: 'params' -> '__obj__'.  Since 
this
set of identifiers must be used together to correctly instantiate an 
object, it
makes sense to define these as proper types within the API.  Then, functions
that return an object (or list of objects) can refer to the correct type.

Signed-off-by: Adam Litke 

diff --git a/vdsm_api/vdsmapi-schema.json b/vdsm_api/vdsmapi-schema.json
index 0418e6e..7e2e851 100644
--- a/vdsm_api/vdsmapi-schema.json
+++ b/vdsm_api/vdsmapi-schema.json
@@ -937,7 +937,7 @@
 # Since: 4.10.0
 ##
 {'command': {'class': 'Host', 'name': 'getConnectedStoragePools'},
- 'returns': ['StoragePool']}
+ 'returns': ['StoragePoolIdentifier']}
 
 ##
 # @BlockDeviceType:
@@ -1572,7 +1572,7 @@
 {'command': {'class': 'Host', 'name': 'getStorageDomains'},
  'data': {'*storagepoolID': 'UUID', '*domainClass': 'StorageDomainImageClass',
   '*storageType': 'StorageDomainType', '*remotePath': 'str'},
- 'returns': ['StorageDomain']}
+ 'returns': ['StorageDomainIdentifier']}
 
 ##
 # @Host.getStorageRepoStats:
@@ -2406,7 +2406,7 @@
 ##
 {'command': {'class': 'Host', 'name': 'getVMList'},
  'data': {'*vmList': ['UUID']},
- 'returns': ['VM']}
+ 'returns': ['VMIdentifier']}
 
 ##
 # @Host.ping:
@@ -2744,10 +2744,11 @@
  'returns': 'ConnectionRefMap'}
 
 ## Category: @ISCSIConnection 
##
+
 ##
-# @ISCSIConnection:
+# @ISCSIConnectionIdentifier:
 #
-# ISCSIConnection API object.
+# Identifier for an ISCSIConnection object.
 #
 # @host:  A fully-qualified domain name (FQDN) or IP 

Re: [Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
> This is all only valid for the current storage API the new one doesn't have
> pools or volumes. Only domains and images.  Also, images and domains are more
> loosely coupled and make this method problematic.

I am looking for an incremental way to bridge the differences.  It's been 2
years and we still don't have the revamped storage API so I am planning on what
we have being around for awhile :)  I think that defining object identifiers as
opaque structured types is also future proof.  In the future an Image-ng object
we can drop 'storagepoolID' from the identifier and, if it makes sense, remove
the hard association with a storageDomain as well.  The point behind this
refactoring is to give us the option of coupling multiple UUID's (or other data)
to form a single, opaque identifier.

> That being said, if we do choose to make the current storage API officially
> supported I do agree that it looks a bit simpler but for the price of forcing
> the user to construct these objects before sending the request. I know for a
> fact that the engine will just create these objects on the fly because they
> use their own objects to group things logically. This means adding more work
> instead of removing it.  Most clients will do that anyway as they will use
> their own DAL to store these relationships. 
> 

Thanks for bringing up some of these points.  All deserve attention so I will
address each one individually:

The current API does not yet make an official statement of support for anything.
I want to model the current storage API so that the node level API can have the
same level of functionality as is currently supported.  I am all for removing
deprecated functions and redesigning in-place for a reasonable amount of time
going forward.  In a perfect world, libvdsm-1.0 would release with no mention of
storage pools at all.

If properly designed, the end-user (including engine) would never be
constructing these objects itself.  Object identifiers are essentially opaque
structures.  In order to make this possible, we need to make sure that the API
provides all of the functions needed to lookup objects.  So far these are:

StoragePool:Host.getConnectedStoragePools
StorageDomain:  Host.getStorageDomains
Image:  StorageDomain.getImages
Volume: StorageDomain.getVolumes / Image.getVolumes
VM: Host.getVMList
Task:   Host.getAllTasks

All of the above would return object identifiers.

The other case is for creation of new resources.  In that case, the create
method needs to move to the owning object.  The key example is VM.create which
should move to Host.createVM.  Functions such as Host.createVM could still
accept a vmUUID (because I assume engine does want the ability to set this
explicitly).  However, they should be changed to return either a TaskIdentifier
(if the creation is asynchronous) or a *Identifier (eg. VmIdentifier) if the
object was created synchronously.

All told, I think the net is less work for clients.  They will no longer need to
model the object associations and relationships because the API will take care
of that automatically.

> - Original Message -
> > From: "Adam Litke" 
> > To: vdsm-de...@lists.fedorahosted.org
> > Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> > "Federico Simoncelli"
> > , "Saggi Mizrahi" , "Ayal Baron" 
> > 
> > Sent: Thursday, November 29, 2012 12:19:06 PM
> > Subject: RFD: API: Identifying vdsm objects in the next-gen API
> > 
> > Today in vdsm, every object (StoragePool, StorageDomain, VM, Volume,
> > etc) is
> > identified by a single UUID.  On the surface, it seems like this is
> > enough info
> > to properly identify a resource but in practice it's not.  For
> > example, when you
> > look at the API's dealing with Volumes, almost all of them require an
> > sdUUID,
> > spUUID, and imgUUID in order to provide proper context for the
> > operation.
> > 
> > Needing to provide these extra UUIDs is a burden on the API user
> > because knowing
> > which values to pass requires internal knowledge of the API.  For
> > example, the
> > spUUID parameter is almost always just the connected storage pool.
> >  Since we
> > know there can currently be only one connected pool, the value is
> > known.
> > 
> > I would like to move away from needing to understand all of these
> > relationships
> > from the end user perspective by encapsulating the extra context into
> > new object
> > identifier types as follows:
> > 
> > StoragePoolIdentifier:
> > { 'storagepoolID': 'UUID' }
> > Storage

Re: [Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> They are not future proof as the paradigm is completely different.  Storage
> domain IDs are not static any more (and are not guaranteed to be unique or the
> same across the cluster.  Image IDs represent the ID of the projected data and
> not the actual unique path.  Just as an example, to run a VM you give a list
> of domains that might contain the needed images in the chain and the image ID
> of the tip.  The paradigm is changed to and most calls get non synchronous
> number of images and domains.  Further more, the APIs themselves are
> completely different. So future proofing is not really an issue.

I don't understand this at all.  Perhaps we could all use some education on the
architecture of the planned architectural changes.  If I can pass an arbitrary
list of domainIDs that _might_ contain the data, why wouldn't I just pass all
of them every time?  In that case, why are they even required since vdsm would
have to search anyway?

> As to making the current API a bit simpler. As I said, making them opaque is
> problematic as currently the engine is responsible for creating the IDs.

As I mentioned in my last post, engine still can specify the ID's when the
object is first created.  From that point forward the ID never changes so it can
be baked into the identifier.

> Further more, some calls require you to play with these (making a template
> instead of a snapshot).  Also, the full chain and topology needs to be
> completely visible to the engine.

Please provide a specific example of how you play with the IDs.  I can guess
where you are going, but I don't want to divert the thread.

> These things, as you said, are problematic. But this is the way things are
> today.

We are changing them.

> As for task IDs.  Currently task IDs are only used for storage and they get
> persisted to disk. This is WRONG and is not the case with the new storage API.
> Because we moved to an asynchronous message based protocol (json-rpc over
> TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc.
> json-rpc specifies that the IDs have to be unique for a client as long as the
> request is still active.  This is good enough as internally we can have a verb
> for a client to query it's own running tasks and a verb to query other host
> tasks by mangling in the client before the ID.  Because the protocol is

So this would rely on the client keeping the connection open and as soon as it
disconnects it would lose the ability to query tasks from before the connection
went down?  I don't know if it's a good idea to conflate message ID's with task
ID's.  While the protocol can operate asynchronously, some calls have
synchronous semantics and others have asynchronous semantics.  I would expect
sync calls to return their data immediately and async calls to return
immediately with either: an error code, or an 'operation started' message and
associated ID for querying the status of the operation.

> asynchronous all calls are asynchronous by nature well.  Tasks will no longer
> be persisted or expected to be persisted. It's the callers responsibility to
> query the state and see if the operation succeeded or failed if the caller or
> VDSM died in the middle of the call. The current "cleanTask()" system can't be
> used when more then one client is using VDSM and will not be used for anything
> other then legacy storage.

I agree about not persisting tasks in the future.  Although I think finished
tasks should remain in memory for some time so they can be queried by a client
who must reconnect.

> AFAIK Apart from storage all objects IDs are constructed with a single ID,
> name or alias. VMs, storageConnections, network interfaces. So it's not a real
> issue.  I agree that in the future we should keep the idiom of pass
> configuration once, name it, and keep using the name to reference the object.

Yes, storage is the major problem here.

> - Original Message -
> > From: "Adam Litke" 
> > To: "Saggi Mizrahi" 
> > Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> > "Federico Simoncelli"
> > , "Ayal Baron" , 
> > vdsm-de...@lists.fedorahosted.org
> > Sent: Thursday, November 29, 2012 4:18:40 PM
> > Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
> > > This is all only valid for the current storage API the new one
> > > doesn't have
> > > pools or volumes. Only domains and images.  Also, images and
> > > domains are more
> > > loosely coupled and make this method problematic.
> > 
> > I am 

Re: [Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke
On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> 
> 
> - Original Message -
> > From: "Adam Litke"  To: "Saggi Mizrahi"
> >  Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg"
> > , "Federico Simoncelli" , "Ayal
> > Baron" , vdsm-de...@lists.fedorahosted.org Sent:
> > Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API: Identifying
> > vdsm objects in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > They are not future proof as the paradigm is completely different.
> > > Storage domain IDs are not static any more (and are not guaranteed to be
> > > unique or the same across the cluster.  Image IDs represent the ID of the
> > > projected data and not the actual unique path.  Just as an example, to run
> > > a VM you give a list of domains that might contain the needed images in
> > > the chain and the image ID of the tip.  The paradigm is changed to and
> > > most calls get non synchronous number of images and domains.  Further
> > > more, the APIs themselves are completely different. So future proofing is
> > > not really an issue.
> > 
> > I don't understand this at all.  Perhaps we could all use some education on
> > the architecture of the planned architectural changes.  If I can pass an
> > arbitrary list of domainIDs that _might_ contain the data, why wouldn't I
> > just pass all of them every time?  In that case, why are they even required
> > since vdsm would have to search anyway?
> It's for optimization mostly, the engine usually has a good idea of where
> stuff are, having it give hints to VDSM can speed up the search process.
> also, then engines knows how transient some storage pieces are. If you have a
> domain that is only there for backup or "owned" by another manager sharing the
> host, you don't want you VMs using the disks that are on that storage
> effectively preventing it from being removed (though we do have plans to have
> qemu switch base snapshots at runtime for just that).

This is not a clean design.  If the search is slow, then maybe we need to
improve caching internally.  Making a client cache a bunch of internal IDs to
pass around sounds like a complete layering violation to me.

> > 
> > > As to making the current API a bit simpler. As I said, making them opaque
> > > is problematic as currently the engine is responsible for creating the
> > > IDs.
> > 
> > As I mentioned in my last post, engine still can specify the ID's when the
> > object is first created.  From that point forward the ID never changes so it
> > can be baked into the identifier.
> Where will this identifier be persisted?
> > 
> > > Further more, some calls require you to play with these (making a template
> > > instead of a snapshot).  Also, the full chain and topology needs to be
> > > completely visible to the engine.
> > 
> > Please provide a specific example of how you play with the IDs.  I can guess
> > where you are going, but I don't want to divert the thread.
> The relationship between volumes and images is deceptive at the moment.  IMG
> is the chain and volume is a member, IMGUUID is only used to for verification
> and to detect when we hit a template going up the chain.  When you do
> operation on images assumptions are being guaranteed about the resulting IDs.
> When you copy an image, you assume to know all the new IDs as they remain the
> same.  With your method I can't tell what the new "opaque" result is going to
> be.  Preview mode (another abomination being deprecated) relies on the
> disconnect between imgUUID and volUUID.  Live migration currently moves a lot
> of the responsibility to the engine.

No client should need to know about all of these internal details.  I understand
that's the way it is today, and that's one of the main reasons that the API is a
complete pain to use.

> > 
> > > These things, as you said, are problematic. But this is the way things are
> > > today.
> > 
> > We are changing them.
> Any intermediary step is needlessly problematic for existing clients.  Work is
> already in progress for fixing the API properly, making some calls a bit nicer
> isn't an excuse to start making more compatibility code in the engine.

The engine won't need compatibility code.  This only would impact the jsonrpc
bindings which aren't used by engine yet.  When engine switches over, then yes
it would need to adapt.

> > 
> > > As for task IDs.  Currently task IDs are only used for storage and

Re: [Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke
On Mon, Dec 03, 2012 at 03:57:42PM -0500, Saggi Mizrahi wrote:
> 
> 
> - Original Message -
> > From: "Adam Litke"  To: "Saggi Mizrahi"
> >  Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg"
> > , "Federico Simoncelli" , "Ayal
> > Baron" , vdsm-de...@lists.fedorahosted.org Sent: Monday,
> > December 3, 2012 3:30:21 PM Subject: Re: RFD: API: Identifying vdsm objects
> > in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> > > 
> > > 
> > > - Original Message -
> > > > From: "Adam Litke"  To: "Saggi Mizrahi"
> > > >  Cc: engine-de...@linode01.ovirt.org, "Dan
> > > > Kenigsberg" , "Federico Simoncelli"
> > > > , "Ayal Baron" ,
> > > > vdsm-de...@lists.fedorahosted.org Sent: Thursday, November 29, 2012
> > > > 5:22:43 PM Subject: Re: RFD: API: Identifying vdsm objects in the
> > > > next-gen API
> > > > 
> > > > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > > > They are not future proof as the paradigm is completely different.
> > > > > Storage domain IDs are not static any more (and are not guaranteed to
> > > > > be unique or the same across the cluster.  Image IDs represent the ID
> > > > > of the projected data and not the actual unique path.  Just as an
> > > > > example, to run a VM you give a list of domains that might contain the
> > > > > needed images in the chain and the image ID of the tip.  The paradigm
> > > > > is changed to and most calls get non synchronous number of images and
> > > > > domains.  Further more, the APIs themselves are completely different.
> > > > > So future proofing is not really an issue.
> > > > 
> > > > I don't understand this at all.  Perhaps we could all use some education
> > > > on the architecture of the planned architectural changes.  If I can pass
> > > > an arbitrary list of domainIDs that _might_ contain the data, why
> > > > wouldn't I just pass all of them every time?  In that case, why are they
> > > > even required since vdsm would have to search anyway?
> > > It's for optimization mostly, the engine usually has a good idea of where
> > > stuff are, having it give hints to VDSM can speed up the search process.
> > > also, then engines knows how transient some storage pieces are. If you
> > > have a domain that is only there for backup or "owned" by another manager
> > > sharing the host, you don't want you VMs using the disks that are on that
> > > storage effectively preventing it from being removed (though we do have
> > > plans to have qemu switch base snapshots at runtime for just that).
> > 
> > This is not a clean design.  If the search is slow, then maybe we need to
> > improve caching internally.  Making a client cache a bunch of internal IDs
> > to pass around sounds like a complete layering violation to me.
> You can't cache this, if the same template exists on an 2 different NFS
> domains only the engine has enough information to know which you should use.
> We only have the engine give us thing information when starting a VM or
> merging\copying an image that resides on multiple domains.  It is also
> completely optional. I didn't like it either.

Is it even valid for the same template (with identical uuids) to exist in two
places?  I thought uuids aren't supposed to collide.  I can envision some
scenario where a cached storagedomain/storagepool relationship becomes invalid
because another user detached the storagedomain.  In that case, the API just
returns the normal error about "sd XXX is not attached to sp XXX".  So I don't
see any problem here.

> > 
> > > > 
> > > > > As to making the current API a bit simpler. As I said, making them
> > > > > opaque is problematic as currently the engine is responsible for
> > > > > creating the IDs.
> > > > 
> > > > As I mentioned in my last post, engine still can specify the ID's when
> > > > the object is first created.  From that point forward the ID never
> > > > changes so it can be baked into the identifier.
> > > Where will this identifier be persisted?
> > > > 
> > > > > Further more, some calls require you to play with these (making a
> > > > > template instead of a snapshot).  Also, the full chai

Re: [Engine-devel] VDSM tasks, the future

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
> Because I started hinting about how VDSM tasks are going to look going forward
> I thought it's better I'll just write everything in an email so we can talk
> about it in context.  This is not set in stone and I'm still debating things
> myself but it's very close to being done.

Don't debate them yourself, debate them here!  Even better, propose your idea in
schema form to show how a command might work exactly.

> - Everything is asynchronous.  The nature of message based communication is
> that you can't have synchronous operations.  This is not really debatable
> because it's just how TCP\AMQP\ works.

Can you show how a traditionally synchronous command might work?  Let's take
Host.getVmList as an example.

> - Task IDs will be decided by the caller.  This is how json-rpc works and also
> makes sense because no the engine can track the task without needing to have a
> stage where we give it the task ID back.  IDs are reusable as long as no one
> else is using them at the time so they can be used for synchronizing
> operations between clients (making sure a command is only executed once on a
> specific host without locking).
> 
> - Tasks are transient If VDSM restarts it forgets all the task information.
> There are 2 ways to have persistent tasks: 1. The task creates an object that
> you can continue work on in VDSM.  The new storage does that by the fact that
> copyImage() returns one the target volume has been created but before the data
> has been fully copied.  From that moment on the stat of the copy can be
> queried from any host using getImageStatus() and the specific copy operation
> can be queried with getTaskStatus() on the host performing it.  After VDSM
> crashes, depending on policy, either VDSM will create a new task to continue
> the copy or someone else will send a command to continue the operation and
> that will be a new task.  2. VDSM tasks just start other operations track-able
> not through the task interface. For example Gluster.
> gluster.startVolumeRebalance() will return once it has been registered with
> Gluster.  glster.getOperationStatuses() will return the state of the operation
> from any host.  Each call is a task in itself.

I worry about this approach because every command has a different semantic for
checking progress.  For migration, we have to check VM status on the src and
dest hosts.  For image copy we need to use a special status call on the dest
image.  It would be nice if there was a unified method for checking on an
operation.  Maybe that can be completion events.

Client:   vdsm:
---   -

Image.copy(...)  -->
 <--  Operation Started
Wait for event   ...
 <--  Event: Operation  done 

For an early error:

Client:   vdsm:
---   -

Image.copy(...)  -->
 <--  Error: 


> - No task tags.  They are silly and the caller can mangle whatever in the task
> ID if he really wants to tag tasks.

Yes.  Agreed.

> - No explicit recovery stage.  VDSM will be crash-only, there should be
> efforts to make everything crash-safe.  If that is problematic, in case of
> networking, VDSM will recover on start without having a task for it.

How does this work in practice for something like creating a new image from a
template?

> - No clean Task: Tasks can be started by any number of hosts this means that
> there is no way to own all tasks.  There could be cases where VDSM starts
> tasks on it's own and thus they have no owner at all.  The caller needs to
> continually track the state of VDSM. We will have brodcasted events to
> mitigate polling.

If a disconnected client might have missed a completion event, it will need to
check state.  This means each async operation that changes state must document a
proceedure for checking progress of a potentially ongoing operation.  For
Image.copy, that process would be to lookup the new image and check its state.

> - No revert Impossible to implement safely.

How do the engine folks feel about this?  I am ok with it :)

> - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only
> for type).  What used to be SPM tasks, or tasks that persist and can be
> restarted on other hosts is talked about in previous bullet points.
> 
A nice simplification.


-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-04 Thread Adam Litke
.
> For very large setups this might be problematic. To mitigate the problem you 
> have these options:
> participatingRepositories=[repoId, ...] which tell VDSM to narrow the search 
> to just these repositories
> and
> imageHints={imgId: repoId} which will force VDSM to look for those image ID 
> just in those repositories and fail if it doesn't find them there.

I would like to have a better way of specifying these optional parameters
without burying them in an options structure.  I will think a little more about
this.  Strategy can just be a two optional flags in a 'flags' argument.  For the
participatingRepositories and imageHints options, I think we need to use real
parameters.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
 progress
> > > is available for that operation.
> > > last_error - This will only be filled if the operation failed
> > > because of something other then IO or a VDSM crash for obvious
> > > reasons.
> > >   It will usually be set if the task was manually
> > >   stopped
> > >
> > > The user can either be satisfied with that information or as the
> > > host specified in host ID if it is still working on that image by
> > > checking it's running tasks.
> > 
> > So we need a function to know what tasks are running on the image
> getImageStatus()
> > >
> > > checkStorageRepository(self, repositoryId, options={}):
> > > A method to go over a storage repository and scan for any existing
> > > problems. This includes degraded\broken images and deleted images
> > > that have no yet been physically deleted\merged.
> > > It returns a list of Fix objects.
> > > Fix objects come in 4 types:
> > > clean - cleans data, run them to get more space.
> > > optimize - run them to optimize a degraded image
> > > merge - Merges two images together. Doing this sometimes
> > >  makes more images ready optimizing or cleaning.
> > >  The reason it is different from optimize is that
> > >  unmerged images are considered optimized.
> > > mend - mends a broken image
> > >
> > > The user can read these types and prioritize fixes. Fixes also
> > > contain opaque FIX data and they should be sent as received to
> > > fixStorageRepository(self, repositoryId, fix, options={}):
> > >
> > > That will start a fix operation.
> > >
> > >
> > > All major operations automatically start the appropriate "Fix" to
> > > bring the created object to an optimize\degraded state (the one
> > > that is quicker) unless one of the options is
> > > AutoFix=False. This is only useful for repos that might not be able
> > > to create volumes on all hosts (SDM) but would like to have the
> > > actual IO distributed in the cluster.
> > >
> > > Other common options is the strategy option:
> > > It has currently 2 possible values
> > > space and performance - In case VDSM has 2 ways of completing the
> > > same operation it will tell it to value one over the other. For
> > > example, whether to copy all the data or just create a qcow based
> > > of a snapshot.
> > > The default is space.
> > >
> > > You might have also noticed that it is never explicitly specified
> > > where to look for existing images. This is done purposefully, VDSM
> > > will always look in all connected repositories for existing
> > > objects.
> > > For very large setups this might be problematic. To mitigate the
> > > problem you have these options:
> > > participatingRepositories=[repoId, ...] which tell VDSM to narrow
> > > the search to just these repositories
> > > and
> > > imageHints={imgId: repoId} which will force VDSM to look for those
> > > image ID just in those repositories and fail if it doesn't find
> > > them there.
> > > ___
> > > vdsm-devel mailing list
> > > vdsm-de...@lists.fedorahosted.org
> > > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> > 
> > 
> > --
> > ---
> > θˆ’ζ˜Ž Shu Ming
> > Open Virtualization Engineerning; CSTL, IBM Corp.
> > Tel: 86-10-82451626  Tieline: 9051626 E-mail: shum...@cn.ibm.com or
> > shum...@linux.vnet.ibm.com
> > Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> > District, Beijing 100193, PRC
> > 
> > 
> > 
> ___
> vdsm-devel mailing list
> vdsm-de...@lists.fedorahosted.org
> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote:



> > 1) Can you provide more info on why there is a exception for 'lvm
> > based
> > block domain'. Its not coming out clearly.
> File based domains are responsible for syncing up object manipulation 
> (creation\deletion)
> The backend is responsible for making sure it all works either by having a 
> single writer (NFS) or having it's own locking mechanism (gluster).
> In our LVM based domains VDSM is responsible for basic object manipulation.
> The current design uses an approach where there is a single host responsible 
> for object creation\deleteion it is the SRM\SDM\SPM\S?M.
> If we ever find a way to make it fully clustered without a big hit in 
> performance the S?M requirement will be removed form that type of domain.

I would like to see us maintain a LOCALFS domain as well.  For this, we would
also need SRM, correct?

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
On Mon, Dec 10, 2012 at 02:03:09PM -0500, Saggi Mizrahi wrote:
> 
> 
> - Original Message -
> > From: "Adam Litke" 
> > To: "Saggi Mizrahi" 
> > Cc: "Deepak C Shetty" , "engine-devel" 
> > , "VDSM Project
> > Development" 
> > Sent: Monday, December 10, 2012 1:49:31 PM
> > Subject: Re: [vdsm] RFC: New Storage API
> > 
> > On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote:
> > 
> > 
> > 
> > > > 1) Can you provide more info on why there is a exception for 'lvm
> > > > based
> > > > block domain'. Its not coming out clearly.
> > > File based domains are responsible for syncing up object
> > > manipulation (creation\deletion)
> > > The backend is responsible for making sure it all works either by
> > > having a single writer (NFS) or having it's own locking mechanism
> > > (gluster).
> > > In our LVM based domains VDSM is responsible for basic object
> > > manipulation.
> > > The current design uses an approach where there is a single host
> > > responsible for object creation\deleteion it is the
> > > SRM\SDM\SPM\S?M.
> > > If we ever find a way to make it fully clustered without a big hit
> > > in performance the S?M requirement will be removed form that type
> > > of domain.
> > 
> > I would like to see us maintain a LOCALFS domain as well.  For this,
> > we would
> > also need SRM, correct?
> No, why?

Sorry, nevermind.  I was thinking of a scenario with multiple clients talking to
a single vdsm and making sure they don't stomp on one another.  This is
probably not something we are going to care about though.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
On Mon, Dec 10, 2012 at 03:36:23PM -0500, Saggi Mizrahi wrote:
> > Statements like this make me start to worry about your userData
> > concept.  It's a
> > sign of a bad API if the user needs to invent a custom metadata
> > scheme for
> > itself.  This reminds me of the abomination that is the 'custom'
> > property in the
> > vm definition today.
> In one sentence: If VDSM doesn't care about it, VDSM doesn't manage it.
> 
> userData being a "void*" is quite common and I don't understand why you would 
> thing it's a sign of a bad API.
> Further more, giving the user choice about how to represent it's own metadata 
> and what fields it want to keep seems reasonable to me.
> Especially given the fact that VDSM never reads it.
> 
> The reason we are pulling away from the current system of VDSM understanding 
> the extra data is that it makes that data tied to VDSMs on disk format.
> VDSM on disk format has to be very stable because of clusters with multiple 
> VDSM versions.
> Further more, since this is actually manager data it has to be tied to the 
> manager backward compatibility lifetime as well.
> Having it be opaque to VDSM ties it to only one, simpler, support lifetime 
> instead of two.
> 
> I guess you are implying that it will make it problematic for multiple users 
> to read userData left by another user because the formats might not be 
> compatible.
> The solution is that all parties interested in using VDSM storage agree on 
> format, and common fields, and supportability, and all the other things that 
> choosing a supporting *something* entails.
> This is, however, out of the scope of VDSM. When the time comes I think how 
> the userData blob is actually parsed and what fields it keeps should be 
> discussed on ovirt-devel or engine-devel.
> 
> The crux of the issue is that VDSM manages only what it cares about and the 
> user can't modify directly.
> This is done because everything we expose we commit to.
> If you want any information persisted like:
> - Human readable name (in whatever encoding)
> - Is this a template or a snapshot
> - What user owns this image
> 
> You can just put it in the userData.
> VDSM is not going to impose what encoding you use.
> It's not going to decide if you represent your users as IDs or names or ldap 
> queries or Public Keys.
> It's not going to decide if you have explicit templates or not.
> It's not going to decide if you care what is the logical image chain.
> It's not going to decide anything that is out of it's scope.
> No format is future proof, no selection of fields will be good for any 
> situation.
> I'd much rather it be someone else's problem when any of them need to be 
> changed.
> They have currently been VDSMs problem and it has been hell to maintain.

In general, I actually agree with most of this.  What I want to avoid is pushing
things that should actually be a part of the API into this userData blob.  We do
want to keep the API as simple as possible to give vdsm flexibility.  If, over
time, we find that users are always using userData to work around something
missing in the API, this could be a really good sign that the API needs
extension.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Managing async tasks

2012-12-17 Thread Adam Litke
On today's vdsm call we had a lively discussion around how asynchronous
operations should be handled in the future.  In an effort to include more people
in the discussion and to better capture the resulting conversation I would like
to continue that discussion here on the mailing list.

A lot of ideas were thrown around about how 'tasks' should be handled in the
future.  There are a lot of ways that it can be done.  To determine how we
should implement it, it's probably best if we start with a set of requirements.
If we can first agree on these, it should be easy to find a solution that meets
them.  I'll take a stab at identifying a first set of POSSIBLE requirements:

- Standardized method for determining the result of an operation

  This is a big one for me because it directly affects the consumability of the
  API.  If each verb has different semantics for discovering whether it has
  completed successfully, then the API will be nearly impossible to use easily.


Sorry.  That's my list :)  Hopefully others will be willing to add other
requirements for consideration.

>From my understanding, task recovery (stop, abort, rollback, etc) will not be
generally supported and should not be a requirement.



-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Managing async tasks

2012-12-17 Thread Adam Litke
On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
> 
> 
> - Original Message -
> > From: "Adam Litke"  To: vdsm-de...@lists.fedorahosted.org
> > Cc: "Dan Kenigsberg" , "Ayal Baron" ,
> > "Saggi Mizrahi" , "Federico Simoncelli"
> > , engine-devel@ovirt.org Sent: Monday, December 17,
> > 2012 12:00:49 PM Subject: Managing async tasks
> > 
> > On today's vdsm call we had a lively discussion around how asynchronous
> > operations should be handled in the future.  In an effort to include more
> > people in the discussion and to better capture the resulting conversation I
> > would like to continue that discussion here on the mailing list.
> > 
> > A lot of ideas were thrown around about how 'tasks' should be handled in the
> > future.  There are a lot of ways that it can be done.  To determine how we
> > should implement it, it's probably best if we start with a set of
> > requirements.  If we can first agree on these, it should be easy to find a
> > solution that meets them.  I'll take a stab at identifying a first set of
> > POSSIBLE requirements:
> > 
> > - Standardized method for determining the result of an operation
> > 
> >   This is a big one for me because it directly affects the consumability of
> >   the API.  If each verb has different semantics for discovering whether it
> >   has completed successfully, then the API will be nearly impossible to use
> >   easily.
> Since there is no way to assure if of some tasks completed successfully or
> failed, especially around the murky waters of storage, I say this requirement
> should be removed.  At least not in the context of a task.

I don't agree.  Please feel free to convince me with some exampled.  If we
cannot provide feedback to a user as to whether their request has been satisfied
or not, then we have some bigger problems to solve.

> > 
> > 
> > Sorry.  That's my list :)  Hopefully others will be willing to add other
> > requirements for consideration.
> > 
> > From my understanding, task recovery (stop, abort, rollback, etc) will not
> > be generally supported and should not be a requirement.
> > 

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Managing async tasks

2012-12-17 Thread Adam Litke
On Mon, Dec 17, 2012 at 03:12:34PM -0500, Saggi Mizrahi wrote:
> This is an addendum to my previous email.
> 
> - Original Message -
> > From: "Saggi Mizrahi" 
> > To: "Adam Litke" 
> > Cc: "Dan Kenigsberg" , "Ayal Baron" , 
> > "Federico Simoncelli"
> > , engine-devel@ovirt.org, 
> > vdsm-de...@lists.fedorahosted.org
> > Sent: Monday, December 17, 2012 2:52:06 PM
> > Subject: Re: Managing async tasks
> > 
> > 
> > 
> > - Original Message -
> > > From: "Adam Litke" 
> > > To: "Saggi Mizrahi" 
> > > Cc: "Dan Kenigsberg" , "Ayal Baron"
> > > , "Federico Simoncelli"
> > > , engine-devel@ovirt.org,
> > > vdsm-de...@lists.fedorahosted.org
> > > Sent: Monday, December 17, 2012 2:16:25 PM
> > > Subject: Re: Managing async tasks
> > > 
> > > On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
> > > > 
> > > > 
> > > > - Original Message -
> > > > > From: "Adam Litke"  To:
> > > > > vdsm-de...@lists.fedorahosted.org
> > > > > Cc: "Dan Kenigsberg" , "Ayal Baron"
> > > > > ,
> > > > > "Saggi Mizrahi" , "Federico Simoncelli"
> > > > > , engine-devel@ovirt.org Sent: Monday,
> > > > > December 17,
> > > > > 2012 12:00:49 PM Subject: Managing async tasks
> > > > > 
> > > > > On today's vdsm call we had a lively discussion around how
> > > > > asynchronous
> > > > > operations should be handled in the future.  In an effort to
> > > > > include more
> > > > > people in the discussion and to better capture the resulting
> > > > > conversation I
> > > > > would like to continue that discussion here on the mailing
> > > > > list.
> > > > > 
> > > > > A lot of ideas were thrown around about how 'tasks' should be
> > > > > handled in the
> > > > > future.  There are a lot of ways that it can be done.  To
> > > > > determine how we
> > > > > should implement it, it's probably best if we start with a set
> > > > > of
> > > > > requirements.  If we can first agree on these, it should be
> > > > > easy
> > > > > to find a
> > > > > solution that meets them.  I'll take a stab at identifying a
> > > > > first set of
> > > > > POSSIBLE requirements:
> > > > > 
> > > > > - Standardized method for determining the result of an
> > > > > operation
> > > > > 
> > > > >   This is a big one for me because it directly affects the
> > > > >   consumability of
> > > > >   the API.  If each verb has different semantics for
> > > > >   discovering
> > > > >   whether it
> > > > >   has completed successfully, then the API will be nearly
> > > > >   impossible to use
> > > > >   easily.
> > > > Since there is no way to assure if of some tasks completed
> > > > successfully or
> > > > failed, especially around the murky waters of storage, I say this
> > > > requirement
> > > > should be removed.  At least not in the context of a task.
> > > 
> > > I don't agree.  Please feel free to convince me with some exampled.
> > >  If we
> > > cannot provide feedback to a user as to whether their request has
> > > been satisfied
> > > or not, then we have some bigger problems to solve.
> > If VDSM sends a write command to a storage server, and the connection
> > hangs up before the ACK has returned.
> > The operation has been committed but VDSM has no way of knowing if
> > that happened as far as VDSM is concerned it got an ETIMEO or EIO.
> > This is the same problem that the engine has with VDSM.
> > If VDSM creates an image\VM\network\repo but the connection hangs up
> > before the response can be sent back as far as the engine is
> > concerned the operation times out.
> > This is an inherent issue with clustering.
> > This is why I want to move away from tasks being *the* trackable
> > objects.
> > Tasks should be short. As short as possible.
> > Run VM should just persist the VM information on the VDSM host and
> > return. The

Re: [Engine-devel] [vdsm] RFC: New Storage API

2013-01-22 Thread Adam Litke
On Tue, Jan 22, 2013 at 11:36:57PM +0800, Shu Ming wrote:
> 2013-1-15 5:34, Ayal Baron:
> >image and volume are overused everywhere and it would be extremely confusing 
> >to have multiple meanings to the same terms in the same system (we have 
> >image today which means virtual disk and volume which means a part of a 
> >virtual disk).
> >Personally I don't like the distinction between image and volume done in 
> >ec2/openstack/etc seeing as they're treated as different types of entities 
> >there while the only real difference is mutability (images are read-only, 
> >volumes are read-write).
> >To move to the industry terminology we would need to first change all 
> >references we have today to image and volume in the system (I would say also 
> >in ovirt-engine side) to align with the new meaning.
> >Despite my personal dislike of the terms, I definitely see the value in 
> >converging on the same terminology as the rest of the industry but to do so 
> >would be an arduous task which is out of scope of this discussion imo 
> >(patches welcome though ;)
> 
> Another distinction between Openstack and oVirt is how the
> Nova/ovirt-engine look upon storage systems. In Openstack, a stand
> alone storage service(Cinder) exports the raw storage block device
> to Nova. On the other hand, in oVirt, storage system is highly
> bounded with the cluster scheduling system which integrates storage
> sub-system, VM dispatching sub-system, ISO image sub systems. This
> combination make all of the sub-system integrated in a whole which
> is easy to deploy, but it make the sub-system more opaque and not
> harder to reuse and maintain. This new storage API proposal give us
> an opportunity to distinct these sub-systems as new components which
> export better, loose-coupling APIs to VDSM.

A very good point and an important goal in my opinion.  I'd like to see
ovirt-engine become more of a GUI for configuring the storage component (like it
does for Gluster) rather than the centralized manager of storage.  The clustered
storage should be able to take care of itself as long as the peer hosts can
negotiate the SDM role.  

It would be cool if someone could actually dedicate a non-virtualization host
where its only job is to handle SDM operations.  Such a host could choose to
only deploy the standalone HSM service and not the complete vdsm package.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] 3.2 features for release notes

2013-01-24 Thread Adam Litke
On Thu, Jan 24, 2013 at 07:30:07AM -0800, Itamar Heim wrote:
> doron/adam:
> not sure about status of vdsm-mom in 3.2?

mom is enabled by default for hosts in 3.2 and will control KSM only.  No
user-visible changes are expected as this is primarily an infrastructure change
to enable more advanced SLA in the next release.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Java Newbie: Renaming some functions to fix findbugs warnings

2013-11-22 Thread Adam Litke
Hello,

I am working on resolving some warnings produced by findbugs and am looking for 
some advice on how to properly resolve the problem.

The Frontend class has several pairs of methods where a capitalized version is 
a deprecated static form and the camelCase version is the instance method.

For example:

@Deprecated
public static void RunQuery(...)

- and -

public void runQuery(...)

In both cases the parameters are the same so simply renaming RunQuery to 
runQuery will result in a conflict.  Since I am new to Java and the 
ovirt-engine project I am looking for some advice on how to fix the function 
name without breaking the code or people's sense of aesthetics.  Since this is 
a deprecated function, would it be terrible to rename it to 'runQueryStatic' or 
'runQueryDeprecated'?  Since the language provides syntactic annotations for 
'static' and 'deprecated', both of these names feel dirty but I am not sure 
what would be better.  Thanks for helping out a newbie!

--Adam
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Java Newbie: Renaming some functions to fix findbugs warnings

2013-11-22 Thread Adam Litke
> Adam,
> 
> We are aware of this issue and we actually have a patch somewhat ready to
> solve the issue [1]. We made the RunQuery/RunAction/etc method deprecated to
> encourage people to no longer use them. We have patch ready to remove all
> current uses of RunQuery/RunAction/etc from the code base, but haven't gotten
> around to rebasing/merging the patch.
> 
> Alexander
> 
> [1] http://gerrit.ovirt.org/#/c/18413/

Thanks for the detail!  Looks like fixing this properly is far from a 
beginner's task.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] UX: Display VM Downtime in the UI

2013-12-18 Thread Adam Litke

Hi UX developers,

My recent change: http://gerrit.ovirt.org/#/c/22429/ adds support for
tracking the time a VM was last stopped and presenting it in the REST
API.  I would also like to expose this information in the admin
portal.  This feature has been requested by end users and is useful
for managing lots of VMs which may not be used frequently.

My idea is to change the 'Uptime' column in the VMs tab to 'Uptime /
Downtime' or some equivalent and more compact phrasing.  If the VM is
Up, then last_start_time would be used to calculate uptime.  If the VM
is Down, then last_stop_time would be used to calculate downtime.
This helps to make efficient use of the column space.

I am not sure how column sorting is being implemented, but if we
combine uptime and downtime into a single column we have an
opportunity to provide a really intuitive sort where the longest
uptime machines are at the top and the longest downtime machines are
at the bottom.  This could be accomplished by treating uptime as a
positive interval and downtime as a negative interval.

Questions for you all:

- Do you support the idea of changing the Uptime column to include
 Downtime as well or would you prefer a new column instead?

- Is there a better heading for the new column or is 'Uptime /
 Downtime' good enough?

- How should we handle sorting?

Thanks for your input!
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UX: Display VM Downtime in the UI

2013-12-18 Thread Adam Litke

On 18/12/13 16:04 -0500, Malini Rao wrote:


- Original Message -

From: "Adam Litke" 
To: engine-devel@ovirt.org
Sent: Wednesday, December 18, 2013 9:42:59 AM
Subject: [Engine-devel] UX: Display VM Downtime in the UI

Hi UX developers,

My recent change: http://gerrit.ovirt.org/#/c/22429/ adds support for
tracking the time a VM was last stopped and presenting it in the REST
API.  I would also like to expose this information in the admin
portal.  This feature has been requested by end users and is useful
for managing lots of VMs which may not be used frequently.

My idea is to change the 'Uptime' column in the VMs tab to 'Uptime /
Downtime' or some equivalent and more compact phrasing.  If the VM is
Up, then last_start_time would be used to calculate uptime.  If the VM
is Down, then last_stop_time would be used to calculate downtime.
This helps to make efficient use of the column space.




Thanks for your comments!


MR: I like the idea in general but can we extend to other states as
well? Then we could have the col be called something like 'Time in


I would argue that 'Up' and 'Down' are the only persistent states
where a VM can linger for a user-controlled amount of time.  The
others (WaitForLaunch, PoweringDown, etc) are just transitions with
their own system defined timeouts.  Because of this, it really only
makes sense to denote uptime and downtime.  When the VM is in another
state, this column would be empty.


current state'. Also, I think since this col is so far from the first
column that has the status icon, we should have a tooltip on the
value that says ' Uptime' , 'down time' or ' time'.


Agree on the tooltip.



I am not sure how column sorting is being implemented, but if we
combine uptime and downtime into a single column we have an
opportunity to provide a really intuitive sort where the longest
uptime machines are at the top and the longest downtime machines
are at the bottom.  This could be accomplished by treating uptime
as a positive interval and downtime as a negative interval.


MR: That's an interesting idea. Not sure how that would translate if
we did all states and times. Then I would think you would do
descending order within each state but then we would have to fix a
sequence for the display of the various statuses based on the
statuses that matter most.


This is much simpler if you just work with Up and Down.



Questions for you all:

- Do you support the idea of changing the Uptime column to include
Downtime as well or would you prefer a new column instead?



MR: I do not like the idea of introducing new columns for this
purpose since at any given time, only one of the columns will be
populated. Another idea is to remove this column all together and
include the time for the current status as a tooltip on the status
icon preceding the name.


What about adding the uptime/downtime to the status column itself?  I
don't necessarily think this will muddy the status much since there is
still an icon on the left.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UX: Display VM Downtime in the UI

2013-12-19 Thread Adam Litke

On 19/12/13 03:08 -0500, Omer Frenkel wrote:



- Original Message -

From: "Adam Litke" 
To: "Malini Rao" 
Cc: engine-devel@ovirt.org
Sent: Wednesday, December 18, 2013 11:19:17 PM
Subject: Re: [Engine-devel] UX: Display VM Downtime in the UI

On 18/12/13 16:04 -0500, Malini Rao wrote:
>
>- Original Message -
>> From: "Adam Litke" 
>> To: engine-devel@ovirt.org
>> Sent: Wednesday, December 18, 2013 9:42:59 AM
>> Subject: [Engine-devel] UX: Display VM Downtime in the UI
>>
>> Hi UX developers,
>>
>> My recent change: http://gerrit.ovirt.org/#/c/22429/ adds support for
>> tracking the time a VM was last stopped and presenting it in the REST
>> API.  I would also like to expose this information in the admin
>> portal.  This feature has been requested by end users and is useful
>> for managing lots of VMs which may not be used frequently.
>>
>> My idea is to change the 'Uptime' column in the VMs tab to 'Uptime /
>> Downtime' or some equivalent and more compact phrasing.  If the VM is
>> Up, then last_start_time would be used to calculate uptime.  If the VM
>> is Down, then last_stop_time would be used to calculate downtime.
>> This helps to make efficient use of the column space.
>

Thanks for your comments!

>MR: I like the idea in general but can we extend to other states as
>well? Then we could have the col be called something like 'Time in

I would argue that 'Up' and 'Down' are the only persistent states
where a VM can linger for a user-controlled amount of time.  The
others (WaitForLaunch, PoweringDown, etc) are just transitions with
their own system defined timeouts.  Because of this, it really only
makes sense to denote uptime and downtime.  When the VM is in another
state, this column would be empty.



when do you think this would be empty?
the way i see it, if there is a qemu process running, we count 'up time' (as it 
is today)
otherwise, its down time (when vm is suspended/image locked its down as well)
maybe only in 'unknown' state, when we dont have connection to the host,
and we dont know the state of the vm it can be empty.


Sure, makes sense and I agree.


>current state'. Also, I think since this col is so far from the first
>column that has the status icon, we should have a tooltip on the
>value that says ' Uptime' , 'down time' or ' time'.

Agree on the tooltip.

>>
>> I am not sure how column sorting is being implemented, but if we
>> combine uptime and downtime into a single column we have an
>> opportunity to provide a really intuitive sort where the longest
>> uptime machines are at the top and the longest downtime machines
>> are at the bottom.  This could be accomplished by treating uptime
>> as a positive interval and downtime as a negative interval.
>
>MR: That's an interesting idea. Not sure how that would translate if
>we did all states and times. Then I would think you would do
>descending order within each state but then we would have to fix a
>sequence for the display of the various statuses based on the
>statuses that matter most.

This is much simpler if you just work with Up and Down.

>>
>> Questions for you all:
>>
>> - Do you support the idea of changing the Uptime column to include
>> Downtime as well or would you prefer a new column instead?
>
>
>MR: I do not like the idea of introducing new columns for this
>purpose since at any given time, only one of the columns will be
>populated. Another idea is to remove this column all together and
>include the time for the current status as a tooltip on the status
>icon preceding the name.

What about adding the uptime/downtime to the status column itself?  I
don't necessarily think this will muddy the status much since there is
still an icon on the left.



i like better the first option of one column with up/down time,
i think its more clear to the user


Can you suggest a good concise column heading for it?

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Engine on Fedora 20

2013-12-19 Thread Adam Litke

Has anyone had success running ovirt-engine on Fedora 20?  I upgraded
my system on Wednesday and thought everything was fine but then I
started getting the following error:

2013-12-19 14:53:31,447 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-5) Error in getting DB connection. The database is
inaccessible. Original exception is:
DataAccessResourceFailureException: Error retreiving database
metadata; nested exception is
org.springframework.jdbc.support.MetaDataAccessException: Could not
get Connection for extracting meta data; nested exception is
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not
get JDBC Connection; nested exception is java.sql.SQLException:
javax.resource.ResourceException: IJ000453: Unable to get managed
connection for java:/ENGINEDataSource

Has anyone encountered this recently?
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Engine on Fedora 20

2013-12-19 Thread Adam Litke

On 19/12/13 15:05 -0500, Adam Litke wrote:

Has anyone had success running ovirt-engine on Fedora 20?  I upgraded
my system on Wednesday and thought everything was fine but then I
started getting the following error:

2013-12-19 14:53:31,447 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-5) Error in getting DB connection. The database is
inaccessible. Original exception is:
DataAccessResourceFailureException: Error retreiving database
metadata; nested exception is
org.springframework.jdbc.support.MetaDataAccessException: Could not
get Connection for extracting meta data; nested exception is
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not
get JDBC Connection; nested exception is java.sql.SQLException:
javax.resource.ResourceException: IJ000453: Unable to get managed
connection for java:/ENGINEDataSource

Has anyone encountered this recently?


Thanks to alonb for his help on IRC.  As it turns out, I had a poorly
configured pg_hba.conf file that only started causing problems on F20.
To fix I replaced my contents with the following two lines:

hostengine  engine  0.0.0.0/0   md5
hostengine  engine  ::0/0   md5

Otherwise, it seems to be working fine.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Engine on Fedora 20

2013-12-19 Thread Adam Litke

On 19/12/13 15:43 -0500, Itamar Heim wrote:

On 12/19/2013 03:22 PM, Adam Litke wrote:

On 19/12/13 15:05 -0500, Adam Litke wrote:

Has anyone had success running ovirt-engine on Fedora 20?  I upgraded
my system on Wednesday and thought everything was fine but then I
started getting the following error:

2013-12-19 14:53:31,447 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-5) Error in getting DB connection. The database is
inaccessible. Original exception is:
DataAccessResourceFailureException: Error retreiving database
metadata; nested exception is
org.springframework.jdbc.support.MetaDataAccessException: Could not
get Connection for extracting meta data; nested exception is
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not
get JDBC Connection; nested exception is java.sql.SQLException:
javax.resource.ResourceException: IJ000453: Unable to get managed
connection for java:/ENGINEDataSource

Has anyone encountered this recently?


Thanks to alonb for his help on IRC.  As it turns out, I had a poorly
configured pg_hba.conf file that only started causing problems on F20.
To fix I replaced my contents with the following two lines:

hostengine  engine  0.0.0.0/0   md5
hostengine  engine  ::0/0   md5

Otherwise, it seems to be working fine.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


is this with the wildfly jboss version?


Actually it's with the manually installed jboss-as-7.1.1.Final.tar.gz
that I put in place using the older development environment setup
instructions.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] ovirt-engine build segfault on Fedora 20

2014-01-03 Thread Adam Litke

On 02/01/14 21:53 -0500, Greg Sheremeta wrote:

Caution on upgrading your dev machine to Fedora 20. GWT compilation of "safari" 
(for Chrome) causes a segfault during the build. Strangely, the build appears to work, so 
I'm not sure what the net effect of the segfault is.

If you only compile for gecko (Firefox) [the default], you won't see the 
segfault.

In other words,
make clean install-dev PREFIX=$HOME/ovirt-engine 
DEV_EXTRA_BUILD_FLAGS_GWT_DEFAULTS="-Dgwt.userAgent=gecko1_8,safari"
causes the segfault

But
make install-dev PREFIX="$HOME/ovirt-engine"
works just fine.

I've duplicated this with with both OpenJDK and Oracle JDK.


I can confirm this on my F20 system with OpenJDK as well.  So far I
have not observed any problems with the resulting build.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

Hi all,

I am working with the latest ovirt-engine git and am finding some
strange behavior with the UI.  The list of VMs never populates and I
am stuck with the loading indicator.  All other tabs behave normally
(Hosts, Templates, Storage, etc).  Also, the list of VMs can be loaded
normally using the REST API.  Any ideas what may be causing this
strange behavior?
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 11:19 -0500, Alexander Wels wrote:

Adam,

Is this just when you first login into the webadmin or whenever you go to the
VM tab? In other words if you login, then switch to the templates tab and back
again to the VM tab does it still not load? What about when you manually
refresh the grid?


Thanks for the quick response!  It doesn't load at all -- first time
or any other time when revisiting.  In some cases in the past I would
have luck by clicking the blue refresh icon but that doesn't help
either.  I have force refreshed the browser (Chrome) to no avail.  I
guess the next step is to completely restart the browser (hmm, no luck
there either).



Alexander

On Monday, January 06, 2014 11:02:02 AM Adam Litke wrote:

Hi all,

I am working with the latest ovirt-engine git and am finding some
strange behavior with the UI.  The list of VMs never populates and I
am stuck with the loading indicator.  All other tabs behave normally
(Hosts, Templates, Storage, etc).  Also, the list of VMs can be loaded
normally using the REST API.  Any ideas what may be causing this
strange behavior?
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel



___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 11:41 -0500, Alexander Wels wrote:

On Monday, January 06, 2014 11:27:07 AM Adam Litke wrote:

On 06/01/14 11:19 -0500, Alexander Wels wrote:
>Adam,
>
>Is this just when you first login into the webadmin or whenever you go to
>the VM tab? In other words if you login, then switch to the templates tab
>and back again to the VM tab does it still not load? What about when you
>manually refresh the grid?

Thanks for the quick response!  It doesn't load at all -- first time
or any other time when revisiting.  In some cases in the past I would
have luck by clicking the blue refresh icon but that doesn't help
either.  I have force refreshed the browser (Chrome) to no avail.  I
guess the next step is to completely restart the browser (hmm, no luck
there either).



Okay, then something else is going on, are there any errors in the server log?



From server.log there are no ERRORs but this message may be related:


2014-01-06 13:01:54,209 WARN
[org.jboss.resteasy.spi.ResteasyDeployment] (http--0.0.0.0-8080-4)
Application.getSingletons() returned unknown class type:
org.ovirt.engine.api.restapi.util.VmHelper




>Alexander
>
>On Monday, January 06, 2014 11:02:02 AM Adam Litke wrote:
>> Hi all,
>>
>> I am working with the latest ovirt-engine git and am finding some
>> strange behavior with the UI.  The list of VMs never populates and I
>> am stuck with the loading indicator.  All other tabs behave normally
>> (Hosts, Templates, Storage, etc).  Also, the list of VMs can be loaded
>> normally using the REST API.  Any ideas what may be causing this
>> strange behavior?
>> ___
>> Engine-devel mailing list
>> Engine-devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/engine-devel



___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 11:44 -0500, Einav Cohen wrote:

- Original Message -
From: "Alexander Wels" 
Sent: Monday, January 6, 2014 11:41:38 AM

On Monday, January 06, 2014 11:27:07 AM Adam Litke wrote:
> On 06/01/14 11:19 -0500, Alexander Wels wrote:
> >Adam,
> >
> >Is this just when you first login into the webadmin or whenever you go to
> >the VM tab? In other words if you login, then switch to the templates tab
> >and back again to the VM tab does it still not load? What about when you
> >manually refresh the grid?
>
> Thanks for the quick response!  It doesn't load at all -- first time
> or any other time when revisiting.  In some cases in the past I would
> have luck by clicking the blue refresh icon but that doesn't help
> either.  I have force refreshed the browser (Chrome) to no avail.  I
> guess the next step is to completely restart the browser (hmm, no luck
> there either).
>

Okay, then something else is going on, are there any errors in the server
log?


In addition to server logs: maybe also provide client "logs" (see instructions
in [1])?
thanks.

[1] http://lists.ovirt.org/pipermail/users/2013-December/018494.html


GET http://localhost:8080/ovirt-engine/webadmin/Reports.xml 404 (Not Found) 
4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:16328
Mon Jan 06 13:05:00 GMT-500 2014 com.google.gwt.logging.client.LogConfiguration
SEVERE: (TypeError) 
stack: TypeError: Cannot call method 'kk' of null

   at LSj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11622:58)
   at Object.JTl [as h_] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17112:15349)
   at l7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15200:166)
   at Object.n7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:16142:328)
   at r2j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15199:140)
   at rjk 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:7040:19)
   at Object._jk [as qT] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17088:17294)
   at Object.r5j [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17085:15904)
   at hIj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15873:85)
   at Object.kIj [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17082:510)
   at uKj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11859:40)
   at Object.xKj [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17082:20018)
   at OJj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15471:172)
   at Object.RJj [as Ch] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17082:19443)
   at Object.jAd [as ue] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17019:23272)
   at cR 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:14512:137)
   at Object.vR 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17019:13248)
   at XMLHttpRequest. 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11884:65)
   at _q 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:8351:29)
   at cr 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15114:57)
   at XMLHttpRequest. 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:12521:45):
 Cannot call method 'kk' of null
com.google.gwt.core.client.JavaScriptException: (TypeError) 
stack: TypeError: Cannot call method 'kk' of null

   at LSj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11622:58)
   at Object.JTl [as h_] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17112:15349)
   at l7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15200:166)
   at Object.n7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:16142:328)
   at r2j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15199:140)
   at rjk 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:7040:19)
   at Object._jk [as qT] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17088:17294)
   at Obje

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 13:12 -0500, Alexander Wels wrote:

On Monday, January 06, 2014 01:03:31 PM Adam Litke wrote:

On 06/01/14 11:41 -0500, Alexander Wels wrote:
>On Monday, January 06, 2014 11:27:07 AM Adam Litke wrote:
>> On 06/01/14 11:19 -0500, Alexander Wels wrote:
>> >Adam,
>> >
>> >Is this just when you first login into the webadmin or whenever you go
>> >to
>> >the VM tab? In other words if you login, then switch to the templates
>> >tab
>> >and back again to the VM tab does it still not load? What about when you
>> >manually refresh the grid?
>>
>> Thanks for the quick response!  It doesn't load at all -- first time
>> or any other time when revisiting.  In some cases in the past I would
>> have luck by clicking the blue refresh icon but that doesn't help
>> either.  I have force refreshed the browser (Chrome) to no avail.  I
>> guess the next step is to completely restart the browser (hmm, no luck
>> there either).
>
>Okay, then something else is going on, are there any errors in the server
>log?
From server.log there are no ERRORs but this message may be related:

2014-01-06 13:01:54,209 WARN
[org.jboss.resteasy.spi.ResteasyDeployment] (http--0.0.0.0-8080-4)
Application.getSingletons() returned unknown class type:
org.ovirt.engine.api.restapi.util.VmHelper



Don't think that is related, as currently the web admin uses GWT RPC to
communicate with the engine, and not the REST interface.

So, ovirt-engine/var/log/ovirt-engine/server.log and ovirt-
engine/var/log/ovirt-engine/engine.log Have nothing in them?



From engine.log:


2014-01-06 13:10:34,428 WARN
[org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil]
(org.ovirt.thread.pool-6-thread-50) Executing a command:
java.util.concurrent.FutureTask , but note that there are 0 tasks in
the queue.

This repeats quite regularly...  Other than that, nothing looks
relavent.  






>> >Alexander
>> >
>> >On Monday, January 06, 2014 11:02:02 AM Adam Litke wrote:
>> >> Hi all,
>> >>
>> >> I am working with the latest ovirt-engine git and am finding some
>> >> strange behavior with the UI.  The list of VMs never populates and I
>> >> am stuck with the loading indicator.  All other tabs behave normally
>> >> (Hosts, Templates, Storage, etc).  Also, the list of VMs can be loaded
>> >> normally using the REST API.  Any ideas what may be causing this
>> >> strange behavior?
>> >> ___
>> >> Engine-devel mailing list
>> >> Engine-devel@ovirt.org
>> >> http://lists.ovirt.org/mailman/listinfo/engine-devel



___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 13:30 -0500, Alexander Wels wrote:

Yes either compile in PRETTY mode or run in GWT debug mode. Depending on how
comfortable you are with doing either one.


Ok I think we're getting somewhere... When compiled in draft mode the client
errors look like this:

GET http://localhost:8080/ovirt-engine/webadmin/Reports.xml 404 (Not Found) 
C5287D41B71197763AB3125431813688.cache.html:44792
Mon Jan 06 14:08:21 GMT-500 2014 com.google.gwt.logging.client.LogConfiguration
SEVERE: (TypeError) 
stack: TypeError: Cannot call method 'get__Ljava_lang_Object_2Ljava_lang_Object_2' of null

   at 
org_ovirt_engine_ui_uicommonweb_dataprovider_AsyncDataProvider_getDisplayTypes__ILorg_ovirt_engine_core_compat_Version_2Ljava_util_List_2
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:180644:360)
   at 
org_ovirt_engine_ui_uicommonweb_dataprovider_AsyncDataProvider_hasSpiceSupport__ILorg_ovirt_engine_core_compat_Version_2Z
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:181442:10)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_vms_SpiceConsoleModel_canBeSelected__Z
 [as canBeSelected__Z] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:248240:199)
   at 
org_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_$canSelectProtocol__Lorg_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_2Lorg_ovirt_engine_ui_uicommonweb_models_ConsoleProtocol_2Z
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:187341:282)
   at 
org_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_$setDefaultSelectedProtocol__Lorg_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:187391:9)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_VmConsolesImpl__Lorg_ovirt_engine_core_common_businessentities_VM_2Lorg_ovirt_engine_ui_uicommonweb_models_Model_2Lorg_ovirt_engine_ui_uicommonweb_ConsoleOptionsFrontendPersister$ConsoleContext_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:187407:3)
   at 
org_ovirt_engine_ui_uicommonweb_models_ConsoleModelsCache_$updateCache__Lorg_ovirt_engine_ui_uicommonweb_models_ConsoleModelsCache_2Ljava_lang_Iterable_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:185252:1037)
   at 
org_ovirt_engine_ui_uicommonweb_models_vms_VmListModel_$setItems__Lorg_ovirt_engine_ui_uicommonweb_models_vms_VmListModel_2Ljava_lang_Iterable_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:194985:3)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_vms_VmListModel_setItems__Ljava_lang_Iterable_2V
 [as setItems__Ljava_lang_Iterable_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:195275:3)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_SearchableListModel$2_onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:186570:23)
   at 
org_ovirt_engine_ui_frontend_Frontend$1_$onSuccess__Lorg_ovirt_engine_ui_frontend_Frontend$1_2Lorg_ovirt_engine_ui_frontend_communication_VdcOperation_2Lorg_ovirt_engine_core_common_queries_VdcQueryReturnValue_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:168839:1451)
   at 
Object.org_ovirt_engine_ui_frontend_Frontend$1_onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:168871:3)
   at 
org_ovirt_engine_ui_frontend_communication_OperationProcessor$2_$onSuccess__Lorg_ovirt_engine_ui_frontend_communication_OperationProcessor$2_2Lorg_ovirt_engine_ui_frontend_communication_VdcOperation_2Ljava_lang_Object_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:173172:217)
   at 
Object.org_ovirt_engine_ui_frontend_communication_OperationProcessor$2_onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:173190:3)
   at 
org_ovirt_engine_ui_frontend_communication_GWTRPCCommunicationProvider$4_$onSuccess__Lorg_ovirt_engine_ui_frontend_communication_GWTRPCCommunicationProvider$4_2Ljava_util_ArrayList_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:172948:675)
   at 
Object.org_ovirt_engine_ui_frontend_communication_GWTRPCCommunicationProvider$4_onSuccess__Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:172970:3

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 14:32 -0500, Daniel Erez wrote:



- Original Message -

From: "Adam Litke" 
To: "Alexander Wels" 
Cc: engine-devel@ovirt.org
Sent: Monday, January 6, 2014 9:11:48 PM
Subject: Re: [Engine-devel] UI: VM list not populating



Might be an issue of a stale osinfo properties file,
'displayProtocols' has recently been introduced by [1]

Try overwriting osinfo-defaults.properties with the updated one from latest bits
/ovirt-engine/packaging/conf/osinfo-defaults.properties --> 
$HOME/ovirt-engine/share/ovirt-engine/conf

[1] 
http://gerrit.ovirt.org/#/c/18677/14/packaging/conf/osinfo-defaults.properties


Thanks for the suggestion but it did not seem to resolve the issue.
Also, my proprties file has os.other.displayProtocols.value and 
os.other.spiceSupport.value.  This seems different from [1] above

which indicates that the spiceSupport key is removed entirely.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 15:31 -0500, Daniel Erez wrote:



- Original Message -

From: "Adam Litke" 
To: "Daniel Erez" 
Cc: "Alexander Wels" , engine-devel@ovirt.org
Sent: Monday, January 6, 2014 9:51:57 PM
Subject: Re: [Engine-devel] UI: VM list not populating

On 06/01/14 14:32 -0500, Daniel Erez wrote:
>
>
>- Original Message -
>> From: "Adam Litke" 
>> To: "Alexander Wels" 
>> Cc: engine-devel@ovirt.org
>> Sent: Monday, January 6, 2014 9:11:48 PM
>> Subject: Re: [Engine-devel] UI: VM list not populating
>>
>
>Might be an issue of a stale osinfo properties file,
>'displayProtocols' has recently been introduced by [1]
>
>Try overwriting osinfo-defaults.properties with the updated one from latest
>bits
>/ovirt-engine/packaging/conf/osinfo-defaults.properties -->
>$HOME/ovirt-engine/share/ovirt-engine/conf
>
>[1]
>http://gerrit.ovirt.org/#/c/18677/14/packaging/conf/osinfo-defaults.properties

Thanks for the suggestion but it did not seem to resolve the issue.
Also, my proprties file has os.other.displayProtocols.value and
os.other.spiceSupport.value.  This seems different from [1] above
which indicates that the spiceSupport key is removed entirely.


Actually spiceSupport key was added a bit later by:
http://gerrit.ovirt.org/#/c/18220/17/packaging/conf/osinfo-defaults.properties

Can you please check if VMs list is displayed correctly from the userportal?
(I just wonder if there's some race in 'initCache/initDisplayTypes' mechanism).


Does not work in the User Portal either.  I don't know if this is
related, but I have started to observe some new errors in server.log.
I wonder if I have done too much rebasing and schema upgrading on my
local DB:

2014-01-06 15:39:20,451 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] 
(DefaultQuartzScheduler_Worker-31) Failed to refresh VDS , vds = 
203848b8-1d84-4c01-a267-c11280d0ad0f : lager, error = 
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad 
SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested 
exception is org.postgresql.util.PSQLException: The column name qos_overridden 
was not found in this ResultSet., continuing.: 
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad 
SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested 
exception is org.postgresql.util.PSQLException: The column name qos_overridden 
was not found in this ResultSet.
at 
org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:98)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:603) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:637) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:666) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:706) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:154)
 [dal.jar:]
at 
org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:120)
 [dal.jar:]
at 
org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:137)
 [dal.jar:]
at 
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:103)
 [dal.jar:]
at 
org.ovirt.engine.core.dao.network.InterfaceDaoDbFacadeImpl.getAllInterfacesForVds(InterfaceDaoDbFacadeImpl.java:167)
 [dal.jar:]
at 
org.ovirt.engine.core.dao.network.InterfaceDaoDbFacadeImpl.getAllInterfacesForVds(InterfaceDaoDbFacadeImpl.java:150)
 [dal.jar:]
at 
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder.updateNetworkData(VdsBrokerObjectsBuilder.java:930)
 [vdsbroker.jar:]
at 
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder.updateVDSDynamicData(VdsBrokerObjectsBuilder.j

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 15:56 -0500, Daniel Erez wrote:



- Original Message -

From: "Adam Litke" 
To: "Daniel Erez" 
Cc: "Alexander Wels" , engine-devel@ovirt.org
Sent: Monday, January 6, 2014 10:42:08 PM
Subject: Re: [Engine-devel] UI: VM list not populating

On 06/01/14 15:31 -0500, Daniel Erez wrote:
>
>
>- Original Message -
>> From: "Adam Litke" 
>> To: "Daniel Erez" 
>> Cc: "Alexander Wels" , engine-devel@ovirt.org
>> Sent: Monday, January 6, 2014 9:51:57 PM
>> Subject: Re: [Engine-devel] UI: VM list not populating
>>
>> On 06/01/14 14:32 -0500, Daniel Erez wrote:
>> >
>> >
>> >- Original Message -
>> >> From: "Adam Litke" 
>> >> To: "Alexander Wels" 
>> >> Cc: engine-devel@ovirt.org
>> >> Sent: Monday, January 6, 2014 9:11:48 PM
>> >> Subject: Re: [Engine-devel] UI: VM list not populating
>> >>
>> >
>> >Might be an issue of a stale osinfo properties file,
>> >'displayProtocols' has recently been introduced by [1]
>> >
>> >Try overwriting osinfo-defaults.properties with the updated one from
>> >latest
>> >bits
>> >/ovirt-engine/packaging/conf/osinfo-defaults.properties -->
>> >$HOME/ovirt-engine/share/ovirt-engine/conf
>> >
>> >[1]
>> 
>http://gerrit.ovirt.org/#/c/18677/14/packaging/conf/osinfo-defaults.properties
>>
>> Thanks for the suggestion but it did not seem to resolve the issue.
>> Also, my proprties file has os.other.displayProtocols.value and
>> os.other.spiceSupport.value.  This seems different from [1] above
>> which indicates that the spiceSupport key is removed entirely.
>
>Actually spiceSupport key was added a bit later by:
>http://gerrit.ovirt.org/#/c/18220/17/packaging/conf/osinfo-defaults.properties
>
>Can you please check if VMs list is displayed correctly from the userportal?
>(I just wonder if there's some race in 'initCache/initDisplayTypes'
>mechanism).

Does not work in the User Portal either.  I don't know if this is
related, but I have started to observe some new errors in server.log.
I wonder if I have done too much rebasing and schema upgrading on my
local DB:


Yeah, looks like the DB needs upgrading...
(if you don't have any important data you can just try creating a new one).
Regarding the user portal, I'm guessing you don't see any VMs as you have
to assign permissions to them first from the webadmin.
Can you try creating some new VMs from the user portal, to see if the list
is displayed correctly. Also, look whether you get a similar error in
the engine log file as the webadmin.


New VMs created in the admin portal and user portal do not show up in
the list.  I just see the animated boxes indicating that the data is
loading.  The same error appears in the engine.log.  I will try to
blow away the data and start over.





2014-01-06 15:39:20,451 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager]
(DefaultQuartzScheduler_Worker-31) Failed to refresh VDS , vds =
203848b8-1d84-4c01-a267-c11280d0ad0f : lager, error =
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback;
bad SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested
exception is org.postgresql.util.PSQLException: The column name
qos_overridden was not found in this ResultSet., continuing.:
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback;
bad SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested
exception is org.postgresql.util.PSQLException: The column name
qos_overridden was not found in this ResultSet.
at

org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:98)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:603)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:637)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:666)
[spri

Re: [Engine-devel] oVirt 3.4.0 alpha repository closure failure

2014-01-10 Thread Adam Litke

On 10/01/14 10:01 +, Dan Kenigsberg wrote:

On Fri, Jan 10, 2014 at 08:48:52AM +0100, Sandro Bonazzola wrote:

Hi,
oVirt 3.4.0 alpha repository has been composed but alpha has not been announced 
due to repository closure failures:

on CentOS 6.5:

# repoclosure -r ovirt-3.4.0-alpha -l ovirt-3.3.2 -l base -l epel -l 
glusterfs-epel -l updates -l extra -l glusterfs-noarch-epel -l ovirt-stable -n
Reading in repository metadata - please wait
Checking Dependencies
Repos looked at: 8
   base
   epel
   glusterfs-epel
   glusterfs-noarch-epel
   ovirt-3.3.2
   ovirt-3.4.0-alpha
   ovirt-stable
   updates
Num Packages in Repos: 16581
package: mom-0.3.2-20140101.git2691f25.el6.noarch from ovirt-3.4.0-alpha
  unresolved deps:
 procps-ng


Adam, this seems like a real bug in http://gerrit.ovirt.org/#/c/22087/ :
el6 still carries the older "procps" (which is, btw, provided by
procps-ng).


Done.
http://gerrit.ovirt.org/23137





package: vdsm-hook-vhostmd-4.14.0-1.git6fdd55f.el6.noarch from ovirt-3.4.0-alpha
  unresolved deps:
 vhostmd


Douglas, could you add a with_vhostmd option to the spec, and have it
default to 0 on el*, and to 1 on fedoras?

Thanks,
Dan.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Gerrit NEW Change Screen

2014-01-16 Thread Adam Litke

On 16/01/14 00:08 +0200, Itamar Heim wrote:

with gerrit 2.8, there is a new change screen.
its not enabled by default (yet), please use and see what you think.

to enable, go to settings (click the top-right arrow next to your 
name, and choose settings).

select preferences and set "Change View:" to "New Screen".


Thanks Itamar.  For me, this new screen is much better.  Things are
more compact and for most patches all of the important information
fits easily on one screen.  I also like the easy to find gitweb link
and the ability to edit the commit message right from the interface.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Copy reviewer scores on trivial rebase/commit msg changes

2014-01-21 Thread Adam Litke

On 18/01/14 01:48 +0200, Itamar Heim wrote:

I'd like to enable these - comments welcome:

1. label.Label-Name.copyAllScoresOnTrivialRebase

If true, all scores for the label are copied forward when a new patch 
set is uploaded that is a trivial rebase. A new patch set is 
considered as trivial rebase if the commit message is the same as in 
the previous patch set and if it has the same code delta as the 
previous patch set. This is the case if the change was rebased onto a 
different parent. This can be used to enable sticky approvals, 
reducing turn-around for trivial rebases prior to submitting a change. 
Defaults to false.



2. label.Label-Name.copyAllScoresIfNoCodeChange

If true, all scores for the label are copied forward when a new patch 
set is uploaded that has the same parent commit as the previous patch 
set and the same code delta as the previous patch set. This means only 
the commit message is different. This can be used to enable sticky 
approvals on labels that only depend on the code, reducing turn-around 
if only the commit message is changed prior to submitting a change. 
Defaults to false.


I am a bit late to the party but +1 from me for trying both.  I guess
it will be quite rare that something bad happens here.  So unlikely,
that the time saved on all the previous patches will far offset the
lost time for fixing the corner cases.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] mom RPMs for 3.4

2014-01-30 Thread Adam Litke

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom >= 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?

[1] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/
[2] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=centos6-host/artifact/exported-artifacts/mom-0.4.0-1.el6.noarch.rpm
[3] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora19-host/artifact/exported-artifacts/mom-0.4.0-1.fc19.noarch.rpm
[4] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora20-host/artifact/exported-artifacts/mom-0.4.0-1.fc20.noarch.rpm
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] mom RPMs for 3.4

2014-01-30 Thread Adam Litke

On 30/01/14 18:13 +, Dan Kenigsberg wrote:

On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom >= 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?


Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.


In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] mom RPMs for 3.4

2014-01-31 Thread Adam Litke

On 31/01/14 08:36 +0100, Sandro Bonazzola wrote:

Il 30/01/2014 19:30, Adam Litke ha scritto:

On 30/01/14 18:13 +, Dan Kenigsberg wrote:

On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom >= 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?


Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.


In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.



mom 0.4.0 has been built in last night nightly job [1] and published to nightly 
by publisher job [2]
so it's already available on nightly [3]

For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll 
include your builds in that release.


I presume the scripting for 3.4 release rpms will produce a version
without the git-rev based suffix: ie. mom-0.4.0-1.rpm?

I need to figure out how to handle a problem that might be a bit
unique to mom.  MOM is used by non-oVirt users who install it from the
main Fedora repository.  I think it's fine that we are producing our
own rpms in oVirt (that may have additional patches applied and may
resync to upstream mom code more frequently than would be desired for
the main Fedora repository).  Given this, I think it makes sense to
tag the oVirt RPMs with a special version suffix to indicate that
these are oVirt produced and not upstream Fedora.

For example:
The next Fedora update will be mom-0.4.0-1.f20.rpm.
The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm.

Is this the best practice for accomplishing my goals?  One other thing
I'd like to have the option of doing is to make vdsm depend on an
ovirt distribution of mom so that the upstream Fedora version will not
satisfy the dependency for vdsm.

Thoughts?
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] mom RPMs for 3.4

2014-02-03 Thread Adam Litke

On 01/02/14 22:48 +, Dan Kenigsberg wrote:

On Fri, Jan 31, 2014 at 04:56:12PM -0500, Adam Litke wrote:

On 31/01/14 08:36 +0100, Sandro Bonazzola wrote:
>Il 30/01/2014 19:30, Adam Litke ha scritto:
>>On 30/01/14 18:13 +, Dan Kenigsberg wrote:
>>>On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:
>>>>Hi Sandro,
>>>>
>>>>After updating the MOM project's build system, I have used jenkins to
>>>>produce a set of RPMs that I would like to tag into the oVirt 3.4
>>>>release.  Please see the jenkins job [1] for the relevant artifacts
>>>>for EL6[2], F19[3], and F20[4].
>>>>
>>>>Dan, should I submit a patch to vdsm to make it require mom >= 0.4.0?
>>>>I want to be careful to not break people's environments this late in
>>>>the 3.4 release cycle.  What is the best way to minimize that damage?
>>>
>>>Hey, we're during beta. I prefer making this requirement explicit now
>>>over having users with supervdsmd.log retate due to log spam.
>>
>>In that case, Sandro, can you let me know when those RPMs hit the
>>ovirt repos (for master and 3.4) and then I will submit a patch to
>>vdsm to require the new version.
>
>
>mom 0.4.0 has been built in last night nightly job [1] and published to 
nightly by publisher job [2]
>so it's already available on nightly [3]
>
>For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll 
include your builds in that release.

I presume the scripting for 3.4 release rpms will produce a version
without the git-rev based suffix: ie. mom-0.4.0-1.rpm?

I need to figure out how to handle a problem that might be a bit
unique to mom.  MOM is used by non-oVirt users who install it from the
main Fedora repository.  I think it's fine that we are producing our
own rpms in oVirt (that may have additional patches applied and may
resync to upstream mom code more frequently than would be desired for
the main Fedora repository).  Given this, I think it makes sense to
tag the oVirt RPMs with a special version suffix to indicate that
these are oVirt produced and not upstream Fedora.

For example:
The next Fedora update will be mom-0.4.0-1.f20.rpm.
The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm.

Is this the best practice for accomplishing my goals?  One other thing
I'd like to have the option of doing is to make vdsm depend on an
ovirt distribution of mom so that the upstream Fedora version will not
satisfy the dependency for vdsm.


What is the motivation for this? You would not like to bother Fedora
users with updates that are required only for oVirt?


Yes, that was my thinking.  It seems that oVirt requires updates more
frequently than users that use MOM with libvirt directly and the
Fedora update process is a bit more heavy than oVirt's at the moment.


Vdsm itself is built, signed, and distributed via Fedora.
It is also copied into the ovirt repo, for completeness sake. Could MoM
do the same?


If vdsm is finding this to work well than surely I can do the same
with MOM.  The 0.4.0 build is in updates-testing right now and should
be able to be tagged stable in a day or two.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Asynchronous tasks for live merge

2014-02-28 Thread Adam Litke

Hi all,

As part of our plan to support live merging of VM disk snapshots it
seems we will need a new form of asynchronous task in ovirt-engine.  I
am aware of AsyncTaskManager but it seems to be limited to managing
SPM tasks.  For live merge, we are going to need something called
VmTasks since the async command can be run only on the host that
currently runs the VM.

The way I see this working from an engine perspective is:
1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
  found to be up, we activate an alternative live merge flow.
2. We submit a LiveMerge VDS Command for each impacted disk.  This is
  an asynchronous command which we need to monitor for completion.
3. A VmJob is inserted into the DB so we'll remember to handle it.
4. The VDS Broker monitors the operation via an extension to the
  already collected VmStatistics data.  Vdsm will report active Block
  Jobs only.  Once the job stops (in error or success) it will cease
  to be reported by vdsm and engine will know to proceed.
5. When the job has completed, VDS Broker raises an event up to bll.
  Maybe this could be done via VmJobDAO on the stored VmJob?
6. Bll receives the event and issues a series of VDS commands to
  complete the operation:
  a) Verify the new image chain matches our expectations (the snap is
 no longer present in the chain).
  b) Delete the snapshot volume
  c) Remove the VmJob from the DB

Could you guys review this proposed flow for sanity?  The main
conceptual gaps I am left with concern #5 and #6.  What is the
appropriate way for VDSBroker to communicate with BLL?  Is there an
event mechanism I can explore or should I use the database?  I am
leaning toward the database because it is persistent and will ensure
#6 gets completed even if engine is restarted somewhere in the middle.
For #6, is there an existing polling / event loop in bll that I can
plug into?

Thanks in advance for taking the time to think about this flow and for
providing your insights!

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Asynchronous tasks for live merge

2014-03-03 Thread Adam Litke

On 03/03/14 14:28 +, Dan Kenigsberg wrote:

On Fri, Feb 28, 2014 at 09:30:16AM -0500, Adam Litke wrote:

Hi all,

As part of our plan to support live merging of VM disk snapshots it
seems we will need a new form of asynchronous task in ovirt-engine.  I
am aware of AsyncTaskManager but it seems to be limited to managing
SPM tasks.  For live merge, we are going to need something called
VmTasks since the async command can be run only on the host that
currently runs the VM.

The way I see this working from an engine perspective is:
1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
  found to be up, we activate an alternative live merge flow.
2. We submit a LiveMerge VDS Command for each impacted disk.  This is
  an asynchronous command which we need to monitor for completion.
3. A VmJob is inserted into the DB so we'll remember to handle it.
4. The VDS Broker monitors the operation via an extension to the
  already collected VmStatistics data.  Vdsm will report active Block
  Jobs only.  Once the job stops (in error or success) it will cease
  to be reported by vdsm and engine will know to proceed.


You describe a reasonable way for Vdsm to report whether an async
operation has finished. However, may we instead use the oportunity to
introduce generic "hsm" tasks?


Sure, I am happy to have that conversation :)  If I understand
correctly, HSM tasks, while ideal, might be too complex to get right
and would block the Live Merge feature for longer than we would like.
Has anyone looked into what it would take to implement a HSM Tasks
framework like this in vdsm?  Are there any WIP implementations?  If
the scope of this is not too big, it can be completed relatively
quickly, and the resulting implementation would cover all known use
cases, then this could be worth it.  It's important to support Live
Merge soon.

Regarding deprecation of the current tasks API:  Could your suggested
HSM Tasks framework be extended to cover SPM/SDM tasks as well?  I
would hope that a it could.  In that case, we could look forward to a
unified async task architecture in vdsm.


I suggest to have something loosely modeled on posix fork/wait.

- Engine asks Vdsm to start an API verb asynchronously and supplies a
 uuid. This is unlike fork(2), where the system chooses the pid, but
 that's required so that Engine could tell if the command has reached
 Vdsm in case of a network error.

- Engine may monitor the task (a-la wait(WNOHANG))


Allon has communicated a desire to limit engine-side polling.  Perhaps
the active tasks could be added to the host stats?


- When the task is finished, Engine may collect its result (a-la wait).
 Until that happens, Vdsm must report the task forever; restart or
 upgrade are no excuses. On reboot, though, all tasks are forgotten, so
 Engine may stop monitoring tasks on a fenced host.


This could be a good comprimise.  I hate the idea of requiring engine
to play janitor and clean up stale vdsm data, but there is not much
better of a way to do it.  Allowing reboot to auto-clear tasks will at
least provide some backstop to how long tasks could pile up if
forgotten.


This may be an over kill for your use case, but it would come useful for
other cases. In particular, setupNetwork returns before it is completely
done, since dhcp address acquisition may take too much time. Engine may
poll getVdsCaps to see when it's done (or timeout), but it would be
nicer to have a generic mechanism that can serve us all.


If we were to consider this, I would want to vet the architecture
against all known use cases for tasks to make sure we don't need to
create a new framework in 3 months.


Note that I'm suggesting a completely new task framwork, at least on
Vdsm side, as the current one (with its broken persistence, arcane
states and never-reliable rollback) is beyond redemption, imho.


Are we okay with abandoning vdsm-side rollback entirely as we move
forward?  Won't that be a regression for at least some error flows
(especially in the realm of SPM tasks)?


5. When the job has completed, VDS Broker raises an event up to bll.
  Maybe this could be done via VmJobDAO on the stored VmJob?
6. Bll receives the event and issues a series of VDS commands to
  complete the operation:
  a) Verify the new image chain matches our expectations (the snap is
 no longer present in the chain).
  b) Delete the snapshot volume
  c) Remove the VmJob from the DB

Could you guys review this proposed flow for sanity?  The main
conceptual gaps I am left with concern #5 and #6.  What is the
appropriate way for VDSBroker to communicate with BLL?  Is there an
event mechanism I can explore or should I use the database?  I am
leaning toward the database because it is persistent and will ensure
#6 gets completed even if engine is restarted somewhere in the middle.
For #6, is there an existing polling / event loop in bll that I can
plug into?

Thanks in advance for tak

Re: [Engine-devel] Asynchronous tasks for live merge

2014-03-03 Thread Adam Litke

On 03/03/14 16:36 +0200, Itamar Heim wrote:

On 03/03/2014 04:28 PM, Dan Kenigsberg wrote:

On Fri, Feb 28, 2014 at 09:30:16AM -0500, Adam Litke wrote:

Hi all,

As part of our plan to support live merging of VM disk snapshots it
seems we will need a new form of asynchronous task in ovirt-engine.  I
am aware of AsyncTaskManager but it seems to be limited to managing
SPM tasks.  For live merge, we are going to need something called
VmTasks since the async command can be run only on the host that
currently runs the VM.

The way I see this working from an engine perspective is:
1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
  found to be up, we activate an alternative live merge flow.
2. We submit a LiveMerge VDS Command for each impacted disk.  This is
  an asynchronous command which we need to monitor for completion.
3. A VmJob is inserted into the DB so we'll remember to handle it.
4. The VDS Broker monitors the operation via an extension to the
  already collected VmStatistics data.  Vdsm will report active Block
  Jobs only.  Once the job stops (in error or success) it will cease
  to be reported by vdsm and engine will know to proceed.


You describe a reasonable way for Vdsm to report whether an async
operation has finished. However, may we instead use the oportunity to
introduce generic "hsm" tasks?

I suggest to have something loosely modeled on posix fork/wait.

- Engine asks Vdsm to start an API verb asynchronously and supplies a
  uuid. This is unlike fork(2), where the system chooses the pid, but
  that's required so that Engine could tell if the command has reached
  Vdsm in case of a network error.

- Engine may monitor the task (a-la wait(WNOHANG))

- When the task is finished, Engine may collect its result (a-la wait).
  Until that happens, Vdsm must report the task forever; restart or
  upgrade are no excuses. On reboot, though, all tasks are forgotten, so
  Engine may stop monitoring tasks on a fenced host.

This may be an over kill for your use case, but it would come useful for
other cases. In particular, setupNetwork returns before it is completely
done, since dhcp address acquisition may take too much time. Engine may
poll getVdsCaps to see when it's done (or timeout), but it would be
nicer to have a generic mechanism that can serve us all.

Note that I'm suggesting a completely new task framwork, at least on
Vdsm side, as the current one (with its broken persistence, arcane
states and never-reliable rollback) is beyond redemption, imho.


5. When the job has completed, VDS Broker raises an event up to bll.
  Maybe this could be done via VmJobDAO on the stored VmJob?
6. Bll receives the event and issues a series of VDS commands to
  complete the operation:
  a) Verify the new image chain matches our expectations (the snap is
 no longer present in the chain).
  b) Delete the snapshot volume
  c) Remove the VmJob from the DB

Could you guys review this proposed flow for sanity?  The main
conceptual gaps I am left with concern #5 and #6.  What is the
appropriate way for VDSBroker to communicate with BLL?  Is there an
event mechanism I can explore or should I use the database?  I am
leaning toward the database because it is persistent and will ensure
#6 gets completed even if engine is restarted somewhere in the middle.
For #6, is there an existing polling / event loop in bll that I can
plug into?

Thanks in advance for taking the time to think about this flow and for
providing your insights!

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel



the way i read Adam's proposal, there is no "task" entity at vdsm side 
to monitor, rather the state of the object the operation is performed 
on (similar to CreateVM, where the engine monitors the state of the 
VM, rather than the CreateVM request).


Yeah, we use the term "job" in order to avoid assumptions and
implications (ie. rollback/cancel, persistence) that come with the
word "task".  "Job" essentially means "libvirt Block Job", but I am
trying to allow for extension in the future.  Vdsm would collect block
job information for devices it expects to have active block jobs and
report them all under a single structure in the VM statistics.  There
would be no persistence of information so when a libvirt block job
goes poof, vdsm will stop reporting it.

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Schema upgrade failure on master

2014-03-03 Thread Adam Litke

Hi,

I've recently rebased to master and it looks like the
03_05_0050_event_notification_methods.sql script is failing on schema
upgrade.  Is this a bug or am I doing something wrong?  To upgrade I
did the normal proceedure with my development installation:

make install-dev ...
~/ovirt/bin/engine-setup

Got this result in the log file:

psql:/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/upgrade/03_05_0050_event_notification_methods.sql:10:
 ERROR:  column "notification_method" contains null values
FATAL: Cannot execute sql command: 
--file=/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/upgrade/03_05_0050_event_notification_methods.sql

2014-03-03 17:20:34 DEBUG otopi.context context._executeMethod:152 method 
exception
Traceback (most recent call last):
 File "/usr/lib/python2.7/site-packages/otopi/context.py", line 142, in 
_executeMethod
   method['method']()
 File 
"/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/setup/bin/../plugins/ovirt-**FILTERED**-setup/ovirt-**FILTERED**/db/schema.py",
 line 280, in _misc
   osetupcons.DBEnv.PGPASS_FILE
 File "/usr/lib/python2.7/site-packages/otopi/plugin.py", line 451, in execute
   command=args[0],
RuntimeError: Command 
'/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/schema.sh' 
failed to execute

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Schema upgrade failure on master

2014-03-04 Thread Adam Litke

On 04/03/14 00:10 -0500, Martin Perina wrote:

Hi Adam,

I didn't notice any problem with this script. Was the database you tried to 
upgrade empty?
If not could you please send me contents of your event_subscriber table?


engine=> select * from event_subscriber ;
   subscriber_id |event_up_name| method_id 
|  method_address  | tag_name 
--+-+---

+--+--
fdfc627c-d875-11e0-90f0-83df133b58cc | VM_SET_TICKET   | 0 
| ali...@brewer.alitke.net | 
fdfc627c-d875-11e0-90f0-83df133b58cc | VM_DOWN_ERROR   | 0 
| ali...@brewer.alitke.net | 
fdfc627c-d875-11e0-90f0-83df133b58cc | VDS_INITIATED_RUN_VM_FAILED | 0 
| ali...@brewer.alitke.net | 
(3 rows)





Thanks

Martin Perina

- Original Message -

From: "Adam Litke" 
To: engine-devel@ovirt.org
Sent: Monday, March 3, 2014 11:26:28 PM
Subject: [Engine-devel] Schema upgrade failure on master

Hi,

I've recently rebased to master and it looks like the
03_05_0050_event_notification_methods.sql script is failing on schema
upgrade.  Is this a bug or am I doing something wrong?  To upgrade I
did the normal proceedure with my development installation:

make install-dev ...
~/ovirt/bin/engine-setup

Got this result in the log file:

psql:/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/upgrade/03_05_0050_event_notification_methods.sql:10:
ERROR:  column "notification_method" contains null values
FATAL: Cannot execute sql command:
--file=/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/upgrade/03_05_0050_event_notification_methods.sql

2014-03-03 17:20:34 DEBUG otopi.context context._executeMethod:152 method
exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 142, in
  _executeMethod
method['method']()
  File
  
"/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/setup/bin/../plugins/ovirt-**FILTERED**-setup/ovirt-**FILTERED**/db/schema.py",
  line 280, in _misc
osetupcons.DBEnv.PGPASS_FILE
  File "/usr/lib/python2.7/site-packages/otopi/plugin.py", line 451, in
  execute
command=args[0],
RuntimeError: Command
'/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/schema.sh'
failed to execute

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel



--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Share Your Thoughts

2014-03-24 Thread Adam Litke

On 23/03/14 10:36 -0400, Gilad Chaplik wrote:

AuditLog gets recycled after 30 days. the reason i stopped my VM may
still be relevant.
I would not make fields complex/composite. they need to be easily
useable via the CLI for example.


I think we need multiple comments, so we need to think about the RESTful api 
anyhow.
I guess that next feature will be a reason for 'wipe after stop'/any other BE 
that needs reasoning.


What about a new DB table (maybe called Annotations) that takes a
business entity type, UUID, action type, timestamp, and reason string.
Then the shutdown reason could be entered as a new row in the DB.  It
can be kept as long as we want it and views can be adjusted to make
these fields searchable.

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Share Your Thoughts

2014-03-24 Thread Adam Litke

On 24/03/14 08:43 -0400, Adam Litke wrote:

On 23/03/14 10:36 -0400, Gilad Chaplik wrote:

AuditLog gets recycled after 30 days. the reason i stopped my VM may
still be relevant.
I would not make fields complex/composite. they need to be easily
useable via the CLI for example.


I think we need multiple comments, so we need to think about the RESTful api 
anyhow.
I guess that next feature will be a reason for 'wipe after stop'/any other BE 
that needs reasoning.


What about a new DB table (maybe called Annotations) that takes a
business entity type, UUID, action type, timestamp, and reason string.
Then the shutdown reason could be entered as a new row in the DB.  It
can be kept as long as we want it and views can be adjusted to make
these fields searchable.


I forgot to mention that this idea would make it simple to annotate
the reason for moving a host into maintenance mode as well (or any
other state change annotations we would want to make in the future).

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [RFC] New Connection Management API

2012-01-25 Thread Adam Litke
On Mon, Jan 23, 2012 at 04:54:10PM -0500, Saggi Mizrahi wrote:
> Nitty Gritty:

This seems like a good API but I have some suggestions with respect to API
naming:

> manageStorageServer
> ===

Could we name this manageStorageConnection or manageStorageServerConnection?
Manage storage server is confusing because it implies you are managing the
server itself (ie. server configuration, NFS exports, reboot, etc).

> Synopsis:
> manageStorageServer(uri, connectionID):
> 
> Parameters:
> uri - a uri pointing to a storage target (eg: nfs://server:export, 
> iscsi://host/iqn;portal=1)
> connectionID - string with any char except "/".
> 
> Description:
> Tells VDSM to start managing the connection. From this moment on VDSM will 
> try and have the connection available when needed. VDSM will monitor the 
> connection and will automatically reconnect on failure.
> Returns:
> Success code if VDSM was able to manage the connection.
> It usually just verifies that the arguments are sane and that the CID is not 
> already in use.
> This doesn't mean the host is connected.
> 
> unmanageStorageServer
> =

To match above: unmanageStorageConnection or unmanageStorageServerConnection

> Synopsis:
> unmanageStorageServer(connectionID):
> 
> Parameters:
> connectionID - string with any char except "/".
> 
> Descriptions:
> Tells VDSM to stop managing the connection. VDSM will try and disconnect for 
> the storage target if this is the last CID referencing the storage connection.
> 
> Returns:
> Success code if VDSM was able to unmanage the connection.
> It will return an error if the CID is not registered with VDSM. Disconnect 
> failures are not reported. Active unmanaged connections can be tracked with 
> getStorageServerList()
> 
> getStorageServerList
> 

getStorageConnectionList or getStorageServerConnectionList

> Synopsis:
> getStorageServerList()
> 
> Description:
> Will return list of all managed and unmanaged connections. Unmanaged 
> connections have temporary IDs and are not guaranteed to be consistent across 
> calls.
> 
> Results:VDSM was able to manage the connection.
> It usually just verifies that the arguments are sane and that the CID is not 
> already in use.
> This doesn't mean the host is connected.
> 
> unmanageStorageServer
> =
> Synopsis:
> unmanageStorageServer(connectionID):
> 
> Parameters:
> connectionID - string with any char except "/".
> 
> Descriptions:
> Tells VDSM to stop managing the connection. VDSM will try and disconnect for 
> the storage target if this is the last CID referencing the storage connection.
> 
> Returns:
> Success code if VDSM was able to unmanage the connection.
> It will return an error if the CID is not registered with VDSM. Disconnect 
> failures are not reported. Active unmanaged connections can be tracked with 
> getStorageServerList()
> 
> getStorageServerList
> 
> Synopsis:
> getStorageServerList()
> 
> Description:
> Will return list of all managed and unmanaged connections. Unmanaged 
> connections have temporary IDs and are not guaranteed to be consistent across 
> calls.
> 
> Results:
> A mapping between CIDs and the status.
> example return value (Actual key names may differ)
> 
> {'conA': {'connected': True, 'managed': True, 'lastError': 0, 
> 'connectionInfo': {
> 'remotePath': 'server:/export
> 'retrans': 3
> 'version': 4
> }}
>  'iscsi_session_34': {'connected': False, 'managed': False, 'lastError': 339, 
> 'connectionIfno': {
> 'hostname': 'dandylopn'
> 'portal': 1}}
> }
> ___
> vdsm-devel mailing list
> vdsm-de...@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] [RFC] New Connection Management API

2012-01-26 Thread Adam Litke
hard and will continue on in about 2 minutes.
> 
> > 
> > Connections that live until they die is a hard to define and work with 
> > lifecycle. Solving this problem is theoretically simple.
> > 
> > Have clients hold some sort of session token and force the client to update 
> > it at a specified interval. You could bind resources (like domains, VMs, 
> > connections) to that session token so when it expires VDSM auto cleans the 
> > resources.
> > 
> > This kind of mechanism is out of the scope of this API change. Further more 
> > I think that this mechanism should sit in the engine since the session 
> > might actually contain resources from multiple hosts and resources that are 
> > not managed by VDSM.
> > 
> > In GUI flows specifically the user might do actions that don't even touch 
> > the engine and forcing it to refresh the engine token is simpler then 
> > having it refresh the VDSM token.
> > 
> > I understand that engine currently has no way of tracking a user session. 
> > This, as I said, is also true in the case of VDSM. We can start and argue 
> > about which project should implement the session semantics. But as I see it 
> > it's not relevant to the connection management API.
> 
> ___
> vdsm-devel mailing list
> vdsm-de...@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] [RFC] New Connection Management API

2012-01-26 Thread Adam Litke
t; > > There are 2 user cases being discussed
> > > 1. Wait until a connection is made, if it fails don't retry and
> > > automatically unmanage.
> > > 2. If the called of the API forgets or fails to unmanage a
> > > connection.
> > > 
> > 
> > Actually I was not discussing #2 at all.
> > 
> > > Your suggestion as I understand it:
> > > Transient connections are:
> > >  - Connection that VDSM will only try to connect to once and
> > >  will not reconnect to in case of disconnect.
> > 
> > yes
> > 
> > > 
> > > My problem with this definition that it does not specify the "end
> > > of life" of the connection.
> > > Meaning it solves only use case 1.
> > 
> > since this is the only use case i had in mind, it is what i was
> > looking for.
> > 
> > > If all is well, and it usually is, VDSM will not invoke a
> > > disconnect.
> > > So the caller would have to call unmanage if the connection
> > > succeeded at the end of the flow.
> > 
> > agree.
> > 
> > > Now, if you are already calling unmanage if connection succeeded
> > > you can just call it anyway.
> > 
> > not exactly, an example I gave earlier on the thread was that VSDM
> > hangs
> > or have other error and the engine can not initiate unmanaged,
> > instead
> > let's assume the host is fenced (self-fence or external fence does
> > not
> > matter), in this scenario the engine will not issue unmanage.
> > 
> > > 
> > > instead of doing: (with your suggestion)
> > > 
> > > manage
> > > wait until succeeds or lastError has value
> > > try:
> > >   do stuff
> > > finally:
> > >   unmanage
> > > 
> > > do: (with the canonical flow)
> > > ---
> > > manage
> > > try:
> > >   wait until succeeds or lastError has value
> > >   do stuff
> > > finally:
> > >   unmanage
> > > 
> > > This is simpler to do than having another connection type.
> > 
> > You are assuming the engine can communicate with VDSM and there are
> > scenarios where it is not feasible.
> > 
> > > 
> > > Now that we got that out of the way lets talk about the 2nd use
> > > case.
> > 
> > Since I did not ask VDSM to clean after the (engine) user and you
> > don't
> > want to do it I am not sure we need to discuss this.
> > 
> > If you insist we can start the discussion on who should implement the
> > cleanup mechanism but I'm afraid I have no strong arguments for VDSM
> > to
> > do it, so I rather not go there ;)
> > 
> > 
> > You dropped from the discussion my request for supporting list of
> > connections for manage and unmanage verbs.
> > 
> > > API client died in the middle of the operation and unmanage was
> > > never called.
> > > 
> > > Your suggested definition means that unless there was a problem
> > > with the connection VDSM will still have this connection active.
> > > The engine will have to clean it anyway.
> > > 
> > > The problem is, VDSM has no way of knowing that a client died,
> > > forgot or is thinking really hard and will continue on in about 2
> > > minutes.
> > 
> > > 
> > > Connections that live until they die is a hard to define and work
> > > with lifecycle. Solving this problem is theoretically simple.
> > > 
> > > Have clients hold some sort of session token and force the client
> > > to update it at a specified interval. You could bind resources
> > > (like domains, VMs, connections) to that session token so when it
> > > expires VDSM auto cleans the resources.
> > > 
> > > This kind of mechanism is out of the scope of this API change.
> > > Further more I think that this mechanism should sit in the engine
> > > since the session might actually contain resources from multiple
> > > hosts and resources that are not managed by VDSM.
> > > 
> > > In GUI flows specifically the user might do actions that don't even
> > > touch the engine and forcing it to refresh the engine token is
> > > simpler then having it refresh the VDSM token.
> > > 
> > > I understand that engine currently has no way of tracking a user
> > > session. This, as I said, is also true in the case of VDSM. We can
> > > start and argue about which project should implement the session
> > > semantics. But as I see it it's not relevant to the connection
> > > management API.
> > 
> > 
> ___
> vdsm-devel mailing list
> vdsm-de...@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Eclipse IDE setup

2012-02-06 Thread Adam Litke
Hi all,

I am trying to set up an eclipse development environment for ovirt-engine and am
running into a stubborn problem with missing classes.  I have followed the
directions for importing the Maven projects as written here:
 http://ovirt.org/wiki/Building_Ovirt_Engine/IDE

The projects are able to be imported but I see lots of errors about missing
imports such as:

import org.ovirt.engine.api.model.*
import org.ovirt.engine.core.common.*

I should have a complete ovirt-engine source repository (I cloned the
ovirt-engine git repo).  Has anyone seen this problem before?  Can you offer any
suggestions to help me resolve it?  Thanks!

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Eclipse IDE setup

2012-02-06 Thread Adam Litke
On Mon, Feb 06, 2012 at 10:58:13AM -0500, Laszlo Hornyak wrote:
> Hi Adam!
> 
> Please check if workspace maven resolution is enabled, and run a maven build 
> with install.
> If it is still broken, then there must be a bad dependency in the 
> pom.xml-s... it happens :-(

Thanks for your suggestions.  Maven resolution is enabled.  Then I tried to
build on the command line using mvn directly but got the same errors as in
eclipse.  Next, I tried to checkout out the 3.0 branch (assuming that the build
should be more stable) and I got a different set of compilation errors.

This brings up a few questions:

1.) Which jdk should I use?  I am currently using OpenJDK

/usr/lib/jvm/java-1.6.0-openjdk/bin/java -version
java version "1.6.0_23"
OpenJDK Runtime Environment (IcedTea6 1.11pre) (6b23~pre11-0ubuntu1.11.10.1)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)

2,) Does this need a Fedora/RH system to compile?

3.) My guess is that others are able to compile oVirt even if there are bad
dependencies in the pom.xml files.  Otherwise they would already be fixed.  How
do others fix the dependencies on their local systems.

Thanks for the help!

> 
> Laszlo
> 
> - Original Message -
> > From: "Adam Litke" 
> > To: engine-devel@ovirt.org
> > Sent: Monday, February 6, 2012 3:49:33 PM
> > Subject: [Engine-devel] Eclipse IDE setup
> > 
> > Hi all,
> > 
> > I am trying to set up an eclipse development environment for
> > ovirt-engine and am
> > running into a stubborn problem with missing classes.  I have
> > followed the
> > directions for importing the Maven projects as written here:
> >  http://ovirt.org/wiki/Building_Ovirt_Engine/IDE
> > 
> > The projects are able to be imported but I see lots of errors about
> > missing
> > imports such as:
> > 
> > import org.ovirt.engine.api.model.*
> > import org.ovirt.engine.core.common.*
> > 
> > I should have a complete ovirt-engine source repository (I cloned the
> > ovirt-engine git repo).  Has anyone seen this problem before?  Can
> > you offer any
> > suggestions to help me resolve it?  Thanks!
> > 
> > --
> > Adam Litke 
> > IBM Linux Technology Center
> > 
> > ___
> > Engine-devel mailing list
> > Engine-devel@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/engine-devel
> > 
> 

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Eclipse IDE setup

2012-02-06 Thread Adam Litke
On Mon, Feb 06, 2012 at 08:28:19PM +0200, Livnat Peer wrote:
> On 06/02/12 19:18, Adam Litke wrote:
> > On Mon, Feb 06, 2012 at 10:58:13AM -0500, Laszlo Hornyak wrote:
> >> Hi Adam!
> >>
> >> Please check if workspace maven resolution is enabled, and run a maven 
> >> build with install.
> >> If it is still broken, then there must be a bad dependency in the 
> >> pom.xml-s... it happens :-(
> > 
> > Thanks for your suggestions.  Maven resolution is enabled.  Then I tried to
> > build on the command line using mvn directly but got the same errors as in
> > eclipse.  Next, I tried to checkout out the 3.0 branch (assuming that the 
> > build
> > should be more stable) and I got a different set of compilation errors.
> > 
> 
> Hi Adam,
> 
> > This brings up a few questions:
> > 
> > 1.) Which jdk should I use?  I am currently using OpenJDK
> > 
> > /usr/lib/jvm/java-1.6.0-openjdk/bin/java -version
> > java version "1.6.0_23"
> > OpenJDK Runtime Environment (IcedTea6 1.11pre) (6b23~pre11-0ubuntu1.11.10.1)
> > OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
> >
> 
> you are using the right JDK.
> 
> > 2,) Does this need a Fedora/RH system to compile?
> 
> The engine works on Fedora, RHEL, Ubuntu Gentoo  and should work on any
> other Linux based operating system (Java is platform agnostic).
> 
> 
> > 
> > 3.) My guess is that others are able to compile oVirt even if there are bad
> > dependencies in the pom.xml files.  Otherwise they would already be fixed.  
> > How
> > do others fix the dependencies on their local systems.
> > 
> 
> There should not be any local issues, let's try to figure out what the
> issues are.
> 
> The errors are probably not related to eclipse because you have
> compilation errors from the command line as well.
> 
> I would start by compiling the engine and api with no tests and no UI:
> 
> Run from the command line -
> 
> 1. $ovirt_engine_home> mvn clean
> 2. $ovirt_engine_home> mvn install -DskipTests
> 
> What is the result of the above two?

Thanks Livnat!  The mvn clean was successful.  Here are the errors from the
install step:

[INFO] 
[INFO] Building Shared GWT code
[INFO]task-segment: [install]
[INFO] 
[INFO] [clean:clean {execution: auto-clean}]
[INFO] [dependency:unpack {execution: copy}]
[INFO] Configured Artifact: org.ovirt.engine.core:common:sources:3.0.0-0001:jar
[INFO] Configured Artifact: org.ovirt.engine.core:compat:sources:3.0.0-0001:jar
[INFO] Configured Artifact: 
org.ovirt.engine.core:searchbackend:sources:3.0.0-0001:jar
[INFO] Unpacking 
/home/aglitke/.m2/repository/org/ovirt/engine/core/common/3.0.0-0001/common-3.0.0-0001-sources.jar
 to
  
/home/aglitke/src/ovirt-engine/frontend/webadmin/modules/sharedgwt/src/main/java
   with includes null and excludes:null
[INFO] Unpacking 
/home/aglitke/.m2/repository/org/ovirt/engine/core/compat/3.0.0-0001/compat-3.0.0-0001-sources.jar
 to
  
/home/aglitke/src/ovirt-engine/frontend/webadmin/modules/sharedgwt/src/main/java
   with includes null and excludes:null
[INFO] Unpacking 
/home/aglitke/.m2/repository/org/ovirt/engine/core/searchbackend/3.0.0-0001/searchbackend-3.0.0-0001-sources.jar
 to
  
/home/aglitke/src/ovirt-engine/frontend/webadmin/modules/sharedgwt/src/main/java
   with includes null and excludes:null
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 3 resources
[INFO] skip non existing resourceDirectory 
/home/aglitke/src/ovirt-engine/frontend/webadmin/modules/sharedgwt/src/main/resources
[INFO] [gwt:resources {execution: default}]
[INFO] 750 source files copied from GWT module org.ovirt.engine.SharedGwt
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Compiling 749 source files to 
/home/aglitke/src/ovirt-engine/frontend/webadmin/modules/sharedgwt/target/classes
[INFO] 
[ERROR] BUILD FAILURE
[INFO] 
[INFO] Compilation failure

Quota.java:[52,52] cannot find symbol
symbol  : variable QUOTA_NAME_SIZE
location: class 
org.ovirt.engine.core.common.businessentities.BusinessEntitiesDefinitions

Quota.java:[58,52] cannot find symbol
symbol  : variable QUOTA_DESCRIPTION_SIZE
location: class 
org.ovirt.engine.core.common.businessentities.BusinessEntitiesDefinitions


[INFO] 
[INFO] For more information, run Maven with the -e switch
[INFO] ---

Re: [Engine-devel] Eclipse IDE setup

2012-02-07 Thread Adam Litke
On Mon, Feb 06, 2012 at 10:24:36PM +0200, Livnat Peer wrote:
> On 06/02/12 21:47, Adam Litke wrote:
> > On Mon, Feb 06, 2012 at 08:28:19PM +0200, Livnat Peer wrote:
> >> On 06/02/12 19:18, Adam Litke wrote:
> >>> On Mon, Feb 06, 2012 at 10:58:13AM -0500, Laszlo Hornyak wrote:
> >>>> Hi Adam!
> >>>>
> >>>> Please check if workspace maven resolution is enabled, and run a maven 
> >>>> build with install.
> >>>> If it is still broken, then there must be a bad dependency in the 
> >>>> pom.xml-s... it happens :-(
> >>>
> >>> Thanks for your suggestions.  Maven resolution is enabled.  Then I tried 
> >>> to
> >>> build on the command line using mvn directly but got the same errors as in
> >>> eclipse.  Next, I tried to checkout out the 3.0 branch (assuming that the 
> >>> build
> >>> should be more stable) and I got a different set of compilation errors.
> >>>
> >>
> >> Hi Adam,
> >>
> >>> This brings up a few questions:
> >>>
> >>> 1.) Which jdk should I use?  I am currently using OpenJDK
> >>>
> >>> /usr/lib/jvm/java-1.6.0-openjdk/bin/java -version
> >>> java version "1.6.0_23"
> >>> OpenJDK Runtime Environment (IcedTea6 1.11pre) 
> >>> (6b23~pre11-0ubuntu1.11.10.1)
> >>> OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
> >>>
> >>
> >> you are using the right JDK.
> >>
> >>> 2,) Does this need a Fedora/RH system to compile?
> >>
> >> The engine works on Fedora, RHEL, Ubuntu Gentoo  and should work on any
> >> other Linux based operating system (Java is platform agnostic).
> >>
> >>
> >>>
> >>> 3.) My guess is that others are able to compile oVirt even if there are 
> >>> bad
> >>> dependencies in the pom.xml files.  Otherwise they would already be 
> >>> fixed.  How
> >>> do others fix the dependencies on their local systems.
> >>>
> >>
> >> There should not be any local issues, let's try to figure out what the
> >> issues are.
> >>
> >> The errors are probably not related to eclipse because you have
> >> compilation errors from the command line as well.
> >>
> >> I would start by compiling the engine and api with no tests and no UI:
> >>
> >> Run from the command line -
> >>
> >> 1. $ovirt_engine_home> mvn clean
> >> 2. $ovirt_engine_home> mvn install -DskipTests
> >>
> >> What is the result of the above two?
> > 
> > Thanks Livnat!  The mvn clean was successful.  Here are the errors from the
> > install step:
> 
> 
> 1. do you have latest? when did you fetch last (I can fetch the same
> commit hash to make sure it compiles, I have latest and it compiles)

Ok.  I guess I had the 3.0 branch checked out when I was trying to fix the
compile.  By moving back to master, I was able to build from the command line
successfully.  However, I still get lots of errors in eclipse.  I will include a
few below:

Action cannot be resolved to a type
ActionResource.java 
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 34 Java 
Problem

Actions cannot be resolved to a type
ActionsBuilder.java 
/restapi-definition/src/main/java/org/ovirt/engine/api/modelline 35 Java 
Problem

BaseDevice cannot be resolved to a type 
DeviceResource.java 
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 28 Java 
Problem

BaseDevices cannot be resolved to a type
DevicesResource.java
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 33 Java 
Problem

BaseResource cannot be resolved to a type
RemovableStorageDomainContentsResource.java 
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 28 Java 
Problem

BaseResources cannot be resolved to a type
RemovableStorageDomainContentsResource.java 
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 28 Java 
Problem

Bound mismatch: The type C is not a valid substitute for the bounded parameter
 of the type ReadOnlyDevicesResource
DevicesResource.java
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 34 Java 
Problem

Bound mismatch: The type D is not a valid substitute for the bounded parameter 
 of the type DeviceResource
DevicesResource.java
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 52 Java 
Problem

Bound mismatch: The type R is not a valid substitute for the bounded parameter 
 of the type StorageDomainContentResource
StorageDomainContentsResource.java  
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 36 Java 
Problem

Capabilities cannot be resolved to a type
CapabilitiesResource.java   
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 33 Java 
Problem

CdRom cannot be resolved to a type
TemplateResource.java   
/restapi-definition/src/main/java/org/ovirt/engine/api/resource line 51 Java 
Problem


-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] Eclipse IDE setup

2012-02-07 Thread Adam Litke
On Mon, Feb 06, 2012 at 10:24:36PM +0200, Livnat Peer wrote:
> BTW if you want online help I am on the ovirt IRC channel.

Thanks.  I might need to take you up on your offer.  Even after following the
suggestions in this thread I still have around 400 unsolved errors across
several different projects.  What is your IRC nick?  I wasn't able to recognize
you on #ovirt.

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] Backing up the DB

2012-04-02 Thread Adam Litke
Hi all,

I have a working development environment for ovirt-engine and I want to verify
someone else's gerrit change.  Is there a documented proceedure for backing up
the engine db so that, after checking out the new code and updating the db
schema, I can revert to my old code and DB (schema and data).  Thanks!

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


[Engine-devel] VDSM REST API

2012-04-03 Thread Adam Litke
Hi all,

At the oVirt Workshop in Beijing I learned about how the ovirt-shell dynamically
discovers: the collections, resource schemas, and allowed actions based on an
RSDL file and API xsd schema.  I am working on a REST API for vdsm and would
like to make my API compatible with the ovirt-engine api such that the same
ovirt-shell program could work with either the engine api or the vdsm api.

There are many differences between ovirt-engine and vdsm (namely that one is
implemented in Java and the other in Python).  I think the easiest way to test
whether this is possible is to try and create a new, minimalist REST service 
with
python Cherrypy.  Such a service would have a root URL with no collections or
actions.  From my understanding I will need to write the following files:

/api- A basic XML representation of the API root resource
/api?schema - An xsd that describes the simple API
/api?rsdl   - An rsdl (XML file) that describes the available links

for /api, I want to start with something dead-simple:



  

  Hello from vdsm!

  


Once I can use ovirt-shell to list messages and show messages I will be happy to
build on it.  Can anyone help me figure out the minimal xsd and rsdl that would
be needed for such an API to be consumable by ovirt-shell?  Thanks for your
help!

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFD: NEW API getAllTasks

2012-05-08 Thread Adam Litke
On Tue, May 08, 2012 at 10:50:49AM +0300, Doron Fediuck wrote:
> On 07/05/12 21:33, Adam Litke wrote:
> > The current APIs for retrieving all task information do not actually return 
> > all
> > task information.  I would like to introduce a new API that corrects this 
> > and
> > other issues with the current API while preserving backwards compatibility 
> > with
> > ovirt-engine for as long as is necessary.
> > 
> > The current APIs:
> > 
> > getAllTasksInfo(spUUID=None, options = None):
> >  - Returns a dictionary that maps a task UUID to a task verb.
> >  - Despite having 'all' in the name, this API only returns tasks that have 
> > an
> >'spm' tag.
> >  - This call returns only one piece of information for each task.
> >  - The spUUID parameter is deprecated and ignored.
> > 
> > getAllTasksStatuses(spUUID=None, options = None):
> >  - Returns a dictionary of task status information.
> >  - Despite having 'all' in the name, this API only returns tasks that have 
> > an
> >'spm' tag.
> >  - The spUUID parameter is deprecated and ignored.
> > 
> > 
> > I propose the following new API:
> > 
> > getAllTasks(tag=None, options=None):
> >  - Returns a dictionary of task information.  The info from both of the 
> > above
> >functions would be merged into a single result set.
> >  - If tag is None, all tasks are returned.  otherwise, only tasks matching 
> > the
> >tag are returned.
> >  - The spUUID parameter is dropped.  options is for future extension and is
> >currently not used.
> > 
> > This new API includes all functionality that is available in the old calls. 
> >  In
> > the future, ovirt-engine could switch to this API and preserve the current
> > semantics by passing tag='spm' to getAllTasks.  Meanwhile, API users that 
> > really
> > want all tasks (gluster and the REST API) can get what they need.
> > 
> > Thoughts on this idea?
> > 
> 
> (Adding engine-devel, as this relates to the engine API).
> 
> AFAIR, in the original design (when a-sync tasks where introduced into vdsm),
> most (if not all) of the tasks were SPM tasks, and this is the reason for this
> behavior.
> 
> Improving the API is welcomed. The suggested design should work.
> I'd like to verify:
> 
> - Backwards compatibility works; so running engine's shouldn't be replaced.
> Dan: any news on this?

Yes, we should make sure that future versions of ovirt-engine that would want to
adopt this new API will have the behavior that they will need.  For now, I plan
to keep around the current/original APIs until they can be removed as part of a
deprication plan in a few years.

> - Going forward with potential changes in SPM concepts should be supported as 
> well.
> Dan/Ayal/Livnat: do you think it works? ie- anything else needed than 
> alternate 'spm' tag?
> -- 
> 
> /d
> 
> "All computers wait at the same speed."
> 

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] RFC: Writeup on VDSM-libstoragemgmt integration

2012-05-31 Thread Adam Litke
olicy connected with a
particular storage target.

> 5) oVirt Engine potential changes - as described by ayal :
> 
> - We will either need a new 'storage array' entity in engine to
> keep credentials, or, in case of storage array as storage domain,
> just keep this info as part of the domain at engine level.
> - Have a 'storage array' entity in oVirt Engine to support
> 'refresh capabilities' as a button/verb.
> - When user during storage provisioning, selects a LUN exported
> from a storage array (via LSM), the oVirt Engine would know from
> then onwards that this LUN is being served via LSM.
>   It would then  be able to query the  capabilities of the LUN
> and show it to the virt admin during storage consumption flow.
> 
> 6) Potential flows:
> - Create snapshot flow
> -- VDSM will check the snapshot offload capability in the
> domain metadata
> -- If available, and override is not configured, it will use
> LSM to offload LUN/File snapshot
> -- If override is configured or capability is not available,
> it will use its internal logic to create
>snapshot (qcow2).
> 
> - Copy/Clone vmdisk flow
> -- VDSM will check the copy offload capability in the domain
> metadata
> -- If available, and override is not configured, it will use
> LSM to offload LUN/File copy
> -- If override is configured or capability is not available,
> it will use its internal logic to create
>snapshot (eg: dd cmd in case of LUN).
> 
> 7) LSM potential changes:
> 
> - list features/capabilities of the array. Eg: copy offload,
> thin prov. etc.
> - list containers (aka pools) (present in LSM today)
> - Ability to list different types of arrays being managed, their
> capabilities and used/free space
> - Ability to create/list/delete/resize volumes ( LUN or exports,
> available in LSM as of today)
> - Get monitoring info with object (LUN/snapshot/volume) as
> optional parameter for specific info. eg: container/pool free/used
> space, raid type etc.
> 
> Need to make sure above info is listed in a coherent way across
> arrays (number of LUNs, raid type used? free/total per
> container/pool, per LUN?. Also need I/O statistics wherever
> possible.
> 
> 
> ___
> vdsm-devel mailing list
> vdsm-de...@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke 
IBM Linux Technology Center
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel


Re: [Engine-devel] [vdsm] Getting rid of a...@ovirt.org?

2012-07-15 Thread Adam Litke
On Sun, Jul 15, 2012 at 04:53:04AM -0400, Ayal Baron wrote:
> Hi all,
> 
> Sorry for cross-posting, but in this case I think it's relevant.
> 
> The original idea was that every time we wish to discuss a new 
> cross-component feature we should do it over arch list. However, it would 
> appear that de-facto usually engine-devel and vdsm-devel are being used 
> (cross posted).
> Currently engine-devel has 211 subscribers, arch has 160 and vdsm-devel has 
> 128 so from this perspective again, arch seems less relevant.
> I propose we ditch arch and keep the other 2 mailing lists.
> I'm not sure whether new cross-component features should be discussed solely 
> on engine-devel or cross-posted (there are probably people who wouldn't care 
> about engine side but would still like to know about such changes).

+1 to ditching arch.  I would still prefer that cross-component features
cross-post to vdsm-devel and engine-devel.  My current focus is on vdsm and the
traffic level on that list is currently far more manageable than that of
engine-devel. 

-- 
Adam Litke 
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel