[vdsm] cpopen version inconsistencies

2014-04-06 Thread Saggi Mizrahi
Yaniv synced the github version with the code
that was released.

1.3 is now tagged.
https://github.com/ficoos/cpopen/tree/1.3.0
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Modeling graphics framebuffer device in VDSM

2014-04-01 Thread Saggi Mizrahi
I remember there was a discussion about this.

https://lists.fedorahosted.org/pipermail/vdsm-devel/2013-November/002727.html

I don't remember what came of it in the end though.

- Original Message -
 From: Frantisek Kobzik fkob...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Friday, March 28, 2014 3:06:17 PM
 Subject: [vdsm] Modeling graphics framebuffer device in VDSM
 
 Dear VDSM devels,
 
 I've been working on refactoring graphics devices in engine and VDSM for some
 time now and I'd like know your opinion of that.
 
 The aim of this refactoring is to model graphics framebuffer (SPICE, VNC) as
 device in the engine and VDSM. This which is quite natural since libvirt
 treats
 graphics as a device and we have some kind of devices infrastructure in both
 projects. Another advantage (and actually the main reason for refactoring)
 is
 simplified support for multiple graphics framebuffers on a single vm.
 
 Currently, passing information about graphics from engine to VDSM is done via
 'display' param in conf. In the other direction VDSM informs the engine about
 graphics parameters ('displayPort', 'displaySecurePort', 'displayIp' and
 'displayNetwork') in conf as well.
 
 What I'd like to achieve is to encapsulate all this information in specParams
 of the new graphics device and use specParams as a place for transfering data
 about graphics device between engine and vdsm. What do you think?
 
 the draft patch is here:
 http://gerrit.ovirt.org/#/c/23555/ (it's currently marked with '-1' but it
 puts
 some light on what the solution looks like so feel free to take a look).
 
 Thanks,
 Franta.
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] thread pool implementation

2014-03-25 Thread Saggi Mizrahi
The thing that worries me the most is stuck threads.
I hate them!

Could we move to multiple libvirt connections scheme?
Where if a call takes too long we just close the connection.
I know that the call is still running in libvirt but then it's
their problem and not my problem. That way the thread pool
doesn't need to handle this use case making it much simpler.

Because apart from the problem of libvirt calls getting stuck
we just need a run of the mill threadpool solution.

- Original Message -
 From: Francesco Romani from...@redhat.com
 To: vdsm-devel vdsm-devel@lists.fedorahosted.org
 Cc: Saggi Mizrahi smizr...@redhat.com, Yaniv Bronheim 
 ybron...@redhat.com
 Sent: Tuesday, March 25, 2014 1:55:36 PM
 Subject: thread pool implementation
 
 Hello,
 
 in order to reduce the number of sampling threads, we'd like to move from a
 one thread per VM
 to a thread pool.
 
 The strongest requirement we have is to be able to detect if a worker pool is
 not responding,
 and if so to detach it from the pool and to kill it as soon as possible; then
 a new worker should
 be made available.
 
 This is because in sampling we are going to call libvirt and libvirt calls
 can block or, even worse,
 get stuck (I'm looking at you virDomainGetBlockInfo -
 http://libvirt.org/html/libvirt-libvirt.html#virDomainGetBlockInfo )
 
 So, we need a thread pool implementation :)
 What is the best way forward? I see a few options:
 
 * we have a thread pool already in storage. Should we move it outside storage
 to lib/ and extend it?
 * there is a thread pool hidden inside the multiprocessing module!
   (see
   
 http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.dummy)
   should we switch to this, at least for sampling?
 * Python 3.2+ has concurrent.futures which has a nice API and can use a
 thread pool executor.
   See
   
 http://docs.python.org/3.3/library/concurrent.futures.html#module-concurrent.futures
   There is a backport for python 2.6/2.7 also:
   https://pypi.python.org/pypi/futures
   Maybe this is the most forward compatible way?
 * Add an(other) thread pool?
 
 I don't really have any preference granted the requirement above is
 satisfied.
 
 Thoughts? Especially Infra people's feedback would be appreciated.
 
 --
 Francesco Romani
 RedHat Engineering Virtualization R  D
 Phone: 8261328
 IRC: fromani
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM profiling results, round 1

2014-03-23 Thread Saggi Mizrahi
 pthread.py:129(wait)  1230.640   1377.992
 +147.28 (BAD)
The threadpool would just get stuck on wait()
if there are no tasks since Queues use Conditions
internally.

This might explain how the average wait time is
so long.

- Original Message -
 From: Francesco Romani from...@redhat.com
 To: vdsm-devel vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, March 19, 2014 10:33:51 AM
 Subject: [vdsm] VDSM profiling results, round 1
 
 (sending again WITHOUT the attachments)
 
 Hi everyone
 
 I'd like to share the first round of profiling results for VDSM and my next
 steps.
 
 Summary:
 - experimented a couple of profiling approaches and found a good one
 - benchmarked http://gerrit.ovirt.org/#/c/25678/ : it is beneficial, was
 merged
 - found a few low-hanging fruits which seems quite safe to merge and
 beneficial to *all* flows
 - started engagement with infra (see other thread) to have common and
 polished performance
   tools
 - test roadmap is shaping up, wiki/ML will be updated in the coming days
 
 Please read through for a more detailed discussion. Every comment is welcome.
 
 Disclaimer:
 long mail, lot of content, please point out if something is missing or not
 clear enough
 or if deserves more discussion.
 
 +++
 
 == First round results ==
 
 First round of profiling was a follow-up of what I shown during the VDSM
 gathering.
 The results file contains a full profile ordered by descending time.
 In a nutshell: parallel start of 32 tiny VMs using engine REST API and a
 single hypervisor host.
 
 VMs are tiny just because I want to stuff as much VMs I can in my mini-dell
 (16 GB ram, 4 core + HT CPUs)
 
 It is worth to point out a few differences with respect to the *profile* (NOT
 the graphs)
 I shown during the gathering:
 
 - profile data is now collected using the profile decorator (see
 http://www.ovirt.org/Profiling_Vdsm)
   just around Vm._startUnderlyingVm. The gathering profile was obtained using
   the yappi application-wide
   profiler (see https://code.google.com/p/yappi/) and 40 VMs.
   * why yappi?
 I thought an application-wide profiler gathers more information and let
 us to have a better picture.
 I actually still think that but I faced some yappi misbehaviour which I
 want to fix later;
 function-level profile so far is easier to collect (just grab the data
 dumped to file).
   * why 40 VMs?
 I started with 64 but exausted my storage backing store :)
 Will add more storage space in the next days, for the moment I stepped
 back to 32.
 
 It is worth to note that while on one hand numbers change a bit (if you
 remember the old profile data
 and the scary 80secs wasted on namedtuple), on the other hand the suspects
 are the same and the
 relative positions are roughly the same.
 So I believe our initial findings (namedtuple patch) and the plan are still
 valid.
 
 == how it was done ==
 
 I am still focusing just on the monday morning scenario (mass start of many
 VMs at the same time).
 Each run consisted in a parallel start of 32 VMs as described in the result
 data.
 VDSM was restarted between one run and the another.
 engine was *NOT* restarted between runs.
 individual profiles have been gathered after all the runs and the profile was
 extracted from the aggregation of them.
 
 profile dumps are available to everyone, just drop me a note and I'll put the
 tarball somewhere.
 
 please find attached the profile data as txt format. For easier consumption,
 they are also
 available on pastebin:
 
 baseline  : http://paste.fedoraproject.org/86318/
 namedtuple fix: http://paste.fedoraproject.org/86378/
 pickle fix: http://paste.fedoraproject.org/86600/ (see below)
 
 == hotspots ==
 
 the baseline profile data highlights five major areas and hotspots:
 
 1. internal concurrency (possible patch: http://gerrit.ovirt.org/#/c/25857/ -
 see below)
 2. libvirt
 3. XML processing (initial patch: http://gerrit.ovirt.org/#/c/17694/)
 4. namedtuple (patch: http://gerrit.ovirt.org/#/c/25678/ - fixed, merged)
 5. pickling (patch: http://gerrit.ovirt.org/#/c/25860/ - see below)
 
 #4 is beneficial in the ISCSI path and it was already merged.
 #1 shows some potential but it needs to be carefully evaluated to avoid
 performance regressions
 on different scenarios (e.g. bigger machines than mine :))
 #2 is basically outside of our control but it needs to be watched out
 #3 and #5 are beneficial for all flows and scenarios and are safe to merge.
 #5 is almost a no-brainer IMO
 
 == Note about the third profile ==
 
 When profiling the cPickle patch http://paste.fedoraproject.org/86600/
 the tests turned out actually *slower* with respect the second profile with
 just the namedtuple
 patch.
 
 The hotspots seems to be around concurrency and libvirt:
 location  profile2(s)profile3(s)
 diff(s)
 pthread.py:129(wait)  1230.640   1377.992
 +147.28 (BAD)
 

Re: [vdsm] Profiling and benchmarking VDSM

2014-03-18 Thread Saggi Mizrahi
Thank you for taking the initiative.
Just reminding you that the test framework is owned
by infra so don't forget to put Yaniv and I in the CC
for all future correspondence regarding this feature.

As I will be the one responsible for the final
approval.

Ignore http://www.ovirt.org/Vdsm_Developers#Performance_and_scalability

Also we don't want to do it per test since it's meaningless for most tests
since they only run through the code once.

I started investigating how we want to solve this issue in the past and this
is what I can up with.

What we need to do is create a decorator that wraps the test with cProfile.
We also want to create a generator that using configuration from nose.

def BenchmarkIter():
start = time.time()
i = 0
while i  MIN_ITERATIONS or (time.time() - start)  MIN_TIME_RUNNING:
yield i
i += 1

So that writing a benchmark is just:

@benchmark([min_iter[, min_time_running]])
def testSomething(self):
something()

That way we are sure we have a statistically significant sample for all tests.

There will need to be a plugin created for nose that skips @benchmark if
benchmarks are not turned on and can generate output for the Jenkins
performance plugin[1]. That way we can run it every night as the benchmarks
will be slow to run since they will intentionally take a few seconds each
and try and hammer the CPU\disk so people would probably not run the entire
suite themselves.

[1] https://wiki.jenkins-ci.org/display/JENKINS/Performance+Plugin
- Original Message -
 From: ybronhei ybron...@redhat.com
 To: Francesco Romani from...@redhat.com, vdsm-devel 
 vdsm-devel@lists.fedorahosted.org
 Sent: Monday, March 17, 2014 1:57:34 PM
 Subject: Re: [vdsm] Profiling and benchmarking VDSM
 
 On 03/17/2014 01:03 PM, Francesco Romani wrote:
  - Original Message -
  From: Francesco Romani from...@redhat.com
  To: Antoni Segura Puimedon asegu...@redhat.com
  Cc: vdsm-devel vdsm-devel@lists.fedorahosted.org
  Sent: Monday, March 17, 2014 10:32:40 AM
  Subject: Re: [vdsm] Profiling and benchmarking VDSM
 
  next immediate steps will be
 
  - have a summary page to collect all performance/profiling/benchmarking
  page
 
  Links added at the bottom of the VDSM developer page:
  http://www.ovirt.org/Vdsm_Developers
  see item #15
 http://www.ovirt.org/Vdsm_Developers#Performance_and_scalability
 
 
  - document and detail the scenarios the way you described (which I like)
  the benchmark templates will be attached/documented on this page
 
  Started to sketch our Monday Morning test scenario here
  http://www.ovirt.org/VDSM_benchmarks
 
  (yes, looks quite ugly, no attached template yet. Will add).
 
  I'll wait a few hours to let things cool down a bit and see if something
  is missing, then start with the benchmarks using the new, proper
  definitions
  and a more structured approach like the one documented on the wiki.
 
  http://gerrit.ovirt.org/#/c/25678/ is the first in queue.
 
 can we add the profiling decorator on each nose test function and share
 results link with each push to gerrit?
 the issue is that it collects profiling only for one function in a file.
 we need somehow to integrate all outputs..
 
 the nose tests might be good to check the profiling status. it should
 cover most of the flows specifically (especially if we'll enforce adding
 unit tests for each new change)
 
 --
 Yaniv Bronhaim.
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Profiling and benchmarking VDSM

2014-03-18 Thread Saggi Mizrahi


- Original Message -
 From: Francesco Romani from...@redhat.com
 To: vdsm-devel vdsm-devel@lists.fedorahosted.org
 Cc: ybronhei ybron...@redhat.com, Saggi Mizrahi smizr...@redhat.com
 Sent: Tuesday, March 18, 2014 12:47:55 PM
 Subject: Re: [vdsm] Profiling and benchmarking VDSM
 
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Francesco Romani from...@redhat.com
  Cc: vdsm-devel vdsm-devel@lists.fedorahosted.org, ybronhei
  ybron...@redhat.com
  Sent: Tuesday, March 18, 2014 10:18:16 AM
  Subject: Re: [vdsm] Profiling and benchmarking VDSM
  
  Thank you for taking the initiative.
  Just reminding you that the test framework is owned
  by infra so don't forget to put Yaniv and I in the CC
  for all future correspondence regarding this feature.
  
  As I will be the one responsible for the final
  approval.
 
 Yes, of course I will.
 At the moment I'm using unofficial/out of tree decorators and support code
 just because I just started the exploration and the work.
 In the meantime, we can and should discuss the better/long term/official
 approach to measure performance and benchmark things.
 
  Ignore http://www.ovirt.org/Vdsm_Developers#Performance_and_scalability
 
 Not sure I understood correctly. You mean I should drop my additions to the
 Vdsm_Developers page?
Don't drop it, just don't have it as a priority over actual work.
I'd much rather have benchmarks and no WIKI than the other way around. :)
 
  Also we don't want to do it per test since it's meaningless for most tests
  since they only run through the code once.
  
  I started investigating how we want to solve this issue in the past and
  this
  is what I can up with.
  
  What we need to do is create a decorator that wraps the test with cProfile.
  We also want to create a generator that using configuration from nose.
  
  def BenchmarkIter():
  start = time.time()
  i = 0
  while i  MIN_ITERATIONS or (time.time() - start)  MIN_TIME_RUNNING:
  yield i
  i += 1
  
  So that writing a benchmark is just:
  
  @benchmark([min_iter[, min_time_running]])
  def testSomething(self):
  something()
  
  That way we are sure we have a statistically significant sample for all
  tests.
 
 Agreed
 
  There will need to be a plugin created for nose that skips @benchmark if
  benchmarks are not turned on and can generate output for the Jenkins
  performance plugin[1]. That way we can run it every night as the benchmarks
  will be slow to run since they will intentionally take a few seconds each
  and try and hammer the CPU\disk so people would probably not run the entire
  suite themselves.
  
  [1] https://wiki.jenkins-ci.org/display/JENKINS/Performance+Plugin
 
 This looks very nice.
 
 Thanks and bests,
 
 --
 Francesco Romani
 RedHat Engineering Virtualization R  D
 Phone: 8261328
 IRC: fromani
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] The new GIL in python 3.2+

2014-03-16 Thread Saggi Mizrahi
It's a very interesting read and I request everyone
from infra read it and recommend others to put down a
few minutes and read it.

To give the VDSM POV.
Even though it make things faster. It still doesn't
solve the IO issues since they are caused by a mix
of kernel issues (D state) and python libraries
implementations. Namely not releasing the GIL either
by mistake or for optimization sake or because an
underlying C implementation just isn't thread safe.

It is an interesting read about how locking policy
effects speed even with a single Lock.

It's good to remember this was made without effecting
how people write python code at all.

Thanks Francesco for sending it.

- Original Message -
 From: Francesco Romani from...@redhat.com
 To: vdsm-devel vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, March 13, 2014 10:33:34 AM
 Subject: [vdsm] The new GIL in python 3.2+
 
 Hi everyone
 
 I found some time ago this very good presentation about the improvements
 made on python 3.2+ for the GIL (which unfortunately is still there... I
 think we need pypy to get rid of it)
 
 http://www.dabeaz.com/python/NewGIL.pdf
 
 
 --
 Francesco Romani
 RedHat Engineering Virtualization R  D
 Phone: 8261328
 IRC: fromani
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] suggested patch for python-pthreading

2014-02-04 Thread Saggi Mizrahi


- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Yaniv Bronheim ybron...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Saggi 
 Mizrahi smizr...@redhat.com
 Sent: Tuesday, February 4, 2014 12:20:52 PM
 Subject: Re: suggested patch for python-pthreading
 
 On Tue, Feb 04, 2014 at 04:04:37AM -0500, Yaniv Bronheim wrote:
  according to coredumps we found in the scope of the bug [1] we opened [2]
  that suggested to override python's implementation of thread.allocate_lock
  in each coredump we saw few threads stuck with the bt:
  
  #16 0x7fcb69288c93 in PyEval_CallObjectWithKeywords (func=0x2527820,
  arg=0x7fcb6972f050, kw=value optimized out) at Python/ceval.c:3663
  #17 0x7fcb692ba7ba in t_bootstrap (boot_raw=0x250a820) at
  Modules/threadmodule.c:428
  #18 0x7fcb68fa3851 in start_thread (arg=0x7fcb1bfff700) at
  pthread_create.c:301
  #19 0x7fcb6866694d in clone () at
  ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
  
  in pystack the threads were stuck in  /usr/lib64/python2.6/threading.py
  (513): __bootstrap_inner
  
  in bootstrap_inner we use thread.allocate_lock which python-pthreading does
  not override.
  
  we suggest the following commit:
  
  From 9d89e9be1a379b3d93b23dd54a381b9ca0973ebc Mon Sep 17 00:00:00 2001
  From: Yaniv Bronhaim ybron...@redhat.com
  Date: Mon, 3 Feb 2014 19:24:30 +0200
  Subject: [PATCH] Mocking thread.allocate_lock with Lock imp
  
  Signed-off-by: Yaniv Bronhaim ybron...@redhat.com
  ---
   pthreading.py | 4 
   1 file changed, 4 insertions(+)
  
  diff --git a/pthreading.py b/pthreading.py
  index 916ca7f..96df42c 100644
  --- a/pthreading.py
  +++ b/pthreading.py
  @@ -132,6 +132,10 @@ def monkey_patch():
   Thus, Queue and SocketServer can easily enjoy them.
   
  
  +import thread
  +
  +thread.allocate_lock = Lock
  +
   import threading
  
   threading.Condition = Condition
  --
  1.8.3.1
  
  [1] https://bugzilla.redhat.com/show_bug.cgi?id=1022036
  [2] https://bugzilla.redhat.com/show_bug.cgi?id=1060749
 
 It makes sense to use pthreading.Lock for thread.allocate_lock instead of the
 standard
 threading.Lock CPU hog. However, I do not understand its
 relevance to the deadlock sited above: pthreading.Lock fixes performance
 issues, but not correctness issues, of threading.Lock.
 
 Would you explain, in the commit message of the pthreading patch, why
 you believe that the implementation of thread.allocate_lock() is buggy?
 Do you know if the bug is fixed in Python 3?
 
 Regards,
 Dan.
 

We actually don't have concrete proof as we can't reproduce the bug
so we can't test this.
We are shooting in the dark hoping something hits.
We assume it's there since all of our cordumps have a thread stuck
acquiring the limbo lock. Since mixing lock implementations is
probably a bad idea we assume that overriding this a thing we should do
anyway we thought we'll give it a go. If VDSM gets stuck again we will have
another coredump that we could compare to the others.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API.py | gerrit.ovirt Code Review

2013-10-20 Thread Saggi Mizrahi

- Original Message -
 From: Doron Fediuck dfedi...@redhat.com
 To: Vinzenz Feenstra eviliss...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Friday, October 18, 2013 10:17:55 AM
 Subject: Re: [vdsm] API.py | gerrit.ovirt Code Review
 
 
 
 - Original Message -
  From: Vinzenz Feenstra eviliss...@redhat.com
  To: vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, October 17, 2013 1:22:48 PM
  Subject: Re: [vdsm] API.py | gerrit.ovirt Code Review
  
  On 10/17/2013 08:43 AM, Doron Fediuck wrote:
   http://gerrit.ovirt.org/#/c/20126/4/vdsm/API.py
  
   Dan,
   just a general design question.
  
   The above will report the HA score to the engine.
   I suspect that in the next versions we'll extend the
   HA integration for other operations, such as shutting down HA.
  
   So going forward I think we'll need something like vdsm/momIF.py
   to stabilize this integration.
  
   What do you think?
  
  I think if you already know that you'll be extending this, it'd be nicer
  to already start adding this to a new module where you can keep
  everything together related to this. Rather than extending bits all over
  the place and having everywhere these conditional imports.
In general we want to get rid of API.py in favor of subsystem specific classes.
So removing things from API.py and moving them to another files is the 
recommended
course of action.
  
  
  
  --
  Regards,
  
  Vinzenz Feenstra | Senior Software Engineer
  RedHat Engineering Virtualization R  D
  Phone: +420 532 294 625
  IRC: vfeenstr or evilissimo
  
  Better technology. Faster innovation. Powered by community collaboration.
  See how it works at redhat.com
  
 
 The idea is to keep what we no have, and when need to extend
 we'll replace the import with an interface the same way mom
 has.
 The only question here is design wise, if such an interface
 will make sense.
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] vdsm sync meeting - October 7th 2013

2013-10-08 Thread Saggi Mizrahi


- Original Message -
 From: Oved Ourfalli ov...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Dan Kenigsberg dan...@redhat.com, dc...@redhat.com, VDSM Project 
 Development
 vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, October 8, 2013 11:42:23 AM
 Subject: Re: [vdsm] vdsm sync meeting - October 7th 2013
 
 
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Dan Kenigsberg dan...@redhat.com
  Cc: dc...@redhat.com, VDSM Project Development
  vdsm-devel@lists.fedorahosted.org
  Sent: Monday, October 7, 2013 5:42:54 PM
  Subject: Re: [vdsm] vdsm sync meeting - October 7th 2013
  
  
  
  - Original Message -
   From: Dan Kenigsberg dan...@redhat.com
   To: VDSM Project Development vdsm-devel@lists.fedorahosted.org,
   dc...@redhat.com
   Sent: Monday, October 7, 2013 5:25:22 PM
   Subject: [vdsm] vdsm sync meeting - October 7th 2013
   
   We had an unpleasant talk, hampered by statics and disconnection on
   danken's side. Beyond the noises I've managed to recognize Yaniv, Toni,
   Douglas, Danken, Ayal, Timothy, Yeela and Mooli. We've managed to
   discuss:
   
   - vdsm-4.13.0 is tagged, with a know selinux issue on el6. Expect a new
 seliux-policy solving it any time soon.
   
   - All bugfixes should be backported to ovirt-3.3, so that we have a
 stable and comfortable vdsm in ovirt-3.3.1. Risky changes and new
 features should remain in master IMO.
   
   - We incorporated a glusterfs requirement breaking rpm installaiton for
 people. We should avoid that by posters notifying reviewers more
 prominently and by having
 http://jenkins.ovirt.org/job/vdsm_install_rpm_sanity_gerrit/
 run on every patch that touches vdsm.spec.in.
   
 David, could you make the adjustment to the job?
   
   - We discussed feature negotiation: Toni and Dan liked the idea of
 having vdsm expose a feature flags, to make it easier on Engine to
 check if a certain feature is supported.
   
 Ayal argues that this is useful only for capabilities that depend on
 existence on lower level components. Sees little value in fine
 feature granularity on vdsm side - versions is enough.
   
 
 Versions might not be enough here, as some features might be supported by
 VDSM version X, but not when it is installed under operating system Y.
 IMO, VDSM should reflect that when reporting the features.
 
 So the disputed question is only how many feature flags we should
 have, and when to set them: statically or based on negotiation with
 kernel/libvirt/gluster/what not.
  I already voiced my reservation over the entire concept
  of feature flags.
  Proposing we only move to specific introspective verbs
  maintained in the subsystem.
  
  Have vdsm.getAvailableStorageDomainTypes() ['gluster']
  
  instead of vdsm.getFeatures()
  ['storagetype/gluster']
  
  It allows for much higher level of flexibility as the aforementioned verb
  can also return other information about the domain type:
  For example returning each domain type with parameter information:
  {'nfs': {'connect_params': [
  {'name': 'timeout',
   'type': 'int',
   'range': [0, 99],
   'desc': 'Sets the timeout',
  
  So even parameters can potentially be introspected.
  
 
 IMO it is great to have a verb per domain (e.g. network, storage, virt,
 etc.), as it allows getting deeper information about features.
 However, it does not conflict with having a single general getFeatures verb.
 Such a verb can be useful cases in which you don't really need more
 information, for example in establishing a feature negotiation between the
 engine and VDSM.
No one is talking about feature negotiation. It's feature reporting.
And all I'm saying is that having a verb reporting unrelated things in
unrelated formats is usually a bad idea.

How would features be represented strings? fqdn? objects of different types?
If it's a string how would the user know how features depends on each other.
How granular should this be? How do we change granularity in the future?

We must have verbs with clear scope. Anyone can tell what 
GetServerConnectionTypes()
needs to return. We know what it's granularity is. We know how it relates to
other things. We know what flows need to check it and how it might effect them.

I have no idea what getFeatures() even means.


 If you find out that a specific feature is supported, and
 you would like to get more details, such as parameter information, you would
 query specifically for that.
 
   
   - Unified network persistence patches are being merged into master
   
   - Timothy is working on fixing
 http://jenkins.ovirt.org/job/vdsm_verify_error_codes/lastBuild/console
 (hopefully by introducing the new error codes to Engine)
   
   I was dropped from the call, so please append with stuff that I've
   missed. Sorry for the noise!
   
   Dan.
   ___
   vdsm-devel mailing

Re: [vdsm] vdsm sync meeting - October 7th 2013

2013-10-08 Thread Saggi Mizrahi


- Original Message -
 From: Oved Ourfalli ov...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: dc...@redhat.com, VDSM Project Development 
 vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, October 8, 2013 4:15:22 PM
 Subject: Re: [vdsm] vdsm sync meeting - October 7th 2013
 
 
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Oved Ourfalli ov...@redhat.com
  Cc: dc...@redhat.com, VDSM Project Development
  vdsm-devel@lists.fedorahosted.org
  Sent: Tuesday, October 8, 2013 4:08:12 PM
  Subject: Re: [vdsm] vdsm sync meeting - October 7th 2013
  
  
  
  - Original Message -
   From: Oved Ourfalli ov...@redhat.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: Dan Kenigsberg dan...@redhat.com, dc...@redhat.com, VDSM Project
   Development
   vdsm-devel@lists.fedorahosted.org
   Sent: Tuesday, October 8, 2013 11:42:23 AM
   Subject: Re: [vdsm] vdsm sync meeting - October 7th 2013
   
   
   
   - Original Message -
From: Saggi Mizrahi smizr...@redhat.com
To: Dan Kenigsberg dan...@redhat.com
Cc: dc...@redhat.com, VDSM Project Development
vdsm-devel@lists.fedorahosted.org
Sent: Monday, October 7, 2013 5:42:54 PM
Subject: Re: [vdsm] vdsm sync meeting - October 7th 2013



- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org,
 dc...@redhat.com
 Sent: Monday, October 7, 2013 5:25:22 PM
 Subject: [vdsm] vdsm sync meeting - October 7th 2013
 
 We had an unpleasant talk, hampered by statics and disconnection on
 danken's side. Beyond the noises I've managed to recognize Yaniv,
 Toni,
 Douglas, Danken, Ayal, Timothy, Yeela and Mooli. We've managed to
 discuss:
 
 - vdsm-4.13.0 is tagged, with a know selinux issue on el6. Expect a
 new
   seliux-policy solving it any time soon.
 
 - All bugfixes should be backported to ovirt-3.3, so that we have a
   stable and comfortable vdsm in ovirt-3.3.1. Risky changes and new
   features should remain in master IMO.
 
 - We incorporated a glusterfs requirement breaking rpm installaiton
 for
   people. We should avoid that by posters notifying reviewers more
   prominently and by having
   http://jenkins.ovirt.org/job/vdsm_install_rpm_sanity_gerrit/
   run on every patch that touches vdsm.spec.in.
 
   David, could you make the adjustment to the job?
 
 - We discussed feature negotiation: Toni and Dan liked the idea of
   having vdsm expose a feature flags, to make it easier on Engine to
   check if a certain feature is supported.
 
   Ayal argues that this is useful only for capabilities that depend
   on
   existence on lower level components. Sees little value in fine
   feature granularity on vdsm side - versions is enough.
 
   
   Versions might not be enough here, as some features might be supported by
   VDSM version X, but not when it is installed under operating system Y.
   IMO, VDSM should reflect that when reporting the features.
   
   So the disputed question is only how many feature flags we should
   have, and when to set them: statically or based on negotiation with
   kernel/libvirt/gluster/what not.
I already voiced my reservation over the entire concept
of feature flags.
Proposing we only move to specific introspective verbs
maintained in the subsystem.

Have vdsm.getAvailableStorageDomainTypes() ['gluster']

instead of vdsm.getFeatures()
['storagetype/gluster']

It allows for much higher level of flexibility as the aforementioned
verb
can also return other information about the domain type:
For example returning each domain type with parameter information:
{'nfs': {'connect_params': [
{'name': 'timeout',
 'type': 'int',
 'range': [0, 99],
 'desc': 'Sets the timeout',

So even parameters can potentially be introspected.

   
   IMO it is great to have a verb per domain (e.g. network, storage, virt,
   etc.), as it allows getting deeper information about features.
   However, it does not conflict with having a single general getFeatures
   verb.
   Such a verb can be useful cases in which you don't really need more
   information, for example in establishing a feature negotiation between
   the
   engine and VDSM.
  No one is talking about feature negotiation. It's feature reporting.
 
 You're right. My bad. Feature reporting is the right terminology here.
 
 
  And all I'm saying is that having a verb reporting unrelated things in
  unrelated formats is usually a bad idea.
  
  How would features be represented strings? fqdn? objects of different
  types?
  If it's a string how would the user know how features depends on each
  other.
  How granular should this be? How do we change granularity

Re: [vdsm] vdsm sync meeting - October 7th 2013

2013-10-07 Thread Saggi Mizrahi


- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
 dc...@redhat.com
 Sent: Monday, October 7, 2013 5:25:22 PM
 Subject: [vdsm] vdsm sync meeting - October 7th 2013
 
 We had an unpleasant talk, hampered by statics and disconnection on
 danken's side. Beyond the noises I've managed to recognize Yaniv, Toni,
 Douglas, Danken, Ayal, Timothy, Yeela and Mooli. We've managed to discuss:
 
 - vdsm-4.13.0 is tagged, with a know selinux issue on el6. Expect a new
   seliux-policy solving it any time soon.
 
 - All bugfixes should be backported to ovirt-3.3, so that we have a
   stable and comfortable vdsm in ovirt-3.3.1. Risky changes and new
   features should remain in master IMO.
 
 - We incorporated a glusterfs requirement breaking rpm installaiton for
   people. We should avoid that by posters notifying reviewers more
   prominently and by having
   http://jenkins.ovirt.org/job/vdsm_install_rpm_sanity_gerrit/
   run on every patch that touches vdsm.spec.in.
 
   David, could you make the adjustment to the job?
 
 - We discussed feature negotiation: Toni and Dan liked the idea of
   having vdsm expose a feature flags, to make it easier on Engine to
   check if a certain feature is supported.
 
   Ayal argues that this is useful only for capabilities that depend on
   existence on lower level components. Sees little value in fine
   feature granularity on vdsm side - versions is enough.
 
   So the disputed question is only how many feature flags we should
   have, and when to set them: statically or based on negotiation with
   kernel/libvirt/gluster/what not.
I already voiced my reservation over the entire concept
of feature flags.
Proposing we only move to specific introspective verbs
maintained in the subsystem.

Have vdsm.getAvailableStorageDomainTypes() ['gluster']

instead of vdsm.getFeatures()
['storagetype/gluster']

It allows for much higher level of flexibility as the aforementioned verb
can also return other information about the domain type:
For example returning each domain type with parameter information:
{'nfs': {'connect_params': [
{'name': 'timeout',
 'type': 'int',
 'range': [0, 99],
 'desc': 'Sets the timeout',

So even parameters can potentially be introspected.

 
 - Unified network persistence patches are being merged into master
 
 - Timothy is working on fixing
   http://jenkins.ovirt.org/job/vdsm_verify_error_codes/lastBuild/console
   (hopefully by introducing the new error codes to Engine)
 
 I was dropped from the call, so please append with stuff that I've
 missed. Sorry for the noise!
 
 Dan.
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] Adding vdsm_api support for gluster vdsm verbs

2013-04-17 Thread Saggi Mizrahi
This is very Gluster specific but I guess it's OK until I get some time to make 
things
a bit more generic over there.

- Original Message -
 From: Aravinda avish...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org, engine-de...@ovirt.org, Saggi 
 Mizrahi smizr...@redhat.com, a...@us.ibm.com
 Cc: Dan Kenigsberg dan...@redhat.com, Sahina Bose sab...@redhat.com
 Sent: Wednesday, April 17, 2013 3:49:13 PM
 Subject: Re: [Engine-devel] [vdsm] Adding vdsm_api support for gluster vdsm 
 verbs
 
 [Adding Saggi, Adam, Dan, Sahina]
 
 On 04/16/2013 02:13 PM, Aravinda wrote:
  [Adding engine-devel]
 
  On 04/16/2013 02:10 PM, Aravinda wrote:
 
  vdsm/gluster is vdsm plugin for gluster related functionality. These
  functionalities are available only when vdsm-gluster package is
  installed. So the schema JSON of vdsm-gluster cannot be added to the
  same file(vdsm_api/vdsmapi-schema.json)
 
  Looks like vdsm_api is not providing plugin support. This patch adds
  functionality to vdsm_api to read vdsmapi-gluster-schema.json if
  available. But with this approach we need to edit the core vdsmapi.py
  file.
 
  http://gerrit.ovirt.org/#/c/13921/
 
  Alternate approach:
  We can have vdsm_api/plugins or vdsm_api/schema directory inside
  vdsm_api, so that we can modify vdsmapi.py to read all schema files
  from that dir. When vdsm-gluster package installed, it copies
  vdsmapi-gluster-schema.json into schema directory.
 
 
  --
  regards
  Aravinda
 
 
  On 04/15/2013 04:14 PM, Aravinda wrote:
  Hi,
 
  We are trying to add json rpc support for vdsm gluster verbs. I
  submitted a patch to read gluster verbs schema from vdsm/gluster
  directory.
  http://gerrit.ovirt.org/#/c/13921/
 
  Let me know if the approach is fine.
 
  --
  regards
  Aravinda
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
  ___
  Engine-devel mailing list
  engine-de...@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/engine-devel
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] Proposal VDSM = Engine Data Statistics Retrieval Optimization

2013-03-13 Thread Saggi Mizrahi
I am completely against this.
It make the return value differ according to input which
is a big no no when talking about type safe APIs.

The only reason we have this problem is because there is this
thing against making multiple calls.

Just split it up.
getVmRuntimeStats() - transient things like mem and cpu%
getVmInformation() - (semi)static things like disk\networking layout etc.
Each updated at different intervals.

- Original Message -
 From: Vinzenz Feenstra vfeen...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org, engine-de...@ovirt.org
 Sent: Thursday, March 7, 2013 6:25:54 AM
 Subject: [Engine-devel] Proposal VDSM = Engine Data Statistics Retrieval
 Optimization
 
 
 Please find the prettier version on the wiki:
 http://www.ovirt.org/Proposal_VDSM_-_Engine_Data_Statistics_Retrieval
 
 Proposal VDSM - Engine Data Statistics Retrieval
 VDSM = Engine data retrieval optimization
 Motivation:
 
 
 Currently the RHEVM engine is polling the a lot of data from VDSM
 every 15 seconds. This should be optimized and the amount of data
 requested should be more specific.
 
 For each VM the data currently contains much more information than
 actually needed which blows up the size of the XML content quite
 big. We could optimize this by splitting the reply on the getVmStats
 based on the request of the engine into sections. For this reason
 Omer Frenkel and me have split up the data into parts based on their
 usage.
 
 This data can and usually does change during the lifetime of the VM.
 Rarely Changed:
 
 
 This data is change not very frequent and it should be enough to
 update this only once in a while. Most commonly this data changes
 after changes made in the UI or after a migration of the VM to
 another Host. Status = Running acpiEnable = true vmType = kvm
 guestName = W864GUESTAGENTT displayType = qxl guestOs = Win 8
 kvmEnable = true # this should be constant and never changed
 pauseCode = NOERR monitorResponse = 0 session = Locked # unused
 netIfaces = [{'name': 'Realtek RTL8139C+ Fast Ethernet NIC',
 'inet6':  ['fe80::490c:92bb:bbcc:9f87'], 'inet': ['10.34.60.148'],
 'hw': '00:1a:4a:22:3c:db'}] appsList = ['RHEV-Tools 3.2.4',
 'RHEV-Agent64 3.2.3', 'RHEV-Serial64 3.2.3', 'RHEV-Network64 3.2.2',
 'RHEV-Network64 3.2.3', 'RHEV-Block64 3.2.3', 'RHEV-Balloon64
 3.2.3', 'RHEV-Balloon64 3.2.2', 'RHEV-Agent64 3.2.2', 'RHEV-USB
 3.2.3', 'RHEV-Block64 3.2.2', 'RHEV-Serial64 3.2.2'] pid = 11314
 guestIPs = 10.34.60.148 # duplicated info displayIp = 0 displayPort
 = 5902 displaySecurePort = 5903 username = user@W864GUESTAGENTT
 clientIp = lastLogin = 1361976900.67 Often Changed:
 
 
 This data is changed quite often however it is not necessary to
 update this data every 15 seconds. As this is cumulative data and
 reflects the current status, and it does not need to be snapshotted
 every 15 seconds to retrieve statistics. The data can be retrieved
 in much more generous time slices. (e.g. Every 5 minutes) network =
 {'vnet1': {'macAddr': '00:1a:4a:22:3c:db', 'rxDropped': '0',
 'txDropped': '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate': '0.0',
 'txErrors': '0', 'state': 'unknown', 'speed': '100', 'name':
 'vnet1'}} disksUsage = [{'path': 'c:\\', 'total': '64055406592',
 'fs': 'NTFS', 'used': '19223846912'}, {'path': 'd:\\', 'total':
 '3490912256', 'fs': 'UDF', 'used': '3490912256'}] timeOffset = 14422
 elapsedTime = 68591 hash = 2335461227228498964 statsAge = 0.09 #
 unused Often Changed but unused
 
 
 This data does not seem to be used in the engine at all. It is not
 even used in the data warehouse. memoryStats = {'swap_out': '0',
 'majflt': '0', 'mem_free': '1466884', 'swap_in': '0', 'pageflt':
 '0', 'mem_total': '2096736', 'mem_unused': '1466884'} balloonInfo =
 {'balloon_max': 2097152, 'balloon_cur': 2097152} disks = {'vda':
 {'readLatency': '0', 'apparentsize': '64424509440', 'writeLatency':
 '1754496','imageID': '28abb923-7b89-4638-84f8-1700f0b76482',
 'flushLatency': '156549',  'readRate': '0.00', 'truesize':
 '18855059456', 'writeRate': '952.05'}, 'hdc': {'readLatency': '0',
 'apparentsize': '0', 'writeLatency': '0', 'flushLatency': '0',
 'readRate': '0.00', 'truesize': '0', 'writeRate': '0.00'}} Very
 frequent uppdates needed by webadmin portal:
 
 
 This data is mostly needed for the webadmin portal and might be
 required to be updated quite often. An exception here is the
 statsAge field, which seems to be unused by the Engine. This data
 could be requested every 15 seconds to keep things as they are now.
 cpuSys = 2.32 cpuUser = 1.34 memUsage = 30 Proposed Solution for
 VDSM  Engine:
 
 
 We will introduce new optional parameters to getVmStats,
 getAllVmStats and list to allow a finer grained specification of
 data which should be included.
 
 Parameter: statsType = string (getVmStats, getAllVmStats only)
 Allowed values:
 
 * full (default to keep backwards compatibility)
 * app-list (Just send the application list)
 * rare (include everything from rarely changed to 

Re: [vdsm] [Engine-devel] Proposal VDSM = Engine Data Statistics Retrieval Optimization

2013-03-13 Thread Saggi Mizrahi


- Original Message -
 From: Ayal Baron aba...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: engine-de...@ovirt.org, vdsm-devel@lists.fedorahosted.org, Vinzenz 
 Feenstra vfeen...@redhat.com
 Sent: Wednesday, March 13, 2013 5:39:24 PM
 Subject: Re: [vdsm] [Engine-devel] Proposal VDSM = Engine Data Statistics 
 Retrieval Optimization
 
 
 
 - Original Message -
  I am completely against this.
  It make the return value differ according to input which
  is a big no no when talking about type safe APIs.
  
  The only reason we have this problem is because there is this
  thing against making multiple calls.
  
  Just split it up.
  getVmRuntimeStats() - transient things like mem and cpu%
  getVmInformation() - (semi)static things like disk\networking
  layout
  etc.
  Each updated at different intervals.
 
 +1 on splitting the data up into 2 separate API calls.
 You could potentially add a checksum (md5, or any other way) of the
 static data to getVmRuntimeStats and not bother even with polling
 the VmInformation if this hasn't changed.  Then you could poll as
 often as you'd like the stats and immediately see if you also need
 to retrieve VmInfo or not (you rarely would).
+1 To Ayal's suggestion
except that instead of the engine hashing the data VDSM sends the
key which is opaque to the engine.
This can be a local timestap or a generation number.

But, we might want to consider that when we add events polling
becomes (much) less frequent so maybe it'll be an overkill.

 
  
  - Original Message -
   From: Vinzenz Feenstra vfeen...@redhat.com
   To: vdsm-devel@lists.fedorahosted.org, engine-de...@ovirt.org
   Sent: Thursday, March 7, 2013 6:25:54 AM
   Subject: [Engine-devel] Proposal VDSM = Engine Data Statistics
   Retrieval Optimization
   
   
   Please find the prettier version on the wiki:
   http://www.ovirt.org/Proposal_VDSM_-_Engine_Data_Statistics_Retrieval
   
   Proposal VDSM - Engine Data Statistics Retrieval
   VDSM = Engine data retrieval optimization
   Motivation:
   
   
   Currently the RHEVM engine is polling the a lot of data from VDSM
   every 15 seconds. This should be optimized and the amount of data
   requested should be more specific.
   
   For each VM the data currently contains much more information
   than
   actually needed which blows up the size of the XML content quite
   big. We could optimize this by splitting the reply on the
   getVmStats
   based on the request of the engine into sections. For this reason
   Omer Frenkel and me have split up the data into parts based on
   their
   usage.
   
   This data can and usually does change during the lifetime of the
   VM.
   Rarely Changed:
   
   
   This data is change not very frequent and it should be enough to
   update this only once in a while. Most commonly this data changes
   after changes made in the UI or after a migration of the VM to
   another Host. Status = Running acpiEnable = true vmType = kvm
   guestName = W864GUESTAGENTT displayType = qxl guestOs = Win 8
   kvmEnable = true # this should be constant and never changed
   pauseCode = NOERR monitorResponse = 0 session = Locked # unused
   netIfaces = [{'name': 'Realtek RTL8139C+ Fast Ethernet NIC',
   'inet6':  ['fe80::490c:92bb:bbcc:9f87'], 'inet':
   ['10.34.60.148'],
   'hw': '00:1a:4a:22:3c:db'}] appsList = ['RHEV-Tools 3.2.4',
   'RHEV-Agent64 3.2.3', 'RHEV-Serial64 3.2.3', 'RHEV-Network64
   3.2.2',
   'RHEV-Network64 3.2.3', 'RHEV-Block64 3.2.3', 'RHEV-Balloon64
   3.2.3', 'RHEV-Balloon64 3.2.2', 'RHEV-Agent64 3.2.2', 'RHEV-USB
   3.2.3', 'RHEV-Block64 3.2.2', 'RHEV-Serial64 3.2.2'] pid = 11314
   guestIPs = 10.34.60.148 # duplicated info displayIp = 0
   displayPort
   = 5902 displaySecurePort = 5903 username = user@W864GUESTAGENTT
   clientIp = lastLogin = 1361976900.67 Often Changed:
   
   
   This data is changed quite often however it is not necessary to
   update this data every 15 seconds. As this is cumulative data and
   reflects the current status, and it does not need to be
   snapshotted
   every 15 seconds to retrieve statistics. The data can be
   retrieved
   in much more generous time slices. (e.g. Every 5 minutes) network
   =
   {'vnet1': {'macAddr': '00:1a:4a:22:3c:db', 'rxDropped': '0',
   'txDropped': '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate':
   '0.0',
   'txErrors': '0', 'state': 'unknown', 'speed': '100', 'name':
   'vnet1'}} disksUsage = [{'path': 'c:\\', 'total': '64055406592',
   'fs': 'NTFS', 'used': '19223846912'}, {'path': 'd:\\', 'total':
   '3490912256', 'fs': 'UDF', 'used': '3490912256'}] timeOffset =
   14422
   elapsedTime = 68591 hash = 2335461227228498964 statsAge = 0.09 #
   unused Often Changed but unused
   
   
   This data does not seem to be used in the engine at all. It is
   not
   even used in the data warehouse. memoryStats = {'swap_out': '0',
   'majflt': '0', 'mem_free': '1466884', 'swap_in': '0', 'pageflt':
   '0', 'mem_total': '2096736', 'mem_unused': '1466884

Re: [vdsm] VDSM Repository Reorganization

2013-02-19 Thread Saggi Mizrahi


- Original Message -
 From: Federico Simoncelli fsimo...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Dan 
 Kenigsberg dan...@redhat.com, Vinzenz
 Feenstra vfeen...@redhat.com, Ayal Baron aba...@redhat.com, Adam 
 Litke a...@us.ibm.com
 Sent: Tuesday, February 19, 2013 11:27:59 AM
 Subject: Re: VDSM Repository Reorganization
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Adam Litke a...@us.ibm.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org,
  Dan Kenigsberg dan...@redhat.com, Vinzenz
  Feenstra vfeen...@redhat.com, Ayal Baron aba...@redhat.com,
  Federico Simoncelli fsimo...@redhat.com
  Sent: Monday, February 18, 2013 8:50:30 PM
  Subject: Re: VDSM Repository Reorganization
  
  - Original Message -
   From: Adam Litke a...@us.ibm.com
   To: Federico Simoncelli fsimo...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org,
   Dan Kenigsberg dan...@redhat.com, Saggi
   Mizrahi smizr...@redhat.com, Vinzenz Feenstra
   vfeen...@redhat.com, Ayal Baron aba...@redhat.com
   Sent: Tuesday, February 12, 2013 3:08:09 PM
   Subject: Re: VDSM Repository Reorganization
   
   On Mon, Feb 11, 2013 at 12:17:39PM -0500, Federico Simoncelli
   wrote:
It is some time now that we are discussing an eventual
repository
reorganization for vdsm. In fact I'm sure that we all
experienced
at least once the discomfort of having several modules
scattered
around the tree.
The main goal of the reorganization would be to place the
modules
in their proper location so that they can be used (imported)
without
any special change (or hack) even when the code is executed
inside
the development repository (e.g. tests).

Recently there has been an initial proposal about moving some
of
these modules:

http://gerrit.ovirt.org/#/c/11858/

That spawned an interesting discussion that must involve the
entire
community; in fact before starting any work we should try to
converge
on a decision for the final repository structure in order to
minimize
the discomfort for the contributors that will be forced to
rebase
their pending gerrit patches. Even if the full reorganization
won't
happen in a short time I think we should plan the entire
structure
now and then eventually move only few relevant modules to their
final
location.

To start the discussion I'm attaching here a sketch of the vdsm
repository structure that I envision:

.
|-- client
|   |-- [...]
|   `-- vdsClient.py
|-- common
|   |-- [...]
|   |-- betterPopen
|   |   `-- [...]
|   `-- vdsm
|   |-- [...]
|   `-- config.py
|-- contrib
|   |-- [...]
|   |-- nfs-check.py
|   `-- sos
|-- daemon
|   |-- [...]
|   |-- supervdsm.py
|   `-- vdsmd
`-- tool
|-- [...]
`-- vdsm-tool
   
   The schema file vdsmapi-schema.json (as well as the python module
   to parse it)
   are needed by the server and clients.  Initially I'd think it
   should be
   installed in 'common', but a client does not need things like
   betterPopen.  Any
   recommendation on where the schema/API definition should live?
  
  Well they both should have the file but when installed both should
  have
  their own version of the file depending on the version installed of
  the
  client or the server. This is so that vdsm-cli doesn't depend on
  vdsm
  or vice-versa. You can't have them share the file since if one is
  installed
  with a version of the schema where the schema syntax changed the
  client\server
  will fail to parse the schema.
 
 I'm not sure what's the purpose of having different versions of the
 client/server on the same machine. The software repository is one and
 it should provide both (as they're built from the same source).
 This is the standard way of delivering client/server applications in
 all the distributions. We can change that but we must have a good
 reason.
There isn't really a reason. But, as I said, you don't want them to
depend on each other or have the schema in it's own rpm.
This means that you have to distribute them separately.

I also want to allow to update the client on a host without updating
the server. This is because you may want to have a script that works
across the cluster but not update all the hosts.

Now, even though you will use only old methods, the schema itself
might become unparsable by old implementations.

 
  As for development, I think the least bad solution is to put it in
  contrib
  with symlinks that have relative paths.
  
  |--daemon
  |  |-- [...]
  |  `-- vdsmapi-schema.json - ../contrib/vdsmapi-schema.json
  |--client
  |  |-- [...]
  |  `-- vdsmapi-schema.json - ../contrib/vdsmapi-schema.json
  |--contrib
  |  |-- [...]
  |  `-- vdsmapi

Re: [vdsm] VDSM Repository Reorganization

2013-02-18 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Dan 
 Kenigsberg dan...@redhat.com, Saggi
 Mizrahi smizr...@redhat.com, Vinzenz Feenstra vfeen...@redhat.com, 
 Ayal Baron aba...@redhat.com
 Sent: Tuesday, February 12, 2013 3:08:09 PM
 Subject: Re: VDSM Repository Reorganization
 
 On Mon, Feb 11, 2013 at 12:17:39PM -0500, Federico Simoncelli wrote:
  It is some time now that we are discussing an eventual repository
  reorganization for vdsm. In fact I'm sure that we all experienced
  at least once the discomfort of having several modules scattered
  around the tree.
  The main goal of the reorganization would be to place the modules
  in their proper location so that they can be used (imported)
  without
  any special change (or hack) even when the code is executed inside
  the development repository (e.g. tests).
  
  Recently there has been an initial proposal about moving some of
  these modules:
  
  http://gerrit.ovirt.org/#/c/11858/
  
  That spawned an interesting discussion that must involve the entire
  community; in fact before starting any work we should try to
  converge
  on a decision for the final repository structure in order to
  minimize
  the discomfort for the contributors that will be forced to rebase
  their pending gerrit patches. Even if the full reorganization won't
  happen in a short time I think we should plan the entire structure
  now and then eventually move only few relevant modules to their
  final
  location.
  
  To start the discussion I'm attaching here a sketch of the vdsm
  repository structure that I envision:
  
  .
  |-- client
  |   |-- [...]
  |   `-- vdsClient.py
  |-- common
  |   |-- [...]
  |   |-- betterPopen
  |   |   `-- [...]
  |   `-- vdsm
  |   |-- [...]
  |   `-- config.py
  |-- contrib
  |   |-- [...]
  |   |-- nfs-check.py
  |   `-- sos
  |-- daemon
  |   |-- [...]
  |   |-- supervdsm.py
  |   `-- vdsmd
  `-- tool
  |-- [...]
  `-- vdsm-tool
 
 The schema file vdsmapi-schema.json (as well as the python module to
 parse it)
 are needed by the server and clients.  Initially I'd think it should
 be
 installed in 'common', but a client does not need things like
 betterPopen.  Any
 recommendation on where the schema/API definition should live?
Well they both should have the file but when installed both should have
their own version of the file depending on the version installed of the
client or the server. This is so that vdsm-cli doesn't depend on vdsm
or vice-versa. You can't have them share the file since if one is installed
with a version of the schema where the schema syntax changed the client\server
will fail to parse the schema.

As for development, I think the least bad solution is to put it in contrib
with symlinks that have relative paths.
|--daemon
|  |-- [...]
|  `-- vdsmapi-schema.json - ../contrib/vdsmapi-schema.json
|--client
|  |-- [...]
|  `-- vdsmapi-schema.json - ../contrib/vdsmapi-schema.json
|--contrib
|  |-- [...]
|  `-- vdsmapi-schema.json
:
.

Git knows how to handle symlinks and symlinks are relative to the location of
the symlink.

We could also just select the daemon or the client folder and put the real
file there and a symlink in the other but I feel it's like choosing which
one of your children is the main user of a schema file.
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] RFC: New Storage API

2013-01-22 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Shu Ming shum...@linux.vnet.ibm.com
 Cc: engine-devel engine-de...@ovirt.org, VDSM Project Development 
 vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, January 22, 2013 2:20:19 PM
 Subject: Re: [vdsm] [Engine-devel]  RFC: New Storage API
 
 On Tue, Jan 22, 2013 at 11:36:57PM +0800, Shu Ming wrote:
  2013-1-15 5:34, Ayal Baron:
  image and volume are overused everywhere and it would be extremely
  confusing to have multiple meanings to the same terms in the same
  system (we have image today which means virtual disk and volume
  which means a part of a virtual disk).
  Personally I don't like the distinction between image and volume
  done in ec2/openstack/etc seeing as they're treated as different
  types of entities there while the only real difference is
  mutability (images are read-only, volumes are read-write).
  To move to the industry terminology we would need to first change
  all references we have today to image and volume in the system (I
  would say also in ovirt-engine side) to align with the new
  meaning.
  Despite my personal dislike of the terms, I definitely see the
  value in converging on the same terminology as the rest of the
  industry but to do so would be an arduous task which is out of
  scope of this discussion imo (patches welcome though ;)
  
  Another distinction between Openstack and oVirt is how the
  Nova/ovirt-engine look upon storage systems. In Openstack, a stand
  alone storage service(Cinder) exports the raw storage block device
  to Nova. On the other hand, in oVirt, storage system is highly
  bounded with the cluster scheduling system which integrates storage
  sub-system, VM dispatching sub-system, ISO image sub systems. This
  combination make all of the sub-system integrated in a whole which
  is easy to deploy, but it make the sub-system more opaque and not
  harder to reuse and maintain. This new storage API proposal give us
  an opportunity to distinct these sub-systems as new components
  which
  export better, loose-coupling APIs to VDSM.
 
 A very good point and an important goal in my opinion.  I'd like to
 see
 ovirt-engine become more of a GUI for configuring the storage
 component (like it
 does for Gluster) rather than the centralized manager of storage.
  The clustered
 storage should be able to take care of itself as long as the peer
 hosts can
 negotiate the SDM role.
 
 It would be cool if someone could actually dedicate a
 non-virtualization host
 where its only job is to handle SDM operations.  Such a host could
 choose to
 only deploy the standalone HSM service and not the complete vdsm
 package.

OpenStack and oVirt have different architectures and goals.

Even though they are both marketed as IaaS solutions they are designed for
different purposes.

OpenStack is designed around the idea of simplifying the *development* and
*integration* of IaaS subsystems through standardization of interfaces. If you
design a system that requires access to some type of infrastructural resource
you can develop against the OpenStack API for that specific resource and you
can consume different underlying implementations of the subsystem.
Alternatively if you are creating a new subsystem implementations one of your
exposed APIs can be the appropriate OpenStack API.

In short, they are a group of loosely coupled services meant to be used
replicated and distributed in a cluster. Everyone can create they own
implementations of the APIs.

oVirt is designed around the idea of simplifying the *management* of said 
infrastructure.

The ovirt-engine is the cluster manager and VDSM is the host-manager. Every
host in the cluster has a host manager installed on it (VDSM) and some
(currently only 1) might have the cluster-manager (ovirt-engine) and they are
the effective brain. oVirt ideally only has managing entities. VDSM APIs
delegate to other subsystems tasks that are in it's scope, the subsystems have
their own APIs. For VMs you have libvirt, for networking you have the linux
management tools and maybe netcf for policy we now have MOM. For iscsi we have
iscsiadm, etc. The only odd one out is the image provisioning subsystem which I
will get in to, don't worry.

This means, if you didn't already gather, that no host managed by ovirt can
exist without either VDSM or the ovirt-engine living on it. That being said, I
am a huge proponent of making all subsystems optional. Meaning you can have
VDSM that doesn't have the libvirt or networking glue bits and just has
storage, and gluster.

To put it simply, no host without a *managing* entity on it.

But, as you all have pointed out, VDSM is redundant. There is no reason why the
engine can't just directly ask libvirt to do things. There is no reason why we
can't make a general iscsi management API and expose it on it's own,
independent from other services. VDSM is frankensteinesque abomination of
misplaced BL and pass-through APIs.

This is why everyone are 

Re: [vdsm] remote serial console via HTTP streaming handler

2013-01-15 Thread Saggi Mizrahi
Good to see my suggestion didn't fall on deaf ears.

- Original Message -
 From: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Adam 
 Litke a...@us.ibm.com, Ayal Baron
 aba...@redhat.com, Barak Azulay bazu...@redhat.com, Dan Kenigsberg 
 dan...@redhat.com
 Sent: Tuesday, January 15, 2013 4:30:03 AM
 Subject: Re: remote serial console via HTTP streaming handler
 
 
 on 01/08/2013 04:10, Saggi Mizrahi wrote:
  The solution is somewhat elegant (and only ~150 LOC).
  That being said I still have some 2 major problems with it:
  The simpler one is that it uses HTTP in a very non-standard manner,
  this
  can be easily solved by using websockets[2]. This is very close
  to what the patch already does and will make it follow some sort of
  a
  standard. This will also enable console on the Web UI to expose
  this on
  newer browsers.
 Using WebSocket is a good idea.  I have a look at its standard
 (http://tools.ietf.org/html/rfc6455). The framing and the security
 model
 is not trivial to implement (compared to that existing patch which
 enables HTTP to forward PTY octet stream in full duplex). Luckily
 there
 are some open-source WebSocket implementations.
  The second and the real reason I didn't put it just as a comment on
  the
  patch is that that using HTTP and POST %PATH to have only one
  listening
  socket for all VMs is completely different from the way we do VNC
  or SPICE.
  This means it kind of bypasses ticketing and any other mechanism we
  want
  to put on VM interfaces.
 I think HTTP digest authentication may be implemented in the current
 PTY
 forwarding patch to enable ticketing.
 
  The thing is, I really like it. I was suggesting that we extend
  this idiom
  to use for SPICE and VNC and tunneling it through a single
  http\websocket
  listener. So instead of making this work with the current methods
  make this
  the way to go.
 
  Using headers like:
  GET /VM/VM_ID/control HTTP/1.1
  Host: server.example.com
  Upgrade: websocket
  Ticket: TICKET
  Connection: Upgrade
  Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
  Sec-WebSocket-Protocol: [pty, vnc, spice]
  Sec-WebSocket-Version: 13
  Origin:http://example.com
 In the Spice official site, I see a demo project spice-html5 uses a
 WebSocket-Spice gateway to get the data. The Spice is tunneled in
 WebSocket between the client and the gateway. This is good for
 javascript running in browsers. If VDSM support tunneling the PTY,
 VNC
 and Spice in WebSocket, writing a viewers in browsers maybe easier.
 
 A WebSocket proxy can also be implemented to support migration with
 PTY.
 The PTY data stream is VDSM=proxy=client. When migrating, VDSM
 sends
 this event to proxy via AMQP, then shuts down the current WebSocket
 connection. The proxy can keep the connection with the client. After
 migration, another VDSM sends this event to proxy via AMQP, then the
 proxy establish the WebSocket connection with VDSM and continue the
 forwarding.
 
 We can also connect two guests' serial port by forwarding the data
 stream via this proxy back and forth with support for migration as
 explained above. Furthermore, the proxy can exposes the data stream
 in
 various plug-in protocols such as SOCKS, HTTP, SSH, telnet to various
 client. For example the proxy provide SOCKS support, then we can use
 socat as a SOCKS client to connect to guest serial port and pipe the
 data to FD 0 and 1 to a process running in the host.
Also, I don't think it's such a problem to have the client change
servers usually even if websockets are invovled. It just means that
client needs to be aware of the possibility of an extra layer.
 
 --
 Thanks and best regards!
 
 Zhou Zheng Sheng / 周征晟
 E-mail:zhshz...@linux.vnet.ibm.com
 Telephone: 86-10-82454397
 

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: New Storage API

2013-01-14 Thread Saggi Mizrahi


- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
 engine-devel engine-de...@ovirt.org
 Sent: Monday, January 14, 2013 6:18:13 AM
 Subject: Re: [vdsm] RFC: New Storage API
 
 On 12/04/2012 11:52 PM, Saggi Mizrahi wrote:
  I've been throwing a lot of bits out about the new storage API and
  I think it's time to talk a bit.
  I will purposefully try and keep implementation details away and
  concentrate about how the API looks and how you use it.
 
  First major change is in terminology, there is no long a storage
  domain but a storage repository.
  This change is done because so many things are already called
  domain in the system and this will make things less confusing for
  new-commers with a libvirt background.
 
  One other changes is that repositories no longer have a UUID.
  The UUID was only used in the pool members manifest and is no
  longer needed.
 
 
  connectStorageRepository(repoId, repoFormat,
  connectionParameters={}):
  repoId - is a transient name that will be used to refer to the
  connected domain, it is not persisted and doesn't have to be the
  same across the cluster.
  repoFormat - Similar to what used to be type (eg. localfs-1.0,
  nfs-3.4, clvm-1.2).
  connectionParameters - This is format specific and will used to
  tell VDSM how to connect to the repo.
 
  disconnectStorageRepository(self, repoId):
 
 
  In the new API there are only images, some images are mutable and
  some are not.
  mutable images are also called VirtualDisks
  immutable images are also called Snapshots
 
  There are no explicit templates, you can create as many images as
  you want from any snapshot.
 
  There are 4 major image operations:
 
 
  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
 userData={}, options={}):
 
  targetRepoId - ID of a connected repo where the disk will be
  created
  size - The size of the image you wish to create
  baseSnapshotId - the ID of the snapshot you want the base the new
  virtual disk on
  userData - optional data that will be attached to the new VD, could
  be anything that the user desires.
  options - options to modify VDSMs default behavior
 
  returns the id of the new VD
 
  createSnapshot(targetRepoId, baseVirtualDiskId,
  userData={}, options={}):
  targetRepoId - The ID of a connected repo where the new sanpshot
  will be created and the original image exists as well.
  size - The size of the image you wish to create
  baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want
  to snapshot
  userData - optional data that will be attached to the new Snapshot,
  could be anything that the user desires.
  options - options to modify VDSMs default behavior
 
  returns the id of the new Snapshot
 
  copyImage(targetRepoId, imageId, baseImageId=None, userData={},
  options={})
  targetRepoId - The ID of a connected repo where the new image will
  be created
  imageId - The image you wish to copy
  baseImageId - if specified, the new image will contain only the
  diff between image and Id.
 If None the new image will contain all the bits of
 image Id. This can be used to copy partial parts of
 images for export.
  userData - optional data that will be attached to the new image,
  could be anything that the user desires.
  options - options to modify VDSMs default behavior
 
  return the Id of the new image. In case of copying an immutable
  image the ID will be identical to the original image as they
  contain the same data. However the user should not assume that and
  always use the value returned from the method.
 
  removeImage(repositoryId, imageId, options={}):
  repositoryId - The ID of a connected repo where the image to delete
  resides
  imageId - The id of the image you wish to delete.
 
 
  
  getImageStatus(repositoryId, imageId)
  repositoryId - The ID of a connected repo where the image to check
  resides
  imageId - The id of the image you wish to check.
 
  All operations return once the operations has been committed to
  disk NOT when the operation actually completes.
  This is done so that:
  - operation come to a stable state as quickly as possible.
  - In case where there is an SDM, only small portion of the
  operation actually needs to be performed on the SDM host.
  - No matter how many times the operation fails and on how many
  hosts, you can always resume the operation and choose when to do
  it.
  - You can stop an operation at any time and remove the resulting
  object making a distinction between stop because the host is
  overloaded to I don't want that image
 
  This means that after calling any operation that creates a new
  image the user must then call getImageStatus() to check what is
  the status of the image.
  The status of the image can be either optimized, degraded, or
  broken.
  Optimized means

Re: [vdsm] API Documentation Since tag

2013-01-14 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Vinzenz Feenstra vfeen...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Friday, January 11, 2013 9:03:19 AM
 Subject: Re: [vdsm] API Documentation  Since tag
 
 On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote:
  Hi everyone,
  
  We are currently documenting the API in vdsmapi-schema.json
  I noticed that we have there documented when a certain element
  newly
  is introduced using the 'Since' tag.
  However I also noticed that we are not documenting when a field was
  newly added, nor do we update the 'since' tag.
  
  We should start documenting in what version we've introduced a
  field.
  A suggestion by saggi was to add to the comment for example:
  @since: 4.10.3
  
  What is your point of view on this?
 
 I do think it's a good idea to add this information.  How about
 supporting
 multiple Since lines in the comment like the following made up
 example:
 
 ##
 # @FenceNodePowerStatus:
 #
 # Indicates the power state of a remote host.
 #
 # @on:The remote host is powered on
 #
 # @off:   The remote host is powered off
 #
 # @unknown:   The power status is not known
 #
 # @sentient:  The host is alive and powered by its own metabolism
 #
 # Since: 4.10.0 - @FenceNodePowerStatus
 # Since: 10.2.0 - @sentient
 ##
I don't like the fact that both lines don't point to the same type of token.
I also don't like that it's a repeat of the type names and field names.

I prefer Vinzenz original suggestion (on IRC) of moving the Since token up 
and then
have it be a state.  It also makes discerning what entities you can use up to a
certain version easier if you make sure to keep them sorted.

We can do this because the order of the fields and availability is undetermined 
(unlike real structs).

##
# @FenceNodePowerStatus:
#
# Indicates the power state of a remote host.
#
# Since: 4.10.0
#
# @on:The remote host is powered on
#
# @off:   The remote host is powered off
#
# @unknown:   The power status is not known
#
# Since: 10.2.0
#
# @sentient:  The host is alive and powered by its own metabolism
#
##

The problem though is that it makes since a property of the fields and not of
the struct. This isn't that much of a problem as we can assume the earliest
version is the time when the struct was introduced.

 
 Remember that any patch to change the schema format will require
 changes to
 process-schema.py as well.
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] remote serial console via HTTP streaming handler

2013-01-07 Thread Saggi Mizrahi
I remember that there was a discussion about it but I
don't remember it ever converging.
In any case there is a patch upstream [1] that merits
discussion outside the scope of the patch and reviewers.

The solution is somewhat elegant (and only ~150 LOC).
That being said I still have some 2 major problems with it:
The simpler one is that it uses HTTP in a very non-standard manner, this
can be easily solved by using websockets[2]. This is very close
to what the patch already does and will make it follow some sort of a
standard. This will also enable console on the Web UI to expose this on
newer browsers.

The second and the real reason I didn't put it just as a comment on the
patch is that that using HTTP and POST %PATH to have only one listening
socket for all VMs is completely different from the way we do VNC or SPICE.
This means it kind of bypasses ticketing and any other mechanism we want
to put on VM interfaces.
The thing is, I really like it. I was suggesting that we extend this idiom
to use for SPICE and VNC and tunneling it through a single http\websocket
listener. So instead of making this work with the current methods make this
the way to go.

Using headers like:
GET /VM/VM_ID/control HTTP/1.1
Host: server.example.com
Upgrade: websocket
Ticket: TICKET
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: [pty, vnc, spice]
Sec-WebSocket-Version: 13
Origin: http://example.com

I admit I have no idea if migrating SPICE would like being tunneled but I
guess there is no practical reason why that would be a problem.


[1] http://gerrit.ovirt.org/#/c/10381
[2] http://en.wikipedia.org/wiki/WebSocket
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Managing async tasks

2012-12-17 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: vdsm-devel@lists.fedorahosted.org
 Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, 
 Saggi Mizrahi smizr...@redhat.com,
 Federico Simoncelli fsimo...@redhat.com, engine-de...@ovirt.org
 Sent: Monday, December 17, 2012 12:00:49 PM
 Subject: Managing async tasks
 
 On today's vdsm call we had a lively discussion around how
 asynchronous
 operations should be handled in the future.  In an effort to include
 more people
 in the discussion and to better capture the resulting conversation I
 would like
 to continue that discussion here on the mailing list.
 
 A lot of ideas were thrown around about how 'tasks' should be handled
 in the
 future.  There are a lot of ways that it can be done.  To determine
 how we
 should implement it, it's probably best if we start with a set of
 requirements.
 If we can first agree on these, it should be easy to find a solution
 that meets
 them.  I'll take a stab at identifying a first set of POSSIBLE
 requirements:
 
 - Standardized method for determining the result of an operation
 
   This is a big one for me because it directly affects the
   consumability of the
   API.  If each verb has different semantics for discovering whether
   it has
   completed successfully, then the API will be nearly impossible to
   use easily.
Since there is no way to assure if of some tasks completed successfully or 
failed, especially around the murky waters of storage, I say this requirement 
should be removed.
At least not in the context of a task.
 
 
 Sorry.  That's my list :)  Hopefully others will be willing to add
 other
 requirements for consideration.
 
 From my understanding, task recovery (stop, abort, rollback, etc)
 will not be
 generally supported and should not be a requirement.
 
 
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Managing async tasks

2012-12-17 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, 
 Federico Simoncelli
 fsimo...@redhat.com, engine-de...@ovirt.org, 
 vdsm-devel@lists.fedorahosted.org
 Sent: Monday, December 17, 2012 2:16:25 PM
 Subject: Re: Managing async tasks
 
 On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
  
  
  - Original Message -
   From: Adam Litke a...@us.ibm.com To:
   vdsm-devel@lists.fedorahosted.org
   Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
   aba...@redhat.com,
   Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli
   fsimo...@redhat.com, engine-de...@ovirt.org Sent: Monday,
   December 17,
   2012 12:00:49 PM Subject: Managing async tasks
   
   On today's vdsm call we had a lively discussion around how
   asynchronous
   operations should be handled in the future.  In an effort to
   include more
   people in the discussion and to better capture the resulting
   conversation I
   would like to continue that discussion here on the mailing list.
   
   A lot of ideas were thrown around about how 'tasks' should be
   handled in the
   future.  There are a lot of ways that it can be done.  To
   determine how we
   should implement it, it's probably best if we start with a set of
   requirements.  If we can first agree on these, it should be easy
   to find a
   solution that meets them.  I'll take a stab at identifying a
   first set of
   POSSIBLE requirements:
   
   - Standardized method for determining the result of an operation
   
 This is a big one for me because it directly affects the
 consumability of
 the API.  If each verb has different semantics for discovering
 whether it
 has completed successfully, then the API will be nearly
 impossible to use
 easily.
  Since there is no way to assure if of some tasks completed
  successfully or
  failed, especially around the murky waters of storage, I say this
  requirement
  should be removed.  At least not in the context of a task.
 
 I don't agree.  Please feel free to convince me with some exampled.
  If we
 cannot provide feedback to a user as to whether their request has
 been satisfied
 or not, then we have some bigger problems to solve.
If VDSM sends a write command to a storage server, and the connection hangs up 
before the ACK has returned.
The operation has been committed but VDSM has no way of knowing if that 
happened as far as VDSM is concerned it got an ETIMEO or EIO.
This is the same problem that the engine has with VDSM.
If VDSM creates an image\VM\network\repo but the connection hangs up before the 
response can be sent back as far as the engine is concerned the operation times 
out.
This is an inherent issue with clustering.
This is why I want to move away from tasks being *the* trackable objects.
Tasks should be short. As short as possible.
Run VM should just persist the VM information on the VDSM host and return. The 
rest of the tracking should be done using the VM ID.
Create image should return once VDSM persisted the information about the 
request on the repository and created the metadata files.
Tracking should be done on the repo or the imageId.
 
   
   
   Sorry.  That's my list :)  Hopefully others will be willing to
   add other
   requirements for consideration.
   
   From my understanding, task recovery (stop, abort, rollback, etc)
   will not
   be generally supported and should not be a requirement.
   
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Managing async tasks

2012-12-17 Thread Saggi Mizrahi
This is an addendum to my previous email.

- Original Message -
 From: Saggi Mizrahi smizr...@redhat.com
 To: Adam Litke a...@us.ibm.com
 Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, 
 Federico Simoncelli
 fsimo...@redhat.com, engine-de...@ovirt.org, 
 vdsm-devel@lists.fedorahosted.org
 Sent: Monday, December 17, 2012 2:52:06 PM
 Subject: Re: Managing async tasks
 
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
  aba...@redhat.com, Federico Simoncelli
  fsimo...@redhat.com, engine-de...@ovirt.org,
  vdsm-devel@lists.fedorahosted.org
  Sent: Monday, December 17, 2012 2:16:25 PM
  Subject: Re: Managing async tasks
  
  On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
   
   
   - Original Message -
From: Adam Litke a...@us.ibm.com To:
vdsm-devel@lists.fedorahosted.org
Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
aba...@redhat.com,
Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli
fsimo...@redhat.com, engine-de...@ovirt.org Sent: Monday,
December 17,
2012 12:00:49 PM Subject: Managing async tasks

On today's vdsm call we had a lively discussion around how
asynchronous
operations should be handled in the future.  In an effort to
include more
people in the discussion and to better capture the resulting
conversation I
would like to continue that discussion here on the mailing
list.

A lot of ideas were thrown around about how 'tasks' should be
handled in the
future.  There are a lot of ways that it can be done.  To
determine how we
should implement it, it's probably best if we start with a set
of
requirements.  If we can first agree on these, it should be
easy
to find a
solution that meets them.  I'll take a stab at identifying a
first set of
POSSIBLE requirements:

- Standardized method for determining the result of an
operation

  This is a big one for me because it directly affects the
  consumability of
  the API.  If each verb has different semantics for
  discovering
  whether it
  has completed successfully, then the API will be nearly
  impossible to use
  easily.
   Since there is no way to assure if of some tasks completed
   successfully or
   failed, especially around the murky waters of storage, I say this
   requirement
   should be removed.  At least not in the context of a task.
  
  I don't agree.  Please feel free to convince me with some exampled.
   If we
  cannot provide feedback to a user as to whether their request has
  been satisfied
  or not, then we have some bigger problems to solve.
 If VDSM sends a write command to a storage server, and the connection
 hangs up before the ACK has returned.
 The operation has been committed but VDSM has no way of knowing if
 that happened as far as VDSM is concerned it got an ETIMEO or EIO.
 This is the same problem that the engine has with VDSM.
 If VDSM creates an image\VM\network\repo but the connection hangs up
 before the response can be sent back as far as the engine is
 concerned the operation times out.
 This is an inherent issue with clustering.
 This is why I want to move away from tasks being *the* trackable
 objects.
 Tasks should be short. As short as possible.
 Run VM should just persist the VM information on the VDSM host and
 return. The rest of the tracking should be done using the VM ID.
 Create image should return once VDSM persisted the information about
 the request on the repository and created the metadata files.
 Tracking should be done on the repo or the imageId.

The thing is that I know how long a VM object should live (or an Image object).
So tracking it is straight forward. How long a task should live is very 
problematic and quite context specific.
It depends on what the task is.
I think it's quite confusing from an API standpoint to have every task have a 
different scope, id requirement and life-cycle.

In VDSM has two types of APIs

CRUD objects - VM, Image, Repository, Bridge, Storage Connections
General transient methods - getBiosInfo(), getDeviceList()

The latter are quite simple to manage. They don't need any special handling. If 
you lost a getBiosInfo() call you just send another one, no harm done.
The same is even true with things that change the host like getDeviceList()

What we are really arguing about is fitting the CRUD objects to some generic 
task oriented scheme.
I'm saying it's a waste of time as you can quite easily have flows to recover 
from each operation.

Create - Check if the object exists
Read - Read again
Update - either update again or read and update if update didn't commit the 
first time
Delete - Check if object doesn't exist

Each of the objects we CRUD have different life-cycles and ownership semantics.

Danken raised the point that creation has

[vdsm] [Draft]Task Management API

2012-12-17 Thread Saggi Mizrahi
Dan rightly suggested I'd be more specific about what the task system is
instead of what the task system isn't.

The problem is that I'm not completely sure how it's going to work.
It also depends on the events mechanism.
This is my current working draft:


TaskInfo:
id string
methodName string
kwargs json-object (string keys variant values) *filtered to remove sensitive
 information

getRunningTasks(filter string, filterType enum{glob, regexp})
Returns a list of TaskInfo of all tasks that their id's match the filter


That's it, not even stopTask()

As explained, I would like to offload handling to the subsystems.
In order to make things easier for the clients every subsystem can choose a
filed of the object to be of type OperationInfo.
This is a generic structure that the user has a generic way to track all tasks
on all subsystem with a report interface. The extraData field is for subsystem
specific data. This is where the storage subsystem would put, for example,
imageState (broken, degraded, optimized) data.

OperationInfo:
operationDescription string - something out of an agreed enum of strings
  vaguely describing the operation at hand for
  example Copying, Merging, Deleting,
  Configuring, Stopped, Paused, 
  They must be known to the client so it can in
  turn translate it in the UI. The also have to
  remain relatively vague as they are part of the
  interface meaning that new values will break old
  clients so they have to be reusable.
stageDescription - Similar to operation description in case you want more
   granularity, optional.
stage (int, int) - (5, 10) means 5 out of 10. 1 out of 1 implies the UI to not
   display stage widgets.
percentage - 0-100, -1 means unknown.
lastError - (code, message) the same errors that can return for regular calls
extraData - json-object


For example creatVM will return once the object is created in VDSM.
getVmInfo() would return, amongst other things, the operation info.
For the case of preparing for launch it will be:
  {Creating, configuring, (2, 4), 40, (0, ),
   {state=preparing for launch}}
In the case of VM paused on EIO:
  {Paused, Paused, (1, 1), -1, (123, Error writing to disks),
   {state=paused}}

Migration is a tricky one, it will be reported as a task while it's in progress
but all the information is available on the image operationInfo.
In the case of Migration:
  {Migration, Configuring, (1, 3), -1, (0, ), {status=Migrating}}

For StorageConnection this is somewhat already the case but in simplified
version.

If you want to ask about any other operation I'd be more then happy to write my
suggestion for it.

Subsystems have complete freedom about how to set up the API.
For Storage you have Fixes() to start\stop operations.
Gluster is pretty autonomous once operations have been started.

Since operations return as soon as they are registered (persisted) or fail to
register, it makes synchronous programming a bit clunky.
vdsm.pauseVm(vmId) doesn't return when the VM is paused but when VDSM committed
it will try to pause it. This means you will have to poll in order to see if
the operation finished. For gluster, as an example, this is the only way we
can check that the operation finished.

For stuff we have a bit more control over vdsm will fire events using json-rpc
notifications sent to the clients. The will be in the form of:
{method: alert, params: {
  alertName: subsystem(.objectType)?.object.(subobject., ...),
  operationInfo, OperationInfo}
}

The user can register to recive events using a glob or a regexp.
registering to vdsm.VM.* pop every time any VM has changed stage.
This means that whenever the task finishes, fails or gains significance progress
and VDSM is there to track it, an event will be sent to the client.

This means that the general flow is.
# Register operation
vmID = best_vm
host.VM.pauseVM(vmID)
while True:
opInfo = None
try:
   event = host.waitForEvent(vdsm.VM.best_vm, timeout=10)
   opInfo = event.opInfo
except VdsmDisconnectionError:
   host.waitForReconnect()
   host.vm.getVmInfo(vmID)  # Double check that we didn't miss the event
   continue
except Timeout:
   # This is a long operation, poll to see that we didn't miss any event
   # but more commonly, update percentage in the UI to show progress.
   vmInfo = host.vm.getVmInfo(vmID)
   opInfo = vmInfo.operationInfo

if opInfo.stage.number != op.stage.total:
   # Operation in progress
   updateUI(opInfo)
else:
   # Operation completed
   # Check that the state is what we expected it to be.
   if oInfo.extraData.state == paused:
  return SUCCESS
   else:
  return 

Re: [vdsm] blame and shame

2012-12-13 Thread Saggi Mizrahi
I kind of like the fact that I will not be blamed for all the stuff I broke. :(

- Original Message -
 From: Antoni Segura Puimedon asegu...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, December 13, 2012 10:34:52 AM
 Subject: [vdsm] blame and shame
 
 Hi list!
 
 Since I'm doing lately and I plan to continue to do patches to
 improve
 pep8 compliance for the whole vdsm codebase, and a lot of that is
 E126, E127 and E128, that deal with whitespaces, I have added to my
 ~/.gitconfig
 
 [alias]
 bl = blame -w
 
 Which ignores whitespaces for the blame on the lines. This way, my
 name
 will not be shown next to code I don't know about ;-) Of course, it
 would
 be great if git blame where to be extended with pydiff so all the
 pep8
 changes would be ignored for blaming purposes... But I'll leave that
 to
 someone else ;-)
 
 Best,
 
 Toni
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Request for consideration during the API revamp

2012-12-13 Thread Saggi Mizrahi
Since I assume vdsClient will use libvdsm. It should have all the constants 
defined.

I do like Adam's suggestion about making vdsClient auto-generated as well.
vdsClient is currently very annoying to maintain.
I would also like to propose changing the name of the executable to vdsm_cli.
It would make it easier to distribute both tools as vdsClient will still be 
needed communicate with old VDSMs.
Also capital letters in executable names is not very Unixy.

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Vinzenz Feenstra vfeen...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, December 12, 2012 10:40:19 AM
 Subject: Re: [vdsm] Request for consideration during the API revamp
 
 On Wed, Dec 12, 2012 at 02:01:31PM +0100, Vinzenz Feenstra wrote:
  Hi,
  
  When there is the attempt to enhance/change the current API, I
  would
  ask you to consider to think also about the vdsClient use case.
  I haven't read anything regarding that so far and therefore I just
  want you to think about it as well.
  
  My expectation is that the vdsClient will continue to use the RPC
  interfaces, however since it is part of the VDSM project I think it
  would be a good idea if there is a way for both vdsmd and vdsClient
  to share constants used for the API.
  
  That in turn also should simplify the maintenance of vdsClient.
  Currently I see the constants used by both being defined on both
  sides and I am pretty sure that this could be improved.
  
  See this as just a thought on the whole redesign talk, but I would
  like to see this kind of use cases to be covered. :-)
 
 Yes, this is an excellent suggestion.  One thing I am thinking about
 doing is
 generating a new python file with the enums defined in the schema.
  This could
 be included by all server-side code and by clients such as vdsClient.
  If we
 decide to add constants to the schema file, we could also place these
 into the
 same generated python file.
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Host bios information

2012-12-13 Thread Saggi Mizrahi


- Original Message -
 From: Ayal Baron aba...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Shu 
 Ming shum...@linux.vnet.ibm.com
 Sent: Thursday, December 13, 2012 11:30:56 AM
 Subject: Re: [vdsm] Host bios information
 
 
 
 - Original Message -
  I think that for the new current XML-RPC API it's OK to add it to
  the
  getVdsCaps() verb.
  For the new API I suggest moving it to it's own API. The smaller
  the
  APIs the easier they are to deprecate and support.
  I quite doubt the fields in getBiosInfo() will change half as
  frequently as whatever getVdsCaps() returns.
  I also kind of want to throw away getVdsCaps() and split it to
  better
  named better encapsulated methods.
 
 Ack.  I just don't understand why not start right now?
 Any new patch should improve things at least a little.
 We know getVdsCaps() is wrong so let's put the bios info (and
 anything in getVdsCaps that makes sense to put with it if relevant)
 in a separate call.  Adding a call in engine to this new method
 should be a no brainer, I don't think that is a good reason for not
 doing things properly in vdsm, even if we're talking about the
 current API.
Well, from what I know the current overhead per call is too large to mandate a 
lot of calls.
At least that is what I've been told. If that is not an issue, do it in the 
XML-RPC API too.
 
  
  Also, in the json-rpc base model, calls are not only cheaper, you
  also have batch calls. This means you can send multiple requests as
  one message and have VDSM send you the responses as one message
  once
  all tasks completed. This makes splitting aggregated methods to
  smaller methods painless and with minimal overhead.
  
  - Original Message -
   From: Shu Ming shum...@linux.vnet.ibm.com
   To: ybronhei ybron...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org
   Sent: Thursday, December 13, 2012 11:04:09 AM
   Subject: Re: [vdsm] Host bios information
   
   After a quick review of the wiki page, it was stated that
   dmidecode
   gave
   too much informations. Only five fields will be displayed in the
   hardware tab, Manufactory, Version, Family, UUID and
   serial
   number.  For Family, it is mean the CPU core's family.  And it
   confuses me a bit with the CPU name and CPU type fields in
   general
   tab. I think we should chose the best one to characterizethe CPU
   type.
   
   
   ybronhei:
Today in the Api we display general information about the host
that
vdsm export by getCapabilities Api.
   
We decided to add bios information as part of the information
that
is
displayed in UI under host's general sub-tab.
   
To summaries the feature - We'll modify General tab to Software
Information and add another tab for Hardware Information which
will
include all the bios data that we'll decide to gather from the
host
and display.
   
Following this feature page:
http://www.ovirt.org/Features/Design/HostBiosInfo for more
details.
All the parameters that can be displayed are mentioned in the
wiki.
   
I would greatly appreciate your comments and questions.
   
Thanks.
   
   
   
   --
   ---
   舒明 Shu Ming
   Open Virtualization Engineerning; CSTL, IBM Corp.
   Tel: 86-10-82451626  Tieline: 9051626 E-mail: shum...@cn.ibm.com
   or
   shum...@linux.vnet.ibm.com
   Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
   District, Beijing 100193, PRC
   
   
   ___
   vdsm-devel mailing list
   vdsm-devel@lists.fedorahosted.org
   https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
   
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
  
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: New Storage API

2012-12-10 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Deepak C Shetty deepa...@linux.vnet.ibm.com, engine-devel 
 engine-de...@ovirt.org, VDSM Project
 Development vdsm-devel@lists.fedorahosted.org
 Sent: Monday, December 10, 2012 1:49:31 PM
 Subject: Re: [vdsm] RFC: New Storage API
 
 On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote:
 
 snip
 
   1) Can you provide more info on why there is a exception for 'lvm
   based
   block domain'. Its not coming out clearly.
  File based domains are responsible for syncing up object
  manipulation (creation\deletion)
  The backend is responsible for making sure it all works either by
  having a single writer (NFS) or having it's own locking mechanism
  (gluster).
  In our LVM based domains VDSM is responsible for basic object
  manipulation.
  The current design uses an approach where there is a single host
  responsible for object creation\deleteion it is the
  SRM\SDM\SPM\S?M.
  If we ever find a way to make it fully clustered without a big hit
  in performance the S?M requirement will be removed form that type
  of domain.
 
 I would like to see us maintain a LOCALFS domain as well.  For this,
 we would
 also need SRM, correct?
No, why?
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: New Storage API

2012-12-10 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Shu Ming shum...@linux.vnet.ibm.com, engine-devel 
 engine-de...@ovirt.org, VDSM Project Development
 vdsm-devel@lists.fedorahosted.org
 Sent: Monday, December 10, 2012 4:47:46 PM
 Subject: Re: [vdsm] RFC: New Storage API
 
 On Mon, Dec 10, 2012 at 03:36:23PM -0500, Saggi Mizrahi wrote:
   Statements like this make me start to worry about your userData
   concept.  It's a
   sign of a bad API if the user needs to invent a custom metadata
   scheme for
   itself.  This reminds me of the abomination that is the 'custom'
   property in the
   vm definition today.
  In one sentence: If VDSM doesn't care about it, VDSM doesn't manage
  it.
  
  userData being a void* is quite common and I don't understand why
  you would thing it's a sign of a bad API.
  Further more, giving the user choice about how to represent it's
  own metadata and what fields it want to keep seems reasonable to
  me.
  Especially given the fact that VDSM never reads it.
  
  The reason we are pulling away from the current system of VDSM
  understanding the extra data is that it makes that data tied to
  VDSMs on disk format.
  VDSM on disk format has to be very stable because of clusters with
  multiple VDSM versions.
  Further more, since this is actually manager data it has to be tied
  to the manager backward compatibility lifetime as well.
  Having it be opaque to VDSM ties it to only one, simpler, support
  lifetime instead of two.
  
  I guess you are implying that it will make it problematic for
  multiple users to read userData left by another user because the
  formats might not be compatible.
  The solution is that all parties interested in using VDSM storage
  agree on format, and common fields, and supportability, and all
  the other things that choosing a supporting *something* entails.
  This is, however, out of the scope of VDSM. When the time comes I
  think how the userData blob is actually parsed and what fields it
  keeps should be discussed on ovirt-devel or engine-devel.
  
  The crux of the issue is that VDSM manages only what it cares about
  and the user can't modify directly.
  This is done because everything we expose we commit to.
  If you want any information persisted like:
  - Human readable name (in whatever encoding)
  - Is this a template or a snapshot
  - What user owns this image
  
  You can just put it in the userData.
  VDSM is not going to impose what encoding you use.
  It's not going to decide if you represent your users as IDs or
  names or ldap queries or Public Keys.
  It's not going to decide if you have explicit templates or not.
  It's not going to decide if you care what is the logical image
  chain.
  It's not going to decide anything that is out of it's scope.
  No format is future proof, no selection of fields will be good for
  any situation.
  I'd much rather it be someone else's problem when any of them need
  to be changed.
  They have currently been VDSMs problem and it has been hell to
  maintain.
 
 In general, I actually agree with most of this.  What I want to avoid
 is pushing
 things that should actually be a part of the API into this userData
 blob.  We do
 want to keep the API as simple as possible to give vdsm flexibility.
  If, over
 time, we find that users are always using userData to work around
 something
 missing in the API, this could be a really good sign that the API
 needs
 extension.
I was actually contemplating about this for quite a while.
If while you create an image the reply is lost or, VDSM is unable to know if 
the operation was committed or not, the user will have no way of knowing what 
thew new image ID is.
To solve this it is recommended that the manager puts some sort of task related 
information in the user data.
If the operation ever finishes in an an ambiguous state the user just reads the 
userData from any images it doesn't know or is unsure about their state.

This is a flow that every client will have to have.
So why not just add that to the API?
Because I don't want to impose how this information gets generated, what is 
the content of that data or how unique it has to be.
Since VDSM doesn't use it for anything, I don't feel like I need to figure this 
out.
I am all for simplicity, but simplicity is kind of an abstract concept. Having 
it be a blob is in some aspects the simplest thing you can do.
Just saying that I have a field, put whatever in it is simple to convey but 
does requires more work on the user's side to figure out what to do with it.

All that being said, I do think that the format, fields and how to use them 
should be defined so different users can communicate and synchronize.
It's also important that you don't reinvent the wheel for every flow in every 
client.
I'm just saying that it's not in the scope of VDSM.
It should be done as a standard that all users of VDSM agree too conform to.
It's the same way that a file

Re: [vdsm] RFC: New Storage API

2012-12-07 Thread Saggi Mizrahi


- Original Message -
 From: Deepak C Shetty deepa...@linux.vnet.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Shu Ming shum...@linux.vnet.ibm.com, engine-devel 
 engine-de...@ovirt.org, VDSM Project Development
 vdsm-devel@lists.fedorahosted.org, Deepak C Shetty 
 deepa...@linux.vnet.ibm.com
 Sent: Friday, December 7, 2012 12:23:15 AM
 Subject: Re: [vdsm] RFC: New Storage API
 
 On 12/06/2012 10:22 PM, Saggi Mizrahi wrote:
 
  - Original Message -
  From: Shu Ming shum...@linux.vnet.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: VDSM Project Development
  vdsm-devel@lists.fedorahosted.org, engine-devel
  engine-de...@ovirt.org
  Sent: Thursday, December 6, 2012 11:02:02 AM
  Subject: Re: [vdsm] RFC: New Storage API
 
  Saggi,
 
  Thanks for sharing your thought and I get some comments below.
 
 
  Saggi Mizrahi:
  I've been throwing a lot of bits out about the new storage API
  and
  I think it's time to talk a bit.
  I will purposefully try and keep implementation details away and
  concentrate about how the API looks and how you use it.
 
  First major change is in terminology, there is no long a storage
  domain but a storage repository.
  This change is done because so many things are already called
  domain in the system and this will make things less confusing for
  new-commers with a libvirt background.
 
  One other changes is that repositories no longer have a UUID.
  The UUID was only used in the pool members manifest and is no
  longer needed.
 
 
  connectStorageRepository(repoId, repoFormat,
  connectionParameters={}):
  repoId - is a transient name that will be used to refer to the
  connected domain, it is not persisted and doesn't have to be the
  same across the cluster.
  repoFormat - Similar to what used to be type (eg. localfs-1.0,
  nfs-3.4, clvm-1.2).
  connectionParameters - This is format specific and will used to
  tell VDSM how to connect to the repo.
 
  Where does repoID come from? I think repoID doesn't exist before
  connectStorageRepository() return.  Isn't repoID a return value of
  connectStorageRepository()?
  No, repoIDs are no longer part of the domain, they are just a
  transient handle.
  The user can put whatever it wants there as long as it isn't
  already taken by another currently connected domain.
 
 So what happens when user mistakenly gives a repoID that is in use
 before.. there should be something in the return value that specifies
 the error and/or reason for error so that user can try with a
 new/diff
 repoID ?
Asi I said, connect fails if the repoId is in use ATM.
 
  disconnectStorageRepository(self, repoId)
 
 
  In the new API there are only images, some images are mutable and
  some are not.
  mutable images are also called VirtualDisks
  immutable images are also called Snapshots
 
  There are no explicit templates, you can create as many images as
  you want from any snapshot.
 
  There are 4 major image operations:
 
 
  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
  userData={}, options={}):
 
  targetRepoId - ID of a connected repo where the disk will be
  created
  size - The size of the image you wish to create
  baseSnapshotId - the ID of the snapshot you want the base the new
  virtual disk on
  userData - optional data that will be attached to the new VD,
  could
  be anything that the user desires.
  options - options to modify VDSMs default behavior
 
 IIUC, i can use options to do storage offloads ? For eg. I can create
 a
 LUN that represents this VD on my storage array based on the
 'options'
 parameter ? Is this the intended way to use 'options' ?
No, this has nothing to do with offloads.
If by offloads you mean having other VDSM hosts to the heavy lifting then 
this is what the option autoFix=False and the fix mechanism is for.
If you are talking about advanced scsi features (ie. write same) they will be 
used automatically whenever possible.
In any case, how we manage LUNs (if they are even used) is an implementation 
detail.
 
 
  returns the id of the new VD
  I think we will also need a function to check if a a VirtualDisk
  is
  based on a specific snapshot.
  Like: isSnapshotOf(virtualDiskId, baseSnapshotID):
  No, the design is that volume dependencies are an implementation
  detail.
  There is no reason for you to know that an image is physically a
  snapshot of another.
  Logical snapshots, template information, and any other information
  can be set by the user by using the userData field available for
  every image.
  createSnapshot(targetRepoId, baseVirtualDiskId,
   userData={}, options={}):
  targetRepoId - The ID of a connected repo where the new sanpshot
  will be created and the original image exists as well.
  size - The size of the image you wish to create
  baseVirtualDisk - the ID of a mutable image (Virtual Disk) you
  want
  to snapshot
  userData - optional data that will be attached to the new
  Snapshot,
  could be anything

Re: [vdsm] RFC: New Storage API

2012-12-06 Thread Saggi Mizrahi


- Original Message -
 From: Tony Asleson tasle...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, December 5, 2012 4:48:34 PM
 Subject: Re: [vdsm] RFC: New Storage API
 
 On 12/04/2012 03:52 PM, Saggi Mizrahi wrote:
  I've been throwing a lot of bits out about the new storage API and
  I think it's time to talk a bit.
  I will purposefully try and keep implementation details away and
  concentrate about how the API looks and how you use it.
  
  First major change is in terminology, there is no long a storage
  domain but a storage repository.
  This change is done because so many things are already called
  domain in the system and this will make things less confusing for
  new-commers with a libvirt background.
  
  One other changes is that repositories no longer have a UUID.
  The UUID was only used in the pool members manifest and is no
  longer needed.
  
  
  connectStorageRepository(repoId, repoFormat,
  connectionParameters={}):
  repoId - is a transient name that will be used to refer to the
  connected domain, it is not persisted and doesn't have to be the
  same across the cluster.
  repoFormat - Similar to what used to be type (eg. localfs-1.0,
  nfs-3.4, clvm-1.2).
  connectionParameters - This is format specific and will used to
  tell VDSM how to connect to the repo.
  
  disconnectStorageRepository(self, repoId):
  
  
  In the new API there are only images, some images are mutable and
  some are not.
  mutable images are also called VirtualDisks
  immutable images are also called Snapshots
  
  There are no explicit templates, you can create as many images as
  you want from any snapshot.
  
  There are 4 major image operations:
  
  
  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
userData={}, options={}):
  
  targetRepoId - ID of a connected repo where the disk will be
  created
  size - The size of the image you wish to create
  baseSnapshotId - the ID of the snapshot you want the base the new
  virtual disk on
  userData - optional data that will be attached to the new VD, could
  be anything that the user desires.
  options - options to modify VDSMs default behavior
  
  returns the id of the new VD
 
 I'm guessing there will be a way to find out how much space is
 available
 for a specified repo before you try to create a virtual disk on it?
This is in the repo API which is not really detailed here.
In any case, due to the nature of storage, you can never tell how much space an 
image is going to actually take.
You have over-committing, thin provisioning, sparse volumes, native snapshots, 
compression, de-dupe, soft raid (btfs\zfs), check-summing, metadata backups, 
metadata per-operation (btrfs), and more.
VDSM might also leave the image in degraded mode if there is no room to 
complete the action.

If you want to create an image you should just give it a whirl, also you should 
always leave certain % percentage free.
 
  
  createSnapshot(targetRepoId, baseVirtualDiskId,
 userData={}, options={}):
  targetRepoId - The ID of a connected repo where the new sanpshot
  will be created and the original image exists as well.
  size - The size of the image you wish to create
  baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want
  to snapshot
  userData - optional data that will be attached to the new Snapshot,
  could be anything that the user desires.
  options - options to modify VDSMs default behavior
  
  returns the id of the new Snapshot
  
  copyImage(targetRepoId, imageId, baseImageId=None, userData={},
  options={})
  targetRepoId - The ID of a connected repo where the new image will
  be created
  imageId - The image you wish to copy
  baseImageId - if specified, the new image will contain only the
  diff between image and Id.
If None the new image will contain all the bits of
image Id. This can be used to copy partial parts of
images for export.
  userData - optional data that will be attached to the new image,
  could be anything that the user desires.
  options - options to modify VDSMs default behavior
  
  return the Id of the new image. In case of copying an immutable
  image the ID will be identical to the original image as they
  contain the same data. However the user should not assume that and
  always use the value returned from the method.
 
 Can the target repo id be itself?  The case where a user wants to
 make a
 copy of a virtual disk in the same repo.  A caller could snapshot the
 virtual disk and then create a virtual disk from the snapshot, but if
 the target repo could be the same as source repo then they could use
 this call as long as the returned ID was different.
 
 Does imageId IO need to be quiesced before calling this or will that
 be
 handled in the implementation (eg. snapshot first)?
Copy of an image is possible to the same repo.
Copy of a sanpshot to the same repo will not work, there is also no reason

Re: [vdsm] RFC: New Storage API

2012-12-06 Thread Saggi Mizrahi


- Original Message -
 From: Shu Ming shum...@linux.vnet.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
 engine-devel engine-de...@ovirt.org
 Sent: Thursday, December 6, 2012 11:02:02 AM
 Subject: Re: [vdsm] RFC: New Storage API
 
 Saggi,
 
 Thanks for sharing your thought and I get some comments below.
 
 
 Saggi Mizrahi:
  I've been throwing a lot of bits out about the new storage API and
  I think it's time to talk a bit.
  I will purposefully try and keep implementation details away and
  concentrate about how the API looks and how you use it.
 
  First major change is in terminology, there is no long a storage
  domain but a storage repository.
  This change is done because so many things are already called
  domain in the system and this will make things less confusing for
  new-commers with a libvirt background.
 
  One other changes is that repositories no longer have a UUID.
  The UUID was only used in the pool members manifest and is no
  longer needed.
 
 
  connectStorageRepository(repoId, repoFormat,
  connectionParameters={}):
  repoId - is a transient name that will be used to refer to the
  connected domain, it is not persisted and doesn't have to be the
  same across the cluster.
  repoFormat - Similar to what used to be type (eg. localfs-1.0,
  nfs-3.4, clvm-1.2).
  connectionParameters - This is format specific and will used to
  tell VDSM how to connect to the repo.
 
 
 Where does repoID come from? I think repoID doesn't exist before
 connectStorageRepository() return.  Isn't repoID a return value of
 connectStorageRepository()?
No, repoIDs are no longer part of the domain, they are just a transient handle.
The user can put whatever it wants there as long as it isn't already taken by 
another currently connected domain.
 
 
  disconnectStorageRepository(self, repoId)
 
 
  In the new API there are only images, some images are mutable and
  some are not.
  mutable images are also called VirtualDisks
  immutable images are also called Snapshots
 
  There are no explicit templates, you can create as many images as
  you want from any snapshot.
 
  There are 4 major image operations:
 
 
  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
 userData={}, options={}):
 
  targetRepoId - ID of a connected repo where the disk will be
  created
  size - The size of the image you wish to create
  baseSnapshotId - the ID of the snapshot you want the base the new
  virtual disk on
  userData - optional data that will be attached to the new VD, could
  be anything that the user desires.
  options - options to modify VDSMs default behavior
 
  returns the id of the new VD
 
 I think we will also need a function to check if a a VirtualDisk is
 based on a specific snapshot.
 Like: isSnapshotOf(virtualDiskId, baseSnapshotID):
No, the design is that volume dependencies are an implementation detail.
There is no reason for you to know that an image is physically a snapshot of 
another.
Logical snapshots, template information, and any other information can be set 
by the user by using the userData field available for every image.
 
 
  createSnapshot(targetRepoId, baseVirtualDiskId,
  userData={}, options={}):
  targetRepoId - The ID of a connected repo where the new sanpshot
  will be created and the original image exists as well.
  size - The size of the image you wish to create
  baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want
  to snapshot
  userData - optional data that will be attached to the new Snapshot,
  could be anything that the user desires.
  options - options to modify VDSMs default behavior
 
  returns the id of the new Snapshot
 
  copyImage(targetRepoId, imageId, baseImageId=None, userData={},
  options={})
  targetRepoId - The ID of a connected repo where the new image will
  be created
  imageId - The image you wish to copy
  baseImageId - if specified, the new image will contain only the
  diff between image and Id.
 If None the new image will contain all the bits of
 image Id. This can be used to copy partial parts of
 images for export.
  userData - optional data that will be attached to the new image,
  could be anything that the user desires.
  options - options to modify VDSMs default behavior
 
 Does this function mean that we can copy the image from one
 repository
 to another repository? Does it cover the semantics of storage
 migration,
 storage backup, storage incremental backup?
Yes, the main purpose is copying to another repo. and you can even do 
incremental backups.
Also the following flow
1. Run a VM using imageA
2. write to disk
3. Stop VM
4. copy imageA to repoB
5. Run a VM using imageA again
6. Write to disk
7. Stop VM
8. Copy imageA again basing it of imageA_copy1 on repoB creating a diff on repo 
diff without snapshotting the original image.

 
 
  return the Id of the new image

Re: [vdsm] VDSM tasks, the future

2012-12-05 Thread Saggi Mizrahi
I'm sorry but your email client messed up the formatting and I can't figure out 
what are you comments.
Could you please use text only emails.

- Original Message -
 From: ybronhei ybron...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Adam Litke a...@us.ibm.com, engine-devel engine-de...@ovirt.org, 
 VDSM Project Development
 vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, December 5, 2012 8:37:23 AM
 Subject: Re: [vdsm] VDSM tasks, the future
 
 
 On 12/05/2012 12:20 AM, Saggi Mizrahi wrote:
 
 
 As the only subsystem to use asynchronous tasks until now is the
 storage subsystem I suggest going over how
 I suggest tackling task creation, task stop, task remove and task
 recovery.
 Other subsystem can create similar mechanisms depending on their
 needs.
 
 There is no way of avoiding it, different types of tasks need
 different ways of tracking\recovering from them.
 network should always auto-recover because it can't get a please
 fix command if the network is down.
 Storage on the other hand should never start operations on it's own
 because it might take up valuable resources from the host.
 Tasks that need to be tracked on a single host, 2 hosts, or the
 entire cluster need to have their own APIs.
 VM configuration never persist across reboots, networking sometimes
 persists and storage always persists.
 This means that recovery procedures (from the managers point of view)
 need to be vastly different.
 Add policy, resource allocation, and error flows you see that VDSM
 doesn't have nearly as much information to deal with the tasks.
 
 - Original Message -
 
 From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
 smizr...@redhat.com Cc: VDSM Project Development
 vdsm-devel@lists.fedorahosted.org , engine-devel
 engine-de...@ovirt.org , Ayal
 Baron aba...@redhat.com , Barak Azulay bazu...@redhat.com ,
 Shireesh Anjal san...@redhat.com Sent: Tuesday, December 4, 2012
 3:50:28 PM
 Subject: Re: VDSM tasks, the future
 
 On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
 
 Because I started hinting about how VDSM tasks are going to look
 going forward
 I thought it's better I'll just write everything in an email so we
 can talk
 about it in context.  This is not set in stone and I'm still
 debating things
 myself but it's very close to being done. Don't debate them yourself,
 debate them here!  Even better, propose
 your idea in
 schema form to show how a command might work exactly. I don't like
 throwing ideas in the air It can be much easier to understand the
 flow of a task in vdsm and outside vdsm by a small schema, mainly
 for the each task's states.
 To define the flow of a task you can separate between type of tasks
 (network, storage, vms, or else), we should have task's states that
 clarify if the task can be recovered or not, can be canceled or not
 and inc..
 
 Canceling\Aborting\Reverting states should be more clarified and not
 every state can lead to all types of states.
 I tries to figure how task flow works today in vdsm, and this is what
 I've got - http://wiki.ovirt.org/Vdsm_tasks
 
 
 
 
 
 
 - Everything is asynchronous.  The nature of message based
 communication is
 that you can't have synchronous operations.  This is not really
 debatable
 because it's just how TCP\AMQP\messaging works. Can you show how a
 traditionally synchronous command might work?
  Let's take
 Host.getVmList as an example. The same as it works today, it's all a
 matter of how you wrap the transport layer.
 You will send a json-rpc request and wait for a response with the
 same id.
 
 As for the bindings, there are a lot of way we can tackle that.
 Always wait for the response and simulate synchronous behavior.
 Make every method return an object to track the task.
 task = host.getVmList()
 if not task.wait(1):
 task.cancel()
 else:
 res = task.result() It looks like traditional timeout.. why not
 to split blocking actions and non-blocking actions, non-blocking
 action will supply callback function to return to if the task
 fails or success. for example:
 
 createAsyncTask(host.getVmList, params, timeout=30,
 callbackGetVmList)
 
 Instead of using the dispatcher? Do you want to keep the dispatcher
 concept?
 
 
 
 Have it both ways (it's auto generated anyway) and have
 list = host.getVmList()
 task = host.getVmList_async()
 
 Have a high level and low level interfaces.
 host = host()
 host.connect(tcp://host:3233)
 req = host.sendRequest(123213, getVmList, [])
 if not req.wait(1):

 
 shost = SynchHost(host)
 shost.getVmList() # Actually wraps a request object
 ahost = AsyncHost(host)
 task = getVmList() # Actually wraps a request object
 
 
 
 - Task IDs will be decided by the caller.  This is how json-rpc
 works and also
 makes sense because no the engine can track the task without
 needing to have a
 stage where we give it the task ID back.  IDs are reusable as long
 as no one
 else is using them at the time so they can be used

[vdsm] VDSM tasks, the future

2012-12-04 Thread Saggi Mizrahi
Because I started hinting about how VDSM tasks are going to look going forward 
I thought it's better I'll just write everything in an email so we can talk 
about it in context.
This is not set in stone and I'm still debating things myself but it's very 
close to being done.

- Everything is asynchronous.
  The nature of message based communication is that you can't have synchronous 
operations.
  This is not really debatable because it's just how TCP\AMQP\messaging works.

- Task IDs will be decided by the caller.
  This is how json-rpc works and also makes sense because no the engine can 
track the task without needing to have a stage where we give it the task ID 
back.
  IDs are reusable as long as no one else is using them at the time so they can 
be used for synchronizing operations between clients (making sure a command is
  only executed once on a specific host without locking).

- Tasks are transient
  If VDSM restarts it forgets all the task information.
  There are 2 ways to have persistent tasks:
  1. The task creates an object that you can continue work on in VDSM.
 The new storage does that by the fact that copyImage() returns one the 
target volume has been created but before the data has been fully copied.
 From that moment on the stat of the copy can be queried from any host 
using getImageStatus() and the specific copy operation can be queried with 
getTaskStatus() on the host performing it.
 After VDSM crashes, depending on policy, either VDSM will create a new 
task to continue the copy or someone else will send a command to continue the 
operation and that will be a new task.
  2. VDSM tasks just start other operations track-able not through the task 
interface. For example Gluster.
 gluster.startVolumeRebalance() will return once it has been registered 
with Gluster.
 glster.getOperationStatuses() will return the state of the operation from 
any host.
 Each call is a task in itself.
  
- No task tags.
  They are silly and the caller can mangle whatever in the task ID if he really 
wants to tag tasks.

- No explicit recovery stage.
  VDSM will be crash-only, there should be efforts to make everything 
crash-safe.
  If that is problematic, in case of networking, VDSM will recover on start 
without having a task for it.

- No clean Task:
  Tasks can be started by any number of hosts this means that there is no way 
to own all tasks.
  There could be cases where VDSM starts tasks on it's own and thus they have 
no owner at all.
  The caller needs to continually track the state of VDSM. We will have 
brodcasted events to mitigate polling.

- No revert
  Impossible to implement safely.

- No SPM\HSM tasks
  SPM\SDM is no longer necessary for all domain types (only for type).
  What used to be SPM tasks, or tasks that persist and can be restarted on 
other hosts is talked about in previous bullet points.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] RFC: New Storage API

2012-12-04 Thread Saggi Mizrahi
I've been throwing a lot of bits out about the new storage API and I think it's 
time to talk a bit.
I will purposefully try and keep implementation details away and concentrate 
about how the API looks and how you use it.

First major change is in terminology, there is no long a storage domain but a 
storage repository.
This change is done because so many things are already called domain in the 
system and this will make things less confusing for new-commers with a libvirt 
background.

One other changes is that repositories no longer have a UUID.
The UUID was only used in the pool members manifest and is no longer needed.


connectStorageRepository(repoId, repoFormat, connectionParameters={}):
repoId - is a transient name that will be used to refer to the connected 
domain, it is not persisted and doesn't have to be the same across the cluster.
repoFormat - Similar to what used to be type (eg. localfs-1.0, nfs-3.4, 
clvm-1.2).
connectionParameters - This is format specific and will used to tell VDSM how 
to connect to the repo.

disconnectStorageRepository(self, repoId):


In the new API there are only images, some images are mutable and some are not.
mutable images are also called VirtualDisks
immutable images are also called Snapshots

There are no explicit templates, you can create as many images as you want from 
any snapshot.

There are 4 major image operations:


createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
  userData={}, options={}):

targetRepoId - ID of a connected repo where the disk will be created
size - The size of the image you wish to create
baseSnapshotId - the ID of the snapshot you want the base the new virtual disk 
on
userData - optional data that will be attached to the new VD, could be anything 
that the user desires.
options - options to modify VDSMs default behavior

returns the id of the new VD

createSnapshot(targetRepoId, baseVirtualDiskId,
   userData={}, options={}):
targetRepoId - The ID of a connected repo where the new sanpshot will be 
created and the original image exists as well.
size - The size of the image you wish to create
baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want to snapshot
userData - optional data that will be attached to the new Snapshot, could be 
anything that the user desires.
options - options to modify VDSMs default behavior

returns the id of the new Snapshot

copyImage(targetRepoId, imageId, baseImageId=None, userData={}, options={})
targetRepoId - The ID of a connected repo where the new image will be created
imageId - The image you wish to copy
baseImageId - if specified, the new image will contain only the diff between 
image and Id.
  If None the new image will contain all the bits of image Id. This 
can be used to copy partial parts of images for export.
userData - optional data that will be attached to the new image, could be 
anything that the user desires.
options - options to modify VDSMs default behavior

return the Id of the new image. In case of copying an immutable image the ID 
will be identical to the original image as they contain the same data. However 
the user should not assume that and always use the value returned from the 
method.

removeImage(repositoryId, imageId, options={}):
repositoryId - The ID of a connected repo where the image to delete resides
imageId - The id of the image you wish to delete.



getImageStatus(repositoryId, imageId)
repositoryId - The ID of a connected repo where the image to check resides
imageId - The id of the image you wish to check.

All operations return once the operations has been committed to disk NOT when 
the operation actually completes.
This is done so that:
- operation come to a stable state as quickly as possible.
- In case where there is an SDM, only small portion of the operation actually 
needs to be performed on the SDM host.
- No matter how many times the operation fails and on how many hosts, you can 
always resume the operation and choose when to do it.
- You can stop an operation at any time and remove the resulting object making 
a distinction between stop because the host is overloaded to I don't want 
that image

This means that after calling any operation that creates a new image the user 
must then call getImageStatus() to check what is the status of the image.
The status of the image can be either optimized, degraded, or broken.
Optimized means that the image is available and you can run VMs of it.
Degraded means that the image is available and will run VMs but it might be a 
better way VDSM can represent the underlying data. 
Broken means that the image can't be used at the moment, probably because not 
all the data has been set up on the volume.

Apart from that VDSM will also return the last persisted status information 
which will conatin
hostID - the last host to try and optimize of fix the image
stage - X/Y (eg. 1/10) the last persisted stage of the fix.
percent_complete - -1 or 0-100, the 

Re: [vdsm] RFC: New Storage API

2012-12-04 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
 engine-devel engine-de...@ovirt.org
 Sent: Tuesday, December 4, 2012 6:08:25 PM
 Subject: Re: [vdsm] RFC: New Storage API
 
 Thanks for sharing this.  It's nice to have something a little more
 concrete to
 think about.  Just a few comments and questions inline to get some
 discussion
 flowing.
 
 On Tue, Dec 04, 2012 at 04:52:40PM -0500, Saggi Mizrahi wrote:
  I've been throwing a lot of bits out about the new storage API and
  I think it's time to talk a bit.
  I will purposefully try and keep implementation details away and
  concentrate about how the API looks and how you use it.
  
  First major change is in terminology, there is no long a storage
  domain but a storage repository.
  This change is done because so many things are already called
  domain in the system and this will make things less confusing for
  new-commers with a libvirt background.
  
  One other changes is that repositories no longer have a UUID.
  The UUID was only used in the pool members manifest and is no
  longer needed.
  
  
  connectStorageRepository(repoId, repoFormat,
  connectionParameters={}):
 
 We should probably add an options/flags parameter for extension of
 all new
 APIs.
Usually I agree but connectionParameters is already generic enough :)
 
  repoId - is a transient name that will be used to refer to the
  connected domain, it is not persisted and doesn't have to be the
  same across the cluster.
  repoFormat - Similar to what used to be type (eg. localfs-1.0,
  nfs-3.4, clvm-1.2).
  connectionParameters - This is format specific and will used to
  tell VDSM how to connect to the repo.
  
  disconnectStorageRepository(self, repoId):
 
 I assume 'self' is a mistake here.  Just want to clarify given all of
 the recent
 talk about instances vs. namespaces.
Yea, it's just pasted from my code
 
  In the new API there are only images, some images are mutable and
  some are not.
  mutable images are also called VirtualDisks
  immutable images are also called Snapshots
 
 By mutable you mean writable right?  Or does the word mutable imply
 more than
 that?
It's a semantic distinction due to implementation details, in general terms, 
yes.
 
  There are no explicit templates, you can create as many images as
  you want from any snapshot.
  
  There are 4 major image operations:
  
  
  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
userData={}, options={}):
 
 Is userdata a 'StringMap'?
currently it's a json object. We could limit it to a string map and trust the 
client to parse types.
We can have it be a string\blob and trust the user to serialize the data.
It's pass-through object either way.
 
 I will reopen the argument about an options dict vs a flags
 parameter.  I oppose
 the dict for expansion because I think it causes APIs to devolve into
 a mess
 where lots of arbitrary and not well thought out overrides are packed
 into the
 dict over time.  A flags argument (in json and python it can be an
 enum array)
 limits us to really switching flags on and off instead of passing
 arbitrary
 data.
We already have strategy that we know we want to have several options.
Other stuff that have been suggested is to be able to override the img format 
(qcow2\qed)

The way I envision it is having an class
opts = CommandOptions()
that you add
opts.addStringOption(key, value)
opts.addIntOption(key, 3)
opt.addBoolOption(key, True)

I know you could just as well have
strategy_space_flag and strategy_performance_flag
and fail the operation if they both exist.
Since it is a matter of personal taste I think it should be decided by a vote.
 
  targetRepoId - ID of a connected repo where the disk will be
  created
  size - The size of the image you wish to create
  baseSnapshotId - the ID of the snapshot you want the base the new
  virtual disk on
  userData - optional data that will be attached to the new VD, could
  be anything that the user desires.
  options - options to modify VDSMs default behavior
  
  returns the id of the new VD
  
  createSnapshot(targetRepoId, baseVirtualDiskId,
 userData={}, options={}):
  targetRepoId - The ID of a connected repo where the new sanpshot
  will be created and the original image exists as well.
  size - The size of the image you wish to create
 
 Why is this needed?  Doesn't the size of a snapshot have to be equal
 to its
 base image?
oops, another copy\paste error, you can see this arg doesn't exist in the 
method signature.
My proof reading do need more work.
 
  baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want
  to snapshot
 
 Can you snapshot a snapshot?  In that case, this parameter should be
 called
 baseImage.
You can't snapshot a sanpshot, it makes no sense as it can't change and you 
will get the same object.
 
  userData - optional data

Re: [vdsm] [RFC]about the implement of text-based console

2012-12-03 Thread Saggi Mizrahi
Sorry, it's probably the fact that I don't have enough time to go into the code 
but I still don't get what you are trying to do.
Having it in HTTP and XML-RPC is a bad idea but I imagine that the theoretical 
solution doesn't depend on any of them.

Could you just show some pseudo code of a client using the stream?

- Original Message -
 From: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com, Adam Litke a...@us.ibm.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Friday, November 30, 2012 10:12:19 PM
 Subject: Re: [vdsm] [RFC]about the implement of text-based console
 
 
 Hi all, in this mail I further explain how solution 5 (console
 streaming
 API) works, and propose a virtual HTTP server live inside existing
 XMLRPC server with a request router. You can have a look at
 
 http://gerrit.ovirt.org/9605
 
 
 on 11/28/2012 01:09, Adam Litke wrote:
  One issue that was raised is console buffering.  What happens if a
  client does
  not call getConsoleReadStream() fast enough?  Will characters be
  dropped?  This
  could create a reliability problem and would make scripting against
  this
  interface risky at best.
 on 11/28/2012 01:45, Saggi Mizrahi wrote:
  I don't really understand 5. What does those methods return the
  virtio dev path?
 As I know, HTTP supports persistent connection and data streaming,
 this
 is popular for AJAX applications and live video broadcasting servers.
 The client sends one GET request to server, and server returns a data
 stream, then the client reads the stream continuously.
 
 XMLRPC and REST calls relies on HTTP, so I was considering
 getConsoleReadStream() can utilize streaming feature in HTTP, and
 VDSM
 just forwards the console data when it is called. Unfortunately I can
 not find out how XMLRPC and REST supports data streaming, because XML
 and JSON do not support containing a continuous stream object. It
 seems
 that to get the continuous stream data, the client must call
 getConsoleReadStream() again and again. I think it's expensive to
 call
 getConsoleReadStream() very frequently to get the data, and it may
 cause
 a notable delay, which is not acceptable for interactive console.
 
 I am thinking of providing console stream through HTTP(s) directly. A
 virtual server can forward the data from guest serial console by
 traditional HTTP streaming method (GET /consoleStream/vmid HTTP/1.0),
 and can forward the input data from the user by POST method as
 well(POST
 /consoleStream/vmid HTTP/1.0), or forward input and output stream at
 the
 same time in a POST request. This virtual server can be further
 extended
 to serve downloading guest crash core dump, and we can provide
 flexible
 authentication policies in this server. The auth for HTTP requests
 can
 be different from the XMLRPC request.
 
 The normal XMLRPC requests are always POST / HTTP/1.0 or POST
 /RPC2
 HTTP/1.0. So this virtual server can live inside the existing XMLRPC
 server, just with a request router. I read the code implementing the
 XMLRPC binding and find that implementing a request router is not
 very
 complex. We can multiplex the port 54321, and route the raw HTTP
 request
 to the virtual server while normal XMLRPC request still goes to
 XMLRPC
 handler.
 
 This means it can serve XMLRPC request as
 
 vdsClient -s localhost getVdsCaps
 
   at the same time it can serve a wget client as
 
wget --no-check-certificate \
--certificate=/etc/pki/vdsm/certs/vdsmcert.pem \
--private-key=/etc/pki/vdsm/keys/vdsmkey.pem \
--ca-certificate=/etc/pki/vdsm/certs/cacert.pem \
https://localhost:54321/console/vmid
 
 I try to implement a simple request router at
 
 http://gerrit.ovirt.org/9605
 
 If interested, you can have a look it. It can pass the recently add
 VDSM
 functional tests, and can serve wget requests at the same time. If we
 do
 not like this idea, I think only the solution of extending spice will
 fulfill our requirements. There are obvious problems in other
 solutions.
  - Original Message -
  From: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com
  To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
  Sent: Tuesday, November 27, 2012 4:22:20 AM
  Subject: Re: [vdsm] [RFC]about the implement of text-based console
 
  Hi all,
 
  For now in there is no agreement on the remote guest console
  solution,
  so I decide to do some investigation continue the discussion.
 
  Our goal
  VM serial console remote access in CLI mode. That means the
  client
  runs without X environment.
  Do you mean like running qemu with -curses?
 I mean like virsh console
 
 --
 Thanks and best regards!
 
 Zhou Zheng Sheng / 周征晟
 E-mail: zhshz...@linux.vnet.ibm.com
 Telephone: 86-10-82454397
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [VDSM][RFC] hsm service standalone

2012-12-03 Thread Saggi Mizrahi
HSM is not a package it's an application. Currently it and the rest of VDSM 
share the same process but they use RPC to communicate. This is done so that 
one day we can actually have them run as different processes.
HSM is not something you import, it's a daemon you communicate with.

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Sheldon shao...@linux.vnet.ibm.com, a...@linux.vnet.ibm.com, 
 vdsm-devel@lists.fedorahosted.org, Zheng Sheng
 ZS Zhou zhshz...@cn.ibm.com
 Sent: Monday, December 3, 2012 12:01:28 PM
 Subject: Re: [vdsm] [VDSM][RFC] hsm service standalone
 
 On Mon, Dec 03, 2012 at 11:35:44AM -0500, Saggi Mizrahi wrote:
  There are a bunch of precondition to having HSM pulled out.
  On simple things is that someone needs to go through
  storage/misc.py and utils.py and move all the code out to logical
  packages.
  There also needs to be a bit of a rearrangement of the code files
  so they can import the reusable code properly.
  
  I am also still very much against putting core VDSM in to
  site-packages.
 
 Would you elaborate on your position? Do you mind the wrong
 implications
 this may give about API stability?
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, 
 Federico Simoncelli
 fsimo...@redhat.com, Ayal Baron aba...@redhat.com, 
 vdsm-devel@lists.fedorahosted.org
 Sent: Monday, December 3, 2012 3:30:21 PM
 Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
 
 On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
  
  
  - Original Message -
   From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
   smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan
   Kenigsberg
   dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com,
   Ayal
   Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org
   Sent:
   Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API:
   Identifying
   vdsm objects in the next-gen API
   
   On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
They are not future proof as the paradigm is completely
different.
Storage domain IDs are not static any more (and are not
guaranteed to be
unique or the same across the cluster.  Image IDs represent the
ID of the
projected data and not the actual unique path.  Just as an
example, to run
a VM you give a list of domains that might contain the needed
images in
the chain and the image ID of the tip.  The paradigm is changed
to and
most calls get non synchronous number of images and domains.
 Further
more, the APIs themselves are completely different. So future
proofing is
not really an issue.
   
   I don't understand this at all.  Perhaps we could all use some
   education on
   the architecture of the planned architectural changes.  If I can
   pass an
   arbitrary list of domainIDs that _might_ contain the data, why
   wouldn't I
   just pass all of them every time?  In that case, why are they
   even required
   since vdsm would have to search anyway?
  It's for optimization mostly, the engine usually has a good idea of
  where
  stuff are, having it give hints to VDSM can speed up the search
  process.
  also, then engines knows how transient some storage pieces are. If
  you have a
  domain that is only there for backup or owned by another manager
  sharing the
  host, you don't want you VMs using the disks that are on that
  storage
  effectively preventing it from being removed (though we do have
  plans to have
  qemu switch base snapshots at runtime for just that).
 
 This is not a clean design.  If the search is slow, then maybe we
 need to
 improve caching internally.  Making a client cache a bunch of
 internal IDs to
 pass around sounds like a complete layering violation to me.
You can't cache this, if the same template exists on an 2 different NFS domains 
only the engine has enough information to know which you should use.
We only have the engine give us thing information when starting a VM or 
merging\copying an image that resides on multiple domains.
It is also completely optional. I didn't like it either.
 
   
As to making the current API a bit simpler. As I said, making
them opaque
is problematic as currently the engine is responsible for
creating the
IDs.
   
   As I mentioned in my last post, engine still can specify the ID's
   when the
   object is first created.  From that point forward the ID never
   changes so it
   can be baked into the identifier.
  Where will this identifier be persisted?
   
Further more, some calls require you to play with these (making
a template
instead of a snapshot).  Also, the full chain and topology
needs to be
completely visible to the engine.
   
   Please provide a specific example of how you play with the IDs.
I can guess
   where you are going, but I don't want to divert the thread.
  The relationship between volumes and images is deceptive at the
  moment.  IMG
  is the chain and volume is a member, IMGUUID is only used to for
  verification
  and to detect when we hit a template going up the chain.  When you
  do
  operation on images assumptions are being guaranteed about the
  resulting IDs.
  When you copy an image, you assume to know all the new IDs as they
  remain the
  same.  With your method I can't tell what the new opaque result
  is going to
  be.  Preview mode (another abomination being deprecated) relies on
  the
  disconnect between imgUUID and volUUID.  Live migration currently
  moves a lot
  of the responsibility to the engine.
 
 No client should need to know about all of these internal details.  I
 understand
 that's the way it is today, and that's one of the main reasons that
 the API is a
 complete pain to use.
You are correct but this is how this API was designed you can't get away from 
that.
 
   
These things, as you said, are problematic. But this is the way
things are
today.
   
   We are changing them.
  Any intermediary step is needlessly problematic

[vdsm] object instancing in the new VDSM API

2012-12-03 Thread Saggi Mizrahi
Currently the suggested scheme treats everything as instances and object have 
methods.
This puts instancing as the responsibility of the API bindings.
I suggest changing it to the way json was designed with namespaces and methods.

For example instead for the api being:

vm = host.getVMsList()[0]
vm.getInfo()

the API should be:

vmID = host.getVMsList()[0]
api.VMsManager.getVMInfo(vmID)

And it should be up to decide how to wrap everything in objects.


The problem with the API bindings controlling the instancing is that:
1) We have to *have* and *pass* implicit api obj which is problematic to 
maintain.
   For example, you have to have the api object as a member of instance for the
   method calls to work.  This means that you can't recreate or pool API
   objects easily. You effectively need to add a move method to move the
   object to another API object to use it on a different host.
2) Because the objects are opaque it might be hard to know what fields of the
   instance to persist to get the same object.
3) It breaks the distinction between by-value and by-reference objects.
4) Any serious user will make it's own instance classes that conform to it's
   design and flow so they don't really add any convenience to anything apart
   for tests.
   You will create you're own VM object, and because it's in the manager scope
   it will be the same instance across all hosts. Instead of being able to pass
   the same ID to any host (as the vmID remains the same) you will have to
   create and instance object to use either before every call for simplicity
   or cache for each host for performance benefits.
5) It makes us pass a weird __obj__ parameter to each call that symbolizes
   self and makes it hard for a user that choose to use it's own bindings to
   understand what it does.
6) It's syntactic sugar at best that adds needless limitation to how a user can
   play with the IDs and the API.

I personally think there is a reason why json-rpc defines name-spaces and
methods and forgoes instance. It's simpler (for the implementation), more
flexible, and it give the user more choice. Trying to hack that in will just
cause needless complications IMHO. IDs should are just strings no need to 
complicate them


By-Value objects should still be defined and instantiated by the bindings 
because unlike IDs we need to make sure all the fields exist and are in the 
correct type.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] object instancing in the new VDSM API

2012-12-03 Thread Saggi Mizrahi
So from what I gather the only thing that is bothering you is that storage 
operations require a lot of IDs.
I get that, I hate that to. It doesn't change the point that it was designed 
that way.
Even if you deem some use cases irrelevant it wouldn't change the fact that 
this is how people use it now.
And because we are going to throw it away as soon as we can there is no reason 
to shape out API around that.

So from what I gather we agree on instancing.

---

From this moment on I'm going to try my best to explain how VDSM storage 
currently works.
It is filled with misdirection and bad design. I hope that after that you will 
understand why you can't pack all the IDs together.

Let's start with the storage pool. Because it was simpler to have all metadata 
changing operations run on the same host someone needed to find a way to make 
cross domain operations work on the same host.
The solution was to band them all to a single entity call the storage pool and 
have a single lock. The point was to have a host be able to connect to multiple 
pools at a time.
Due to bad code (that could have been easily not have been so bad) the multiple 
pools feature was never implemented. Because the single lock to rule them all 
doesn't really work when you want to secure domain we had to add more locks 
making the pool concept obsolete.

These means that you can trust VDSM to only be connected to a single pool at 
the time, this means that if you want to change anything you can just remove 
the pool arg.

Lets go to volumes and images. Contrary to how it's name imgUUID does not 
represent and image. It's actually a tag given to part of a chain. This is 
commonly used to differentiate between parts of the chain responsible for VM 
images and templates. Due to bad code a lot of the possible combinations are 
not supported but that is the intention.
imgUUID being a tag means that it serves 3 purposes depending on the verb that 
uses it.
1) In some verbs it used as a useless sanity check to make sure the volume is 
tagged with this sdUUID.
   This I imagine was done because someone didn't fully comprehend how and why 
you do sanity checks.
   This means that in some verbs you can just remove it (if you are actually 
changing anything)
2) In some verbs it's meant to distinct the volume from it's original chain 
(creating a template). At that point it's actually now being invented by the 
caller.
3) Operations that act on the whole chain, if volUUID is there is for the same 
useless sanity check and can be removed.

What you need to get out of this is that most of the time you can use less IDs 
just by removing useless imgUUID or volUUID args.
Further more, you need to understand that they are not hierarchical. imgUUID is 
a tag on the volume. similar to user for a file.

As for domain IDs, because the caller can choose to reuse imgUUIDs and volUUIDs 
on different domains and some flows actually depend on that.
To make things simpler some verbs should be split up so how you specify that 
target volID doesn't affect the actual command.

This means that copyImage() and createTemplate() should be split to:
copyImage(dstDomain, srcDomain, imgUUID)
createTemplate(dstDomain, dstImgUUID, srcDomain, srcImgUUID)

That being said, I'm personally still against an indeterminate storage API 
because of engine adoption problems.
But if you want to fix the current interface. Packing up the IDs to a single 
ID wouldn't work and is logically wrong.
What you need to do is remove redundant arguments and split up verbs that do 
more then one thing.

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: vdsm-devel vdsm-de...@fedorahosted.org, Ayal Baron 
 aba...@redhat.com, Barak Azulay
 bazu...@redhat.com, ybronhei ybron...@redhat.com
 Sent: Monday, December 3, 2012 5:46:31 PM
 Subject: Re: object instancing in the new VDSM API
 
 On Mon, Dec 03, 2012 at 04:34:28PM -0500, Saggi Mizrahi wrote:
  Currently the suggested scheme treats everything as instances and
  object have methods.
  This puts instancing as the responsibility of the API bindings.
  I suggest changing it to the way json was designed with namespaces
  and methods.
  
  For example instead for the api being:
  
  vm = host.getVMsList()[0]
  vm.getInfo()
  
  the API should be:
  
  vmID = host.getVMsList()[0]
  api.VMsManager.getVMInfo(vmID)
  
  And it should be up to decide how to wrap everything in objects.
 
 For VMs, your example looks nice, but for today's Volumes it's not so
 nice.  To
 properly identify a Volume, we must pass the storage pool id, storage
 domain id,
 image id, and volume id.  If we are working with two Volumes, we
 would need 8
 parameters unless we optimize for context and assume that the storage
 pool uuid
 is the same for both volumes, etc.  The problem with that
 optimization is that
 we require clients to understand internal implementation details.
 
 How should the StorageDomain.getVolumes API

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Saggi Mizrahi
This is all only valid for the current storage API
the new one doesn't have pools or volumes. Only domains and images.
Also, images and domains are more loosely coupled and make this method 
problematic.

That being said, if we do choose to make the current storage API officially 
supported I do agree that it looks a bit simpler but for the price of forcing 
the user to construct these objects before sending the request. I know for a 
fact that the engine will just create these objects on the fly because they use 
their own objects to group things logically. This means adding more work 
instead of removing it.
Most clients will do that anyway as they will use their own DAL to store these 
relationships. 

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: vdsm-devel@lists.fedorahosted.org
 Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, 
 Federico Simoncelli
 fsimo...@redhat.com, Saggi Mizrahi smizr...@redhat.com, Ayal Baron 
 aba...@redhat.com
 Sent: Thursday, November 29, 2012 12:19:06 PM
 Subject: RFD: API: Identifying vdsm objects in the next-gen API
 
 Today in vdsm, every object (StoragePool, StorageDomain, VM, Volume,
 etc) is
 identified by a single UUID.  On the surface, it seems like this is
 enough info
 to properly identify a resource but in practice it's not.  For
 example, when you
 look at the API's dealing with Volumes, almost all of them require an
 sdUUID,
 spUUID, and imgUUID in order to provide proper context for the
 operation.
 
 Needing to provide these extra UUIDs is a burden on the API user
 because knowing
 which values to pass requires internal knowledge of the API.  For
 example, the
 spUUID parameter is almost always just the connected storage pool.
  Since we
 know there can currently be only one connected pool, the value is
 known.
 
 I would like to move away from needing to understand all of these
 relationships
 from the end user perspective by encapsulating the extra context into
 new object
 identifier types as follows:
 
 StoragePoolIdentifier:
 { 'storagepoolID': 'UUID' }
 StorageDomainIdentifier:
 { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID' }
 ImageIdentifier:
 { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID', 'imageID':
 'UUID' }
 VolumeIdentifier:
 { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID',
   'imageID': 'UUID', 'volumeID': 'UUID' }
 TaskIdentifier:
 { 'taskID': 'UUID' }
 VMIdentifier:
 { 'vmID': 'UUID' }
 
 In the new API, anytime a reference to an object is required, one of
 the above
 structures must be passed in place of today's single UUID.  In many
 cases, this
 will allow us to reduce the number of parameters to the function
 since the
 needed contextual parameters (spUUID, etc) will be part of the
 object's
 identifier.  Similarly, any time the API returns an object reference
 it would
 return a *Identifier instead of a bare UUID.
 
 These identifier types are basically opaque blobs to the API users
 and are only
 ever generated by vdsm itself.  Because of this, we can change the
 internal
 structure of the identifier to require new information or (before
 freezing the
 API) remove fields that no longer make sense.
 
 I would greatly appreciate your comments on this proposal.  If it
 seems
 reasonable, I will revamp the current schema to make the necessary
 changes and
 provide the Bridge patch functions to convert between the current
 implementation
 and the new schema.
 
 --- sample schema patch ---
 
 commit 48f6b0f0a111dd0b372d211a4e566ce87f375cee
 Author: Adam Litke a...@us.ibm.com
 Date:   Tue Nov 27 14:14:06 2012 -0600
 
 schema: Introduce class identifier types
 
 When calling API methods that belong to a particular class, a
 class instance
 must be indicated by passing a set of identifiers in the request.
  The location
 of these parameters within the request is: 'params' - '__obj__'.
  Since this
 set of identifiers must be used together to correctly instantiate
 an object, it
 makes sense to define these as proper types within the API.
  Then, functions
 that return an object (or list of objects) can refer to the
 correct type.
 
 Signed-off-by: Adam Litke a...@us.ibm.com
 
 diff --git a/vdsm_api/vdsmapi-schema.json
 b/vdsm_api/vdsmapi-schema.json
 index 0418e6e..7e2e851 100644
 --- a/vdsm_api/vdsmapi-schema.json
 +++ b/vdsm_api/vdsmapi-schema.json
 @@ -937,7 +937,7 @@
  # Since: 4.10.0
  ##
  {'command': {'class': 'Host', 'name': 'getConnectedStoragePools'},
 - 'returns': ['StoragePool']}
 + 'returns': ['StoragePoolIdentifier']}
  
  ##
  # @BlockDeviceType:
 @@ -1572,7 +1572,7 @@
  {'command': {'class': 'Host', 'name': 'getStorageDomains'},
   'data': {'*storagepoolID': 'UUID', '*domainClass':
   'StorageDomainImageClass',
'*storageType': 'StorageDomainType', '*remotePath':
'str'},
 - 'returns': ['StorageDomain']}
 + 'returns': ['StorageDomainIdentifier

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Saggi Mizrahi
They are not future proof as the paradigm is completely different.
Storage domain IDs are not static any more (and are not guaranteed to be unique 
or the same across the cluster.
Image IDs represent the ID of the projected data and not the actual unique path.
Just as an example, to run a VM you give a list of domains that might contain 
the needed images in the chain and the image ID of the tip.
The paradigm is changed to and most calls get non synchronous number of images 
and domains.
Further more, the APIs themselves are completely different. So future proofing 
is not really an issue.

As to making the current API a bit simpler. As I said, making them opaque is 
problematic as currently the engine is responsible for creating the IDs.
Further more, some calls require you to play with these (making a template 
instead of a snapshot).
Also, the full chain and topology needs to be completely visible to the engine.

These things, as you said, are problematic. But this is the way things are 
today.

As for task IDs.
Currently task IDs are only used for storage and they get persisted to disk. 
This is WRONG and is not the case with the new storage API.
Because we moved to an asynchronous message based protocol (json-rpc over 
TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc.
json-rpc specifies that the IDs have to be unique for a client as long as the 
request is still active.
This is good enough as internally we can have a verb for a client to query it's 
own running tasks and a verb to query other host tasks by mangling in the 
client before the ID.
Because the protocol is asynchronous all calls are asynchronous by nature well.
Tasks will no longer be persisted or expected to be persisted. It's the callers 
responsibility to query the state and see if the operation succeeded or failed 
if the caller or VDSM died in the middle of the call. The current cleanTask() 
system can't be used when more then one client is using VDSM and will not be 
used for anything other then legacy storage.

AFAIK Apart from storage all objects IDs are constructed with a single ID, name 
or alias. VMs, storageConnections, network interfaces. So it's not a real issue.
I agree that in the future we should keep the idiom of pass configuration once, 
name it, and keep using the name to reference the object.

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, 
 Federico Simoncelli
 fsimo...@redhat.com, Ayal Baron aba...@redhat.com, 
 vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, November 29, 2012 4:18:40 PM
 Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
 
 On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
  This is all only valid for the current storage API the new one
  doesn't have
  pools or volumes. Only domains and images.  Also, images and
  domains are more
  loosely coupled and make this method problematic.
 
 I am looking for an incremental way to bridge the differences.  It's
 been 2
 years and we still don't have the revamped storage API so I am
 planning on what
 we have being around for awhile :)  I think that defining object
 identifiers as
 opaque structured types is also future proof.  In the future an
 Image-ng object
 we can drop 'storagepoolID' from the identifier and, if it makes
 sense, remove
 the hard association with a storageDomain as well.  The point behind
 this
 refactoring is to give us the option of coupling multiple UUID's (or
 other data)
 to form a single, opaque identifier.
 
  That being said, if we do choose to make the current storage API
  officially
  supported I do agree that it looks a bit simpler but for the price
  of forcing
  the user to construct these objects before sending the request. I
  know for a
  fact that the engine will just create these objects on the fly
  because they
  use their own objects to group things logically. This means adding
  more work
  instead of removing it.  Most clients will do that anyway as they
  will use
  their own DAL to store these relationships.
  
 
 Thanks for bringing up some of these points.  All deserve attention
 so I will
 address each one individually:
 
 The current API does not yet make an official statement of support
 for anything.
 I want to model the current storage API so that the node level API
 can have the
 same level of functionality as is currently supported.  I am all for
 removing
 deprecated functions and redesigning in-place for a reasonable amount
 of time
 going forward.  In a perfect world, libvdsm-1.0 would release with no
 mention of
 storage pools at all.
 
 If properly designed, the end-user (including engine) would never be
 constructing these objects itself.  Object identifiers are
 essentially opaque
 structures.  In order to make this possible, we need to make sure
 that the API
 provides all of the functions needed to lookup

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, 
 Federico Simoncelli
 fsimo...@redhat.com, Ayal Baron aba...@redhat.com, 
 vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, November 29, 2012 5:22:43 PM
 Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
 
 On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
  They are not future proof as the paradigm is completely different.
   Storage
  domain IDs are not static any more (and are not guaranteed to be
  unique or the
  same across the cluster.  Image IDs represent the ID of the
  projected data and
  not the actual unique path.  Just as an example, to run a VM you
  give a list
  of domains that might contain the needed images in the chain and
  the image ID
  of the tip.  The paradigm is changed to and most calls get non
  synchronous
  number of images and domains.  Further more, the APIs themselves
  are
  completely different. So future proofing is not really an issue.
 
 I don't understand this at all.  Perhaps we could all use some
 education on the
 architecture of the planned architectural changes.  If I can pass an
 arbitrary
 list of domainIDs that _might_ contain the data, why wouldn't I just
 pass all
 of them every time?  In that case, why are they even required since
 vdsm would
 have to search anyway?
It's for optimization mostly, the engine usually has a good idea of where stuff 
are, having it give hints to VDSM can speed up the search process.
also, then engines knows how transient some storage pieces are. If you have a 
domain that is only there for backup or owned by another manager sharing the 
host, you don't want you VMs using the disks that are on that storage 
effectively preventing it from being removed (though we do have plans to have 
qemu switch base snapshots at runtime for just that).
 
  As to making the current API a bit simpler. As I said, making them
  opaque is
  problematic as currently the engine is responsible for creating the
  IDs.
 
 As I mentioned in my last post, engine still can specify the ID's
 when the
 object is first created.  From that point forward the ID never
 changes so it can
 be baked into the identifier.
Where will this identifier be persisted?
 
  Further more, some calls require you to play with these (making a
  template
  instead of a snapshot).  Also, the full chain and topology needs to
  be
  completely visible to the engine.
 
 Please provide a specific example of how you play with the IDs.  I
 can guess
 where you are going, but I don't want to divert the thread.
The relationship between volumes and images is deceptive at the moment.
IMG is the chain and volume is a member, IMGUUID is only used to for 
verification and to detect when we hit a template going up the chain.
When you do operation on images assumptions are being guaranteed about the 
resulting IDs. When you copy an image, you assume to know all the new IDs as 
they remain the same.
With your method I can't tell what the new opaque result is going to be.
Preview mode (another abomination being deprecated) relies on the disconnect 
between imgUUID and volUUID.
Live migration currently moves a lot of the responsibility to the engine.
 
  These things, as you said, are problematic. But this is the way
  things are
  today.
 
 We are changing them.
Any intermediary step is needlessly problematic for existing clients.
Work is already in progress for fixing the API properly, making some calls a 
bit nicer isn't an excuse to start making more compatibility code in the engine.
 
  As for task IDs.  Currently task IDs are only used for storage and
  they get
  persisted to disk. This is WRONG and is not the case with the new
  storage API.
  Because we moved to an asynchronous message based protocol
  (json-rpc over
  TCP\AMQP) there is no need to generate a task ID. it is built in to
  json-rpc.
  json-rpc specifies that the IDs have to be unique for a client as
  long as the
  request is still active.  This is good enough as internally we can
  have a verb
  for a client to query it's own running tasks and a verb to query
  other host
  tasks by mangling in the client before the ID.  Because the
  protocol is
 
 So this would rely on the client keeping the connection open and as
 soon as it
 disconnects it would lose the ability to query tasks from before the
 connection
 went down?  I don't know if it's a good idea to conflate message ID's
 with task
 ID's.  While the protocol can operate asynchronously, some calls have
 synchronous semantics and others have asynchronous semantics.  I
 would expect
 sync calls to return their data immediately and async calls to return
 immediately with either: an error code, or an 'operation started'
 message and
 associated ID for querying the status of the operation.
Upon reflection I agree that having the request ID unique per client

Re: [vdsm] MTU setting according to ifcfg files.

2012-11-28 Thread Saggi Mizrahi
I suggest we don't have a default. If you don't specify an MTU it will use 
whatever is already configured.
There is no way to go back to the defaults only to set a new value. The 
engine can assume 1500 (in case of ethernet devices) is the recommended value.

- Original Message -
 From: Simon Grinberg si...@redhat.com
 To: Igor Lvovsky ilvov...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, November 28, 2012 9:53:48 AM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 
 
 - Original Message -
  From: Igor Lvovsky ilvov...@redhat.com
  To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
  Cc: Simon Grinberg si...@redhat.com
  Sent: Wednesday, November 28, 2012 2:58:52 PM
  Subject: [vdsm] MTU setting according to ifcfg files.
  
  Hi,
  
  I am working on one of the vdsm bugs that we have and I found that
  initscripts (initscripts-9.03.34-1.el6.x86_64)
  behaviour doesn't fits our needs.
  So, I would like to raise this issue in the list.
  
  The issue is MTU setting according to ifcfg files.
  I'll try to describe the flow below.
  
  1. I started with ifcfg file for the interface without MTU keyword
  at
  all
  and the proper interface (let say eth0) had the *default* MTU=1500
  (according to /sys/class/net/eth0/mtu).
  2. I created a bridge with MTU=9000 on top of this interface.
  Everything went OK.
 After I wrote MTU=9000 on ifcfg-eth0 and ifdown/ifup it, eth0
 got
 the proper MTU.
  3. Now, I removed the bridge and deleted MTU keyword from the
  ifcfg-eth0.
 But after ifup/ifdown the actual MTU of the eth0 stayed 9000.

  The only way to change it back to 1500 (or something else) is
  explicitly set MTU in ifcfg file.
  According to Bill Nottingham it is intentional behaviour.
  If so, we have a problem in vdsm, because we never set MTU value
  until user ask it explicitly.
 
 Actually you are,
 
 You where asked for MTU 9000 on the network,
 As implementation specif you had to do this all the way down the
 chain
 Now it's only reasonable that when you cancel the 9000 request then
 you'll do what is necessary to rollback the changes.
 It's pity that ifcfg-files don't have the option to set
 MTU='default', but as you can read this default before you change,
 then please keep it somewhere and revert to that.
 
 
  It means that if we have interface with MTU=9000 on it just because
  once there was a bridge with such MTU
  attached to it and now we want to attach regular bridge with
  *default* MTU=1500 we have a problem.
  The only thing we can do to avoid this it's set explicitly MTU=1500
  in interface's ifcfg file.
  IMHO it's a bit ugly, but it looks like we have no choice.
  
  As usual comments more than welcome...
  
  Regards,
 Igor Lvovsky
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
  
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] MTU setting according to ifcfg files.

2012-11-28 Thread Saggi Mizrahi
I don't want to keep the last configured MTU. It's problematic. Having a stack 
is even worse.
VDSM should try not to persist anything if possible.

Also, reverting to the last MTU is raceful and has weird corner cases. Best to 
just assume default it 1500 (Like all major OSs do).
But since it's not really a default I would call it a recommended setting.

- Original Message -
 From: Igor Lvovsky ilvov...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Simon 
 Grinberg si...@redhat.com
 Sent: Wednesday, November 28, 2012 11:10:27 AM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Simon Grinberg si...@redhat.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org,
  Igor Lvovsky ilvov...@redhat.com
  Sent: Wednesday, November 28, 2012 5:30:17 PM
  Subject: Re: [vdsm] MTU setting according to ifcfg files.
  
  I suggest we don't have a default. If you don't specify an MTU it
  will use whatever is already configured.
  There is no way to go back to the defaults only to set a new
  value.
  The engine can assume 1500 (in case of ethernet devices) is the
  recommended value.
  
 
 This is not related to engine. You are right that the actually MTU
 will the last configured one,
 but this is exactly a problem.
 As I already mentioned, if you will add another bridge without custom
 MTU its users (VMs)
 can assume that the MTU is 1500
 
  - Original Message -
   From: Simon Grinberg si...@redhat.com
   To: Igor Lvovsky ilvov...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org
   Sent: Wednesday, November 28, 2012 9:53:48 AM
   Subject: Re: [vdsm] MTU setting according to ifcfg files.
   
   
   
   - Original Message -
From: Igor Lvovsky ilvov...@redhat.com
To: VDSM Project Development
vdsm-devel@lists.fedorahosted.org
Cc: Simon Grinberg si...@redhat.com
Sent: Wednesday, November 28, 2012 2:58:52 PM
Subject: [vdsm] MTU setting according to ifcfg files.

Hi,

I am working on one of the vdsm bugs that we have and I found
that
initscripts (initscripts-9.03.34-1.el6.x86_64)
behaviour doesn't fits our needs.
So, I would like to raise this issue in the list.

The issue is MTU setting according to ifcfg files.
I'll try to describe the flow below.

1. I started with ifcfg file for the interface without MTU
keyword
at
all
and the proper interface (let say eth0) had the *default*
MTU=1500
(according to /sys/class/net/eth0/mtu).
2. I created a bridge with MTU=9000 on top of this interface.
Everything went OK.
   After I wrote MTU=9000 on ifcfg-eth0 and ifdown/ifup it,
   eth0
   got
   the proper MTU.
3. Now, I removed the bridge and deleted MTU keyword from the
ifcfg-eth0.
   But after ifup/ifdown the actual MTU of the eth0 stayed
   9000.
  
The only way to change it back to 1500 (or something else) is
explicitly set MTU in ifcfg file.
According to Bill Nottingham it is intentional behaviour.
If so, we have a problem in vdsm, because we never set MTU
value
until user ask it explicitly.
   
   Actually you are,
   
   You where asked for MTU 9000 on the network,
   As implementation specif you had to do this all the way down the
   chain
   Now it's only reasonable that when you cancel the 9000 request
   then
   you'll do what is necessary to rollback the changes.
   It's pity that ifcfg-files don't have the option to set
   MTU='default', but as you can read this default before you
   change,
   then please keep it somewhere and revert to that.
   
   
It means that if we have interface with MTU=9000 on it just
because
once there was a bridge with such MTU
attached to it and now we want to attach regular bridge with
*default* MTU=1500 we have a problem.
The only thing we can do to avoid this it's set explicitly
MTU=1500
in interface's ifcfg file.
IMHO it's a bit ugly, but it looks like we have no choice.

As usual comments more than welcome...

Regards,
   Igor Lvovsky
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

   ___
   vdsm-devel mailing list
   vdsm-devel@lists.fedorahosted.org
   https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
   
  
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] MTU setting according to ifcfg files.

2012-11-28 Thread Saggi Mizrahi
OK, I think I need to explain myself better,
MTU sizes under 1500 are not interesting as they are only really valid for slow 
networks which will not be able to support virt workloads anyway.
1500 is internet MTU and is the recommended size when communicating with the 
outside world.

MTU is just a size that has to be agreed upon by all participants in the chain.
There is no inherent default MTU but default is technically 1500.

Reverting to previous value makes no sense unless you are just testing 
something out.
For that case the engine can remember the current MTU and set it back.

To sum up, I suggest ignoring any previously set value like we would ignore it 
if VDSM had set it.
It makes no sense to keep it because the semantic of setting the MTU is to 
override the current configuration.

As a side note, having verb to test max MTU for a path might be a good idea to 
give the engine\user a way to recommend a value to the user.

- Original Message -
 From: Saggi Mizrahi smizr...@redhat.com
 To: Igor Lvovsky ilvov...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Simon 
 Grinberg si...@redhat.com
 Sent: Wednesday, November 28, 2012 11:23:52 AM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 I don't want to keep the last configured MTU. It's problematic.
 Having a stack is even worse.
 VDSM should try not to persist anything if possible.
 
 Also, reverting to the last MTU is raceful and has weird corner
 cases. Best to just assume default it 1500 (Like all major OSs do).
 But since it's not really a default I would call it a recommended
 setting.
 
 - Original Message -
  From: Igor Lvovsky ilvov...@redhat.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org,
  Simon Grinberg si...@redhat.com
  Sent: Wednesday, November 28, 2012 11:10:27 AM
  Subject: Re: [vdsm] MTU setting according to ifcfg files.
  
  
  
  - Original Message -
   From: Saggi Mizrahi smizr...@redhat.com
   To: Simon Grinberg si...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org,
   Igor Lvovsky ilvov...@redhat.com
   Sent: Wednesday, November 28, 2012 5:30:17 PM
   Subject: Re: [vdsm] MTU setting according to ifcfg files.
   
   I suggest we don't have a default. If you don't specify an MTU it
   will use whatever is already configured.
   There is no way to go back to the defaults only to set a new
   value.
   The engine can assume 1500 (in case of ethernet devices) is the
   recommended value.
   
  
  This is not related to engine. You are right that the actually MTU
  will the last configured one,
  but this is exactly a problem.
  As I already mentioned, if you will add another bridge without
  custom
  MTU its users (VMs)
  can assume that the MTU is 1500
  
   - Original Message -
From: Simon Grinberg si...@redhat.com
To: Igor Lvovsky ilvov...@redhat.com
Cc: VDSM Project Development
vdsm-devel@lists.fedorahosted.org
Sent: Wednesday, November 28, 2012 9:53:48 AM
Subject: Re: [vdsm] MTU setting according to ifcfg files.



- Original Message -
 From: Igor Lvovsky ilvov...@redhat.com
 To: VDSM Project Development
 vdsm-devel@lists.fedorahosted.org
 Cc: Simon Grinberg si...@redhat.com
 Sent: Wednesday, November 28, 2012 2:58:52 PM
 Subject: [vdsm] MTU setting according to ifcfg files.
 
 Hi,
 
 I am working on one of the vdsm bugs that we have and I found
 that
 initscripts (initscripts-9.03.34-1.el6.x86_64)
 behaviour doesn't fits our needs.
 So, I would like to raise this issue in the list.
 
 The issue is MTU setting according to ifcfg files.
 I'll try to describe the flow below.
 
 1. I started with ifcfg file for the interface without MTU
 keyword
 at
 all
 and the proper interface (let say eth0) had the *default*
 MTU=1500
 (according to /sys/class/net/eth0/mtu).
 2. I created a bridge with MTU=9000 on top of this interface.
 Everything went OK.
After I wrote MTU=9000 on ifcfg-eth0 and ifdown/ifup it,
eth0
got
the proper MTU.
 3. Now, I removed the bridge and deleted MTU keyword from the
 ifcfg-eth0.
But after ifup/ifdown the actual MTU of the eth0 stayed
9000.
   
 The only way to change it back to 1500 (or something else) is
 explicitly set MTU in ifcfg file.
 According to Bill Nottingham it is intentional behaviour.
 If so, we have a problem in vdsm, because we never set MTU
 value
 until user ask it explicitly.

Actually you are,

You where asked for MTU 9000 on the network,
As implementation specif you had to do this all the way down
the
chain
Now it's only reasonable that when you cancel the 9000 request
then
you'll do what is necessary to rollback the changes.
It's pity

Re: [vdsm] MTU setting according to ifcfg files.

2012-11-28 Thread Saggi Mizrahi


- Original Message -
 From: Simon Grinberg si...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Barak 
 Azulay bazu...@redhat.com, Igor
 Lvovsky ilvov...@redhat.com
 Sent: Wednesday, November 28, 2012 12:03:03 PM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Igor Lvovsky ilvov...@redhat.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org,
  Simon Grinberg si...@redhat.com, Barak
  Azulay bazu...@redhat.com
  Sent: Wednesday, November 28, 2012 6:49:22 PM
  Subject: Re: [vdsm] MTU setting according to ifcfg files.
  
  OK, I think I need to explain myself better,
  MTU sizes under 1500 are not interesting as they are only really
  valid for slow networks which will not be able to support virt
  workloads anyway.
  1500 is internet MTU and is the recommended size when
  communicating
  with the outside world.
  
  MTU is just a size that has to be agreed upon by all participants
  in
  the chain.
  There is no inherent default MTU but default is technically 1500.
  
  Reverting to previous value makes no sense unless you are just
  testing something out.
 
 Yes it does,
 There are networks out there that do use MTU  1500 as weird as it
 sounds,
It not weird at all, this is why MTU settings exist.
But setting a low MTU will not break the network but will just have some 
performance degredation.
 this usually the admin does initial settings on the
 management network and then when you set don't touch all works well.
 An example is when you have storage and management on the same
 network.
 
 Now consider the scenario that for some VMs the user wants to limit
 to the 'normal/recommended defaults' so in this case he will have to
 set in the logical network property to MTU=1500. when VDSM sets this
 chain it supposedly won't touch the interface MTU since it's already
 bigger (if it does it's a bug). Now the user has one more logical
 network of VMs with 9000 since he also have VMs using shared storage
 on this network.
 
 All works well till now.
 
 But what about when removing the 9000 network?
 Will VDSM 'remember' that it did not touch the interface MTU in the
 first place, or will it try to set it to this recommended MTU?.
It's a question of ownership. Because it's simpler I suggest we assume 
ownership and always set the maximum needed (also lowering if to high).
The engine can query the MTU and make weird decision according. Like setting 
the current as default or as a saved value or whatever.
This flow obviously needs user input so VSDM is not the place to put the 
decision making.
 
 I have no idea :)
 
 
  For that case the engine can remember the current MTU and set it
  back.
  
  To sum up, I suggest ignoring any previously set value like we
  would
  ignore it if VDSM had set it.
  It makes no sense to keep it because the semantic of setting the
  MTU
  is to override the current configuration.
  
  As a side note, having verb to test max MTU for a path might be a
  good idea to give the engine\user a way to recommend a value to the
  user.
 
 That is better but not perfect :)
 
 
  
  - Original Message -
   From: Saggi Mizrahi smizr...@redhat.com
   To: Igor Lvovsky ilvov...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org,
   Simon Grinberg si...@redhat.com
   Sent: Wednesday, November 28, 2012 11:23:52 AM
   Subject: Re: [vdsm] MTU setting according to ifcfg files.
   
   I don't want to keep the last configured MTU. It's problematic.
   Having a stack is even worse.
   VDSM should try not to persist anything if possible.
   
   Also, reverting to the last MTU is raceful and has weird corner
   cases. Best to just assume default it 1500 (Like all major OSs
   do).
   But since it's not really a default I would call it a recommended
   setting.
   
   - Original Message -
From: Igor Lvovsky ilvov...@redhat.com
To: Saggi Mizrahi smizr...@redhat.com
Cc: VDSM Project Development
vdsm-devel@lists.fedorahosted.org,
Simon Grinberg si...@redhat.com
Sent: Wednesday, November 28, 2012 11:10:27 AM
Subject: Re: [vdsm] MTU setting according to ifcfg files.



- Original Message -
 From: Saggi Mizrahi smizr...@redhat.com
 To: Simon Grinberg si...@redhat.com
 Cc: VDSM Project Development
 vdsm-devel@lists.fedorahosted.org,
 Igor Lvovsky ilvov...@redhat.com
 Sent: Wednesday, November 28, 2012 5:30:17 PM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 I suggest we don't have a default. If you don't specify an
 MTU
 it
 will use whatever is already configured.
 There is no way to go back to the defaults only to set a
 new
 value.
 The engine can assume 1500 (in case of ethernet devices) is
 the
 recommended value

Re: [vdsm] MTU setting according to ifcfg files.

2012-11-28 Thread Saggi Mizrahi


- Original Message -
 From: Alon Bar-Lev alo...@redhat.com
 To: Simon Grinberg si...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Saggi 
 Mizrahi smizr...@redhat.com, lpeer 
 Livnat Peer lp...@redhat.com
 Sent: Wednesday, November 28, 2012 12:49:10 PM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 
 
 - Original Message -
  From: Simon Grinberg si...@redhat.com
  To: Saggi Mizrahi smizr...@redhat.com, lpeer  Livnat Peer
  lp...@redhat.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org
  Sent: Wednesday, November 28, 2012 7:37:48 PM
  Subject: Re: [vdsm] MTU setting according to ifcfg files.
  
  
  
  - Original Message -
   From: Saggi Mizrahi smizr...@redhat.com
   To: Simon Grinberg si...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org
   Sent: Wednesday, November 28, 2012 7:15:35 PM
   Subject: Re: [vdsm] MTU setting according to ifcfg files.
   
   
   
   - Original Message -
From: Simon Grinberg si...@redhat.com
To: Saggi Mizrahi smizr...@redhat.com
Cc: VDSM Project Development
vdsm-devel@lists.fedorahosted.org,
Barak Azulay bazu...@redhat.com, Igor
Lvovsky ilvov...@redhat.com
Sent: Wednesday, November 28, 2012 12:03:03 PM
Subject: Re: [vdsm] MTU setting according to ifcfg files.



- Original Message -
 From: Saggi Mizrahi smizr...@redhat.com
 To: Igor Lvovsky ilvov...@redhat.com
 Cc: VDSM Project Development
 vdsm-devel@lists.fedorahosted.org,
 Simon Grinberg si...@redhat.com, Barak
 Azulay bazu...@redhat.com
 Sent: Wednesday, November 28, 2012 6:49:22 PM
 Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
 OK, I think I need to explain myself better,
 MTU sizes under 1500 are not interesting as they are only
 really
 valid for slow networks which will not be able to support
 virt
 workloads anyway.
 1500 is internet MTU and is the recommended size when
 communicating
 with the outside world.
 
 MTU is just a size that has to be agreed upon by all
 participants
 in
 the chain.
 There is no inherent default MTU but default is technically
 1500.
 
 Reverting to previous value makes no sense unless you are
 just
 testing something out.

Yes it does,
There are networks out there that do use MTU  1500 as weird as
it
sounds,
   It not weird at all, this is why MTU settings exist.
   But setting a low MTU will not break the network but will just
   have
   some performance degredation.
this usually the admin does initial settings on the
management network and then when you set don't touch all works
well.
An example is when you have storage and management on the same
network.

Now consider the scenario that for some VMs the user wants to
limit
to the 'normal/recommended defaults' so in this case he will
have
to
set in the logical network property to MTU=1500. when VDSM sets
this
chain it supposedly won't touch the interface MTU since it's
already
bigger (if it does it's a bug). Now the user has one more
logical
network of VMs with 9000 since he also have VMs using shared
storage
on this network.

All works well till now.

But what about when removing the 9000 network?
Will VDSM 'remember' that it did not touch the interface MTU in
the
first place, or will it try to set it to this recommended MTU?.
   It's a question of ownership. Because it's simpler I suggest we
   assume ownership and always set the maximum needed (also lowering
   if
   to high).
   The engine can query the MTU and make weird decision according.
   Like
   setting the current as default or as a saved value or whatever.
   This flow obviously needs user input so VSDM is not the place to
   put
   the decision making.
  
  I tend to agree, it's an ownership thing
  
  Engine should not allow mixed configuration of 'default vs
  override'
  on the same interface.
  If user wishes to start playing with MTUs he needs to use it
  carefully and across the board.
  
  VDSM should not bother with the issue at all, certainly not playing
  a
  guessing game.
  
  Livant, your 0.02$?
 
 This exactly the reason why we should either define completely
 stateless slave host, and apply configuration including what you
 call 'defaults'.
Completely stateless is problematic because if the engine is down or 
unavailable and VDSM happens to restart you can't use any of your resources.

The way forward is currently to get rid of most of the configuration in 
vdsm.conf.
Only have things that are necessary for communication with the engine (eg. Core 
dump on\off, management interface\port, SSL on\off).
Other VDSM configuration should have a an API introduced to set them and that 
will be persisted but only configurable

Re: [vdsm] [RFC]about the implement of text-based console

2012-11-27 Thread Saggi Mizrahi
The best solution would of course be 3 (Or something similar that keeps the 
terminal state inside the VM memory so that migration works).
Tunelling screen can do that but it requires having screen (or something 
similar) installed on the guest which is hard to do.

But I think the more practical solution is 2 as it has semantics similar to VNC.
Running a real ssh (ie. 1) is problematic because we have less control over the 
daemon and there are more vectors the user can try and use to break out of the 
sandbox.
Further more, setting up sandboxes is a bit problematic ATM.

I don't really understand 5. What does those methods return the virtio dev path?

- Original Message -
 From: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, November 27, 2012 4:22:20 AM
 Subject: Re: [vdsm] [RFC]about the implement of text-based console
 
 Hi all,
 
 For now in there is no agreement on the remote guest console
 solution,
 so I decide to do some investigation continue the discussion.
 
 Our goal
VM serial console remote access in CLI mode. That means the client
 runs without X environment.
Do you mean like running qemu with -curses?
 
 There are several proposals.
 
 1. Sandboxed sshd
VDSM runs a new host sshd instance in virtual machine/sandbox and
 redirects the virtio console to it.
 2. Third-party sshd
VDSM runs third-party sshd library/implementation and redirects
 virtio console to it.
 3. Spice
Extend spice to support console and implement a client to be run
 without GUI environment
 4. oVirt shell - Engine - libvirt
The user connects to Engine via oVirt CLI, then issues a
 serial-console command, then Engine locates the host and connect to
 the guest console. Currently there is a workaround, it invokes virsh
 -c
 qemu+tls://host/qemu console vmid from Engine side.
 5. VDSM console streaming API
VDSM exposes getConsoleReadStream() and getConsoleWriteStream()
via
 XMLRPC binding. Then implement the related client in vdsClient and
 Engine
 
 
 Detailed discussion
 
 1. Sandboxes
 Solution 1 and 2 allow users connect to console using their favorite
 ssh
 client. The login name is vmid, the password is set by setVmTicket()
 call of VDSM. The connection will be lost during migration. This is
 similar to VNC in oVirt.
 
 I take a look at several sandbox technologies, including
 libvirt-sandbox, lxc and selinux.
 a) libvirt-sandbox boots a VM using host kernel and initramfs, then
 passthru the host file system to the VM in read only mode. We can
 also
 add extra binding to the guest file system. It's very easy to use. To
 run shell in a VM, one can just issues
 
 virt-sandbox -c qemu:///session  /bin/sh
 
 Then the VM will be ready in several seconds.
 However it will trigger some selinux violations. Currently there is
 no
 official support for selinux policy configuration from this project.
 In
 the project page this is put in the todo list.
 
 b) lxc utilize Linux container to run a process in sandbox. It needs
 to
 be configured properly. I find in the package lxc-templates there is
 an
 example configuration file for running sshd in lxc.
 
 c) sandbox command in the package policycoreutils-python makes use of
 selinux to run a process in sandbox, but there is no official or
 example
 policy files for sshd.
 
 In a word, for sandbox technologies, we have to configure the
 policies/file system binding/network carefully and test the
 compatibility with popular sshd implementations (openssh-server).
 When
 those sshd upgrade, the policy must be upgraded by us at the same
 time.
 Since the policies are not maintained by who implements sshd, this is
 a
 burden for us.
 
 Work to do
Write and maintain the policies.
Find ways for auth callback and redirecting data to
openssh-server.
 
 pros
Re-use existing pieces and technologies (host sshd, sandbox).
User friendly, they can use existing ssh clients.
 cons
Connection is lost in migration, this is not a big problem because
1)
 VNC connection share the same problem, 2) the user can reconnect
 manually.
It's not easy to maintain the sandbox policies/file system
 binding/network for compatibility with sshd.
 
 
 2. Third-party sshd implementations
 Almost the same as solution 1 but with better flexibility. VDSM can
 import a third-party sshd library and let that library deal with auth
 and transport. VDSM just have to implement the data forwarding. Many
 people consider this is insecure but I think the ticket solution for
 VNC
 is even not as secure as this. Currently most of us only trust
 openssh-server and think the quality of third-party sshd is low. I
 searched for a while and found twisted.conch from the popular twisted
 project. I'm not familiar with twisted.conch, but I still put it in
 this
 mail to collect opinions from potential twisted.conch experts.
 
 In a word, I prefer sandbox technologies to third-party sshd
 

[vdsm] When Zombies Attack

2012-10-30 Thread Saggi Mizrahi
I'm starting to see more and more flows run a process and leave a thread 
waiting for it just to prevent a zombie attack.

This is wasteful, even more so that this is usually done for process that might 
get stuck on IO and might take a while to come back.

To solve this I implemented this little tidbit.
http://gerrit.ovirt.org/#/c/8937/

And you can see it being used here:
http://gerrit.ovirt.org/#/c/8907/

specifically:
http://gerrit.ovirt.org/#/c/8907/5/vdsm/storage/remoteFileHandler.py

That being said, I also want to suggest adding autoreaping to 
AsyncProc.__del__() with a warning printed to the log notifying about the 
(maybe) unintentional process leak.
Comments and suggestions for improvement are most welcome.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] git-review

2012-10-22 Thread Saggi Mizrahi
I've recently encountered more and more people not using the git-review tool 
and manually pushing their changes to Gerrit using raw git commands.
Even though there is nothing wrong with doing things the hard way. I prefer not 
using an overly complicated error prone way to interact with Gerrit.

Last I checked the version of git-review in Fedora is broken but I suggest 
using pip anyway as it is always synced with the master branch.

Also, please use topics. Either use a BZ# or a topic codename (eg. 
live_migration, vdsm_api, nfs4_support) so people can skim the review list for 
topics they might want to review.
Be careful, it automatically uses current the branch name (unless you use -t) 
so if you giving your branches funny names (I know I do) don't forget to 
manually specify a topic.

More information.

http://wiki.openstack.org/GerritWorkflow
https://github.com/openstack-ci/git-review
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] new API verb: getVersionInfo()

2012-10-18 Thread Saggi Mizrahi
currently getVdsCaps() does a lot of unrelated things most of them have no 
relation to capabilites.
This was done because of HTTP overhead. Instead of calling multiple commands we 
will call one that does everything.

I agree with the suggestion that getVdsCaps() will actually return the 
capabilities.
Capabilities being:
- storage core version supported domain formats
- VM core version and supported host capabilites.
- network core and capabilities.
- etc...

These all should be mostly static and set at boot.

As to the query API. I personally dislike the idea of a bag API. Now that we 
are moving away from HTTP, call overhead is no longer an issue so we can have 
multiple verbs and call them sequentially. In actuality we already do. 
Internally getVdsCaps() just aggregates other APIs.
This makes return values of the method easier to handle and makes changing the 
results of an API call not affect users that don't care about that change.
This also has better performance as storage APIs tend to slow the response and 
sending multiple commands would mean that you can get the Network stats even 
though the storage server is down.

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Adam Litke a...@us.ibm.com
 Cc: vdsm-devel@lists.fedorahosted.org, Michal Skrivanek 
 mskri...@redhat.com
 Sent: Thursday, October 18, 2012 4:38:16 AM
 Subject: Re: [vdsm] new API verb: getVersionInfo()
 
 On Wed, Oct 17, 2012 at 10:07:43AM -0500, Adam Litke wrote:
  Thanks for posting your idea on the list here.  I like the idea of
  a more
  fine-grained version query API.  getVdsCapabilities has become too
  much of a
  catch-all and I agree that something lighter is useful.  I do think
  vdsm will
  want to add a real capabilities mechanism and it could probably go
  here as well.
  
  As we work to make the vdsm API evolve in a stable, compatible
  manner,
  capabilities/feature-bits will come into play.  Since you're
  proposing a
  structure return value, we can easily add the capabilities field to
  it in a
  future release, but it might make sense to have it there now to
  reduce
  client-side complexity of figuring out if the return value has a
  capabilities
  item.
  
  To avoid the bloat that we have with the current getVdsCapabilities
  API, I
  propose a simple format for the new capabilities:
  
  {'enum': 'Capabilities',
   'data': ['StorageDomain_30', 'StorageDomain_22', 'Sanlock', ...]}
  
  and then add the following to the return type for your new API:
  
  'capabilities': ['Capabilities']
  
  This is essentially an expandable bitmask of features where a
  feature is present
  by its presense in the 'capabilities' array.  This will be
  extensible by simply
  adding new capabilities to the enum as we find them to be
  necessary.
  
  Thoughts on this?  The reason I am bringing it up now is it would
  be nice to
  restrict the pain of migrating to this new version API to just one
  time.
 
 I fully agree - that's what I've ment in my
 http://gerrit.ovirt.org/#/c/8431/4/vdsm_api/vdsmapi-schema.json
 comment
 on a bag of capability flags.
 
  
  
  On Wed, Oct 17, 2012 at 01:37:08PM +0200, Peter V. Saveliev wrote:
   …
   
   New verb proposal: getVersionInfo()
   
   
 Background
   
   Right now VDSM has only one possibility to discover the peer VDSM
   version — it is to call getVdsCapabilities(). All would be nice,
   but
   the verb does a lot of stuff, including disk I/O (rpm data
   query).
   It is a serious issue for high-loaded hosts, that can even
   trigger
   call timeout.
   
   
 Rationale
   
   Working in an environment with multiple VDSM versions, it is
   inevitable to fall in a simple choice:
   
   * always operate with one API, described once and forever
   * use different protocol versions.
   
   It is a common practice to reserve something in a protocol, that
   will represent the protocol version. Any protocols w/o version
   info
   sooner or later face the need to guess a version, that is much
   worse.
   
   On the other hand, involving rpm queries and CPU topology
   calculation into the protocol version discovery is an overkill.
   So
   the simplest way is to reserve a new verb for it.
   
   
 Usecases
   
   It can be used in the future in *any* VDSM communication that can
   expose version difference.
   
   
 Implementation
   
   Obviously, the usage of a new verb in the current release, e.g.
   RHEV-3.1 can be done only in try/catch way, 'cause RHEV-3.0 does
   not
   support it. But to be able to use it in RHEV-3.2, we should
   already
   have it in 3.1. Even if we will not use it yet, the future
   usecases
   are pretty straightforward.
   
   So pls comment it: http://gerrit.ovirt.org/#/c/8431/
   
   --
   Peter V. Saveliev
   
   ___
   vdsm-devel mailing list
   vdsm-devel@lists.fedorahosted.org
   https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
  
  --
  Adam Litke 

Re: [vdsm] new API verb: getVersionInfo()

2012-10-18 Thread Saggi Mizrahi
I don't see how pyinotify is even related to storage stats.
It doesn't work with NFS and is a bit flaky when it comes to VFSs like proc or 
dev.
Also it doesn't check liveness or latency so the events don't really give us 
anything useful.

The data is being taken from cache. I assume there is a prepare call there that 
makes everything slower.
This will only be fixed with new style domains that don't have a built in 
sdUUID.

- Original Message -
 From: Vinzenz Feenstra vfeen...@redhat.com
 To: Itamar Heim ih...@redhat.com
 Cc: Saggi Mizrahi smizr...@redhat.com, Michal Skrivanek 
 mskri...@redhat.com,
 vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, October 18, 2012 3:15:47 PM
 Subject: Re: [vdsm] new API verb: getVersionInfo()
 
 On 10/18/2012 08:34 PM, Itamar Heim wrote:
  On 10/18/2012 06:03 PM, Saggi Mizrahi wrote:
  currently getVdsCaps() does a lot of unrelated things most of them
  have no relation to capabilites.
  This was done because of HTTP overhead. Instead of calling
  multiple
  commands we will call one that does everything.
 
  I agree with the suggestion that getVdsCaps() will actually return
  the capabilities.
  Capabilities being:
  - storage core version supported domain formats
  - VM core version and supported host capabilites.
  - network core and capabilities.
  - etc...
 
  These all should be mostly static and set at boot.
 
  As to the query API. I personally dislike the idea of a bag API.
  Now that we are moving away from HTTP, call overhead is no longer
  an
  issue so we can have multiple verbs and call them sequentially. In
  actuality we already do. Internally getVdsCaps() just aggregates
  other APIs.
  This makes return values of the method easier to handle and makes
  changing the results of an API call not affect users that don't
  care
  about that change.
  This also has better performance as storage APIs tend to slow the
  response and sending multiple commands would mean that you can get
  the Network stats even though the storage server is down.
 
  i thought getVdsCaps return the storage results from cache, which
  is
  refreshed by another thread, to make sure getVdsCaps has no
  latency.
 Well this is what it should do but it still doesn't do it. At least
 from
 what I have seen so far.
 I am currently working on a PoC implementation for caching packages
 and
 having so pyinotify based trigger for refreshing the cache.
 I plan to really cache everything and we'll have a background thread
 running for updating the cached data on changes.
 
 I will be sending the proposed solution for it to the list. So we can
 discuss it into more details.
 
 --
 Regards,
 
 Vinzenz Feenstra
 Senior Software Engineer
 IRC: vfeenstr or evilissimo
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API: Supporting internal/testing interfaces

2012-10-03 Thread Saggi Mizrahi
Never expose such things through the API.
I know that it is currently impossible to test the mailbox \ lvextend flow 
without a full blown VDSM running because of bad design but this doesn't imply 
we should expose testing interface through the main public API.

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: vdsm-devel@lists.fedorahosted.org
 Cc: Dan Kenigsberg dan...@redhat.com, fsimo...@redhat.com, Saggi 
 Mizrahi smizr...@redhat.com
 Sent: Wednesday, October 3, 2012 3:09:48 PM
 Subject: API: Supporting internal/testing interfaces
 
 Hi,
 
 A recent patch: http://gerrit.ovirt.org/#/c/8286/1 has brought up an
 important
 issue regarding the vdsm API and I would like to open up a discussion
 about how
 we should expose testing/internal interfaces in the next-generation
 vdsm API.
 
 The above change exposes an internal HSM verb 'sendExtendMsg' via the
 xmlrpc
 interface.  There is no doubt that this is useful for testing and
 debugging the
 storage mailbox functionality.  Until now, all new APIs were required
 to be
 documented in the vdsm api schema so that they can be properly
 exported to end
 users.  But we don't really want end users to consume this particular
 API.
 
 How should we handle this?  I see a few options:
 
 1) Don't document the API and omit it from the schema.  This is the
 patch's
 current approach.  I do not favor this approach because eventually
 the xmlrpc
 server will be going away and then we will lose the ability to use
 this new
 debugging API.  We need to decide how to support debugging interfaces
 going
 forward.
 
 2) Expose it in the schema as a debugging API.  This can be done by
 extending
 the symbol's dictionary with {'debug': True}.  Initially, the API
 documentation
 and code generators can simply skip over these symbols.  Later on, we
 could
 generate an independent libvdsm-debug.so library that includes these
 debugging
 APIs.
 
 Thoughts?
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API: Supporting internal/testing interfaces

2012-10-03 Thread Saggi Mizrahi
My personal preference is using the VDSM debug hook to inject code to a running 
VDSM and dynamically add whatever you want.
This means the code is part of the test and not VDSM.

We used to use it (before the code rotted away) to add to VDSM the 
startCoverage() and endCoverage() verbs for tests.

Another option is having the code in an optional RPM (similar to how debug hook 
is loaded only if it's installed)

I might also accept unpythonic things like conditional compilation

Asking people nicely not to use a method that might corrupt their data-center 
doesn't always work with good people not to mention bad ones.

You could also just fix the design :)

- Original Message -
 From: Federico Simoncelli fsimo...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Dan Kenigsberg dan...@redhat.com, vdsm-devel@lists.fedorahosted.org, 
 Adam Litke a...@us.ibm.com
 Sent: Wednesday, October 3, 2012 9:39:44 PM
 Subject: Re: API: Supporting internal/testing interfaces
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Adam Litke a...@us.ibm.com
  Cc: Dan Kenigsberg dan...@redhat.com, fsimo...@redhat.com,
  vdsm-devel@lists.fedorahosted.org
  Sent: Wednesday, October 3, 2012 9:27:02 PM
  Subject: Re: API: Supporting internal/testing interfaces
  
  Never expose such things through the API.
  I know that it is currently impossible to test the mailbox \
  lvextend
  flow without a full blown VDSM running because of bad design but
  this doesn't imply we should expose testing interface through the
  main public API.
 
 Ok, given that in the future we'll have a proper design, what is the
 short term alternative to efficiently test the mailbox?
 
 You also completely dismissed Adam's proposal to ship these in a
 separate
 libvdsm-debug.so library.
 
 --
 Federico
 
  - Original Message -
   From: Adam Litke a...@us.ibm.com
   To: vdsm-devel@lists.fedorahosted.org
   Cc: Dan Kenigsberg dan...@redhat.com, fsimo...@redhat.com,
   Saggi Mizrahi smizr...@redhat.com
   Sent: Wednesday, October 3, 2012 3:09:48 PM
   Subject: API: Supporting internal/testing interfaces
   
   Hi,
   
   A recent patch: http://gerrit.ovirt.org/#/c/8286/1 has brought up
   an
   important
   issue regarding the vdsm API and I would like to open up a
   discussion
   about how
   we should expose testing/internal interfaces in the
   next-generation
   vdsm API.
   
   The above change exposes an internal HSM verb 'sendExtendMsg' via
   the
   xmlrpc
   interface.  There is no doubt that this is useful for testing and
   debugging the
   storage mailbox functionality.  Until now, all new APIs were
   required
   to be
   documented in the vdsm api schema so that they can be properly
   exported to end
   users.  But we don't really want end users to consume this
   particular
   API.
   
   How should we handle this?  I see a few options:
   
   1) Don't document the API and omit it from the schema.  This is
   the
   patch's
   current approach.  I do not favor this approach because
   eventually
   the xmlrpc
   server will be going away and then we will lose the ability to
   use
   this new
   debugging API.  We need to decide how to support debugging
   interfaces
   going
   forward.
   
   2) Expose it in the schema as a debugging API.  This can be done
   by
   extending
   the symbol's dictionary with {'debug': True}.  Initially, the API
   documentation
   and code generators can simply skip over these symbols.  Later
   on,
   we
   could
   generate an independent libvdsm-debug.so library that includes
   these
   debugging
   APIs.
   
   Thoughts?
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] sanlock issues

2012-09-23 Thread Saggi Mizrahi
If you are trying to run sanlock on fedora and you get this error:

Sep 23 11:26:56 dhcp-XX-XX.tlv.redhat.com sanlock[7083]: 2012-09-23 
11:26:56+0200 37014 [7083]: wdmd connect failed for watchdog handling

You need to do this:

# unload softdog if it's running
rmmod softdog
# Check if there are residual watchdog files under /dev and remove them
rm /dev/watchdog*
# reload the softdog module
modprobe softdog
# make sure the file is named /dev/watchdog
mv /dev/watchdog? /dev/watchdog
# set the proper selinux context
restorecon /dev/watchdog
# restart wdmd
systemctl restart wdmd.service
# restart sanlock
systemctl restart sanlock.service
# Profit!
fortune
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] [RFC] Implied UUIDs in API

2012-08-30 Thread Saggi Mizrahi
Hi, in the API a lot of IDs get passed around are UUIDs.
The point is that as long as you are not the entity generating the UUIDs the 
fact that these are UUIDs have no real significance to you.
I suggest removing the validation of UUIDs from the receiving end. There is no 
real reason to make sure these are real UUIDs.
It's another restriction we can remove from the interface simplifying the code 
and the interface.

Just to be clear I'm not saying that we should stop using UUIDs.
For example, vdsm will keep generating task IDs as UUIDs. But the documentation 
will state that it could be *any* string value.
If for some reason we choose to change the format of task IDs. There will be no 
need to change the interface.

The same goes for VM IDs. Currently the engine uses UUIDs but there is no 
reason for VDSM to enforce this and limit the engine from ever changing it in 
the future and using other string values.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] About vdsm rest api

2012-08-20 Thread Saggi Mizrahi
Hi, the rest API is going to move to a 2nd tier API very soon.
There is also a pretty big API restructuring going on so we could supply a 
supported and documented API.

Is there a reason you are not using the ovirt-engine REST-API?

- Original Message -
 From: Paolo Tonin paolo.to...@gmail.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, August 16, 2012 10:00:41 PM
 Subject: [vdsm] About vdsm rest api
 
 Hi all there!
 
 Is there any documentation about implemented http rest api in the
 current version of VDSM (4.10)
 I would to use vdsm without oVirt packages, actually i'm using
 
 # rpm -qa vdsm*
 vdsm-xmlrpc-4.10.0-0.42.12.el6.noarch
 vdsm-4.10.0-0.42.12.el6.x86_64
 vdsm-rest-4.10.0-0.42.12.el6.noarch
 vdsm-bootstrap-4.10.0-0.42.12.el6.noarch
 vdsm-python-4.10.0-0.42.12.el6.x86_64
 vdsm-reg-4.10.0-0.42.12.el6.noarch
 vdsm-cli-4.10.0-0.42.12.el6.noarch
 
 Thanks a lot
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] About vdsm rest api

2012-08-20 Thread Saggi Mizrahi
reposting to list
---
I know it might seem a bit of a cumbersome process. But installation is 
relatively simple nowadays and the API is a lot more stable and safe at the 
moment.

Because VDSM doesn't keep any information on the host you will end up having to 
develop your own solutions to problems already solved by the ovirt-engine.

You only need to use the VDSM API if you write your own management solution. 
But seeing as the ovirt-engine is open source there should be no reason not to 
just write the feature you need and push it upstream.


- Original Message -
 From: Paolo Tonin paolo.to...@gmail.com
 To: Saggi Mizrahi smizr...@redhat.com
 Sent: Monday, August 20, 2012 11:33:50 AM
 Subject: Re: [vdsm] About vdsm rest api
 
 Yes, because i don't want to install oVirt engine (and subsequely
 entire oVirt web interface and DB)
 
 2012/8/20 Saggi Mizrahi smizr...@redhat.com:
  Hi, the rest API is going to move to a 2nd tier API very soon.
  There is also a pretty big API restructuring going on so we could
  supply a supported and documented API.
 
  Is there a reason you are not using the ovirt-engine REST-API?
 
  - Original Message -
  From: Paolo Tonin paolo.to...@gmail.com
  To: vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, August 16, 2012 10:00:41 PM
  Subject: [vdsm] About vdsm rest api
 
  Hi all there!
 
  Is there any documentation about implemented http rest api in the
  current version of VDSM (4.10)
  I would to use vdsm without oVirt packages, actually i'm using
 
  # rpm -qa vdsm*
  vdsm-xmlrpc-4.10.0-0.42.12.el6.noarch
  vdsm-4.10.0-0.42.12.el6.x86_64
  vdsm-rest-4.10.0-0.42.12.el6.noarch
  vdsm-bootstrap-4.10.0-0.42.12.el6.noarch
  vdsm-python-4.10.0-0.42.12.el6.x86_64
  vdsm-reg-4.10.0-0.42.12.el6.noarch
  vdsm-cli-4.10.0-0.42.12.el6.noarch
 
  Thanks a lot
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Please Review

2012-08-01 Thread Saggi Mizrahi
I have a bunch of patches going stale adding minor improvements:
I would like to get reviews so they get pushed in.
I know they contain code paths that are unused at the moment.
But adding death signal to certain copy operations or using the permutation 
feature for testing could prove useful for other people while I'm working on my 
own patches.

Everything that isn't WIP is ready to get pushed in.
http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:repo_engine,n,z
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] How should we handle aborted tasks? via engine, vdsClient or both?

2012-07-18 Thread Saggi Mizrahi


- Original Message -
 From: Lee Yarwood lyarw...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, July 18, 2012 10:53:45 AM
 Subject: Re: [vdsm] How should we handle aborted tasks? via engine, vdsClient 
 or both?
 
 On 07/18/2012 03:13 PM, Saggi Mizrahi wrote:
  We purposefully removed the ability to stop and aborted task from
  outside VDSM.
  It is one of the many features VDSM had (and still has) that could
  corrupt you data center if abused.
 
 Understood, however we also lack the ability to manually recover a
 task
 so is it just a case of waiting for VDSM to forcibly remove the
 aborted
 tasks itself?
Yes, Task recovery isn't really that robust. We are working on a different 
approach for tasks that is more cluster aware.
 
  On a related note, this is the time that the 1st rule of VDSM
  didn't apply!
  This is one hell of a milestone!
 
 I would ask what the 1st rule of VDSM is but I fear I might wake up
 in a
 basement.
 
 Lee
 --
 
 Lee Yarwood
 Software Maintenance Engineer
 Red Hat UK Ltd
 200 Fowler Avenue IQ Farnborough, Farnborough, Hants GU14 7JP
 
 Registered in England and Wales under Company Registration No.
 03798903
 Directors: Michael Cunningham (USA), Brendan Lane (Ireland), Matt
 Parson(USA), Charlie Peters (USA)
 
 GPG fingerprint : A5D1 9385 88CB 7E5F BE64  6618 BCA6 6E33 F672 2D76
 
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Verify the storage data integrity after some storage operations with test cases

2012-07-17 Thread Saggi Mizrahi
Actually setting up isos and installing an OS is an overkill IMHO.
Using libguestfs seems simpler as it has python bindings.

What you could do is:
1. use libguest fs to format a file system on an image
2. Put files on said file system with libguestfs
3. Snapshot
4. run fsck with libguestfs
5. rinse
6. repeast

If you don't trust fsck to detect all issues you can use libguestfs to get an 
md5sum of the raw drive and make sure that after a snapshot it stays the same.


- Original Message -
 From: Shu Ming shum...@linux.vnet.ibm.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Monday, July 16, 2012 10:28:25 PM
 Subject: [vdsm] Verify the storage data integrity after some storage 
 operations with test cases
 
 Hi,
 
   To verify the storage data integrity after some storage operations
 like snapshot, merging by VDSM.  Here are the test cases I am
 pondering.
 I would like to know your feedback about these thoughts.
 
 1) An customized ISO image with the agent  required prepared for
 bringing up a VM in VDSM
 2) The test case will inform VDSM to create a VM from the customized
 ISO
 image
 3) The test case will install an IO application to the VM
 3) The test case communicate with the VDSM to inform the IO
 application
 in the VM to write some data intentionally.
 4) The test case sends the commands to VDSM do some storage operation
 like disk snapshot, volume merging, etc.
   Say snapshot operation here for an example.
 5) VDSM then tell the test case the result of the operation like the
 name of the snapshot.
 6) Test case can read the  snapshot made to verify the snapshot with
 the
 data written in 3).
   Note: currently, there is no tool to read the snapshot image
 directly.  We can restart the VM with the snapshot as
   the active disk and tell the IO application in the VM to read
   the
 data writen before for test case.  And test case can compare
the data read with the data it informs the application in 3).
 7) If the two data matches, the storage operation succeed or it
 fails.
 
 In order to write such a test case, these VDSM features will be
 required.
 
 1) VDSM can create a VM from a specific ISO image  (Almost works)
 2) Test case can install an IO application to the VM by VDSM (by
 ovirt-agent?)
 3) Test case must have some protocols with the IO application in VM
 for
 passing the command to the VM and returning the result from the VM
   to the test case(by ovirt-agent?).
 4) The IO application can be seen as an test agent.  We may extend
 the
 existing agent like ovirt-agent as the IO application.
 
 
 --
 Shu Ming shum...@linux.vnet.ibm.com
 IBM China Systems and Technology Laboratory
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] A few notes about lists in Makefiles

2012-07-16 Thread Saggi Mizrahi
Hi, I would just like to push a few notes to people modifying autoconf\automake 
lists

Please make sure the lists are sorted. Sorted lists are easier to skim and 
modify.
Also, unsorted lists are known to make Federico sad, and we all want to keep 
him happy because he is a pretty swell guy and the one that we actually have to 
thank for the amazing build system.

Also please make sure to add the $(NULL) item so when auto sorting you don't 
need to check if you need to add\remove a backslash.

VARIABLE = \
   A \
   B \
   C \
   $(NULL)

If you are using vim you could just mark all the lines an run :!sort to 
sort.

Also, when adding a file to the PEP_WHITELIST, check if you can just mark the 
entire directory instead of the individual file.

Remember, cleanliness is next to godliness.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Repo Engine Initial code drop

2012-07-13 Thread Saggi Mizrahi
http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:repo_engine,n,z

Everything that isn't marked as WIP is ready for same hard-core review and 
commit action.

The WIP parts are a bit rough around the edges expect:
* Typos 
* spelling errors 
* grammar errors
* Functions that are remnants of attempts which proved problematic
* Lack of code reuse
* Old docstrings that are no longer correct
* Dragons

A few notes:
* I call the new storage domains image repositories because I think it 
creates less confusion and ambiguity.
* VirtualDisks are writable entities you can run a VM off, snapshots a read 
only entities you can make Virtual Disks from. Images are a name for both disks 
and snapshots.
* Only localfs is somewhat supported
* The Image manipulation code is working and you can create images and 
snapshots to your hearts delight. It might even work!.
* The check process detects all tree issues but there are only fixes for 
orphaned tags and volumes meaning you will be able to clean whole tree.
* The APIs are not final
* Documentation is sparse

I'm trying to make all this code separate from the regular VDSM core so we can 
push it in even though it's not perfect and slowly build up from that.

The biggest problem with integration is not having the blockdev feature in qemu 
and libvirt.
This means that running more the one VM which use the same snapshot might 
corrupt the qcow file.
https://bugzilla.redhat.com/show_bug.cgi?id=750801
https://bugzilla.redhat.com/show_bug.cgi?id=760547

If anyone wants to help find me on #VDSM @ freenode and we'll coordinate 
efforts.

My current TODO list:
1. XML-RPC API integration
--- Could be pushed in from this point on, as an experimental API
2. nfs repo engine (will introduce sanlock to the mix)
3. clustered-lvm repo engine (Will introduce SRM)
4. Tasks
--- I expect to have the Most of the API finalized here
5. Fix operations (merging, conversions)
6. Live snapshot
7. Live Merge
8. Live Storage Migration
9. Profit?
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-11 Thread Saggi Mizrahi
I'm sorry, but I don't really understand the drawing

- Original Message -
 From: Shu Ming shum...@linux.vnet.ibm.com
 To: Adam Litke a...@us.ibm.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, July 11, 2012 10:24:49 AM
 Subject: Re: [vdsm] [RFC] An alternative way to provide a supported interface 
 -- libvdsm
 
 Adam,
 
 Maybe,  I don't fully understand your proposal.  Here is my
 understanding of libvdsm in the picture. Please check the following
 link
 for the picture.
 
 http://www.ovirt.org/wiki/File:Libvdsm.JPG
 
 
 http://www.ovirt.org/wiki/File:Libvdsm.JPG
 
 On 2012-7-9 21:56, Adam Litke wrote:
  On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote:
  On 07/06/2012 01:15 AM, Robert Middleswarth wrote:
  On 07/05/2012 04:45 PM, Adam Litke wrote:
  On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote:
  - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project
  Development vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, July 5, 2012 2:34:50 PM
  Subject: Re: [RFC] An alternative way to provide a supported
  interface -- libvdsm
 
  On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote:
  The idea of having a supported C API was something I was
  thinking
  about doing
  (But I'd rather use gobject introspection and not schema
  generation) But the
  problem is not having a C API is using the current XML RPC
  API as
  it's base
  I want to disect this a bit to find out exactly where there
  might be
  agreement
  and disagreement.
 
  C API is a good thing to implement - Agreed.
 
  I also want to use gobject introspection but I don't agree
  that using
  glib
  precludes the use of a formalized schema.  My proposal is that
  we
  write a schema
  definition and generate the glib C code from that schema.
 
  I agree that the _current_ xmlrpc API makes a pretty bad base
  from
  which to
  start a supportable API.  XMLRPC is a perfectly reasonable
  remote/wire protocol
  and I think we should continue using it as a base for the next
  generation API.
  Using a schema will ensure that the new API is
  well-structured.
  There major problems with XML-RPC (and to some extent with REST
  as
  well) are high call overhead and no two way communication (push
  events). Basing on XML-RPC means that we will never be able to
  solve
  these issues.
  I am not sure I am ready to conceed that XML-RPC is too slow for
  our
  needs.  Can
  you provide some more detail around this point and possibly
  suggest an
  alternative that has even lower overhead without sacrificing the
  ubiquity and
  usability of XML-RPC?  As far as the two-way communication
  point, what
  are the
  options besides AMQP/ZeroMQ?  Aren't these even worse from an
  overhead
  perspective than XML-RPC?  Regarding two-way communication: you
  can
  write AMQP
  brokers based on the C API and run one on each vdsm host.
   Assuming
  the C API
  supports events, what else would you need?
  I personally think that using something like AMQP for inter-node
  communication and engine - node would be optimal.  With a rest
  interface
  that just send messages though something like AMQP.
  I would also not dismiss AMQP so soon
  we want a bug with more than a single listener at engine side
  (engine, history db, maybe event correlation service).
  collectd as a means for statistics already supports it as well.
  I'm for having REST as well, but not sure as main one for a
  consumer
  like ovirt engine.
  I agree that a message bus could be a very useful model of
  communication between
  ovirt-engine components and multiple vdsm instances.  But the
  complexities and
  dependencies of AMQP do not make it suitable for use as a low-level
  API.  AMQP
  will repel new adopters.  Why not establish a libvdsm that is more
  minimalist
  and can be easily used by everyone?  Then AMQP brokers can be built
  on top of
  the stable API with ease.  All AMQP should require of the low-level
  API are
  standard function calls and an events mechanism.
 
  Thanks
  Robert
  The current XML-RPC API contains a lot of decencies and
  inefficiencies and we
  would like to retire it as soon as we possibly can. Engine
  would
  like us to
  move to a message based API and 3rd parties want something
  simple
  like REST so
  it looks like no one actually wants to use XML-RPC. Not even
  us.
  I am proposing that AMQP brokers and REST APIs could be
  written
  against the
  public API.  In fact, they need not even live in the vdsm tree
  anymore if that
  is what we choose.  Core vdsm would only be responsible for
  providing
  libvdsm
  and whatever language bindings we want to support.
  If we take the libvdsm route, the only reason to even have a
  REST
  bridge is only to support OSes other then Linux which is
  something
  I'm not sure we care about at the moment.
  That might be true regarding the current in-tree

Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-09 Thread Saggi Mizrahi
I don't think AMQP is a good low level supported protocol as it's a very 
complex protocol to set up and support.
Also brokers are known to have their differences in standard implementation 
which means supporting them all is a mess.

It looks like the most accepted route is the libvirt route of having a c 
library abstracting away client server communication and having more advanced 
consumers build protocol specific bridges that may have different support 
standards.

On a more personal note, I think brokerless messaging is the way to go in ovirt 
because, unlike traditional clustering, worker nodes are not interchangeable so 
direct communication is the way to go, rendering brokers pretty much useless.

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Itamar Heim ih...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Monday, July 9, 2012 9:56:17 AM
 Subject: Re: [vdsm] [RFC] An alternative way to provide a supported interface 
 -- libvdsm
 
 On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote:
  On 07/06/2012 01:15 AM, Robert Middleswarth wrote:
  On 07/05/2012 04:45 PM, Adam Litke wrote:
  On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote:
  
  - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project
  Development vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, July 5, 2012 2:34:50 PM
  Subject: Re: [RFC] An alternative way to provide a supported
  interface -- libvdsm
  
  On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote:
  The idea of having a supported C API was something I was
  thinking
  about doing
  (But I'd rather use gobject introspection and not schema
  generation) But the
  problem is not having a C API is using the current XML RPC API
  as
  it's base
  I want to disect this a bit to find out exactly where there
  might be
  agreement
  and disagreement.
  
  C API is a good thing to implement - Agreed.
  
  I also want to use gobject introspection but I don't agree that
  using
  glib
  precludes the use of a formalized schema.  My proposal is that
  we
  write a schema
  definition and generate the glib C code from that schema.
  
  I agree that the _current_ xmlrpc API makes a pretty bad base
  from
  which to
  start a supportable API.  XMLRPC is a perfectly reasonable
  remote/wire protocol
  and I think we should continue using it as a base for the next
  generation API.
  Using a schema will ensure that the new API is well-structured.
  There major problems with XML-RPC (and to some extent with REST
  as
  well) are high call overhead and no two way communication (push
  events). Basing on XML-RPC means that we will never be able to
  solve
  these issues.
  I am not sure I am ready to conceed that XML-RPC is too slow for
  our
  needs.  Can
  you provide some more detail around this point and possibly
  suggest an
  alternative that has even lower overhead without sacrificing the
  ubiquity and
  usability of XML-RPC?  As far as the two-way communication point,
  what
  are the
  options besides AMQP/ZeroMQ?  Aren't these even worse from an
  overhead
  perspective than XML-RPC?  Regarding two-way communication: you
  can
  write AMQP
  brokers based on the C API and run one on each vdsm host.
   Assuming
  the C API
  supports events, what else would you need?
  I personally think that using something like AMQP for inter-node
  communication and engine - node would be optimal.  With a rest
  interface
  that just send messages though something like AMQP.
  
  I would also not dismiss AMQP so soon
  we want a bug with more than a single listener at engine side
  (engine, history db, maybe event correlation service).
  collectd as a means for statistics already supports it as well.
  I'm for having REST as well, but not sure as main one for a
  consumer
  like ovirt engine.
 
 I agree that a message bus could be a very useful model of
 communication between
 ovirt-engine components and multiple vdsm instances.  But the
 complexities and
 dependencies of AMQP do not make it suitable for use as a low-level
 API.  AMQP
 will repel new adopters.  Why not establish a libvdsm that is more
 minimalist
 and can be easily used by everyone?  Then AMQP brokers can be built
 on top of
 the stable API with ease.  All AMQP should require of the low-level
 API are
 standard function calls and an events mechanism.
 
  
  Thanks
  Robert
  The current XML-RPC API contains a lot of decencies and
  inefficiencies and we
  would like to retire it as soon as we possibly can. Engine
  would
  like us to
  move to a message based API and 3rd parties want something
  simple
  like REST so
  it looks like no one actually wants to use XML-RPC. Not even
  us.
  I am proposing that AMQP brokers and REST APIs could be written
  against the
  public API.  In fact, they need not even live in the vdsm tree
  anymore if that
  is what we choose

Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-09 Thread Saggi Mizrahi


- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Adam Litke a...@us.ibm.com, vdsm-devel@lists.fedorahosted.org
 Sent: Monday, July 9, 2012 11:03:43 AM
 Subject: Re: [vdsm] [RFC] An alternative way to provide a supported interface 
 -- libvdsm
 
 On 07/09/2012 05:56 PM, Saggi Mizrahi wrote:
  I don't think AMQP is a good low level supported protocol as it's a
  very complex protocol to set up and support.
  Also brokers are known to have their differences in standard
  implementation which means supporting them all is a mess.
 
  It looks like the most accepted route is the libvirt route of
  having a c library abstracting away client server communication
  and having more advanced consumers build protocol specific bridges
  that may have different support standards.
 
  On a more personal note, I think brokerless messaging is the way to
  go in ovirt because, unlike traditional clustering, worker nodes
  are not interchangeable so direct communication is the way to go,
  rendering brokers pretty much useless.
 
 but brokerless doesn't let multiple consumers which a bus provides?
All consumers can connect to the host and *some* events can be broadcasted to 
all connected clients.

The real question is weather you want to depend on AMQP's routing \ message 
storing
Also, if you find it preferable to have a centralized host (single point of 
failure) to get all events from all hosts for the price of  some clients (I 
assume read only clients) not needing to know the locations of all worker nodes.
But IMHO we already have something like that, it's called the ovirt-engine, and 
it could send aggregated events about the cluster (maybe with some extra enginy 
data).

The question is what does mandating a broker gives us something that an AMQP 
bridge wouldn't.
The only thing I can think of is vdsm can assume unmoderated vdsm to vdsm 
communication bypassing the engine.
This means that VDSM can have some clustered behavior that requires no engine 
intervention.
Further more, the engine can send a request and let the nodes decide who is 
performing the operation among themselves.

Essentially:

[  engine  ]  [  engine  ]
   | |  VS  |
[vdsm][vdsm]  [  broker  ]
 | |
  [vdsm][vdsm]

*All links are two way links

This has dire consequences on API usability and supportability. So we need to 
converge on that.

There needs to be a good reason why the aforementioned logic code can't sit on 
a another ovirt specific entity (lets call it ovirt-dynamo) that uses VDSM's 
supported API but it's own APIs (or more likely messaging algorithms) are 
unsupported.
  
 [engine   ]
   |||
   |  [   broker   ] |
   |||   |
[vdsm]-[dynamo] : [dynamo]-[vdsm]
Host A  :  Host B

*All links are two way links
 
 
  - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Itamar Heim ih...@redhat.com
  Cc: vdsm-devel@lists.fedorahosted.org
  Sent: Monday, July 9, 2012 9:56:17 AM
  Subject: Re: [vdsm] [RFC] An alternative way to provide a
  supported interface -- libvdsm
 
  On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote:
  On 07/06/2012 01:15 AM, Robert Middleswarth wrote:
  On 07/05/2012 04:45 PM, Adam Litke wrote:
  On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote:
 
  - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project
  Development vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, July 5, 2012 2:34:50 PM
  Subject: Re: [RFC] An alternative way to provide a supported
  interface -- libvdsm
 
  On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi
  wrote:
  The idea of having a supported C API was something I was
  thinking
  about doing
  (But I'd rather use gobject introspection and not schema
  generation) But the
  problem is not having a C API is using the current XML RPC
  API
  as
  it's base
  I want to disect this a bit to find out exactly where there
  might be
  agreement
  and disagreement.
 
  C API is a good thing to implement - Agreed.
 
  I also want to use gobject introspection but I don't agree
  that
  using
  glib
  precludes the use of a formalized schema.  My proposal is
  that
  we
  write a schema
  definition and generate the glib C code from that schema.
 
  I agree that the _current_ xmlrpc API makes a pretty bad base
  from
  which to
  start a supportable API.  XMLRPC is a perfectly reasonable
  remote/wire protocol
  and I think we should continue using it as a base for the
  next
  generation API.
  Using a schema will ensure that the new API is
  well-structured.
  There major problems with XML-RPC (and to some extent with
  REST
  as
  well) are high call overhead and no two way communication
  (push

Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-05 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project Development 
 vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, July 5, 2012 4:45:08 PM
 Subject: Re: [RFC] An alternative way to provide a supported interface -- 
 libvdsm
 
 On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote:
  
  
  - Original Message -
   From: Adam Litke a...@us.ibm.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project
   Development vdsm-devel@lists.fedorahosted.org
   Sent: Thursday, July 5, 2012 2:34:50 PM
   Subject: Re: [RFC] An alternative way to provide a supported
   interface -- libvdsm
   
   On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote:
The idea of having a supported C API was something I was
thinking
about doing
(But I'd rather use gobject introspection and not schema
generation) But the
problem is not having a C API is using the current XML RPC API
as
it's base
   
   I want to disect this a bit to find out exactly where there might
   be
   agreement
   and disagreement.
   
   C API is a good thing to implement - Agreed.
   
   I also want to use gobject introspection but I don't agree that
   using
   glib
   precludes the use of a formalized schema.  My proposal is that we
   write a schema
   definition and generate the glib C code from that schema.
   
   I agree that the _current_ xmlrpc API makes a pretty bad base
   from
   which to
   start a supportable API.  XMLRPC is a perfectly reasonable
   remote/wire protocol
   and I think we should continue using it as a base for the next
   generation API.
   Using a schema will ensure that the new API is well-structured.
  There major problems with XML-RPC (and to some extent with REST as
  well) are high call overhead and no two way communication (push
  events). Basing on XML-RPC means that we will never be able to
  solve these issues.
 
 I am not sure I am ready to conceed that XML-RPC is too slow for our
 needs.  Can
 you provide some more detail around this point and possibly suggest
 an
 alternative that has even lower overhead without sacrificing the
 ubiquity and
 usability of XML-RPC?  As far as the two-way communication point,
 what are the
 options besides AMQP/ZeroMQ?  Aren't these even worse from an
 overhead
 perspective than XML-RPC?  Regarding two-way communication: you can
 write AMQP
 brokers based on the C API and run one on each vdsm host.  Assuming
 the C API
 supports events, what else would you need?
If we plan to go with the libvdsm route the only transports I think are 
appropriate are either raw sockets (like libvirt) or ZMQ (just to take 
advantage of it managing connection and message encapsulation but it might be 
an overkill). Other then that ZMQ\AMQP\REST\XML-RPC bridges are not really a 
priority for me as engine will not be using any of the bridges.
 
The current XML-RPC API contains a lot of decencies and
inefficiencies and we
would like to retire it as soon as we possibly can. Engine
would
like us to
move to a message based API and 3rd parties want something
simple
like REST so
it looks like no one actually wants to use XML-RPC. Not even
us.
   
   I am proposing that AMQP brokers and REST APIs could be written
   against the
   public API.  In fact, they need not even live in the vdsm tree
   anymore if that
   is what we choose.  Core vdsm would only be responsible for
   providing
   libvdsm
   and whatever language bindings we want to support.
  If we take the libvdsm route, the only reason to even have a REST
  bridge is only to support OSes other then Linux which is something
  I'm not sure we care about at the moment.
 
 That might be true regarding the current in-tree implementation.
  However, I can
 almost guarantee that someone wanting to write a web GUI on top of
 standalone
 vdsm would want a REST API to talk to.  But libvdsm makes this use
 case of no
 concern to the core vdsm developers.
 
I do think that having C supportability in our API is a good
idea,
but the
current API should not be used as the base.
   
   Let's _start_ with a schema document that describes today's API
   and
   then clean
   it up.  I think that will work better than starting from scratch.
Once my
   schema is written I will post it and we can 'patch' it as a
   community
   until we
   arrive at a 1.0 version we are all happy with.
  +1
 
 Ok.  Redoubling my efforts to get this done.  Describing the output
 of
 list(True) takes awhile :)
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-06-27 Thread Saggi Mizrahi
The idea of having a supported C API was something I was thinking about doing 
(But I'd rather use gobject introspection and not schema generation)
But the problem is not having a C API is using the current XML RPC API as it's 
base

The current XML-RPC API contains a lot of decencies and inefficiencies and we 
would like to retire it as soon as we possibly can. Engine would like us to 
move to a message based API and 3rd parties want something simple like REST so 
it looks like no one actually wants to use XML-RPC. Not even us.

I do think that having C supportability in our API is a good idea, but the 
current API should not be used as the base.

- Original Message -
 From: Anthony Liguori anth...@codemonkey.ws
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Cc: Adam Litke a...@us.ibm.com, Saggi Mizrahi smizr...@redhat.com
 Sent: Monday, June 25, 2012 10:18:33 AM
 Subject: [RFC] An alternative way to provide a supported interface -- libvdsm
 
 Hi,
 
 I've been reading through the API threads here and considering the
 options.  To
 be honest, I worry a lot about the scope of these discussions and
 that there's a
 tremendous amount of work before we have a useful end result.
 
 I wonder if we can solve this problem by adding another layer of
 abstraction...
 
 As Adam is currently building a schema for VDSM's XML-RPC, we could
 use the QAPI
 code generators to build a libvdsm that provided a programmatic C
 interface for
 the XML-RPC interface.
 
 It would take some tweaking, but this could be made a supportable C
 interface.
 The rules for having a supportable C interface are basically:
 
 1) Never change function signatures
 
 2) Never remove functions
 
 3) Always allocate structures in the library and/or pad
 
 4) Only add to structures, never remove or reorder
 
 5) Provide flags that default to zero to indicate that
 fields/features are not
 present.
 
 6) Always zero-initialize structures
 
 Having a libvdsm would allow the transport to change over time w/o
 affecting
 end-users.  There are lots of good tools for documenting C APIs and
 dealing with
 versioning of C APIs.
 
 While we can start out with a schema-generated API, over time, we can
 implement
 libvdsm in an open-coded fashion allowing old APIs to be
 reimplemented in terms
 of new APIs.
 
  From a compatibility perspective, libvdsm would be fully backwards
  compatible
 with old versions of VDSM (so it would keep XML-RPC support forever)
 but may
 require new versions of libvdsm to talk to new versions of VDSM.
  That would
 allow for APIs to be deprecated within VDSM without breaking old
 clients.
 
 I think this would be an incremental approach to building a
 supportable API
 today while still giving the flexibility to make changes in the long
 term.
 
 And it should be fairly easy to generate a JNI binding and also port
 ovirt-engine to use an interface like this (since it already uses the
 XML-RPC API).
 
 Regards,
 
 Anthony Liguori
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] RFC: API Supportability

2012-06-21 Thread Saggi Mizrahi
I tired to sum everything in the wiki page [1]
Please review the page and see if there is something I missed or that you don't 
agree with.

- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Dan Kenigsberg dan...@redhat.com
 Cc: Saggi Mizrahi smizr...@redhat.com, VDSM Project Development 
 vdsm-devel@lists.fedorahosted.org, Daniel
 Veillard veill...@redhat.com, Anthony Liguori aligu...@redhat.com
 Sent: Thursday, June 21, 2012 10:41:36 AM
 Subject: Re: [vdsm] [virt-node] RFC: API Supportability
 
 On Thu, Jun 21, 2012 at 01:20:40PM +0300, Dan Kenigsberg wrote:
  On Wed, Jun 20, 2012 at 10:42:16AM -0500, Adam Litke wrote:
   On Tue, Jun 19, 2012 at 10:17:28AM -0400, Saggi Mizrahi wrote:
I've opened a wiki page [1] for the stable API and extracted
some of the TODO points so we don't forget. Everyone can
feel free to add more stuff.

[1] http://ovirt.org/wiki/VDSM_Stable_API_Plan

Rest of the comments inline
- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development
 vdsm-devel@lists.fedorahosted.org, Barak Azulay
 bazu...@redhat.com, Itamar
 Heim ih...@redhat.com, Ayal Baron aba...@redhat.com,
 Anthony Liguori aligu...@redhat.com
 Sent: Monday, June 18, 2012 12:23:10 PM
 Subject: Re: [virt-node] RFC: API Supportability
 
 On Mon, Jun 18, 2012 at 11:02:25AM -0400, Saggi Mizrahi
 wrote:
  The first thing we need to decide is API supportabiliy.
  I'll list the
  questions that need to be answered. The decision made here
  will have great
  effect on transport selection (espscially API change
  process and
  versioning) so try and think about this without going to
  specfic
  technicalities (eg. X can't be done on REST).
 
 Thanks for sending this out.  I will take a crack at these
 questions...
 
 I would like to pose an additional question to be answered:
 
 - Should API parameter and return value constraints be
 formally defined?  If
 so, how?
 
 Think of this as defining an API schema.  For example: When
 creating a VM,
 which parameters are required/optional?  What are the valid
 formats for
 specifying a VM disk?  What are all of the possible task
 states?
Has to be part of response to the call that retrieves the
state. This will
allow us to change the states in a BC manner.
   
   I am not sure I agree.  I think it should be a part of the schema
   but not
   transmitted along with each API response involving a task.  This
   would increase
   traffic and make responses unnecessarily verbose.
   
  Is there a maximum length for the storage domain
  description?
I totally agree, how depends on the transport of choice but in
any case I
think the definition should be done in a declarative manner
(XML\JSON) using
concrete types (important for binding with C\Java) and have
some *code to
enforce* that the input is correct. This will prevent clients
from not
adhering to the schema exploiting python's relative lax
approach to types. We
already had issues with the engine wrongly sending numbers as
strings and
having this break internally because of some change in the
python code made it
not handle the conversion very well.
   
   Our schema should fully define a set of simple types and complex
   types.  Each
   defined simple type will have an internal validation function to
   verify
   conformity of a given input.  Complex types consist of nested
   lists and dicts of
   simple types.  They are validated first by validating members as
   simple types
   and then checking for missing and/or extra data.
  
  When designing a dependable API, we should not desert our agility.
  ovirt-Engine has enjoyed the possibility of saying hey, we want
  another
  field reported in getVdsStats and presto, here it was.
  Complex types should be easily extendible (with a proper update of
  the
  API minor version, or a capabilities set).
 
 +1
 
 --
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] RFC: API Supportability

2012-06-19 Thread Saggi Mizrahi
I've opened a wiki page [1] for the stable API and extracted some of the TODO 
points so we don't forget. Everyone can feel free to add more stuff.

[1] http://ovirt.org/wiki/VDSM_Stable_API_Plan

Rest of the comments inline
- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Barak 
 Azulay bazu...@redhat.com, Itamar
 Heim ih...@redhat.com, Ayal Baron aba...@redhat.com, Anthony Liguori 
 aligu...@redhat.com
 Sent: Monday, June 18, 2012 12:23:10 PM
 Subject: Re: [virt-node] RFC: API Supportability
 
 On Mon, Jun 18, 2012 at 11:02:25AM -0400, Saggi Mizrahi wrote:
  The first thing we need to decide is API supportabiliy. I'll list
  the questions
  that need to be answered. The decision made here will have great
  effect on
  transport selection (espscially API change process and versioning)
  so try and
  think about this without going to specfic technicalities (eg. X
  can't be done
  on REST).
 
 Thanks for sending this out.  I will take a crack at these
 questions...
 
 I would like to pose an additional question to be answered:
 
 - Should API parameter and return value constraints be formally
 defined?
   If so, how?
 
 Think of this as defining an API schema.  For example: When creating
 a VM, which
 parameters are required/optional?  What are the valid formats for
 specifying a
 VM disk?
  What are all of the possible task states?
Has to be part of response to the call that retrieves the state. This will 
allow us to change the states in a BC manner.
  Is there a
 maximum length
 for the storage domain description?
I totally agree, how depends on the transport of choice but in any case I think 
the definition should be done in a declarative manner (XML\JSON) using concrete 
types (important for binding with C\Java) and have some *code to enforce* that 
the input is correct. This will prevent clients from not adhering to the schema 
exploiting python's relative lax approach to types. We already had issues with 
the engine wrongly sending numbers as strings and having this break internally 
because of some change in the python code made it not handle the conversion 
very well.
 
 
  New API acceptance process
  ==
   - What is the process to suggest new API calls?
 
 New APIs should be proposed on a mailing list.  This allows everyone
 to
 participate in the conversation and preserves the comments.  If the
 API is
 simple, a patch the provides a concrete example of implementation is
 recommended.  Once the API design is agreed upon, patches can be
 submitted via
 the standard method (gerrit) to implement a new experimental API
 based on the
 design.  The submitter of the patches should reply to the design
 discussion
 thread to notify participants of the available code.
+1
 
   - Who can ack such a change?
 
 API changes should be subject to wider approval than a simple change
 to an
 internal component.  I believe that the +1,-1 system works well here
 and we
 should seek approvals from all participants in the design discussion
 if
 possible.
I will add that you need a +1 from at least 2 maintainers for an API change.
Also someone has to test that the change did not break old clients.
 
   - Does someone have veto rights?
 
 Anyone can NACK an API design.  Same rules as for normal code.
+1
 
   - Are there experimental APIs?
 
 Yes!  Dave Allan has mentioned that from his experience in libvirt,
 it would be
 very nice to have experimental APIs that can be improved before being
 baked into
 the supportable API.  I definitely agree.  In fact, all new APIs
 should go
 through a period of being experimental.  Experimental API functions
 should
 begin with '_'.  Once deemed stable, the '_' can be removed.
I don't like this specific mangling scheme but I do agree that we need 
experimental calls.
I also think that you need a mechanism to turn them on similar to `import 
__future__` in python so that you make sure API user knows it's experimental.
 
 
  
  API deprecation process
  ===
   - Do we allow deprecation?
 
 I would like to allow deprecation because it grants us an avenue to
 clean up the
 API from time to time.  That being said, I am not aware of a clean
 way to do
 this without breaking old clients too badly.  At a minimum, an API
 would need to
 be deprecated for at least 2 years before it can be removed.  How
 will this
 decision influence the initial API design?  Are there features that
 we can build
 into an API that can ease the burden of deprecation on API consumers?
Deprecation is tricky. We also need a mechanism for a client to know that his 
version of the API no longer exists so it can check for that at host connection 
and fail if the client is to old.
To do that we could either have API group versions and expose which versions 
are supported in full.
We could also take the opengl route of querying for call. But this might

Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-19 Thread Saggi Mizrahi


- Original Message -
 From: Ryan Harper ry...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Ryan Harper ry...@us.ibm.com, VDSM Project Development 
 vdsm-devel@lists.fedorahosted.org, Anthony
 Liguori aligu...@redhat.com
 Sent: Tuesday, June 19, 2012 9:30:08 AM
 Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
 
 * Saggi Mizrahi smizr...@redhat.com [2012-06-18 16:09]:
  Ryan, thanks for commenting.
  
  Sadly I feel that your points, though important, are a bit of a
  digression from the main discussion.  Internal architectural
  changes
  to VDSM are out of the scope as this should be done on a very tight
  schedual.
 
 I don't think I was suggesting internal architectural changes.  I may
 not yet be familiar enough with to code to understand that modifying
 the
 exist API will result in architectural changes.
 
 I do worry about what we expect to accomplish here if we have a tight
 schedule and also include the idea of general purpose virt host
 manager.  Maybe your opening was too wide for the specific purpose
 you
 were intending (your numbered list).
 
 If you're strictly focused on something around Fedora18 timeline
 wise, I
 would agree that there isn't much runway to make big changes.
 
 With that in mind, I'd say we need to add a topic to your list:
 
 5. API versioning and deprecation
This is part of the supportability discussion. Please join in if you have 
something to add. The supportability email was sent to the list as well.
 
 I believe you've got a number of questions in this space on your
 other
 thread so I'll move over there.  This is going to be a critical
 dicussion on how we move forward.
 
  
  Seeing as this is a pretty good list of things that need to be
  done\discussed in VDSM anyway. I took the liberty of putting them
  in a
  wiki page [1] so we don't forget and others can add\comment on the
  ideas.
 
 Thanks.
 
  
  In any case you can feel free to raise those issues on the list
  separately. Specifically, 3rd party plugins might be very topical
  with the undergoing gluster integration work.
  
  [1] http://www.ovirt.org/wiki/VDSM_Potential_Features
  
  - Original Message -
   From: Ryan Harper ry...@us.ibm.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: VDSM Project Development
   vdsm-devel@lists.fedorahosted.org, Anthony Liguori
   aligu...@redhat.com
   Sent: Monday, June 18, 2012 3:43:42 PM
   Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt
   host manager
   
   * Saggi Mizrahi smizr...@redhat.com [2012-06-18 10:05]:
I would like to put on to the table for descussion the growing
need
for a way
to more easily reuse of the functionality of VDSM in order to
service projects
other than Ovirt-Engine.

Originally VDSM was created as a proprietary agent for the sole
purpose of
serving the then proprietary version of what is known as
ovirt-engine. Red Hat,
after acquiring the technology, pressed on with it's commitment
to
open source
ideals and released the code. But just releasing code into the
wild
doesn't
build a community or makes a project successful. Further more
when
building
open source software you should aspire to build reusable
components
instead of
monolithic stacks.
   
   
   Saggi,
   
   Thanks for sending this out.  I've been trying to pull together
   some
   thoughts on what else is needed for vdsm as a community.  I know
   that
   for some time downstream has been the driving force for all of
   the
   work
   and now with a community there are challenges in finding our own
   way.
   
   While we certainly don't want to make downstream efforts harder,
   I
   think
   we need to develop and support our own vision for what vdsm can
   be
   come,
   some what independent of downstream and other exploiters.
   
   Revisiting the API is definitely a much needed endeavor and I
   think
   adding some use-cases or sample applications would be useful in
   demonstrating whether or not we're evolving the API into
   something
   easier to use for applications beyond engine.

We would like to expose a stable, documented, well supported
API.
This gives
us a chance to rethink the VDSM API from the ground up. There
is
already work
in progress of making the internal logic of VDSM separate
enough
from the API
layer so we could continue feature development and bug fixing
while
designing
the API of the future.

In order to achieve this though we need to do several things:
   1. Declare API supportability guidelines
   2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
   3. Make the API easily consumable (e.g. proper docs, example
   code, extending
  the API, etc)
   4. Implement the API itself
   
   I agree with the list, but I'd like to work on the redesign
   discussion so
   that we're not doing all

[vdsm] My VDSM delopment workflow

2012-06-19 Thread Saggi Mizrahi
People asking me about my code\test cycle so I decided to just have a small 
writeup on the list.

Always test on latest stable fedora and RHEL, put yum upgrade -y as a nightly 
cron command!

My development storage is a FreeNAS VM (But I will be moving to f17+lio when I 
find the time to configure everything and they implement CHAP auth)

For things that have to be tested with a full blown VDSM install I have a 
script* that pulls from my git HEAD and installs it on the host. I don't use 
rsync because the timestamps confuse Make and might cause my local fedora files 
to be packaged by mistake.
I also delete the RPMs and clean install new ones each time. It takes longer 
and invokes a libvirt reconfigure each time but it caches a lot of elusive 
errors that QE often miss.

Always use yum! The rpm command is much less robust.

This script is meant to work on EL\Fedora that just has git and multipath 
installed, and a cloned repo in which your development machine is the origin.

Also note I explicitly clean /usr/share/vdsm because locally changed files 
don't have the same MD5 hash and might not be removed\replaced by yum.

It takes quite a while but it just makes writing unit tests that much more 
appealing :)

Make sure you have sudo rights to the appropriate commands

I hope people find this helpful.

---

#!/bin/bash
# Get git root
PROJ_GIT_DIR=$(git rev-parse --git-dir | xargs dirname | xargs readlink -f)
pushd $PROJ_GIT_DIR .

# Make sure autotools and other basic dependencies are installed
sudo yum install -y automake autoconf gcc rpm-build pyflakes

# Fetch remote head
git co HEAD~  /dev/null
git fetch -f origin HEAD:testhead
git co testhead

# Remove old RPMs
rm -rvf ~/rpmbuild

# Build
./autogen.sh --system
./configure

# Install build requirements
grep BuildRequires vdsm.spec | awk {'print $2'} | \
xargs sudo yum install -y

make clean
make rpm || exit 1

# Stop VDSM
sudo /sbin/service vdsmd stop

# Clean RPMs
rpm -qa | grep vdsm | xargs sudo yum remove -y
# Clean any local edits
sudo rm -rf /usr/share/vdsm

# Install new RPMs
ls ~/rpmbuild/RPMS/*/*.rpm | grep -v faqemu | grep -v hook | grep -v reg | \
grep -v bootstrap | xargs sudo yum localinstall --nogpgcheck -y

popd

# Start VDSM
sudo /sbin/service vdsmd start
# VDSM has a long standing issues with reporting OK on start when it's actually
# down
service vdsmd status
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-19 Thread Saggi Mizrahi


- Original Message -
 From: Deepak C Shetty deepa...@linux.vnet.ibm.com
 To: Ryan Harper ry...@us.ibm.com
 Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori 
 aligu...@redhat.com, VDSM Project Development
 vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, June 19, 2012 10:58:47 AM
 Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
 
 On 06/19/2012 01:13 AM, Ryan Harper wrote:
  * Saggi Mizrahismizr...@redhat.com  [2012-06-18 10:05]:
  I would like to put on to the table for descussion the growing
  need for a way
  to more easily reuse of the functionality of VDSM in order to
  service projects
  other than Ovirt-Engine.
 
  Originally VDSM was created as a proprietary agent for the sole
  purpose of
  serving the then proprietary version of what is known as
  ovirt-engine. Red Hat,
  after acquiring the technology, pressed on with it's commitment to
  open source
  ideals and released the code. But just releasing code into the
  wild doesn't
  build a community or makes a project successful. Further more when
  building
  open source software you should aspire to build reusable
  components instead of
  monolithic stacks.
 
  Saggi,
 
  Thanks for sending this out.  I've been trying to pull together
  some
  thoughts on what else is needed for vdsm as a community.  I know
  that
  for some time downstream has been the driving force for all of the
  work
  and now with a community there are challenges in finding our own
  way.
 
  While we certainly don't want to make downstream efforts harder, I
  think
  we need to develop and support our own vision for what vdsm can be
  come,
  some what independent of downstream and other exploiters.
 
  Revisiting the API is definitely a much needed endeavor and I think
  adding some use-cases or sample applications would be useful in
  demonstrating whether or not we're evolving the API into something
  easier to use for applications beyond engine.
 
  We would like to expose a stable, documented, well supported API.
  This gives
  us a chance to rethink the VDSM API from the ground up. There is
  already work
  in progress of making the internal logic of VDSM separate enough
  from the API
  layer so we could continue feature development and bug fixing
  while designing
  the API of the future.
 
  In order to achieve this though we need to do several things:
  1. Declare API supportability guidelines
  2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
  3. Make the API easily consumable (e.g. proper docs, example
  code, extending
 the API, etc)
  4. Implement the API itself
  I agree with the list, but I'd like to work on the redesign
  discussion so
  that we're not doing all of 1-4 around the existing API that's
  engine-focused.
 
  I'm over due for posting a feature page on vdsm standalone mode,
  and I
  have some other thoughts on various uses.
 
  Some other paths of thought for use-cases I've been mulling over:
 
   - Simplifying using QEMU/KVM
   - consuming qemu via command line
   - can we manage/support developers launching qemu
   directly
   - consuming qemu via libvirt
   - can we integrate with systems that are already using
   libvirt
 
   - Addressing issues with libvirt
   - are there kvm specific features we can exploit that
   libvirt
   doesn't?
 
   - Scale-up/fail-over
   - can we support a single vdsm node, but allow for
   building up
   clusters/groups without bringing in something like
   ovirt-engine
   - can we look at decentralized fail-over for reliability
   without
   a central mgmt server?
 
   - pluggability
   - can we support an API that allows for third-party
   plugins to
   support new features or changes in implementation?
 
 Pluggability feature would be nice. Even nicer would be the ability
 to
 introspect and figure whats supported by VDSM. For eg: It would be
 nice
 to query what plugins/capabilities are supported and accordingly the
 client can take a decision and/or call the appropriate APIs w/o
 worrying
 about ENOTSUPP kind of error.
 It does becomes blur when we talk about Repository Engines... that
 was
 also targetted to provide pluggaibility in managing Images.. how will
 that co-exist with API level pluggability ?
 
 IIUC, StorageProvisioning (via libstoragemgmt) can be one such
 optional
 support that can fit as a plug-in nicely, right ?
You will have have an introspective verb to get supported storage engines. 
Without the engine the hosts will not be able to log in to an image repo but it 
will not be an API level error. You will get UnsupportedRepoFormatError or 
something similar no matter which version of VDSM you use. The error is part of 
the interface and engines will expose their format and parameter in some way.
 
   - kvm tool integration

[vdsm] [virt-node] RFC: API Supportability

2012-06-18 Thread Saggi Mizrahi
The first thing we need to decide is API supportabiliy. I'll list the questions
that need to be answered. The decision made here will have great effect on
transport selection (espscially API change process and versioning) so try and
think about this without going to specfic technicalities (eg. X can't be done
on REST).

New API acceptance process
==
 - What is the process to suggest new API calls?
 - Who can ack such a change?
 - Does someone have veto rights?
 - Are there experimental APIs?

API deprecation process
===
 - Do we allow deprecation?
 - When can an API call be deprecated?
 - Who can ack such a change?
 - Does someone have veto rights?

API change process
==
 - Can calls be modified or no symbol can ever repeat in a different form
 - When can an API call be deprecated?
 - Who can ack such a change?
 - Does someone have veto rights?

API versioning
==
 - Is the API versioned as a whole, is it per subsystem (storage, networking,
   etc..) or is each call versioned by itself.
 - What happens when old clients connects to a newer server.
 - What happens when a new client connects to an older sever.
 - How will versioning be expressed in the bindings?
 - Do we retrict newer clients from using old APIs when talking with a new
   server?
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-18 Thread Saggi Mizrahi
I would like to put on to the table for descussion the growing need for a way
to more easily reuse of the functionality of VDSM in order to service projects
other than Ovirt-Engine.

Originally VDSM was created as a proprietary agent for the sole purpose of
serving the then proprietary version of what is known as ovirt-engine. Red Hat,
after acquiring the technology, pressed on with it's commitment to open source
ideals and released the code. But just releasing code into the wild doesn't
build a community or makes a project successful. Further more when building
open source software you should aspire to build reusable components instead of
monolithic stacks.

We would like to expose a stable, documented, well supported API. This gives
us a chance to rethink the VDSM API from the ground up. There is already work
in progress of making the internal logic of VDSM separate enough from the API
layer so we could continue feature development and bug fixing while designing
the API of the future.

In order to achieve this though we need to do several things:
   1. Declare API supportability guidelines
   2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
   3. Make the API easily consumable (e.g. proper docs, example code, extending
  the API, etc)
   4. Implement the API itself

All of these are dependent on one another and the permutations are endless.
This is why I think we should try and work on each one separately. All
discussions will be done openly on the mailing list and until the final version
comes out nothing is set in stone.

If you think you have anything to contribute to this process, please do so
either by commenting on the discussions or by sending code/docs/whatever
patches. Once the API solidifies it will be quite difficult to change
fundamental things, so speak now or forever hold your peace. Note that this is
just an introductory email. There will be a quick follow up email to kick start
the discussions.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] vdsm vs selinux

2012-06-18 Thread Saggi Mizrahi
Do you have an AVC denial in the audit log? What does it say?
(Please run sealert -a FILE and put the resolved text along with the original 
AVC denail)
Are you using NFS\localfs\SAN?

What are the credentials and contexts of the files in question?
Have you recently turned selinux on\off?
Did you upgrade the OS or selinux policy?
What is the libvirt version?

- Original Message -
 From: Laszlo Hornyak lhorn...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Monday, June 18, 2012 11:13:37 AM
 Subject: [vdsm] vdsm vs selinux
 
 hi,
 
 I am running the latest VDSM (built from git repo) on rhel 6.2 and
 looks like it has some issues with selinux. setenforce 0 solves the
 problem, but is there a proper solution under way?
 
 Traceback (most recent call last):
   File /usr/share/vdsm/vm.py, line 570, in _startUnderlyingVm
 self._run()
   File /usr/share/vdsm/libvirtvm.py, line 1364, in _run
 self._connection.createXML(domxml, flags),
   File
   /usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py,
   line 82, in wrapper
 ret = f(*args, **kwargs)
   File /usr/lib64/python2.6/site-packages/libvirt.py, line 2490, in
   createXML
 if ret is None:raise libvirtError('virDomainCreateXML() failed',
 conn=self)
 libvirtError: internal error Process exited while reading console log
 output: char device redirected to /dev/pts/2
 qemu-kvm: -drive
 file=/rhev/data-center/8c369da4-b3a0-11e1-9db0-273609afe0b1/efef4a96-16b1-4f14-a252-f33c7a8ce52b/images/40d2cc3a-9e9c-4224-af6f-2450efc883ca/e84617c5-8073-46de-85bd-2497235a5ba2,if=none,id=drive-virtio-disk0,format=raw,serial=40d2cc3a-9e9c-4224-af6f-2450efc883ca,cache=none,werror=stop,rerror=stop,aio=threads:
 could not open disk image
 /rhev/data-center/8c369da4-b3a0-11e1-9db0-273609afe0b1/efef4a96-16b1-4f14-a252-f33c7a8ce52b/images/40d2cc3a-9e9c-4224-af6f-2450efc883ca/e84617c5-8073-46de-85bd-2497235a5ba2:
 Permission denied
 
 
 Thank you,
 Laszlo
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-18 Thread Saggi Mizrahi
To the question of What blocks us using the current VDSM API?.

The main issue is supportability, This is also why it's the first point of 
discussion.
The current API has no supportability guidelines and there is no way we could 
support it for the long run.

Further more the current API, apart from being outdated is highly 
engine-specific. A lot of the decisions are related to VDSM having to slow down 
development to accommodate the slower pace of movement of the giant the is the 
ovirt-engine.
This means, for example having confusing verbs and argument names (eg. destroy, 
and iscsi portals). Having redundant steps in the setup of things (eg. storage 
domain creation). Arbitrary limitations (eg. storage pools, iso\export 
domains), etc.

In order to give a well supported API, we need to think about what we expose 
and how we expose it. Every verb should be thoroughly examined.

This was not the case when the original API was created because, as I already 
noted, it was built to a case where the supported API is at the Engine level 
and not the VDSM level. This made creating\removing verbs a lot quicker and 
less expensive, accepting things we knew were not ideal with the knowledge we 
can change them the next version. This cannot be the case with a supported 
public API.

- Original Message -
 From: Deepak C Shetty deepa...@linux.vnet.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Anthony 
 Liguori aligu...@redhat.com
 Sent: Monday, June 18, 2012 1:35:21 PM
 Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
 
 On 06/18/2012 08:32 PM, Saggi Mizrahi wrote:
  I would like to put on to the table for descussion the growing need
  for a way
  to more easily reuse of the functionality of VDSM in order to
  service projects
  other than Ovirt-Engine.
 
  Originally VDSM was created as a proprietary agent for the sole
  purpose of
  serving the then proprietary version of what is known as
  ovirt-engine. Red Hat,
  after acquiring the technology, pressed on with it's commitment to
  open source
  ideals and released the code. But just releasing code into the wild
  doesn't
  build a community or makes a project successful. Further more when
  building
  open source software you should aspire to build reusable components
  instead of
  monolithic stacks.
 
 
 Can you list issues that block tools (other than ovirt-engine) in
 using
 VDSM ?
 That will help provide more clarity and scope of work described here.
 
 I understand the lack of REST API, which is where Adam's work comes
 in.
 With REST API support for vdsm, other tools can integrate upwardly
 with
 VDSM and exploit it. What else ? How does the current API layer
 design/implementation inhibit tools other than ovirt-engine to use
 VDSM  ?
 
  We would like to expose a stable, documented, well supported API.
  This gives
  us a chance to rethink the VDSM API from the ground up. There is
  already work
  in progress of making the internal logic of VDSM separate enough
  from the API
  layer so we could continue feature development and bug fixing while
  designing
  the API of the future.
 
  In order to achieve this though we need to do several things:
  1. Declare API supportability guidelines
  2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
  3. Make the API easily consumable (e.g. proper docs, example
  code, extending
 the API, etc)
  4. Implement the API itself
 
  All of these are dependent on one another and the permutations are
  endless.
  This is why I think we should try and work on each one separately.
  All
  discussions will be done openly on the mailing list and until the
  final version
  comes out nothing is set in stone.
 
  If you think you have anything to contribute to this process,
  please do so
  either by commenting on the discussions or by sending
  code/docs/whatever
  patches. Once the API solidifies it will be quite difficult to
  change
  fundamental things, so speak now or forever hold your peace. Note
  that this is
  just an introductory email. There will be a quick follow up email
  to kick start
  the discussions.
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] RFC: Writeup on VDSM-libstoragemgmt integration

2012-06-18 Thread Saggi Mizrahi
First of all I'd like to suggest not using the LSM acronym as it can also mean 
live-storage-migration and maybe other things.

Secondly I would like to avoid talking about what needs to be changed in VDSM 
before we figure out what exactly we want to accomplish.

Also, there is no mention on credentials in any part of the process.
How does VDSM or the host get access to actually modify the storage array?
Who holds the creds for that and how?
How does the user set this up?

In the array as domain case. How are the luns being mapped to initiators. What 
about setting discovery credentials.
In the array set up case. How will the hosts be represented in regards to 
credentials?
How will the different schemes and capabilities in regard to authentication 
methods will be expressed.

Rest of the comments inline

- Original Message -
 From: Deepak C Shetty deepa...@linux.vnet.ibm.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Cc: libstoragemgmt-de...@lists.sourceforge.net, engine-de...@ovirt.org
 Sent: Wednesday, May 30, 2012 5:38:46 AM
 Subject: [Engine-devel] RFC: Writeup on VDSM-libstoragemgmt integration
 
 Hello All,
 
  I have a draft write-up on the VDSM-libstoragemgmt integration.
 I wanted to run this thru' the mailing list(s) to help tune and
 crystallize it, before putting it on the ovirt wiki.
 I have run this once thru Ayal and Tony, so have some of their
 comments
 incorporated.
 
 I still have few doubts/questions, which I have posted below with
 lines
 ending with '?'
 
 Comments / Suggestions are welcome  appreciated.
 
 thanx,
 deepak
 
 [Ccing engine-devel and libstoragemgmt lists as this stuff is
 relevant
 to them too]
 
 --
 
 1) Background:
 
 VDSM provides high level API for node virtualization management. It
 acts
 in response to the requests sent by oVirt Engine, which uses VDSM to
 do
 all node virtualization related tasks, including but not limited to
 storage management.
 
 libstoragemgmt aims to provide vendor agnostic API for managing
 external
 storage array. It should help system administrators utilizing open
 source solutions have a way to programmatically manage their storage
 hardware in a vendor neutral way. It also aims to facilitate
 management
 automation, ease of use and take advantage of storage vendor
 supported
 features which improve storage performance and space utilization.
 
 Home Page: http://sourceforge.net/apps/trac/libstoragemgmt/
 
 libstoragemgmt (LSM) today supports C and python plugins for talking
 to
 external storage array using SMI-S as well as native interfaces (eg:
 netapp plugin )
 Plan is to grow the SMI-S interface as needed over time and add more
 vendor specific plugins for exploiting features not possible via
 SMI-S
 or have better alternatives than using SMI-S.
 For eg: Many of the copy offload features require to use vendor
 specific
 commands, which justifies the need for a vendor specific plugin.
 
 
 2) Goals:
 
  2a) Ability to plugin external storage array into oVirt/VDSM
 virtualization stack, in a vendor neutral way.
 
  2b) Ability to list features/capabilities and other statistical
 info of the array
 
  2c) Ability to utilize the storage array offload capabilities
  from
 oVirt/VDSM.
 
 
 3) Details:
 
 LSM will sit as a new repository engine in VDSM.
 VDSM Repository Engine WIP @ http://gerrit.ovirt.org/#change,192
 
 Current plan is to have LSM co-exist with VDSM on the virtualization
 nodes.
 
 *Note : 'storage' used below is generic. It can be a file/nfs-export
 for
 NAS targets and LUN/logical-drive for SAN targets.
 
 VDSM can use LSM and do the following...
  - Provision storage
  - Consume storage
 
 3.1) Provisioning Storage using LSM
 
 Typically this will be done by a Storage administrator.
 
 oVirt/VDSM should provide storage admin the
  - ability to list the different storage arrays along with their
 types (NAS/SAN), capabilities, free/used space.
  - ability to provision storage using any of the array
  capabilities
 (eg: thin provisioned lun or new NFS export )
  - ability to manage the provisioned storage (eg: resize/delete
  storage)
 
 Once the storage is provisioned by the storage admin, VDSM will have
 to
 refresh the host(s) for them to be able to see the newly provisioned
 storage.
[SM] What about the clustered case, The management or the mailbox will have to 
be involved. Pros\Cons? Is there a capability for the storage to announce a 
change in topology? Can libstoragemgmt consume it? Does it even make sense?
 
 3.1.1) Potential flows:
 
 Mgmt - vdsm - lsm: create LUN + LUN Mapping / Zoning / whatever is
 needed to make LUN available to list of hosts passed by mgmt
 Mgmt - vdsm: getDeviceList (refreshes host and gets list of devices)
   Repeat above for all relevant hosts (depending on list passed
   earlier,
 mostly relevant 

Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-18 Thread Saggi Mizrahi
The decision to declare the current API as supported or not, or opening 
ourselves to more then one API transport is directly related to how we decide 
to handle deprecation (if any), API versioning and forward\backward 
compatibility.

If we discover we clear path to evolve the API or support multiple transports 
and adhere to the (soon to be) agreed upon supportability guidelines we might 
choose the easy way of supporting the current API. This is why deciding how we 
are going to support things is the first step in the process.

As a side note, having the XML-RPC operational for a version or two until the 
engine starts to use the new API is a non issue IMHO.

- Original Message -
 From: Anthony Liguori anth...@codemonkey.ws
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Daniel 
 P. Berrange berra...@redhat.com,
 Daniel Veillard veill...@redhat.com
 Sent: Monday, June 18, 2012 4:14:15 PM
 Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
 
 On 06/18/2012 10:02 AM, Saggi Mizrahi wrote:
  I would like to put on to the table for descussion the growing need
  for a way
  to more easily reuse of the functionality of VDSM in order to
  service projects
  other than Ovirt-Engine.
 
  Originally VDSM was created as a proprietary agent for the sole
  purpose of
  serving the then proprietary version of what is known as
  ovirt-engine. Red Hat,
  after acquiring the technology, pressed on with it's commitment to
  open source
  ideals and released the code. But just releasing code into the wild
  doesn't
  build a community or makes a project successful. Further more when
  building
  open source software you should aspire to build reusable components
  instead of
  monolithic stacks.
 
  We would like to expose a stable, documented, well supported API.
  This gives
  us a chance to rethink the VDSM API from the ground up. There is
  already work
  in progress of making the internal logic of VDSM separate enough
  from the API
  layer so we could continue feature development and bug fixing while
  designing
  the API of the future.
 
  In order to achieve this though we need to do several things:
  1. Declare API supportability guidelines
 
 Adding danpb and DV as I think they can provide good advice here.
 
 Practically speaking, I think the most important thing to do is
 clearly declare
 what's supported and not supported in more detail than you probably
 want to.
 Realistically, you have to just support whatever you have.  I don't
 know that
 designing a supportable interface can be really successful unless
 you start
 with that tomorrow.
 
 So basically, unless you plan on removing the XML-RPC interface in
 the next
 release, you should plan on supporting it forever...
 
  2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
 
 We spent so much time trying to find the best transport in QEMU with
 the
 resulting being something I'm ultimately unhappy with.
 
 The best decision we've made recently on this front is to move to a
 schema-based
 RPC mechanism where the transport code is all autogenerated.  Python
 has an
 advantage in that it supports introspection although a disadvantage
 in that it's
 easy to end up with an ad-hoc interface by relying on passing around
 dictionaries.
 
  3. Make the API easily consumable (e.g. proper docs, example
  code, extending
 the API, etc)
 
 Documentation is by far the most important thing IMHO.  I actually
 think that
 simply taking the existing XML-RPC interface and adding documentation
 ought to
 be the first step even..
 
  4. Implement the API itself
 
 I think the biggest risk in an effort like this is letting perfect
 become the
 enemy of good.  If the goal is to open VDSM up to other applications,
 you can
 start today but just documenting what you have with plans to
 deprecate and
 improve later.
 
 Honestly, worrying about XML-RPC vs. REST vs. AMQP is likely going to
 result in
 a lot of bike shedding and grand plans.
 
 Regards,
 
 Anthony Liguori
 
  All of these are dependent on one another and the permutations are
  endless.
  This is why I think we should try and work on each one separately.
  All
  discussions will be done openly on the mailing list and until the
  final version
  comes out nothing is set in stone.
 
  If you think you have anything to contribute to this process,
  please do so
  either by commenting on the discussions or by sending
  code/docs/whatever
  patches. Once the API solidifies it will be quite difficult to
  change
  fundamental things, so speak now or forever hold your peace. Note
  that this is
  just an introductory email. There will be a quick follow up email
  to kick start
  the discussions.
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] pep8 questions

2012-06-05 Thread Saggi Mizrahi
I thins this is the correct formatting:

self.__putMetadata({NONE: # * (sd.METASIZE - 10)}, metaid)

cls.log.warn(Could not get size for vol %s/%s
 using optimized methods,
 sdobj.sdUUID, volUUID, exc_info=True)

- Original Message -
 From: Deepak C Shetty deepa...@linux.vnet.ibm.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, June 5, 2012 2:19:04 PM
 Subject: [vdsm] pep8 questions
 
 Hi,
  I was looking at resolving pep8 issues in
 vdsm/storage/blockVolume.py. Haven't been able to resolve the below..
 Pointers appreciated.
 
 vdsm/storage/blockVolume.py:99:55: E225 missing whitespace around
 operator
 vdsm/storage/blockVolume.py:148:28: E201 whitespace after '{'
 vdsm/storage/blockVolume.py:207:28: E701 multiple statements on one
 line
 (colon)
 
 
 line 99:  cls.log.warn(Could not get size for vol %s/%s using
 optimized
 googling i found some links indicating this pep8 warning is
 incorrect.
 
 line 148: cls.__putMetadata({ NONE: # * (sd.METASIZE-10) },
 metaid)
 It gives some other error if i remove the whitespace after {
 
 line 206  207:
  raise se.VolumeCannotGetParent(blockVolume can't get
 parent %s for
volume %s: %s % (srcVolUUID, volUUID, str(e)))
 I split this line to overcome the  80 error, but unable to decipher
 what this error means ?
 
 thanx,
 deepak
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM API/clientIF instance design issue

2012-05-30 Thread Saggi Mizrahi
If you don't want to add it in a parameter then you already suspect that you 
are doing something wrong. Using a singleton instead of passing a parameter 
doesn't make the dependency not there. It's just obscures it.

I might not fully understand what you want to do but I think what you want is 
to have MOM expect a certain interface.
Then have an adapter class bridging the two interfaces.
The pass the wrapped CIF to MOM.


- Original Message -
 From: Mark Wu wu...@linux.vnet.ibm.com
 To: vdsm-devel@lists.fedorahosted.org
 Cc: Dan Kenigsberg dan...@redhat.com, Saggi Mizrahi 
 smizr...@redhat.com, Adam Litke a...@us.ibm.com, Ryan
 Harper ry...@us.ibm.com
 Sent: Wednesday, May 30, 2012 10:49:29 AM
 Subject: VDSM API/clientIF instance design issue
 
 Hi Guys,
 
 Recently,  I has been working on integrate MOM into VDSM.  MOM needs
 to
 use VDSM API to interact with it.  But currently, it requires the
 instance of clientIF to use vdsm API.  Passing clientIF to MOM is not
 a
 good choice since it's a vdsm internal object.  So I try to remove
 the
 parameter 'cif' from the interface definition and change to access
 the
 globally unique  clientIF instance in API.py.
 
 To get the instance of clientIF,  I add a decorator to clientIF to
 change it into singleton. Actually, clientIF has been working as a
 global single instance already.  We just don't have an interface to
 get
 it and so passing it as parameter instead.  I think using singleton
 to
 get the instance of clientIF is more clean.
 
 Dan and Saggi already gave some comments in
 http://gerrit.ovirt.org/#change,4839  Thanks for the reviewing!  But
 I
 think we need more discussion on it,  so I post it here because
 gerrit
 is not the appropriate to discuss a design issue.
 
 Thanks !
 Mark.
 
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] pep8 check in vim

2012-04-12 Thread Saggi Mizrahi
Now that we started moving to conform with pep8 you would probably like to be 
able to easily check your code.

If you use vim you could use this vim script
http://www.vim.org/scripts/script.php?script_id=2914

I you are not using vim follow these 3 simple steps:
1. Switch to vim
2. Install the script
3. Profit
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Test that broken upstream

2012-04-09 Thread Saggi Mizrahi
Upstream will not currently build because of the following patch:

commit 1d4f220616ca6fc014bbdfef7b826a16ed608ddf
Author: y kaplan ykap...@redhat.com
Date:   Thu Apr 5 18:02:40 2012 +0300

Added guestIFTests

Change-Id: I8b5138296c098826f149c26d38fd2bfce8794fe4
Reviewed-on: http://gerrit.ovirt.org/3379
Reviewed-by: Dan Kenigsberg dan...@redhat.com
Tested-by: Dan Kenigsberg dan...@redhat.com

It's import utils before setting up constants.
The reason it worked for the committer is that it had VDSM installed on the 
host so it could import vdsm from site-packages.
Please note that when writing tests and try to test with a clean host.
We are in the process of having Jenkins plug in to Gerrit to do it for us 
before we actually commit and break the build. But until then please double 
check.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] PEP8 in VDSM code

2012-03-26 Thread Saggi Mizrahi
The reason I wanted a gerrit hook is to avoid putting a -1 until VDSM is clean 
of errors.
It's supposed to be a transitional state.

- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Ewoud Kohl van Wijngaarden ew...@kohlvanwijngaarden.nl
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Monday, March 26, 2012 5:52:13 AM
 Subject: Re: [vdsm] PEP8 in VDSM code
 
 On 03/26/2012 11:26 AM, Ewoud Kohl van Wijngaarden wrote:
  On Mon, Mar 26, 2012 at 04:57:24AM -0400, Ayal Baron wrote:
  I'd rather avoid gerrit hooks if possible to use a jenkins job to
  validate this to keep the gerrit deployment as simple to
  maintain/upgrade as possible.
 
  But that's the wrong place to be doing it.
  Jenkins periodically polls for changes and then runs a job and
  posts
  the results somewhere (who would get the email?)
 
  Here the committer would immediately know that there is a problem
  with
  the patch and reviewers also immediately know not to accept it.
  I think what Itamar is getting at is that from gerrit you can
  trigger
  jenkins jobs which give a -1 if it fails. If jenkins checks for
  pep8
  you've solved the feedback issue without creating custom a gerrit
  hook.
  It will also be more scalable since you can add pyflakes / pylint /
  ...
  in the same check.
 
 true.
 per ayal's question - patch owner and reviewers will get the email,
 like
 any other review.
 we need to keep the gerrit as simple as possible wrt maintenance.
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] PEP8 in VDSM code

2012-03-22 Thread Saggi Mizrahi
I suggest having pep8 a must for patch submission in VDSM.
http://www.python.org/dev/peps/pep-0008/

Currently there are a few people policing these rules in reviews but I suggest 
we make it automatic.

Unless someone objects I will put a gerrit hook that complains about pep8 
violations.
It will not mark -1s until all (or at least most) source code has been 
converted because people might get complains about code they did not modify in 
this patch.

If you happy and you know it +1!
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Following unix pipes

2012-03-15 Thread Saggi Mizrahi
I recently had several instances where I had to try and figure out who holds 
what end of a unix pipe.
To make this operation a bit more streamlined I created a small script to 
follow a pipe. I think it will be useful for other people debugging VDSM 
especially bugs related to out of process helpers not closing FDs properly.

To see all the exists of a pipe just input a known end of the pipe: stahlband 
PID FD

$ stahlband 5758 5
PID: 5758 FD: 5 KIND: r
PID: 5758 FD: 6 KIND: w
PID: 5770 FD: 5 KIND: r
PID: 5770 FD: 6 KIND: w

The code is available on github:
https://github.com/ficoos/stahlband
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Fedora Virtualization Test Day 2012-04-12

2012-02-28 Thread Saggi Mizrahi
Good thing you linked to that wiki page. I learned a lot.

I don't mind being there for the EST shift.

- Original Message -
 From: Ayal Baron aba...@redhat.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, February 28, 2012 4:39:04 PM
 Subject: [vdsm] Fedora Virtualization Test Day 2012-04-12
 
 Hi all,
 
 $subject is a month and a half away.
 Any volunteers to driving vdsm testing forward for that day?
 
 
 https://fedoraproject.org/wiki/Test_Day:2012-04-12_Virtualization_Test_Day
 
 Regards,
 Ayal
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-16 Thread Saggi Mizrahi
Pinning down specific moments where stuff went horribly wrong is usually quite 
simple. The only reason I can't think of for someone to track a problem is if 
there was a bug in the storage subsystem that cause a corruption that only 
became apparent later on. But this can't benefit from flowID.

I'm going to to day this again. I want to hear someone to give a case where 
this is useful. Don't say things like.
It will allow me to do X
or
I know of a guy who spent 2 years doing X to VDSM\Engine logs

I want an example like:

The user complained about X. So I had to do Y to figure out what is wrong and 
it was a pain.

The Only reason I can think of is someone trying to figure out what wrong 
knowing nothing about how VDSM\Engine works.
While I just jump from place to place because I know what is going on, other 
people would just want to go step by step to get the complete picture.

But even with that I don't see why you would want to keep going from the Engine 
down to VDSM and back out again. 

Other then that I just simply can't imagine a use case where I'd need to do the 
stuff you talk about.
But again, I usually get my bug reports from QA and QE, and they are more 
skilled at bug reporting then the general public.

Further more, WHAT IS A FLOW? When does it start? When does it end? create 
Image is a flow? is the connect to the domain included? isn't it all just part 
of a big flow to create a VM? Does the engine even track it as a flow?

Flows depend on the debugger the problem and the scope. Throwing another 
useless ID in the pile will give you nothing. What you want is to be able to 
map sophisticated connections between resources and operations inside every 
component and between them. For instance, in VDSM. To track a resource locking 
issues you use the resourceID. It crosses flows. To debug connection issues 
you use the connection information (and soon, the reference ID). An it crosses 
flows as well.

I'm not saying that debugging Ovirt is easy. I'm just saying that this is not 
the solution IMHO. Good anchors to resources like taskIDs, connection reference 
IDs, domain IDs and resource IDs give you the ability to track whatever you 
want. You just need better tools to cross reference them across log files so 
that the log tricks I *know* are implemented in a way that everyone can use 
them.

This is when good tools come in to play. Think about it like a DB. Should 
something be a table with an index or should it just be a view.

(more inline)

- Original Message -
 From: Simon Grinberg si...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org, Dan Yasny dya...@redhat.com, Ayal 
 Baron aba...@redhat.com
 Sent: Thursday, February 16, 2012 11:07:34 AM
 Subject: Re: [vdsm] flowID schema
 
 
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Simon Grinberg si...@redhat.com
  Cc: vdsm-devel@lists.fedorahosted.org, Dan Yasny
  dya...@redhat.com, Ayal Baron aba...@redhat.com
  Sent: Thursday, February 16, 2012 5:18:26 PM
  Subject: Re: [vdsm] flowID schema
  
  You could just cross check flowID (that is printed in RHEV-M to all
  the call IDs in that flow, It's a simple enough tool to make).
  
  Also if you look at the way VDSM is heading you will see that just
  grepping for the flow ID will gradually give you a smaller and
  smaller picture of the actual flow.
  
  - The new connection management means that the actual connect is
  not
  a direct part of the flow anymore.
  - We plan to have a similar mechanism for domains.
  - The new image code will do a lot of things out of band.
  - There will be more flows using multiple hosts because of SDM and
  the increased safety gained by sanlock and they will not share the
  flow ID in internal communication.
  - Engine actually polls all tasks in a single thread (with it's own
  flow ID? No flowID?) so even the actual result for async tasks
  might
  have a different flowID in the VDSM log.
 
 OK so they may be a terminology mismatch here:
 A flow as an end user would see it is: Everything that happened since
 the moment he clicked OK and to the moment the operation failed
 succeeded.
 Same goes for internal/auto triggered actions in response to an event
 (Storage failure etc).
 
 It may be composed as you notes above of several connections,
 multiple tasks etc.
 The flow ID as I see it is the linkage that makes sense in all of
 these. Everything that you've wrote above just convinces me that
 such an ID is a must.
I understood that. The point is that the system is too complex to be able to 
find certain operations to flows.
If you connect in FlowX but use a resource provided by the connection in flowY. 
The connection breaks and you loose the resource.
As long as you somehow pinned the flowID value to the connection object it 
might be logged as flowX. The actual problem happened in flowY.
My point is that the logical leaps required to extract only relevant data to a 
flow

Re: [vdsm] flowID schema

2012-02-09 Thread Saggi Mizrahi
-1

I agree that for messaging environment having a Message ID is a must because 
you sometimes don't have a particular target so when you get a response you 
need to know what this node is actually responding to.

The message ID could be composed with FLOWIDMSGID so you can reuse the 
field.

But that is all besides the point.

I understand that someone might find it fun to go on following the entire flow 
in the Engine and in VDSM. But I would like to hear an actual use case where 
someone would have actually benefited from this.
As I see it having VSDM return the task ID with every response (and not just 
for async tasks) is a lot more useful and correct.

A generic debugging scenario as I see it.

1. Something went wrong
2. You go looking in the ENGINE log trying to figure out what happend.
3. You see that ENGINE got SomeError.
4. Check to see if this error makes sense imagining that VDSM is always right 
and is a black box.
5. You did your digging and now you think that VDSM is as fault.
6. Go look for the call that failed. (If we returned the taskID it's pretty 
simple to find that call).
7. Look around the call to check VDSM state.
8. Profit.

There is never a point where you want to follow a whole flow call by call going 
back and forth, and even if you did having the VDSM taskID is a better anchor 
then flowID.

VDSM is built in a way that every call takes in to account the current state 
only. Debugging it with an engine flow mindset is just wrong and distracting. I 
see it doing more harm the good by reinforcing bad debugging practices.

- Original Message -
 From: Keith Robertson krobe...@redhat.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, February 9, 2012 1:34:43 PM
 Subject: Re: [vdsm] flowID schema
 
 On 02/09/2012 12:18 PM, Andrew Cathrow wrote:
 
  - Original Message -
  From: Ayal Baronaba...@redhat.com
  To: Dan Kenigsbergdan...@redhat.com
  Cc: VDSM Project Developmentvdsm-devel@lists.fedorahosted.org
  Sent: Monday, February 6, 2012 10:35:54 AM
  Subject: Re: [vdsm] flowID schema
 
 
 
  - Original Message -
  On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
  flowID makes no sense after the initial API call as stuff like
  cacheing\threadpools\samplingtasks\resources\asyncTasks so
  flowing
  a flow like that will not give you the entire picture while
  debugging.
 
  Also adding it now will make everything even more ugly.
  You know what, just imagine I wrote one of my long rambles about
  why I don't agree with doing this.
  I cannot imagine you write anything like that. Really. I do not
  understand why you object logging flowID on API entry point.
  The question is, what problem is this really trying to solve and
  is
  there a simpler and less obtrusive solution to that problem?
  correlating logs between ovirt engine and potentially multiple vdsm
  nodes is a nightmare. It requires a lot skill to follow a
  transaction through from the front end all the way to the node,
  and even multiple nodes (eg actions on spm, then actions on other
  node to run a vm).
  Having a way to correlate the logs and follow a single event/flow
  is vital.
 
 +1
 
 Knowing what command caused a sequence of events in VDSM would be
 really
 helpful particularly in a threaded environment.  Further, wouldn't
 such
 an ID be helpful in an asynchronous request/response model?  I'm not
 sure what the plans are for AMQP or even if there are plans, but I'd
 think that something like this would be crucial for an async
 response.
 So, if you implemented it you might be killing 2 birds with 1 stone.
 
 FYI: If you want to see examples of other systems that use similar
 concepts, take a look at the correlation ID in JMS.
 
 Cheers,
 Keith
 
 
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] metadata

2012-02-02 Thread Saggi Mizrahi
Domain contain different metadata depending on their versions.
The important keys represent:
* Domain UUID
* SPM lease information
* Pool membership
* original device block size (So if someone moves the domain between devices 
with different block sizes we know that the domain broke)
* Domain human readable name
A master domain also contains pool information
* All members of the storage pool
* this version of the master MD
* pool human readable name

A lot of these key will be deprecated in future domain versions as long as the 
entire concept of storage pools. So If you are planning on writing tools that 
read the domain MD please be aware the the format\keys will drastically change 
in the next months.

- Original Message -
 From: wangxiaofan wangxiao...@opzoon.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, February 2, 2012 1:58:06 AM
 Subject: [vdsm] metadata
 
 What is the metadata of non-master data domain used for? And master
 data domain?
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-02 Thread Saggi Mizrahi
flowID makes no sense after the initial API call as stuff like 
cacheing\threadpools\samplingtasks\resources\asyncTasks so flowing a flow like 
that will not give you the entire picture while debugging.

Also adding it now will make everything even more ugly.
You know what, just imagine I wrote one of my long rambles about why I don't 
agree with doing this.

As you plan on going on anyway here is my suggestion on how to push this in.

XMLRPC doesn't support named parameter, this means that you can't just ad-hoc a 
new arg to all the API calls called flow-id.
For simplicity's sake lets assume they always pass the last arg as flowID if it 
is a string that starts with __FLOWID__.
What you do then is in dispatcher take the last arg and put it on the task 
object.
Have the logger print this value even when the task is in prepare next to the 
threadID.

You will have to make the clientIF calls use *another* dispatcher but the same 
task thread pool to have this supported at the clientIF verbs as well but I 
think it should have been done anyways.

- Original Message -
 From: Douglas Landgraf dougsl...@redhat.com
 To: VDSM Project Development vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, February 2, 2012 12:00:44 AM
 Subject: [vdsm] flowID schema
 
 Hello,
 
   flowID is schema that we will be including in vdsm API to oVirt
 Engine people share the ID of engine transaction to vdsm.
 With this in hands, we will add the ID of transactions to our log.
 
 I would like to know your opinion how we could do that without break
 our
 API, like include new parameters to our calls.
 
 Should we add at BindingXMLRPC.py - wrapper() a code to search for a
 'flowID' key into functions which use dict as parameter (like
 create)?
 [1] Maybe change at other level inside BindingXMLRPC ?
 
 Ideas/Thoughts?
 
 [1]
 http://gerrit.ovirt.org/#patch,sidebyside,1221,3,vdsm/BindingXMLRPC.py
 
 Thanks!
 
 --
 Cheers
 Douglas
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Libstorage and repository engines

2012-01-31 Thread Saggi Mizrahi
I've been working on refactoring the storageDomain\images system in VDSM.
Apart from facilitating various features I've also been trying to make adding 
new SD types easier and making the image manipulation bits consistent across 
domain implementation.

Currently in order to create a new domain type you have to create a new 
StoageDomain,Image and Volume objects and implement all the logic to manipulate 
them. Apart from being cumbersome and redundant it also make mixed clustered 
very hard to do.

On of the big changes I put in is separating the image manipulation with the 
actual storage work.

Instead of each domain type implementing createImage and co you have one class 
responsible for all the image manipulation in the cluster.

All you have to do facilitate a new storage type is to create a domain engine.

A domain engine is a python class that implement a minimal interface.
1. It has to be able to create resize and delete a slab (slab being a block of 
writable storage like a lun\lv\file)
2. It has to be able to create and delete tags (tags are pointers to slabs)

The above function are very easy to implement and require very little 
complexity. All the heavy lifting (image manipulation, cleaning, transaction, 
atomic operations, etc) is managed by the Image Manager that just uses this 
unified interface to interact with the different storage types)

In cases where a domain might have a special non-standard features I introduce 
the concept of capabilities. A domain engine can declare support for certain 
capabilities (eg. native snapshotting) and implement additional interfaces. If 
the image manager sees that the domain implements a capability it will use it 
if not it will use a default implementation that uses the default must have 
verbs. This is similar to just having drawLine and having drawRect. This is 
done automatically and at runtime.

I like to compare this to how OpenGL will use software rendering if a certain 
standard feature is not implemented by the card so you might get a slower but 
still correct result.

Now, libstorage is another way to abstract interactions and capabilities for 
different storage types and have a unified API for accessing them.

Building a repo engine on top of libstorage is completely possible. But as you 
can see this creates a redundant layer of abstractions in the libstorage side.

As I see it if you just want to have you storage supported by ovirt creating a 
repo engine is simpler as you can use high level concepts and I do plan to have 
engines run as their own processes so you could use whatever licence, language 
and storage server API you choose.

Also libstorage will have to keep it's abstraction at a much lower level. This 
means exposing target specific flags and abilities. eWhile this is good in 
concept it will mean that the repo engine wrapping libstorage will have to 
juggle all those flags and calls instead of having different distinct class for 
each storage type with it's own specific hacks in place.

Just as a current example, we currently use the same engine for nfs3 and 
nfs4. This means that when we are running on nfs4 we are still doing all the 
hacks that are meant to circumvent issues with v3 being stateless. This is no 
longer relevant as v4 is stateful.
And what about SAMBA? or gluster? You got to have special hacks for boths

What I'm saying is that if in the relatively simple world of NAS where we have 
a proven abstraction (file access commands, POSIX). We can't find a way to 
create a 1 class to rule them all. How can we expect to have a sane solution 
for the crazy world of SAN.

I'm not saying we shouldn't create an engine for libstorage, just that we 
should treat it like we treat sharefs. As a simple generic non bullet 
proof\optimized implementation.

Let the flaming commence!
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] about snapshot

2012-01-30 Thread Saggi Mizrahi
snapshot might have multiple images base on the in the case of template and 
preview mode. 
We also plan to remove preview mode and make it that every snapshot can have 
multiple images base on it.
This design can only be possible when using different qcow2 files.

I might not be understanding Dan as there are no plans to have libvirt do 
snapshotting (apart from live snapshots and even in that case vdsm's storage 
backend will be the one creating the actual storage target for the image).

- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: wangxiaofan wangxiao...@opzoon.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Monday, January 30, 2012 9:10:38 AM
 Subject: Re: [vdsm] about snapshot
 
 On Mon, Jan 30, 2012 at 10:08:19PM +0800, wangxiaofan wrote:
  Hi there,
  Why does not vdsm use snapshot APIs of libvirt, or qemu-img
  snapshot -c ?
 
 The snapshot APIs are a fairly recent addition to libvirt, whose
 design was
 in fact influenced strongly by VDSM's requirements. IIUC, the intent
 is for
 VDSM to switch over to the libvirt APIs at some point in the not too
 distant
 future.
 
 Regards,
 Daniel
 --
 |: http://berrange.com  -o-
 |   http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o-
 |http://virt-manager.org :|
 |: http://autobuild.org   -o-
 |http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-
 |  http://live.gnome.org/gtk-vnc :|
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] [RFC] New Connection Management API

2012-01-26 Thread Saggi Mizrahi
snip
Again trying to sum up and address all comments

Clear all:
==
My opinions is still to not implement it.
Even though it might generate a bit more traffic premature optimization is bad 
and there are other reasons we can improve VDSM command overhead without doing 
this.

In any case this argument is redundant because my intention is (as Litke 
pointed out) is to have a lean API.
and API call is something you have to support across versions, this call 
implemented in the engine is something that no one has to support and can 
change\evolve easily.

As a rule, if an API call C and be implemented by doing A + B then C is 
redundant.

List of connections as args:

Sorry I forgot to respond about that. I'm not as strongly opposed to the idea 
as the other things you suggested. It'll just make implementing the persistence 
logic in VDSM significantly more complicated as I will have to commit multiple 
connection information to disk in an all or nothing mode. I can create a small 
sqlitedb to do that or do some directory tricks and exploit FS rename atomicity 
but I'd rather not.

The demands are not without base. I would like to keep the code simple under 
the hood in the price of a few more calls. You would like to make less calls 
and keep the code simpler on your side. There isn't a real way to settle this.
If anyone on the list as pros and cons for either way I'd be happy to hear them.
If no compelling arguments arise I will let Ayal call this one.

Transient connections:
==
The problem you are describing as I understand it is that VDSM did not respond 
and not that the API client did not respond.
Again, this can happen for a number of reason, most of which VDSM might not be 
aware that there is actually a problem (network issues).

This relates to the EOL policy. I agree we have to find a good way to define an 
automatic EOL for resources. I have made my suggestion. Out of the scope of the 
API.

In the meantime cleaning stale connections is trivial and I have made it clear 
a previous email about how to go about it in a simple non intrusive way. Clean 
hosts on host connect, and on every poll if you find connections that you don't 
like. This should keep things squeaky clean.

- Original Message -
 From: Livnat Peer lp...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org, engine-de...@ovirt.org
 Sent: Thursday, January 26, 2012 5:22:42 AM
 Subject: Re: [Engine-devel] [RFC] New Connection Management API
 
 On 25/01/12 23:35, Saggi Mizrahi wrote:
  SNIP
  This is mail was getting way too long.
  
  About the clear all verb.
  No.
  Just loop, find the connections YOU OWN and clean them. Even though
  you don't want to support multiple clients to VDSM API doesn't
  mean the engine shouldn't behave like a proper citizen.
  It's the same reason why VDSM tries and not mess system resources
  it didn't initiate.
 
 
 There is a big difference, VDSM living in hybrid mode with other
 workload on the host is a valid use case, having more than one
 concurrent manager for VDSM is not.
 Generating a disconnect request for each connection does not seem
 like
 the right API to me, again think on the simple flow of moving host
 from
 one data center to another, the engine needs to disconnect tall
 storage
 domains (each domain can have couple of connections associated with
 it).
 
 I am giving example from the engine use cases as it is the main user
 of
 VDSM ATM but I am sure it will be relevant to any other user of VDSM.
 
  
  
  
  As I see it the only point of conflict is the so called
  non-peristed connections.
  I will call them transient connections from now on.
  
  There are 2 user cases being discussed
  1. Wait until a connection is made, if it fails don't retry and
  automatically unmanage.
  2. If the called of the API forgets or fails to unmanage a
  connection.
  
 
 Actually I was not discussing #2 at all.
 
  Your suggestion as I understand it:
  Transient connections are:
   - Connection that VDSM will only try to connect to once and
   will not reconnect to in case of disconnect.
 
 yes
 
  
  My problem with this definition that it does not specify the end
  of life of the connection.
  Meaning it solves only use case 1.
 
 since this is the only use case i had in mind, it is what i was
 looking for.
 
  If all is well, and it usually is, VDSM will not invoke a
  disconnect.
  So the caller would have to call unmanage if the connection
  succeeded at the end of the flow.
 
 agree.
 
  Now, if you are already calling unmanage if connection succeeded
  you can just call it anyway.
 
 not exactly, an example I gave earlier on the thread was that VSDM
 hangs
 or have other error and the engine can not initiate unmanaged,
 instead
 let's assume the host is fenced (self-fence or external fence does
 not
 matter), in this scenario the engine will not issue unmanage

Re: [vdsm] [Engine-devel] [RFC] New Connection Management API

2012-01-26 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Livnat Peer lp...@redhat.com, engine-de...@ovirt.org, 
 vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, January 26, 2012 1:58:40 PM
 Subject: Re: [vdsm] [Engine-devel] [RFC] New Connection Management API
 
 On Thu, Jan 26, 2012 at 10:00:57AM -0500, Saggi Mizrahi wrote:
  snip
  Again trying to sum up and address all comments
  
  Clear all:
  ==
  My opinions is still to not implement it.
  Even though it might generate a bit more traffic premature
  optimization is bad and there are other reasons we can improve
  VDSM command overhead without doing this.
  
  In any case this argument is redundant because my intention is (as
  Litke pointed out) is to have a lean API.
  and API call is something you have to support across versions, this
  call implemented in the engine is something that no one has to
  support and can change\evolve easily.
  
  As a rule, if an API call C and be implemented by doing A + B then
  C is redundant.
  
  List of connections as args:
  
  Sorry I forgot to respond about that. I'm not as strongly opposed
  to the idea as the other things you suggested. It'll just make
  implementing the persistence logic in VDSM significantly more
  complicated as I will have to commit multiple connection
  information to disk in an all or nothing mode. I can create a
  small sqlitedb to do that or do some directory tricks and exploit
  FS rename atomicity but I'd rather not.
 
 I would be strongly opposed to introducing a sqlite database into
 vdsm just to
 enable convenience mode for this API.  Does the operation really
 need to be
 atomic?  Why not just perform each connection sequentially and return
 a list of
 statuses? Is the only motivation for allowing a list of parameters
 to reduce
 the number of API calls between engine and vdsm)?  If so, the same
 argument
 Saggi makes above applies here.

I try and have VDSM expose APIs that are simple to predict. a command can 
either succeed or fail.
The problem is not actually validating the connections. The problem is that 
once I concluded that they are all OK I need to persist to disk the information 
that will allow me to reconnect if VDSM happens to crash. If I naively save 
them one by one I could get in a state where only some of the connections 
persisted before the operation failed. So I have to somehow put all this in a 
transaction.

I don't have to use sqlite. I could also put all the persistence information in 
a new dir for every call named UUID.tmp. Once I wrote everything down I 
rename the directory to just UUID and fsync it. This is guarantied by posix 
to be atomic. For unmanage, I move all the persistence information from the 
directories they sit in to a new dir named UUID. Rename it to a UUDI.tmp, 
fsync it and then remove it.

This all just looks like more trouble then it's worth to me.

 
  The demands are not without base. I would like to keep the code
  simple under the hood in the price of a few more calls. You would
  like to make less calls and keep the code simpler on your side.
  There isn't a real way to settle this.
  If anyone on the list as pros and cons for either way I'd be happy
  to hear them.
  If no compelling arguments arise I will let Ayal call this one.
  
  Transient connections:
  ==
  The problem you are describing as I understand it is that VDSM did
  not respond and not that the API client did not respond.
  Again, this can happen for a number of reason, most of which VDSM
  might not be aware that there is actually a problem (network
  issues).
  
  This relates to the EOL policy. I agree we have to find a good way
  to define an automatic EOL for resources. I have made my
  suggestion. Out of the scope of the API.
  
  In the meantime cleaning stale connections is trivial and I have
  made it clear a previous email about how to go about it in a
  simple non intrusive way. Clean hosts on host connect, and on
  every poll if you find connections that you don't like. This
  should keep things squeaky clean.
  
  - Original Message -
   From: Livnat Peer lp...@redhat.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: vdsm-devel@lists.fedorahosted.org, engine-de...@ovirt.org
   Sent: Thursday, January 26, 2012 5:22:42 AM
   Subject: Re: [Engine-devel] [RFC] New Connection Management API
   
   On 25/01/12 23:35, Saggi Mizrahi wrote:
SNIP
This is mail was getting way too long.

About the clear all verb.
No.
Just loop, find the connections YOU OWN and clean them. Even
though
you don't want to support multiple clients to VDSM API doesn't
mean the engine shouldn't behave like a proper citizen.
It's the same reason why VDSM tries and not mess system
resources
it didn't initiate.
   
   
   There is a big difference, VDSM living in hybrid mode with other
   workload on the host

Re: [vdsm] [Engine-devel] [RFC] New Connection Management API

2012-01-26 Thread Saggi Mizrahi


- Original Message -
 From: Livnat Peer lp...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: vdsm-devel@lists.fedorahosted.org, engine-de...@ovirt.org
 Sent: Thursday, January 26, 2012 3:03:39 PM
 Subject: Re: [Engine-devel] [RFC] New Connection Management API
 
 On 26/01/12 17:00, Saggi Mizrahi wrote:
  snip
  Again trying to sum up and address all comments
  
  Clear all:
  ==
  My opinions is still to not implement it.
  Even though it might generate a bit more traffic premature
  optimization is bad and there are other reasons we can improve
  VDSM command overhead without doing this.
  
  In any case this argument is redundant because my intention is (as
  Litke pointed out) is to have a lean API.
  and API call is something you have to support across versions, this
  call implemented in the engine is something that no one has to
  support and can change\evolve easily.
  
  As a rule, if an API call C and be implemented by doing A + B then
  C is redundant.
 
 I disagree with the above statement, exposing a bulk of operations in
 a
 single API call is very common and not considered redundant.
I agree that that APIs with those kind of calls exist but it doesn't mean they 
are not redundant.

re·dun·dant: adj. (of words or data) Able to be omitted without loss of meaning 
or function

This call can be omitted without loss of function.
API calls are a commitment for generations. Wrapping this in the clients 
doesn't.
To quot myself:
API call is something you have to support across versions, this
call implemented in the engine is something that no one has to
support and can change\evolve easily. ~ Saggi Mizrahi, a few lines above

this API set will one day be considered stupid, obsolete and annoying.
That's just how life is. We'll find better ways of solving these problems.
When that moment comes I want to have as little functionality as possible I 
have to keep maintaining.
I doubt there is any way you can convince me otherwise.

Put yourself in my position and think if you would have made this sacrifice 
just to save someone a loop.

To sum up, I will not add any API calls I don't absolutely have to.

As to the amount of calls, this is not relevant to the clear all verb. This is 
addressed by the point right below this sentence.

 
  
  List of connections as args:
  
  Sorry I forgot to respond about that. I'm not as strongly opposed
  to the idea as the other things you suggested. It'll just make
  implementing the persistence logic in VDSM significantly more
  complicated as I will have to commit multiple connection
  information to disk in an all or nothing mode. I can create a
  small sqlitedb to do that or do some directory tricks and exploit
  FS rename atomicity but I'd rather not.
  
  The demands are not without base. I would like to keep the code
  simple under the hood in the price of a few more calls. You would
  like to make less calls and keep the code simpler on your side.
  There isn't a real way to settle this.
 
 It is not about keeping the code simple (writing a loop is simple as
 well), it is about redundant round trips.
As I said, I agree there is merit there.

I think that roundtrips is a general issue not specific to this call.
My opinion is that communication with VDSM should just use HTTP pipelining 
(http://en.wikipedia.org/wiki/HTTP_pipelining)
This will solve the problem globally instead of tacking it on to the API 
interface.

I generally prefer simplicity of the API and the implementation, and 
correctness over performance.

I laid out out what the change entails, multiple ways of solving this, and my 
personal perspective.
Unless someone on the list objects to either solution, Ayal will have final say 
on this matter.
He is more of a pragmatist than I (and doing what he says usually correlates 
with me getting my paycheck).

 
  If anyone on the list as pros and cons for either way I'd be happy
  to hear them.
  If no compelling arguments arise I will let Ayal call this one.
  
  Transient connections:
  ==
  The problem you are describing as I understand it is that VDSM did
  not respond and not that the API client did not respond.
  Again, this can happen for a number of reason, most of which VDSM
  might not be aware that there is actually a problem (network
  issues).
  
  This relates to the EOL policy. I agree we have to find a good way
  to define an automatic EOL for resources. I have made my
  suggestion. Out of the scope of the API.
  
  In the meantime cleaning stale connections is trivial and I have
  made it clear a previous email about how to go about it in a
  simple non intrusive way. Clean hosts on host connect, and on
  every poll if you find connections that you don't like. This
  should keep things squeaky clean.
 
 I have no additional input on this.
 
The only real legitimate reservation you still have with the API is transient 
connections. As I said, if you can find a way

  1   2   >