Re: [ovirt-devel] [VDSM] Correct implementation of virt-sysprep job

Michal Skrivanek Tue, 13 Dec 2016 04:45:56 -0800

> On 7 Dec 2016, at 20:16, Nir Soffer <nsof...@redhat.com> wrote:
> 
> On Wed, Dec 7, 2016 at 8:24 PM, Michal Skrivanek <mskri...@redhat.com> wrote:
>> 
>> 
>> On 07 Dec 2016, at 09:17, Oved Ourfali <oourf...@redhat.com> wrote:
>> 
>> 
>> 
>> On Tue, Dec 6, 2016 at 11:12 PM, Adam Litke <ali...@redhat.com> wrote:
>>> 
>>> On 06/12/16 22:06 +0200, Arik Hadas wrote:
>>>> 
>>>> Adam,
>>> 
>>> 
>>> :)  You seem upset.  Sorry if I touched on a nerve...
>>> 
>>>> Just out of curiosity: when you write "v2v has promised" - what exactly
>>>> do you
>>>> mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar and I
>>>> that
>>>> implemented the integration with virt-v2v? I'm not aware of such a
>>>> promise by
>>>> any of these options :)
>>> 
>>> 
>>> Some history...
>>> 
>>> Earlier this year Nir, Francesco (added), Shahar, and I began
>>> discussing the similarities between what storage needed to do with
>>> external commands and what was designed specifically for v2v.  I am
>>> not sure if you were involved in the project at that time.  The plan
>>> was to create common infrastructure that could be extended to fit the
>>> unique needs of the verticals.  The v2v code was going to be moved
>>> over to the new infrastructure (see [1]) and the only thing that
>>> stopped the initial patch was lack of a VMWare testing environment for
>>> verification.
>>> 
>>> At that time storage refocused on developing verbs that used the new
>>> infrastructure and have been maintaining its suitability for general
>>> use.  Conversion of v2v -> Host Jobs is obviously a lower priority
>>> item and much more difficult now due to the early missed opportunity.
>>> 
>>>> Anyway, let's say that you were given such a promise by someone and thus
>>>> consider that mechanism to be deprecated - it doesn't really matter.
>>> 
>>> 
>>> I may be biased but I think my opinion does matter.
>>> 
>>>> The current implementation doesn't well fit to this flow (it requires
>>>> per-volume job, it creates leases that are not needed for template's
>>>> disks,
>>>> ...) and with the "next-gen API" with proper support for virt flows not
>>>> even
>>>> being discussed with us (and iiuc also not with the infra team) yet, I
>>>> don't
>>>> understand what do you suggest except for some strong, though irrelevant,
>>>> statements.
>>> 
>>> 
>>> If you are willing to engage in a good-faith technical discussion I am
>>> sure I can help you to understand.  These operations to storage demand
>>> some form of locking protection.  If volume leases aren't appropriate then
>>> perhaps we should use the VM Leases / xleases that Nir is finishing
>>> off for 4.1 now.
>>> 
>>>> I suggest loud and clear to reuse (not to add dependencies, not to
>>>> enhance, ..)
>>>> an existing mechanism for a very similar flow of virt-v2v that works well
>>>> and
>>>> simple.
>>> 
>>> 
>>> I clearly remember discussions involving infra (hello Oved), virt
>>> (hola Michal), and storage where we decided that new APIs performing
>>> async operations involving external commands should use the HostJobs
>>> infrastructure instead of adding more information to Host Stats.
>>> These were the "famous" entity polling meetings.
>>> 
>>> Of course plans can change but I have never been looped into any such
>>> discussions.
>>> 
>> 
>> Well, I think that when someone builds a good infrastructure he first needs
>> to talk to all consumers and make sure it fits.
>> In this case it seems like most work was done to fit the storage use-case,
>> and now you check whether it can fit others as well....
>> 
>> IMO it makes much more sense to use events where possible (and you've
>> promised to use those as well, but I don't see you doing that...). v2v
>> should use events for sure, and they have promised to do that in the past,
>> instead of using the v2v jobs. The reason events weren't used originally
>> with the v2v feature, was that it was too risky and the events
>> infrastructure was added too late in the game.
>> 
>> 
>> Revisiting and refactoring code which is already in use is always a bit of
>> luxury we can rarely prioritize.
>> So indeed v2v is not using events. The
>> generalization work has been done to some extent, but there is no incentive
>> to rewrite it completely.
> 
> On the vdsm side we don't need to rewrite the interesting parts, just run the
> code using the current infrastructure.


ugh, there are too many “infrastructure” words floating around. Would you 
please specify what exactly do you mean?

> 
> The first step is to use the jobs framework and delete the stale copy
> of the jobs
> framework in v2v.py.

by “stale” you mean the one which is currently used by v2v code, right? 
So…that’s not stale, that’s in use, and it should change/be refactored, yes, 
but that doesn’t make it stale.

> 
> Then we can easily schedule the jobs in the task manager instead of starting
> a thread for each job - one line to schedule a job in hsm - see the sparsify
> patch.

it’s a per-VM virt job. Why HSM?

> 
> I think Shahar started to work on it, we need to rebase his patch and finish 
> it.

don’t know what you are referring to

Thanks,
michal

> 
>> On the other hand we are now trying to add events to migration progress
>> reporting and hand over since that area is being touched due to post-copy
>> enhancements.
>> So, when there is a practical chance to improve functionality by utilizing
>> events it indeed should be the first choice
>> 
>> 
>> 
>>>> 
>>>> Do you "promise" to implement your "next gen API" for 4.1 as an
>>>> alternative?
>>> 
>>> 
>>> I guess we need the design first.
>>> 
>>> 
>>>> On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke <ali...@redhat.com> wrote:
>>>> 
>>>>   On 05/12/16 11:17 +0200, Arik Hadas wrote:
>>>> 
>>>> 
>>>> 
>>>>       On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer <nsof...@redhat.com>
>>>> wrote:
>>>> 
>>>>          On Sun, Dec 4, 2016 at 8:50 PM, Shmuel Melamud
>>>> <smela...@redhat.com>
>>>>       wrote:
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> I'm currently working on integration of virt-sysprep into
>>>> oVirt.
>>>>> 
>>>>> Usually, if user creates a template from a regular VM, and
>>>> then
>>>>       creates
>>>>          new VMs from this template, these new VMs inherit all
>>>> configuration
>>>>       of the
>>>>          original VM, including SSH keys, UDEV rules, MAC addresses,
>>>> system
>>>>       ID,
>>>>          hostname etc. It is unfortunate, because you cannot have two
>>>> network
>>>>          devices with the same MAC address in the same network, for
>>>> example.
>>>>> 
>>>>> To avoid this, user must clean all machine-specific
>>>> configuration
>>>>       from
>>>>          the original VM before creating a template from it. You can do
>>>> this
>>>>          manually, but there is virt-sysprep utility that does this
>>>>       automatically.
>>>>> 
>>>>> Ideally, virt-sysprep should be seamlessly integrated into
>>>>       template
>>>>          creation process. But the first step is to create a simple
>>>> button:
>>>>       user
>>>>          selects a VM, clicks the button and oVirt executes virt-sysprep
>>>> on
>>>>       the VM.
>>>>> 
>>>>> virt-sysprep works directly on VM's filesystem. It accepts
>>>> list of
>>>>       all
>>>>          disks of the VM as parameters:
>>>>> 
>>>>> virt-sysprep -a disk1.img -a disk2.img -a disk3.img
>>>>> 
>>>>> The architecture is as follows: command on the Engine side
>>>> runs a
>>>>       job on
>>>>          VDSM side and tracks its success/failure. The job on VDSM side
>>>> runs
>>>>          virt-sysprep.
>>>>> 
>>>>> The question is how to implement the job correctly?
>>>>> 
>>>>> I thought about using storage jobs, but they are designed to
>>>> work
>>>>       only
>>>>          with a single volume, correct?
>>>> 
>>>>          New storage verbs are volume based. This make it easy to manage
>>>>          them on the engine side, and will allow parallelizing volume
>>>>       operations
>>>>          on single or multiple hosts.
>>>> 
>>>>          A storage volume job is using sanlock lease on the modified
>>>> volume
>>>>          and volume generation number. If a host running pending jobs
>>>> becomes
>>>>          non-responsive and cannot be fenced, we can detect the state of
>>>>          the job, fence the job, and start the job on another host.
>>>> 
>>>>          In the SPM task, if a host becomes non-responsive and cannot be
>>>>          fenced, the whole setup is stuck, there is no way to perform
>>>> any
>>>>          storage operation.
>>>>> Is is possible to use them with operation that is performed
>>>> on
>>>>       multiple
>>>>          volumes?
>>>>> Or, alternatively, is it possible to use some kind of 'VM
>>>> jobs' -
>>>>       that
>>>>          work on VM at whole?
>>>> 
>>>>          We can do:
>>>> 
>>>>          1. Add jobs with multiple volumes leases - can make error
>>>> handling
>>>>       very
>>>>              complex. How do tell a job state if you have multiple
>>>> leases?
>>>>       which
>>>>              volume generation you use?
>>>> 
>>>>          2. Use volume job using one of the volumes (the boot volume?).
>>>> This
>>>>       does
>>>>              not protect the other volumes from modification but engine
>>>> is
>>>>          responsible
>>>>              for this.
>>>> 
>>>>          3. Use new "vm jobs", using a vm lease (should be available
>>>> this
>>>>       week
>>>>          on master).
>>>>              This protects a vm during sysprep from starting the vm.
>>>>              We still need a generation to detect the job state, I think
>>>> we
>>>>       can
>>>>          use the sanlock
>>>>              lease generation for this.
>>>> 
>>>>          I like the last option since sysprep is much like running a vm.
>>>>> How v2v solves this problem?
>>>> 
>>>>          It does not.
>>>> 
>>>>          v2v predates storage volume jobs. It does not use volume leases
>>>> and
>>>>          generation
>>>>          and does have any way to recover if a host running v2v becomes
>>>>          non-responsive
>>>>          and cannot be fenced.
>>>> 
>>>>          It also does not use the jobs framework and does not use a
>>>> thread
>>>>       pool for
>>>>          v2v jobs, so it has no limit on the number of storage
>>>> operations on
>>>>       a host.
>>>> 
>>>> 
>>>>       Right, but let's be fair and present the benefits of v2v-jobs as
>>>> well:
>>>>       1. it is the simplest "infrastructure" in terms of LOC
>>>> 
>>>> 
>>>>   It is also deprecated.  V2V has promised to adopt the richer Host Jobs
>>>>   API in the future.
>>>> 
>>>> 
>>>>       2. it is the most efficient mechanism in terms of interactions
>>>> between
>>>>       the
>>>>       engine and VDSM (it doesn't require new verbs/call, the data is
>>>>       attached to
>>>>       VdsStats; probably the easiest mechanism to convert to events)
>>>> 
>>>> 
>>>>   Engine is already polling the host jobs API so I am not sure I agree
>>>>   with you here.
>>>> 
>>>> 
>>>>       3. it is the most efficient implementation in terms of interaction
>>>> with
>>>>       the
>>>>       database (no date is persisted into the database, no polling is
>>>> done)
>>>> 
>>>> 
>>>>   Again, we're already using the Host Jobs API.  We'll gain efficiency
>>>>   by migrating away from the old v2v API and having a single, unified
>>>>   approach (Host Jobs).
>>>> 
>>>> 
>>>>       Currently we have 3 mechanisms to report jobs:
>>>>       1. VM jobs - that is currently used for live-merge. This requires
>>>> the
>>>>       VM entity
>>>>       to exist in VDSM, thus not suitable for virt-sysprep.
>>>> 
>>>> 
>>>>   Correct, not appropriate for this application.
>>>> 
>>>> 
>>>>       2. storage jobs - complicated infrastructure, targeted for
>>>> recovering
>>>>       from
>>>>       failures to maintain storage consistency. Many of the things this
>>>>       infrastructure knows to handle is irrelevant for virt-sysprep
>>>> flow, and
>>>>       the
>>>>       fact that virt-sysprep is invoked on VM rather than particular
>>>> disk
>>>>       makes it
>>>>       less suitable.
>>>> 
>>>> 
>>>>   These are more appropriately called HostJobs and the have the
>>>>   following semantics:
>>>>   - They represent an external process running on a single host
>>>>   - They are not persisted.  If the host or vdsm restarts, the job is
>>>>     aborted
>>>>   - They operate on entities.  Currently storage is the first adopter
>>>>     of the infrastructure but virt was going to adopt these for the
>>>>     next-gen API.  Entities can be volumes, storage domains, vms,
>>>>     network interfaces, etc.
>>>>   - Job status and progress is reported by the Host Jobs API.  If a job
>>>>     is not present, then the underlying entitie(s) must be polled by
>>>>     engine to determine the actual state.
>>>> 
>>>> 
>>>>       3. V2V jobs - no mechanism is provided to resume failed jobs, no
>>>>       leases, etc
>>>> 
>>>> 
>>>>   This is the old infra upon which Host Jobs are built.  v2v has
>>>>   promised to move to Host Jobs in the future so we should not add new
>>>>   dependencies to this code.
>>>> 
>>>> 
>>>>       I have some arguments for using V2V-like jobs [1]:
>>>>       1. creating template from vm is rarely done - if host goes
>>>> unresponsive
>>>>       or any
>>>>       other failure is detected we can just remove the template and
>>>> report
>>>>       the error
>>>> 
>>>> 
>>>>   We can chose this error handling with Host Jobs as well.
>>>> 
>>>> 
>>>>       2. the phase of virt-sysprep is, unlike typical storage operation,
>>>>       short -
>>>>       reducing the risk of failures during the process
>>>> 
>>>> 
>>>>   Reduced risk of failures is never an excuse to have lax error
>>>>   handling.  The storage flavored host jobs provide tons of utilities
>>>>   for making error handling standardized, easy to implement, and
>>>>   correct.
>>>> 
>>>> 
>>>>       3. during the operation the VM is down - by locking the
>>>> VM/template and
>>>>       its
>>>>       disks on the engine side, we render leases-like mechanism
>>>> redundant
>>>> 
>>>> 
>>>>   Eventually we want to protect all operations on storage with sanlock
>>>>   leases.  This is safer and allows for a more distributed approach to
>>>>   management.  Again, the use of leases correctly in host jobs requires
>>>>   about 5 lines of code.  The benefits of standardization far outweigh
>>>>   any perceived simplification resulting from omitting it.
>>>> 
>>>> 
>>>>       4. in the worst case - the disk will not be corrupted (only some
>>>> of the
>>>>       data
>>>>       might be removed).
>>>> 
>>>> 
>>>>   Again, the way engine chooses to handle job failures is independent of
>>>>   the mechanism.  Let's separate that from this discussion.
>>>> 
>>>> 
>>>>       So I think that the mechanism for storage jobs is an over-kill for
>>>> this
>>>>       case.
>>>>       We can keep it simple by generalise the V2V-job for other
>>>> virt-tools
>>>>       jobs, like
>>>>       virt-sysprep.
>>>> 
>>>> 
>>>>   I think we ought to standardize on the Host Jobs framework where we
>>>>   can collaborate on unit tests, standardized locking and error
>>>>   handling, abort logic, etc.  When v2v moves to host jobs then we will
>>>>   have a unified method of handling ephemeral jobs that are tied to
>>>>   entities.
>>>> 
>>>>   --
>>>>   Adam Litke
>>>> 
>>>> 
>>> 
>>> --
>>> Adam Litke
>> 
>> 

_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] Correct implementation of virt-sysprep job

Reply via email to