On Dec 7, 2016 20:16, "Nir Soffer" <nsof...@redhat.com> wrote: > > On Wed, Dec 7, 2016 at 8:10 PM, Oved Ourfali <oourf...@redhat.com> wrote: > > On Dec 7, 2016 16:00, "Nir Soffer" <nsof...@redhat.com> wrote: > >> > >> On Wed, Dec 7, 2016 at 10:17 AM, Oved Ourfali <oourf...@redhat.com> wrote: > >> > > >> > > >> > On Tue, Dec 6, 2016 at 11:12 PM, Adam Litke <ali...@redhat.com> wrote: > >> >> > >> >> On 06/12/16 22:06 +0200, Arik Hadas wrote: > >> >>> > >> >>> Adam, > >> >> > >> >> > >> >> :) You seem upset. Sorry if I touched on a nerve... > >> >> > >> >>> Just out of curiosity: when you write "v2v has promised" - what > >> >>> exactly > >> >>> do you > >> >>> mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar and > >> >>> I > >> >>> that > >> >>> implemented the integration with virt-v2v? I'm not aware of such a > >> >>> promise by > >> >>> any of these options :) > >> >> > >> >> > >> >> Some history... > >> >> > >> >> Earlier this year Nir, Francesco (added), Shahar, and I began > >> >> discussing the similarities between what storage needed to do with > >> >> external commands and what was designed specifically for v2v. I am > >> >> not sure if you were involved in the project at that time. The plan > >> >> was to create common infrastructure that could be extended to fit the > >> >> unique needs of the verticals. The v2v code was going to be moved > >> >> over to the new infrastructure (see [1]) and the only thing that > >> >> stopped the initial patch was lack of a VMWare testing environment for > >> >> verification. > >> >> > >> >> At that time storage refocused on developing verbs that used the new > >> >> infrastructure and have been maintaining its suitability for general > >> >> use. Conversion of v2v -> Host Jobs is obviously a lower priority > >> >> item and much more difficult now due to the early missed opportunity. > >> >> > >> >>> Anyway, let's say that you were given such a promise by someone and > >> >>> thus > >> >>> consider that mechanism to be deprecated - it doesn't really matter. > >> >> > >> >> > >> >> I may be biased but I think my opinion does matter. > >> >> > >> >>> The current implementation doesn't well fit to this flow (it requires > >> >>> per-volume job, it creates leases that are not needed for template's > >> >>> disks, > >> >>> ...) and with the "next-gen API" with proper support for virt flows > >> >>> not > >> >>> even > >> >>> being discussed with us (and iiuc also not with the infra team) yet, I > >> >>> don't > >> >>> understand what do you suggest except for some strong, though > >> >>> irrelevant, > >> >>> statements. > >> >> > >> >> > >> >> If you are willing to engage in a good-faith technical discussion I am > >> >> sure I can help you to understand. These operations to storage demand > >> >> some form of locking protection. If volume leases aren't appropriate > >> >> then > >> >> perhaps we should use the VM Leases / xleases that Nir is finishing > >> >> off for 4.1 now. > >> >> > >> >>> I suggest loud and clear to reuse (not to add dependencies, not to > >> >>> enhance, ..) > >> >>> an existing mechanism for a very similar flow of virt-v2v that works > >> >>> well > >> >>> and > >> >>> simple. > >> >> > >> >> > >> >> I clearly remember discussions involving infra (hello Oved), virt > >> >> (hola Michal), and storage where we decided that new APIs performing > >> >> async operations involving external commands should use the HostJobs > >> >> infrastructure instead of adding more information to Host Stats. > >> >> These were the "famous" entity polling meetings. > >> > >> We discussed these issues behind close doors, not in the public mailing > >> list, > >> so it is not surprising that people do not know about the agreements we > >> had. > >> > > > > The core team was there. So it is surprising. > > > >> >> > >> >> Of course plans can change but I have never been looped into any such > >> >> discussions. > >> >> > >> > > >> > Well, I think that when someone builds a good infrastructure he first > >> > needs > >> > to talk to all consumers and make sure it fits. > >> > In this case it seems like most work was done to fit the storage > >> > use-case, > >> > and now you check whether it can fit others as well.... > >> > >> The jobs framework is generic and can be used for any subsystem, > >> there is nothing related to storage about it. But modifying disks *is* > >> a storage operation, even if someone from the virt team worked on it. > >> > >> V2v is also storage operation - if we compare it with copying disks: > >> > >> - we create a new volume that nobody is using yet > >> - if the operation fails, the disk must be in illegal state > >> - if the operation fails we delete the disks > >> - if the operation succeeds the volume must be legal > >> - we need to limit the number of operations on a host > >> - we need to detect the job state if the host becomes non-responsive > >> - we may want to fence the job if the host becomes non-responsive > >> in volume jobs, we can increment the volume generation and run > >> the same job on another host. > >> - we want to take a lease on storage to ensure that other hosts cannot > >> access the same entity, or that the job will fail if someone else is > >> using > >> this entity > >> - we want to take a lease on storage, ensuring that a job cannot get > >> stuck for long time - sanlock kill the owner of a lease when storage > >> becomes inaccessible. > >> - we want to report progress > >> > >> sysprep is less risky because the operation is faster, but on storage even > >> fast operation can get stuck for minutes. > >> > >> We need to agree on a standard way to do such operations that is safe > >> enough > >> and can be managed on the engine side. > >> > >> > IMO it makes much more sense to use events where possible (and you've > >> > promised to use those as well, but I don't see you doing that...). v2v > >> > should use events for sure, and they have promised to do that in the > >> > past, > >> > instead of using the v2v jobs. The reason events weren't used originally > >> > with the v2v feature, was that it was too risky and the events > >> > infrastructure was added too late in the game. > >> > >> Events are not replacing the need for managing jobs in the vdsm side. > >> Engine must have a way to query the current jobs before subscribing > >> to events from these jobs, otherwise you will loose events and engine > >> will never notice a completed job after network errors. > >> > >> The jobs framework supports events, see > >> https://gerrit.ovirt.org/67118 > >> > >> We are waiting for review from the infra team, maybe you can > >> get someone to review this? > > > > It would have been great to review the design for this before it reaches to > > gerrit. > > Anyway, I get permissions error when opening. Any clue why? > > It is a recent bug in gerrit, or configuration issue, drafts are > private sometimes. > > I added you as reviewer, can you see this now? >
Yes. I see Piotr is already on it. I'll also be happy to hear how are you going to use events in your current design. Also, is there a design page for this work? Thanks, Oved > Nir > > > > >> > >> Nir > >> > >> > > >> > > >> >>> > >> >>> Do you "promise" to implement your "next gen API" for 4.1 as an > >> >>> alternative? > >> >> > >> >> > >> >> I guess we need the design first. > >> >> > >> >> > >> >>> On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke <ali...@redhat.com> wrote: > >> >>> > >> >>> On 05/12/16 11:17 +0200, Arik Hadas wrote: > >> >>> > >> >>> > >> >>> > >> >>> On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer > >> >>> <nsof...@redhat.com> > >> >>> wrote: > >> >>> > >> >>> On Sun, Dec 4, 2016 at 8:50 PM, Shmuel Melamud > >> >>> <smela...@redhat.com> > >> >>> wrote: > >> >>> > > >> >>> > Hi! > >> >>> > > >> >>> > I'm currently working on integration of virt-sysprep into > >> >>> oVirt. > >> >>> > > >> >>> > Usually, if user creates a template from a regular VM, and > >> >>> then > >> >>> creates > >> >>> new VMs from this template, these new VMs inherit all > >> >>> configuration > >> >>> of the > >> >>> original VM, including SSH keys, UDEV rules, MAC addresses, > >> >>> system > >> >>> ID, > >> >>> hostname etc. It is unfortunate, because you cannot have two > >> >>> network > >> >>> devices with the same MAC address in the same network, for > >> >>> example. > >> >>> > > >> >>> > To avoid this, user must clean all machine-specific > >> >>> configuration > >> >>> from > >> >>> the original VM before creating a template from it. You can > >> >>> do > >> >>> this > >> >>> manually, but there is virt-sysprep utility that does this > >> >>> automatically. > >> >>> > > >> >>> > Ideally, virt-sysprep should be seamlessly integrated into > >> >>> template > >> >>> creation process. But the first step is to create a simple > >> >>> button: > >> >>> user > >> >>> selects a VM, clicks the button and oVirt executes > >> >>> virt-sysprep > >> >>> on > >> >>> the VM. > >> >>> > > >> >>> > virt-sysprep works directly on VM's filesystem. It accepts > >> >>> list of > >> >>> all > >> >>> disks of the VM as parameters: > >> >>> > > >> >>> > virt-sysprep -a disk1.img -a disk2.img -a disk3.img > >> >>> > > >> >>> > The architecture is as follows: command on the Engine side > >> >>> runs a > >> >>> job on > >> >>> VDSM side and tracks its success/failure. The job on VDSM > >> >>> side > >> >>> runs > >> >>> virt-sysprep. > >> >>> > > >> >>> > The question is how to implement the job correctly? > >> >>> > > >> >>> > I thought about using storage jobs, but they are designed > >> >>> to > >> >>> work > >> >>> only > >> >>> with a single volume, correct? > >> >>> > >> >>> New storage verbs are volume based. This make it easy to > >> >>> manage > >> >>> them on the engine side, and will allow parallelizing volume > >> >>> operations > >> >>> on single or multiple hosts. > >> >>> > >> >>> A storage volume job is using sanlock lease on the modified > >> >>> volume > >> >>> and volume generation number. If a host running pending jobs > >> >>> becomes > >> >>> non-responsive and cannot be fenced, we can detect the state > >> >>> of > >> >>> the job, fence the job, and start the job on another host. > >> >>> > >> >>> In the SPM task, if a host becomes non-responsive and cannot > >> >>> be > >> >>> fenced, the whole setup is stuck, there is no way to perform > >> >>> any > >> >>> storage operation. > >> >>> > Is is possible to use them with operation that is > >> >>> performed > >> >>> on > >> >>> multiple > >> >>> volumes? > >> >>> > Or, alternatively, is it possible to use some kind of 'VM > >> >>> jobs' - > >> >>> that > >> >>> work on VM at whole? > >> >>> > >> >>> We can do: > >> >>> > >> >>> 1. Add jobs with multiple volumes leases - can make error > >> >>> handling > >> >>> very > >> >>> complex. How do tell a job state if you have multiple > >> >>> leases? > >> >>> which > >> >>> volume generation you use? > >> >>> > >> >>> 2. Use volume job using one of the volumes (the boot > >> >>> volume?). > >> >>> This > >> >>> does > >> >>> not protect the other volumes from modification but > >> >>> engine > >> >>> is > >> >>> responsible > >> >>> for this. > >> >>> > >> >>> 3. Use new "vm jobs", using a vm lease (should be available > >> >>> this > >> >>> week > >> >>> on master). > >> >>> This protects a vm during sysprep from starting the vm. > >> >>> We still need a generation to detect the job state, I > >> >>> think > >> >>> we > >> >>> can > >> >>> use the sanlock > >> >>> lease generation for this. > >> >>> > >> >>> I like the last option since sysprep is much like running a > >> >>> vm. > >> >>> > How v2v solves this problem? > >> >>> > >> >>> It does not. > >> >>> > >> >>> v2v predates storage volume jobs. It does not use volume > >> >>> leases > >> >>> and > >> >>> generation > >> >>> and does have any way to recover if a host running v2v > >> >>> becomes > >> >>> non-responsive > >> >>> and cannot be fenced. > >> >>> > >> >>> It also does not use the jobs framework and does not use a > >> >>> thread > >> >>> pool for > >> >>> v2v jobs, so it has no limit on the number of storage > >> >>> operations on > >> >>> a host. > >> >>> > >> >>> > >> >>> Right, but let's be fair and present the benefits of v2v-jobs > >> >>> as > >> >>> well: > >> >>> 1. it is the simplest "infrastructure" in terms of LOC > >> >>> > >> >>> > >> >>> It is also deprecated. V2V has promised to adopt the richer Host > >> >>> Jobs > >> >>> API in the future. > >> >>> > >> >>> > >> >>> 2. it is the most efficient mechanism in terms of interactions > >> >>> between > >> >>> the > >> >>> engine and VDSM (it doesn't require new verbs/call, the data is > >> >>> attached to > >> >>> VdsStats; probably the easiest mechanism to convert to events) > >> >>> > >> >>> > >> >>> Engine is already polling the host jobs API so I am not sure I > >> >>> agree > >> >>> with you here. > >> >>> > >> >>> > >> >>> 3. it is the most efficient implementation in terms of > >> >>> interaction > >> >>> with > >> >>> the > >> >>> database (no date is persisted into the database, no polling is > >> >>> done) > >> >>> > >> >>> > >> >>> Again, we're already using the Host Jobs API. We'll gain > >> >>> efficiency > >> >>> by migrating away from the old v2v API and having a single, unified > >> >>> approach (Host Jobs). > >> >>> > >> >>> > >> >>> Currently we have 3 mechanisms to report jobs: > >> >>> 1. VM jobs - that is currently used for live-merge. This > >> >>> requires > >> >>> the > >> >>> VM entity > >> >>> to exist in VDSM, thus not suitable for virt-sysprep. > >> >>> > >> >>> > >> >>> Correct, not appropriate for this application. > >> >>> > >> >>> > >> >>> 2. storage jobs - complicated infrastructure, targeted for > >> >>> recovering > >> >>> from > >> >>> failures to maintain storage consistency. Many of the things > >> >>> this > >> >>> infrastructure knows to handle is irrelevant for virt-sysprep > >> >>> flow, and > >> >>> the > >> >>> fact that virt-sysprep is invoked on VM rather than particular > >> >>> disk > >> >>> makes it > >> >>> less suitable. > >> >>> > >> >>> > >> >>> These are more appropriately called HostJobs and the have the > >> >>> following semantics: > >> >>> - They represent an external process running on a single host > >> >>> - They are not persisted. If the host or vdsm restarts, the job is > >> >>> aborted > >> >>> - They operate on entities. Currently storage is the first adopter > >> >>> of the infrastructure but virt was going to adopt these for the > >> >>> next-gen API. Entities can be volumes, storage domains, vms, > >> >>> network interfaces, etc. > >> >>> - Job status and progress is reported by the Host Jobs API. If a > >> >>> job > >> >>> is not present, then the underlying entitie(s) must be polled by > >> >>> engine to determine the actual state. > >> >>> > >> >>> > >> >>> 3. V2V jobs - no mechanism is provided to resume failed jobs, > >> >>> no > >> >>> leases, etc > >> >>> > >> >>> > >> >>> This is the old infra upon which Host Jobs are built. v2v has > >> >>> promised to move to Host Jobs in the future so we should not add > >> >>> new > >> >>> dependencies to this code. > >> >>> > >> >>> > >> >>> I have some arguments for using V2V-like jobs [1]: > >> >>> 1. creating template from vm is rarely done - if host goes > >> >>> unresponsive > >> >>> or any > >> >>> other failure is detected we can just remove the template and > >> >>> report > >> >>> the error > >> >>> > >> >>> > >> >>> We can chose this error handling with Host Jobs as well. > >> >>> > >> >>> > >> >>> 2. the phase of virt-sysprep is, unlike typical storage > >> >>> operation, > >> >>> short - > >> >>> reducing the risk of failures during the process > >> >>> > >> >>> > >> >>> Reduced risk of failures is never an excuse to have lax error > >> >>> handling. The storage flavored host jobs provide tons of utilities > >> >>> for making error handling standardized, easy to implement, and > >> >>> correct. > >> >>> > >> >>> > >> >>> 3. during the operation the VM is down - by locking the > >> >>> VM/template and > >> >>> its > >> >>> disks on the engine side, we render leases-like mechanism > >> >>> redundant > >> >>> > >> >>> > >> >>> Eventually we want to protect all operations on storage with > >> >>> sanlock > >> >>> leases. This is safer and allows for a more distributed approach > >> >>> to > >> >>> management. Again, the use of leases correctly in host jobs > >> >>> requires > >> >>> about 5 lines of code. The benefits of standardization far > >> >>> outweigh > >> >>> any perceived simplification resulting from omitting it. > >> >>> > >> >>> > >> >>> 4. in the worst case - the disk will not be corrupted (only > >> >>> some > >> >>> of the > >> >>> data > >> >>> might be removed). > >> >>> > >> >>> > >> >>> Again, the way engine chooses to handle job failures is independent > >> >>> of > >> >>> the mechanism. Let's separate that from this discussion. > >> >>> > >> >>> > >> >>> So I think that the mechanism for storage jobs is an over-kill > >> >>> for > >> >>> this > >> >>> case. > >> >>> We can keep it simple by generalise the V2V-job for other > >> >>> virt-tools > >> >>> jobs, like > >> >>> virt-sysprep. > >> >>> > >> >>> > >> >>> I think we ought to standardize on the Host Jobs framework where we > >> >>> can collaborate on unit tests, standardized locking and error > >> >>> handling, abort logic, etc. When v2v moves to host jobs then we > >> >>> will > >> >>> have a unified method of handling ephemeral jobs that are tied to > >> >>> entities. > >> >>> > >> >>> -- > >> >>> Adam Litke > >> >>> > >> >>> > >> >> > >> >> -- > >> >> Adam Litke > >> > > >> >
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel