Adam, Just out of curiosity: when you write "v2v has promised" - what exactly do you mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar and I that implemented the integration with virt-v2v? I'm not aware of such a promise by any of these options :)
Anyway, let's say that you were given such a promise by someone and thus consider that mechanism to be deprecated - it doesn't really matter. The current implementation doesn't well fit to this flow (it requires per-volume job, it creates leases that are not needed for template's disks, ...) and with the "next-gen API" with proper support for virt flows not even being discussed with us (and iiuc also not with the infra team) yet, I don't understand what do you suggest except for some strong, though irrelevant, statements. I suggest loud and clear to reuse (not to add dependencies, not to enhance, ..) an existing mechanism for a very similar flow of virt-v2v that works well and simple. Do you "promise" to implement your "next gen API" for 4.1 as an alternative? On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke <ali...@redhat.com> wrote: > On 05/12/16 11:17 +0200, Arik Hadas wrote: > >> >> >> On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer <nsof...@redhat.com> wrote: >> >> On Sun, Dec 4, 2016 at 8:50 PM, Shmuel Melamud <smela...@redhat.com> >> wrote: >> > >> > Hi! >> > >> > I'm currently working on integration of virt-sysprep into oVirt. >> > >> > Usually, if user creates a template from a regular VM, and then >> creates >> new VMs from this template, these new VMs inherit all configuration of >> the >> original VM, including SSH keys, UDEV rules, MAC addresses, system ID, >> hostname etc. It is unfortunate, because you cannot have two network >> devices with the same MAC address in the same network, for example. >> > >> > To avoid this, user must clean all machine-specific configuration >> from >> the original VM before creating a template from it. You can do this >> manually, but there is virt-sysprep utility that does this >> automatically. >> > >> > Ideally, virt-sysprep should be seamlessly integrated into template >> creation process. But the first step is to create a simple button: user >> selects a VM, clicks the button and oVirt executes virt-sysprep on the >> VM. >> > >> > virt-sysprep works directly on VM's filesystem. It accepts list of >> all >> disks of the VM as parameters: >> > >> > virt-sysprep -a disk1.img -a disk2.img -a disk3.img >> > >> > The architecture is as follows: command on the Engine side runs a >> job on >> VDSM side and tracks its success/failure. The job on VDSM side runs >> virt-sysprep. >> > >> > The question is how to implement the job correctly? >> > >> > I thought about using storage jobs, but they are designed to work >> only >> with a single volume, correct? >> >> New storage verbs are volume based. This make it easy to manage >> them on the engine side, and will allow parallelizing volume operations >> on single or multiple hosts. >> >> A storage volume job is using sanlock lease on the modified volume >> and volume generation number. If a host running pending jobs becomes >> non-responsive and cannot be fenced, we can detect the state of >> the job, fence the job, and start the job on another host. >> >> In the SPM task, if a host becomes non-responsive and cannot be >> fenced, the whole setup is stuck, there is no way to perform any >> storage operation. >> > Is is possible to use them with operation that is performed on >> multiple >> volumes? >> > Or, alternatively, is it possible to use some kind of 'VM jobs' - >> that >> work on VM at whole? >> >> We can do: >> >> 1. Add jobs with multiple volumes leases - can make error handling very >> complex. How do tell a job state if you have multiple leases? which >> volume generation you use? >> >> 2. Use volume job using one of the volumes (the boot volume?). This >> does >> not protect the other volumes from modification but engine is >> responsible >> for this. >> >> 3. Use new "vm jobs", using a vm lease (should be available this week >> on master). >> This protects a vm during sysprep from starting the vm. >> We still need a generation to detect the job state, I think we can >> use the sanlock >> lease generation for this. >> >> I like the last option since sysprep is much like running a vm. >> > How v2v solves this problem? >> >> It does not. >> >> v2v predates storage volume jobs. It does not use volume leases and >> generation >> and does have any way to recover if a host running v2v becomes >> non-responsive >> and cannot be fenced. >> >> It also does not use the jobs framework and does not use a thread pool >> for >> v2v jobs, so it has no limit on the number of storage operations on a >> host. >> >> >> Right, but let's be fair and present the benefits of v2v-jobs as well: >> 1. it is the simplest "infrastructure" in terms of LOC >> > > It is also deprecated. V2V has promised to adopt the richer Host Jobs > API in the future. > > 2. it is the most efficient mechanism in terms of interactions between the >> engine and VDSM (it doesn't require new verbs/call, the data is attached >> to >> VdsStats; probably the easiest mechanism to convert to events) >> > > Engine is already polling the host jobs API so I am not sure I agree > with you here. > > 3. it is the most efficient implementation in terms of interaction with the >> database (no date is persisted into the database, no polling is done) >> > > Again, we're already using the Host Jobs API. We'll gain efficiency > by migrating away from the old v2v API and having a single, unified > approach (Host Jobs). > > Currently we have 3 mechanisms to report jobs: >> 1. VM jobs - that is currently used for live-merge. This requires the VM >> entity >> to exist in VDSM, thus not suitable for virt-sysprep. >> > > Correct, not appropriate for this application. > > 2. storage jobs - complicated infrastructure, targeted for recovering from >> failures to maintain storage consistency. Many of the things this >> infrastructure knows to handle is irrelevant for virt-sysprep flow, and >> the >> fact that virt-sysprep is invoked on VM rather than particular disk makes >> it >> less suitable. >> > > These are more appropriately called HostJobs and the have the > following semantics: > - They represent an external process running on a single host > - They are not persisted. If the host or vdsm restarts, the job is > aborted > - They operate on entities. Currently storage is the first adopter > of the infrastructure but virt was going to adopt these for the > next-gen API. Entities can be volumes, storage domains, vms, > network interfaces, etc. > - Job status and progress is reported by the Host Jobs API. If a job > is not present, then the underlying entitie(s) must be polled by > engine to determine the actual state. > > 3. V2V jobs - no mechanism is provided to resume failed jobs, no leases, >> etc >> > > This is the old infra upon which Host Jobs are built. v2v has > promised to move to Host Jobs in the future so we should not add new > dependencies to this code. > > I have some arguments for using V2V-like jobs [1]: >> 1. creating template from vm is rarely done - if host goes unresponsive >> or any >> other failure is detected we can just remove the template and report the >> error >> > > We can chose this error handling with Host Jobs as well. > > 2. the phase of virt-sysprep is, unlike typical storage operation, short - >> reducing the risk of failures during the process >> > > Reduced risk of failures is never an excuse to have lax error > handling. The storage flavored host jobs provide tons of utilities > for making error handling standardized, easy to implement, and > correct. > > 3. during the operation the VM is down - by locking the VM/template and its >> disks on the engine side, we render leases-like mechanism redundant >> > > Eventually we want to protect all operations on storage with sanlock > leases. This is safer and allows for a more distributed approach to > management. Again, the use of leases correctly in host jobs requires > about 5 lines of code. The benefits of standardization far outweigh > any perceived simplification resulting from omitting it. > > 4. in the worst case - the disk will not be corrupted (only some of the >> data >> might be removed). >> > > Again, the way engine chooses to handle job failures is independent of > the mechanism. Let's separate that from this discussion. > > So I think that the mechanism for storage jobs is an over-kill for this >> case. >> We can keep it simple by generalise the V2V-job for other virt-tools >> jobs, like >> virt-sysprep. >> > > I think we ought to standardize on the Host Jobs framework where we > can collaborate on unit tests, standardized locking and error > handling, abort logic, etc. When v2v moves to host jobs then we will > have a unified method of handling ephemeral jobs that are tied to > entities. > > -- > Adam Litke >
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel