On 12/21/18 3:23 AM, Nikolay Shirokovskiy wrote:
>
>
> On 20.12.2018 23:31, John Ferlan wrote:
>>
>>
>> On 12/13/18 3:03 AM, Nikolay Shirokovskiy wrote:
>>> In case of active persistent domain snapshot metadata is
>>> not complete. We save only active configuration and on
>>> restore use it both for active and inactive configuration.
>>> Let's fix it and save and restore both in this case.
>>>
>>> In case of active transient or inactive domain we have
>>> only one config and thus everything is good.
>>>
>>> By the way this patch fixes config memleak on error paths.
>>
>> use @config to make it clearer...
>>
>> I see how/why you combined things and although extracting out the
>> memleak is a bit difficult, I think it should be done.
>>
>> Perhaps in the cleanup code to *SaveConfig, you just set @config = NULL
>> "temporarily" since we know/assume virDomainObjAssignDef was successful
>> and we wouldn't want the subsequently added virDomainDefFree(config) to
>> free it on those paths, but we would want it to be free'd on error
>> paths. Then by using @newConfig in this patch, it's even more obvious
>> and of course the temporary config = NULL would be removed... Hope this
>> makes sense.
>
> Ok. I thought of a distinct patch too, but having in mind that adding/removing
> code in subsequent patches is not preferred I decided to just note this moment
> in the message.
>
I'm mostly 50/50 on this - leaning towards separation mainly because I
think it is possible, but also because when you mention adding a new
feature and fixing an existing memory leak in the commit message that
generally results in someone asking for separation. Certainly makes it
easier to bisect to a commit and not have to wonder did the new
functionality or the bugfix cause some issue...
>>
>>>
>>> Signed-off-by: Nikolay Shirokovskiy <nshirokovs...@virtuozzo.com>
>>> ---
>>> src/conf/snapshot_conf.c | 1 +
>>> src/conf/snapshot_conf.h | 2 ++
>>> src/qemu/qemu_driver.c | 44 +++++++++++++++++++++++++++++++++++++++-----
>>> 3 files changed, 42 insertions(+), 5 deletions(-)
>>>
>>
>> Not my area of expertise, but rather than have it sit unreviewed...
>>
>>
>>> diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c
>>> index 5a511b4..2e4a0b9 100644
>>> --- a/src/conf/snapshot_conf.c
>>> +++ b/src/conf/snapshot_conf.c
>>> @@ -102,6 +102,7 @@ void virDomainSnapshotDefFree(virDomainSnapshotDefPtr
>>> def)
>>> virDomainSnapshotDiskDefClear(&def->disks[i]);
>>> VIR_FREE(def->disks);
>>> virDomainDefFree(def->dom);
>>> + virDomainDefFree(def->persistDom);
>>> virObjectUnref(def->cookie);
>>> VIR_FREE(def);
>>> }
>>> diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h
>>> index 20a42bd..3da204a 100644
>>> --- a/src/conf/snapshot_conf.h
>>> +++ b/src/conf/snapshot_conf.h
>>> @@ -75,6 +75,8 @@ struct _virDomainSnapshotDef {
>>> virDomainSnapshotDiskDef *disks;
>>>
>>> virDomainDefPtr dom;
>>> + /* inactive domain config in case of active persistent domain */
>>> + virDomainDefPtr persistDom;
>>>
>>> virObjectPtr cookie;
>>>
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index f7c1d6f..7e0585d 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -15687,6 +15687,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain,
>>>
>>> VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATE)))
>>> goto endjob;
>>>
>>> + if (virDomainObjIsActive(vm) && vm->persistent &&
>>
>> As it's easy to get lost or forget... IIRC vm->persistent means that w> have
>> used the Define functions to provide a "config" at some point in
>
>> time. Then eventually when/if we modify the "active" object to make a
>> configuration specific change, then the @newDef gets populated with the
>> config change(s); otherwise, the changes are made to the "active" def
>> and eventually/possibly saved. All the magic is hidden away in
>> accessors that I don't think about all that often any more.
>
> If by otherwise you mean "if we make changes to active config" - then
> such changes are saved to status files but they can't be written
> directly to config file.
>
Like I said - some of the magic and details are lost in accessors. Yes,
the active is essentially saved to status... The whole def/newDef fetch,
store, and manipulation gets hidden behind those *Domain.*Persistent*
type helpers. The @newDef nomenclature has always struck me as a bit
odd, but using a longer more descriptive name may not help me either.
>>
>> So, should the vm->persistent be vm->newDef instead? Either way if the
>> following call is made and vm->newDef == NULL, then eventually in
>> virDomainDefFormatInternal we would core when @def was NULL.
>
> Well there is at least one case when vm->newDef != NULL but domain is not
> persistent
> - if we undefine domain vm->newDef is not released. This can be fixed but
> I'm not sure there are no other cases. So checking vm->persisent is more
> reliable.
So we have an active persistent domain with a newDef that eventually is
made non persistent... I'm sure there's a reason for it...
>
> I think this is currenly invariant that if virDomainObjIsActive(vm) &&
> vm->persistent
> then vm->newDef != NULL but we can add explicit check.
>
> The best option I think is to fix all cases when vm->newDef != NULL for non
> persistent
> active domain and replace check as you suggested. What do you think?
>
I think my main concern would be vm->persistent == true and vm->newDef
== NULL.
>
>>
>>
>>> + !(def->persistDom = qemuDomainDefCopy(driver, vm->newDef,
>>> + VIR_DOMAIN_XML_SECURE |
>>> +
>>> VIR_DOMAIN_XML_MIGRATABLE)))
>>
>> Do we need VIR_DOMAIN_XML_INACTIVE too since this is the "next"
>> persistent config def? A variant for parse is used later in patch5.
>
> This is one of the hardest questions - what combinations of flags one should
> use for one's purpuses :) I'm not kidding, there are no good definitions of
110% agree.
> VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_DEF_FORMAT_INACTIVE. So I use same
> combination migration uses to pass persistent config to destination -
> check qemuMigrationSrcRun.
and obviously I asked because I find the design of the flags is
confusing and what each means with respect to what post parse processing
or validation occurs - I just cannot keep it in memory.
>
> Hmm, virDomainDefFormatInternal has next hunk at the beginning:
>
> if (def->id == -1)
>
> flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;
>
> and def->id == -1 for inactive config (@id is set to -1 on parsing
> inactive configs and on domain start @def is copied to @newDef before
> allocating @id - check qemuProcessInit). So we have this flag anyway :)
<sigh>... commit f24e67d24
>
> Also even the above fact does not have the case AFAIU having this flag for
> inactive config
> does not have much meaning - random hunks of code that I checked filter
> dumping info that gained by config on domain start. However I run across a
> kind of exception too - @id is not dumped at all if
> VIR_DOMAIN_DEF_FORMAT_INACTIVE
> is set however this is not make much difference.
>
The case you're chasing though would have an id != -1 since it's an
active guest, but we're saving the newDef/next config. Perhaps similar
to how various qemu_driver commands can manipulate both active and
config (attach, detach, change, etc). It's a "hypervisor dependent"
action as to what happens.
>>
>>> + goto endjob;
>>> +
>>
>> Since we are doing this as "best chance" and aren't requiring it, what
>> are your (and perhaps others) thoughts related to making this really
>> optional. As in if the qemuDomainDefCopy fails, then we just throw a
>> VIR_INFO, clear the last error, and continue on?
>>
>> I don't have a preference, but I'm just throwing it out there as an idea.
>
> I should say I'm not looking at this step as "best chance" - if we can
> not revert exact state I'd prefer to fail here.
>
>>
>>> if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) {
>>> align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL;
>>> align_match = false;
>>> @@ -16219,6 +16225,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> qemuDomainObjPrivatePtr priv;
>>> int rc;
>>> virDomainDefPtr config = NULL;
>>> + virDomainDefPtr persistConfig = NULL;
>>> virQEMUDriverConfigPtr cfg = NULL;
>>> virCapsPtr caps = NULL;
>>> bool was_stopped = false;
>>> @@ -16226,6 +16233,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> virCPUDefPtr origCPU = NULL;
>>> unsigned int start_flags = VIR_QEMU_PROCESS_START_GEN_VMID;
>>> qemuDomainAsyncJob jobType = QEMU_ASYNC_JOB_START;
>>> + virDomainDefPtr newConfig = NULL;
>>>
>>> virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING |
>>> VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED |
>>> @@ -16332,6 +16340,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> goto endjob;
>>> }
>>>
>>> + if (snap->def->persistDom &&
>>> + !(persistConfig = virDomainDefCopy(snap->def->persistDom, caps,
>>> + driver->xmlopt, NULL, true)))
>>> + goto endjob;
>>> +
>>
>> Based only on my previous comment, for this part I agree especially
>> since if we were able to save a config, then we'd want to be sure to
>> restore it.
>>
>>> cookie = (qemuDomainSaveCookiePtr) snap->def->cookie;
>>>
>>> switch ((virDomainState) snap->def->state) {
>>> @@ -16440,8 +16453,18 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> * failed loadvm attempt? */
>>> goto endjob;
>>> }
>>> - if (config) {
>>> - virDomainObjAssignDef(vm, config, false, NULL);
>>> +
>>> + /* Older versions do not save inactive config in metadata,
>>> instead
>>> + * they use active config for this purpose, so keep this
>>> behaviour
>>> + * for backward compat.
>>> + */
>>> + if (persistConfig)
>>> + VIR_STEAL_PTR(newConfig, persistConfig);
>>> + else
>>> + VIR_STEAL_PTR(newConfig, config);
>>> +
>>> + if (newConfig) {
>>> + virDomainObjAssignDef(vm, newConfig, false, NULL);
>>> virCPUDefFree(priv->origCPU);
>>> VIR_STEAL_PTR(priv->origCPU, origCPU);
>>> }
>>> @@ -16449,8 +16472,10 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> /* Transitions 2, 3 */
>>> load:
>>> was_stopped = true;
>>> - if (config)
>>> + if (config) {
>>> virDomainObjAssignDef(vm, config, false, NULL);
>>> + config = NULL;
>>
>> This "makes sense" except for the case where @persistConfig == NULL,
>> then we wouldn't virDomainSaveConfig below as we currently do. So maybe
>> this should be:
>>
>> if (!persistConfig)
>> VIR_STEAL_PTR(newConfig, config);
>> else
>> config = NULL;
>>
>> That way we won't virDomainDefFree @config below (which is what you're
>> trying to avoid), but we will save @config as we did before unless of
>> course @persistConfig is going to be used.
>>
>> Make sense?
>
> Good catch! I think I'll change to:
>
> if (config) {
> virDomainObjAssignDef(vm, config, false, NULL);
> VIR_STEAL_PTR(newConfig, config);
> }
>
> because if we have persistConfig but fail to assing it - means
> we are on error path and we won't save newConfig anyway.
>
OK - I was going explicit with the alternative to the code below for
persistConfig/newConfig.
>>
>>> + }
>>>
>>> /* No cookie means libvirt which saved the domain was too old
>>> to
>>> * mess up the CPU definitions.
>>> @@ -16471,6 +16496,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> detail);
>>> if (rc < 0)
>>> goto endjob;
>>> +
>>> + if (persistConfig) {
>>
>> Perhaps we should point out:
>>
>> /* This will save the @persistConfig in the vm->newDef where it
>> * was originally copied from. */
>>
>> John
>>
>>> + virDomainObjAssignDef(vm, persistConfig, false, NULL);
>>> + VIR_STEAL_PTR(newConfig, persistConfig);
>>> + }
>>> }
>
>
> This makes me wonder if virDomainObjAssignDef is too overcomplicated
> if we need such comments.
>
Nah, no way that's over complicated <eyeroll>. Touching it though means
you get to own making it better... job security and all that.
John
> Nikolay
>
>>>
>>> /* Touch up domain state. */
>>> @@ -16535,8 +16565,10 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> qemuDomainRemoveInactive(driver, vm);
>>> goto endjob;
>>> }
>>> - if (config)
>>> + if (config) {
>>> virDomainObjAssignDef(vm, config, false, NULL);
>>> + VIR_STEAL_PTR(newConfig, config);
>>> + }
>>>
>>> if (flags & (VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING |
>>> VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) {
>>> @@ -16600,7 +16632,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> } else if (snap) {
>>> snap->def->current = false;
>>> }
>>> - if (ret == 0 && config && vm->persistent &&
>>> + if (ret == 0 && newConfig && vm->persistent &&
>>> !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
>>> vm->newDef ? vm->newDef : vm->def))) {
>>> detail = VIR_DOMAIN_EVENT_DEFINED_FROM_SNAPSHOT;
>>> @@ -16616,6 +16648,8 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr
>>> snapshot,
>>> virObjectUnref(cfg);
>>> virNWFilterUnlockFilterUpdates();
>>> virCPUDefFree(origCPU);
>>> + virDomainDefFree(config);
>>> + virDomainDefFree(persistConfig);
>>>
>>> return ret;
>>> }
>>>
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list