On Mon, Sep 22, 2025 at 4:28 PM Peter Krempa <[email protected]> wrote:

> On Mon, Sep 22, 2025 at 16:15:47 +0800, Yong Huang wrote:
> > On Mon, Sep 22, 2025 at 2:59 PM Peter Krempa <[email protected]> wrote:
> >
> > > On Mon, Sep 22, 2025 at 11:30:46 +0800, Yong Huang wrote:
> > > > On Fri, Sep 19, 2025 at 8:23 PM Peter Krempa <[email protected]>
> wrote:
> > > >
> > > > > On Fri, Sep 19, 2025 at 17:09:07 +0800, [email protected]
> wrote:
> > > > > > From: Hyman Huang <[email protected]>
>
> [...]
>
> > > > 3. Launch the migration and use "systemctl restart libvirt" to
> restart
> > > > Libvirtd
> > > > once after migration enters the perform phase.
> > >
> > > [...]
> > >
> > > Okay so my understanding from your description is that an (early
> > > startup) failure in virDomainObjListLoadAllConfigs() (and surrounding
> > > code) can result in the daemon shutting down before the threads
> handling
> > > the already loaded(? ... impossible to tell with the abbreviated log
> > > below) domains terminate? Right?
> > >
> >
> > Yes, in our productized Libvirt, an early failure
> > in virDomainObjListLoadAllConfigs()
> >
> > can result in the daemon shutting down.
> >
> > In the upstream Libvirt, the daemon started up successfully but failed to
> > manage the VM
> >
> > (The virDomainObjListLoadAllConfigs returns an error since
> > the missing private data
> >
> > in status XML).
>
> So I assume a non-upstream version. What is it based on? What else did
> you change?
>
>
> > > Thus the other threads trigger a use-after-free on the driver object?
> > >
> > > Anyways I think it's clear now that just checking if the callbacks are
> > > present doesn't make sense.
> > >
> > > Additionally there's now an upstream issue
> > > https://gitlab.com/libvirt/libvirt/-/issues/814
> > > which seems to claim a use-after-free on a different code path but
> still
> > > triggered by the cleanup code freeing private data.
> > >
> > > Unfortunately I didn't get any logs or backtrace there either.
> > >
> > > I'll look into the shutdown code path and see if I can figure it out.
> > >
> > > >
> > > >
> > > > 4. Search the log message:
> > > >
> > > > $ cat /var/log/zbs/libvirtd.log |egrep "PrivateData formatter driver
> does
> > > > not exist|remoteDispatchDomainMigratePerform3Params"
> > > > 2025-09-22 03:06:12.517+0000: 1124258: debug : virThreadJobSet:94 :
> > > Thread
> > > > 1124258 (rpc-worker) is now running job
> > > > remoteDispatchDomainMigratePerform3Params
> > >
> >
> > This log indicate that 1124258 thread now execute
> > the remoteDispatchDomainMigratePerform3Params
> >
> > > 2025-09-22 03:06:12.517+0000: 1124258: debug :
> > > > remoteDispatchDomainMigratePerform3ParamsHelper:8804 :
> > > > server=0x556317979660 client=0x55631799eff0 msg=0x55631799c010
> > > > rerr=0x7f08c688b9c0 args=0x7f08a800a820 ret=0x7f08a80053b0
> > > > 2025-09-22 03:06:21.959+0000: 1124258: warning :
> > > virDomainObjFormat:30190 :
> > > > PrivateData formatter driver does not exist
> > >
> >
> > In the execution path of remoteDispatchDomainMigratePerform3Params, it
> > enters the code and the
> >
> > warning message is logged, while the following warning message is never
> > logged in a successful migration:
> >
> > +    if (!xmlopt->privateData.format) {
> > +        VIR_WARN("PrivateData formatter driver does not exist");
> > +    }
> >
> > The following info shows the backtrace of virDomainObjFormat in an
> > successful migration:
>
> Successful, meaning you didn't hit the bug?
>
> > #0  virDomainObjFormat (obj=obj@entry=0x7fa3342598e0,
> > xmlopt=0x7fa3341c54b0, flags=flags@entry=313) at
> > ../../src/conf/domain_conf.c:30166
> > #1  0x00007fa395ae8684 in virDomainObjSave (obj=obj@entry
> =0x7fa3342598e0,
> > xmlopt=<optimized out>, statusDir=0x7fa33412aec0 "/run/libvirt/qemu") at
> > ../../src/conf/domain_conf.c:30375
>
> [...]
>
>
> I asked for a backtrace of all threads as I want to see what the other
> threads are doing during the shutdown.
>
> > > > 2025-09-22 03:06:25.141+0000: 1124258: warning :
> > > virDomainObjFormat:30190 :
> > > > PrivateData formatter driver does not exist
> > > > 2025-09-22 03:06:25.141+0000: 1124258: warning :
> > > virDomainObjFormat:30190 :
> > > > PrivateData formatter driver does not exist
> > > > 2025-09-22 03:06:25.153+0000: 1124258: warning :
> > > virDomainObjFormat:30190 :
> > > > PrivateData formatter driver does not exist
> >
> > > 2025-09-22 03:06:25.153+0000: 1124258: debug : virThreadJobClear:119 :
> > > > Thread 1124258 (rpc-worker) finished job
> > > > remoteDispatchDomainMigratePerform3Params with ret=-1
> > >
> > > This log is so abbreviated that it's useless. Please post the full
> thing
> > > somewhere.
> > >
> >
> > :( Since we focus on the shutdown code of Libvirtd, getting the backtrace
> > is not easy, so I added
> > the debug patch.
> >
> >
> > >
> > > Additionally if you can reproduce this without the patch I'd be
> > > interested in that log as well.
> > >
> > > Yes, I reproduce this with Libvirt 6.2.0, the latest version in the
> > upstream uses the same logic and
> > I assume that it also has this issue and reproducing is not that hard.
>
> So, can you hit this problem with current upstream code?
>
> While the logic for formatting data is the same, it's not actually the
> problem. The problem is in the shutdown logic and I do remember some
> changes in the code. Especially if you're claiming to use libvirt-6.2
> which is 5 years old at this point.
>
>
Ok, in case of focusing on a nonexistent issue in the upstream code, I'll
try to
reproduce this and reply to you once I get the result.

Thanks for the reply.

-- 
Best regards

Reply via email to