On 16.02.2023 16:10, Tamas K Lengyel wrote: > On Thu, Feb 16, 2023 at 9:27 AM Jan Beulich <jbeul...@suse.com> wrote: >> >> On 16.02.2023 13:22, Tamas K Lengyel wrote: >>> On Thu, Feb 16, 2023 at 3:31 AM Jan Beulich <jbeul...@suse.com> wrote: >>>> >>>> On 15.02.2023 17:29, Tamas K Lengyel wrote: >>>>> On Wed, Feb 15, 2023 at 5:27 AM Jan Beulich <jbeul...@suse.com> wrote: >>>>>> Did you consider the alternative of adjusting the ASSERT() in > question >>>>>> instead? It could reasonably become >>>>>> >>>>>> #ifdef CONFIG_MEM_SHARING >>>>>> ASSERT(!p2m_is_hostp2m(p2m) || !remove_root || >>>>> !atomic_read(&d->shr_pages)); >>>>>> #endif >>>>>> >>>>>> now, I think. That would be less intrusive a change (helpful for >>>>>> backporting), but there may be other (so far unnamed) benefits of >>> pulling >>>>>> ahead the shared memory teardown. >>>>> >>>>> I have a hard time understanding this proposed ASSERT. >>>> >>>> It accounts for the various ways p2m_teardown() can (now) be called, >>>> limiting the actual check for no remaining shared pages to the last >>>> of all these invocations (on the host p2m with remove_root=true). >>>> >>>> Maybe >>>> >>>> /* Limit the check to the final invocation. */ >>>> if ( p2m_is_hostp2m(p2m) && remove_root ) >>>> ASSERT(!atomic_read(&d->shr_pages)); >>>> >>>> would make this easier to follow? Another option might be to move >>>> the assertion to paging_final_teardown(), ahead of that specific call >>>> to p2m_teardown(). >>> >>> AFAICT d->shr_pages would still be more then 0 when this is called > before >>> sharing is torn down so the rearrangement is necessary even if we do > this >>> assert only on the final invocation. I did a printk in place of this > assert >>> without the rearrangement and afaict it was always != 0. >> >> Was your printk() in an if() as above? paging_final_teardown() runs really >> late during cleanup (when we're about to free struct domain), so I would >> be somewhat concerned if by that time the count was still non-zero. > > Just replaced the assert with a printk. Without calling relinquish shared > pages I don't find it odd that the count is non-zero, it only gets > decremented when you do call relinquish. Once the order is corrected it is > zero.
I may be getting you wrong, but it feels like you're still talking about early invocations of p2m_teardown() (from underneath domain_kill()) when I'm talking about the final one. No matter what ordering inside domain_relinquish_resources() (called from domain_kill()), the freeing will have happened by the time that process completes. Which is before the (typically last) last domain ref would be put (near the end of domain_kill()), and hence before the domain freeing process might be invoked (which is where paging_final_teardown() gets involved; see {,arch_}domain_destroy()). Jan