On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote:
>
> On 24.07.2013, at 12:01, Gleb Natapov wrote:
>
> > Copying Andrea for him to verify that I am not talking nonsense :)
> >
> > On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
> >>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >>> index 1580dd4..5e8635b 100644
> >>> --- a/virt/kvm/kvm_main.c
> >>> +++ b/virt/kvm/kvm_main.c
> >>> @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
> >>>
> >>> bool kvm_is_mmio_pfn(pfn_t pfn)
> >>> {
> >>> +#ifdef CONFIG_MEMORY_HOTPLUG
> >>
> >> I'd feel safer if we narrow this down to e500.
> >>
> >>> + /*
> >>> + * Currently only in memory hot remove case we may still need
> >>> this.
> >>> + */
> >>> if (pfn_valid(pfn)) {
> >>
> >> We still have to check for pfn_valid, no? So the #ifdef should be down
> >> here.
> >>
> >>> int reserved;
> >>> struct page *tail = pfn_to_page(pfn);
> >>> @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
> >>> }
> >>> return PageReserved(tail);
> >>> }
> >>> +#endif
> >>>
> >>> return true;
> >>> }
> >>>
> >>> Before apply this change:
> >>>
> >>> real (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5=
> >>> 1m21.376s
> >>> user (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5=
> >>> 0m23.433s
> >>> sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5=
> >>> 0m50.349s
> >>>
> >>> After apply this change:
> >>>
> >>> real (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5=
> >>> 1m20.667s
> >>> user (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5=
> >>> 0m22.615s
> >>> sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5=
> >>> 0m49.496s
> >>>
> >>> So,
> >>>
> >>> real (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
> >>> user (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
> >>> sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
> >>
> >> Very nice, so there is a real world performance benefit to doing this.
> >> Then yes, I think it would make sense to change the global helper function
> >> to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead.
> >>
> >> Gleb, Paolo, any hard feelings?
> >>
> > I do not see how can we break the function in such a way and get
> > away with it. Not all valid pfns point to memory. Physical address can
> > be sparse (due to PCI hole, framebuffer or just because).
>
> But we don't check for sparseness today in here either. We merely check for
> incomplete huge pages.
>
That's not how I read the code. The code checks for reserved flag set.
It should be set on pfns that point to memory holes. As far as I
understand huge page tricks they are there to guaranty that THP does not
change flags under us, Andrea?
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html