On Sun, Nov 30, 2008 at 10:07:01PM +0200, Avi Kivity wrote:
> Right.  Allocated from the guest kernel's perspective.  This may be 
> different from the host kernel's perspective.
> 
> Linux will delay touching memory until the last moment, Windows will not 
> (likely it zeros pages on their own nodes, but who knows)?

The problem on Linux is that the first touch is clear_page() and 
that unfortunately happens in the direct mapping before mapping, 
so the "detect mapping" trick doesn't quite work (unless it's a 32bit highmem 
page). 

Ok one could migrate it on mapping. When the data is still cache
hot that shouldn't be that expensive. Thinking about it again 
it might be actually a reasonable approach.

> 
> The bigger problem is lifetime.  Inside a guest, 'allocation' happens 
> when a page is used for pagecache, or when a process is created and 
> starts using memory.  From the host perspective, it happens just once.

Yes, that's a problem. I discussed some ways to get around that
earlier. 

> 
> >>It's very different.  The kernel expects an application that touched 
> >>page X on node Y to continue using page X on node Y.  Because 
> >>applications know this, they are written to this assumption.  However, 
> >>    
> >
> >The far majority of applications do not actually know where memory is. 
> >  
> 
> In our case, the application is the guest kernel, which does know.

It knows but it doesn't really care all that much.  The only thing
that counts is the end performance in this case.

[Some people also NUMA policy to partition machines, but that's
ok in this case because that only needs the same fixed guest physical
addresses which is guaranteed of course]

> The difference is, Linux (as a guest) will try to reuse freed pages from 
> an application or pagecache, knowing which node they belong to.
> 
> I agree that if all you do is HPC style computation (boot a kernel and 
> one app with one process per cpu), then the heuristics work well.

Or if there's a way to detect unmapping/remapping.

> >It is certainly not perfect and has holes (like any heuristics),
> >but it has the advantage of being fully dynamic. 
> >  
> 
> It also has the advantage of being already implemented (apart from fake 
> SRAT tables; and that isn't necessary for HPC apps).

What do you mean?

-Andi
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to