On Thu, Nov 11, 2010 at 12:48:53AM +0900, Masao Uebayashi wrote: > On Mon, Nov 08, 2010 at 08:53:12AM -0800, Matt Thomas wrote: > > > > On Nov 8, 2010, at 8:07 AM, Masao Uebayashi wrote: > > > > > On Mon, Nov 08, 2010 at 10:48:45AM -0500, Thor Lancelot Simon wrote: > > >> On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote: > > >>> > > >>> I don't like "it's MD, period" attitude. That solves nothing. > > >> > > >> We've had pmaps which have tried to pretend they were pmaps for some > > >> other architecture (that is, that some parts of the pmap weren't > > >> best left MD). For example, we used to have a lot of pmaps in our > > >> tree that sort of treated the whole world like a 68K MMU. > > >> > > >> Performance has not been so great. And besides, what -are- you going > > >> to do, in an MI way, about synchronization against hardware lookup? > > > > > > Do you mean synchronization among processors? > > > > No. For instance, on PPC OEA processors the CPU will write back to > > the reverse page table entries to update the REF/MOD bits. This > > requires the pmap to use the PPC equivalent of LL/SC to update PTEs. > > > > For normal page tables with hardware lookup like ARM the MMU will > > read the L1 page table to find the address of the L2 page tables > > and then read the actual PTE. All of this happens without any sort > > of locking so updates need to be done in a lockless manner to have > > a coherent view of the page tables. > > > > On a TLB base MMU, the TLB miss handler will run without locking > > which requires an always coherent page lookup (typically page table) > > where entries (either PTEs or page table pointers) are updated using > > using lockless primitives (CAS). THis is even more critical as we > > deal with more MP platform where lookups on one CPU may be happening > > in parallel with updates on another. > > So, in either design, we have to carefully update page tables by > atomic operations. > > But even with it done so, the whole fault resolution can be done ^^^ cannot
> in once shot in slow paths - like paging (I/O) or COW. There are > consistencies between VAs sharing one PA, or CPUs sharing one VA. > And we resolve these dirty works one by one. My concern is more > about the order of those operations.