Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-10 Thread Masao Uebayashi
On Mon, Nov 08, 2010 at 08:53:12AM -0800, Matt Thomas wrote:
 
 On Nov 8, 2010, at 8:07 AM, Masao Uebayashi wrote:
 
  On Mon, Nov 08, 2010 at 10:48:45AM -0500, Thor Lancelot Simon wrote:
  On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote:
  
  I don't like it's MD, period attitude.  That solves nothing.
  
  We've had pmaps which have tried to pretend they were pmaps for some
  other architecture (that is, that some parts of the pmap weren't
  best left MD).  For example, we used to have a lot of pmaps in our
  tree that sort of treated the whole world like a 68K MMU.
  
  Performance has not been so great.  And besides, what -are- you going
  to do, in an MI way, about synchronization against hardware lookup?
  
  Do you mean synchronization among processors?
 
 No.  For instance, on PPC OEA processors the CPU will write back to
 the reverse page table entries to update the REF/MOD bits.  This
 requires the pmap to use the PPC equivalent of LL/SC to update PTEs.
 
 For normal page tables with hardware lookup like ARM the MMU will 
 read the L1 page table to find the address of the L2 page tables 
 and then read the actual PTE.  All of this happens without any sort
 of locking so updates need to be done in a lockless manner to have
 a coherent view of the page tables.
 
 On a TLB base MMU, the TLB miss handler will run without locking 
 which requires an always coherent page lookup (typically page table)
 where entries (either PTEs or page table pointers) are updated using
 using lockless primitives (CAS).  THis is even more critical as we
 deal with more MP platform where lookups on one CPU may be happening
 in parallel with updates on another.

So, in either design, we have to carefully update page tables by
atomic operations.

But even with it done so, the whole fault resolution can be done
in once shot in slow paths - like paging (I/O) or COW.  There are
consistencies between VAs sharing one PA, or CPUs sharing one VA.
And we resolve these dirty works one by one.  My concern is more
about the order of those operations.

I think what's going wrong in fault handling is, UVM doesn't teach
enough information to pmap during fault handling, and it calls
pmap_enter() with only a few clues.  Thus pmap has lots of problems
to solve at once.

I guess if UVM tells pmap right information at right timing, and solve
one thing at a time, pmap_enter() would become pretty much simple
operation - place the new PTE.  All the needed information is in
MI UVM structures.  Why not use them.

 
 This doesn't mean that the pmap can't be made more MI (for instance
 I have the mips and ppc85xx pmaps sharing a lot of code but still
 have MD bits to handle the various machine dependent bits).  But
 going completely MI is just not possible.
 


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Masao Uebayashi
On Fri, Nov 05, 2010 at 04:54:33PM +, Eduardo Horvath wrote:
 On Fri, 5 Nov 2010, Masao Uebayashi wrote:
 
  On Mon, Nov 01, 2010 at 03:55:01PM +, Eduardo Horvath wrote:
   On Mon, 1 Nov 2010, Masao Uebayashi wrote:
   
I think pmap_extract(9) is a bad API.

After MD bootstrap code detects all physical memories, it gives
all the informations to UVM, including available KVA.  At this
point UVM knows all the available resources of virtual/physical
addresses.  UVM is responsible to manage all of these.
   
   This is managed RAM.  What about I/O pages?
  
  To access MMIO device pages, you need a physical address.  Physical
  address space is single, linear resource on all platforms.  I wonder
  why we can't manage it in MI way.
 
 I suppose that depends on your definition of linear.  But that's beside 
 the point.
 
 I/O pages have no KVA until a mapping is done.  UVM knows nothing about 
 those mappings since they are managed solely by pmap.  I still don't see 
 how what you're proposing here will work.

UVM knows nothing about those mappings, since they are not taught.

UVM knows managed RAM pages, since they are taught.

 
  
   
Calling pmap_extract(9) means that some kernel code asks pmap(9)
to look up a physical address.  pmap(9) is only responsible to
handle CPU and MMU.  Using it as a lookup database is an abuse.
The only reasonable use of pmap_extract(9) is for debugging purpose.
I think that pmap_extract(9) should be changed like:

bool pmap_mapped_p(struct pmap *, vaddr_t);

and allow it to be used for KASSERT()s.

The only right way to retrieve P-V translation is to lookup from
vm_map (== the fault handler).  If we honour this principle, VM
and I/O code will be much more consistent.
   
   pmap(9) has always needed a database to keep track of V-P mappings(*) as 
   wll as P-V mappings so pmap_page_protect() can be implemented.  
  
  pmap_extract() accesses page table (per-space).  pmap_page_protect()
  accesses PV (per-page).  I think they're totally different...
 
 The purpose of pmap(9) is to manage MMU hardware.  Page tables are one 
 possible implementation of MMU hardware.  Not all machines have page 
 tables.  Some processors use reverse page tables.  Some just have TLBs.  
 And if you read secion 5.13 of 
 _The_Design_and_Implmentation_of_the_4.4BSD_Operating_System_ 
 it says that pmap is allowed to forget any mappings that are not wired.  
 So, in theory, all you need to do is keep a linked list of wired mappings 
 to insert in the TLB on fault and forget everything else.  Of course, that 
 doesn't seem to work so well with UVM.

Ancient designs don't help me so far.

 
 Anyway, please keep in mind that not all machines are PCs.  I'd really 
 hate to see a repeat of the Linux VM subsysem which directly manipulated 
 x86 page tables even on architectures that don't have page tables let 
 alone somehing compaible wih x86.  pmap(9) is an abstraction layer for 
 good reason.

Huh?  When I said I like x86?

I said only PV.  IIRC Linux didn't have PV before 2.6.


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Masao Uebayashi
On Fri, Nov 05, 2010 at 05:36:53PM +, Eduardo Horvath wrote:
 On Fri, 5 Nov 2010, Masao Uebayashi wrote:
 
  On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote:
 
   Indeed.  Also consider that pmap's are designed to have to have
   fast V-P translations, using that instead of UVM makes a lot of
   sense.
  
  How does locking works?
  
  My understanding is page tables (per-process) are protected by
  struct vm_map (per-process).  (Or moving toward it.)
 
 No, once again this is MD.  For instance sparc64 uses compare and swap 
 instructions to manipulate page tables for lockless synchronization.

I don't like it's MD, period attitude.  That solves nothing.


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Masao Uebayashi
On Fri, Nov 05, 2010 at 10:04:46AM -0700, Matt Thomas wrote:
 
 On Nov 5, 2010, at 4:59 AM, Masao Uebayashi wrote:
 
  On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote:
  
  On Nov 1, 2010, at 8:55 AM, Eduardo Horvath wrote:
  
  On Mon, 1 Nov 2010, Masao Uebayashi wrote:
  
  I think pmap_extract(9) is a bad API.
  
  After MD bootstrap code detects all physical memories, it gives
  all the informations to UVM, including available KVA.  At this
  point UVM knows all the available resources of virtual/physical
  addresses.  UVM is responsible to manage all of these.
  
  This is managed RAM.  What about I/O pages?
  
  Indeed.  Also consider that pmap's are designed to have to have
  fast V-P translations, using that instead of UVM makes a lot of
  sense.
  
  How does locking works?
  
  My understanding is page tables (per-process) are protected by
  struct vm_map (per-process).  (Or moving toward it.)
 
 Unfortunately, that doesn't completely solve the problem since
 lookups will be done either by exception handlers or hardware 
 bypassing any locks.  These means that the page tables must be
 updated in a MP safe manner.

I spent some time to think of this.  I'm pretty sure I have a good
understanding of pmap vs. MP now.

I'll reply after doing a little more research.


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Izumi Tsutsui
  No, once again this is MD.  For instance sparc64 uses compare and swap 
  instructions to manipulate page tables for lockless synchronization.
 
 I don't like it's MD, period attitude.  That solves nothing.

What do you want to solve, as yamt asked you first?

He said pmap_extract() could be used to get PA from VA.
You just answered pmap_extract() was bad API.
What you were trying to solve?

If existing API can solve it without bad side effect,
I don't think it's so bad for your purpose and
its design should be another discussion.

---
Izumi Tsutsui


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Eduardo Horvath
On Mon, 8 Nov 2010, Masao Uebayashi wrote:

 On Fri, Nov 05, 2010 at 05:36:53PM +, Eduardo Horvath wrote:
  On Fri, 5 Nov 2010, Masao Uebayashi wrote:
  
   On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote:
  
Indeed.  Also consider that pmap's are designed to have to have
fast V-P translations, using that instead of UVM makes a lot of
sense.
   
   How does locking works?
   
   My understanding is page tables (per-process) are protected by
   struct vm_map (per-process).  (Or moving toward it.)
  
  No, once again this is MD.  For instance sparc64 uses compare and swap 
  instructions to manipulate page tables for lockless synchronization.
 
 I don't like it's MD, period attitude.  That solves nothing.

Yes it does.  If you have bleed through between the different abstraction 
layers it makes implementing a pmap for a new processor much more 
difficult and makes the code inefficient since you end up implementing a 
whole bunch of goo just to keep the sideffects compatible.  You should not 
be making any implicit assumptions beyond what is explicitly documented in 
the interface descriptions otherwise the code becomes unmaintainable 
across the dozens of different processors and MMU archittures we're trying 
to support.

Eduardo


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Thor Lancelot Simon
On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote:
 
 I don't like it's MD, period attitude.  That solves nothing.

We've had pmaps which have tried to pretend they were pmaps for some
other architecture (that is, that some parts of the pmap weren't
best left MD).  For example, we used to have a lot of pmaps in our
tree that sort of treated the whole world like a 68K MMU.

Performance has not been so great.  And besides, what -are- you going
to do, in an MI way, about synchronization against hardware lookup?

-- 
Thor Lancelot Simont...@rek.tjls.com

   If the World Wide Web were more than a pale shadow of what Usenet was,
   every single blog entry would be http://preview.tinyurl.com/34zahyx .


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Masao Uebayashi
On Mon, Nov 08, 2010 at 03:22:42PM +, Eduardo Horvath wrote:
 On Mon, 8 Nov 2010, Masao Uebayashi wrote:
 
  On Fri, Nov 05, 2010 at 05:36:53PM +, Eduardo Horvath wrote:
   On Fri, 5 Nov 2010, Masao Uebayashi wrote:
   
On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote:
   
 Indeed.  Also consider that pmap's are designed to have to have
 fast V-P translations, using that instead of UVM makes a lot of
 sense.

How does locking works?

My understanding is page tables (per-process) are protected by
struct vm_map (per-process).  (Or moving toward it.)
   
   No, once again this is MD.  For instance sparc64 uses compare and swap 
   instructions to manipulate page tables for lockless synchronization.
  
  I don't like it's MD, period attitude.  That solves nothing.
 
 Yes it does.  If you have bleed through between the different abstraction 
 layers it makes implementing a pmap for a new processor much more 
 difficult and makes the code inefficient since you end up implementing a 
 whole bunch of goo just to keep the sideffects compatible.  You should not 
 be making any implicit assumptions beyond what is explicitly documented in 
 the interface descriptions otherwise the code becomes unmaintainable 
 across the dozens of different processors and MMU archittures we're trying 
 to support.

Most of pmaps are already almost unmaintainable IMO. ;)


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-08 Thread Masao Uebayashi
On Mon, Nov 08, 2010 at 10:48:45AM -0500, Thor Lancelot Simon wrote:
 On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote:
  
  I don't like it's MD, period attitude.  That solves nothing.
 
 We've had pmaps which have tried to pretend they were pmaps for some
 other architecture (that is, that some parts of the pmap weren't
 best left MD).  For example, we used to have a lot of pmaps in our
 tree that sort of treated the whole world like a 68K MMU.
 
 Performance has not been so great.  And besides, what -are- you going
 to do, in an MI way, about synchronization against hardware lookup?

Do you mean synchronization among processors?


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-05 Thread Masao Uebayashi
On Mon, Nov 01, 2010 at 03:55:01PM +, Eduardo Horvath wrote:
 On Mon, 1 Nov 2010, Masao Uebayashi wrote:
 
  I think pmap_extract(9) is a bad API.
  
  After MD bootstrap code detects all physical memories, it gives
  all the informations to UVM, including available KVA.  At this
  point UVM knows all the available resources of virtual/physical
  addresses.  UVM is responsible to manage all of these.
 
 This is managed RAM.  What about I/O pages?

To access MMIO device pages, you need a physical address.  Physical
address space is single, linear resource on all platforms.  I wonder
why we can't manage it in MI way.

 
  Calling pmap_extract(9) means that some kernel code asks pmap(9)
  to look up a physical address.  pmap(9) is only responsible to
  handle CPU and MMU.  Using it as a lookup database is an abuse.
  The only reasonable use of pmap_extract(9) is for debugging purpose.
  I think that pmap_extract(9) should be changed like:
  
  bool pmap_mapped_p(struct pmap *, vaddr_t);
  
  and allow it to be used for KASSERT()s.
  
  The only right way to retrieve P-V translation is to lookup from
  vm_map (== the fault handler).  If we honour this principle, VM
  and I/O code will be much more consistent.
 
 pmap(9) has always needed a database to keep track of V-P mappings(*) as 
 wll as P-V mappings so pmap_page_protect() can be implemented.  

pmap_extract() accesses page table (per-space).  pmap_page_protect()
accesses PV (per-page).  I think they're totally different...

 
 Are you planning on moving the responsibility of tracking P-V mappings to 
 UVM?
 
 * While you can claim that keeping track of P-V mappings is the primary 
 function of pmap(9) and a sideffect of page tables, that posits the 
 machine in quesion uses page tables.  In a machine with a software managed 
 TLB you could implement pmap(9) by walking the UVM structures on a page 
 fault and generating TLB entries from the vm_page structure.  This would 
 reduce the amount of duplicate informaion maintained by the VM subsystems.  
 However, UVM currently assumes pmap() remembers all forward and reverse 
 mappings.  If pmap() forgets them, bad things happen.
 
 Eduardo

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-05 Thread Masao Uebayashi
On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote:
 
 On Nov 1, 2010, at 8:55 AM, Eduardo Horvath wrote:
 
  On Mon, 1 Nov 2010, Masao Uebayashi wrote:
  
  I think pmap_extract(9) is a bad API.
  
  After MD bootstrap code detects all physical memories, it gives
  all the informations to UVM, including available KVA.  At this
  point UVM knows all the available resources of virtual/physical
  addresses.  UVM is responsible to manage all of these.
  
  This is managed RAM.  What about I/O pages?
 
 Indeed.  Also consider that pmap's are designed to have to have
 fast V-P translations, using that instead of UVM makes a lot of
 sense.

How does locking works?

My understanding is page tables (per-process) are protected by
struct vm_map (per-process).  (Or moving toward it.)

Masao

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-05 Thread Eduardo Horvath
On Fri, 5 Nov 2010, Masao Uebayashi wrote:

 On Mon, Nov 01, 2010 at 03:55:01PM +, Eduardo Horvath wrote:
  On Mon, 1 Nov 2010, Masao Uebayashi wrote:
  
   I think pmap_extract(9) is a bad API.
   
   After MD bootstrap code detects all physical memories, it gives
   all the informations to UVM, including available KVA.  At this
   point UVM knows all the available resources of virtual/physical
   addresses.  UVM is responsible to manage all of these.
  
  This is managed RAM.  What about I/O pages?
 
 To access MMIO device pages, you need a physical address.  Physical
 address space is single, linear resource on all platforms.  I wonder
 why we can't manage it in MI way.

I suppose that depends on your definition of linear.  But that's beside 
the point.

I/O pages have no KVA until a mapping is done.  UVM knows nothing about 
those mappings since they are managed solely by pmap.  I still don't see 
how what you're proposing here will work.

 
  
   Calling pmap_extract(9) means that some kernel code asks pmap(9)
   to look up a physical address.  pmap(9) is only responsible to
   handle CPU and MMU.  Using it as a lookup database is an abuse.
   The only reasonable use of pmap_extract(9) is for debugging purpose.
   I think that pmap_extract(9) should be changed like:
   
 bool pmap_mapped_p(struct pmap *, vaddr_t);
   
   and allow it to be used for KASSERT()s.
   
   The only right way to retrieve P-V translation is to lookup from
   vm_map (== the fault handler).  If we honour this principle, VM
   and I/O code will be much more consistent.
  
  pmap(9) has always needed a database to keep track of V-P mappings(*) as 
  wll as P-V mappings so pmap_page_protect() can be implemented.  
 
 pmap_extract() accesses page table (per-space).  pmap_page_protect()
 accesses PV (per-page).  I think they're totally different...

The purpose of pmap(9) is to manage MMU hardware.  Page tables are one 
possible implementation of MMU hardware.  Not all machines have page 
tables.  Some processors use reverse page tables.  Some just have TLBs.  
And if you read secion 5.13 of 
_The_Design_and_Implmentation_of_the_4.4BSD_Operating_System_ 
it says that pmap is allowed to forget any mappings that are not wired.  
So, in theory, all you need to do is keep a linked list of wired mappings 
to insert in the TLB on fault and forget everything else.  Of course, that 
doesn't seem to work so well with UVM.

Anyway, please keep in mind that not all machines are PCs.  I'd really 
hate to see a repeat of the Linux VM subsysem which directly manipulated 
x86 page tables even on architectures that don't have page tables let 
alone somehing compaible wih x86.  pmap(9) is an abstraction layer for 
good reason.

Eduardo


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-05 Thread Eduardo Horvath
On Fri, 5 Nov 2010, Masao Uebayashi wrote:

 On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote:

  Indeed.  Also consider that pmap's are designed to have to have
  fast V-P translations, using that instead of UVM makes a lot of
  sense.
 
 How does locking works?
 
 My understanding is page tables (per-process) are protected by
 struct vm_map (per-process).  (Or moving toward it.)

No, once again this is MD.  For instance sparc64 uses compare and swap 
instructions to manipulate page tables for lockless synchronization.

Eduardo


pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-01 Thread Masao Uebayashi
I think pmap_extract(9) is a bad API.

After MD bootstrap code detects all physical memories, it gives
all the informations to UVM, including available KVA.  At this
point UVM knows all the available resources of virtual/physical
addresses.  UVM is responsible to manage all of these.

Calling pmap_extract(9) means that some kernel code asks pmap(9)
to look up a physical address.  pmap(9) is only responsible to
handle CPU and MMU.  Using it as a lookup database is an abuse.
The only reasonable use of pmap_extract(9) is for debugging purpose.
I think that pmap_extract(9) should be changed like:

bool pmap_mapped_p(struct pmap *, vaddr_t);

and allow it to be used for KASSERT()s.

The only right way to retrieve P-V translation is to lookup from
vm_map (== the fault handler).  If we honour this principle, VM
and I/O code will be much more consistent.

Masao

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-01 Thread der Mouse
 The only right way to retrieve P-V translation is to lookup from
 vm_map (== the fault handler).

What about setting up DMA on machines whose DMA uses physical
addresses?  Or does the DMA code get an exception to this rule?

I also suspect debugging may well be a non-ignorable use case, though I
could also be wrong about that.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-01 Thread Eduardo Horvath
On Mon, 1 Nov 2010, Masao Uebayashi wrote:

 I think pmap_extract(9) is a bad API.
 
 After MD bootstrap code detects all physical memories, it gives
 all the informations to UVM, including available KVA.  At this
 point UVM knows all the available resources of virtual/physical
 addresses.  UVM is responsible to manage all of these.

This is managed RAM.  What about I/O pages?

 Calling pmap_extract(9) means that some kernel code asks pmap(9)
 to look up a physical address.  pmap(9) is only responsible to
 handle CPU and MMU.  Using it as a lookup database is an abuse.
 The only reasonable use of pmap_extract(9) is for debugging purpose.
 I think that pmap_extract(9) should be changed like:
 
   bool pmap_mapped_p(struct pmap *, vaddr_t);
 
 and allow it to be used for KASSERT()s.
 
 The only right way to retrieve P-V translation is to lookup from
 vm_map (== the fault handler).  If we honour this principle, VM
 and I/O code will be much more consistent.

pmap(9) has always needed a database to keep track of V-P mappings(*) as 
wll as P-V mappings so pmap_page_protect() can be implemented.  

Are you planning on moving the responsibility of tracking P-V mappings to 
UVM?

* While you can claim that keeping track of P-V mappings is the primary 
function of pmap(9) and a sideffect of page tables, that posits the 
machine in quesion uses page tables.  In a machine with a software managed 
TLB you could implement pmap(9) by walking the UVM structures on a page 
fault and generating TLB entries from the vm_page structure.  This would 
reduce the amount of duplicate informaion maintained by the VM subsystems.  
However, UVM currently assumes pmap() remembers all forward and reverse 
mappings.  If pmap() forgets them, bad things happen.

Eduardo


Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))

2010-11-01 Thread Matt Thomas

On Nov 1, 2010, at 8:55 AM, Eduardo Horvath wrote:

 On Mon, 1 Nov 2010, Masao Uebayashi wrote:
 
 I think pmap_extract(9) is a bad API.
 
 After MD bootstrap code detects all physical memories, it gives
 all the informations to UVM, including available KVA.  At this
 point UVM knows all the available resources of virtual/physical
 addresses.  UVM is responsible to manage all of these.
 
 This is managed RAM.  What about I/O pages?

Indeed.  Also consider that pmap's are designed to have to have
fast V-P translations, using that instead of UVM makes a lot of
sense.