Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, Nov 08, 2010 at 08:53:12AM -0800, Matt Thomas wrote: On Nov 8, 2010, at 8:07 AM, Masao Uebayashi wrote: On Mon, Nov 08, 2010 at 10:48:45AM -0500, Thor Lancelot Simon wrote: On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote: I don't like it's MD, period attitude. That solves nothing. We've had pmaps which have tried to pretend they were pmaps for some other architecture (that is, that some parts of the pmap weren't best left MD). For example, we used to have a lot of pmaps in our tree that sort of treated the whole world like a 68K MMU. Performance has not been so great. And besides, what -are- you going to do, in an MI way, about synchronization against hardware lookup? Do you mean synchronization among processors? No. For instance, on PPC OEA processors the CPU will write back to the reverse page table entries to update the REF/MOD bits. This requires the pmap to use the PPC equivalent of LL/SC to update PTEs. For normal page tables with hardware lookup like ARM the MMU will read the L1 page table to find the address of the L2 page tables and then read the actual PTE. All of this happens without any sort of locking so updates need to be done in a lockless manner to have a coherent view of the page tables. On a TLB base MMU, the TLB miss handler will run without locking which requires an always coherent page lookup (typically page table) where entries (either PTEs or page table pointers) are updated using using lockless primitives (CAS). THis is even more critical as we deal with more MP platform where lookups on one CPU may be happening in parallel with updates on another. So, in either design, we have to carefully update page tables by atomic operations. But even with it done so, the whole fault resolution can be done in once shot in slow paths - like paging (I/O) or COW. There are consistencies between VAs sharing one PA, or CPUs sharing one VA. And we resolve these dirty works one by one. My concern is more about the order of those operations. I think what's going wrong in fault handling is, UVM doesn't teach enough information to pmap during fault handling, and it calls pmap_enter() with only a few clues. Thus pmap has lots of problems to solve at once. I guess if UVM tells pmap right information at right timing, and solve one thing at a time, pmap_enter() would become pretty much simple operation - place the new PTE. All the needed information is in MI UVM structures. Why not use them. This doesn't mean that the pmap can't be made more MI (for instance I have the mips and ppc85xx pmaps sharing a lot of code but still have MD bits to handle the various machine dependent bits). But going completely MI is just not possible.
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Fri, Nov 05, 2010 at 04:54:33PM +, Eduardo Horvath wrote: On Fri, 5 Nov 2010, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:55:01PM +, Eduardo Horvath wrote: On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? To access MMIO device pages, you need a physical address. Physical address space is single, linear resource on all platforms. I wonder why we can't manage it in MI way. I suppose that depends on your definition of linear. But that's beside the point. I/O pages have no KVA until a mapping is done. UVM knows nothing about those mappings since they are managed solely by pmap. I still don't see how what you're proposing here will work. UVM knows nothing about those mappings, since they are not taught. UVM knows managed RAM pages, since they are taught. Calling pmap_extract(9) means that some kernel code asks pmap(9) to look up a physical address. pmap(9) is only responsible to handle CPU and MMU. Using it as a lookup database is an abuse. The only reasonable use of pmap_extract(9) is for debugging purpose. I think that pmap_extract(9) should be changed like: bool pmap_mapped_p(struct pmap *, vaddr_t); and allow it to be used for KASSERT()s. The only right way to retrieve P-V translation is to lookup from vm_map (== the fault handler). If we honour this principle, VM and I/O code will be much more consistent. pmap(9) has always needed a database to keep track of V-P mappings(*) as wll as P-V mappings so pmap_page_protect() can be implemented. pmap_extract() accesses page table (per-space). pmap_page_protect() accesses PV (per-page). I think they're totally different... The purpose of pmap(9) is to manage MMU hardware. Page tables are one possible implementation of MMU hardware. Not all machines have page tables. Some processors use reverse page tables. Some just have TLBs. And if you read secion 5.13 of _The_Design_and_Implmentation_of_the_4.4BSD_Operating_System_ it says that pmap is allowed to forget any mappings that are not wired. So, in theory, all you need to do is keep a linked list of wired mappings to insert in the TLB on fault and forget everything else. Of course, that doesn't seem to work so well with UVM. Ancient designs don't help me so far. Anyway, please keep in mind that not all machines are PCs. I'd really hate to see a repeat of the Linux VM subsysem which directly manipulated x86 page tables even on architectures that don't have page tables let alone somehing compaible wih x86. pmap(9) is an abstraction layer for good reason. Huh? When I said I like x86? I said only PV. IIRC Linux didn't have PV before 2.6.
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Fri, Nov 05, 2010 at 05:36:53PM +, Eduardo Horvath wrote: On Fri, 5 Nov 2010, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote: Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense. How does locking works? My understanding is page tables (per-process) are protected by struct vm_map (per-process). (Or moving toward it.) No, once again this is MD. For instance sparc64 uses compare and swap instructions to manipulate page tables for lockless synchronization. I don't like it's MD, period attitude. That solves nothing.
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Fri, Nov 05, 2010 at 10:04:46AM -0700, Matt Thomas wrote: On Nov 5, 2010, at 4:59 AM, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote: On Nov 1, 2010, at 8:55 AM, Eduardo Horvath wrote: On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense. How does locking works? My understanding is page tables (per-process) are protected by struct vm_map (per-process). (Or moving toward it.) Unfortunately, that doesn't completely solve the problem since lookups will be done either by exception handlers or hardware bypassing any locks. These means that the page tables must be updated in a MP safe manner. I spent some time to think of this. I'm pretty sure I have a good understanding of pmap vs. MP now. I'll reply after doing a little more research.
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
No, once again this is MD. For instance sparc64 uses compare and swap instructions to manipulate page tables for lockless synchronization. I don't like it's MD, period attitude. That solves nothing. What do you want to solve, as yamt asked you first? He said pmap_extract() could be used to get PA from VA. You just answered pmap_extract() was bad API. What you were trying to solve? If existing API can solve it without bad side effect, I don't think it's so bad for your purpose and its design should be another discussion. --- Izumi Tsutsui
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, 8 Nov 2010, Masao Uebayashi wrote: On Fri, Nov 05, 2010 at 05:36:53PM +, Eduardo Horvath wrote: On Fri, 5 Nov 2010, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote: Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense. How does locking works? My understanding is page tables (per-process) are protected by struct vm_map (per-process). (Or moving toward it.) No, once again this is MD. For instance sparc64 uses compare and swap instructions to manipulate page tables for lockless synchronization. I don't like it's MD, period attitude. That solves nothing. Yes it does. If you have bleed through between the different abstraction layers it makes implementing a pmap for a new processor much more difficult and makes the code inefficient since you end up implementing a whole bunch of goo just to keep the sideffects compatible. You should not be making any implicit assumptions beyond what is explicitly documented in the interface descriptions otherwise the code becomes unmaintainable across the dozens of different processors and MMU archittures we're trying to support. Eduardo
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote: I don't like it's MD, period attitude. That solves nothing. We've had pmaps which have tried to pretend they were pmaps for some other architecture (that is, that some parts of the pmap weren't best left MD). For example, we used to have a lot of pmaps in our tree that sort of treated the whole world like a 68K MMU. Performance has not been so great. And besides, what -are- you going to do, in an MI way, about synchronization against hardware lookup? -- Thor Lancelot Simont...@rek.tjls.com If the World Wide Web were more than a pale shadow of what Usenet was, every single blog entry would be http://preview.tinyurl.com/34zahyx .
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, Nov 08, 2010 at 03:22:42PM +, Eduardo Horvath wrote: On Mon, 8 Nov 2010, Masao Uebayashi wrote: On Fri, Nov 05, 2010 at 05:36:53PM +, Eduardo Horvath wrote: On Fri, 5 Nov 2010, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote: Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense. How does locking works? My understanding is page tables (per-process) are protected by struct vm_map (per-process). (Or moving toward it.) No, once again this is MD. For instance sparc64 uses compare and swap instructions to manipulate page tables for lockless synchronization. I don't like it's MD, period attitude. That solves nothing. Yes it does. If you have bleed through between the different abstraction layers it makes implementing a pmap for a new processor much more difficult and makes the code inefficient since you end up implementing a whole bunch of goo just to keep the sideffects compatible. You should not be making any implicit assumptions beyond what is explicitly documented in the interface descriptions otherwise the code becomes unmaintainable across the dozens of different processors and MMU archittures we're trying to support. Most of pmaps are already almost unmaintainable IMO. ;)
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, Nov 08, 2010 at 10:48:45AM -0500, Thor Lancelot Simon wrote: On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote: I don't like it's MD, period attitude. That solves nothing. We've had pmaps which have tried to pretend they were pmaps for some other architecture (that is, that some parts of the pmap weren't best left MD). For example, we used to have a lot of pmaps in our tree that sort of treated the whole world like a 68K MMU. Performance has not been so great. And besides, what -are- you going to do, in an MI way, about synchronization against hardware lookup? Do you mean synchronization among processors?
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, Nov 01, 2010 at 03:55:01PM +, Eduardo Horvath wrote: On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? To access MMIO device pages, you need a physical address. Physical address space is single, linear resource on all platforms. I wonder why we can't manage it in MI way. Calling pmap_extract(9) means that some kernel code asks pmap(9) to look up a physical address. pmap(9) is only responsible to handle CPU and MMU. Using it as a lookup database is an abuse. The only reasonable use of pmap_extract(9) is for debugging purpose. I think that pmap_extract(9) should be changed like: bool pmap_mapped_p(struct pmap *, vaddr_t); and allow it to be used for KASSERT()s. The only right way to retrieve P-V translation is to lookup from vm_map (== the fault handler). If we honour this principle, VM and I/O code will be much more consistent. pmap(9) has always needed a database to keep track of V-P mappings(*) as wll as P-V mappings so pmap_page_protect() can be implemented. pmap_extract() accesses page table (per-space). pmap_page_protect() accesses PV (per-page). I think they're totally different... Are you planning on moving the responsibility of tracking P-V mappings to UVM? * While you can claim that keeping track of P-V mappings is the primary function of pmap(9) and a sideffect of page tables, that posits the machine in quesion uses page tables. In a machine with a software managed TLB you could implement pmap(9) by walking the UVM structures on a page fault and generating TLB entries from the vm_page structure. This would reduce the amount of duplicate informaion maintained by the VM subsystems. However, UVM currently assumes pmap() remembers all forward and reverse mappings. If pmap() forgets them, bad things happen. Eduardo -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote: On Nov 1, 2010, at 8:55 AM, Eduardo Horvath wrote: On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense. How does locking works? My understanding is page tables (per-process) are protected by struct vm_map (per-process). (Or moving toward it.) Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Fri, 5 Nov 2010, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:55:01PM +, Eduardo Horvath wrote: On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? To access MMIO device pages, you need a physical address. Physical address space is single, linear resource on all platforms. I wonder why we can't manage it in MI way. I suppose that depends on your definition of linear. But that's beside the point. I/O pages have no KVA until a mapping is done. UVM knows nothing about those mappings since they are managed solely by pmap. I still don't see how what you're proposing here will work. Calling pmap_extract(9) means that some kernel code asks pmap(9) to look up a physical address. pmap(9) is only responsible to handle CPU and MMU. Using it as a lookup database is an abuse. The only reasonable use of pmap_extract(9) is for debugging purpose. I think that pmap_extract(9) should be changed like: bool pmap_mapped_p(struct pmap *, vaddr_t); and allow it to be used for KASSERT()s. The only right way to retrieve P-V translation is to lookup from vm_map (== the fault handler). If we honour this principle, VM and I/O code will be much more consistent. pmap(9) has always needed a database to keep track of V-P mappings(*) as wll as P-V mappings so pmap_page_protect() can be implemented. pmap_extract() accesses page table (per-space). pmap_page_protect() accesses PV (per-page). I think they're totally different... The purpose of pmap(9) is to manage MMU hardware. Page tables are one possible implementation of MMU hardware. Not all machines have page tables. Some processors use reverse page tables. Some just have TLBs. And if you read secion 5.13 of _The_Design_and_Implmentation_of_the_4.4BSD_Operating_System_ it says that pmap is allowed to forget any mappings that are not wired. So, in theory, all you need to do is keep a linked list of wired mappings to insert in the TLB on fault and forget everything else. Of course, that doesn't seem to work so well with UVM. Anyway, please keep in mind that not all machines are PCs. I'd really hate to see a repeat of the Linux VM subsysem which directly manipulated x86 page tables even on architectures that don't have page tables let alone somehing compaible wih x86. pmap(9) is an abstraction layer for good reason. Eduardo
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Fri, 5 Nov 2010, Masao Uebayashi wrote: On Mon, Nov 01, 2010 at 03:52:11PM -0700, Matt Thomas wrote: Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense. How does locking works? My understanding is page tables (per-process) are protected by struct vm_map (per-process). (Or moving toward it.) No, once again this is MD. For instance sparc64 uses compare and swap instructions to manipulate page tables for lockless synchronization. Eduardo
pmap_extract(9) (was Re: xmd(4) (Re: XIP))
I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. Calling pmap_extract(9) means that some kernel code asks pmap(9) to look up a physical address. pmap(9) is only responsible to handle CPU and MMU. Using it as a lookup database is an abuse. The only reasonable use of pmap_extract(9) is for debugging purpose. I think that pmap_extract(9) should be changed like: bool pmap_mapped_p(struct pmap *, vaddr_t); and allow it to be used for KASSERT()s. The only right way to retrieve P-V translation is to lookup from vm_map (== the fault handler). If we honour this principle, VM and I/O code will be much more consistent. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
The only right way to retrieve P-V translation is to lookup from vm_map (== the fault handler). What about setting up DMA on machines whose DMA uses physical addresses? Or does the DMA code get an exception to this rule? I also suspect debugging may well be a non-ignorable use case, though I could also be wrong about that. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? Calling pmap_extract(9) means that some kernel code asks pmap(9) to look up a physical address. pmap(9) is only responsible to handle CPU and MMU. Using it as a lookup database is an abuse. The only reasonable use of pmap_extract(9) is for debugging purpose. I think that pmap_extract(9) should be changed like: bool pmap_mapped_p(struct pmap *, vaddr_t); and allow it to be used for KASSERT()s. The only right way to retrieve P-V translation is to lookup from vm_map (== the fault handler). If we honour this principle, VM and I/O code will be much more consistent. pmap(9) has always needed a database to keep track of V-P mappings(*) as wll as P-V mappings so pmap_page_protect() can be implemented. Are you planning on moving the responsibility of tracking P-V mappings to UVM? * While you can claim that keeping track of P-V mappings is the primary function of pmap(9) and a sideffect of page tables, that posits the machine in quesion uses page tables. In a machine with a software managed TLB you could implement pmap(9) by walking the UVM structures on a page fault and generating TLB entries from the vm_page structure. This would reduce the amount of duplicate informaion maintained by the VM subsystems. However, UVM currently assumes pmap() remembers all forward and reverse mappings. If pmap() forgets them, bad things happen. Eduardo
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Nov 1, 2010, at 8:55 AM, Eduardo Horvath wrote: On Mon, 1 Nov 2010, Masao Uebayashi wrote: I think pmap_extract(9) is a bad API. After MD bootstrap code detects all physical memories, it gives all the informations to UVM, including available KVA. At this point UVM knows all the available resources of virtual/physical addresses. UVM is responsible to manage all of these. This is managed RAM. What about I/O pages? Indeed. Also consider that pmap's are designed to have to have fast V-P translations, using that instead of UVM makes a lot of sense.