Re: CVS commit: src/sys/uvm
y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote: > > Right, uvm_anon_locklaonpg() dance can happen only in O->A case. > > However, having uvm_anfree() able to release the lock by its interface > > definition is potentially defective. It is the main motivation why I > > want to slightly rework the code into uvm_anon_freelst() which would > > always drop the lock and move freeing of anons to the end point. > > Cleaner, less error prone. > > the committed change seems broken in case uvm_anon_dispose sets > PG_RELEASED. in that case, uvm_anon_freelst should leave the anon as it > will be freed by uvm_anon_release later. Right. Fix committed. Do we have a better regression test to trigger PG_RELEASED cases? -- Mindaugas
Re: CVS commit: src/sys/uvm
[ moving to tech-kern ] hi, > y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote: >> > >> > Here is the updated patch after your changes: >> > >> > http://www.netbsd.org/~rmind/uvm_anon_freelst2.diff >> > >> > As you noted, uvm_anfree() can temporarily release the amap lock - that >> > can happen in amap_copy(). Patch closes the race by moving uvm_anfree >> > () further, and changes the semantics of the function, now called >> > uvm_anon_freelst(), to return with amap lock released (plus free anons >> > without lock held). >> >> the temporary release of the amap lock is only for O->A loan >> which you disabled, isn't it? > > Right, uvm_anon_locklaonpg() dance can happen only in O->A case. However, > having uvm_anfree() able to release the lock by its interface definition > is potentially defective. It is the main motivation why I want to slightly > rework the code into uvm_anon_freelst() which would always drop the lock > and move freeing of anons to the end point. Cleaner, less error prone. the committed change seems broken in case uvm_anon_dispose sets PG_RELEASED. in that case, uvm_anon_freelst should leave the anon as it will be freed by uvm_anon_release later. YAMAMOTO Takashi > > -- > Mindaugas
Re: CVS commit: src/sys/uvm
hi, >> >> (Just in case, have you read my paper?) >> >> which paper? i guess no. > > http://uebayasi.dyndns.org/~uebayasi/tmp/xip.pdf i will put it on my todo list. thanks. YAMAMOTO Takashi
Re: CVS commit: src/sys/uvm
On Wed, Jan 05, 2011 at 05:58:34AM +, YAMAMOTO Takashi wrote: > hi, > > > I take silence as "no objection". > > the silence in this case means was-busy-for-other-things-and-forgot. > sorry. > > >> I have no real code for this big picture at this moment. Making > >> vm_physseg available as reference is the first step. This only > >> changes uvm_page_physload() to return a pointer: > >> > >>-void uvm_page_physload(); > >>+void *uvm_page_physload(); > >> > >> But this makes XIP pager MUCH cleaner. The reason has been explained > >> many times. > > because the separate uvm_page_physload_device is no longer necessary, > you mean? i have no problem with the step. > > >> Making fault handlers and pagers to use vm_physseg * + off_t is > >> the next step, and I don't intend to work on it now. I just want > >> to explain the big picture. > >> > >> > > >> > >> > >> > >> Keep vm_physseg * + off_t array on stack. If UVM objects uses > >> > >> vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > >> > >> + off_t *once* and cache it on stack. > >> > > >> > do you mean something like this? > >> > struct { > >> > vm_physseg *hoge; > >> > off_t fuga; > >> > } foo [16]; > >> > >> Yes. > >> > >> Or cache vm_page * with it, like: > >> > >>struct vm_id { > >>vm_physseg *seg; > >>off_t off; > >>vm_page *pg; > >>}; > >> > >>uvm_fault() > >>{ > >>vm_id pgs[]; > >>: > >>} > >> > >> Vnode pager (genfs_getpages) takes vm_page's by looking up > >> vnode::v_uobj's list, or uvn_findpages(). > >> > >> When it returns back to fault handler, we have to lookup vm_physseg > >> for each page. Then fill the "seg" slot above (assuming we'll > >> remove vm_page::phys_addr soon). > >> > >> Fault handler calls per-vm_page operations iff vm_page slot is filled. > >> XIP pages are not pageq'ed. > > pgo_get returns either seg+off or pg for each vm_id slots? vm_id[] is passed to pgo_get, each pgo_get (uvn_get, udv_get, ...) will fill seg + off and return the array back to uvm_fault. Pagers that use vm_page (e.g. vnode) will get pg from vnode, then lookup seg+off, fill them into the vm_id slot. Those that don't use vm_page (e.g. cdev) will get seg+off from underlying objects, fill them into the vm_id slot, and return. > > >> XIP pages don't need vm_page, but > >> cached because it's vnode. > > can you explain this sentence? Sorry, it was so unclear. XIP pages don't need vm_page (as a paging state variable), because the pages are read-only and never involve paging. XIP pages have to be mapped as cacheable at MMU level, because it's vnode. I'm allocating the full vm_page for XIP'able pages, only because chs@ wanted to support loaning, which needs vm_page::loan_count. > > >> (Just in case, have you read my paper?) > > which paper? i guess no. http://uebayasi.dyndns.org/~uebayasi/tmp/xip.pdf > > YAMAMOTO Takashi -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Re: CVS commit: src/sys/uvm
hi, > I take silence as "no objection". the silence in this case means was-busy-for-other-things-and-forgot. sorry. >> I have no real code for this big picture at this moment. Making >> vm_physseg available as reference is the first step. This only >> changes uvm_page_physload() to return a pointer: >> >> -void uvm_page_physload(); >> +void *uvm_page_physload(); >> >> But this makes XIP pager MUCH cleaner. The reason has been explained >> many times. because the separate uvm_page_physload_device is no longer necessary, you mean? i have no problem with the step. >> Making fault handlers and pagers to use vm_physseg * + off_t is >> the next step, and I don't intend to work on it now. I just want >> to explain the big picture. >> >> > >> > >> >> > >> Keep vm_physseg * + off_t array on stack. If UVM objects uses >> > >> vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * >> > >> + off_t *once* and cache it on stack. >> > >> > do you mean something like this? >> >struct { >> >vm_physseg *hoge; >> >off_t fuga; >> >} foo [16]; >> >> Yes. >> >> Or cache vm_page * with it, like: >> >> struct vm_id { >> vm_physseg *seg; >> off_t off; >> vm_page *pg; >> }; >> >> uvm_fault() >> { >> vm_id pgs[]; >> : >> } >> >> Vnode pager (genfs_getpages) takes vm_page's by looking up >> vnode::v_uobj's list, or uvn_findpages(). >> >> When it returns back to fault handler, we have to lookup vm_physseg >> for each page. Then fill the "seg" slot above (assuming we'll >> remove vm_page::phys_addr soon). >> >> Fault handler calls per-vm_page operations iff vm_page slot is filled. >> XIP pages are not pageq'ed. pgo_get returns either seg+off or pg for each vm_id slots? >> XIP pages don't need vm_page, but >> cached because it's vnode. can you explain this sentence? >> (Just in case, have you read my paper?) which paper? i guess no. YAMAMOTO Takashi
Re: CVS commit: src/sys/uvm
On Mon, Jan 03, 2011 at 10:28:17PM -0800, Matt Thomas wrote: > > On Jan 3, 2011, at 9:01 PM, Masao Uebayashi wrote: > > > I take silence as "no objection". > > Not so safe. > > > On Thu, Dec 23, 2010 at 12:48:04PM +0900, Masao Uebayashi wrote: > >> On Wed, Dec 22, 2010 at 05:37:57AM +, YAMAMOTO Takashi wrote: > >>> hi, > >>> > Could you ack this discussion? > >>> > >>> sorry for dropping a ball. > >>> > > On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote: > > On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > >> [ adding cc: tech-kern@ ] > >> > >> hi, > >> > >>> On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: > > On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: > > > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: > >> hi, > >> > >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: > hi, > > > Hi, thanks for review. > > > > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi > > wrote: > >> hi, > >> > >> - what's VM_PHYSSEG_OP_PG? > > > > It's to lookup vm_physseg by "struct vm_page *", relying on that > > "struct vm_page *[]" is allocated linearly. It'll be used to > > remove > > vm_page::phys_addr as we talked some time ago. > > i'm not sure if commiting this unused uncommented code now helps > it. > some try-and-benchmark cycles might be necessary given that > vm_page <-> paddr conversion could be performace critical. > >>> > >>> If you really care performance, we can directly pass "struct > >>> vm_page > >>> *" to pmap_enter(). > >>> > >>> We're doing "struct vm_page *" -> "paddr_t" just before > >>> pmap_enter(), > >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in > >>> pmap_enter() to check if a given PA is managed. What is really > >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once > >>> and you'll know both paddr_t and managed or not. > >> > >> i agree that the current code is not ideal in that respect. > >> otoh, i'm not sure if passing vm_physseg around is a good idea. > > > > It's great you share the interest. > > > > I chose vm_physseg, because it was there. I'm open to alternatives, > > but I don't think you have many options... > > Passing vm_page * doesn't work if the page isn't managed since there > won't be a vm_page for the paddr_t. > > Now passing both paddr_t and vm_page * would work and if the pointer > to the vm_page, it would be an unmanaged mapping. This also gives > the > access to mdpg without another lookup. > >>> > >>> What if XIP'ed md(4), where physical pages are in .data (or .rodata)? > >>> > >>> And don't forget that you're the one who first pointed out that > >>> allocating vm_pages for XIP is a pure waste of memory. ;) > >> > >> i guess matt meant "if the pointer to the vm_page is NULL,". > >> > >>> > >>> I'm allocating vm_pages, only because of phys_addr and loan_count. > >>> I believe vm_pages is unnecessary for read-only XIP segments. > >>> Because they're read-only, and stateless. > >>> > >>> I've already concluded that the current "managed or not" model > >>> doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t > >>> model can explain everything. I'm rather waiting for a counter > >>> example how vm_physseg doesn't work... > >> > >> i guess your suggestion is too vague. > >> where do you want to use vm_physseg * + off_t instead of vm_page * ? > >> getpages, pmap_enter, and? how their function prototypes would be? > > > > The basic idea is straightforward; always allocate vm_physseg for > > memories/devices. If a vm_physseg is used as general purpose > > memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > > potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > >>> > >>> can you provide function prototypes? > >> > >> I have no real code for this big picture at this moment. Making > >> vm_physseg available as reference is the first step. This only > >> changes uvm_page_physload() to return a pointer: > >> > >>-void uvm_page_physload(); > >>+void *uvm_page_physload(); > >> > >> But this makes XIP pager MUCH cleaner. The reason has been explained > >> many times. > >> > >> Making fault handlers and pagers to use vm_physseg * + off_t is > >> the next step, and I don't intend to work on it now. I ju
Re: CVS commit: src/sys/uvm
On Jan 3, 2011, at 9:01 PM, Masao Uebayashi wrote: > I take silence as "no objection". Not so safe. > On Thu, Dec 23, 2010 at 12:48:04PM +0900, Masao Uebayashi wrote: >> On Wed, Dec 22, 2010 at 05:37:57AM +, YAMAMOTO Takashi wrote: >>> hi, >>> Could you ack this discussion? >>> >>> sorry for dropping a ball. >>> On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote: > On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: >> [ adding cc: tech-kern@ ] >> >> hi, >> >>> On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: >> hi, >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: hi, > Hi, thanks for review. > > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi wrote: >> hi, >> >> - what's VM_PHYSSEG_OP_PG? > > It's to lookup vm_physseg by "struct vm_page *", relying on that > "struct vm_page *[]" is allocated linearly. It'll be used to > remove > vm_page::phys_addr as we talked some time ago. i'm not sure if commiting this unused uncommented code now helps it. some try-and-benchmark cycles might be necessary given that vm_page <-> paddr conversion could be performace critical. >>> >>> If you really care performance, we can directly pass "struct vm_page >>> *" to pmap_enter(). >>> >>> We're doing "struct vm_page *" -> "paddr_t" just before >>> pmap_enter(), >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in >>> pmap_enter() to check if a given PA is managed. What is really >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once >>> and you'll know both paddr_t and managed or not. >> >> i agree that the current code is not ideal in that respect. >> otoh, i'm not sure if passing vm_physseg around is a good idea. > > It's great you share the interest. > > I chose vm_physseg, because it was there. I'm open to alternatives, > but I don't think you have many options... Passing vm_page * doesn't work if the page isn't managed since there won't be a vm_page for the paddr_t. Now passing both paddr_t and vm_page * would work and if the pointer to the vm_page, it would be an unmanaged mapping. This also gives the access to mdpg without another lookup. >>> >>> What if XIP'ed md(4), where physical pages are in .data (or .rodata)? >>> >>> And don't forget that you're the one who first pointed out that >>> allocating vm_pages for XIP is a pure waste of memory. ;) >> >> i guess matt meant "if the pointer to the vm_page is NULL,". >> >>> >>> I'm allocating vm_pages, only because of phys_addr and loan_count. >>> I believe vm_pages is unnecessary for read-only XIP segments. >>> Because they're read-only, and stateless. >>> >>> I've already concluded that the current "managed or not" model >>> doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t >>> model can explain everything. I'm rather waiting for a counter >>> example how vm_physseg doesn't work... >> >> i guess your suggestion is too vague. >> where do you want to use vm_physseg * + off_t instead of vm_page * ? >> getpages, pmap_enter, and? how their function prototypes would be? > > The basic idea is straightforward; always allocate vm_physseg for > memories/devices. If a vm_physseg is used as general purpose > memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). >>> >>> can you provide function prototypes? >> >> I have no real code for this big picture at this moment. Making >> vm_physseg available as reference is the first step. This only >> changes uvm_page_physload() to return a pointer: >> >> -void uvm_page_physload(); >> +void *uvm_page_physload(); >> >> But this makes XIP pager MUCH cleaner. The reason has been explained >> many times. >> >> Making fault handlers and pagers to use vm_physseg * + off_t is >> the next step, and I don't intend to work on it now. I just want >> to explain the big picture. >> >>> > > Keep vm_physseg * + off_t array on stack. If UVM objects uses > vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > + off_t *once* and cache it on stack. >>> >>> do you mean something like this? >>> struct { >>> vm_physseg *hog
Re: CVS commit: src/sys/uvm
I take silence as "no objection". On Thu, Dec 23, 2010 at 12:48:04PM +0900, Masao Uebayashi wrote: > On Wed, Dec 22, 2010 at 05:37:57AM +, YAMAMOTO Takashi wrote: > > hi, > > > > > Could you ack this discussion? > > > > sorry for dropping a ball. > > > > > > > > On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote: > > >> On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > > >> > [ adding cc: tech-kern@ ] > > >> > > > >> > hi, > > >> > > > >> > > On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: > > >> > >> > > >> > >> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: > > >> > >> > > >> > >> > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: > > >> > >> >> hi, > > >> > >> >> > > >> > >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi > > >> > >> >>> wrote: > > >> > >> hi, > > >> > >> > > >> > >> > Hi, thanks for review. > > >> > >> > > > >> > >> > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi > > >> > >> > wrote: > > >> > >> >> hi, > > >> > >> >> > > >> > >> >> - what's VM_PHYSSEG_OP_PG? > > >> > >> > > > >> > >> > It's to lookup vm_physseg by "struct vm_page *", relying on > > >> > >> > that > > >> > >> > "struct vm_page *[]" is allocated linearly. It'll be used to > > >> > >> > remove > > >> > >> > vm_page::phys_addr as we talked some time ago. > > >> > >> > > >> > >> i'm not sure if commiting this unused uncommented code now > > >> > >> helps it. > > >> > >> some try-and-benchmark cycles might be necessary given that > > >> > >> vm_page <-> paddr conversion could be performace critical. > > >> > >> >>> > > >> > >> >>> If you really care performance, we can directly pass "struct > > >> > >> >>> vm_page > > >> > >> >>> *" to pmap_enter(). > > >> > >> >>> > > >> > >> >>> We're doing "struct vm_page *" -> "paddr_t" just before > > >> > >> >>> pmap_enter(), > > >> > >> >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in > > >> > >> >>> pmap_enter() to check if a given PA is managed. What is really > > >> > >> >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" > > >> > >> >>> once > > >> > >> >>> and you'll know both paddr_t and managed or not. > > >> > >> >> > > >> > >> >> i agree that the current code is not ideal in that respect. > > >> > >> >> otoh, i'm not sure if passing vm_physseg around is a good idea. > > >> > >> > > > >> > >> > It's great you share the interest. > > >> > >> > > > >> > >> > I chose vm_physseg, because it was there. I'm open to > > >> > >> > alternatives, > > >> > >> > but I don't think you have many options... > > >> > >> > > >> > >> Passing vm_page * doesn't work if the page isn't managed since there > > >> > >> won't be a vm_page for the paddr_t. > > >> > >> > > >> > >> Now passing both paddr_t and vm_page * would work and if the pointer > > >> > >> to the vm_page, it would be an unmanaged mapping. This also gives > > >> > >> the > > >> > >> access to mdpg without another lookup. > > >> > > > > >> > > What if XIP'ed md(4), where physical pages are in .data (or .rodata)? > > >> > > > > >> > > And don't forget that you're the one who first pointed out that > > >> > > allocating vm_pages for XIP is a pure waste of memory. ;) > > >> > > > >> > i guess matt meant "if the pointer to the vm_page is NULL,". > > >> > > > >> > > > > >> > > I'm allocating vm_pages, only because of phys_addr and loan_count. > > >> > > I believe vm_pages is unnecessary for read-only XIP segments. > > >> > > Because they're read-only, and stateless. > > >> > > > > >> > > I've already concluded that the current "managed or not" model > > >> > > doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t > > >> > > model can explain everything. I'm rather waiting for a counter > > >> > > example how vm_physseg doesn't work... > > >> > > > >> > i guess your suggestion is too vague. > > >> > where do you want to use vm_physseg * + off_t instead of vm_page * ? > > >> > getpages, pmap_enter, and? how their function prototypes would be? > > >> > > >> The basic idea is straightforward; always allocate vm_physseg for > > >> memories/devices. If a vm_physseg is used as general purpose > > >> memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > > >> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > > > > can you provide function prototypes? > > I have no real code for this big picture at this moment. Making > vm_physseg available as reference is the first step. This only > changes uvm_page_physload() to return a pointer: > > -void uvm_page_physload(); > +void *uvm_page_physload(); > > But this makes XIP pager MUCH cleaner. The reason has been explained > many times. > > Making fault handlers and pagers to use vm_physseg * + off_t is > the next step, and I don't intend to work on it now. I just want > to explain the big
Re: CVS commit: src/sys/uvm
Do you have any questions yet? On Wed, Dec 22, 2010 at 12:52:21PM +0900, Masao Uebayashi wrote: > On Tue, Dec 21, 2010 at 06:33:53PM -0800, Matt Thomas wrote: > > > > On Dec 21, 2010, at 6:13 PM, Masao Uebayashi wrote: > > > > > On Tue, Dec 21, 2010 at 11:29:01AM -0800, Matt Thomas wrote: > > >> > > >> On Dec 6, 2010, at 8:19 AM, Masao Uebayashi wrote: > > >> > > >>> On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > > [ adding cc: tech-kern@ ] > > >>> > > >>> The basic idea is straightforward; always allocate vm_physseg for > > >>> memories/devices. If a vm_physseg is used as general purpose > > >>> memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > > >>> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > > >> > > >> Ewww. How p->v is managed needs to be kept out of the MI code. > > > > > > Could you elaborate the reason why so? > > > > PowerPC OEA uses an inverted page table. It's p->v works very differently. > > True. > > > > > > I've already proven that __HAVE_VM_PAGE_MD pmaps don't need struct > > > vm_page *. > > > > Well, pmap_page_* definitely need vm_page * since it's one of the arguments. > > Yes, and vm_page * is used as page identity, right? > > What I call "vm_page is used" means for example read/write'ing > pg->flags. In that sense, vm_page is not really used in any pmaps. > > > > > any valid paddr_t value will belong to exactly one vm_phsseg? > > >>> > > >>> That's the idea. This would clarify mem(4) backend too. > > >>> > > >>> Note that allocating vm_physseg for device segments is cheap. > > >> > > >> that's depends on how much more expensive finding physseg gets. > > > > > > "Finding physseg" == "(reverse) lookup of vm_page -> vm_physseg". > > > It is done only once (for each page) in pagers that use vm_page. > > > > > > Is the biggest concern lookup cost? Then I'd point out that > > > uvm_pageismanaged() in pmap_enter() should die. Cacheability is > > > decided by how VA is mapped. Those uvm_pageismanaged() calls are > > > both inefficient and wrong. > > > > Not all pmaps use that. Instead, we should have a PMAP_UNMANAGED > > flag passed to pmap_enter since the caller probably know the > > answer. > > s/probably know/know/ > > IMO cacheability is always explicitly specified by UVM. > > Cacheability is decided by how VA is mapped, i.e., shared or not. > Vnode are shared but read-only, so it's cacheable. Device registers > are shared between CPU and H/W, so it can't be cacheable. -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Re: CVS commit: src/sys/uvm
On Wed, Dec 22, 2010 at 05:37:57AM +, YAMAMOTO Takashi wrote: > hi, > > > Could you ack this discussion? > > sorry for dropping a ball. > > > > > On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote: > >> On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > >> > [ adding cc: tech-kern@ ] > >> > > >> > hi, > >> > > >> > > On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: > >> > >> > >> > >> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: > >> > >> > >> > >> > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: > >> > >> >> hi, > >> > >> >> > >> > >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: > >> > >> hi, > >> > >> > >> > >> > Hi, thanks for review. > >> > >> > > >> > >> > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi > >> > >> > wrote: > >> > >> >> hi, > >> > >> >> > >> > >> >> - what's VM_PHYSSEG_OP_PG? > >> > >> > > >> > >> > It's to lookup vm_physseg by "struct vm_page *", relying on that > >> > >> > "struct vm_page *[]" is allocated linearly. It'll be used to > >> > >> > remove > >> > >> > vm_page::phys_addr as we talked some time ago. > >> > >> > >> > >> i'm not sure if commiting this unused uncommented code now helps > >> > >> it. > >> > >> some try-and-benchmark cycles might be necessary given that > >> > >> vm_page <-> paddr conversion could be performace critical. > >> > >> >>> > >> > >> >>> If you really care performance, we can directly pass "struct > >> > >> >>> vm_page > >> > >> >>> *" to pmap_enter(). > >> > >> >>> > >> > >> >>> We're doing "struct vm_page *" -> "paddr_t" just before > >> > >> >>> pmap_enter(), > >> > >> >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in > >> > >> >>> pmap_enter() to check if a given PA is managed. What is really > >> > >> >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once > >> > >> >>> and you'll know both paddr_t and managed or not. > >> > >> >> > >> > >> >> i agree that the current code is not ideal in that respect. > >> > >> >> otoh, i'm not sure if passing vm_physseg around is a good idea. > >> > >> > > >> > >> > It's great you share the interest. > >> > >> > > >> > >> > I chose vm_physseg, because it was there. I'm open to alternatives, > >> > >> > but I don't think you have many options... > >> > >> > >> > >> Passing vm_page * doesn't work if the page isn't managed since there > >> > >> won't be a vm_page for the paddr_t. > >> > >> > >> > >> Now passing both paddr_t and vm_page * would work and if the pointer > >> > >> to the vm_page, it would be an unmanaged mapping. This also gives the > >> > >> access to mdpg without another lookup. > >> > > > >> > > What if XIP'ed md(4), where physical pages are in .data (or .rodata)? > >> > > > >> > > And don't forget that you're the one who first pointed out that > >> > > allocating vm_pages for XIP is a pure waste of memory. ;) > >> > > >> > i guess matt meant "if the pointer to the vm_page is NULL,". > >> > > >> > > > >> > > I'm allocating vm_pages, only because of phys_addr and loan_count. > >> > > I believe vm_pages is unnecessary for read-only XIP segments. > >> > > Because they're read-only, and stateless. > >> > > > >> > > I've already concluded that the current "managed or not" model > >> > > doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t > >> > > model can explain everything. I'm rather waiting for a counter > >> > > example how vm_physseg doesn't work... > >> > > >> > i guess your suggestion is too vague. > >> > where do you want to use vm_physseg * + off_t instead of vm_page * ? > >> > getpages, pmap_enter, and? how their function prototypes would be? > >> > >> The basic idea is straightforward; always allocate vm_physseg for > >> memories/devices. If a vm_physseg is used as general purpose > >> memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > >> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > > can you provide function prototypes? I have no real code for this big picture at this moment. Making vm_physseg available as reference is the first step. This only changes uvm_page_physload() to return a pointer: -void uvm_page_physload(); +void *uvm_page_physload(); But this makes XIP pager MUCH cleaner. The reason has been explained many times. Making fault handlers and pagers to use vm_physseg * + off_t is the next step, and I don't intend to work on it now. I just want to explain the big picture. > > >> > >> Keep vm_physseg * + off_t array on stack. If UVM objects uses > >> vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > >> + off_t *once* and cache it on stack. > > do you mean something like this? > struct { > vm_physseg *hoge; > off_t fuga; > } foo [16]; Yes. Or cache vm_page * with it, like: struct v
Re: CVS commit: src/sys/uvm
hi, > Could you ack this discussion? sorry for dropping a ball. > > On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote: >> On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: >> > [ adding cc: tech-kern@ ] >> > >> > hi, >> > >> > > On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: >> > >> >> > >> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: >> > >> >> > >> > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: >> > >> >> hi, >> > >> >> >> > >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: >> > >> hi, >> > >> >> > >> > Hi, thanks for review. >> > >> > >> > >> > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi wrote: >> > >> >> hi, >> > >> >> >> > >> >> - what's VM_PHYSSEG_OP_PG? >> > >> > >> > >> > It's to lookup vm_physseg by "struct vm_page *", relying on that >> > >> > "struct vm_page *[]" is allocated linearly. It'll be used to >> > >> > remove >> > >> > vm_page::phys_addr as we talked some time ago. >> > >> >> > >> i'm not sure if commiting this unused uncommented code now helps >> > >> it. >> > >> some try-and-benchmark cycles might be necessary given that >> > >> vm_page <-> paddr conversion could be performace critical. >> > >> >>> >> > >> >>> If you really care performance, we can directly pass "struct vm_page >> > >> >>> *" to pmap_enter(). >> > >> >>> >> > >> >>> We're doing "struct vm_page *" -> "paddr_t" just before >> > >> >>> pmap_enter(), >> > >> >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in >> > >> >>> pmap_enter() to check if a given PA is managed. What is really >> > >> >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once >> > >> >>> and you'll know both paddr_t and managed or not. >> > >> >> >> > >> >> i agree that the current code is not ideal in that respect. >> > >> >> otoh, i'm not sure if passing vm_physseg around is a good idea. >> > >> > >> > >> > It's great you share the interest. >> > >> > >> > >> > I chose vm_physseg, because it was there. I'm open to alternatives, >> > >> > but I don't think you have many options... >> > >> >> > >> Passing vm_page * doesn't work if the page isn't managed since there >> > >> won't be a vm_page for the paddr_t. >> > >> >> > >> Now passing both paddr_t and vm_page * would work and if the pointer >> > >> to the vm_page, it would be an unmanaged mapping. This also gives the >> > >> access to mdpg without another lookup. >> > > >> > > What if XIP'ed md(4), where physical pages are in .data (or .rodata)? >> > > >> > > And don't forget that you're the one who first pointed out that >> > > allocating vm_pages for XIP is a pure waste of memory. ;) >> > >> > i guess matt meant "if the pointer to the vm_page is NULL,". >> > >> > > >> > > I'm allocating vm_pages, only because of phys_addr and loan_count. >> > > I believe vm_pages is unnecessary for read-only XIP segments. >> > > Because they're read-only, and stateless. >> > > >> > > I've already concluded that the current "managed or not" model >> > > doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t >> > > model can explain everything. I'm rather waiting for a counter >> > > example how vm_physseg doesn't work... >> > >> > i guess your suggestion is too vague. >> > where do you want to use vm_physseg * + off_t instead of vm_page * ? >> > getpages, pmap_enter, and? how their function prototypes would be? >> >> The basic idea is straightforward; always allocate vm_physseg for >> memories/devices. If a vm_physseg is used as general purpose >> memory, you allocate vm_page[] (as vm_physseg::pgs). If it's >> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). can you provide function prototypes? >> >> Keep vm_physseg * + off_t array on stack. If UVM objects uses >> vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * >> + off_t *once* and cache it on stack. do you mean something like this? struct { vm_physseg *hoge; off_t fuga; } foo [16]; YAMAMOTO Takashi >> >> > any valid paddr_t value will belong to exactly one vm_phsseg? >> >> That's the idea. This would clarify mem(4) backend too. >> >> Note that allocating vm_physseg for device segments is cheap.
Re: CVS commit: src/sys/uvm
On Wed, Dec 22, 2010 at 11:13:52AM +0900, Masao Uebayashi wrote: > On Tue, Dec 21, 2010 at 11:29:01AM -0800, Matt Thomas wrote: > > > > On Dec 6, 2010, at 8:19 AM, Masao Uebayashi wrote: > > > > > On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > > >> [ adding cc: tech-kern@ ] > > > > > > The basic idea is straightforward; always allocate vm_physseg for > > > memories/devices. If a vm_physseg is used as general purpose > > > memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > > > potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > > > > Ewww. How p->v is managed needs to be kept out of the MI code. > > Could you elaborate the reason why so? > > I've already proven that __HAVE_VM_PAGE_MD pmaps don't need struct > vm_page *. > > > There can be a common implementation of it for use by MD code but > > that isn't the same as MI. > > > > > Keep vm_physseg * + off_t array on stack. If UVM objects uses > > > vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > > > + off_t *once* and cache it on stack. > > > > off_t is not the right type. psize_t would be more accurate. > > Probably. Well, I want to make this MI. One of motivations to use vm_physseg is to make page identity independent of machine-specific types (paddr_t / psize_t). So that kernel modules are shared between different sizeof(paddr_t). I see pmap as "MMU driver". paddr_t is an opaque type, which should be used only for H/Ws that touch the physical bus (mainbus).
Re: CVS commit: src/sys/uvm
On Tue, Dec 21, 2010 at 06:33:53PM -0800, Matt Thomas wrote: > > On Dec 21, 2010, at 6:13 PM, Masao Uebayashi wrote: > > > On Tue, Dec 21, 2010 at 11:29:01AM -0800, Matt Thomas wrote: > >> > >> On Dec 6, 2010, at 8:19 AM, Masao Uebayashi wrote: > >> > >>> On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > [ adding cc: tech-kern@ ] > >>> > >>> The basic idea is straightforward; always allocate vm_physseg for > >>> memories/devices. If a vm_physseg is used as general purpose > >>> memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > >>> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > >> > >> Ewww. How p->v is managed needs to be kept out of the MI code. > > > > Could you elaborate the reason why so? > > PowerPC OEA uses an inverted page table. It's p->v works very differently. True. > > > I've already proven that __HAVE_VM_PAGE_MD pmaps don't need struct > > vm_page *. > > Well, pmap_page_* definitely need vm_page * since it's one of the arguments. Yes, and vm_page * is used as page identity, right? What I call "vm_page is used" means for example read/write'ing pg->flags. In that sense, vm_page is not really used in any pmaps. > > any valid paddr_t value will belong to exactly one vm_phsseg? > >>> > >>> That's the idea. This would clarify mem(4) backend too. > >>> > >>> Note that allocating vm_physseg for device segments is cheap. > >> > >> that's depends on how much more expensive finding physseg gets. > > > > "Finding physseg" == "(reverse) lookup of vm_page -> vm_physseg". > > It is done only once (for each page) in pagers that use vm_page. > > > > Is the biggest concern lookup cost? Then I'd point out that > > uvm_pageismanaged() in pmap_enter() should die. Cacheability is > > decided by how VA is mapped. Those uvm_pageismanaged() calls are > > both inefficient and wrong. > > Not all pmaps use that. Instead, we should have a PMAP_UNMANAGED > flag passed to pmap_enter since the caller probably know the > answer. s/probably know/know/ IMO cacheability is always explicitly specified by UVM. Cacheability is decided by how VA is mapped, i.e., shared or not. Vnode are shared but read-only, so it's cacheable. Device registers are shared between CPU and H/W, so it can't be cacheable.
Re: CVS commit: src/sys/uvm
On Dec 21, 2010, at 6:13 PM, Masao Uebayashi wrote: > On Tue, Dec 21, 2010 at 11:29:01AM -0800, Matt Thomas wrote: >> >> On Dec 6, 2010, at 8:19 AM, Masao Uebayashi wrote: >> >>> On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: [ adding cc: tech-kern@ ] >>> >>> The basic idea is straightforward; always allocate vm_physseg for >>> memories/devices. If a vm_physseg is used as general purpose >>> memory, you allocate vm_page[] (as vm_physseg::pgs). If it's >>> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). >> >> Ewww. How p->v is managed needs to be kept out of the MI code. > > Could you elaborate the reason why so? PowerPC OEA uses an inverted page table. It's p->v works very differently. > I've already proven that __HAVE_VM_PAGE_MD pmaps don't need struct > vm_page *. Well, pmap_page_* definitely need vm_page * since it's one of the arguments. any valid paddr_t value will belong to exactly one vm_phsseg? >>> >>> That's the idea. This would clarify mem(4) backend too. >>> >>> Note that allocating vm_physseg for device segments is cheap. >> >> that's depends on how much more expensive finding physseg gets. > > "Finding physseg" == "(reverse) lookup of vm_page -> vm_physseg". > It is done only once (for each page) in pagers that use vm_page. > > Is the biggest concern lookup cost? Then I'd point out that > uvm_pageismanaged() in pmap_enter() should die. Cacheability is > decided by how VA is mapped. Those uvm_pageismanaged() calls are > both inefficient and wrong. Not all pmaps use that. Instead, we should have a PMAP_UNMANAGED flag passed to pmap_enter since the caller probably know the answer.
Re: CVS commit: src/sys/uvm
On Tue, Dec 21, 2010 at 11:29:01AM -0800, Matt Thomas wrote: > > On Dec 6, 2010, at 8:19 AM, Masao Uebayashi wrote: > > > On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > >> [ adding cc: tech-kern@ ] > > > > The basic idea is straightforward; always allocate vm_physseg for > > memories/devices. If a vm_physseg is used as general purpose > > memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > > potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > > Ewww. How p->v is managed needs to be kept out of the MI code. Could you elaborate the reason why so? I've already proven that __HAVE_VM_PAGE_MD pmaps don't need struct vm_page *. > There can be a common implementation of it for use by MD code but > that isn't the same as MI. > > > Keep vm_physseg * + off_t array on stack. If UVM objects uses > > vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > > + off_t *once* and cache it on stack. > > off_t is not the right type. psize_t would be more accurate. Probably. > > >> any valid paddr_t value will belong to exactly one vm_phsseg? > > > > That's the idea. This would clarify mem(4) backend too. > > > > Note that allocating vm_physseg for device segments is cheap. > > that's depends on how much more expensive finding physseg gets. "Finding physseg" == "(reverse) lookup of vm_page -> vm_physseg". It is done only once (for each page) in pagers that use vm_page. Is the biggest concern lookup cost? Then I'd point out that uvm_pageismanaged() in pmap_enter() should die. Cacheability is decided by how VA is mapped. Those uvm_pageismanaged() calls are both inefficient and wrong.
Re: CVS commit: src/sys/uvm
On Dec 6, 2010, at 8:19 AM, Masao Uebayashi wrote: > On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: >> [ adding cc: tech-kern@ ] > > The basic idea is straightforward; always allocate vm_physseg for > memories/devices. If a vm_physseg is used as general purpose > memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). Ewww. How p->v is managed needs to be kept out of the MI code. There can be a common implementation of it for use by MD code but that isn't the same as MI. > Keep vm_physseg * + off_t array on stack. If UVM objects uses > vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > + off_t *once* and cache it on stack. off_t is not the right type. psize_t would be more accurate. >> any valid paddr_t value will belong to exactly one vm_phsseg? > > That's the idea. This would clarify mem(4) backend too. > > Note that allocating vm_physseg for device segments is cheap. that's depends on how much more expensive finding physseg gets.
Re: CVS commit: src/sys/uvm
Could you ack this discussion? On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote: > On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > > [ adding cc: tech-kern@ ] > > > > hi, > > > > > On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: > > >> > > >> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: > > >> > > >> > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: > > >> >> hi, > > >> >> > > >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: > > >> hi, > > >> > > >> > Hi, thanks for review. > > >> > > > >> > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi wrote: > > >> >> hi, > > >> >> > > >> >> - what's VM_PHYSSEG_OP_PG? > > >> > > > >> > It's to lookup vm_physseg by "struct vm_page *", relying on that > > >> > "struct vm_page *[]" is allocated linearly. It'll be used to > > >> > remove > > >> > vm_page::phys_addr as we talked some time ago. > > >> > > >> i'm not sure if commiting this unused uncommented code now helps it. > > >> some try-and-benchmark cycles might be necessary given that > > >> vm_page <-> paddr conversion could be performace critical. > > >> >>> > > >> >>> If you really care performance, we can directly pass "struct vm_page > > >> >>> *" to pmap_enter(). > > >> >>> > > >> >>> We're doing "struct vm_page *" -> "paddr_t" just before pmap_enter(), > > >> >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in > > >> >>> pmap_enter() to check if a given PA is managed. What is really > > >> >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once > > >> >>> and you'll know both paddr_t and managed or not. > > >> >> > > >> >> i agree that the current code is not ideal in that respect. > > >> >> otoh, i'm not sure if passing vm_physseg around is a good idea. > > >> > > > >> > It's great you share the interest. > > >> > > > >> > I chose vm_physseg, because it was there. I'm open to alternatives, > > >> > but I don't think you have many options... > > >> > > >> Passing vm_page * doesn't work if the page isn't managed since there > > >> won't be a vm_page for the paddr_t. > > >> > > >> Now passing both paddr_t and vm_page * would work and if the pointer > > >> to the vm_page, it would be an unmanaged mapping. This also gives the > > >> access to mdpg without another lookup. > > > > > > What if XIP'ed md(4), where physical pages are in .data (or .rodata)? > > > > > > And don't forget that you're the one who first pointed out that > > > allocating vm_pages for XIP is a pure waste of memory. ;) > > > > i guess matt meant "if the pointer to the vm_page is NULL,". > > > > > > > > I'm allocating vm_pages, only because of phys_addr and loan_count. > > > I believe vm_pages is unnecessary for read-only XIP segments. > > > Because they're read-only, and stateless. > > > > > > I've already concluded that the current "managed or not" model > > > doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t > > > model can explain everything. I'm rather waiting for a counter > > > example how vm_physseg doesn't work... > > > > i guess your suggestion is too vague. > > where do you want to use vm_physseg * + off_t instead of vm_page * ? > > getpages, pmap_enter, and? how their function prototypes would be? > > The basic idea is straightforward; always allocate vm_physseg for > memories/devices. If a vm_physseg is used as general purpose > memory, you allocate vm_page[] (as vm_physseg::pgs). If it's > potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). > > Keep vm_physseg * + off_t array on stack. If UVM objects uses > vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * > + off_t *once* and cache it on stack. > > > any valid paddr_t value will belong to exactly one vm_phsseg? > > That's the idea. This would clarify mem(4) backend too. > > Note that allocating vm_physseg for device segments is cheap.
Re: CVS commit: src/sys/uvm
On Thu, Nov 25, 2010 at 11:32:39PM +, YAMAMOTO Takashi wrote: > [ adding cc: tech-kern@ ] > > hi, > > > On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: > >> > >> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: > >> > >> > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: > >> >> hi, > >> >> > >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: > >> hi, > >> > >> > Hi, thanks for review. > >> > > >> > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi wrote: > >> >> hi, > >> >> > >> >> - what's VM_PHYSSEG_OP_PG? > >> > > >> > It's to lookup vm_physseg by "struct vm_page *", relying on that > >> > "struct vm_page *[]" is allocated linearly. It'll be used to remove > >> > vm_page::phys_addr as we talked some time ago. > >> > >> i'm not sure if commiting this unused uncommented code now helps it. > >> some try-and-benchmark cycles might be necessary given that > >> vm_page <-> paddr conversion could be performace critical. > >> >>> > >> >>> If you really care performance, we can directly pass "struct vm_page > >> >>> *" to pmap_enter(). > >> >>> > >> >>> We're doing "struct vm_page *" -> "paddr_t" just before pmap_enter(), > >> >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in > >> >>> pmap_enter() to check if a given PA is managed. What is really > >> >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once > >> >>> and you'll know both paddr_t and managed or not. > >> >> > >> >> i agree that the current code is not ideal in that respect. > >> >> otoh, i'm not sure if passing vm_physseg around is a good idea. > >> > > >> > It's great you share the interest. > >> > > >> > I chose vm_physseg, because it was there. I'm open to alternatives, > >> > but I don't think you have many options... > >> > >> Passing vm_page * doesn't work if the page isn't managed since there > >> won't be a vm_page for the paddr_t. > >> > >> Now passing both paddr_t and vm_page * would work and if the pointer > >> to the vm_page, it would be an unmanaged mapping. This also gives the > >> access to mdpg without another lookup. > > > > What if XIP'ed md(4), where physical pages are in .data (or .rodata)? > > > > And don't forget that you're the one who first pointed out that > > allocating vm_pages for XIP is a pure waste of memory. ;) > > i guess matt meant "if the pointer to the vm_page is NULL,". > > > > > I'm allocating vm_pages, only because of phys_addr and loan_count. > > I believe vm_pages is unnecessary for read-only XIP segments. > > Because they're read-only, and stateless. > > > > I've already concluded that the current "managed or not" model > > doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t > > model can explain everything. I'm rather waiting for a counter > > example how vm_physseg doesn't work... > > i guess your suggestion is too vague. > where do you want to use vm_physseg * + off_t instead of vm_page * ? > getpages, pmap_enter, and? how their function prototypes would be? The basic idea is straightforward; always allocate vm_physseg for memories/devices. If a vm_physseg is used as general purpose memory, you allocate vm_page[] (as vm_physseg::pgs). If it's potentially mapped as cached, you allocate pvh (as vm_physseg:pvh). Keep vm_physseg * + off_t array on stack. If UVM objects uses vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg * + off_t *once* and cache it on stack. > any valid paddr_t value will belong to exactly one vm_phsseg? That's the idea. This would clarify mem(4) backend too. Note that allocating vm_physseg for device segments is cheap.
Re: CVS commit: src/sys/uvm
[ adding cc: tech-kern@ ] hi, > On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote: >> >> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote: >> >> > On Thu, Nov 25, 2010 at 05:44:21AM +, YAMAMOTO Takashi wrote: >> >> hi, >> >> >> >>> On Thu, Nov 25, 2010 at 04:18:25AM +, YAMAMOTO Takashi wrote: >> hi, >> >> > Hi, thanks for review. >> > >> > On Thu, Nov 25, 2010 at 01:58:04AM +, YAMAMOTO Takashi wrote: >> >> hi, >> >> >> >> - what's VM_PHYSSEG_OP_PG? >> > >> > It's to lookup vm_physseg by "struct vm_page *", relying on that >> > "struct vm_page *[]" is allocated linearly. It'll be used to remove >> > vm_page::phys_addr as we talked some time ago. >> >> i'm not sure if commiting this unused uncommented code now helps it. >> some try-and-benchmark cycles might be necessary given that >> vm_page <-> paddr conversion could be performace critical. >> >>> >> >>> If you really care performance, we can directly pass "struct vm_page >> >>> *" to pmap_enter(). >> >>> >> >>> We're doing "struct vm_page *" -> "paddr_t" just before pmap_enter(), >> >>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in >> >>> pmap_enter() to check if a given PA is managed. What is really >> >>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once >> >>> and you'll know both paddr_t and managed or not. >> >> >> >> i agree that the current code is not ideal in that respect. >> >> otoh, i'm not sure if passing vm_physseg around is a good idea. >> > >> > It's great you share the interest. >> > >> > I chose vm_physseg, because it was there. I'm open to alternatives, >> > but I don't think you have many options... >> >> Passing vm_page * doesn't work if the page isn't managed since there >> won't be a vm_page for the paddr_t. >> >> Now passing both paddr_t and vm_page * would work and if the pointer >> to the vm_page, it would be an unmanaged mapping. This also gives the >> access to mdpg without another lookup. > > What if XIP'ed md(4), where physical pages are in .data (or .rodata)? > > And don't forget that you're the one who first pointed out that > allocating vm_pages for XIP is a pure waste of memory. ;) i guess matt meant "if the pointer to the vm_page is NULL,". > > I'm allocating vm_pages, only because of phys_addr and loan_count. > I believe vm_pages is unnecessary for read-only XIP segments. > Because they're read-only, and stateless. > > I've already concluded that the current "managed or not" model > doesn't work for XIP. I'm pretty sure that my vm_physseg + off_t > model can explain everything. I'm rather waiting for a counter > example how vm_physseg doesn't work... i guess your suggestion is too vague. where do you want to use vm_physseg * + off_t instead of vm_page * ? getpages, pmap_enter, and? how their function prototypes would be? any valid paddr_t value will belong to exactly one vm_phsseg? YAMAMOTO Takashi
Re: sysctl node names (Re: CVS commit: src/sys/uvm)
On Fri, Feb 19, 2010 at 07:15:21PM +, Patrick Welche wrote: > > Which reminds me of sys/sysctl.h > > #define CTLTYPE_BOOL6 /* name describes a bool */ > > but bool doesn't seem to have i/o routines (I bet that all the above > are CTLTYPE_INT) And then what would be output for bool 1, t, true, vrai... Which has been fixed :-) date: 2010/04/11 01:52:10; author: mrg; state: Exp; lines: +66 -5 implement CTLTYPE_BOOL support. it was entirely missing. HI MATT!
Re: sysctl node names (Re: CVS commit: src/sys/uvm)
On Fri, Feb 19, 2010 at 06:42:03AM -0500, der Mouse wrote: > I'd say it's a question of whether you think of them as input to the > kernel, commands ("enable this"), or as output from the kernel, > reporting state ("this is enabled"). > > Of course, in most cases, they're actually both, so that doesn't help > much. But, as a native anglophone, it's what the difference between > "enable" and "enabled" here feels like to me. Well, the sysctl tree is a presentation of the state. Changing it is a command. So I think it should be "enabled". We could also avoid this problem by using "on" instead, but that's probably not a great idea. -- David A. Holland dholl...@netbsd.org
Re: sysctl node names (Re: CVS commit: src/sys/uvm)
On Fri, Feb 19, 2010 at 05:44:48AM +, YAMAMOTO Takashi wrote: > hi, > > > Module Name:src > > Committed By: drochner > > Date: Thu Feb 18 14:57:01 UTC 2010 > > > > Modified Files: > > src/sys/uvm: files.uvm uvm_map.c > > > > Log Message: > > Disable mapping of virtual address 0 by user programs per default. > > This blocks an easy exploit of kernel bugs leading to dereference > > of a NULL pointer on some architectures (eg i386). > > The check can be disabled in various ways: > > -by CPP definitions in machine/types.h (portmaster's choice) > > -by a kernel config option USER_VA0_DISABLED_DEFAULT=0 > > -at runtime by sysctl vm.user_va0_disabled (cannot be cleared > > at securelevel>0) > > it reminded me this... > > can we have some policy for future sysctl node names? > the current mixture of "enable" vs "enabled" seems ugly to me. > if my preference matters, i'd say "enable"/"disable". > > YAMAMOTO Takashi > > ushi% sysctl -a|grep enable > net.inet.tcp.sack.enable = 1 > net.inet.tcp.ecn.enable = 0 > net.inet.tcp.abc.enable = 1 > net.inet6.tcp6.sack.enable = 1 > net.inet6.tcp6.ecn.enable = 0 > net.inet6.tcp6.abc.enable = 1 > hw.fwohci.phydma_enable = 1 > security.pax.mprotect.enabled = 1 > security.pax.aslr.enabled = 1 > ushi% Which reminds me of sys/sysctl.h #define CTLTYPE_BOOL6 /* name describes a bool */ but bool doesn't seem to have i/o routines (I bet that all the above are CTLTYPE_INT) And then what would be output for bool 1, t, true, vrai... Cheers, Patrick
Re: sysctl node names (Re: CVS commit: src/sys/uvm)
>> the current mixture of "enable" vs "enabled" seems ugly to me. >> if my preference matters, i'd say "enable"/"disable". > For me, "enabled" sounds more logical because such a flag refers to a > state, not a process. I'd say it's a question of whether you think of them as input to the kernel, commands ("enable this"), or as output from the kernel, reporting state ("this is enabled"). Of course, in most cases, they're actually both, so that doesn't help much. But, as a native anglophone, it's what the difference between "enable" and "enabled" here feels like to me. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: sysctl node names (Re: CVS commit: src/sys/uvm)
y...@mwd.biglobe.ne.jp said: > the current mixture of "enable" vs "enabled" seems ugly to me. if my > preference matters, i'd say "enable"/"disable". For me, "enabled" sounds more logical because such a flag refers to a state, not a process. But I'm ready to sacrifice that for consistency. The "enable"s are the majority, and they are the older ones. In the "va0" case it is not too late -- if no native speaker convinces me otherwise I'll change it to _disable. best regards Matthias Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
sysctl node names (Re: CVS commit: src/sys/uvm)
hi, > Module Name: src > Committed By: drochner > Date: Thu Feb 18 14:57:01 UTC 2010 > > Modified Files: > src/sys/uvm: files.uvm uvm_map.c > > Log Message: > Disable mapping of virtual address 0 by user programs per default. > This blocks an easy exploit of kernel bugs leading to dereference > of a NULL pointer on some architectures (eg i386). > The check can be disabled in various ways: > -by CPP definitions in machine/types.h (portmaster's choice) > -by a kernel config option USER_VA0_DISABLED_DEFAULT=0 > -at runtime by sysctl vm.user_va0_disabled (cannot be cleared > at securelevel>0) it reminded me this... can we have some policy for future sysctl node names? the current mixture of "enable" vs "enabled" seems ugly to me. if my preference matters, i'd say "enable"/"disable". YAMAMOTO Takashi ushi% sysctl -a|grep enable net.inet.tcp.sack.enable = 1 net.inet.tcp.ecn.enable = 0 net.inet.tcp.abc.enable = 1 net.inet6.tcp6.sack.enable = 1 net.inet6.tcp6.ecn.enable = 0 net.inet6.tcp6.abc.enable = 1 hw.fwohci.phydma_enable = 1 security.pax.mprotect.enabled = 1 security.pax.aslr.enabled = 1 ushi% > > > To generate a diff of this commit: > cvs rdiff -u -r1.16 -r1.17 src/sys/uvm/files.uvm > cvs rdiff -u -r1.287 -r1.288 src/sys/uvm/uvm_map.c > > Please note that diffs are not public domain; they are subject to the > copyright notices on the relevant files.