Re: DRM memory manager on cards with hardware contexts

2006-10-05 Thread Thomas Hellström
Ben,

I've implemented a version of the drm_mm code that unmaps ptes using 
unmap_mapping_range, and remaps IO space using io_remap_pfn_range() for 
a single page in nopage. This has the side effect that I need to double 
check in nopage() after taking the object mutex that the pte in question 
hasn't been populated by a racing nopage, which means I have to include 
some page table walking code. I can see no obvious performance drops 
from populating one pte at a time.

This makes the need for an io-page aware nopage() or an 
io_remap_pfn_range() that doesn't BUG on populated page tables quite 
obvious.

What is the status of the NOPAGE_RETRY mechanism in 2.6.19?


/Thomas




-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-10-05 Thread Benjamin Herrenschmidt
On Thu, 2006-10-05 at 17:52 +0200, Thomas Hellström wrote:
 Ben,
 
 I've implemented a version of the drm_mm code that unmaps ptes using 
 unmap_mapping_range, and remaps IO space using io_remap_pfn_range() for 
 a single page in nopage. This has the side effect that I need to double 
 check in nopage() after taking the object mutex that the pte in question 
 hasn't been populated by a racing nopage, which means I have to include 
 some page table walking code. I can see no obvious performance drops 
 from populating one pte at a time.

I think the correct thing is to add the double check in
remap_pte_range() instread of the current BUG_ON

 This makes the need for an io-page aware nopage() or an 
 io_remap_pfn_range() that doesn't BUG on populated page tables quite 
 obvious.
 
 What is the status of the NOPAGE_RETRY mechanism in 2.6.19?

I don't think it got in, but I need it for various things so I'll check
with Andrew if it can still be merged.

Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-10-05 Thread Benjamin Herrenschmidt
On Thu, 2006-10-05 at 17:52 +0200, Thomas Hellström wrote:
 Ben,
 
 I've implemented a version of the drm_mm code that unmaps ptes using 
 unmap_mapping_range, and remaps IO space using io_remap_pfn_range() for 
 a single page in nopage. This has the side effect that I need to double 
 check in nopage() after taking the object mutex that the pte in question 
 hasn't been populated by a racing nopage, which means I have to include 
 some page table walking code. I can see no obvious performance drops 
 from populating one pte at a time.
 
 This makes the need for an io-page aware nopage() or an 
 io_remap_pfn_range() that doesn't BUG on populated page tables quite 
 obvious.
 
 What is the status of the NOPAGE_RETRY mechanism in 2.6.19?

Patch just got in -mm, the return code changed to NOPAGE_REFAULT. It
might still make it into 2.6.19.

Regarding the change to io_remap_pfn_range(), I'm tempted to just
provide a single routine to put in a single PTE that also does the
double check of pte-present. I'll cook up something today hopefully as I
want to experiment doing the exact same thing for SPE mappings on cell.

Cheers,
Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Thomas Hellström




Benjamin Herrenschmidt wrote:


  
  
OK. It seems like mmap locks are needed even for
unmap_mapping_range().

  
  
Well, I came to the opposite conclusion :) unmap_mapping_range() uses
the truncate count mecanism to guard against a racing no_page().

The idea is that:

no_page() itself internally takes the per-obkect lock/mutex mostly as a
sycnhronisation point before looking for the struct page and releases it
before returning the struct page to do_no_page().

unmap_mapping_range() is called with that muetx/lock held (and the copy
is done with that held too).

That should work without taking the mmap_sem.
  

OK. i was reffering to another approach: Copying _to_ VRAM /AGP:

lock_mmap_sems()
unmap_mapping_range() (or similar)
copy() / flip()
foreach_affected_vma{
 io_remap_pfn_range() /* Map vram / AGP space */
}
unlock_mmap_sem()

This works like a charm in the drm memory manager but it requires the
lock of the mmap sems from all affected processes, and the locking
order must be the same all the time otherwise deadlocks will occur.


  
Now, of course, the real problem is that we don't have struct page for
vram There are two ways out of this:

 - Enforce use of sparsemem and create struct page for vram. That will
probably make a few people jump out of their seats in x86 land but
that's what we do for cell and SPUs for now.

 - There's a prooposal that I'm pusing to add a way for no_page() to
return a NOPAGE_RETRY error, which essentially causes it to go all the
way back to userland and re-do the access. I want that to be able to
handle signals while blocked inside no_page() but that could -also- be
used to have no_page() setup the PTE mappings itself and return
NOPAGE_RETRY, thus avoiding the need for a struct page. Now I do not
-ever- want to see drivers mucking around with PTEs directly, however,
we can provide something in mm/memory.c that a driver can call from
within no_page() to perform the set_pte() along with all the necessary
locking, flushes, etc... The base code for NOPAGE_RETRY should get in
2.6.19 soon (one of these days).

  

do_no_page() is smart enough to recheck the pte when it retakes the
page table spinlock(), so if the pte has been populated by someone
while in the driver nopage(), the returned struct page will simply be
discarded. 
io_remap_pfn_range() should do the job of setting up the new ptes, but
it needs the mmap_sem, so if that one is held while blocked in
nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously
do the job. When io_remap_pfn_range() has finished setting up the ptes,
one can simply return a bogus page to nopage() if it insists on
retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with
the approach outlined above.


/Thomas






 




  Ben.

  




-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Benjamin Herrenschmidt

 OK. i was reffering to another approach: Copying _to_ VRAM /AGP:
 
 lock_mmap_sems()
 unmap_mapping_range() (or similar)
 copy() / flip()
 foreach_affected_vma{
io_remap_pfn_range() /* Map vram / AGP space */
 }
 unlock_mmap_sem()
 
 This works like a charm in the drm memory manager but it requires the
 lock of the mmap sems from all affected processes, and the locking
 order must be the same all the time otherwise deadlocks will occur.

Yes, and that's what I think we can fix using do_no_page() and
unmap_mapping_ranges(). That is, we don't io_remap_pfn_range(), we just
fault in pages either from VRAM or from memory depending on where an
object sits at a given point in time. and we use
unmap_mappingng_ranges() to invalidate current mappings of an object
when we move it. That can be done with the minimal approach I
described with the only limitation (though a pretty big one today) that
you need struct page for VRAM for no_page() to be useable in those
conditions.

 do_no_page() is smart enough to recheck the pte when it retakes the
 page table spinlock(), so if the pte has been populated by someone
 while in the driver nopage(), the returned struct page will simply be
 discarded. 

Yup, indeed. It has to to avoid races since no_page() has to be called
without the PTE lock. The NOPAGE_RETRY approach would still be slightly
more efficient though.

 io_remap_pfn_range() should do the job of setting up the new ptes, but
 it needs the mmap_sem, so if that one is held while blocked in
 nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously
 do the job. When io_remap_pfn_range() has finished setting up the
 ptes, one can simply return a bogus page to nopage() if it insists on
 retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with
 the approach outlined above.

It's not completely clear to me if we need the mmap_sem for writing to
call io_remap_pfn_range()... We can certainly populate PTEs with only
the read semaphore and we happen to have it in no_page so that would
just work being called just within no_page().

So this approach would work today imho:

* objects have rwsem to protect migration.
* no_page() does:
   - takes that object read sem
   - if object is in vram or other non-memory location then do
io_remap_pfn_range() and get a dummy page struct pointer
   - else get the struct page of the object page in memory
   - release the object read sem and return whatever struct page we got
* migration does:
   - take that object write sem
   - copy the data to the new location
   - call unmap_mapping_ranges() for that object
   - release the object write sem

With 2.6.19, hopefully, NOPAGE_RETRY will get in, which means that
no_page() can be optimized for the case where it calls
io_remap_pfn_range() to not return a bogus page and have a faster return
path to userland. It's also possible to provide a io_remap_one_page()
that would be faster than having to call the whole 4 level
io_remap_pfn_range() for every page faulted in (though we might just
remap the entire object on the first fault, might well work ...)

Or do you think I missed something ?

Cheers,
Ben


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Benjamin Herrenschmidt

 * objects have rwsem to protect migration.
 * no_page() does:
- takes that object read sem
- if object is in vram or other non-memory location then do
 io_remap_pfn_range() and get a dummy page struct pointer
- else get the struct page of the object page in memory
- release the object read sem and return whatever struct page we got
 * migration does:
- take that object write sem
- copy the data to the new location
- call unmap_mapping_ranges() for that object
- release the object write sem

Ok, there is one fault in my reasoning: io_remap_pfn_range() isn't
designed to be used in that context (it's really made to be called with
the mmap_sem held for writing) and thus doesn't check if the PTE is
still empty after locking it which is necessary if you are only holding
the read semaphore.

That means that it's still all possible, but not using
io_remap_pfn_range(). Best is to provide a specific new function, called
something like map_one_io_page() or something like that, which does
something along the lines of 

pgd = pgd_offset(mm, address);
pud = pud_alloc(mm, pgd, address);
if (!pud)
return VM_FAULT_OOM;
pmd = pmd_alloc(mm, pud, address);
if (!pmd)
return VM_FAULT_OOM;
pte = pte_alloc_map(mm, pmd, address);
if (!pte)
return VM_FAULT_OOM;
pte = pte_offset_map_lock(mm, pmd, address, ptl);
if (pte_none(*page_table)) {
flush_icache_page(vma, new_page);
entry = mk_pte(new_page, vma-vm_page_prot);
if (write_access)
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
set_pte_at(mm, address, page_table, entry);
} else {
 page_cache_release(new_page);
goto unlock;
}
update_mmu_cache(vma, address, entry);
lazy_mmu_prot_update(entry);
unlock:
pte_unmap_unlock(pte, ptl);

Note that it's clear that this is to be used exclusively for mapping on
non real pages and it doesn't handle racing with truncate (concurrent
unmap_mapping_ranges(), which is fine in our case as we have the object
semaphore).

We're looking into doing something like that for Cell to not require
sparsemem anymore and thus not create struct page's for SPE local stores
and registers which is a real pain...

We should probably move that discussion to linux-mm and/or lkml tho :)

Cheers,
Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Thomas Hellström
Benjamin Herrenschmidt wrote:

OK. i was reffering to another approach: Copying _to_ VRAM /AGP:

lock_mmap_sems()
unmap_mapping_range() (or similar)
copy() / flip()
foreach_affected_vma{
   io_remap_pfn_range() /* Map vram / AGP space */
}
unlock_mmap_sem()

This works like a charm in the drm memory manager but it requires the
lock of the mmap sems from all affected processes, and the locking
order must be the same all the time otherwise deadlocks will occur.



Yes, and that's what I think we can fix using do_no_page() and
unmap_mapping_ranges(). That is, we don't io_remap_pfn_range(), we just
fault in pages either from VRAM or from memory depending on where an
object sits at a given point in time. and we use
unmap_mappingng_ranges() to invalidate current mappings of an object
when we move it. That can be done with the minimal approach I
described with the only limitation (though a pretty big one today) that
you need struct page for VRAM for no_page() to be useable in those
conditions.

  

do_no_page() is smart enough to recheck the pte when it retakes the
page table spinlock(), so if the pte has been populated by someone
while in the driver nopage(), the returned struct page will simply be
discarded. 



Yup, indeed. It has to to avoid races since no_page() has to be called
without the PTE lock. The NOPAGE_RETRY approach would still be slightly
more efficient though.

  

io_remap_pfn_range() should do the job of setting up the new ptes, but
it needs the mmap_sem, so if that one is held while blocked in
nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously
do the job. When io_remap_pfn_range() has finished setting up the
ptes, one can simply return a bogus page to nopage() if it insists on
retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with
the approach outlined above.



It's not completely clear to me if we need the mmap_sem for writing to
call io_remap_pfn_range()... We can certainly populate PTEs with only
the read semaphore and we happen to have it in no_page so that would
just work being called just within no_page().

So this approach would work today imho:

* objects have rwsem to protect migration.
* no_page() does:
   - takes that object read sem
   - if object is in vram or other non-memory location then do
io_remap_pfn_range() and get a dummy page struct pointer
   - else get the struct page of the object page in memory
   - release the object read sem and return whatever struct page we got
* migration does:
   - take that object write sem
   - copy the data to the new location
   - call unmap_mapping_ranges() for that object
   - release the object write sem

With 2.6.19, hopefully, NOPAGE_RETRY will get in, which means that
no_page() can be optimized for the case where it calls
io_remap_pfn_range() to not return a bogus page and have a faster return
path to userland. It's also possible to provide a io_remap_one_page()
that would be faster than having to call the whole 4 level
io_remap_pfn_range() for every page faulted in (though we might just
remap the entire object on the first fault, might well work ...)

Or do you think I missed something ?

  

No, that's probably the safest approach we can use until NOPAGE_RETRY 
arrives.
Only I was not sure it'd be safe to call io_remap_pfn_range() from
within nopage, in case it modifies some internal mm structs that the 
kernel nopage() code
expects to be untouched.

Once NOPAGE_RETRY arrives, (hopefully with a schedule() call attached to 
it),
it's possible, however, that repopulating the whole vma using 
io_remap_pfn_range() outside nopage, just after doing the copying is 
more efficient. Although this means keeping track of vmas, the mmap_sems 
can be taken and released one at a time, without any locking problems.

I agree the single-page approach looks nicer, though. It's somewhat ugly 
to force one's way into another process' memory space.


/Thomas






Cheers,
Ben

  





-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Benjamin Herrenschmidt

 I'm finding this an interesting discussion.  If it shifts to lkml, for 
 instance, is there a way to follow *and post* on the thread without 
 either subscribing to lkml or requiring myself to be on the CC list?

I don't know if lkml allows non-subscriber posted, I think it does tho.
So you can follow from an archive, though that sucks. Or we can keep
both lists CCed.

Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Ville Syrjälä
On Thu, Sep 21, 2006 at 07:18:07PM +1000, Benjamin Herrenschmidt wrote:
 
  I'm finding this an interesting discussion.  If it shifts to lkml, for 
  instance, is there a way to follow *and post* on the thread without 
  either subscribing to lkml or requiring myself to be on the CC list?
 
 I don't know if lkml allows non-subscriber posted, I think it does tho.

It does. And you can post via Gmane too.

 So you can follow from an archive, though that sucks.

nntp://news.gmane.org is quite nice.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Benjamin Herrenschmidt

 No, that's probably the safest approach we can use until NOPAGE_RETRY 
 arrives.
 Only I was not sure it'd be safe to call io_remap_pfn_range() from
 within nopage, in case it modifies some internal mm structs that the 
 kernel nopage() code
 expects to be untouched.

It does a couple of things that I don't like and lacks the re-test of
the PTE as I explained in my other email. I really want to provide a
function for doing that tho, one page at a time.

 Once NOPAGE_RETRY arrives, (hopefully with a schedule() call attached to 
 it),

What about schedule ? NOPAGE_RETRY returns all the way back to
userland... So things like signals can be handled etc... and the
faulting instruction is re-executed. If there is a need for
rescheduling, that will be handled by the kernel on the return path to
userland as with signals.

I want that for other things on cell which might also be useful for you,
for example, the no_page() handler for cell might wait a long time to be
able to schedule in a free HW SPE to service the fault. With
NOPAGE_RETRY, I can make that wait interruptible (test for signals) and
return to userland _without_ filling the PTE if a signal is pending,
thus causing signals to be serviced and the faulting instruction
re-executed.

 it's possible, however, that repopulating the whole vma using 
 io_remap_pfn_range() outside nopage, just after doing the copying is 
 more efficient.

Might well be when switching to vram but that means knowing about all
VMAs and thus all clients... possible but I was trying to avoid it.
 
  Although this means keeping track of vmas, the mmap_sems 
 can be taken and released one at a time, without any locking problems.

Yup.

 I agree the single-page approach looks nicer, though. It's somewhat ugly 
 to force one's way into another process' memory space.

It is but it works :) It's typically done by things like ptrace, or some
drivers DMA'ing to user memory, that's for example what get_user_pages()
allows you to do. So it should work as long as you are only faulting
pages, or invalidating mappings. We (ab)use that on cell too as SPEs run
in the host process address space. They have an MMU that we point to the
page tables of the process owning the context running on them. That
means we might have take interrupts on the host to service faults for an
SPE which is running another mm :)

It might be more efficient performance wise however to do the full remap
of the entire vram on the first no_page() to it when the object is in
vram. That can be done safely with a simple change to
io_remap_pfn_range() to make it safe against racing with itself,
basically by having remap_pte_range() return 0 instead of BUG()'ing if
the PTE has been populated after the lock. Should be pretty trivial and
we shouldn't break anything since that was a BUG() case. That would work
around what I'm explaining in another email that it currently needs the
write mmap_sem because it doesn't handle races with another faulting
path (that is with itself basically). 

Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Keith Whitwell
Ville Syrjälä wrote:
 On Thu, Sep 21, 2006 at 07:18:07PM +1000, Benjamin Herrenschmidt wrote:
 I'm finding this an interesting discussion.  If it shifts to lkml, for 
 instance, is there a way to follow *and post* on the thread without 
 either subscribing to lkml or requiring myself to be on the CC list?
 I don't know if lkml allows non-subscriber posted, I think it does tho.
 
 It does. And you can post via Gmane too.
 
 So you can follow from an archive, though that sucks.
 
 nntp://news.gmane.org is quite nice.
 

Gmane is working nicely for me now - thanks.

Keith

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Thomas Hellström
Benjamin Herrenschmidt wrote:

No, that's probably the safest approach we can use until NOPAGE_RETRY 
arrives.
Only I was not sure it'd be safe to call io_remap_pfn_range() from
within nopage, in case it modifies some internal mm structs that the 
kernel nopage() code
expects to be untouched.



It does a couple of things that I don't like and lacks the re-test of
the PTE as I explained in my other email. I really want to provide a
function for doing that tho, one page at a time.

  

Once NOPAGE_RETRY arrives, (hopefully with a schedule() call attached to 
it),



What about schedule ? NOPAGE_RETRY returns all the way back to
userland... So things like signals can be handled etc... and the
faulting instruction is re-executed. If there is a need for
rescheduling, that will be handled by the kernel on the return path to
userland as with signals.

I want that for other things on cell which might also be useful for you,
for example, the no_page() handler for cell might wait a long time to be
able to schedule in a free HW SPE to service the fault. With
NOPAGE_RETRY, I can make that wait interruptible (test for signals) and
return to userland _without_ filling the PTE if a signal is pending,
thus causing signals to be serviced and the faulting instruction
re-executed.
  

OK. I was thinking about the case where NOPAGE_RETRY was used to release 
the
mmap semaphore for the process doing io_remap_pfn_range()
Having the process yield CPU would make it more probable that it wasn't 
regrabbed by the process doing nopage.

  

it's possible, however, that repopulating the whole vma using 
io_remap_pfn_range() outside nopage, just after doing the copying is 
more efficient.



Might well be when switching to vram but that means knowing about all
VMAs and thus all clients... possible but I was trying to avoid it. 


 Although this means keeping track of vmas, the mmap_sems 
can be taken and released one at a time, without any locking problems.
  


Yup.

  

I agree the single-page approach looks nicer, though. It's somewhat ugly 
to force one's way into another process' memory space.



It is but it works :) It's typically done by things like ptrace, or some
drivers DMA'ing to user memory, that's for example what get_user_pages()
allows you to do. So it should work as long as you are only faulting
pages, or invalidating mappings. We (ab)use that on cell too as SPEs run
in the host process address space. They have an MMU that we point to the
page tables of the process owning the context running on them. That
means we might have take interrupts on the host to service faults for an
SPE which is running another mm :)

It might be more efficient performance wise however to do the full remap
of the entire vram on the first no_page() to it when the object is in
vram. That can be done safely with a simple change to
io_remap_pfn_range() to make it safe against racing with itself,
basically by having remap_pte_range() return 0 instead of BUG()'ing if
the PTE has been populated after the lock. Should be pretty trivial and
we shouldn't break anything since that was a BUG() case. That would work
around what I'm explaining in another email that it currently needs the
write mmap_sem because it doesn't handle races with another faulting
path (that is with itself basically). 

Ben.


  

Hmm, the comments to handle_pte_fault(), which is calling do_nopage 
gives some insight..

 * Note the page_table_lock. It is to protect against kswapd removing
 * pages from under us. Note that kswapd only ever _removes_ pages, never
 * adds them. As such, once we have noticed that the page is not present,
 * we can drop the lock early.
 *
 * The adding of pages is protected by the MM semaphore (which we hold),
 * so we don't need to worry about a page being suddenly been added into
 * our VM.
 *

So basically when driver nopage is called we should _never_ have a valid 
PTE.
This makes the extra check in do_nopage() after the call to driver 
nopage() somewhat mysterious,  (but fortunate). Perhaps the intention is 
for driver nopage() to be able to temporarily release the MM semaphore. 
(Which would be even more fortunate).

In any case, if the comments hold, we should never hit the BUG() 
statement in io_remap_pfn_range(), but it also seems clear that the code 
doesn't really expect ptes to be added.

Taking this to the lkml for some clarification might be a good idea.

On a totally different subject, the previous discussion we had about 
having pages outside of the kernel virtual map (highmem pages) for 
example, might be somewhat tricky with the current definition of 
alloc_gatt_pages and free_gatt_pages, which both returns kernel virtual 
addresses. Would be nice to have them return struct page* instead.


/Thomas






-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on 

Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Benjamin Herrenschmidt

 Hmm, the comments to handle_pte_fault(), which is calling do_nopage 
 gives some insight..
 
  * Note the page_table_lock. It is to protect against kswapd removing
  * pages from under us. Note that kswapd only ever _removes_ pages, never
  * adds them. As such, once we have noticed that the page is not present,
  * we can drop the lock early.
  *
  * The adding of pages is protected by the MM semaphore (which we hold),
  * so we don't need to worry about a page being suddenly been added into
  * our VM.
  *

This comment is a bit stale I think :) For example, the PTL is no longer
used for faulting in PTEs, in favor of a more fine grained lock. Also,
fauling only takes the mmap_sem for reading, which can be taken multiple
times. It's only taken for writing (which excludes other writers and all
readers) when modifying the VMA list itself.

 So basically when driver nopage is called we should _never_ have a valid 
 PTE.

No, we can have two no_page racing.

 This makes the extra check in do_nopage() after the call to driver 
 nopage() somewhat mysterious,  (but fortunate). Perhaps the intention is 
 for driver nopage() to be able to temporarily release the MM semaphore. 
 (Which would be even more fortunate).

It's a rwsem, it can be taken multiple times for reading. Only once the
PTE lock has been taken (ex page table lock, now a PTE lock whose
actual granularity is arch dependent) then you know for sure that nobody
else will be mucking around with -this- specific PTE. Which is why you
need to re-check after taking the lock. The mmap_sem only protects
against the whole VMA being teared down or modified (though truncate
doesn't take it neither, thus the truncate count trick which ends up in
a retry if we raced with it).

 In any case, if the comments hold, we should never hit the BUG() 
 statement in io_remap_pfn_range(), but it also seems clear that the code 
 doesn't really expect ptes to be added.

Unfortunately, the comment is misleading. I suppose I should submit a
patch changing or removing it one of these days...

 Taking this to the lkml for some clarification might be a good idea.
 
 On a totally different subject, the previous discussion we had about 
 having pages outside of the kernel virtual map (highmem pages) for 
 example, might be somewhat tricky with the current definition of 
 alloc_gatt_pages and free_gatt_pages, which both returns kernel virtual 
 addresses. Would be nice to have them return struct page* instead.

Yes. Currently, we can get to struct page with virt_to_page, which is
what we do in drm_vm.h for platforms where the AGP aperture cannot be
accessed as MMIO and thus requires a no_page() for faulting the
individual pages in (which is what we do on ppc btw). But that will not
work with pages that aren't coming from the virtual mapping. Thus it
might indeed be a good idea to change the AGP allocation to return
struct page.

Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Thomas Hellström
Thomas Hellström wrote:

Benjamin Herrenschmidt wrote:

  


Hmm, the comments to handle_pte_fault(), which is calling do_nopage 
gives some insight..

 * Note the page_table_lock. It is to protect against kswapd removing
 * pages from under us. Note that kswapd only ever _removes_ pages, never
 * adds them. As such, once we have noticed that the page is not present,
 * we can drop the lock early.
 *
 * The adding of pages is protected by the MM semaphore (which we hold),
 * so we don't need to worry about a page being suddenly been added into
 * our VM.
 *

So basically when driver nopage is called we should _never_ have a valid 
PTE.
  

...Or perhaps that comment is a remnant from the time when the
mm semaphore wasn't a rw semaphore.

/Thomas




This makes the extra check in do_nopage() after the call to driver 
nopage() somewhat mysterious,  (but fortunate). Perhaps the intention is 
for driver nopage() to be able to temporarily release the MM semaphore. 
(Which would be even more fortunate).

In any case, if the comments hold, we should never hit the BUG() 
statement in io_remap_pfn_range(), but it also seems clear that the code 
doesn't really expect ptes to be added.

Taking this to the lkml for some clarification might be a good idea.

On a totally different subject, the previous discussion we had about 
having pages outside of the kernel virtual map (highmem pages) for 
example, might be somewhat tricky with the current definition of 
alloc_gatt_pages and free_gatt_pages, which both returns kernel virtual 
addresses. Would be nice to have them return struct page* instead.


/Thomas






-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel
  





-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-20 Thread Thomas Hellström




Benjamin Herrenschmidt wrote:

  On Tue, 2006-09-19 at 12:49 +0200, Thomas Hellström wrote:
  
  
Benjamin Herrenschmidt wrote: 


  On Tue, 2006-09-19 at 11:27 +0200, Thomas Hellström wrote:

  
  
  
But this should be the same problem encountered by the agpgart driver?
x86 and x86-64 calls change_page_attr() to take care of this.
On powerpc it is simply a noop. (asm/agp.h)


  
  Possibly. We sort-of ignore the issue for now on PowerPC and happen to
be lucky most of the time because 32 bits PowerPC aren't that agressive
at prefetching...

I haven't looked at change_page_attr() implementation on x86 but I know
they also map the linear mapping with large pages. I don't know what
happens if you start trying to change a single page attribute. x86 can
breakup that large page into 4k pages, so maybe that's what happens.

  
  

Yes, I think that's what happens. I know some Athlon chips had a big
issue with this some time ago.

I notice there are special functions in agp.h to alloc / free GATT
pages, so the general idea might be to have a pool of uncached pages
in the future for powerpc? Even better would perhaps be to have pages
that aren't mapped for the kernel. (like highmem pages on x86).

  
  
Yes, that's exactly what I'm thinking about doing. However, this is only
a problem for AGP.

  

Right.

  For objects that are in video memory but can also be moved back to main
memory (what I call "evicted") under pressure by the memory manager, one
thing I was wondering is, do we have to bother about cache settings at
all ?

  

I don't think so. 
We are not doing vram yet in the TTM code, but I think a general
"eviction" would consist of 

1) locking mmap_sems for all processes mapping the buffer.
2) zap the page table. Any attempt to access will be blocked by
mmap_sem in nopage().
3) Copy contents from vram to system using either PCI SG or
video-blit-AGP-flip-system.
4) Wait for completion.
5) release the mmap sem. The page table will be refilled using nopage().

A copy back might be more efficient since in come situations we don't
have to wait for completion 
(If the copy is done using the command queue.) Intel chips for instance
have the possibility to flip cached pages into AGP for use with the
video blitter.

  That is, have them mapped non-cacheable when in vram and cacheable when
in main memory. Is there any reason why there would be a problem with
userland having the same buffer being sometimes cacheable and
non-cacheable ? I don't think so as long as userland isn't using cache
tricks and whatever primitive is used to do the copy to/from vram
properly accounts for it.
  

I agree.

  
Ben.

  

/Thomas



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-20 Thread Benjamin Herrenschmidt

 I don't think so. 
 We are not doing vram yet in the TTM code, but I think a general
 eviction would consist of 
 
 1) locking mmap_sems for all processes mapping the buffer.
 2) zap the page table. Any attempt to access will be blocked by
 mmap_sem in nopage().
 3) Copy contents from vram to system using either PCI SG or
 video-blit-AGP-flip-system.
 4) Wait for completion.
 5) release the mmap sem. The page table will be refilled using
 nopage().

On Cell, for SPU mappings, we don't scan through all processes mapping
it, we use umap_mapping_range() which does it. However, after
double-checking, i have some doubts about the locking so I'm trying to
clarify that and I'll come back to you wether it's actually a viable
solution or not.

Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-20 Thread Benjamin Herrenschmidt
  
 OK. It seems like mmap locks are needed even for
 unmap_mapping_range().

Well, I came to the opposite conclusion :) unmap_mapping_range() uses
the truncate count mecanism to guard against a racing no_page().

The idea is that:

no_page() itself internally takes the per-obkect lock/mutex mostly as a
sycnhronisation point before looking for the struct page and releases it
before returning the struct page to do_no_page().

unmap_mapping_range() is called with that muetx/lock held (and the copy
is done with that held too).

That should work without taking the mmap_sem.

Now, of course, the real problem is that we don't have struct page for
vram There are two ways out of this:

 - Enforce use of sparsemem and create struct page for vram. That will
probably make a few people jump out of their seats in x86 land but
that's what we do for cell and SPUs for now.

 - There's a prooposal that I'm pusing to add a way for no_page() to
return a NOPAGE_RETRY error, which essentially causes it to go all the
way back to userland and re-do the access. I want that to be able to
handle signals while blocked inside no_page() but that could -also- be
used to have no_page() setup the PTE mappings itself and return
NOPAGE_RETRY, thus avoiding the need for a struct page. Now I do not
-ever- want to see drivers mucking around with PTEs directly, however,
we can provide something in mm/memory.c that a driver can call from
within no_page() to perform the set_pte() along with all the necessary
locking, flushes, etc... The base code for NOPAGE_RETRY should get in
2.6.19 soon (one of these days).

Ben.


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-19 Thread Benjamin Herrenschmidt
On Tue, 2006-09-19 at 11:27 +0200, Thomas Hellström wrote:

 But this should be the same problem encountered by the agpgart driver?
 x86 and x86-64 calls change_page_attr() to take care of this.
 On powerpc it is simply a noop. (asm/agp.h)

Possibly. We sort-of ignore the issue for now on PowerPC and happen to
be lucky most of the time because 32 bits PowerPC aren't that agressive
at prefetching...

I haven't looked at change_page_attr() implementation on x86 but I know
they also map the linear mapping with large pages. I don't know what
happens if you start trying to change a single page attribute. x86 can
breakup that large page into 4k pages, so maybe that's what happens.

 Currently we take the following approach when the GPU needs access to
 a buffer:
 
 0) Take the hardware lock.
 1) The buffer is validated, and if not present in the GATT, it's
 flipped in. At this point, idle buffers may be flipped out.
 2) The app submits a batch buffer (or in the general case a command
 sequence). All buffers that are referenced by this command sequence
 needs to have been validated, and the command sequence should be
 updated with their new GATT offset.
 3) A fence is emitted, and associated with all unfenced buffers.
 4) The hardware lock is released.
 5) When the fence has expired (The GPU is finished with the command
 sequence), the buffers associated with it may optionally be thrown
 out. 
 
 One problem is that buffers that need to be pinned (_always_ available
 to the GPU) cannot be thrown out and will thus fragment the aperture-
 or VRAM space.
 
 Buffers also carry usage- and mapping refcounts. They are not allowed
 to be validated when mapped, and (except under some circumstances) are
 not allowed to be mapped while validated. Buffer destruction occurs
 when the refcount goes to zero.

Yup. My idea was to allow the locking from userspace to allow for chips
that can allow userspace direct command submission. Basically, lock/pin
the buffer if it's still in the card, and if that fails (because it's
been evicted already), then fallback to a kernel call.

Ben.


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-19 Thread Thomas Hellström




Benjamin Herrenschmidt wrote:

  On Tue, 2006-09-19 at 11:27 +0200, Thomas Hellström wrote:

  
  
But this should be the same problem encountered by the agpgart driver?
x86 and x86-64 calls change_page_attr() to take care of this.
On powerpc it is simply a noop. (asm/agp.h)

  
  
Possibly. We sort-of ignore the issue for now on PowerPC and happen to
be lucky most of the time because 32 bits PowerPC aren't that agressive
at prefetching...

I haven't looked at change_page_attr() implementation on x86 but I know
they also map the linear mapping with large pages. I don't know what
happens if you start trying to change a single page attribute. x86 can
breakup that large page into 4k pages, so maybe that's what happens.

  

Yes, I think that's what happens. I know some Athlon chips had a big
issue with this some time ago.

I notice there are special functions in agp.h to alloc / free
GATT pages, so the general idea might be to have a pool of uncached
pages in the future for powerpc? Even better would perhaps be to have
pages that aren't mapped for the kernel. (like highmem pages on x86).

/Thomas



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-19 Thread Stephane Marchesin
Thomas Hellström wrote:
 Benjamin Herrenschmidt wrote:
 On Tue, 2006-09-19 at 11:27 +0200, Thomas Hellström wrote:

   
 But this should be the same problem encountered by the agpgart driver?
 x86 and x86-64 calls change_page_attr() to take care of this.
 On powerpc it is simply a noop. (asm/agp.h)
 

 Possibly. We sort-of ignore the issue for now on PowerPC and happen to
 be lucky most of the time because 32 bits PowerPC aren't that agressive
 at prefetching...

 I haven't looked at change_page_attr() implementation on x86 but I know
 they also map the linear mapping with large pages. I don't know what
 happens if you start trying to change a single page attribute. x86 can
 breakup that large page into 4k pages, so maybe that's what happens.

   
 Yes, I think that's what happens. I know some Athlon chips had a big 
 issue with this some time ago.

 I notice there are special functions in agp.h to alloc / free GATT 
 pages, so the general idea might be to have a pool of uncached pages 
 in the future for powerpc? Even better would perhaps be to have pages 
 that aren't mapped for the kernel. (like highmem pages on x86).

As a side note, it's not always possible to map the whole video memory. 
For example on ATI chips you can't map VRAM above 128MB, on Nvidia you 
can't above 256MB. Still, that memory has to be managed somehow. In that 
case, such memory areas will be hidden from the application anyway.

Stephane


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-18 Thread Keith Whitwell
Dave Airlie wrote:
 Obviously, we are interested in making use of the new DRM memory manager
 on that hardware. Now if I understand how it works correctly, this new
 memory manager allocates opaque handles which are not to be used as
 offset in memory, because they are not. Therefore, a translation from
 the handle into a proper memory adress has to be done before the
 commands are sent to the hardware. This is easy to add when the DRM
 validates all the commands.
 
 Also the multiple contexts means taking the drm lock is not something we 
 would want to be doing in on a regular basis... I've got some ideas 
 already discussed with Stephane but I'd like to see what other methods ppl 
 might have..

Yes, this is really a different hardware model than we're used to 
dealing with for DRI drivers, however it's not a problem for the most 
part - if you don't need to take the lock, don't.  But then you need 
some other way of dealing with the other hacky stuff we get away with by 
   lock abuses eg. VT switching.

For the memory manager, I guess there are two choices:  1) make the 
driver use a command-buffer approach even though the hardware supports 
per-context ring buffers, or 2) extend the memory manager.

Extending the memory manager would involve adding an ability to lock and 
unlock surfaces to VRAM/AGP addresses - this would require kernel 
interaction I guess.  The driver would have to lock the surfaces then be 
free to submit commands to the ring, then explicitly unlock the 
surfaces.  This is actually a pretty nasty approach - it makes it very 
hard to deal with low-memory situations - it's impossible to kick out 
another processes allocations.

I wonder how NV deals with this...

Keith


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-18 Thread Benjamin Herrenschmidt

 Yes, this is really a different hardware model than we're used to 
 dealing with for DRI drivers, however it's not a problem for the most 
 part - if you don't need to take the lock, don't.  But then you need 
 some other way of dealing with the other hacky stuff we get away with by 
lock abuses eg. VT switching.

We could abuse a RW lock model here where normal command submission
takes a read lock (can be shared) and big hammer like VT switch takes a
write lock.

 For the memory manager, I guess there are two choices:  1) make the 
 driver use a command-buffer approach even though the hardware supports 
 per-context ring buffers, or 2) extend the memory manager.
 
 Extending the memory manager would involve adding an ability to lock and 
 unlock surfaces to VRAM/AGP addresses - this would require kernel 
 interaction I guess.  The driver would have to lock the surfaces then be 
 free to submit commands to the ring, then explicitly unlock the 
 surfaces.  This is actually a pretty nasty approach - it makes it very 
 hard to deal with low-memory situations - it's impossible to kick out 
 another processes allocations.
 
 I wonder how NV deals with this...

I've heard some of the proprietary drivers play MMU tricks. We could do
something similar... when kicking a pixmap out, we remap the virtual
mapping for that pixmap to backup memory. But that means fundamentally
changing our model where we have a big mapping for the fb which we then
cut into objects and instead mmap objects separately. As for kicking out
mappings behind somebody's back, it works fine :) We do that for SPEs
local stores on the Cell processor. A no_page() function will refill as
needed from either the HW or the backing store and use
unmap_mapping_range() can kick that out behind any process back. The
main problem I see with that approach is that you have to either map the
backup memory non-cacheable which can be trouble on some platforms (*)
or cacheable which means that the same bit of memory will either be
cacheable or non-cacheable depending on wether you get the HW, which
might be trouble to some apps unless you are careful.

(*) The kernel always keeps a cacheable linear mapping of all memory and
the nasty prefetchers or speculative execution units might thus bring
something from your page otherwise mapped non-cacheable into userspace
in the cache that way. Some CPUs will shoke badly if you access via a
non-cacheable mapping something that is present in the cache.

Ben.



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-18 Thread Thomas Hellström




Benjamin Herrenschmidt wrote:

  
Yes, this is really a different hardware model than we're used to 
dealing with for DRI drivers, however it's not a problem for the most 
part - if you don't need to take the lock, don't.  But then you need 
some other way of dealing with the other hacky stuff we get away with by 
   lock abuses eg. VT switching.

  
  
We could abuse a RW lock model here where normal command submission
takes a read lock (can be shared) and big hammer like VT switch takes a
write lock.

  
  
For the memory manager, I guess there are two choices:  1) make the 
driver use a command-buffer approach even though the hardware supports 
per-context ring buffers, or 2) extend the memory manager.

Extending the memory manager would involve adding an ability to lock and 
unlock surfaces to VRAM/AGP addresses - this would require kernel 
interaction I guess.  The driver would have to lock the surfaces then be 
free to submit commands to the ring, then explicitly unlock the 
surfaces.  This is actually a pretty nasty approach - it makes it very 
hard to deal with low-memory situations - it's impossible to kick out 
another processes allocations.

I wonder how NV deals with this...

  
  
I've heard some of the proprietary drivers play MMU tricks. We could do
something similar... when kicking a pixmap out, we "remap" the virtual
mapping for that pixmap to backup memory. But that means fundamentally
changing our model where we have a big mapping for the fb which we then
cut into objects and instead mmap objects separately. As for kicking out
mappings behind somebody's back, it works fine :) We do that for SPEs
local stores on the Cell processor. A no_page() function will refill as
needed from either the HW or the backing store and use
unmap_mapping_range() can kick that out behind any process back. The
main problem I see with that approach is that you have to either map the
backup memory non-cacheable which can be trouble on some platforms (*)
or cacheable which means that the same bit of memory will either be
cacheable or non-cacheable depending on wether you get the HW, which
might be trouble to some apps unless you are careful.

(*) The kernel always keeps a cacheable linear mapping of all memory and
the nasty prefetchers or speculative execution units might thus bring
something from your page otherwise mapped non-cacheable into userspace
in the cache that way. Some CPUs will shoke badly if you access via a
non-cacheable mapping something that is present in the cache.

Ben.

  

Actually, the TTM memory manager already does this, 
but also changes the caching policy of the linear kernel map.

Unfortunately this leads to rather costly cache and TLB flushes.
Particularly on SMP.

I think Keith was referring to the drawbacks with buffers pinned in AGP
or VRAM space.

/Thomas.



  

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel
  




-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-18 Thread Benjamin Herrenschmidt

 Actually, the TTM memory manager already does this, 
 but also changes the caching policy of the linear kernel map.

The later is not portable unfortunately, and can have other serious
performance impacts.

Typically, the kernel linear map is mapped using larger page sizes, or
in some cases, even large TLB entries, or separate translation registers
(like BATs). Thus you cannot affect the caching policy of a single 4k
page. Also, on some processors, you can't just break down a single large
page into small pages neither. For example, on desktop PowerPC, entire
segments of 256M can have only one page size. Even x86 might have some
interesting issues here...

 Unfortunately this leads to rather costly cache and TLB flushes.
 Particularly on SMP.

Yup.

 I think Keith was referring to the drawbacks with buffers pinned in
 AGP or VRAM space.
 
 /Thomas.
 
 
  
  -
  Using Tomcat but need to do more? Need to support web services, security?
  Get stuff done quickly with pre-integrated technology to make your job 
  easier
  Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
  http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
  --
  ___
  Dri-devel mailing list
  Dri-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/dri-devel

 


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-18 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Keith Whitwell wrote:
 
 Extending the memory manager would involve adding an ability to lock and 
 unlock surfaces to VRAM/AGP addresses - this would require kernel 
 interaction I guess.  The driver would have to lock the surfaces then be 
 free to submit commands to the ring, then explicitly unlock the 
 surfaces.  This is actually a pretty nasty approach - it makes it very 
 hard to deal with low-memory situations - it's impossible to kick out 
 another processes allocations.

This was similar to how I had designed the memory manager that I never
finished.  In order to make this work, you have to set a fence when you
unlock the surface.  You can still kick out surfaces that we locked by
another process, but you have to wait until the fence has past.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFFDxHFX1gOwKyEAw8RAgK8AKCLZb0FXQMF7cFP4fWxdTjIb+aLwQCcClRk
yk/LjKRuECunVbP8/EFRxfk=
=YUsT
-END PGP SIGNATURE-

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-18 Thread Benjamin Herrenschmidt
On Mon, 2006-09-18 at 16:46 +0200, Thomas Hellström wrote:

 Unfortunately this leads to rather costly cache and TLB flushes.
 Particularly on SMP.
 
 I think Keith was referring to the drawbacks with buffers pinned in
 AGP or VRAM space.

What about a futex-like approach:

A shared are mapped by both kernel and user has locks for the buffers.
When submitting a command involving a buffer, userland tries to lock it.
This is a simple atomic operation in user space. If that fails (the lock
for that buffer is held, possibly by the kernel, or the buffer is
swapped out), them it does an ioctl to the DRM to get access (which
involves sleeping until the buffer can be retreived).

One the operation is complete, the apps can release the locks to buffers
it holds. In fact, if there is a mapping to buffers - objects for
cards like nVidia with objects and notifiers, the kernel could
auto-unlock objects when the completion interrupt for them occurs.

Ben.


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


DRM memory manager on cards with hardware contexts

2006-09-17 Thread Stephane Marchesin
Hello,

Before explaining the issue, let me first introduce the context a bit. 
We are working on hardware that supports multiple contexts. By 
allocating one context to each application, we can achieve full graphics 
command submission from user space (each context is actually simply a 
command fifo and its control registers which are both mapped into the 
application space).

Obviously, we are interested in making use of the new DRM memory manager 
on that hardware. Now if I understand how it works correctly, this new 
memory manager allocates opaque handles which are not to be used as 
offset in memory, because they are not. Therefore, a translation from 
the handle into a proper memory adress has to be done before the 
commands are sent to the hardware. This is easy to add when the DRM 
validates all the commands.

However in our model the drm does can not do so, since it does not 
validates each and every command that is sent. So we need to be able to 
translate the handles into real offsets from user space, and know for 
sure that these offsets will remain valid for as long as the hardware 
will need them. And that's where we're out of ideas. Do you guys think 
it's possible to work around that ?

Stephane

PS : before everyone jumps in and says that this model is insecure, let 
me add that the hardware can enforce memory access


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-17 Thread Dave Airlie

 Obviously, we are interested in making use of the new DRM memory manager
 on that hardware. Now if I understand how it works correctly, this new
 memory manager allocates opaque handles which are not to be used as
 offset in memory, because they are not. Therefore, a translation from
 the handle into a proper memory adress has to be done before the
 commands are sent to the hardware. This is easy to add when the DRM
 validates all the commands.

Also the multiple contexts means taking the drm lock is not something we 
would want to be doing in on a regular basis... I've got some ideas 
already discussed with Stephane but I'd like to see what other methods ppl 
might have..

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
Linux kernel - DRI, VAX / pam_smb / ILUG


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel