Re: tracking P->V for unmanaged device pages

2015-03-26 Thread J. Hannken-Illjes

On 26 Mar 2015, at 13:13, Taylor R Campbell 
 wrote:

> Various DRM graphics drivers, including Intel, Radeon, and Nouveau,
> sometimes need to unmap all virtual mappings of certain physical
> pages for which there is no struct vm_page.  The issue is explained in
> detail here:
> 
> https://mail-index.netbsd.org/tech-kern/2014/07/23/msg017392.html
> 
> It's not desirable to simply add struct vm_pages on a freelist that
> uvm_pagealloc ignores, because struct vm_page is large (120 bytes on
> amd64, for example), most of it is unnecessary for P->V tracking, and
> the physical regions that need P->V tracking are large (hundreds of
> megabytes, or gigabytes).
> 
> The attached patch implements the following extension to pmap(9) on
> x86 and uses it in DRM[*].  The implementation uses a linear list of
> pv-tracked ranges, since it is expected to be short (one to three
> elements).  The list is managed with pserialize(9) so it adds no
> locking overhead to existing pmap operations that need to look up
> entries in it.

If you plan to use pserialize here it is time to change
pserialize_perform() to not kpause() every xc_broadcast.

Currently pserialize_perform() takes at least two ticks to complete.

It should be modified as:

i = 0;
do {
mutex_spin_exit(&psz_lock);
xc = xc_broadcast(XC_HIGHPRI, (xcfunc_t)nullop, NULL, NULL);
xc_wait(xc);
if (i++ > 1)
kpause("psrlz", false, 1, NULL);
mutex_spin_enter(&psz_lock);
} while (!kcpuset_iszero(psz->psz_target));

to become faster in the usual "two activity switches" case.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: tracking P->V for unmanaged device pages

2015-03-26 Thread Taylor R Campbell
   Date: Thu, 26 Mar 2015 15:30:01 +0100
   From: "J. Hannken-Illjes" 

   On 26 Mar 2015, at 13:13, Taylor R Campbell 
 wrote:

   > The attached patch implements the following extension to pmap(9) on
   > x86 and uses it in DRM[*].  The implementation uses a linear list of
   > pv-tracked ranges, since it is expected to be short (one to three
   > elements).  The list is managed with pserialize(9) so it adds no
   > locking overhead to existing pmap operations that need to look up
   > entries in it.

   If you plan to use pserialize here it is time to change
   pserialize_perform() to not kpause() every xc_broadcast.

   Currently pserialize_perform() takes at least two ticks to complete.

No objection to making pserialize(9) faster, but for this P->V
tracking code, pserialize_perform happens only when you are, e.g.,
detaching a device -- which is about the least performance-critical
path I can imagine.


Re: tracking P->V for unmanaged device pages

2015-04-07 Thread Masao Uebayashi
- struct vm_physseg is MI representation of a continuous physical
address region.  It is currently allocated only for main memories, but
apertures should have one.
- struct vm_page is MI representation of memory pages that have
backing stores; IOW, a state between memory page and its backing
store.
- PV is state that a physical page address is mapped to a virtual address.

Considering these, PV should be kept in struct vm_physseg, instead of
struct vm_page, ideally.  PV should be also an MI struct, like struct
vm_pv.

(ISTR yamt@ had an idea to link PVs from vm_map_entries for obvious reasons.)

If you once have such a right design, you no longer need to have a
separate API to track PV, because pmap_enter() is given sufficient
information to decide whether PV is tracked or not; P/V addresses and
flags telling how to map the new virtual address (cache enabled, PAT
used, etc.).

On Thu, Mar 26, 2015 at 9:13 PM, Taylor R Campbell
 wrote:
> Various DRM graphics drivers, including Intel, Radeon, and Nouveau,
> sometimes need to unmap all virtual mappings of certain physical
> pages for which there is no struct vm_page.  The issue is explained in
> detail here:
>
> https://mail-index.netbsd.org/tech-kern/2014/07/23/msg017392.html
>
> It's not desirable to simply add struct vm_pages on a freelist that
> uvm_pagealloc ignores, because struct vm_page is large (120 bytes on
> amd64, for example), most of it is unnecessary for P->V tracking, and
> the physical regions that need P->V tracking are large (hundreds of
> megabytes, or gigabytes).
>
> The attached patch implements the following extension to pmap(9) on
> x86 and uses it in DRM[*].  The implementation uses a linear list of
> pv-tracked ranges, since it is expected to be short (one to three
> elements).  The list is managed with pserialize(9) so it adds no
> locking overhead to existing pmap operations that need to look up
> entries in it.
>
> core@ discussed the problem and asked that this approach be marked as
> an interim solution, because not everyone was happy about it but
> nobody had an obviously better idea.
>
> Objections?
>
>
> voidpmap_pv_init(void);
>
>   Initialize the pmap_pv(9) subsystem.  Called by uvm_init.
>
> voidpmap_pv_track(paddr_t startpa, paddr_t size)
>
>   Do pv-tracking for the unmanaged pages in [startpa, startpa + size).
>   Called by a driver on initialization to register device pages.  Can
>   be done only once for any given [startpa, startpa + size) range,
>   with no overlapping allowed.  The range must be page-aligned.
>
> voidpmap_pv_untrack(paddr_t startpa, paddr_t size)
>
>   Stop doing pv-tracking for the pages in [startpa, startpa + size).
>
> voidpmap_pv_protect(paddr_t pa, vm_prot_t prot)
>
>   Reduce page protection of pv-tracked unmanaged page at pa to prot.
>   pa must be page-aligned.
>
>
> [*] This predictably fixes some rendering issues but seems to expose
> another bug causing the X server to crash more frequently, which is
> under investigation now.


Re: tracking P->V for unmanaged device pages

2015-04-08 Thread Taylor R Campbell
   Date: Wed, 8 Apr 2015 12:29:13 +0900
   From: Masao Uebayashi 

   Considering these, PV should be kept in struct vm_physseg, instead of
   struct vm_page, ideally.

The only difficulty with that is that you still need a fast way to go
from a struct vm_page to the PV tracking record.  Since vm_page has no
pointer to vm_physseg, you'd need to add one in struct vm_page_md.
But then you might as well just include the PV tracking record in the
struct vm_page_md anyway.

   PV should be also an MI struct, like struct vm_pv.

No objection here.  I think everyone would like to see MI code for PV
tracking instead of having copypasta in every pmap.  But someone has
to do the work.

   If you once have such a right design, you no longer need to have a
   separate API to track PV, because pmap_enter() is given sufficient
   information to decide whether PV is tracked or not; P/V addresses and
   flags telling how to map the new virtual address (cache enabled, PAT
   used, etc.).

In my pmap_pv API, there's no need to change pmap_enter.  All you need
to do is to call pmap_pv_track on driver attach to mark the pages as
device pages.  Then pmap can do the bookkeeping inside.

It sounds like what you're proposing is to replace pmap_pv_track by a
version of uvm_page_physload that will not allocate struct vm_page for
each page in the region, but only struct vm_pv.  I have no objection
to this either, but someone has to do the work.


Re: tracking P->V for unmanaged device pages

2015-04-08 Thread Taylor R Campbell
   Date: Thu, 9 Apr 2015 09:45:15 +0900
   From: Masao Uebayashi 

   On Wed, Apr 8, 2015 at 11:34 PM, Taylor R Campbell
wrote:
   >Date: Wed, 8 Apr 2015 12:29:13 +0900
   >From: Masao Uebayashi 
   >
   >Considering these, PV should be kept in struct vm_physseg, instead of
   >struct vm_page, ideally.
   >
   > The only difficulty with that is that you still need a fast way to go
   > from a struct vm_page to the PV tracking record.  Since vm_page has no
   > pointer to vm_physseg, you'd need to add one in struct vm_page_md.
   > But then you might as well just include the PV tracking record in the
   > struct vm_page_md anyway.

   If you know a) the base address of struct vm_physseg, and b) the
   offset within it, you can look up the matching PV entry (in vector or
   table or whatever) fast.

Yes -- that's what I meant by `Since vm_page has no pointer to
vm_physseg, you'd need to add one in struct vm_page_md.'  Of course,
it could instead be put in the MI part of struct vm_page, &c.

   > In my pmap_pv API, there's no need to change pmap_enter.  All you need
   > to do is to call pmap_pv_track on driver attach to mark the pages as
   > device pages.  Then pmap can do the bookkeeping inside.

   You don't need to change pmap_enter(), but you need to change fault
   handlers.  That didn't really work for XIP, which has to share the
   generic fault handler.

I don't understand.  Why do you need to change fault handlers?  I
think the only real reason the drm drivers need custom fault handlers
is that they need to hold a lock over pinning graphics buffers in GPU
VA and entering the corresponding aperture PAs into the pmap, and
neither the cdev_mmap nor pgo_getpages/putpages abstraction supports
this.

Is the uebayasi-xip branch relevant?  Are there changes there which
might be instructive?

   > It sounds like what you're proposing is to replace pmap_pv_track by a
   > version of uvm_page_physload that will not allocate struct vm_page for
   > each page in the region, but only struct vm_pv.  I have no objection
   > to this either, but someone has to do the work.

   It doesn't need to be done quickly.  I just want to know the
   direction.  The abstraction of physical address and memory in UVM is
   very poor ATM.  I want to change that, and want awareness of the
   problem.  (Future memory technologies like persistent memory is coming
   soon.)

I don't have enough background to make any statements about the
direction.


Re: tracking P->V for unmanaged device pages

2015-04-08 Thread Masao Uebayashi
On Wed, Apr 8, 2015 at 11:34 PM, Taylor R Campbell
 wrote:
>Date: Wed, 8 Apr 2015 12:29:13 +0900
>From: Masao Uebayashi 
>
>Considering these, PV should be kept in struct vm_physseg, instead of
>struct vm_page, ideally.
>
> The only difficulty with that is that you still need a fast way to go
> from a struct vm_page to the PV tracking record.  Since vm_page has no
> pointer to vm_physseg, you'd need to add one in struct vm_page_md.
> But then you might as well just include the PV tracking record in the
> struct vm_page_md anyway.

If you know a) the base address of struct vm_physseg, and b) the
offset within it, you can look up the matching PV entry (in vector or
table or whatever) fast.

>PV should be also an MI struct, like struct vm_pv.
>
> No objection here.  I think everyone would like to see MI code for PV
> tracking instead of having copypasta in every pmap.  But someone has
> to do the work.

Yes.

>If you once have such a right design, you no longer need to have a
>separate API to track PV, because pmap_enter() is given sufficient
>information to decide whether PV is tracked or not; P/V addresses and
>flags telling how to map the new virtual address (cache enabled, PAT
>used, etc.).
>
> In my pmap_pv API, there's no need to change pmap_enter.  All you need
> to do is to call pmap_pv_track on driver attach to mark the pages as
> device pages.  Then pmap can do the bookkeeping inside.

You don't need to change pmap_enter(), but you need to change fault
handlers.  That didn't really work for XIP, which has to share the
generic fault handler.

> It sounds like what you're proposing is to replace pmap_pv_track by a
> version of uvm_page_physload that will not allocate struct vm_page for
> each page in the region, but only struct vm_pv.  I have no objection
> to this either, but someone has to do the work.

It doesn't need to be done quickly.  I just want to know the
direction.  The abstraction of physical address and memory in UVM is
very poor ATM.  I want to change that, and want awareness of the
problem.  (Future memory technologies like persistent memory is coming
soon.)


Re: tracking P->V for unmanaged device pages

2015-04-08 Thread Masao Uebayashi
On Thu, Apr 9, 2015 at 11:46 AM, Taylor R Campbell
 wrote:
>> In my pmap_pv API, there's no need to change pmap_enter.  All you need
>> to do is to call pmap_pv_track on driver attach to mark the pages as
>> device pages.  Then pmap can do the bookkeeping inside.
>
>You don't need to change pmap_enter(), but you need to change fault
>handlers.  That didn't really work for XIP, which has to share the
>generic fault handler.
>
> I don't understand.  Why do you need to change fault handlers?  I
> think the only real reason the drm drivers need custom fault handlers
> is that they need to hold a lock over pinning graphics buffers in GPU
> VA and entering the corresponding aperture PAs into the pmap, and
> neither the cdev_mmap nor pgo_getpages/putpages abstraction supports
> this.

Sorry, I misunderstood...  pmap_pv_tracked() is called from
pmap_enter() so it's already doing what I wanted (the name
pmap_pv_tracked() is not great though).

> Is the uebayasi-xip branch relevant?  Are there changes there which
> might be instructive?

Not really.

>> It sounds like what you're proposing is to replace pmap_pv_track by a
>> version of uvm_page_physload that will not allocate struct vm_page for
>> each page in the region, but only struct vm_pv.  I have no objection
>> to this either, but someone has to do the work.
>
>It doesn't need to be done quickly.  I just want to know the
>direction.  The abstraction of physical address and memory in UVM is
>very poor ATM.  I want to change that, and want awareness of the
>problem.  (Future memory technologies like persistent memory is coming
>soon.)
>
> I don't have enough background to make any statements about the
> direction.


Re: tracking P->V for unmanaged device pages

2015-04-14 Thread Reinoud Zandijk
Hi Taylor,

On Thu, Mar 26, 2015 at 12:13:56PM +, Taylor R Campbell wrote:
> Various DRM graphics drivers, including Intel, Radeon, and Nouveau,
> sometimes need to unmap all virtual mappings of certain physical
> pages for which there is no struct vm_page.  The issue is explained in
> detail here:
...

> Index: sys/external/bsd/drm2/dist/drm/ttm/ttm_bo.c
> ===
> RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/ttm/ttm_bo.c,v
> retrieving revision 1.6
> diff -p -u -r1.6 ttm_bo.c
> --- sys/external/bsd/drm2/dist/drm/ttm/ttm_bo.c   18 Aug 2014 02:43:27 
> -  1.6
> +++ sys/external/bsd/drm2/dist/drm/ttm/ttm_bo.c   23 Mar 2015 13:13:57 
> -
> @@ -1611,11 +1611,16 @@ void ttm_bo_unmap_virtual_locked(struct 
>  
>  #ifdef __NetBSD__
>   if (bo->mem.bus.is_iomem) {
> - /*
> -  * XXX OOPS!  NetBSD doesn't have a way to enumerate
> -  * and remove the virtual mappings for device addresses
> -  * or of a uvm object.
> -  */
> + paddr_t start, end, pa;
> +
> + KASSERT((bo->mem.bus.base & (PAGE_SIZE - 1)) == 0);
> + KASSERT((bo->mem.bus.offset & (PAGE_SIZE - 1)) == 0);
> + start = bo->mem.bus.base + bo->mem.bus.offset;
> + KASSERT((bo->mem.bus.size & (PAGE_SIZE - 1)) == 0);
> + end = start + bo->mem.bus.size;
> +
> + for (pa = start; pa < end; pa += PAGE_SIZE)
> + pmap_pv_protect(pa, VM_PROT_NONE);
>   } else if (bo->ttm != NULL) {
>   unsigned i;
>  

I got a panic here:
Apr 14 21:18:10 diablo savecore: reboot after panic: panic: kernel diagnostic
assertion "(bo->mem.bus.base & (PAGE_SIZE - 1)) == 0" failed: file
"../../../../external/bsd/drm2/dist/drm/ttm/ttm_bo.c", line 1618 bo bus base
addr not page-aligned: fe833a3989b0

The trace i get:
#0  0x80684ae5 in cpu_reboot ()
#1  0x808c5cf4 in vpanic ()
#2  0x80a78443 in kern_assert ()
#3  0x8091ca9a in ttm_bo_unmap_virtual_locked ()
#4  0x8091d10d in ttm_bo_unmap_virtual ()
#5  0x807cc4aa in radeon_pm_set_clocks ()
#6  0x807cd65f in radeon_pm_compute_clocks ()
#7  0x802c8548 in drm_crtc_helper_set_mode ()
#8  0x807b44d7 in radeon_connector_set_property ()
#9  0x802c64dc in drm_mode_obj_set_property_ioctl ()
#10 0x802c66fa in drm_mode_connector_property_set_ioctl ()
#11 0x802ca9da in drm_ioctl ()
#12 0x808d8799 in sys_ioctl ()
#13 0x808e37f7 in syscall ()
#14 0x80100691 in Xsyscall ()

version 1.6 of ttm_bo.c (checkout kernel source date 20150402) works fine.

I hope this will shed some light on it, no clue yet as to why yet though.

(note to myself, crash 113)

With regards,
Reinoud



Re: tracking P->V for unmanaged device pages

2015-04-14 Thread Taylor R Campbell
   Date: Tue, 14 Apr 2015 21:57:03 +0200
   From: Reinoud Zandijk 

   I got a panic here:
   Apr 14 21:18:10 diablo savecore: reboot after panic: panic: kernel diagnostic
   assertion "(bo->mem.bus.base & (PAGE_SIZE - 1)) == 0" failed: file
   "../../../../external/bsd/drm2/dist/drm/ttm/ttm_bo.c", line 1618 bo bus base
   addr not page-aligned: fe833a3989b0

Known problem, not sure why we're getting some random kernel pointer
instead of a page-aligned bus address here.  Kasserts around the
initialization of these things suggest the objects are getting
corrupted somewhere.


Re: tracking P->V for unmanaged device pages

2015-04-14 Thread Masao Uebayashi
KERNHIST is your friend. :)