Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

2015-08-03 Thread Michel Thierry

On 7/29/2015 5:23 PM, Michel Thierry wrote:

Michel Thierry (19):
   drm/i915: Remove unnecessary gen8_clamp_pd
   drm/i915/gen8: Make pdp allocation more dynamic
   drm/i915/gen8: Abstract PDP usage
   drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
   drm/i915/gen8: Add dynamic page trace events
   drm/i915/gen8: Add PML4 structure
   drm/i915/gen8: implement alloc/free for 4lvl
   drm/i915/gen8: Add 4 level switching infrastructure and lrc support
   drm/i915/gen8: Pass sg_iter through pte inserts
   drm/i915/gen8: Add 4 level support in insert_entries and clear_range
   drm/i915/gen8: Initialize PDPs and PML4
   drm/i915: Expand error state's address width to 64b
   drm/i915/gen8: Add ppgtt info and debug_dump
   drm/i915: object size needs to be u64
   drm/i915: batch_obj vm offset must be u64
   drm/i915/userptr: Kill user_size limit check
   drm/i915: Wa32bitGeneralStateOffset  Wa32bitInstructionBaseOffset
   drm/i915/gen8: Flip the 48b switch
   drm/i915: Save some page table setup on repeated binds

  drivers/gpu/drm/i915/i915_debugfs.c|  18 +-
  drivers/gpu/drm/i915/i915_drv.h|  11 +-
  drivers/gpu/drm/i915/i915_gem.c|  30 +-
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  13 +
  drivers/gpu/drm/i915/i915_gem_gtt.c| 665 -
  drivers/gpu/drm/i915/i915_gem_gtt.h|  64 ++-
  drivers/gpu/drm/i915/i915_gem_userptr.c|   4 -
  drivers/gpu/drm/i915/i915_gpu_error.c  |  24 +-
  drivers/gpu/drm/i915/i915_params.c |   2 +-
  drivers/gpu/drm/i915/i915_reg.h|   1 +
  drivers/gpu/drm/i915/i915_trace.h  |  32 +-
  drivers/gpu/drm/i915/intel_lrc.c   |  60 ++-
  include/uapi/drm/i915_drm.h|   3 +-
  13 files changed, 747 insertions(+), 180 deletions(-)

--
2.4.5



Hi Daniel,

Finally all the patches have Akash's r-b.
Since there were still some small changes by him and Chris, I addressed 
them individually (instead of resending the whole series one more time).


Below are the msg-id of the last versions of each of them, in case there 
are some doubts about which patches to merge.


Note, the last patch (drm/i915: Save some page table setup on repeated 
binds) is an optimization Akash recommended. That's why he didn't review 
it. Do you have someone in mind to check it? Or should I ask around for 
volunteers?


Thanks,

-Michel

[01/19] drm/i915: Remove unnecessary gen8_clamp_pd
1438187043-34267-2-git-send-email-michel.thie...@intel.com

[02/19] drm/i915/gen8: Make pdp allocation more dynamic
1438187043-34267-3-git-send-email-michel.thie...@intel.com

[03/19] drm/i915/gen8: Abstract PDP usage
1438250523-22533-1-git-send-email-michel.thie...@intel.com

[04/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
1438250569-22618-1-git-send-email-michel.thie...@intel.com

[05/19] drm/i915/gen8: Add dynamic page trace events
1438187043-34267-6-git-send-email-michel.thie...@intel.com

[06/19] drm/i915/gen8: Add PML4 structure
1438591921-3087-1-git-send-email-michel.thie...@intel.com

[07/19] drm/i915/gen8: implement alloc/free for 4lvl
1438250729-22955-1-git-send-email-michel.thie...@intel.com

[08/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc
support
1438250783-23118-1-git-send-email-michel.thie...@intel.com

[09/19] drm/i915/gen8: Pass sg_iter through pte inserts
1438591967-3249-1-git-send-email-michel.thie...@intel.com

[10/19] drm/i915/gen8: Add 4 level support in insert_entries and
clear_range
1438592007-3354-1-git-send-email-michel.thie...@intel.com

[11/19] drm/i915/gen8: Initialize PDPs and PML4
1438187043-34267-12-git-send-email-michel.thie...@intel.com

[12/19] drm/i915: Expand error state's address width to 64b
1438187043-34267-13-git-send-email-michel.thie...@intel.com

[13/19] drm/i915/gen8: Add ppgtt info and debug_dump
1438187043-34267-14-git-send-email-michel.thie...@intel.com

[14/19] drm/i915: object size needs to be u64
1438187043-34267-15-git-send-email-michel.thie...@intel.com

[15/19] drm/i915: batch_obj vm offset must be u64
1438187043-34267-16-git-send-email-michel.thie...@intel.com

[16/19] drm/i915/userptr: Kill user_size limit check
1438187043-34267-17-git-send-email-michel.thie...@intel.com

[17/19] drm/i915: Wa32bitGeneralStateOffset 
Wa32bitInstructionBaseOffset
1438187043-34267-18-git-send-email-michel.thie...@intel.com

[18/19] drm/i915/gen8: Flip the 48b switch
1438346110-18985-1-git-send-email-michel.thie...@intel.com

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

2015-07-30 Thread Michel Thierry

On 7/30/2015 12:26 PM, Chris Wilson wrote:

On Wed, Jul 29, 2015 at 05:23:44PM +0100, Michel Thierry wrote:

This clean-up version delays the 48-bit work to later patches and includes
more review comments from Akash and Chris. The first 5 patches prepare the
dynamic page allocation code to handle independent pdps, but no specific
code for 48-bit mode is added before the 5th patch.

In order expand the GPU address space, a 4th level translation is added,
the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E),
PML4[0-511], each pointing to a PDP. All the existing dynamic alloc
ppgtt functions are used, only adding the 4th level changes. I also
updated some remaining variables that were 32b only.

There are 2 hardware workarounds needed to allow correct operation with
48b addresses (Wa32bitGeneralStateOffset  Wa32bitInstructionBaseOffset).
A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can
be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB.
Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain
compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function
that will flag these objects, while the existing drm_intel_bo_emit_reloc
clears it.

Finally, this feature is only available in BDW and Gen9, requires LRC
submission mode (execlists) and it can be detected by i915.enable_ppgtt=3.

Also note that this expanded address space is only available for full
PPGTT, aliasing PPGTT and Global GTT remain 32-bit.

I'll resend the userland patches (libdrm/mesa) in a different patchset, there
haven't been changes on them, but they require a rebase. I will also expand the
ppgtt igt test per Chris suggestions.


Just a head's up, I haven't root caused this yet, but with
i915.enable_ppgtt=2 I started getting GPU hangs that didn't happen
before this series...
-Chris



Sounds like I screwed up something in the first 4 patches or in the 
Wa32bit one. The rest of the changes are contained to 48-bit code.


Have you find a way to reproduce it?

Thanks,

-Michel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

2015-07-30 Thread Chris Wilson
On Thu, Jul 30, 2015 at 12:52:19PM +0100, Michel Thierry wrote:
 On 7/30/2015 12:26 PM, Chris Wilson wrote:
 Just a head's up, I haven't root caused this yet, but with
 i915.enable_ppgtt=2 I started getting GPU hangs that didn't happen
 before this series...
 
 Sounds like I screwed up something in the first 4 patches or in the
 Wa32bit one. The rest of the changes are contained to 48-bit code.

It's also likely to be bdw specific since I've been running the same
kernel on snb/ivb/hsw without issue. I just thought I would do a quick
compare of pggtt=3 against pggtt=2 when the problems started.
 
 Have you find a way to reproduce it?

It was in the middle of the ue4 Reflections demo, though it had run
through a sample of other tests seemingly without issue.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

2015-07-30 Thread Chris Wilson
On Thu, Jul 30, 2015 at 12:52:19PM +0100, Michel Thierry wrote:
 Sounds like I screwed up something in the first 4 patches or in the
 Wa32bit one. The rest of the changes are contained to 48-bit code.
 
 Have you find a way to reproduce it?

Seems like no. Whatever happened this morning, it hasn't happened since
preping the tree for a bisect (recompiling an retesting last known
bad/good).

Panic over for the time being.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

2015-07-30 Thread Chris Wilson
On Wed, Jul 29, 2015 at 05:23:44PM +0100, Michel Thierry wrote:
 This clean-up version delays the 48-bit work to later patches and includes
 more review comments from Akash and Chris. The first 5 patches prepare the
 dynamic page allocation code to handle independent pdps, but no specific
 code for 48-bit mode is added before the 5th patch.
 
 In order expand the GPU address space, a 4th level translation is added,
 the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E),
 PML4[0-511], each pointing to a PDP. All the existing dynamic alloc
 ppgtt functions are used, only adding the 4th level changes. I also
 updated some remaining variables that were 32b only.
 
 There are 2 hardware workarounds needed to allow correct operation with
 48b addresses (Wa32bitGeneralStateOffset  Wa32bitInstructionBaseOffset).
 A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can
 be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB.
 Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain
 compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function
 that will flag these objects, while the existing drm_intel_bo_emit_reloc
 clears it.
 
 Finally, this feature is only available in BDW and Gen9, requires LRC
 submission mode (execlists) and it can be detected by i915.enable_ppgtt=3.
 
 Also note that this expanded address space is only available for full
 PPGTT, aliasing PPGTT and Global GTT remain 32-bit.
 
 I'll resend the userland patches (libdrm/mesa) in a different patchset, there
 haven't been changes on them, but they require a rebase. I will also expand 
 the
 ppgtt igt test per Chris suggestions.

Just a head's up, I haven't root caused this yet, but with
i915.enable_ppgtt=2 I started getting GPU hangs that didn't happen
before this series...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

2015-07-29 Thread Michel Thierry
This clean-up version delays the 48-bit work to later patches and includes
more review comments from Akash and Chris. The first 5 patches prepare the
dynamic page allocation code to handle independent pdps, but no specific
code for 48-bit mode is added before the 5th patch.

In order expand the GPU address space, a 4th level translation is added,
the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E),
PML4[0-511], each pointing to a PDP. All the existing dynamic alloc
ppgtt functions are used, only adding the 4th level changes. I also
updated some remaining variables that were 32b only.

There are 2 hardware workarounds needed to allow correct operation with
48b addresses (Wa32bitGeneralStateOffset  Wa32bitInstructionBaseOffset).
A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can
be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB.
Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain
compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function
that will flag these objects, while the existing drm_intel_bo_emit_reloc
clears it.

Finally, this feature is only available in BDW and Gen9, requires LRC
submission mode (execlists) and it can be detected by i915.enable_ppgtt=3.

Also note that this expanded address space is only available for full
PPGTT, aliasing PPGTT and Global GTT remain 32-bit.

I'll resend the userland patches (libdrm/mesa) in a different patchset, there
haven't been changes on them, but they require a rebase. I will also expand the
ppgtt igt test per Chris suggestions.

Michel Thierry (19):
  drm/i915: Remove unnecessary gen8_clamp_pd
  drm/i915/gen8: Make pdp allocation more dynamic
  drm/i915/gen8: Abstract PDP usage
  drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
  drm/i915/gen8: Add dynamic page trace events
  drm/i915/gen8: Add PML4 structure
  drm/i915/gen8: implement alloc/free for 4lvl
  drm/i915/gen8: Add 4 level switching infrastructure and lrc support
  drm/i915/gen8: Pass sg_iter through pte inserts
  drm/i915/gen8: Add 4 level support in insert_entries and clear_range
  drm/i915/gen8: Initialize PDPs and PML4
  drm/i915: Expand error state's address width to 64b
  drm/i915/gen8: Add ppgtt info and debug_dump
  drm/i915: object size needs to be u64
  drm/i915: batch_obj vm offset must be u64
  drm/i915/userptr: Kill user_size limit check
  drm/i915: Wa32bitGeneralStateOffset  Wa32bitInstructionBaseOffset
  drm/i915/gen8: Flip the 48b switch
  drm/i915: Save some page table setup on repeated binds

 drivers/gpu/drm/i915/i915_debugfs.c|  18 +-
 drivers/gpu/drm/i915/i915_drv.h|  11 +-
 drivers/gpu/drm/i915/i915_gem.c|  30 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  13 +
 drivers/gpu/drm/i915/i915_gem_gtt.c| 665 -
 drivers/gpu/drm/i915/i915_gem_gtt.h|  64 ++-
 drivers/gpu/drm/i915/i915_gem_userptr.c|   4 -
 drivers/gpu/drm/i915/i915_gpu_error.c  |  24 +-
 drivers/gpu/drm/i915/i915_params.c |   2 +-
 drivers/gpu/drm/i915/i915_reg.h|   1 +
 drivers/gpu/drm/i915/i915_trace.h  |  32 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  60 ++-
 include/uapi/drm/i915_drm.h|   3 +-
 13 files changed, 747 insertions(+), 180 deletions(-)

-- 
2.4.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx