Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT
On 7/29/2015 5:23 PM, Michel Thierry wrote: Michel Thierry (19): drm/i915: Remove unnecessary gen8_clamp_pd drm/i915/gen8: Make pdp allocation more dynamic drm/i915/gen8: Abstract PDP usage drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT drm/i915/gen8: Add dynamic page trace events drm/i915/gen8: Add PML4 structure drm/i915/gen8: implement alloc/free for 4lvl drm/i915/gen8: Add 4 level switching infrastructure and lrc support drm/i915/gen8: Pass sg_iter through pte inserts drm/i915/gen8: Add 4 level support in insert_entries and clear_range drm/i915/gen8: Initialize PDPs and PML4 drm/i915: Expand error state's address width to 64b drm/i915/gen8: Add ppgtt info and debug_dump drm/i915: object size needs to be u64 drm/i915: batch_obj vm offset must be u64 drm/i915/userptr: Kill user_size limit check drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset drm/i915/gen8: Flip the 48b switch drm/i915: Save some page table setup on repeated binds drivers/gpu/drm/i915/i915_debugfs.c| 18 +- drivers/gpu/drm/i915/i915_drv.h| 11 +- drivers/gpu/drm/i915/i915_gem.c| 30 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 + drivers/gpu/drm/i915/i915_gem_gtt.c| 665 - drivers/gpu/drm/i915/i915_gem_gtt.h| 64 ++- drivers/gpu/drm/i915/i915_gem_userptr.c| 4 - drivers/gpu/drm/i915/i915_gpu_error.c | 24 +- drivers/gpu/drm/i915/i915_params.c | 2 +- drivers/gpu/drm/i915/i915_reg.h| 1 + drivers/gpu/drm/i915/i915_trace.h | 32 +- drivers/gpu/drm/i915/intel_lrc.c | 60 ++- include/uapi/drm/i915_drm.h| 3 +- 13 files changed, 747 insertions(+), 180 deletions(-) -- 2.4.5 Hi Daniel, Finally all the patches have Akash's r-b. Since there were still some small changes by him and Chris, I addressed them individually (instead of resending the whole series one more time). Below are the msg-id of the last versions of each of them, in case there are some doubts about which patches to merge. Note, the last patch (drm/i915: Save some page table setup on repeated binds) is an optimization Akash recommended. That's why he didn't review it. Do you have someone in mind to check it? Or should I ask around for volunteers? Thanks, -Michel [01/19] drm/i915: Remove unnecessary gen8_clamp_pd 1438187043-34267-2-git-send-email-michel.thie...@intel.com [02/19] drm/i915/gen8: Make pdp allocation more dynamic 1438187043-34267-3-git-send-email-michel.thie...@intel.com [03/19] drm/i915/gen8: Abstract PDP usage 1438250523-22533-1-git-send-email-michel.thie...@intel.com [04/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT 1438250569-22618-1-git-send-email-michel.thie...@intel.com [05/19] drm/i915/gen8: Add dynamic page trace events 1438187043-34267-6-git-send-email-michel.thie...@intel.com [06/19] drm/i915/gen8: Add PML4 structure 1438591921-3087-1-git-send-email-michel.thie...@intel.com [07/19] drm/i915/gen8: implement alloc/free for 4lvl 1438250729-22955-1-git-send-email-michel.thie...@intel.com [08/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support 1438250783-23118-1-git-send-email-michel.thie...@intel.com [09/19] drm/i915/gen8: Pass sg_iter through pte inserts 1438591967-3249-1-git-send-email-michel.thie...@intel.com [10/19] drm/i915/gen8: Add 4 level support in insert_entries and clear_range 1438592007-3354-1-git-send-email-michel.thie...@intel.com [11/19] drm/i915/gen8: Initialize PDPs and PML4 1438187043-34267-12-git-send-email-michel.thie...@intel.com [12/19] drm/i915: Expand error state's address width to 64b 1438187043-34267-13-git-send-email-michel.thie...@intel.com [13/19] drm/i915/gen8: Add ppgtt info and debug_dump 1438187043-34267-14-git-send-email-michel.thie...@intel.com [14/19] drm/i915: object size needs to be u64 1438187043-34267-15-git-send-email-michel.thie...@intel.com [15/19] drm/i915: batch_obj vm offset must be u64 1438187043-34267-16-git-send-email-michel.thie...@intel.com [16/19] drm/i915/userptr: Kill user_size limit check 1438187043-34267-17-git-send-email-michel.thie...@intel.com [17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset 1438187043-34267-18-git-send-email-michel.thie...@intel.com [18/19] drm/i915/gen8: Flip the 48b switch 1438346110-18985-1-git-send-email-michel.thie...@intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT
On Thu, Jul 30, 2015 at 12:52:19PM +0100, Michel Thierry wrote: > Sounds like I screwed up something in the first 4 patches or in the > Wa32bit one. The rest of the changes are contained to 48-bit code. > > Have you find a way to reproduce it? Seems like no. Whatever happened this morning, it hasn't happened since preping the tree for a bisect (recompiling an retesting last known bad/good). Panic over for the time being. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT
On Thu, Jul 30, 2015 at 12:52:19PM +0100, Michel Thierry wrote: > On 7/30/2015 12:26 PM, Chris Wilson wrote: > >Just a head's up, I haven't root caused this yet, but with > >i915.enable_ppgtt=2 I started getting GPU hangs that didn't happen > >before this series... > > Sounds like I screwed up something in the first 4 patches or in the > Wa32bit one. The rest of the changes are contained to 48-bit code. It's also likely to be bdw specific since I've been running the same kernel on snb/ivb/hsw without issue. I just thought I would do a quick compare of pggtt=3 against pggtt=2 when the problems started. > Have you find a way to reproduce it? It was in the middle of the ue4 Reflections demo, though it had run through a sample of other tests seemingly without issue. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT
On 7/30/2015 12:26 PM, Chris Wilson wrote: On Wed, Jul 29, 2015 at 05:23:44PM +0100, Michel Thierry wrote: This clean-up version delays the 48-bit work to later patches and includes more review comments from Akash and Chris. The first 5 patches prepare the dynamic page allocation code to handle independent pdps, but no specific code for 48-bit mode is added before the 5th patch. In order expand the GPU address space, a 4th level translation is added, the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E), PML4[0-511], each pointing to a PDP. All the existing "dynamic alloc ppgtt" functions are used, only adding the 4th level changes. I also updated some remaining variables that were 32b only. There are 2 hardware workarounds needed to allow correct operation with 48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function that will flag these objects, while the existing drm_intel_bo_emit_reloc clears it. Finally, this feature is only available in BDW and Gen9, requires LRC submission mode (execlists) and it can be detected by i915.enable_ppgtt=3. Also note that this expanded address space is only available for full PPGTT, aliasing PPGTT and Global GTT remain 32-bit. I'll resend the userland patches (libdrm/mesa) in a different patchset, there haven't been changes on them, but they require a rebase. I will also expand the ppgtt igt test per Chris suggestions. Just a head's up, I haven't root caused this yet, but with i915.enable_ppgtt=2 I started getting GPU hangs that didn't happen before this series... -Chris Sounds like I screwed up something in the first 4 patches or in the Wa32bit one. The rest of the changes are contained to 48-bit code. Have you find a way to reproduce it? Thanks, -Michel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT
On Wed, Jul 29, 2015 at 05:23:44PM +0100, Michel Thierry wrote: > This clean-up version delays the 48-bit work to later patches and includes > more review comments from Akash and Chris. The first 5 patches prepare the > dynamic page allocation code to handle independent pdps, but no specific > code for 48-bit mode is added before the 5th patch. > > In order expand the GPU address space, a 4th level translation is added, > the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E), > PML4[0-511], each pointing to a PDP. All the existing "dynamic alloc > ppgtt" functions are used, only adding the 4th level changes. I also > updated some remaining variables that were 32b only. > > There are 2 hardware workarounds needed to allow correct operation with > 48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). > A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can > be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. > Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain > compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function > that will flag these objects, while the existing drm_intel_bo_emit_reloc > clears it. > > Finally, this feature is only available in BDW and Gen9, requires LRC > submission mode (execlists) and it can be detected by i915.enable_ppgtt=3. > > Also note that this expanded address space is only available for full > PPGTT, aliasing PPGTT and Global GTT remain 32-bit. > > I'll resend the userland patches (libdrm/mesa) in a different patchset, there > haven't been changes on them, but they require a rebase. I will also expand > the > ppgtt igt test per Chris suggestions. Just a head's up, I haven't root caused this yet, but with i915.enable_ppgtt=2 I started getting GPU hangs that didn't happen before this series... -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT
This clean-up version delays the 48-bit work to later patches and includes more review comments from Akash and Chris. The first 5 patches prepare the dynamic page allocation code to handle independent pdps, but no specific code for 48-bit mode is added before the 5th patch. In order expand the GPU address space, a 4th level translation is added, the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E), PML4[0-511], each pointing to a PDP. All the existing "dynamic alloc ppgtt" functions are used, only adding the 4th level changes. I also updated some remaining variables that were 32b only. There are 2 hardware workarounds needed to allow correct operation with 48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function that will flag these objects, while the existing drm_intel_bo_emit_reloc clears it. Finally, this feature is only available in BDW and Gen9, requires LRC submission mode (execlists) and it can be detected by i915.enable_ppgtt=3. Also note that this expanded address space is only available for full PPGTT, aliasing PPGTT and Global GTT remain 32-bit. I'll resend the userland patches (libdrm/mesa) in a different patchset, there haven't been changes on them, but they require a rebase. I will also expand the ppgtt igt test per Chris suggestions. Michel Thierry (19): drm/i915: Remove unnecessary gen8_clamp_pd drm/i915/gen8: Make pdp allocation more dynamic drm/i915/gen8: Abstract PDP usage drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT drm/i915/gen8: Add dynamic page trace events drm/i915/gen8: Add PML4 structure drm/i915/gen8: implement alloc/free for 4lvl drm/i915/gen8: Add 4 level switching infrastructure and lrc support drm/i915/gen8: Pass sg_iter through pte inserts drm/i915/gen8: Add 4 level support in insert_entries and clear_range drm/i915/gen8: Initialize PDPs and PML4 drm/i915: Expand error state's address width to 64b drm/i915/gen8: Add ppgtt info and debug_dump drm/i915: object size needs to be u64 drm/i915: batch_obj vm offset must be u64 drm/i915/userptr: Kill user_size limit check drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset drm/i915/gen8: Flip the 48b switch drm/i915: Save some page table setup on repeated binds drivers/gpu/drm/i915/i915_debugfs.c| 18 +- drivers/gpu/drm/i915/i915_drv.h| 11 +- drivers/gpu/drm/i915/i915_gem.c| 30 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 + drivers/gpu/drm/i915/i915_gem_gtt.c| 665 - drivers/gpu/drm/i915/i915_gem_gtt.h| 64 ++- drivers/gpu/drm/i915/i915_gem_userptr.c| 4 - drivers/gpu/drm/i915/i915_gpu_error.c | 24 +- drivers/gpu/drm/i915/i915_params.c | 2 +- drivers/gpu/drm/i915/i915_reg.h| 1 + drivers/gpu/drm/i915/i915_trace.h | 32 +- drivers/gpu/drm/i915/intel_lrc.c | 60 ++- include/uapi/drm/i915_drm.h| 3 +- 13 files changed, 747 insertions(+), 180 deletions(-) -- 2.4.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx