[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt
== Series Details == Series: series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt URL : https://patchwork.freedesktop.org/series/31525/ State : failure == Summary == Test kms_properties: Subgroup crtc-properties-legacy: pass -> DMESG-WARN (shard-hsw) Test kms_atomic_transition: Subgroup 4x-modeset-transitions-fencing: skip -> INCOMPLETE (shard-hsw) Test kms_cursor_legacy: Subgroup cursorA-vs-flipA-atomic-transitions: fail -> PASS (shard-hsw) fdo#102723 Test kms_plane_multiple: Subgroup legacy-pipe-C-tiling-none: pass -> SKIP (shard-hsw) fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723 shard-hswtotal:2446 pass:1292 dwarn:7 dfail:0 fail:8 skip:1090 time:9959s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5938/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: warning for benchmark/gem_busy: Compare polling with syncobj_wait
== Series Details == Series: benchmark/gem_busy: Compare polling with syncobj_wait URL : https://patchwork.freedesktop.org/series/31507/ State : warning == Summary == Test kms_setmode: Subgroup basic: pass -> FAIL (shard-hsw) fdo#99912 Test gem_eio: Subgroup in-flight-external: pass -> DMESG-WARN (shard-hsw) fdo#102886 +1 Test kms_cursor_legacy: Subgroup cursorA-vs-flipA-atomic-transitions: fail -> PASS (shard-hsw) fdo#102723 Test gem_render_tiled_blits: Subgroup basic: pass -> DMESG-WARN (shard-hsw) fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886 fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723 shard-hswtotal:2446 pass:1327 dwarn:7 dfail:0 fail:9 skip:1103 time:10028s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_306/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.IGT: success for igt/gem_eio: Check hang/eio recovery during suspend
== Series Details == Series: igt/gem_eio: Check hang/eio recovery during suspend URL : https://patchwork.freedesktop.org/series/31485/ State : success == Summary == Test gem_eio: Subgroup in-flight-contexts: dmesg-warn -> PASS (shard-hsw) fdo#102886 +1 Test kms_cursor_legacy: Subgroup cursorA-vs-flipA-atomic-transitions: fail -> PASS (shard-hsw) fdo#102723 fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886 fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723 shard-hswtotal:2447 pass:1329 dwarn:7 dfail:0 fail:8 skip:1103 time:10232s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_305/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Cancel the hotplug work when unregistering the connector (rev2)
== Series Details == Series: drm/i915: Cancel the hotplug work when unregistering the connector (rev2) URL : https://patchwork.freedesktop.org/series/31501/ State : success == Summary == Test kms_cursor_legacy: Subgroup cursorA-vs-flipA-atomic-transitions: fail -> PASS (shard-hsw) fdo#102723 fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723 shard-hswtotal:2446 pass:1329 dwarn:6 dfail:0 fail:8 skip:1103 time:10131s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5937/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.IGT: success for igt/gem_memfd: Exercise hugepages and memfd
== Series Details == Series: igt/gem_memfd: Exercise hugepages and memfd URL : https://patchwork.freedesktop.org/series/31460/ State : success == Summary == Test gem_eio: Subgroup in-flight: dmesg-warn -> PASS (shard-hsw) fdo#102886 +2 Test kms_setmode: Subgroup basic: fail -> PASS (shard-hsw) fdo#99912 Test prime_self_import: Subgroup reimport-vs-gem_close-race: pass -> FAIL (shard-hsw) fdo#102655 fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886 fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 fdo#102655 https://bugs.freedesktop.org/show_bug.cgi?id=102655 shard-hswtotal:2400 pass:1303 dwarn:5 dfail:0 fail:9 skip:1083 time:9973s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_303/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: warning for drm/i915: Cancel the hotplug work when unregistering the connector
== Series Details == Series: drm/i915: Cancel the hotplug work when unregistering the connector URL : https://patchwork.freedesktop.org/series/31501/ State : warning == Summary == Test drv_module_reload: Subgroup basic-reload-inject: pass -> DMESG-WARN (shard-hsw) fdo#102707 +2 Test gem_eio: Subgroup wait: dmesg-warn -> PASS (shard-hsw) fdo#102886 Test kms_cursor_legacy: Subgroup basic-flip-before-cursor-atomic: pass -> SKIP (shard-hsw) Test kms_frontbuffer_tracking: Subgroup fbc-1p-primscrn-shrfb-plflip-blt: pass -> SKIP (shard-hsw) Test kms_rotation_crc: Subgroup sprite-rotation-180: pass -> SKIP (shard-hsw) Test pm_rpm: Subgroup gem-mmap-cpu: pass -> SKIP (shard-hsw) Test kms_draw_crc: Subgroup draw-method-xrgb-render-xtiled: pass -> SKIP (shard-hsw) Test kms_chv_cursor_fail: Subgroup pipe-A-64x64-left-edge: pass -> SKIP (shard-hsw) Test kms_setmode: Subgroup basic: fail -> PASS (shard-hsw) fdo#99912 Test kms_atomic_transition: Subgroup plane-all-modeset-transition: pass -> DMESG-WARN (shard-hsw) fdo#102707 https://bugs.freedesktop.org/show_bug.cgi?id=102707 fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886 fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 shard-hswtotal:2446 pass:1320 dwarn:9 dfail:0 fail:8 skip:1109 time:10039s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5936/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt
== Series Details == Series: series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt URL : https://patchwork.freedesktop.org/series/31525/ State : success == Summary == Series 31525v1 series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt https://patchwork.freedesktop.org/api/1.0/series/31525/revisions/1/mbox/ Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (fi-byt-j1900) fdo#101705 fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:454s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:483s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:391s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:577s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:285s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:523s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:522s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:534s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:510s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:565s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:615s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:436s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:594s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:440s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:499s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:474s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:500s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:578s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:486s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:591s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:663s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:476s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:662s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:530s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:510s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:468s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:585s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:432s aaf31e875e72b50f6a970c11f797b7f5b61a2681 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest b93aebea22b8 drm/i915: enable platform support for 2M pages 4a0c64a37ae2 drm/i915: enable platform support for 64K pages 0af112e2f579 drm/i915: disable platform support for vGPU huge gtt pages f0a1be1a3dd7 drm/i915/selftests: mix huge pages 00a7d5db2c00 drm/i915/selftests: huge page tests 93c70ed742be drm/i915/debugfs: include some gtt page size metrics 1f0a8b9c5966 drm/i915: accurate page size tracking for the ppgtt 1654205c7949 drm/i915: support 64K pages for the 48b PPGTT 111e38def2cc drm/i915: add support for 64K scratch page c61736868242 drm/i915: support 2M pages for the 48b PPGTT b18b3f752993 drm/i915: disable GTT cache for 2M pages e956c4176e78 drm/i915: enable IPS bit for 64K pages e86cd1858ba7 drm/i915: align 64K objects to 2M a6190ddbeaa0 drm/i915: align the vma start to the largest gtt page size 0ad35eb31cee drm/i915: introduce vm set_pages/clear_pages e9108b31f52e drm/i915: introduce page_size members 24b5f6444521 drm/i915: push set_pages down to the callers 0fe2db9775b3 drm/i915: introduce page_sizes field to dev_info 2a1dc2a89a9b drm/i915/gemfs: enable THP 2ff188382116 drm/i915: introduce simple gemfs ff8befdbed20 mm/shmem: introduce shmem_file_setup_with_mnt == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5938/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/execlists: Add a comment for the extra MI_ARB_ENABLE
Quoting Michel Thierry (2017-10-05 20:41:40) > On 10/5/2017 12:10 PM, Chris Wilson wrote: > > Michel Thierry noticed that we were applying WaDisableCtxRestoreArbitration > > even to gen9, which does not require the w/a. The rationale is that we > > need to enable MI arbitration for execlists to work, and to be safe we > > do that before every batch (in addition to every context switch into the > > batch). Since this is not clear from the single line comment suggesting > > the MI_ARB_ENABLE is solely for the w/a, add a little more detail. > > > > Signed-off-by: Chris Wilson > > Cc: Michel Thierry > > Cc: Joonas Lahtinen > > Cc: Michał Winiarski > > It can't be clearer. Thanks! > > Reviewed-by: Michel Thierry Thanks for asking, and checking what I wrote made sense! -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: warning for series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock (rev2)
== Series Details == Series: series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock (rev2) URL : https://patchwork.freedesktop.org/series/31476/ State : warning == Summary == Test gem_eio: Subgroup in-flight-contexts: dmesg-warn -> PASS (shard-hsw) fdo#102886 +4 Test kms_cursor_crc: Subgroup cursor-64x64-sliding: pass -> DMESG-WARN (shard-hsw) Test prime_mmap: Subgroup test_userptr: dmesg-warn -> PASS (shard-hsw) fdo#102939 fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886 fdo#102939 https://bugs.freedesktop.org/show_bug.cgi?id=102939 shard-hswtotal:2446 pass:1333 dwarn:1 dfail:0 fail:9 skip:1103 time:10142s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5935/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 05/21] drm/i915: push set_pages down to the callers
From: Matthew Auld Each backend is now responsible for calling __i915_gem_object_set_pages upon successfully gathering its backing storage. This eliminates the inconsistency between the async and sync paths, which stands out even more when we start throwing around an sg_mask in a later patch. Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-6-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 45 +--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 15 +--- drivers/gpu/drm/i915/i915_gem_internal.c | 15 drivers/gpu/drm/i915/i915_gem_object.h | 2 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 16 ++--- drivers/gpu/drm/i915/i915_gem_userptr.c | 12 +++ drivers/gpu/drm/i915/selftests/huge_gem_object.c | 14 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 12 --- 8 files changed, 77 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1da1f52d12cc..42f2ca1e136b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -162,8 +162,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, return 0; } -static struct sg_table * -i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) +static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) { struct address_space *mapping = obj->base.filp->f_mapping; drm_dma_handle_t *phys; @@ -171,9 +170,10 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) struct scatterlist *sg; char *vaddr; int i; + int err; if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj))) - return ERR_PTR(-EINVAL); + return -EINVAL; /* Always aligning to the object size, allows a single allocation * to handle all possible callers, and given typical object sizes, @@ -183,7 +183,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) roundup_pow_of_two(obj->base.size), roundup_pow_of_two(obj->base.size)); if (!phys) - return ERR_PTR(-ENOMEM); + return -ENOMEM; vaddr = phys->vaddr; for (i = 0; i < obj->base.size / PAGE_SIZE; i++) { @@ -192,7 +192,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) page = shmem_read_mapping_page(mapping, i); if (IS_ERR(page)) { - st = ERR_CAST(page); + err = PTR_ERR(page); goto err_phys; } @@ -209,13 +209,13 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) st = kmalloc(sizeof(*st), GFP_KERNEL); if (!st) { - st = ERR_PTR(-ENOMEM); + err = -ENOMEM; goto err_phys; } if (sg_alloc_table(st, 1, GFP_KERNEL)) { kfree(st); - st = ERR_PTR(-ENOMEM); + err = -ENOMEM; goto err_phys; } @@ -227,11 +227,15 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) sg_dma_len(sg) = obj->base.size; obj->phys_handle = phys; - return st; + + __i915_gem_object_set_pages(obj, st); + + return 0; err_phys: drm_pci_free(obj->base.dev, phys); - return st; + + return err; } static void __start_cpu_write(struct drm_i915_gem_object *obj) @@ -2292,8 +2296,7 @@ static bool i915_sg_trim(struct sg_table *orig_st) return true; } -static struct sg_table * -i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) +static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) { struct drm_i915_private *dev_priv = to_i915(obj->base.dev); const unsigned long page_count = obj->base.size / PAGE_SIZE; @@ -2317,12 +2320,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) - return ERR_PTR(-ENOMEM); + return -ENOMEM; rebuild_st: if (sg_alloc_table(st, page_count, GFP_KERNEL)) { kfree(st); - return ERR_PTR(-ENOMEM); + return -ENOMEM; } /* Get the list of pages out of our struct file. They'll be pinned @@ -2430,7 +2433,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_do_bit_17_swizzle(obj, st); - return st; + __i915_gem_object_set_pages(obj, st); + + return 0; err_sg: sg_
[Intel-gfx] [CI 09/21] drm/i915: align 64K objects to 2M
From: Matthew Auld We can't mix 64K and 4K pte's in the same page-table, so for now we align 64K objects to 2M to avoid any potential mixing. This is potentially wasteful but in reality shouldn't be too bad since this only applies to the virtual address space of a 48b PPGTT. v2: don't separate logically connected ops Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-10-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_vma.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 5d4164406b63..72e86b32ab41 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -503,10 +503,20 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) */ if (upper_32_bits(end - 1) && vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { + /* +* We can't mix 64K and 4K PTEs in the same page-table +* (2M block), and so to avoid the ugliness and +* complexity of coloring we opt for just aligning 64K +* objects to 2M. +*/ u64 page_alignment = - rounddown_pow_of_two(vma->page_sizes.sg); + rounddown_pow_of_two(vma->page_sizes.sg | +I915_GTT_PAGE_SIZE_2M); alignment = max(alignment, page_alignment); + + if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) + size = round_up(size, I915_GTT_PAGE_SIZE_2M); } ret = i915_gem_gtt_insert(vma->vm, &vma->node, -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 15/21] drm/i915: accurate page size tracking for the ppgtt
From: Matthew Auld Now that we support multiple page sizes for the ppgtt, it would be useful to track the real usage for debugging purposes. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-16-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c| 11 +++ drivers/gpu/drm/i915/i915_gem_object.h | 10 ++ 2 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 118aad90468f..4c605785e2b3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1053,6 +1053,8 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm, gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx, cache_level); + + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; } static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, @@ -1145,7 +1147,10 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, vaddr = kmap_atomic_px(pd); vaddr[idx.pde] |= GEN8_PDE_IPS_64K; kunmap_atomic(vaddr); + page_size = I915_GTT_PAGE_SIZE_64K; } + + vma->page_sizes.gtt |= page_size; } while (iter->sg); } @@ -1170,6 +1175,8 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm, while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter, &idx, cache_level)) GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4); + + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; } } @@ -1891,6 +1898,8 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm, } } while (1); kunmap_atomic(vaddr); + + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; } static int gen6_alloc_va_range(struct i915_address_space *vm, @@ -2598,6 +2607,8 @@ static int ggtt_bind_vma(struct i915_vma *vma, vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags); intel_runtime_pm_put(i915); + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; + /* * Without aliasing PPGTT there's no difference between * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h index 110672952a1c..e4e6dd93889d 100644 --- a/drivers/gpu/drm/i915/i915_gem_object.h +++ b/drivers/gpu/drm/i915/i915_gem_object.h @@ -169,6 +169,7 @@ struct drm_i915_gem_object { struct sg_table *pages; void *mapping; + /* TODO: whack some of this into the error state */ struct i915_page_sizes { /** * The sg mask of the pages sg_table. i.e the mask of @@ -184,6 +185,15 @@ struct drm_i915_gem_object { * to use opportunistically. */ unsigned int sg; + + /** +* The actual gtt page size usage. Since we can have +* multiple vma associated with this object we need to +* prevent any trampling of state, hence a copy of this +* struct also lives in each vma, therefore the gtt +* value here should only be read/write through the vma. +*/ + unsigned int gtt; } page_sizes; struct i915_gem_object_page_iter { -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 03/21] drm/i915/gemfs: enable THP
From: Matthew Auld Enable transparent-huge-pages through gemfs by mounting with huge=within_size. v2: sprinkle within_size comment Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-4-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gemfs.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/i915_gemfs.c index 168d0bd98f60..e2993857df37 100644 --- a/drivers/gpu/drm/i915/i915_gemfs.c +++ b/drivers/gpu/drm/i915/i915_gemfs.c @@ -24,6 +24,7 @@ #include #include +#include #include "i915_drv.h" #include "i915_gemfs.h" @@ -41,6 +42,27 @@ int i915_gemfs_init(struct drm_i915_private *i915) if (IS_ERR(gemfs)) return PTR_ERR(gemfs); + /* +* Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most +* likely 2M. Note that within_size may overallocate huge-pages, if say +* we allocate an object of size 2M + 4K, we may get 2M + 2M, but under +* memory pressure shmem should split any huge-pages which can be +* shrunk. +*/ + + if (has_transparent_hugepage()) { + struct super_block *sb = gemfs->mnt_sb; + char options[] = "huge=within_size"; + int flags = 0; + int err; + + err = sb->s_op->remount_fs(sb, &flags, options); + if (err) { + kern_unmount(gemfs); + return err; + } + } + i915->mm.gemfs = gemfs; return 0; -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 17/21] drm/i915/selftests: huge page tests
From: Matthew Auld v2: mock test page support configurations and add MI_STORE_DWORD test v3: run all mockable huge page tests on all platforms via the mock_device v4: add pin_update regression test various improvements suggested by Chris v5: fix issues reported by kbuild test single sg spanning multiple page sizes don't explode when running the live-tests through the appgtt v6: lots of improvements from Chris v7: run on each engine for igt_write_huge add simple tmpfs fallback test v8: size_t is bad don't break the i386 build Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-18-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c|1 + drivers/gpu/drm/i915/i915_gem_object.h |2 + drivers/gpu/drm/i915/selftests/huge_pages.c| 1715 .../gpu/drm/i915/selftests/i915_live_selftests.h |1 + .../gpu/drm/i915/selftests/i915_mock_selftests.h |1 + 5 files changed, 1720 insertions(+) create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 34398696824c..f8c3ac1c8c67 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5412,6 +5412,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) #include "selftests/scatterlist.c" #include "selftests/mock_gem_device.c" #include "selftests/huge_gem_object.c" +#include "selftests/huge_pages.c" #include "selftests/i915_gem_object.c" #include "selftests/i915_gem_coherency.c" #endif diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h index e4e6dd93889d..956c911c2cbf 100644 --- a/drivers/gpu/drm/i915/i915_gem_object.h +++ b/drivers/gpu/drm/i915/i915_gem_object.h @@ -196,6 +196,8 @@ struct drm_i915_gem_object { unsigned int gtt; } page_sizes; + I915_SELFTEST_DECLARE(unsigned int page_mask); + struct i915_gem_object_page_iter { struct scatterlist *sg_pos; unsigned int sg_idx; /* in pages, but 32bit eek! */ diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c new file mode 100644 index ..b8495882e5b0 --- /dev/null +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c @@ -0,0 +1,1715 @@ +/* + * Copyright © 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include "../i915_selftest.h" + +#include + +#include "mock_drm.h" + +static const unsigned int page_sizes[] = { + I915_GTT_PAGE_SIZE_2M, + I915_GTT_PAGE_SIZE_64K, + I915_GTT_PAGE_SIZE_4K, +}; + +static unsigned int get_largest_page_size(struct drm_i915_private *i915, + u64 rem) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) { + unsigned int page_size = page_sizes[i]; + + if (HAS_PAGE_SIZES(i915, page_size) && rem >= page_size) + return page_size; + } + + return 0; +} + +static void huge_pages_free_pages(struct sg_table *st) +{ + struct scatterlist *sg; + + for (sg = st->sgl; sg; sg = __sg_next(sg)) { + if (sg_page(sg)) + __free_pages(sg_page(sg), get_order(sg->length)); + } + + sg_free_table(st); + kfree(st); +} + +static int get_huge_pages(struct drm_i915_gem_object *obj) +{ +#define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY) + unsigned int page_mask = obj->mm.page_mask; + struct sg_table *st; + struct scatterlist *sg; + unsigned int sg_mask; + u64 rem; + + st =
[Intel-gfx] [CI 01/21] mm/shmem: introduce shmem_file_setup_with_mnt
From: Matthew Auld We are planning to use our own tmpfs mnt in i915 in place of the shm_mnt, such that we can control the mount options, in particular huge=, which we require to support huge-gtt-pages. So rather than roll our own version of __shmem_file_setup, it would be preferred if we could just give shmem our mnt, and let it do the rest. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Dave Hansen Cc: Kirill A. Shutemov Cc: Hugh Dickins Cc: linux...@kvack.org Acked-by: Andrew Morton Acked-by: Kirill A. Shutemov Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-2-matthew.a...@intel.com Signed-off-by: Chris Wilson --- include/linux/shmem_fs.h | 2 ++ mm/shmem.c | 30 ++ 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index b6c3540e07bc..0937d9a7d8fb 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -53,6 +53,8 @@ extern struct file *shmem_file_setup(const char *name, loff_t size, unsigned long flags); extern struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags); +extern struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt, + const char *name, loff_t size, unsigned long flags); extern int shmem_zero_setup(struct vm_area_struct *); extern unsigned long shmem_get_unmapped_area(struct file *, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); diff --git a/mm/shmem.c b/mm/shmem.c index 07a1d22807be..3229d27503ec 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4183,7 +4183,7 @@ static const struct dentry_operations anon_ops = { .d_dname = simple_dname }; -static struct file *__shmem_file_setup(const char *name, loff_t size, +static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, loff_t size, unsigned long flags, unsigned int i_flags) { struct file *res; @@ -4192,8 +4192,8 @@ static struct file *__shmem_file_setup(const char *name, loff_t size, struct super_block *sb; struct qstr this; - if (IS_ERR(shm_mnt)) - return ERR_CAST(shm_mnt); + if (IS_ERR(mnt)) + return ERR_CAST(mnt); if (size < 0 || size > MAX_LFS_FILESIZE) return ERR_PTR(-EINVAL); @@ -4205,8 +4205,8 @@ static struct file *__shmem_file_setup(const char *name, loff_t size, this.name = name; this.len = strlen(name); this.hash = 0; /* will go */ - sb = shm_mnt->mnt_sb; - path.mnt = mntget(shm_mnt); + sb = mnt->mnt_sb; + path.mnt = mntget(mnt); path.dentry = d_alloc_pseudo(sb, &this); if (!path.dentry) goto put_memory; @@ -4251,7 +4251,7 @@ static struct file *__shmem_file_setup(const char *name, loff_t size, */ struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags) { - return __shmem_file_setup(name, size, flags, S_PRIVATE); + return __shmem_file_setup(shm_mnt, name, size, flags, S_PRIVATE); } /** @@ -4262,10 +4262,24 @@ struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned lon */ struct file *shmem_file_setup(const char *name, loff_t size, unsigned long flags) { - return __shmem_file_setup(name, size, flags, 0); + return __shmem_file_setup(shm_mnt, name, size, flags, 0); } EXPORT_SYMBOL_GPL(shmem_file_setup); +/** + * shmem_file_setup_with_mnt - get an unlinked file living in tmpfs + * @mnt: the tmpfs mount where the file will be created + * @name: name for dentry (to be seen in /proc//maps + * @size: size to be set for the file + * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size + */ +struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt, const char *name, + loff_t size, unsigned long flags) +{ + return __shmem_file_setup(mnt, name, size, flags, 0); +} +EXPORT_SYMBOL_GPL(shmem_file_setup_with_mnt); + /** * shmem_zero_setup - setup a shared anonymous mapping * @vma: the vma to be mmapped is prepared by do_mmap_pgoff @@ -4281,7 +4295,7 @@ int shmem_zero_setup(struct vm_area_struct *vma) * accessible to the user through its mapping, use S_PRIVATE flag to * bypass file security, in the same way as shmem_kernel_file_setup(). */ - file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE); + file = shmem_kernel_file_setup("dev/zero", size, vma->vm_flags); if (IS_ERR(file)) return PTR_ERR(file); -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-g
[Intel-gfx] [CI 19/21] drm/i915: disable platform support for vGPU huge gtt pages
From: Matthew Auld Currently gvt gtt handling doesn't support huge page entries, so disable for now. v2: remove useless 48b PPGTT check Suggested-by: Zhenyu Wang Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Zhenyu Wang Reviewed-by: Zhenyu Wang Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-20-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f8c3ac1c8c67..82a10036fb38 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4822,6 +4822,15 @@ int i915_gem_init(struct drm_i915_private *dev_priv) mutex_lock(&dev_priv->drm.struct_mutex); + /* +* We need to fallback to 4K pages since gvt gtt handling doesn't +* support huge page entries - we will need to check either hypervisor +* mm can support huge guest page or just do emulation in gvt. +*/ + if (intel_vgpu_active(dev_priv)) + mkwrite_device_info(dev_priv)->page_sizes = + I915_GTT_PAGE_SIZE_4K; + dev_priv->mm.unordered_timeline = dma_fence_context_alloc(1); if (!i915_modparams.enable_execlists) { -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 21/21] drm/i915: enable platform support for 2M pages
From: Matthew Auld For gen8+ platforms which support the 48b PPGTT, enable platform level support for 2M pages. Also enable for mock testing. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-22-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_pci.c | 6 -- drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 8d349aec1902..bf467f30c99b 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -376,7 +376,8 @@ static const struct intel_device_info intel_haswell_gt3_info __initconst = { #define GEN8_FEATURES \ G75_FEATURES, \ BDW_COLORS, \ - GEN_DEFAULT_PAGE_SIZES, \ + .page_sizes = I915_GTT_PAGE_SIZE_4K | \ + I915_GTT_PAGE_SIZE_2M, \ .has_logical_ring_contexts = 1, \ .has_full_48bit_ppgtt = 1, \ .has_64bit_reloc = 1, \ @@ -437,7 +438,8 @@ static const struct intel_device_info intel_cherryview_info __initconst = { #define GEN9_DEFAULT_PAGE_SIZES \ .page_sizes = I915_GTT_PAGE_SIZE_4K | \ - I915_GTT_PAGE_SIZE_64K + I915_GTT_PAGE_SIZE_64K | \ + I915_GTT_PAGE_SIZE_2M #define GEN9_FEATURES \ GEN8_FEATURES, \ diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index 7a9735dac912..04eb9362f4f8 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -176,7 +176,8 @@ struct drm_i915_private *mock_gem_device(void) mkwrite_device_info(i915)->page_sizes = I915_GTT_PAGE_SIZE_4K | - I915_GTT_PAGE_SIZE_64K; + I915_GTT_PAGE_SIZE_64K | + I915_GTT_PAGE_SIZE_2M; spin_lock_init(&i915->mm.object_stat_lock); mock_uncore_init(i915); -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 18/21] drm/i915/selftests: mix huge pages
From: Matthew Auld Try to mix sg page sizes for 4K, 64K and 2M pages. v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/ Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-19-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/selftests/scatterlist.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/i915/selftests/scatterlist.c b/drivers/gpu/drm/i915/selftests/scatterlist.c index 1cc5d2931753..cd6d2a16071f 100644 --- a/drivers/gpu/drm/i915/selftests/scatterlist.c +++ b/drivers/gpu/drm/i915/selftests/scatterlist.c @@ -189,6 +189,20 @@ static unsigned int random(unsigned long n, return 1 + (prandom_u32_state(rnd) % 1024); } +static unsigned int random_page_size_pages(unsigned long n, + unsigned long count, + struct rnd_state *rnd) +{ + /* 4K, 64K, 2M */ + static unsigned int page_count[] = { + BIT(12) >> PAGE_SHIFT, + BIT(16) >> PAGE_SHIFT, + BIT(21) >> PAGE_SHIFT, + }; + + return page_count[(prandom_u32_state(rnd) % 3)]; +} + static inline bool page_contiguous(struct page *first, struct page *last, unsigned long npages) @@ -252,6 +266,7 @@ static const npages_fn_t npages_funcs[] = { grow, shrink, random, + random_page_size_pages, NULL, }; -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 02/21] drm/i915: introduce simple gemfs
From: Matthew Auld Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so moves us away from the shmemfs shm_mnt, and gives us the much needed flexibility to do things like set our own mount options, namely huge= which should allow us to enable the use of transparent-huge-pages for our shmem backed objects. v2: various improvements suggested by Joonas v3: move gemfs instance to i915.mm and simplify now that we have file_setup_with_mnt v4: fallback to tmpfs shm_mnt upon failure to setup gemfs v5: make tmpfs fallback kinder v5: better gemfs failure message flags variable Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Dave Hansen Cc: Kirill A. Shutemov Cc: Hugh Dickins Cc: linux...@kvack.org Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-3-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/i915_drv.h | 5 +++ drivers/gpu/drm/i915/i915_gem.c | 33 ++- drivers/gpu/drm/i915/i915_gemfs.c| 52 drivers/gpu/drm/i915/i915_gemfs.h| 34 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 4 ++ 6 files changed, 128 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 51d0d2929a4b..66d23b619db1 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -47,6 +47,7 @@ i915-y += i915_cmd_parser.o \ i915_gem_tiling.o \ i915_gem_timeline.o \ i915_gem_userptr.o \ + i915_gemfs.o \ i915_trace_points.o \ i915_vma.o \ intel_breadcrumbs.o \ diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1fc7080bfa7b..ec6f320cc4f5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1511,6 +1511,11 @@ struct i915_gem_mm { /** Usable portion of the GTT for GEM */ dma_addr_t stolen_base; /* limited to low memory (32-bit) */ + /** +* tmpfs instance used for shmem backed objects +*/ + struct vfsmount *gemfs; + /** PPGTT used for aliasing the PPGTT with the GTT */ struct i915_hw_ppgtt *aliasing_ppgtt; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 50cc3c2cef06..1da1f52d12cc 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -35,6 +35,7 @@ #include "intel_drv.h" #include "intel_frontbuffer.h" #include "intel_mocs.h" +#include "i915_gemfs.h" #include #include #include @@ -4256,6 +4257,30 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = { .pwrite = i915_gem_object_pwrite_gtt, }; +static int i915_gem_object_create_shmem(struct drm_device *dev, + struct drm_gem_object *obj, + size_t size) +{ + struct drm_i915_private *i915 = to_i915(dev); + unsigned long flags = VM_NORESERVE; + struct file *filp; + + drm_gem_private_object_init(dev, obj, size); + + if (i915->mm.gemfs) + filp = shmem_file_setup_with_mnt(i915->mm.gemfs, "i915", size, +flags); + else + filp = shmem_file_setup("i915", size, flags); + + if (IS_ERR(filp)) + return PTR_ERR(filp); + + obj->filp = filp; + + return 0; +} + struct drm_i915_gem_object * i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size) { @@ -4280,7 +4305,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size) if (obj == NULL) return ERR_PTR(-ENOMEM); - ret = drm_gem_object_init(&dev_priv->drm, &obj->base, size); + ret = i915_gem_object_create_shmem(&dev_priv->drm, &obj->base, size); if (ret) goto fail; @@ -4919,6 +4944,10 @@ i915_gem_load_init(struct drm_i915_private *dev_priv) spin_lock_init(&dev_priv->fb_tracking.lock); + err = i915_gemfs_init(dev_priv); + if (err) + DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err); + return 0; err_priorities: @@ -4957,6 +4986,8 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv) /* And ensure that our DESTROY_BY_RCU slabs are truly destroyed */ rcu_barrier(); + + i915_gemfs_fini(dev_priv); } int i915_gem_freeze(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/i915_gemfs.c new file mode 100644 index ..168d0bd98f60 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_gemfs.c @
[Intel-gfx] [CI 12/21] drm/i915: support 2M pages for the 48b PPGTT
From: Matthew Auld Support inserting 2M gtt pages into the 48b PPGTT. v2: sanity check sg->length against page_size v3: don't recalculate rem on each loop whitespace breakup Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-13-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 76 +++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + 2 files changed, 74 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 74fc9ac11cd5..79ba485c5d42 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1013,6 +1013,69 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm, cache_level); } +static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, + struct i915_page_directory_pointer **pdps, + struct sgt_dma *iter, + enum i915_cache_level cache_level) +{ + const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level); + u64 start = vma->node.start; + dma_addr_t rem = iter->sg->length; + + do { + struct gen8_insert_pte idx = gen8_insert_pte(start); + struct i915_page_directory_pointer *pdp = pdps[idx.pml4e]; + struct i915_page_directory *pd = pdp->page_directory[idx.pdpe]; + unsigned int page_size; + gen8_pte_t encode = pte_encode; + gen8_pte_t *vaddr; + u16 index, max; + + if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M && + IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) && + rem >= I915_GTT_PAGE_SIZE_2M && !idx.pte) { + index = idx.pde; + max = I915_PDES; + page_size = I915_GTT_PAGE_SIZE_2M; + + encode |= GEN8_PDE_PS_2M; + + vaddr = kmap_atomic_px(pd); + } else { + struct i915_page_table *pt = pd->page_table[idx.pde]; + + index = idx.pte; + max = GEN8_PTES; + page_size = I915_GTT_PAGE_SIZE; + + vaddr = kmap_atomic_px(pt); + } + + do { + GEM_BUG_ON(iter->sg->length < page_size); + vaddr[index++] = encode | iter->dma; + + start += page_size; + iter->dma += page_size; + rem -= page_size; + if (iter->dma >= iter->max) { + iter->sg = __sg_next(iter->sg); + if (!iter->sg) + break; + + rem = iter->sg->length; + iter->dma = sg_dma_address(iter->sg); + iter->max = iter->dma + rem; + + if (unlikely(!IS_ALIGNED(iter->dma, page_size))) + break; + } + } while (rem >= page_size && index < max); + + kunmap_atomic(vaddr); + } while (iter->sg); +} + static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm, struct i915_vma *vma, enum i915_cache_level cache_level, @@ -1025,11 +1088,16 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm, .max = iter.dma + iter.sg->length, }; struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps; - struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start); - while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter, -&idx, cache_level)) - GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4); + if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { + gen8_ppgtt_insert_huge_entries(vma, pdps, &iter, cache_level); + } else { + struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start); + + while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], +&iter, &idx, cache_level)) + GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4); + } } static void gen8_free_page_tables(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index f22491b4e6dc..b9d7036c3665 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -154,6 +154,8 @@ t
[Intel-gfx] [CI 07/21] drm/i915: introduce vm set_pages/clear_pages
From: Matthew Auld Move the setting/clearing of the vma->pages to a vm operation. Doing so neatens things up a little, but more importantly gives us a sane place to also set/clear the vma->pages_sizes, which we introduce later in preparation for supporting huge-pages. v2: remove redundant vma->pages check v3: GEM_BUG_ON(vma->pages) following i915_vma_remove Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 70 +++ drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + drivers/gpu/drm/i915/i915_vma.c | 27 +++- drivers/gpu/drm/i915/selftests/mock_gtt.c | 11 ++--- 4 files changed, 66 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 4c82ceb8d318..c534b74eee32 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -205,8 +205,6 @@ static int ppgtt_bind_vma(struct i915_vma *vma, return ret; } - vma->pages = vma->obj->mm.pages; - /* Currently applicable only to VLV */ pte_flags = 0; if (vma->obj->gt_ro) @@ -222,6 +220,26 @@ static void ppgtt_unbind_vma(struct i915_vma *vma) vma->vm->clear_range(vma->vm, vma->node.start, vma->size); } +static int ppgtt_set_pages(struct i915_vma *vma) +{ + GEM_BUG_ON(vma->pages); + + vma->pages = vma->obj->mm.pages; + + return 0; +} + +static void clear_pages(struct i915_vma *vma) +{ + GEM_BUG_ON(!vma->pages); + + if (vma->pages != vma->obj->mm.pages) { + sg_free_table(vma->pages); + kfree(vma->pages); + } + vma->pages = NULL; +} + static gen8_pte_t gen8_pte_encode(dma_addr_t addr, enum i915_cache_level level) { @@ -1452,6 +1470,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.cleanup = gen8_ppgtt_cleanup; ppgtt->base.unbind_vma = ppgtt_unbind_vma; ppgtt->base.bind_vma = ppgtt_bind_vma; + ppgtt->base.set_pages = ppgtt_set_pages; + ppgtt->base.clear_pages = clear_pages; ppgtt->debug_dump = gen8_dump_ppgtt; return 0; @@ -1894,6 +1914,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; ppgtt->base.unbind_vma = ppgtt_unbind_vma; ppgtt->base.bind_vma = ppgtt_bind_vma; + ppgtt->base.set_pages = ppgtt_set_pages; + ppgtt->base.clear_pages = clear_pages; ppgtt->base.cleanup = gen6_ppgtt_cleanup; ppgtt->debug_dump = gen6_dump_ppgtt; @@ -2405,12 +2427,6 @@ static int ggtt_bind_vma(struct i915_vma *vma, struct drm_i915_gem_object *obj = vma->obj; u32 pte_flags; - if (unlikely(!vma->pages)) { - int ret = i915_get_ggtt_vma_pages(vma); - if (ret) - return ret; - } - /* Currently applicable only to VLV */ pte_flags = 0; if (obj->gt_ro) @@ -2447,12 +2463,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, u32 pte_flags; int ret; - if (unlikely(!vma->pages)) { - ret = i915_get_ggtt_vma_pages(vma); - if (ret) - return ret; - } - /* Currently applicable only to VLV */ pte_flags = 0; if (vma->obj->gt_ro) @@ -2467,7 +2477,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, vma->node.start, vma->size); if (ret) - goto err_pages; + return ret; } appgtt->base.insert_entries(&appgtt->base, vma, cache_level, @@ -2481,17 +2491,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, } return 0; - -err_pages: - if (!(vma->flags & (I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND))) { - if (vma->pages != vma->obj->mm.pages) { - GEM_BUG_ON(!vma->pages); - sg_free_table(vma->pages); - kfree(vma->pages); - } - vma->pages = NULL; - } - return ret; } static void aliasing_gtt_unbind_vma(struct i915_vma *vma) @@ -2529,6 +2528,19 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj, dma_unmap_sg(kdev, pages->sgl, pages->nents, PCI_DMA_BIDIRECTIONAL); } +static int ggtt_set_pages(struct i915_vma *vma) +{ + int ret; + + GEM_BUG_ON(vma->pages); + + ret = i915_get_ggtt_vma_page
[Intel-gfx] [CI 20/21] drm/i915: enable platform support for 64K pages
From: Matthew Auld For gen9+ enable platform level support for 64K pages. Also enable for mock testing. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-21-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_pci.c | 3 ++- drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 7938006cf03a..8d349aec1902 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -436,7 +436,8 @@ static const struct intel_device_info intel_cherryview_info __initconst = { }; #define GEN9_DEFAULT_PAGE_SIZES \ - .page_sizes = I915_GTT_PAGE_SIZE_4K + .page_sizes = I915_GTT_PAGE_SIZE_4K | \ + I915_GTT_PAGE_SIZE_64K #define GEN9_FEATURES \ GEN8_FEATURES, \ diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index f46c3a35d61a..7a9735dac912 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -175,7 +175,8 @@ struct drm_i915_private *mock_gem_device(void) mkwrite_device_info(i915)->gen = -1; mkwrite_device_info(i915)->page_sizes = - I915_GTT_PAGE_SIZE_4K; + I915_GTT_PAGE_SIZE_4K | + I915_GTT_PAGE_SIZE_64K; spin_lock_init(&i915->mm.object_stat_lock); mock_uncore_init(i915); -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 16/21] drm/i915/debugfs: include some gtt page size metrics
From: Matthew Auld Good to know, mostly for debugging purposes. v2: some improvements from Chris Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-17-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 61 ++--- 1 file changed, 57 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 84ab77c02d3e..f7817c667958 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -119,6 +119,36 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj) return size; } +static const char * +stringify_page_sizes(unsigned int page_sizes, char *buf, size_t len) +{ + size_t x = 0; + + switch (page_sizes) { + case 0: + return ""; + case I915_GTT_PAGE_SIZE_4K: + return "4K"; + case I915_GTT_PAGE_SIZE_64K: + return "64K"; + case I915_GTT_PAGE_SIZE_2M: + return "2M"; + default: + if (!buf) + return "M"; + + if (page_sizes & I915_GTT_PAGE_SIZE_2M) + x += snprintf(buf + x, len - x, "2M, "); + if (page_sizes & I915_GTT_PAGE_SIZE_64K) + x += snprintf(buf + x, len - x, "64K, "); + if (page_sizes & I915_GTT_PAGE_SIZE_4K) + x += snprintf(buf + x, len - x, "4K, "); + buf[x-2] = '\0'; + + return buf; + } +} + static void describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) { @@ -156,9 +186,10 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) if (!drm_mm_node_allocated(&vma->node)) continue; - seq_printf(m, " (%sgtt offset: %08llx, size: %08llx", + seq_printf(m, " (%sgtt offset: %08llx, size: %08llx, pages: %s", i915_vma_is_ggtt(vma) ? "g" : "pp", - vma->node.start, vma->node.size); + vma->node.start, vma->node.size, + stringify_page_sizes(vma->page_sizes.gtt, NULL, 0)); if (i915_vma_is_ggtt(vma)) { switch (vma->ggtt_view.type) { case I915_GGTT_VIEW_NORMAL: @@ -403,10 +434,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) struct drm_i915_private *dev_priv = node_to_i915(m->private); struct drm_device *dev = &dev_priv->drm; struct i915_ggtt *ggtt = &dev_priv->ggtt; - u32 count, mapped_count, purgeable_count, dpy_count; - u64 size, mapped_size, purgeable_size, dpy_size; + u32 count, mapped_count, purgeable_count, dpy_count, huge_count; + u64 size, mapped_size, purgeable_size, dpy_size, huge_size; struct drm_i915_gem_object *obj; + unsigned int page_sizes = 0; struct drm_file *file; + char buf[80]; int ret; ret = mutex_lock_interruptible(&dev->struct_mutex); @@ -420,6 +453,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data) size = count = 0; mapped_size = mapped_count = 0; purgeable_size = purgeable_count = 0; + huge_size = huge_count = 0; list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) { size += obj->base.size; ++count; @@ -433,6 +467,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) mapped_count++; mapped_size += obj->base.size; } + + if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) { + huge_count++; + huge_size += obj->base.size; + page_sizes |= obj->mm.page_sizes.sg; + } } seq_printf(m, "%u unbound objects, %llu bytes\n", count, size); @@ -455,6 +495,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) mapped_count++; mapped_size += obj->base.size; } + + if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) { + huge_count++; + huge_size += obj->base.size; + page_sizes |= obj->mm.page_sizes.sg; + } } seq_printf(m, "%u bound objects, %llu bytes\n", count, size); @@ -462,11 +508,18 @@ static int i915_gem_object_info(struct seq_file *m, void *data) purgeable_count, purgeable_size); seq_printf(m, "%u mapped objects, %llu bytes\n", mapped_count, mapped_size); + seq_printf(m, "%u huge-paged objects (%s) %llu bytes\n",
[Intel-gfx] [CI 04/21] drm/i915: introduce page_sizes field to dev_info
From: Matthew Auld In preparation for huge gtt pages expose page_sizes as part of the device info, to indicate the page sizes supported by the HW. Currently only 4K is supported. v2: s/page_size_mask/page_sizes/ v3: introduce I915_GTT_MAX_PAGE_SIZE Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_gtt.h | 8 +++- drivers/gpu/drm/i915/i915_pci.c | 18 ++ drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 +++ 4 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ec6f320cc4f5..3d4dee817381 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -869,6 +869,8 @@ struct intel_device_info { u8 num_sprites[I915_MAX_PIPES]; u8 num_scalers[I915_MAX_PIPES]; + unsigned int page_sizes; /* page sizes supported by the HW */ + #define DEFINE_FLAG(name) u8 name:1 DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG); #undef DEFINE_FLAG diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index f62fb903dc24..50218c141c21 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -42,7 +42,13 @@ #include "i915_gem_request.h" #include "i915_selftest.h" -#define I915_GTT_PAGE_SIZE 4096UL +#define I915_GTT_PAGE_SIZE_4K BIT(12) +#define I915_GTT_PAGE_SIZE_64K BIT(16) +#define I915_GTT_PAGE_SIZE_2M BIT(21) + +#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K +#define I915_GTT_MAX_PAGE_SIZE I915_GTT_PAGE_SIZE_2M + #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE #define I915_FENCE_REG_NONE -1 diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 745b6a6e0188..7938006cf03a 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -58,6 +58,10 @@ .color = { .degamma_lut_size = 0, .gamma_lut_size = 1024 } /* Keep in gen based order, and chronological order within a gen */ + +#define GEN_DEFAULT_PAGE_SIZES \ + .page_sizes = I915_GTT_PAGE_SIZE_4K + #define GEN2_FEATURES \ .gen = 2, .num_pipes = 1, \ .has_overlay = 1, .overlay_needs_physical = 1, \ @@ -67,6 +71,7 @@ .ring_mask = RENDER_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_i830_info __initconst = { @@ -100,6 +105,7 @@ static const struct intel_device_info intel_i865g_info __initconst = { .ring_mask = RENDER_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_i915g_info __initconst = { @@ -163,6 +169,7 @@ static const struct intel_device_info intel_pineview_info __initconst = { .ring_mask = RENDER_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_i965g_info __initconst = { @@ -205,6 +212,7 @@ static const struct intel_device_info intel_gm45_info __initconst = { .ring_mask = RENDER_RING | BSD_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_ironlake_d_info __initconst = { @@ -228,6 +236,7 @@ static const struct intel_device_info intel_ironlake_m_info __initconst = { .has_rc6p = 1, \ .has_aliasing_ppgtt = 1, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS #define SNB_D_PLATFORM \ @@ -271,6 +280,7 @@ static const struct intel_device_info intel_sandybridge_m_gt2_info __initconst = .has_aliasing_ppgtt = 1, \ .has_full_ppgtt = 1, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ IVB_CURSOR_OFFSETS #define IVB_D_PLATFORM \ @@ -327,6 +337,7 @@ static const struct intel_device_info intel_valleyview_info __initconst = { .has_snoop = true, .ring_mask = RENDER_RING | BSD_RING | BLT_RING, .display_mmio_offset = VLV_DISPLAY_BASE, + GEN_DEFAULT_PAGE_SIZES, GEN_DEFAULT_PIPEOFFSETS, CURSOR_OFFSETS }; @@ -365,6 +376,7 @@ static const struct intel_device_info intel_haswell_gt3_info __initconst = { #define GEN8_FEATURES \ G75_FEATURES, \ BDW_COLORS, \ + GEN_DEFAULT_PAGE_SIZES, \ .has_logical_ring_contexts = 1, \ .has_full_48bit_ppgtt = 1, \ .has_64bit_reloc = 1, \ @@ -417,13 +429,18 @@ static const
[Intel-gfx] [CI 13/21] drm/i915: add support for 64K scratch page
From: Matthew Auld Before we can fully enable 64K pages, we need to first support a 64K scratch page if we intend to support the case where we have object sizes < 2M, since any scratch PTE must also point to a 64K region. Without this our 64K usage is limited to objects which completely fill the page-table, and therefore don't need any scratch. v2: add reminder about why 48b PPGTT Reported-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-14-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 64 ++--- drivers/gpu/drm/i915/i915_gem_gtt.h | 1 + 2 files changed, 54 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 79ba485c5d42..7eae6ab8c5fd 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -519,22 +519,63 @@ static void fill_page_dma_32(struct i915_address_space *vm, static int setup_scratch_page(struct i915_address_space *vm, gfp_t gfp) { - struct page *page; + struct page *page = NULL; dma_addr_t addr; + int order; - page = alloc_page(gfp | __GFP_ZERO); - if (unlikely(!page)) - return -ENOMEM; + /* +* In order to utilize 64K pages for an object with a size < 2M, we will +* need to support a 64K scratch page, given that every 16th entry for a +* page-table operating in 64K mode must point to a properly aligned 64K +* region, including any PTEs which happen to point to scratch. +* +* This is only relevant for the 48b PPGTT where we support +* huge-gtt-pages, see also i915_vma_insert(). +* +* TODO: we should really consider write-protecting the scratch-page and +* sharing between ppgtt +*/ + if (i915_vm_is_48bit(vm) && + HAS_PAGE_SIZES(vm->i915, I915_GTT_PAGE_SIZE_64K)) { + order = get_order(I915_GTT_PAGE_SIZE_64K); + page = alloc_pages(gfp | __GFP_ZERO, order); + if (page) { + addr = dma_map_page(vm->dma, page, 0, + I915_GTT_PAGE_SIZE_64K, + PCI_DMA_BIDIRECTIONAL); + if (unlikely(dma_mapping_error(vm->dma, addr))) { + __free_pages(page, order); + page = NULL; + } - addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE, - PCI_DMA_BIDIRECTIONAL); - if (unlikely(dma_mapping_error(vm->dma, addr))) { - __free_page(page); - return -ENOMEM; + if (!IS_ALIGNED(addr, I915_GTT_PAGE_SIZE_64K)) { + dma_unmap_page(vm->dma, addr, + I915_GTT_PAGE_SIZE_64K, + PCI_DMA_BIDIRECTIONAL); + __free_pages(page, order); + page = NULL; + } + } + } + + if (!page) { + order = 0; + page = alloc_page(gfp | __GFP_ZERO); + if (unlikely(!page)) + return -ENOMEM; + + addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE, + PCI_DMA_BIDIRECTIONAL); + if (unlikely(dma_mapping_error(vm->dma, addr))) { + __free_page(page); + return -ENOMEM; + } } vm->scratch_page.page = page; vm->scratch_page.daddr = addr; + vm->scratch_page.order = order; + return 0; } @@ -542,8 +583,9 @@ static void cleanup_scratch_page(struct i915_address_space *vm) { struct i915_page_dma *p = &vm->scratch_page; - dma_unmap_page(vm->dma, p->daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); - __free_page(p->page); + dma_unmap_page(vm->dma, p->daddr, BIT(p->order) << PAGE_SHIFT, + PCI_DMA_BIDIRECTIONAL); + __free_pages(p->page, p->order); } static struct i915_page_table *alloc_pt(struct i915_address_space *vm) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index b9d7036c3665..e9de3f05b0c9 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -215,6 +215,7 @@ struct i915_vma; struct i915_page_dma { struct page *page; + int order; union { dma_addr_t daddr; -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 06/21] drm/i915: introduce page_size members
From: Matthew Auld In preparation for supporting huge gtt pages for the ppgtt, we introduce page size members for gem objects. We fill in the page sizes by scanning the sg table. v2: pass the sg_mask to set_pages v3: calculate the sg_mask inline with populating the sg_table where possible, and pass to set_pages along with the pages. v4: bunch of improvements from Joonas v5: fix num_pages blunder introduce i915_sg_page_sizes helper v6: prefer GEM_BUG_ON(sizes == 0) Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Daniel Vetter Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 22 - drivers/gpu/drm/i915/i915_gem.c | 42 +--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 5 ++- drivers/gpu/drm/i915/i915_gem_internal.c | 5 ++- drivers/gpu/drm/i915/i915_gem_object.h | 17 ++ drivers/gpu/drm/i915/i915_gem_stolen.c | 2 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 5 ++- drivers/gpu/drm/i915/selftests/huge_gem_object.c | 2 +- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 5 ++- 9 files changed, 93 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3d4dee817381..799a90abd81f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2872,6 +2872,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg) (((__iter).curr += PAGE_SIZE) >= (__iter).max) ? \ (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0) +static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg) +{ + unsigned int page_sizes; + + page_sizes = 0; + while (sg) { + GEM_BUG_ON(sg->offset); + GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE)); + page_sizes |= sg->length; + sg = __sg_next(sg); + } + + return page_sizes; +} + static inline unsigned int i915_sg_segment_size(void) { unsigned int size = swiotlb_max_segment(); @@ -3101,6 +3116,10 @@ intel_info(const struct drm_i915_private *dev_priv) #define USES_PPGTT(dev_priv) (i915_modparams.enable_ppgtt) #define USES_FULL_PPGTT(dev_priv) (i915_modparams.enable_ppgtt >= 2) #define USES_FULL_48BIT_PPGTT(dev_priv)(i915_modparams.enable_ppgtt == 3) +#define HAS_PAGE_SIZES(dev_priv, sizes) ({ \ + GEM_BUG_ON((sizes) == 0); \ + ((sizes) & ~(dev_priv)->info.page_sizes) == 0; \ +}) #define HAS_OVERLAY(dev_priv) ((dev_priv)->info.has_overlay) #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \ @@ -3517,7 +3536,8 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, unsigned long n); void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, -struct sg_table *pages); +struct sg_table *pages, +unsigned int sg_mask); int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline int __must_check diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 42f2ca1e136b..34398696824c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -228,7 +228,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) obj->phys_handle = phys; - __i915_gem_object_set_pages(obj, st); + __i915_gem_object_set_pages(obj, st, sg->length); return 0; @@ -2266,6 +2266,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, if (!IS_ERR(pages)) obj->ops->put_pages(obj, pages); + obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0; + unlock: mutex_unlock(&obj->mm.lock); } @@ -2308,6 +2310,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ unsigned int max_segment = i915_sg_segment_size(); + unsigned int sg_mask; gfp_t noreclaim; int ret; @@ -2339,6 +2342,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) sg = st->sgl; st->nents = 0; + sg_mask = 0; for (i = 0; i < page_count; i++) { const unsigned int shrink[] = { I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | I915_SHRINK_PURGEABLE, @@ -2391,8 +2395,10 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (!i || sg->length >= max_segment || page_to_pfn(page) != last_pfn + 1) { - if (i) +
[Intel-gfx] [CI 14/21] drm/i915: support 64K pages for the 48b PPGTT
From: Matthew Auld Support inserting 64K pages into the 48b PPGTT. v2: check for 64K scratch v3: we should only have to re-adjust maybe_64K at every sg interval Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-15-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 31 +++ drivers/gpu/drm/i915/i915_gem_gtt.h | 7 +++ 2 files changed, 38 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 7eae6ab8c5fd..118aad90468f 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1069,6 +1069,7 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, struct i915_page_directory_pointer *pdp = pdps[idx.pml4e]; struct i915_page_directory *pd = pdp->page_directory[idx.pdpe]; unsigned int page_size; + bool maybe_64K = false; gen8_pte_t encode = pte_encode; gen8_pte_t *vaddr; u16 index, max; @@ -1090,6 +1091,13 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, max = GEN8_PTES; page_size = I915_GTT_PAGE_SIZE; + if (!index && + vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K && + IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) && + (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) || +rem >= (max - index) << PAGE_SHIFT)) + maybe_64K = true; + vaddr = kmap_atomic_px(pt); } @@ -1109,12 +1117,35 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, iter->dma = sg_dma_address(iter->sg); iter->max = iter->dma + rem; + if (maybe_64K && index < max && + !(IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) && + (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) || + rem >= (max - index) << PAGE_SHIFT))) + maybe_64K = false; + if (unlikely(!IS_ALIGNED(iter->dma, page_size))) break; } } while (rem >= page_size && index < max); kunmap_atomic(vaddr); + + /* +* Is it safe to mark the 2M block as 64K? -- Either we have +* filled whole page-table with 64K entries, or filled part of +* it and have reached the end of the sg table and we have +* enough padding. +*/ + if (maybe_64K && + (index == max || +(i915_vm_has_scratch_64K(vma->vm) && + !iter->sg && IS_ALIGNED(vma->node.start + + vma->node.size, + I915_GTT_PAGE_SIZE_2M { + vaddr = kmap_atomic_px(pd); + vaddr[idx.pde] |= GEN8_PDE_IPS_64K; + kunmap_atomic(vaddr); + } } while (iter->sg); } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index e9de3f05b0c9..93211a96fdad 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -154,6 +154,7 @@ typedef u64 gen8_ppgtt_pml4e_t; #define GEN8_PPAT_GET_AGE(x) ((x) & (3 << 4)) #define CHV_PPAT_GET_SNOOP(x) ((x) & (1 << 6)) +#define GEN8_PDE_IPS_64K BIT(11) #define GEN8_PDE_PS_2M BIT(7) struct sg_table; @@ -352,6 +353,12 @@ i915_vm_is_48bit(const struct i915_address_space *vm) return (vm->total - 1) >> 32; } +static inline bool +i915_vm_has_scratch_64K(struct i915_address_space *vm) +{ + return vm->scratch_page.order == get_order(I915_GTT_PAGE_SIZE_64K); +} + /* The Graphics Translation Table is the way in which GEN hardware translates a * Graphics Virtual Address into a Physical Address. In addition to the normal * collateral associated with any va->pa translations GEN hardware also has a -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 11/21] drm/i915: disable GTT cache for 2M pages
From: Matthew Auld When SW enables the use of 2M/1G pages, it must disable the GTT cache. v2: don't disable for Cherryview which doesn't even support 48b PPGTT! v3: explicitly check that the system does support 2M/1G pages v4: split WA and decision logic Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-12-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_pm.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 171b21f6c4ad..9d0ca2656a23 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -8453,6 +8453,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv) static void bdw_init_clock_gating(struct drm_i915_private *dev_priv) { + /* The GTT cache must be disabled if the system is using 2M pages. */ + bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv, +I915_GTT_PAGE_SIZE_2M); enum pipe pipe; ilk_init_lp_watermarks(dev_priv); @@ -8487,12 +8490,8 @@ static void bdw_init_clock_gating(struct drm_i915_private *dev_priv) /* WaProgramL3SqcReg1Default:bdw */ gen8_set_l3sqc_credits(dev_priv, 30, 2); - /* -* WaGttCachingOffByDefault:bdw -* GTT cache may not work with big pages, so if those -* are ever enabled GTT cache may need to be disabled. -*/ - I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL); + /* WaGttCachingOffByDefault:bdw */ + I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0); /* WaKVMNotificationOnConfigChange:bdw */ I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1) -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 08/21] drm/i915: align the vma start to the largest gtt page size
From: Matthew Auld For the 48b PPGTT try to align the vma start address to the required page size boundary to guarantee we use said page size in the gtt. If we are dealing with multiple page sizes, we can't guarantee anything and just align to the largest. For soft pinning and objects which need to be tightly packed into the lower 32bits we don't force any alignment. v2: various improvements suggested by Chris v3: use set_pages and better placement of page_sizes v4: prefer upper_32_bits() v5: assign vma->page_sizes = vma->obj->page_sizes directly prefer sizeof(vma->page_sizes) v6: fixup checking of end to exclude GGTT (which are assumed to be limited to 4G). Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-9-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++ drivers/gpu/drm/i915/i915_vma.c | 16 drivers/gpu/drm/i915/i915_vma.h | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index c534b74eee32..fb7ac66814ab 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -226,6 +226,8 @@ static int ppgtt_set_pages(struct i915_vma *vma) vma->pages = vma->obj->mm.pages; + vma->page_sizes = vma->obj->mm.page_sizes; + return 0; } @@ -238,6 +240,8 @@ static void clear_pages(struct i915_vma *vma) kfree(vma->pages); } vma->pages = NULL; + + memset(&vma->page_sizes, 0, sizeof(vma->page_sizes)); } static gen8_pte_t gen8_pte_encode(dma_addr_t addr, @@ -2538,6 +2542,8 @@ static int ggtt_set_pages(struct i915_vma *vma) if (ret) return ret; + vma->page_sizes = vma->obj->mm.page_sizes; + return 0; } diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 49bf49571e47..5d4164406b63 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -493,6 +493,22 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (ret) goto err_clear; } else { + /* +* We only support huge gtt pages through the 48b PPGTT, +* however we also don't want to force any alignment for +* objects which need to be tightly packed into the low 32bits. +* +* Note that we assume that GGTT are limited to 4GiB for the +* forseeable future. See also i915_ggtt_offset(). +*/ + if (upper_32_bits(end - 1) && + vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { + u64 page_alignment = + rounddown_pow_of_two(vma->page_sizes.sg); + + alignment = max(alignment, page_alignment); + } + ret = i915_gem_gtt_insert(vma->vm, &vma->node, size, alignment, obj->cache_level, start, end, flags); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index e811067c7724..c59ba76613a3 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -55,6 +55,7 @@ struct i915_vma { void __iomem *iomap; u64 size; u64 display_alignment; + struct i915_page_sizes page_sizes; u32 fence_size; u32 fence_alignment; -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 10/21] drm/i915: enable IPS bit for 64K pages
From: Matthew Auld Before we can enable 64K pages through the IPS bit, we must first enable it through MMIO, otherwise the page-walker will simply ignore it. v2: add comment mentioning that 64K is BDW+ v3: move to more suitable home Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Mika Kuoppala Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-11-matthew.a...@intel.com Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 17 + drivers/gpu/drm/i915/i915_reg.h | 3 +++ 2 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index fb7ac66814ab..74fc9ac11cd5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1987,6 +1987,23 @@ static void gtt_write_workarounds(struct drm_i915_private *dev_priv) I915_WRITE(GEN8_L3_LRA_1_GPGPU, GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_SKL); else if (IS_GEN9_LP(dev_priv)) I915_WRITE(GEN8_L3_LRA_1_GPGPU, GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_BXT); + + /* +* To support 64K PTEs we need to first enable the use of the +* Intermediate-Page-Size(IPS) bit of the PDE field via some magical +* mmio, otherwise the page-walker will simply ignore the IPS bit. This +* shouldn't be needed after GEN10. +* +* 64K pages were first introduced from BDW+, although technically they +* only *work* from gen9+. For pre-BDW we instead have the option for +* 32K pages, but we don't currently have any support for it in our +* driver. +*/ + if (HAS_PAGE_SIZES(dev_priv, I915_GTT_PAGE_SIZE_64K) && + INTEL_GEN(dev_priv) <= 10) + I915_WRITE(GEN8_GAMW_ECO_DEV_RW_IA, + I915_READ(GEN8_GAMW_ECO_DEV_RW_IA) | + GAMW_ECO_ENABLE_64K_IPS_FIELD); } int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index e7dba5539b11..50e65c98ca6c 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2371,6 +2371,9 @@ enum i915_power_well_id { #define GEN9_GAMT_ECO_REG_RW_IA _MMIO(0x4ab0) #define GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS (1<<18) +#define GEN8_GAMW_ECO_DEV_RW_IA _MMIO(0x4080) +#define GAMW_ECO_ENABLE_64K_IPS_FIELD 0xF + #define GAMT_CHKN_BIT_REG _MMIO(0x4ab8) #define GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING (1<<28) #define GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT (1<<24) -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915/guc: group initialization of GuC objects
On 10/04/2017 06:58 AM, Michal Wajdeczko wrote: On Wed, 04 Oct 2017 00:57:00 +0200, Sujaritha Sundaresan wrote: The previous patch has split up the initialization of some of the GuC objects in 2 different functions, let's pull them back together. v3: Group initialization of GuC objects v2: Decoupling ADS together with logs (Daniele) v3: Rebase v4: Rebase v5: Separated from previous patch Cc: Anusha Srivatsa Cc: Daniele Ceraolo Spurio Cc: Michal Wajdeczko Cc: Oscar Mateo Cc: Sagar Arun Kamble Signed-off-by: Sujaritha Sundaresan --- drivers/gpu/drm/i915/i915_guc_submission.c | 7 ++--- drivers/gpu/drm/i915/intel_uc.c | 41 +- drivers/gpu/drm/i915/intel_uc.h | 4 +-- 3 files changed, 28 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c456c55..a351339 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -920,9 +920,8 @@ void i915_guc_policies_init(struct guc_policies *policies) * Set up the memory resources to be shared with the GuC (via the GGTT) * at firmware loading time. */ -int i915_guc_submission_init(struct drm_i915_private *dev_priv) +int i915_guc_submission_shared_objects_init(struct intel_guc *guc) Hmm, is "stage_ids" also considered as "shared object" ? See ida_init(&guc->stage_ids); later in this function. also, since function starts with "i915" there is no reason to change parameter from dev_priv to guc I was actually undecided if this change was worth doing. It seems to be a better idea to keep the guc_submission stuff separate at an higher level. Sujaritha { - struct intel_guc *guc = &dev_priv->guc; struct i915_vma *vma; void *vaddr; @@ -950,10 +949,8 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv) return 0; } -void i915_guc_submission_fini(struct drm_i915_private *dev_priv) +void i915_guc_submission_shared_objects_fini(struct intel_guc *guc) { - struct intel_guc *guc = &dev_priv->guc; - ida_destroy(&guc->stage_ids); i915_gem_object_unpin_map(guc->stage_desc_pool->obj); i915_vma_unpin_and_release(&guc->stage_desc_pool); diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c index 732f188..69239e4 100644 --- a/drivers/gpu/drm/i915/intel_uc.c +++ b/drivers/gpu/drm/i915/intel_uc.c @@ -423,13 +423,33 @@ static int guc_shared_objects_init(struct intel_guc *guc) ret = guc_ads_create(guc); if (ret < 0) - intel_guc_log_destroy(guc); + goto err_logs; + + if (i915_modparams.enable_guc_submission) { + /* + * This is stuff we need to have available at fw load time + * if we are planning to enable submission later + */ + ret = i915_guc_submission_shared_objects_init(guc); + if (ret) + goto err_ads; + } + + return 0; + +err_ads: + guc_ads_destroy(guc); +err_logs: + intel_guc_log_destroy(guc); return ret; } static void guc_shared_objects_fini(struct intel_guc *guc) { + if (i915_modparams.enable_guc_submission) + i915_guc_submission_shared_objects_fini(guc); + guc_ads_destroy(guc); intel_guc_log_destroy(guc); } @@ -452,16 +472,6 @@ int intel_uc_init_hw(struct drm_i915_private *dev_priv) if (ret) goto err_guc; - if (i915_modparams.enable_guc_submission) { - /* - * This is stuff we need to have available at fw load time - * if we are planning to enable submission later - */ - ret = i915_guc_submission_init(dev_priv); - if (ret) - goto err_shared; - } - /* init WOPCM */ I915_WRITE(GUC_WOPCM_SIZE, intel_guc_wopcm_size(dev_priv)); I915_WRITE(DMA_GUC_WOPCM_OFFSET, @@ -481,7 +491,7 @@ int intel_uc_init_hw(struct drm_i915_private *dev_priv) */ ret = __intel_uc_reset_hw(dev_priv); if (ret) - goto err_submission; + goto err_shared; intel_huc_init_hw(&dev_priv->huc); ret = intel_guc_init_hw(&dev_priv->guc); @@ -526,11 +536,8 @@ int intel_uc_init_hw(struct drm_i915_private *dev_priv) gen9_disable_guc_interrupts(dev_priv); err_log_capture: guc_capture_load_err_log(guc); -err_submission: - if (i915_modparams.enable_guc_submission) - i915_guc_submission_fini(dev_priv); err_shared: - guc_shared_objects_fini(guc); + guc_shared_objects_fini(guc); ??? err_guc: i915_ggtt_disable_guc(dev_priv); @@ -567,7 +574,7 @@ void intel_uc_fini_hw(struct drm_i915_private *dev_priv) if (i915_modparams.enable_guc_submission) { gen9_disable_guc_interrupts(dev_priv); - i915_guc_submission_fini(dev_priv); + i915_guc_submission_shared_objects_fini(dev_priv); } guc_shared_objects_fini(&dev_priv->guc); diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/
Re: [Intel-gfx] [PATCH 08/21] drm/i915: align the vma start to the largest gtt page size
Quoting Matthew Auld (2017-10-06 15:50:28) > For the 48b PPGTT try to align the vma start address to the required > page size boundary to guarantee we use said page size in the gtt. If we > are dealing with multiple page sizes, we can't guarantee anything and > just align to the largest. For soft pinning and objects which need to be > tightly packed into the lower 32bits we don't force any alignment. > > v2: various improvements suggested by Chris > > v3: use set_pages and better placement of page_sizes > > v4: prefer upper_32_bits() > > v5: assign vma->page_sizes = vma->obj->page_sizes directly > prefer sizeof(vma->page_sizes) > > Signed-off-by: Matthew Auld > Cc: Joonas Lahtinen > Cc: Chris Wilson > Reviewed-by: Chris Wilson > Reviewed-by: Joonas Lahtinen > --- > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c > index 49bf49571e47..5067eab27829 100644 > --- a/drivers/gpu/drm/i915/i915_vma.c > +++ b/drivers/gpu/drm/i915/i915_vma.c > @@ -493,6 +493,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 > alignment, u64 flags) > if (ret) > goto err_clear; > } else { > + /* > +* We only support huge gtt pages through the 48b PPGTT, > +* however we also don't want to force any alignment for > +* objects which need to be tightly packed into the low > 32bits. > +*/ > + if (upper_32_bits(end) && Bah, this assumed PIN_ZONE_4G behaviour and forgot about 4G GGTT. :| Insert a sly !i915_vma_is_ggtt(vma) && here. Or use upper_32_bits(end-1). Hmm. Atm we have the pervasive assumption that GGTT is capped at 4G, so we could use end-1 with a comment. The theory about not wanting to waste space in the low 4G is theory no more! -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✓ Fi.CI.IGT: success for huge gtt pages (rev13)
Quoting Patchwork (2017-10-06 22:29:20) > == Series Details == > > Series: huge gtt pages (rev13) > URL : https://patchwork.freedesktop.org/series/25118/ > State : success > > == Summary == > > Test kms_setmode: > Subgroup basic: > fail -> PASS (shard-hsw) fdo#99912 > > fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 > > shard-hswtotal:2446 pass:1329 dwarn:6 dfail:0 fail:8 skip:1103 > time:10117s > > == Logs == > > For more details see: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5933/shards.html An oddity pops out, ENOSPC from gem_exec_schedule/wide-render. At a guess, overallocation? -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.IGT: success for huge gtt pages (rev13)
== Series Details == Series: huge gtt pages (rev13) URL : https://patchwork.freedesktop.org/series/25118/ State : success == Summary == Test kms_setmode: Subgroup basic: fail -> PASS (shard-hsw) fdo#99912 fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 shard-hswtotal:2446 pass:1329 dwarn:6 dfail:0 fail:8 skip:1103 time:10117s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5933/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915: avoid unnecessary call to intel_hpd_pin_to_port
On Fri, Oct 06, 2017 at 06:19:12PM -0300, Paulo Zanoni wrote: > Em Sex, 2017-10-06 às 10:45 +, Patchwork escreveu: > > == Series Details == > > > > Series: series starting with [1/2] drm/i915: avoid unnecessary call > > to intel_hpd_pin_to_port > > URL : https://patchwork.freedesktop.org/series/31459/ > > State : warning > > > > == Summary == > > > > Series 31459v1 series starting with [1/2] drm/i915: avoid unnecessary > > call to intel_hpd_pin_to_port > > https://patchwork.freedesktop.org/api/1.0/series/31459/revisions/1/mb > > ox/ > > > > Test chamelium: > > Subgroup dp-crc-fast: > > pass -> FAIL (fi-kbl-7500u) fdo#102514 > > Test gem_ctx_switch: > > Subgroup basic-default: > > pass -> INCOMPLETE (fi-cnl-y) fdo#103027 > > Test gem_exec_suspend: > > Subgroup basic-s3: > > pass -> DMESG-WARN (fi-cfl-s) fdo#103026 > > Subgroup basic-s4-devices: > > pass -> DMESG-WARN (fi-kbl-7500u) > > [ 242.023771] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not > signal timeout (has irq: 1)! > > I do not believe this is caused by my patches. This test on this > machine is failing in many other recent patch series, but with > different error messages. Looks very unstable. > > Yes this is the system where we have had these messages due to LSPCON issue recently. Manasi > > > > fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514 > > fdo#103027 https://bugs.freedesktop.org/show_bug.cgi?id=103027 > > fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026 > > > > fi-bdw- > > 5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 > > time:461s > > fi-bdw- > > gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 > > time:467s > > fi-blb- > > e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 > > time:390s > > fi-bsw- > > n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 > > time:573s > > fi-bwr- > > 2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 > > time:288s > > fi-bxt- > > dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 > > time:526s > > fi-bxt- > > j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 > > time:537s > > fi-byt- > > j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 > > time:538s > > fi-byt- > > n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 > > time:525s > > fi-cfl- > > s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 > > time:553s > > fi-cnl- > > y total:31 pass:21 dwarn:0 dfail:0 fail:0 skip:9 > > fi-elk- > > e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 > > time:437s > > fi-glk- > > 1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 > > time:599s > > fi-hsw- > > 4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 > > time:439s > > fi-hsw- > > 4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 > > time:416s > > fi-ilk- > > 650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 > > time:468s > > fi-ivb- > > 3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 > > time:508s > > fi-ivb- > > 3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 > > time:479s > > fi-kbl- > > 7500u total:289 pass:262 dwarn:2 dfail:0 fail:1 skip:24 > > time:501s > > fi-kbl- > > 7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 > > time:584s > > fi-kbl- > > 7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 > > time:499s > > fi-kbl- > > r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 > > time:598s > > fi-pnv- > > d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 > > time:661s > > fi-skl- > > 6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 > > time:477s > > fi-skl- > > 6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 > > time:663s > > fi-skl- > > 6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 > > time:534s > > fi-skl- > > 6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 > > time:521s > > fi-skl- > > gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 > > time:473s > > fi-snb- > > 2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 > > time:578s > > fi-snb- > > 2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 > > time:437s > > > > 97c9e99b242fe40bbda48ba2bcaed07c47fba085 drm-tip: 2017y-10m-06d-09h- > > 07m-21s UTC integration manifest > > d0674fac8e07 drm/i915: avoid division by zero on cnl_calc_wrpll_link > > 7d8046f85adf drm/i915: avoid unnecessary call to > > intel_hpd_pin_to_port > > > > == Logs == > > > > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchw > > ork_5919/ >
Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915: avoid unnecessary call to intel_hpd_pin_to_port
Em Sex, 2017-10-06 às 10:45 +, Patchwork escreveu: > == Series Details == > > Series: series starting with [1/2] drm/i915: avoid unnecessary call > to intel_hpd_pin_to_port > URL : https://patchwork.freedesktop.org/series/31459/ > State : warning > > == Summary == > > Series 31459v1 series starting with [1/2] drm/i915: avoid unnecessary > call to intel_hpd_pin_to_port > https://patchwork.freedesktop.org/api/1.0/series/31459/revisions/1/mb > ox/ > > Test chamelium: > Subgroup dp-crc-fast: > pass -> FAIL (fi-kbl-7500u) fdo#102514 > Test gem_ctx_switch: > Subgroup basic-default: > pass -> INCOMPLETE (fi-cnl-y) fdo#103027 > Test gem_exec_suspend: > Subgroup basic-s3: > pass -> DMESG-WARN (fi-cfl-s) fdo#103026 > Subgroup basic-s4-devices: > pass -> DMESG-WARN (fi-kbl-7500u) [ 242.023771] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not signal timeout (has irq: 1)! I do not believe this is caused by my patches. This test on this machine is failing in many other recent patch series, but with different error messages. Looks very unstable. > > fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514 > fdo#103027 https://bugs.freedesktop.org/show_bug.cgi?id=103027 > fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026 > > fi-bdw- > 5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 > time:461s > fi-bdw- > gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 > time:467s > fi-blb- > e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 > time:390s > fi-bsw- > n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 > time:573s > fi-bwr- > 2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 > time:288s > fi-bxt- > dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 > time:526s > fi-bxt- > j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 > time:537s > fi-byt- > j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 > time:538s > fi-byt- > n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 > time:525s > fi-cfl- > s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 > time:553s > fi-cnl- > y total:31 pass:21 dwarn:0 dfail:0 fail:0 skip:9 > fi-elk- > e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 > time:437s > fi-glk- > 1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 > time:599s > fi-hsw- > 4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 > time:439s > fi-hsw- > 4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 > time:416s > fi-ilk- > 650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 > time:468s > fi-ivb- > 3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 > time:508s > fi-ivb- > 3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 > time:479s > fi-kbl- > 7500u total:289 pass:262 dwarn:2 dfail:0 fail:1 skip:24 > time:501s > fi-kbl- > 7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 > time:584s > fi-kbl- > 7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 > time:499s > fi-kbl- > r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 > time:598s > fi-pnv- > d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 > time:661s > fi-skl- > 6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 > time:477s > fi-skl- > 6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 > time:663s > fi-skl- > 6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 > time:534s > fi-skl- > 6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 > time:521s > fi-skl- > gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 > time:473s > fi-snb- > 2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 > time:578s > fi-snb- > 2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 > time:437s > > 97c9e99b242fe40bbda48ba2bcaed07c47fba085 drm-tip: 2017y-10m-06d-09h- > 07m-21s UTC integration manifest > d0674fac8e07 drm/i915: avoid division by zero on cnl_calc_wrpll_link > 7d8046f85adf drm/i915: avoid unnecessary call to > intel_hpd_pin_to_port > > == Logs == > > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchw > ork_5919/ > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH igt] igt/gem_fence_thresh: Use streaming reads for verify
Quoting Chris Wilson (2017-08-23 13:55:55) > At the moment, the verify tests use an extremely brutal write-read of > every dword, degrading performance to UC. If we break those up into > cachelines, we can do a wcb write/read at a time instead, roughly 8x > faster. We lose the accuracy of the forced wcb flushes around every dword, > but we are retaining the overall behaviour of checking reads following > writes instead. To compensate, we do check that a single dword write/read > before using wcb aligned accesses. This fixes one of the APL timeouts... > Signed-off-by: Chris Wilson > --- > tests/gem_fence_thrash.c | 116 > +-- > 1 file changed, 101 insertions(+), 15 deletions(-) > > diff --git a/tests/gem_fence_thrash.c b/tests/gem_fence_thrash.c > index 52095f26..3e1edb73 100644 > --- a/tests/gem_fence_thrash.c > +++ b/tests/gem_fence_thrash.c > @@ -30,7 +30,6 @@ > #include "config.h" > #endif > > -#include "igt.h" > #include > #include > #include > @@ -43,6 +42,12 @@ > #include > #include "drm.h" > > +#include "igt.h" > +#include "igt_x86.h" > + > +#define PAGE_SIZE 4096 > +#define CACHELINE 64 > + > #define OBJECT_SIZE (128*1024) /* restricted to 1MiB alignment on i915 > fences */ > > /* Before introduction of the LRU list for fences, allocation of a fence for > a page > @@ -104,15 +109,78 @@ bo_copy (void *_arg) > return NULL; > } > > +#if defined(__x86_64__) && !defined(__clang__) > +#define MOVNT 512 > + > +#pragma GCC push_options > +#pragma GCC target("sse4.1") > + > +#include > +__attribute__((noinline)) > +static void copy_wc_page(void *dst, void *src) > +{ > + if (igt_x86_features() & SSE4_1) { > + __m128i *S = (__m128i *)src; > + __m128i *D = (__m128i *)dst; > + > + for (int i = 0; i < PAGE_SIZE/CACHELINE; i++) { > + __m128i tmp[4]; > + > + tmp[0] = _mm_stream_load_si128(S++); > + tmp[1] = _mm_stream_load_si128(S++); > + tmp[2] = _mm_stream_load_si128(S++); > + tmp[3] = _mm_stream_load_si128(S++); > + > + _mm_store_si128(D++, tmp[0]); > + _mm_store_si128(D++, tmp[1]); > + _mm_store_si128(D++, tmp[2]); > + _mm_store_si128(D++, tmp[3]); > + } > + } else > + memcpy(dst, src, PAGE_SIZE); > +} > +static void copy_wc_cacheline(void *dst, void *src) > +{ > + if (igt_x86_features() & SSE4_1) { > + __m128i *S = (__m128i *)src; > + __m128i *D = (__m128i *)dst; > + __m128i tmp[4]; > + > + tmp[0] = _mm_stream_load_si128(S++); > + tmp[1] = _mm_stream_load_si128(S++); > + tmp[2] = _mm_stream_load_si128(S++); > + tmp[3] = _mm_stream_load_si128(S++); > + > + _mm_store_si128(D++, tmp[0]); > + _mm_store_si128(D++, tmp[1]); > + _mm_store_si128(D++, tmp[2]); > + _mm_store_si128(D++, tmp[3]); > + } else > + memcpy(dst, src, CACHELINE); > +} > + > +#pragma GCC pop_options > + > +#else > +static void copy_wc_page(void *dst, const void *src) > +{ > + memcpy(dst, src, PAGE_SIZE); > +} > +static void copy_wc_cacheline(void *dst, const void *src) > +{ > + memcpy(dst, src, CACHELINE); > +} > +#endif > + > static void > _bo_write_verify(struct test *t) > { > int fd = t->fd; > int i, k; > uint32_t **s; > - uint32_t v; > unsigned int dwords = OBJECT_SIZE >> 2; > const char *tile_str[] = { "none", "x", "y" }; > + uint32_t tmp[PAGE_SIZE/sizeof(uint32_t)]; > > igt_assert(t->tiling >= 0 && t->tiling <= I915_TILING_Y); > igt_assert_lt(0, t->num_surfaces); > @@ -124,21 +192,39 @@ _bo_write_verify(struct test *t) > s[k] = bo_create(fd, t->tiling); > > for (k = 0; k < t->num_surfaces; k++) { > - volatile uint32_t *a = s[k]; > - > - for (i = 0; i < dwords; i++) { > - a[i] = i; > - v = a[i]; > - igt_assert_f(v == i, > -"tiling %s: write failed at %d (%x)\n", > -tile_str[t->tiling], i, v); > + uint32_t *a = s[k]; > + > + a[0] = 0xdeadbeef; > + igt_assert_f(a[0] == 0xdeadbeef, > +"tiling %s: write failed at start (%x)\n", > +tile_str[t->tiling], a[0]); > + > + a[dwords - 1] = 0xc0ffee; > + igt_assert_f(a[dwords - 1] == 0xc0ffee, > +"tiling %s: write failed at end (%x)\n", > +tile_str[t->tiling], a[dwords - 1]); > + > + for (i = 0; i < dwords;
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Fix pointer-to-int conversion (rev2)
== Series Details == Series: drm/i915: Fix pointer-to-int conversion (rev2) URL : https://patchwork.freedesktop.org/series/31488/ State : success == Summary == shard-hswtotal:2446 pass:1328 dwarn:6 dfail:0 fail:9 skip:1103 time:10066s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5932/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for benchmark/gem_busy: Compare polling with syncobj_wait
== Series Details == Series: benchmark/gem_busy: Compare polling with syncobj_wait URL : https://patchwork.freedesktop.org/series/31507/ State : success == Summary == IGT patchset tested on top of latest successful build d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small priority sorting smoketest with latest DRM-Tip kernel build CI_DRM_3188 aaf31e875e72 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest No testlist changes. Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (fi-byt-j1900) fdo#101705 fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:458s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:477s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:394s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:566s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:288s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:531s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:532s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:546s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:523s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:563s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:647s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:441s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:608s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:441s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:495s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:476s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:505s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:591s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:488s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:599s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:658s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:472s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:664s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:540s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:524s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:483s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:587s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:441s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_306/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH igt] benchmark/gem_busy: Compare polling with syncobj_wait
v2: Hook the syncobj array to the execbuf! Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- benchmarks/gem_busy.c | 75 ++- 1 file changed, 74 insertions(+), 1 deletion(-) diff --git a/benchmarks/gem_busy.c b/benchmarks/gem_busy.c index f050454b..ce631d56 100644 --- a/benchmarks/gem_busy.c +++ b/benchmarks/gem_busy.c @@ -58,6 +58,15 @@ #define DMABUF 0x4 #define WAIT 0x8 #define SYNC 0x10 +#define SYNCOBJ 0x20 + +#define LOCAL_I915_EXEC_FENCE_ARRAY (1 << 19) +struct local_gem_exec_fence { + uint32_t handle; + uint32_t flags; +#define LOCAL_EXEC_FENCE_WAIT (1 << 0) +#define LOCAL_EXEC_FENCE_SIGNAL (1 << 1) +}; static void gem_busy(int fd, uint32_t handle) { @@ -109,11 +118,54 @@ static int sync_merge(int fd1, int fd2) return data.fence; } +static uint32_t __syncobj_create(int fd) +{ + struct local_syncobj_create { + uint32_t handle, flags; + } arg; +#define LOCAL_IOCTL_SYNCOBJ_CREATEDRM_IOWR(0xBF, struct local_syncobj_create) + + memset(&arg, 0, sizeof(arg)); + ioctl(fd, LOCAL_IOCTL_SYNCOBJ_CREATE, &arg); + + return arg.handle; +} + +static uint32_t syncobj_create(int fd) +{ + uint32_t ret; + + igt_assert_neq((ret = __syncobj_create(fd)), 0); + + return ret; +} + +#define LOCAL_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0) +#define LOCAL_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1) +struct local_syncobj_wait { + __u64 handles; + /* absolute timeout */ + __s64 timeout_nsec; + __u32 count_handles; + __u32 flags; + __u32 first_signaled; /* only valid when not waiting all */ + __u32 pad; +}; +#define LOCAL_IOCTL_SYNCOBJ_WAIT DRM_IOWR(0xC3, struct local_syncobj_wait) +static int __syncobj_wait(int fd, struct local_syncobj_wait *args) +{ + int err = 0; + if (drmIoctl(fd, LOCAL_IOCTL_SYNCOBJ_WAIT, args)) + err = -errno; + return err; +} + static int loop(unsigned ring, int reps, int ncpus, unsigned flags) { struct drm_i915_gem_execbuffer2 execbuf; struct drm_i915_gem_exec_object2 obj[2]; struct drm_i915_gem_relocation_entry reloc[2]; + struct local_gem_exec_fence syncobj; unsigned engines[16]; unsigned nengine; uint32_t *batch; @@ -150,6 +202,15 @@ static int loop(unsigned ring, int reps, int ncpus, unsigned flags) return 77; } + if (flags & SYNCOBJ) { + syncobj.handle = syncobj_create(fd); + syncobj.flags = LOCAL_EXEC_FENCE_SIGNAL; + + execbuf.cliprects_ptr = (uintptr_t)&syncobj; + execbuf.num_cliprects = 1; + execbuf.flags |= LOCAL_I915_EXEC_FENCE_ARRAY; + } + if (ring == -1) { nengine = 0; for (ring = 1; ring < 16; ring++) { @@ -235,6 +296,14 @@ static int loop(unsigned ring, int reps, int ncpus, unsigned flags) struct pollfd pfd = { .fd = dmabuf, .events = POLLOUT }; for (int inner = 0; inner < 1024; inner++) poll(&pfd, 1, 0); + } else if (flags & SYNCOBJ) { + struct local_syncobj_wait arg = { + .handles = to_user_pointer(&syncobj.handle), + .count_handles = 1, + }; + + for (int inner = 0; inner < 1024; inner++) + __syncobj_wait(fd, &arg); } else if (flags & SYNC) { struct pollfd pfd = { .fd = fence, .events = POLLOUT }; for (int inner = 0; inner < 1024; inner++) @@ -275,7 +344,7 @@ int main(int argc, char **argv) int ncpus = 1; int c; - while ((c = getopt (argc, argv, "e:r:dfswWI")) != -1) { + while ((c = getopt (argc, argv, "e:r:dfsSwWI")) != -1) { switch (c) { case 'e': if (strcmp(optarg, "rcs") == 0) @@ -314,6 +383,10 @@ int main(int argc, char **argv) flags |= SYNC; break; + case 'S': + flags |= SYNCOBJ; + break; + case 'W': flags |= WRITE; break; -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone
== Series Details == Series: series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone URL : https://patchwork.freedesktop.org/series/31489/ State : failure == Summary == Test kms_cursor_legacy: Subgroup cursor-vs-flip-atomic-transitions: pass -> FAIL (shard-hsw) Test perf: Subgroup polling: pass -> FAIL (shard-hsw) fdo#102252 fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252 shard-hswtotal:2446 pass:1326 dwarn:6 dfail:0 fail:11 skip:1103 time:10112s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5931/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for igt/gem_eio: Check hang/eio recovery during suspend
== Series Details == Series: igt/gem_eio: Check hang/eio recovery during suspend URL : https://patchwork.freedesktop.org/series/31485/ State : success == Summary == IGT patchset tested on top of latest successful build d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small priority sorting smoketest with latest DRM-Tip kernel build CI_DRM_3188 aaf31e875e72 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest Testlist changes: +igt@gem_eio@in-flight-suspend Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (fi-byt-j1900) fdo#101705 fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:457s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:479s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:393s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:593s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:290s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:535s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:539s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:549s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:543s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:571s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:642s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:440s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:607s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:442s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:423s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:509s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:477s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:506s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:590s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:494s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:601s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:657s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:479s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:664s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:542s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:526s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:476s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:588s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:437s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_305/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: warning for igt/gem_exec_capture: Exercise readback of userptr
== Series Details == Series: igt/gem_exec_capture: Exercise readback of userptr URL : https://patchwork.freedesktop.org/series/31480/ State : warning == Summary == IGT patchset tested on top of latest successful build d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small priority sorting smoketest with latest DRM-Tip kernel build CI_DRM_3188 aaf31e875e72 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest Testlist changes: +igt@gem_exec_capture@userptr Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (fi-byt-j1900) fdo#101705 Test drv_module_reload: Subgroup basic-reload: pass -> DMESG-WARN (fi-skl-6770hq) fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:460s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:473s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:397s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:579s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:288s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:536s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:536s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:549s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:538s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:571s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:643s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:445s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:602s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:447s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:504s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:479s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:500s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:581s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:499s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:595s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:472s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:662s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:540s fi-skl-6770hqtotal:289 pass:268 dwarn:1 dfail:0 fail:0 skip:20 time:566s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:473s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:584s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:444s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_304/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Separate RC6, RPS, LLC ring Frequency management
== Series Details == Series: drm/i915: Separate RC6, RPS, LLC ring Frequency management URL : https://patchwork.freedesktop.org/series/31487/ State : success == Summary == Test kms_setmode: Subgroup basic: fail -> PASS (shard-hsw) fdo#99912 fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 shard-hswtotal:2446 pass:1329 dwarn:6 dfail:0 fail:8 skip:1103 time:10141s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5930/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Cancel the hotplug work when unregistering the connector (rev2)
== Series Details == Series: drm/i915: Cancel the hotplug work when unregistering the connector (rev2) URL : https://patchwork.freedesktop.org/series/31501/ State : success == Summary == Series 31501v2 drm/i915: Cancel the hotplug work when unregistering the connector https://patchwork.freedesktop.org/api/1.0/series/31501/revisions/2/mbox/ Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (fi-byt-j1900) fdo#101705 fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:452s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:476s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:395s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:561s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:288s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:524s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:530s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:543s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:530s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:560s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:623s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:444s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:599s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:438s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:505s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:475s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:502s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:580s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:495s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:599s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:469s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:661s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:530s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:525s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:473s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:580s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:431s fi-pnv-d510 failed to connect after reboot aaf31e875e72b50f6a970c11f797b7f5b61a2681 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest e16fa7a43e0a drm/i915: Cancel the hotplug work when unregistering the connector == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5937/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged
Quoting Daniel Vetter (2017-10-06 15:20:09) > On Fri, Oct 06, 2017 at 12:03:49PM +0100, Chris Wilson wrote: > > Quoting Daniel Vetter (2017-10-06 10:06:37) > > > stop_machine is not really a locking primitive we should use, except > > > when the hw folks tell us the hw is broken and that's the only way to > > > work around it. > > > > > > This patch tries to address the locking abuse of stop_machine() from > > > > > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > > > Author: Chris Wilson > > > Date: Tue Nov 22 14:41:21 2016 + > > > > > > drm/i915: Stop the machine as we install the wedged submit_request > > > handler > > > > > > Chris said parts of the reasons for going with stop_machine() was that > > > it's no overhead for the fast-path. But these callbacks use irqsave > > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > > > > > To stay as close as possible to the stop_machine semantics we first > > > update all the submit function pointers to the nop handler, then call > > > synchronize_rcu() to make sure no new requests can be submitted. This > > > should give us exactly the huge barrier we want. > > > > > > I pondered whether we should annotate engine->submit_request as __rcu > > > and use rcu_assign_pointer and rcu_dereference on it. But the reason > > > behind those is to make sure the compiler/cpu barriers are there for > > > when you have an actual data structure you point at, to make sure all > > > the writes are seen correctly on the read side. But we just have a > > > function pointer, and .text isn't changed, so no need for these > > > barriers and hence no need for annotations. > > > > > > This should fix the followwing lockdep splat: > > > > > > == > > > WARNING: possible circular locking dependency detected > > > 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U > > > -- > > > kworker/3:4/562 is trying to acquire lock: > > > (cpu_hotplug_lock.rw_sem){}, at: [] > > > stop_machine+0x1c/0x40 > > > > > > but task is already holding lock: > > > (&dev->struct_mutex){+.+.}, at: [] > > > i915_reset_device+0x1e8/0x260 [i915] > > > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > > > -> #6 (&dev->struct_mutex){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >__mutex_lock+0x86/0x9b0 > > >mutex_lock_interruptible_nested+0x1b/0x20 > > >i915_mutex_lock_interruptible+0x51/0x130 [i915] > > >i915_gem_fault+0x209/0x650 [i915] > > >__do_fault+0x1e/0x80 > > >__handle_mm_fault+0xa08/0xed0 > > >handle_mm_fault+0x156/0x300 > > >__do_page_fault+0x2c5/0x570 > > >do_page_fault+0x28/0x250 > > >page_fault+0x22/0x30 > > > > > > -> #5 (&mm->mmap_sem){}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >__might_fault+0x68/0x90 > > >_copy_to_user+0x23/0x70 > > >filldir+0xa5/0x120 > > >dcache_readdir+0xf9/0x170 > > >iterate_dir+0x69/0x1a0 > > >SyS_getdents+0xa5/0x140 > > >entry_SYSCALL_64_fastpath+0x1c/0xb1 > > > > > > -> #4 (&sb->s_type->i_mutex_key#5){}: > > >down_write+0x3b/0x70 > > >handle_create+0xcb/0x1e0 > > >devtmpfsd+0x139/0x180 > > >kthread+0x152/0x190 > > >ret_from_fork+0x27/0x40 > > > > > > -> #3 ((complete)&req.done){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >wait_for_common+0x58/0x210 > > >wait_for_completion+0x1d/0x20 > > >devtmpfs_create_node+0x13d/0x160 > > >device_add+0x5eb/0x620 > > >device_create_groups_vargs+0xe0/0xf0 > > >device_create+0x3a/0x40 > > >msr_device_create+0x2b/0x40 > > >cpuhp_invoke_callback+0xc9/0xbf0 > > >cpuhp_thread_fun+0x17b/0x240 > > >smpboot_thread_fn+0x18a/0x280 > > >kthread+0x152/0x190 > > >ret_from_fork+0x27/0x40 > > > > > > -> #2 (cpuhp_state-up){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >cpuhp_issue_call+0x133/0x1c0 > > >__cpuhp_setup_state_cpuslocked+0x139/0x2a0 > > >__cpuhp_setup_state+0x46/0x60 > > >page_writeback_init+0x43/0x67 > > >pagecache_init+0x3d/0x42 > > >start_kernel+0x3a8/0x3fc > > >x86_64_start_reservations+0x2a/0x2c > > >x86_64_start_kernel+0x6d/0x70 > > >verify_cpu+0x0/0xfb > > > > > > -> #1 (cpuhp_state_mutex){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >__mutex_lock+0x86/0x9b0 > > >mutex_lock_nested+0x1b/0x20 > > >__cpuhp_setup_state_cpuslocked+0x53/0x2a0 > > >__cpuhp_setup_state+0x46/0x60 > > >page_alloc_init+0x28/0x30 > >
Re: [Intel-gfx] [PATCH v2] drm/i915: Order two completing nop_submit_request
Quoting Tvrtko Ursulin (2017-10-06 13:23:03) > > On 06/10/2017 12:56, Chris Wilson wrote: > > If two nop's (requests in-flight following a wedged device) complete at > > the same time, the global_seqno value written to the HWSP is undefined > > as the two threads are not serialized. > > > > v2: Use irqsafe spinlock. We expect the callback may be called from > > inside another irq spinlock, so we can't unconditionally restore irqs. > > > > Fixes: ce1135c7de64 ("drm/i915: Complete requests in nop_submit_request") > > Signed-off-by: Chris Wilson > > Cc: Tvrtko Ursulin > > Reviewed-by: Tvrtko Ursulin #v1 > > --- > > drivers/gpu/drm/i915/i915_gem.c | 7 ++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c > > index ab8c6946fea4..6a6974ed8f74 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3014,10 +3014,15 @@ void i915_gem_reset_finish(struct drm_i915_private > > *dev_priv) > > > > static void nop_submit_request(struct drm_i915_gem_request *request) > > { > > + unsigned long flags; > > + > > GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error)); > > dma_fence_set_error(&request->fence, -EIO); > > - i915_gem_request_submit(request); > > + > > + spin_lock_irqsave(&request->engine->timeline->lock, flags); > > + __i915_gem_request_submit(request); > > intel_engine_init_global_seqno(request->engine, > > request->global_seqno); > > + spin_unlock_irqrestore(&request->engine->timeline->lock, flags); > > } > > > > static void engine_set_wedged(struct intel_engine_cs *engine) > > > > Ooops.. > > Reviewed-by: Tvrtko Ursulin Thanks for asking the question that lead to the discovery of the race and then reviewing the results! Pushed, -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged
Quoting Daniel Vetter (2017-10-06 15:20:09) > On Fri, Oct 06, 2017 at 12:03:49PM +0100, Chris Wilson wrote: > > Quoting Daniel Vetter (2017-10-06 10:06:37) > > > stop_machine is not really a locking primitive we should use, except > > > when the hw folks tell us the hw is broken and that's the only way to > > > work around it. > > > > > > This patch tries to address the locking abuse of stop_machine() from > > > > > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > > > Author: Chris Wilson > > > Date: Tue Nov 22 14:41:21 2016 + > > > > > > drm/i915: Stop the machine as we install the wedged submit_request > > > handler > > > > > > Chris said parts of the reasons for going with stop_machine() was that > > > it's no overhead for the fast-path. But these callbacks use irqsave > > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > > > > > To stay as close as possible to the stop_machine semantics we first > > > update all the submit function pointers to the nop handler, then call > > > synchronize_rcu() to make sure no new requests can be submitted. This > > > should give us exactly the huge barrier we want. > > > > > > I pondered whether we should annotate engine->submit_request as __rcu > > > and use rcu_assign_pointer and rcu_dereference on it. But the reason > > > behind those is to make sure the compiler/cpu barriers are there for > > > when you have an actual data structure you point at, to make sure all > > > the writes are seen correctly on the read side. But we just have a > > > function pointer, and .text isn't changed, so no need for these > > > barriers and hence no need for annotations. > > > > > > This should fix the followwing lockdep splat: > > > > > > == > > > WARNING: possible circular locking dependency detected > > > 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U > > > -- > > > kworker/3:4/562 is trying to acquire lock: > > > (cpu_hotplug_lock.rw_sem){}, at: [] > > > stop_machine+0x1c/0x40 > > > > > > but task is already holding lock: > > > (&dev->struct_mutex){+.+.}, at: [] > > > i915_reset_device+0x1e8/0x260 [i915] > > > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > > > -> #6 (&dev->struct_mutex){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >__mutex_lock+0x86/0x9b0 > > >mutex_lock_interruptible_nested+0x1b/0x20 > > >i915_mutex_lock_interruptible+0x51/0x130 [i915] > > >i915_gem_fault+0x209/0x650 [i915] > > >__do_fault+0x1e/0x80 > > >__handle_mm_fault+0xa08/0xed0 > > >handle_mm_fault+0x156/0x300 > > >__do_page_fault+0x2c5/0x570 > > >do_page_fault+0x28/0x250 > > >page_fault+0x22/0x30 > > > > > > -> #5 (&mm->mmap_sem){}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >__might_fault+0x68/0x90 > > >_copy_to_user+0x23/0x70 > > >filldir+0xa5/0x120 > > >dcache_readdir+0xf9/0x170 > > >iterate_dir+0x69/0x1a0 > > >SyS_getdents+0xa5/0x140 > > >entry_SYSCALL_64_fastpath+0x1c/0xb1 > > > > > > -> #4 (&sb->s_type->i_mutex_key#5){}: > > >down_write+0x3b/0x70 > > >handle_create+0xcb/0x1e0 > > >devtmpfsd+0x139/0x180 > > >kthread+0x152/0x190 > > >ret_from_fork+0x27/0x40 > > > > > > -> #3 ((complete)&req.done){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >wait_for_common+0x58/0x210 > > >wait_for_completion+0x1d/0x20 > > >devtmpfs_create_node+0x13d/0x160 > > >device_add+0x5eb/0x620 > > >device_create_groups_vargs+0xe0/0xf0 > > >device_create+0x3a/0x40 > > >msr_device_create+0x2b/0x40 > > >cpuhp_invoke_callback+0xc9/0xbf0 > > >cpuhp_thread_fun+0x17b/0x240 > > >smpboot_thread_fn+0x18a/0x280 > > >kthread+0x152/0x190 > > >ret_from_fork+0x27/0x40 > > > > > > -> #2 (cpuhp_state-up){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >cpuhp_issue_call+0x133/0x1c0 > > >__cpuhp_setup_state_cpuslocked+0x139/0x2a0 > > >__cpuhp_setup_state+0x46/0x60 > > >page_writeback_init+0x43/0x67 > > >pagecache_init+0x3d/0x42 > > >start_kernel+0x3a8/0x3fc > > >x86_64_start_reservations+0x2a/0x2c > > >x86_64_start_kernel+0x6d/0x70 > > >verify_cpu+0x0/0xfb > > > > > > -> #1 (cpuhp_state_mutex){+.+.}: > > >__lock_acquire+0x1420/0x15e0 > > >lock_acquire+0xb0/0x200 > > >__mutex_lock+0x86/0x9b0 > > >mutex_lock_nested+0x1b/0x20 > > >__cpuhp_setup_state_cpuslocked+0x53/0x2a0 > > >__cpuhp_setup_state+0x46/0x60 > > >page_alloc_init+0x28/0x30 > >
[Intel-gfx] [PATCH v2] drm/i915: Cancel the hotplug work when unregistering the connector
When we unregister the connector, we may have a pending hotplug work. This needs to be cancel early during the teardown so that it does not fire after we have freed the connector. Or else we may see something like: DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) [ cut here ] WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 task: 8803c827aa40 task.stack: c952 RIP: 0010:mutex_destroy+0x4e/0x60 RSP: 0018:c9523d58 EFLAGS: 00010292 RAX: 002a RBX: 88044fbef648 RCX: RDX: 8001 RSI: 0001 RDI: 810f0cf0 RBP: c9523d60 R08: 0001 R09: 0001 R10: 0f21cb81 R11: R12: 88044f71efc8 R13: a02b3d20 R14: a02b3d90 R15: 880459b29308 FS: 7f5df4d6e8c0() GS:88045d30() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 55ec51f00a18 CR3: 000451782006 CR4: 003606e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: drm_fb_helper_fini+0xd9/0x130 intel_fbdev_destroy+0x12/0x60 [i915] intel_fbdev_fini+0x28/0x30 [i915] intel_modeset_cleanup+0x45/0xa0 [i915] i915_driver_unload+0x92/0x180 [i915] i915_pci_remove+0x19/0x30 [i915] pci_device_remove+0x39/0xb0 device_release_driver_internal+0x15d/0x220 driver_detach+0x40/0x80 bus_remove_driver+0x58/0xd0 driver_unregister+0x2c/0x40 pci_unregister_driver+0x36/0xb0 i915_exit+0x1a/0x8b [i915] SyS_delete_module+0x18c/0x1e0 entry_SYSCALL_64_fastpath+0x1c/0xb1 RIP: 0033:0x7f5df3286287 RSP: 002b:7fff8e107cc8 EFLAGS: 0246 ORIG_RAX: 00b0 RAX: ffda RBX: 81493a03 RCX: 7f5df3286287 RDX: 0001 RSI: 0800 RDI: 564c7be02e48 RBP: c9523f88 R08: R09: 0080 R10: 7f5df4d6e8c0 R11: 0246 R12: R13: 7fff8e107eb0 R14: R15: ? __this_cpu_preempt_check+0x13/0x20 Code: 00 00 5b 5d c3 e8 93 b9 3a 00 85 c0 74 ec 8b 05 e1 53 c3 01 85 c0 75 e2 48 c7 c6 86 a6 c7 81 48 c7 c7 8b 8d c6 81 e8 03 ae 01 00 <0f> ff eb cb 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 b8 ---[ end trace 08901ff1a77d30c6 ]--- [drm:wait_panel_status [i915]] mask b80f value status control 0060 [drm:wait_panel_status [i915]] Wait complete [drm:edp_panel_vdd_on [i915]] PP_STATUS: 0x PP_CONTROL: 0x0068 [drm:edp_panel_vdd_on [i915]] eDP port A panel power wasn't enabled [drm:drm_dp_read_desc] DP sink: OUI 00-1c-f8 dev-ID HW-rev 0.0 SW-rev 7.49 quirks 0x [drm:drm_edid_to_eld] ELD: no CEA Extension found [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:48:eDP-1] probed modes : [drm:drm_mode_debug_printmodeline] Modeline 49:"1920x1080" 60 138780 1920 1966 1996 2080 1080 1082 1086 1112 0x48 0xa [drm:drm_mode_debug_printmodeline] Modeline 50:"1920x1080" 40 92520 1920 1966 1996 2080 1080 1082 1086 1112 0x40 0xa general protection fault: [#1] PREEMPT SMP Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G U W 4.14.0-rc3-CI-CI_DRM_3186+ #1 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 Workqueue: events intel_dp_modeset_retry_work_fn [i915] task: 88045a5caa40 task.stack: c9378000 RIP: 0010:drm_setup_crtcs+0x143/0xbf0 RSP: 0018:c937bd20 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b6b RBX: 0002 RCX: 0001 RDX: 0001 RSI: 0780 RDI: RBP: c937bdb8 R08: 0001 R09: 0001 R10: 0780 R11: R12: 0002 R13: 88044fbef4e8 R14: 0780 R15: 0438 FS: () GS:88045d20() knlGS: CS: 0010 DS: ES:
[Intel-gfx] ✗ Fi.CI.IGT: warning for drm/i915: Order two completing nop_submit_request (rev2)
== Series Details == Series: drm/i915: Order two completing nop_submit_request (rev2) URL : https://patchwork.freedesktop.org/series/31486/ State : warning == Summary == Test kms_plane: Subgroup plane-panning-bottom-right-suspend-pipe-B-planes: pass -> SKIP (shard-hsw) shard-hswtotal:2446 pass:1327 dwarn:6 dfail:0 fail:9 skip:1104 time:10071s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5929/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Silence compiler warning for hsw_power_well_enable()
Quoting Imre Deak (2017-10-02 12:42:24) > On Mon, Oct 02, 2017 at 11:04:16AM +0100, Chris Wilson wrote: > > Not all compilers are able to determine that pg is guarded by wait_fuses > > and so may think that pg is used uninitialized. > > > > Reported-by: Geert Uytterhoeven > > Fixes: b2891eb2531e ("drm/i915/hsw+: Add has_fuses power well attribute") > > Signed-off-by: Chris Wilson > > Cc: Imre Deak > > Cc: Arkadiusz Hiler > > Reviewed-by: Imre Deak Thanks for the review, applied so we should be off the nag list in the cycle. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for igt/gem_memfd: Exercise hugepages and memfd
== Series Details == Series: igt/gem_memfd: Exercise hugepages and memfd URL : https://patchwork.freedesktop.org/series/31460/ State : success == Summary == IGT patchset tested on top of latest successful build d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small priority sorting smoketest with latest DRM-Tip kernel build CI_DRM_3186 cb32cc2ad1c3 drm-tip: 2017y-10m-06d-15h-01m-44s UTC integration manifest Testlist changes: +igt@gem_memfd@1G +igt@gem_memfd@2M +igt@gem_memfd@64k Test chamelium: Subgroup hdmi-crc-fast: pass -> DMESG-WARN (fi-skl-6700k) fdo#103019 Test drv_module_reload: Subgroup basic-reload-inject: incomplete -> PASS (fi-cfl-s) fdo#103022 fdo#103019 https://bugs.freedesktop.org/show_bug.cgi?id=103019 fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:457s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:473s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:404s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:570s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:288s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:528s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:535s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:545s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:532s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:568s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:645s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:439s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:606s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:451s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:420s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:503s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:484s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:501s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:585s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:489s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:599s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:472s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:662s fi-skl-6700k total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:537s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:524s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:479s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:589s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:435s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_303/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Cancel the hotplug work when unregistering the connector
== Series Details == Series: drm/i915: Cancel the hotplug work when unregistering the connector URL : https://patchwork.freedesktop.org/series/31501/ State : warning == Summary == Series 31501v1 drm/i915: Cancel the hotplug work when unregistering the connector https://patchwork.freedesktop.org/api/1.0/series/31501/revisions/1/mbox/ Test drv_module_reload: Subgroup basic-reload: pass -> DMESG-WARN (fi-blb-e6850) pass -> DMESG-WARN (fi-pnv-d510) pass -> DMESG-WARN (fi-bwr-2160) pass -> DMESG-WARN (fi-elk-e7500) pass -> DMESG-WARN (fi-snb-2520m) pass -> DMESG-WARN (fi-snb-2600) pass -> DMESG-WARN (fi-ivb-3520m) pass -> DMESG-WARN (fi-ivb-3770) pass -> DMESG-WARN (fi-byt-j1900) pass -> DMESG-WARN (fi-hsw-4770) pass -> DMESG-WARN (fi-hsw-4770r) pass -> DMESG-WARN (fi-bdw-5557u) pass -> DMESG-WARN (fi-bdw-gvtdvm) pass -> DMESG-WARN (fi-bsw-n3050) pass -> DMESG-WARN (fi-skl-6260u) pass -> DMESG-WARN (fi-skl-6700k) pass -> DMESG-WARN (fi-skl-6770hq) pass -> DMESG-WARN (fi-skl-gvtdvm) pass -> DMESG-WARN (fi-bxt-dsi) pass -> DMESG-WARN (fi-bxt-j4205) pass -> DMESG-WARN (fi-kbl-7500u) pass -> DMESG-WARN (fi-kbl-7560u) pass -> DMESG-WARN (fi-kbl-r) pass -> DMESG-WARN (fi-glk-1) pass -> DMESG-WARN (fi-cfl-s) pass -> DMESG-WARN (fi-cnl-y) Subgroup basic-no-display: pass -> DMESG-WARN (fi-cfl-s) fdo#103022 +1 fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022 fi-bdw-5557u total:289 pass:267 dwarn:1 dfail:0 fail:0 skip:21 time:462s fi-bdw-gvtdvmtotal:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:480s fi-blb-e6850 total:289 pass:222 dwarn:2 dfail:0 fail:0 skip:65 time:403s fi-bsw-n3050 total:289 pass:242 dwarn:1 dfail:0 fail:0 skip:46 time:566s fi-bwr-2160 total:289 pass:182 dwarn:1 dfail:0 fail:0 skip:106 time:286s fi-bxt-dsi total:289 pass:258 dwarn:1 dfail:0 fail:0 skip:30 time:524s fi-bxt-j4205 total:289 pass:259 dwarn:1 dfail:0 fail:0 skip:29 time:531s fi-byt-j1900 total:289 pass:252 dwarn:2 dfail:0 fail:0 skip:35 time:539s fi-cfl-s total:289 pass:253 dwarn:4 dfail:0 fail:0 skip:32 time:579s fi-cnl-y total:289 pass:261 dwarn:1 dfail:0 fail:0 skip:27 time:649s fi-elk-e7500 total:289 pass:228 dwarn:1 dfail:0 fail:0 skip:60 time:435s fi-glk-1 total:289 pass:260 dwarn:1 dfail:0 fail:0 skip:28 time:597s fi-hsw-4770 total:289 pass:261 dwarn:1 dfail:0 fail:0 skip:27 time:438s fi-hsw-4770r total:289 pass:261 dwarn:1 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:259 dwarn:1 dfail:0 fail:0 skip:29 time:503s fi-ivb-3770 total:289 pass:259 dwarn:1 dfail:0 fail:0 skip:29 time:475s fi-kbl-7500u total:289 pass:263 dwarn:2 dfail:0 fail:0 skip:24 time:506s fi-kbl-7560u total:289 pass:269 dwarn:1 dfail:0 fail:0 skip:19 time:583s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:488s fi-kbl-r total:289 pass:261 dwarn:1 dfail:0 fail:0 skip:27 time:591s fi-pnv-d510 total:289 pass:221 dwarn:2 dfail:0 fail:0 skip:66 time:667s fi-skl-6260u total:289 pass:268 dwarn:1 dfail:0 fail:0 skip:20 time:467s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:656s fi-skl-6700k total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:570s fi-skl-6770hqtotal:289 pass:268 dwarn:1 dfail:0 fail:0 skip:20 time:509s fi-skl-gvtdvmtotal:289 pass:265 dwarn:1 dfail:0 fail:0 skip:23 time:471s fi-snb-2520m total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:581s fi-snb-2600 total:289 pass:248 dwarn:1 dfail:0 fail:0 skip:40 time:433s fi-byt-n2820 failed to connect after reboot cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC integration manifest 9af48be91aae drm/i915: Cancel the hotplug work when unregistering the connector == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5936/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock (rev2)
== Series Details == Series: series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock (rev2) URL : https://patchwork.freedesktop.org/series/31476/ State : success == Summary == Series 31476v2 series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock https://patchwork.freedesktop.org/api/1.0/series/31476/revisions/2/mbox/ Test drv_module_reload: Subgroup basic-reload-inject: incomplete -> PASS (fi-cfl-s) fdo#103022 fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:455s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:482s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:397s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:566s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:287s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:526s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:544s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:527s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:559s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:618s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:430s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:612s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:440s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:417s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:504s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:482s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:505s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:584s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:501s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:597s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:656s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:469s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:659s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:531s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:516s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:476s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:582s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:432s fi-bxt-j4205 failed to connect after reboot cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC integration manifest 999c4f026e85 drm/i915: Use rcu instead of stop_machine in set_wedged cef3c4054a61 drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5935/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset
On 06/10/17 05:35, Michał Winiarski wrote: On Thu, Oct 05, 2017 at 05:02:39PM +, Daniele Ceraolo Spurio wrote: On 05/10/17 02:33, Chris Wilson wrote: Quoting Michał Winiarski (2017-10-05 10:13:40) We're using first page of kernel context state to share data with GuC, let's precompute the ggtt offset at GuC initialization time rather than everytime we're using GuC actions. So LRC_GUCSHR_PN is still 0. Plans for that to change? This is a requirement from the GuC side. GuC expects each context to have that extra page before the PPHWSP and it uses it to dump some per-lrc info, part of which is for internal use and part is info for the host (although we don't need/use it). On certain events (reset/preempt/suspend etc) GuC will dump extra info and this is done in the page provided in the H2G. I think we use the one of the default ctx just for simplicity, but it should be possible to use a different one, possibly not attached to any lrc if needed, but I'm not sure if this has ever been tested. Done that (allocating a separate object for GuC shared data), seems to work just fine on its own. Except if we try to remove the first page from contexts. It seems to make GuC upset even though we're not using actions. Yep, as I mentioned above GuC dumps runtime info about each lrc it handles in that page (e.g. if an lrc has been submitted via proxy), so it is probably going to either page-fault or write in the wrong memory if that page is not allocated. We could still do that, though without removing the extra page we're just being more wasteful. But perhaps it's cleaner that way? Having separate managed in GuC code rather than reusing random places in context state? Thoughts? This is similar to what we used to do by using the PPHWSP of the default ctx as the global HWSP. Personally I'd prefer to keep it separate as it feels cleaner and a single extra page shouldn't hurt us that much, but there was some push-back when I suggested the same for the HWSP. Daniele -Michał -Daniele Atm, we should be changing one pointer deref for another... -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Prove an assert for when we expect forcewake to be held
s/Prove/Provide/ Quoting Chris Wilson (2017-10-06 15:54:59) > Add assert_forcewakes_active() (the complementary function to > assert_forcewakes_inactive) that documents the requirement of a > function for its callers to be holding the forcewake ref (i.e. the > function is part of a sequence over which RC6 must be prevented). > > One such example is during ringbuffer reset, where RC6 must be held > across the whole reinitialisation sequence. > > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin > Cc: Mika Kuoppala > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 11 ++- > drivers/gpu/drm/i915/intel_uncore.c | 12 > drivers/gpu/drm/i915/intel_uncore.h | 2 ++ > 3 files changed, 24 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 05c08b0bc172..4285f09ff8b8 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -579,7 +579,16 @@ static int init_ring_common(struct intel_engine_cs > *engine) > static void reset_ring_common(struct intel_engine_cs *engine, > struct drm_i915_gem_request *request) > { > - /* Try to restore the logical GPU state to match the continuation > + /* > +* RC6 must be prevented until the reset is complete and the engine > +* reinitialised. If it occurs in the middle of this sequence, the > +* state written to/loaded from the power context is ill-defined (e.g. > +* the PP_BASE_DIR may be lost). PP_DIR_BASE -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: warning for drm/i915: Try harder to finish the idle-worker (rev2)
== Series Details == Series: drm/i915: Try harder to finish the idle-worker (rev2) URL : https://patchwork.freedesktop.org/series/29690/ State : warning == Summary == Test kms_plane_multiple: Subgroup legacy-pipe-C-tiling-none: pass -> SKIP (shard-hsw) shard-hswtotal:2446 pass:1327 dwarn:6 dfail:0 fail:9 skip:1104 time:10118s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5927/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests
== Series Details == Series: series starting with [1/2] drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests URL : https://patchwork.freedesktop.org/series/31498/ State : warning == Summary == Series 31498v1 series starting with [1/2] drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests https://patchwork.freedesktop.org/api/1.0/series/31498/revisions/1/mbox/ Test gem_busy: Subgroup basic-hang-default: pass -> DMESG-WARN (fi-snb-2520m) Test gem_ringfill: Subgroup basic-default-hang: pass -> DMESG-WARN (fi-ivb-3520m) Test drv_module_reload: Subgroup basic-reload-inject: incomplete -> PASS (fi-cfl-s) fdo#103022 fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:452s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:472s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:396s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:569s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:287s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:519s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:532s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:541s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:534s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:554s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:626s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:440s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:604s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:441s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:259 dwarn:1 dfail:0 fail:0 skip:29 time:501s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:478s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:506s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:585s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:503s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:594s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:657s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:477s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:660s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:537s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:568s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:476s fi-snb-2520m total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:577s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:440s cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC integration manifest 7793b7e9953b drm/i915: Prove an assert for when we expect forcewake to be held 3e2ce344584e drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5934/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock
On 06/10/2017 16:52, Daniel Vetter wrote: 4.14-rc1 gained the fancy new cross-release support in lockdep, which seems to have uncovered a few more rules about what is allowed and isn't. This one here seems to indicate that allocating a work-queue while holding mmap_sem is a no-go, so let's try to preallocate it. Of course another way to break this chain would be somewhere in the cpu hotplug code, since this isn't the only trace we're finding now which goes through msr_create_device. Full lockdep splat: == WARNING: possible circular locking dependency detected 4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G U -- prime_mmap/1551 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){}, at: [] apply_workqueue_attrs+0x17/0x50 but task is already holding lock: (&dev_priv->mm_lock){+.+.}, at: [] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&dev_priv->mm_lock){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_nested+0x1b/0x20 i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] i915_gem_userptr_ioctl+0x222/0x2c0 [i915] drm_ioctl_kernel+0x69/0xb0 drm_ioctl+0x2f9/0x3d0 do_vfs_ioctl+0x94/0x670 SyS_ioctl+0x41/0x70 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #5 (&mm->mmap_sem){}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __might_fault+0x68/0x90 _copy_to_user+0x23/0x70 filldir+0xa5/0x120 dcache_readdir+0xf9/0x170 iterate_dir+0x69/0x1a0 SyS_getdents+0xa5/0x140 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #4 (&sb->s_type->i_mutex_key#5){}: down_write+0x3b/0x70 handle_create+0xcb/0x1e0 devtmpfsd+0x139/0x180 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #3 ((complete)&req.done){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 wait_for_common+0x58/0x210 wait_for_completion+0x1d/0x20 devtmpfs_create_node+0x13d/0x160 device_add+0x5eb/0x620 device_create_groups_vargs+0xe0/0xf0 device_create+0x3a/0x40 msr_device_create+0x2b/0x40 cpuhp_invoke_callback+0xa3/0x840 cpuhp_thread_fun+0x7a/0x150 smpboot_thread_fn+0x18a/0x280 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #2 (cpuhp_state){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpuhp_issue_call+0x10b/0x170 __cpuhp_setup_state_cpuslocked+0x134/0x2a0 __cpuhp_setup_state+0x46/0x60 page_writeback_init+0x43/0x67 pagecache_init+0x3d/0x42 start_kernel+0x3a8/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #1 (cpuhp_state_mutex){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_nested+0x1b/0x20 __cpuhp_setup_state_cpuslocked+0x52/0x2a0 __cpuhp_setup_state+0x46/0x60 page_alloc_init+0x28/0x30 start_kernel+0x145/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #0 (cpu_hotplug_lock.rw_sem){}: check_prev_add+0x430/0x840 __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpus_read_lock+0x3d/0xb0 apply_workqueue_attrs+0x17/0x50 __alloc_workqueue_key+0x1d8/0x4d9 i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915] i915_gem_userptr_ioctl+0x222/0x2c0 [i915] drm_ioctl_kernel+0x69/0xb0 drm_ioctl+0x2f9/0x3d0 do_vfs_ioctl+0x94/0x670 SyS_ioctl+0x41/0x70 entry_SYSCALL_64_fastpath+0x1c/0xb1 other info that might help us debug this: Chain exists of: cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock Possible unsafe locking scenario: CPU0CPU1 lock(&dev_priv->mm_lock); lock(&mm->mmap_sem); lock(&dev_priv->mm_lock); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** 2 locks held by prime_mmap/1551: #0: (&mm->mmap_sem){}, at: [] i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915] #1: (&dev_priv->mm_lock){+.+.}, at: [] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] stack backtrace: CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G U 4.14.0-rc1-CI-CI_DRM_3118+ #1 Hardware name: Dell Inc. XPS 8300 /0Y2MRG, BIOS A06 10/17/2011 Call Trace: dump_stack+0x68/0x9f print_circular_bug+0x235/0x3c0 ? lockdep_init_map_crosslock+0x20/0x20 check_prev_add+0x430/0x840 __lock_acquire+0x1420/
[Intel-gfx] [PATCH] drm/i915: Cancel the hotplug work when unregistering the connector
When we unregister the connector, we may have a pending hotplug work. This needs to be cancel early during the teardown so that it does not fire after we have freed the connector. Or else we may see something like: DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) [ cut here ] WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 task: 8803c827aa40 task.stack: c952 RIP: 0010:mutex_destroy+0x4e/0x60 RSP: 0018:c9523d58 EFLAGS: 00010292 RAX: 002a RBX: 88044fbef648 RCX: RDX: 8001 RSI: 0001 RDI: 810f0cf0 RBP: c9523d60 R08: 0001 R09: 0001 R10: 0f21cb81 R11: R12: 88044f71efc8 R13: a02b3d20 R14: a02b3d90 R15: 880459b29308 FS: 7f5df4d6e8c0() GS:88045d30() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 55ec51f00a18 CR3: 000451782006 CR4: 003606e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: drm_fb_helper_fini+0xd9/0x130 intel_fbdev_destroy+0x12/0x60 [i915] intel_fbdev_fini+0x28/0x30 [i915] intel_modeset_cleanup+0x45/0xa0 [i915] i915_driver_unload+0x92/0x180 [i915] i915_pci_remove+0x19/0x30 [i915] pci_device_remove+0x39/0xb0 device_release_driver_internal+0x15d/0x220 driver_detach+0x40/0x80 bus_remove_driver+0x58/0xd0 driver_unregister+0x2c/0x40 pci_unregister_driver+0x36/0xb0 i915_exit+0x1a/0x8b [i915] SyS_delete_module+0x18c/0x1e0 entry_SYSCALL_64_fastpath+0x1c/0xb1 RIP: 0033:0x7f5df3286287 RSP: 002b:7fff8e107cc8 EFLAGS: 0246 ORIG_RAX: 00b0 RAX: ffda RBX: 81493a03 RCX: 7f5df3286287 RDX: 0001 RSI: 0800 RDI: 564c7be02e48 RBP: c9523f88 R08: R09: 0080 R10: 7f5df4d6e8c0 R11: 0246 R12: R13: 7fff8e107eb0 R14: R15: ? __this_cpu_preempt_check+0x13/0x20 Code: 00 00 5b 5d c3 e8 93 b9 3a 00 85 c0 74 ec 8b 05 e1 53 c3 01 85 c0 75 e2 48 c7 c6 86 a6 c7 81 48 c7 c7 8b 8d c6 81 e8 03 ae 01 00 <0f> ff eb cb 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 b8 ---[ end trace 08901ff1a77d30c6 ]--- [drm:wait_panel_status [i915]] mask b80f value status control 0060 [drm:wait_panel_status [i915]] Wait complete [drm:edp_panel_vdd_on [i915]] PP_STATUS: 0x PP_CONTROL: 0x0068 [drm:edp_panel_vdd_on [i915]] eDP port A panel power wasn't enabled [drm:drm_dp_read_desc] DP sink: OUI 00-1c-f8 dev-ID HW-rev 0.0 SW-rev 7.49 quirks 0x [drm:drm_edid_to_eld] ELD: no CEA Extension found [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:48:eDP-1] probed modes : [drm:drm_mode_debug_printmodeline] Modeline 49:"1920x1080" 60 138780 1920 1966 1996 2080 1080 1082 1086 1112 0x48 0xa [drm:drm_mode_debug_printmodeline] Modeline 50:"1920x1080" 40 92520 1920 1966 1996 2080 1080 1082 1086 1112 0x40 0xa general protection fault: [#1] PREEMPT SMP Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G U W 4.14.0-rc3-CI-CI_DRM_3186+ #1 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 Workqueue: events intel_dp_modeset_retry_work_fn [i915] task: 88045a5caa40 task.stack: c9378000 RIP: 0010:drm_setup_crtcs+0x143/0xbf0 RSP: 0018:c937bd20 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b6b RBX: 0002 RCX: 0001 RDX: 0001 RSI: 0780 RDI: RBP: c937bdb8 R08: 0001 R09: 0001 R10: 0780 R11: R12: 0002 R13: 88044fbef4e8 R14: 0780 R15: 0438 FS: () GS:88045d20() knlGS: CS: 0010 DS: ES:
[Intel-gfx] [PATCH] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock
4.14-rc1 gained the fancy new cross-release support in lockdep, which seems to have uncovered a few more rules about what is allowed and isn't. This one here seems to indicate that allocating a work-queue while holding mmap_sem is a no-go, so let's try to preallocate it. Of course another way to break this chain would be somewhere in the cpu hotplug code, since this isn't the only trace we're finding now which goes through msr_create_device. Full lockdep splat: == WARNING: possible circular locking dependency detected 4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G U -- prime_mmap/1551 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){}, at: [] apply_workqueue_attrs+0x17/0x50 but task is already holding lock: (&dev_priv->mm_lock){+.+.}, at: [] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&dev_priv->mm_lock){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_nested+0x1b/0x20 i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] i915_gem_userptr_ioctl+0x222/0x2c0 [i915] drm_ioctl_kernel+0x69/0xb0 drm_ioctl+0x2f9/0x3d0 do_vfs_ioctl+0x94/0x670 SyS_ioctl+0x41/0x70 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #5 (&mm->mmap_sem){}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __might_fault+0x68/0x90 _copy_to_user+0x23/0x70 filldir+0xa5/0x120 dcache_readdir+0xf9/0x170 iterate_dir+0x69/0x1a0 SyS_getdents+0xa5/0x140 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #4 (&sb->s_type->i_mutex_key#5){}: down_write+0x3b/0x70 handle_create+0xcb/0x1e0 devtmpfsd+0x139/0x180 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #3 ((complete)&req.done){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 wait_for_common+0x58/0x210 wait_for_completion+0x1d/0x20 devtmpfs_create_node+0x13d/0x160 device_add+0x5eb/0x620 device_create_groups_vargs+0xe0/0xf0 device_create+0x3a/0x40 msr_device_create+0x2b/0x40 cpuhp_invoke_callback+0xa3/0x840 cpuhp_thread_fun+0x7a/0x150 smpboot_thread_fn+0x18a/0x280 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #2 (cpuhp_state){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpuhp_issue_call+0x10b/0x170 __cpuhp_setup_state_cpuslocked+0x134/0x2a0 __cpuhp_setup_state+0x46/0x60 page_writeback_init+0x43/0x67 pagecache_init+0x3d/0x42 start_kernel+0x3a8/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #1 (cpuhp_state_mutex){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_nested+0x1b/0x20 __cpuhp_setup_state_cpuslocked+0x52/0x2a0 __cpuhp_setup_state+0x46/0x60 page_alloc_init+0x28/0x30 start_kernel+0x145/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #0 (cpu_hotplug_lock.rw_sem){}: check_prev_add+0x430/0x840 __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpus_read_lock+0x3d/0xb0 apply_workqueue_attrs+0x17/0x50 __alloc_workqueue_key+0x1d8/0x4d9 i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915] i915_gem_userptr_ioctl+0x222/0x2c0 [i915] drm_ioctl_kernel+0x69/0xb0 drm_ioctl+0x2f9/0x3d0 do_vfs_ioctl+0x94/0x670 SyS_ioctl+0x41/0x70 entry_SYSCALL_64_fastpath+0x1c/0xb1 other info that might help us debug this: Chain exists of: cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock Possible unsafe locking scenario: CPU0CPU1 lock(&dev_priv->mm_lock); lock(&mm->mmap_sem); lock(&dev_priv->mm_lock); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** 2 locks held by prime_mmap/1551: #0: (&mm->mmap_sem){}, at: [] i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915] #1: (&dev_priv->mm_lock){+.+.}, at: [] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] stack backtrace: CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G U 4.14.0-rc1-CI-CI_DRM_3118+ #1 Hardware name: Dell Inc. XPS 8300 /0Y2MRG, BIOS A06 10/17/2011 Call Trace: dump_stack+0x68/0x9f print_circular_bug+0x235/0x3c0 ? lockdep_init_map_crosslock+0x20/0x20 check_prev_add+0x430/0x840 __lock_acquire+0x1420/0x15e0 ? __lock_acquire+0x1420/0x15e0 ? lockdep_init_map_crosslock+0x20/0x20 lock_acquire+0xb0/0x200 ? apply_workqueue_attrs+0x17/0x5
[Intel-gfx] ✓ Fi.CI.BAT: success for huge gtt pages (rev13)
== Series Details == Series: huge gtt pages (rev13) URL : https://patchwork.freedesktop.org/series/25118/ State : success == Summary == Series 25118v13 huge gtt pages https://patchwork.freedesktop.org/api/1.0/series/25118/revisions/13/mbox/ Test gem_exec_suspend: Subgroup basic-s3: dmesg-warn -> PASS (fi-cfl-s) fdo#103026 Test drv_module_reload: Subgroup basic-reload-inject: incomplete -> PASS (fi-cfl-s) fdo#103022 fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026 fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:453s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:468s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:389s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:562s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:284s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:524s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:521s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:533s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:523s fi-cfl-s total:289 pass:257 dwarn:0 dfail:0 fail:0 skip:32 time:574s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:615s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:431s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:608s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:436s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:416s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:509s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:479s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:501s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:583s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:489s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:589s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:651s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:486s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:652s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:531s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:509s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:465s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:573s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:430s cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC integration manifest 36ea69dd9d5b drm/i915: enable platform support for 2M pages 0f0029c50531 drm/i915: enable platform support for 64K pages a3ceb490e30c drm/i915: disable platform support for vGPU huge gtt pages bb78739d06a6 drm/i915/selftests: mix huge pages db68d040aacb drm/i915/selftests: huge page tests 7ad2fcb4f3b9 drm/i915/debugfs: include some gtt page size metrics a8ba9dc4b480 drm/i915: accurate page size tracking for the ppgtt 2bfd27e41b3e drm/i915: support 64K pages for the 48b PPGTT 77922800074a drm/i915: add support for 64K scratch page 1e0e1d625d57 drm/i915: support 2M pages for the 48b PPGTT 7afcdd9a842d drm/i915: disable GTT cache for 2M pages 4f6be3c66188 drm/i915: enable IPS bit for 64K pages 03330868e4af drm/i915: align 64K objects to 2M 3a6d462cc964 drm/i915: align the vma start to the largest gtt page size 33effb284bc0 drm/i915: introduce vm set_pages/clear_pages ebf5d7e4e80e drm/i915: introduce page_size members 9e877d2dad5e drm/i915: push set_pages down to the callers 74be45fad762 drm/i915: introduce page_sizes field to dev_info 83936c1a8137 drm/i915/gemfs: enable THP 634bb031d367 drm/i915: introduce simple gemfs 7041b30f5891 mm/shmem: introduce shmem_file_setup_with_mnt == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5933/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Fix pointer-to-int conversion
Quoting Chris Wilson (2017-10-06 14:16:55) > Quoting Michal Wajdeczko (2017-10-06 14:08:44) > > Commit faf654864b25 ("drm/i915: Unify uC variable types to avoid > > flooding checkpatch.pl") breaks 32-bit kernel builds. Lets use > > cast helper to make compiler happy. > > > > v2: introduce ptr_to_u64 (Chris) > > > > Signed-off-by: Michal Wajdeczko > > Cc: Joonas Lahtinen > > Cc: Chris Wilson > Reviewed-by: Chris Wilson Also applied to my queue, thanks for the quick fixup. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t 1/7] intel-gpu-overlay: Move local perf implementation to a library
On 29/09/2017 14:43, Petri Latvala wrote: On Fri, Sep 29, 2017 at 01:39:33PM +0100, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Idea is to avoid duplication across multiple users in upcoming patches. v2: Commit message and use a separate library instead of piggy- backing to libintel_tools. (Chris Wilson) Signed-off-by: Tvrtko Ursulin --- lib/Makefile.am | 6 +- overlay/perf.c => lib/igt_perf.c | 2 +- overlay/perf.h => lib/igt_perf.h | 2 ++ overlay/Makefile.am | 6 ++ overlay/gem-interrupts.c | 3 ++- overlay/gpu-freq.c | 3 ++- overlay/gpu-perf.c | 3 ++- overlay/gpu-top.c| 3 ++- overlay/power.c | 3 ++- overlay/rc6.c| 3 ++- 10 files changed, 22 insertions(+), 12 deletions(-) rename overlay/perf.c => lib/igt_perf.c (94%) rename overlay/perf.h => lib/igt_perf.h (99%) This one was more of a doozey to mesonize for a newbie. This is ugly but hopefully will make someone more knowledgeable point out better ways and practices for using build targets vs. just lib names around... (Now sent with X-Patchwork-Hint, hopefully patchwork doesn't get confused) diff --git a/benchmarks/meson.build b/benchmarks/meson.build index 9ab738f7..9f2672eb 100644 --- a/benchmarks/meson.build +++ b/benchmarks/meson.build @@ -31,6 +31,11 @@ endif foreach prog : benchmark_progs # FIXME meson doesn't like binaries with the same name # meanwhile just suffix with _bench + link = [] + if prog == 'gem_wsim' + link += lib_igt_perf + endif executable(prog + '_bench', prog + '.c', - dependencies : test_deps) + dependencies : test_deps, + link_with : link) endforeach diff --git a/lib/meson.build b/lib/meson.build index 203be520..2c33493d 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -178,4 +178,8 @@ lib_igt = declare_dependency(link_with : lib_igt_build, igt_deps = [ lib_igt ] + lib_deps +lib_igt_perf = static_library('igt_perf', +['igt_perf.c'] +) + subdir('tests') diff --git a/overlay/meson.build b/overlay/meson.build index a92ef895..ffc011cc 100644 --- a/overlay/meson.build +++ b/overlay/meson.build @@ -10,7 +10,6 @@ gpu_overlay_src = [ 'gpu-freq.c', 'igfx.c', 'overlay.c', - 'perf.c', 'power.c', 'rc6.c', ] @@ -56,5 +55,6 @@ if xrandr.found() and cairo.found() include_directories : inc, c_args : gpu_overlay_cflags, dependencies : gpu_overlay_deps, + link_with : lib_igt_perf, install : true) endif Grumble, can we have a switch over day where it all gets converted to meson by the people in the know, and until then not concern ourselves with a two-headed build system? At the moment it is just a distraction and time waste if everybody working on IGT has to test both build systems. I know meson is great and all that by I'd rather focus on the actual work than having to maintain parallel build systems. Especially since I am clueless on it, so it would be one more thing competing for limited brain resources. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 11/11] drm/i915: Introduce separate status variable for RC6 and LLC ring frequency setup
On 10/6/2017 6:25 PM, Chris Wilson wrote: Quoting Sagar Arun Kamble (2017-10-06 13:13:40) Defined new struct intel_rc6 to hold RC6 specific state and intel_ring_pstate to hold ring specific state. v2: s/intel_ring_pstate/intel_llc_pstate and rebase. (Chris) Signed-off-by: Sagar Arun Kamble Cc: Imre Deak Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Radoslaw Szwichtenberg --- drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 10 drivers/gpu/drm/i915/intel_pm.c | 57 +++-- 3 files changed, 54 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 470807c..154f231 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2502,7 +2502,7 @@ static int intel_runtime_suspend(struct device *kdev) struct drm_i915_private *dev_priv = to_i915(dev); int ret; - if (WARN_ON_ONCE(!(dev_priv->pm.rps.enabled && intel_rc6_enabled( + if (WARN_ON_ONCE(!(dev_priv->pm.rc6.enabled && intel_rc6_enabled( return -ENODEV; if (WARN_ON_ONCE(!HAS_RUNTIME_PM(dev_priv))) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 45944a8..a07aa71 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1363,8 +1363,18 @@ struct intel_rps { struct intel_rps_ei ei; }; +struct intel_rc6 { + bool enabled; +}; + +struct intel_llc_pstate { + bool configured; +}; + struct intel_gen6_power_mgmt { struct intel_rps rps; + struct intel_rc6 rc6; + struct intel_llc_pstate llc_pstate; struct delayed_work autoenable_work; /* diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 03264fe..df36a6f 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -7873,7 +7873,12 @@ static void intel_init_emon(struct drm_i915_private *dev_priv) static inline void intel_update_ring_freq(struct drm_i915_private *i915) { + if (READ_ONCE(i915->pm.llc_pstate.configured)) + return; Tell me about how you expect the locking around this function to be. The READ_ONCE() implies that we are doing a optimistic peek outside of a lock, but then we set configured without acquiring a lock, so I assume we are inside some lock. That looks true for all, we don't need READ_ONCE() anymore as we only inspect inside the mutex (and so READ_ONCE is giving the wrong impression). + gen6_update_ring_freq(i915); + + i915->pm.llc_pstate.configured = true; } void intel_disable_gt_powersave(struct drm_i915_private *dev_priv) { - if (!READ_ONCE(dev_priv->pm.rps.enabled)) - return; - mutex_lock(&dev_priv->pm.pcu_lock); intel_disable_rc6(dev_priv); intel_disable_rps(dev_priv); + if (HAS_LLC(dev_priv)) + dev_priv->pm.llc_pstate.configured = false; Always clear it? If no llc, it can never be configured. Hmm, better if we just made it symmetrical with s/intel_update_ring_freq/intel_enable_llc_pstate/ and intel_disable_llc_pstate here. Will update. - dev_priv->pm.rps.enabled = false; mutex_unlock(&dev_priv->pm.pcu_lock); } @@ -8080,7 +8103,10 @@ static void __intel_autoenable_gt_powersave(struct work_struct *work) struct intel_engine_cs *rcs; struct drm_i915_gem_request *req; - if (READ_ONCE(dev_priv->pm.rps.enabled)) + if (READ_ONCE(dev_priv->pm.rps.enabled) && + READ_ONCE(dev_priv->pm.rc6.enabled) && + !(HAS_LLC(dev_priv) ^ + READ_ONCE(dev_priv->pm.llc_pstate.configured))) goto out; This optimisation has lost its appeal :) Kill it, if we need something like it we can try again later. -Chris Sure. Understood that using READ_ONCE inside lock was unnecessary. Will remove this triple condition. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for ] lib: Ask the kernel to quiesce the GPU
== Series Details == Series: ] lib: Ask the kernel to quiesce the GPU URL : https://patchwork.freedesktop.org/series/31448/ State : failure == Summary == IGT patchset tested on top of latest successful build d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small priority sorting smoketest with latest DRM-Tip kernel build CI_DRM_3185 7dacd1f2e70c drm-tip: 2017y-10m-06d-12h-29m-28s UTC integration manifest No testlist changes. Test chamelium: Subgroup dp-edid-read: pass -> FAIL (fi-kbl-7500u) fdo#102672 Test gem_sync: Subgroup basic-all: pass -> FAIL (fi-pnv-d510) fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:449s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:467s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:563s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:292s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:526s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:543s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:546s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:518s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:564s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:630s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:437s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:596s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:436s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:419s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:506s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:470s fi-kbl-7500u total:289 pass:263 dwarn:1 dfail:0 fail:1 skip:24 time:495s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:589s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:483s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:588s fi-pnv-d510 total:289 pass:221 dwarn:1 dfail:0 fail:1 skip:66 time:660s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:483s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:654s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:526s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:515s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:466s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:580s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:435s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_302/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 10/11] drm/i915: Create generic functions to control RC6, RPS
On 10/6/2017 6:16 PM, Chris Wilson wrote: Quoting Sagar Arun Kamble (2017-10-06 13:13:39) Prepared generic functions intel_enable_rc6, intel_disable_rc6, intel_enable_rps and intel_disable_rps functions to setup RC6/RPS based on platforms. v2: Make intel_enable/disable_rc6/rps static. (Chris) Signed-off-by: Sagar Arun Kamble Cc: Imre Deak Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Radoslaw Szwichtenberg --- drivers/gpu/drm/i915/intel_pm.c | 97 ++--- 1 file changed, 62 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index ce2dc5b..03264fe 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -7972,75 +7972,102 @@ void intel_sanitize_gt_powersave(struct drm_i915_private *dev_priv) gen6_reset_rps_interrupts(dev_priv); } -void intel_disable_gt_powersave(struct drm_i915_private *dev_priv) +static void intel_disable_rc6(struct drm_i915_private *dev_priv) { - if (!READ_ONCE(dev_priv->pm.rps.enabled)) - return; - - mutex_lock(&dev_priv->pm.pcu_lock); lockdep_assert_held(dev_priv->pm.pcu_lock); ? We often skip it for statics, unless we know we are planning on adding an interface that may not take the lock. Sure will add this. Thanks. Reviewed-by: Chris Wilson -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/huc: Fix includes in intel_huc.c
Quoting Michal Wajdeczko (2017-10-06 10:02:09) > Fix includes order and make sure we only include required headers. > While here, make intel_huc.h header self-contained. > > Signed-off-by: Michal Wajdeczko > Cc: Joonas Lahtinen > Cc: Chris Wilson > --- > drivers/gpu/drm/i915/intel_huc.c | 6 -- > drivers/gpu/drm/i915/intel_huc.h | 2 ++ > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_huc.c > b/drivers/gpu/drm/i915/intel_huc.c > index 3f796fe..4b4cf56 100644 > --- a/drivers/gpu/drm/i915/intel_huc.c > +++ b/drivers/gpu/drm/i915/intel_huc.c > @@ -21,9 +21,11 @@ > * IN THE SOFTWARE. > * > */ > -#include > + > +#include > + > +#include "intel_huc.h" > #include "i915_drv.h" > -#include "intel_uc.h" > > /** > * DOC: HuC Firmware > diff --git a/drivers/gpu/drm/i915/intel_huc.h > b/drivers/gpu/drm/i915/intel_huc.h > index d58422b..aaa38b9 100644 > --- a/drivers/gpu/drm/i915/intel_huc.h > +++ b/drivers/gpu/drm/i915/intel_huc.h > @@ -25,6 +25,8 @@ > #ifndef _INTEL_HUC_H_ > #define _INTEL_HUC_H_ > > +#include "intel_uc_fw.h" > + > struct intel_huc { > /* Generic uC firmware management */ > struct intel_uc_fw fw; Reviewed-by: Chris Wilson Applied to my queue, thanks. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 06/11] drm/i915: Name i915_runtime_pm structure in dev_priv as "rpm"
On 10/6/2017 6:10 PM, Chris Wilson wrote: Quoting Sagar Arun Kamble (2017-10-06 13:13:35) We were using dev_priv->pm for runtime power management related state. This patch renames it to "rpm" which looks more apt. Will be using pm for state containing RPS/RC6 state in the next patch. Signed-off-by: Sagar Arun Kamble Cc: Imre Deak Cc: Chris Wilson Cc: Joonas Lahtinen Reviewed-by: Radoslaw Szwichtenberg Reviewed-by: Chris Wilson Thinking about this again, rpm, pm are very close. How about if we used i915->runtime_pm and i915->gt_pm (or i915->gt.pm)? Imre, any thoughts? rps.hw_lock/pcu_lock is used by display too, so I just kept it pm. should we pull rps.hw_lock/pcu_lock out into drm_i915_private and then gt_pm would be good. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915: Prove an assert for when we expect forcewake to be held
Add assert_forcewakes_active() (the complementary function to assert_forcewakes_inactive) that documents the requirement of a function for its callers to be holding the forcewake ref (i.e. the function is part of a sequence over which RC6 must be prevented). One such example is during ringbuffer reset, where RC6 must be held across the whole reinitialisation sequence. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Mika Kuoppala --- drivers/gpu/drm/i915/intel_ringbuffer.c | 11 ++- drivers/gpu/drm/i915/intel_uncore.c | 12 drivers/gpu/drm/i915/intel_uncore.h | 2 ++ 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 05c08b0bc172..4285f09ff8b8 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -579,7 +579,16 @@ static int init_ring_common(struct intel_engine_cs *engine) static void reset_ring_common(struct intel_engine_cs *engine, struct drm_i915_gem_request *request) { - /* Try to restore the logical GPU state to match the continuation + /* +* RC6 must be prevented until the reset is complete and the engine +* reinitialised. If it occurs in the middle of this sequence, the +* state written to/loaded from the power context is ill-defined (e.g. +* the PP_BASE_DIR may be lost). +*/ + assert_forcewakes_active(engine->i915, FORCEWAKE_ALL); + + /* +* Try to restore the logical GPU state to match the continuation * of the request queue. If we skip the context/PD restore, then * the next request may try to execute assuming that its context * is valid and loaded on the GPU and so may try to access invalid diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index b3c3f94fc7e4..3d41667919dc 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -629,6 +629,18 @@ void assert_forcewakes_inactive(struct drm_i915_private *dev_priv) WARN_ON(dev_priv->uncore.fw_domains_active); } +void assert_forcewakes_active(struct drm_i915_private *dev_priv, + enum forcewake_domains fw_domains) +{ + if (!dev_priv->uncore.funcs.force_wake_get) + return; + + assert_rpm_wakelock_held(dev_priv); + + fw_domains &= dev_priv->uncore.fw_domains; + WARN_ON(fw_domains & ~dev_priv->uncore.fw_domains_active); +} + /* We give fast paths for the really cool registers */ #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4) diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h index 66eae2ce2f29..582771251b57 100644 --- a/drivers/gpu/drm/i915/intel_uncore.h +++ b/drivers/gpu/drm/i915/intel_uncore.h @@ -137,6 +137,8 @@ void intel_uncore_resume_early(struct drm_i915_private *dev_priv); u64 intel_uncore_edram_size(struct drm_i915_private *dev_priv); void assert_forcewakes_inactive(struct drm_i915_private *dev_priv); +void assert_forcewakes_active(struct drm_i915_private *dev_priv, + enum forcewake_domains fw_domains); const char *intel_uncore_forcewake_domain_to_str(const enum forcewake_domain_id id); enum forcewake_domains -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests
Resetting the engine requires us to hold the forcewake wakeref to prevent RC6 trying to happen in the middle of the reset sequence. Normally, this is taken by i915_handle_error(), but as we are calling the lowlevel functions ourselves, we need to hold it. Wrap the entire live_hangcheck set of subtests in a single forcewake section for simplicity. This greatly improves the reliability of drv_selftest/live_hangcheck on Haswell, where it would exhibit an inability to restart a request because it lost its PD registers (PD_DIR_BASE reported as 0). Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c index 7e1bdd88eda3..a9e0de1f 100644 --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c @@ -878,9 +878,18 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915) SUBTEST(igt_reset_queue), SUBTEST(igt_handle_error), }; + int err; if (!intel_has_gpu_reset(i915)) return 0; - return i915_subtests(tests, i915); + intel_runtime_pm_get(i915); + intel_uncore_forcewake_get(i915, FORCEWAKE_ALL); + + err = i915_subtests(tests, i915); + + intel_uncore_forcewake_put(i915, FORCEWAKE_ALL); + intel_runtime_pm_put(i915); + + return err; } -- 2.14.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [ANNOUNCE] dim-tools mailing list for drm maintainer tools
The drm maintainer tools and documentation [1][2], the dim script in particular, have expanded in use and features and especially user base beyond at least my imagination. It's time to move the maintainer tools patches and discussion away from the intel-gfx mailing list, but it seems best to not clutter dri-devel any more. Hence we're introducing a new dim-tools mailing list [3] for announcements, discussion, and development of drm maintainer tools and documentation. Please subscribe to the list if you use dim, so we can reach out to all users with announcements. Some of you have been automatically subscribed; apologies if this was not what you wanted. BR, Jani. [1] https://cgit.freedesktop.org/drm/drm-intel/log/?h=maintainer-tools [2] https://01.org/linuxgraphics/gfx-docs/maintainer-tools/index.html [3] https://lists.freedesktop.org/mailman/listinfo/dim-tools -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 15/21] drm/i915: accurate page size tracking for the ppgtt
Now that we support multiple page sizes for the ppgtt, it would be useful to track the real usage for debugging purposes. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c| 11 +++ drivers/gpu/drm/i915/i915_gem_object.h | 10 ++ 2 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 118aad90468f..4c605785e2b3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1053,6 +1053,8 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm, gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx, cache_level); + + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; } static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, @@ -1145,7 +1147,10 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, vaddr = kmap_atomic_px(pd); vaddr[idx.pde] |= GEN8_PDE_IPS_64K; kunmap_atomic(vaddr); + page_size = I915_GTT_PAGE_SIZE_64K; } + + vma->page_sizes.gtt |= page_size; } while (iter->sg); } @@ -1170,6 +1175,8 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm, while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter, &idx, cache_level)) GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4); + + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; } } @@ -1891,6 +1898,8 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm, } } while (1); kunmap_atomic(vaddr); + + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; } static int gen6_alloc_va_range(struct i915_address_space *vm, @@ -2598,6 +2607,8 @@ static int ggtt_bind_vma(struct i915_vma *vma, vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags); intel_runtime_pm_put(i915); + vma->page_sizes.gtt = I915_GTT_PAGE_SIZE; + /* * Without aliasing PPGTT there's no difference between * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h index 110672952a1c..e4e6dd93889d 100644 --- a/drivers/gpu/drm/i915/i915_gem_object.h +++ b/drivers/gpu/drm/i915/i915_gem_object.h @@ -169,6 +169,7 @@ struct drm_i915_gem_object { struct sg_table *pages; void *mapping; + /* TODO: whack some of this into the error state */ struct i915_page_sizes { /** * The sg mask of the pages sg_table. i.e the mask of @@ -184,6 +185,15 @@ struct drm_i915_gem_object { * to use opportunistically. */ unsigned int sg; + + /** +* The actual gtt page size usage. Since we can have +* multiple vma associated with this object we need to +* prevent any trampling of state, hence a copy of this +* struct also lives in each vma, therefore the gtt +* value here should only be read/write through the vma. +*/ + unsigned int gtt; } page_sizes; struct i915_gem_object_page_iter { -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 16/21] drm/i915/debugfs: include some gtt page size metrics
Good to know, mostly for debugging purposes. v2: some improvements from Chris Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 61 ++--- 1 file changed, 57 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 44aae25d12c7..552d89eded44 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -119,6 +119,36 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj) return size; } +static const char * +stringify_page_sizes(unsigned int page_sizes, char *buf, size_t len) +{ + size_t x = 0; + + switch (page_sizes) { + case 0: + return ""; + case I915_GTT_PAGE_SIZE_4K: + return "4K"; + case I915_GTT_PAGE_SIZE_64K: + return "64K"; + case I915_GTT_PAGE_SIZE_2M: + return "2M"; + default: + if (!buf) + return "M"; + + if (page_sizes & I915_GTT_PAGE_SIZE_2M) + x += snprintf(buf + x, len - x, "2M, "); + if (page_sizes & I915_GTT_PAGE_SIZE_64K) + x += snprintf(buf + x, len - x, "64K, "); + if (page_sizes & I915_GTT_PAGE_SIZE_4K) + x += snprintf(buf + x, len - x, "4K, "); + buf[x-2] = '\0'; + + return buf; + } +} + static void describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) { @@ -156,9 +186,10 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) if (!drm_mm_node_allocated(&vma->node)) continue; - seq_printf(m, " (%sgtt offset: %08llx, size: %08llx", + seq_printf(m, " (%sgtt offset: %08llx, size: %08llx, pages: %s", i915_vma_is_ggtt(vma) ? "g" : "pp", - vma->node.start, vma->node.size); + vma->node.start, vma->node.size, + stringify_page_sizes(vma->page_sizes.gtt, NULL, 0)); if (i915_vma_is_ggtt(vma)) { switch (vma->ggtt_view.type) { case I915_GGTT_VIEW_NORMAL: @@ -403,10 +434,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) struct drm_i915_private *dev_priv = node_to_i915(m->private); struct drm_device *dev = &dev_priv->drm; struct i915_ggtt *ggtt = &dev_priv->ggtt; - u32 count, mapped_count, purgeable_count, dpy_count; - u64 size, mapped_size, purgeable_size, dpy_size; + u32 count, mapped_count, purgeable_count, dpy_count, huge_count; + u64 size, mapped_size, purgeable_size, dpy_size, huge_size; struct drm_i915_gem_object *obj; + unsigned int page_sizes = 0; struct drm_file *file; + char buf[80]; int ret; ret = mutex_lock_interruptible(&dev->struct_mutex); @@ -420,6 +453,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data) size = count = 0; mapped_size = mapped_count = 0; purgeable_size = purgeable_count = 0; + huge_size = huge_count = 0; list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) { size += obj->base.size; ++count; @@ -433,6 +467,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) mapped_count++; mapped_size += obj->base.size; } + + if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) { + huge_count++; + huge_size += obj->base.size; + page_sizes |= obj->mm.page_sizes.sg; + } } seq_printf(m, "%u unbound objects, %llu bytes\n", count, size); @@ -455,6 +495,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) mapped_count++; mapped_size += obj->base.size; } + + if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) { + huge_count++; + huge_size += obj->base.size; + page_sizes |= obj->mm.page_sizes.sg; + } } seq_printf(m, "%u bound objects, %llu bytes\n", count, size); @@ -462,11 +508,18 @@ static int i915_gem_object_info(struct seq_file *m, void *data) purgeable_count, purgeable_size); seq_printf(m, "%u mapped objects, %llu bytes\n", mapped_count, mapped_size); + seq_printf(m, "%u huge-paged objects (%s) %llu bytes\n", + huge_count, + stringify_page_sizes(page_sizes, buf, sizeof(buf)), + huge_size); seq_prin
[Intel-gfx] [PATCH 21/21] drm/i915: enable platform support for 2M pages
For gen8+ platforms which support the 48b PPGTT, enable platform level support for 2M pages. Also enable for mock testing. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_pci.c | 6 -- drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 8d349aec1902..bf467f30c99b 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -376,7 +376,8 @@ static const struct intel_device_info intel_haswell_gt3_info __initconst = { #define GEN8_FEATURES \ G75_FEATURES, \ BDW_COLORS, \ - GEN_DEFAULT_PAGE_SIZES, \ + .page_sizes = I915_GTT_PAGE_SIZE_4K | \ + I915_GTT_PAGE_SIZE_2M, \ .has_logical_ring_contexts = 1, \ .has_full_48bit_ppgtt = 1, \ .has_64bit_reloc = 1, \ @@ -437,7 +438,8 @@ static const struct intel_device_info intel_cherryview_info __initconst = { #define GEN9_DEFAULT_PAGE_SIZES \ .page_sizes = I915_GTT_PAGE_SIZE_4K | \ - I915_GTT_PAGE_SIZE_64K + I915_GTT_PAGE_SIZE_64K | \ + I915_GTT_PAGE_SIZE_2M #define GEN9_FEATURES \ GEN8_FEATURES, \ diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index 7a9735dac912..04eb9362f4f8 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -176,7 +176,8 @@ struct drm_i915_private *mock_gem_device(void) mkwrite_device_info(i915)->page_sizes = I915_GTT_PAGE_SIZE_4K | - I915_GTT_PAGE_SIZE_64K; + I915_GTT_PAGE_SIZE_64K | + I915_GTT_PAGE_SIZE_2M; spin_lock_init(&i915->mm.object_stat_lock); mock_uncore_init(i915); -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 11/21] drm/i915: disable GTT cache for 2M pages
When SW enables the use of 2M/1G pages, it must disable the GTT cache. v2: don't disable for Cherryview which doesn't even support 48b PPGTT! v3: explicitly check that the system does support 2M/1G pages v4: split WA and decision logic Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_pm.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 171b21f6c4ad..9d0ca2656a23 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -8453,6 +8453,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv) static void bdw_init_clock_gating(struct drm_i915_private *dev_priv) { + /* The GTT cache must be disabled if the system is using 2M pages. */ + bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv, +I915_GTT_PAGE_SIZE_2M); enum pipe pipe; ilk_init_lp_watermarks(dev_priv); @@ -8487,12 +8490,8 @@ static void bdw_init_clock_gating(struct drm_i915_private *dev_priv) /* WaProgramL3SqcReg1Default:bdw */ gen8_set_l3sqc_credits(dev_priv, 30, 2); - /* -* WaGttCachingOffByDefault:bdw -* GTT cache may not work with big pages, so if those -* are ever enabled GTT cache may need to be disabled. -*/ - I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL); + /* WaGttCachingOffByDefault:bdw */ + I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0); /* WaKVMNotificationOnConfigChange:bdw */ I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1) -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 17/21] drm/i915/selftests: huge page tests
v2: mock test page support configurations and add MI_STORE_DWORD test v3: run all mockable huge page tests on all platforms via the mock_device v4: add pin_update regression test various improvements suggested by Chris v5: fix issues reported by kbuild test single sg spanning multiple page sizes don't explode when running the live-tests through the appgtt v6: lots of improvements from Chris v7: run on each engine for igt_write_huge add simple tmpfs fallback test v8: size_t is bad don't break the i386 build Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c|1 + drivers/gpu/drm/i915/i915_gem_object.h |2 + drivers/gpu/drm/i915/selftests/huge_pages.c| 1715 .../gpu/drm/i915/selftests/i915_live_selftests.h |1 + .../gpu/drm/i915/selftests/i915_mock_selftests.h |1 + 5 files changed, 1720 insertions(+) create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 695cb2a38c88..e59fc37bf56e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5408,6 +5408,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) #include "selftests/scatterlist.c" #include "selftests/mock_gem_device.c" #include "selftests/huge_gem_object.c" +#include "selftests/huge_pages.c" #include "selftests/i915_gem_object.c" #include "selftests/i915_gem_coherency.c" #endif diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h index e4e6dd93889d..956c911c2cbf 100644 --- a/drivers/gpu/drm/i915/i915_gem_object.h +++ b/drivers/gpu/drm/i915/i915_gem_object.h @@ -196,6 +196,8 @@ struct drm_i915_gem_object { unsigned int gtt; } page_sizes; + I915_SELFTEST_DECLARE(unsigned int page_mask); + struct i915_gem_object_page_iter { struct scatterlist *sg_pos; unsigned int sg_idx; /* in pages, but 32bit eek! */ diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c new file mode 100644 index ..b8495882e5b0 --- /dev/null +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c @@ -0,0 +1,1715 @@ +/* + * Copyright © 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include "../i915_selftest.h" + +#include + +#include "mock_drm.h" + +static const unsigned int page_sizes[] = { + I915_GTT_PAGE_SIZE_2M, + I915_GTT_PAGE_SIZE_64K, + I915_GTT_PAGE_SIZE_4K, +}; + +static unsigned int get_largest_page_size(struct drm_i915_private *i915, + u64 rem) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) { + unsigned int page_size = page_sizes[i]; + + if (HAS_PAGE_SIZES(i915, page_size) && rem >= page_size) + return page_size; + } + + return 0; +} + +static void huge_pages_free_pages(struct sg_table *st) +{ + struct scatterlist *sg; + + for (sg = st->sgl; sg; sg = __sg_next(sg)) { + if (sg_page(sg)) + __free_pages(sg_page(sg), get_order(sg->length)); + } + + sg_free_table(st); + kfree(st); +} + +static int get_huge_pages(struct drm_i915_gem_object *obj) +{ +#define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY) + unsigned int page_mask = obj->mm.page_mask; + struct sg_table *st; + struct scatterlist *sg; + unsigned int sg_mask; + u64 rem; + + st = kmalloc(sizeof(*st), GFP); + if (!st) + return -ENOMEM; + + if (sg_alloc_table(st, obj->base.size >> PAGE_SHIFT, GFP)) { +
[Intel-gfx] [PATCH 13/21] drm/i915: add support for 64K scratch page
Before we can fully enable 64K pages, we need to first support a 64K scratch page if we intend to support the case where we have object sizes < 2M, since any scratch PTE must also point to a 64K region. Without this our 64K usage is limited to objects which completely fill the page-table, and therefore don't need any scratch. v2: add reminder about why 48b PPGTT Reported-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 64 ++--- drivers/gpu/drm/i915/i915_gem_gtt.h | 1 + 2 files changed, 54 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 79ba485c5d42..7eae6ab8c5fd 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -519,22 +519,63 @@ static void fill_page_dma_32(struct i915_address_space *vm, static int setup_scratch_page(struct i915_address_space *vm, gfp_t gfp) { - struct page *page; + struct page *page = NULL; dma_addr_t addr; + int order; - page = alloc_page(gfp | __GFP_ZERO); - if (unlikely(!page)) - return -ENOMEM; + /* +* In order to utilize 64K pages for an object with a size < 2M, we will +* need to support a 64K scratch page, given that every 16th entry for a +* page-table operating in 64K mode must point to a properly aligned 64K +* region, including any PTEs which happen to point to scratch. +* +* This is only relevant for the 48b PPGTT where we support +* huge-gtt-pages, see also i915_vma_insert(). +* +* TODO: we should really consider write-protecting the scratch-page and +* sharing between ppgtt +*/ + if (i915_vm_is_48bit(vm) && + HAS_PAGE_SIZES(vm->i915, I915_GTT_PAGE_SIZE_64K)) { + order = get_order(I915_GTT_PAGE_SIZE_64K); + page = alloc_pages(gfp | __GFP_ZERO, order); + if (page) { + addr = dma_map_page(vm->dma, page, 0, + I915_GTT_PAGE_SIZE_64K, + PCI_DMA_BIDIRECTIONAL); + if (unlikely(dma_mapping_error(vm->dma, addr))) { + __free_pages(page, order); + page = NULL; + } - addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE, - PCI_DMA_BIDIRECTIONAL); - if (unlikely(dma_mapping_error(vm->dma, addr))) { - __free_page(page); - return -ENOMEM; + if (!IS_ALIGNED(addr, I915_GTT_PAGE_SIZE_64K)) { + dma_unmap_page(vm->dma, addr, + I915_GTT_PAGE_SIZE_64K, + PCI_DMA_BIDIRECTIONAL); + __free_pages(page, order); + page = NULL; + } + } + } + + if (!page) { + order = 0; + page = alloc_page(gfp | __GFP_ZERO); + if (unlikely(!page)) + return -ENOMEM; + + addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE, + PCI_DMA_BIDIRECTIONAL); + if (unlikely(dma_mapping_error(vm->dma, addr))) { + __free_page(page); + return -ENOMEM; + } } vm->scratch_page.page = page; vm->scratch_page.daddr = addr; + vm->scratch_page.order = order; + return 0; } @@ -542,8 +583,9 @@ static void cleanup_scratch_page(struct i915_address_space *vm) { struct i915_page_dma *p = &vm->scratch_page; - dma_unmap_page(vm->dma, p->daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); - __free_page(p->page); + dma_unmap_page(vm->dma, p->daddr, BIT(p->order) << PAGE_SHIFT, + PCI_DMA_BIDIRECTIONAL); + __free_pages(p->page, p->order); } static struct i915_page_table *alloc_pt(struct i915_address_space *vm) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index b9d7036c3665..e9de3f05b0c9 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -215,6 +215,7 @@ struct i915_vma; struct i915_page_dma { struct page *page; + int order; union { dma_addr_t daddr; -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 19/21] drm/i915: disable platform support for vGPU huge gtt pages
Currently gvt gtt handling doesn't support huge page entries, so disable for now. v2: remove useless 48b PPGTT check Suggested-by: Zhenyu Wang Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Zhenyu Wang Reviewed-by: Zhenyu Wang Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e59fc37bf56e..6d36ee9c3508 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4818,6 +4818,15 @@ int i915_gem_init(struct drm_i915_private *dev_priv) mutex_lock(&dev_priv->drm.struct_mutex); + /* +* We need to fallback to 4K pages since gvt gtt handling doesn't +* support huge page entries - we will need to check either hypervisor +* mm can support huge guest page or just do emulation in gvt. +*/ + if (intel_vgpu_active(dev_priv)) + mkwrite_device_info(dev_priv)->page_sizes = + I915_GTT_PAGE_SIZE_4K; + dev_priv->mm.unordered_timeline = dma_fence_context_alloc(1); if (!i915_modparams.enable_execlists) { -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 20/21] drm/i915: enable platform support for 64K pages
For gen9+ enable platform level support for 64K pages. Also enable for mock testing. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_pci.c | 3 ++- drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 7938006cf03a..8d349aec1902 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -436,7 +436,8 @@ static const struct intel_device_info intel_cherryview_info __initconst = { }; #define GEN9_DEFAULT_PAGE_SIZES \ - .page_sizes = I915_GTT_PAGE_SIZE_4K + .page_sizes = I915_GTT_PAGE_SIZE_4K | \ + I915_GTT_PAGE_SIZE_64K #define GEN9_FEATURES \ GEN8_FEATURES, \ diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index f46c3a35d61a..7a9735dac912 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -175,7 +175,8 @@ struct drm_i915_private *mock_gem_device(void) mkwrite_device_info(i915)->gen = -1; mkwrite_device_info(i915)->page_sizes = - I915_GTT_PAGE_SIZE_4K; + I915_GTT_PAGE_SIZE_4K | + I915_GTT_PAGE_SIZE_64K; spin_lock_init(&i915->mm.object_stat_lock); mock_uncore_init(i915); -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 18/21] drm/i915/selftests: mix huge pages
Try to mix sg page sizes for 4K, 64K and 2M pages. v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/ Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/selftests/scatterlist.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/i915/selftests/scatterlist.c b/drivers/gpu/drm/i915/selftests/scatterlist.c index 1cc5d2931753..cd6d2a16071f 100644 --- a/drivers/gpu/drm/i915/selftests/scatterlist.c +++ b/drivers/gpu/drm/i915/selftests/scatterlist.c @@ -189,6 +189,20 @@ static unsigned int random(unsigned long n, return 1 + (prandom_u32_state(rnd) % 1024); } +static unsigned int random_page_size_pages(unsigned long n, + unsigned long count, + struct rnd_state *rnd) +{ + /* 4K, 64K, 2M */ + static unsigned int page_count[] = { + BIT(12) >> PAGE_SHIFT, + BIT(16) >> PAGE_SHIFT, + BIT(21) >> PAGE_SHIFT, + }; + + return page_count[(prandom_u32_state(rnd) % 3)]; +} + static inline bool page_contiguous(struct page *first, struct page *last, unsigned long npages) @@ -252,6 +266,7 @@ static const npages_fn_t npages_funcs[] = { grow, shrink, random, + random_page_size_pages, NULL, }; -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/21] drm/i915: align 64K objects to 2M
We can't mix 64K and 4K pte's in the same page-table, so for now we align 64K objects to 2M to avoid any potential mixing. This is potentially wasteful but in reality shouldn't be too bad since this only applies to the virtual address space of a 48b PPGTT. v2: don't separate logically connected ops Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_vma.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 5067eab27829..ecddf519a11c 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -500,10 +500,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) */ if (upper_32_bits(end) && vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { + /* +* We can't mix 64K and 4K PTEs in the same page-table (2M +* block), and so to avoid the ugliness and complexity of +* coloring we opt for just aligning 64K objects to 2M. +*/ u64 page_alignment = - rounddown_pow_of_two(vma->page_sizes.sg); + rounddown_pow_of_two(vma->page_sizes.sg | +I915_GTT_PAGE_SIZE_2M); alignment = max(alignment, page_alignment); + + if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) + size = round_up(size, I915_GTT_PAGE_SIZE_2M); } ret = i915_gem_gtt_insert(vma->vm, &vma->node, -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 10/21] drm/i915: enable IPS bit for 64K pages
Before we can enable 64K pages through the IPS bit, we must first enable it through MMIO, otherwise the page-walker will simply ignore it. v2: add comment mentioning that 64K is BDW+ v3: move to more suitable home Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Mika Kuoppala Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 17 + drivers/gpu/drm/i915/i915_reg.h | 3 +++ 2 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index fb7ac66814ab..74fc9ac11cd5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1987,6 +1987,23 @@ static void gtt_write_workarounds(struct drm_i915_private *dev_priv) I915_WRITE(GEN8_L3_LRA_1_GPGPU, GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_SKL); else if (IS_GEN9_LP(dev_priv)) I915_WRITE(GEN8_L3_LRA_1_GPGPU, GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_BXT); + + /* +* To support 64K PTEs we need to first enable the use of the +* Intermediate-Page-Size(IPS) bit of the PDE field via some magical +* mmio, otherwise the page-walker will simply ignore the IPS bit. This +* shouldn't be needed after GEN10. +* +* 64K pages were first introduced from BDW+, although technically they +* only *work* from gen9+. For pre-BDW we instead have the option for +* 32K pages, but we don't currently have any support for it in our +* driver. +*/ + if (HAS_PAGE_SIZES(dev_priv, I915_GTT_PAGE_SIZE_64K) && + INTEL_GEN(dev_priv) <= 10) + I915_WRITE(GEN8_GAMW_ECO_DEV_RW_IA, + I915_READ(GEN8_GAMW_ECO_DEV_RW_IA) | + GAMW_ECO_ENABLE_64K_IPS_FIELD); } int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index e7dba5539b11..50e65c98ca6c 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2371,6 +2371,9 @@ enum i915_power_well_id { #define GEN9_GAMT_ECO_REG_RW_IA _MMIO(0x4ab0) #define GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS (1<<18) +#define GEN8_GAMW_ECO_DEV_RW_IA _MMIO(0x4080) +#define GAMW_ECO_ENABLE_64K_IPS_FIELD 0xF + #define GAMT_CHKN_BIT_REG _MMIO(0x4ab8) #define GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING (1<<28) #define GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT (1<<24) -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 14/21] drm/i915: support 64K pages for the 48b PPGTT
Support inserting 64K pages into the 48b PPGTT. v2: check for 64K scratch v3: we should only have to re-adjust maybe_64K at every sg interval Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 31 +++ drivers/gpu/drm/i915/i915_gem_gtt.h | 7 +++ 2 files changed, 38 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 7eae6ab8c5fd..118aad90468f 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1069,6 +1069,7 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, struct i915_page_directory_pointer *pdp = pdps[idx.pml4e]; struct i915_page_directory *pd = pdp->page_directory[idx.pdpe]; unsigned int page_size; + bool maybe_64K = false; gen8_pte_t encode = pte_encode; gen8_pte_t *vaddr; u16 index, max; @@ -1090,6 +1091,13 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, max = GEN8_PTES; page_size = I915_GTT_PAGE_SIZE; + if (!index && + vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K && + IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) && + (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) || +rem >= (max - index) << PAGE_SHIFT)) + maybe_64K = true; + vaddr = kmap_atomic_px(pt); } @@ -1109,12 +1117,35 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, iter->dma = sg_dma_address(iter->sg); iter->max = iter->dma + rem; + if (maybe_64K && index < max && + !(IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) && + (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) || + rem >= (max - index) << PAGE_SHIFT))) + maybe_64K = false; + if (unlikely(!IS_ALIGNED(iter->dma, page_size))) break; } } while (rem >= page_size && index < max); kunmap_atomic(vaddr); + + /* +* Is it safe to mark the 2M block as 64K? -- Either we have +* filled whole page-table with 64K entries, or filled part of +* it and have reached the end of the sg table and we have +* enough padding. +*/ + if (maybe_64K && + (index == max || +(i915_vm_has_scratch_64K(vma->vm) && + !iter->sg && IS_ALIGNED(vma->node.start + + vma->node.size, + I915_GTT_PAGE_SIZE_2M { + vaddr = kmap_atomic_px(pd); + vaddr[idx.pde] |= GEN8_PDE_IPS_64K; + kunmap_atomic(vaddr); + } } while (iter->sg); } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index e9de3f05b0c9..93211a96fdad 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -154,6 +154,7 @@ typedef u64 gen8_ppgtt_pml4e_t; #define GEN8_PPAT_GET_AGE(x) ((x) & (3 << 4)) #define CHV_PPAT_GET_SNOOP(x) ((x) & (1 << 6)) +#define GEN8_PDE_IPS_64K BIT(11) #define GEN8_PDE_PS_2M BIT(7) struct sg_table; @@ -352,6 +353,12 @@ i915_vm_is_48bit(const struct i915_address_space *vm) return (vm->total - 1) >> 32; } +static inline bool +i915_vm_has_scratch_64K(struct i915_address_space *vm) +{ + return vm->scratch_page.order == get_order(I915_GTT_PAGE_SIZE_64K); +} + /* The Graphics Translation Table is the way in which GEN hardware translates a * Graphics Virtual Address into a Physical Address. In addition to the normal * collateral associated with any va->pa translations GEN hardware also has a -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 12/21] drm/i915: support 2M pages for the 48b PPGTT
Support inserting 2M gtt pages into the 48b PPGTT. v2: sanity check sg->length against page_size v3: don't recalculate rem on each loop whitespace breakup Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 76 +++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + 2 files changed, 74 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 74fc9ac11cd5..79ba485c5d42 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1013,6 +1013,69 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm, cache_level); } +static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma, + struct i915_page_directory_pointer **pdps, + struct sgt_dma *iter, + enum i915_cache_level cache_level) +{ + const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level); + u64 start = vma->node.start; + dma_addr_t rem = iter->sg->length; + + do { + struct gen8_insert_pte idx = gen8_insert_pte(start); + struct i915_page_directory_pointer *pdp = pdps[idx.pml4e]; + struct i915_page_directory *pd = pdp->page_directory[idx.pdpe]; + unsigned int page_size; + gen8_pte_t encode = pte_encode; + gen8_pte_t *vaddr; + u16 index, max; + + if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M && + IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) && + rem >= I915_GTT_PAGE_SIZE_2M && !idx.pte) { + index = idx.pde; + max = I915_PDES; + page_size = I915_GTT_PAGE_SIZE_2M; + + encode |= GEN8_PDE_PS_2M; + + vaddr = kmap_atomic_px(pd); + } else { + struct i915_page_table *pt = pd->page_table[idx.pde]; + + index = idx.pte; + max = GEN8_PTES; + page_size = I915_GTT_PAGE_SIZE; + + vaddr = kmap_atomic_px(pt); + } + + do { + GEM_BUG_ON(iter->sg->length < page_size); + vaddr[index++] = encode | iter->dma; + + start += page_size; + iter->dma += page_size; + rem -= page_size; + if (iter->dma >= iter->max) { + iter->sg = __sg_next(iter->sg); + if (!iter->sg) + break; + + rem = iter->sg->length; + iter->dma = sg_dma_address(iter->sg); + iter->max = iter->dma + rem; + + if (unlikely(!IS_ALIGNED(iter->dma, page_size))) + break; + } + } while (rem >= page_size && index < max); + + kunmap_atomic(vaddr); + } while (iter->sg); +} + static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm, struct i915_vma *vma, enum i915_cache_level cache_level, @@ -1025,11 +1088,16 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm, .max = iter.dma + iter.sg->length, }; struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps; - struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start); - while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter, -&idx, cache_level)) - GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4); + if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { + gen8_ppgtt_insert_huge_entries(vma, pdps, &iter, cache_level); + } else { + struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start); + + while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], +&iter, &idx, cache_level)) + GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4); + } } static void gen8_free_page_tables(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index f22491b4e6dc..b9d7036c3665 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -154,6 +154,8 @@ typedef u64 gen8_ppgtt_pml4e_t; #define GEN8_PPAT_GET_AGE(x) ((x) & (3 << 4)) #define CHV_PPAT_GET_SNOOP(x) ((x) & (1 << 6)) +#define GEN8_PDE_PS_2
[Intel-gfx] [PATCH 07/21] drm/i915: introduce vm set_pages/clear_pages
Move the setting/clearing of the vma->pages to a vm operation. Doing so neatens things up a little, but more importantly gives us a sane place to also set/clear the vma->pages_sizes, which we introduce later in preparation for supporting huge-pages. v2: remove redundant vma->pages check v3: GEM_BUG_ON(vma->pages) following i915_vma_remove Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 70 +++ drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + drivers/gpu/drm/i915/i915_vma.c | 27 +++- drivers/gpu/drm/i915/selftests/mock_gtt.c | 11 ++--- 4 files changed, 66 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 4c82ceb8d318..c534b74eee32 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -205,8 +205,6 @@ static int ppgtt_bind_vma(struct i915_vma *vma, return ret; } - vma->pages = vma->obj->mm.pages; - /* Currently applicable only to VLV */ pte_flags = 0; if (vma->obj->gt_ro) @@ -222,6 +220,26 @@ static void ppgtt_unbind_vma(struct i915_vma *vma) vma->vm->clear_range(vma->vm, vma->node.start, vma->size); } +static int ppgtt_set_pages(struct i915_vma *vma) +{ + GEM_BUG_ON(vma->pages); + + vma->pages = vma->obj->mm.pages; + + return 0; +} + +static void clear_pages(struct i915_vma *vma) +{ + GEM_BUG_ON(!vma->pages); + + if (vma->pages != vma->obj->mm.pages) { + sg_free_table(vma->pages); + kfree(vma->pages); + } + vma->pages = NULL; +} + static gen8_pte_t gen8_pte_encode(dma_addr_t addr, enum i915_cache_level level) { @@ -1452,6 +1470,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.cleanup = gen8_ppgtt_cleanup; ppgtt->base.unbind_vma = ppgtt_unbind_vma; ppgtt->base.bind_vma = ppgtt_bind_vma; + ppgtt->base.set_pages = ppgtt_set_pages; + ppgtt->base.clear_pages = clear_pages; ppgtt->debug_dump = gen8_dump_ppgtt; return 0; @@ -1894,6 +1914,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; ppgtt->base.unbind_vma = ppgtt_unbind_vma; ppgtt->base.bind_vma = ppgtt_bind_vma; + ppgtt->base.set_pages = ppgtt_set_pages; + ppgtt->base.clear_pages = clear_pages; ppgtt->base.cleanup = gen6_ppgtt_cleanup; ppgtt->debug_dump = gen6_dump_ppgtt; @@ -2405,12 +2427,6 @@ static int ggtt_bind_vma(struct i915_vma *vma, struct drm_i915_gem_object *obj = vma->obj; u32 pte_flags; - if (unlikely(!vma->pages)) { - int ret = i915_get_ggtt_vma_pages(vma); - if (ret) - return ret; - } - /* Currently applicable only to VLV */ pte_flags = 0; if (obj->gt_ro) @@ -2447,12 +2463,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, u32 pte_flags; int ret; - if (unlikely(!vma->pages)) { - ret = i915_get_ggtt_vma_pages(vma); - if (ret) - return ret; - } - /* Currently applicable only to VLV */ pte_flags = 0; if (vma->obj->gt_ro) @@ -2467,7 +2477,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, vma->node.start, vma->size); if (ret) - goto err_pages; + return ret; } appgtt->base.insert_entries(&appgtt->base, vma, cache_level, @@ -2481,17 +2491,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, } return 0; - -err_pages: - if (!(vma->flags & (I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND))) { - if (vma->pages != vma->obj->mm.pages) { - GEM_BUG_ON(!vma->pages); - sg_free_table(vma->pages); - kfree(vma->pages); - } - vma->pages = NULL; - } - return ret; } static void aliasing_gtt_unbind_vma(struct i915_vma *vma) @@ -2529,6 +2528,19 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj, dma_unmap_sg(kdev, pages->sgl, pages->nents, PCI_DMA_BIDIRECTIONAL); } +static int ggtt_set_pages(struct i915_vma *vma) +{ + int ret; + + GEM_BUG_ON(vma->pages); + + ret = i915_get_ggtt_vma_pages(vma); + if (ret) + return ret; + + return 0; +} + static void i915_gtt_color_adjust(const struct drm_mm_node *node,
[Intel-gfx] [PATCH 03/21] drm/i915/gemfs: enable THP
Enable transparent-huge-pages through gemfs by mounting with huge=within_size. v2: sprinkle within_size comment Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gemfs.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/i915_gemfs.c index 168d0bd98f60..e2993857df37 100644 --- a/drivers/gpu/drm/i915/i915_gemfs.c +++ b/drivers/gpu/drm/i915/i915_gemfs.c @@ -24,6 +24,7 @@ #include #include +#include #include "i915_drv.h" #include "i915_gemfs.h" @@ -41,6 +42,27 @@ int i915_gemfs_init(struct drm_i915_private *i915) if (IS_ERR(gemfs)) return PTR_ERR(gemfs); + /* +* Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most +* likely 2M. Note that within_size may overallocate huge-pages, if say +* we allocate an object of size 2M + 4K, we may get 2M + 2M, but under +* memory pressure shmem should split any huge-pages which can be +* shrunk. +*/ + + if (has_transparent_hugepage()) { + struct super_block *sb = gemfs->mnt_sb; + char options[] = "huge=within_size"; + int flags = 0; + int err; + + err = sb->s_op->remount_fs(sb, &flags, options); + if (err) { + kern_unmount(gemfs); + return err; + } + } + i915->mm.gemfs = gemfs; return 0; -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 00/21] huge gtt pages
Some more bits of polish. Matthew Auld (21): mm/shmem: introduce shmem_file_setup_with_mnt drm/i915: introduce simple gemfs drm/i915/gemfs: enable THP drm/i915: introduce page_sizes field to dev_info drm/i915: push set_pages down to the callers drm/i915: introduce page_size members drm/i915: introduce vm set_pages/clear_pages drm/i915: align the vma start to the largest gtt page size drm/i915: align 64K objects to 2M drm/i915: enable IPS bit for 64K pages drm/i915: disable GTT cache for 2M pages drm/i915: support 2M pages for the 48b PPGTT drm/i915: add support for 64K scratch page drm/i915: support 64K pages for the 48b PPGTT drm/i915: accurate page size tracking for the ppgtt drm/i915/debugfs: include some gtt page size metrics drm/i915/selftests: huge page tests drm/i915/selftests: mix huge pages drm/i915: disable platform support for vGPU huge gtt pages drm/i915: enable platform support for 64K pages drm/i915: enable platform support for 2M pages drivers/gpu/drm/i915/Makefile |1 + drivers/gpu/drm/i915/i915_debugfs.c| 61 +- drivers/gpu/drm/i915/i915_drv.h| 29 +- drivers/gpu/drm/i915/i915_gem.c| 126 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 18 +- drivers/gpu/drm/i915/i915_gem_gtt.c| 275 +++- drivers/gpu/drm/i915/i915_gem_gtt.h| 20 +- drivers/gpu/drm/i915/i915_gem_internal.c | 18 +- drivers/gpu/drm/i915/i915_gem_object.h | 31 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 16 +- drivers/gpu/drm/i915/i915_gem_userptr.c| 15 +- drivers/gpu/drm/i915/i915_gemfs.c | 74 + drivers/gpu/drm/i915/i915_gemfs.h | 34 + drivers/gpu/drm/i915/i915_pci.c| 21 + drivers/gpu/drm/i915/i915_reg.h|3 + drivers/gpu/drm/i915/i915_vma.c| 49 +- drivers/gpu/drm/i915/i915_vma.h|1 + drivers/gpu/drm/i915/intel_pm.c| 11 +- drivers/gpu/drm/i915/selftests/huge_gem_object.c | 14 +- drivers/gpu/drm/i915/selftests/huge_pages.c| 1715 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 +- .../gpu/drm/i915/selftests/i915_live_selftests.h |1 + .../gpu/drm/i915/selftests/i915_mock_selftests.h |1 + drivers/gpu/drm/i915/selftests/mock_gem_device.c |9 + drivers/gpu/drm/i915/selftests/mock_gtt.c | 11 +- drivers/gpu/drm/i915/selftests/scatterlist.c | 15 + include/linux/shmem_fs.h |2 + mm/shmem.c | 30 +- 28 files changed, 2479 insertions(+), 137 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 01/21] mm/shmem: introduce shmem_file_setup_with_mnt
We are planning to use our own tmpfs mnt in i915 in place of the shm_mnt, such that we can control the mount options, in particular huge=, which we require to support huge-gtt-pages. So rather than roll our own version of __shmem_file_setup, it would be preferred if we could just give shmem our mnt, and let it do the rest. Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Dave Hansen Cc: Kirill A. Shutemov Cc: Hugh Dickins Cc: linux...@kvack.org Acked-by: Andrew Morton Acked-by: Kirill A. Shutemov Reviewed-by: Joonas Lahtinen --- include/linux/shmem_fs.h | 2 ++ mm/shmem.c | 30 ++ 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index b6c3540e07bc..0937d9a7d8fb 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -53,6 +53,8 @@ extern struct file *shmem_file_setup(const char *name, loff_t size, unsigned long flags); extern struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags); +extern struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt, + const char *name, loff_t size, unsigned long flags); extern int shmem_zero_setup(struct vm_area_struct *); extern unsigned long shmem_get_unmapped_area(struct file *, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); diff --git a/mm/shmem.c b/mm/shmem.c index 07a1d22807be..3229d27503ec 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4183,7 +4183,7 @@ static const struct dentry_operations anon_ops = { .d_dname = simple_dname }; -static struct file *__shmem_file_setup(const char *name, loff_t size, +static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, loff_t size, unsigned long flags, unsigned int i_flags) { struct file *res; @@ -4192,8 +4192,8 @@ static struct file *__shmem_file_setup(const char *name, loff_t size, struct super_block *sb; struct qstr this; - if (IS_ERR(shm_mnt)) - return ERR_CAST(shm_mnt); + if (IS_ERR(mnt)) + return ERR_CAST(mnt); if (size < 0 || size > MAX_LFS_FILESIZE) return ERR_PTR(-EINVAL); @@ -4205,8 +4205,8 @@ static struct file *__shmem_file_setup(const char *name, loff_t size, this.name = name; this.len = strlen(name); this.hash = 0; /* will go */ - sb = shm_mnt->mnt_sb; - path.mnt = mntget(shm_mnt); + sb = mnt->mnt_sb; + path.mnt = mntget(mnt); path.dentry = d_alloc_pseudo(sb, &this); if (!path.dentry) goto put_memory; @@ -4251,7 +4251,7 @@ static struct file *__shmem_file_setup(const char *name, loff_t size, */ struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags) { - return __shmem_file_setup(name, size, flags, S_PRIVATE); + return __shmem_file_setup(shm_mnt, name, size, flags, S_PRIVATE); } /** @@ -4262,11 +4262,25 @@ struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned lon */ struct file *shmem_file_setup(const char *name, loff_t size, unsigned long flags) { - return __shmem_file_setup(name, size, flags, 0); + return __shmem_file_setup(shm_mnt, name, size, flags, 0); } EXPORT_SYMBOL_GPL(shmem_file_setup); /** + * shmem_file_setup_with_mnt - get an unlinked file living in tmpfs + * @mnt: the tmpfs mount where the file will be created + * @name: name for dentry (to be seen in /proc//maps + * @size: size to be set for the file + * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size + */ +struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt, const char *name, + loff_t size, unsigned long flags) +{ + return __shmem_file_setup(mnt, name, size, flags, 0); +} +EXPORT_SYMBOL_GPL(shmem_file_setup_with_mnt); + +/** * shmem_zero_setup - setup a shared anonymous mapping * @vma: the vma to be mmapped is prepared by do_mmap_pgoff */ @@ -4281,7 +4295,7 @@ int shmem_zero_setup(struct vm_area_struct *vma) * accessible to the user through its mapping, use S_PRIVATE flag to * bypass file security, in the same way as shmem_kernel_file_setup(). */ - file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE); + file = shmem_kernel_file_setup("dev/zero", size, vma->vm_flags); if (IS_ERR(file)) return PTR_ERR(file); -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 04/21] drm/i915: introduce page_sizes field to dev_info
In preparation for huge gtt pages expose page_sizes as part of the device info, to indicate the page sizes supported by the HW. Currently only 4K is supported. v2: s/page_size_mask/page_sizes/ v3: introduce I915_GTT_MAX_PAGE_SIZE Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_gtt.h | 8 +++- drivers/gpu/drm/i915/i915_pci.c | 18 ++ drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 +++ 4 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ec6f320cc4f5..3d4dee817381 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -869,6 +869,8 @@ struct intel_device_info { u8 num_sprites[I915_MAX_PIPES]; u8 num_scalers[I915_MAX_PIPES]; + unsigned int page_sizes; /* page sizes supported by the HW */ + #define DEFINE_FLAG(name) u8 name:1 DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG); #undef DEFINE_FLAG diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index f62fb903dc24..50218c141c21 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -42,7 +42,13 @@ #include "i915_gem_request.h" #include "i915_selftest.h" -#define I915_GTT_PAGE_SIZE 4096UL +#define I915_GTT_PAGE_SIZE_4K BIT(12) +#define I915_GTT_PAGE_SIZE_64K BIT(16) +#define I915_GTT_PAGE_SIZE_2M BIT(21) + +#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K +#define I915_GTT_MAX_PAGE_SIZE I915_GTT_PAGE_SIZE_2M + #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE #define I915_FENCE_REG_NONE -1 diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 745b6a6e0188..7938006cf03a 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -58,6 +58,10 @@ .color = { .degamma_lut_size = 0, .gamma_lut_size = 1024 } /* Keep in gen based order, and chronological order within a gen */ + +#define GEN_DEFAULT_PAGE_SIZES \ + .page_sizes = I915_GTT_PAGE_SIZE_4K + #define GEN2_FEATURES \ .gen = 2, .num_pipes = 1, \ .has_overlay = 1, .overlay_needs_physical = 1, \ @@ -67,6 +71,7 @@ .ring_mask = RENDER_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_i830_info __initconst = { @@ -100,6 +105,7 @@ static const struct intel_device_info intel_i865g_info __initconst = { .ring_mask = RENDER_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_i915g_info __initconst = { @@ -163,6 +169,7 @@ static const struct intel_device_info intel_pineview_info __initconst = { .ring_mask = RENDER_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_i965g_info __initconst = { @@ -205,6 +212,7 @@ static const struct intel_device_info intel_gm45_info __initconst = { .ring_mask = RENDER_RING | BSD_RING, \ .has_snoop = true, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS static const struct intel_device_info intel_ironlake_d_info __initconst = { @@ -228,6 +236,7 @@ static const struct intel_device_info intel_ironlake_m_info __initconst = { .has_rc6p = 1, \ .has_aliasing_ppgtt = 1, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ CURSOR_OFFSETS #define SNB_D_PLATFORM \ @@ -271,6 +280,7 @@ static const struct intel_device_info intel_sandybridge_m_gt2_info __initconst = .has_aliasing_ppgtt = 1, \ .has_full_ppgtt = 1, \ GEN_DEFAULT_PIPEOFFSETS, \ + GEN_DEFAULT_PAGE_SIZES, \ IVB_CURSOR_OFFSETS #define IVB_D_PLATFORM \ @@ -327,6 +337,7 @@ static const struct intel_device_info intel_valleyview_info __initconst = { .has_snoop = true, .ring_mask = RENDER_RING | BSD_RING | BLT_RING, .display_mmio_offset = VLV_DISPLAY_BASE, + GEN_DEFAULT_PAGE_SIZES, GEN_DEFAULT_PIPEOFFSETS, CURSOR_OFFSETS }; @@ -365,6 +376,7 @@ static const struct intel_device_info intel_haswell_gt3_info __initconst = { #define GEN8_FEATURES \ G75_FEATURES, \ BDW_COLORS, \ + GEN_DEFAULT_PAGE_SIZES, \ .has_logical_ring_contexts = 1, \ .has_full_48bit_ppgtt = 1, \ .has_64bit_reloc = 1, \ @@ -417,13 +429,18 @@ static const struct intel_device_info intel_cherryview_info __initconst = { .has_reset_engine = 1, .has_snoop = true, .display_mmio_offse
[Intel-gfx] [PATCH 08/21] drm/i915: align the vma start to the largest gtt page size
For the 48b PPGTT try to align the vma start address to the required page size boundary to guarantee we use said page size in the gtt. If we are dealing with multiple page sizes, we can't guarantee anything and just align to the largest. For soft pinning and objects which need to be tightly packed into the lower 32bits we don't force any alignment. v2: various improvements suggested by Chris v3: use set_pages and better placement of page_sizes v4: prefer upper_32_bits() v5: assign vma->page_sizes = vma->obj->page_sizes directly prefer sizeof(vma->page_sizes) Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++ drivers/gpu/drm/i915/i915_vma.c | 13 + drivers/gpu/drm/i915/i915_vma.h | 1 + 3 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index c534b74eee32..fb7ac66814ab 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -226,6 +226,8 @@ static int ppgtt_set_pages(struct i915_vma *vma) vma->pages = vma->obj->mm.pages; + vma->page_sizes = vma->obj->mm.page_sizes; + return 0; } @@ -238,6 +240,8 @@ static void clear_pages(struct i915_vma *vma) kfree(vma->pages); } vma->pages = NULL; + + memset(&vma->page_sizes, 0, sizeof(vma->page_sizes)); } static gen8_pte_t gen8_pte_encode(dma_addr_t addr, @@ -2538,6 +2542,8 @@ static int ggtt_set_pages(struct i915_vma *vma) if (ret) return ret; + vma->page_sizes = vma->obj->mm.page_sizes; + return 0; } diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 49bf49571e47..5067eab27829 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -493,6 +493,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (ret) goto err_clear; } else { + /* +* We only support huge gtt pages through the 48b PPGTT, +* however we also don't want to force any alignment for +* objects which need to be tightly packed into the low 32bits. +*/ + if (upper_32_bits(end) && + vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { + u64 page_alignment = + rounddown_pow_of_two(vma->page_sizes.sg); + + alignment = max(alignment, page_alignment); + } + ret = i915_gem_gtt_insert(vma->vm, &vma->node, size, alignment, obj->cache_level, start, end, flags); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index e811067c7724..c59ba76613a3 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -55,6 +55,7 @@ struct i915_vma { void __iomem *iomap; u64 size; u64 display_alignment; + struct i915_page_sizes page_sizes; u32 fence_size; u32 fence_alignment; -- 2.13.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/21] drm/i915: push set_pages down to the callers
Each backend is now responsible for calling __i915_gem_object_set_pages upon successfully gathering its backing storage. This eliminates the inconsistency between the async and sync paths, which stands out even more when we start throwing around an sg_mask in a later patch. Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 45 +--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 15 +--- drivers/gpu/drm/i915/i915_gem_internal.c | 15 drivers/gpu/drm/i915/i915_gem_object.h | 2 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 16 ++--- drivers/gpu/drm/i915/i915_gem_userptr.c | 12 +++ drivers/gpu/drm/i915/selftests/huge_gem_object.c | 14 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 12 --- 8 files changed, 77 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 81d70d23a057..c1c07d0957aa 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -162,8 +162,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, return 0; } -static struct sg_table * -i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) +static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) { struct address_space *mapping = obj->base.filp->f_mapping; drm_dma_handle_t *phys; @@ -171,9 +170,10 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) struct scatterlist *sg; char *vaddr; int i; + int err; if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj))) - return ERR_PTR(-EINVAL); + return -EINVAL; /* Always aligning to the object size, allows a single allocation * to handle all possible callers, and given typical object sizes, @@ -183,7 +183,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) roundup_pow_of_two(obj->base.size), roundup_pow_of_two(obj->base.size)); if (!phys) - return ERR_PTR(-ENOMEM); + return -ENOMEM; vaddr = phys->vaddr; for (i = 0; i < obj->base.size / PAGE_SIZE; i++) { @@ -192,7 +192,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) page = shmem_read_mapping_page(mapping, i); if (IS_ERR(page)) { - st = ERR_CAST(page); + err = PTR_ERR(page); goto err_phys; } @@ -209,13 +209,13 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) st = kmalloc(sizeof(*st), GFP_KERNEL); if (!st) { - st = ERR_PTR(-ENOMEM); + err = -ENOMEM; goto err_phys; } if (sg_alloc_table(st, 1, GFP_KERNEL)) { kfree(st); - st = ERR_PTR(-ENOMEM); + err = -ENOMEM; goto err_phys; } @@ -227,11 +227,15 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) sg_dma_len(sg) = obj->base.size; obj->phys_handle = phys; - return st; + + __i915_gem_object_set_pages(obj, st); + + return 0; err_phys: drm_pci_free(obj->base.dev, phys); - return st; + + return err; } static void __start_cpu_write(struct drm_i915_gem_object *obj) @@ -2292,8 +2296,7 @@ static bool i915_sg_trim(struct sg_table *orig_st) return true; } -static struct sg_table * -i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) +static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) { struct drm_i915_private *dev_priv = to_i915(obj->base.dev); const unsigned long page_count = obj->base.size / PAGE_SIZE; @@ -2317,12 +2320,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) - return ERR_PTR(-ENOMEM); + return -ENOMEM; rebuild_st: if (sg_alloc_table(st, page_count, GFP_KERNEL)) { kfree(st); - return ERR_PTR(-ENOMEM); + return -ENOMEM; } /* Get the list of pages out of our struct file. They'll be pinned @@ -2430,7 +2433,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_do_bit_17_swizzle(obj, st); - return st; + __i915_gem_object_set_pages(obj, st); + + return 0; err_sg: sg_mark_end(sg); @@ -2451,7 +2456,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (ret == -ENOSPC) ret =
[Intel-gfx] [PATCH 06/21] drm/i915: introduce page_size members
In preparation for supporting huge gtt pages for the ppgtt, we introduce page size members for gem objects. We fill in the page sizes by scanning the sg table. v2: pass the sg_mask to set_pages v3: calculate the sg_mask inline with populating the sg_table where possible, and pass to set_pages along with the pages. v4: bunch of improvements from Joonas v5: fix num_pages blunder introduce i915_sg_page_sizes helper v6: prefer GEM_BUG_ON(sizes == 0) Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Daniel Vetter Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 22 - drivers/gpu/drm/i915/i915_gem.c | 42 +--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 5 ++- drivers/gpu/drm/i915/i915_gem_internal.c | 5 ++- drivers/gpu/drm/i915/i915_gem_object.h | 17 ++ drivers/gpu/drm/i915/i915_gem_stolen.c | 2 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 5 ++- drivers/gpu/drm/i915/selftests/huge_gem_object.c | 2 +- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 5 ++- 9 files changed, 93 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3d4dee817381..799a90abd81f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2872,6 +2872,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg) (((__iter).curr += PAGE_SIZE) >= (__iter).max) ? \ (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0) +static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg) +{ + unsigned int page_sizes; + + page_sizes = 0; + while (sg) { + GEM_BUG_ON(sg->offset); + GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE)); + page_sizes |= sg->length; + sg = __sg_next(sg); + } + + return page_sizes; +} + static inline unsigned int i915_sg_segment_size(void) { unsigned int size = swiotlb_max_segment(); @@ -3101,6 +3116,10 @@ intel_info(const struct drm_i915_private *dev_priv) #define USES_PPGTT(dev_priv) (i915_modparams.enable_ppgtt) #define USES_FULL_PPGTT(dev_priv) (i915_modparams.enable_ppgtt >= 2) #define USES_FULL_48BIT_PPGTT(dev_priv)(i915_modparams.enable_ppgtt == 3) +#define HAS_PAGE_SIZES(dev_priv, sizes) ({ \ + GEM_BUG_ON((sizes) == 0); \ + ((sizes) & ~(dev_priv)->info.page_sizes) == 0; \ +}) #define HAS_OVERLAY(dev_priv) ((dev_priv)->info.has_overlay) #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \ @@ -3517,7 +3536,8 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, unsigned long n); void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, -struct sg_table *pages); +struct sg_table *pages, +unsigned int sg_mask); int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline int __must_check diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c1c07d0957aa..695cb2a38c88 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -228,7 +228,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) obj->phys_handle = phys; - __i915_gem_object_set_pages(obj, st); + __i915_gem_object_set_pages(obj, st, sg->length); return 0; @@ -2266,6 +2266,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, if (!IS_ERR(pages)) obj->ops->put_pages(obj, pages); + obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0; + unlock: mutex_unlock(&obj->mm.lock); } @@ -2308,6 +2310,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ unsigned int max_segment = i915_sg_segment_size(); + unsigned int sg_mask; gfp_t noreclaim; int ret; @@ -2339,6 +2342,7 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) sg = st->sgl; st->nents = 0; + sg_mask = 0; for (i = 0; i < page_count; i++) { const unsigned int shrink[] = { I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | I915_SHRINK_PURGEABLE, @@ -2391,8 +2395,10 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (!i || sg->length >= max_segment || page_to_pfn(page) != last_pfn + 1) { - if (i) + if (i) { + sg_mask |= sg->length; sg = sg_next(sg); + }
[Intel-gfx] [PATCH 02/21] drm/i915: introduce simple gemfs
Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so moves us away from the shmemfs shm_mnt, and gives us the much needed flexibility to do things like set our own mount options, namely huge= which should allow us to enable the use of transparent-huge-pages for our shmem backed objects. v2: various improvements suggested by Joonas v3: move gemfs instance to i915.mm and simplify now that we have file_setup_with_mnt v4: fallback to tmpfs shm_mnt upon failure to setup gemfs v5: make tmpfs fallback kinder v5: better gemfs failure message flags variable Signed-off-by: Matthew Auld Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Dave Hansen Cc: Kirill A. Shutemov Cc: Hugh Dickins Cc: linux...@kvack.org Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/i915_drv.h | 5 +++ drivers/gpu/drm/i915/i915_gem.c | 33 ++- drivers/gpu/drm/i915/i915_gemfs.c| 52 drivers/gpu/drm/i915/i915_gemfs.h| 34 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 4 ++ 6 files changed, 128 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 51d0d2929a4b..66d23b619db1 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -47,6 +47,7 @@ i915-y += i915_cmd_parser.o \ i915_gem_tiling.o \ i915_gem_timeline.o \ i915_gem_userptr.o \ + i915_gemfs.o \ i915_trace_points.o \ i915_vma.o \ intel_breadcrumbs.o \ diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1fc7080bfa7b..ec6f320cc4f5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1511,6 +1511,11 @@ struct i915_gem_mm { /** Usable portion of the GTT for GEM */ dma_addr_t stolen_base; /* limited to low memory (32-bit) */ + /** +* tmpfs instance used for shmem backed objects +*/ + struct vfsmount *gemfs; + /** PPGTT used for aliasing the PPGTT with the GTT */ struct i915_hw_ppgtt *aliasing_ppgtt; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ab8c6946fea4..81d70d23a057 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -35,6 +35,7 @@ #include "intel_drv.h" #include "intel_frontbuffer.h" #include "intel_mocs.h" +#include "i915_gemfs.h" #include #include #include @@ -4251,6 +4252,30 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = { .pwrite = i915_gem_object_pwrite_gtt, }; +static int i915_gem_object_create_shmem(struct drm_device *dev, + struct drm_gem_object *obj, + size_t size) +{ + struct drm_i915_private *i915 = to_i915(dev); + unsigned long flags = VM_NORESERVE; + struct file *filp; + + drm_gem_private_object_init(dev, obj, size); + + if (i915->mm.gemfs) + filp = shmem_file_setup_with_mnt(i915->mm.gemfs, "i915", size, +flags); + else + filp = shmem_file_setup("i915", size, flags); + + if (IS_ERR(filp)) + return PTR_ERR(filp); + + obj->filp = filp; + + return 0; +} + struct drm_i915_gem_object * i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size) { @@ -4275,7 +4300,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size) if (obj == NULL) return ERR_PTR(-ENOMEM); - ret = drm_gem_object_init(&dev_priv->drm, &obj->base, size); + ret = i915_gem_object_create_shmem(&dev_priv->drm, &obj->base, size); if (ret) goto fail; @@ -4915,6 +4940,10 @@ i915_gem_load_init(struct drm_i915_private *dev_priv) spin_lock_init(&dev_priv->fb_tracking.lock); + err = i915_gemfs_init(dev_priv); + if (err) + DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err); + return 0; err_priorities: @@ -4953,6 +4982,8 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv) /* And ensure that our DESTROY_BY_RCU slabs are truly destroyed */ rcu_barrier(); + + i915_gemfs_fini(dev_priv); } int i915_gem_freeze(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/i915_gemfs.c new file mode 100644 index ..168d0bd98f60 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_gemfs.c @@ -0,0 +1,52 @@ +/* + * Copyright © 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy o
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix pointer-to-int conversion (rev2)
== Series Details == Series: drm/i915: Fix pointer-to-int conversion (rev2) URL : https://patchwork.freedesktop.org/series/31488/ State : success == Summary == Series 31488v2 drm/i915: Fix pointer-to-int conversion https://patchwork.freedesktop.org/api/1.0/series/31488/revisions/2/mbox/ fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:460s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:467s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:396s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:564s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:289s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:536s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:532s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:549s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:528s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:568s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:622s fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:444s fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:600s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:437s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:425s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:514s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:496s fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:499s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:585s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:490s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:593s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:477s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:659s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:537s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:516s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:469s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:581s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:432s fi-pnv-d510 failed to connect after reboot 7dacd1f2e70cb3202e2b153d76b05b601d099082 drm-tip: 2017y-10m-06d-12h-29m-28s UTC integration manifest 86ad6277e1ad drm/i915: Fix pointer-to-int conversion == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5932/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915/cnl: WaDisableGatherAtSetShaderCommonSlice (rev2)
HI, > -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Rodrigo Vivi > Sent: perjantai 6. lokakuuta 2017 16.10 > To: intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915/cnl: > WaDisableGatherAtSetShaderCommonSlice (rev2) > > On Fri, Oct 06, 2017 at 11:06:34AM +, Patchwork wrote: > > == Series Details == > > > > Series: drm/i915/cnl: WaDisableGatherAtSetShaderCommonSlice (rev2) > > URL : https://patchwork.freedesktop.org/series/31457/ > > State : warning > > > > == Summary == > > > > Series 31457v2 drm/i915/cnl: WaDisableGatherAtSetShaderCommonSlice > > > https://patchwork.freedesktop.org/api/1.0/series/31457/revisions/2/mbox/ > > > > Test gem_exec_suspend: > > Subgroup basic-s3: > > pass -> DMESG-WARN (fi-cfl-s) fdo#103026 > > Subgroup basic-s4-devices: > > pass -> DMESG-WARN (fi-kbl-7500u) > > I believe this is a false positive. > This patch only changes CNL, not KBL. [ 254.679399] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not signal timeout (has irq: 1)! [ 254.679428] [drm:intel_dp_aux_ch [i915]] *ERROR* dp_aux_ch not done status 0xac1003ff So something new or known? Jani Saarinen Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock
On 06/10/2017 15:23, Daniel Vetter wrote: On Fri, Oct 06, 2017 at 12:34:02PM +0100, Tvrtko Ursulin wrote: On 06/10/2017 10:06, Daniel Vetter wrote: 4.14-rc1 gained the fancy new cross-release support in lockdep, which seems to have uncovered a few more rules about what is allowed and isn't. This one here seems to indicate that allocating a work-queue while holding mmap_sem is a no-go, so let's try to preallocate it. Of course another way to break this chain would be somewhere in the cpu hotplug code, since this isn't the only trace we're finding now which goes through msr_create_device. Full lockdep splat: [snipped lockdep splat] v2: Set ret correctly when we raced with another thread. v3: Use Chris' diff. Attach the right lockdep splat. Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Sasha Levin Cc: Marta Lofstedt Cc: Tejun Heo References: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/shard-hsw3/igt@prime_mmap@test_userptr.html Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102939 Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem_userptr.c | 35 +++-- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 2d4996de7331..f9b3406401af 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -164,7 +164,6 @@ static struct i915_mmu_notifier * i915_mmu_notifier_create(struct mm_struct *mm) { struct i915_mmu_notifier *mn; - int ret; mn = kmalloc(sizeof(*mn), GFP_KERNEL); if (mn == NULL) @@ -179,14 +178,6 @@ i915_mmu_notifier_create(struct mm_struct *mm) return ERR_PTR(-ENOMEM); } -/* Protected by mmap_sem (write-lock) */ - ret = __mmu_notifier_register(&mn->mn, mm); - if (ret) { - destroy_workqueue(mn->wq); - kfree(mn); - return ERR_PTR(ret); - } - return mn; } @@ -210,23 +201,37 @@ i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj) static struct i915_mmu_notifier * i915_mmu_notifier_find(struct i915_mm_struct *mm) { - struct i915_mmu_notifier *mn = mm->mn; + struct i915_mmu_notifier *mn; + int err; mn = mm->mn; if (mn) return mn; + mn = i915_mmu_notifier_create(mm->mm); + if (IS_ERR(mn)) + return mn; Strictly speaking we don't want to fail just yet, only it we actually needed a new notifier and we failed to create it. The check 2 lines above not good enough? It's somewhat racy, but I'm not sure what value we provide by being perfectly correct against low memory. This thread racing against a 2nd one, where the minimal allocation of the 2nd one pushed us perfectly over the oom threshold seems a very unlikely scenario. Also, small allocations actually never fail :-) Yes, but, we otherwise make each other re-spin for much smaller things than bailout logic being conceptually at the wrong place. So for me I'd like a respin. It's not complicated at all, just move the bailout to to before the __mmu_notifier_register: ... err = 0; if (IS_ERR(mn)) err = PTR_ERR(..); ... if (mana->manah == NULL) { /* ;-D */ /* Protect by mmap_sem... if (err == 0) { err = __mmu_notifier_register(..); ... } } ... if (mn && !IS_ERR(mn)) { ...free... } I think.. ? R-b on this, plus below, unless I got something wrong. + + err = 0; down_write(&mm->mm->mmap_sem); mutex_lock(&mm->i915->mm_lock); - if ((mn = mm->mn) == NULL) {ed - mn = i915_mmu_notifier_create(mm->mm); - if (!IS_ERR(mn)) - mm->mn = mn; + if (mm->mn == NULL) { + /* Protected by mmap_sem (write-lock) */ + err = __mmu_notifier_register(&mn->mn, mm->mm); + if (!err) { + /* Protected by mm_lock */ + mm->mn = fetch_and_zero(&mn); + } } mutex_unlock(&mm->i915->mm_lock); up_write(&mm->mm->mmap_sem); - return mn; + if (mn) { + destroy_workqueue(mn->wq); + kfree(mn); + } + + return err ? ERR_PTR(err) : mm->mn; } static int Otherwise looks good to me. I would also put a note in the commit on how working around the locking issue is also beneficial to performance with moving the allocation step outside the mmap_sem. Yeah Chris brought that up too, I don't really buy it given how heavy-weight __mmu_notifier_register is. But I can add something like: "This also has the minor benefit of slightly reducing the critical section where we hold mmap_sem." r-b with that added to the commit message? I think for me it is
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/huc: Fix includes in intel_huc.c
== Series Details == Series: drm/i915/huc: Fix includes in intel_huc.c URL : https://patchwork.freedesktop.org/series/31475/ State : success == Summary == Test kms_cursor_legacy: Subgroup cursorA-vs-flipA-atomic-transitions: fail -> PASS (shard-hsw) fdo#102723 fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723 shard-hswtotal:2446 pass:1328 dwarn:6 dfail:0 fail:9 skip:1103 time:10148s == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5925/shards.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone
== Series Details == Series: series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone URL : https://patchwork.freedesktop.org/series/31489/ State : success == Summary == Series 31489v1 series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone https://patchwork.freedesktop.org/api/1.0/series/31489/revisions/1/mbox/ Test chamelium: Subgroup dp-crc-fast: pass -> FAIL (fi-kbl-7500u) fdo#102514 Test kms_pipe_crc_basic: Subgroup nonblocking-crc-pipe-a-frame-sequence: pass -> INCOMPLETE (fi-elk-e7500) fdo#102364 fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514 fdo#102364 https://bugs.freedesktop.org/show_bug.cgi?id=102364 fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:457s fi-bdw-gvtdvmtotal:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:474s fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:395s fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:572s fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:289s fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:527s fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:530s fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:542s fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:534s fi-cfl-s total:289 pass:256 dwarn:1 dfail:0 fail:0 skip:32 time:559s fi-cnl-y total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:626s fi-elk-e7500 total:234 pass:185 dwarn:0 dfail:0 fail:0 skip:48 fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:602s fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:441s fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:418s fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:510s fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:475s fi-kbl-7500u total:289 pass:263 dwarn:1 dfail:0 fail:1 skip:24 time:499s fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:585s fi-kbl-7567u total:289 pass:265 dwarn:4 dfail:0 fail:0 skip:20 time:490s fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:600s fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:659s fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:471s fi-skl-6700hqtotal:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:659s fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:545s fi-skl-6770hqtotal:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:517s fi-skl-gvtdvmtotal:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:472s fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:588s fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:435s 7dacd1f2e70cb3202e2b153d76b05b601d099082 drm-tip: 2017y-10m-06d-12h-29m-28s UTC integration manifest 7b94f33d5162 drm/i915/selftests: Pretty print engine state when requests fail to start abe1c7b19674 drm/i915: Make i915_engine_info pretty printer to standalone == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5931/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock
On Fri, Oct 06, 2017 at 12:34:02PM +0100, Tvrtko Ursulin wrote: > > On 06/10/2017 10:06, Daniel Vetter wrote: > > 4.14-rc1 gained the fancy new cross-release support in lockdep, which > > seems to have uncovered a few more rules about what is allowed and > > isn't. > > > > This one here seems to indicate that allocating a work-queue while > > holding mmap_sem is a no-go, so let's try to preallocate it. > > > > Of course another way to break this chain would be somewhere in the > > cpu hotplug code, since this isn't the only trace we're finding now > > which goes through msr_create_device. > > > > Full lockdep splat: > > > > == > > WARNING: possible circular locking dependency detected > > 4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G U > > -- > > prime_mmap/1551 is trying to acquire lock: > > (cpu_hotplug_lock.rw_sem){}, at: [] > > apply_workqueue_attrs+0x17/0x50 > > > > but task is already holding lock: > > (&dev_priv->mm_lock){+.+.}, at: [] > > i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #6 (&dev_priv->mm_lock){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > __mutex_lock+0x86/0x9b0 > > mutex_lock_nested+0x1b/0x20 > > i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915] > > i915_gem_userptr_ioctl+0x222/0x2c0 [i915] > > drm_ioctl_kernel+0x69/0xb0 > > drm_ioctl+0x2f9/0x3d0 > > do_vfs_ioctl+0x94/0x670 > > SyS_ioctl+0x41/0x70 > > entry_SYSCALL_64_fastpath+0x1c/0xb1 > > > > -> #5 (&mm->mmap_sem){}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > __might_fault+0x68/0x90 > > _copy_to_user+0x23/0x70 > > filldir+0xa5/0x120 > > dcache_readdir+0xf9/0x170 > > iterate_dir+0x69/0x1a0 > > SyS_getdents+0xa5/0x140 > > entry_SYSCALL_64_fastpath+0x1c/0xb1 > > > > -> #4 (&sb->s_type->i_mutex_key#5){}: > > down_write+0x3b/0x70 > > handle_create+0xcb/0x1e0 > > devtmpfsd+0x139/0x180 > > kthread+0x152/0x190 > > ret_from_fork+0x27/0x40 > > > > -> #3 ((complete)&req.done){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > wait_for_common+0x58/0x210 > > wait_for_completion+0x1d/0x20 > > devtmpfs_create_node+0x13d/0x160 > > device_add+0x5eb/0x620 > > device_create_groups_vargs+0xe0/0xf0 > > device_create+0x3a/0x40 > > msr_device_create+0x2b/0x40 > > cpuhp_invoke_callback+0xa3/0x840 > > cpuhp_thread_fun+0x7a/0x150 > > smpboot_thread_fn+0x18a/0x280 > > kthread+0x152/0x190 > > ret_from_fork+0x27/0x40 > > > > -> #2 (cpuhp_state){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > cpuhp_issue_call+0x10b/0x170 > > __cpuhp_setup_state_cpuslocked+0x134/0x2a0 > > __cpuhp_setup_state+0x46/0x60 > > page_writeback_init+0x43/0x67 > > pagecache_init+0x3d/0x42 > > start_kernel+0x3a8/0x3fc > > x86_64_start_reservations+0x2a/0x2c > > x86_64_start_kernel+0x6d/0x70 > > verify_cpu+0x0/0xfb > > > > -> #1 (cpuhp_state_mutex){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > __mutex_lock+0x86/0x9b0 > > mutex_lock_nested+0x1b/0x20 > > __cpuhp_setup_state_cpuslocked+0x52/0x2a0 > > __cpuhp_setup_state+0x46/0x60 > > page_alloc_init+0x28/0x30 > > start_kernel+0x145/0x3fc > > x86_64_start_reservations+0x2a/0x2c > > x86_64_start_kernel+0x6d/0x70 > > verify_cpu+0x0/0xfb > > > > -> #0 (cpu_hotplug_lock.rw_sem){}: > > check_prev_add+0x430/0x840 > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > cpus_read_lock+0x3d/0xb0 > > apply_workqueue_attrs+0x17/0x50 > > __alloc_workqueue_key+0x1d8/0x4d9 > > i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915] > > i915_gem_userptr_ioctl+0x222/0x2c0 [i915] > > drm_ioctl_kernel+0x69/0xb0 > > drm_ioctl+0x2f9/0x3d0 > > do_vfs_ioctl+0x94/0x670 > > SyS_ioctl+0x41/0x70 > > entry_SYSCALL_64_fastpath+0x1c/0xb1 > > > > other info that might help us debug this: > > > > Chain exists of: > >cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock > > > > Possible unsafe locking scenario: > > > > CPU0CPU1 > > > >lock(&dev_priv->mm_lock); > > lock(&mm->mmap_sem); > > lock(&dev_priv->mm_lock); > >lock(cpu_hotplug_lock.rw_