[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/21] mm/shmem: introduce 
shmem_file_setup_with_mnt
URL   : https://patchwork.freedesktop.org/series/31525/
State : failure

== Summary ==

Test kms_properties:
Subgroup crtc-properties-legacy:
pass   -> DMESG-WARN (shard-hsw)
Test kms_atomic_transition:
Subgroup 4x-modeset-transitions-fencing:
skip   -> INCOMPLETE (shard-hsw)
Test kms_cursor_legacy:
Subgroup cursorA-vs-flipA-atomic-transitions:
fail   -> PASS   (shard-hsw) fdo#102723
Test kms_plane_multiple:
Subgroup legacy-pipe-C-tiling-none:
pass   -> SKIP   (shard-hsw)

fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723

shard-hswtotal:2446 pass:1292 dwarn:7   dfail:0   fail:8   skip:1090 
time:9959s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5938/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: warning for benchmark/gem_busy: Compare polling with syncobj_wait

2017-10-06 Thread Patchwork
== Series Details ==

Series: benchmark/gem_busy: Compare polling with syncobj_wait
URL   : https://patchwork.freedesktop.org/series/31507/
State : warning

== Summary ==

Test kms_setmode:
Subgroup basic:
pass   -> FAIL   (shard-hsw) fdo#99912
Test gem_eio:
Subgroup in-flight-external:
pass   -> DMESG-WARN (shard-hsw) fdo#102886 +1
Test kms_cursor_legacy:
Subgroup cursorA-vs-flipA-atomic-transitions:
fail   -> PASS   (shard-hsw) fdo#102723
Test gem_render_tiled_blits:
Subgroup basic:
pass   -> DMESG-WARN (shard-hsw)

fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886
fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723

shard-hswtotal:2446 pass:1327 dwarn:7   dfail:0   fail:9   skip:1103 
time:10028s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_306/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for igt/gem_eio: Check hang/eio recovery during suspend

2017-10-06 Thread Patchwork
== Series Details ==

Series: igt/gem_eio: Check hang/eio recovery during suspend
URL   : https://patchwork.freedesktop.org/series/31485/
State : success

== Summary ==

Test gem_eio:
Subgroup in-flight-contexts:
dmesg-warn -> PASS   (shard-hsw) fdo#102886 +1
Test kms_cursor_legacy:
Subgroup cursorA-vs-flipA-atomic-transitions:
fail   -> PASS   (shard-hsw) fdo#102723

fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886
fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723

shard-hswtotal:2447 pass:1329 dwarn:7   dfail:0   fail:8   skip:1103 
time:10232s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_305/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Cancel the hotplug work when unregistering the connector (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Cancel the hotplug work when unregistering the connector 
(rev2)
URL   : https://patchwork.freedesktop.org/series/31501/
State : success

== Summary ==

Test kms_cursor_legacy:
Subgroup cursorA-vs-flipA-atomic-transitions:
fail   -> PASS   (shard-hsw) fdo#102723

fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723

shard-hswtotal:2446 pass:1329 dwarn:6   dfail:0   fail:8   skip:1103 
time:10131s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5937/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for igt/gem_memfd: Exercise hugepages and memfd

2017-10-06 Thread Patchwork
== Series Details ==

Series: igt/gem_memfd: Exercise hugepages and memfd
URL   : https://patchwork.freedesktop.org/series/31460/
State : success

== Summary ==

Test gem_eio:
Subgroup in-flight:
dmesg-warn -> PASS   (shard-hsw) fdo#102886 +2
Test kms_setmode:
Subgroup basic:
fail   -> PASS   (shard-hsw) fdo#99912
Test prime_self_import:
Subgroup reimport-vs-gem_close-race:
pass   -> FAIL   (shard-hsw) fdo#102655

fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
fdo#102655 https://bugs.freedesktop.org/show_bug.cgi?id=102655

shard-hswtotal:2400 pass:1303 dwarn:5   dfail:0   fail:9   skip:1083 
time:9973s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_303/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: warning for drm/i915: Cancel the hotplug work when unregistering the connector

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Cancel the hotplug work when unregistering the connector
URL   : https://patchwork.freedesktop.org/series/31501/
State : warning

== Summary ==

Test drv_module_reload:
Subgroup basic-reload-inject:
pass   -> DMESG-WARN (shard-hsw) fdo#102707 +2
Test gem_eio:
Subgroup wait:
dmesg-warn -> PASS   (shard-hsw) fdo#102886
Test kms_cursor_legacy:
Subgroup basic-flip-before-cursor-atomic:
pass   -> SKIP   (shard-hsw)
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-primscrn-shrfb-plflip-blt:
pass   -> SKIP   (shard-hsw)
Test kms_rotation_crc:
Subgroup sprite-rotation-180:
pass   -> SKIP   (shard-hsw)
Test pm_rpm:
Subgroup gem-mmap-cpu:
pass   -> SKIP   (shard-hsw)
Test kms_draw_crc:
Subgroup draw-method-xrgb-render-xtiled:
pass   -> SKIP   (shard-hsw)
Test kms_chv_cursor_fail:
Subgroup pipe-A-64x64-left-edge:
pass   -> SKIP   (shard-hsw)
Test kms_setmode:
Subgroup basic:
fail   -> PASS   (shard-hsw) fdo#99912
Test kms_atomic_transition:
Subgroup plane-all-modeset-transition:
pass   -> DMESG-WARN (shard-hsw)

fdo#102707 https://bugs.freedesktop.org/show_bug.cgi?id=102707
fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hswtotal:2446 pass:1320 dwarn:9   dfail:0   fail:8   skip:1109 
time:10039s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5936/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,01/21] mm/shmem: introduce shmem_file_setup_with_mnt

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/21] mm/shmem: introduce 
shmem_file_setup_with_mnt
URL   : https://patchwork.freedesktop.org/series/31525/
State : success

== Summary ==

Series 31525v1 series starting with [CI,01/21] mm/shmem: introduce 
shmem_file_setup_with_mnt
https://patchwork.freedesktop.org/api/1.0/series/31525/revisions/1/mbox/

Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (fi-byt-j1900) fdo#101705

fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:454s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:483s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:391s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:577s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:285s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:523s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:522s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:534s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:510s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:565s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:615s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:436s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:594s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:440s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:499s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:474s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:500s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:578s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:486s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:591s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:663s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:476s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:662s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:530s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:510s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:468s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:585s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:432s

aaf31e875e72b50f6a970c11f797b7f5b61a2681 drm-tip: 2017y-10m-06d-17h-24m-22s UTC 
integration manifest
b93aebea22b8 drm/i915: enable platform support for 2M pages
4a0c64a37ae2 drm/i915: enable platform support for 64K pages
0af112e2f579 drm/i915: disable platform support for vGPU huge gtt pages
f0a1be1a3dd7 drm/i915/selftests: mix huge pages
00a7d5db2c00 drm/i915/selftests: huge page tests
93c70ed742be drm/i915/debugfs: include some gtt page size metrics
1f0a8b9c5966 drm/i915: accurate page size tracking for the ppgtt
1654205c7949 drm/i915: support 64K pages for the 48b PPGTT
111e38def2cc drm/i915: add support for 64K scratch page
c61736868242 drm/i915: support 2M pages for the 48b PPGTT
b18b3f752993 drm/i915: disable GTT cache for 2M pages
e956c4176e78 drm/i915: enable IPS bit for 64K pages
e86cd1858ba7 drm/i915: align 64K objects to 2M
a6190ddbeaa0 drm/i915: align the vma start to the largest gtt page size
0ad35eb31cee drm/i915: introduce vm set_pages/clear_pages
e9108b31f52e drm/i915: introduce page_size members
24b5f6444521 drm/i915: push set_pages down to the callers
0fe2db9775b3 drm/i915: introduce page_sizes field to dev_info
2a1dc2a89a9b drm/i915/gemfs: enable THP
2ff188382116 drm/i915: introduce simple gemfs
ff8befdbed20 mm/shmem: introduce shmem_file_setup_with_mnt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5938/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/execlists: Add a comment for the extra MI_ARB_ENABLE

2017-10-06 Thread Chris Wilson
Quoting Michel Thierry (2017-10-05 20:41:40)
> On 10/5/2017 12:10 PM, Chris Wilson wrote:
> > Michel Thierry noticed that we were applying WaDisableCtxRestoreArbitration
> > even to gen9, which does not require the w/a. The rationale is that we
> > need to enable MI arbitration for execlists to work, and to be safe we
> > do that before every batch (in addition to every context switch into the
> > batch). Since this is not clear from the single line comment suggesting
> > the MI_ARB_ENABLE is solely for the w/a, add a little more detail.
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Michel Thierry 
> > Cc: Joonas Lahtinen 
> > Cc: Michał Winiarski 
> 
> It can't be clearer. Thanks!
> 
> Reviewed-by: Michel Thierry 

Thanks for asking, and checking what I wrote made sense!
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: warning for series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with drm/i915: Preallocate our mmu notifier workequeu 
to unbreak cpu hotplug deadlock (rev2)
URL   : https://patchwork.freedesktop.org/series/31476/
State : warning

== Summary ==

Test gem_eio:
Subgroup in-flight-contexts:
dmesg-warn -> PASS   (shard-hsw) fdo#102886 +4
Test kms_cursor_crc:
Subgroup cursor-64x64-sliding:
pass   -> DMESG-WARN (shard-hsw)
Test prime_mmap:
Subgroup test_userptr:
dmesg-warn -> PASS   (shard-hsw) fdo#102939

fdo#102886 https://bugs.freedesktop.org/show_bug.cgi?id=102886
fdo#102939 https://bugs.freedesktop.org/show_bug.cgi?id=102939

shard-hswtotal:2446 pass:1333 dwarn:1   dfail:0   fail:9   skip:1103 
time:10142s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5935/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 05/21] drm/i915: push set_pages down to the callers

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Each backend is now responsible for calling __i915_gem_object_set_pages
upon successfully gathering its backing storage. This eliminates the
inconsistency between the async and sync paths, which stands out even
more when we start throwing around an sg_mask in a later patch.

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-6-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c  | 45 +---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   | 15 +---
 drivers/gpu/drm/i915/i915_gem_internal.c | 15 
 drivers/gpu/drm/i915/i915_gem_object.h   |  2 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c   | 16 ++---
 drivers/gpu/drm/i915/i915_gem_userptr.c  | 12 +++
 drivers/gpu/drm/i915/selftests/huge_gem_object.c | 14 
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 12 ---
 8 files changed, 77 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1da1f52d12cc..42f2ca1e136b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -162,8 +162,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void 
*data,
return 0;
 }
 
-static struct sg_table *
-i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
+static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 {
struct address_space *mapping = obj->base.filp->f_mapping;
drm_dma_handle_t *phys;
@@ -171,9 +170,10 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
struct scatterlist *sg;
char *vaddr;
int i;
+   int err;
 
if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
-   return ERR_PTR(-EINVAL);
+   return -EINVAL;
 
/* Always aligning to the object size, allows a single allocation
 * to handle all possible callers, and given typical object sizes,
@@ -183,7 +183,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
 roundup_pow_of_two(obj->base.size),
 roundup_pow_of_two(obj->base.size));
if (!phys)
-   return ERR_PTR(-ENOMEM);
+   return -ENOMEM;
 
vaddr = phys->vaddr;
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
@@ -192,7 +192,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
 
page = shmem_read_mapping_page(mapping, i);
if (IS_ERR(page)) {
-   st = ERR_CAST(page);
+   err = PTR_ERR(page);
goto err_phys;
}
 
@@ -209,13 +209,13 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
 
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (!st) {
-   st = ERR_PTR(-ENOMEM);
+   err = -ENOMEM;
goto err_phys;
}
 
if (sg_alloc_table(st, 1, GFP_KERNEL)) {
kfree(st);
-   st = ERR_PTR(-ENOMEM);
+   err = -ENOMEM;
goto err_phys;
}
 
@@ -227,11 +227,15 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
sg_dma_len(sg) = obj->base.size;
 
obj->phys_handle = phys;
-   return st;
+
+   __i915_gem_object_set_pages(obj, st);
+
+   return 0;
 
 err_phys:
drm_pci_free(obj->base.dev, phys);
-   return st;
+
+   return err;
 }
 
 static void __start_cpu_write(struct drm_i915_gem_object *obj)
@@ -2292,8 +2296,7 @@ static bool i915_sg_trim(struct sg_table *orig_st)
return true;
 }
 
-static struct sg_table *
-i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
+static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 {
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
const unsigned long page_count = obj->base.size / PAGE_SIZE;
@@ -2317,12 +2320,12 @@ i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
 
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
-   return ERR_PTR(-ENOMEM);
+   return -ENOMEM;
 
 rebuild_st:
if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
kfree(st);
-   return ERR_PTR(-ENOMEM);
+   return -ENOMEM;
}
 
/* Get the list of pages out of our struct file.  They'll be pinned
@@ -2430,7 +2433,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
if (i915_gem_object_needs_bit17_swizzle(obj))
i915_gem_object_do_bit_17_swizzle(obj, st);
 
-   return st;
+   __i915_gem_object_set_pages(obj, st);
+
+   return 0;
 
 err_sg:
sg_

[Intel-gfx] [CI 09/21] drm/i915: align 64K objects to 2M

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

We can't mix 64K and 4K pte's in the same page-table, so for now we
align 64K objects to 2M to avoid any potential mixing. This is
potentially wasteful but in reality shouldn't be too bad since this only
applies to the virtual address space of a 48b PPGTT.

v2: don't separate logically connected ops

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-10-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_vma.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5d4164406b63..72e86b32ab41 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -503,10 +503,20 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
 */
if (upper_32_bits(end - 1) &&
vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   /*
+* We can't mix 64K and 4K PTEs in the same page-table
+* (2M block), and so to avoid the ugliness and
+* complexity of coloring we opt for just aligning 64K
+* objects to 2M.
+*/
u64 page_alignment =
-   rounddown_pow_of_two(vma->page_sizes.sg);
+   rounddown_pow_of_two(vma->page_sizes.sg |
+I915_GTT_PAGE_SIZE_2M);
 
alignment = max(alignment, page_alignment);
+
+   if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K)
+   size = round_up(size, I915_GTT_PAGE_SIZE_2M);
}
 
ret = i915_gem_gtt_insert(vma->vm, &vma->node,
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 15/21] drm/i915: accurate page size tracking for the ppgtt

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Now that we support multiple page sizes for the ppgtt, it would be
useful to track the real usage for debugging purposes.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-16-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 11 +++
 drivers/gpu/drm/i915/i915_gem_object.h | 10 ++
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 118aad90468f..4c605785e2b3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1053,6 +1053,8 @@ static void gen8_ppgtt_insert_3lvl(struct 
i915_address_space *vm,
 
gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx,
  cache_level);
+
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
 static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
@@ -1145,7 +1147,10 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
vaddr = kmap_atomic_px(pd);
vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
kunmap_atomic(vaddr);
+   page_size = I915_GTT_PAGE_SIZE_64K;
}
+
+   vma->page_sizes.gtt |= page_size;
} while (iter->sg);
 }
 
@@ -1170,6 +1175,8 @@ static void gen8_ppgtt_insert_4lvl(struct 
i915_address_space *vm,
while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++],
 &iter, &idx, cache_level))
GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
}
 }
 
@@ -1891,6 +1898,8 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
}
} while (1);
kunmap_atomic(vaddr);
+
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
@@ -2598,6 +2607,8 @@ static int ggtt_bind_vma(struct i915_vma *vma,
vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
intel_runtime_pm_put(i915);
 
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+
/*
 * Without aliasing PPGTT there's no difference between
 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h 
b/drivers/gpu/drm/i915/i915_gem_object.h
index 110672952a1c..e4e6dd93889d 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -169,6 +169,7 @@ struct drm_i915_gem_object {
struct sg_table *pages;
void *mapping;
 
+   /* TODO: whack some of this into the error state */
struct i915_page_sizes {
/**
 * The sg mask of the pages sg_table. i.e the mask of
@@ -184,6 +185,15 @@ struct drm_i915_gem_object {
 * to use opportunistically.
 */
unsigned int sg;
+
+   /**
+* The actual gtt page size usage. Since we can have
+* multiple vma associated with this object we need to
+* prevent any trampling of state, hence a copy of this
+* struct also lives in each vma, therefore the gtt
+* value here should only be read/write through the vma.
+*/
+   unsigned int gtt;
} page_sizes;
 
struct i915_gem_object_page_iter {
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 03/21] drm/i915/gemfs: enable THP

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Enable transparent-huge-pages through gemfs by mounting with
huge=within_size.

v2: sprinkle within_size comment

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-4-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gemfs.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gemfs.c 
b/drivers/gpu/drm/i915/i915_gemfs.c
index 168d0bd98f60..e2993857df37 100644
--- a/drivers/gpu/drm/i915/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/i915_gemfs.c
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 #include "i915_drv.h"
 #include "i915_gemfs.h"
@@ -41,6 +42,27 @@ int i915_gemfs_init(struct drm_i915_private *i915)
if (IS_ERR(gemfs))
return PTR_ERR(gemfs);
 
+   /*
+* Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most
+* likely 2M. Note that within_size may overallocate huge-pages, if say
+* we allocate an object of size 2M + 4K, we may get 2M + 2M, but under
+* memory pressure shmem should split any huge-pages which can be
+* shrunk.
+*/
+
+   if (has_transparent_hugepage()) {
+   struct super_block *sb = gemfs->mnt_sb;
+   char options[] = "huge=within_size";
+   int flags = 0;
+   int err;
+
+   err = sb->s_op->remount_fs(sb, &flags, options);
+   if (err) {
+   kern_unmount(gemfs);
+   return err;
+   }
+   }
+
i915->mm.gemfs = gemfs;
 
return 0;
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 17/21] drm/i915/selftests: huge page tests

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

v2: mock test page support configurations and add MI_STORE_DWORD test

v3: run all mockable huge page tests on all platforms via the mock_device

v4: add pin_update regression test
various improvements suggested by Chris

v5: fix issues reported by kbuild
test single sg spanning multiple page sizes
don't explode when running the live-tests through the appgtt

v6: lots of improvements from Chris

v7: run on each engine for igt_write_huge
add simple tmpfs fallback test

v8: size_t is bad
don't break the i386 build

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-18-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c|1 +
 drivers/gpu/drm/i915/i915_gem_object.h |2 +
 drivers/gpu/drm/i915/selftests/huge_pages.c| 1715 
 .../gpu/drm/i915/selftests/i915_live_selftests.h   |1 +
 .../gpu/drm/i915/selftests/i915_mock_selftests.h   |1 +
 5 files changed, 1720 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 34398696824c..f8c3ac1c8c67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5412,6 +5412,7 @@ int i915_gem_object_attach_phys(struct 
drm_i915_gem_object *obj, int align)
 #include "selftests/scatterlist.c"
 #include "selftests/mock_gem_device.c"
 #include "selftests/huge_gem_object.c"
+#include "selftests/huge_pages.c"
 #include "selftests/i915_gem_object.c"
 #include "selftests/i915_gem_coherency.c"
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h 
b/drivers/gpu/drm/i915/i915_gem_object.h
index e4e6dd93889d..956c911c2cbf 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -196,6 +196,8 @@ struct drm_i915_gem_object {
unsigned int gtt;
} page_sizes;
 
+   I915_SELFTEST_DECLARE(unsigned int page_mask);
+
struct i915_gem_object_page_iter {
struct scatterlist *sg_pos;
unsigned int sg_idx; /* in pages, but 32bit eek! */
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/selftests/huge_pages.c
new file mode 100644
index ..b8495882e5b0
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -0,0 +1,1715 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "../i915_selftest.h"
+
+#include 
+
+#include "mock_drm.h"
+
+static const unsigned int page_sizes[] = {
+   I915_GTT_PAGE_SIZE_2M,
+   I915_GTT_PAGE_SIZE_64K,
+   I915_GTT_PAGE_SIZE_4K,
+};
+
+static unsigned int get_largest_page_size(struct drm_i915_private *i915,
+ u64 rem)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) {
+   unsigned int page_size = page_sizes[i];
+
+   if (HAS_PAGE_SIZES(i915, page_size) && rem >= page_size)
+   return page_size;
+   }
+
+   return 0;
+}
+
+static void huge_pages_free_pages(struct sg_table *st)
+{
+   struct scatterlist *sg;
+
+   for (sg = st->sgl; sg; sg = __sg_next(sg)) {
+   if (sg_page(sg))
+   __free_pages(sg_page(sg), get_order(sg->length));
+   }
+
+   sg_free_table(st);
+   kfree(st);
+}
+
+static int get_huge_pages(struct drm_i915_gem_object *obj)
+{
+#define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY)
+   unsigned int page_mask = obj->mm.page_mask;
+   struct sg_table *st;
+   struct scatterlist *sg;
+   unsigned int sg_mask;
+   u64 rem;
+
+   st = 

[Intel-gfx] [CI 01/21] mm/shmem: introduce shmem_file_setup_with_mnt

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

We are planning to use our own tmpfs mnt in i915 in place of the
shm_mnt, such that we can control the mount options, in particular
huge=, which we require to support huge-gtt-pages. So rather than roll
our own version of __shmem_file_setup, it would be preferred if we could
just give shmem our mnt, and let it do the rest.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Dave Hansen 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Acked-by: Andrew Morton 
Acked-by: Kirill A. Shutemov 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-2-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 include/linux/shmem_fs.h |  2 ++
 mm/shmem.c   | 30 ++
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index b6c3540e07bc..0937d9a7d8fb 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -53,6 +53,8 @@ extern struct file *shmem_file_setup(const char *name,
loff_t size, unsigned long flags);
 extern struct file *shmem_kernel_file_setup(const char *name, loff_t size,
unsigned long flags);
+extern struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt,
+   const char *name, loff_t size, unsigned long flags);
 extern int shmem_zero_setup(struct vm_area_struct *);
 extern unsigned long shmem_get_unmapped_area(struct file *, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
diff --git a/mm/shmem.c b/mm/shmem.c
index 07a1d22807be..3229d27503ec 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -4183,7 +4183,7 @@ static const struct dentry_operations anon_ops = {
.d_dname = simple_dname
 };
 
-static struct file *__shmem_file_setup(const char *name, loff_t size,
+static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, 
loff_t size,
   unsigned long flags, unsigned int 
i_flags)
 {
struct file *res;
@@ -4192,8 +4192,8 @@ static struct file *__shmem_file_setup(const char *name, 
loff_t size,
struct super_block *sb;
struct qstr this;
 
-   if (IS_ERR(shm_mnt))
-   return ERR_CAST(shm_mnt);
+   if (IS_ERR(mnt))
+   return ERR_CAST(mnt);
 
if (size < 0 || size > MAX_LFS_FILESIZE)
return ERR_PTR(-EINVAL);
@@ -4205,8 +4205,8 @@ static struct file *__shmem_file_setup(const char *name, 
loff_t size,
this.name = name;
this.len = strlen(name);
this.hash = 0; /* will go */
-   sb = shm_mnt->mnt_sb;
-   path.mnt = mntget(shm_mnt);
+   sb = mnt->mnt_sb;
+   path.mnt = mntget(mnt);
path.dentry = d_alloc_pseudo(sb, &this);
if (!path.dentry)
goto put_memory;
@@ -4251,7 +4251,7 @@ static struct file *__shmem_file_setup(const char *name, 
loff_t size,
  */
 struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned 
long flags)
 {
-   return __shmem_file_setup(name, size, flags, S_PRIVATE);
+   return __shmem_file_setup(shm_mnt, name, size, flags, S_PRIVATE);
 }
 
 /**
@@ -4262,10 +4262,24 @@ struct file *shmem_kernel_file_setup(const char *name, 
loff_t size, unsigned lon
  */
 struct file *shmem_file_setup(const char *name, loff_t size, unsigned long 
flags)
 {
-   return __shmem_file_setup(name, size, flags, 0);
+   return __shmem_file_setup(shm_mnt, name, size, flags, 0);
 }
 EXPORT_SYMBOL_GPL(shmem_file_setup);
 
+/**
+ * shmem_file_setup_with_mnt - get an unlinked file living in tmpfs
+ * @mnt: the tmpfs mount where the file will be created
+ * @name: name for dentry (to be seen in /proc//maps
+ * @size: size to be set for the file
+ * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size
+ */
+struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt, const char *name,
+  loff_t size, unsigned long flags)
+{
+   return __shmem_file_setup(mnt, name, size, flags, 0);
+}
+EXPORT_SYMBOL_GPL(shmem_file_setup_with_mnt);
+
 /**
  * shmem_zero_setup - setup a shared anonymous mapping
  * @vma: the vma to be mmapped is prepared by do_mmap_pgoff
@@ -4281,7 +4295,7 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 * accessible to the user through its mapping, use S_PRIVATE flag to
 * bypass file security, in the same way as shmem_kernel_file_setup().
 */
-   file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE);
+   file = shmem_kernel_file_setup("dev/zero", size, vma->vm_flags);
if (IS_ERR(file))
return PTR_ERR(file);
 
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-g

[Intel-gfx] [CI 19/21] drm/i915: disable platform support for vGPU huge gtt pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Currently gvt gtt handling doesn't support huge page entries, so disable
for now.

v2: remove useless 48b PPGTT check

Suggested-by: Zhenyu Wang 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Zhenyu Wang 
Reviewed-by: Zhenyu Wang 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-20-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f8c3ac1c8c67..82a10036fb38 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4822,6 +4822,15 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 
mutex_lock(&dev_priv->drm.struct_mutex);
 
+   /*
+* We need to fallback to 4K pages since gvt gtt handling doesn't
+* support huge page entries - we will need to check either hypervisor
+* mm can support huge guest page or just do emulation in gvt.
+*/
+   if (intel_vgpu_active(dev_priv))
+   mkwrite_device_info(dev_priv)->page_sizes =
+   I915_GTT_PAGE_SIZE_4K;
+
dev_priv->mm.unordered_timeline = dma_fence_context_alloc(1);
 
if (!i915_modparams.enable_execlists) {
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 21/21] drm/i915: enable platform support for 2M pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

For gen8+ platforms which support the 48b PPGTT, enable platform level
support for 2M pages. Also enable for mock testing.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-22-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_pci.c  | 6 --
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 8d349aec1902..bf467f30c99b 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -376,7 +376,8 @@ static const struct intel_device_info 
intel_haswell_gt3_info __initconst = {
 #define GEN8_FEATURES \
G75_FEATURES, \
BDW_COLORS, \
-   GEN_DEFAULT_PAGE_SIZES, \
+   .page_sizes = I915_GTT_PAGE_SIZE_4K | \
+ I915_GTT_PAGE_SIZE_2M, \
.has_logical_ring_contexts = 1, \
.has_full_48bit_ppgtt = 1, \
.has_64bit_reloc = 1, \
@@ -437,7 +438,8 @@ static const struct intel_device_info intel_cherryview_info 
__initconst = {
 
 #define GEN9_DEFAULT_PAGE_SIZES \
.page_sizes = I915_GTT_PAGE_SIZE_4K | \
- I915_GTT_PAGE_SIZE_64K
+ I915_GTT_PAGE_SIZE_64K | \
+ I915_GTT_PAGE_SIZE_2M
 
 #define GEN9_FEATURES \
GEN8_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 7a9735dac912..04eb9362f4f8 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -176,7 +176,8 @@ struct drm_i915_private *mock_gem_device(void)
 
mkwrite_device_info(i915)->page_sizes =
I915_GTT_PAGE_SIZE_4K |
-   I915_GTT_PAGE_SIZE_64K;
+   I915_GTT_PAGE_SIZE_64K |
+   I915_GTT_PAGE_SIZE_2M;
 
spin_lock_init(&i915->mm.object_stat_lock);
mock_uncore_init(i915);
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 18/21] drm/i915/selftests: mix huge pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Try to mix sg page sizes for 4K, 64K and 2M pages.

v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-19-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/selftests/scatterlist.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/scatterlist.c 
b/drivers/gpu/drm/i915/selftests/scatterlist.c
index 1cc5d2931753..cd6d2a16071f 100644
--- a/drivers/gpu/drm/i915/selftests/scatterlist.c
+++ b/drivers/gpu/drm/i915/selftests/scatterlist.c
@@ -189,6 +189,20 @@ static unsigned int random(unsigned long n,
return 1 + (prandom_u32_state(rnd) % 1024);
 }
 
+static unsigned int random_page_size_pages(unsigned long n,
+  unsigned long count,
+  struct rnd_state *rnd)
+{
+   /* 4K, 64K, 2M */
+   static unsigned int page_count[] = {
+   BIT(12) >> PAGE_SHIFT,
+   BIT(16) >> PAGE_SHIFT,
+   BIT(21) >> PAGE_SHIFT,
+   };
+
+   return page_count[(prandom_u32_state(rnd) % 3)];
+}
+
 static inline bool page_contiguous(struct page *first,
   struct page *last,
   unsigned long npages)
@@ -252,6 +266,7 @@ static const npages_fn_t npages_funcs[] = {
grow,
shrink,
random,
+   random_page_size_pages,
NULL,
 };
 
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 02/21] drm/i915: introduce simple gemfs

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
moves us away from the shmemfs shm_mnt, and gives us the much needed
flexibility to do things like set our own mount options, namely huge=
which should allow us to enable the use of transparent-huge-pages for
our shmem backed objects.

v2: various improvements suggested by Joonas

v3: move gemfs instance to i915.mm and simplify now that we have
file_setup_with_mnt

v4: fallback to tmpfs shm_mnt upon failure to setup gemfs

v5: make tmpfs fallback kinder

v5: better gemfs failure message
flags variable

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Dave Hansen 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-3-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/i915_drv.h  |  5 +++
 drivers/gpu/drm/i915/i915_gem.c  | 33 ++-
 drivers/gpu/drm/i915/i915_gemfs.c| 52 
 drivers/gpu/drm/i915/i915_gemfs.h| 34 
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  4 ++
 6 files changed, 128 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 51d0d2929a4b..66d23b619db1 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -47,6 +47,7 @@ i915-y += i915_cmd_parser.o \
  i915_gem_tiling.o \
  i915_gem_timeline.o \
  i915_gem_userptr.o \
+ i915_gemfs.o \
  i915_trace_points.o \
  i915_vma.o \
  intel_breadcrumbs.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1fc7080bfa7b..ec6f320cc4f5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1511,6 +1511,11 @@ struct i915_gem_mm {
/** Usable portion of the GTT for GEM */
dma_addr_t stolen_base; /* limited to low memory (32-bit) */
 
+   /**
+* tmpfs instance used for shmem backed objects
+*/
+   struct vfsmount *gemfs;
+
/** PPGTT used for aliasing the PPGTT with the GTT */
struct i915_hw_ppgtt *aliasing_ppgtt;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 50cc3c2cef06..1da1f52d12cc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -35,6 +35,7 @@
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
 #include "intel_mocs.h"
+#include "i915_gemfs.h"
 #include 
 #include 
 #include 
@@ -4256,6 +4257,30 @@ static const struct drm_i915_gem_object_ops 
i915_gem_object_ops = {
.pwrite = i915_gem_object_pwrite_gtt,
 };
 
+static int i915_gem_object_create_shmem(struct drm_device *dev,
+   struct drm_gem_object *obj,
+   size_t size)
+{
+   struct drm_i915_private *i915 = to_i915(dev);
+   unsigned long flags = VM_NORESERVE;
+   struct file *filp;
+
+   drm_gem_private_object_init(dev, obj, size);
+
+   if (i915->mm.gemfs)
+   filp = shmem_file_setup_with_mnt(i915->mm.gemfs, "i915", size,
+flags);
+   else
+   filp = shmem_file_setup("i915", size, flags);
+
+   if (IS_ERR(filp))
+   return PTR_ERR(filp);
+
+   obj->filp = filp;
+
+   return 0;
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 {
@@ -4280,7 +4305,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, 
u64 size)
if (obj == NULL)
return ERR_PTR(-ENOMEM);
 
-   ret = drm_gem_object_init(&dev_priv->drm, &obj->base, size);
+   ret = i915_gem_object_create_shmem(&dev_priv->drm, &obj->base, size);
if (ret)
goto fail;
 
@@ -4919,6 +4944,10 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 
spin_lock_init(&dev_priv->fb_tracking.lock);
 
+   err = i915_gemfs_init(dev_priv);
+   if (err)
+   DRM_NOTE("Unable to create a private tmpfs mount, hugepage 
support will be disabled(%d).\n", err);
+
return 0;
 
 err_priorities:
@@ -4957,6 +4986,8 @@ void i915_gem_load_cleanup(struct drm_i915_private 
*dev_priv)
 
/* And ensure that our DESTROY_BY_RCU slabs are truly destroyed */
rcu_barrier();
+
+   i915_gemfs_fini(dev_priv);
 }
 
 int i915_gem_freeze(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gemfs.c 
b/drivers/gpu/drm/i915/i915_gemfs.c
new file mode 100644
index ..168d0bd98f60
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gemfs.c
@

[Intel-gfx] [CI 12/21] drm/i915: support 2M pages for the 48b PPGTT

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Support inserting 2M gtt pages into the 48b PPGTT.

v2: sanity check sg->length against page_size

v3: don't recalculate rem on each loop
whitespace breakup

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-13-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 76 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 +
 2 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 74fc9ac11cd5..79ba485c5d42 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1013,6 +1013,69 @@ static void gen8_ppgtt_insert_3lvl(struct 
i915_address_space *vm,
  cache_level);
 }
 
+static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
+  struct i915_page_directory_pointer 
**pdps,
+  struct sgt_dma *iter,
+  enum i915_cache_level cache_level)
+{
+   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
+   u64 start = vma->node.start;
+   dma_addr_t rem = iter->sg->length;
+
+   do {
+   struct gen8_insert_pte idx = gen8_insert_pte(start);
+   struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
+   struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
+   unsigned int page_size;
+   gen8_pte_t encode = pte_encode;
+   gen8_pte_t *vaddr;
+   u16 index, max;
+
+   if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
+   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
+   rem >= I915_GTT_PAGE_SIZE_2M && !idx.pte) {
+   index = idx.pde;
+   max = I915_PDES;
+   page_size = I915_GTT_PAGE_SIZE_2M;
+
+   encode |= GEN8_PDE_PS_2M;
+
+   vaddr = kmap_atomic_px(pd);
+   } else {
+   struct i915_page_table *pt = pd->page_table[idx.pde];
+
+   index = idx.pte;
+   max = GEN8_PTES;
+   page_size = I915_GTT_PAGE_SIZE;
+
+   vaddr = kmap_atomic_px(pt);
+   }
+
+   do {
+   GEM_BUG_ON(iter->sg->length < page_size);
+   vaddr[index++] = encode | iter->dma;
+
+   start += page_size;
+   iter->dma += page_size;
+   rem -= page_size;
+   if (iter->dma >= iter->max) {
+   iter->sg = __sg_next(iter->sg);
+   if (!iter->sg)
+   break;
+
+   rem = iter->sg->length;
+   iter->dma = sg_dma_address(iter->sg);
+   iter->max = iter->dma + rem;
+
+   if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
+   break;
+   }
+   } while (rem >= page_size && index < max);
+
+   kunmap_atomic(vaddr);
+   } while (iter->sg);
+}
+
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
   struct i915_vma *vma,
   enum i915_cache_level cache_level,
@@ -1025,11 +1088,16 @@ static void gen8_ppgtt_insert_4lvl(struct 
i915_address_space *vm,
.max = iter.dma + iter.sg->length,
};
struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
-   struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
-   while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
-&idx, cache_level))
-   GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+   if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   gen8_ppgtt_insert_huge_entries(vma, pdps, &iter, cache_level);
+   } else {
+   struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
+
+   while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++],
+&iter, &idx, cache_level))
+   GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+   }
 }
 
 static void gen8_free_page_tables(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f22491b4e6dc..b9d7036c3665 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -154,6 +154,8 @@ t

[Intel-gfx] [CI 07/21] drm/i915: introduce vm set_pages/clear_pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Move the setting/clearing of the vma->pages to a vm operation. Doing so
neatens things up a little, but more importantly gives us a sane place
to also set/clear the vma->pages_sizes, which we introduce later in
preparation for supporting huge-pages.

v2: remove redundant vma->pages check

v3: GEM_BUG_ON(vma->pages) following i915_vma_remove

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 70 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   | 27 +++-
 drivers/gpu/drm/i915/selftests/mock_gtt.c | 11 ++---
 4 files changed, 66 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4c82ceb8d318..c534b74eee32 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -205,8 +205,6 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
return ret;
}
 
-   vma->pages = vma->obj->mm.pages;
-
/* Currently applicable only to VLV */
pte_flags = 0;
if (vma->obj->gt_ro)
@@ -222,6 +220,26 @@ static void ppgtt_unbind_vma(struct i915_vma *vma)
vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 }
 
+static int ppgtt_set_pages(struct i915_vma *vma)
+{
+   GEM_BUG_ON(vma->pages);
+
+   vma->pages = vma->obj->mm.pages;
+
+   return 0;
+}
+
+static void clear_pages(struct i915_vma *vma)
+{
+   GEM_BUG_ON(!vma->pages);
+
+   if (vma->pages != vma->obj->mm.pages) {
+   sg_free_table(vma->pages);
+   kfree(vma->pages);
+   }
+   vma->pages = NULL;
+}
+
 static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
  enum i915_cache_level level)
 {
@@ -1452,6 +1470,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.cleanup = gen8_ppgtt_cleanup;
ppgtt->base.unbind_vma = ppgtt_unbind_vma;
ppgtt->base.bind_vma = ppgtt_bind_vma;
+   ppgtt->base.set_pages = ppgtt_set_pages;
+   ppgtt->base.clear_pages = clear_pages;
ppgtt->debug_dump = gen8_dump_ppgtt;
 
return 0;
@@ -1894,6 +1914,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
ppgtt->base.unbind_vma = ppgtt_unbind_vma;
ppgtt->base.bind_vma = ppgtt_bind_vma;
+   ppgtt->base.set_pages = ppgtt_set_pages;
+   ppgtt->base.clear_pages = clear_pages;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->debug_dump = gen6_dump_ppgtt;
 
@@ -2405,12 +2427,6 @@ static int ggtt_bind_vma(struct i915_vma *vma,
struct drm_i915_gem_object *obj = vma->obj;
u32 pte_flags;
 
-   if (unlikely(!vma->pages)) {
-   int ret = i915_get_ggtt_vma_pages(vma);
-   if (ret)
-   return ret;
-   }
-
/* Currently applicable only to VLV */
pte_flags = 0;
if (obj->gt_ro)
@@ -2447,12 +2463,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
u32 pte_flags;
int ret;
 
-   if (unlikely(!vma->pages)) {
-   ret = i915_get_ggtt_vma_pages(vma);
-   if (ret)
-   return ret;
-   }
-
/* Currently applicable only to VLV */
pte_flags = 0;
if (vma->obj->gt_ro)
@@ -2467,7 +2477,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 vma->node.start,
 vma->size);
if (ret)
-   goto err_pages;
+   return ret;
}
 
appgtt->base.insert_entries(&appgtt->base, vma, cache_level,
@@ -2481,17 +2491,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
}
 
return 0;
-
-err_pages:
-   if (!(vma->flags & (I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND))) {
-   if (vma->pages != vma->obj->mm.pages) {
-   GEM_BUG_ON(!vma->pages);
-   sg_free_table(vma->pages);
-   kfree(vma->pages);
-   }
-   vma->pages = NULL;
-   }
-   return ret;
 }
 
 static void aliasing_gtt_unbind_vma(struct i915_vma *vma)
@@ -2529,6 +2528,19 @@ void i915_gem_gtt_finish_pages(struct 
drm_i915_gem_object *obj,
dma_unmap_sg(kdev, pages->sgl, pages->nents, PCI_DMA_BIDIRECTIONAL);
 }
 
+static int ggtt_set_pages(struct i915_vma *vma)
+{
+   int ret;
+
+   GEM_BUG_ON(vma->pages);
+
+   ret = i915_get_ggtt_vma_page

[Intel-gfx] [CI 20/21] drm/i915: enable platform support for 64K pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-21-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_pci.c  | 3 ++-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 7938006cf03a..8d349aec1902 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -436,7 +436,8 @@ static const struct intel_device_info intel_cherryview_info 
__initconst = {
 };
 
 #define GEN9_DEFAULT_PAGE_SIZES \
-   .page_sizes = I915_GTT_PAGE_SIZE_4K
+   .page_sizes = I915_GTT_PAGE_SIZE_4K | \
+ I915_GTT_PAGE_SIZE_64K
 
 #define GEN9_FEATURES \
GEN8_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index f46c3a35d61a..7a9735dac912 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -175,7 +175,8 @@ struct drm_i915_private *mock_gem_device(void)
mkwrite_device_info(i915)->gen = -1;
 
mkwrite_device_info(i915)->page_sizes =
-   I915_GTT_PAGE_SIZE_4K;
+   I915_GTT_PAGE_SIZE_4K |
+   I915_GTT_PAGE_SIZE_64K;
 
spin_lock_init(&i915->mm.object_stat_lock);
mock_uncore_init(i915);
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 16/21] drm/i915/debugfs: include some gtt page size metrics

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Good to know, mostly for debugging purposes.

v2: some improvements from Chris

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-17-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 61 ++---
 1 file changed, 57 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 84ab77c02d3e..f7817c667958 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -119,6 +119,36 @@ static u64 i915_gem_obj_total_ggtt_size(struct 
drm_i915_gem_object *obj)
return size;
 }
 
+static const char *
+stringify_page_sizes(unsigned int page_sizes, char *buf, size_t len)
+{
+   size_t x = 0;
+
+   switch (page_sizes) {
+   case 0:
+   return "";
+   case I915_GTT_PAGE_SIZE_4K:
+   return "4K";
+   case I915_GTT_PAGE_SIZE_64K:
+   return "64K";
+   case I915_GTT_PAGE_SIZE_2M:
+   return "2M";
+   default:
+   if (!buf)
+   return "M";
+
+   if (page_sizes & I915_GTT_PAGE_SIZE_2M)
+   x += snprintf(buf + x, len - x, "2M, ");
+   if (page_sizes & I915_GTT_PAGE_SIZE_64K)
+   x += snprintf(buf + x, len - x, "64K, ");
+   if (page_sizes & I915_GTT_PAGE_SIZE_4K)
+   x += snprintf(buf + x, len - x, "4K, ");
+   buf[x-2] = '\0';
+
+   return buf;
+   }
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
@@ -156,9 +186,10 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
if (!drm_mm_node_allocated(&vma->node))
continue;
 
-   seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
+   seq_printf(m, " (%sgtt offset: %08llx, size: %08llx, pages: %s",
   i915_vma_is_ggtt(vma) ? "g" : "pp",
-  vma->node.start, vma->node.size);
+  vma->node.start, vma->node.size,
+  stringify_page_sizes(vma->page_sizes.gtt, NULL, 0));
if (i915_vma_is_ggtt(vma)) {
switch (vma->ggtt_view.type) {
case I915_GGTT_VIEW_NORMAL:
@@ -403,10 +434,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
struct drm_i915_private *dev_priv = node_to_i915(m->private);
struct drm_device *dev = &dev_priv->drm;
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   u32 count, mapped_count, purgeable_count, dpy_count;
-   u64 size, mapped_size, purgeable_size, dpy_size;
+   u32 count, mapped_count, purgeable_count, dpy_count, huge_count;
+   u64 size, mapped_size, purgeable_size, dpy_size, huge_size;
struct drm_i915_gem_object *obj;
+   unsigned int page_sizes = 0;
struct drm_file *file;
+   char buf[80];
int ret;
 
ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -420,6 +453,7 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
size = count = 0;
mapped_size = mapped_count = 0;
purgeable_size = purgeable_count = 0;
+   huge_size = huge_count = 0;
list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) {
size += obj->base.size;
++count;
@@ -433,6 +467,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
mapped_count++;
mapped_size += obj->base.size;
}
+
+   if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   huge_count++;
+   huge_size += obj->base.size;
+   page_sizes |= obj->mm.page_sizes.sg;
+   }
}
seq_printf(m, "%u unbound objects, %llu bytes\n", count, size);
 
@@ -455,6 +495,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
mapped_count++;
mapped_size += obj->base.size;
}
+
+   if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   huge_count++;
+   huge_size += obj->base.size;
+   page_sizes |= obj->mm.page_sizes.sg;
+   }
}
seq_printf(m, "%u bound objects, %llu bytes\n",
   count, size);
@@ -462,11 +508,18 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
   purgeable_count, purgeable_size);
seq_printf(m, "%u mapped objects, %llu bytes\n",
   mapped_count, mapped_size);
+   seq_printf(m, "%u huge-paged objects (%s) %llu bytes\n",

[Intel-gfx] [CI 04/21] drm/i915: introduce page_sizes field to dev_info

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

In preparation for huge gtt pages expose page_sizes as part of the
device info, to indicate the page sizes supported by the HW.  Currently
only 4K is supported.

v2: s/page_size_mask/page_sizes/

v3: introduce I915_GTT_MAX_PAGE_SIZE

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Mika Kuoppala 
Cc: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |  2 ++
 drivers/gpu/drm/i915/i915_gem_gtt.h  |  8 +++-
 drivers/gpu/drm/i915/i915_pci.c  | 18 ++
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  3 +++
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ec6f320cc4f5..3d4dee817381 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -869,6 +869,8 @@ struct intel_device_info {
u8 num_sprites[I915_MAX_PIPES];
u8 num_scalers[I915_MAX_PIPES];
 
+   unsigned int page_sizes; /* page sizes supported by the HW */
+
 #define DEFINE_FLAG(name) u8 name:1
DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
 #undef DEFINE_FLAG
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f62fb903dc24..50218c141c21 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -42,7 +42,13 @@
 #include "i915_gem_request.h"
 #include "i915_selftest.h"
 
-#define I915_GTT_PAGE_SIZE 4096UL
+#define I915_GTT_PAGE_SIZE_4K BIT(12)
+#define I915_GTT_PAGE_SIZE_64K BIT(16)
+#define I915_GTT_PAGE_SIZE_2M BIT(21)
+
+#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
+#define I915_GTT_MAX_PAGE_SIZE I915_GTT_PAGE_SIZE_2M
+
 #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
 
 #define I915_FENCE_REG_NONE -1
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 745b6a6e0188..7938006cf03a 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -58,6 +58,10 @@
.color = { .degamma_lut_size = 0, .gamma_lut_size = 1024 }
 
 /* Keep in gen based order, and chronological order within a gen */
+
+#define GEN_DEFAULT_PAGE_SIZES \
+   .page_sizes = I915_GTT_PAGE_SIZE_4K
+
 #define GEN2_FEATURES \
.gen = 2, .num_pipes = 1, \
.has_overlay = 1, .overlay_needs_physical = 1, \
@@ -67,6 +71,7 @@
.ring_mask = RENDER_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i830_info __initconst = {
@@ -100,6 +105,7 @@ static const struct intel_device_info intel_i865g_info 
__initconst = {
.ring_mask = RENDER_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i915g_info __initconst = {
@@ -163,6 +169,7 @@ static const struct intel_device_info intel_pineview_info 
__initconst = {
.ring_mask = RENDER_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i965g_info __initconst = {
@@ -205,6 +212,7 @@ static const struct intel_device_info intel_gm45_info 
__initconst = {
.ring_mask = RENDER_RING | BSD_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ironlake_d_info __initconst = {
@@ -228,6 +236,7 @@ static const struct intel_device_info intel_ironlake_m_info 
__initconst = {
.has_rc6p = 1, \
.has_aliasing_ppgtt = 1, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 #define SNB_D_PLATFORM \
@@ -271,6 +280,7 @@ static const struct intel_device_info 
intel_sandybridge_m_gt2_info __initconst =
.has_aliasing_ppgtt = 1, \
.has_full_ppgtt = 1, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
IVB_CURSOR_OFFSETS
 
 #define IVB_D_PLATFORM \
@@ -327,6 +337,7 @@ static const struct intel_device_info intel_valleyview_info 
__initconst = {
.has_snoop = true,
.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
.display_mmio_offset = VLV_DISPLAY_BASE,
+   GEN_DEFAULT_PAGE_SIZES,
GEN_DEFAULT_PIPEOFFSETS,
CURSOR_OFFSETS
 };
@@ -365,6 +376,7 @@ static const struct intel_device_info 
intel_haswell_gt3_info __initconst = {
 #define GEN8_FEATURES \
G75_FEATURES, \
BDW_COLORS, \
+   GEN_DEFAULT_PAGE_SIZES, \
.has_logical_ring_contexts = 1, \
.has_full_48bit_ppgtt = 1, \
.has_64bit_reloc = 1, \
@@ -417,13 +429,18 @@ static const 

[Intel-gfx] [CI 13/21] drm/i915: add support for 64K scratch page

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Before we can fully enable 64K pages, we need to first support a 64K
scratch page if we intend to support the case where we have object sizes
< 2M, since any scratch PTE must also point to a 64K region.  Without
this our 64K usage is limited to objects which completely fill the
page-table, and therefore don't need any scratch.

v2: add reminder about why 48b PPGTT

Reported-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-14-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 64 ++---
 drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
 2 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 79ba485c5d42..7eae6ab8c5fd 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -519,22 +519,63 @@ static void fill_page_dma_32(struct i915_address_space 
*vm,
 static int
 setup_scratch_page(struct i915_address_space *vm, gfp_t gfp)
 {
-   struct page *page;
+   struct page *page = NULL;
dma_addr_t addr;
+   int order;
 
-   page = alloc_page(gfp | __GFP_ZERO);
-   if (unlikely(!page))
-   return -ENOMEM;
+   /*
+* In order to utilize 64K pages for an object with a size < 2M, we will
+* need to support a 64K scratch page, given that every 16th entry for a
+* page-table operating in 64K mode must point to a properly aligned 64K
+* region, including any PTEs which happen to point to scratch.
+*
+* This is only relevant for the 48b PPGTT where we support
+* huge-gtt-pages, see also i915_vma_insert().
+*
+* TODO: we should really consider write-protecting the scratch-page and
+* sharing between ppgtt
+*/
+   if (i915_vm_is_48bit(vm) &&
+   HAS_PAGE_SIZES(vm->i915, I915_GTT_PAGE_SIZE_64K)) {
+   order = get_order(I915_GTT_PAGE_SIZE_64K);
+   page = alloc_pages(gfp | __GFP_ZERO, order);
+   if (page) {
+   addr = dma_map_page(vm->dma, page, 0,
+   I915_GTT_PAGE_SIZE_64K,
+   PCI_DMA_BIDIRECTIONAL);
+   if (unlikely(dma_mapping_error(vm->dma, addr))) {
+   __free_pages(page, order);
+   page = NULL;
+   }
 
-   addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE,
-   PCI_DMA_BIDIRECTIONAL);
-   if (unlikely(dma_mapping_error(vm->dma, addr))) {
-   __free_page(page);
-   return -ENOMEM;
+   if (!IS_ALIGNED(addr, I915_GTT_PAGE_SIZE_64K)) {
+   dma_unmap_page(vm->dma, addr,
+  I915_GTT_PAGE_SIZE_64K,
+  PCI_DMA_BIDIRECTIONAL);
+   __free_pages(page, order);
+   page = NULL;
+   }
+   }
+   }
+
+   if (!page) {
+   order = 0;
+   page = alloc_page(gfp | __GFP_ZERO);
+   if (unlikely(!page))
+   return -ENOMEM;
+
+   addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE,
+   PCI_DMA_BIDIRECTIONAL);
+   if (unlikely(dma_mapping_error(vm->dma, addr))) {
+   __free_page(page);
+   return -ENOMEM;
+   }
}
 
vm->scratch_page.page = page;
vm->scratch_page.daddr = addr;
+   vm->scratch_page.order = order;
+
return 0;
 }
 
@@ -542,8 +583,9 @@ static void cleanup_scratch_page(struct i915_address_space 
*vm)
 {
struct i915_page_dma *p = &vm->scratch_page;
 
-   dma_unmap_page(vm->dma, p->daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-   __free_page(p->page);
+   dma_unmap_page(vm->dma, p->daddr, BIT(p->order) << PAGE_SHIFT,
+  PCI_DMA_BIDIRECTIONAL);
+   __free_pages(p->page, p->order);
 }
 
 static struct i915_page_table *alloc_pt(struct i915_address_space *vm)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b9d7036c3665..e9de3f05b0c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -215,6 +215,7 @@ struct i915_vma;
 
 struct i915_page_dma {
struct page *page;
+   int order;
union {
dma_addr_t daddr;
 
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 06/21] drm/i915: introduce page_size members

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

In preparation for supporting huge gtt pages for the ppgtt, we introduce
page size members for gem objects.  We fill in the page sizes by
scanning the sg table.

v2: pass the sg_mask to set_pages

v3: calculate the sg_mask inline with populating the sg_table where
possible, and pass to set_pages along with the pages.

v4: bunch of improvements from Joonas

v5: fix num_pages blunder
introduce i915_sg_page_sizes helper

v6: prefer GEM_BUG_ON(sizes == 0)

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Daniel Vetter 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  | 22 -
 drivers/gpu/drm/i915/i915_gem.c  | 42 +---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   |  5 ++-
 drivers/gpu/drm/i915/i915_gem_internal.c |  5 ++-
 drivers/gpu/drm/i915/i915_gem_object.h   | 17 ++
 drivers/gpu/drm/i915/i915_gem_stolen.c   |  2 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c  |  5 ++-
 drivers/gpu/drm/i915/selftests/huge_gem_object.c |  2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c|  5 ++-
 9 files changed, 93 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3d4dee817381..799a90abd81f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2872,6 +2872,21 @@ static inline struct scatterlist *__sg_next(struct 
scatterlist *sg)
 (((__iter).curr += PAGE_SIZE) >= (__iter).max) ?   \
 (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0)
 
+static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
+{
+   unsigned int page_sizes;
+
+   page_sizes = 0;
+   while (sg) {
+   GEM_BUG_ON(sg->offset);
+   GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE));
+   page_sizes |= sg->length;
+   sg = __sg_next(sg);
+   }
+
+   return page_sizes;
+}
+
 static inline unsigned int i915_sg_segment_size(void)
 {
unsigned int size = swiotlb_max_segment();
@@ -3101,6 +3116,10 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define USES_PPGTT(dev_priv)   (i915_modparams.enable_ppgtt)
 #define USES_FULL_PPGTT(dev_priv)  (i915_modparams.enable_ppgtt >= 2)
 #define USES_FULL_48BIT_PPGTT(dev_priv)(i915_modparams.enable_ppgtt == 
3)
+#define HAS_PAGE_SIZES(dev_priv, sizes) ({ \
+   GEM_BUG_ON((sizes) == 0); \
+   ((sizes) & ~(dev_priv)->info.page_sizes) == 0; \
+})
 
 #define HAS_OVERLAY(dev_priv)   ((dev_priv)->info.has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
@@ -3517,7 +3536,8 @@ i915_gem_object_get_dma_address(struct 
drm_i915_gem_object *obj,
unsigned long n);
 
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
-struct sg_table *pages);
+struct sg_table *pages,
+unsigned int sg_mask);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
 static inline int __must_check
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 42f2ca1e136b..34398696824c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -228,7 +228,7 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
 
obj->phys_handle = phys;
 
-   __i915_gem_object_set_pages(obj, st);
+   __i915_gem_object_set_pages(obj, st, sg->length);
 
return 0;
 
@@ -2266,6 +2266,8 @@ void __i915_gem_object_put_pages(struct 
drm_i915_gem_object *obj,
if (!IS_ERR(pages))
obj->ops->put_pages(obj, pages);
 
+   obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0;
+
 unlock:
mutex_unlock(&obj->mm.lock);
 }
@@ -2308,6 +2310,7 @@ static int i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
unsigned int max_segment = i915_sg_segment_size();
+   unsigned int sg_mask;
gfp_t noreclaim;
int ret;
 
@@ -2339,6 +2342,7 @@ static int i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
 
sg = st->sgl;
st->nents = 0;
+   sg_mask = 0;
for (i = 0; i < page_count; i++) {
const unsigned int shrink[] = {
I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | 
I915_SHRINK_PURGEABLE,
@@ -2391,8 +2395,10 @@ static int i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
if (!i ||
sg->length >= max_segment ||
page_to_pfn(page) != last_pfn + 1) {
-   if (i)
+  

[Intel-gfx] [CI 14/21] drm/i915: support 64K pages for the 48b PPGTT

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Support inserting 64K pages into the 48b PPGTT.

v2: check for 64K scratch

v3: we should only have to re-adjust maybe_64K at every sg interval

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-15-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 31 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  7 +++
 2 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7eae6ab8c5fd..118aad90468f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1069,6 +1069,7 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
unsigned int page_size;
+   bool maybe_64K = false;
gen8_pte_t encode = pte_encode;
gen8_pte_t *vaddr;
u16 index, max;
@@ -1090,6 +1091,13 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
max = GEN8_PTES;
page_size = I915_GTT_PAGE_SIZE;
 
+   if (!index &&
+   vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
+   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) &&
+   (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) ||
+rem >= (max - index) << PAGE_SHIFT))
+   maybe_64K = true;
+
vaddr = kmap_atomic_px(pt);
}
 
@@ -1109,12 +1117,35 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
iter->dma = sg_dma_address(iter->sg);
iter->max = iter->dma + rem;
 
+   if (maybe_64K && index < max &&
+   !(IS_ALIGNED(iter->dma, 
I915_GTT_PAGE_SIZE_64K) &&
+ (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) 
||
+  rem >= (max - index) << PAGE_SHIFT)))
+   maybe_64K = false;
+
if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
break;
}
} while (rem >= page_size && index < max);
 
kunmap_atomic(vaddr);
+
+   /*
+* Is it safe to mark the 2M block as 64K? -- Either we have
+* filled whole page-table with 64K entries, or filled part of
+* it and have reached the end of the sg table and we have
+* enough padding.
+*/
+   if (maybe_64K &&
+   (index == max ||
+(i915_vm_has_scratch_64K(vma->vm) &&
+ !iter->sg && IS_ALIGNED(vma->node.start +
+ vma->node.size,
+ I915_GTT_PAGE_SIZE_2M {
+   vaddr = kmap_atomic_px(pd);
+   vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
+   kunmap_atomic(vaddr);
+   }
} while (iter->sg);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e9de3f05b0c9..93211a96fdad 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -154,6 +154,7 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_GET_AGE(x) ((x) & (3 << 4))
 #define CHV_PPAT_GET_SNOOP(x) ((x) & (1 << 6))
 
+#define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
 struct sg_table;
@@ -352,6 +353,12 @@ i915_vm_is_48bit(const struct i915_address_space *vm)
return (vm->total - 1) >> 32;
 }
 
+static inline bool
+i915_vm_has_scratch_64K(struct i915_address_space *vm)
+{
+   return vm->scratch_page.order == get_order(I915_GTT_PAGE_SIZE_64K);
+}
+
 /* The Graphics Translation Table is the way in which GEN hardware translates a
  * Graphics Virtual Address into a Physical Address. In addition to the normal
  * collateral associated with any va->pa translations GEN hardware also has a
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 11/21] drm/i915: disable GTT cache for 2M pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

When SW enables the use of 2M/1G pages, it must disable the GTT cache.

v2: don't disable for Cherryview which doesn't even support 48b PPGTT!

v3: explicitly check that the system does support 2M/1G pages

v4: split WA and decision logic

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-12-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/intel_pm.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 171b21f6c4ad..9d0ca2656a23 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8453,6 +8453,9 @@ static void skl_init_clock_gating(struct drm_i915_private 
*dev_priv)
 
 static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
 {
+   /* The GTT cache must be disabled if the system is using 2M pages. */
+   bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
+I915_GTT_PAGE_SIZE_2M);
enum pipe pipe;
 
ilk_init_lp_watermarks(dev_priv);
@@ -8487,12 +8490,8 @@ static void bdw_init_clock_gating(struct 
drm_i915_private *dev_priv)
/* WaProgramL3SqcReg1Default:bdw */
gen8_set_l3sqc_credits(dev_priv, 30, 2);
 
-   /*
-* WaGttCachingOffByDefault:bdw
-* GTT cache may not work with big pages, so if those
-* are ever enabled GTT cache may need to be disabled.
-*/
-   I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+   /* WaGttCachingOffByDefault:bdw */
+   I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
 
/* WaKVMNotificationOnConfigChange:bdw */
I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 08/21] drm/i915: align the vma start to the largest gtt page size

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

For the 48b PPGTT try to align the vma start address to the required
page size boundary to guarantee we use said page size in the gtt. If we
are dealing with multiple page sizes, we can't guarantee anything and
just align to the largest. For soft pinning and objects which need to be
tightly packed into the lower 32bits we don't force any alignment.

v2: various improvements suggested by Chris

v3: use set_pages and better placement of page_sizes

v4: prefer upper_32_bits()

v5: assign vma->page_sizes = vma->obj->page_sizes directly
prefer sizeof(vma->page_sizes)

v6: fixup checking of end to exclude GGTT (which are assumed to be
limited to 4G).

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-9-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c |  6 ++
 drivers/gpu/drm/i915/i915_vma.c | 16 
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 3 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c534b74eee32..fb7ac66814ab 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -226,6 +226,8 @@ static int ppgtt_set_pages(struct i915_vma *vma)
 
vma->pages = vma->obj->mm.pages;
 
+   vma->page_sizes = vma->obj->mm.page_sizes;
+
return 0;
 }
 
@@ -238,6 +240,8 @@ static void clear_pages(struct i915_vma *vma)
kfree(vma->pages);
}
vma->pages = NULL;
+
+   memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
 static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
@@ -2538,6 +2542,8 @@ static int ggtt_set_pages(struct i915_vma *vma)
if (ret)
return ret;
 
+   vma->page_sizes = vma->obj->mm.page_sizes;
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 49bf49571e47..5d4164406b63 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -493,6 +493,22 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
if (ret)
goto err_clear;
} else {
+   /*
+* We only support huge gtt pages through the 48b PPGTT,
+* however we also don't want to force any alignment for
+* objects which need to be tightly packed into the low 32bits.
+*
+* Note that we assume that GGTT are limited to 4GiB for the
+* forseeable future. See also i915_ggtt_offset().
+*/
+   if (upper_32_bits(end - 1) &&
+   vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   u64 page_alignment =
+   rounddown_pow_of_two(vma->page_sizes.sg);
+
+   alignment = max(alignment, page_alignment);
+   }
+
ret = i915_gem_gtt_insert(vma->vm, &vma->node,
  size, alignment, obj->cache_level,
  start, end, flags);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index e811067c7724..c59ba76613a3 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -55,6 +55,7 @@ struct i915_vma {
void __iomem *iomap;
u64 size;
u64 display_alignment;
+   struct i915_page_sizes page_sizes;
 
u32 fence_size;
u32 fence_alignment;
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 10/21] drm/i915: enable IPS bit for 64K pages

2017-10-06 Thread Chris Wilson
From: Matthew Auld 

Before we can enable 64K pages through the IPS bit, we must first enable
it through MMIO, otherwise the page-walker will simply ignore it.

v2: add comment mentioning that 64K is BDW+

v3: move to more suitable home

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Reviewed-by: Mika Kuoppala 
Reviewed-by: Joonas Lahtinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-11-matthew.a...@intel.com
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 17 +
 drivers/gpu/drm/i915/i915_reg.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fb7ac66814ab..74fc9ac11cd5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1987,6 +1987,23 @@ static void gtt_write_workarounds(struct 
drm_i915_private *dev_priv)
I915_WRITE(GEN8_L3_LRA_1_GPGPU, 
GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_SKL);
else if (IS_GEN9_LP(dev_priv))
I915_WRITE(GEN8_L3_LRA_1_GPGPU, 
GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_BXT);
+
+   /*
+* To support 64K PTEs we need to first enable the use of the
+* Intermediate-Page-Size(IPS) bit of the PDE field via some magical
+* mmio, otherwise the page-walker will simply ignore the IPS bit. This
+* shouldn't be needed after GEN10.
+*
+* 64K pages were first introduced from BDW+, although technically they
+* only *work* from gen9+. For pre-BDW we instead have the option for
+* 32K pages, but we don't currently have any support for it in our
+* driver.
+*/
+   if (HAS_PAGE_SIZES(dev_priv, I915_GTT_PAGE_SIZE_64K) &&
+   INTEL_GEN(dev_priv) <= 10)
+   I915_WRITE(GEN8_GAMW_ECO_DEV_RW_IA,
+  I915_READ(GEN8_GAMW_ECO_DEV_RW_IA) |
+  GAMW_ECO_ENABLE_64K_IPS_FIELD);
 }
 
 int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e7dba5539b11..50e65c98ca6c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2371,6 +2371,9 @@ enum i915_power_well_id {
 #define GEN9_GAMT_ECO_REG_RW_IA _MMIO(0x4ab0)
 #define   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS  (1<<18)
 
+#define GEN8_GAMW_ECO_DEV_RW_IA _MMIO(0x4080)
+#define   GAMW_ECO_ENABLE_64K_IPS_FIELD 0xF
+
 #define GAMT_CHKN_BIT_REG  _MMIO(0x4ab8)
 #define   GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING (1<<28)
 #define   GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT   (1<<24)
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 4/5] drm/i915/guc: group initialization of GuC objects

2017-10-06 Thread Sujaritha



On 10/04/2017 06:58 AM, Michal Wajdeczko wrote:
On Wed, 04 Oct 2017 00:57:00 +0200, Sujaritha Sundaresan 
 wrote:



The previous patch has split up the initialization of some of the GuC
objects in 2 different functions, let's pull them back together.

v3: Group initialization of GuC objects

v2: Decoupling ADS together with logs (Daniele)

v3: Rebase

v4: Rebase

v5: Separated from previous patch

Cc: Anusha Srivatsa 
Cc: Daniele Ceraolo Spurio 
Cc: Michal Wajdeczko 
Cc: Oscar Mateo 
Cc: Sagar Arun Kamble 
Signed-off-by: Sujaritha Sundaresan 
---
 drivers/gpu/drm/i915/i915_guc_submission.c |  7 ++---
 drivers/gpu/drm/i915/intel_uc.c    | 41 
+-

 drivers/gpu/drm/i915/intel_uc.h    |  4 +--
 3 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c

index c456c55..a351339 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -920,9 +920,8 @@ void i915_guc_policies_init(struct guc_policies 
*policies)

  * Set up the memory resources to be shared with the GuC (via the GGTT)
  * at firmware loading time.
  */
-int i915_guc_submission_init(struct drm_i915_private *dev_priv)
+int i915_guc_submission_shared_objects_init(struct intel_guc *guc)


Hmm, is "stage_ids" also considered as "shared object" ?
See ida_init(&guc->stage_ids); later in this function.

also, since function starts with "i915" there is no reason to change
parameter from dev_priv to guc



I was actually undecided if this change was worth doing. It seems to be 
a better

idea to keep the guc_submission stuff separate at an higher level.

Sujaritha

 {
-    struct intel_guc *guc = &dev_priv->guc;
 struct i915_vma *vma;
 void *vaddr;
@@ -950,10 +949,8 @@ int i915_guc_submission_init(struct 
drm_i915_private *dev_priv)

 return 0;
 }
-void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
+void i915_guc_submission_shared_objects_fini(struct intel_guc *guc)
 {
-    struct intel_guc *guc = &dev_priv->guc;
-
 ida_destroy(&guc->stage_ids);
 i915_gem_object_unpin_map(guc->stage_desc_pool->obj);
 i915_vma_unpin_and_release(&guc->stage_desc_pool);
diff --git a/drivers/gpu/drm/i915/intel_uc.c 
b/drivers/gpu/drm/i915/intel_uc.c

index 732f188..69239e4 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -423,13 +423,33 @@ static int guc_shared_objects_init(struct 
intel_guc *guc)

ret = guc_ads_create(guc);
 if (ret < 0)
-    intel_guc_log_destroy(guc);
+    goto err_logs;
+
+    if (i915_modparams.enable_guc_submission) {
+    /*
+ * This is stuff we need to have available at fw load time
+ * if we are planning to enable submission later
+ */
+    ret = i915_guc_submission_shared_objects_init(guc);
+    if (ret)
+    goto err_ads;
+    }
+
+    return 0;
+
+err_ads:
+    guc_ads_destroy(guc);
+err_logs:
+    intel_guc_log_destroy(guc);
return ret;
 }
static void guc_shared_objects_fini(struct intel_guc *guc)
 {
+    if (i915_modparams.enable_guc_submission)
+    i915_guc_submission_shared_objects_fini(guc);
+
 guc_ads_destroy(guc);
 intel_guc_log_destroy(guc);
 }
@@ -452,16 +472,6 @@ int intel_uc_init_hw(struct drm_i915_private 
*dev_priv)

 if (ret)
 goto err_guc;
-    if (i915_modparams.enable_guc_submission) {
-    /*
- * This is stuff we need to have available at fw load time
- * if we are planning to enable submission later
- */
-    ret = i915_guc_submission_init(dev_priv);
-    if (ret)
-    goto err_shared;
-    }
-
 /* init WOPCM */
 I915_WRITE(GUC_WOPCM_SIZE, intel_guc_wopcm_size(dev_priv));
 I915_WRITE(DMA_GUC_WOPCM_OFFSET,
@@ -481,7 +491,7 @@ int intel_uc_init_hw(struct drm_i915_private 
*dev_priv)

  */
 ret = __intel_uc_reset_hw(dev_priv);
 if (ret)
-    goto err_submission;
+    goto err_shared;
    intel_huc_init_hw(&dev_priv->huc);
 ret = intel_guc_init_hw(&dev_priv->guc);
@@ -526,11 +536,8 @@ int intel_uc_init_hw(struct drm_i915_private 
*dev_priv)

 gen9_disable_guc_interrupts(dev_priv);
 err_log_capture:
 guc_capture_load_err_log(guc);
-err_submission:
-    if (i915_modparams.enable_guc_submission)
-    i915_guc_submission_fini(dev_priv);
 err_shared:
-    guc_shared_objects_fini(guc);
+    guc_shared_objects_fini(guc);


???


 err_guc:
 i915_ggtt_disable_guc(dev_priv);
@@ -567,7 +574,7 @@ void intel_uc_fini_hw(struct drm_i915_private 
*dev_priv)

if (i915_modparams.enable_guc_submission) {
 gen9_disable_guc_interrupts(dev_priv);
-    i915_guc_submission_fini(dev_priv);
+    i915_guc_submission_shared_objects_fini(dev_priv);
 }
guc_shared_objects_fini(&dev_priv->guc);
diff --git a/drivers/gpu/drm/i915/intel_uc.h 
b/drivers/gpu/drm/i915/

Re: [Intel-gfx] [PATCH 08/21] drm/i915: align the vma start to the largest gtt page size

2017-10-06 Thread Chris Wilson
Quoting Matthew Auld (2017-10-06 15:50:28)
> For the 48b PPGTT try to align the vma start address to the required
> page size boundary to guarantee we use said page size in the gtt. If we
> are dealing with multiple page sizes, we can't guarantee anything and
> just align to the largest. For soft pinning and objects which need to be
> tightly packed into the lower 32bits we don't force any alignment.
> 
> v2: various improvements suggested by Chris
> 
> v3: use set_pages and better placement of page_sizes
> 
> v4: prefer upper_32_bits()
> 
> v5: assign vma->page_sizes = vma->obj->page_sizes directly
> prefer sizeof(vma->page_sizes)
> 
> Signed-off-by: Matthew Auld 
> Cc: Joonas Lahtinen 
> Cc: Chris Wilson 
> Reviewed-by: Chris Wilson 
> Reviewed-by: Joonas Lahtinen 
> ---
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 49bf49571e47..5067eab27829 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -493,6 +493,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
> alignment, u64 flags)
> if (ret)
> goto err_clear;
> } else {
> +   /*
> +* We only support huge gtt pages through the 48b PPGTT,
> +* however we also don't want to force any alignment for
> +* objects which need to be tightly packed into the low 
> 32bits.
> +*/
> +   if (upper_32_bits(end) &&

Bah, this assumed PIN_ZONE_4G behaviour and forgot about 4G GGTT. :|

Insert a sly
  !i915_vma_is_ggtt(vma) &&
here. Or use upper_32_bits(end-1). Hmm. Atm we have the pervasive
assumption that GGTT is capped at 4G, so we could use end-1 with a
comment.

The theory about not wanting to waste space in the low 4G is theory no
more!
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✓ Fi.CI.IGT: success for huge gtt pages (rev13)

2017-10-06 Thread Chris Wilson
Quoting Patchwork (2017-10-06 22:29:20)
> == Series Details ==
> 
> Series: huge gtt pages (rev13)
> URL   : https://patchwork.freedesktop.org/series/25118/
> State : success
> 
> == Summary ==
> 
> Test kms_setmode:
> Subgroup basic:
> fail   -> PASS   (shard-hsw) fdo#99912
> 
> fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
> 
> shard-hswtotal:2446 pass:1329 dwarn:6   dfail:0   fail:8   skip:1103 
> time:10117s
> 
> == Logs ==
> 
> For more details see: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5933/shards.html

An oddity pops out, ENOSPC from gem_exec_schedule/wide-render. At a
guess, overallocation?
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for huge gtt pages (rev13)

2017-10-06 Thread Patchwork
== Series Details ==

Series: huge gtt pages (rev13)
URL   : https://patchwork.freedesktop.org/series/25118/
State : success

== Summary ==

Test kms_setmode:
Subgroup basic:
fail   -> PASS   (shard-hsw) fdo#99912

fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hswtotal:2446 pass:1329 dwarn:6   dfail:0   fail:8   skip:1103 
time:10117s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5933/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915: avoid unnecessary call to intel_hpd_pin_to_port

2017-10-06 Thread Manasi Navare
On Fri, Oct 06, 2017 at 06:19:12PM -0300, Paulo Zanoni wrote:
> Em Sex, 2017-10-06 às 10:45 +, Patchwork escreveu:
> > == Series Details ==
> > 
> > Series: series starting with [1/2] drm/i915: avoid unnecessary call
> > to intel_hpd_pin_to_port
> > URL   : https://patchwork.freedesktop.org/series/31459/
> > State : warning
> > 
> > == Summary ==
> > 
> > Series 31459v1 series starting with [1/2] drm/i915: avoid unnecessary
> > call to intel_hpd_pin_to_port
> > https://patchwork.freedesktop.org/api/1.0/series/31459/revisions/1/mb
> > ox/
> > 
> > Test chamelium:
> > Subgroup dp-crc-fast:
> > pass   -> FAIL   (fi-kbl-7500u) fdo#102514
> > Test gem_ctx_switch:
> > Subgroup basic-default:
> > pass   -> INCOMPLETE (fi-cnl-y) fdo#103027
> > Test gem_exec_suspend:
> > Subgroup basic-s3:
> > pass   -> DMESG-WARN (fi-cfl-s) fdo#103026
> > Subgroup basic-s4-devices:
> > pass   -> DMESG-WARN (fi-kbl-7500u)
> 
> [  242.023771] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not
> signal timeout (has irq: 1)!
> 
> I do not believe this is caused by my patches. This test on this
> machine is failing in many other recent patch series, but with
> different error messages. Looks very unstable.
> 
>

Yes this is the system where we have had these messages due to
LSPCON issue recently.

Manasi
 
> > 
> > fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
> > fdo#103027 https://bugs.freedesktop.org/show_bug.cgi?id=103027
> > fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026
> > 
> > fi-bdw-
> > 5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
> > time:461s
> > fi-bdw-
> > gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
> > time:467s
> > fi-blb-
> > e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
> > time:390s
> > fi-bsw-
> > n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
> > time:573s
> > fi-bwr-
> > 2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106
> > time:288s
> > fi-bxt-
> > dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
> > time:526s
> > fi-bxt-
> > j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
> > time:537s
> > fi-byt-
> > j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
> > time:538s
> > fi-byt-
> > n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
> > time:525s
> > fi-cfl-
> > s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
> > time:553s
> > fi-cnl-
> > y total:31   pass:21   dwarn:0   dfail:0   fail:0   skip:9  
> > fi-elk-
> > e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
> > time:437s
> > fi-glk-
> > 1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
> > time:599s
> > fi-hsw-
> > 4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
> > time:439s
> > fi-hsw-
> > 4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
> > time:416s
> > fi-ilk-
> > 650   total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  
> > time:468s
> > fi-ivb-
> > 3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
> > time:508s
> > fi-ivb-
> > 3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
> > time:479s
> > fi-kbl-
> > 7500u total:289  pass:262  dwarn:2   dfail:0   fail:1   skip:24  
> > time:501s
> > fi-kbl-
> > 7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
> > time:584s
> > fi-kbl-
> > 7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
> > time:499s
> > fi-kbl-
> > r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
> > time:598s
> > fi-pnv-
> > d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
> > time:661s
> > fi-skl-
> > 6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
> > time:477s
> > fi-skl-
> > 6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
> > time:663s
> > fi-skl-
> > 6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
> > time:534s
> > fi-skl-
> > 6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
> > time:521s
> > fi-skl-
> > gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
> > time:473s
> > fi-snb-
> > 2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
> > time:578s
> > fi-snb-
> > 2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
> > time:437s
> > 
> > 97c9e99b242fe40bbda48ba2bcaed07c47fba085 drm-tip: 2017y-10m-06d-09h-
> > 07m-21s UTC integration manifest
> > d0674fac8e07 drm/i915: avoid division by zero on cnl_calc_wrpll_link
> > 7d8046f85adf drm/i915: avoid unnecessary call to
> > intel_hpd_pin_to_port
> > 
> > == Logs ==
> > 
> > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchw
> > ork_5919/
> 

Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915: avoid unnecessary call to intel_hpd_pin_to_port

2017-10-06 Thread Paulo Zanoni
Em Sex, 2017-10-06 às 10:45 +, Patchwork escreveu:
> == Series Details ==
> 
> Series: series starting with [1/2] drm/i915: avoid unnecessary call
> to intel_hpd_pin_to_port
> URL   : https://patchwork.freedesktop.org/series/31459/
> State : warning
> 
> == Summary ==
> 
> Series 31459v1 series starting with [1/2] drm/i915: avoid unnecessary
> call to intel_hpd_pin_to_port
> https://patchwork.freedesktop.org/api/1.0/series/31459/revisions/1/mb
> ox/
> 
> Test chamelium:
> Subgroup dp-crc-fast:
> pass   -> FAIL   (fi-kbl-7500u) fdo#102514
> Test gem_ctx_switch:
> Subgroup basic-default:
> pass   -> INCOMPLETE (fi-cnl-y) fdo#103027
> Test gem_exec_suspend:
> Subgroup basic-s3:
> pass   -> DMESG-WARN (fi-cfl-s) fdo#103026
> Subgroup basic-s4-devices:
> pass   -> DMESG-WARN (fi-kbl-7500u)

[  242.023771] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not
signal timeout (has irq: 1)!

I do not believe this is caused by my patches. This test on this
machine is failing in many other recent patch series, but with
different error messages. Looks very unstable.


> 
> fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
> fdo#103027 https://bugs.freedesktop.org/show_bug.cgi?id=103027
> fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026
> 
> fi-bdw-
> 5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
> time:461s
> fi-bdw-
> gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
> time:467s
> fi-blb-
> e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
> time:390s
> fi-bsw-
> n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
> time:573s
> fi-bwr-
> 2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106
> time:288s
> fi-bxt-
> dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
> time:526s
> fi-bxt-
> j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
> time:537s
> fi-byt-
> j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
> time:538s
> fi-byt-
> n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
> time:525s
> fi-cfl-
> s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
> time:553s
> fi-cnl-
> y total:31   pass:21   dwarn:0   dfail:0   fail:0   skip:9  
> fi-elk-
> e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
> time:437s
> fi-glk-
> 1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
> time:599s
> fi-hsw-
> 4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
> time:439s
> fi-hsw-
> 4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
> time:416s
> fi-ilk-
> 650   total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  
> time:468s
> fi-ivb-
> 3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
> time:508s
> fi-ivb-
> 3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
> time:479s
> fi-kbl-
> 7500u total:289  pass:262  dwarn:2   dfail:0   fail:1   skip:24  
> time:501s
> fi-kbl-
> 7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
> time:584s
> fi-kbl-
> 7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
> time:499s
> fi-kbl-
> r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
> time:598s
> fi-pnv-
> d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
> time:661s
> fi-skl-
> 6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
> time:477s
> fi-skl-
> 6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
> time:663s
> fi-skl-
> 6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
> time:534s
> fi-skl-
> 6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
> time:521s
> fi-skl-
> gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
> time:473s
> fi-snb-
> 2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
> time:578s
> fi-snb-
> 2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
> time:437s
> 
> 97c9e99b242fe40bbda48ba2bcaed07c47fba085 drm-tip: 2017y-10m-06d-09h-
> 07m-21s UTC integration manifest
> d0674fac8e07 drm/i915: avoid division by zero on cnl_calc_wrpll_link
> 7d8046f85adf drm/i915: avoid unnecessary call to
> intel_hpd_pin_to_port
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchw
> ork_5919/
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH igt] igt/gem_fence_thresh: Use streaming reads for verify

2017-10-06 Thread Chris Wilson
Quoting Chris Wilson (2017-08-23 13:55:55)
> At the moment, the verify tests use an extremely brutal write-read of
> every dword, degrading performance to UC. If we break those up into
> cachelines, we can do a wcb write/read at a time instead, roughly 8x
> faster. We lose the accuracy of the forced wcb flushes around every dword,
> but we are retaining the overall behaviour of checking reads following
> writes instead. To compensate, we do check that a single dword write/read
> before using wcb aligned accesses.

This fixes one of the APL timeouts...

> Signed-off-by: Chris Wilson 
> ---
>  tests/gem_fence_thrash.c | 116 
> +--
>  1 file changed, 101 insertions(+), 15 deletions(-)
> 
> diff --git a/tests/gem_fence_thrash.c b/tests/gem_fence_thrash.c
> index 52095f26..3e1edb73 100644
> --- a/tests/gem_fence_thrash.c
> +++ b/tests/gem_fence_thrash.c
> @@ -30,7 +30,6 @@
>  #include "config.h"
>  #endif
>  
> -#include "igt.h"
>  #include 
>  #include 
>  #include 
> @@ -43,6 +42,12 @@
>  #include 
>  #include "drm.h"
>  
> +#include "igt.h"
> +#include "igt_x86.h"
> +
> +#define PAGE_SIZE 4096
> +#define CACHELINE 64
> +
>  #define OBJECT_SIZE (128*1024) /* restricted to 1MiB alignment on i915 
> fences */
>  
>  /* Before introduction of the LRU list for fences, allocation of a fence for 
> a page
> @@ -104,15 +109,78 @@ bo_copy (void *_arg)
> return NULL;
>  }
>  
> +#if defined(__x86_64__) && !defined(__clang__)
> +#define MOVNT 512
> +
> +#pragma GCC push_options
> +#pragma GCC target("sse4.1")
> +
> +#include 
> +__attribute__((noinline))
> +static void copy_wc_page(void *dst, void *src)
> +{
> +   if (igt_x86_features() & SSE4_1) {
> +   __m128i *S = (__m128i *)src;
> +   __m128i *D = (__m128i *)dst;
> +
> +   for (int i = 0; i < PAGE_SIZE/CACHELINE; i++) {
> +   __m128i tmp[4];
> +
> +   tmp[0] = _mm_stream_load_si128(S++);
> +   tmp[1] = _mm_stream_load_si128(S++);
> +   tmp[2] = _mm_stream_load_si128(S++);
> +   tmp[3] = _mm_stream_load_si128(S++);
> +
> +   _mm_store_si128(D++, tmp[0]);
> +   _mm_store_si128(D++, tmp[1]);
> +   _mm_store_si128(D++, tmp[2]);
> +   _mm_store_si128(D++, tmp[3]);
> +   }
> +   } else
> +   memcpy(dst, src, PAGE_SIZE);
> +}
> +static void copy_wc_cacheline(void *dst, void *src)
> +{
> +   if (igt_x86_features() & SSE4_1) {
> +   __m128i *S = (__m128i *)src;
> +   __m128i *D = (__m128i *)dst;
> +   __m128i tmp[4];
> +
> +   tmp[0] = _mm_stream_load_si128(S++);
> +   tmp[1] = _mm_stream_load_si128(S++);
> +   tmp[2] = _mm_stream_load_si128(S++);
> +   tmp[3] = _mm_stream_load_si128(S++);
> +
> +   _mm_store_si128(D++, tmp[0]);
> +   _mm_store_si128(D++, tmp[1]);
> +   _mm_store_si128(D++, tmp[2]);
> +   _mm_store_si128(D++, tmp[3]);
> +   } else
> +   memcpy(dst, src, CACHELINE);
> +}
> +
> +#pragma GCC pop_options
> +
> +#else
> +static void copy_wc_page(void *dst, const void *src)
> +{
> +   memcpy(dst, src, PAGE_SIZE);
> +}
> +static void copy_wc_cacheline(void *dst, const void *src)
> +{
> +   memcpy(dst, src, CACHELINE);
> +}
> +#endif
> +
>  static void
>  _bo_write_verify(struct test *t)
>  {
> int fd = t->fd;
> int i, k;
> uint32_t **s;
> -   uint32_t v;
> unsigned int dwords = OBJECT_SIZE >> 2;
> const char *tile_str[] = { "none", "x", "y" };
> +   uint32_t tmp[PAGE_SIZE/sizeof(uint32_t)];
>  
> igt_assert(t->tiling >= 0 && t->tiling <= I915_TILING_Y);
> igt_assert_lt(0, t->num_surfaces);
> @@ -124,21 +192,39 @@ _bo_write_verify(struct test *t)
> s[k] = bo_create(fd, t->tiling);
>  
> for (k = 0; k < t->num_surfaces; k++) {
> -   volatile uint32_t *a = s[k];
> -
> -   for (i = 0; i < dwords; i++) {
> -   a[i] = i;
> -   v = a[i];
> -   igt_assert_f(v == i,
> -"tiling %s: write failed at %d (%x)\n",
> -tile_str[t->tiling], i, v);
> +   uint32_t *a = s[k];
> +
> +   a[0] = 0xdeadbeef;
> +   igt_assert_f(a[0] == 0xdeadbeef,
> +"tiling %s: write failed at start (%x)\n",
> +tile_str[t->tiling], a[0]);
> +
> +   a[dwords - 1] = 0xc0ffee;
> +   igt_assert_f(a[dwords - 1] == 0xc0ffee,
> +"tiling %s: write failed at end (%x)\n",
> +tile_str[t->tiling], a[dwords - 1]);
> +
> +   for (i = 0; i < dwords;

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Fix pointer-to-int conversion (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Fix pointer-to-int conversion (rev2)
URL   : https://patchwork.freedesktop.org/series/31488/
State : success

== Summary ==

shard-hswtotal:2446 pass:1328 dwarn:6   dfail:0   fail:9   skip:1103 
time:10066s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5932/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for benchmark/gem_busy: Compare polling with syncobj_wait

2017-10-06 Thread Patchwork
== Series Details ==

Series: benchmark/gem_busy: Compare polling with syncobj_wait
URL   : https://patchwork.freedesktop.org/series/31507/
State : success

== Summary ==

IGT patchset tested on top of latest successful build
d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small 
priority sorting smoketest

with latest DRM-Tip kernel build CI_DRM_3188
aaf31e875e72 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest

No testlist changes.

Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (fi-byt-j1900) fdo#101705

fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:458s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:477s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:394s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:566s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:288s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:531s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:532s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:546s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:523s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:563s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:647s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:441s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:608s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:441s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:495s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:476s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:505s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:591s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:488s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:599s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:658s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:472s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:664s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:540s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:524s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:483s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:587s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:441s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_306/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH igt] benchmark/gem_busy: Compare polling with syncobj_wait

2017-10-06 Thread Chris Wilson
v2: Hook the syncobj array to the execbuf!

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 benchmarks/gem_busy.c | 75 ++-
 1 file changed, 74 insertions(+), 1 deletion(-)

diff --git a/benchmarks/gem_busy.c b/benchmarks/gem_busy.c
index f050454b..ce631d56 100644
--- a/benchmarks/gem_busy.c
+++ b/benchmarks/gem_busy.c
@@ -58,6 +58,15 @@
 #define DMABUF 0x4
 #define WAIT 0x8
 #define SYNC 0x10
+#define SYNCOBJ 0x20
+
+#define LOCAL_I915_EXEC_FENCE_ARRAY (1 << 19)
+struct local_gem_exec_fence {
+   uint32_t handle;
+   uint32_t flags;
+#define LOCAL_EXEC_FENCE_WAIT (1 << 0)
+#define LOCAL_EXEC_FENCE_SIGNAL (1 << 1)
+};
 
 static void gem_busy(int fd, uint32_t handle)
 {
@@ -109,11 +118,54 @@ static int sync_merge(int fd1, int fd2)
return data.fence;
 }
 
+static uint32_t __syncobj_create(int fd)
+{
+   struct local_syncobj_create {
+   uint32_t handle, flags;
+   } arg;
+#define LOCAL_IOCTL_SYNCOBJ_CREATEDRM_IOWR(0xBF, struct 
local_syncobj_create)
+
+   memset(&arg, 0, sizeof(arg));
+   ioctl(fd, LOCAL_IOCTL_SYNCOBJ_CREATE, &arg);
+
+   return arg.handle;
+}
+
+static uint32_t syncobj_create(int fd)
+{
+   uint32_t ret;
+
+   igt_assert_neq((ret = __syncobj_create(fd)), 0);
+
+   return ret;
+}
+
+#define LOCAL_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define LOCAL_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+struct local_syncobj_wait {
+   __u64 handles;
+   /* absolute timeout */
+   __s64 timeout_nsec;
+   __u32 count_handles;
+   __u32 flags;
+   __u32 first_signaled; /* only valid when not waiting all */
+   __u32 pad;
+};
+#define LOCAL_IOCTL_SYNCOBJ_WAIT   DRM_IOWR(0xC3, struct 
local_syncobj_wait)
+static int __syncobj_wait(int fd, struct local_syncobj_wait *args)
+{
+   int err = 0;
+   if (drmIoctl(fd, LOCAL_IOCTL_SYNCOBJ_WAIT, args))
+   err = -errno;
+   return err;
+}
+
 static int loop(unsigned ring, int reps, int ncpus, unsigned flags)
 {
struct drm_i915_gem_execbuffer2 execbuf;
struct drm_i915_gem_exec_object2 obj[2];
struct drm_i915_gem_relocation_entry reloc[2];
+   struct local_gem_exec_fence syncobj;
unsigned engines[16];
unsigned nengine;
uint32_t *batch;
@@ -150,6 +202,15 @@ static int loop(unsigned ring, int reps, int ncpus, 
unsigned flags)
return 77;
}
 
+   if (flags & SYNCOBJ) {
+   syncobj.handle = syncobj_create(fd);
+   syncobj.flags = LOCAL_EXEC_FENCE_SIGNAL;
+
+   execbuf.cliprects_ptr = (uintptr_t)&syncobj;
+   execbuf.num_cliprects = 1;
+   execbuf.flags |= LOCAL_I915_EXEC_FENCE_ARRAY;
+   }
+
if (ring == -1) {
nengine = 0;
for (ring = 1; ring < 16; ring++) {
@@ -235,6 +296,14 @@ static int loop(unsigned ring, int reps, int ncpus, 
unsigned flags)
struct pollfd pfd = { .fd = dmabuf, 
.events = POLLOUT };
for (int inner = 0; inner < 1024; 
inner++)
poll(&pfd, 1, 0);
+   } else if (flags & SYNCOBJ) {
+   struct local_syncobj_wait arg = {
+   .handles = 
to_user_pointer(&syncobj.handle),
+   .count_handles = 1,
+   };
+
+   for (int inner = 0; inner < 1024; 
inner++)
+   __syncobj_wait(fd, &arg);
} else if (flags & SYNC) {
struct pollfd pfd = { .fd = fence, 
.events = POLLOUT };
for (int inner = 0; inner < 1024; 
inner++)
@@ -275,7 +344,7 @@ int main(int argc, char **argv)
int ncpus = 1;
int c;
 
-   while ((c = getopt (argc, argv, "e:r:dfswWI")) != -1) {
+   while ((c = getopt (argc, argv, "e:r:dfsSwWI")) != -1) {
switch (c) {
case 'e':
if (strcmp(optarg, "rcs") == 0)
@@ -314,6 +383,10 @@ int main(int argc, char **argv)
flags |= SYNC;
break;
 
+   case 'S':
+   flags |= SYNCOBJ;
+   break;
+
case 'W':
flags |= WRITE;
break;
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with [1/2] drm/i915: Make i915_engine_info pretty 
printer to standalone
URL   : https://patchwork.freedesktop.org/series/31489/
State : failure

== Summary ==

Test kms_cursor_legacy:
Subgroup cursor-vs-flip-atomic-transitions:
pass   -> FAIL   (shard-hsw)
Test perf:
Subgroup polling:
pass   -> FAIL   (shard-hsw) fdo#102252

fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252

shard-hswtotal:2446 pass:1326 dwarn:6   dfail:0   fail:11  skip:1103 
time:10112s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5931/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for igt/gem_eio: Check hang/eio recovery during suspend

2017-10-06 Thread Patchwork
== Series Details ==

Series: igt/gem_eio: Check hang/eio recovery during suspend
URL   : https://patchwork.freedesktop.org/series/31485/
State : success

== Summary ==

IGT patchset tested on top of latest successful build
d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small 
priority sorting smoketest

with latest DRM-Tip kernel build CI_DRM_3188
aaf31e875e72 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest

Testlist changes:
+igt@gem_eio@in-flight-suspend

Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (fi-byt-j1900) fdo#101705

fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:457s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:479s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:393s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:593s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:290s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:535s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:539s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:549s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:543s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:571s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:642s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:440s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:607s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:442s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:423s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:509s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:477s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:506s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:590s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:494s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:601s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:657s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:479s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:664s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:542s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:526s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:476s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:588s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:437s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_305/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: warning for igt/gem_exec_capture: Exercise readback of userptr

2017-10-06 Thread Patchwork
== Series Details ==

Series: igt/gem_exec_capture: Exercise readback of userptr
URL   : https://patchwork.freedesktop.org/series/31480/
State : warning

== Summary ==

IGT patchset tested on top of latest successful build
d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small 
priority sorting smoketest

with latest DRM-Tip kernel build CI_DRM_3188
aaf31e875e72 drm-tip: 2017y-10m-06d-17h-24m-22s UTC integration manifest

Testlist changes:
+igt@gem_exec_capture@userptr

Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (fi-byt-j1900) fdo#101705
Test drv_module_reload:
Subgroup basic-reload:
pass   -> DMESG-WARN (fi-skl-6770hq)

fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:460s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:473s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:397s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:579s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:288s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:536s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:536s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:549s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:538s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:571s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:643s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:445s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:602s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:447s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:504s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:479s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:500s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:581s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:499s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:595s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:472s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:662s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:540s
fi-skl-6770hqtotal:289  pass:268  dwarn:1   dfail:0   fail:0   skip:20  
time:566s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:473s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:584s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:444s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_304/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Separate RC6, RPS, LLC ring Frequency management

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Separate RC6, RPS, LLC ring Frequency management
URL   : https://patchwork.freedesktop.org/series/31487/
State : success

== Summary ==

Test kms_setmode:
Subgroup basic:
fail   -> PASS   (shard-hsw) fdo#99912

fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hswtotal:2446 pass:1329 dwarn:6   dfail:0   fail:8   skip:1103 
time:10141s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5930/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Cancel the hotplug work when unregistering the connector (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Cancel the hotplug work when unregistering the connector 
(rev2)
URL   : https://patchwork.freedesktop.org/series/31501/
State : success

== Summary ==

Series 31501v2 drm/i915: Cancel the hotplug work when unregistering the 
connector
https://patchwork.freedesktop.org/api/1.0/series/31501/revisions/2/mbox/

Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (fi-byt-j1900) fdo#101705

fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:452s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:476s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:395s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:561s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:288s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:524s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:530s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:543s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:530s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:560s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:623s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:444s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:599s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:438s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:505s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:475s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:502s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:580s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:495s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:599s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:469s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:661s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:530s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:525s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:473s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:580s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:431s
fi-pnv-d510 failed to connect after reboot

aaf31e875e72b50f6a970c11f797b7f5b61a2681 drm-tip: 2017y-10m-06d-17h-24m-22s UTC 
integration manifest
e16fa7a43e0a drm/i915: Cancel the hotplug work when unregistering the connector

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5937/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged

2017-10-06 Thread Chris Wilson
Quoting Daniel Vetter (2017-10-06 15:20:09)
> On Fri, Oct 06, 2017 at 12:03:49PM +0100, Chris Wilson wrote:
> > Quoting Daniel Vetter (2017-10-06 10:06:37)
> > > stop_machine is not really a locking primitive we should use, except
> > > when the hw folks tell us the hw is broken and that's the only way to
> > > work around it.
> > > 
> > > This patch tries to address the locking abuse of stop_machine() from
> > > 
> > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5
> > > Author: Chris Wilson 
> > > Date:   Tue Nov 22 14:41:21 2016 +
> > > 
> > > drm/i915: Stop the machine as we install the wedged submit_request 
> > > handler
> > > 
> > > Chris said parts of the reasons for going with stop_machine() was that
> > > it's no overhead for the fast-path. But these callbacks use irqsave
> > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
> > > 
> > > To stay as close as possible to the stop_machine semantics we first
> > > update all the submit function pointers to the nop handler, then call
> > > synchronize_rcu() to make sure no new requests can be submitted. This
> > > should give us exactly the huge barrier we want.
> > > 
> > > I pondered whether we should annotate engine->submit_request as __rcu
> > > and use rcu_assign_pointer and rcu_dereference on it. But the reason
> > > behind those is to make sure the compiler/cpu barriers are there for
> > > when you have an actual data structure you point at, to make sure all
> > > the writes are seen correctly on the read side. But we just have a
> > > function pointer, and .text isn't changed, so no need for these
> > > barriers and hence no need for annotations.
> > > 
> > > This should fix the followwing lockdep splat:
> > > 
> > > ==
> > > WARNING: possible circular locking dependency detected
> > > 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U
> > > --
> > > kworker/3:4/562 is trying to acquire lock:
> > >  (cpu_hotplug_lock.rw_sem){}, at: [] 
> > > stop_machine+0x1c/0x40
> > > 
> > > but task is already holding lock:
> > >  (&dev->struct_mutex){+.+.}, at: [] 
> > > i915_reset_device+0x1e8/0x260 [i915]
> > > 
> > > which lock already depends on the new lock.
> > > 
> > > the existing dependency chain (in reverse order) is:
> > > 
> > > -> #6 (&dev->struct_mutex){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >__mutex_lock+0x86/0x9b0
> > >mutex_lock_interruptible_nested+0x1b/0x20
> > >i915_mutex_lock_interruptible+0x51/0x130 [i915]
> > >i915_gem_fault+0x209/0x650 [i915]
> > >__do_fault+0x1e/0x80
> > >__handle_mm_fault+0xa08/0xed0
> > >handle_mm_fault+0x156/0x300
> > >__do_page_fault+0x2c5/0x570
> > >do_page_fault+0x28/0x250
> > >page_fault+0x22/0x30
> > > 
> > > -> #5 (&mm->mmap_sem){}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >__might_fault+0x68/0x90
> > >_copy_to_user+0x23/0x70
> > >filldir+0xa5/0x120
> > >dcache_readdir+0xf9/0x170
> > >iterate_dir+0x69/0x1a0
> > >SyS_getdents+0xa5/0x140
> > >entry_SYSCALL_64_fastpath+0x1c/0xb1
> > > 
> > > -> #4 (&sb->s_type->i_mutex_key#5){}:
> > >down_write+0x3b/0x70
> > >handle_create+0xcb/0x1e0
> > >devtmpfsd+0x139/0x180
> > >kthread+0x152/0x190
> > >ret_from_fork+0x27/0x40
> > > 
> > > -> #3 ((complete)&req.done){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >wait_for_common+0x58/0x210
> > >wait_for_completion+0x1d/0x20
> > >devtmpfs_create_node+0x13d/0x160
> > >device_add+0x5eb/0x620
> > >device_create_groups_vargs+0xe0/0xf0
> > >device_create+0x3a/0x40
> > >msr_device_create+0x2b/0x40
> > >cpuhp_invoke_callback+0xc9/0xbf0
> > >cpuhp_thread_fun+0x17b/0x240
> > >smpboot_thread_fn+0x18a/0x280
> > >kthread+0x152/0x190
> > >ret_from_fork+0x27/0x40
> > > 
> > > -> #2 (cpuhp_state-up){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >cpuhp_issue_call+0x133/0x1c0
> > >__cpuhp_setup_state_cpuslocked+0x139/0x2a0
> > >__cpuhp_setup_state+0x46/0x60
> > >page_writeback_init+0x43/0x67
> > >pagecache_init+0x3d/0x42
> > >start_kernel+0x3a8/0x3fc
> > >x86_64_start_reservations+0x2a/0x2c
> > >x86_64_start_kernel+0x6d/0x70
> > >verify_cpu+0x0/0xfb
> > > 
> > > -> #1 (cpuhp_state_mutex){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >__mutex_lock+0x86/0x9b0
> > >mutex_lock_nested+0x1b/0x20
> > >__cpuhp_setup_state_cpuslocked+0x53/0x2a0
> > >__cpuhp_setup_state+0x46/0x60
> > >page_alloc_init+0x28/0x30
> > 

Re: [Intel-gfx] [PATCH v2] drm/i915: Order two completing nop_submit_request

2017-10-06 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-10-06 13:23:03)
> 
> On 06/10/2017 12:56, Chris Wilson wrote:
> > If two nop's (requests in-flight following a wedged device) complete at
> > the same time, the global_seqno value written to the HWSP is undefined
> > as the two threads are not serialized.
> > 
> > v2: Use irqsafe spinlock. We expect the callback may be called from
> > inside another irq spinlock, so we can't unconditionally restore irqs.
> > 
> > Fixes: ce1135c7de64 ("drm/i915: Complete requests in nop_submit_request")
> > Signed-off-by: Chris Wilson 
> > Cc: Tvrtko Ursulin 
> > Reviewed-by: Tvrtko Ursulin  #v1
> > ---
> >   drivers/gpu/drm/i915/i915_gem.c | 7 ++-
> >   1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index ab8c6946fea4..6a6974ed8f74 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3014,10 +3014,15 @@ void i915_gem_reset_finish(struct drm_i915_private 
> > *dev_priv)
> >   
> >   static void nop_submit_request(struct drm_i915_gem_request *request)
> >   {
> > + unsigned long flags;
> > +
> >   GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error));
> >   dma_fence_set_error(&request->fence, -EIO);
> > - i915_gem_request_submit(request);
> > +
> > + spin_lock_irqsave(&request->engine->timeline->lock, flags);
> > + __i915_gem_request_submit(request);
> >   intel_engine_init_global_seqno(request->engine, 
> > request->global_seqno);
> > + spin_unlock_irqrestore(&request->engine->timeline->lock, flags);
> >   }
> >   
> >   static void engine_set_wedged(struct intel_engine_cs *engine)
> > 
> 
> Ooops..
> 
> Reviewed-by: Tvrtko Ursulin 

Thanks for asking the question that lead to the discovery of the race and
then reviewing the results! Pushed,
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged

2017-10-06 Thread Chris Wilson
Quoting Daniel Vetter (2017-10-06 15:20:09)
> On Fri, Oct 06, 2017 at 12:03:49PM +0100, Chris Wilson wrote:
> > Quoting Daniel Vetter (2017-10-06 10:06:37)
> > > stop_machine is not really a locking primitive we should use, except
> > > when the hw folks tell us the hw is broken and that's the only way to
> > > work around it.
> > > 
> > > This patch tries to address the locking abuse of stop_machine() from
> > > 
> > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5
> > > Author: Chris Wilson 
> > > Date:   Tue Nov 22 14:41:21 2016 +
> > > 
> > > drm/i915: Stop the machine as we install the wedged submit_request 
> > > handler
> > > 
> > > Chris said parts of the reasons for going with stop_machine() was that
> > > it's no overhead for the fast-path. But these callbacks use irqsave
> > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
> > > 
> > > To stay as close as possible to the stop_machine semantics we first
> > > update all the submit function pointers to the nop handler, then call
> > > synchronize_rcu() to make sure no new requests can be submitted. This
> > > should give us exactly the huge barrier we want.
> > > 
> > > I pondered whether we should annotate engine->submit_request as __rcu
> > > and use rcu_assign_pointer and rcu_dereference on it. But the reason
> > > behind those is to make sure the compiler/cpu barriers are there for
> > > when you have an actual data structure you point at, to make sure all
> > > the writes are seen correctly on the read side. But we just have a
> > > function pointer, and .text isn't changed, so no need for these
> > > barriers and hence no need for annotations.
> > > 
> > > This should fix the followwing lockdep splat:
> > > 
> > > ==
> > > WARNING: possible circular locking dependency detected
> > > 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U
> > > --
> > > kworker/3:4/562 is trying to acquire lock:
> > >  (cpu_hotplug_lock.rw_sem){}, at: [] 
> > > stop_machine+0x1c/0x40
> > > 
> > > but task is already holding lock:
> > >  (&dev->struct_mutex){+.+.}, at: [] 
> > > i915_reset_device+0x1e8/0x260 [i915]
> > > 
> > > which lock already depends on the new lock.
> > > 
> > > the existing dependency chain (in reverse order) is:
> > > 
> > > -> #6 (&dev->struct_mutex){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >__mutex_lock+0x86/0x9b0
> > >mutex_lock_interruptible_nested+0x1b/0x20
> > >i915_mutex_lock_interruptible+0x51/0x130 [i915]
> > >i915_gem_fault+0x209/0x650 [i915]
> > >__do_fault+0x1e/0x80
> > >__handle_mm_fault+0xa08/0xed0
> > >handle_mm_fault+0x156/0x300
> > >__do_page_fault+0x2c5/0x570
> > >do_page_fault+0x28/0x250
> > >page_fault+0x22/0x30
> > > 
> > > -> #5 (&mm->mmap_sem){}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >__might_fault+0x68/0x90
> > >_copy_to_user+0x23/0x70
> > >filldir+0xa5/0x120
> > >dcache_readdir+0xf9/0x170
> > >iterate_dir+0x69/0x1a0
> > >SyS_getdents+0xa5/0x140
> > >entry_SYSCALL_64_fastpath+0x1c/0xb1
> > > 
> > > -> #4 (&sb->s_type->i_mutex_key#5){}:
> > >down_write+0x3b/0x70
> > >handle_create+0xcb/0x1e0
> > >devtmpfsd+0x139/0x180
> > >kthread+0x152/0x190
> > >ret_from_fork+0x27/0x40
> > > 
> > > -> #3 ((complete)&req.done){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >wait_for_common+0x58/0x210
> > >wait_for_completion+0x1d/0x20
> > >devtmpfs_create_node+0x13d/0x160
> > >device_add+0x5eb/0x620
> > >device_create_groups_vargs+0xe0/0xf0
> > >device_create+0x3a/0x40
> > >msr_device_create+0x2b/0x40
> > >cpuhp_invoke_callback+0xc9/0xbf0
> > >cpuhp_thread_fun+0x17b/0x240
> > >smpboot_thread_fn+0x18a/0x280
> > >kthread+0x152/0x190
> > >ret_from_fork+0x27/0x40
> > > 
> > > -> #2 (cpuhp_state-up){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >cpuhp_issue_call+0x133/0x1c0
> > >__cpuhp_setup_state_cpuslocked+0x139/0x2a0
> > >__cpuhp_setup_state+0x46/0x60
> > >page_writeback_init+0x43/0x67
> > >pagecache_init+0x3d/0x42
> > >start_kernel+0x3a8/0x3fc
> > >x86_64_start_reservations+0x2a/0x2c
> > >x86_64_start_kernel+0x6d/0x70
> > >verify_cpu+0x0/0xfb
> > > 
> > > -> #1 (cpuhp_state_mutex){+.+.}:
> > >__lock_acquire+0x1420/0x15e0
> > >lock_acquire+0xb0/0x200
> > >__mutex_lock+0x86/0x9b0
> > >mutex_lock_nested+0x1b/0x20
> > >__cpuhp_setup_state_cpuslocked+0x53/0x2a0
> > >__cpuhp_setup_state+0x46/0x60
> > >page_alloc_init+0x28/0x30
> > 

[Intel-gfx] [PATCH v2] drm/i915: Cancel the hotplug work when unregistering the connector

2017-10-06 Thread Chris Wilson
When we unregister the connector, we may have a pending hotplug work.
This needs to be cancel early during the teardown so that it does not
fire after we have freed the connector. Or else we may see something like:

 DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
 [ cut here ]
 WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 
mutex_destroy+0x4e/0x60
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem 
ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core 
prime_numbers i2c_hid [last unloaded: snd_hda_intel]
 CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U  
4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM 
RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 task: 8803c827aa40 task.stack: c952
 RIP: 0010:mutex_destroy+0x4e/0x60
 RSP: 0018:c9523d58 EFLAGS: 00010292
 RAX: 002a RBX: 88044fbef648 RCX: 
 RDX: 8001 RSI: 0001 RDI: 810f0cf0
 RBP: c9523d60 R08: 0001 R09: 0001
 R10: 0f21cb81 R11:  R12: 88044f71efc8
 R13: a02b3d20 R14: a02b3d90 R15: 880459b29308
 FS:  7f5df4d6e8c0() GS:88045d30() knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 55ec51f00a18 CR3: 000451782006 CR4: 003606e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: fffe0ff0 DR7: 0400
 Call Trace:
  drm_fb_helper_fini+0xd9/0x130
  intel_fbdev_destroy+0x12/0x60 [i915]
  intel_fbdev_fini+0x28/0x30 [i915]
  intel_modeset_cleanup+0x45/0xa0 [i915]
  i915_driver_unload+0x92/0x180 [i915]
  i915_pci_remove+0x19/0x30 [i915]
  pci_device_remove+0x39/0xb0
  device_release_driver_internal+0x15d/0x220
  driver_detach+0x40/0x80
  bus_remove_driver+0x58/0xd0
  driver_unregister+0x2c/0x40
  pci_unregister_driver+0x36/0xb0
  i915_exit+0x1a/0x8b [i915]
  SyS_delete_module+0x18c/0x1e0
  entry_SYSCALL_64_fastpath+0x1c/0xb1
 RIP: 0033:0x7f5df3286287
 RSP: 002b:7fff8e107cc8 EFLAGS: 0246 ORIG_RAX: 00b0
 RAX: ffda RBX: 81493a03 RCX: 7f5df3286287
 RDX: 0001 RSI: 0800 RDI: 564c7be02e48
 RBP: c9523f88 R08:  R09: 0080
 R10: 7f5df4d6e8c0 R11: 0246 R12: 
 R13: 7fff8e107eb0 R14:  R15: 
  ? __this_cpu_preempt_check+0x13/0x20
 Code: 00 00 5b 5d c3 e8 93 b9 3a 00 85 c0 74 ec 8b 05 e1 53 c3 01 85 c0 75 e2 
48 c7 c6 86 a6 c7 81 48 c7 c7 8b 8d c6 81 e8 03 ae 01 00 <0f> ff eb cb 0f 1f 40 
00 66 2e 0f 1f 84 00 00 00 00 00 55 48 b8
 ---[ end trace 08901ff1a77d30c6 ]---
 [drm:wait_panel_status [i915]] mask b80f value  status  
control 0060
 [drm:wait_panel_status [i915]] Wait complete
 [drm:edp_panel_vdd_on [i915]] PP_STATUS: 0x PP_CONTROL: 0x0068
 [drm:edp_panel_vdd_on [i915]] eDP port A panel power wasn't enabled
 [drm:drm_dp_read_desc] DP sink: OUI 00-1c-f8 dev-ID  HW-rev 0.0 SW-rev 7.49 
quirks 0x
 [drm:drm_edid_to_eld] ELD: no CEA Extension found
 [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:48:eDP-1] probed 
modes :
 [drm:drm_mode_debug_printmodeline] Modeline 49:"1920x1080" 60 138780 1920 1966 
1996 2080 1080 1082 1086 1112 0x48 0xa
 [drm:drm_mode_debug_printmodeline] Modeline 50:"1920x1080" 40 92520 1920 1966 
1996 2080 1080 1082 1086 1112 0x40 0xa
 general protection fault:  [#1] PREEMPT SMP
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem 
ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core 
prime_numbers i2c_hid [last unloaded: snd_hda_intel]
 CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G U  W   
4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM 
RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 Workqueue: events intel_dp_modeset_retry_work_fn [i915]
 task: 88045a5caa40 task.stack: c9378000
 RIP: 0010:drm_setup_crtcs+0x143/0xbf0
 RSP: 0018:c937bd20 EFLAGS: 00010202
 RAX: 6b6b6b6b6b6b6b6b RBX: 0002 RCX: 0001
 RDX: 0001 RSI: 0780 RDI: 
 RBP: c937bdb8 R08: 0001 R09: 0001
 R10: 0780 R11:  R12: 0002
 R13: 88044fbef4e8 R14: 0780 R15: 0438
 FS:  () GS:88045d20() knlGS:
 CS:  0010 DS:  ES:  

[Intel-gfx] ✗ Fi.CI.IGT: warning for drm/i915: Order two completing nop_submit_request (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Order two completing nop_submit_request (rev2)
URL   : https://patchwork.freedesktop.org/series/31486/
State : warning

== Summary ==

Test kms_plane:
Subgroup plane-panning-bottom-right-suspend-pipe-B-planes:
pass   -> SKIP   (shard-hsw)

shard-hswtotal:2446 pass:1327 dwarn:6   dfail:0   fail:9   skip:1104 
time:10071s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5929/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Silence compiler warning for hsw_power_well_enable()

2017-10-06 Thread Chris Wilson
Quoting Imre Deak (2017-10-02 12:42:24)
> On Mon, Oct 02, 2017 at 11:04:16AM +0100, Chris Wilson wrote:
> > Not all compilers are able to determine that pg is guarded by wait_fuses
> > and so may think that pg is used uninitialized.
> > 
> > Reported-by: Geert Uytterhoeven 
> > Fixes: b2891eb2531e ("drm/i915/hsw+: Add has_fuses power well attribute")
> > Signed-off-by: Chris Wilson 
> > Cc: Imre Deak 
> > Cc: Arkadiusz Hiler 
> 
> Reviewed-by: Imre Deak 

Thanks for the review, applied so we should be off the nag list in the
cycle.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for igt/gem_memfd: Exercise hugepages and memfd

2017-10-06 Thread Patchwork
== Series Details ==

Series: igt/gem_memfd: Exercise hugepages and memfd
URL   : https://patchwork.freedesktop.org/series/31460/
State : success

== Summary ==

IGT patchset tested on top of latest successful build
d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small 
priority sorting smoketest

with latest DRM-Tip kernel build CI_DRM_3186
cb32cc2ad1c3 drm-tip: 2017y-10m-06d-15h-01m-44s UTC integration manifest

Testlist changes:
+igt@gem_memfd@1G
+igt@gem_memfd@2M
+igt@gem_memfd@64k

Test chamelium:
Subgroup hdmi-crc-fast:
pass   -> DMESG-WARN (fi-skl-6700k) fdo#103019
Test drv_module_reload:
Subgroup basic-reload-inject:
incomplete -> PASS   (fi-cfl-s) fdo#103022

fdo#103019 https://bugs.freedesktop.org/show_bug.cgi?id=103019
fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:457s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:473s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:404s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:570s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:288s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:528s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:535s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:545s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:532s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:568s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:645s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:439s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:606s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:451s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:420s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:503s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:484s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:501s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:585s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:489s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:599s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:472s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:662s
fi-skl-6700k total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:537s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:524s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:479s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:589s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:435s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_303/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Cancel the hotplug work when unregistering the connector

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Cancel the hotplug work when unregistering the connector
URL   : https://patchwork.freedesktop.org/series/31501/
State : warning

== Summary ==

Series 31501v1 drm/i915: Cancel the hotplug work when unregistering the 
connector
https://patchwork.freedesktop.org/api/1.0/series/31501/revisions/1/mbox/

Test drv_module_reload:
Subgroup basic-reload:
pass   -> DMESG-WARN (fi-blb-e6850)
pass   -> DMESG-WARN (fi-pnv-d510)
pass   -> DMESG-WARN (fi-bwr-2160)
pass   -> DMESG-WARN (fi-elk-e7500)
pass   -> DMESG-WARN (fi-snb-2520m)
pass   -> DMESG-WARN (fi-snb-2600)
pass   -> DMESG-WARN (fi-ivb-3520m)
pass   -> DMESG-WARN (fi-ivb-3770)
pass   -> DMESG-WARN (fi-byt-j1900)
pass   -> DMESG-WARN (fi-hsw-4770)
pass   -> DMESG-WARN (fi-hsw-4770r)
pass   -> DMESG-WARN (fi-bdw-5557u)
pass   -> DMESG-WARN (fi-bdw-gvtdvm)
pass   -> DMESG-WARN (fi-bsw-n3050)
pass   -> DMESG-WARN (fi-skl-6260u)
pass   -> DMESG-WARN (fi-skl-6700k)
pass   -> DMESG-WARN (fi-skl-6770hq)
pass   -> DMESG-WARN (fi-skl-gvtdvm)
pass   -> DMESG-WARN (fi-bxt-dsi)
pass   -> DMESG-WARN (fi-bxt-j4205)
pass   -> DMESG-WARN (fi-kbl-7500u)
pass   -> DMESG-WARN (fi-kbl-7560u)
pass   -> DMESG-WARN (fi-kbl-r)
pass   -> DMESG-WARN (fi-glk-1)
pass   -> DMESG-WARN (fi-cfl-s)
pass   -> DMESG-WARN (fi-cnl-y)
Subgroup basic-no-display:
pass   -> DMESG-WARN (fi-cfl-s) fdo#103022 +1

fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022

fi-bdw-5557u total:289  pass:267  dwarn:1   dfail:0   fail:0   skip:21  
time:462s
fi-bdw-gvtdvmtotal:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:480s
fi-blb-e6850 total:289  pass:222  dwarn:2   dfail:0   fail:0   skip:65  
time:403s
fi-bsw-n3050 total:289  pass:242  dwarn:1   dfail:0   fail:0   skip:46  
time:566s
fi-bwr-2160  total:289  pass:182  dwarn:1   dfail:0   fail:0   skip:106 
time:286s
fi-bxt-dsi   total:289  pass:258  dwarn:1   dfail:0   fail:0   skip:30  
time:524s
fi-bxt-j4205 total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  
time:531s
fi-byt-j1900 total:289  pass:252  dwarn:2   dfail:0   fail:0   skip:35  
time:539s
fi-cfl-s total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  
time:579s
fi-cnl-y total:289  pass:261  dwarn:1   dfail:0   fail:0   skip:27  
time:649s
fi-elk-e7500 total:289  pass:228  dwarn:1   dfail:0   fail:0   skip:60  
time:435s
fi-glk-1 total:289  pass:260  dwarn:1   dfail:0   fail:0   skip:28  
time:597s
fi-hsw-4770  total:289  pass:261  dwarn:1   dfail:0   fail:0   skip:27  
time:438s
fi-hsw-4770r total:289  pass:261  dwarn:1   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  
time:503s
fi-ivb-3770  total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  
time:475s
fi-kbl-7500u total:289  pass:263  dwarn:2   dfail:0   fail:0   skip:24  
time:506s
fi-kbl-7560u total:289  pass:269  dwarn:1   dfail:0   fail:0   skip:19  
time:583s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:488s
fi-kbl-r total:289  pass:261  dwarn:1   dfail:0   fail:0   skip:27  
time:591s
fi-pnv-d510  total:289  pass:221  dwarn:2   dfail:0   fail:0   skip:66  
time:667s
fi-skl-6260u total:289  pass:268  dwarn:1   dfail:0   fail:0   skip:20  
time:467s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:656s
fi-skl-6700k total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:570s
fi-skl-6770hqtotal:289  pass:268  dwarn:1   dfail:0   fail:0   skip:20  
time:509s
fi-skl-gvtdvmtotal:289  pass:265  dwarn:1   dfail:0   fail:0   skip:23  
time:471s
fi-snb-2520m total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:581s
fi-snb-2600  total:289  pass:248  dwarn:1   dfail:0   fail:0   skip:40  
time:433s
fi-byt-n2820 failed to connect after reboot

cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC 
integration manifest
9af48be91aae drm/i915: Cancel the hotplug work when unregistering the connector

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5936/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with drm/i915: Preallocate our mmu notifier workequeu 
to unbreak cpu hotplug deadlock (rev2)
URL   : https://patchwork.freedesktop.org/series/31476/
State : success

== Summary ==

Series 31476v2 series starting with drm/i915: Preallocate our mmu notifier 
workequeu to unbreak cpu hotplug deadlock
https://patchwork.freedesktop.org/api/1.0/series/31476/revisions/2/mbox/

Test drv_module_reload:
Subgroup basic-reload-inject:
incomplete -> PASS   (fi-cfl-s) fdo#103022

fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:455s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:482s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:397s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:566s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:287s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:526s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:544s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:527s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:559s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:618s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:430s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:612s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:440s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:417s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:504s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:482s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:505s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:584s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:501s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:597s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:656s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:469s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:659s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:531s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:516s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:476s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:582s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:432s
fi-bxt-j4205 failed to connect after reboot

cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC 
integration manifest
999c4f026e85 drm/i915: Use rcu instead of stop_machine in set_wedged
cef3c4054a61 drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu 
hotplug deadlock

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5935/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/10] drm/i915/guc: Precompute GuC shared data offset

2017-10-06 Thread Daniele Ceraolo Spurio



On 06/10/17 05:35, Michał Winiarski wrote:

On Thu, Oct 05, 2017 at 05:02:39PM +, Daniele Ceraolo Spurio wrote:



On 05/10/17 02:33, Chris Wilson wrote:

Quoting Michał Winiarski (2017-10-05 10:13:40)

We're using first page of kernel context state to share data with GuC,
let's precompute the ggtt offset at GuC initialization time rather than
everytime we're using GuC actions.


So LRC_GUCSHR_PN is still 0. Plans for that to change?



This is a requirement from the GuC side. GuC expects each context to have
that extra page before the PPHWSP and it uses it to dump some per-lrc info,
part of which is for internal use and part is info for the host (although we
don't need/use it).
On certain events (reset/preempt/suspend etc) GuC will dump extra info and
this is done in the page provided in the H2G. I think we use the one of the
default ctx just for simplicity, but it should be possible to use a
different one, possibly not attached to any lrc if needed, but I'm not sure
if this has ever been tested.


Done that (allocating a separate object for GuC shared data), seems to
work just fine on its own. Except if we try to remove the first page from
contexts. It seems to make GuC upset even though we're not using actions.



Yep, as I mentioned above GuC dumps runtime info about each lrc it 
handles in that page (e.g. if an lrc has been submitted via proxy), so 
it is probably going to either page-fault or write in the wrong memory 
if that page is not allocated.



We could still do that, though without removing the extra page we're just being
more wasteful. But perhaps it's cleaner that way? Having separate managed in GuC
code rather than reusing random places in context state? Thoughts?



This is similar to what we used to do by using the PPHWSP of the default 
ctx as the global HWSP. Personally I'd prefer to keep it separate as it 
feels cleaner and a single extra page shouldn't hurt us that much, but 
there was some push-back when I suggested the same for the HWSP.


Daniele


-Michał



-Daniele


Atm, we should be changing one pointer deref for another...
-Chris



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Prove an assert for when we expect forcewake to be held

2017-10-06 Thread Chris Wilson
s/Prove/Provide/

Quoting Chris Wilson (2017-10-06 15:54:59)
> Add assert_forcewakes_active() (the complementary function to
> assert_forcewakes_inactive) that documents the requirement of a
> function for its callers to be holding the forcewake ref (i.e. the
> function is part of a sequence over which RC6 must be prevented).
> 
> One such example is during ringbuffer reset, where RC6 must be held
> across the whole reinitialisation sequence.
> 
> Signed-off-by: Chris Wilson 
> Cc: Tvrtko Ursulin 
> Cc: Mika Kuoppala 
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 11 ++-
>  drivers/gpu/drm/i915/intel_uncore.c | 12 
>  drivers/gpu/drm/i915/intel_uncore.h |  2 ++
>  3 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 05c08b0bc172..4285f09ff8b8 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -579,7 +579,16 @@ static int init_ring_common(struct intel_engine_cs 
> *engine)
>  static void reset_ring_common(struct intel_engine_cs *engine,
>   struct drm_i915_gem_request *request)
>  {
> -   /* Try to restore the logical GPU state to match the continuation
> +   /*
> +* RC6 must be prevented until the reset is complete and the engine
> +* reinitialised. If it occurs in the middle of this sequence, the
> +* state written to/loaded from the power context is ill-defined (e.g.
> +* the PP_BASE_DIR may be lost).

PP_DIR_BASE
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: warning for drm/i915: Try harder to finish the idle-worker (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Try harder to finish the idle-worker (rev2)
URL   : https://patchwork.freedesktop.org/series/29690/
State : warning

== Summary ==

Test kms_plane_multiple:
Subgroup legacy-pipe-C-tiling-none:
pass   -> SKIP   (shard-hsw)

shard-hswtotal:2446 pass:1327 dwarn:6   dfail:0   fail:9   skip:1104 
time:10118s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5927/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with [1/2] drm/i915/selftests: Hold the rpm/forcewake 
wakeref for the reset tests
URL   : https://patchwork.freedesktop.org/series/31498/
State : warning

== Summary ==

Series 31498v1 series starting with [1/2] drm/i915/selftests: Hold the 
rpm/forcewake wakeref for the reset tests
https://patchwork.freedesktop.org/api/1.0/series/31498/revisions/1/mbox/

Test gem_busy:
Subgroup basic-hang-default:
pass   -> DMESG-WARN (fi-snb-2520m)
Test gem_ringfill:
Subgroup basic-default-hang:
pass   -> DMESG-WARN (fi-ivb-3520m)
Test drv_module_reload:
Subgroup basic-reload-inject:
incomplete -> PASS   (fi-cfl-s) fdo#103022

fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:452s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:472s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:396s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:569s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:287s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:519s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:532s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:541s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:534s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:554s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:626s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:440s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:604s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:441s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  
time:501s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:478s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:506s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:585s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:503s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:594s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:657s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:477s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:660s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:537s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:568s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:476s
fi-snb-2520m total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:577s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:440s

cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC 
integration manifest
7793b7e9953b drm/i915: Prove an assert for when we expect forcewake to be held
3e2ce344584e drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset 
tests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5934/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

2017-10-06 Thread Tvrtko Ursulin


On 06/10/2017 16:52, Daniel Vetter wrote:

4.14-rc1 gained the fancy new cross-release support in lockdep, which
seems to have uncovered a few more rules about what is allowed and
isn't.

This one here seems to indicate that allocating a work-queue while
holding mmap_sem is a no-go, so let's try to preallocate it.

Of course another way to break this chain would be somewhere in the
cpu hotplug code, since this isn't the only trace we're finding now
which goes through msr_create_device.

Full lockdep splat:

==
WARNING: possible circular locking dependency detected
4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G U
--
prime_mmap/1551 is trying to acquire lock:
  (cpu_hotplug_lock.rw_sem){}, at: [] 
apply_workqueue_attrs+0x17/0x50

but task is already holding lock:
  (&dev_priv->mm_lock){+.+.}, at: [] 
i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #6 (&dev_priv->mm_lock){+.+.}:
__lock_acquire+0x1420/0x15e0
lock_acquire+0xb0/0x200
__mutex_lock+0x86/0x9b0
mutex_lock_nested+0x1b/0x20
i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
drm_ioctl_kernel+0x69/0xb0
drm_ioctl+0x2f9/0x3d0
do_vfs_ioctl+0x94/0x670
SyS_ioctl+0x41/0x70
entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #5 (&mm->mmap_sem){}:
__lock_acquire+0x1420/0x15e0
lock_acquire+0xb0/0x200
__might_fault+0x68/0x90
_copy_to_user+0x23/0x70
filldir+0xa5/0x120
dcache_readdir+0xf9/0x170
iterate_dir+0x69/0x1a0
SyS_getdents+0xa5/0x140
entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #4 (&sb->s_type->i_mutex_key#5){}:
down_write+0x3b/0x70
handle_create+0xcb/0x1e0
devtmpfsd+0x139/0x180
kthread+0x152/0x190
ret_from_fork+0x27/0x40

-> #3 ((complete)&req.done){+.+.}:
__lock_acquire+0x1420/0x15e0
lock_acquire+0xb0/0x200
wait_for_common+0x58/0x210
wait_for_completion+0x1d/0x20
devtmpfs_create_node+0x13d/0x160
device_add+0x5eb/0x620
device_create_groups_vargs+0xe0/0xf0
device_create+0x3a/0x40
msr_device_create+0x2b/0x40
cpuhp_invoke_callback+0xa3/0x840
cpuhp_thread_fun+0x7a/0x150
smpboot_thread_fn+0x18a/0x280
kthread+0x152/0x190
ret_from_fork+0x27/0x40

-> #2 (cpuhp_state){+.+.}:
__lock_acquire+0x1420/0x15e0
lock_acquire+0xb0/0x200
cpuhp_issue_call+0x10b/0x170
__cpuhp_setup_state_cpuslocked+0x134/0x2a0
__cpuhp_setup_state+0x46/0x60
page_writeback_init+0x43/0x67
pagecache_init+0x3d/0x42
start_kernel+0x3a8/0x3fc
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x6d/0x70
verify_cpu+0x0/0xfb

-> #1 (cpuhp_state_mutex){+.+.}:
__lock_acquire+0x1420/0x15e0
lock_acquire+0xb0/0x200
__mutex_lock+0x86/0x9b0
mutex_lock_nested+0x1b/0x20
__cpuhp_setup_state_cpuslocked+0x52/0x2a0
__cpuhp_setup_state+0x46/0x60
page_alloc_init+0x28/0x30
start_kernel+0x145/0x3fc
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x6d/0x70
verify_cpu+0x0/0xfb

-> #0 (cpu_hotplug_lock.rw_sem){}:
check_prev_add+0x430/0x840
__lock_acquire+0x1420/0x15e0
lock_acquire+0xb0/0x200
cpus_read_lock+0x3d/0xb0
apply_workqueue_attrs+0x17/0x50
__alloc_workqueue_key+0x1d8/0x4d9
i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
drm_ioctl_kernel+0x69/0xb0
drm_ioctl+0x2f9/0x3d0
do_vfs_ioctl+0x94/0x670
SyS_ioctl+0x41/0x70
entry_SYSCALL_64_fastpath+0x1c/0xb1

other info that might help us debug this:

Chain exists of:
   cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock

  Possible unsafe locking scenario:

CPU0CPU1

   lock(&dev_priv->mm_lock);
lock(&mm->mmap_sem);
lock(&dev_priv->mm_lock);
   lock(cpu_hotplug_lock.rw_sem);

  *** DEADLOCK ***

2 locks held by prime_mmap/1551:
  #0:  (&mm->mmap_sem){}, at: [] 
i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915]
  #1:  (&dev_priv->mm_lock){+.+.}, at: [] 
i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]

stack backtrace:
CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G U  
4.14.0-rc1-CI-CI_DRM_3118+ #1
Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
Call Trace:
  dump_stack+0x68/0x9f
  print_circular_bug+0x235/0x3c0
  ? lockdep_init_map_crosslock+0x20/0x20
  check_prev_add+0x430/0x840
  __lock_acquire+0x1420/

[Intel-gfx] [PATCH] drm/i915: Cancel the hotplug work when unregistering the connector

2017-10-06 Thread Chris Wilson
When we unregister the connector, we may have a pending hotplug work.
This needs to be cancel early during the teardown so that it does not
fire after we have freed the connector. Or else we may see something like:

 DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
 [ cut here ]
 WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 
mutex_destroy+0x4e/0x60
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem 
ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core 
prime_numbers i2c_hid [last unloaded: snd_hda_intel]
 CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U  
4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM 
RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 task: 8803c827aa40 task.stack: c952
 RIP: 0010:mutex_destroy+0x4e/0x60
 RSP: 0018:c9523d58 EFLAGS: 00010292
 RAX: 002a RBX: 88044fbef648 RCX: 
 RDX: 8001 RSI: 0001 RDI: 810f0cf0
 RBP: c9523d60 R08: 0001 R09: 0001
 R10: 0f21cb81 R11:  R12: 88044f71efc8
 R13: a02b3d20 R14: a02b3d90 R15: 880459b29308
 FS:  7f5df4d6e8c0() GS:88045d30() knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 55ec51f00a18 CR3: 000451782006 CR4: 003606e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: fffe0ff0 DR7: 0400
 Call Trace:
  drm_fb_helper_fini+0xd9/0x130
  intel_fbdev_destroy+0x12/0x60 [i915]
  intel_fbdev_fini+0x28/0x30 [i915]
  intel_modeset_cleanup+0x45/0xa0 [i915]
  i915_driver_unload+0x92/0x180 [i915]
  i915_pci_remove+0x19/0x30 [i915]
  pci_device_remove+0x39/0xb0
  device_release_driver_internal+0x15d/0x220
  driver_detach+0x40/0x80
  bus_remove_driver+0x58/0xd0
  driver_unregister+0x2c/0x40
  pci_unregister_driver+0x36/0xb0
  i915_exit+0x1a/0x8b [i915]
  SyS_delete_module+0x18c/0x1e0
  entry_SYSCALL_64_fastpath+0x1c/0xb1
 RIP: 0033:0x7f5df3286287
 RSP: 002b:7fff8e107cc8 EFLAGS: 0246 ORIG_RAX: 00b0
 RAX: ffda RBX: 81493a03 RCX: 7f5df3286287
 RDX: 0001 RSI: 0800 RDI: 564c7be02e48
 RBP: c9523f88 R08:  R09: 0080
 R10: 7f5df4d6e8c0 R11: 0246 R12: 
 R13: 7fff8e107eb0 R14:  R15: 
  ? __this_cpu_preempt_check+0x13/0x20
 Code: 00 00 5b 5d c3 e8 93 b9 3a 00 85 c0 74 ec 8b 05 e1 53 c3 01 85 c0 75 e2 
48 c7 c6 86 a6 c7 81 48 c7 c7 8b 8d c6 81 e8 03 ae 01 00 <0f> ff eb cb 0f 1f 40 
00 66 2e 0f 1f 84 00 00 00 00 00 55 48 b8
 ---[ end trace 08901ff1a77d30c6 ]---
 [drm:wait_panel_status [i915]] mask b80f value  status  
control 0060
 [drm:wait_panel_status [i915]] Wait complete
 [drm:edp_panel_vdd_on [i915]] PP_STATUS: 0x PP_CONTROL: 0x0068
 [drm:edp_panel_vdd_on [i915]] eDP port A panel power wasn't enabled
 [drm:drm_dp_read_desc] DP sink: OUI 00-1c-f8 dev-ID  HW-rev 0.0 SW-rev 7.49 
quirks 0x
 [drm:drm_edid_to_eld] ELD: no CEA Extension found
 [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:48:eDP-1] probed 
modes :
 [drm:drm_mode_debug_printmodeline] Modeline 49:"1920x1080" 60 138780 1920 1966 
1996 2080 1080 1082 1086 1112 0x48 0xa
 [drm:drm_mode_debug_printmodeline] Modeline 50:"1920x1080" 40 92520 1920 1966 
1996 2080 1080 1082 1086 1112 0x40 0xa
 general protection fault:  [#1] PREEMPT SMP
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem 
ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core 
prime_numbers i2c_hid [last unloaded: snd_hda_intel]
 CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G U  W   
4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM 
RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 Workqueue: events intel_dp_modeset_retry_work_fn [i915]
 task: 88045a5caa40 task.stack: c9378000
 RIP: 0010:drm_setup_crtcs+0x143/0xbf0
 RSP: 0018:c937bd20 EFLAGS: 00010202
 RAX: 6b6b6b6b6b6b6b6b RBX: 0002 RCX: 0001
 RDX: 0001 RSI: 0780 RDI: 
 RBP: c937bdb8 R08: 0001 R09: 0001
 R10: 0780 R11:  R12: 0002
 R13: 88044fbef4e8 R14: 0780 R15: 0438
 FS:  () GS:88045d20() knlGS:
 CS:  0010 DS:  ES:  

[Intel-gfx] [PATCH] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

2017-10-06 Thread Daniel Vetter
4.14-rc1 gained the fancy new cross-release support in lockdep, which
seems to have uncovered a few more rules about what is allowed and
isn't.

This one here seems to indicate that allocating a work-queue while
holding mmap_sem is a no-go, so let's try to preallocate it.

Of course another way to break this chain would be somewhere in the
cpu hotplug code, since this isn't the only trace we're finding now
which goes through msr_create_device.

Full lockdep splat:

==
WARNING: possible circular locking dependency detected
4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G U
--
prime_mmap/1551 is trying to acquire lock:
 (cpu_hotplug_lock.rw_sem){}, at: [] 
apply_workqueue_attrs+0x17/0x50

but task is already holding lock:
 (&dev_priv->mm_lock){+.+.}, at: [] 
i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #6 (&dev_priv->mm_lock){+.+.}:
   __lock_acquire+0x1420/0x15e0
   lock_acquire+0xb0/0x200
   __mutex_lock+0x86/0x9b0
   mutex_lock_nested+0x1b/0x20
   i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
   i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
   drm_ioctl_kernel+0x69/0xb0
   drm_ioctl+0x2f9/0x3d0
   do_vfs_ioctl+0x94/0x670
   SyS_ioctl+0x41/0x70
   entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #5 (&mm->mmap_sem){}:
   __lock_acquire+0x1420/0x15e0
   lock_acquire+0xb0/0x200
   __might_fault+0x68/0x90
   _copy_to_user+0x23/0x70
   filldir+0xa5/0x120
   dcache_readdir+0xf9/0x170
   iterate_dir+0x69/0x1a0
   SyS_getdents+0xa5/0x140
   entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #4 (&sb->s_type->i_mutex_key#5){}:
   down_write+0x3b/0x70
   handle_create+0xcb/0x1e0
   devtmpfsd+0x139/0x180
   kthread+0x152/0x190
   ret_from_fork+0x27/0x40

-> #3 ((complete)&req.done){+.+.}:
   __lock_acquire+0x1420/0x15e0
   lock_acquire+0xb0/0x200
   wait_for_common+0x58/0x210
   wait_for_completion+0x1d/0x20
   devtmpfs_create_node+0x13d/0x160
   device_add+0x5eb/0x620
   device_create_groups_vargs+0xe0/0xf0
   device_create+0x3a/0x40
   msr_device_create+0x2b/0x40
   cpuhp_invoke_callback+0xa3/0x840
   cpuhp_thread_fun+0x7a/0x150
   smpboot_thread_fn+0x18a/0x280
   kthread+0x152/0x190
   ret_from_fork+0x27/0x40

-> #2 (cpuhp_state){+.+.}:
   __lock_acquire+0x1420/0x15e0
   lock_acquire+0xb0/0x200
   cpuhp_issue_call+0x10b/0x170
   __cpuhp_setup_state_cpuslocked+0x134/0x2a0
   __cpuhp_setup_state+0x46/0x60
   page_writeback_init+0x43/0x67
   pagecache_init+0x3d/0x42
   start_kernel+0x3a8/0x3fc
   x86_64_start_reservations+0x2a/0x2c
   x86_64_start_kernel+0x6d/0x70
   verify_cpu+0x0/0xfb

-> #1 (cpuhp_state_mutex){+.+.}:
   __lock_acquire+0x1420/0x15e0
   lock_acquire+0xb0/0x200
   __mutex_lock+0x86/0x9b0
   mutex_lock_nested+0x1b/0x20
   __cpuhp_setup_state_cpuslocked+0x52/0x2a0
   __cpuhp_setup_state+0x46/0x60
   page_alloc_init+0x28/0x30
   start_kernel+0x145/0x3fc
   x86_64_start_reservations+0x2a/0x2c
   x86_64_start_kernel+0x6d/0x70
   verify_cpu+0x0/0xfb

-> #0 (cpu_hotplug_lock.rw_sem){}:
   check_prev_add+0x430/0x840
   __lock_acquire+0x1420/0x15e0
   lock_acquire+0xb0/0x200
   cpus_read_lock+0x3d/0xb0
   apply_workqueue_attrs+0x17/0x50
   __alloc_workqueue_key+0x1d8/0x4d9
   i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
   i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
   drm_ioctl_kernel+0x69/0xb0
   drm_ioctl+0x2f9/0x3d0
   do_vfs_ioctl+0x94/0x670
   SyS_ioctl+0x41/0x70
   entry_SYSCALL_64_fastpath+0x1c/0xb1

other info that might help us debug this:

Chain exists of:
  cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(&dev_priv->mm_lock);
   lock(&mm->mmap_sem);
   lock(&dev_priv->mm_lock);
  lock(cpu_hotplug_lock.rw_sem);

 *** DEADLOCK ***

2 locks held by prime_mmap/1551:
 #0:  (&mm->mmap_sem){}, at: [] 
i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915]
 #1:  (&dev_priv->mm_lock){+.+.}, at: [] 
i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]

stack backtrace:
CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G U  
4.14.0-rc1-CI-CI_DRM_3118+ #1
Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
Call Trace:
 dump_stack+0x68/0x9f
 print_circular_bug+0x235/0x3c0
 ? lockdep_init_map_crosslock+0x20/0x20
 check_prev_add+0x430/0x840
 __lock_acquire+0x1420/0x15e0
 ? __lock_acquire+0x1420/0x15e0
 ? lockdep_init_map_crosslock+0x20/0x20
 lock_acquire+0xb0/0x200
 ? apply_workqueue_attrs+0x17/0x5

[Intel-gfx] ✓ Fi.CI.BAT: success for huge gtt pages (rev13)

2017-10-06 Thread Patchwork
== Series Details ==

Series: huge gtt pages (rev13)
URL   : https://patchwork.freedesktop.org/series/25118/
State : success

== Summary ==

Series 25118v13 huge gtt pages
https://patchwork.freedesktop.org/api/1.0/series/25118/revisions/13/mbox/

Test gem_exec_suspend:
Subgroup basic-s3:
dmesg-warn -> PASS   (fi-cfl-s) fdo#103026
Test drv_module_reload:
Subgroup basic-reload-inject:
incomplete -> PASS   (fi-cfl-s) fdo#103022

fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026
fdo#103022 https://bugs.freedesktop.org/show_bug.cgi?id=103022

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:453s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:468s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:389s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:562s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:284s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:524s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:521s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:533s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:523s
fi-cfl-s total:289  pass:257  dwarn:0   dfail:0   fail:0   skip:32  
time:574s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:615s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:431s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:608s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:436s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:416s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:509s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:479s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:501s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:583s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:489s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:589s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:651s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:486s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:652s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:531s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:509s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:465s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:573s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:430s

cb32cc2ad1c3ccd0803276d5af46c410f5104951 drm-tip: 2017y-10m-06d-15h-01m-44s UTC 
integration manifest
36ea69dd9d5b drm/i915: enable platform support for 2M pages
0f0029c50531 drm/i915: enable platform support for 64K pages
a3ceb490e30c drm/i915: disable platform support for vGPU huge gtt pages
bb78739d06a6 drm/i915/selftests: mix huge pages
db68d040aacb drm/i915/selftests: huge page tests
7ad2fcb4f3b9 drm/i915/debugfs: include some gtt page size metrics
a8ba9dc4b480 drm/i915: accurate page size tracking for the ppgtt
2bfd27e41b3e drm/i915: support 64K pages for the 48b PPGTT
77922800074a drm/i915: add support for 64K scratch page
1e0e1d625d57 drm/i915: support 2M pages for the 48b PPGTT
7afcdd9a842d drm/i915: disable GTT cache for 2M pages
4f6be3c66188 drm/i915: enable IPS bit for 64K pages
03330868e4af drm/i915: align 64K objects to 2M
3a6d462cc964 drm/i915: align the vma start to the largest gtt page size
33effb284bc0 drm/i915: introduce vm set_pages/clear_pages
ebf5d7e4e80e drm/i915: introduce page_size members
9e877d2dad5e drm/i915: push set_pages down to the callers
74be45fad762 drm/i915: introduce page_sizes field to dev_info
83936c1a8137 drm/i915/gemfs: enable THP
634bb031d367 drm/i915: introduce simple gemfs
7041b30f5891 mm/shmem: introduce shmem_file_setup_with_mnt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5933/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: Fix pointer-to-int conversion

2017-10-06 Thread Chris Wilson
Quoting Chris Wilson (2017-10-06 14:16:55)
> Quoting Michal Wajdeczko (2017-10-06 14:08:44)
> > Commit faf654864b25 ("drm/i915: Unify uC variable types to avoid
> > flooding checkpatch.pl") breaks 32-bit kernel builds. Lets use
> > cast helper to make compiler happy.
> > 
> > v2: introduce ptr_to_u64 (Chris)
> > 
> > Signed-off-by: Michal Wajdeczko 
> > Cc: Joonas Lahtinen 
> > Cc: Chris Wilson 
> Reviewed-by: Chris Wilson 

Also applied to my queue, thanks for the quick fixup.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t 1/7] intel-gpu-overlay: Move local perf implementation to a library

2017-10-06 Thread Tvrtko Ursulin


On 29/09/2017 14:43, Petri Latvala wrote:

On Fri, Sep 29, 2017 at 01:39:33PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Idea is to avoid duplication across multiple users in
upcoming patches.

v2: Commit message and use a separate library instead of piggy-
 backing to libintel_tools. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin 
---
  lib/Makefile.am  | 6 +-
  overlay/perf.c => lib/igt_perf.c | 2 +-
  overlay/perf.h => lib/igt_perf.h | 2 ++
  overlay/Makefile.am  | 6 ++
  overlay/gem-interrupts.c | 3 ++-
  overlay/gpu-freq.c   | 3 ++-
  overlay/gpu-perf.c   | 3 ++-
  overlay/gpu-top.c| 3 ++-
  overlay/power.c  | 3 ++-
  overlay/rc6.c| 3 ++-
  10 files changed, 22 insertions(+), 12 deletions(-)
  rename overlay/perf.c => lib/igt_perf.c (94%)
  rename overlay/perf.h => lib/igt_perf.h (99%)



This one was more of a doozey to mesonize for a newbie.

This is ugly but hopefully will make someone more knowledgeable point
out better ways and practices for using build targets vs. just lib
names around...

(Now sent with X-Patchwork-Hint, hopefully patchwork doesn't get
confused)

diff --git a/benchmarks/meson.build b/benchmarks/meson.build
index 9ab738f7..9f2672eb 100644
--- a/benchmarks/meson.build
+++ b/benchmarks/meson.build
@@ -31,6 +31,11 @@ endif
  foreach prog : benchmark_progs
# FIXME meson doesn't like binaries with the same name
# meanwhile just suffix with _bench
+   link = []
+   if prog == 'gem_wsim'
+  link += lib_igt_perf
+   endif
executable(prog + '_bench', prog + '.c',
-   dependencies : test_deps)
+   dependencies : test_deps,
+   link_with : link)
  endforeach
diff --git a/lib/meson.build b/lib/meson.build
index 203be520..2c33493d 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -178,4 +178,8 @@ lib_igt = declare_dependency(link_with : lib_igt_build,
  
  igt_deps = [ lib_igt ] + lib_deps
  
+lib_igt_perf = static_library('igt_perf',

+['igt_perf.c']
+)
+
  subdir('tests')
diff --git a/overlay/meson.build b/overlay/meson.build
index a92ef895..ffc011cc 100644
--- a/overlay/meson.build
+++ b/overlay/meson.build
@@ -10,7 +10,6 @@ gpu_overlay_src = [
'gpu-freq.c',
'igfx.c',
'overlay.c',
-   'perf.c',
'power.c',
'rc6.c',
  ]
@@ -56,5 +55,6 @@ if xrandr.found() and cairo.found()
include_directories : inc,
c_args : gpu_overlay_cflags,
dependencies : gpu_overlay_deps,
+   link_with : lib_igt_perf,
install : true)
  endif


Grumble, can we have a switch over day where it all gets converted to 
meson by the people in the know, and until then not concern ourselves 
with a two-headed build system?


At the moment it is just a distraction and time waste if everybody 
working on IGT has to test both build systems.


I know meson is great and all that by I'd rather focus on the actual 
work than having to maintain parallel build systems. Especially since I 
am clueless on it, so it would be one more thing competing for limited 
brain resources.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 11/11] drm/i915: Introduce separate status variable for RC6 and LLC ring frequency setup

2017-10-06 Thread Sagar Arun Kamble



On 10/6/2017 6:25 PM, Chris Wilson wrote:

Quoting Sagar Arun Kamble (2017-10-06 13:13:40)

Defined new struct intel_rc6 to hold RC6 specific state and
intel_ring_pstate to hold ring specific state.

v2: s/intel_ring_pstate/intel_llc_pstate and rebase. (Chris)

Signed-off-by: Sagar Arun Kamble 
Cc: Imre Deak 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Radoslaw Szwichtenberg 
---
  drivers/gpu/drm/i915/i915_drv.c |  2 +-
  drivers/gpu/drm/i915/i915_drv.h | 10 
  drivers/gpu/drm/i915/intel_pm.c | 57 +++--
  3 files changed, 54 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 470807c..154f231 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2502,7 +2502,7 @@ static int intel_runtime_suspend(struct device *kdev)
 struct drm_i915_private *dev_priv = to_i915(dev);
 int ret;
  
-   if (WARN_ON_ONCE(!(dev_priv->pm.rps.enabled && intel_rc6_enabled(

+   if (WARN_ON_ONCE(!(dev_priv->pm.rc6.enabled && intel_rc6_enabled(
 return -ENODEV;
  
 if (WARN_ON_ONCE(!HAS_RUNTIME_PM(dev_priv)))

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 45944a8..a07aa71 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1363,8 +1363,18 @@ struct intel_rps {
 struct intel_rps_ei ei;
  };
  
+struct intel_rc6 {

+   bool enabled;
+};
+
+struct intel_llc_pstate {
+   bool configured;
+};
+
  struct intel_gen6_power_mgmt {
 struct intel_rps rps;
+   struct intel_rc6 rc6;
+   struct intel_llc_pstate llc_pstate;
 struct delayed_work autoenable_work;
  
 /*

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 03264fe..df36a6f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7873,7 +7873,12 @@ static void intel_init_emon(struct drm_i915_private 
*dev_priv)
  
  static inline void intel_update_ring_freq(struct drm_i915_private *i915)

  {
+   if (READ_ONCE(i915->pm.llc_pstate.configured))
+   return;

Tell me about how you expect the locking around this function to be.

The READ_ONCE() implies that we are doing a optimistic peek outside of a
lock, but then we set configured without acquiring a lock, so I assume
we are inside some lock.

That looks true for all, we don't need READ_ONCE() anymore as we only
inspect inside the mutex (and so READ_ONCE is giving the wrong
impression).


+
 gen6_update_ring_freq(i915);
+
+   i915->pm.llc_pstate.configured = true;
  }
  
  void intel_disable_gt_powersave(struct drm_i915_private *dev_priv)

  {
-   if (!READ_ONCE(dev_priv->pm.rps.enabled))
-   return;
-
 mutex_lock(&dev_priv->pm.pcu_lock);
  
 intel_disable_rc6(dev_priv);

 intel_disable_rps(dev_priv);
+   if (HAS_LLC(dev_priv))
+   dev_priv->pm.llc_pstate.configured = false;

Always clear it? If no llc, it can never be configured.
Hmm, better if we just made it symmetrical with
s/intel_update_ring_freq/intel_enable_llc_pstate/ and
intel_disable_llc_pstate here.

Will update.


  
-   dev_priv->pm.rps.enabled = false;

 mutex_unlock(&dev_priv->pm.pcu_lock);
  }
@@ -8080,7 +8103,10 @@ static void __intel_autoenable_gt_powersave(struct 
work_struct *work)
 struct intel_engine_cs *rcs;
 struct drm_i915_gem_request *req;
  
-   if (READ_ONCE(dev_priv->pm.rps.enabled))

+   if (READ_ONCE(dev_priv->pm.rps.enabled) &&
+   READ_ONCE(dev_priv->pm.rc6.enabled) &&
+   !(HAS_LLC(dev_priv) ^
+ READ_ONCE(dev_priv->pm.llc_pstate.configured)))
 goto out;

This optimisation has lost its appeal :)

Kill it, if we need something like it we can try again later.
-Chris

Sure. Understood that using READ_ONCE inside lock was unnecessary.
Will remove this triple condition.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: failure for ] lib: Ask the kernel to quiesce the GPU

2017-10-06 Thread Patchwork
== Series Details ==

Series: ] lib: Ask the kernel to quiesce the GPU
URL   : https://patchwork.freedesktop.org/series/31448/
State : failure

== Summary ==

IGT patchset tested on top of latest successful build
d8954f05024d73a8b3f26fa0d5892d067a70fdac igt/gem_exec_scheduler: Add small 
priority sorting smoketest

with latest DRM-Tip kernel build CI_DRM_3185
7dacd1f2e70c drm-tip: 2017y-10m-06d-12h-29m-28s UTC integration manifest

No testlist changes.

Test chamelium:
Subgroup dp-edid-read:
pass   -> FAIL   (fi-kbl-7500u) fdo#102672
Test gem_sync:
Subgroup basic-all:
pass   -> FAIL   (fi-pnv-d510)

fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:449s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:467s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:563s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:292s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:526s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:543s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:546s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:518s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:564s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:630s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:437s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:596s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:436s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:419s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:506s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:470s
fi-kbl-7500u total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  
time:495s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:589s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:483s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:588s
fi-pnv-d510  total:289  pass:221  dwarn:1   dfail:0   fail:1   skip:66  
time:660s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:483s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:654s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:526s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:515s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:466s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:580s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:435s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_302/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 10/11] drm/i915: Create generic functions to control RC6, RPS

2017-10-06 Thread Sagar Arun Kamble



On 10/6/2017 6:16 PM, Chris Wilson wrote:

Quoting Sagar Arun Kamble (2017-10-06 13:13:39)

Prepared generic functions intel_enable_rc6, intel_disable_rc6,
intel_enable_rps and intel_disable_rps functions to setup RC6/RPS
based on platforms.

v2: Make intel_enable/disable_rc6/rps static. (Chris)

Signed-off-by: Sagar Arun Kamble 
Cc: Imre Deak 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Radoslaw Szwichtenberg 
---
  drivers/gpu/drm/i915/intel_pm.c | 97 ++---
  1 file changed, 62 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ce2dc5b..03264fe 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7972,75 +7972,102 @@ void intel_sanitize_gt_powersave(struct 
drm_i915_private *dev_priv)
 gen6_reset_rps_interrupts(dev_priv);
  }
  
-void intel_disable_gt_powersave(struct drm_i915_private *dev_priv)

+static void intel_disable_rc6(struct drm_i915_private *dev_priv)
  {
-   if (!READ_ONCE(dev_priv->pm.rps.enabled))
-   return;
-
-   mutex_lock(&dev_priv->pm.pcu_lock);

lockdep_assert_held(dev_priv->pm.pcu_lock); ?

We often skip it for statics, unless we know we are planning on adding
an interface that may not take the lock.

Sure will add this. Thanks.


Reviewed-by: Chris Wilson 
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/huc: Fix includes in intel_huc.c

2017-10-06 Thread Chris Wilson
Quoting Michal Wajdeczko (2017-10-06 10:02:09)
> Fix includes order and make sure we only include required headers.
> While here, make intel_huc.h header self-contained.
> 
> Signed-off-by: Michal Wajdeczko 
> Cc: Joonas Lahtinen 
> Cc: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/intel_huc.c | 6 --
>  drivers/gpu/drm/i915/intel_huc.h | 2 ++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_huc.c 
> b/drivers/gpu/drm/i915/intel_huc.c
> index 3f796fe..4b4cf56 100644
> --- a/drivers/gpu/drm/i915/intel_huc.c
> +++ b/drivers/gpu/drm/i915/intel_huc.c
> @@ -21,9 +21,11 @@
>   * IN THE SOFTWARE.
>   *
>   */
> -#include 
> +
> +#include 
> +
> +#include "intel_huc.h"
>  #include "i915_drv.h"
> -#include "intel_uc.h"
>  
>  /**
>   * DOC: HuC Firmware
> diff --git a/drivers/gpu/drm/i915/intel_huc.h 
> b/drivers/gpu/drm/i915/intel_huc.h
> index d58422b..aaa38b9 100644
> --- a/drivers/gpu/drm/i915/intel_huc.h
> +++ b/drivers/gpu/drm/i915/intel_huc.h
> @@ -25,6 +25,8 @@
>  #ifndef _INTEL_HUC_H_
>  #define _INTEL_HUC_H_
>  
> +#include "intel_uc_fw.h"
> +
>  struct intel_huc {
> /* Generic uC firmware management */
> struct intel_uc_fw fw;


Reviewed-by: Chris Wilson 

Applied to my queue, thanks.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 06/11] drm/i915: Name i915_runtime_pm structure in dev_priv as "rpm"

2017-10-06 Thread Sagar Arun Kamble



On 10/6/2017 6:10 PM, Chris Wilson wrote:

Quoting Sagar Arun Kamble (2017-10-06 13:13:35)

We were using dev_priv->pm for runtime power management related state.
This patch renames it to "rpm" which looks more apt. Will be using pm
for state containing RPS/RC6 state in the next patch.

Signed-off-by: Sagar Arun Kamble 
Cc: Imre Deak 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Reviewed-by: Radoslaw Szwichtenberg 
Reviewed-by: Chris Wilson 

Thinking about this again, rpm, pm are very close. How about if we used
i915->runtime_pm and i915->gt_pm (or i915->gt.pm)? Imre, any thoughts?
rps.hw_lock/pcu_lock is used by display too, so I just kept it pm. 
should we pull rps.hw_lock/pcu_lock out into drm_i915_private

and then gt_pm would be good.

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: Prove an assert for when we expect forcewake to be held

2017-10-06 Thread Chris Wilson
Add assert_forcewakes_active() (the complementary function to
assert_forcewakes_inactive) that documents the requirement of a
function for its callers to be holding the forcewake ref (i.e. the
function is part of a sequence over which RC6 must be prevented).

One such example is during ringbuffer reset, where RC6 must be held
across the whole reinitialisation sequence.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 11 ++-
 drivers/gpu/drm/i915/intel_uncore.c | 12 
 drivers/gpu/drm/i915/intel_uncore.h |  2 ++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 05c08b0bc172..4285f09ff8b8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -579,7 +579,16 @@ static int init_ring_common(struct intel_engine_cs *engine)
 static void reset_ring_common(struct intel_engine_cs *engine,
  struct drm_i915_gem_request *request)
 {
-   /* Try to restore the logical GPU state to match the continuation
+   /*
+* RC6 must be prevented until the reset is complete and the engine
+* reinitialised. If it occurs in the middle of this sequence, the
+* state written to/loaded from the power context is ill-defined (e.g.
+* the PP_BASE_DIR may be lost).
+*/
+   assert_forcewakes_active(engine->i915, FORCEWAKE_ALL);
+
+   /*
+* Try to restore the logical GPU state to match the continuation
 * of the request queue. If we skip the context/PD restore, then
 * the next request may try to execute assuming that its context
 * is valid and loaded on the GPU and so may try to access invalid
diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index b3c3f94fc7e4..3d41667919dc 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -629,6 +629,18 @@ void assert_forcewakes_inactive(struct drm_i915_private 
*dev_priv)
WARN_ON(dev_priv->uncore.fw_domains_active);
 }
 
+void assert_forcewakes_active(struct drm_i915_private *dev_priv,
+ enum forcewake_domains fw_domains)
+{
+   if (!dev_priv->uncore.funcs.force_wake_get)
+   return;
+
+   assert_rpm_wakelock_held(dev_priv);
+
+   fw_domains &= dev_priv->uncore.fw_domains;
+   WARN_ON(fw_domains & ~dev_priv->uncore.fw_domains_active);
+}
+
 /* We give fast paths for the really cool registers */
 #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4)
 
diff --git a/drivers/gpu/drm/i915/intel_uncore.h 
b/drivers/gpu/drm/i915/intel_uncore.h
index 66eae2ce2f29..582771251b57 100644
--- a/drivers/gpu/drm/i915/intel_uncore.h
+++ b/drivers/gpu/drm/i915/intel_uncore.h
@@ -137,6 +137,8 @@ void intel_uncore_resume_early(struct drm_i915_private 
*dev_priv);
 
 u64 intel_uncore_edram_size(struct drm_i915_private *dev_priv);
 void assert_forcewakes_inactive(struct drm_i915_private *dev_priv);
+void assert_forcewakes_active(struct drm_i915_private *dev_priv,
+ enum forcewake_domains fw_domains);
 const char *intel_uncore_forcewake_domain_to_str(const enum 
forcewake_domain_id id);
 
 enum forcewake_domains
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/2] drm/i915/selftests: Hold the rpm/forcewake wakeref for the reset tests

2017-10-06 Thread Chris Wilson
Resetting the engine requires us to hold the forcewake wakeref to
prevent RC6 trying to happen in the middle of the reset sequence.
Normally, this is taken by i915_handle_error(), but as we are calling
the lowlevel functions ourselves, we need to hold it. Wrap the entire
live_hangcheck set of subtests in a single forcewake section for
simplicity.

This greatly improves the reliability of drv_selftest/live_hangcheck on
Haswell, where it would exhibit an inability to restart a request
because it lost its PD registers (PD_DIR_BASE reported as 0).

Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
---
 drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c 
b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 7e1bdd88eda3..a9e0de1f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -878,9 +878,18 @@ int intel_hangcheck_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_reset_queue),
SUBTEST(igt_handle_error),
};
+   int err;
 
if (!intel_has_gpu_reset(i915))
return 0;
 
-   return i915_subtests(tests, i915);
+   intel_runtime_pm_get(i915);
+   intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+
+   err = i915_subtests(tests, i915);
+
+   intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
+   intel_runtime_pm_put(i915);
+
+   return err;
 }
-- 
2.14.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [ANNOUNCE] dim-tools mailing list for drm maintainer tools

2017-10-06 Thread Jani Nikula

The drm maintainer tools and documentation [1][2], the dim script in
particular, have expanded in use and features and especially user base
beyond at least my imagination.

It's time to move the maintainer tools patches and discussion away from
the intel-gfx mailing list, but it seems best to not clutter dri-devel
any more. Hence we're introducing a new dim-tools mailing list [3] for
announcements, discussion, and development of drm maintainer tools and
documentation.

Please subscribe to the list if you use dim, so we can reach out to all
users with announcements. Some of you have been automatically
subscribed; apologies if this was not what you wanted.

BR,
Jani.


[1] https://cgit.freedesktop.org/drm/drm-intel/log/?h=maintainer-tools
[2] https://01.org/linuxgraphics/gfx-docs/maintainer-tools/index.html
[3] https://lists.freedesktop.org/mailman/listinfo/dim-tools

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 15/21] drm/i915: accurate page size tracking for the ppgtt

2017-10-06 Thread Matthew Auld
Now that we support multiple page sizes for the ppgtt, it would be
useful to track the real usage for debugging purposes.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 11 +++
 drivers/gpu/drm/i915/i915_gem_object.h | 10 ++
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 118aad90468f..4c605785e2b3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1053,6 +1053,8 @@ static void gen8_ppgtt_insert_3lvl(struct 
i915_address_space *vm,
 
gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx,
  cache_level);
+
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
 static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
@@ -1145,7 +1147,10 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
vaddr = kmap_atomic_px(pd);
vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
kunmap_atomic(vaddr);
+   page_size = I915_GTT_PAGE_SIZE_64K;
}
+
+   vma->page_sizes.gtt |= page_size;
} while (iter->sg);
 }
 
@@ -1170,6 +1175,8 @@ static void gen8_ppgtt_insert_4lvl(struct 
i915_address_space *vm,
while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++],
 &iter, &idx, cache_level))
GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
}
 }
 
@@ -1891,6 +1898,8 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
}
} while (1);
kunmap_atomic(vaddr);
+
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
@@ -2598,6 +2607,8 @@ static int ggtt_bind_vma(struct i915_vma *vma,
vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
intel_runtime_pm_put(i915);
 
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+
/*
 * Without aliasing PPGTT there's no difference between
 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h 
b/drivers/gpu/drm/i915/i915_gem_object.h
index 110672952a1c..e4e6dd93889d 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -169,6 +169,7 @@ struct drm_i915_gem_object {
struct sg_table *pages;
void *mapping;
 
+   /* TODO: whack some of this into the error state */
struct i915_page_sizes {
/**
 * The sg mask of the pages sg_table. i.e the mask of
@@ -184,6 +185,15 @@ struct drm_i915_gem_object {
 * to use opportunistically.
 */
unsigned int sg;
+
+   /**
+* The actual gtt page size usage. Since we can have
+* multiple vma associated with this object we need to
+* prevent any trampling of state, hence a copy of this
+* struct also lives in each vma, therefore the gtt
+* value here should only be read/write through the vma.
+*/
+   unsigned int gtt;
} page_sizes;
 
struct i915_gem_object_page_iter {
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 16/21] drm/i915/debugfs: include some gtt page size metrics

2017-10-06 Thread Matthew Auld
Good to know, mostly for debugging purposes.

v2: some improvements from Chris

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 61 ++---
 1 file changed, 57 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 44aae25d12c7..552d89eded44 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -119,6 +119,36 @@ static u64 i915_gem_obj_total_ggtt_size(struct 
drm_i915_gem_object *obj)
return size;
 }
 
+static const char *
+stringify_page_sizes(unsigned int page_sizes, char *buf, size_t len)
+{
+   size_t x = 0;
+
+   switch (page_sizes) {
+   case 0:
+   return "";
+   case I915_GTT_PAGE_SIZE_4K:
+   return "4K";
+   case I915_GTT_PAGE_SIZE_64K:
+   return "64K";
+   case I915_GTT_PAGE_SIZE_2M:
+   return "2M";
+   default:
+   if (!buf)
+   return "M";
+
+   if (page_sizes & I915_GTT_PAGE_SIZE_2M)
+   x += snprintf(buf + x, len - x, "2M, ");
+   if (page_sizes & I915_GTT_PAGE_SIZE_64K)
+   x += snprintf(buf + x, len - x, "64K, ");
+   if (page_sizes & I915_GTT_PAGE_SIZE_4K)
+   x += snprintf(buf + x, len - x, "4K, ");
+   buf[x-2] = '\0';
+
+   return buf;
+   }
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
@@ -156,9 +186,10 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
if (!drm_mm_node_allocated(&vma->node))
continue;
 
-   seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
+   seq_printf(m, " (%sgtt offset: %08llx, size: %08llx, pages: %s",
   i915_vma_is_ggtt(vma) ? "g" : "pp",
-  vma->node.start, vma->node.size);
+  vma->node.start, vma->node.size,
+  stringify_page_sizes(vma->page_sizes.gtt, NULL, 0));
if (i915_vma_is_ggtt(vma)) {
switch (vma->ggtt_view.type) {
case I915_GGTT_VIEW_NORMAL:
@@ -403,10 +434,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
struct drm_i915_private *dev_priv = node_to_i915(m->private);
struct drm_device *dev = &dev_priv->drm;
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   u32 count, mapped_count, purgeable_count, dpy_count;
-   u64 size, mapped_size, purgeable_size, dpy_size;
+   u32 count, mapped_count, purgeable_count, dpy_count, huge_count;
+   u64 size, mapped_size, purgeable_size, dpy_size, huge_size;
struct drm_i915_gem_object *obj;
+   unsigned int page_sizes = 0;
struct drm_file *file;
+   char buf[80];
int ret;
 
ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -420,6 +453,7 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
size = count = 0;
mapped_size = mapped_count = 0;
purgeable_size = purgeable_count = 0;
+   huge_size = huge_count = 0;
list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) {
size += obj->base.size;
++count;
@@ -433,6 +467,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
mapped_count++;
mapped_size += obj->base.size;
}
+
+   if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   huge_count++;
+   huge_size += obj->base.size;
+   page_sizes |= obj->mm.page_sizes.sg;
+   }
}
seq_printf(m, "%u unbound objects, %llu bytes\n", count, size);
 
@@ -455,6 +495,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
mapped_count++;
mapped_size += obj->base.size;
}
+
+   if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   huge_count++;
+   huge_size += obj->base.size;
+   page_sizes |= obj->mm.page_sizes.sg;
+   }
}
seq_printf(m, "%u bound objects, %llu bytes\n",
   count, size);
@@ -462,11 +508,18 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
   purgeable_count, purgeable_size);
seq_printf(m, "%u mapped objects, %llu bytes\n",
   mapped_count, mapped_size);
+   seq_printf(m, "%u huge-paged objects (%s) %llu bytes\n",
+  huge_count,
+  stringify_page_sizes(page_sizes, buf, sizeof(buf)),
+  huge_size);
seq_prin

[Intel-gfx] [PATCH 21/21] drm/i915: enable platform support for 2M pages

2017-10-06 Thread Matthew Auld
For gen8+ platforms which support the 48b PPGTT, enable platform level
support for 2M pages. Also enable for mock testing.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_pci.c  | 6 --
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 8d349aec1902..bf467f30c99b 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -376,7 +376,8 @@ static const struct intel_device_info 
intel_haswell_gt3_info __initconst = {
 #define GEN8_FEATURES \
G75_FEATURES, \
BDW_COLORS, \
-   GEN_DEFAULT_PAGE_SIZES, \
+   .page_sizes = I915_GTT_PAGE_SIZE_4K | \
+ I915_GTT_PAGE_SIZE_2M, \
.has_logical_ring_contexts = 1, \
.has_full_48bit_ppgtt = 1, \
.has_64bit_reloc = 1, \
@@ -437,7 +438,8 @@ static const struct intel_device_info intel_cherryview_info 
__initconst = {
 
 #define GEN9_DEFAULT_PAGE_SIZES \
.page_sizes = I915_GTT_PAGE_SIZE_4K | \
- I915_GTT_PAGE_SIZE_64K
+ I915_GTT_PAGE_SIZE_64K | \
+ I915_GTT_PAGE_SIZE_2M
 
 #define GEN9_FEATURES \
GEN8_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 7a9735dac912..04eb9362f4f8 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -176,7 +176,8 @@ struct drm_i915_private *mock_gem_device(void)
 
mkwrite_device_info(i915)->page_sizes =
I915_GTT_PAGE_SIZE_4K |
-   I915_GTT_PAGE_SIZE_64K;
+   I915_GTT_PAGE_SIZE_64K |
+   I915_GTT_PAGE_SIZE_2M;
 
spin_lock_init(&i915->mm.object_stat_lock);
mock_uncore_init(i915);
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 11/21] drm/i915: disable GTT cache for 2M pages

2017-10-06 Thread Matthew Auld
When SW enables the use of 2M/1G pages, it must disable the GTT cache.

v2: don't disable for Cherryview which doesn't even support 48b PPGTT!

v3: explicitly check that the system does support 2M/1G pages

v4: split WA and decision logic

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_pm.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 171b21f6c4ad..9d0ca2656a23 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8453,6 +8453,9 @@ static void skl_init_clock_gating(struct drm_i915_private 
*dev_priv)
 
 static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
 {
+   /* The GTT cache must be disabled if the system is using 2M pages. */
+   bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
+I915_GTT_PAGE_SIZE_2M);
enum pipe pipe;
 
ilk_init_lp_watermarks(dev_priv);
@@ -8487,12 +8490,8 @@ static void bdw_init_clock_gating(struct 
drm_i915_private *dev_priv)
/* WaProgramL3SqcReg1Default:bdw */
gen8_set_l3sqc_credits(dev_priv, 30, 2);
 
-   /*
-* WaGttCachingOffByDefault:bdw
-* GTT cache may not work with big pages, so if those
-* are ever enabled GTT cache may need to be disabled.
-*/
-   I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+   /* WaGttCachingOffByDefault:bdw */
+   I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
 
/* WaKVMNotificationOnConfigChange:bdw */
I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 17/21] drm/i915/selftests: huge page tests

2017-10-06 Thread Matthew Auld
v2: mock test page support configurations and add MI_STORE_DWORD test

v3: run all mockable huge page tests on all platforms via the mock_device

v4: add pin_update regression test
various improvements suggested by Chris

v5: fix issues reported by kbuild
test single sg spanning multiple page sizes
don't explode when running the live-tests through the appgtt

v6: lots of improvements from Chris

v7: run on each engine for igt_write_huge
add simple tmpfs fallback test

v8: size_t is bad
don't break the i386 build

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c|1 +
 drivers/gpu/drm/i915/i915_gem_object.h |2 +
 drivers/gpu/drm/i915/selftests/huge_pages.c| 1715 
 .../gpu/drm/i915/selftests/i915_live_selftests.h   |1 +
 .../gpu/drm/i915/selftests/i915_mock_selftests.h   |1 +
 5 files changed, 1720 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 695cb2a38c88..e59fc37bf56e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5408,6 +5408,7 @@ int i915_gem_object_attach_phys(struct 
drm_i915_gem_object *obj, int align)
 #include "selftests/scatterlist.c"
 #include "selftests/mock_gem_device.c"
 #include "selftests/huge_gem_object.c"
+#include "selftests/huge_pages.c"
 #include "selftests/i915_gem_object.c"
 #include "selftests/i915_gem_coherency.c"
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h 
b/drivers/gpu/drm/i915/i915_gem_object.h
index e4e6dd93889d..956c911c2cbf 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -196,6 +196,8 @@ struct drm_i915_gem_object {
unsigned int gtt;
} page_sizes;
 
+   I915_SELFTEST_DECLARE(unsigned int page_mask);
+
struct i915_gem_object_page_iter {
struct scatterlist *sg_pos;
unsigned int sg_idx; /* in pages, but 32bit eek! */
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/selftests/huge_pages.c
new file mode 100644
index ..b8495882e5b0
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -0,0 +1,1715 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "../i915_selftest.h"
+
+#include 
+
+#include "mock_drm.h"
+
+static const unsigned int page_sizes[] = {
+   I915_GTT_PAGE_SIZE_2M,
+   I915_GTT_PAGE_SIZE_64K,
+   I915_GTT_PAGE_SIZE_4K,
+};
+
+static unsigned int get_largest_page_size(struct drm_i915_private *i915,
+ u64 rem)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) {
+   unsigned int page_size = page_sizes[i];
+
+   if (HAS_PAGE_SIZES(i915, page_size) && rem >= page_size)
+   return page_size;
+   }
+
+   return 0;
+}
+
+static void huge_pages_free_pages(struct sg_table *st)
+{
+   struct scatterlist *sg;
+
+   for (sg = st->sgl; sg; sg = __sg_next(sg)) {
+   if (sg_page(sg))
+   __free_pages(sg_page(sg), get_order(sg->length));
+   }
+
+   sg_free_table(st);
+   kfree(st);
+}
+
+static int get_huge_pages(struct drm_i915_gem_object *obj)
+{
+#define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY)
+   unsigned int page_mask = obj->mm.page_mask;
+   struct sg_table *st;
+   struct scatterlist *sg;
+   unsigned int sg_mask;
+   u64 rem;
+
+   st = kmalloc(sizeof(*st), GFP);
+   if (!st)
+   return -ENOMEM;
+
+   if (sg_alloc_table(st, obj->base.size >> PAGE_SHIFT, GFP)) {
+  

[Intel-gfx] [PATCH 13/21] drm/i915: add support for 64K scratch page

2017-10-06 Thread Matthew Auld
Before we can fully enable 64K pages, we need to first support a 64K
scratch page if we intend to support the case where we have object sizes
< 2M, since any scratch PTE must also point to a 64K region.  Without
this our 64K usage is limited to objects which completely fill the
page-table, and therefore don't need any scratch.

v2: add reminder about why 48b PPGTT

Reported-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 64 ++---
 drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
 2 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 79ba485c5d42..7eae6ab8c5fd 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -519,22 +519,63 @@ static void fill_page_dma_32(struct i915_address_space 
*vm,
 static int
 setup_scratch_page(struct i915_address_space *vm, gfp_t gfp)
 {
-   struct page *page;
+   struct page *page = NULL;
dma_addr_t addr;
+   int order;
 
-   page = alloc_page(gfp | __GFP_ZERO);
-   if (unlikely(!page))
-   return -ENOMEM;
+   /*
+* In order to utilize 64K pages for an object with a size < 2M, we will
+* need to support a 64K scratch page, given that every 16th entry for a
+* page-table operating in 64K mode must point to a properly aligned 64K
+* region, including any PTEs which happen to point to scratch.
+*
+* This is only relevant for the 48b PPGTT where we support
+* huge-gtt-pages, see also i915_vma_insert().
+*
+* TODO: we should really consider write-protecting the scratch-page and
+* sharing between ppgtt
+*/
+   if (i915_vm_is_48bit(vm) &&
+   HAS_PAGE_SIZES(vm->i915, I915_GTT_PAGE_SIZE_64K)) {
+   order = get_order(I915_GTT_PAGE_SIZE_64K);
+   page = alloc_pages(gfp | __GFP_ZERO, order);
+   if (page) {
+   addr = dma_map_page(vm->dma, page, 0,
+   I915_GTT_PAGE_SIZE_64K,
+   PCI_DMA_BIDIRECTIONAL);
+   if (unlikely(dma_mapping_error(vm->dma, addr))) {
+   __free_pages(page, order);
+   page = NULL;
+   }
 
-   addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE,
-   PCI_DMA_BIDIRECTIONAL);
-   if (unlikely(dma_mapping_error(vm->dma, addr))) {
-   __free_page(page);
-   return -ENOMEM;
+   if (!IS_ALIGNED(addr, I915_GTT_PAGE_SIZE_64K)) {
+   dma_unmap_page(vm->dma, addr,
+  I915_GTT_PAGE_SIZE_64K,
+  PCI_DMA_BIDIRECTIONAL);
+   __free_pages(page, order);
+   page = NULL;
+   }
+   }
+   }
+
+   if (!page) {
+   order = 0;
+   page = alloc_page(gfp | __GFP_ZERO);
+   if (unlikely(!page))
+   return -ENOMEM;
+
+   addr = dma_map_page(vm->dma, page, 0, PAGE_SIZE,
+   PCI_DMA_BIDIRECTIONAL);
+   if (unlikely(dma_mapping_error(vm->dma, addr))) {
+   __free_page(page);
+   return -ENOMEM;
+   }
}
 
vm->scratch_page.page = page;
vm->scratch_page.daddr = addr;
+   vm->scratch_page.order = order;
+
return 0;
 }
 
@@ -542,8 +583,9 @@ static void cleanup_scratch_page(struct i915_address_space 
*vm)
 {
struct i915_page_dma *p = &vm->scratch_page;
 
-   dma_unmap_page(vm->dma, p->daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-   __free_page(p->page);
+   dma_unmap_page(vm->dma, p->daddr, BIT(p->order) << PAGE_SHIFT,
+  PCI_DMA_BIDIRECTIONAL);
+   __free_pages(p->page, p->order);
 }
 
 static struct i915_page_table *alloc_pt(struct i915_address_space *vm)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b9d7036c3665..e9de3f05b0c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -215,6 +215,7 @@ struct i915_vma;
 
 struct i915_page_dma {
struct page *page;
+   int order;
union {
dma_addr_t daddr;
 
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 19/21] drm/i915: disable platform support for vGPU huge gtt pages

2017-10-06 Thread Matthew Auld
Currently gvt gtt handling doesn't support huge page entries, so disable
for now.

v2: remove useless 48b PPGTT check

Suggested-by: Zhenyu Wang 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Zhenyu Wang 
Reviewed-by: Zhenyu Wang 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e59fc37bf56e..6d36ee9c3508 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4818,6 +4818,15 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 
mutex_lock(&dev_priv->drm.struct_mutex);
 
+   /*
+* We need to fallback to 4K pages since gvt gtt handling doesn't
+* support huge page entries - we will need to check either hypervisor
+* mm can support huge guest page or just do emulation in gvt.
+*/
+   if (intel_vgpu_active(dev_priv))
+   mkwrite_device_info(dev_priv)->page_sizes =
+   I915_GTT_PAGE_SIZE_4K;
+
dev_priv->mm.unordered_timeline = dma_fence_context_alloc(1);
 
if (!i915_modparams.enable_execlists) {
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 20/21] drm/i915: enable platform support for 64K pages

2017-10-06 Thread Matthew Auld
For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_pci.c  | 3 ++-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 7938006cf03a..8d349aec1902 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -436,7 +436,8 @@ static const struct intel_device_info intel_cherryview_info 
__initconst = {
 };
 
 #define GEN9_DEFAULT_PAGE_SIZES \
-   .page_sizes = I915_GTT_PAGE_SIZE_4K
+   .page_sizes = I915_GTT_PAGE_SIZE_4K | \
+ I915_GTT_PAGE_SIZE_64K
 
 #define GEN9_FEATURES \
GEN8_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index f46c3a35d61a..7a9735dac912 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -175,7 +175,8 @@ struct drm_i915_private *mock_gem_device(void)
mkwrite_device_info(i915)->gen = -1;
 
mkwrite_device_info(i915)->page_sizes =
-   I915_GTT_PAGE_SIZE_4K;
+   I915_GTT_PAGE_SIZE_4K |
+   I915_GTT_PAGE_SIZE_64K;
 
spin_lock_init(&i915->mm.object_stat_lock);
mock_uncore_init(i915);
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 18/21] drm/i915/selftests: mix huge pages

2017-10-06 Thread Matthew Auld
Try to mix sg page sizes for 4K, 64K and 2M pages.

v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/selftests/scatterlist.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/scatterlist.c 
b/drivers/gpu/drm/i915/selftests/scatterlist.c
index 1cc5d2931753..cd6d2a16071f 100644
--- a/drivers/gpu/drm/i915/selftests/scatterlist.c
+++ b/drivers/gpu/drm/i915/selftests/scatterlist.c
@@ -189,6 +189,20 @@ static unsigned int random(unsigned long n,
return 1 + (prandom_u32_state(rnd) % 1024);
 }
 
+static unsigned int random_page_size_pages(unsigned long n,
+  unsigned long count,
+  struct rnd_state *rnd)
+{
+   /* 4K, 64K, 2M */
+   static unsigned int page_count[] = {
+   BIT(12) >> PAGE_SHIFT,
+   BIT(16) >> PAGE_SHIFT,
+   BIT(21) >> PAGE_SHIFT,
+   };
+
+   return page_count[(prandom_u32_state(rnd) % 3)];
+}
+
 static inline bool page_contiguous(struct page *first,
   struct page *last,
   unsigned long npages)
@@ -252,6 +266,7 @@ static const npages_fn_t npages_funcs[] = {
grow,
shrink,
random,
+   random_page_size_pages,
NULL,
 };
 
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 09/21] drm/i915: align 64K objects to 2M

2017-10-06 Thread Matthew Auld
We can't mix 64K and 4K pte's in the same page-table, so for now we
align 64K objects to 2M to avoid any potential mixing. This is
potentially wasteful but in reality shouldn't be too bad since this only
applies to the virtual address space of a 48b PPGTT.

v2: don't separate logically connected ops

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_vma.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5067eab27829..ecddf519a11c 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -500,10 +500,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
 */
if (upper_32_bits(end) &&
vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   /*
+* We can't mix 64K and 4K PTEs in the same page-table 
(2M
+* block), and so to avoid the ugliness and complexity 
of
+* coloring we opt for just aligning 64K objects to 2M.
+*/
u64 page_alignment =
-   rounddown_pow_of_two(vma->page_sizes.sg);
+   rounddown_pow_of_two(vma->page_sizes.sg |
+I915_GTT_PAGE_SIZE_2M);
 
alignment = max(alignment, page_alignment);
+
+   if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K)
+   size = round_up(size, I915_GTT_PAGE_SIZE_2M);
}
 
ret = i915_gem_gtt_insert(vma->vm, &vma->node,
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/21] drm/i915: enable IPS bit for 64K pages

2017-10-06 Thread Matthew Auld
Before we can enable 64K pages through the IPS bit, we must first enable
it through MMIO, otherwise the page-walker will simply ignore it.

v2: add comment mentioning that 64K is BDW+

v3: move to more suitable home

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Reviewed-by: Mika Kuoppala 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 17 +
 drivers/gpu/drm/i915/i915_reg.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fb7ac66814ab..74fc9ac11cd5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1987,6 +1987,23 @@ static void gtt_write_workarounds(struct 
drm_i915_private *dev_priv)
I915_WRITE(GEN8_L3_LRA_1_GPGPU, 
GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_SKL);
else if (IS_GEN9_LP(dev_priv))
I915_WRITE(GEN8_L3_LRA_1_GPGPU, 
GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_BXT);
+
+   /*
+* To support 64K PTEs we need to first enable the use of the
+* Intermediate-Page-Size(IPS) bit of the PDE field via some magical
+* mmio, otherwise the page-walker will simply ignore the IPS bit. This
+* shouldn't be needed after GEN10.
+*
+* 64K pages were first introduced from BDW+, although technically they
+* only *work* from gen9+. For pre-BDW we instead have the option for
+* 32K pages, but we don't currently have any support for it in our
+* driver.
+*/
+   if (HAS_PAGE_SIZES(dev_priv, I915_GTT_PAGE_SIZE_64K) &&
+   INTEL_GEN(dev_priv) <= 10)
+   I915_WRITE(GEN8_GAMW_ECO_DEV_RW_IA,
+  I915_READ(GEN8_GAMW_ECO_DEV_RW_IA) |
+  GAMW_ECO_ENABLE_64K_IPS_FIELD);
 }
 
 int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e7dba5539b11..50e65c98ca6c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2371,6 +2371,9 @@ enum i915_power_well_id {
 #define GEN9_GAMT_ECO_REG_RW_IA _MMIO(0x4ab0)
 #define   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS  (1<<18)
 
+#define GEN8_GAMW_ECO_DEV_RW_IA _MMIO(0x4080)
+#define   GAMW_ECO_ENABLE_64K_IPS_FIELD 0xF
+
 #define GAMT_CHKN_BIT_REG  _MMIO(0x4ab8)
 #define   GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING (1<<28)
 #define   GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT   (1<<24)
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 14/21] drm/i915: support 64K pages for the 48b PPGTT

2017-10-06 Thread Matthew Auld
Support inserting 64K pages into the 48b PPGTT.

v2: check for 64K scratch

v3: we should only have to re-adjust maybe_64K at every sg interval

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 31 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  7 +++
 2 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7eae6ab8c5fd..118aad90468f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1069,6 +1069,7 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
unsigned int page_size;
+   bool maybe_64K = false;
gen8_pte_t encode = pte_encode;
gen8_pte_t *vaddr;
u16 index, max;
@@ -1090,6 +1091,13 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
max = GEN8_PTES;
page_size = I915_GTT_PAGE_SIZE;
 
+   if (!index &&
+   vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
+   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) &&
+   (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) ||
+rem >= (max - index) << PAGE_SHIFT))
+   maybe_64K = true;
+
vaddr = kmap_atomic_px(pt);
}
 
@@ -1109,12 +1117,35 @@ static void gen8_ppgtt_insert_huge_entries(struct 
i915_vma *vma,
iter->dma = sg_dma_address(iter->sg);
iter->max = iter->dma + rem;
 
+   if (maybe_64K && index < max &&
+   !(IS_ALIGNED(iter->dma, 
I915_GTT_PAGE_SIZE_64K) &&
+ (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) 
||
+  rem >= (max - index) << PAGE_SHIFT)))
+   maybe_64K = false;
+
if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
break;
}
} while (rem >= page_size && index < max);
 
kunmap_atomic(vaddr);
+
+   /*
+* Is it safe to mark the 2M block as 64K? -- Either we have
+* filled whole page-table with 64K entries, or filled part of
+* it and have reached the end of the sg table and we have
+* enough padding.
+*/
+   if (maybe_64K &&
+   (index == max ||
+(i915_vm_has_scratch_64K(vma->vm) &&
+ !iter->sg && IS_ALIGNED(vma->node.start +
+ vma->node.size,
+ I915_GTT_PAGE_SIZE_2M {
+   vaddr = kmap_atomic_px(pd);
+   vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
+   kunmap_atomic(vaddr);
+   }
} while (iter->sg);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e9de3f05b0c9..93211a96fdad 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -154,6 +154,7 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_GET_AGE(x) ((x) & (3 << 4))
 #define CHV_PPAT_GET_SNOOP(x) ((x) & (1 << 6))
 
+#define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
 struct sg_table;
@@ -352,6 +353,12 @@ i915_vm_is_48bit(const struct i915_address_space *vm)
return (vm->total - 1) >> 32;
 }
 
+static inline bool
+i915_vm_has_scratch_64K(struct i915_address_space *vm)
+{
+   return vm->scratch_page.order == get_order(I915_GTT_PAGE_SIZE_64K);
+}
+
 /* The Graphics Translation Table is the way in which GEN hardware translates a
  * Graphics Virtual Address into a Physical Address. In addition to the normal
  * collateral associated with any va->pa translations GEN hardware also has a
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 12/21] drm/i915: support 2M pages for the 48b PPGTT

2017-10-06 Thread Matthew Auld
Support inserting 2M gtt pages into the 48b PPGTT.

v2: sanity check sg->length against page_size

v3: don't recalculate rem on each loop
whitespace breakup

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 76 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 +
 2 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 74fc9ac11cd5..79ba485c5d42 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1013,6 +1013,69 @@ static void gen8_ppgtt_insert_3lvl(struct 
i915_address_space *vm,
  cache_level);
 }
 
+static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
+  struct i915_page_directory_pointer 
**pdps,
+  struct sgt_dma *iter,
+  enum i915_cache_level cache_level)
+{
+   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
+   u64 start = vma->node.start;
+   dma_addr_t rem = iter->sg->length;
+
+   do {
+   struct gen8_insert_pte idx = gen8_insert_pte(start);
+   struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
+   struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
+   unsigned int page_size;
+   gen8_pte_t encode = pte_encode;
+   gen8_pte_t *vaddr;
+   u16 index, max;
+
+   if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
+   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
+   rem >= I915_GTT_PAGE_SIZE_2M && !idx.pte) {
+   index = idx.pde;
+   max = I915_PDES;
+   page_size = I915_GTT_PAGE_SIZE_2M;
+
+   encode |= GEN8_PDE_PS_2M;
+
+   vaddr = kmap_atomic_px(pd);
+   } else {
+   struct i915_page_table *pt = pd->page_table[idx.pde];
+
+   index = idx.pte;
+   max = GEN8_PTES;
+   page_size = I915_GTT_PAGE_SIZE;
+
+   vaddr = kmap_atomic_px(pt);
+   }
+
+   do {
+   GEM_BUG_ON(iter->sg->length < page_size);
+   vaddr[index++] = encode | iter->dma;
+
+   start += page_size;
+   iter->dma += page_size;
+   rem -= page_size;
+   if (iter->dma >= iter->max) {
+   iter->sg = __sg_next(iter->sg);
+   if (!iter->sg)
+   break;
+
+   rem = iter->sg->length;
+   iter->dma = sg_dma_address(iter->sg);
+   iter->max = iter->dma + rem;
+
+   if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
+   break;
+   }
+   } while (rem >= page_size && index < max);
+
+   kunmap_atomic(vaddr);
+   } while (iter->sg);
+}
+
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
   struct i915_vma *vma,
   enum i915_cache_level cache_level,
@@ -1025,11 +1088,16 @@ static void gen8_ppgtt_insert_4lvl(struct 
i915_address_space *vm,
.max = iter.dma + iter.sg->length,
};
struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
-   struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
-   while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
-&idx, cache_level))
-   GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+   if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   gen8_ppgtt_insert_huge_entries(vma, pdps, &iter, cache_level);
+   } else {
+   struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
+
+   while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++],
+&iter, &idx, cache_level))
+   GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+   }
 }
 
 static void gen8_free_page_tables(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f22491b4e6dc..b9d7036c3665 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -154,6 +154,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_GET_AGE(x) ((x) & (3 << 4))
 #define CHV_PPAT_GET_SNOOP(x) ((x) & (1 << 6))
 
+#define GEN8_PDE_PS_2

[Intel-gfx] [PATCH 07/21] drm/i915: introduce vm set_pages/clear_pages

2017-10-06 Thread Matthew Auld
Move the setting/clearing of the vma->pages to a vm operation. Doing so
neatens things up a little, but more importantly gives us a sane place
to also set/clear the vma->pages_sizes, which we introduce later in
preparation for supporting huge-pages.

v2: remove redundant vma->pages check

v3: GEM_BUG_ON(vma->pages) following i915_vma_remove

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 70 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   | 27 +++-
 drivers/gpu/drm/i915/selftests/mock_gtt.c | 11 ++---
 4 files changed, 66 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4c82ceb8d318..c534b74eee32 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -205,8 +205,6 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
return ret;
}
 
-   vma->pages = vma->obj->mm.pages;
-
/* Currently applicable only to VLV */
pte_flags = 0;
if (vma->obj->gt_ro)
@@ -222,6 +220,26 @@ static void ppgtt_unbind_vma(struct i915_vma *vma)
vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 }
 
+static int ppgtt_set_pages(struct i915_vma *vma)
+{
+   GEM_BUG_ON(vma->pages);
+
+   vma->pages = vma->obj->mm.pages;
+
+   return 0;
+}
+
+static void clear_pages(struct i915_vma *vma)
+{
+   GEM_BUG_ON(!vma->pages);
+
+   if (vma->pages != vma->obj->mm.pages) {
+   sg_free_table(vma->pages);
+   kfree(vma->pages);
+   }
+   vma->pages = NULL;
+}
+
 static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
  enum i915_cache_level level)
 {
@@ -1452,6 +1470,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.cleanup = gen8_ppgtt_cleanup;
ppgtt->base.unbind_vma = ppgtt_unbind_vma;
ppgtt->base.bind_vma = ppgtt_bind_vma;
+   ppgtt->base.set_pages = ppgtt_set_pages;
+   ppgtt->base.clear_pages = clear_pages;
ppgtt->debug_dump = gen8_dump_ppgtt;
 
return 0;
@@ -1894,6 +1914,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
ppgtt->base.unbind_vma = ppgtt_unbind_vma;
ppgtt->base.bind_vma = ppgtt_bind_vma;
+   ppgtt->base.set_pages = ppgtt_set_pages;
+   ppgtt->base.clear_pages = clear_pages;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->debug_dump = gen6_dump_ppgtt;
 
@@ -2405,12 +2427,6 @@ static int ggtt_bind_vma(struct i915_vma *vma,
struct drm_i915_gem_object *obj = vma->obj;
u32 pte_flags;
 
-   if (unlikely(!vma->pages)) {
-   int ret = i915_get_ggtt_vma_pages(vma);
-   if (ret)
-   return ret;
-   }
-
/* Currently applicable only to VLV */
pte_flags = 0;
if (obj->gt_ro)
@@ -2447,12 +2463,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
u32 pte_flags;
int ret;
 
-   if (unlikely(!vma->pages)) {
-   ret = i915_get_ggtt_vma_pages(vma);
-   if (ret)
-   return ret;
-   }
-
/* Currently applicable only to VLV */
pte_flags = 0;
if (vma->obj->gt_ro)
@@ -2467,7 +2477,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 vma->node.start,
 vma->size);
if (ret)
-   goto err_pages;
+   return ret;
}
 
appgtt->base.insert_entries(&appgtt->base, vma, cache_level,
@@ -2481,17 +2491,6 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
}
 
return 0;
-
-err_pages:
-   if (!(vma->flags & (I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND))) {
-   if (vma->pages != vma->obj->mm.pages) {
-   GEM_BUG_ON(!vma->pages);
-   sg_free_table(vma->pages);
-   kfree(vma->pages);
-   }
-   vma->pages = NULL;
-   }
-   return ret;
 }
 
 static void aliasing_gtt_unbind_vma(struct i915_vma *vma)
@@ -2529,6 +2528,19 @@ void i915_gem_gtt_finish_pages(struct 
drm_i915_gem_object *obj,
dma_unmap_sg(kdev, pages->sgl, pages->nents, PCI_DMA_BIDIRECTIONAL);
 }
 
+static int ggtt_set_pages(struct i915_vma *vma)
+{
+   int ret;
+
+   GEM_BUG_ON(vma->pages);
+
+   ret = i915_get_ggtt_vma_pages(vma);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
 static void i915_gtt_color_adjust(const struct drm_mm_node *node,

[Intel-gfx] [PATCH 03/21] drm/i915/gemfs: enable THP

2017-10-06 Thread Matthew Auld
Enable transparent-huge-pages through gemfs by mounting with
huge=within_size.

v2: sprinkle within_size comment

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gemfs.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gemfs.c 
b/drivers/gpu/drm/i915/i915_gemfs.c
index 168d0bd98f60..e2993857df37 100644
--- a/drivers/gpu/drm/i915/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/i915_gemfs.c
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 #include "i915_drv.h"
 #include "i915_gemfs.h"
@@ -41,6 +42,27 @@ int i915_gemfs_init(struct drm_i915_private *i915)
if (IS_ERR(gemfs))
return PTR_ERR(gemfs);
 
+   /*
+* Enable huge-pages for objects that are at least HPAGE_PMD_SIZE, most
+* likely 2M. Note that within_size may overallocate huge-pages, if say
+* we allocate an object of size 2M + 4K, we may get 2M + 2M, but under
+* memory pressure shmem should split any huge-pages which can be
+* shrunk.
+*/
+
+   if (has_transparent_hugepage()) {
+   struct super_block *sb = gemfs->mnt_sb;
+   char options[] = "huge=within_size";
+   int flags = 0;
+   int err;
+
+   err = sb->s_op->remount_fs(sb, &flags, options);
+   if (err) {
+   kern_unmount(gemfs);
+   return err;
+   }
+   }
+
i915->mm.gemfs = gemfs;
 
return 0;
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 00/21] huge gtt pages

2017-10-06 Thread Matthew Auld
Some more bits of polish.

Matthew Auld (21):
  mm/shmem: introduce shmem_file_setup_with_mnt
  drm/i915: introduce simple gemfs
  drm/i915/gemfs: enable THP
  drm/i915: introduce page_sizes field to dev_info
  drm/i915: push set_pages down to the callers
  drm/i915: introduce page_size members
  drm/i915: introduce vm set_pages/clear_pages
  drm/i915: align the vma start to the largest gtt page size
  drm/i915: align 64K objects to 2M
  drm/i915: enable IPS bit for 64K pages
  drm/i915: disable GTT cache for 2M pages
  drm/i915: support 2M pages for the 48b PPGTT
  drm/i915: add support for 64K scratch page
  drm/i915: support 64K pages for the 48b PPGTT
  drm/i915: accurate page size tracking for the ppgtt
  drm/i915/debugfs: include some gtt page size metrics
  drm/i915/selftests: huge page tests
  drm/i915/selftests: mix huge pages
  drm/i915: disable platform support for vGPU huge gtt pages
  drm/i915: enable platform support for 64K pages
  drm/i915: enable platform support for 2M pages

 drivers/gpu/drm/i915/Makefile  |1 +
 drivers/gpu/drm/i915/i915_debugfs.c|   61 +-
 drivers/gpu/drm/i915/i915_drv.h|   29 +-
 drivers/gpu/drm/i915/i915_gem.c|  126 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |   18 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c|  275 +++-
 drivers/gpu/drm/i915/i915_gem_gtt.h|   20 +-
 drivers/gpu/drm/i915/i915_gem_internal.c   |   18 +-
 drivers/gpu/drm/i915/i915_gem_object.h |   31 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c |   16 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c|   15 +-
 drivers/gpu/drm/i915/i915_gemfs.c  |   74 +
 drivers/gpu/drm/i915/i915_gemfs.h  |   34 +
 drivers/gpu/drm/i915/i915_pci.c|   21 +
 drivers/gpu/drm/i915/i915_reg.h|3 +
 drivers/gpu/drm/i915/i915_vma.c|   49 +-
 drivers/gpu/drm/i915/i915_vma.h|1 +
 drivers/gpu/drm/i915/intel_pm.c|   11 +-
 drivers/gpu/drm/i915/selftests/huge_gem_object.c   |   14 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c| 1715 
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c  |   15 +-
 .../gpu/drm/i915/selftests/i915_live_selftests.h   |1 +
 .../gpu/drm/i915/selftests/i915_mock_selftests.h   |1 +
 drivers/gpu/drm/i915/selftests/mock_gem_device.c   |9 +
 drivers/gpu/drm/i915/selftests/mock_gtt.c  |   11 +-
 drivers/gpu/drm/i915/selftests/scatterlist.c   |   15 +
 include/linux/shmem_fs.h   |2 +
 mm/shmem.c |   30 +-
 28 files changed, 2479 insertions(+), 137 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h
 create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c

-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/21] mm/shmem: introduce shmem_file_setup_with_mnt

2017-10-06 Thread Matthew Auld
We are planning to use our own tmpfs mnt in i915 in place of the
shm_mnt, such that we can control the mount options, in particular
huge=, which we require to support huge-gtt-pages. So rather than roll
our own version of __shmem_file_setup, it would be preferred if we could
just give shmem our mnt, and let it do the rest.

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Dave Hansen 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Acked-by: Andrew Morton 
Acked-by: Kirill A. Shutemov 
Reviewed-by: Joonas Lahtinen 
---
 include/linux/shmem_fs.h |  2 ++
 mm/shmem.c   | 30 ++
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index b6c3540e07bc..0937d9a7d8fb 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -53,6 +53,8 @@ extern struct file *shmem_file_setup(const char *name,
loff_t size, unsigned long flags);
 extern struct file *shmem_kernel_file_setup(const char *name, loff_t size,
unsigned long flags);
+extern struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt,
+   const char *name, loff_t size, unsigned long flags);
 extern int shmem_zero_setup(struct vm_area_struct *);
 extern unsigned long shmem_get_unmapped_area(struct file *, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
diff --git a/mm/shmem.c b/mm/shmem.c
index 07a1d22807be..3229d27503ec 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -4183,7 +4183,7 @@ static const struct dentry_operations anon_ops = {
.d_dname = simple_dname
 };
 
-static struct file *__shmem_file_setup(const char *name, loff_t size,
+static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, 
loff_t size,
   unsigned long flags, unsigned int 
i_flags)
 {
struct file *res;
@@ -4192,8 +4192,8 @@ static struct file *__shmem_file_setup(const char *name, 
loff_t size,
struct super_block *sb;
struct qstr this;
 
-   if (IS_ERR(shm_mnt))
-   return ERR_CAST(shm_mnt);
+   if (IS_ERR(mnt))
+   return ERR_CAST(mnt);
 
if (size < 0 || size > MAX_LFS_FILESIZE)
return ERR_PTR(-EINVAL);
@@ -4205,8 +4205,8 @@ static struct file *__shmem_file_setup(const char *name, 
loff_t size,
this.name = name;
this.len = strlen(name);
this.hash = 0; /* will go */
-   sb = shm_mnt->mnt_sb;
-   path.mnt = mntget(shm_mnt);
+   sb = mnt->mnt_sb;
+   path.mnt = mntget(mnt);
path.dentry = d_alloc_pseudo(sb, &this);
if (!path.dentry)
goto put_memory;
@@ -4251,7 +4251,7 @@ static struct file *__shmem_file_setup(const char *name, 
loff_t size,
  */
 struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned 
long flags)
 {
-   return __shmem_file_setup(name, size, flags, S_PRIVATE);
+   return __shmem_file_setup(shm_mnt, name, size, flags, S_PRIVATE);
 }
 
 /**
@@ -4262,11 +4262,25 @@ struct file *shmem_kernel_file_setup(const char *name, 
loff_t size, unsigned lon
  */
 struct file *shmem_file_setup(const char *name, loff_t size, unsigned long 
flags)
 {
-   return __shmem_file_setup(name, size, flags, 0);
+   return __shmem_file_setup(shm_mnt, name, size, flags, 0);
 }
 EXPORT_SYMBOL_GPL(shmem_file_setup);
 
 /**
+ * shmem_file_setup_with_mnt - get an unlinked file living in tmpfs
+ * @mnt: the tmpfs mount where the file will be created
+ * @name: name for dentry (to be seen in /proc//maps
+ * @size: size to be set for the file
+ * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size
+ */
+struct file *shmem_file_setup_with_mnt(struct vfsmount *mnt, const char *name,
+  loff_t size, unsigned long flags)
+{
+   return __shmem_file_setup(mnt, name, size, flags, 0);
+}
+EXPORT_SYMBOL_GPL(shmem_file_setup_with_mnt);
+
+/**
  * shmem_zero_setup - setup a shared anonymous mapping
  * @vma: the vma to be mmapped is prepared by do_mmap_pgoff
  */
@@ -4281,7 +4295,7 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 * accessible to the user through its mapping, use S_PRIVATE flag to
 * bypass file security, in the same way as shmem_kernel_file_setup().
 */
-   file = __shmem_file_setup("dev/zero", size, vma->vm_flags, S_PRIVATE);
+   file = shmem_kernel_file_setup("dev/zero", size, vma->vm_flags);
if (IS_ERR(file))
return PTR_ERR(file);
 
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/21] drm/i915: introduce page_sizes field to dev_info

2017-10-06 Thread Matthew Auld
In preparation for huge gtt pages expose page_sizes as part of the
device info, to indicate the page sizes supported by the HW.  Currently
only 4K is supported.

v2: s/page_size_mask/page_sizes/

v3: introduce I915_GTT_MAX_PAGE_SIZE

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Mika Kuoppala 
Cc: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |  2 ++
 drivers/gpu/drm/i915/i915_gem_gtt.h  |  8 +++-
 drivers/gpu/drm/i915/i915_pci.c  | 18 ++
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  3 +++
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ec6f320cc4f5..3d4dee817381 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -869,6 +869,8 @@ struct intel_device_info {
u8 num_sprites[I915_MAX_PIPES];
u8 num_scalers[I915_MAX_PIPES];
 
+   unsigned int page_sizes; /* page sizes supported by the HW */
+
 #define DEFINE_FLAG(name) u8 name:1
DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
 #undef DEFINE_FLAG
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f62fb903dc24..50218c141c21 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -42,7 +42,13 @@
 #include "i915_gem_request.h"
 #include "i915_selftest.h"
 
-#define I915_GTT_PAGE_SIZE 4096UL
+#define I915_GTT_PAGE_SIZE_4K BIT(12)
+#define I915_GTT_PAGE_SIZE_64K BIT(16)
+#define I915_GTT_PAGE_SIZE_2M BIT(21)
+
+#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
+#define I915_GTT_MAX_PAGE_SIZE I915_GTT_PAGE_SIZE_2M
+
 #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
 
 #define I915_FENCE_REG_NONE -1
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 745b6a6e0188..7938006cf03a 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -58,6 +58,10 @@
.color = { .degamma_lut_size = 0, .gamma_lut_size = 1024 }
 
 /* Keep in gen based order, and chronological order within a gen */
+
+#define GEN_DEFAULT_PAGE_SIZES \
+   .page_sizes = I915_GTT_PAGE_SIZE_4K
+
 #define GEN2_FEATURES \
.gen = 2, .num_pipes = 1, \
.has_overlay = 1, .overlay_needs_physical = 1, \
@@ -67,6 +71,7 @@
.ring_mask = RENDER_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i830_info __initconst = {
@@ -100,6 +105,7 @@ static const struct intel_device_info intel_i865g_info 
__initconst = {
.ring_mask = RENDER_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i915g_info __initconst = {
@@ -163,6 +169,7 @@ static const struct intel_device_info intel_pineview_info 
__initconst = {
.ring_mask = RENDER_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i965g_info __initconst = {
@@ -205,6 +212,7 @@ static const struct intel_device_info intel_gm45_info 
__initconst = {
.ring_mask = RENDER_RING | BSD_RING, \
.has_snoop = true, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ironlake_d_info __initconst = {
@@ -228,6 +236,7 @@ static const struct intel_device_info intel_ironlake_m_info 
__initconst = {
.has_rc6p = 1, \
.has_aliasing_ppgtt = 1, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
 
 #define SNB_D_PLATFORM \
@@ -271,6 +280,7 @@ static const struct intel_device_info 
intel_sandybridge_m_gt2_info __initconst =
.has_aliasing_ppgtt = 1, \
.has_full_ppgtt = 1, \
GEN_DEFAULT_PIPEOFFSETS, \
+   GEN_DEFAULT_PAGE_SIZES, \
IVB_CURSOR_OFFSETS
 
 #define IVB_D_PLATFORM \
@@ -327,6 +337,7 @@ static const struct intel_device_info intel_valleyview_info 
__initconst = {
.has_snoop = true,
.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
.display_mmio_offset = VLV_DISPLAY_BASE,
+   GEN_DEFAULT_PAGE_SIZES,
GEN_DEFAULT_PIPEOFFSETS,
CURSOR_OFFSETS
 };
@@ -365,6 +376,7 @@ static const struct intel_device_info 
intel_haswell_gt3_info __initconst = {
 #define GEN8_FEATURES \
G75_FEATURES, \
BDW_COLORS, \
+   GEN_DEFAULT_PAGE_SIZES, \
.has_logical_ring_contexts = 1, \
.has_full_48bit_ppgtt = 1, \
.has_64bit_reloc = 1, \
@@ -417,13 +429,18 @@ static const struct intel_device_info 
intel_cherryview_info __initconst = {
.has_reset_engine = 1,
.has_snoop = true,
.display_mmio_offse

[Intel-gfx] [PATCH 08/21] drm/i915: align the vma start to the largest gtt page size

2017-10-06 Thread Matthew Auld
For the 48b PPGTT try to align the vma start address to the required
page size boundary to guarantee we use said page size in the gtt. If we
are dealing with multiple page sizes, we can't guarantee anything and
just align to the largest. For soft pinning and objects which need to be
tightly packed into the lower 32bits we don't force any alignment.

v2: various improvements suggested by Chris

v3: use set_pages and better placement of page_sizes

v4: prefer upper_32_bits()

v5: assign vma->page_sizes = vma->obj->page_sizes directly
prefer sizeof(vma->page_sizes)

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c |  6 ++
 drivers/gpu/drm/i915/i915_vma.c | 13 +
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 3 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c534b74eee32..fb7ac66814ab 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -226,6 +226,8 @@ static int ppgtt_set_pages(struct i915_vma *vma)
 
vma->pages = vma->obj->mm.pages;
 
+   vma->page_sizes = vma->obj->mm.page_sizes;
+
return 0;
 }
 
@@ -238,6 +240,8 @@ static void clear_pages(struct i915_vma *vma)
kfree(vma->pages);
}
vma->pages = NULL;
+
+   memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
 static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
@@ -2538,6 +2542,8 @@ static int ggtt_set_pages(struct i915_vma *vma)
if (ret)
return ret;
 
+   vma->page_sizes = vma->obj->mm.page_sizes;
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 49bf49571e47..5067eab27829 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -493,6 +493,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
if (ret)
goto err_clear;
} else {
+   /*
+* We only support huge gtt pages through the 48b PPGTT,
+* however we also don't want to force any alignment for
+* objects which need to be tightly packed into the low 32bits.
+*/
+   if (upper_32_bits(end) &&
+   vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+   u64 page_alignment =
+   rounddown_pow_of_two(vma->page_sizes.sg);
+
+   alignment = max(alignment, page_alignment);
+   }
+
ret = i915_gem_gtt_insert(vma->vm, &vma->node,
  size, alignment, obj->cache_level,
  start, end, flags);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index e811067c7724..c59ba76613a3 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -55,6 +55,7 @@ struct i915_vma {
void __iomem *iomap;
u64 size;
u64 display_alignment;
+   struct i915_page_sizes page_sizes;
 
u32 fence_size;
u32 fence_alignment;
-- 
2.13.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/21] drm/i915: push set_pages down to the callers

2017-10-06 Thread Matthew Auld
Each backend is now responsible for calling __i915_gem_object_set_pages
upon successfully gathering its backing storage. This eliminates the
inconsistency between the async and sync paths, which stands out even
more when we start throwing around an sg_mask in a later patch.

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c  | 45 +---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   | 15 +---
 drivers/gpu/drm/i915/i915_gem_internal.c | 15 
 drivers/gpu/drm/i915/i915_gem_object.h   |  2 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c   | 16 ++---
 drivers/gpu/drm/i915/i915_gem_userptr.c  | 12 +++
 drivers/gpu/drm/i915/selftests/huge_gem_object.c | 14 
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 12 ---
 8 files changed, 77 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 81d70d23a057..c1c07d0957aa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -162,8 +162,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void 
*data,
return 0;
 }
 
-static struct sg_table *
-i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
+static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 {
struct address_space *mapping = obj->base.filp->f_mapping;
drm_dma_handle_t *phys;
@@ -171,9 +170,10 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
struct scatterlist *sg;
char *vaddr;
int i;
+   int err;
 
if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
-   return ERR_PTR(-EINVAL);
+   return -EINVAL;
 
/* Always aligning to the object size, allows a single allocation
 * to handle all possible callers, and given typical object sizes,
@@ -183,7 +183,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
 roundup_pow_of_two(obj->base.size),
 roundup_pow_of_two(obj->base.size));
if (!phys)
-   return ERR_PTR(-ENOMEM);
+   return -ENOMEM;
 
vaddr = phys->vaddr;
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
@@ -192,7 +192,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
 
page = shmem_read_mapping_page(mapping, i);
if (IS_ERR(page)) {
-   st = ERR_CAST(page);
+   err = PTR_ERR(page);
goto err_phys;
}
 
@@ -209,13 +209,13 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
 
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (!st) {
-   st = ERR_PTR(-ENOMEM);
+   err = -ENOMEM;
goto err_phys;
}
 
if (sg_alloc_table(st, 1, GFP_KERNEL)) {
kfree(st);
-   st = ERR_PTR(-ENOMEM);
+   err = -ENOMEM;
goto err_phys;
}
 
@@ -227,11 +227,15 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object 
*obj)
sg_dma_len(sg) = obj->base.size;
 
obj->phys_handle = phys;
-   return st;
+
+   __i915_gem_object_set_pages(obj, st);
+
+   return 0;
 
 err_phys:
drm_pci_free(obj->base.dev, phys);
-   return st;
+
+   return err;
 }
 
 static void __start_cpu_write(struct drm_i915_gem_object *obj)
@@ -2292,8 +2296,7 @@ static bool i915_sg_trim(struct sg_table *orig_st)
return true;
 }
 
-static struct sg_table *
-i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
+static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 {
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
const unsigned long page_count = obj->base.size / PAGE_SIZE;
@@ -2317,12 +2320,12 @@ i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
 
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
-   return ERR_PTR(-ENOMEM);
+   return -ENOMEM;
 
 rebuild_st:
if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
kfree(st);
-   return ERR_PTR(-ENOMEM);
+   return -ENOMEM;
}
 
/* Get the list of pages out of our struct file.  They'll be pinned
@@ -2430,7 +2433,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
if (i915_gem_object_needs_bit17_swizzle(obj))
i915_gem_object_do_bit_17_swizzle(obj, st);
 
-   return st;
+   __i915_gem_object_set_pages(obj, st);
+
+   return 0;
 
 err_sg:
sg_mark_end(sg);
@@ -2451,7 +2456,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
if (ret == -ENOSPC)
ret =

[Intel-gfx] [PATCH 06/21] drm/i915: introduce page_size members

2017-10-06 Thread Matthew Auld
In preparation for supporting huge gtt pages for the ppgtt, we introduce
page size members for gem objects.  We fill in the page sizes by
scanning the sg table.

v2: pass the sg_mask to set_pages

v3: calculate the sg_mask inline with populating the sg_table where
possible, and pass to set_pages along with the pages.

v4: bunch of improvements from Joonas

v5: fix num_pages blunder
introduce i915_sg_page_sizes helper

v6: prefer GEM_BUG_ON(sizes == 0)

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Daniel Vetter 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  | 22 -
 drivers/gpu/drm/i915/i915_gem.c  | 42 +---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   |  5 ++-
 drivers/gpu/drm/i915/i915_gem_internal.c |  5 ++-
 drivers/gpu/drm/i915/i915_gem_object.h   | 17 ++
 drivers/gpu/drm/i915/i915_gem_stolen.c   |  2 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c  |  5 ++-
 drivers/gpu/drm/i915/selftests/huge_gem_object.c |  2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c|  5 ++-
 9 files changed, 93 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3d4dee817381..799a90abd81f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2872,6 +2872,21 @@ static inline struct scatterlist *__sg_next(struct 
scatterlist *sg)
 (((__iter).curr += PAGE_SIZE) >= (__iter).max) ?   \
 (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0)
 
+static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
+{
+   unsigned int page_sizes;
+
+   page_sizes = 0;
+   while (sg) {
+   GEM_BUG_ON(sg->offset);
+   GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE));
+   page_sizes |= sg->length;
+   sg = __sg_next(sg);
+   }
+
+   return page_sizes;
+}
+
 static inline unsigned int i915_sg_segment_size(void)
 {
unsigned int size = swiotlb_max_segment();
@@ -3101,6 +3116,10 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define USES_PPGTT(dev_priv)   (i915_modparams.enable_ppgtt)
 #define USES_FULL_PPGTT(dev_priv)  (i915_modparams.enable_ppgtt >= 2)
 #define USES_FULL_48BIT_PPGTT(dev_priv)(i915_modparams.enable_ppgtt == 
3)
+#define HAS_PAGE_SIZES(dev_priv, sizes) ({ \
+   GEM_BUG_ON((sizes) == 0); \
+   ((sizes) & ~(dev_priv)->info.page_sizes) == 0; \
+})
 
 #define HAS_OVERLAY(dev_priv)   ((dev_priv)->info.has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
@@ -3517,7 +3536,8 @@ i915_gem_object_get_dma_address(struct 
drm_i915_gem_object *obj,
unsigned long n);
 
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
-struct sg_table *pages);
+struct sg_table *pages,
+unsigned int sg_mask);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
 static inline int __must_check
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c1c07d0957aa..695cb2a38c88 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -228,7 +228,7 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
 
obj->phys_handle = phys;
 
-   __i915_gem_object_set_pages(obj, st);
+   __i915_gem_object_set_pages(obj, st, sg->length);
 
return 0;
 
@@ -2266,6 +2266,8 @@ void __i915_gem_object_put_pages(struct 
drm_i915_gem_object *obj,
if (!IS_ERR(pages))
obj->ops->put_pages(obj, pages);
 
+   obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0;
+
 unlock:
mutex_unlock(&obj->mm.lock);
 }
@@ -2308,6 +2310,7 @@ static int i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
unsigned int max_segment = i915_sg_segment_size();
+   unsigned int sg_mask;
gfp_t noreclaim;
int ret;
 
@@ -2339,6 +2342,7 @@ static int i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
 
sg = st->sgl;
st->nents = 0;
+   sg_mask = 0;
for (i = 0; i < page_count; i++) {
const unsigned int shrink[] = {
I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | 
I915_SHRINK_PURGEABLE,
@@ -2391,8 +2395,10 @@ static int i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
if (!i ||
sg->length >= max_segment ||
page_to_pfn(page) != last_pfn + 1) {
-   if (i)
+   if (i) {
+   sg_mask |= sg->length;
sg = sg_next(sg);
+   }

[Intel-gfx] [PATCH 02/21] drm/i915: introduce simple gemfs

2017-10-06 Thread Matthew Auld
Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
moves us away from the shmemfs shm_mnt, and gives us the much needed
flexibility to do things like set our own mount options, namely huge=
which should allow us to enable the use of transparent-huge-pages for
our shmem backed objects.

v2: various improvements suggested by Joonas

v3: move gemfs instance to i915.mm and simplify now that we have
file_setup_with_mnt

v4: fallback to tmpfs shm_mnt upon failure to setup gemfs

v5: make tmpfs fallback kinder

v5: better gemfs failure message
flags variable

Signed-off-by: Matthew Auld 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Dave Hansen 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/i915_drv.h  |  5 +++
 drivers/gpu/drm/i915/i915_gem.c  | 33 ++-
 drivers/gpu/drm/i915/i915_gemfs.c| 52 
 drivers/gpu/drm/i915/i915_gemfs.h| 34 
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  4 ++
 6 files changed, 128 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 51d0d2929a4b..66d23b619db1 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -47,6 +47,7 @@ i915-y += i915_cmd_parser.o \
  i915_gem_tiling.o \
  i915_gem_timeline.o \
  i915_gem_userptr.o \
+ i915_gemfs.o \
  i915_trace_points.o \
  i915_vma.o \
  intel_breadcrumbs.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1fc7080bfa7b..ec6f320cc4f5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1511,6 +1511,11 @@ struct i915_gem_mm {
/** Usable portion of the GTT for GEM */
dma_addr_t stolen_base; /* limited to low memory (32-bit) */
 
+   /**
+* tmpfs instance used for shmem backed objects
+*/
+   struct vfsmount *gemfs;
+
/** PPGTT used for aliasing the PPGTT with the GTT */
struct i915_hw_ppgtt *aliasing_ppgtt;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ab8c6946fea4..81d70d23a057 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -35,6 +35,7 @@
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
 #include "intel_mocs.h"
+#include "i915_gemfs.h"
 #include 
 #include 
 #include 
@@ -4251,6 +4252,30 @@ static const struct drm_i915_gem_object_ops 
i915_gem_object_ops = {
.pwrite = i915_gem_object_pwrite_gtt,
 };
 
+static int i915_gem_object_create_shmem(struct drm_device *dev,
+   struct drm_gem_object *obj,
+   size_t size)
+{
+   struct drm_i915_private *i915 = to_i915(dev);
+   unsigned long flags = VM_NORESERVE;
+   struct file *filp;
+
+   drm_gem_private_object_init(dev, obj, size);
+
+   if (i915->mm.gemfs)
+   filp = shmem_file_setup_with_mnt(i915->mm.gemfs, "i915", size,
+flags);
+   else
+   filp = shmem_file_setup("i915", size, flags);
+
+   if (IS_ERR(filp))
+   return PTR_ERR(filp);
+
+   obj->filp = filp;
+
+   return 0;
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 {
@@ -4275,7 +4300,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, 
u64 size)
if (obj == NULL)
return ERR_PTR(-ENOMEM);
 
-   ret = drm_gem_object_init(&dev_priv->drm, &obj->base, size);
+   ret = i915_gem_object_create_shmem(&dev_priv->drm, &obj->base, size);
if (ret)
goto fail;
 
@@ -4915,6 +4940,10 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 
spin_lock_init(&dev_priv->fb_tracking.lock);
 
+   err = i915_gemfs_init(dev_priv);
+   if (err)
+   DRM_NOTE("Unable to create a private tmpfs mount, hugepage 
support will be disabled(%d).\n", err);
+
return 0;
 
 err_priorities:
@@ -4953,6 +4982,8 @@ void i915_gem_load_cleanup(struct drm_i915_private 
*dev_priv)
 
/* And ensure that our DESTROY_BY_RCU slabs are truly destroyed */
rcu_barrier();
+
+   i915_gemfs_fini(dev_priv);
 }
 
 int i915_gem_freeze(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gemfs.c 
b/drivers/gpu/drm/i915/i915_gemfs.c
new file mode 100644
index ..168d0bd98f60
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gemfs.c
@@ -0,0 +1,52 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy o

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix pointer-to-int conversion (rev2)

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915: Fix pointer-to-int conversion (rev2)
URL   : https://patchwork.freedesktop.org/series/31488/
State : success

== Summary ==

Series 31488v2 drm/i915: Fix pointer-to-int conversion
https://patchwork.freedesktop.org/api/1.0/series/31488/revisions/2/mbox/

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:460s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:467s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:396s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:564s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:289s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:536s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:532s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:549s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:528s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:568s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:622s
fi-elk-e7500 total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  
time:444s
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:600s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:437s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:425s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:514s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:496s
fi-kbl-7500u total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  
time:499s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:585s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:490s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:593s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:477s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:659s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:537s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:516s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:469s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:581s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:432s
fi-pnv-d510 failed to connect after reboot

7dacd1f2e70cb3202e2b153d76b05b601d099082 drm-tip: 2017y-10m-06d-12h-29m-28s UTC 
integration manifest
86ad6277e1ad drm/i915: Fix pointer-to-int conversion

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5932/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915/cnl: WaDisableGatherAtSetShaderCommonSlice (rev2)

2017-10-06 Thread Saarinen, Jani
HI, 

> -Original Message-
> From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rodrigo Vivi
> Sent: perjantai 6. lokakuuta 2017 16.10
> To: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915/cnl:
> WaDisableGatherAtSetShaderCommonSlice (rev2)
> 
> On Fri, Oct 06, 2017 at 11:06:34AM +, Patchwork wrote:
> > == Series Details ==
> >
> > Series: drm/i915/cnl: WaDisableGatherAtSetShaderCommonSlice (rev2)
> > URL   : https://patchwork.freedesktop.org/series/31457/
> > State : warning
> >
> > == Summary ==
> >
> > Series 31457v2 drm/i915/cnl: WaDisableGatherAtSetShaderCommonSlice
> >
> https://patchwork.freedesktop.org/api/1.0/series/31457/revisions/2/mbox/
> >
> > Test gem_exec_suspend:
> > Subgroup basic-s3:
> > pass   -> DMESG-WARN (fi-cfl-s) fdo#103026
> > Subgroup basic-s4-devices:
> > pass   -> DMESG-WARN (fi-kbl-7500u)
> 
> I believe this is a false positive.
> This patch only changes CNL, not KBL.
[  254.679399] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not signal 
timeout (has irq: 1)!
[  254.679428] [drm:intel_dp_aux_ch [i915]] *ERROR* dp_aux_ch not done status 
0xac1003ff

So something new or known? 

Jani Saarinen
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

2017-10-06 Thread Tvrtko Ursulin


On 06/10/2017 15:23, Daniel Vetter wrote:

On Fri, Oct 06, 2017 at 12:34:02PM +0100, Tvrtko Ursulin wrote:


On 06/10/2017 10:06, Daniel Vetter wrote:

4.14-rc1 gained the fancy new cross-release support in lockdep, which
seems to have uncovered a few more rules about what is allowed and
isn't.

This one here seems to indicate that allocating a work-queue while
holding mmap_sem is a no-go, so let's try to preallocate it.

Of course another way to break this chain would be somewhere in the
cpu hotplug code, since this isn't the only trace we're finding now
which goes through msr_create_device.

Full lockdep splat:


[snipped lockdep splat]


v2: Set ret correctly when we raced with another thread.

v3: Use Chris' diff. Attach the right lockdep splat.

Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Sasha Levin 
Cc: Marta Lofstedt 
Cc: Tejun Heo 
References: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/shard-hsw3/igt@prime_mmap@test_userptr.html
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102939
Signed-off-by: Daniel Vetter 
---
   drivers/gpu/drm/i915/i915_gem_userptr.c | 35 
+++--
   1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 2d4996de7331..f9b3406401af 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -164,7 +164,6 @@ static struct i915_mmu_notifier *
   i915_mmu_notifier_create(struct mm_struct *mm)
   {
struct i915_mmu_notifier *mn;
-   int ret;
mn = kmalloc(sizeof(*mn), GFP_KERNEL);
if (mn == NULL)
@@ -179,14 +178,6 @@ i915_mmu_notifier_create(struct mm_struct *mm)
return ERR_PTR(-ENOMEM);
}
-/* Protected by mmap_sem (write-lock) */
-   ret = __mmu_notifier_register(&mn->mn, mm);
-   if (ret) {
-   destroy_workqueue(mn->wq);
-   kfree(mn);
-   return ERR_PTR(ret);
-   }
-
return mn;
   }
@@ -210,23 +201,37 @@ i915_gem_userptr_release__mmu_notifier(struct 
drm_i915_gem_object *obj)
   static struct i915_mmu_notifier *
   i915_mmu_notifier_find(struct i915_mm_struct *mm)
   {
-   struct i915_mmu_notifier *mn = mm->mn;
+   struct i915_mmu_notifier *mn;
+   int err;
mn = mm->mn;
if (mn)
return mn;
+   mn = i915_mmu_notifier_create(mm->mm);
+   if (IS_ERR(mn))
+   return mn;


Strictly speaking we don't want to fail just yet, only it we actually needed
a new notifier and we failed to create it.


The check 2 lines above not good enough? It's somewhat racy, but I'm not
sure what value we provide by being perfectly correct against low memory.
This thread racing against a 2nd one, where the minimal allocation of the
2nd one pushed us perfectly over the oom threshold seems a very unlikely
scenario.

Also, small allocations actually never fail :-)


Yes, but, we otherwise make each other re-spin for much smaller things 
than bailout logic being conceptually at the wrong place. So for me I'd 
like a respin. It's not complicated at all, just move the bailout to to 
before the __mmu_notifier_register:


...

err = 0;
if (IS_ERR(mn))
err = PTR_ERR(..);

...

if (mana->manah == NULL) { /* ;-D */
/* Protect by mmap_sem...
if (err == 0) {
err = __mmu_notifier_register(..);
...
}
}

...

if (mn && !IS_ERR(mn)) {
...free...
}

I think.. ?

R-b on this, plus below, unless I got something wrong.






+
+   err = 0;
down_write(&mm->mm->mmap_sem);
mutex_lock(&mm->i915->mm_lock);
-   if ((mn = mm->mn) == NULL) {ed
-   mn = i915_mmu_notifier_create(mm->mm);
-   if (!IS_ERR(mn))
-   mm->mn = mn;
+   if (mm->mn == NULL) {
+   /* Protected by mmap_sem (write-lock) */
+   err = __mmu_notifier_register(&mn->mn, mm->mm);
+   if (!err) {
+   /* Protected by mm_lock */
+   mm->mn = fetch_and_zero(&mn);
+   }
}
mutex_unlock(&mm->i915->mm_lock);
up_write(&mm->mm->mmap_sem);
-   return mn;
+   if (mn) {
+   destroy_workqueue(mn->wq);
+   kfree(mn);
+   }
+
+   return err ? ERR_PTR(err) : mm->mn;
   }
   static int



Otherwise looks good to me.

I would also put a note in the commit on how working around the locking
issue is also beneficial to performance with moving the allocation step
outside the mmap_sem.


Yeah Chris brought that up too, I don't really buy it given how
heavy-weight __mmu_notifier_register is. But I can add something like:

"This also has the minor benefit of slightly reducing the critical
section where we hold mmap_sem."

r-b with that added to the commit message?


I think for me it is 

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/huc: Fix includes in intel_huc.c

2017-10-06 Thread Patchwork
== Series Details ==

Series: drm/i915/huc: Fix includes in intel_huc.c
URL   : https://patchwork.freedesktop.org/series/31475/
State : success

== Summary ==

Test kms_cursor_legacy:
Subgroup cursorA-vs-flipA-atomic-transitions:
fail   -> PASS   (shard-hsw) fdo#102723

fdo#102723 https://bugs.freedesktop.org/show_bug.cgi?id=102723

shard-hswtotal:2446 pass:1328 dwarn:6   dfail:0   fail:9   skip:1103 
time:10148s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5925/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Make i915_engine_info pretty printer to standalone

2017-10-06 Thread Patchwork
== Series Details ==

Series: series starting with [1/2] drm/i915: Make i915_engine_info pretty 
printer to standalone
URL   : https://patchwork.freedesktop.org/series/31489/
State : success

== Summary ==

Series 31489v1 series starting with [1/2] drm/i915: Make i915_engine_info 
pretty printer to standalone
https://patchwork.freedesktop.org/api/1.0/series/31489/revisions/1/mbox/

Test chamelium:
Subgroup dp-crc-fast:
pass   -> FAIL   (fi-kbl-7500u) fdo#102514
Test kms_pipe_crc_basic:
Subgroup nonblocking-crc-pipe-a-frame-sequence:
pass   -> INCOMPLETE (fi-elk-e7500) fdo#102364

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#102364 https://bugs.freedesktop.org/show_bug.cgi?id=102364

fi-bdw-5557u total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  
time:457s
fi-bdw-gvtdvmtotal:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:474s
fi-blb-e6850 total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  
time:395s
fi-bsw-n3050 total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  
time:572s
fi-bwr-2160  total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 
time:289s
fi-bxt-dsi   total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  
time:527s
fi-bxt-j4205 total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:530s
fi-byt-j1900 total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  
time:542s
fi-byt-n2820 total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  
time:534s
fi-cfl-s total:289  pass:256  dwarn:1   dfail:0   fail:0   skip:32  
time:559s
fi-cnl-y total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:626s
fi-elk-e7500 total:234  pass:185  dwarn:0   dfail:0   fail:0   skip:48 
fi-glk-1 total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  
time:602s
fi-hsw-4770  total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:441s
fi-hsw-4770r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:418s
fi-ivb-3520m total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:510s
fi-ivb-3770  total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  
time:475s
fi-kbl-7500u total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  
time:499s
fi-kbl-7560u total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  
time:585s
fi-kbl-7567u total:289  pass:265  dwarn:4   dfail:0   fail:0   skip:20  
time:490s
fi-kbl-r total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  
time:600s
fi-pnv-d510  total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  
time:659s
fi-skl-6260u total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:471s
fi-skl-6700hqtotal:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  
time:659s
fi-skl-6700k total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  
time:545s
fi-skl-6770hqtotal:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  
time:517s
fi-skl-gvtdvmtotal:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  
time:472s
fi-snb-2520m total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  
time:588s
fi-snb-2600  total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  
time:435s

7dacd1f2e70cb3202e2b153d76b05b601d099082 drm-tip: 2017y-10m-06d-12h-29m-28s UTC 
integration manifest
7b94f33d5162 drm/i915/selftests: Pretty print engine state when requests fail 
to start
abe1c7b19674 drm/i915: Make i915_engine_info pretty printer to standalone

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5931/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

2017-10-06 Thread Daniel Vetter
On Fri, Oct 06, 2017 at 12:34:02PM +0100, Tvrtko Ursulin wrote:
> 
> On 06/10/2017 10:06, Daniel Vetter wrote:
> > 4.14-rc1 gained the fancy new cross-release support in lockdep, which
> > seems to have uncovered a few more rules about what is allowed and
> > isn't.
> > 
> > This one here seems to indicate that allocating a work-queue while
> > holding mmap_sem is a no-go, so let's try to preallocate it.
> > 
> > Of course another way to break this chain would be somewhere in the
> > cpu hotplug code, since this isn't the only trace we're finding now
> > which goes through msr_create_device.
> > 
> > Full lockdep splat:
> > 
> > ==
> > WARNING: possible circular locking dependency detected
> > 4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G U
> > --
> > prime_mmap/1551 is trying to acquire lock:
> >   (cpu_hotplug_lock.rw_sem){}, at: [] 
> > apply_workqueue_attrs+0x17/0x50
> > 
> > but task is already holding lock:
> >   (&dev_priv->mm_lock){+.+.}, at: [] 
> > i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
> > 
> > which lock already depends on the new lock.
> > 
> > the existing dependency chain (in reverse order) is:
> > 
> > -> #6 (&dev_priv->mm_lock){+.+.}:
> > __lock_acquire+0x1420/0x15e0
> > lock_acquire+0xb0/0x200
> > __mutex_lock+0x86/0x9b0
> > mutex_lock_nested+0x1b/0x20
> > i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
> > i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
> > drm_ioctl_kernel+0x69/0xb0
> > drm_ioctl+0x2f9/0x3d0
> > do_vfs_ioctl+0x94/0x670
> > SyS_ioctl+0x41/0x70
> > entry_SYSCALL_64_fastpath+0x1c/0xb1
> > 
> > -> #5 (&mm->mmap_sem){}:
> > __lock_acquire+0x1420/0x15e0
> > lock_acquire+0xb0/0x200
> > __might_fault+0x68/0x90
> > _copy_to_user+0x23/0x70
> > filldir+0xa5/0x120
> > dcache_readdir+0xf9/0x170
> > iterate_dir+0x69/0x1a0
> > SyS_getdents+0xa5/0x140
> > entry_SYSCALL_64_fastpath+0x1c/0xb1
> > 
> > -> #4 (&sb->s_type->i_mutex_key#5){}:
> > down_write+0x3b/0x70
> > handle_create+0xcb/0x1e0
> > devtmpfsd+0x139/0x180
> > kthread+0x152/0x190
> > ret_from_fork+0x27/0x40
> > 
> > -> #3 ((complete)&req.done){+.+.}:
> > __lock_acquire+0x1420/0x15e0
> > lock_acquire+0xb0/0x200
> > wait_for_common+0x58/0x210
> > wait_for_completion+0x1d/0x20
> > devtmpfs_create_node+0x13d/0x160
> > device_add+0x5eb/0x620
> > device_create_groups_vargs+0xe0/0xf0
> > device_create+0x3a/0x40
> > msr_device_create+0x2b/0x40
> > cpuhp_invoke_callback+0xa3/0x840
> > cpuhp_thread_fun+0x7a/0x150
> > smpboot_thread_fn+0x18a/0x280
> > kthread+0x152/0x190
> > ret_from_fork+0x27/0x40
> > 
> > -> #2 (cpuhp_state){+.+.}:
> > __lock_acquire+0x1420/0x15e0
> > lock_acquire+0xb0/0x200
> > cpuhp_issue_call+0x10b/0x170
> > __cpuhp_setup_state_cpuslocked+0x134/0x2a0
> > __cpuhp_setup_state+0x46/0x60
> > page_writeback_init+0x43/0x67
> > pagecache_init+0x3d/0x42
> > start_kernel+0x3a8/0x3fc
> > x86_64_start_reservations+0x2a/0x2c
> > x86_64_start_kernel+0x6d/0x70
> > verify_cpu+0x0/0xfb
> > 
> > -> #1 (cpuhp_state_mutex){+.+.}:
> > __lock_acquire+0x1420/0x15e0
> > lock_acquire+0xb0/0x200
> > __mutex_lock+0x86/0x9b0
> > mutex_lock_nested+0x1b/0x20
> > __cpuhp_setup_state_cpuslocked+0x52/0x2a0
> > __cpuhp_setup_state+0x46/0x60
> > page_alloc_init+0x28/0x30
> > start_kernel+0x145/0x3fc
> > x86_64_start_reservations+0x2a/0x2c
> > x86_64_start_kernel+0x6d/0x70
> > verify_cpu+0x0/0xfb
> > 
> > -> #0 (cpu_hotplug_lock.rw_sem){}:
> > check_prev_add+0x430/0x840
> > __lock_acquire+0x1420/0x15e0
> > lock_acquire+0xb0/0x200
> > cpus_read_lock+0x3d/0xb0
> > apply_workqueue_attrs+0x17/0x50
> > __alloc_workqueue_key+0x1d8/0x4d9
> > i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
> > i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
> > drm_ioctl_kernel+0x69/0xb0
> > drm_ioctl+0x2f9/0x3d0
> > do_vfs_ioctl+0x94/0x670
> > SyS_ioctl+0x41/0x70
> > entry_SYSCALL_64_fastpath+0x1c/0xb1
> > 
> > other info that might help us debug this:
> > 
> > Chain exists of:
> >cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock
> > 
> >   Possible unsafe locking scenario:
> > 
> > CPU0CPU1
> > 
> >lock(&dev_priv->mm_lock);
> > lock(&mm->mmap_sem);
> > lock(&dev_priv->mm_lock);
> >lock(cpu_hotplug_lock.rw_

  1   2   3   >