Re: [Intel-gfx] [01/15] drm/i915: Copy user requested buffers into the error state

2017-03-21 Thread Chris Wilson
On Tue, Mar 21, 2017 at 10:53:53AM -0700, Ben Widawsky wrote:
> On 17-03-21 16:23:05, Tahvanainen, Jari wrote:
> >See below [Jari]...
> >
> >-Original Message-
> >From: Ben Widawsky [mailto:b...@bwidawsk.net]
> >Sent: Tuesday, March 21, 2017 5:38 PM
> >To: Tahvanainen, Jari 
> >Cc: Chris Wilson ; intel-gfx@lists.freedesktop.org
> >Subject: Re: [01/15] drm/i915: Copy user requested buffers into the error 
> >state
> >
> >On 17-03-21 11:30:36, Tahvanainen, Jari wrote:
> >>Note that this is for all the patches in series, replied only on [1/15].
> >>
> >>See also https://bugs.freedesktop.org/show_bug.cgi?id=94001#c45
> >>
> >
> >Jari, did you test this patch specifically? It would involve introspection 
> >of the error state.
> >
> >[Jari]  like said I tested the patch series including this patch
> > " Note that this is for all the patches in series, replied only on 
> > [1/15]"
> > Tested-by " for https://patchwork.freedesktop.org/series/21377;
> > 
> > If this is not the way to do it then I need to stop.
> > And since being tester (not programmer) you need to tell more 
> > what do you mean with " would involve introspection of the error state".
> > What should be outcome? What skill shall have for it, etc.? If I cannot 
> > do it then assumable tested-by is not the thing that I will do in future.
> 
> Well there is tested-by "this doesn't regress anything" and there is tested-by
> "this new feature works properly". I've no doubt you asserted the first, but 
> my
> concern was around the second. For this patch specifically, it's a new feature
> and there is no igt test for it AFAIK.

There is/will be, gem_exec_capture; sent alongside the original
patch.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [01/15] drm/i915: Copy user requested buffers into the error state

2017-03-21 Thread Ben Widawsky

On 17-03-21 16:23:05, Tahvanainen, Jari wrote:

See below [Jari]...

-Original Message-
From: Ben Widawsky [mailto:b...@bwidawsk.net]
Sent: Tuesday, March 21, 2017 5:38 PM
To: Tahvanainen, Jari 
Cc: Chris Wilson ; intel-gfx@lists.freedesktop.org
Subject: Re: [01/15] drm/i915: Copy user requested buffers into the error state

On 17-03-21 11:30:36, Tahvanainen, Jari wrote:

Note that this is for all the patches in series, replied only on [1/15].

See also https://bugs.freedesktop.org/show_bug.cgi?id=94001#c45



Jari, did you test this patch specifically? It would involve introspection of 
the error state.

[Jari]  like said I tested the patch series including this patch
" Note that this is for all the patches in series, replied only on 
[1/15]"
 Tested-by " for https://patchwork.freedesktop.org/series/21377;

If this is not the way to do it then I need to stop.
And since being tester (not programmer) you need to tell more what do you 
mean with " would involve introspection of the error state".
What should be outcome? What skill shall have for it, etc.? If I cannot 
do it then assumable tested-by is not the thing that I will do in future.


Well there is tested-by "this doesn't regress anything" and there is tested-by
"this new feature works properly". I've no doubt you asserted the first, but my
concern was around the second. For this patch specifically, it's a new feature
and there is no igt test for it AFAIK.



From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
Sent: Thursday, March 16, 2017 3:20 PM
To: intel-gfx@lists.freedesktop.org
Cc: Ben Widawsky 
Subject: [01/15] drm/i915: Copy user requested buffers into the error
state

Introduce a new execobject.flag (EXEC_OBJECT_CAPTURE) that userspace
may use to indicate that it wants the contents of this buffer preserved
in the error state (/sys/class/drm/cardN/error) following a GPU hang
involving this batch.

Use this at your discretion, the contents of the error state. although
compressed, are allocated with GFP_ATOMIC (i.e. limited) and kept for
all eternity (until the error state is destroyed).

Based on an earlier patch by Ben Widawsky
>
Signed-off-by: Chris Wilson
>
Cc: Ben Widawsky >
Cc: Matt Turner >
Acked-by: Ben Widawsky >
Reviewed-by: Joonas Lahtinen




for https://patchwork.freedesktop.org/series/21377 on my dev-SKL (i5-6600k) by 
taking all the gem_exec_reloc cases to testlist (151 tests).

Executing those as a full set through piglit was not successful due to out-of-memory 
conditions at the end of the testlist with some (varying) gtt-xx subcases causing 
"Command terminated by signal 9". cpu-xx did not signal any problems.



drm-tip: 2017y-03m-17d-08h-03m-19s without patch series produced:

[151/151] skip: 2, pass: 120, fail: 29



with patch series applied one gets:

[121/151] pass: 121 |

running: igt/gem_exec_reloc/gtt-28 - "Command terminated by signal 9"

Taking rest as new testlist

[30/30] skip: 2, pass: 30, dmesg-warn: 1

having

dmesg-warn: igt/gem_exec_reloc/readonly-32

skip: igt/gem_exec_reloc/active-bsd1

skip: igt/gem_exec_reloc/active-bsd2



When running tests gtt-xx tests individually then result for all is pass.

$ sudo ./gem_exec_reloc --run-subtest cpu-31

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+
x86_64)

Subtest cpu-31: SUCCESS (3,760s)

$ sudo ./gem_exec_reloc --run-subtest gtt-31

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+
x86_64)

Subtest gtt-31: SUCCESS (25,313s)

$ sudo ./gem_exec_reloc --run-subtest gtt-30

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+
x86_64)

Subtest gtt-30: SUCCESS (11,196s)

$ sudo ./gem_exec_reloc --run-subtest gtt-29

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+
x86_64)

Subtest gtt-29: SUCCESS (5,198s)

$ sudo ./gem_exec_reloc --run-subtest gtt-28

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+
x86_64)

Subtest gtt-28: SUCCESS (2,543s)


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [01/15] drm/i915: Copy user requested buffers into the error state

2017-03-21 Thread Tahvanainen, Jari
See below [Jari]...

-Original Message-
From: Ben Widawsky [mailto:b...@bwidawsk.net] 
Sent: Tuesday, March 21, 2017 5:38 PM
To: Tahvanainen, Jari 
Cc: Chris Wilson ; intel-gfx@lists.freedesktop.org
Subject: Re: [01/15] drm/i915: Copy user requested buffers into the error state

On 17-03-21 11:30:36, Tahvanainen, Jari wrote:
>Note that this is for all the patches in series, replied only on [1/15].
>
>See also https://bugs.freedesktop.org/show_bug.cgi?id=94001#c45
>

Jari, did you test this patch specifically? It would involve introspection of 
the error state.

[Jari]  like said I tested the patch series including this patch
" Note that this is for all the patches in series, replied only on 
[1/15]" 
  Tested-by " for https://patchwork.freedesktop.org/series/21377;

If this is not the way to do it then I need to stop.
And since being tester (not programmer) you need to tell more 
what do you mean with " would involve introspection of the error state".
What should be outcome? What skill shall have for it, etc.? If I cannot 
do it then assumable tested-by is not the thing that I will do in future.
>
>From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
>Sent: Thursday, March 16, 2017 3:20 PM
>To: intel-gfx@lists.freedesktop.org
>Cc: Ben Widawsky 
>Subject: [01/15] drm/i915: Copy user requested buffers into the error 
>state
>
>Introduce a new execobject.flag (EXEC_OBJECT_CAPTURE) that userspace 
>may use to indicate that it wants the contents of this buffer preserved 
>in the error state (/sys/class/drm/cardN/error) following a GPU hang 
>involving this batch.
>
>Use this at your discretion, the contents of the error state. although 
>compressed, are allocated with GFP_ATOMIC (i.e. limited) and kept for 
>all eternity (until the error state is destroyed).
>
>Based on an earlier patch by Ben Widawsky 
>>
>Signed-off-by: Chris Wilson 
>>
>Cc: Ben Widawsky >
>Cc: Matt Turner >
>Acked-by: Ben Widawsky >
>Reviewed-by: Joonas Lahtinen 
>>>
>
>
>Tested-by: Jari Tahvanainen 
>
>
>
>for https://patchwork.freedesktop.org/series/21377 on my dev-SKL (i5-6600k) by 
>taking all the gem_exec_reloc cases to testlist (151 tests).
>
>Executing those as a full set through piglit was not successful due to 
>out-of-memory conditions at the end of the testlist with some (varying) gtt-xx 
>subcases causing "Command terminated by signal 9". cpu-xx did not signal any 
>problems.
>
>
>
>drm-tip: 2017y-03m-17d-08h-03m-19s without patch series produced:
>
>[151/151] skip: 2, pass: 120, fail: 29
>
>
>
>with patch series applied one gets:
>
>[121/151] pass: 121 |
>
>running: igt/gem_exec_reloc/gtt-28 - "Command terminated by signal 9"
>
>Taking rest as new testlist
>
>[30/30] skip: 2, pass: 30, dmesg-warn: 1
>
>having
>
>dmesg-warn: igt/gem_exec_reloc/readonly-32
>
>skip: igt/gem_exec_reloc/active-bsd1
>
>skip: igt/gem_exec_reloc/active-bsd2
>
>
>
>When running tests gtt-xx tests individually then result for all is pass.
>
>$ sudo ./gem_exec_reloc --run-subtest cpu-31
>
>IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ 
>x86_64)
>
>Subtest cpu-31: SUCCESS (3,760s)
>
>$ sudo ./gem_exec_reloc --run-subtest gtt-31
>
>IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ 
>x86_64)
>
>Subtest gtt-31: SUCCESS (25,313s)
>
>$ sudo ./gem_exec_reloc --run-subtest gtt-30
>
>IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ 
>x86_64)
>
>Subtest gtt-30: SUCCESS (11,196s)
>
>$ sudo ./gem_exec_reloc --run-subtest gtt-29
>
>IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ 
>x86_64)
>
>Subtest gtt-29: SUCCESS (5,198s)
>
>$ sudo ./gem_exec_reloc --run-subtest gtt-28
>
>IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ 
>x86_64)
>
>Subtest gtt-28: SUCCESS (2,543s)
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [01/15] drm/i915: Copy user requested buffers into the error state

2017-03-21 Thread Ben Widawsky

On 17-03-21 11:30:36, Tahvanainen, Jari wrote:

Note that this is for all the patches in series, replied only on [1/15].

See also https://bugs.freedesktop.org/show_bug.cgi?id=94001#c45



Jari, did you test this patch specifically? It would involve introspection of
the error state.



From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
Sent: Thursday, March 16, 2017 3:20 PM
To: intel-gfx@lists.freedesktop.org
Cc: Ben Widawsky 
Subject: [01/15] drm/i915: Copy user requested buffers into the error state

Introduce a new execobject.flag (EXEC_OBJECT_CAPTURE) that userspace may
use to indicate that it wants the contents of this buffer preserved in
the error state (/sys/class/drm/cardN/error) following a GPU hang
involving this batch.

Use this at your discretion, the contents of the error state. although
compressed, are allocated with GFP_ATOMIC (i.e. limited) and kept for all
eternity (until the error state is destroyed).

Based on an earlier patch by Ben Widawsky 
>
Signed-off-by: Chris Wilson 
>
Cc: Ben Widawsky >
Cc: Matt Turner >
Acked-by: Ben Widawsky >
Reviewed-by: Joonas Lahtinen 
>


Tested-by: Jari Tahvanainen 



for https://patchwork.freedesktop.org/series/21377 on my dev-SKL (i5-6600k) by 
taking all the gem_exec_reloc cases to testlist (151 tests).

Executing those as a full set through piglit was not successful due to out-of-memory 
conditions at the end of the testlist with some (varying) gtt-xx subcases causing 
"Command terminated by signal 9". cpu-xx did not signal any problems.



drm-tip: 2017y-03m-17d-08h-03m-19s without patch series produced:

[151/151] skip: 2, pass: 120, fail: 29



with patch series applied one gets:

[121/151] pass: 121 |

running: igt/gem_exec_reloc/gtt-28 - "Command terminated by signal 9"

Taking rest as new testlist

[30/30] skip: 2, pass: 30, dmesg-warn: 1

having

dmesg-warn: igt/gem_exec_reloc/readonly-32

skip: igt/gem_exec_reloc/active-bsd1

skip: igt/gem_exec_reloc/active-bsd2



When running tests gtt-xx tests individually then result for all is pass.

$ sudo ./gem_exec_reloc --run-subtest cpu-31

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest cpu-31: SUCCESS (3,760s)

$ sudo ./gem_exec_reloc --run-subtest gtt-31

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-31: SUCCESS (25,313s)

$ sudo ./gem_exec_reloc --run-subtest gtt-30

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-30: SUCCESS (11,196s)

$ sudo ./gem_exec_reloc --run-subtest gtt-29

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-29: SUCCESS (5,198s)

$ sudo ./gem_exec_reloc --run-subtest gtt-28

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-28: SUCCESS (2,543s)


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [01/15] drm/i915: Copy user requested buffers into the error state

2017-03-21 Thread Tahvanainen, Jari
Note that this is for all the patches in series, replied only on [1/15].

See also https://bugs.freedesktop.org/show_bug.cgi?id=94001#c45


From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
Sent: Thursday, March 16, 2017 3:20 PM
To: intel-gfx@lists.freedesktop.org
Cc: Ben Widawsky 
Subject: [01/15] drm/i915: Copy user requested buffers into the error state

Introduce a new execobject.flag (EXEC_OBJECT_CAPTURE) that userspace may
use to indicate that it wants the contents of this buffer preserved in
the error state (/sys/class/drm/cardN/error) following a GPU hang
involving this batch.

Use this at your discretion, the contents of the error state. although
compressed, are allocated with GFP_ATOMIC (i.e. limited) and kept for all
eternity (until the error state is destroyed).

Based on an earlier patch by Ben Widawsky 
>
Signed-off-by: Chris Wilson 
>
Cc: Ben Widawsky >
Cc: Matt Turner >
Acked-by: Ben Widawsky >
Reviewed-by: Joonas Lahtinen 
>


Tested-by: Jari Tahvanainen 



for https://patchwork.freedesktop.org/series/21377 on my dev-SKL (i5-6600k) by 
taking all the gem_exec_reloc cases to testlist (151 tests).

Executing those as a full set through piglit was not successful due to 
out-of-memory conditions at the end of the testlist with some (varying) gtt-xx 
subcases causing "Command terminated by signal 9". cpu-xx did not signal any 
problems.



drm-tip: 2017y-03m-17d-08h-03m-19s without patch series produced:

[151/151] skip: 2, pass: 120, fail: 29



with patch series applied one gets:

[121/151] pass: 121 |

running: igt/gem_exec_reloc/gtt-28 - "Command terminated by signal 9"

Taking rest as new testlist

[30/30] skip: 2, pass: 30, dmesg-warn: 1

having

dmesg-warn: igt/gem_exec_reloc/readonly-32

skip: igt/gem_exec_reloc/active-bsd1

skip: igt/gem_exec_reloc/active-bsd2



When running tests gtt-xx tests individually then result for all is pass.

$ sudo ./gem_exec_reloc --run-subtest cpu-31

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest cpu-31: SUCCESS (3,760s)

$ sudo ./gem_exec_reloc --run-subtest gtt-31

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-31: SUCCESS (25,313s)

$ sudo ./gem_exec_reloc --run-subtest gtt-30

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-30: SUCCESS (11,196s)

$ sudo ./gem_exec_reloc --run-subtest gtt-29

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-29: SUCCESS (5,198s)

$ sudo ./gem_exec_reloc --run-subtest gtt-28

IGT-Version: 1.17-g3e3c1cd (x86_64) (Linux: 4.11.0-rc2-ezbench_cb106cd+ x86_64)

Subtest gtt-28: SUCCESS (2,543s)

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx