Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-25 Thread Daniel Vetter
On Thu, Mar 20, 2014 at 04:43:05PM +0200, Jani Nikula wrote:
 
 Hi Bradley -
 
 Apologies for my procrastination with the review; I don't easily recall
 as tedious a review as the command and register tables. And I sure have
 reviewed a lot of miserable stuff in the past.
 
 Most infuriatingly, I did not find a single real bug in the code!
 
 I think we'll need to automate some things going forward, for example
 listing the non-conforming length encoding with Damien's tools for cross
 checking.
 
 There are a few subtle points we need to discuss (separate mails
 internally) but all in all this series is:
 
 Reviewed-by: Jani Nikula jani.nik...@intel.com

Ok, pulled this one in, thanks a lot for the patchesreview. I think it's
time we start to move on with the next bits, the batch copy stuff seams
like a suitable piece. There's still issues with launching the entire
thing in the end, but we can start with the copy infrastructure.

Open issues I see still:

- The littel issue we're discussing internally. Like I've said that one is
  blocking us and needs to be resolved before we can switch to enforcing
  mode.

- Secure batch dispatch is still fubar.

- CodingStyle says: Functions should be a) at most 3 indent levels b) at
  most 3 ansi screens long (i.e. 75 lines). i915_parse_cmds violates both
  metrics pretty deftly. I think a few refactoring patches to extract
  helper functions and structure the flow a bit would be good.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-25 Thread Jani Nikula
On Tue, 25 Mar 2014, Daniel Vetter dan...@ffwll.ch wrote:
 On Thu, Mar 20, 2014 at 04:43:05PM +0200, Jani Nikula wrote:
 
 Hi Bradley -
 
 Apologies for my procrastination with the review; I don't easily recall
 as tedious a review as the command and register tables. And I sure have
 reviewed a lot of miserable stuff in the past.
 
 Most infuriatingly, I did not find a single real bug in the code!
 
 I think we'll need to automate some things going forward, for example
 listing the non-conforming length encoding with Damien's tools for cross
 checking.
 
 There are a few subtle points we need to discuss (separate mails
 internally) but all in all this series is:
 
 Reviewed-by: Jani Nikula jani.nik...@intel.com

 Ok, pulled this one in, thanks a lot for the patchesreview. I think it's
 time we start to move on with the next bits, the batch copy stuff seams
 like a suitable piece. There's still issues with launching the entire
 thing in the end, but we can start with the copy infrastructure.

 Open issues I see still:

 - The littel issue we're discussing internally. Like I've said that one is
   blocking us and needs to be resolved before we can switch to enforcing
   mode.

 - Secure batch dispatch is still fubar.

 - CodingStyle says: Functions should be a) at most 3 indent levels b) at
   most 3 ansi screens long (i.e. 75 lines). i915_parse_cmds violates both
   metrics pretty deftly. I think a few refactoring patches to extract
   helper functions and structure the flow a bit would be good.

Just extracting the handlers for (desc-flags  CMD_DESC_REGISTER) and
(desc-flags  CMD_DESC_BITMASK) would go a long way.

BR,
Jani.





 Cheers, Daniel
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-25 Thread Volkin, Bradley D
On Tue, Mar 25, 2014 at 06:15:36AM -0700, Daniel Vetter wrote:
 On Thu, Mar 20, 2014 at 04:43:05PM +0200, Jani Nikula wrote:
  
  Hi Bradley -
  
  Apologies for my procrastination with the review; I don't easily recall
  as tedious a review as the command and register tables. And I sure have
  reviewed a lot of miserable stuff in the past.
  
  Most infuriatingly, I did not find a single real bug in the code!
  
  I think we'll need to automate some things going forward, for example
  listing the non-conforming length encoding with Damien's tools for cross
  checking.
  
  There are a few subtle points we need to discuss (separate mails
  internally) but all in all this series is:
  
  Reviewed-by: Jani Nikula jani.nik...@intel.com
 
 Ok, pulled this one in, thanks a lot for the patchesreview. I think it's
 time we start to move on with the next bits, the batch copy stuff seams
 like a suitable piece. There's still issues with launching the entire
 thing in the end, but we can start with the copy infrastructure.
 
 Open issues I see still:
 
 - The littel issue we're discussing internally. Like I've said that one is
   blocking us and needs to be resolved before we can switch to enforcing
   mode.
 
 - Secure batch dispatch is still fubar.

I'm not sure that this will still impact us once we implement the batch copy
step. I was only using the secure dispatch stuff because it was a convenient
way to get the batch into GGTT. But with the copy step, we could just have
separate code to do that.

 
 - CodingStyle says: Functions should be a) at most 3 indent levels b) at
   most 3 ansi screens long (i.e. 75 lines). i915_parse_cmds violates both
   metrics pretty deftly. I think a few refactoring patches to extract
   helper functions and structure the flow a bit would be good.

:)

I'll start with a patch to move all of the actual checking logic into a
separate function, with maybe an extra helper for the bitmask checks. That
seems like it should cut the size down sufficiently.

Brad

 
 Cheers, Daniel
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-25 Thread Daniel Vetter
On Tue, Mar 25, 2014 at 12:46:37PM -0700, Volkin, Bradley D wrote:
 On Tue, Mar 25, 2014 at 06:15:36AM -0700, Daniel Vetter wrote:
  - Secure batch dispatch is still fubar.
 
 I'm not sure that this will still impact us once we implement the batch copy
 step. I was only using the secure dispatch stuff because it was a convenient
 way to get the batch into GGTT. But with the copy step, we could just have
 separate code to do that.

The problem isn't copying or allocating the bo, the issue is running it
with
a) the hw checker disabled
b) not mapped into any ppgtt so hidden from all (unchecked) access and
c) otherwise working like a normal batch.

For that we need to employ the secure batch dispatch code in the execbuf
code. Atm b) is broken for aliasing ppgtt and c) is broken for full ppgtt.
So a bit of blockers for us. But at least broken b) with aliasing ppgtt is
kinda a regression, which means I'll get around to it soonish (before
3.15-rc1 at least).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-12 Thread Volkin, Bradley D
On Tue, Mar 11, 2014 at 05:41:06AM -0700, Jani Nikula wrote:
 
 Hi Bradley -
 
 I've now rather meticulously reviewed what *is* in the command and
 register tables, and didn't spot any obvious errors.

Thanks Jani! I know it's a huge pain, so I appreciate you taking the
time for it.

 
 There is still the review of what is *not* in the command and register
 tables, but perhaps should be. This is important for two reasons:
 
 1) Any command that should fail that's missing from the tables will
pass.
 
 2) Any command that has a non-standard length encoding that's missing
from the tables will confuse the parser. Going out-of-sync with the
commands may fail a good batch, or, worse, a command that should fail
may pass.
 
 I'm thinking there's a certain fragility in this. In a sense, it would
 be better to whitelist everything, and fail everything else, to err on
 the safe side. But I'm guessing that's not really feasible.

There was some discussion on these issues previously. I think Daniel's
response here covers it.

http://lists.freedesktop.org/archives/intel-gfx/2014-January/039209.html

 
 Part of the fragility is the immense tediousness of the review... I know
 it gets easier with the incremental updates, but this first series is
 not fun. (You know, you could have planted a bug to reward me! Or did I
 miss it?! ;)

Yes, I'm afraid that I've taken all the fun for myself :p

Brad

 
 
 BR,
 Jani.
 
 
 On Tue, 18 Feb 2014, bradley.d.vol...@intel.com wrote:
  From: Brad Volkin bradley.d.vol...@intel.com
 
  Certain OpenGL features (e.g. transform feedback, performance monitoring)
  require userspace code to submit batches containing commands such as
  MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
  generations of the hardware will noop these commands in unsecure batches
  (which includes all userspace batches submitted via i915) even though the
  commands may be safe and represent the intended programming model of the 
  device.
 
  This series introduces a software command parser similar in operation to the
  command parsing done in hardware for unsecure batches. However, the software
  parser allows some operations that would be noop'd by hardware, if the 
  parser
  determines the operation is safe, and submits the batch as secure to 
  prevent
  hardware parsing. Currently the series implements this on IVB and HSW.
 
  The series has one piece of prep work, one patch for the parser logic, and a
  handful of patches to fill out the tables which drive the parser. There are
  follow-up patches to libdrm and to i-g-t. The i-g-t tests are basic and do 
  not
  test all of the commands used by the parser on the assumption that I'm 
  likely
  to make the same mistakes in both the parser and the test.
 
  I've previously run the i-g-t gem_* tests, the piglit quick tests, and 
  generally
  used Ubuntu 13.10 IVB and HSW systems with the parser running. Aside from a
  failure described below, I did not see any regressions.
 
  At this point there are a couple of required/potential improvements.
 
  1) Chained batches. The parser currently allows MI_BATCH_BUFFER_START 
  commands
 in userspace batches without parsing them. The media driver uses chained
 batches, so a solution is required. I'm still working through the
 requirements but don't want to continue delaying the review process for 
  what
 I have so far.
  2) Command buffer copy. To avoid CPU modifications to buffers after 
  parsing, and
 to avoid GPU modifications to buffers via EUs or commands in the batch, 
  we
 should copy the userspace batch buffer to memory that userspace does not
 have access to, map it into GGTT, and execute that batch buffer. I have a
 sense of how to do this for 1st-level batches, but it may need changes to
 tie in with the chained batch parsing, so I've again held off.
  3) Coherency. I've previously found a coherency issue on VLV when reading 
  the
 batch buffer from the CPU during execbuffer2. Userspace writes the batch 
  via
 pwrite fast path before calling execbuffer2. The parser reads stale data.
 This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC 
  issue.
 It's possible that the shmem pread refactoring fixes this, I just have 
  not
 been able to retest due to lack of a VLV system.
 
  v2:
  - Significantly reorder series
  - Scan secure batches (i.e. I915_EXEC_SECURE)
  - Check that parser tables are sorted during init
  - Fixed gem_cpu_reloc regression
  - HAS_CMD_PARSER - CMD_PARSER_VERSION getparam
  - Additional tests
 
  v3:
  - Don't actually send batches as secure yet
  - Improved documentation and commenting
  - Many other small cleanups throughout
 
  Brad Volkin (13):
drm/i915: Refactor shmem pread setup
drm/i915: Implement command buffer parsing logic
drm/i915: Initial command parser table definitions
drm/i915: Reject privileged commands
drm/i915: Allow some privileged 

Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-11 Thread Jani Nikula

Hi Bradley -

I've now rather meticulously reviewed what *is* in the command and
register tables, and didn't spot any obvious errors.

There is still the review of what is *not* in the command and register
tables, but perhaps should be. This is important for two reasons:

1) Any command that should fail that's missing from the tables will
   pass.

2) Any command that has a non-standard length encoding that's missing
   from the tables will confuse the parser. Going out-of-sync with the
   commands may fail a good batch, or, worse, a command that should fail
   may pass.

I'm thinking there's a certain fragility in this. In a sense, it would
be better to whitelist everything, and fail everything else, to err on
the safe side. But I'm guessing that's not really feasible.

Part of the fragility is the immense tediousness of the review... I know
it gets easier with the incremental updates, but this first series is
not fun. (You know, you could have planted a bug to reward me! Or did I
miss it?! ;)


BR,
Jani.


On Tue, 18 Feb 2014, bradley.d.vol...@intel.com wrote:
 From: Brad Volkin bradley.d.vol...@intel.com

 Certain OpenGL features (e.g. transform feedback, performance monitoring)
 require userspace code to submit batches containing commands such as
 MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
 generations of the hardware will noop these commands in unsecure batches
 (which includes all userspace batches submitted via i915) even though the
 commands may be safe and represent the intended programming model of the 
 device.

 This series introduces a software command parser similar in operation to the
 command parsing done in hardware for unsecure batches. However, the software
 parser allows some operations that would be noop'd by hardware, if the parser
 determines the operation is safe, and submits the batch as secure to prevent
 hardware parsing. Currently the series implements this on IVB and HSW.

 The series has one piece of prep work, one patch for the parser logic, and a
 handful of patches to fill out the tables which drive the parser. There are
 follow-up patches to libdrm and to i-g-t. The i-g-t tests are basic and do not
 test all of the commands used by the parser on the assumption that I'm likely
 to make the same mistakes in both the parser and the test.

 I've previously run the i-g-t gem_* tests, the piglit quick tests, and 
 generally
 used Ubuntu 13.10 IVB and HSW systems with the parser running. Aside from a
 failure described below, I did not see any regressions.

 At this point there are a couple of required/potential improvements.

 1) Chained batches. The parser currently allows MI_BATCH_BUFFER_START commands
in userspace batches without parsing them. The media driver uses chained
batches, so a solution is required. I'm still working through the
requirements but don't want to continue delaying the review process for 
 what
I have so far.
 2) Command buffer copy. To avoid CPU modifications to buffers after parsing, 
 and
to avoid GPU modifications to buffers via EUs or commands in the batch, we
should copy the userspace batch buffer to memory that userspace does not
have access to, map it into GGTT, and execute that batch buffer. I have a
sense of how to do this for 1st-level batches, but it may need changes to
tie in with the chained batch parsing, so I've again held off.
 3) Coherency. I've previously found a coherency issue on VLV when reading the
batch buffer from the CPU during execbuffer2. Userspace writes the batch 
 via
pwrite fast path before calling execbuffer2. The parser reads stale data.
This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC issue.
It's possible that the shmem pread refactoring fixes this, I just have not
been able to retest due to lack of a VLV system.

 v2:
 - Significantly reorder series
 - Scan secure batches (i.e. I915_EXEC_SECURE)
 - Check that parser tables are sorted during init
 - Fixed gem_cpu_reloc regression
 - HAS_CMD_PARSER - CMD_PARSER_VERSION getparam
 - Additional tests

 v3:
 - Don't actually send batches as secure yet
 - Improved documentation and commenting
 - Many other small cleanups throughout

 Brad Volkin (13):
   drm/i915: Refactor shmem pread setup
   drm/i915: Implement command buffer parsing logic
   drm/i915: Initial command parser table definitions
   drm/i915: Reject privileged commands
   drm/i915: Allow some privileged commands from master
   drm/i915: Add register whitelists for mesa
   drm/i915: Add register whitelist for DRM master
   drm/i915: Enable register whitelist checks
   drm/i915: Reject commands that explicitly generate interrupts
   drm/i915: Enable PPGTT command parser checks
   drm/i915: Reject commands that would store to global HWS page
   drm/i915: Add a CMD_PARSER_VERSION getparam
   drm/i915: Enable command parsing by default

  drivers/gpu/drm/i915/Makefile  |   1 +
  

Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-05 Thread Daniel Vetter
On Tue, Feb 18, 2014 at 10:15:44AM -0800, bradley.d.vol...@intel.com wrote:
 From: Brad Volkin bradley.d.vol...@intel.com
 
 Certain OpenGL features (e.g. transform feedback, performance monitoring)
 require userspace code to submit batches containing commands such as
 MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
 generations of the hardware will noop these commands in unsecure batches
 (which includes all userspace batches submitted via i915) even though the
 commands may be safe and represent the intended programming model of the 
 device.
 
 This series introduces a software command parser similar in operation to the
 command parsing done in hardware for unsecure batches. However, the software
 parser allows some operations that would be noop'd by hardware, if the parser
 determines the operation is safe, and submits the batch as secure to prevent
 hardware parsing. Currently the series implements this on IVB and HSW.
 
 The series has one piece of prep work, one patch for the parser logic, and a
 handful of patches to fill out the tables which drive the parser. There are
 follow-up patches to libdrm and to i-g-t. The i-g-t tests are basic and do not
 test all of the commands used by the parser on the assumption that I'm likely
 to make the same mistakes in both the parser and the test.
 
 I've previously run the i-g-t gem_* tests, the piglit quick tests, and 
 generally
 used Ubuntu 13.10 IVB and HSW systems with the parser running. Aside from a
 failure described below, I did not see any regressions.
 
 At this point there are a couple of required/potential improvements.
 
 1) Chained batches. The parser currently allows MI_BATCH_BUFFER_START commands
in userspace batches without parsing them. The media driver uses chained
batches, so a solution is required. I'm still working through the
requirements but don't want to continue delaying the review process for 
 what
I have so far.
 2) Command buffer copy. To avoid CPU modifications to buffers after parsing, 
 and
to avoid GPU modifications to buffers via EUs or commands in the batch, we
should copy the userspace batch buffer to memory that userspace does not
have access to, map it into GGTT, and execute that batch buffer. I have a
sense of how to do this for 1st-level batches, but it may need changes to
tie in with the chained batch parsing, so I've again held off.
 3) Coherency. I've previously found a coherency issue on VLV when reading the
batch buffer from the CPU during execbuffer2. Userspace writes the batch 
 via
pwrite fast path before calling execbuffer2. The parser reads stale data.
This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC issue.
It's possible that the shmem pread refactoring fixes this, I just have not
been able to retest due to lack of a VLV system.

Is it still true that we need to test this on vlv? The shmem_pread path
really should have fixed this ... Otherwise I think this looks ready to go
in, I'll pester Jani for the review.

Thanks, Daniel

 
 v2:
 - Significantly reorder series
 - Scan secure batches (i.e. I915_EXEC_SECURE)
 - Check that parser tables are sorted during init
 - Fixed gem_cpu_reloc regression
 - HAS_CMD_PARSER - CMD_PARSER_VERSION getparam
 - Additional tests
 
 v3:
 - Don't actually send batches as secure yet
 - Improved documentation and commenting
 - Many other small cleanups throughout
 
 Brad Volkin (13):
   drm/i915: Refactor shmem pread setup
   drm/i915: Implement command buffer parsing logic
   drm/i915: Initial command parser table definitions
   drm/i915: Reject privileged commands
   drm/i915: Allow some privileged commands from master
   drm/i915: Add register whitelists for mesa
   drm/i915: Add register whitelist for DRM master
   drm/i915: Enable register whitelist checks
   drm/i915: Reject commands that explicitly generate interrupts
   drm/i915: Enable PPGTT command parser checks
   drm/i915: Reject commands that would store to global HWS page
   drm/i915: Add a CMD_PARSER_VERSION getparam
   drm/i915: Enable command parsing by default
 
  drivers/gpu/drm/i915/Makefile  |   1 +
  drivers/gpu/drm/i915/i915_cmd_parser.c | 918 
 +
  drivers/gpu/drm/i915/i915_dma.c|   3 +
  drivers/gpu/drm/i915/i915_drv.h| 103 
  drivers/gpu/drm/i915/i915_gem.c|  51 +-
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  18 +
  drivers/gpu/drm/i915/i915_params.c |   5 +
  drivers/gpu/drm/i915/i915_reg.h|  96 +++
  drivers/gpu/drm/i915/intel_ringbuffer.c|   2 +
  drivers/gpu/drm/i915/intel_ringbuffer.h|  32 +
  include/uapi/drm/i915_drm.h|   1 +
  11 files changed, 1216 insertions(+), 14 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/i915_cmd_parser.c
 
 -- 
 1.8.3.2
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 

Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-05 Thread Volkin, Bradley D
On Wed, Mar 05, 2014 at 09:14:38AM -0800, Daniel Vetter wrote:
 On Wed, Mar 05, 2014 at 08:59:56AM -0800, Volkin, Bradley D wrote:
  On Wed, Mar 05, 2014 at 02:46:35AM -0800, Daniel Vetter wrote:
   On Tue, Feb 18, 2014 at 10:15:44AM -0800, bradley.d.vol...@intel.com 
   wrote:
From: Brad Volkin bradley.d.vol...@intel.com
3) Coherency. I've previously found a coherency issue on VLV when 
reading the
   batch buffer from the CPU during execbuffer2. Userspace writes the 
batch via
   pwrite fast path before calling execbuffer2. The parser reads stale 
data.
   This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC 
issue.
   It's possible that the shmem pread refactoring fixes this, I just 
have not
   been able to retest due to lack of a VLV system.
   
   Is it still true that we need to test this on vlv? The shmem_pread path
   really should have fixed this ... Otherwise I think this looks ready to go
   in, I'll pester Jani for the review.
  
  Yes, I still don't have a system to test on.
 
 Steal one? I guess we'll notice when our QA runs all this stuff on theirs
 ;-)

It's not stealing if I give it back :)

Anyhow, sorry, the note isn't totally clear. It shouldn't be a problem for now 
because
the parser is still disabled on VLV (and platforms with ppgtt disabled in 
general). I
will look into retesting this though.

Brad

 
 Cheers, Daniel
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-03-04 Thread Volkin, Bradley D
Daniel, Jani,

I think I managed to send this while you were both out and I'm sure it got
buried. Can you take a look? I think this rev addresses all of the current
comments.

Thanks,
Brad

On Tue, Feb 18, 2014 at 10:15:44AM -0800, Volkin, Bradley D wrote:
 From: Brad Volkin bradley.d.vol...@intel.com
 
 Certain OpenGL features (e.g. transform feedback, performance monitoring)
 require userspace code to submit batches containing commands such as
 MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
 generations of the hardware will noop these commands in unsecure batches
 (which includes all userspace batches submitted via i915) even though the
 commands may be safe and represent the intended programming model of the 
 device.
 
 This series introduces a software command parser similar in operation to the
 command parsing done in hardware for unsecure batches. However, the software
 parser allows some operations that would be noop'd by hardware, if the parser
 determines the operation is safe, and submits the batch as secure to prevent
 hardware parsing. Currently the series implements this on IVB and HSW.
 
 The series has one piece of prep work, one patch for the parser logic, and a
 handful of patches to fill out the tables which drive the parser. There are
 follow-up patches to libdrm and to i-g-t. The i-g-t tests are basic and do not
 test all of the commands used by the parser on the assumption that I'm likely
 to make the same mistakes in both the parser and the test.
 
 I've previously run the i-g-t gem_* tests, the piglit quick tests, and 
 generally
 used Ubuntu 13.10 IVB and HSW systems with the parser running. Aside from a
 failure described below, I did not see any regressions.
 
 At this point there are a couple of required/potential improvements.
 
 1) Chained batches. The parser currently allows MI_BATCH_BUFFER_START commands
in userspace batches without parsing them. The media driver uses chained
batches, so a solution is required. I'm still working through the
requirements but don't want to continue delaying the review process for 
 what
I have so far.
 2) Command buffer copy. To avoid CPU modifications to buffers after parsing, 
 and
to avoid GPU modifications to buffers via EUs or commands in the batch, we
should copy the userspace batch buffer to memory that userspace does not
have access to, map it into GGTT, and execute that batch buffer. I have a
sense of how to do this for 1st-level batches, but it may need changes to
tie in with the chained batch parsing, so I've again held off.
 3) Coherency. I've previously found a coherency issue on VLV when reading the
batch buffer from the CPU during execbuffer2. Userspace writes the batch 
 via
pwrite fast path before calling execbuffer2. The parser reads stale data.
This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC issue.
It's possible that the shmem pread refactoring fixes this, I just have not
been able to retest due to lack of a VLV system.
 
 v2:
 - Significantly reorder series
 - Scan secure batches (i.e. I915_EXEC_SECURE)
 - Check that parser tables are sorted during init
 - Fixed gem_cpu_reloc regression
 - HAS_CMD_PARSER - CMD_PARSER_VERSION getparam
 - Additional tests
 
 v3:
 - Don't actually send batches as secure yet
 - Improved documentation and commenting
 - Many other small cleanups throughout
 
 Brad Volkin (13):
   drm/i915: Refactor shmem pread setup
   drm/i915: Implement command buffer parsing logic
   drm/i915: Initial command parser table definitions
   drm/i915: Reject privileged commands
   drm/i915: Allow some privileged commands from master
   drm/i915: Add register whitelists for mesa
   drm/i915: Add register whitelist for DRM master
   drm/i915: Enable register whitelist checks
   drm/i915: Reject commands that explicitly generate interrupts
   drm/i915: Enable PPGTT command parser checks
   drm/i915: Reject commands that would store to global HWS page
   drm/i915: Add a CMD_PARSER_VERSION getparam
   drm/i915: Enable command parsing by default
 
  drivers/gpu/drm/i915/Makefile  |   1 +
  drivers/gpu/drm/i915/i915_cmd_parser.c | 918 
 +
  drivers/gpu/drm/i915/i915_dma.c|   3 +
  drivers/gpu/drm/i915/i915_drv.h| 103 
  drivers/gpu/drm/i915/i915_gem.c|  51 +-
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  18 +
  drivers/gpu/drm/i915/i915_params.c |   5 +
  drivers/gpu/drm/i915/i915_reg.h|  96 +++
  drivers/gpu/drm/i915/intel_ringbuffer.c|   2 +
  drivers/gpu/drm/i915/intel_ringbuffer.h|  32 +
  include/uapi/drm/i915_drm.h|   1 +
  11 files changed, 1216 insertions(+), 14 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/i915_cmd_parser.c
 
 -- 
 1.8.3.2
 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org

Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-02-05 Thread Jani Nikula

FYI, I did an initial review sweep of this. Will focus more on the
logic and registers etc. next.

BR,
Jani.


On Wed, 29 Jan 2014, bradley.d.vol...@intel.com wrote:
 From: Brad Volkin bradley.d.vol...@intel.com

 Certain OpenGL features (e.g. transform feedback, performance monitoring)
 require userspace code to submit batches containing commands such as
 MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
 generations of the hardware will noop these commands in unsecure batches
 (which includes all userspace batches submitted via i915) even though the
 commands may be safe and represent the intended programming model of the 
 device.

 This series introduces a software command parser similar in operation to the
 command parsing done in hardware for unsecure batches. However, the software
 parser allows some operations that would be noop'd by hardware, if the parser
 determines the operation is safe, and submits the batch as secure to prevent
 hardware parsing. Currently the series implements this on IVB and HSW.

 The series has one piece of prep work, one patch for the parser logic, and a
 handful of patches to fill out the tables which drive the parser. There are
 follow-up patches to libdrm and to i-g-t. The i-g-t tests are basic and do not
 test all of the commands used by the parser on the assumption that I'm likely
 to make the same mistakes in both the parser and the test.

 WARNING!!!
 I've previously run the i-g-t gem_* tests, the piglit quick tests, and 
 generally
 used Ubuntu 13.10 IVB and HSW systems with the parser running. Aside from a
 failure described below, I did not see any regressions. However, the series
 currently hits a BUG_ON() if you enable the parser due to a regression in 
 secure
 batch handling on -nightly.

 At this point there are a couple of required/potential improvements.

 1) Chained batches. The parser currently allows MI_BATCH_BUFFER_START commands
in userspace batches without parsing them. The media driver uses chained
batches, so a solution is required. I'm still working through the
requirements but don't want to continue delaying the review process for 
 what
I have so far.
 2) Command buffer copy. To avoid CPU modifications to buffers after parsing, 
 and
to avoid GPU modifications to buffers via EUs or commands in the batch, we
should copy the userspace batch buffer to memory that userspace does not
have access to, map it into GGTT, and execute that batch buffer. I have a
sense of how to do this for 1st-level batches, but it may need changes to
tie in with the chained batch parsing, so I've again held off.
 3) Coherency. I've found a coherency issue on VLV when reading the batch 
 buffer
from the CPU during execbuffer2. Userspace writes the batch via pwrite fast
path before calling execbuffer2. The parser reads stale data. This works 
 fine
on IVB and HSW, so I believe it's an LLC vs. non-LLC issue. I'm just 
 unclear
on what the correct flushing or synchronization is for this scenario. This
only matters if we get PPGTT working on VLV and enable the parser there.

 v2:
 - Significantly reorder series
 - Scan secure batches (i.e. I915_EXEC_SECURE)
 - Check that parser tables are sorted during init
 - Fixed gem_cpu_reloc regression
 - HAS_CMD_PARSER - CMD_PARSER_VERSION getparam
 - Additional tests

 Brad Volkin (13):
   drm/i915: Refactor shmem pread setup
   drm/i915: Implement command buffer parsing logic
   drm/i915: Initial command parser table definitions
   drm/i915: Reject privileged commands
   drm/i915: Allow some privileged commands from master
   drm/i915: Add register whitelists for mesa
   drm/i915: Add register whitelist for DRM master
   drm/i915: Enable register whitelist checks
   drm/i915: Reject commands that explicitly generate interrupts
   drm/i915: Enable PPGTT command parser checks
   drm/i915: Reject commands that would store to global HWS page
   drm/i915: Add a CMD_PARSER_VERSION getparam
   drm/i915: Enable command parsing by default

  drivers/gpu/drm/i915/Makefile  |   3 +-
  drivers/gpu/drm/i915/i915_cmd_parser.c | 845 
 +
  drivers/gpu/drm/i915/i915_dma.c|   4 +
  drivers/gpu/drm/i915/i915_drv.h| 103 
  drivers/gpu/drm/i915/i915_gem.c|  48 +-
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  17 +
  drivers/gpu/drm/i915/i915_params.c |   5 +
  drivers/gpu/drm/i915/i915_reg.h|  78 +++
  drivers/gpu/drm/i915/intel_ringbuffer.c|   2 +
  drivers/gpu/drm/i915/intel_ringbuffer.h|  32 ++
  include/uapi/drm/i915_drm.h|   1 +
  11 files changed, 1123 insertions(+), 15 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/i915_cmd_parser.c

 -- 
 1.8.5.2

 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open 

Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-01-29 Thread Daniel Vetter
On Wed, Jan 29, 2014 at 01:55:01PM -0800, bradley.d.vol...@intel.com wrote:
 From: Brad Volkin bradley.d.vol...@intel.com
 3) Coherency. I've found a coherency issue on VLV when reading the batch 
 buffer
from the CPU during execbuffer2. Userspace writes the batch via pwrite fast
path before calling execbuffer2. The parser reads stale data. This works 
 fine
on IVB and HSW, so I believe it's an LLC vs. non-LLC issue. I'm just 
 unclear
on what the correct flushing or synchronization is for this scenario. This
only matters if we get PPGTT working on VLV and enable the parser there.

Hm, adopting the shmem_read clflushing didn't help for this? That would be
fairly shocking, since it means our shmem read paths are broken. Which are
used e.g. by the libva readback code for the encoded bitstream.

One thing aside: When resending the complete series (even if it's just a
subset) it's better to start a new thread. We tend to use in-reply-to only
when resending individual patches, while the review discussion is still
ongoing. That way the discussion stays together. But when there's been a
bit a longer break it's imo better to start a new thread.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-01-29 Thread Volkin, Bradley D
On Wed, Jan 29, 2014 at 02:11:17PM -0800, Daniel Vetter wrote:
 On Wed, Jan 29, 2014 at 01:55:01PM -0800, bradley.d.vol...@intel.com wrote:
  From: Brad Volkin bradley.d.vol...@intel.com
  3) Coherency. I've found a coherency issue on VLV when reading the batch 
  buffer
 from the CPU during execbuffer2. Userspace writes the batch via pwrite 
  fast
 path before calling execbuffer2. The parser reads stale data. This works 
  fine
 on IVB and HSW, so I believe it's an LLC vs. non-LLC issue. I'm just 
  unclear
 on what the correct flushing or synchronization is for this scenario. 
  This
 only matters if we get PPGTT working on VLV and enable the parser there.
 
 Hm, adopting the shmem_read clflushing didn't help for this? That would be
 fairly shocking, since it means our shmem read paths are broken. Which are
 used e.g. by the libva readback code for the encoded bitstream.

Sorry, not clear enough. I actually haven't retested that part with the 
clflushing
added since the opinion seemed to be that leaving the parser disabled for VLV 
was ok.
Just left the note for now so it doesn't get lost.

 
 One thing aside: When resending the complete series (even if it's just a
 subset) it's better to start a new thread. We tend to use in-reply-to only
 when resending individual patches, while the review discussion is still
 ongoing. That way the discussion stays together. But when there's been a
 bit a longer break it's imo better to start a new thread.

Ok, I thought it was more of a blanket thing. Noted.
-Brad

 
 Cheers, Daniel
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/13] Gen7 batch buffer command parser

2014-01-29 Thread Daniel Vetter
On Wed, Jan 29, 2014 at 02:22:49PM -0800, Volkin, Bradley D wrote:
 On Wed, Jan 29, 2014 at 02:11:17PM -0800, Daniel Vetter wrote:
  On Wed, Jan 29, 2014 at 01:55:01PM -0800, bradley.d.vol...@intel.com wrote:
   From: Brad Volkin bradley.d.vol...@intel.com
   3) Coherency. I've found a coherency issue on VLV when reading the batch 
   buffer
  from the CPU during execbuffer2. Userspace writes the batch via pwrite 
   fast
  path before calling execbuffer2. The parser reads stale data. This 
   works fine
  on IVB and HSW, so I believe it's an LLC vs. non-LLC issue. I'm just 
   unclear
  on what the correct flushing or synchronization is for this scenario. 
   This
  only matters if we get PPGTT working on VLV and enable the parser 
   there.
  
  Hm, adopting the shmem_read clflushing didn't help for this? That would be
  fairly shocking, since it means our shmem read paths are broken. Which are
  used e.g. by the libva readback code for the encoded bitstream.
 
 Sorry, not clear enough. I actually haven't retested that part with the 
 clflushing
 added since the opinion seemed to be that leaving the parser disabled for VLV 
 was ok.
 Just left the note for now so it doesn't get lost.

Would be nice to give it spin though, since afaik this issue might persist
on vlv+1. And I guess we can't keep on sticking our heads into sand about
ppgtt not really working on soc platforms ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx