Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-14 Thread Francisco Jerez
Kenneth Graunke kenn...@whitecape.org writes:

 On Wednesday, May 13, 2015 07:33:22 PM Francisco Jerez wrote:
  - Nothing prevents you from evaluating the builder framework
independent from the tiling and format conversion code.
 [snip]

 I think you misunderstand - we all largely agree that the fs_builder
 infrastructure is a good idea.  I think we've mostly evaluated it and
 concluded that it's worth doing.

 But the array_utils stuff - a new subclass of backend_reg -
 emit_collect, emit_extract, emit_zip, emit_send, all of that is a bunch
 of new infrastructure that's only used in your new code, and is hard to
 evaluate.

Right, I misunderstood because I didn't consider those to be a large
amount of infrastructure, but rather an implementation detail I wasn't
expecting to receive much attention, and because in his mail Matt
explicitly mentioned the builder and quoted text from my builder patch.

 That's not nearly what I meant.  Of course whether and how we carry out
 the transition will still be open for discussion, that statement you
 quote was intended to show that I'm willing to do the hard work and not
 going to abandon the new infrastructure while the transition is half-way
 done (as Ken rudely insinuated on IRC).

 That's good news.  In the past, many instances of we'll fix it later
 have turned into we never got around to it - not from you, but from
 other developers.

I understand your concern, but dismissing a proposal that can
potentially improve the back-end architecture in the long run just
because you're uncertain about the author's commitment doesn't seem like
a very constructive reaction.  Most likely I misunderstood you and you
never meant to dismiss it, but a simple question about my long-term
plans would have been less discouraging.

 We've also been fixing bugs with subreg_offset for about a year
 now...which was added for ARB_image_load_store (IIRC)...

Although it's somewhat off-topic, I think it's good that you're bringing
this point up.  I admit that retrospectively adding the subreg_offset
field at that point was a mistake for several reasons:

 - It was a compromise from the start.  What I originally tried to do
   was to change reg_offset to keep track of the offset in bytes rather
   than in dispatch_width units (which is crazy but it's what it used to
   do back then), but the diff turned out to be massive and caused a
   considerable amount of breakage (this probably will sound familiar to
   Jason), so I settled on using a separate field under the pressure to
   get things working now, what retrospectively seems silly because my
   initial attempt at ARB_shader_image_load_store wasn't accepted.
   Ironically Jason made reg_offset much more sensible (kudos!) last
   year while I wasn't paying attention, but we missed the opportunity
   to unify the two offset fields.

 - It wasn't SSA-friendly.  subreg_offset was introduced before there
   was a consensus about switching the back-end IR to SSA-form.  One of
   its primary use cases was to implement packing efficiently using
   strided sub-dword copies, what fundamentally relies on doing multiple
   partial writes of a register.  Retrospectively it would have been
   better to define virtual opcodes similar to VEC4_OPCODE_PACK_BYTES
   but consistent across back-ends and supporting both packing and
   unpacking and different component widths.  This doesn't mean that
   subreg_offset doesn't make sense in an SSA-form IR, only this
   particular use case doesn't, we probably still want some mechanism to
   represent sub-GRF addressing even while in SSA-form in order to
   implement sub-dword arithmetic reasonably.

At some point my packing code based on strided copies stopped working
because of a bug in lower_load_payload().  I submitted a patch to fix it
but Jason was reluctant to include it for obvious reasons.  At that
point I decided to drop the strided copy path in favor of the more
general implementation which uses bit arithmetic to achieve the same
result but in the worst-case requires three instructions more.  I'm
planning to re-implement the strided copy path eventually as a virtual
opcode to get rid of the unnecessary instructions, if we consider it
worth doing.

 and not accounted for in a lot of places (i.e. dead code elimination
 thought writing g7.2 kills a previous write to any of g7).

Really?  I'm pretty sure I fixed fs_inst::is_partial_write() to return
true in that case, which DCE should have been using to decide whether to
kill a previous write.  Maybe you were doing a cross-register write or
something like that by accident?  (which is going to violate plenty of
other invariants of the compiler *and* of the hardware)

 So it's a real concern, no matter who's writing the code.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-13 Thread Francisco Jerez
Matt Turner matts...@gmail.com writes:

 On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
 I like the idea of the builder refactor - having a mechanism for emitting
 code at arbitrary points makes a ton of sense.  But I'm uncomfortable
 with how much code is added - and duplicated from the existing visitor
 code - given that fs_visitor is refactored to use it.

 I agree with Jason that we should discuss our goal - what things should
 look like - but frankly, I think we should hold off on large scale
 refactoring until after we delete the GLSL IR backend.  There will be
 much less code to reorganize.

 I agree, I wasn't planning on migrating the rest of the compiler
 back-end to the builder framework until that happens.

 We talked about this on IRC yesterday -- I find it difficult to review
 some of these patches adding large amounts of new infrastructure
 that's only used in other in this series.

 Basically, the problem is that you're adding (1) a large amount of new
 infrastructure, and (2) a large amount of tiling calculation/format
 conversion code and neither can be independently evaluated. If the new
 infrastructure was proposed and we could see how it was useful in the
 existing compiler, we could evaluate it. If the new tiling
 calculation/format conversion code used existing compiler
 infrastructure, we could evaluate it.

 Combining the two in the same series though makes it difficult.

Matt, I know that a 3k LOC series might be intimidating at first glance,
but:

 - The additional infrastructure is badly needed.  If you've bothered to
   read the commit message [maybe longest in (our git tree's) history]
   or skimmed through the body of the fs_builder patch, read the
   6-screenful e-mail I sent you last October including a detailed
   explanation of the builder, or any of the working sample code
   (including a port of large parts of existing compiler code) I sent
   you back then, you surely already know that I'm not doing this on a
   whim.  The infrastructure equivalent to the builder should have
   *never* been part of the GLSL IR visitor, and the more new
   functionality we throw at the compiler relying on the old
   infrastructure the more difficult the migration will get.

 - Nothing prevents you from evaluating the builder framework
   independent from the tiling and format conversion code.  You have
   more than enough information to assess whether the change is
   reasonable on its own.  You have the lengthy explanation from last
   October.  You have sample code from last October showing how it's
   useful in the existing compiler -- And for the case that doesn't seem
   convincing enough I'm attaching a partial port of brw_fs_visitor.cpp
   to the builder framework based on a current tree [but note that it's
   just an example and not meant to be upstreamed].  You have enough
   knowledge of the existing infrastructure that hardly anything about
   the new infrastructure will come as a surprise for you -- It largely
   mimics the existing IR-building methods, their semantics and
   implementation, and the transition from one to the other usually
   amounts to replacing emit(FOO(...)) with bld.FOO(...).  You have
   enough documentation in the code itself -- I've spent a great deal of
   time documenting the builder framework and the image load/store
   implementation [to a much greater extent than is usual in the rest of
   the back-end code, a quickly written bash script gives a
   comment-to-code ratio of 31% vs. the average of 19% in the rest of
   the FS back-end].  Lastly you have me to answer any questions that
   could come up during review -- And the fact that you haven't asked
   any specific technical questions so far other than about codestyle
   and logistics makes me think that you haven't really attempted to
   understand the code in detail yet, and the supposed difficulty you
   talk about may be based on a presumption rather than on your actual
   experience.

 - Even if I turn out to be wrong, your evaluation is inconclusive for
   some strange reason, and the new infrastructure is merged with some
   hidden architectural flaw, the risk is close to non-existent.  The
   worst thing that could happen is we have to flip the switch to
   disable ARB_shader_image_load_store again until the problem is fixed.
   No critical functionality is affected because the new infrastructure
   is completely self-contained.

 It seems like we'd be buying into a transition to the new builder
 without having seen what the transition looks like ([switching to the
 new infrastructure] will be undertaken by a separate series, as it
 involves major back-end surgery).

That's not nearly what I meant.  Of course whether and how we carry out
the transition will still be open for discussion, that statement you
quote was intended to show that I'm willing to do the hard work and not
going to abandon the new 

Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-13 Thread Kenneth Graunke
On Wednesday, May 13, 2015 07:33:22 PM Francisco Jerez wrote:
  - Nothing prevents you from evaluating the builder framework
independent from the tiling and format conversion code.
[snip]

I think you misunderstand - we all largely agree that the fs_builder
infrastructure is a good idea.  I think we've mostly evaluated it and
concluded that it's worth doing.

But the array_utils stuff - a new subclass of backend_reg -
emit_collect, emit_extract, emit_zip, emit_send, all of that is a bunch
of new infrastructure that's only used in your new code, and is hard to
evaluate.

 That's not nearly what I meant.  Of course whether and how we carry out
 the transition will still be open for discussion, that statement you
 quote was intended to show that I'm willing to do the hard work and not
 going to abandon the new infrastructure while the transition is half-way
 done (as Ken rudely insinuated on IRC).

That's good news.  In the past, many instances of we'll fix it later
have turned into we never got around to it - not from you, but from
other developers.  We've also been fixing bugs with subreg_offset for
about a year now...which was added for ARB_image_load_store (IIRC)...
and not accounted for in a lot of places (i.e. dead code elimination
thought writing g7.2 kills a previous write to any of g7).

So it's a real concern, no matter who's writing the code.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-06 Thread Francisco Jerez
Kenneth Graunke kenn...@whitecape.org writes:

 On Tuesday, May 05, 2015 03:05:02 PM Matt Turner wrote:
 On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net 
 wrote:
  Kenneth Graunke kenn...@whitecape.org writes:
  That then begs the question - could we do the format conversion and
  address calculations in a i965-specific NIR pass?  It could simplify
  the backend changes.  NIR also has better CSE, which could help for
  repeated image access.
 
  I doubt that rewriting things in NIR would be of any help at this point.
  We could have support for ARB_shader_image_load_store already in the
  10.6 release if we refrain from reimplementing the world in the last
  minute.  And I doubt that an implementation in terms of NIR would have
  simplified anything, it would have required a bunch of intrinsics with
  driver-specific semantics,
 
 Can you give an example? We weren't sure exactly how to do the bounds
 checking on IVB in NIR (add an extra uniform, or maybe do the
 image-equivalent of textureSize?) but the other work-arounds didn't

imageSize is not enough...  We'd have to get through a bunch of other
driver-specific uniforms somehow (strides, tiling and swizzling
parameters) for untyped surface access to work.

 seem like they would be difficult to implement in NIR.

 Note that I'm assuming the NIR pass would be very driver specific, i.e.
 it would know which formats require typed/untyped messages, or format
 conversion.

 For example, it would see a NIR image load intrinsic that's supposed to
 return RGBA32_FLOAT, and turn it into an intrinsic that loads
 RGBA32_UINT(*), then emit code to do the conversion.

Yeah, the format conversion wasn't my concern, it's straightforward
arithmetic.  I'm more worried about handling typed vs untyped access and
passing through the bunch of parameters required for the latter.

 (*) or leave it alone, knowing i965/fs is going to do that anyway.

 I'm not sure how tricky it would be to set up an extra uniform, but with
 a uniform variable in place, reading/using its fields should be
 straightforward.

 I'm not sure what extra intrinsics we'd need.

  and I must admit that I find generating NIR
  even more annoying than generating i965 IR directly (for an example of
  what a mean see how [1] spends over twenty lines to calculate a simple
  a + b * c expression).
 
 That's nothing specific to NIR itself, and it's really already been
 solved with the new nir_builder. Take a look at Ken's
 src/mesa/program/prog_to_nir.c.

 Yeah, using nir_builder makes this /much/ less annoying...basically one
 line of code.

prog_to_nir seems far more readable indeed, but I don't see how this
suggestion is of any help at this point nine days before the end of the
merge window.  Given that reimplementing this in NIR wouldn't be
particularly beneficial neither short-term [we miss the merge window for
no practical benefit] nor long-term [long-term we have an SSA-form
back-end IR, hopefully one that allows us to share code between
back-ends], this is likely to be a complete waste of my time.

 --Ken


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-05 Thread Matt Turner
On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
 That then begs the question - could we do the format conversion and
 address calculations in a i965-specific NIR pass?  It could simplify
 the backend changes.  NIR also has better CSE, which could help for
 repeated image access.

 I doubt that rewriting things in NIR would be of any help at this point.
 We could have support for ARB_shader_image_load_store already in the
 10.6 release if we refrain from reimplementing the world in the last
 minute.  And I doubt that an implementation in terms of NIR would have
 simplified anything, it would have required a bunch of intrinsics with
 driver-specific semantics,

Can you give an example? We weren't sure exactly how to do the bounds
checking on IVB in NIR (add an extra uniform, or maybe do the
image-equivalent of textureSize?) but the other work-arounds didn't
seem like they would be difficult to implement in NIR.

 and I must admit that I find generating NIR
 even more annoying than generating i965 IR directly (for an example of
 what a mean see how [1] spends over twenty lines to calculate a simple
 a + b * c expression).

That's nothing specific to NIR itself, and it's really already been
solved with the new nir_builder. Take a look at Ken's
src/mesa/program/prog_to_nir.c.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-05 Thread Kenneth Graunke
On Tuesday, May 05, 2015 03:05:02 PM Matt Turner wrote:
 On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote:
  Kenneth Graunke kenn...@whitecape.org writes:
  That then begs the question - could we do the format conversion and
  address calculations in a i965-specific NIR pass?  It could simplify
  the backend changes.  NIR also has better CSE, which could help for
  repeated image access.
 
  I doubt that rewriting things in NIR would be of any help at this point.
  We could have support for ARB_shader_image_load_store already in the
  10.6 release if we refrain from reimplementing the world in the last
  minute.  And I doubt that an implementation in terms of NIR would have
  simplified anything, it would have required a bunch of intrinsics with
  driver-specific semantics,
 
 Can you give an example? We weren't sure exactly how to do the bounds
 checking on IVB in NIR (add an extra uniform, or maybe do the
 image-equivalent of textureSize?) but the other work-arounds didn't
 seem like they would be difficult to implement in NIR.

Note that I'm assuming the NIR pass would be very driver specific, i.e.
it would know which formats require typed/untyped messages, or format
conversion.

For example, it would see a NIR image load intrinsic that's supposed to
return RGBA32_FLOAT, and turn it into an intrinsic that loads
RGBA32_UINT(*), then emit code to do the conversion.

(*) or leave it alone, knowing i965/fs is going to do that anyway.

I'm not sure how tricky it would be to set up an extra uniform, but with
a uniform variable in place, reading/using its fields should be
straightforward.

I'm not sure what extra intrinsics we'd need.

  and I must admit that I find generating NIR
  even more annoying than generating i965 IR directly (for an example of
  what a mean see how [1] spends over twenty lines to calculate a simple
  a + b * c expression).
 
 That's nothing specific to NIR itself, and it's really already been
 solved with the new nir_builder. Take a look at Ken's
 src/mesa/program/prog_to_nir.c.

Yeah, using nir_builder makes this /much/ less annoying...basically one
line of code.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-05 Thread Francisco Jerez
Kenneth Graunke kenn...@whitecape.org writes:

 On Saturday, May 02, 2015 06:29:27 PM Francisco Jerez wrote:
 This v2 is motivated by the less than enthusiastic reception of my
 last two series implementing the image load, store and atomic
 intrinsics in the i965 back-end.  It substitutes both.
 
 The built-ins are now implemented in the scalar back-end only, what
 means that IVB and HSW are no longer able to expose image support for
 the VS and GS stages -- behaviour acceptable according to the
 specification.  BDW and up should still be able to expose images in
 the VS stage, and possibly in the GS stage at some point too.  CS
 images can be supported on all gens (that support CS that is).  The
 benefit is that the FS/VEC4 abstraction becomes unnecessary and the
 overall set of changes amounts to less code.
 
 It can be found in the following branch along with its dependencies:
 http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-scalar
 
 src/mesa/drivers/dri/i965/Makefile.sources   |4 +
 src/mesa/drivers/dri/i965/brw_context.c  |   12 ++
 src/mesa/drivers/dri/i965/brw_fs.cpp |   43 -
 src/mesa/drivers/dri/i965/brw_fs.h   |   19 +-
 src/mesa/drivers/dri/i965/brw_fs_builder.h   |  617 
 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |8 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |   56 --
 src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 1144 
 +++
 src/mesa/drivers/dri/i965/brw_fs_surface_builder.h   |  233 
 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |  466 
 ---
 src/mesa/drivers/dri/i965/brw_ir_fs.h|   56 ++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  |   55 ++
 src/mesa/drivers/dri/i965/brw_shader.cpp |   32 
 src/mesa/drivers/dri/i965/brw_shader.h   |6 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp   |   10 +
 src/mesa/drivers/dri/i965/brw_vec4.h |6 +
 src/mesa/drivers/dri/i965/brw_vec4_builder.h |  579 
 ++
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |8 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   |  179 
 ++
 src/mesa/drivers/dri/i965/intel_extensions.c |2 +
 20 files changed, 3224 insertions(+), 311 deletions(-)
 
 [PATCH 01/29] i965/fs: Have component() set the register stride to zero.
 [PATCH 02/29] i965: Add register constructors taking an backend_reg as 
 argument.
 [PATCH 03/29] i965: Define consistent interface to disable control flow 
 execution masking.
 [PATCH 04/29] i965: Define consistent interface to predicate an instruction.
 [PATCH 05/29] i965: Define consistent interface to enable instruction 
 conditional modifiers.
 [PATCH 06/29] i965: Define consistent interface to enable instruction result 
 saturation.
 [PATCH 07/29] i965/fs: Introduce FS IR builder.
 [PATCH 08/29] i965/vec4: Introduce VEC4 IR builder.
 [PATCH 09/29] i965: Define the setup_vector_uniform_values() backend_visitor 
 interface.
 [PATCH 10/29] i965: Add support for handling image uniforms to the GLSL IR 
 visitors.
 [PATCH 11/29] i965: Add a visitor method to extract the result of a visit.
 [PATCH 12/29] i965/fs: Obtain atomic counter locations by recursing through 
 the visitor.
 [PATCH 13/29] i965/vec4: Obtain atomic counter locations by recursing 
 through the visitor.
 [PATCH 14/29] i965: Lift the constness restriction on surface indices passed 
 to untyped ops.
 [PATCH 15/29] i965/fs: Import array utils for the surface message builder.
 [PATCH 16/29] i965/fs: Import helpers to convert vectors into arrays and 
 back.
 [PATCH 17/29] i965/fs: Import surface message builder functions.
 [PATCH 18/29] i965/fs: Import image access validity checks.
 [PATCH 19/29] i965/fs: Import image memory offset calculation code.
 [PATCH 20/29] i965/fs: Import image format metadata queries.
 [PATCH 21/29] i965/fs: Import image format conversion primitives.
 [PATCH 22/29] i965/fs: Implement image load, store and atomic.
 [PATCH 23/29] i965/fs: Revisit GLSL IR atomic counter intrinsic translation.
 [PATCH 24/29] i965/fs: Revisit NIR atomic counter intrinsic translation.
 [PATCH 25/29] i965/fs: Import GLSL IR image intrinsic translation code.
 [PATCH 26/29] i965/fs: Import GLSL IR memory barrier intrinsic translation 
 code.
 [PATCH 27/29] i965/fs: Drop unused untyped surface read and atomic emit 
 methods.
 [PATCH 28/29] i965: Define implementation constants for 
 ARB_shader_image_load_store.
 [PATCH 29/29] i965: Expose ARB_shader_image_load_store.

 Hi Curro,

Hi Ken,

 Thanks for sending this out.  I'm definitely more comfortable with v2,
 though I do 

Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-04 Thread Kenneth Graunke
On Saturday, May 02, 2015 06:29:27 PM Francisco Jerez wrote:
 This v2 is motivated by the less than enthusiastic reception of my
 last two series implementing the image load, store and atomic
 intrinsics in the i965 back-end.  It substitutes both.
 
 The built-ins are now implemented in the scalar back-end only, what
 means that IVB and HSW are no longer able to expose image support for
 the VS and GS stages -- behaviour acceptable according to the
 specification.  BDW and up should still be able to expose images in
 the VS stage, and possibly in the GS stage at some point too.  CS
 images can be supported on all gens (that support CS that is).  The
 benefit is that the FS/VEC4 abstraction becomes unnecessary and the
 overall set of changes amounts to less code.
 
 It can be found in the following branch along with its dependencies:
 http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-scalar
 
 src/mesa/drivers/dri/i965/Makefile.sources   |4 +
 src/mesa/drivers/dri/i965/brw_context.c  |   12 ++
 src/mesa/drivers/dri/i965/brw_fs.cpp |   43 -
 src/mesa/drivers/dri/i965/brw_fs.h   |   19 +-
 src/mesa/drivers/dri/i965/brw_fs_builder.h   |  617 
 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |8 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |   56 --
 src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 1144 
 +++
 src/mesa/drivers/dri/i965/brw_fs_surface_builder.h   |  233 
 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |  466 
 ---
 src/mesa/drivers/dri/i965/brw_ir_fs.h|   56 ++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  |   55 ++
 src/mesa/drivers/dri/i965/brw_shader.cpp |   32 
 src/mesa/drivers/dri/i965/brw_shader.h   |6 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp   |   10 +
 src/mesa/drivers/dri/i965/brw_vec4.h |6 +
 src/mesa/drivers/dri/i965/brw_vec4_builder.h |  579 
 ++
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |8 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   |  179 ++
 src/mesa/drivers/dri/i965/intel_extensions.c |2 +
 20 files changed, 3224 insertions(+), 311 deletions(-)
 
 [PATCH 01/29] i965/fs: Have component() set the register stride to zero.
 [PATCH 02/29] i965: Add register constructors taking an backend_reg as 
 argument.
 [PATCH 03/29] i965: Define consistent interface to disable control flow 
 execution masking.
 [PATCH 04/29] i965: Define consistent interface to predicate an instruction.
 [PATCH 05/29] i965: Define consistent interface to enable instruction 
 conditional modifiers.
 [PATCH 06/29] i965: Define consistent interface to enable instruction result 
 saturation.
 [PATCH 07/29] i965/fs: Introduce FS IR builder.
 [PATCH 08/29] i965/vec4: Introduce VEC4 IR builder.
 [PATCH 09/29] i965: Define the setup_vector_uniform_values() backend_visitor 
 interface.
 [PATCH 10/29] i965: Add support for handling image uniforms to the GLSL IR 
 visitors.
 [PATCH 11/29] i965: Add a visitor method to extract the result of a visit.
 [PATCH 12/29] i965/fs: Obtain atomic counter locations by recursing through 
 the visitor.
 [PATCH 13/29] i965/vec4: Obtain atomic counter locations by recursing through 
 the visitor.
 [PATCH 14/29] i965: Lift the constness restriction on surface indices passed 
 to untyped ops.
 [PATCH 15/29] i965/fs: Import array utils for the surface message builder.
 [PATCH 16/29] i965/fs: Import helpers to convert vectors into arrays and back.
 [PATCH 17/29] i965/fs: Import surface message builder functions.
 [PATCH 18/29] i965/fs: Import image access validity checks.
 [PATCH 19/29] i965/fs: Import image memory offset calculation code.
 [PATCH 20/29] i965/fs: Import image format metadata queries.
 [PATCH 21/29] i965/fs: Import image format conversion primitives.
 [PATCH 22/29] i965/fs: Implement image load, store and atomic.
 [PATCH 23/29] i965/fs: Revisit GLSL IR atomic counter intrinsic translation.
 [PATCH 24/29] i965/fs: Revisit NIR atomic counter intrinsic translation.
 [PATCH 25/29] i965/fs: Import GLSL IR image intrinsic translation code.
 [PATCH 26/29] i965/fs: Import GLSL IR memory barrier intrinsic translation 
 code.
 [PATCH 27/29] i965/fs: Drop unused untyped surface read and atomic emit 
 methods.
 [PATCH 28/29] i965: Define implementation constants for 
 ARB_shader_image_load_store.
 [PATCH 29/29] i965: Expose ARB_shader_image_load_store.

Hi Curro,

Thanks for sending this out.  I'm definitely more comfortable with v2,
though I do still have concerns.

At this point, NIR is the default FS backend. 

[Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)

2015-05-02 Thread Francisco Jerez
This v2 is motivated by the less than enthusiastic reception of my
last two series implementing the image load, store and atomic
intrinsics in the i965 back-end.  It substitutes both.

The built-ins are now implemented in the scalar back-end only, what
means that IVB and HSW are no longer able to expose image support for
the VS and GS stages -- behaviour acceptable according to the
specification.  BDW and up should still be able to expose images in
the VS stage, and possibly in the GS stage at some point too.  CS
images can be supported on all gens (that support CS that is).  The
benefit is that the FS/VEC4 abstraction becomes unnecessary and the
overall set of changes amounts to less code.

It can be found in the following branch along with its dependencies:
http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-scalar

src/mesa/drivers/dri/i965/Makefile.sources   |4 +
src/mesa/drivers/dri/i965/brw_context.c  |   12 ++
src/mesa/drivers/dri/i965/brw_fs.cpp |   43 -
src/mesa/drivers/dri/i965/brw_fs.h   |   19 +-
src/mesa/drivers/dri/i965/brw_fs_builder.h   |  617 
++
src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |8 +-
src/mesa/drivers/dri/i965/brw_fs_nir.cpp |   56 --
src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 1144 
+++
src/mesa/drivers/dri/i965/brw_fs_surface_builder.h   |  233 

src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |  466 
---
src/mesa/drivers/dri/i965/brw_ir_fs.h|   56 ++
src/mesa/drivers/dri/i965/brw_ir_vec4.h  |   55 ++
src/mesa/drivers/dri/i965/brw_shader.cpp |   32 
src/mesa/drivers/dri/i965/brw_shader.h   |6 +
src/mesa/drivers/dri/i965/brw_vec4.cpp   |   10 +
src/mesa/drivers/dri/i965/brw_vec4.h |6 +
src/mesa/drivers/dri/i965/brw_vec4_builder.h |  579 
++
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |8 +-
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   |  179 ++
src/mesa/drivers/dri/i965/intel_extensions.c |2 +
20 files changed, 3224 insertions(+), 311 deletions(-)

[PATCH 01/29] i965/fs: Have component() set the register stride to zero.
[PATCH 02/29] i965: Add register constructors taking an backend_reg as argument.
[PATCH 03/29] i965: Define consistent interface to disable control flow 
execution masking.
[PATCH 04/29] i965: Define consistent interface to predicate an instruction.
[PATCH 05/29] i965: Define consistent interface to enable instruction 
conditional modifiers.
[PATCH 06/29] i965: Define consistent interface to enable instruction result 
saturation.
[PATCH 07/29] i965/fs: Introduce FS IR builder.
[PATCH 08/29] i965/vec4: Introduce VEC4 IR builder.
[PATCH 09/29] i965: Define the setup_vector_uniform_values() backend_visitor 
interface.
[PATCH 10/29] i965: Add support for handling image uniforms to the GLSL IR 
visitors.
[PATCH 11/29] i965: Add a visitor method to extract the result of a visit.
[PATCH 12/29] i965/fs: Obtain atomic counter locations by recursing through the 
visitor.
[PATCH 13/29] i965/vec4: Obtain atomic counter locations by recursing through 
the visitor.
[PATCH 14/29] i965: Lift the constness restriction on surface indices passed to 
untyped ops.
[PATCH 15/29] i965/fs: Import array utils for the surface message builder.
[PATCH 16/29] i965/fs: Import helpers to convert vectors into arrays and back.
[PATCH 17/29] i965/fs: Import surface message builder functions.
[PATCH 18/29] i965/fs: Import image access validity checks.
[PATCH 19/29] i965/fs: Import image memory offset calculation code.
[PATCH 20/29] i965/fs: Import image format metadata queries.
[PATCH 21/29] i965/fs: Import image format conversion primitives.
[PATCH 22/29] i965/fs: Implement image load, store and atomic.
[PATCH 23/29] i965/fs: Revisit GLSL IR atomic counter intrinsic translation.
[PATCH 24/29] i965/fs: Revisit NIR atomic counter intrinsic translation.
[PATCH 25/29] i965/fs: Import GLSL IR image intrinsic translation code.
[PATCH 26/29] i965/fs: Import GLSL IR memory barrier intrinsic translation code.
[PATCH 27/29] i965/fs: Drop unused untyped surface read and atomic emit methods.
[PATCH 28/29] i965: Define implementation constants for 
ARB_shader_image_load_store.
[PATCH 29/29] i965: Expose ARB_shader_image_load_store.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev