Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
Kenneth Graunke kenn...@whitecape.org writes: On Wednesday, May 13, 2015 07:33:22 PM Francisco Jerez wrote: - Nothing prevents you from evaluating the builder framework independent from the tiling and format conversion code. [snip] I think you misunderstand - we all largely agree that the fs_builder infrastructure is a good idea. I think we've mostly evaluated it and concluded that it's worth doing. But the array_utils stuff - a new subclass of backend_reg - emit_collect, emit_extract, emit_zip, emit_send, all of that is a bunch of new infrastructure that's only used in your new code, and is hard to evaluate. Right, I misunderstood because I didn't consider those to be a large amount of infrastructure, but rather an implementation detail I wasn't expecting to receive much attention, and because in his mail Matt explicitly mentioned the builder and quoted text from my builder patch. That's not nearly what I meant. Of course whether and how we carry out the transition will still be open for discussion, that statement you quote was intended to show that I'm willing to do the hard work and not going to abandon the new infrastructure while the transition is half-way done (as Ken rudely insinuated on IRC). That's good news. In the past, many instances of we'll fix it later have turned into we never got around to it - not from you, but from other developers. I understand your concern, but dismissing a proposal that can potentially improve the back-end architecture in the long run just because you're uncertain about the author's commitment doesn't seem like a very constructive reaction. Most likely I misunderstood you and you never meant to dismiss it, but a simple question about my long-term plans would have been less discouraging. We've also been fixing bugs with subreg_offset for about a year now...which was added for ARB_image_load_store (IIRC)... Although it's somewhat off-topic, I think it's good that you're bringing this point up. I admit that retrospectively adding the subreg_offset field at that point was a mistake for several reasons: - It was a compromise from the start. What I originally tried to do was to change reg_offset to keep track of the offset in bytes rather than in dispatch_width units (which is crazy but it's what it used to do back then), but the diff turned out to be massive and caused a considerable amount of breakage (this probably will sound familiar to Jason), so I settled on using a separate field under the pressure to get things working now, what retrospectively seems silly because my initial attempt at ARB_shader_image_load_store wasn't accepted. Ironically Jason made reg_offset much more sensible (kudos!) last year while I wasn't paying attention, but we missed the opportunity to unify the two offset fields. - It wasn't SSA-friendly. subreg_offset was introduced before there was a consensus about switching the back-end IR to SSA-form. One of its primary use cases was to implement packing efficiently using strided sub-dword copies, what fundamentally relies on doing multiple partial writes of a register. Retrospectively it would have been better to define virtual opcodes similar to VEC4_OPCODE_PACK_BYTES but consistent across back-ends and supporting both packing and unpacking and different component widths. This doesn't mean that subreg_offset doesn't make sense in an SSA-form IR, only this particular use case doesn't, we probably still want some mechanism to represent sub-GRF addressing even while in SSA-form in order to implement sub-dword arithmetic reasonably. At some point my packing code based on strided copies stopped working because of a bug in lower_load_payload(). I submitted a patch to fix it but Jason was reluctant to include it for obvious reasons. At that point I decided to drop the strided copy path in favor of the more general implementation which uses bit arithmetic to achieve the same result but in the worst-case requires three instructions more. I'm planning to re-implement the strided copy path eventually as a virtual opcode to get rid of the unnecessary instructions, if we consider it worth doing. and not accounted for in a lot of places (i.e. dead code elimination thought writing g7.2 kills a previous write to any of g7). Really? I'm pretty sure I fixed fs_inst::is_partial_write() to return true in that case, which DCE should have been using to decide whether to kill a previous write. Maybe you were doing a cross-register write or something like that by accident? (which is going to violate plenty of other invariants of the compiler *and* of the hardware) So it's a real concern, no matter who's writing the code. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
Matt Turner matts...@gmail.com writes: On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote: Kenneth Graunke kenn...@whitecape.org writes: I like the idea of the builder refactor - having a mechanism for emitting code at arbitrary points makes a ton of sense. But I'm uncomfortable with how much code is added - and duplicated from the existing visitor code - given that fs_visitor is refactored to use it. I agree with Jason that we should discuss our goal - what things should look like - but frankly, I think we should hold off on large scale refactoring until after we delete the GLSL IR backend. There will be much less code to reorganize. I agree, I wasn't planning on migrating the rest of the compiler back-end to the builder framework until that happens. We talked about this on IRC yesterday -- I find it difficult to review some of these patches adding large amounts of new infrastructure that's only used in other in this series. Basically, the problem is that you're adding (1) a large amount of new infrastructure, and (2) a large amount of tiling calculation/format conversion code and neither can be independently evaluated. If the new infrastructure was proposed and we could see how it was useful in the existing compiler, we could evaluate it. If the new tiling calculation/format conversion code used existing compiler infrastructure, we could evaluate it. Combining the two in the same series though makes it difficult. Matt, I know that a 3k LOC series might be intimidating at first glance, but: - The additional infrastructure is badly needed. If you've bothered to read the commit message [maybe longest in (our git tree's) history] or skimmed through the body of the fs_builder patch, read the 6-screenful e-mail I sent you last October including a detailed explanation of the builder, or any of the working sample code (including a port of large parts of existing compiler code) I sent you back then, you surely already know that I'm not doing this on a whim. The infrastructure equivalent to the builder should have *never* been part of the GLSL IR visitor, and the more new functionality we throw at the compiler relying on the old infrastructure the more difficult the migration will get. - Nothing prevents you from evaluating the builder framework independent from the tiling and format conversion code. You have more than enough information to assess whether the change is reasonable on its own. You have the lengthy explanation from last October. You have sample code from last October showing how it's useful in the existing compiler -- And for the case that doesn't seem convincing enough I'm attaching a partial port of brw_fs_visitor.cpp to the builder framework based on a current tree [but note that it's just an example and not meant to be upstreamed]. You have enough knowledge of the existing infrastructure that hardly anything about the new infrastructure will come as a surprise for you -- It largely mimics the existing IR-building methods, their semantics and implementation, and the transition from one to the other usually amounts to replacing emit(FOO(...)) with bld.FOO(...). You have enough documentation in the code itself -- I've spent a great deal of time documenting the builder framework and the image load/store implementation [to a much greater extent than is usual in the rest of the back-end code, a quickly written bash script gives a comment-to-code ratio of 31% vs. the average of 19% in the rest of the FS back-end]. Lastly you have me to answer any questions that could come up during review -- And the fact that you haven't asked any specific technical questions so far other than about codestyle and logistics makes me think that you haven't really attempted to understand the code in detail yet, and the supposed difficulty you talk about may be based on a presumption rather than on your actual experience. - Even if I turn out to be wrong, your evaluation is inconclusive for some strange reason, and the new infrastructure is merged with some hidden architectural flaw, the risk is close to non-existent. The worst thing that could happen is we have to flip the switch to disable ARB_shader_image_load_store again until the problem is fixed. No critical functionality is affected because the new infrastructure is completely self-contained. It seems like we'd be buying into a transition to the new builder without having seen what the transition looks like ([switching to the new infrastructure] will be undertaken by a separate series, as it involves major back-end surgery). That's not nearly what I meant. Of course whether and how we carry out the transition will still be open for discussion, that statement you quote was intended to show that I'm willing to do the hard work and not going to abandon the new
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
On Wednesday, May 13, 2015 07:33:22 PM Francisco Jerez wrote: - Nothing prevents you from evaluating the builder framework independent from the tiling and format conversion code. [snip] I think you misunderstand - we all largely agree that the fs_builder infrastructure is a good idea. I think we've mostly evaluated it and concluded that it's worth doing. But the array_utils stuff - a new subclass of backend_reg - emit_collect, emit_extract, emit_zip, emit_send, all of that is a bunch of new infrastructure that's only used in your new code, and is hard to evaluate. That's not nearly what I meant. Of course whether and how we carry out the transition will still be open for discussion, that statement you quote was intended to show that I'm willing to do the hard work and not going to abandon the new infrastructure while the transition is half-way done (as Ken rudely insinuated on IRC). That's good news. In the past, many instances of we'll fix it later have turned into we never got around to it - not from you, but from other developers. We've also been fixing bugs with subreg_offset for about a year now...which was added for ARB_image_load_store (IIRC)... and not accounted for in a lot of places (i.e. dead code elimination thought writing g7.2 kills a previous write to any of g7). So it's a real concern, no matter who's writing the code. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
Kenneth Graunke kenn...@whitecape.org writes: On Tuesday, May 05, 2015 03:05:02 PM Matt Turner wrote: On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote: Kenneth Graunke kenn...@whitecape.org writes: That then begs the question - could we do the format conversion and address calculations in a i965-specific NIR pass? It could simplify the backend changes. NIR also has better CSE, which could help for repeated image access. I doubt that rewriting things in NIR would be of any help at this point. We could have support for ARB_shader_image_load_store already in the 10.6 release if we refrain from reimplementing the world in the last minute. And I doubt that an implementation in terms of NIR would have simplified anything, it would have required a bunch of intrinsics with driver-specific semantics, Can you give an example? We weren't sure exactly how to do the bounds checking on IVB in NIR (add an extra uniform, or maybe do the image-equivalent of textureSize?) but the other work-arounds didn't imageSize is not enough... We'd have to get through a bunch of other driver-specific uniforms somehow (strides, tiling and swizzling parameters) for untyped surface access to work. seem like they would be difficult to implement in NIR. Note that I'm assuming the NIR pass would be very driver specific, i.e. it would know which formats require typed/untyped messages, or format conversion. For example, it would see a NIR image load intrinsic that's supposed to return RGBA32_FLOAT, and turn it into an intrinsic that loads RGBA32_UINT(*), then emit code to do the conversion. Yeah, the format conversion wasn't my concern, it's straightforward arithmetic. I'm more worried about handling typed vs untyped access and passing through the bunch of parameters required for the latter. (*) or leave it alone, knowing i965/fs is going to do that anyway. I'm not sure how tricky it would be to set up an extra uniform, but with a uniform variable in place, reading/using its fields should be straightforward. I'm not sure what extra intrinsics we'd need. and I must admit that I find generating NIR even more annoying than generating i965 IR directly (for an example of what a mean see how [1] spends over twenty lines to calculate a simple a + b * c expression). That's nothing specific to NIR itself, and it's really already been solved with the new nir_builder. Take a look at Ken's src/mesa/program/prog_to_nir.c. Yeah, using nir_builder makes this /much/ less annoying...basically one line of code. prog_to_nir seems far more readable indeed, but I don't see how this suggestion is of any help at this point nine days before the end of the merge window. Given that reimplementing this in NIR wouldn't be particularly beneficial neither short-term [we miss the merge window for no practical benefit] nor long-term [long-term we have an SSA-form back-end IR, hopefully one that allows us to share code between back-ends], this is likely to be a complete waste of my time. --Ken signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote: Kenneth Graunke kenn...@whitecape.org writes: That then begs the question - could we do the format conversion and address calculations in a i965-specific NIR pass? It could simplify the backend changes. NIR also has better CSE, which could help for repeated image access. I doubt that rewriting things in NIR would be of any help at this point. We could have support for ARB_shader_image_load_store already in the 10.6 release if we refrain from reimplementing the world in the last minute. And I doubt that an implementation in terms of NIR would have simplified anything, it would have required a bunch of intrinsics with driver-specific semantics, Can you give an example? We weren't sure exactly how to do the bounds checking on IVB in NIR (add an extra uniform, or maybe do the image-equivalent of textureSize?) but the other work-arounds didn't seem like they would be difficult to implement in NIR. and I must admit that I find generating NIR even more annoying than generating i965 IR directly (for an example of what a mean see how [1] spends over twenty lines to calculate a simple a + b * c expression). That's nothing specific to NIR itself, and it's really already been solved with the new nir_builder. Take a look at Ken's src/mesa/program/prog_to_nir.c. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
On Tuesday, May 05, 2015 03:05:02 PM Matt Turner wrote: On Tue, May 5, 2015 at 2:17 PM, Francisco Jerez curroje...@riseup.net wrote: Kenneth Graunke kenn...@whitecape.org writes: That then begs the question - could we do the format conversion and address calculations in a i965-specific NIR pass? It could simplify the backend changes. NIR also has better CSE, which could help for repeated image access. I doubt that rewriting things in NIR would be of any help at this point. We could have support for ARB_shader_image_load_store already in the 10.6 release if we refrain from reimplementing the world in the last minute. And I doubt that an implementation in terms of NIR would have simplified anything, it would have required a bunch of intrinsics with driver-specific semantics, Can you give an example? We weren't sure exactly how to do the bounds checking on IVB in NIR (add an extra uniform, or maybe do the image-equivalent of textureSize?) but the other work-arounds didn't seem like they would be difficult to implement in NIR. Note that I'm assuming the NIR pass would be very driver specific, i.e. it would know which formats require typed/untyped messages, or format conversion. For example, it would see a NIR image load intrinsic that's supposed to return RGBA32_FLOAT, and turn it into an intrinsic that loads RGBA32_UINT(*), then emit code to do the conversion. (*) or leave it alone, knowing i965/fs is going to do that anyway. I'm not sure how tricky it would be to set up an extra uniform, but with a uniform variable in place, reading/using its fields should be straightforward. I'm not sure what extra intrinsics we'd need. and I must admit that I find generating NIR even more annoying than generating i965 IR directly (for an example of what a mean see how [1] spends over twenty lines to calculate a simple a + b * c expression). That's nothing specific to NIR itself, and it's really already been solved with the new nir_builder. Take a look at Ken's src/mesa/program/prog_to_nir.c. Yeah, using nir_builder makes this /much/ less annoying...basically one line of code. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
Kenneth Graunke kenn...@whitecape.org writes: On Saturday, May 02, 2015 06:29:27 PM Francisco Jerez wrote: This v2 is motivated by the less than enthusiastic reception of my last two series implementing the image load, store and atomic intrinsics in the i965 back-end. It substitutes both. The built-ins are now implemented in the scalar back-end only, what means that IVB and HSW are no longer able to expose image support for the VS and GS stages -- behaviour acceptable according to the specification. BDW and up should still be able to expose images in the VS stage, and possibly in the GS stage at some point too. CS images can be supported on all gens (that support CS that is). The benefit is that the FS/VEC4 abstraction becomes unnecessary and the overall set of changes amounts to less code. It can be found in the following branch along with its dependencies: http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-scalar src/mesa/drivers/dri/i965/Makefile.sources |4 + src/mesa/drivers/dri/i965/brw_context.c | 12 ++ src/mesa/drivers/dri/i965/brw_fs.cpp | 43 - src/mesa/drivers/dri/i965/brw_fs.h | 19 +- src/mesa/drivers/dri/i965/brw_fs_builder.h | 617 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp |8 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 56 -- src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 1144 +++ src/mesa/drivers/dri/i965/brw_fs_surface_builder.h | 233 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 466 --- src/mesa/drivers/dri/i965/brw_ir_fs.h| 56 ++ src/mesa/drivers/dri/i965/brw_ir_vec4.h | 55 ++ src/mesa/drivers/dri/i965/brw_shader.cpp | 32 src/mesa/drivers/dri/i965/brw_shader.h |6 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 10 + src/mesa/drivers/dri/i965/brw_vec4.h |6 + src/mesa/drivers/dri/i965/brw_vec4_builder.h | 579 ++ src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |8 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 179 ++ src/mesa/drivers/dri/i965/intel_extensions.c |2 + 20 files changed, 3224 insertions(+), 311 deletions(-) [PATCH 01/29] i965/fs: Have component() set the register stride to zero. [PATCH 02/29] i965: Add register constructors taking an backend_reg as argument. [PATCH 03/29] i965: Define consistent interface to disable control flow execution masking. [PATCH 04/29] i965: Define consistent interface to predicate an instruction. [PATCH 05/29] i965: Define consistent interface to enable instruction conditional modifiers. [PATCH 06/29] i965: Define consistent interface to enable instruction result saturation. [PATCH 07/29] i965/fs: Introduce FS IR builder. [PATCH 08/29] i965/vec4: Introduce VEC4 IR builder. [PATCH 09/29] i965: Define the setup_vector_uniform_values() backend_visitor interface. [PATCH 10/29] i965: Add support for handling image uniforms to the GLSL IR visitors. [PATCH 11/29] i965: Add a visitor method to extract the result of a visit. [PATCH 12/29] i965/fs: Obtain atomic counter locations by recursing through the visitor. [PATCH 13/29] i965/vec4: Obtain atomic counter locations by recursing through the visitor. [PATCH 14/29] i965: Lift the constness restriction on surface indices passed to untyped ops. [PATCH 15/29] i965/fs: Import array utils for the surface message builder. [PATCH 16/29] i965/fs: Import helpers to convert vectors into arrays and back. [PATCH 17/29] i965/fs: Import surface message builder functions. [PATCH 18/29] i965/fs: Import image access validity checks. [PATCH 19/29] i965/fs: Import image memory offset calculation code. [PATCH 20/29] i965/fs: Import image format metadata queries. [PATCH 21/29] i965/fs: Import image format conversion primitives. [PATCH 22/29] i965/fs: Implement image load, store and atomic. [PATCH 23/29] i965/fs: Revisit GLSL IR atomic counter intrinsic translation. [PATCH 24/29] i965/fs: Revisit NIR atomic counter intrinsic translation. [PATCH 25/29] i965/fs: Import GLSL IR image intrinsic translation code. [PATCH 26/29] i965/fs: Import GLSL IR memory barrier intrinsic translation code. [PATCH 27/29] i965/fs: Drop unused untyped surface read and atomic emit methods. [PATCH 28/29] i965: Define implementation constants for ARB_shader_image_load_store. [PATCH 29/29] i965: Expose ARB_shader_image_load_store. Hi Curro, Hi Ken, Thanks for sending this out. I'm definitely more comfortable with v2, though I do
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
On Saturday, May 02, 2015 06:29:27 PM Francisco Jerez wrote: This v2 is motivated by the less than enthusiastic reception of my last two series implementing the image load, store and atomic intrinsics in the i965 back-end. It substitutes both. The built-ins are now implemented in the scalar back-end only, what means that IVB and HSW are no longer able to expose image support for the VS and GS stages -- behaviour acceptable according to the specification. BDW and up should still be able to expose images in the VS stage, and possibly in the GS stage at some point too. CS images can be supported on all gens (that support CS that is). The benefit is that the FS/VEC4 abstraction becomes unnecessary and the overall set of changes amounts to less code. It can be found in the following branch along with its dependencies: http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-scalar src/mesa/drivers/dri/i965/Makefile.sources |4 + src/mesa/drivers/dri/i965/brw_context.c | 12 ++ src/mesa/drivers/dri/i965/brw_fs.cpp | 43 - src/mesa/drivers/dri/i965/brw_fs.h | 19 +- src/mesa/drivers/dri/i965/brw_fs_builder.h | 617 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp |8 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 56 -- src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 1144 +++ src/mesa/drivers/dri/i965/brw_fs_surface_builder.h | 233 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 466 --- src/mesa/drivers/dri/i965/brw_ir_fs.h| 56 ++ src/mesa/drivers/dri/i965/brw_ir_vec4.h | 55 ++ src/mesa/drivers/dri/i965/brw_shader.cpp | 32 src/mesa/drivers/dri/i965/brw_shader.h |6 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 10 + src/mesa/drivers/dri/i965/brw_vec4.h |6 + src/mesa/drivers/dri/i965/brw_vec4_builder.h | 579 ++ src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |8 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 179 ++ src/mesa/drivers/dri/i965/intel_extensions.c |2 + 20 files changed, 3224 insertions(+), 311 deletions(-) [PATCH 01/29] i965/fs: Have component() set the register stride to zero. [PATCH 02/29] i965: Add register constructors taking an backend_reg as argument. [PATCH 03/29] i965: Define consistent interface to disable control flow execution masking. [PATCH 04/29] i965: Define consistent interface to predicate an instruction. [PATCH 05/29] i965: Define consistent interface to enable instruction conditional modifiers. [PATCH 06/29] i965: Define consistent interface to enable instruction result saturation. [PATCH 07/29] i965/fs: Introduce FS IR builder. [PATCH 08/29] i965/vec4: Introduce VEC4 IR builder. [PATCH 09/29] i965: Define the setup_vector_uniform_values() backend_visitor interface. [PATCH 10/29] i965: Add support for handling image uniforms to the GLSL IR visitors. [PATCH 11/29] i965: Add a visitor method to extract the result of a visit. [PATCH 12/29] i965/fs: Obtain atomic counter locations by recursing through the visitor. [PATCH 13/29] i965/vec4: Obtain atomic counter locations by recursing through the visitor. [PATCH 14/29] i965: Lift the constness restriction on surface indices passed to untyped ops. [PATCH 15/29] i965/fs: Import array utils for the surface message builder. [PATCH 16/29] i965/fs: Import helpers to convert vectors into arrays and back. [PATCH 17/29] i965/fs: Import surface message builder functions. [PATCH 18/29] i965/fs: Import image access validity checks. [PATCH 19/29] i965/fs: Import image memory offset calculation code. [PATCH 20/29] i965/fs: Import image format metadata queries. [PATCH 21/29] i965/fs: Import image format conversion primitives. [PATCH 22/29] i965/fs: Implement image load, store and atomic. [PATCH 23/29] i965/fs: Revisit GLSL IR atomic counter intrinsic translation. [PATCH 24/29] i965/fs: Revisit NIR atomic counter intrinsic translation. [PATCH 25/29] i965/fs: Import GLSL IR image intrinsic translation code. [PATCH 26/29] i965/fs: Import GLSL IR memory barrier intrinsic translation code. [PATCH 27/29] i965/fs: Drop unused untyped surface read and atomic emit methods. [PATCH 28/29] i965: Define implementation constants for ARB_shader_image_load_store. [PATCH 29/29] i965: Expose ARB_shader_image_load_store. Hi Curro, Thanks for sending this out. I'm definitely more comfortable with v2, though I do still have concerns. At this point, NIR is the default FS backend.
[Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v2)
This v2 is motivated by the less than enthusiastic reception of my last two series implementing the image load, store and atomic intrinsics in the i965 back-end. It substitutes both. The built-ins are now implemented in the scalar back-end only, what means that IVB and HSW are no longer able to expose image support for the VS and GS stages -- behaviour acceptable according to the specification. BDW and up should still be able to expose images in the VS stage, and possibly in the GS stage at some point too. CS images can be supported on all gens (that support CS that is). The benefit is that the FS/VEC4 abstraction becomes unnecessary and the overall set of changes amounts to less code. It can be found in the following branch along with its dependencies: http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-scalar src/mesa/drivers/dri/i965/Makefile.sources |4 + src/mesa/drivers/dri/i965/brw_context.c | 12 ++ src/mesa/drivers/dri/i965/brw_fs.cpp | 43 - src/mesa/drivers/dri/i965/brw_fs.h | 19 +- src/mesa/drivers/dri/i965/brw_fs_builder.h | 617 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp |8 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 56 -- src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 1144 +++ src/mesa/drivers/dri/i965/brw_fs_surface_builder.h | 233 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 466 --- src/mesa/drivers/dri/i965/brw_ir_fs.h| 56 ++ src/mesa/drivers/dri/i965/brw_ir_vec4.h | 55 ++ src/mesa/drivers/dri/i965/brw_shader.cpp | 32 src/mesa/drivers/dri/i965/brw_shader.h |6 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 10 + src/mesa/drivers/dri/i965/brw_vec4.h |6 + src/mesa/drivers/dri/i965/brw_vec4_builder.h | 579 ++ src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |8 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 179 ++ src/mesa/drivers/dri/i965/intel_extensions.c |2 + 20 files changed, 3224 insertions(+), 311 deletions(-) [PATCH 01/29] i965/fs: Have component() set the register stride to zero. [PATCH 02/29] i965: Add register constructors taking an backend_reg as argument. [PATCH 03/29] i965: Define consistent interface to disable control flow execution masking. [PATCH 04/29] i965: Define consistent interface to predicate an instruction. [PATCH 05/29] i965: Define consistent interface to enable instruction conditional modifiers. [PATCH 06/29] i965: Define consistent interface to enable instruction result saturation. [PATCH 07/29] i965/fs: Introduce FS IR builder. [PATCH 08/29] i965/vec4: Introduce VEC4 IR builder. [PATCH 09/29] i965: Define the setup_vector_uniform_values() backend_visitor interface. [PATCH 10/29] i965: Add support for handling image uniforms to the GLSL IR visitors. [PATCH 11/29] i965: Add a visitor method to extract the result of a visit. [PATCH 12/29] i965/fs: Obtain atomic counter locations by recursing through the visitor. [PATCH 13/29] i965/vec4: Obtain atomic counter locations by recursing through the visitor. [PATCH 14/29] i965: Lift the constness restriction on surface indices passed to untyped ops. [PATCH 15/29] i965/fs: Import array utils for the surface message builder. [PATCH 16/29] i965/fs: Import helpers to convert vectors into arrays and back. [PATCH 17/29] i965/fs: Import surface message builder functions. [PATCH 18/29] i965/fs: Import image access validity checks. [PATCH 19/29] i965/fs: Import image memory offset calculation code. [PATCH 20/29] i965/fs: Import image format metadata queries. [PATCH 21/29] i965/fs: Import image format conversion primitives. [PATCH 22/29] i965/fs: Implement image load, store and atomic. [PATCH 23/29] i965/fs: Revisit GLSL IR atomic counter intrinsic translation. [PATCH 24/29] i965/fs: Revisit NIR atomic counter intrinsic translation. [PATCH 25/29] i965/fs: Import GLSL IR image intrinsic translation code. [PATCH 26/29] i965/fs: Import GLSL IR memory barrier intrinsic translation code. [PATCH 27/29] i965/fs: Drop unused untyped surface read and atomic emit methods. [PATCH 28/29] i965: Define implementation constants for ARB_shader_image_load_store. [PATCH 29/29] i965: Expose ARB_shader_image_load_store. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev