Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-11 Thread Chema Casanova
El 09/12/17 a las 00:52, Jason Ekstrand escribió:
> While reviewing some of the UBO pushing comments from Topi, I
> discovered a fairly disturbing assert in brw_fs_nir.cpp in our
> implementation of nir_intrinsic_load_uniform:
>
>  /* Offsets are in bytes but they should always be multiples
> of 4 */
>  assert(const_offset->u32[0] % 4 == 0);
>
> This assertion isn't triggering with 16bit storage enabled for push
> constants.  Looking at the CTS tests in a bit more detail, they're
> very poor.  They only test basic types (scalars, vectors, and
> matrices) and only in arrays with a dynamic index.  This means that
> the constant optimization paths for UBO pulls aren't getting triggered
> at all.  Also, we're not using push constants with any offsets not
> aligned to 4 (as per the above assert) so there's no real assurance
> that that works.  Given that constant offsets are a very common case
> for apps, this is very disappointing.  For the moment, I'm going to
> push a patch to master to disable 16bit storage.  I'm really sorry
> about that.  I think your code is great and, based on my review, I'm
> pretty sure it should work but I don't think we can really ship this
> extension in good faith when we know that there is a massive gaping
> hole in test coverage like this.  (The coverage hole is not your
> fault!)  I've also filed a bug (893) against the CTS.

I agree that is better to increase testing coverage before enabling the
feature after your findings. At the same time for having the complete
support for VK_KHR_16bit_storage it is still pending the review of part
of input/output support of the feature, so we are not in a hurry to have
it enabled.

Chema

>
> On Wed, Dec 6, 2017 at 12:09 AM, Alejandro Piñeiro
> > wrote:
>
> On 06/12/17 01:19, Chema Casanova wrote:
> > On 05/12/17 18:31, Chema Casanova wrote:
> >> El 05/12/17 a las 06:16, Jason Ekstrand escribió:
> >>> A couple of notes:
> >>>
> >>>  1) I *think* I gave you enough reviews to land the UBO/SSBO
> part and
> >>> the optimizations in 26-28.  If reviews are still missing
> anywhere,
> >>> please let me know.  If not, let's try and get that part landed.
> >> The series is almost ready to land, I have only pending to
> address your
> >> feedback about use untyped_read for reading vec3 ssbos.
> >>
> >> The only missing explicit R-b is that " [PATCH v4 28/44]
> i965/fs: Use
> >> untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
> >> i965/fs: Enables 16-bit load_ubo with sampler" i've just
> answered your
> >> review to confirm the R-b.
> >>
> >> I expect to finish today vec3 ssbo and send the series to
> Jenkins before
> >> landing, confirm your "pending" R-b, do a last rebase to master
> and ask
> >> for a push.
> > I've just prepared a rebased branch with the reviewed commits
> ready to
> > land to enable VK_KHR_16bit_storage support for SSBO/UBO.
> >
> >
> 
> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-v4-ubo-ssbo-to-land
> 
> 
> >
> > As I don't have still commit access to mesa, maybe Eduardo or
> Alejandro
> > can land it for me tomorrow. But, Jason feel free to push it if
> you want.
>
> I have just pushed it to master.
>
> >
> > Chema
> >
> >>>  2) I send out a patch to rewrite assign_constant_locations
> which I
> >>> think should make it automatically handle 8 and 16-bit values as
> >>> well.  I'd rather do that than more special casing if
> everything works
> >>> out ok.
> >> I'm testing this patch with 16-bits and make sure whatever is
> needed to
> >> have 16-bit working.
> >>
> >>>  3) I sent out a series of patches to enable pushing of UBOs in
> >>> Vulkan.  If we're not careful, these will clash with 16bit
> storage as
> >>> UBO support suddenly has to imply push constant support.  That
> said,
> >>> I"m willing to wait at least a little while before landing
> them to let
> >>> us get 16bit push constant support sorted out.  The UBO pushing
> >>> patches give us a nice little performance boost but we're
> nowhere near
> >>> a release and I don't want it blocking you.
> >> That would be my next priority, so we would only have pending
> to land
> >> the 16-bit input/output support to finish this extension.
> >>
> >> Chema
> >>
> >>> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
> >>> 
> >> wrote:
> >>>
> >>>     Hello,
> >>>
> >>>     this is the V4 series for the implementation of the
> >>>     SPV_KHR_16bit_storage
> >>>     

Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-08 Thread Jason Ekstrand
While reviewing some of the UBO pushing comments from Topi, I discovered a
fairly disturbing assert in brw_fs_nir.cpp in our implementation of
nir_intrinsic_load_uniform:

 /* Offsets are in bytes but they should always be multiples of 4 */
 assert(const_offset->u32[0] % 4 == 0);

This assertion isn't triggering with 16bit storage enabled for push
constants.  Looking at the CTS tests in a bit more detail, they're very
poor.  They only test basic types (scalars, vectors, and matrices) and only
in arrays with a dynamic index.  This means that the constant optimization
paths for UBO pulls aren't getting triggered at all.  Also, we're not using
push constants with any offsets not aligned to 4 (as per the above assert)
so there's no real assurance that that works.  Given that constant offsets
are a very common case for apps, this is very disappointing.  For the
moment, I'm going to push a patch to master to disable 16bit storage.  I'm
really sorry about that.  I think your code is great and, based on my
review, I'm pretty sure it should work but I don't think we can really ship
this extension in good faith when we know that there is a massive gaping
hole in test coverage like this.  (The coverage hole is not your fault!)
I've also filed a bug (893) against the CTS.

On Wed, Dec 6, 2017 at 12:09 AM, Alejandro Piñeiro 
wrote:

> On 06/12/17 01:19, Chema Casanova wrote:
> > On 05/12/17 18:31, Chema Casanova wrote:
> >> El 05/12/17 a las 06:16, Jason Ekstrand escribió:
> >>> A couple of notes:
> >>>
> >>>  1) I *think* I gave you enough reviews to land the UBO/SSBO part and
> >>> the optimizations in 26-28.  If reviews are still missing anywhere,
> >>> please let me know.  If not, let's try and get that part landed.
> >> The series is almost ready to land, I have only pending to address your
> >> feedback about use untyped_read for reading vec3 ssbos.
> >>
> >> The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use
> >> untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
> >> i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your
> >> review to confirm the R-b.
> >>
> >> I expect to finish today vec3 ssbo and send the series to Jenkins before
> >> landing, confirm your "pending" R-b, do a last rebase to master and ask
> >> for a push.
> > I've just prepared a rebased branch with the reviewed commits ready to
> > land to enable VK_KHR_16bit_storage support for SSBO/UBO.
> >
> > https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_
> storage-v4-ubo-ssbo-to-land
> >
> > As I don't have still commit access to mesa, maybe Eduardo or Alejandro
> > can land it for me tomorrow. But, Jason feel free to push it if you want.
>
> I have just pushed it to master.
>
> >
> > Chema
> >
> >>>  2) I send out a patch to rewrite assign_constant_locations which I
> >>> think should make it automatically handle 8 and 16-bit values as
> >>> well.  I'd rather do that than more special casing if everything works
> >>> out ok.
> >> I'm testing this patch with 16-bits and make sure whatever is needed to
> >> have 16-bit working.
> >>
> >>>  3) I sent out a series of patches to enable pushing of UBOs in
> >>> Vulkan.  If we're not careful, these will clash with 16bit storage as
> >>> UBO support suddenly has to imply push constant support.  That said,
> >>> I"m willing to wait at least a little while before landing them to let
> >>> us get 16bit push constant support sorted out.  The UBO pushing
> >>> patches give us a nice little performance boost but we're nowhere near
> >>> a release and I don't want it blocking you.
> >> That would be my next priority, so we would only have pending to land
> >> the 16-bit input/output support to finish this extension.
> >>
> >> Chema
> >>
> >>> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
> >>> > wrote:
> >>>
> >>> Hello,
> >>>
> >>> this is the V4 series for the implementation of the
> >>> SPV_KHR_16bit_storage
> >>> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in
> >>> addition
> >>> to the GLSL and NIR support needed.
> >>>
> >>> The original series can be found here [1], the following v2 [2]
> >>> and v3 [3].
> >>>
> >>> In short V4 includes the following:
> >>>
> >>>  * Reorder the series to enable features as they are implemented,
> >>> the series
> >>>now enables first UBO and SSBO support, and then inputs/outputs
> and
> >>>finally push constants.
> >>>  * Support the byte scattered read/write messages with different
> >>> bit sizes
> >>>byte/word/dword.
> >>>  * Refactor of the store_ssbo code and also fix stores when
> >>> writemask was .yz
> >>>  * Uses the sampler for load_ubo avoiding the initial
> >>> implementation of
> >>>the series using byte_scattered_read.
> >>>  * Addressed all the feedback provided by Jason and Topi 

Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-06 Thread Alejandro Piñeiro
On 06/12/17 01:19, Chema Casanova wrote:
> On 05/12/17 18:31, Chema Casanova wrote:
>> El 05/12/17 a las 06:16, Jason Ekstrand escribió:
>>> A couple of notes:
>>>
>>>  1) I *think* I gave you enough reviews to land the UBO/SSBO part and
>>> the optimizations in 26-28.  If reviews are still missing anywhere,
>>> please let me know.  If not, let's try and get that part landed.
>> The series is almost ready to land, I have only pending to address your
>> feedback about use untyped_read for reading vec3 ssbos.
>>
>> The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use
>> untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
>> i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your
>> review to confirm the R-b.
>>
>> I expect to finish today vec3 ssbo and send the series to Jenkins before
>> landing, confirm your "pending" R-b, do a last rebase to master and ask
>> for a push.
> I've just prepared a rebased branch with the reviewed commits ready to
> land to enable VK_KHR_16bit_storage support for SSBO/UBO.
>
> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-v4-ubo-ssbo-to-land
>
> As I don't have still commit access to mesa, maybe Eduardo or Alejandro
> can land it for me tomorrow. But, Jason feel free to push it if you want.

I have just pushed it to master.

>
> Chema
>
>>>  2) I send out a patch to rewrite assign_constant_locations which I
>>> think should make it automatically handle 8 and 16-bit values as
>>> well.  I'd rather do that than more special casing if everything works
>>> out ok.
>> I'm testing this patch with 16-bits and make sure whatever is needed to
>> have 16-bit working.
>>
>>>  3) I sent out a series of patches to enable pushing of UBOs in
>>> Vulkan.  If we're not careful, these will clash with 16bit storage as
>>> UBO support suddenly has to imply push constant support.  That said,
>>> I"m willing to wait at least a little while before landing them to let
>>> us get 16bit push constant support sorted out.  The UBO pushing
>>> patches give us a nice little performance boost but we're nowhere near
>>> a release and I don't want it blocking you.
>> That would be my next priority, so we would only have pending to land
>> the 16-bit input/output support to finish this extension.
>>
>> Chema
>>
>>> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
>>> > wrote:
>>>
>>> Hello,
>>>
>>> this is the V4 series for the implementation of the
>>> SPV_KHR_16bit_storage
>>> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in
>>> addition
>>> to the GLSL and NIR support needed.
>>>
>>> The original series can be found here [1], the following v2 [2]
>>> and v3 [3].
>>>
>>> In short V4 includes the following:
>>>
>>>  * Reorder the series to enable features as they are implemented,
>>> the series
>>>    now enables first UBO and SSBO support, and then inputs/outputs and
>>>    finally push constants.
>>>  * Support the byte scattered read/write messages with different
>>> bit sizes
>>>    byte/word/dword.
>>>  * Refactor of the store_ssbo code and also fix stores when
>>> writemask was .yz
>>>  * Uses the sampler for load_ubo avoiding the initial
>>> implementation of
>>>    the series using byte_scattered_read.
>>>  * Addressed all the feedback provided by Jason and Topi on v3 review.
>>>
>>> This series is also available at:
>>>
>>> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4
>>> 
>>>
>>> The objective is to start landing part of this series, all
>>> feedback has been
>>> addressed for SSBO and UBO. But for input/outputs features it will
>>> probably
>>> need another iteration as was not completely reviewed. It is also
>>> needed
>>> to define the approach for push constants issues before of after
>>> landing
>>> the support with this implementation.
>>>
>>> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
>>> reviewed but as it has changed too much i would appreciate another
>>> review. When patches until 25 or 28 are reviewed we could land
>>> UBOs and
>>> SSBOs support.
>>>
>>> Finally an updated overview of the patches:
>>>
>>> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
>>> needed because NIR uses GLSL types internally. We use the enums
>>> already defined at AMD_gpu_shader_half_float and NV_gpu_shader
>>> extensions. Patch 2 updates mesa/st, in order to avoid warnings for
>>> types not handled on a switch.
>>>
>>> Patches 3-6 add NIR support for those new GLSL 16-bit types,
>>> conversion opcodes, and rounding modes for float to half-float
>>> conversions.
>>>
>>> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.
>>>
>>> 

Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-05 Thread Chema Casanova
On 05/12/17 18:31, Chema Casanova wrote:
> El 05/12/17 a las 06:16, Jason Ekstrand escribió:
>> A couple of notes:
>>
>>  1) I *think* I gave you enough reviews to land the UBO/SSBO part and
>> the optimizations in 26-28.  If reviews are still missing anywhere,
>> please let me know.  If not, let's try and get that part landed.
> 
> The series is almost ready to land, I have only pending to address your
> feedback about use untyped_read for reading vec3 ssbos.
> 
> The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use
> untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
> i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your
> review to confirm the R-b.
> 
> I expect to finish today vec3 ssbo and send the series to Jenkins before
> landing, confirm your "pending" R-b, do a last rebase to master and ask
> for a push.

I've just prepared a rebased branch with the reviewed commits ready to
land to enable VK_KHR_16bit_storage support for SSBO/UBO.

https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-v4-ubo-ssbo-to-land

As I don't have still commit access to mesa, maybe Eduardo or Alejandro
can land it for me tomorrow. But, Jason feel free to push it if you want.

Chema

>>  2) I send out a patch to rewrite assign_constant_locations which I
>> think should make it automatically handle 8 and 16-bit values as
>> well.  I'd rather do that than more special casing if everything works
>> out ok.
> 
> I'm testing this patch with 16-bits and make sure whatever is needed to
> have 16-bit working.
> 
>>
>>  3) I sent out a series of patches to enable pushing of UBOs in
>> Vulkan.  If we're not careful, these will clash with 16bit storage as
>> UBO support suddenly has to imply push constant support.  That said,
>> I"m willing to wait at least a little while before landing them to let
>> us get 16bit push constant support sorted out.  The UBO pushing
>> patches give us a nice little performance boost but we're nowhere near
>> a release and I don't want it blocking you.
> 
> That would be my next priority, so we would only have pending to land
> the 16-bit input/output support to finish this extension.
> 
> Chema
> 
>> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
>> > wrote:
>>
>> Hello,
>>
>> this is the V4 series for the implementation of the
>> SPV_KHR_16bit_storage
>> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in
>> addition
>> to the GLSL and NIR support needed.
>>
>> The original series can be found here [1], the following v2 [2]
>> and v3 [3].
>>
>> In short V4 includes the following:
>>
>>  * Reorder the series to enable features as they are implemented,
>> the series
>>    now enables first UBO and SSBO support, and then inputs/outputs and
>>    finally push constants.
>>  * Support the byte scattered read/write messages with different
>> bit sizes
>>    byte/word/dword.
>>  * Refactor of the store_ssbo code and also fix stores when
>> writemask was .yz
>>  * Uses the sampler for load_ubo avoiding the initial
>> implementation of
>>    the series using byte_scattered_read.
>>  * Addressed all the feedback provided by Jason and Topi on v3 review.
>>
>> This series is also available at:
>>
>> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4
>> 
>>
>> The objective is to start landing part of this series, all
>> feedback has been
>> addressed for SSBO and UBO. But for input/outputs features it will
>> probably
>> need another iteration as was not completely reviewed. It is also
>> needed
>> to define the approach for push constants issues before of after
>> landing
>> the support with this implementation.
>>
>> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
>> reviewed but as it has changed too much i would appreciate another
>> review. When patches until 25 or 28 are reviewed we could land
>> UBOs and
>> SSBOs support.
>>
>> Finally an updated overview of the patches:
>>
>> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
>> needed because NIR uses GLSL types internally. We use the enums
>> already defined at AMD_gpu_shader_half_float and NV_gpu_shader
>> extensions. Patch 2 updates mesa/st, in order to avoid warnings for
>> types not handled on a switch.
>>
>> Patches 3-6 add NIR support for those new GLSL 16-bit types,
>> conversion opcodes, and rounding modes for float to half-float
>> conversions.
>>
>> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.
>>
>> Patches 10-12 add general 16-bit support for i965. This includes
>> handling of new types on several general purpose methods,
>> update/remove some asserts.
>>
>> Patches 

Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-05 Thread Chema Casanova
El 05/12/17 a las 06:16, Jason Ekstrand escribió:
> A couple of notes:
>
>  1) I *think* I gave you enough reviews to land the UBO/SSBO part and
> the optimizations in 26-28.  If reviews are still missing anywhere,
> please let me know.  If not, let's try and get that part landed.

The series is almost ready to land, I have only pending to address your
feedback about use untyped_read for reading vec3 ssbos.

The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use
untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your
review to confirm the R-b.

I expect to finish today vec3 ssbo and send the series to Jenkins before
landing, confirm your "pending" R-b, do a last rebase to master and ask
for a push.

>
>  2) I send out a patch to rewrite assign_constant_locations which I
> think should make it automatically handle 8 and 16-bit values as
> well.  I'd rather do that than more special casing if everything works
> out ok.

I'm testing this patch with 16-bits and make sure whatever is needed to
have 16-bit working.

>
>  3) I sent out a series of patches to enable pushing of UBOs in
> Vulkan.  If we're not careful, these will clash with 16bit storage as
> UBO support suddenly has to imply push constant support.  That said,
> I"m willing to wait at least a little while before landing them to let
> us get 16bit push constant support sorted out.  The UBO pushing
> patches give us a nice little performance boost but we're nowhere near
> a release and I don't want it blocking you.

That would be my next priority, so we would only have pending to land
the 16-bit input/output support to finish this extension.

Chema

> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
> > wrote:
>
> Hello,
>
> this is the V4 series for the implementation of the
> SPV_KHR_16bit_storage
> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in
> addition
> to the GLSL and NIR support needed.
>
> The original series can be found here [1], the following v2 [2]
> and v3 [3].
>
> In short V4 includes the following:
>
>  * Reorder the series to enable features as they are implemented,
> the series
>    now enables first UBO and SSBO support, and then inputs/outputs and
>    finally push constants.
>  * Support the byte scattered read/write messages with different
> bit sizes
>    byte/word/dword.
>  * Refactor of the store_ssbo code and also fix stores when
> writemask was .yz
>  * Uses the sampler for load_ubo avoiding the initial
> implementation of
>    the series using byte_scattered_read.
>  * Addressed all the feedback provided by Jason and Topi on v3 review.
>
> This series is also available at:
>
> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4
> 
>
> The objective is to start landing part of this series, all
> feedback has been
> addressed for SSBO and UBO. But for input/outputs features it will
> probably
> need another iteration as was not completely reviewed. It is also
> needed
> to define the approach for push constants issues before of after
> landing
> the support with this implementation.
>
> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
> reviewed but as it has changed too much i would appreciate another
> review. When patches until 25 or 28 are reviewed we could land
> UBOs and
> SSBOs support.
>
> Finally an updated overview of the patches:
>
> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
> needed because NIR uses GLSL types internally. We use the enums
> already defined at AMD_gpu_shader_half_float and NV_gpu_shader
> extensions. Patch 2 updates mesa/st, in order to avoid warnings for
> types not handled on a switch.
>
> Patches 3-6 add NIR support for those new GLSL 16-bit types,
> conversion opcodes, and rounding modes for float to half-float
> conversions.
>
> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.
>
> Patches 10-12 add general 16-bit support for i965. This includes
> handling of new types on several general purpose methods,
> update/remove some asserts.
>
> Patches 14-17 add support for 32 to 16-bit conversions for i965,
> including rounding mode opcodes (needed for float to half-float
> conversions), and an optimization that removes superfluous rounding
> mode sets.
>
> Patches 18-21 add and use two new messages: byte scattered read and
> write. Those were needed because untyped surface message has a fixed
> 32-bit write size. Those messages are used on the 16-bit support of
> store SSBO, load SSBO and load shared.
>
> Patch 22 adds helpers to allow un/shuffle 

Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-04 Thread Jason Ekstrand
A couple of notes:

 1) I *think* I gave you enough reviews to land the UBO/SSBO part and the
optimizations in 26-28.  If reviews are still missing anywhere, please let
me know.  If not, let's try and get that part landed.

 2) I send out a patch to rewrite assign_constant_locations which I think
should make it automatically handle 8 and 16-bit values as well.  I'd
rather do that than more special casing if everything works out ok.

 3) I sent out a series of patches to enable pushing of UBOs in Vulkan.  If
we're not careful, these will clash with 16bit storage as UBO support
suddenly has to imply push constant support.  That said, I"m willing to
wait at least a little while before landing them to let us get 16bit push
constant support sorted out.  The UBO pushing patches give us a nice little
performance boost but we're nowhere near a release and I don't want it
blocking you.

--Jason



On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> Hello,
>
> this is the V4 series for the implementation of the SPV_KHR_16bit_storage
> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in addition
> to the GLSL and NIR support needed.
>
> The original series can be found here [1], the following v2 [2]
> and v3 [3].
>
> In short V4 includes the following:
>
>  * Reorder the series to enable features as they are implemented, the
> series
>now enables first UBO and SSBO support, and then inputs/outputs and
>finally push constants.
>  * Support the byte scattered read/write messages with different bit sizes
>byte/word/dword.
>  * Refactor of the store_ssbo code and also fix stores when writemask was
> .yz
>  * Uses the sampler for load_ubo avoiding the initial implementation of
>the series using byte_scattered_read.
>  * Addressed all the feedback provided by Jason and Topi on v3 review.
>
> This series is also available at:
>
> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4
>
> The objective is to start landing part of this series, all feedback has
> been
> addressed for SSBO and UBO. But for input/outputs features it will probably
> need another iteration as was not completely reviewed. It is also needed
> to define the approach for push constants issues before of after landing
> the support with this implementation.
>
> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
> reviewed but as it has changed too much i would appreciate another
> review. When patches until 25 or 28 are reviewed we could land UBOs and
> SSBOs support.
>
> Finally an updated overview of the patches:
>
> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
> needed because NIR uses GLSL types internally. We use the enums
> already defined at AMD_gpu_shader_half_float and NV_gpu_shader
> extensions. Patch 2 updates mesa/st, in order to avoid warnings for
> types not handled on a switch.
>
> Patches 3-6 add NIR support for those new GLSL 16-bit types,
> conversion opcodes, and rounding modes for float to half-float
> conversions.
>
> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.
>
> Patches 10-12 add general 16-bit support for i965. This includes
> handling of new types on several general purpose methods,
> update/remove some asserts.
>
> Patches 14-17 add support for 32 to 16-bit conversions for i965,
> including rounding mode opcodes (needed for float to half-float
> conversions), and an optimization that removes superfluous rounding
> mode sets.
>
> Patches 18-21 add and use two new messages: byte scattered read and
> write. Those were needed because untyped surface message has a fixed
> 32-bit write size. Those messages are used on the 16-bit support of
> store SSBO, load SSBO and load shared.
>
> Patch 22 adds helpers to allow un/shuffle 16-bit components in 32-bit
> ones. This would be needed for following optimizations on load/store
> ssbo. This huck was originally in the input/outputs support, but needed
> a relocation because of the new order of the series.
>
> Patch 23 Enables load_ubo support for 16-bit using the sampler un-shuffling
> pairs 16-bit components from 32-bit.
>
> Patches 24-25 enable SPV_KHR_16bit_storage and VK_KHR_16bit_storage but
> only
> the support for SSBO and UBO on anv vulkan driver.
>
> Patches 26-28 were new patches included in V2 to improve performance
> reducing the use of multiple scattered messages for untyped read/write
> opreations. 16bit CTS tests passes without them. The other one would
> fix a real problem using predication (patch 27), but unfourtunately no CTS
> test yet catching it.
>
> Patches 29-33 implement 16-bit vertex attribute inputs support on
> i965. These include changes on anv. This was needed because 16-bit
> surface formats do implicit conversion to 32-bit. To workaround this,
> we override the 16-bit surface format, and use 32-bit ones. Issues related
> to robust buffer access have been addressed.
>
> Patch 34 implements load input and load store 

[Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-11-29 Thread Jose Maria Casanova Crespo
Hello,

this is the V4 series for the implementation of the SPV_KHR_16bit_storage
and VK_KHR_16bit_storage extensions on the anv vulkan driver, in addition
to the GLSL and NIR support needed.

The original series can be found here [1], the following v2 [2]
and v3 [3].

In short V4 includes the following:

 * Reorder the series to enable features as they are implemented, the series
   now enables first UBO and SSBO support, and then inputs/outputs and
   finally push constants.
 * Support the byte scattered read/write messages with different bit sizes
   byte/word/dword.
 * Refactor of the store_ssbo code and also fix stores when writemask was .yz
 * Uses the sampler for load_ubo avoiding the initial implementation of
   the series using byte_scattered_read.
 * Addressed all the feedback provided by Jason and Topi on v3 review.

This series is also available at:

https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4

The objective is to start landing part of this series, all feedback has been
addressed for SSBO and UBO. But for input/outputs features it will probably
need another iteration as was not completely reviewed. It is also needed
to define the approach for push constants issues before of after landing
the support with this implementation.

Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
reviewed but as it has changed too much i would appreciate another
review. When patches until 25 or 28 are reviewed we could land UBOs and
SSBOs support.

Finally an updated overview of the patches:

Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
needed because NIR uses GLSL types internally. We use the enums
already defined at AMD_gpu_shader_half_float and NV_gpu_shader
extensions. Patch 2 updates mesa/st, in order to avoid warnings for
types not handled on a switch.

Patches 3-6 add NIR support for those new GLSL 16-bit types,
conversion opcodes, and rounding modes for float to half-float
conversions.

Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.

Patches 10-12 add general 16-bit support for i965. This includes
handling of new types on several general purpose methods,
update/remove some asserts.

Patches 14-17 add support for 32 to 16-bit conversions for i965,
including rounding mode opcodes (needed for float to half-float
conversions), and an optimization that removes superfluous rounding
mode sets.

Patches 18-21 add and use two new messages: byte scattered read and
write. Those were needed because untyped surface message has a fixed
32-bit write size. Those messages are used on the 16-bit support of
store SSBO, load SSBO and load shared.

Patch 22 adds helpers to allow un/shuffle 16-bit components in 32-bit
ones. This would be needed for following optimizations on load/store
ssbo. This huck was originally in the input/outputs support, but needed
a relocation because of the new order of the series.

Patch 23 Enables load_ubo support for 16-bit using the sampler un-shuffling
pairs 16-bit components from 32-bit.

Patches 24-25 enable SPV_KHR_16bit_storage and VK_KHR_16bit_storage but only
the support for SSBO and UBO on anv vulkan driver.

Patches 26-28 were new patches included in V2 to improve performance
reducing the use of multiple scattered messages for untyped read/write
opreations. 16bit CTS tests passes without them. The other one would
fix a real problem using predication (patch 27), but unfourtunately no CTS
test yet catching it.

Patches 29-33 implement 16-bit vertex attribute inputs support on
i965. These include changes on anv. This was needed because 16-bit
surface formats do implicit conversion to 32-bit. To workaround this,
we override the 16-bit surface format, and use 32-bit ones. Issues related
to robust buffer access have been addressed.

Patch 34 implements load input and load store for all intra stage. This
patch could have problems pointed by Jason related to how TCS outputs are
implmemented that need more work.

Patch 35-42 implements 16-bit store output support for fragment
shaders on i965. Last patch enables VK_KHR_16bit for input/outputs.

Patch 43-44 adds 16-bit support for push constant and enables the feature.
There is still pending to work on a general solution for push constants and
the mixture of different bit_sizes.

[1] https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html
[2] https://lists.freedesktop.org/archives/mesa-dev/2017-August/167455.html
[3] https://lists.freedesktop.org/archives/mesa-dev/2017-October/172557.html

CC: Jason Ekstrand 
CC: Topi Pohjolainen 
CC: Matt Turner 

Alejandro Piñeiro (12):
  i965/vec4: Handle 16-bit types at type_size_xvec4
  i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type
  i965/fs: Handle 32-bit to 16-bit conversions
  i965/fs: Define new shader opcode to set rounding modes
  i965/fs: Enable rounding mode on f2f16 ops
  i965/fs: Add remove_extra_rounding_modes optimization