On Thursday, September 28, 2017 1:24:21 AM PDT Iago Toral Quiroga wrote:
> Triggering the push model when 64-bit inputs are involved is not easy due to
> the constrains on the maximum number of registers that we allow for this mode,
> however, for GS with 'points' primitive type and just a couple of double
> varyings we can trigger this and it just doesn't work because the
> implementation is not 64-bit aware at all. For now, let's make sure that we
> don't attempt this model whith 64-bit inputs and we always fall back to pull
> model for them.
> 
> Also, don't enable the VUE handles in the thread payload on the fly when we
> find an input for which we need the pull model, this is not safe: if we need
> to resort to the pull model we need to account for that when we setup the
> thread payload so we compute the first non-payload register properly. If we
> didn't do that correctly and we enable it on-the-fly here then we will end up
> VUE handles on the first non-payload register which will probably lead to
> GPU hangs. Instead, always enable the VUE handles for the pull model so we
> can safely use them when needed. The GS is going to resort to pull model
> almost in every situation anyway, so this shouldn't make a significant
> difference and it makes things easier and safer.
> 
> v2: Always enable the VUE handles for pull model, this is easier and safer
>     and the GS is going to fallback to pull model almost always anyway (Ken)
> 
> v3: Only clamp the URB read length if we are over the maximum reserved for
>     push inputs as we were doing in the original code (Ken).
> ---
>  src/intel/compiler/brw_fs.cpp     | 16 +++++++++++-----
>  src/intel/compiler/brw_fs_nir.cpp |  4 +++-
>  2 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index eb9b4c3890..b13b56cfba 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -5602,6 +5602,17 @@ fs_visitor::setup_gs_payload()
>        payload.num_regs++;
>     }
>  
> +   /* Always enable VUE handles so we can safely use pull model if needed.
> +    *
> +    * The push model for a GS uses a ton of register space even for trivial
> +    * scenarios with just a few inputs, so just make things easier and a bit
> +    * safer by always having pull model available.
> +    */
> +   gs_prog_data->base.include_vue_handles = true;
> +
> +   /* R3..RN: ICP Handles for each incoming vertex (when using pull model) */
> +   payload.num_regs += nir->info.gs.vertices_in;
> +
>     /* Use a maximum of 24 registers for push-model inputs. */
>     const unsigned max_push_components = 24;
>  
> @@ -5613,11 +5624,6 @@ fs_visitor::setup_gs_payload()
>      */
>     if (8 * vue_prog_data->urb_read_length * nir->info.gs.vertices_in >
>         max_push_components || gs_prog_data->invocations > 1) {

One last change...sorry!  It doesn't make much sense to clamp to the
maximum number of push components just because invocations > 1.  We only
needed to consider that in order to include the VUE handles, which we
now do unconditionally.

Let's change this to:

     if (8 * vue_prog_data->urb_read_length * nir->info.gs.vertices_in >
         max_push_components) {

With that change,
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>

Looking at this more closely has really shown all the ways I made a mess
of this...so, thanks again for cleaning this all up!

> -      gs_prog_data->base.include_vue_handles = true;
> -
> -      /* R3..RN: ICP Handles for each incoming vertex (when using pull 
> model) */
> -      payload.num_regs += nir->info.gs.vertices_in;
> -
>        vue_prog_data->urb_read_length =
>           ROUND_DOWN_TO(max_push_components / nir->info.gs.vertices_in, 8) / 
> 8;
>     }
> diff --git a/src/intel/compiler/brw_fs_nir.cpp 
> b/src/intel/compiler/brw_fs_nir.cpp
> index d760946e62..aa57bb9d78 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -1915,7 +1915,9 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
>     const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
>  
>     /* TODO: figure out push input layout for invocations == 1 */
> +   /* TODO: make this work with 64-bit inputs */
>     if (gs_prog_data->invocations == 1 &&
> +       type_sz(dst.type) <= 4 &&
>         offset_const != NULL && vertex_const != NULL &&
>         4 * (base_offset + offset_const->u32[0]) < push_reg_count) {
>        int imm_offset = (base_offset + offset_const->u32[0]) * 4 +
> @@ -1928,7 +1930,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
>     }
>  
>     /* Resort to the pull model.  Ensure the VUE handles are provided. */
> -   gs_prog_data->base.include_vue_handles = true;
> +   assert(gs_prog_data->base.include_vue_handles);
>  
>     unsigned first_icp_handle = gs_prog_data->include_primitive_id ? 3 : 2;
>     fs_reg icp_handle = bld.vgrf(BRW_REGISTER_TYPE_UD, 1);
> 

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to