On Wed, Nov 9, 2016 at 10:04 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote:
> On Wed, Nov 9, 2016 at 5:08 AM, Timothy Arceri
> <timothy.arc...@collabora.com> wrote:
>> On Sat, 2016-11-05 at 14:56 -0400, Ilia Mirkin wrote:
>>> On Sat, Nov 5, 2016 at 1:03 PM, Ilia Mirkin <imir...@alum.mit.edu>
>>> wrote:
>>> >
>>> > The previous code was confused about whether slot_end was inclusive
>>> > or
>>> > exclusive. Make it so that it is inclusive consistently, and use it
>>> > for
>>> > setting the new location. This should also better deal with arrays
>>> > of
>>> > incomplete size, since the slot check will assume it gets packed.
>>> >
>>> > Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
>>> > ---
>>> >
>>> > No regressions in Intel's jenkins runs, which I believe hit both
>>> > piglit and CTS.
>>> > This is somewhat subtle code, but I think I covered what it was
>>> > trying to do.
>>> >
>>> >  src/compiler/glsl/link_varyings.cpp | 50 ++++++++++++++++---------
>>> > ------------
>>> >  1 file changed, 22 insertions(+), 28 deletions(-)
>>> >
>>> > diff --git a/src/compiler/glsl/link_varyings.cpp
>>> > b/src/compiler/glsl/link_varyings.cpp
>>> > index e622b3e..49843cc 100644
>>> > --- a/src/compiler/glsl/link_varyings.cpp
>>> > +++ b/src/compiler/glsl/link_varyings.cpp
>>> > @@ -1546,14 +1546,16 @@ varying_matches::assign_locations(struct
>>> > gl_shader_program *prog,
>>> >
>>> >        previous_var_xfb_only = var->data.is_xfb_only;
>>> >
>>> > -      unsigned num_elements =  type-
>>> > >count_attribute_slots(is_vertex_input);
>>> > -      unsigned slot_end;
>>> > -      if (this->disable_varying_packing &&
>>> > -          !is_varying_packing_safe(type, var))
>>> > -         slot_end = 4;
>>> > -      else
>>> > -         slot_end = type->without_array()->vector_elements;
>>> > -      slot_end += *location - 1;
>>> > +      /* The number of components taken up by this variable. For
>>> > vertex shader
>>> > +       * inputs, we use the number of slots * 4, as they have
>>> > different
>>> > +       * counting rules.
>>> > +       */
>>> > +      unsigned num_components = is_vertex_input ?
>>> > +         type->count_attribute_slots(is_vertex_input) * 4 :
>>> > +         this->matches[i].num_components;
>>> > +
>>> > +      /* The last slot for this variable, inclusive. */
>>> > +      unsigned slot_end = *location + num_components - 1;
>>> >
>>> >        /* FIXME: We could be smarter in the below code and loop
>>> > back over
>>> >         * trying to fill any locations that we skipped because we
>>> > couldn't pack
>>> > @@ -1561,29 +1563,21 @@ varying_matches::assign_locations(struct
>>> > gl_shader_program *prog,
>>> >         * hit the linking error if we run out of room and suggest
>>> > they use
>>> >         * explicit locations.
>>> >         */
>>> > -      for (unsigned j = 0; j < num_elements; j++) {
>>> > -         while ((slot_end < MAX_VARYING * 4u) &&
>>> > -                ((reserved_slots & (UINT64_C(1) << *location / 4u)
>>> > ||
>>> > -                 (reserved_slots & (UINT64_C(1) << slot_end /
>>> > 4u))))) {
>>> > -
>>> > -            *location = ALIGN(*location + 1, 4);
>>> > -            slot_end = *location;
>>> > -
>>> > -            /* reset the counter and try again */
>>> > -            j = 0;
>>> > +      while (slot_end < MAX_VARYING * 4u) {
>>> > +         const unsigned slots = (slot_end / 4u) - (*location / 4u)
>>> > + 1;
>>> > +         const uint64_t slot_mask = ((1ull << slots) - 1) <<
>>> > (*location / 4u);
>>> > +
>>> > +         assert(slots > 0);
>>> > +         if (reserved_slots & slot_mask) {
>>> > +            *location = ALIGN(slot_end + 1, 4);
>>>
>>> I was a bit overzealous about changing this -- should be
>>>
>>
>> Right. It's much improved with these changes already :)> *location = 
>> ALIGN(*location + 1, 4);
>>>
>>> Since the reserved slot could be the first one. I've made this change
>>> locally. Intel's CI is still happy.
>>>
>>> Various bit arithmetic could be done to improve this, but ... meh.
>>> Hardly seems worth it.
>>
>> Right. It's much improved with these changes already :)
>>
>> I'm not really sure what you mean about arrays of incomplete size but
>> the logic in this change looks correct to me.
>
> I'll be honest - I don't remember either. A few things jump out as bad
> in the old code - e.g. type->without_array()->vector_elements == 0 for
> structs. Also the way it's using count_attribute_slots to compute the
> number of array elements is a wee bit suspect as well. These things
> mostly work, but I suspect there are scenarios where they don't. Put
> another way, I bet that putting a assert(matches[i].num_components ==
> num_elements * type->without_array()->vector_elements) would trigger.
> However that is what is implicitly being assumed.
>
> Happy to remove that comment though and just claim this is a cleanup
> to make slot_end be correct.

One simple example that occurs to me right now is an array of
double[]. That has vector_elements == 1 and count_attribute_slots ==
length of array. So we'll end up counting it as a single component
rather than 2. Which means that if you have e.g.

flat out double foo[4];
layout(location=1) out vec4 bar;

it will think that sticking foo at location 0 is OK. Haven't tested it.

I'm pretty sure there were situations going the other way as well
(i.e. it would fail to place the var when it would be fine) but I
can't come up with any right now.

  -ilia

>
>>
>> Reviewed-by: Timothy Arceri <timothy.arc...@collabora.com>
>
> Thanks!
>
>>
>>>
>>> >
>>> > +            slot_end = *location + num_components - 1;
>>> > +            continue;
>>> >           }
>>> >
>>> > -         /* Increase the slot to make sure there is enough room
>>> > for next
>>> > -          * array element.
>>> > -          */
>>> > -         if (this->disable_varying_packing &&
>>> > -             !is_varying_packing_safe(type, var))
>>> > -            slot_end += 4;
>>> > -         else
>>> > -            slot_end += type->without_array()->vector_elements;
>>> > +         break;
>>> >        }
>>> >
>>> > -      if (!var->data.patch && *location >= MAX_VARYING * 4u) {
>>> > +      if (!var->data.patch && slot_end >= MAX_VARYING * 4u) {
>>> >           linker_error(prog, "insufficient contiguous locations
>>> > available for "
>>> >                        "%s it is possible an array or struct could
>>> > not be "
>>> >                        "packed between varyings with explicit
>>> > locations. Try "
>>> > @@ -1593,7 +1587,7 @@ varying_matches::assign_locations(struct
>>> > gl_shader_program *prog,
>>> >
>>> >        this->matches[i].generic_location = *location;
>>> >
>>> > -      *location += this->matches[i].num_components;
>>> > +      *location = slot_end + 1;
>>> >     }
>>> >
>>> >     return (generic_location + 3) / 4;
>>> > --
>>> > 2.7.3
>>> >
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to