On Wed, Nov 9, 2016 at 10:04 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Wed, Nov 9, 2016 at 5:08 AM, Timothy Arceri > <timothy.arc...@collabora.com> wrote: >> On Sat, 2016-11-05 at 14:56 -0400, Ilia Mirkin wrote: >>> On Sat, Nov 5, 2016 at 1:03 PM, Ilia Mirkin <imir...@alum.mit.edu> >>> wrote: >>> > >>> > The previous code was confused about whether slot_end was inclusive >>> > or >>> > exclusive. Make it so that it is inclusive consistently, and use it >>> > for >>> > setting the new location. This should also better deal with arrays >>> > of >>> > incomplete size, since the slot check will assume it gets packed. >>> > >>> > Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> >>> > --- >>> > >>> > No regressions in Intel's jenkins runs, which I believe hit both >>> > piglit and CTS. >>> > This is somewhat subtle code, but I think I covered what it was >>> > trying to do. >>> > >>> > src/compiler/glsl/link_varyings.cpp | 50 ++++++++++++++++--------- >>> > ------------ >>> > 1 file changed, 22 insertions(+), 28 deletions(-) >>> > >>> > diff --git a/src/compiler/glsl/link_varyings.cpp >>> > b/src/compiler/glsl/link_varyings.cpp >>> > index e622b3e..49843cc 100644 >>> > --- a/src/compiler/glsl/link_varyings.cpp >>> > +++ b/src/compiler/glsl/link_varyings.cpp >>> > @@ -1546,14 +1546,16 @@ varying_matches::assign_locations(struct >>> > gl_shader_program *prog, >>> > >>> > previous_var_xfb_only = var->data.is_xfb_only; >>> > >>> > - unsigned num_elements = type- >>> > >count_attribute_slots(is_vertex_input); >>> > - unsigned slot_end; >>> > - if (this->disable_varying_packing && >>> > - !is_varying_packing_safe(type, var)) >>> > - slot_end = 4; >>> > - else >>> > - slot_end = type->without_array()->vector_elements; >>> > - slot_end += *location - 1; >>> > + /* The number of components taken up by this variable. For >>> > vertex shader >>> > + * inputs, we use the number of slots * 4, as they have >>> > different >>> > + * counting rules. >>> > + */ >>> > + unsigned num_components = is_vertex_input ? >>> > + type->count_attribute_slots(is_vertex_input) * 4 : >>> > + this->matches[i].num_components; >>> > + >>> > + /* The last slot for this variable, inclusive. */ >>> > + unsigned slot_end = *location + num_components - 1; >>> > >>> > /* FIXME: We could be smarter in the below code and loop >>> > back over >>> > * trying to fill any locations that we skipped because we >>> > couldn't pack >>> > @@ -1561,29 +1563,21 @@ varying_matches::assign_locations(struct >>> > gl_shader_program *prog, >>> > * hit the linking error if we run out of room and suggest >>> > they use >>> > * explicit locations. >>> > */ >>> > - for (unsigned j = 0; j < num_elements; j++) { >>> > - while ((slot_end < MAX_VARYING * 4u) && >>> > - ((reserved_slots & (UINT64_C(1) << *location / 4u) >>> > || >>> > - (reserved_slots & (UINT64_C(1) << slot_end / >>> > 4u))))) { >>> > - >>> > - *location = ALIGN(*location + 1, 4); >>> > - slot_end = *location; >>> > - >>> > - /* reset the counter and try again */ >>> > - j = 0; >>> > + while (slot_end < MAX_VARYING * 4u) { >>> > + const unsigned slots = (slot_end / 4u) - (*location / 4u) >>> > + 1; >>> > + const uint64_t slot_mask = ((1ull << slots) - 1) << >>> > (*location / 4u); >>> > + >>> > + assert(slots > 0); >>> > + if (reserved_slots & slot_mask) { >>> > + *location = ALIGN(slot_end + 1, 4); >>> >>> I was a bit overzealous about changing this -- should be >>> >> >> Right. It's much improved with these changes already :)> *location = >> ALIGN(*location + 1, 4); >>> >>> Since the reserved slot could be the first one. I've made this change >>> locally. Intel's CI is still happy. >>> >>> Various bit arithmetic could be done to improve this, but ... meh. >>> Hardly seems worth it. >> >> Right. It's much improved with these changes already :) >> >> I'm not really sure what you mean about arrays of incomplete size but >> the logic in this change looks correct to me. > > I'll be honest - I don't remember either. A few things jump out as bad > in the old code - e.g. type->without_array()->vector_elements == 0 for > structs. Also the way it's using count_attribute_slots to compute the > number of array elements is a wee bit suspect as well. These things > mostly work, but I suspect there are scenarios where they don't. Put > another way, I bet that putting a assert(matches[i].num_components == > num_elements * type->without_array()->vector_elements) would trigger. > However that is what is implicitly being assumed. > > Happy to remove that comment though and just claim this is a cleanup > to make slot_end be correct.
One simple example that occurs to me right now is an array of double[]. That has vector_elements == 1 and count_attribute_slots == length of array. So we'll end up counting it as a single component rather than 2. Which means that if you have e.g. flat out double foo[4]; layout(location=1) out vec4 bar; it will think that sticking foo at location 0 is OK. Haven't tested it. I'm pretty sure there were situations going the other way as well (i.e. it would fail to place the var when it would be fine) but I can't come up with any right now. -ilia > >> >> Reviewed-by: Timothy Arceri <timothy.arc...@collabora.com> > > Thanks! > >> >>> >>> > >>> > + slot_end = *location + num_components - 1; >>> > + continue; >>> > } >>> > >>> > - /* Increase the slot to make sure there is enough room >>> > for next >>> > - * array element. >>> > - */ >>> > - if (this->disable_varying_packing && >>> > - !is_varying_packing_safe(type, var)) >>> > - slot_end += 4; >>> > - else >>> > - slot_end += type->without_array()->vector_elements; >>> > + break; >>> > } >>> > >>> > - if (!var->data.patch && *location >= MAX_VARYING * 4u) { >>> > + if (!var->data.patch && slot_end >= MAX_VARYING * 4u) { >>> > linker_error(prog, "insufficient contiguous locations >>> > available for " >>> > "%s it is possible an array or struct could >>> > not be " >>> > "packed between varyings with explicit >>> > locations. Try " >>> > @@ -1593,7 +1587,7 @@ varying_matches::assign_locations(struct >>> > gl_shader_program *prog, >>> > >>> > this->matches[i].generic_location = *location; >>> > >>> > - *location += this->matches[i].num_components; >>> > + *location = slot_end + 1; >>> > } >>> > >>> > return (generic_location + 3) / 4; >>> > -- >>> > 2.7.3 >>> > >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev