On Thu, 2016-04-28 at 15:29 +0200, Ian Romanick wrote: > On 04/28/2016 01:40 PM, Antia Puentes wrote: > > > > From: "Juan A. Suarez Romero" <jasua...@igalia.com> > > > > Even when the number of vertex attributes is under the limit, for > > shaders that use a high number of them, we can quickly exhaust the > > number of hardware registers. > Were you able to construct a case where this actually occurs? Limits > exposed by the driver and enforced by the GLSL linker should prevent > this. >
(Re-sending, because the original email was too big). Yes. See the attached shader1 test that exposes this problem. The driver supports up to 16 vertex attributes. ARB_vertex_attrib_64bit states that attribute variables of type dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 *may* count as consuming twice as many attributes as equivalent single-precision types. I highlight the may, because it is not mandatory. If we count those types as consuming the same as a single-precision type (which is what is happening in Mesa), we are consuming 15 attributes, so we are under the limit. The issue is that in scalar mode (SIMD8), for each vec4 attribute we require 4 registers (or 8 per each dvec4 attribute), so it is easy to reach a huge number of registers. Which is the problem the test is exposing. If we were working on SIMD4x2, this wouldn't happen, as we would require only 1 register per vec4 attribute (or 2 per each dvec4). So the problem is a combination of using a high number of attributes and SIMD8 mode. One of the first approaches we took was precisely to consider the previous types to consume two attributes, instead of one. In this case, the shader1 test would be consuming 29 attributes, so the limit would be reached. But I see couple of drawbacks with this approach: - There are tests that under the same conditions (less than the limit if you count those types as occupying the same as single-precision, but beyond the limit if those types are considered as consuming twice) they still works. An example is the attached shader2 test: it requires 13 attributes (or 19 counting as twice the mentioned types) and it works fine. - This check affects to all the backends. And there could be some backend that works perfectly fine with the current implementation, which is less conservative. In fact, we have an example: the same driver running in vec4 mode (SIMD4x2) works perfectly fine. So all in all, the best way we found is to keep how we count vertex attributes, and just abort if we exhaust the available registers. Ideally, the best approach would be to switch to vec4 mode. But this would require to support gen8+vec4 (we are right now working on support for gen7, which uses vec4), and also to improve switching from scalar mode to vec4 when compiling the shader. J.A.
shader1.shader_test.gz
Description: application/gzip
shader2.shader_test.gz
Description: application/gzip
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev