[Mesa-dev] [PATCH] glsl/tests: Add a test for properties of sampler types.

2013-09-12 Thread Kenneth Graunke
For each sampler type, this tests that:
- The base type is GLSL_TYPE_SAMPLER.
- The dimensionality is set correctly.
- The returned data type is correct.
- The sampler_array and sampler_shadow flags are set correctly.
- sampler_coordinate_components() returns the correct value.

Signed-off-by: Kenneth Graunke 
Cc: Ian Romanick 
---
 src/glsl/Makefile.am  |  13 +
 src/glsl/tests/sampler_types_test.cpp | 101 ++
 2 files changed, 114 insertions(+)
 create mode 100644 src/glsl/tests/sampler_types_test.cpp

Not a bad idea, since it relies on sampler_dimensionality being set correctly
and there's no test for that.

How's this look?

diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
index 9352848..2e161b8 100644
--- a/src/glsl/Makefile.am
+++ b/src/glsl/Makefile.am
@@ -34,6 +34,7 @@ include Makefile.sources
 TESTS = glcpp/tests/glcpp-test \
tests/optimization-test \
tests/ralloc-test   \
+   tests/sampler-types-test\
tests/uniform-initializer-test
 
 TESTS_ENVIRONMENT= \
@@ -45,6 +46,7 @@ check_PROGRAMS =  \
glcpp/glcpp \
glsl_test   \
tests/ralloc-test   \
+   tests/sampler-types-test\
tests/uniform-initializer-test
 
 tests_uniform_initializer_test_SOURCES =   \
@@ -70,6 +72,17 @@ tests_ralloc_test_LDADD =\
$(top_builddir)/src/gtest/libgtest.la   \
$(PTHREAD_LIBS)
 
+tests_sampler_types_test_SOURCES = \
+   $(top_srcdir)/src/mesa/program/prog_hash_table.c\
+   $(top_srcdir)/src/mesa/program/symbol_table.c   \
+   tests/sampler_types_test.cpp
+tests_sampler_types_test_CFLAGS =  \
+   $(PTHREAD_CFLAGS)
+tests_sampler_types_test_LDADD =   \
+   $(top_builddir)/src/gtest/libgtest.la   \
+   $(top_builddir)/src/glsl/libglsl.la \
+   $(PTHREAD_LIBS)
+
 libglcpp_la_SOURCES =  \
glcpp/glcpp-lex.c   \
glcpp/glcpp-parse.c \
diff --git a/src/glsl/tests/sampler_types_test.cpp 
b/src/glsl/tests/sampler_types_test.cpp
new file mode 100644
index 000..4fb30dd
--- /dev/null
+++ b/src/glsl/tests/sampler_types_test.cpp
@@ -0,0 +1,101 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include 
+#include "main/compiler.h"
+#include "main/mtypes.h"
+#include "main/macros.h"
+#include "ralloc.h"
+#include "ir.h"
+
+/**
+ * \file sampler_types_test.cpp
+ *
+ * Test that built-in sampler types have the right properties.
+ */
+
+#define ARRAYEXPECT_TRUE(type->sampler_array);
+#define NONARRAY EXPECT_FALSE(type->sampler_array);
+#define SHADOW   EXPECT_TRUE(type->sampler_shadow);
+#define COLOREXPECT_FALSE(type->sampler_shadow);
+
+#define T(TYPE, DIM, DATA_TYPE, ARR, SHAD, COMPS)   \
+TEST(sampler_types, TYPE)   \
+{   \
+   const glsl_type *type = glsl_type::TYPE##_type;  \
+   EXPECT_EQ(GLSL_TYPE_SAMPLER, type->base_type);   \
+   EXPECT_EQ(DIM, type->sampler_dimensionality);\
+   EXPECT_EQ(DATA_TYPE, type->sampler_type);\
+   ARR; \
+   SHAD;\
+   EXPECT_EQ(COMPS, type->sampler_coordinate_components()); \
+}
+
+T( sampler1D,GLSL_SAMPLER_DIM_1D,   GLSL_TYPE_FLOAT, NONARRAY, COLOR,  
1)
+T( sampler2D,GLSL_SAMPLER_

Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Mathias Fröhlich

Hi,

On Thursday, September 12, 2013 08:41:10 Christian König wrote:
> I completely agree.
> 
> Building everything shared might speed up the build process a little bit
> and save some space, but for the cost of having to handle allot of
> rather small shared libraries where which each clashing the symbol space
> of any application using these drivers with a bunch of unnecessary symbols.
> 
> Building everything as one big blob sounds like the better idea.
+1.

Symbol clashes with libraries used in drivers are a huge problem for 
applications that either ship with their own version/variant of this kind of 
library and do not expect to have a second one injected by the side effect of a 
user space driver or in case of LLVM just because of this not being reliably 
thread safe. If your driver knows the version and instance of llvm it has 
linked with, because it's its own internal one technically hidden from all 
other potential users we will see less unwanted side effects.

To get this symbol clash problem right, there could be an other solution I 
have been playing with which is loading the drivers with RTLD_DEEPBIND. I am 
running with patches for this already for a long time here but I never found 
the time to test this for side effects on OpenCL use. The problem that still 
needs to be investigated there is that you want to share buffer objects between 
OpenCL and OpenGL and for that you work you might(?) need a single instance of 
libdrm in the application. Also this dlopen flag is not part of the standard 
that covers dlopen and thus not avaiable everywhere.

What problem do you want to solve exactly with this shared library split?

If you care for memory use of the running application, you will need to have 
the driver binary loaded including all its code. Having a big blob without any 
relocs in between increases the probability that you can reuse read only pages 
for the code segments to be shared between applications.
Note that the RTLD_DEEPBIND variant above uses even more memory in the running 
application since each dlopened driver gets it's own private set of mapped 
shared objects.

If you care about on disc use which might be interresting for small embedded 
like machines, is it possible to split the drivers rpm into seperate pieces 
per driver? That way an embedded system integrator could potentially install 
them individually based on the available hardware for such a specific case?

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Kenneth Graunke
On 09/11/2013 11:41 PM, Christian König wrote:
> I completely agree.
> 
> Building everything shared might speed up the build process a little bit
> and save some space, but for the cost of having to handle allot of
> rather small shared libraries where which each clashing the symbol space
> of any application using these drivers with a bunch of unnecessary symbols.
> 
> Building everything as one big blob sounds like the better idea.
> 
> Christian.

Not to mention...installing a ton of shared libraries is just asking for
version mismatch problems.  I've had a /ton/ of problems due to
mismatching libdricore and i965_dri.so...usually due to rpath shenanigans.

If anything, I'd like to get rid of libdricore and build core Mesa and
the drivers together again.  No more version clashes.  Far fewer symbols
exported.  LTO for extra performance with no extra effort...

Faster build times are nice, but...not if it means shipping a ton of
shared libraries...

--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Demos] EGLUT Wayland patch

2013-09-12 Thread Tarnyko
Hi Armin, 

Thank you very much ! My patch lacked the "configure_callback" part. I can 
now use EGLUT with no trouble. 

BTW, it would be nice to merge this upstream, as using git still fetches the 
deprecated code now. 


Regards,
Tarnyko 



Armin K. writes: 


On 09/11/2013 10:46 AM, Tarnyko wrote:

Hi folks,
Could someone review the following patch ?
https://bugs.freedesktop.org/show_bug.cgi?id=69135
As of today, the Wayland EGL demo doesn't compile anymore, because of
some API breakage between 1.0 and 1.2.
Patch solves this ; some improvement-insight is welcome though. 


Regards,
Tarnyko
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


http://lists.freedesktop.org/archives/mesa-dev/2013-August/043858.html 


a bit of c/p from simple-egl demo and es2gears run on weston just fine.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

2013-09-12 Thread Chia-I Wu
On Thu, Sep 12, 2013 at 2:06 PM, Chris Forbes  wrote:
> Can we make this approximation conditional on an image-quality control
> in driconf [or somewhere else]?
Sure.  What would be the default behavior?

> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu  wrote:
>> From: Chia-I Wu 
>>
>> Replicate the gradient of the top-left pixel to the other three pixels in the
>> subspan, as how DDY is implemented.  Before, different graidents were used 
>> for
>> pixels in the top row and pixels in the bottom row.
>>
>> This change results in a less accurate approximation.  However, it improves
>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
>> 95.0% confidence) on Haswell.  No noticeable image quality difference
>> observed.
>>
>> No piglit gpu.tests regressions.
>>
>> I failed to come up with an explanation for the performance difference.  The
>> change does not make a difference on Ivy Bridge either.  If anyone has the
>> insight, please kindly enlighten me.  Performance differences may also be
>> observed on other games that call textureGrad and dFdx.
>>
>> Signed-off-by: Chia-I Wu 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +
>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> index bfb3d33..c0d24a0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct 
>> brw_reg dst, struct brw_reg src
>>  void
>>  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct 
>> brw_reg src)
>>  {
>> +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell,
>> +* which gives much better performance when the result is used with
>> +* sample_d
>> +*/
>> +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
>> +  BRW_VERTICAL_STRIDE_2;
>> +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
>> +BRW_WIDTH_2;
>> +
>> struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
>>  BRW_REGISTER_TYPE_F,
>> -BRW_VERTICAL_STRIDE_2,
>> -BRW_WIDTH_2,
>> +vstride,
>> +width,
>>  BRW_HORIZONTAL_STRIDE_0,
>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>> struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
>>  BRW_REGISTER_TYPE_F,
>> -BRW_VERTICAL_STRIDE_2,
>> -BRW_WIDTH_2,
>> +vstride,
>> +width,
>>  BRW_HORIZONTAL_STRIDE_0,
>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>> brw_ADD(p, dst, src0, negate(src1));
>> --
>> 1.8.3.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev



-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

2013-09-12 Thread Chris Forbes
I guess fast-by-default. I imagine that more apps care about
performance than care about the granularity of their derivatives.

After a bit more thought -- In HLSL shader model 5 there's both
ddx_coarse() and ddx_fine() which gives the shader author the choice
between roughly these options. In a *very* quick look I haven't found
anything equivalent -- but I might just be being blind.

CC'ing Ian -- any opinion? Is there any conformance issue here?

-- Chris

On Thu, Sep 12, 2013 at 8:41 PM, Chia-I Wu  wrote:
> On Thu, Sep 12, 2013 at 2:06 PM, Chris Forbes  wrote:
>> Can we make this approximation conditional on an image-quality control
>> in driconf [or somewhere else]?
> Sure.  What would be the default behavior?
>
>> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu  wrote:
>>> From: Chia-I Wu 
>>>
>>> Replicate the gradient of the top-left pixel to the other three pixels in 
>>> the
>>> subspan, as how DDY is implemented.  Before, different graidents were used 
>>> for
>>> pixels in the top row and pixels in the bottom row.
>>>
>>> This change results in a less accurate approximation.  However, it improves
>>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
>>> 95.0% confidence) on Haswell.  No noticeable image quality difference
>>> observed.
>>>
>>> No piglit gpu.tests regressions.
>>>
>>> I failed to come up with an explanation for the performance difference.  The
>>> change does not make a difference on Ivy Bridge either.  If anyone has the
>>> insight, please kindly enlighten me.  Performance differences may also be
>>> observed on other games that call textureGrad and dFdx.
>>>
>>> Signed-off-by: Chia-I Wu 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +
>>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>> index bfb3d33..c0d24a0 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct 
>>> brw_reg dst, struct brw_reg src
>>>  void
>>>  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct 
>>> brw_reg src)
>>>  {
>>> +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on 
>>> Haswell,
>>> +* which gives much better performance when the result is used with
>>> +* sample_d
>>> +*/
>>> +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
>>> +  BRW_VERTICAL_STRIDE_2;
>>> +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
>>> +BRW_WIDTH_2;
>>> +
>>> struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
>>>  BRW_REGISTER_TYPE_F,
>>> -BRW_VERTICAL_STRIDE_2,
>>> -BRW_WIDTH_2,
>>> +vstride,
>>> +width,
>>>  BRW_HORIZONTAL_STRIDE_0,
>>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>> struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
>>>  BRW_REGISTER_TYPE_F,
>>> -BRW_VERTICAL_STRIDE_2,
>>> -BRW_WIDTH_2,
>>> +vstride,
>>> +width,
>>>  BRW_HORIZONTAL_STRIDE_0,
>>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>> brw_ADD(p, dst, src0, negate(src1));
>>> --
>>> 1.8.3.1
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
> --
> o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

2013-09-12 Thread Chia-I Wu
On Thu, Sep 12, 2013 at 5:27 PM, Chris Forbes  wrote:
> I guess fast-by-default. I imagine that more apps care about
> performance than care about the granularity of their derivatives.
That is my preference too.  My concern is that the performance gain is
only observed on Haswell so far.  Why is that and is there a way to
speed up sample_d on Ivy Brdige and Sandy Brdige?

> After a bit more thought -- In HLSL shader model 5 there's both
> ddx_coarse() and ddx_fine() which gives the shader author the choice
> between roughly these options. In a *very* quick look I haven't found
> anything equivalent -- but I might just be being blind.
>
> CC'ing Ian -- any opinion? Is there any conformance issue here?
>
> -- Chris
>
> On Thu, Sep 12, 2013 at 8:41 PM, Chia-I Wu  wrote:
>> On Thu, Sep 12, 2013 at 2:06 PM, Chris Forbes  wrote:
>>> Can we make this approximation conditional on an image-quality control
>>> in driconf [or somewhere else]?
>> Sure.  What would be the default behavior?
>>
>>> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu  wrote:
 From: Chia-I Wu 

 Replicate the gradient of the top-left pixel to the other three pixels in 
 the
 subspan, as how DDY is implemented.  Before, different graidents were used 
 for
 pixels in the top row and pixels in the bottom row.

 This change results in a less accurate approximation.  However, it improves
 the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% 
 (at
 95.0% confidence) on Haswell.  No noticeable image quality difference
 observed.

 No piglit gpu.tests regressions.

 I failed to come up with an explanation for the performance difference.  
 The
 change does not make a difference on Ivy Bridge either.  If anyone has the
 insight, please kindly enlighten me.  Performance differences may also be
 observed on other games that call textureGrad and dFdx.

 Signed-off-by: Chia-I Wu 
 ---
  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
 index bfb3d33..c0d24a0 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
 @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct 
 brw_reg dst, struct brw_reg src
  void
  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct 
 brw_reg src)
  {
 +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on 
 Haswell,
 +* which gives much better performance when the result is used with
 +* sample_d
 +*/
 +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
 +  BRW_VERTICAL_STRIDE_2;
 +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
 +BRW_WIDTH_2;
 +
 struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
  BRW_REGISTER_TYPE_F,
 -BRW_VERTICAL_STRIDE_2,
 -BRW_WIDTH_2,
 +vstride,
 +width,
  BRW_HORIZONTAL_STRIDE_0,
  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
 struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
  BRW_REGISTER_TYPE_F,
 -BRW_VERTICAL_STRIDE_2,
 -BRW_WIDTH_2,
 +vstride,
 +width,
  BRW_HORIZONTAL_STRIDE_0,
  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
 brw_ADD(p, dst, src0, negate(src1));
 --
 1.8.3.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>>
>> --
>> o...@lunarg.com



-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Johannes Obermayr
I see current situation is better:

Symbol table '.dynsym' contains ...

master:

libdricore: 3675
i965_dri:398


after [PATCH 10/21]:

classic drivers:
libmesacore: 839
libmesadri:  348
total:  1187
i965_dri:397

gallium drivers:
libgallium:  833
libmesacore: 839
libmesagallium:  360
total:  2032

Complaining about the weather instead of opening the shutter to see the sun.

Am Donnerstag, 12. September 2013, 00:44:58 schrieb Kenneth Graunke:
> On 09/11/2013 11:41 PM, Christian König wrote:
> > I completely agree.
> > 
> > Building everything shared might speed up the build process a little bit
> > and save some space, but for the cost of having to handle allot of
> > rather small shared libraries where which each clashing the symbol space
> > of any application using these drivers with a bunch of unnecessary symbols.
> > 
> > Building everything as one big blob sounds like the better idea.
> > 
> > Christian.
> 
> Not to mention...installing a ton of shared libraries is just asking for
> version mismatch problems.  I've had a /ton/ of problems due to
> mismatching libdricore and i965_dri.so...usually due to rpath shenanigans.
> 
> If anything, I'd like to get rid of libdricore and build core Mesa and
> the drivers together again.  No more version clashes.  Far fewer symbols
> exported.  LTO for extra performance with no extra effort...
> 
> Faster build times are nice, but...not if it means shipping a ton of
> shared libraries...
> 
> --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Marek Olšák
I think current Gallium drivers only need one exported symbol to work:
__driDriverExtensions. See:
http://lists.freedesktop.org/archives/mesa-dev/2013-August/043038.html

Marek

On Thu, Sep 12, 2013 at 2:41 PM, Johannes Obermayr
 wrote:
> I see current situation is better:
>
> Symbol table '.dynsym' contains ...
>
> master:
>
> libdricore: 3675
> i965_dri:398
>
>
> after [PATCH 10/21]:
>
> classic drivers:
> libmesacore: 839
> libmesadri:  348
> total:  1187
> i965_dri:397
>
> gallium drivers:
> libgallium:  833
> libmesacore: 839
> libmesagallium:  360
> total:  2032
>
> Complaining about the weather instead of opening the shutter to see the sun.
>
> Am Donnerstag, 12. September 2013, 00:44:58 schrieb Kenneth Graunke:
>> On 09/11/2013 11:41 PM, Christian König wrote:
>> > I completely agree.
>> >
>> > Building everything shared might speed up the build process a little bit
>> > and save some space, but for the cost of having to handle allot of
>> > rather small shared libraries where which each clashing the symbol space
>> > of any application using these drivers with a bunch of unnecessary symbols.
>> >
>> > Building everything as one big blob sounds like the better idea.
>> >
>> > Christian.
>>
>> Not to mention...installing a ton of shared libraries is just asking for
>> version mismatch problems.  I've had a /ton/ of problems due to
>> mismatching libdricore and i965_dri.so...usually due to rpath shenanigans.
>>
>> If anything, I'd like to get rid of libdricore and build core Mesa and
>> the drivers together again.  No more version clashes.  Far fewer symbols
>> exported.  LTO for extra performance with no extra effort...
>>
>> Faster build times are nice, but...not if it means shipping a ton of
>> shared libraries...
>>
>> --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Johannes Obermayr
Am Donnerstag, 12. September 2013, 14:52:15 schrieb Marek Olšák:
> I think current Gallium drivers only need one exported symbol to work:
> __driDriverExtensions. See:
> http://lists.freedesktop.org/archives/mesa-dev/2013-August/043038.html
> 
> Marek

readelf -s usr/lib64/dri/radeonsi_dri.so 

master:
Symbol table '.dynsym' contains 556 entries:

after patch series:
Symbol table '.dynsym' contains 153 entries:

> 
> On Thu, Sep 12, 2013 at 2:41 PM, Johannes Obermayr
>  wrote:
> > I see current situation is better:
> >
> > Symbol table '.dynsym' contains ...
> >
> > master:
> >
> > libdricore: 3675
> > i965_dri:398
> >
> >
> > after [PATCH 10/21]:
> >
> > classic drivers:
> > libmesacore: 839
> > libmesadri:  348
> > total:  1187
> > i965_dri:397
> >
> > gallium drivers:
> > libgallium:  833
> > libmesacore: 839
> > libmesagallium:  360
> > total:  2032
> >
> > Complaining about the weather instead of opening the shutter to see the sun.
> >
> > Am Donnerstag, 12. September 2013, 00:44:58 schrieb Kenneth Graunke:
> >> On 09/11/2013 11:41 PM, Christian König wrote:
> >> > I completely agree.
> >> >
> >> > Building everything shared might speed up the build process a little bit
> >> > and save some space, but for the cost of having to handle allot of
> >> > rather small shared libraries where which each clashing the symbol space
> >> > of any application using these drivers with a bunch of unnecessary 
> >> > symbols.
> >> >
> >> > Building everything as one big blob sounds like the better idea.
> >> >
> >> > Christian.
> >>
> >> Not to mention...installing a ton of shared libraries is just asking for
> >> version mismatch problems.  I've had a /ton/ of problems due to
> >> mismatching libdricore and i965_dri.so...usually due to rpath shenanigans.
> >>
> >> If anything, I'd like to get rid of libdricore and build core Mesa and
> >> the drivers together again.  No more version clashes.  Far fewer symbols
> >> exported.  LTO for extra performance with no extra effort...
> >>
> >> Faster build times are nice, but...not if it means shipping a ton of
> >> shared libraries...
> >>
> >> --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Christian König

Am 12.09.2013 14:52, schrieb Marek Olšák:

I think current Gallium drivers only need one exported symbol to work:
__driDriverExtensions. See:
http://lists.freedesktop.org/archives/mesa-dev/2013-August/043038.html


Yes that's indeed the right way of doing it, feel free to make the 
changes Chia-l Wu suggested to avoid the version file and commit the 
patch with my rb.


Christian.


Marek

On Thu, Sep 12, 2013 at 2:41 PM, Johannes Obermayr
 wrote:

I see current situation is better:

Symbol table '.dynsym' contains ...

master:

libdricore: 3675
i965_dri:398


after [PATCH 10/21]:

classic drivers:
libmesacore: 839
libmesadri:  348
total:  1187
i965_dri:397

gallium drivers:
libgallium:  833
libmesacore: 839
libmesagallium:  360
total:  2032

Complaining about the weather instead of opening the shutter to see the sun.

Am Donnerstag, 12. September 2013, 00:44:58 schrieb Kenneth Graunke:

On 09/11/2013 11:41 PM, Christian König wrote:

I completely agree.

Building everything shared might speed up the build process a little bit
and save some space, but for the cost of having to handle allot of
rather small shared libraries where which each clashing the symbol space
of any application using these drivers with a bunch of unnecessary symbols.

Building everything as one big blob sounds like the better idea.

Christian.

Not to mention...installing a ton of shared libraries is just asking for
version mismatch problems.  I've had a /ton/ of problems due to
mismatching libdricore and i965_dri.so...usually due to rpath shenanigans.

If anything, I'd like to get rid of libdricore and build core Mesa and
the drivers together again.  No more version clashes.  Far fewer symbols
exported.  LTO for extra performance with no extra effort...

Faster build times are nice, but...not if it means shipping a ton of
shared libraries...

--Ken


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] regression on nvc0 since floating point compare instructions

2013-09-12 Thread Roland Scheidegger
Am 12.09.2013 03:40, schrieb Dave Airlie:
>>
>> Maybe the type isn't set correctly? Looks to me like these instructions
>> end up in mkCmp, which will set both src and dst type but ignore src
>> type and set both according to the same type (which was the dst type).
>>
>> Roland
> 
> Okay I've attached my next attempt at fixing it, fixes the two testcases I 
> had.


No idea what setting type there really does but I guess that looks right
:-). Though I'm wondering if U32 vs. S32 would make a difference for dst
type since some of the (unsigned) comparisons still would use U32.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] regression on nvc0 since floating point compare instructions

2013-09-12 Thread Christoph Bumiller
On 12.09.2013 16:14, Roland Scheidegger wrote:
> Am 12.09.2013 03:40, schrieb Dave Airlie:
>>> Maybe the type isn't set correctly? Looks to me like these instructions
>>> end up in mkCmp, which will set both src and dst type but ignore src
>>> type and set both according to the same type (which was the dst type).
>>>
>>> Roland
>> Okay I've attached my next attempt at fixing it, fixes the two testcases I 
>> had.
>
> No idea what setting type there really does but I guess that looks right
> :-). Though I'm wondering if U32 vs. S32 would make a difference for dst
> type since some of the (unsigned) comparisons still would use U32.

It doesn't make a difference, making it signed is unnecessary.
If it helped before that was just because it made negative floats be
interpreted as negative ints (instead of large ints) which has a
slightly better chance of "succeeding".

> Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

2013-09-12 Thread Ian Romanick
On 09/12/2013 01:06 AM, Chris Forbes wrote:
> Can we make this approximation conditional on an image-quality control
> in driconf [or somewhere else]?

There's already a control that applications can use:
GL_FRAGMENT_SHADER_DERIVATIVE_HINT.  I don't know whether or not /any/
app has ever used it.  The default setting is GL_DONT_CARE, so,
technically speaking, we could do this optimization whenever the hint
isn't GL_NICEST.  Though, we may want a driconf override anyway.  Hmm...

> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu  wrote:
>> From: Chia-I Wu 
>>
>> Replicate the gradient of the top-left pixel to the other three pixels in the
>> subspan, as how DDY is implemented.  Before, different graidents were used 
>> for
>> pixels in the top row and pixels in the bottom row.
>>
>> This change results in a less accurate approximation.  However, it improves
>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
>> 95.0% confidence) on Haswell.  No noticeable image quality difference
>> observed.
>>
>> No piglit gpu.tests regressions.
>>
>> I failed to come up with an explanation for the performance difference.  The
>> change does not make a difference on Ivy Bridge either.  If anyone has the
>> insight, please kindly enlighten me.  Performance differences may also be
>> observed on other games that call textureGrad and dFdx.
>>
>> Signed-off-by: Chia-I Wu 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +
>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> index bfb3d33..c0d24a0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct 
>> brw_reg dst, struct brw_reg src
>>  void
>>  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct 
>> brw_reg src)
>>  {
>> +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell,
>> +* which gives much better performance when the result is used with
>> +* sample_d
>> +*/
>> +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
>> +  BRW_VERTICAL_STRIDE_2;
>> +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
>> +BRW_WIDTH_2;
>> +
>> struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
>>  BRW_REGISTER_TYPE_F,
>> -BRW_VERTICAL_STRIDE_2,
>> -BRW_WIDTH_2,
>> +vstride,
>> +width,
>>  BRW_HORIZONTAL_STRIDE_0,
>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>> struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
>>  BRW_REGISTER_TYPE_F,
>> -BRW_VERTICAL_STRIDE_2,
>> -BRW_WIDTH_2,
>> +vstride,
>> +width,
>>  BRW_HORIZONTAL_STRIDE_0,
>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>> brw_ADD(p, dst, src0, negate(src1));
>> --
>> 1.8.3.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/tests: Add a test for properties of sampler types.

2013-09-12 Thread Ian Romanick
On 09/12/2013 02:08 AM, Kenneth Graunke wrote:
> For each sampler type, this tests that:
> - The base type is GLSL_TYPE_SAMPLER.
> - The dimensionality is set correctly.
> - The returned data type is correct.
> - The sampler_array and sampler_shadow flags are set correctly.
> - sampler_coordinate_components() returns the correct value.
> 
> Signed-off-by: Kenneth Graunke 
> Cc: Ian Romanick 

Wow.  I was expecting that to be a much larger patch.  Strong work!

Reviewed-by: Ian Romanick 

> ---
>  src/glsl/Makefile.am  |  13 +
>  src/glsl/tests/sampler_types_test.cpp | 101 
> ++
>  2 files changed, 114 insertions(+)
>  create mode 100644 src/glsl/tests/sampler_types_test.cpp
> 
> Not a bad idea, since it relies on sampler_dimensionality being set correctly
> and there's no test for that.
> 
> How's this look?
> 
> diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
> index 9352848..2e161b8 100644
> --- a/src/glsl/Makefile.am
> +++ b/src/glsl/Makefile.am
> @@ -34,6 +34,7 @@ include Makefile.sources
>  TESTS = glcpp/tests/glcpp-test   \
>   tests/optimization-test \
>   tests/ralloc-test   \
> + tests/sampler-types-test\
>   tests/uniform-initializer-test
>  
>  TESTS_ENVIRONMENT= \
> @@ -45,6 +46,7 @@ check_PROGRAMS =\
>   glcpp/glcpp \
>   glsl_test   \
>   tests/ralloc-test   \
> + tests/sampler-types-test\
>   tests/uniform-initializer-test
>  
>  tests_uniform_initializer_test_SOURCES = \
> @@ -70,6 +72,17 @@ tests_ralloc_test_LDADD =  \
>   $(top_builddir)/src/gtest/libgtest.la   \
>   $(PTHREAD_LIBS)
>  
> +tests_sampler_types_test_SOURCES =   \
> + $(top_srcdir)/src/mesa/program/prog_hash_table.c\
> + $(top_srcdir)/src/mesa/program/symbol_table.c   \
> + tests/sampler_types_test.cpp
> +tests_sampler_types_test_CFLAGS =\
> + $(PTHREAD_CFLAGS)
> +tests_sampler_types_test_LDADD = \
> + $(top_builddir)/src/gtest/libgtest.la   \
> + $(top_builddir)/src/glsl/libglsl.la \
> + $(PTHREAD_LIBS)
> +
>  libglcpp_la_SOURCES =\
>   glcpp/glcpp-lex.c   \
>   glcpp/glcpp-parse.c \
> diff --git a/src/glsl/tests/sampler_types_test.cpp 
> b/src/glsl/tests/sampler_types_test.cpp
> new file mode 100644
> index 000..4fb30dd
> --- /dev/null
> +++ b/src/glsl/tests/sampler_types_test.cpp
> @@ -0,0 +1,101 @@
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +#include 
> +#include "main/compiler.h"
> +#include "main/mtypes.h"
> +#include "main/macros.h"
> +#include "ralloc.h"
> +#include "ir.h"
> +
> +/**
> + * \file sampler_types_test.cpp
> + *
> + * Test that built-in sampler types have the right properties.
> + */
> +
> +#define ARRAYEXPECT_TRUE(type->sampler_array);
> +#define NONARRAY EXPECT_FALSE(type->sampler_array);
> +#define SHADOW   EXPECT_TRUE(type->sampler_shadow);
> +#define COLOREXPECT_FALSE(type->sampler_shadow);
> +
> +#define T(TYPE, DIM, DATA_TYPE, ARR, SHAD, COMPS)   \
> +TEST(sampler_types, TYPE)   \
> +{   \
> +   const glsl_type *type = glsl_type::TYPE##_type;  \
> +   EXPECT_EQ(GLSL_TYPE_SAMPLER, type->base_type);   \
> +   EXPECT_EQ(DIM, type->sampler_dimensionality);\
> +   EXPECT_EQ(DATA_TYPE, type->sampler_type)

Re: [Mesa-dev] [PATCH] mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix

2013-09-12 Thread Roland Scheidegger
Am 12.09.2013 18:47, schrieb Ian Romanick:
> From: Ian Romanick 
> 
> Everyone at the Khronos meeting was as surprised that GLSL didn't
> already support this as we were.  Several vendors said they'd ship it,
> but there didn't seem to be enough interest to put in the effort to make
> it ARB or KHR.
> 
> Signed-off-by: Ian Romanick 
> Cc: Matt Turner 
> ---
>  docs/specs/MESA_shader_integer_mix.spec  | 19 +++
>  src/glsl/builtin_functions.cpp   |  2 +-
>  src/glsl/glcpp/glcpp-parse.y |  4 ++--
>  src/glsl/glsl_parser_extras.cpp  |  2 +-
>  src/glsl/glsl_parser_extras.h|  4 ++--
>  src/mesa/drivers/dri/i965/intel_extensions.c |  2 +-
>  src/mesa/main/extensions.c   |  2 +-
>  src/mesa/main/mtypes.h   |  2 +-
>  8 files changed, 20 insertions(+), 17 deletions(-)
> 
> diff --git a/docs/specs/MESA_shader_integer_mix.spec 
> b/docs/specs/MESA_shader_integer_mix.spec
> index d381ddd..f2f903b 100644
> --- a/docs/specs/MESA_shader_integer_mix.spec
> +++ b/docs/specs/MESA_shader_integer_mix.spec
> @@ -1,10 +1,10 @@
>  Name
>  
> -MESA_shader_integer_mix
> +EXT_shader_integer_mix
>  
>  Name Strings
>  
> -GL_MESA_shader_integer_mix
> +GL_EXT_shader_integer_mix
>  
>  Contact
>  
> @@ -21,12 +21,12 @@ Status
>  
>  Version
>  
> -Last Modified Date: 09/09/2013
> -Author Revision:5
> +Last Modified Date: 09/12/2013
> +Author Revision:6
>  
>  Number
>  
> -
> +TBD
>  
>  Dependencies
>  
> @@ -78,18 +78,18 @@ Modifications to The OpenGL Shading Language 
> Specification, Version 4.40
>  Including the following line in a shader can be used to control the
>  language features described in this extension:
>  
> -  #extension GL_MESA_shader_integer_mix : 
> +  #extension GL_EXT_shader_integer_mix : 
>  
>  where  is as specified in section 3.3.
>  
>  New preprocessor #defines are added to the OpenGL Shading Language:
>  
> -  #define GL_MESA_shader_integer_mix1
> +  #define GL_EXT_shader_integer_mix1
>  
>  Interactions with ARB_ES3_compatibility
>  
>  On desktop implementations that support ARB_ES3_compatibility,
> -GL_MESA_shader_integer_mix can be enabled (and the new functions
> +GL_EXT_shader_integer_mix can be enabled (and the new functions
>  used) in shaders declared with '#version 300 es'.
>  
>  GLX Protocol
> @@ -124,6 +124,9 @@ Revision History
>  
>  Rev.Date  AuthorChanges
>      -
> +  6   09/12/2013  idr   After discussions in Khronso, change vendor
> +prefix to EXT.
Khronso->Khronos



> +
>5   09/09/2013  idr   Add ARB_ES3_compatibility interaction.
>  
>4   09/06/2013  mattst88  Allow extension on OpenGL ES 3.0.
> diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
> index ce78df1..e005a95 100644
> --- a/src/glsl/builtin_functions.cpp
> +++ b/src/glsl/builtin_functions.cpp
> @@ -190,7 +190,7 @@ shader_bit_encoding(const _mesa_glsl_parse_state *state)
>  static bool
>  shader_integer_mix(const _mesa_glsl_parse_state *state)
>  {
> -   return v130(state) && state->MESA_shader_integer_mix_enable;
> +   return v130(state) && state->EXT_shader_integer_mix_enable;
>  }
>  
>  static bool
> diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
> index fb1c1d0..6eaa5f9 100644
> --- a/src/glsl/glcpp/glcpp-parse.y
> +++ b/src/glsl/glcpp/glcpp-parse.y
> @@ -1246,8 +1246,8 @@ glcpp_parser_create (const struct gl_extensions 
> *extensions, int api)
> if (extensions->ARB_shading_language_420pack)
>add_builtin_define(parser, "GL_ARB_shading_language_420pack", 
> 1);
>  
> -   if (extensions->MESA_shader_integer_mix)
> -  add_builtin_define(parser, "GL_MESA_shader_integer_mix", 1);
> +   if (extensions->EXT_shader_integer_mix)
> +  add_builtin_define(parser, "GL_EXT_shader_integer_mix", 1);
>  }
>   }
>  
> diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
> index 1e4d7c7..1f5900b 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -517,7 +517,7 @@ static const _mesa_glsl_extension 
> _mesa_glsl_supported_extensions[] = {
> EXT(ARB_texture_query_lod,  true,  false, 
> ARB_texture_query_lod),
> EXT(ARB_gpu_shader5,true,  false, ARB_gpu_shader5),
> EXT(AMD_vertex_shader_layer,true,  false, 
> AMD_vertex_shader_layer),
> -   EXT(MESA_shader_integer_mix,true,   true, 
> MESA_shader_integer_mix),
> +   EXT(EXT_shader_integer_mix,true,   true,  
> EXT_shader_integer_mix),
Formatting looks wrong.


>  };
>  
>  #undef EXT
> diff --git a/src/glsl/glsl_parser_e

Re: [Mesa-dev] [RFC] Consolidate the remaining source files to Makefile.sources

2013-09-12 Thread Jakob Bornecrantz
On Thu, Aug 15, 2013 at 7:04 PM, Emil Velikov wrote:

> Hello list
>
> Feeling inspired by the automake work done in mesa, I felt like
> finishing a few things that did not receive too much attention
>  * use Makefile.sources wherever possible
>  * cleanup the duplicated C{,PP,XX}FLAGS and factor out the the common
> ones into Automake.inc
>
> If anyone is interested I have pushed my preliminary work to the
> makefile.sources branch at https://github.com/evelikov/Mesa/
>
> Currently I have gone through the following:
>  gallium/drivers
>  gallium/state_trackers/dri/drm
>  gallium/state_trackers/dri/sw
>  gallium/state_trackers/vega
>  gallium/state_trackers/xorg
>
> Producing the following diff stat:
>  54 files changed, 352 insertions(+), 552 deletions(-)
>
>
> Pros:
> * patches such as "gallium/dri-targets: hide all symbols except for
> __driDriverExtensions" will be brought down to changing only with 2-3 lines
> * one less chance of breaking builds when someone forgets to update the
> SCons/Makefile.am/etc build, after adding/removing a file
>
> Cons:
> * Non nouveau changes will be only(lacking any other the hardware atm).
> Note that I will check the symbols for all drivers, to ensure that I do
> not cause chaos.
>
> Curious if anyone is interested and have any comments on this.
> Note: there may be some git rebase fails as I've started this ~3months ago
>

This seems to be have dropped on the floor.

The Makefile.sources patches looks good, can't comment on the other ones.

If you rebase them and you clobber the Makefile.sources ones, then send that
along with the reset of the patches on that branch out I'm sure they will
get reviewed.

Cheers, Jakob.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix

2013-09-12 Thread Matt Turner
On Thu, Sep 12, 2013 at 9:47 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Everyone at the Khronos meeting was as surprised that GLSL didn't
> already support this as we were.  Several vendors said they'd ship it,
> but there didn't seem to be enough interest to put in the effort to make
> it ARB or KHR.
>
> Signed-off-by: Ian Romanick 
> Cc: Matt Turner 

Nice! Thanks Ian.

Reviewed-by: Matt Turner 

> ---
>  docs/specs/MESA_shader_integer_mix.spec  | 19 +++
>  src/glsl/builtin_functions.cpp   |  2 +-
>  src/glsl/glcpp/glcpp-parse.y |  4 ++--
>  src/glsl/glsl_parser_extras.cpp  |  2 +-
>  src/glsl/glsl_parser_extras.h|  4 ++--
>  src/mesa/drivers/dri/i965/intel_extensions.c |  2 +-
>  src/mesa/main/extensions.c   |  2 +-
>  src/mesa/main/mtypes.h   |  2 +-
>  8 files changed, 20 insertions(+), 17 deletions(-)
>
> diff --git a/docs/specs/MESA_shader_integer_mix.spec 
> b/docs/specs/MESA_shader_integer_mix.spec
> index d381ddd..f2f903b 100644
> --- a/docs/specs/MESA_shader_integer_mix.spec
> +++ b/docs/specs/MESA_shader_integer_mix.spec
> @@ -1,10 +1,10 @@
>  Name
>
> -MESA_shader_integer_mix
> +EXT_shader_integer_mix
>
>  Name Strings
>
> -GL_MESA_shader_integer_mix
> +GL_EXT_shader_integer_mix
>
>  Contact
>
> @@ -21,12 +21,12 @@ Status
>
>  Version
>
> -Last Modified Date: 09/09/2013
> -Author Revision:5
> +Last Modified Date: 09/12/2013
> +Author Revision:6
>
>  Number
>
> -
> +TBD
>
>  Dependencies
>
> @@ -78,18 +78,18 @@ Modifications to The OpenGL Shading Language 
> Specification, Version 4.40
>  Including the following line in a shader can be used to control the
>  language features described in this extension:
>
> -  #extension GL_MESA_shader_integer_mix : 
> +  #extension GL_EXT_shader_integer_mix : 
>
>  where  is as specified in section 3.3.
>
>  New preprocessor #defines are added to the OpenGL Shading Language:
>
> -  #define GL_MESA_shader_integer_mix1
> +  #define GL_EXT_shader_integer_mix1
>
>  Interactions with ARB_ES3_compatibility
>
>  On desktop implementations that support ARB_ES3_compatibility,
> -GL_MESA_shader_integer_mix can be enabled (and the new functions
> +GL_EXT_shader_integer_mix can be enabled (and the new functions
>  used) in shaders declared with '#version 300 es'.
>
>  GLX Protocol
> @@ -124,6 +124,9 @@ Revision History
>
>  Rev.Date  AuthorChanges
>      -
> +  6   09/12/2013  idr   After discussions in Khronso, change vendor

Extra o on Khronos.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix

2013-09-12 Thread Ian Romanick
From: Ian Romanick 

Everyone at the Khronos meeting was as surprised that GLSL didn't
already support this as we were.  Several vendors said they'd ship it,
but there didn't seem to be enough interest to put in the effort to make
it ARB or KHR.

Signed-off-by: Ian Romanick 
Cc: Matt Turner 
---
 docs/specs/MESA_shader_integer_mix.spec  | 19 +++
 src/glsl/builtin_functions.cpp   |  2 +-
 src/glsl/glcpp/glcpp-parse.y |  4 ++--
 src/glsl/glsl_parser_extras.cpp  |  2 +-
 src/glsl/glsl_parser_extras.h|  4 ++--
 src/mesa/drivers/dri/i965/intel_extensions.c |  2 +-
 src/mesa/main/extensions.c   |  2 +-
 src/mesa/main/mtypes.h   |  2 +-
 8 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/docs/specs/MESA_shader_integer_mix.spec 
b/docs/specs/MESA_shader_integer_mix.spec
index d381ddd..f2f903b 100644
--- a/docs/specs/MESA_shader_integer_mix.spec
+++ b/docs/specs/MESA_shader_integer_mix.spec
@@ -1,10 +1,10 @@
 Name
 
-MESA_shader_integer_mix
+EXT_shader_integer_mix
 
 Name Strings
 
-GL_MESA_shader_integer_mix
+GL_EXT_shader_integer_mix
 
 Contact
 
@@ -21,12 +21,12 @@ Status
 
 Version
 
-Last Modified Date: 09/09/2013
-Author Revision:5
+Last Modified Date: 09/12/2013
+Author Revision:6
 
 Number
 
-
+TBD
 
 Dependencies
 
@@ -78,18 +78,18 @@ Modifications to The OpenGL Shading Language Specification, 
Version 4.40
 Including the following line in a shader can be used to control the
 language features described in this extension:
 
-  #extension GL_MESA_shader_integer_mix : 
+  #extension GL_EXT_shader_integer_mix : 
 
 where  is as specified in section 3.3.
 
 New preprocessor #defines are added to the OpenGL Shading Language:
 
-  #define GL_MESA_shader_integer_mix1
+  #define GL_EXT_shader_integer_mix1
 
 Interactions with ARB_ES3_compatibility
 
 On desktop implementations that support ARB_ES3_compatibility,
-GL_MESA_shader_integer_mix can be enabled (and the new functions
+GL_EXT_shader_integer_mix can be enabled (and the new functions
 used) in shaders declared with '#version 300 es'.
 
 GLX Protocol
@@ -124,6 +124,9 @@ Revision History
 
 Rev.Date  AuthorChanges
     -
+  6   09/12/2013  idr   After discussions in Khronso, change vendor
+prefix to EXT.
+
   5   09/09/2013  idr   Add ARB_ES3_compatibility interaction.
 
   4   09/06/2013  mattst88  Allow extension on OpenGL ES 3.0.
diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index ce78df1..e005a95 100644
--- a/src/glsl/builtin_functions.cpp
+++ b/src/glsl/builtin_functions.cpp
@@ -190,7 +190,7 @@ shader_bit_encoding(const _mesa_glsl_parse_state *state)
 static bool
 shader_integer_mix(const _mesa_glsl_parse_state *state)
 {
-   return v130(state) && state->MESA_shader_integer_mix_enable;
+   return v130(state) && state->EXT_shader_integer_mix_enable;
 }
 
 static bool
diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index fb1c1d0..6eaa5f9 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -1246,8 +1246,8 @@ glcpp_parser_create (const struct gl_extensions 
*extensions, int api)
  if (extensions->ARB_shading_language_420pack)
 add_builtin_define(parser, "GL_ARB_shading_language_420pack", 
1);
 
- if (extensions->MESA_shader_integer_mix)
-add_builtin_define(parser, "GL_MESA_shader_integer_mix", 1);
+ if (extensions->EXT_shader_integer_mix)
+add_builtin_define(parser, "GL_EXT_shader_integer_mix", 1);
   }
}
 
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 1e4d7c7..1f5900b 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -517,7 +517,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(ARB_texture_query_lod,  true,  false, 
ARB_texture_query_lod),
EXT(ARB_gpu_shader5,true,  false, ARB_gpu_shader5),
EXT(AMD_vertex_shader_layer,true,  false, 
AMD_vertex_shader_layer),
-   EXT(MESA_shader_integer_mix,true,   true, 
MESA_shader_integer_mix),
+   EXT(EXT_shader_integer_mix,true,   true,  
EXT_shader_integer_mix),
 };
 
 #undef EXT
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 15abbbc..2e2440a 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -315,8 +315,8 @@ struct _mesa_glsl_parse_state {
bool AMD_vertex_shader_layer_warn;
bool ARB_shading_language_420pack_enable;
bool ARB_shading_language_420pack_warn;
-   bool MESA_shader_i

Re: [Mesa-dev] [PATCH] gallium/dri-targets: hide all symbols except for __driDriverExtensions

2013-09-12 Thread Jakob Bornecrantz
On Thu, Aug 15, 2013 at 7:38 AM, Chia-I Wu  wrote:

> On Thu, Aug 15, 2013 at 1:26 PM, Chia-I Wu  wrote:
> > On Sat, Aug 10, 2013 at 2:56 AM, Marek Olšák  wrote:
> >> Most importantly, this hides all LLVM symbols. They shouldn't clash
> >> with a different LLVM version used by apps (at least in theory).
> >>
> >> $ nm -g --defined-only radeonsi_dri.so
> >> 01148f30 D __driDriverExtensions
> > I am not familiar with issues regarding LLVM symbols so I am fine with
> > the change if this is what needs to be done (except maybe use
> > -export-symbols-regex __driDriverExtensions to avoid the version
> > script?)
> >
> > But I ran the nm command on ilo_dri.so, and almost all of the exported
> > symbols are from libdricommon or st/dri.  I think those two components
> > need VISIBILITY_CFLAGS in their AM_CFLAGS and __driDriverExtensions
> > needs to be marked as PUBLIC.  This way other gallium targets can
> > benefit.
> There is no other gallium target that uses st/dri :)
>
> Anyway, in addition to controlling exported symbols using symbol
> files, I still like to see VISIBILITY_CFLAGS be added to st/dri and
> the dri targets, which directly list source files from libdrmcommon in
> their SOURCES.  Besides, it seems __driConfigOptions and
> __dri2ConfigOptions are also marked PUBLIC.  Do they need to be
> exported?
>

I tested the v2 patch on vmwgfx worked but driconf complained after
adding the above two symbols to dri.versions and driconf worked so:

Tested-by: Jakob Bornecrantz 
Reviewed-by: Jakob Bornecrantz 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 69285] New: LLVM rendering bug

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=69285

  Priority: medium
Bug ID: 69285
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: LLVM rendering bug
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: genpfa...@gmail.com
  Hardware: x86 (IA32)
Status: NEW
   Version: 9.2
 Component: Mesa core
   Product: Mesa

Created attachment 85735
  --> https://bugs.freedesktop.org/attachment.cgi?id=85735&action=edit
SDL 1.2-based test case

A simple OpenGL ES 2.0 checkerboard shader has significantly different output
when LLVM is enabled on softpipe.

I think the non-LLVM output is correct.


$ gcc --version
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

$ uname -a
Linux garmin-vm 3.2.0-53-generic #81-Ubuntu SMP Thu Aug 22 21:03:54 UTC 2013
i686 i686 i386 GNU/Linux

$ lsb_release -a
Distributor ID:Ubuntu
Description:Ubuntu 12.04.3 LTS
Release:12.04
Codename:precise

# llvm 3.3
./configure --disable-assertions

# mesa 9.2.0 llvm enabled
./configure \
--enable-gles2 \
--enable-gallium-egl \
--with-egl-platforms="x11" \
--with-gallium-drivers="swrast" \
--with-dri-drivers="" \
--disable-xvmc \
--enable-gallium-llvm \

# mesa 9.2.0 llvm disabled
./configure \
--enable-gles2 \
--enable-gallium-egl \
--with-egl-platforms="x11" \
--with-gallium-drivers="swrast" \
--with-dri-drivers="" \
--disable-xvmc \
--disable-gallium-llvm \

# build
g++ main.cpp -lSDL -L/usr/local/lib -lEGL -lGLESv2

# run
LD_LIBRARY_PATH=/usr/local/lib EGL_SOFTWARE=true ./a.out

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 69285] LLVM rendering bug

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=69285

--- Comment #1 from Charles Huber  ---
Created attachment 85736
  --> https://bugs.freedesktop.org/attachment.cgi?id=85736&action=edit
Output with --disable-gallium-llvm

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 69285] LLVM rendering bug

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=69285

--- Comment #2 from Charles Huber  ---
Created attachment 85737
  --> https://bugs.freedesktop.org/attachment.cgi?id=85737&action=edit
Output with --enable-gallium-llvm

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 69285] Enabling LLVM results in substantially different rendering

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=69285

Charles Huber  changed:

   What|Removed |Added

Summary|LLVM rendering bug  |Enabling LLVM results in
   ||substantially different
   ||rendering

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl/builtins: Fix {texture1D, texture2D, shadow1D}ArrayLod availibility.

2013-09-12 Thread Paul Berry
These functions are defined in EXT_texture_array, which makes no
mention of what shader types they should be allowed in.  At the time
EXT_texture_array was introduced, functions ending in "Lod" were
available only in vertex shaders, however this restriction was lifted
in later spec versions and extensions.

We already have the function lod_exists_in_stage() for figuring out
whether functions ending in "Lod" should be available, so just re-use
that.
---
 src/glsl/builtin_functions.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index ce78df1..10fc590 100644
--- a/src/glsl/builtin_functions.cpp
+++ b/src/glsl/builtin_functions.cpp
@@ -214,9 +214,9 @@ gpu_shader5(const _mesa_glsl_parse_state *state)
 }
 
 static bool
-vs_texture_array(const _mesa_glsl_parse_state *state)
+texture_array_lod(const _mesa_glsl_parse_state *state)
 {
-   return state->target == vertex_shader &&
+   return lod_exists_in_stage(state) &&
   state->EXT_texture_array_enable;
 }
 
@@ -1610,7 +1610,7 @@ builtin_builder::create_builtins()
 NULL);
 
add_function("texture1DArrayLod",
-_texture(ir_txl, vs_texture_array, glsl_type::vec4_type, 
glsl_type::sampler1DArray_type, 2, glsl_type::vec2_type),
+_texture(ir_txl, texture_array_lod, glsl_type::vec4_type, 
glsl_type::sampler1DArray_type, 2, glsl_type::vec2_type),
 NULL);
 
add_function("texture1DProjLod",
@@ -1643,7 +1643,7 @@ builtin_builder::create_builtins()
 NULL);
 
add_function("texture2DArrayLod",
-_texture(ir_txl, vs_texture_array, glsl_type::vec4_type, 
glsl_type::sampler2DArray_type, 3, glsl_type::vec3_type),
+_texture(ir_txl, texture_array_lod, glsl_type::vec4_type, 
glsl_type::sampler2DArray_type, 3, glsl_type::vec3_type),
 NULL);
 
add_function("texture2DProjLod",
@@ -1726,7 +1726,7 @@ builtin_builder::create_builtins()
 NULL);
 
add_function("shadow1DArrayLod",
-_texture(ir_txl, vs_texture_array, glsl_type::vec4_type, 
glsl_type::sampler1DArrayShadow_type, 2, glsl_type::vec3_type),
+_texture(ir_txl, texture_array_lod, glsl_type::vec4_type, 
glsl_type::sampler1DArrayShadow_type, 2, glsl_type::vec3_type),
 NULL);
 
add_function("shadow1DProjLod",
-- 
1.8.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/builtins: Fix {texture1D, texture2D, shadow1D}ArrayLod availibility.

2013-09-12 Thread Kenneth Graunke
On 09/12/2013 11:29 AM, Paul Berry wrote:
> These functions are defined in EXT_texture_array, which makes no
> mention of what shader types they should be allowed in.  At the time
> EXT_texture_array was introduced, functions ending in "Lod" were
> available only in vertex shaders, however this restriction was lifted
> in later spec versions and extensions.
> 
> We already have the function lod_exists_in_stage() for figuring out
> whether functions ending in "Lod" should be available, so just re-use
> that.
> ---
>  src/glsl/builtin_functions.cpp | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

Yeah, I hadn't been thinking about geometry shaders when considering the
deprecated functions.  The lod_exists_in_stage() was also a late addition.

I like this.

Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The long way to a faster build with shared libs and some fixes ...

2013-09-12 Thread Ian Romanick
On 09/12/2013 02:44 AM, Kenneth Graunke wrote:
> On 09/11/2013 11:41 PM, Christian König wrote:
>> I completely agree.
>>
>> Building everything shared might speed up the build process a little bit
>> and save some space, but for the cost of having to handle allot of
>> rather small shared libraries where which each clashing the symbol space
>> of any application using these drivers with a bunch of unnecessary symbols.
>>
>> Building everything as one big blob sounds like the better idea.
>>
>> Christian.
> 
> Not to mention...installing a ton of shared libraries is just asking for
> version mismatch problems.  I've had a /ton/ of problems due to
> mismatching libdricore and i965_dri.so...usually due to rpath shenanigans.

The existing number of shared libraries already makes it a giant pain in
the ass to build and test multiple Mesa versions (master, 9.2.x, 9.1.x,
branches, etc.).  I'm not interested in seeing anything land that
exacerbates that problem.

> If anything, I'd like to get rid of libdricore and build core Mesa and
> the drivers together again.  No more version clashes.  Far fewer symbols
> exported.  LTO for extra performance with no extra effort...
> 
> Faster build times are nice, but...not if it means shipping a ton of
> shared libraries...
> 
> --Ken
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix

2013-09-12 Thread Ian Romanick
On 09/12/2013 12:01 PM, Roland Scheidegger wrote:
> Am 12.09.2013 18:47, schrieb Ian Romanick:
>> From: Ian Romanick 
>>
>> Everyone at the Khronos meeting was as surprised that GLSL didn't
>> already support this as we were.  Several vendors said they'd ship it,
>> but there didn't seem to be enough interest to put in the effort to make
>> it ARB or KHR.
>>
>> Signed-off-by: Ian Romanick 
>> Cc: Matt Turner 
>> ---
>>  docs/specs/MESA_shader_integer_mix.spec  | 19 +++
>>  src/glsl/builtin_functions.cpp   |  2 +-
>>  src/glsl/glcpp/glcpp-parse.y |  4 ++--
>>  src/glsl/glsl_parser_extras.cpp  |  2 +-
>>  src/glsl/glsl_parser_extras.h|  4 ++--
>>  src/mesa/drivers/dri/i965/intel_extensions.c |  2 +-
>>  src/mesa/main/extensions.c   |  2 +-
>>  src/mesa/main/mtypes.h   |  2 +-
>>  8 files changed, 20 insertions(+), 17 deletions(-)
>>
>> diff --git a/docs/specs/MESA_shader_integer_mix.spec 
>> b/docs/specs/MESA_shader_integer_mix.spec
>> index d381ddd..f2f903b 100644
>> --- a/docs/specs/MESA_shader_integer_mix.spec
>> +++ b/docs/specs/MESA_shader_integer_mix.spec
>> @@ -1,10 +1,10 @@
>>  Name
>>  
>> -MESA_shader_integer_mix
>> +EXT_shader_integer_mix
>>  
>>  Name Strings
>>  
>> -GL_MESA_shader_integer_mix
>> +GL_EXT_shader_integer_mix
>>  
>>  Contact
>>  
>> @@ -21,12 +21,12 @@ Status
>>  
>>  Version
>>  
>> -Last Modified Date: 09/09/2013
>> -Author Revision:5
>> +Last Modified Date: 09/12/2013
>> +Author Revision:6
>>  
>>  Number
>>  
>> -
>> +TBD
>>  
>>  Dependencies
>>  
>> @@ -78,18 +78,18 @@ Modifications to The OpenGL Shading Language 
>> Specification, Version 4.40
>>  Including the following line in a shader can be used to control the
>>  language features described in this extension:
>>  
>> -  #extension GL_MESA_shader_integer_mix : 
>> +  #extension GL_EXT_shader_integer_mix : 
>>  
>>  where  is as specified in section 3.3.
>>  
>>  New preprocessor #defines are added to the OpenGL Shading Language:
>>  
>> -  #define GL_MESA_shader_integer_mix1
>> +  #define GL_EXT_shader_integer_mix1
>>  
>>  Interactions with ARB_ES3_compatibility
>>  
>>  On desktop implementations that support ARB_ES3_compatibility,
>> -GL_MESA_shader_integer_mix can be enabled (and the new functions
>> +GL_EXT_shader_integer_mix can be enabled (and the new functions
>>  used) in shaders declared with '#version 300 es'.
>>  
>>  GLX Protocol
>> @@ -124,6 +124,9 @@ Revision History
>>  
>>  Rev.Date  AuthorChanges
>>      
>> -
>> +  6   09/12/2013  idr   After discussions in Khronso, change vendor
>> +prefix to EXT.
> Khronso->Khronos

Oops.

>> +
>>5   09/09/2013  idr   Add ARB_ES3_compatibility interaction.
>>  
>>4   09/06/2013  mattst88  Allow extension on OpenGL ES 3.0.
>> diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
>> index ce78df1..e005a95 100644
>> --- a/src/glsl/builtin_functions.cpp
>> +++ b/src/glsl/builtin_functions.cpp
>> @@ -190,7 +190,7 @@ shader_bit_encoding(const _mesa_glsl_parse_state *state)
>>  static bool
>>  shader_integer_mix(const _mesa_glsl_parse_state *state)
>>  {
>> -   return v130(state) && state->MESA_shader_integer_mix_enable;
>> +   return v130(state) && state->EXT_shader_integer_mix_enable;
>>  }
>>  
>>  static bool
>> diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
>> index fb1c1d0..6eaa5f9 100644
>> --- a/src/glsl/glcpp/glcpp-parse.y
>> +++ b/src/glsl/glcpp/glcpp-parse.y
>> @@ -1246,8 +1246,8 @@ glcpp_parser_create (const struct gl_extensions 
>> *extensions, int api)
>>if (extensions->ARB_shading_language_420pack)
>>   add_builtin_define(parser, "GL_ARB_shading_language_420pack", 
>> 1);
>>  
>> -  if (extensions->MESA_shader_integer_mix)
>> - add_builtin_define(parser, "GL_MESA_shader_integer_mix", 1);
>> +  if (extensions->EXT_shader_integer_mix)
>> + add_builtin_define(parser, "GL_EXT_shader_integer_mix", 1);
>> }
>>  }
>>  
>> diff --git a/src/glsl/glsl_parser_extras.cpp 
>> b/src/glsl/glsl_parser_extras.cpp
>> index 1e4d7c7..1f5900b 100644
>> --- a/src/glsl/glsl_parser_extras.cpp
>> +++ b/src/glsl/glsl_parser_extras.cpp
>> @@ -517,7 +517,7 @@ static const _mesa_glsl_extension 
>> _mesa_glsl_supported_extensions[] = {
>> EXT(ARB_texture_query_lod,  true,  false, 
>> ARB_texture_query_lod),
>> EXT(ARB_gpu_shader5,true,  false, ARB_gpu_shader5),
>> EXT(AMD_vertex_shader_layer,true,  false, 
>> AMD_vertex_shader_layer),
>> -   EXT(MESA_shader_integer_mix,true,   true, 
>> MESA_shader_int

[Mesa-dev] RFC: Fixing FB issues in nv30 gallium

2013-09-12 Thread Ilia Mirkin
Hello,

I sent a patch earlier to add a new PIPE_CAP about disallowing mixed
fb cbuf/zsbuf sizes, which would be used to disable ARB_fbo. I think
people were generally in favor, but I didn't see any actual
Reviewed-By's, I'll resend it with updated nouveau directories (I
guess everything got moved for some reason) and help text that more
accurately describes the situation.

However, as it turns out, that doesn't go quite far enough. The bpp of
cbuf/zsbuf has to match up as well, which apparently can happen even
without ARB_fbo. Currently st_validate_framebuffer has special logic
to deal with PIPE_CAP_MIXED_COLORBUFFER_FORMATS and return
FRAMEBUFFER_UNSUPPORTED as necessary. I was thinking this could be
handled by adding a pipe_screen->validate_fb() hook, and folding the
MIXED_COLORBUFFER_FORMATS stuff into it (as well as implementing the
nv30-specific logic as well). I could alternatively add another
PIPE_CAP, but I think it's getting to be too much.

Thoughts? Alternate proposals?

Thanks for any feedback,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] gallium-bind-sampler-states branch

2013-09-12 Thread Brian Paul


I just pushed a gallium-bind-sampler-states branch to my git repo at 
git://people.freedesktop.org/~brianp/mesa


It replaces the four 
pipe_context::bind_fragment/vertex/geometry/compute_sampler_states() 
functions with a single bind_sampler_states() function:


 void (*bind_sampler_states)(struct pipe_context *,
 unsigned shader, unsigned start_slot,
 unsigned num_samplers, void **samplers);

At this point start_slot is always zero (at least for non-compute 
shaders).  And as the updated gallium docs explain, at some point calls 
to bind_sampler_states() will be used to updated sub-ranges, but that 
never happens currently.


I've updated all the drivers, state trackers, utils, etc.

I've tested the svga, llvmpipe and softpipe drivers.  'make check' and a 
texture subset of piglit pass w/out regressions.  I'd appreciate it if 
other driver developers would test their favorite driver.



Next, I'd like to consolidate the 
set_vertex/geometry/fragment/compute_sampler_views() functions with a 
single function.  But I have no idea when I'll get around to that.


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] RFC: Fixing FB issues in nv30 gallium

2013-09-12 Thread Roland Scheidegger
Am 13.09.2013 01:09, schrieb Ilia Mirkin:
> Hello,
> 
> I sent a patch earlier to add a new PIPE_CAP about disallowing mixed
> fb cbuf/zsbuf sizes, which would be used to disable ARB_fbo. I think
> people were generally in favor, but I didn't see any actual
> Reviewed-By's, I'll resend it with updated nouveau directories (I
> guess everything got moved for some reason) and help text that more
> accurately describes the situation.
> 
> However, as it turns out, that doesn't go quite far enough. The bpp of
> cbuf/zsbuf has to match up as well
your hardware sucks :-).


> , which apparently can happen even
> without ARB_fbo. Currently st_validate_framebuffer has special logic
> to deal with PIPE_CAP_MIXED_COLORBUFFER_FORMATS and return
> FRAMEBUFFER_UNSUPPORTED as necessary. I was thinking this could be
> handled by adding a pipe_screen->validate_fb() hook, and folding the
> MIXED_COLORBUFFER_FORMATS stuff into it (as well as implementing the
> nv30-specific logic as well). I could alternatively add another
> PIPE_CAP, but I think it's getting to be too much.
> 
> Thoughts? Alternate proposals?
> 
> Thanks for any feedback,
> 

Theoretically this sounds like the right approach, since currently with
the PIPE_CAP_MIXED_COLORBUFFER_FORMATS this just enforces that color
formats are all the same - but this may not really be what the hw
limitation is (even wrt just color buffers), I could imagine some hw
only not supporting some but not all possible combinations. The
framebuffer_unsupported stuff is all entirely implementation dependent
after all (though I guess in the EXT_fbo version the only possibility to
hit that is indeed with non-matching color/zs combinations since
different color formats are explicitly prohibited).
I'm not sure though it's worth having this in the pipe interface since
it only seems to affect some fairly arcane hw, but I could live with it.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] gallium-bind-sampler-states branch

2013-09-12 Thread Chia-I Wu
Hi Brian,

On Fri, Sep 13, 2013 at 8:46 AM, Brian Paul  wrote:
>
> I just pushed a gallium-bind-sampler-states branch to my git repo at
> git://people.freedesktop.org/~brianp/mesa
>
> It replaces the four
> pipe_context::bind_fragment/vertex/geometry/compute_sampler_states()
> functions with a single bind_sampler_states() function:
>
>  void (*bind_sampler_states)(struct pipe_context *,
>  unsigned shader, unsigned start_slot,
>  unsigned num_samplers, void **samplers);
>
> At this point start_slot is always zero (at least for non-compute shaders).
> And as the updated gallium docs explain, at some point calls to
> bind_sampler_states() will be used to updated sub-ranges, but that never
> happens currently.
>
> I've updated all the drivers, state trackers, utils, etc.
>
> I've tested the svga, llvmpipe and softpipe drivers.  'make check' and a
> texture subset of piglit pass w/out regressions.  I'd appreciate it if other
> driver developers would test their favorite driver.
For ilo, the new code does not follow the doc and unbinds samplers not in range.

Is it fine if I implement the new bind_sampler_states as a helper
function on master branch, so that you hook it up to
pipe_context::bind_sampler_states in your branch and remove the old
ones?

>
> Next, I'd like to consolidate the
> set_vertex/geometry/fragment/compute_sampler_views() functions with a single
> function.  But I have no idea when I'll get around to that.
>
> -Brian
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev



-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

2013-09-12 Thread Chia-I Wu
On Thu, Sep 12, 2013 at 10:48 PM, Ian Romanick  wrote:
> On 09/12/2013 01:06 AM, Chris Forbes wrote:
>> Can we make this approximation conditional on an image-quality control
>> in driconf [or somewhere else]?
>
> There's already a control that applications can use:
> GL_FRAGMENT_SHADER_DERIVATIVE_HINT.  I don't know whether or not /any/
> app has ever used it.  The default setting is GL_DONT_CARE, so,
> technically speaking, we could do this optimization whenever the hint
> isn't GL_NICEST.  Though, we may want a driconf override anyway.  Hmm...
How about, in generate_ddx():

  if (brw->ctx.Hint.FragmentShaderDerivative == GL_NICEST ||
  brw->accurate_ddx) {
 // current code
  }
  else {
 // new code
  }

That is, when the app don't care, we treat it as GL_FASTEST.  If the
user cares, he can set the new drirc option, accurate_ddx, to true to
override.  accurate_ddx is false by default.

>> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu  wrote:
>>> From: Chia-I Wu 
>>>
>>> Replicate the gradient of the top-left pixel to the other three pixels in 
>>> the
>>> subspan, as how DDY is implemented.  Before, different graidents were used 
>>> for
>>> pixels in the top row and pixels in the bottom row.
>>>
>>> This change results in a less accurate approximation.  However, it improves
>>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
>>> 95.0% confidence) on Haswell.  No noticeable image quality difference
>>> observed.
>>>
>>> No piglit gpu.tests regressions.
>>>
>>> I failed to come up with an explanation for the performance difference.  The
>>> change does not make a difference on Ivy Bridge either.  If anyone has the
>>> insight, please kindly enlighten me.  Performance differences may also be
>>> observed on other games that call textureGrad and dFdx.
>>>
>>> Signed-off-by: Chia-I Wu 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +
>>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>> index bfb3d33..c0d24a0 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct 
>>> brw_reg dst, struct brw_reg src
>>>  void
>>>  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct 
>>> brw_reg src)
>>>  {
>>> +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on 
>>> Haswell,
>>> +* which gives much better performance when the result is used with
>>> +* sample_d
>>> +*/
>>> +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
>>> +  BRW_VERTICAL_STRIDE_2;
>>> +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
>>> +BRW_WIDTH_2;
>>> +
>>> struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
>>>  BRW_REGISTER_TYPE_F,
>>> -BRW_VERTICAL_STRIDE_2,
>>> -BRW_WIDTH_2,
>>> +vstride,
>>> +width,
>>>  BRW_HORIZONTAL_STRIDE_0,
>>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>> struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
>>>  BRW_REGISTER_TYPE_F,
>>> -BRW_VERTICAL_STRIDE_2,
>>> -BRW_WIDTH_2,
>>> +vstride,
>>> +width,
>>>  BRW_HORIZONTAL_STRIDE_0,
>>>  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>> brw_ADD(p, dst, src0, negate(src1));
>>> --
>>> 1.8.3.1
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

2013-09-12 Thread Chris Forbes
Sounds good to me.

On Fri, Sep 13, 2013 at 3:11 PM, Chia-I Wu  wrote:
> On Thu, Sep 12, 2013 at 10:48 PM, Ian Romanick  wrote:
>> On 09/12/2013 01:06 AM, Chris Forbes wrote:
>>> Can we make this approximation conditional on an image-quality control
>>> in driconf [or somewhere else]?
>>
>> There's already a control that applications can use:
>> GL_FRAGMENT_SHADER_DERIVATIVE_HINT.  I don't know whether or not /any/
>> app has ever used it.  The default setting is GL_DONT_CARE, so,
>> technically speaking, we could do this optimization whenever the hint
>> isn't GL_NICEST.  Though, we may want a driconf override anyway.  Hmm...
> How about, in generate_ddx():
>
>   if (brw->ctx.Hint.FragmentShaderDerivative == GL_NICEST ||
>   brw->accurate_ddx) {
>  // current code
>   }
>   else {
>  // new code
>   }
>
> That is, when the app don't care, we treat it as GL_FASTEST.  If the
> user cares, he can set the new drirc option, accurate_ddx, to true to
> override.  accurate_ddx is false by default.
>
>>> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu  wrote:
 From: Chia-I Wu 

 Replicate the gradient of the top-left pixel to the other three pixels in 
 the
 subspan, as how DDY is implemented.  Before, different graidents were used 
 for
 pixels in the top row and pixels in the bottom row.

 This change results in a less accurate approximation.  However, it improves
 the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% 
 (at
 95.0% confidence) on Haswell.  No noticeable image quality difference
 observed.

 No piglit gpu.tests regressions.

 I failed to come up with an explanation for the performance difference.  
 The
 change does not make a difference on Ivy Bridge either.  If anyone has the
 insight, please kindly enlighten me.  Performance differences may also be
 observed on other games that call textureGrad and dFdx.

 Signed-off-by: Chia-I Wu 
 ---
  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
 index bfb3d33..c0d24a0 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
 @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct 
 brw_reg dst, struct brw_reg src
  void
  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct 
 brw_reg src)
  {
 +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on 
 Haswell,
 +* which gives much better performance when the result is used with
 +* sample_d
 +*/
 +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
 +  BRW_VERTICAL_STRIDE_2;
 +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
 +BRW_WIDTH_2;
 +
 struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
  BRW_REGISTER_TYPE_F,
 -BRW_VERTICAL_STRIDE_2,
 -BRW_WIDTH_2,
 +vstride,
 +width,
  BRW_HORIZONTAL_STRIDE_0,
  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
 struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
  BRW_REGISTER_TYPE_F,
 -BRW_VERTICAL_STRIDE_2,
 -BRW_WIDTH_2,
 +vstride,
 +width,
  BRW_HORIZONTAL_STRIDE_0,
  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
 brw_ADD(p, dst, src0, negate(src1));
 --
 1.8.3.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
>
> --
> o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/hsw: compute DDX in a subspan based only on top row

2013-09-12 Thread Chia-I Wu
From: Chia-I Wu 

Consider only the top-left and top-right pixels to approximate DDX in a 2x2
subspan, unless the application or the user requests a more accurate
approximation.  This results in a less accurate approximation.  However, it
improves the performance of Xonotic with Ultra settings by 24.3879% +/-
0.832202% (at 95.0% confidence) on Haswell.  No noticeable image quality
difference observed.

No piglit gpu.tests regressions (tested with v1)

I failed to come up with an explanation for the performance difference, as the
change does not affect Ivy Bridge.  If anyone has the insight, please kindly
enlighten me.  Performance differences may also be observed on other games
that call textureGrad and dFdx.

v2: Honor GL_FRAGMENT_SHADER_DERIVATIVE_HINT and add a drirc option.  Update
comments.

Signed-off-by: Chia-I Wu 
---
 src/mesa/drivers/dri/i965/brw_context.c   |  1 +
 src/mesa/drivers/dri/i965/brw_context.h   |  1 +
 src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 40 ---
 src/mesa/drivers/dri/i965/intel_screen.c  |  4 
 4 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 4fcc9fb..1cdfb9d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -470,6 +470,7 @@ brwCreateContext(int api,
brw_draw_init( brw );
 
brw->precompile = driQueryOptionb(&brw->optionCache, "shader_precompile");
+   brw->accurate_derivative = driQueryOptionb(&brw->optionCache, 
"accurate_derivative");
 
ctx->Const.ContextFlags = 0;
if ((flags & __DRI_CTX_FLAG_FORWARD_COMPATIBLE) != 0)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index c566bba..8bfc54a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -964,6 +964,7 @@ struct brw_context
bool always_flush_cache;
bool disable_throttling;
bool precompile;
+   bool accurate_derivative;
 
driOptionCache optionCache;
/** @} */
diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
index bfb3d33..69aeab1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
@@ -540,7 +540,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg 
dst, struct brw_reg src
  *
  * arg0: ss0.tl ss0.tr ss0.bl ss0.br ss1.tl ss1.tr ss1.bl ss1.br
  *
- * and we're trying to produce:
+ * Ideally, we want to produce:
  *
  *   DDX DDY
  * dst: (ss0.tr - ss0.tl) (ss0.tl - ss0.bl)
@@ -556,24 +556,48 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg 
dst, struct brw_reg src
  *
  * For DDX, it ends up being easy: width = 2, horiz=0 gets us the same result
  * for each pair, and vertstride = 2 jumps us 2 elements after processing a
- * pair. But for DDY, it's harder, as we want to produce the pairs swizzled
- * between each other.  We could probably do it like ddx and swizzle the right
- * order later, but bail for now and just produce
+ * pair.  But the ideal approximation of DDX may impose a huge performance
+ * cost on sample_d.  As such, we favor ((ss0.tr - ss0.tl)x4 (ss1.tr -
+ * ss1.tl)x4) unless the app or the user requests otherwise.
+ *
+ * For DDY, it's harder, as we want to produce the pairs swizzled between each
+ * other.  We could probably do it like ddx and swizzle the right order later,
+ * but bail for now and just produce
  * ((ss0.tl - ss0.bl)x4 (ss1.tl - ss1.bl)x4)
  */
 void
 fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct brw_reg 
src)
 {
+   unsigned vstride, width;
+
+   /* Produce accurate result only when requested.  We emit only one
+* instruction for either case, but the problem is the result may affect
+* how fast sample_d executes.
+*
+* Since the performance difference is only observed on Haswell, ignore the
+* hints on other GENs for now.
+*/
+   if (!brw->is_haswell ||
+   brw->ctx.Hint.FragmentShaderDerivative == GL_NICEST ||
+   brw->accurate_derivative) {
+  vstride = BRW_VERTICAL_STRIDE_2;
+  width = BRW_WIDTH_2;
+   }
+   else {
+  vstride = BRW_VERTICAL_STRIDE_4;
+  width = BRW_WIDTH_4;
+   }
+
struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
 BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_2,
-BRW_WIDTH_2,
+vstride,
+width,
 BRW_HORIZONTAL_STRIDE_0,
 BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
 BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_2,
-BRW_WIDTH_2,
+vstride,
+