This pass flips (matrix * vector) operations to (vector *
matrixTranspose) for certain built-in matrices (currently
gl_ModelViewProjectionMatrix and gl_TextureMatrix).
This is equivalent, but results in dot products rather than multiplies
and adds. On some hardware, this is more efficient.
This
Doing matrix multiplies with DP4s is fewer instructions than MUL/ADD,
especially since we don't support MAD in the vertex shader.
Not observed to improve performance in any fixed function applications,
but is useful for the next patch.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i96
On 03/31/2013 02:10 AM, Chris Forbes wrote:
This series implements ARB_texture_gather in core mesa, and the
driver side for Gen7 i965.
Not quite baked -- green/blue/alpha texture swizzles with VS don't
work yet. Everything else works, though (R/0/1 swizzles in VS; all
swizzles in FS; textureGath
On 03/31/2013 04:01 PM, Matt Turner wrote:
On Sun, Mar 31, 2013 at 2:10 AM, Chris Forbes wrote:
Signed-off-by: Chris Forbes
---
src/mesa/drivers/dri/i965/brw_context.c | 1 +
src/mesa/drivers/dri/intel/intel_extensions.c | 4
2 files changed, 5 insertions(+)
diff --git a/src/me
On 03/31/2013 02:10 AM, Chris Forbes wrote:
Pretty much the same as the FS case. Channel select goes in the header,
post-sampling swizzle only does the 0/1 cases.
Signed-off-by: Chris Forbes
---
src/mesa/drivers/dri/i965/brw_vec4.h | 1 +
src/mesa/drivers/dri/i965/brw_vec4_emit.cp
On 03/31/2013 02:10 AM, Chris Forbes wrote:
Lowers ir_tg4 (from textureGather and textureGatherOffset builtins) to
SHADER_OPCODE_TG4.
The usual post-sampling swizzle workaround can't work for ir_tg4,
so avoid doing that:
* For R/G/B/A swizzles use the hardware channel select (lives in the
s
On 03/31/2013 02:10 AM, Chris Forbes wrote:
From: Maxence Le Dore
---
src/mapi/glapi/gen/ARB_texture_gather.xml | 14 ++
src/mapi/glapi/gen/gl_API.xml | 2 +-
src/mesa/main/context.c | 4
src/mesa/main/extensions.c| 1 +
sr
On 04/02/2013 01:38 PM, Matt Turner wrote:
---
src/mesa/program/register_allocate.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/mesa/program/register_allocate.c
b/src/mesa/program/register_allocate.c
index a9064c3..7d11b73 100644
--- a/src/mesa/program/regist
This still fails, since 8192*4bpp == 32768, which is too big to use the
blitter on.
Reviewed-by: Kenneth Graunke
---
src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 21 +
1 file changed, 21 insertions(+)
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
b/src/m
Doing so was breaking miptree mapping, which we really need to be able to
handle. With this change, intel_miptree_map_direct() falls through to
doing a CPU mapping on the buffer like we need.
With the previous 2 patches, all of these should be fixed:
Bugzilla: https://bugs.freedesktop.org/show_bu
This will be used for handling updates of large textures.
Reviewed-by: Kenneth Graunke
---
src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 25 ++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
b/src/mesa/dri
From: Roland Scheidegger
This is trivial now, though need to make sure we pass all the necessary
derivative values (which is 3 each for ddx/ddy not 2).
Untested (no piglit test) however since the transform works the same
as implicit derivatives this should probably work correctly.
---
src/galliu
From: Roland Scheidegger
This proved to be tricky, the problem is that after selection/mirroring
we cannot calculate reasonable derivatives (if not all pixels in a quad
end up on the same face the derivatives could get "randomly" exceedingly
large).
However, it is actually quite easy to simply ca
From: Roland Scheidegger
Using a different packing for the single coord case should save a shuffle.
Plus some minor style fixes.
---
src/gallium/auxiliary/gallivm/lp_bld_quad.c | 20 +++-
src/gallium/auxiliary/gallivm/lp_bld_sample.c | 31 +++--
2 files chan
Brian Paul writes:
> On 04/02/2013 04:16 PM, Paul Berry wrote:
>> GCC 4.8 now warns about typedefs that are local to a scope and not
>> used anywhere within that scope. This produces spurious warnings with
>> the STATIC_ASSERT() macro (which uses a typedef to provoke a compile
>> error in the ev
On Tue, Apr 2, 2013 at 4:19 PM, Christian König wrote:
> diff --git a/configure.ac b/configure.ac
> index 81d4a3f..93ec1d2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1814,6 +1814,7 @@ if test "x$with_gallium_drivers" != x; then
> if test "x$enable_r600_llvm" = xyes -o "x$e
On Tue, Apr 2, 2013 at 4:48 PM, Eric Anholt wrote:
> Matt Turner writes:
>
>> The original goal of pre-register allocation scheduling was to reduce
>> live ranges so we'd use fewer registers and hopefully fit into 16-wide.
>> In shader-db, this change causes us to lose 30 16-wide programs, but we
From: Marek Olšák
Ported from r600g commit:
8891b2f9c91b2f6c8625184c23a10b8e55875dc0
NOTE: This is a candidate for the stable branches.
---
src/gallium/drivers/radeonsi/r600_blit.c | 12
1 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/src/gallium/drivers/radeonsi
Matt Turner writes:
> The original goal of pre-register allocation scheduling was to reduce
> live ranges so we'd use fewer registers and hopefully fit into 16-wide.
> In shader-db, this change causes us to lose 30 16-wide programs, but we
> gain 29... so it's a toss-up. At least by choosing inst
On 04/02/2013 05:07 PM, srol...@vmware.com wrote:
From: Roland Scheidegger
Should be way faster of course on cpus supporting this (includes AMD
Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)).
Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge
From: Roland Scheidegger
Should be way faster of course on cpus supporting this (includes AMD
Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)).
Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge.
---
src/gallium/auxiliary/gallivm/lp_bld_conv.c
On 03/30/2013 07:27 AM, Zack Rusin wrote:
Signed-off-by: Zack Rusin
---
src/gallium/auxiliary/draw/draw_gs.c |4
1 file changed, 4 deletions(-)
diff --git a/src/gallium/auxiliary/draw/draw_gs.c
b/src/gallium/auxiliary/draw/draw_gs.c
index b98b133..70db837 100644
--- a/src/gallium/au
On 04/02/2013 04:16 PM, Paul Berry wrote:
GCC 4.8 now warns about typedefs that are local to a scope and not
used anywhere within that scope. This produces spurious warnings with
the STATIC_ASSERT() macro (which uses a typedef to provoke a compile
error in the event of an assertion failure).
Th
GCC 4.8 now warns about typedefs that are local to a scope and not
used anywhere within that scope. This produces spurious warnings with
the STATIC_ASSERT() macro (which uses a typedef to provoke a compile
error in the event of an assertion failure).
This patch avoids the warning using the GCC __
On Tue, Apr 02, 2013 at 01:38:07PM -0700, Matt Turner wrote:
> ---
Nice catch, will this change have any affect on the compiled code?
-Tom
> src/mesa/program/register_allocate.c |2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/src/mesa/program/register_allocate.c
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.
---
src/gallium/drivers/r600/r600_shader.c | 45 --
1 file changed, 32 insertions(+), 13 deletions(-)
diff --git a/src/gallium/drivers/r600/r600_shade
---
src/mesa/program/register_allocate.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/mesa/program/register_allocate.c
b/src/mesa/program/register_allocate.c
index a9064c3..7d11b73 100644
--- a/src/mesa/program/register_allocate.c
+++ b/src/mesa/program/register_a
This clarifies that the offset of 2 is actually 16 kB / 8kB units.
It also keys both computations off of a single variable, which should
make it easier to change in the future.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/gen7_urb.c | 5 +++--
1 file changed, 3 insertions(+), 2 d
These variables are only used within a single function, so we may as
well make them local variables.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/brw_context.h | 9 -
src/mesa/drivers/dri/i965/gen6_urb.c| 18 +-
src/mesa/drivers/dri/i965/gen7_urb.c
When geometry shaders are present, one needs to be able to create
an empty geometry shader with stream output that needs to be
resolved later and attached to the currently bound vertex shader.
Lets add support for it to llvmpipe and draw. draw allows attaching
independent stream output info to any
We need to reset the internal state of the so buffers or we'll
keep appending even though we're not supposed to.
Signed-off-by: Zack Rusin
---
src/gallium/drivers/llvmpipe/lp_state_so.c |6 ++
1 file changed, 6 insertions(+)
diff --git a/src/gallium/drivers/llvmpipe/lp_state_so.c
b/src
we use draw_set_mapped_so_targets nowadays
Signed-off-by: Zack Rusin
---
src/gallium/auxiliary/draw/draw_context.c |7 ---
src/gallium/auxiliary/draw/draw_context.h |5 -
2 files changed, 12 deletions(-)
diff --git a/src/gallium/auxiliary/draw/draw_context.c
b/src/gallium/auxil
I think this was there before and got accidently
removed during a merge. Same code as for the GS
context, which is also using an enum instead of
hardcoded numbers.
Signed-off-by: Zack Rusin
---
src/gallium/auxiliary/draw/draw_llvm.c |8
src/gallium/auxiliary/draw/draw_llvm.h | 17
Signed-off-by: Zack Rusin
---
src/gallium/auxiliary/draw/draw_gs.c |4
1 file changed, 4 deletions(-)
diff --git a/src/gallium/auxiliary/draw/draw_gs.c
b/src/gallium/auxiliary/draw/draw_gs.c
index b98b133..70db837 100644
--- a/src/gallium/auxiliary/draw/draw_gs.c
+++ b/src/gallium/aux
On 1 April 2013 11:43, Kenneth Graunke wrote:
> On 04/01/2013 11:30 AM, Ian Romanick wrote:
>
>> On 03/29/2013 02:13 PM, Paul Berry wrote:
>>
>>> Mesa constant-folds built-in functions by using a miniature GLSL
>>> interpreter (see
>>> ir_function_signature::**constant_expression_evaluate_**
>>>
This is the same computation as the _WriteEnabled flag, so we may as
well use it.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/gen6_depthstencil.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/gen6_depthstencil.c
b/src/mesa/dr
ctx->Stencil.WriteMask is a statically sized array of 3 elements.
Checking it against 0 actually is a NULL check, and can never fail,
which meant that we always said stencil writes were enabled.
Use the new core Mesa derived state flag to fix this.
NOTE: This is a candidate for stable branches.
S
i965 needs to know whether stencil writes are enabled in several places,
and gets the test wrong sometimes. While we could create a function to
compute this, it seems generally useful enough to warrant a new piece of
derived state. Also, all the plumbing is already in place.
NOTE: This is a cand
On 04/01/2013 11:25 AM, Ian Romanick wrote:
From: Ian Romanick
A future commit will try to use this function in a different file.
Signed-off-by: Ian Romanick
Title of patch should be "check_builtin_array_max_size" (typo). I was
wondering what a check_build_array function would do :)
I'm
On 04/01/2013 11:25 AM, Ian Romanick wrote:
From: Ian Romanick
Previously the shader
uniform float x[6];
void main() { gl_Position.x = x[1.0]; }
would have generated the errors
0:2(33): error: array index must be integer type
0:2(36): error: array index must be < 6
Now only
0:2(33): error:
On 03/26/2013 09:54 PM, Paul Berry wrote:
This patch consolidates duplicate code in the brw_depthbuffer and
gen7_depthbuffer state atoms. Previously, these state atoms contained
5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for
Gen4-6 and 2 for Gen7). Also a lot of logic for
On Die, 2013-04-02 at 10:20 +0200, Michel Dänzer wrote:
> On Mon, 2013-04-01 at 14:11 -0700, Tom Stellard wrote:
> > From: Tom Stellard
> >
> > Building libradeonllvm as a shared object has led to a number of bugs
> > and build system complications, and I don't think it's necessary for
> > such
https://bugs.freedesktop.org/show_bug.cgi?id=62868
Brian Paul changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
Looks good to me in principle, but I have some remarks (inline) concerning the
implementation.
Jose
- Original Message -
> From: Adhemerval Zanella
>
> Reviewed-by: Adam Jackson
> Signed-off-by: Adhemerval Zanella
> ---
> src/gallium/auxiliary/gallivm/lp_bld_printf.c | 56
> +++
On 04/02/2013 08:43 AM, Christoph Bumiller wrote:
On 02.04.2013 16:39, Brian Paul wrote:
On 03/30/2013 08:11 AM, Christoph Bumiller wrote:
NOTE: Changed the semantic index for the drawtex coordiante to
be the texture unit index instead of always 0.
Not sure if this is correct but since the valu
I don't see need/benefit in mixing "iround" (ie, float -> int) with "round"
(ie, float -> float).
If this is a one-off, then you should just call
lp_build_intrinsic_unary(builder, "llvm.ppc.altivec.vctsxs", ...)
If you really need an generic intrinsic helper for iround, then please add a new
On 02.04.2013 16:39, Brian Paul wrote:
> On 03/30/2013 08:11 AM, Christoph Bumiller wrote:
>> NOTE: Changed the semantic index for the drawtex coordiante to
>> be the texture unit index instead of always 0.
>> Not sure if this is correct but since the value seems to depend
>> on the unit it would m
On 03/30/2013 08:11 AM, Christoph Bumiller wrote:
NOTE: Changed the semantic index for the drawtex coordiante to
be the texture unit index instead of always 0.
Not sure if this is correct but since the value seems to depend
on the unit it would make sense to use different varying slots.
Tested-
From: Brian Paul
The fallbacks count is the number of drawing calls that use a "draw"
module fallback, such as polygon stipple.
---
src/gallium/drivers/svga/svga_context.h|9 +
src/gallium/drivers/svga/svga_pipe_draw.c |3 +++
src/gallium/drivers/svga/svga_pipe_query.c | 2
From: Brian Paul
This is in preparation for adding new query types for the HUD.
---
src/gallium/drivers/svga/svga_pipe_query.c | 218
1 file changed, 124 insertions(+), 94 deletions(-)
diff --git a/src/gallium/drivers/svga/svga_pipe_query.c
b/src/gallium/drivers/s
On 04/01/2013 07:36 PM, Marek Olšák wrote:
This allows using L8 and R8 for the font if I8 isn't supported.
---
src/gallium/auxiliary/hud/hud_context.c | 36 +--
1 file changed, 30 insertions(+), 6 deletions(-)
Tested-by: Brian Paul
___
- Original Message -
> From: Roland Scheidegger
>
> Conceptually the same as previously done in float_to_half.
> Should cut down number of instructions from 14 to 10 or so, but
> will promote some NaNs to Infs, so it's disabled.
> It gets a bit tricky though handling all the cases corre
- Original Message -
> On 03/29/2013 05:30 PM, Brian Paul wrote:
>
> Has this bug been reported to the Topogun developer?
Yes, I have reported via http://www.topogun.com/support/contact-us.htm on 29th
September 2012.
I received no reply since, nor did I try a second time.
Also note tha
From: Christian König
v2: fix instrinsic name as well
Signed-off-by: Christian König
---
src/gallium/drivers/radeonsi/radeonsi_shader.c | 19 +++
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c
b/src/gallium/drive
https://bugs.freedesktop.org/show_bug.cgi?id=62921
Roland Scheidegger changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
On Mit, 2013-03-27 at 16:35 +0100, Christian König wrote:
> From: Christian König
>
> v2: reduce key size, don't copy key around to much.
> v3: remove key size reduction
>
> Signed-off-by: Christian König
Reviewed-by: Michel Dänzer
--
Earthling Michel Dänzer |
Reported-by: `per` in #intel-gfx
The size of the cache key varies, so store the actual size as well as
the key blob itself, rather than just assuming it's the same as the size
passed in.
NOTE: This is a candidate for stable branches.
V2: Don't leave silly holes in structure; use unsigned instead
On Mon, 2013-04-01 at 14:11 -0700, Tom Stellard wrote:
> From: Tom Stellard
>
> Building libradeonllvm as a shared object has led to a number of bugs
> and build system complications, and I don't think it's necessary for
> such a small library.
>
> This library was originally changed to a share
58 matches
Mail list logo