Re: [Mesa-dev] [PATCH 2/2] glsl: keep track of intra-stage indices for atomics

2015-10-15 Thread Timothy Arceri
On Sun, 2015-10-04 at 18:45 +1100, Timothy Arceri wrote:
> This is more optimal as it means we no longer have to upload the same
> set
> of Atomic Buffer Object surfaces to all stages in the program.
> 
> This also fixes a bug where since commit c0cd5b var->data.binding was
> being used as a replacement for atomic buffer index, but they don't
> have
> to be the same value they just happened to end up the same when
> binding is 0.
> 
> Cc: Francisco Jerez 
> Cc: Ilia Mirkin 
> Cc: Alejandro Piñeiro 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
> ---


Hi guys,

Any thoughts on this patch?

Thanks,
Tim
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vec4: dead_code_eliminate: update writemask on null_regs based on flag_live

2015-10-15 Thread Francisco Jerez
Alejandro Piñeiro  writes:

> ---
>
> This patch implements the idea proposed by Francisco Jerez. With this
> change, even adding the new condition pointed by Matt Turner on the
> "2/5 i965/vec4: adding vec4_cmod_propagation optimization", the shader-db
> numbers remain the same. So this patch would go before the optimization
> (so in this series it would be the patch 1.5).
>
> Note: Im not resending the patch 2/5, as Matt pointed that he granted
> the reviewed status with his suggested change. I can send it if needed
> in any case.
>
>  .../drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp| 16 
> +++-
>  1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> index 8fc7a36..31ea128 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> @@ -78,13 +78,19 @@ vec4_visitor::dead_code_eliminate()
>   sizeof(BITSET_WORD));
>  
>foreach_inst_in_block_reverse(vec4_instruction, inst, block) {
> - if (inst->dst.file == GRF && !inst->has_side_effects()) {
> + if ((inst->dst.file == GRF && !inst->has_side_effects()) ||
> + (inst->dst.is_null() && inst->writes_flag())){
>  bool result_live[4] = { false };
>  
> -for (unsigned i = 0; i < inst->regs_written; i++) {
> -   for (int c = 0; c < 4; c++)
> -  result_live[c] |= BITSET_TEST(
> - live, var_from_reg(alloc, offset(inst->dst, i), c));
> +if (inst->dst.file == GRF) {
> +   for (unsigned i = 0; i < inst->regs_written; i++) {
> +  for (int c = 0; c < 4; c++)
> + result_live[c] |= BITSET_TEST(
> +live, var_from_reg(alloc, offset(inst->dst, i), c));
> +   }
> +} else {
> +   for (unsigned c = 0; c < 4; c++)
> +  result_live[c] |= BITSET_TEST(flag_live, c);

Sadly flag liveness is not kept track of per component -- I.e. the
flag_live bit-set and the flag live-out bitset calculated by liveness
analysis have only one bit representing the union of all components.
This won't work unless you fix that too.

>  }
>  
>  /* If the instruction can't do writemasking, then it's all or
> -- 
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add PIPE_CAP_SHAREABLE_SHADERS

2015-10-15 Thread Marek Olšák
Ping

On Sun, Oct 11, 2015 at 3:09 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> I'll let drivers figure out how to do it.
> ---
>  src/gallium/docs/source/screen.rst   | 2 ++
>  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
>  src/gallium/drivers/i915/i915_screen.c   | 1 +
>  src/gallium/drivers/ilo/ilo_screen.c | 1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
>  src/gallium/drivers/r300/r300_screen.c   | 1 +
>  src/gallium/drivers/r600/r600_pipe.c | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
>  src/gallium/drivers/softpipe/sp_screen.c | 1 +
>  src/gallium/drivers/svga/svga_screen.c   | 1 +
>  src/gallium/drivers/vc4/vc4_screen.c | 1 +
>  src/gallium/include/pipe/p_defines.h | 1 +
>  15 files changed, 16 insertions(+)
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Fix the incorrect path of sse_minmax.c

2015-10-15 Thread Emil Velikov
On 12 October 2015 at 16:36, Chih-Wei Huang  wrote:
> Signed-off-by: Chih-Wei Huang 
> ---
>  src/mesa/Android.libmesa_dricore.mk | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/Android.libmesa_dricore.mk 
> b/src/mesa/Android.libmesa_dricore.mk
> index 2e308b8..fef76c8 100644
> --- a/src/mesa/Android.libmesa_dricore.mk
> +++ b/src/mesa/Android.libmesa_dricore.mk
> @@ -50,7 +50,7 @@ endif # MESA_ENABLE_ASM
>  ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
>  LOCAL_SRC_FILES += \
> main/streaming-load-memcpy.c \
> -   mesa/main/sse_minmax.c
> +   main/sse_minmax.c
Ouch ... seems like I broke this with 669cfc267a1
Added a fixes + stable tag and pushed to master.

Thank you Chih-Wei
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: build i965_compile_FILES sources

2015-10-15 Thread Emil Velikov
On 11 October 2015 at 12:49, Mauro Rossi  wrote:
> i965_compile_FILES need to be built, in order to avoid following building 
> errors:
>
> target SharedLib: i915_dri 
> (out/target/product/x86/obj/SHARED_LIBRARIES/i915_dri_intermediates/LINKED/i915_dri.so)
> external/mesa/src/mesa/drivers/dri/i965/brw_ir_fs.h:181: error: undefined 
> reference to 'fs_inst::~fs_inst()'
> ...
> ...
> external/mesa/src/mesa/drivers/dri/i965/intel_screen.c:1484: error: undefined 
> reference to 'brw_compiler_create'
> collect2: error: ld returned 1 exit status
> build/core/shared_library.mk:81: recipe for target 
> 'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so'
>  failed
> make: *** 
> [out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so]
>  Error 1
> ---
Thanks Mauro.

I slightly touched the commit message and pushed to master.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/glsl: Use shader_prog->Name for naming the NIR shader

2015-10-15 Thread Neil Roberts
Ping, could you please push this patch? It's a pain to use the optimise
debug output without it. Thanks.

Reviewed-by: Neil Roberts 

- Neil

Jason Ekstrand  writes:

> This has the better name to use. Aparently, sh->Name is usually 0.
> ---
>  src/glsl/nir/glsl_to_nir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index 6e1dd84..3284bdc 100644
> --- a/src/glsl/nir/glsl_to_nir.cpp
> +++ b/src/glsl/nir/glsl_to_nir.cpp
> @@ -150,7 +150,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
>if (sh->Program->SamplersUsed & (1 << i))
>   num_textures = i;
>  
> -   shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
> +   shader->info.name = ralloc_asprintf(shader, "GLSL%d", shader_prog->Name);
> if (shader_prog->Label)
>shader->info.label = ralloc_strdup(shader, shader_prog->Label);
> shader->info.num_textures = num_textures;
> -- 
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/dri2: expose srgb configs when KHR_gl_colorspace is available

2015-10-15 Thread Emil Velikov
On 3 October 2015 at 12:19, Emil Velikov  wrote:
> On 3 October 2015 at 02:12, Marek Olšák  wrote:
>> I'm not sure if this is correct or if we should just return NULL in
>> this case like the "case" statement above that does.
>>
> Actually I was thinking about bailing out when the requested attribute
> is set. I.e.
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index 1740ee3..0450269 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -237,6 +237,8 @@ dri2_add_config(_EGLDisplay *disp, const
> __DRIconfig *dri_config, int id,
>
>   case __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE:
>  srgb = value != 0;
> + if (!dpy->Extensions.KHR_gl_colorspace && srgb)
> +return NULL;
>  break;
>
>   default:
>
Guys can anyone give this patch a quick test ? Afaict it is very
uncommon to get here, but it still does the right thing.

Thanks
Emil


>
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92467] Program for dumping images crashes at OSMesa library giving floating exception in Linux(OpenSuse 13.2 and Centos 6.6)

2015-10-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92467

Bug ID: 92467
   Summary: Program for dumping images crashes at OSMesa library
giving floating exception in Linux(OpenSuse 13.2 and
Centos 6.6)
   Product: Mesa
   Version: 10.3
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: pawan24ghildi...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

We have cfd code which use VTK for dumping image . We have compiled VTK with
onscreen rendering without mesa and offscreen rendering with Mesa. One with
onscreen rendering worked fine but one with offscreen rendering with mesa is
giving segmentation fault for certain case. It worked fine for many but with
one case it fail randomly. Following is message , i am getting from Mesa

It failed both using system(opensuse 13.2) installed mesa or other mesa 11.0.3
and other 10.3.2 . Similar error occur in Centos 6.6. 


configure using this option 

./configure \
CXXFLAGS="-O2 -g -DDEFAULT_SOFTWARE_DEPTH_BITS=31" \
CFLAGS="-O2 -g " \
--enable-shared  \
--disable-static \
--enable-texture-float \
--enable-osmesa  \
--disable-gallium-llvm \
--disable-dri \
--disable-egl \
--disable-glx \
--with-gallium-drivers="swrast" \
--prefix=$MESA_DIR

=

(rogram received signal SIGFPE, Arithmetic exception.
general_triangle (ctx=0x2ec10620, v0=, v1=,
v2=0x319f17a0) at swrast/s_tritemp.h:439
439   span.attrStepY[attr][c] = oneOverArea * (eMaj.dx *
eBot_da - eMaj_da * eBot.dx);
Missing separate debuginfos, use: zypper install
libICE6-debuginfo-1.0.9-2.1.3.x86_64 libSM6-debuginfo-1.2.2-4.1.2.x86_64
libX11-6-debuginfo-1.6.2-5.1.2.x86_64 libXau6-debuginfo-1.0.8-5.1.2.x86_64
libXext6-debuginfo-1.3.3-2.1.2.x86_64
libpciaccess0-debuginfo-0.13.2-4.1.2.x86_64
libuuid1-debuginfo-2.25.1-13.1.x86_64 libxcb1-debuginfo-1.11-2.1.2.x86_64
libz1-debuginfo-1.2.8-5.1.2.x86_64
(gdb) bt
#0  general_triangle (ctx=0x2ec10620, v0=, v1=,
v2=0x319f17a0) at swrast/s_tritemp.h:439
#1  0x7fffd27ebb4c in triangle_twoside_rgba (ctx=,
e0=, e1=, e2=) at
swrast_setup/ss_tritmp.h:176
#2  0x7fffd27a522d in _tnl_render_poly_elts (ctx=0x2ec10620, start=0,
count=3, flags=48) at tnl/t_vb_rendertmp.h:352
#3  0x7fffd27abbf9 in _tnl_RenderClippedPolygon (ctx=,
elts=, n=) at tnl/t_vb_render.c:246
#4  0x7fffd27a91ce in clip_tri_4 (ctx=ctx@entry=0x2ec10620,
v0=v0@entry=816, v1=v1@entry=817, v2=v2@entry=818, mask=) at
tnl/t_vb_cliptmp.h:259
#5  0x7fffd27aa1e8 in clip_render_triangles_verts (ctx=0x2ec10620,
start=, count=1024, flags=) at
tnl/t_vb_rendertmp.h:182
#6  0x7fffd27a5811 in run_render (ctx=0x2ec10620, stage=) at
tnl/t_vb_render.c:322
#7  0x7fffd279a8cd in _tnl_run_pipeline (ctx=0x2ec10620) at
tnl/t_pipeline.c:241
#8  0x7fffd2799f33 in _tnl_draw_prims (ctx=0x2ec10620, prim=0x2de201e0,
nr_prims=1, ib=0x0, index_bounds_valid=, min_index=0,
max_index=1023, tfb_vertcount=0x0, stream=0, indirect=0x0) at tnl/t_draw.c:520
#9  0x7fffd279772f in vbo_save_playback_vertex_list (ctx=0x2ec10620,
data=0x2fcab688) at vbo/vbo_save_draw.c:310
#10 0x7fffd2660072 in ext_opcode_execute (node=0x2fcab684, ctx=0x2ec10620)
at main/dlist.c:666
#11 execute_list (ctx=0x2ec10620, list=) at main/dlist.c:7756
#12 0x7fffd267540a in _mesa_CallList (list=1) at main/dlist.c:9121
#13 0x7fffda54bf81 in
vtkOpenGLDisplayListPainter::RenderInternal(vtkRenderer*, vtkActor*, unsigned
long, bool) () from
/home/ren2/pawan/OpenFOAM/ThirdParty-dev/VTK/offscreen/VTK-6.2.0/lib/libvtkRenderingOpenGL-6.2.so.1
#14 0x7fffda54a9ab in
vtkOpenGLClipPlanesPainter::RenderInternal(vtkRenderer*, vtkActor*, unsigned
long, bool) () from
/home/ren2/pawan/OpenFOAM/ThirdParty-dev/VTK/offscreen/VTK-6.2.0/lib/libvtkRenderingOpenGL-6.2.so.1
#15 0x7fffda574183 in
vtkOpenGLScalarsToColorsPainter::RenderInternal(vtkRenderer*, vtkActor*,
unsigned long, bool) () from
/home/ren2/pawan/OpenFOAM/ThirdParty-dev/VTK/offscreen/VTK-6.2.0/lib/libvtkRenderingOpenGL-6.2.so.1
#16 0x7fffda57fbca in vtkPainterPolyDataMapper::RenderPiece(vtkRenderer*,
vtkActor*) () from
/home/ren2/pawan/OpenFOAM/ThirdParty-dev/VTK/offscreen/VTK-6.2.0/lib/libvtkRenderingOpenGL-6.2.so.1
#17 0x7fffe19c41ff in vtkPolyDataMapper::Render(vtkRenderer*, vtkActor*) ()
from
/home/ren2/pawan/OpenFOAM/ThirdParty-dev/VTK/offscreen/VTK-6.2.0/lib/libvtkRenderingCore-6.2.so.1
#18 0x7fffda548ef4 in vtkOpenGLActor::Render(vtkRenderer*, vtkMapper*) ()
from
/home/ren2/pawan/OpenFOAM/ThirdParty-dev/VTK/offscreen/VTK-6.2.0/lib/libvtkRenderingOpenGL-6.2.so.1
#19 0x7fffe194b348 in 

Re: [Mesa-dev] [PATCH 0/4] i965: skip control-flow aware liveness analysis if we only have 1 block

2015-10-15 Thread Iago Toral
On Wed, 2015-10-14 at 11:08 -0700, Jordan Justen wrote:
> On 2015-10-13 22:49:08, Iago Toral wrote:
> > On Tue, 2015-10-13 at 09:44 -0700, Jordan Justen wrote:
> > > On 2015-10-13 05:17:37, Francisco Jerez wrote:
> > > > Iago Toral Quiroga  writes:
> > > > 
> > > > > This fixes the following test:
> > > > >
> > > > > [require]
> > > > > GL >= 3.3
> > > > > GLSL >= 3.30
> > > > > GL_ARB_shader_storage_buffer_object
> > > > >
> > > > > [fragment shader]
> > > > > #version 330
> > > > > #extension GL_ARB_shader_storage_buffer_object: require
> > > > >
> > > > > buffer SSBO {
> > > > > mat4 sm4;
> > > > > };
> > > > >
> > > > >
> > > > > mat4 um4;
> > > > >
> > > > > void main() {
> > > > > sm4 *= um4;
> > > > 
> > > > This is using the value of "um4", which is never assigned, so liveness
> > > > analysis will correctly extend its live interval up to the start of the
> > > > block.
> > > 
> > > This test was derived by simplifying a CTS test case.
> > > 
> > > Anyway, I'm not sure what happened on the way to the commit message,
> > > but um4 should be a uniform.
> > > 
> > > http://sprunge.us/cEUe
> > 
> > Oh yes, that was me playing around with the example. The patches also
> > fix the uniform version. Jordan, can you verify if this fixes the CTS
> > test case?
> 
> Unfortunately, no. The CTS case has some control flow. I had removed
> it to minimize the test case.

Yes, if it has control flow then my patch doesn't help.

> Here is a small shader_test that has control flow and still fails to
> compile with your patches:
> 
> http://sprunge.us/LIjA

Thanks, I am experimenting with a ssbo load cache (so we do not emit the
same ssbo load when we know there are no ssbo writes that could've
altered the underlying buffer) that should help the test. It is not
complete yet but it seems that this alone should be able to get the test
to pass.

> > In any case, since Curro is working on a more general fix for this
> > stuff, I guess you'd rather wait for his patches...
> 
> It depends how long we'd have to wait. :) Anyway, since we don't have
> a short-term fix anyhow, let's wait to see what curro has to say.

I think my patch will still take a couple of days, but I have just
tested Curro's hack and that seems to work for this test, so hopefully
we can use that?

Iago

> -Jordan
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Enable split of lower UBOs and SSBO also for compute shaders

2015-10-15 Thread Iago Toral
On Wed, 2015-10-14 at 16:18 +0300, Francisco Jerez wrote:
> "Lofstedt, Marta"  writes:
> 
> > I have found a couple of more places in linker.cpp where we loop up to 
> > MESA_SHADER_FRAGMENT.
> > Should these now also be up to MESA_SHADER_COMPUTE instead?
> >
> Some might be oversights like this, but I guess in some cases a loop up
> to MESA_SHADER_FRAGMENT might be the right thing to do when doing stuff
> like linking varyings that really only applies to the graphics pipeline
> and not to compute shaders.

Right, I imagine that if any of these should also include the CS stage
we'll probably know via broken CS tests.

Iago

> > /Marta
> >
> >> -Original Message-
> >> From: Marta Lofstedt [mailto:marta.lofst...@linux.intel.com]
> >> Sent: Wednesday, October 14, 2015 2:56 PM
> >> To: mesa-dev@lists.freedesktop.org
> >> Cc: Lofstedt, Marta; Marta Lofstedt
> >> Subject: [PATCH] glsl: Enable split of lower UBOs and SSBO also for compute
> >> shaders
> >> 
> >> From: Marta Lofstedt 
> >> 
> >> The split of Uniform blocks and shader storage block only loops up to
> >> MESA_SHADER_FRAGMENT and igonres compute shaders.
> >> This cause segfault when running the OpenGL ES 3.1 CTS tests with
> >> GL_ARB_compute_shader enabled.
> >> 
> >> Signed-off-by: Marta Lofstedt 
> >> ---
> >>  src/glsl/linker.cpp | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 
> >> c61c76e..5b5d6e6
> >> 100644
> >> --- a/src/glsl/linker.cpp
> >> +++ b/src/glsl/linker.cpp
> >> @@ -4392,7 +4392,7 @@ link_shaders(struct gl_context *ctx, struct
> >> gl_shader_program *prog)
> >>  * for gl_shader_program and gl_shader, so that drivers that need 
> >> separate
> >>  * index spaces for each set can have that.
> >>  */
> >> -   for (unsigned i = MESA_SHADER_VERTEX; i <=
> >> MESA_SHADER_FRAGMENT; i++) {
> >> +   for (unsigned i = MESA_SHADER_VERTEX; i <= MESA_SHADER_COMPUTE;
> >> i++)
> >
> > Remove new line.
> >
> >> + {
> >>if (prog->_LinkedShaders[i] != NULL) {
> >>   gl_shader *sh = prog->_LinkedShaders[i];
> >>   split_ubos_and_ssbos(sh,
> >> --
> >> 2.1.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965/fs: use the right number of UBOs

2015-10-15 Thread Iago Toral Quiroga
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 05f3f63..7afcd5b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1434,7 +1434,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   */
  brw_mark_surface_used(prog_data,
stage_prog_data->binding_table.ubo_start +
-   nir->info.num_ssbos - 1);
+   nir->info.num_ubos - 1);
   }
 
   if (has_indirect) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] nir: Get the number of SSBOs and UBOs right

2015-10-15 Thread Iago Toral Quiroga
Before d31f98a272e429d and 56e2bdbca36a20 we had a sigle index space for UBOs
and SSBOs, so NumBufferInterfaceBlocks would contain the combined number of
blocks, not just one kind. This means that for shader programs using both
UBOs and SSBOs, we were setting num_ssbos and num_ubos to a larger number than
we should. Since the above commits  we have separate index spaces for each
so we can just get the right numbers.
---
 src/glsl/nir/glsl_to_nir.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 6f67b1d..b705f66 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -152,9 +152,9 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
 
shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
shader->info.num_textures = num_textures;
-   shader->info.num_ubos = sh->NumBufferInterfaceBlocks;
+   shader->info.num_ubos = sh->NumUniformBlocks;
shader->info.num_abos = shader_prog->NumAtomicBuffers;
-   shader->info.num_ssbos = shader_prog->NumBufferInterfaceBlocks;
+   shader->info.num_ssbos = sh->NumShaderStorageBlocks;
shader->info.num_images = sh->NumImages;
shader->info.inputs_read = sh->Program->InputsRead;
shader->info.outputs_written = sh->Program->OutputsWritten;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965/vec4: Use the right number of UBOs

2015-10-15 Thread Iago Toral Quiroga
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 0025f36..ea1e3e7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -765,7 +765,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
   */
  brw_mark_surface_used(_data->base,
prog_data->base.binding_table.ubo_start +
-   nir->info.num_ssbos - 1);
+   nir->info.num_ubos - 1);
   }
 
   unsigned const_offset = instr->const_index[0];
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] mesa: fix incorrect error string in _mesa_BlendEquationiARB()

2015-10-15 Thread Brian Paul

On 10/14/2015 06:54 PM, Eric Anholt wrote:

Brian Paul  writes:


---
  src/mesa/main/blend.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/blend.c b/src/mesa/main/blend.c
index d225f3d..f14949f 100644
--- a/src/mesa/main/blend.c
+++ b/src/mesa/main/blend.c
@@ -407,7 +407,7 @@ _mesa_BlendEquationiARB(GLuint buf, GLenum mode)
buf, _mesa_enum_to_string(mode));

 if (buf >= ctx->Const.MaxDrawBuffers) {
-  _mesa_error(ctx, GL_INVALID_VALUE, "glBlendFuncSeparatei(buffer=%u)",
+  _mesa_error(ctx, GL_INVALID_VALUE, "glBlendFuncEquationi(buffer=%u)",
buf);
return;


The other strings in this function say "glBlendEquationi".  I think you
meant that, instead?


Yes, of course.  My typo was worse than the original!

-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] mesa: short-cut new_state == _NEW_LINE in _mesa_update_state_locked()

2015-10-15 Thread Brian Paul

On 10/14/2015 06:57 PM, Eric Anholt wrote:

Brian Paul  writes:


We can skip to the end of _mesa_update_state_locked() if only the
_NEW_LINE flag is set since none of the derived state depends on it
(just like _NEW_CURRENT_ATTRIB).  Note that we still call the
ctx->Driver.UpdateState() function, of course.
---
  src/mesa/main/state.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index d3b1c72..7fa7da2 100644
--- a/src/mesa/main/state.c
+++ b/src/mesa/main/state.c
@@ -392,7 +392,8 @@ _mesa_update_state_locked( struct gl_context *ctx )
 GLbitfield prog_flags = _NEW_PROGRAM;
 GLbitfield new_prog_state = 0x0;

-   if (new_state == _NEW_CURRENT_ATTRIB)
+   if (new_state == _NEW_CURRENT_ATTRIB ||
+   new_state == _NEW_LINE)
goto out;


Perhaps something like:

GLbitfield computed_states = ~(_NEW_CURRENT_ATTRIB | _NEW_LINE);

if (!(new_state & computed_states))
goto out;

making the optimization slightly more general and more self-documenting?


Good idea.  I'll fix that before committing.

-Brian



Either way, other than the comment on #7,

Reviewed-by: Eric Anholt 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965/vec4: Add unit tests for cmod propagation pass

2015-10-15 Thread Matt Turner
On Wed, Oct 14, 2015 at 2:11 PM, Alejandro Piñeiro  wrote:
> This include the same tests coming from test_fs_cmod_propagation, (non
> vector glsl types included) plus some new with vec4 types, inspired on
> the regressions found while the optimization was a work in progress.
>
> Additionally, the check of number of instructions after the
> optimization was changed from EXPECT_EQ to ASSERT_EQ. This was done to
> avoid a crash on failing tests that expected no optimization, as after
> checking the number of instructions, there were some checks related to
> this last instruction opcode/conditional mod.
>
> v2: include new tests to manage when inst and scan_inst has
> different writemasks
> ---
>  src/mesa/drivers/dri/i965/Makefile.am  |   7 +
>  .../dri/i965/test_vec4_cmod_propagation.cpp| 820 
> +
>  2 files changed, 827 insertions(+)
>  create mode 100644 src/mesa/drivers/dri/i965/test_vec4_cmod_propagation.cpp
>
> diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
> b/src/mesa/drivers/dri/i965/Makefile.am
> index 2e24151..63228a5 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.am
> +++ b/src/mesa/drivers/dri/i965/Makefile.am
> @@ -58,6 +58,7 @@ TESTS = \
> test_fs_saturate_propagation \
>  test_eu_compact \
> test_vf_float_conversions \
> +   test_vec4_cmod_propagation \
>  test_vec4_copy_propagation \
>  test_vec4_register_coalesce
>
> @@ -93,6 +94,12 @@ test_vec4_copy_propagation_LDADD = \
>  $(top_builddir)/src/gtest/libgtest.la \
>  $(TEST_LIBS)
>
> +test_vec4_cmod_propagation_SOURCES = \
> +   test_vec4_cmod_propagation.cpp
> +test_vec4_cmod_propagation_LDADD = \
> +   $(top_builddir)/src/gtest/libgtest.la \
> +   $(TEST_LIBS)
> +
>  test_eu_compact_SOURCES = \
> test_eu_compact.c
>  nodist_EXTRA_test_eu_compact_SOURCES = dummy.cpp
> diff --git a/src/mesa/drivers/dri/i965/test_vec4_cmod_propagation.cpp 
> b/src/mesa/drivers/dri/i965/test_vec4_cmod_propagation.cpp
> new file mode 100644
> index 000..2d9f6c7
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/test_vec4_cmod_propagation.cpp
> @@ -0,0 +1,820 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Based on test_fs_cmod_propagation.cpp
> + */
> +
> +#include 
> +#include "brw_vec4.h"
> +#include "brw_vec4_builder.h"
> +#include "brw_cfg.h"
> +#include "program/program.h"
> +
> +using namespace brw;
> +
> +class cmod_propagation_test : public ::testing::Test {
> +   virtual void SetUp();
> +
> +public:
> +   struct brw_compiler *compiler;
> +   struct brw_device_info *devinfo;
> +   struct gl_context *ctx;
> +   struct gl_shader_program *shader_prog;
> +   struct brw_vertex_program *vp;
> +   vec4_visitor *v;
> +};
> +
> +class cmod_propagation_vec4_visitor : public vec4_visitor
> +{
> +public:
> +   cmod_propagation_vec4_visitor(struct brw_compiler *compiler,
> + nir_shader *shader)
> +  : vec4_visitor(compiler, NULL, NULL, NULL, shader, NULL,
> + false, -1) {}
> +
> +protected:
> +   /* Dummy implementation for pure virtual methods */
> +   virtual dst_reg *make_reg_for_system_value(int location,
> +  const glsl_type *type)
> +   {
> +  unreachable("Not reached");
> +   }
> +
> +   virtual void setup_payload()
> +   {
> +  unreachable("Not reached");
> +   }
> +
> +   virtual void emit_prolog()
> +   {
> +  unreachable("Not reached");
> +   }
> +
> +   virtual void emit_program_code()
> +   {
> +  unreachable("Not reached");
> +   }
> +
> +   virtual void emit_thread_end()
> +   {
> +  unreachable("Not reached");
> +   }
> +
> +   virtual void emit_urb_write_header(int mrf)
> +   {
> +  unreachable("Not reached");
> 

Re: [Mesa-dev] [PATCH 1/3] nv50/ir: use C++11 standard std::unordered_map if possible

2015-10-15 Thread Ilia Mirkin
This patch and the nv30 one are both

Reviewed-by: Ilia Mirkin 

I guess adding a cc: stable makes sense for these too? Or are further
fixes required that would make building 11.0.x impractical?

On Thu, Oct 15, 2015 at 11:46 AM, Chih-Wei Huang
 wrote:
> Note Android version before Lollipop is not supported.
>
> Signed-off-by: Chih-Wei Huang 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 20 +---
>  1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> index 400b9f0..7859c8e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> @@ -25,10 +25,24 @@
>
>  #include 
>  #include 
> +#if __cplusplus >= 201103L
> +#include 
> +#else
>  #include 
> +#endif
>
>  namespace nv50_ir {
>
> +#if __cplusplus >= 201103L
> +using std::hash;
> +using std::unordered_map;
> +#elif !defined(ANDROID)
> +using std::tr1::hash;
> +using std::tr1::unordered_map;
> +#else
> +#error Android release before Lollipop is not supported!
> +#endif
> +
>  #define MAX_REGISTER_FILE_SIZE 256
>
>  class RegisterSet
> @@ -349,12 +363,12 @@ RegAlloc::PhiMovesPass::needNewElseBlock(BasicBlock *b, 
> BasicBlock *p)
>
>  struct PhiMapHash {
> size_t operator()(const std::pair& val) 
> const {
> -  return std::tr1::hash()(val.first) * 31 +
> - std::tr1::hash()(val.second);
> +  return hash()(val.first) * 31 +
> + hash()(val.second);
> }
>  };
>
> -typedef std::tr1::unordered_map<
> +typedef unordered_map<
> std::pair, Value *, PhiMapHash> PhiMap;
>
>  // Critical edges need to be split up so that work can be inserted along
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir: Get the number of SSBOs and UBOs right

2015-10-15 Thread Iago Toral
On Thu, 2015-10-15 at 08:06 -0700, Jason Ekstrand wrote:
> On Thu, Oct 15, 2015 at 12:18 AM, Iago Toral Quiroga  
> wrote:
> > Before d31f98a272e429d and 56e2bdbca36a20 we had a sigle index space for 
> > UBOs
> > and SSBOs, so NumBufferInterfaceBlocks would contain the combined number of
> > blocks, not just one kind. This means that for shader programs using both
> > UBOs and SSBOs, we were setting num_ssbos and num_ubos to a larger number 
> > than
> > we should. Since the above commits  we have separate index spaces for each
> > so we can just get the right numbers.
> 
> Shouldn't this patch go after the other two?  It seems like once we
> have this patch, we'll no longer be marking the right number of UBO's
> as used (as per the other two) but since NumBufferInterfaceBlocks is
> probably bigger than NumShaderStorageBlocks, it should be safe to do
> the other two first.

Yes, you're right. I'll push one this last.
Thanks!

> In any case, all three are
> 
> Reviewed-by: Jason Ekstrand 
> 
> > ---
> >  src/glsl/nir/glsl_to_nir.cpp | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> > index 6f67b1d..b705f66 100644
> > --- a/src/glsl/nir/glsl_to_nir.cpp
> > +++ b/src/glsl/nir/glsl_to_nir.cpp
> > @@ -152,9 +152,9 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
> >
> > shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
> > shader->info.num_textures = num_textures;
> > -   shader->info.num_ubos = sh->NumBufferInterfaceBlocks;
> > +   shader->info.num_ubos = sh->NumUniformBlocks;
> > shader->info.num_abos = shader_prog->NumAtomicBuffers;
> > -   shader->info.num_ssbos = shader_prog->NumBufferInterfaceBlocks;
> > +   shader->info.num_ssbos = sh->NumShaderStorageBlocks;
> > shader->info.num_images = sh->NumImages;
> > shader->info.inputs_read = sh->Program->InputsRead;
> > shader->info.outputs_written = sh->Program->OutputsWritten;
> > --
> > 1.9.1
> >
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vec4: dead_code_eliminate: update writemask on null_regs based on flag_live

2015-10-15 Thread Alejandro Piñeiro


On 15/10/15 14:38, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
>
>> ---
>>
>> This patch implements the idea proposed by Francisco Jerez. With this
>> change, even adding the new condition pointed by Matt Turner on the
>> "2/5 i965/vec4: adding vec4_cmod_propagation optimization", the shader-db
>> numbers remain the same. So this patch would go before the optimization
>> (so in this series it would be the patch 1.5).
>>
>> Note: Im not resending the patch 2/5, as Matt pointed that he granted
>> the reviewed status with his suggested change. I can send it if needed
>> in any case.
>>
>>  .../drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp| 16 
>> +++-
>>  1 file changed, 11 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> index 8fc7a36..31ea128 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> @@ -78,13 +78,19 @@ vec4_visitor::dead_code_eliminate()
>>   sizeof(BITSET_WORD));
>>  
>>foreach_inst_in_block_reverse(vec4_instruction, inst, block) {
>> - if (inst->dst.file == GRF && !inst->has_side_effects()) {
>> + if ((inst->dst.file == GRF && !inst->has_side_effects()) ||
>> + (inst->dst.is_null() && inst->writes_flag())){
>>  bool result_live[4] = { false };
>>  
>> -for (unsigned i = 0; i < inst->regs_written; i++) {
>> -   for (int c = 0; c < 4; c++)
>> -  result_live[c] |= BITSET_TEST(
>> - live, var_from_reg(alloc, offset(inst->dst, i), c));
>> +if (inst->dst.file == GRF) {
>> +   for (unsigned i = 0; i < inst->regs_written; i++) {
>> +  for (int c = 0; c < 4; c++)
>> + result_live[c] |= BITSET_TEST(
>> +live, var_from_reg(alloc, offset(inst->dst, i), c));
>> +   }
>> +} else {
>> +   for (unsigned c = 0; c < 4; c++)
>> +  result_live[c] |= BITSET_TEST(flag_live, c);
> Sadly flag liveness is not kept track of per component -- I.e. the
> flag_live bit-set and the flag live-out bitset calculated by liveness
> analysis have only one bit representing the union of all components.
> This won't work unless you fix that too.

Ok, I assumed that was tracking per component as I didn't detect any
piglit regression after this change. But after all, I also didn't detect
any piglit regression with vec4_cmod_propagation even without Matt last
suggestion.

I will work on adding component information on flag liveness (or at
least try to add it).

BR

-- 
Alejandro Piñeiro (apinhe...@igalia.com)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965/vec4: use a custom envvar to decide to print the assembly on test_vec4_cmod_propagation

2015-10-15 Thread Matt Turner
On Sat, Oct 10, 2015 at 4:24 AM, Alejandro Piñeiro  wrote:
> The complete way to do this would be parse INTEL_DEBUG and
> print the output if DEBUG_VS (or a new one) is present
> (see intel_debug.c).
>
> But that seems like an overkill for the unit tests, that
> after all, the most common use case is being run when
> calling make check.

Seems like a fine idea. I wouldn't mind giving the fs version of the
test the same treatment.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] i965/vec4: adding vec4_cmod_propagation optimization

2015-10-15 Thread Matt Turner
On Wed, Oct 14, 2015 at 12:51 AM, Alejandro Piñeiro
 wrote:
> On 13/10/15 23:36, Matt Turner wrote:
>> The good news is that, unless I've done something wrong, this doesn't
>> affect any shaders in shader-db on ILK or HSW (I only tested those
>> two, but I expect the results are the same everywhere). Apparently
>> this is a pretty rare case.
>
> Are you sure? I have made a run adding your condition, and now comparing
> master vs having the optimization I get this:
> total instructions in shared programs: 6240631 -> 6240471 (-0.00%)
> instructions in affected programs: 18965 -> 18805 (-0.84%)
> helped:160
> HURT:  0
>
> That is a really small gain. Or put in other way, if we compare the
> conditions I have on the original patches vs adding the condition you
> are proposing, I get this:
>
> total instructions in shared programs: 6223900 -> 6240471 (0.27%)
> instructions in affected programs: 477537 -> 494108 (3.47%)
> helped:0
> HURT:  3047

Strange, I must have made a mistake in my shader-db results. Your
results certainly make more sense.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir: Get the number of SSBOs and UBOs right

2015-10-15 Thread Jason Ekstrand
On Thu, Oct 15, 2015 at 12:18 AM, Iago Toral Quiroga  wrote:
> Before d31f98a272e429d and 56e2bdbca36a20 we had a sigle index space for UBOs
> and SSBOs, so NumBufferInterfaceBlocks would contain the combined number of
> blocks, not just one kind. This means that for shader programs using both
> UBOs and SSBOs, we were setting num_ssbos and num_ubos to a larger number than
> we should. Since the above commits  we have separate index spaces for each
> so we can just get the right numbers.

Shouldn't this patch go after the other two?  It seems like once we
have this patch, we'll no longer be marking the right number of UBO's
as used (as per the other two) but since NumBufferInterfaceBlocks is
probably bigger than NumShaderStorageBlocks, it should be safe to do
the other two first.

In any case, all three are

Reviewed-by: Jason Ekstrand 

> ---
>  src/glsl/nir/glsl_to_nir.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index 6f67b1d..b705f66 100644
> --- a/src/glsl/nir/glsl_to_nir.cpp
> +++ b/src/glsl/nir/glsl_to_nir.cpp
> @@ -152,9 +152,9 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
>
> shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
> shader->info.num_textures = num_textures;
> -   shader->info.num_ubos = sh->NumBufferInterfaceBlocks;
> +   shader->info.num_ubos = sh->NumUniformBlocks;
> shader->info.num_abos = shader_prog->NumAtomicBuffers;
> -   shader->info.num_ssbos = shader_prog->NumBufferInterfaceBlocks;
> +   shader->info.num_ssbos = sh->NumShaderStorageBlocks;
> shader->info.num_images = sh->NumImages;
> shader->info.inputs_read = sh->Program->InputsRead;
> shader->info.outputs_written = sh->Program->OutputsWritten;
> --
> 1.9.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] radeonsi: re-enable unsafe-fp-math for LLVM 3.8

2015-10-15 Thread Tom Stellard
On Sun, Oct 11, 2015 at 03:29:48AM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 

I don't think we should globally enable this until we are sure it does not
introduce any illegal transforms.

> Required for 1/sqrt ==> rsq.

I think the arcp fast-math flag for instruction is supposed to allow this.
Let me check with some LLVM people.

-Tom
> 
> We should finally fix the hang instead of running away from the issue. This
> assumes the bug is in LLVM and we have time to fix it before the release.
> Include compute shaders as well, which only affects TGSI and thus OpenGL.
> 
> Totals:
> SGPRS: 344368 -> 345104 (0.21 %)
> VGPRS: 197552 -> 197420 (-0.07 %)
> Code Size: 7366304 -> 7324692 (-0.56 %) bytes
> LDS: 91 -> 91 (0.00 %) blocks
> Scratch: 1615872 -> 1524736 (-5.64 %) bytes per wave
> 
> Totals from affected shaders:
> SGPRS: 146696 -> 147432 (0.50 %)
> VGPRS: 87212 -> 87080 (-0.15 %)
> Code Size: 3852664 -> 3811052 (-1.08 %) bytes
> LDS: 48 -> 48 (0.00 %) blocks
> Scratch: 1179648 -> 1088512 (-7.73 %) bytes per wave
> ---
>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
> b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> index 6b2ebde..4bda4a4 100644
> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> @@ -84,6 +84,13 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
>   sprintf(Str, "%1d", llvm_type);
>  
>   LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
> +
> +#if HAVE_LLVM >= 0x0308
> + /* This only affects TGSI (OpenGL), so it's okay to set it for
> +  * compute shaders too.
> +  */
> + LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
> +#endif
>  }
>  
>  static void init_r600_target()
> -- 
> 2.1.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] gallivm: implement the correct version of LRP

2015-10-15 Thread Roland Scheidegger
Am 15.10.2015 um 16:44 schrieb Marek Olšák:
> Any comment or is this okay with people? Given, "(1-t)*a + t*b", the
> original code didn't return b for t=1 because it's "floating-point".
> 
> Marek
> 
> On Sun, Oct 11, 2015 at 3:29 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> The previous version has precision issues. This can be a problem
>> with tessellation. Sadly, I can't find the article where I read it
>> anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert
>> this.
>> ---
>>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 13 +++--
>>  1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
>> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
>> index 0ad78b0..512558b 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
>> @@ -538,12 +538,13 @@ lrp_emit(
>> struct lp_build_tgsi_context * bld_base,
>> struct lp_build_emit_data * emit_data)
>>  {
>> -   LLVMValueRef tmp;
>> -   tmp = lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_SUB,
>> -   emit_data->args[1],
>> -   emit_data->args[2]);
>> -   emit_data->output[emit_data->chan] = lp_build_emit_llvm_ternary(bld_base,
>> -TGSI_OPCODE_MAD, emit_data->args[0], tmp, 
>> emit_data->args[2]);
>> +   struct lp_build_context *bld = _base->base;
>> +   LLVMValueRef inv, a, b;
>> +
>> +   inv = lp_build_sub(bld, bld_base->base.one, emit_data->args[0]);
>> +   a = lp_build_mul(bld, emit_data->args[1], emit_data->args[0]);
>> +   b = lp_build_mul(bld, emit_data->args[2], inv);
>> +   emit_data->output[emit_data->chan] = lp_build_add(bld, a, b);
>>  }
>>
>>  /* TGSI_OPCODE_MAD */
>> --

Please add a comment why it's using t*a + (1-t)*b and not (a-b)*t + b.
Though it is yet another thing we should have some more control over in
tgsi. Because if you're willing to allow unsafe-fp-math, then you should
also be willing to accept the simpler formula (I'm quite sure
unsafe-fp-math would be allowed to turn one formula into the other).
But otherwise I guess this is ok - it is the formula specified by glsl
after all.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] nouveau: nv30: include the header of ffs prototype

2015-10-15 Thread Chih-Wei Huang
It fixes a building error of the android 6.0 64-bit target.

Signed-off-by: Chih-Wei Huang 
---
 src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c 
b/src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c
index 5757eb1..dbbb8ba 100644
--- a/src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c
+++ b/src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c
@@ -1,3 +1,4 @@
+#include 
 #include "pipe/p_context.h"
 #include "pipe/p_defines.h"
 #include "pipe/p_state.h"
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] android: gallium_dri: fix a linking error

2015-10-15 Thread Chih-Wei Huang
Link with libmesa_dricore to get '_mesa_uint_array_min_max'
from sse_minmax.c if defined USE_SSE41.

Signed-off-by: Chih-Wei Huang 
---
 src/gallium/targets/dri/Android.mk | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/targets/dri/Android.mk 
b/src/gallium/targets/dri/Android.mk
index a33d7f8..2354323 100644
--- a/src/gallium/targets/dri/Android.mk
+++ b/src/gallium/targets/dri/Android.mk
@@ -106,6 +106,7 @@ LOCAL_STATIC_LIBRARIES := \
libmesa_st_mesa \
libmesa_glsl \
libmesa_dri_common \
+   libmesa_dricore \
libmesa_megadriver_stub \
libmesa_gallium \
libmesa_util \
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] Patches for Android 6.0

2015-10-15 Thread Chih-Wei Huang
Here are some patches to fix building errors on
Android 6.0 Marshmallow.

Tested OK with Android-x86 marshmallow-x86 branch.

Chih-Wei Huang (3):
  nv50/ir: use C++11 standard std::unordered_map if possible
  android: gallium_dri: fix a linking error
  nouveau: nv30: include the header of ffs prototype

 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 20 +---
 src/gallium/drivers/nouveau/nv30/nvfx_vertprog.c   |  1 +
 src/gallium/targets/dri/Android.mk |  1 +
 3 files changed, 19 insertions(+), 3 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] nv50/ir: use C++11 standard std::unordered_map if possible

2015-10-15 Thread Chih-Wei Huang
Note Android version before Lollipop is not supported.

Signed-off-by: Chih-Wei Huang 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 400b9f0..7859c8e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -25,10 +25,24 @@
 
 #include 
 #include 
+#if __cplusplus >= 201103L
+#include 
+#else
 #include 
+#endif
 
 namespace nv50_ir {
 
+#if __cplusplus >= 201103L
+using std::hash;
+using std::unordered_map;
+#elif !defined(ANDROID)
+using std::tr1::hash;
+using std::tr1::unordered_map;
+#else
+#error Android release before Lollipop is not supported!
+#endif
+
 #define MAX_REGISTER_FILE_SIZE 256
 
 class RegisterSet
@@ -349,12 +363,12 @@ RegAlloc::PhiMovesPass::needNewElseBlock(BasicBlock *b, 
BasicBlock *p)
 
 struct PhiMapHash {
size_t operator()(const std::pair& val) const {
-  return std::tr1::hash()(val.first) * 31 +
- std::tr1::hash()(val.second);
+  return hash()(val.first) * 31 +
+ hash()(val.second);
}
 };
 
-typedef std::tr1::unordered_map<
+typedef unordered_map<
std::pair, Value *, PhiMapHash> PhiMap;
 
 // Critical edges need to be split up so that work can be inserted along
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965/vec4: print predicate control at brw_vec4 dump_instruction

2015-10-15 Thread Matt Turner
On Sat, Oct 10, 2015 at 4:24 AM, Alejandro Piñeiro  wrote:
> ---
>
> I found this useful while I was using INTEL_DEBUG=optimizer after
> changing how the ifs are emitted. And after all, that info is
> also included by brw_disasm.c

Definitely.

> I assumed that at the vec4_visitor we would not need to handle
> pred_ctrl_align1, but Im not totally sure.

That's correct.

>
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 55e381b..eb81523 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1358,9 +1358,21 @@ vec4_visitor::dump_instruction(backend_instruction 
> *be_inst, FILE *file)
> vec4_instruction *inst = (vec4_instruction *)be_inst;
>
> if (inst->predicate) {
> -  fprintf(file, "(%cf0.%d) ",
> +  static const char *const pred_ctrl_align16[16] = {
> + "",
> + "",
> + ".x",
> + ".y",
> + ".z",
> + ".w",
> + ".any4h",
> + ".all4h",
> +  };

Let's just externalize pred_ctrl_align16 from brw_disasm.c and use it
here. See for example commit b9af66528e5b7bd.

With that change,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Implementation of vec4 equivalent to fs_cmod_propagation optimization

2015-10-15 Thread Matt Turner
On Sat, Oct 10, 2015 at 4:24 AM, Alejandro Piñeiro  wrote:
> This series implements a vec4 equivalent to fs_cmod_propagation optimization.
>
> The last two commits are not really needed for the optimization, are just
> nice-to-have (imho) that I added while implementing the optimization.
>
> Alejandro Piñeiro (5):
>   i965/vec4: nir_emit_if doesn't need to predicate based on all the
> channels
>   i965/vec4: adding vec4_cmod_propagation optimization
>   i965/vec4: Add unit tests for cmod propagation pass.
>   i965/vec4: use a custom envvar to decide to print the assembly on
> test_vec4_cmod_propagation
>   i965/vec4: print predicate control at brw_vec4 dump_instruction

Pending the flag liveness analysis patch, these five are reviewed, but
aren't we still missing a vec4_visitor implementation of
fixup_3src_null_dest()?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl: silence warning about unhandled ast_unsized_array_dim case in switch

2015-10-15 Thread Brian Paul
---
 src/glsl/ast_to_hir.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index cd40fe3..ede02d9 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2017,6 +2017,9 @@ ast_expression::has_sequence_subexpression() const
 
case ast_function_call:
   unreachable("should be handled by ast_function_expression::hir");
+
+   case ast_unsized_array_dim:
+  unreachable("ast_unsized_array_dim: Should never get here.");
}
 
return false;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/10] radeonsi: don't emit AMDGPU intrinsics for EX2, ROUND, TRUNC

2015-10-15 Thread Tom Stellard
On Sun, Oct 11, 2015 at 03:29:44AM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 

Reviewed-by: Tom Stellard 

> No difference according to shader-db.
> ---
>  src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index f548d1a..91cf658 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -1481,7 +1481,7 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_ENDIF].emit = endif_emit;
>   bld_base->op_actions[TGSI_OPCODE_ENDLOOP].emit = endloop_emit;
>   bld_base->op_actions[TGSI_OPCODE_EX2].emit = build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_EX2].intr_name = "llvm.AMDIL.exp.";
> + bld_base->op_actions[TGSI_OPCODE_EX2].intr_name = "llvm.exp2.f32";
>   bld_base->op_actions[TGSI_OPCODE_FLR].emit = build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_FLR].intr_name = "llvm.floor.f32";
>   bld_base->op_actions[TGSI_OPCODE_FMA].emit = build_tgsi_intrinsic_nomem;
> @@ -1530,7 +1530,7 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_POW].emit = build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_POW].intr_name = "llvm.pow.f32";
>   bld_base->op_actions[TGSI_OPCODE_ROUND].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_ROUND].intr_name = 
> "llvm.AMDIL.round.nearest.";
> + bld_base->op_actions[TGSI_OPCODE_ROUND].intr_name = "llvm.rint.f32";
>   bld_base->op_actions[TGSI_OPCODE_RSQ].intr_name = 
> "llvm.AMDGPU.rsq.clamped.f32";
>   bld_base->op_actions[TGSI_OPCODE_RSQ].emit = build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_SGE].emit = emit_cmp;
> @@ -1546,7 +1546,7 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_SQRT].intr_name = "llvm.sqrt.f32";
>   bld_base->op_actions[TGSI_OPCODE_SSG].emit = emit_ssg;
>   bld_base->op_actions[TGSI_OPCODE_TRUNC].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_TRUNC].intr_name = "llvm.AMDGPU.trunc";
> + bld_base->op_actions[TGSI_OPCODE_TRUNC].intr_name = "llvm.trunc.f32";
>   bld_base->op_actions[TGSI_OPCODE_UADD].emit = emit_uadd;
>   bld_base->op_actions[TGSI_OPCODE_UBFE].emit = 
> build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_UBFE].intr_name = 
> "llvm.AMDGPU.bfe.u32";
> -- 
> 2.1.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/10] radeonsi: don't emit AMDGPU intrinsics for integer abs, min, max

2015-10-15 Thread Tom Stellard
On Sun, Oct 11, 2015 at 03:29:45AM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 

Reviewed-by: Tom Stellard 

> No difference according to shader-db. (with the new S_ABS_I32 pattern)
> ---
>  .../drivers/radeon/radeon_setup_tgsi_llvm.c| 60 
> ++
>  1 file changed, 50 insertions(+), 10 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index 91cf658..23ea23a 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -1393,6 +1393,51 @@ static void emit_imsb(const struct 
> lp_build_tgsi_action * action,
>   LLVMBuildSelect(builder, cond, all_ones, msb, "");
>  }
>  
> +static void emit_iabs(const struct lp_build_tgsi_action *action,
> +   struct lp_build_tgsi_context *bld_base,
> +   struct lp_build_emit_data *emit_data)
> +{
> + LLVMBuilderRef builder = bld_base->base.gallivm->builder;
> +
> + emit_data->output[emit_data->chan] =
> + lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_IMAX,
> +   emit_data->args[0],
> +   LLVMBuildNeg(builder,
> +emit_data->args[0], ""));
> +}
> +
> +static void emit_minmax_int(const struct lp_build_tgsi_action *action,
> + struct lp_build_tgsi_context *bld_base,
> + struct lp_build_emit_data *emit_data)
> +{
> + LLVMBuilderRef builder = bld_base->base.gallivm->builder;
> + LLVMIntPredicate op;
> +
> + switch (emit_data->info->opcode) {
> + default:
> + assert(0);
> + case TGSI_OPCODE_IMAX:
> + op = LLVMIntSGT;
> + break;
> + case TGSI_OPCODE_IMIN:
> + op = LLVMIntSLT;
> + break;
> + case TGSI_OPCODE_UMAX:
> + op = LLVMIntUGT;
> + break;
> + case TGSI_OPCODE_UMIN:
> + op = LLVMIntULT;
> + break;
> + }
> +
> + emit_data->output[emit_data->chan] =
> + LLVMBuildSelect(builder,
> + LLVMBuildICmp(builder, op, emit_data->args[0],
> +   emit_data->args[1], ""),
> + emit_data->args[0],
> + emit_data->args[1], "");
> +}
> +
>  void radeon_llvm_context_init(struct radeon_llvm_context * ctx)
>  {
>   struct lp_type type;
> @@ -1493,17 +1538,14 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_FSGE].emit = emit_fcmp;
>   bld_base->op_actions[TGSI_OPCODE_FSLT].emit = emit_fcmp;
>   bld_base->op_actions[TGSI_OPCODE_FSNE].emit = emit_fcmp;
> - bld_base->op_actions[TGSI_OPCODE_IABS].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_IABS].intr_name = "llvm.AMDIL.abs.";
> + bld_base->op_actions[TGSI_OPCODE_IABS].emit = emit_iabs;
>   bld_base->op_actions[TGSI_OPCODE_IBFE].emit = 
> build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_IBFE].intr_name = 
> "llvm.AMDGPU.bfe.i32";
>   bld_base->op_actions[TGSI_OPCODE_IDIV].emit = emit_idiv;
>   bld_base->op_actions[TGSI_OPCODE_IF].emit = if_emit;
>   bld_base->op_actions[TGSI_OPCODE_UIF].emit = uif_emit;
> - bld_base->op_actions[TGSI_OPCODE_IMAX].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_IMAX].intr_name = "llvm.AMDGPU.imax";
> - bld_base->op_actions[TGSI_OPCODE_IMIN].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_IMIN].intr_name = "llvm.AMDGPU.imin";
> + bld_base->op_actions[TGSI_OPCODE_IMAX].emit = emit_minmax_int;
> + bld_base->op_actions[TGSI_OPCODE_IMIN].emit = emit_minmax_int;
>   bld_base->op_actions[TGSI_OPCODE_IMSB].emit = emit_imsb;
>   bld_base->op_actions[TGSI_OPCODE_INEG].emit = emit_ineg;
>   bld_base->op_actions[TGSI_OPCODE_ISHR].emit = emit_ishr;
> @@ -1551,10 +1593,8 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_UBFE].emit = 
> build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_UBFE].intr_name = 
> "llvm.AMDGPU.bfe.u32";
>   bld_base->op_actions[TGSI_OPCODE_UDIV].emit = emit_udiv;
> - bld_base->op_actions[TGSI_OPCODE_UMAX].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_UMAX].intr_name = "llvm.AMDGPU.umax";
> - bld_base->op_actions[TGSI_OPCODE_UMIN].emit = 
> build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_UMIN].intr_name = "llvm.AMDGPU.umin";
> + bld_base->op_actions[TGSI_OPCODE_UMAX].emit = emit_minmax_int;
> + bld_base->op_actions[TGSI_OPCODE_UMIN].emit = emit_minmax_int;
>   

[Mesa-dev] [PATCH] i965/fs: Run all of the optimisations after lower_load_payload

2015-10-15 Thread Neil Roberts
Instead of just running a couple of the possible optimisations in one
single iteration, it now runs the whole loop again after lowering the
load payloads. According to shader-db this gives:

total instructions in shared programs: 6493365 -> 6493289 (-0.00%)
instructions in affected programs: 1696 -> 1620 (-4.48%)
total loops in shared programs:2237 -> 2237 (0.00%)
helped:20
HURT:  0

Most of the shaders just benefit from running the register coalesce
pass multiple times. However the following two additionally benefit
from an extra pass of opt_saturate_propagation which causes a bunch of
further optimisations:

steam-metro-last-light-1719.shader_test
steam-metro-last-light-836.shader_test

The optimisations that get run after lowering the load payloads are:

04-14-register_coalesce
04-17-compact_virtual_grfs
05-12-opt_saturate_propagation
06-02-opt_algebraic
06-04-opt_copy_propagate
06-07-dead_code_eliminate
06-12-opt_saturate_propagation
06-14-register_coalesce
06-17-compact_virtual_grfs
07-18-opt_combine_constants

In the first shader this drops four instructions.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 94 +++-
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 ++
 2 files changed, 53 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 01a7c99..8d1db23 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4754,33 +4754,6 @@ fs_visitor::calculate_register_pressure()
}
 }
 
-void
-fs_visitor::optimize()
-{
-   /* Start by validating the shader we currently have. */
-   validate();
-
-   /* bld is the common builder object pointing at the end of the program we
-* used to translate it into i965 IR.  For the optimization and lowering
-* passes coming next, any code added after the end of the program without
-* having explicitly called fs_builder::at() clearly points at a mistake.
-* Ideally optimization passes wouldn't be part of the visitor so they
-* wouldn't have access to bld at all, but they do, so just in case some
-* pass forgets to ask for a location explicitly set it to NULL here to
-* make it trip.  The dispatch width is initialized to a bogus value to
-* make sure that optimizations set the execution controls explicitly to
-* match the code they are manipulating instead of relying on the defaults.
-*/
-   bld = fs_builder(this, 64);
-
-   assign_constant_locations();
-   demote_pull_constants();
-
-   validate();
-
-   split_virtual_grfs();
-   validate();
-
 #define OPT(pass, args...) ({   \
   pass_num++;   \
   bool this_progress = pass(args);  \
@@ -4799,20 +4772,10 @@ fs_visitor::optimize()
   this_progress;\
})
 
-   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) {
-  char filename[64];
-  snprintf(filename, 64, "%s%d-%s-00-start",
-   stage_abbrev, dispatch_width, nir->info.name);
-
-  backend_shader::dump_instructions(filename);
-   }
-
-   bool progress = false;
-   int iteration = 0;
-   int pass_num = 0;
-
-   OPT(lower_simd_width);
-   OPT(lower_logical_sends);
+void
+fs_visitor::optimize_loop(int , int _num)
+{
+   bool progress;
 
do {
   progress = false;
@@ -4839,6 +4802,51 @@ fs_visitor::optimize()
 
   OPT(compact_virtual_grfs);
} while (progress);
+}
+
+void
+fs_visitor::optimize()
+{
+   /* Start by validating the shader we currently have. */
+   validate();
+
+   /* bld is the common builder object pointing at the end of the program we
+* used to translate it into i965 IR.  For the optimization and lowering
+* passes coming next, any code added after the end of the program without
+* having explicitly called fs_builder::at() clearly points at a mistake.
+* Ideally optimization passes wouldn't be part of the visitor so they
+* wouldn't have access to bld at all, but they do, so just in case some
+* pass forgets to ask for a location explicitly set it to NULL here to
+* make it trip.  The dispatch width is initialized to a bogus value to
+* make sure that optimizations set the execution controls explicitly to
+* match the code they are manipulating instead of relying on the defaults.
+*/
+   bld = fs_builder(this, 64);
+
+   assign_constant_locations();
+   demote_pull_constants();
+
+   validate();
+
+   split_virtual_grfs();
+   validate();
+
+   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) {
+  char filename[64];
+  snprintf(filename, 64, "%s%d-%s-00-start",
+   stage_abbrev, dispatch_width, nir->info.name);
+
+  backend_shader::dump_instructions(filename);
+   }
+
+   bool progress = false;
+   int iteration = 0;
+   int 

Re: [Mesa-dev] [PATCH 07/10] radeonsi: don't use the AMDGPU intrinsic for CMP

2015-10-15 Thread Tom Stellard
On Sun, Oct 11, 2015 at 03:29:47AM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 

Reviewed-by: Tom Stellard 

> The increase in VGPRs in unfortunate, but the decrease in the scratch size
> is always welcome.
> 
> Totals:
> SGPRS: 344552 -> 344368 (-0.05 %)
> VGPRS: 197132 -> 197552 (0.21 %)
> Code Size: 7375376 -> 7366304 (-0.12 %) bytes
> LDS: 91 -> 91 (0.00 %) blocks
> Scratch: 1679360 -> 1615872 (-3.78 %) bytes per wave
> 
> Totals from affected shaders:
> SGPRS: 47736 -> 47552 (-0.39 %)
> VGPRS: 27952 -> 28372 (1.50 %)
> Code Size: 1392724 -> 1383652 (-0.65 %) bytes
> LDS: 39 -> 39 (0.00 %) blocks
> Scratch: 513024 -> 449536 (-12.38 %) bytes per wave
> ---
>  .../drivers/radeon/radeon_setup_tgsi_llvm.c| 31 
> +++---
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index c22ea7c..ac99e73 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -919,7 +919,21 @@ static void emit_ucmp(
>   LLVMBuildSelect(builder, v, emit_data->args[1], 
> emit_data->args[2], "");
>  }
>  
> -static void emit_cmp(
> +static void emit_cmp(const struct lp_build_tgsi_action *action,
> +  struct lp_build_tgsi_context *bld_base,
> +  struct lp_build_emit_data *emit_data)
> +{
> + LLVMBuilderRef builder = bld_base->base.gallivm->builder;
> + LLVMValueRef cond, *args = emit_data->args;
> +
> + cond = LLVMBuildFCmp(builder, LLVMRealOLT, args[0],
> +  bld_base->base.zero, "");
> +
> + emit_data->output[emit_data->chan] =
> + LLVMBuildSelect(builder, cond, args[1], args[2], "");
> +}
> +
> +static void emit_set_cond(
>   const struct lp_build_tgsi_action *action,
>   struct lp_build_tgsi_context * bld_base,
>   struct lp_build_emit_data * emit_data)
> @@ -1503,8 +1517,7 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_CEIL].intr_name = "llvm.ceil.f32";
>   bld_base->op_actions[TGSI_OPCODE_CLAMP].emit = 
> build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_CLAMP].intr_name = "llvm.AMDIL.clamp.";
> - bld_base->op_actions[TGSI_OPCODE_CMP].emit = build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_CMP].intr_name = "llvm.AMDGPU.cndlt";
> + bld_base->op_actions[TGSI_OPCODE_CMP].emit = emit_cmp;
>   bld_base->op_actions[TGSI_OPCODE_CONT].emit = cont_emit;
>   bld_base->op_actions[TGSI_OPCODE_COS].emit = build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_COS].intr_name = "llvm.cos.f32";
> @@ -1573,13 +1586,13 @@ void radeon_llvm_context_init(struct 
> radeon_llvm_context * ctx)
>   bld_base->op_actions[TGSI_OPCODE_ROUND].intr_name = "llvm.rint.f32";
>   bld_base->op_actions[TGSI_OPCODE_RSQ].intr_name = 
> "llvm.AMDGPU.rsq.clamped.f32";
>   bld_base->op_actions[TGSI_OPCODE_RSQ].emit = build_tgsi_intrinsic_nomem;
> - bld_base->op_actions[TGSI_OPCODE_SGE].emit = emit_cmp;
> - bld_base->op_actions[TGSI_OPCODE_SEQ].emit = emit_cmp;
> + bld_base->op_actions[TGSI_OPCODE_SGE].emit = emit_set_cond;
> + bld_base->op_actions[TGSI_OPCODE_SEQ].emit = emit_set_cond;
>   bld_base->op_actions[TGSI_OPCODE_SHL].emit = emit_shl;
> - bld_base->op_actions[TGSI_OPCODE_SLE].emit = emit_cmp;
> - bld_base->op_actions[TGSI_OPCODE_SLT].emit = emit_cmp;
> - bld_base->op_actions[TGSI_OPCODE_SNE].emit = emit_cmp;
> - bld_base->op_actions[TGSI_OPCODE_SGT].emit = emit_cmp;
> + bld_base->op_actions[TGSI_OPCODE_SLE].emit = emit_set_cond;
> + bld_base->op_actions[TGSI_OPCODE_SLT].emit = emit_set_cond;
> + bld_base->op_actions[TGSI_OPCODE_SNE].emit = emit_set_cond;
> + bld_base->op_actions[TGSI_OPCODE_SGT].emit = emit_set_cond;
>   bld_base->op_actions[TGSI_OPCODE_SIN].emit = build_tgsi_intrinsic_nomem;
>   bld_base->op_actions[TGSI_OPCODE_SIN].intr_name = "llvm.sin.f32";
>   bld_base->op_actions[TGSI_OPCODE_SQRT].emit = 
> build_tgsi_intrinsic_nomem;
> -- 
> 2.1.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-15 Thread Predut, Marius
> -Original Message-
> From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> Sent: Wednesday, October 07, 2015 1:53 PM
> To: Predut, Marius
> Cc: mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest
> width lines
> 
> On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> > On PNV platform, for 1 pixel line thickness or less, the general
> > anti-aliasing algorithm gives up, and a garbage line is generated.
> > Setting a Line Width of 0.0 specifies the rasterization of the
> > "thinnest" (one-pixel-wide), non-antialiased lines.
> > Lines rendered with zero Line Width are rasterized using Grid
> > Intersection Quantization rules as specified by
> > 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f of the
> > GEN3 docs.
> > The patch was tested on Intel Atom CPU N455.
> >
> > This patch follow the same rules as patches fixing the
> > https://bugs.freedesktop.org/show_bug.cgi?id=28832
> > bug.
> >
> > v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> > v2: Ian Romanick: comments fix.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> >
> > Signed-off-by: Marius Predut 
> > ---
> >  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i915/i915_state.c
> > b/src/mesa/drivers/dri/i915/i915_state.c
> > index 4c83073..897eb59 100644
> > --- a/src/mesa/drivers/dri/i915/i915_state.c
> > +++ b/src/mesa/drivers/dri/i915/i915_state.c
> > @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat
> > widthf)
> >
> > width = (int) (widthf * 2);
> > width = CLAMP(width, 1, 0xf);
> > +
> > +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> > + /* For 1 pixel line thickness or less, the general
> > +  * anti-aliasing algorithm gives up, and a garbage line is
> > +  * generated.  Setting a Line Width of 0.0 specifies the
> > +  * rasterization of the "thinnest" (one-pixel-wide),
> > +  * non-antialiased lines.
> > +  *
> > +  * Lines rendered with zero Line Width are rasterized using
> > +  * Grid Intersection Quantization rules as specified by
> > +  * volume 1f of the GEN3 docs,
> > +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization.
> > +  */
> > +  width = 0;
> > +   }
> 
> I went to do some spec reading, and while I can't confirm the AA <= 1.0
> problem (no mention in the spec about such things), I can see this fix alone
> isn't sufficient to satisfy the spec (we lack the round to nearest integer for
> non-aa for instance).

Ville ,Thanks for review!
On this seem not too much docs, here can use experiments or docs for next GEN+.

> 
> I think what we'd want is a small helper. i965 has one, although that one
> looks quite messy. I think this is how I'd write the helper for
> i915:
> 
> unsigned intel_line_width(ctx)
> {
>   float line_width = ctx->Line.Width;
> 
>   if (ctx->Line.SmoothFlag)
>   line_width = CLAMP(line_width, MinAA, MaxAA);
>   else
>   line_width = CLAMP(roundf(line_width), Min, Max);
> 
>   /*
>* blah
>*/
>   if (line_width < 1.5f)
>   line_width = 0.0f
> 
>   return U_FIXED(line_width, 1);
> }
> 
> and then use it for both gen2 and gen3 state setup.

Do you used this and it works for you? (I mean if you did a test on your PNV 
platform)
I have some comments on the Bugzilla related to SmoothFlag flag.(on 2015-06-04).
On my tests seems the flag is set only if call glLineWidth (lineWidth), 
lineWidth != 1.

> 
> The clamp part could even ve moved to some central place so that all drivers
> could share it, or I suppose we could stash the appropriately rounded and
> clamped line width into the context as ctx->Line._Width.
> 
> Oh and BTW, the gen4/5 line width handling in i965 looks busted too (only
> gen6+ got fixed).

First I intend only to fix de bug , then add extra fixes like CLAMP.
CLAMP was not done before and it can be subject on next patch series.


> 
> > lis4 |= width << S4_LINE_WIDTH_SHIFT;
> >
> > if (lis4 != i915->state.Ctx[I915_CTXREG_LIS4]) {
> > --
> > 1.9.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> --
> Ville Syrjälä
> Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 12/17] i965/vs: Rework vs_emit to take a nir_shader and a brw_compiler

2015-10-15 Thread Jason Ekstrand
On Oct 14, 2015 10:48 PM, "Pohjolainen, Topi" 
wrote:
>
> On Wed, Oct 14, 2015 at 11:53:37AM -0700, Jason Ekstrand wrote:
> > On Wed, Oct 14, 2015 at 1:41 AM, Pohjolainen, Topi
> >  wrote:
> > > On Wed, Oct 14, 2015 at 11:25:40AM +0300, Pohjolainen, Topi wrote:
> > >> On Sat, Oct 10, 2015 at 08:09:01AM -0700, Jason Ekstrand wrote:
> > >> > This commit removes all dependence on GL state by getting rid of
the
> > >> > brw_context parameter and the GL data structures.
> > >> >
> > >> > v2 (Jason Ekstrand):
> > >> >- Patch use_legacy_snorm_formula through as a function argument
rather
> > >> >  than trying to go through the shader key.
> > >> > ---
> > >> >  src/mesa/drivers/dri/i965/brw_vec4.cpp | 70
+-
> > >> >  src/mesa/drivers/dri/i965/brw_vs.c | 16 +++-
> > >> >  src/mesa/drivers/dri/i965/brw_vs.h | 12 --
> > >> >  3 files changed, 49 insertions(+), 49 deletions(-)
> > >> >
> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > >> > index 4b8390f..8e38729 100644
> > >> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > >> > @@ -1937,51 +1937,42 @@ extern "C" {
> > >> >   * Returns the final assembly and the program's size.
> > >> >   */
> > >> >  const unsigned *
> > >> > -brw_vs_emit(struct brw_context *brw,
> > >> > +brw_vs_emit(const struct brw_compiler *compiler, void *log_data,
> > >> >  void *mem_ctx,
> > >> >  const struct brw_vs_prog_key *key,
> > >> >  struct brw_vs_prog_data *prog_data,
> > >> > -struct gl_vertex_program *vp,
> > >> > -struct gl_shader_program *prog,
> > >> > +const nir_shader *shader,
> > >> > +gl_clip_plane *clip_planes,
> > >> > +bool use_legacy_snorm_formula,
> > >> >  int shader_time_index,
> > >> > -unsigned *final_assembly_size)
> > >> > +unsigned *final_assembly_size,
> > >> > +char **error_str)
> > >> >  {
> > >> > const unsigned *assembly = NULL;
> > >> >
> > >> > -   if (brw->intelScreen->compiler->scalar_vs) {
> > >> > +   if (compiler->scalar_vs) {
> > >> >prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8;
> > >> >
> > >> > -  fs_visitor v(brw->intelScreen->compiler, brw,
> > >> > -   mem_ctx, key, _data->base.base,
> > >> > +  fs_visitor v(compiler, log_data, mem_ctx, key,
_data->base.base,
> > >> > NULL, /* prog; Only used for TEXTURE_RECTANGLE
on gen < 8 */
> > >> > -   vp->Base.nir, 8, shader_time_index);
> > >> > -  if (!v.run_vs(brw_select_clip_planes(>ctx))) {
> > >> > - if (prog) {
> > >> > -prog->LinkStatus = false;
> > >> > -ralloc_strcat(>InfoLog, v.fail_msg);
> > >> > - }
> > >> > -
> > >> > - _mesa_problem(NULL, "Failed to compile vertex shader:
%s\n",
> > >> > -   v.fail_msg);
> > >> > +   shader, 8, shader_time_index);
> > >> > +  if (!v.run_vs(clip_planes)) {
> > >> > + if (error_str)
> > >> > +*error_str = ralloc_strdup(mem_ctx, v.fail_msg);
> > >>
> > >> I don't particularly like the complexity of the error reporting
mechanism.
> > >> First vec4_visitor::fail() uses ralloc_asprintf() to create one
string, then
> > >> we make a copy of it here and finally the caller of brw_vs_emit()
makes yet
> > >> another copy using ralloc_strcat().
> > >> I wonder if we could pass the final destination all the way for the
> > >> vec4_visitor::fail() to augment with ralloc_asprintf() and hence
avoid all
> > >
> > > Or more appropiately using ralloc_asprintf_append()...
> > >
> > >> the indirection in the middle. What do you think?
> >
> > I'd be moderately ok with just doing "*error_str = v.fail_msg" and
> > avoiding the extra copy.  I'm not a big fan of the extra copy, but I
> > decided to leave it in for a couple of reasons
> >
> > 1) It only happens on the error path so it's not a big deal.
>
> I wasn't concerned about the overhead either, as you said this is error
path
> only.
>
> >
> > 2) Not copying it is kind of a layering violation.  You're grabbing a
> > string from an object without copying it, destroying the object, and
> > then handing it back to the thing that called you.  The only way this
> > works is if you know that the class ralloc'd the string from the
> > context you gave it.  We do, in this case, but it did seem like a bit
> > of a layering violation.
> >
> > 3) The first time I did this rework, I created a new memory context
> > for *_emit and destroyed that memory context at the end.  Because
> > fail_msg was allocated out of this temp context, I had to do something
> > with it before returning it.  The objective there was to remove the
> > mem_ctx input parameter and make it more self-contained.  Then I

Re: [Mesa-dev] [PATCH 03/10] radeonsi: initialize output, temp, and address registers to "undef"

2015-10-15 Thread Tom Stellard
On Sun, Oct 11, 2015 at 03:29:43AM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 
> This removes "v_mov v0, 0" which typically occurs before exports.
> 

Reviewed-by: Tom Stellard 

> Totals:
> SGPRS: 345216 -> 344552 (-0.19 %)
> VGPRS: 197684 -> 197132 (-0.28 %)
> Code Size: 7390408 -> 7375376 (-0.20 %) bytes
> LDS: 91 -> 91 (0.00 %) blocks
> Scratch: 1842176 -> 1679360 (-8.84 %) bytes per wave
> 
> Totals from affected shaders:
> SGPRS: 101336 -> 100672 (-0.66 %)
> VGPRS: 53920 -> 53368 (-1.02 %)
> Code Size: 2170176 -> 2155144 (-0.69 %) bytes
> LDS: 2 -> 2 (0.00 %) blocks
> Scratch: 1015808 -> 852992 (-16.03 %) bytes per wave
> ---
>  src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index 2e9a013..f548d1a 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -272,6 +272,15 @@ static LLVMValueRef fetch_system_value(
>   return bitcast(bld_base, type, cval);
>  }
>  
> +static LLVMValueRef si_build_alloca_undef(struct gallivm_state *gallivm,
> +   LLVMTypeRef type,
> +   const char *name)
> +{
> + LLVMValueRef ptr = lp_build_alloca(gallivm, type, name);
> + LLVMBuildStore(gallivm->builder, LLVMGetUndef(type), ptr);
> + return ptr;
> +}
> +
>  static void emit_declaration(
>   struct lp_build_tgsi_context * bld_base,
>   const struct tgsi_full_declaration *decl)
> @@ -285,7 +294,7 @@ static void emit_declaration(
>   for (idx = decl->Range.First; idx <= decl->Range.Last; idx++) {
>   unsigned chan;
>   for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
> -  ctx->soa.addr[idx][chan] = lp_build_alloca(
> +  ctx->soa.addr[idx][chan] = 
> si_build_alloca_undef(
>   >gallivm,
>   ctx->soa.bld_base.uint_bld.elem_type, 
> "");
>   }
> @@ -315,8 +324,9 @@ static void emit_declaration(
>   for (idx = first; idx <= last; idx++) {
>   for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
>   ctx->temps[idx * TGSI_NUM_CHANNELS + i] =
> - lp_build_alloca(bld_base->base.gallivm, 
> bld_base->base.vec_type,
> - "temp");
> + 
> si_build_alloca_undef(bld_base->base.gallivm,
> +   
> bld_base->base.vec_type,
> +   "temp");
>   }
>   }
>   break;
> @@ -347,7 +357,8 @@ static void emit_declaration(
>   unsigned chan;
>   assert(idx < RADEON_LLVM_MAX_OUTPUTS);
>   for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
> - ctx->soa.outputs[idx][chan] = 
> lp_build_alloca(>gallivm,
> + ctx->soa.outputs[idx][chan] = 
> si_build_alloca_undef(
> + >gallivm,
>   ctx->soa.bld_base.base.elem_type, "");
>   }
>   }
> -- 
> 2.1.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/mesa: fix incorrect pointer type arguments in st_new_program()

2015-10-15 Thread Brian Paul
Silences 5 warnings of the type:
state_tracker/st_cb_program.c: In function 'st_new_program':
state_tracker/st_cb_program.c:108:7: warning: passing argument 1 of
'_mesa_init_gl_program' from incompatible pointer type [enabled by default]
   return _mesa_init_gl_program(>Base, target, id);
   ^
---
 src/mesa/state_tracker/st_cb_program.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_program.c 
b/src/mesa/state_tracker/st_cb_program.c
index 26d128a..708bdf5 100644
--- a/src/mesa/state_tracker/st_cb_program.c
+++ b/src/mesa/state_tracker/st_cb_program.c
@@ -105,23 +105,23 @@ st_new_program(struct gl_context *ctx, GLenum target, 
GLuint id)
switch (target) {
case GL_VERTEX_PROGRAM_ARB: {
   struct st_vertex_program *prog = ST_CALLOC_STRUCT(st_vertex_program);
-  return _mesa_init_gl_program(>Base, target, id);
+  return _mesa_init_gl_program(>Base.Base, target, id);
}
case GL_FRAGMENT_PROGRAM_ARB: {
   struct st_fragment_program *prog = ST_CALLOC_STRUCT(st_fragment_program);
-  return _mesa_init_gl_program(>Base, target, id);
+  return _mesa_init_gl_program(>Base.Base, target, id);
}
case GL_GEOMETRY_PROGRAM_NV: {
   struct st_geometry_program *prog = ST_CALLOC_STRUCT(st_geometry_program);
-  return _mesa_init_gl_program(>Base, target, id);
+  return _mesa_init_gl_program(>Base.Base, target, id);
}
case GL_TESS_CONTROL_PROGRAM_NV: {
   struct st_tessctrl_program *prog = ST_CALLOC_STRUCT(st_tessctrl_program);
-  return _mesa_init_gl_program(>Base, target, id);
+  return _mesa_init_gl_program(>Base.Base, target, id);
}
case GL_TESS_EVALUATION_PROGRAM_NV: {
   struct st_tesseval_program *prog = ST_CALLOC_STRUCT(st_tesseval_program);
-  return _mesa_init_gl_program(>Base, target, id);
+  return _mesa_init_gl_program(>Base.Base, target, id);
}
default:
   assert(0);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/mesa: fix incorrect pointer type arguments in st_new_program()

2015-10-15 Thread Emil Velikov
On 15 October 2015 at 14:27, Brian Paul  wrote:
> Silences 5 warnings of the type:
> state_tracker/st_cb_program.c: In function 'st_new_program':
> state_tracker/st_cb_program.c:108:7: warning: passing argument 1 of
> '_mesa_init_gl_program' from incompatible pointer type [enabled by default]
>return _mesa_init_gl_program(>Base, target, id);
>^
Forgot to git add and squash these before pushing. Sorry about that.

Reviewed-by: Emil Velikov 
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/glsl: Use shader_prog->Name for naming the NIR shader

2015-10-15 Thread Jason Ekstrand
On Thu, Oct 15, 2015 at 6:03 AM, Neil Roberts  wrote:
> Ping, could you please push this patch? It's a pain to use the optimise
> debug output without it. Thanks.

Pushed!  Sorry that took so long.  I was off doing other things.

> Reviewed-by: Neil Roberts 

Thanks.

> - Neil
>
> Jason Ekstrand  writes:
>
>> This has the better name to use. Aparently, sh->Name is usually 0.
>> ---
>>  src/glsl/nir/glsl_to_nir.cpp | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>> index 6e1dd84..3284bdc 100644
>> --- a/src/glsl/nir/glsl_to_nir.cpp
>> +++ b/src/glsl/nir/glsl_to_nir.cpp
>> @@ -150,7 +150,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
>>if (sh->Program->SamplersUsed & (1 << i))
>>   num_textures = i;
>>
>> -   shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
>> +   shader->info.name = ralloc_asprintf(shader, "GLSL%d", shader_prog->Name);
>> if (shader_prog->Label)
>>shader->info.label = ralloc_strdup(shader, shader_prog->Label);
>> shader->info.num_textures = num_textures;
>> --
>> 2.5.0.400.gff86faf
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] gallivm: implement the correct version of LRP

2015-10-15 Thread Marek Olšák
Any comment or is this okay with people? Given, "(1-t)*a + t*b", the
original code didn't return b for t=1 because it's "floating-point".

Marek

On Sun, Oct 11, 2015 at 3:29 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> The previous version has precision issues. This can be a problem
> with tessellation. Sadly, I can't find the article where I read it
> anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert
> this.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index 0ad78b0..512558b 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -538,12 +538,13 @@ lrp_emit(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> -   LLVMValueRef tmp;
> -   tmp = lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_SUB,
> -   emit_data->args[1],
> -   emit_data->args[2]);
> -   emit_data->output[emit_data->chan] = lp_build_emit_llvm_ternary(bld_base,
> -TGSI_OPCODE_MAD, emit_data->args[0], tmp, 
> emit_data->args[2]);
> +   struct lp_build_context *bld = _base->base;
> +   LLVMValueRef inv, a, b;
> +
> +   inv = lp_build_sub(bld, bld_base->base.one, emit_data->args[0]);
> +   a = lp_build_mul(bld, emit_data->args[1], emit_data->args[0]);
> +   b = lp_build_mul(bld, emit_data->args[2], inv);
> +   emit_data->output[emit_data->chan] = lp_build_add(bld, a, b);
>  }
>
>  /* TGSI_OPCODE_MAD */
> --
> 2.1.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: add more cases to print_list() in dlist.c

2015-10-15 Thread Brian Paul
---
 src/mesa/main/dlist.c | 46 ++
 1 file changed, 46 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index e8059c7..fdb839c 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -9741,6 +9741,46 @@ print_list(struct gl_context *ctx, GLuint list, const 
char *fname)
n[3].f, n[4].f, n[5].f, n[6].f,
get_pointer([7]));
 break;
+ case OPCODE_BLEND_COLOR:
+fprintf(f, "BlendColor %f, %f, %f, %f\n",
+n[1].f, n[2].f, n[3].f, n[4].f);
+break;
+ case OPCODE_BLEND_EQUATION:
+fprintf(f, "BlendEquation %s\n",
+enum_string(n[1].e));
+break;
+ case OPCODE_BLEND_EQUATION_SEPARATE:
+fprintf(f, "BlendEquationSeparate %s, %s\n",
+enum_string(n[1].e),
+enum_string(n[2].e));
+break;
+ case OPCODE_BLEND_FUNC_SEPARATE:
+fprintf(f, "BlendFuncSeparate %s, %s, %s, %s\n",
+enum_string(n[1].e),
+enum_string(n[2].e),
+enum_string(n[3].e),
+enum_string(n[4].e));
+break;
+ case OPCODE_BLEND_EQUATION_I:
+fprintf(f, "BlendEquationi %u, %s\n",
+n[1].ui, enum_string(n[2].e));
+break;
+ case OPCODE_BLEND_EQUATION_SEPARATE_I:
+fprintf(f, "BlendEquationSeparatei %u, %s, %s\n",
+n[1].ui, enum_string(n[2].e), enum_string(n[3].e));
+break;
+ case OPCODE_BLEND_FUNC_I:
+fprintf(f, "BlendFunci %u, %s, %s\n",
+n[1].ui, enum_string(n[2].e), enum_string(n[3].e));
+break;
+ case OPCODE_BLEND_FUNC_SEPARATE_I:
+fprintf(f, "BlendFuncSeparatei %u, %s, %s, %s, %s\n",
+n[1].ui,
+enum_string(n[2].e),
+enum_string(n[3].e),
+enum_string(n[4].e),
+enum_string(n[5].e));
+break;
  case OPCODE_CALL_LIST:
 fprintf(f, "CallList %d\n", (int) n[1].ui);
 break;
@@ -9761,6 +9801,9 @@ print_list(struct gl_context *ctx, GLuint list, const 
char *fname)
  case OPCODE_LINE_STIPPLE:
 fprintf(f, "LineStipple %d %x\n", n[1].i, (int) n[2].us);
 break;
+ case OPCODE_LINE_WIDTH:
+fprintf(f, "LineWidth %f\n", n[1].f);
+break;
  case OPCODE_LOAD_IDENTITY:
 fprintf(f, "LoadIdentity\n");
 break;
@@ -9790,6 +9833,9 @@ print_list(struct gl_context *ctx, GLuint list, const 
char *fname)
 fprintf(f, "Ortho %g %g %g %g %g %g\n",
  n[1].f, n[2].f, n[3].f, n[4].f, n[5].f, n[6].f);
 break;
+ case OPCODE_POINT_SIZE:
+fprintf(f, "PointSize %f\n", n[1].f);
+break;
  case OPCODE_POP_ATTRIB:
 fprintf(f, "PopAttrib\n");
 break;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: fix incorrect opcode in save_BlendFunci()

2015-10-15 Thread Brian Paul
Fixes assertion failure with new piglit
arb_draw_buffers_blend-state_set_get test.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/dlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index fdb839c..2b65b2e 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -1400,7 +1400,7 @@ save_BlendFunci(GLuint buf, GLenum sfactor, GLenum 
dfactor)
GET_CURRENT_CONTEXT(ctx);
Node *n;
ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
-   n = alloc_instruction(ctx, OPCODE_BLEND_FUNC_SEPARATE_I, 3);
+   n = alloc_instruction(ctx, OPCODE_BLEND_FUNC_I, 3);
if (n) {
   n[1].ui = buf;
   n[2].e = sfactor;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: initialise record count to 1

2015-10-15 Thread Marek Olšák
Thanks a lot.

Tested-by: Marek Olšák 

Marek

On Thu, Oct 15, 2015 at 5:16 AM, Timothy Arceri  wrote:
> This was only being done in one of the two process methods.
>
> Fixes issue with samplers using the array size of a previous record.
>
> Cc: Marek Olšák 
> Cc: Jason Ekstrand 
> ---
>  src/glsl/link_uniforms.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
> index 0ccd9c8..e60b050 100644
> --- a/src/glsl/link_uniforms.cpp
> +++ b/src/glsl/link_uniforms.cpp
> @@ -160,6 +160,7 @@ program_resource_visitor::process(ir_variable *var)
>  false, record_array_count);
>ralloc_free(name);
> } else {
> +  this->set_record_array_count(record_array_count);
>this->visit_field(t, var->name, row_major, NULL, packing, false);
> }
>  }
> --
> 2.4.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-15 Thread Ville Syrjälä
On Thu, Oct 15, 2015 at 02:19:09PM +, Predut, Marius wrote:
> > -Original Message-
> > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > Sent: Wednesday, October 07, 2015 1:53 PM
> > To: Predut, Marius
> > Cc: mesa-dev@lists.freedesktop.org
> > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for 
> > thinnest
> > width lines
> > 
> > On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> > > On PNV platform, for 1 pixel line thickness or less, the general
> > > anti-aliasing algorithm gives up, and a garbage line is generated.
> > > Setting a Line Width of 0.0 specifies the rasterization of the
> > > "thinnest" (one-pixel-wide), non-antialiased lines.
> > > Lines rendered with zero Line Width are rasterized using Grid
> > > Intersection Quantization rules as specified by
> > > 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f of the
> > > GEN3 docs.
> > > The patch was tested on Intel Atom CPU N455.
> > >
> > > This patch follow the same rules as patches fixing the
> > > https://bugs.freedesktop.org/show_bug.cgi?id=28832
> > > bug.
> > >
> > > v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> > > v2: Ian Romanick: comments fix.
> > >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> > >
> > > Signed-off-by: Marius Predut 
> > > ---
> > >  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
> > >  1 file changed, 15 insertions(+)
> > >
> > > diff --git a/src/mesa/drivers/dri/i915/i915_state.c
> > > b/src/mesa/drivers/dri/i915/i915_state.c
> > > index 4c83073..897eb59 100644
> > > --- a/src/mesa/drivers/dri/i915/i915_state.c
> > > +++ b/src/mesa/drivers/dri/i915/i915_state.c
> > > @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat
> > > widthf)
> > >
> > > width = (int) (widthf * 2);
> > > width = CLAMP(width, 1, 0xf);
> > > +
> > > +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> > > + /* For 1 pixel line thickness or less, the general
> > > +  * anti-aliasing algorithm gives up, and a garbage line is
> > > +  * generated.  Setting a Line Width of 0.0 specifies the
> > > +  * rasterization of the "thinnest" (one-pixel-wide),
> > > +  * non-antialiased lines.
> > > +  *
> > > +  * Lines rendered with zero Line Width are rasterized using
> > > +  * Grid Intersection Quantization rules as specified by
> > > +  * volume 1f of the GEN3 docs,
> > > +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization.
> > > +  */
> > > +  width = 0;
> > > +   }
> > 
> > I went to do some spec reading, and while I can't confirm the AA <= 1.0
> > problem (no mention in the spec about such things), I can see this fix alone
> > isn't sufficient to satisfy the spec (we lack the round to nearest integer 
> > for
> > non-aa for instance).
> 
> Ville ,Thanks for review!
> On this seem not too much docs, here can use experiments or docs for next 
> GEN+.
> 
> > 
> > I think what we'd want is a small helper. i965 has one, although that one
> > looks quite messy. I think this is how I'd write the helper for
> > i915:
> > 
> > unsigned intel_line_width(ctx)
> > {
> > float line_width = ctx->Line.Width;
> > 
> > if (ctx->Line.SmoothFlag)
> > line_width = CLAMP(line_width, MinAA, MaxAA);
> > else
> > line_width = CLAMP(roundf(line_width), Min, Max);
> > 
> > /*
> >  * blah
> >  */
> > if (line_width < 1.5f)
> > line_width = 0.0f
> > 
> > return U_FIXED(line_width, 1);
> > }
> > 
> > and then use it for both gen2 and gen3 state setup.
> 
> Do you used this and it works for you? (I mean if you did a test on your PNV 
> platform)

Didn't do any actual testing yet. I've been meaning to, but just been
too busy with other stuff. I can try to test on pnv today, and maybe on
830 and 85x on the weekend. Hmm, I wonder if the test even works on
gl1?

> I have some comments on the Bugzilla related to SmoothFlag flag.(on 
> 2015-06-04).
> On my tests seems the flag is set only if call glLineWidth (lineWidth), 
> lineWidth != 1.
> 
> > 
> > The clamp part could even ve moved to some central place so that all drivers
> > could share it, or I suppose we could stash the appropriately rounded and
> > clamped line width into the context as ctx->Line._Width.
> > 
> > Oh and BTW, the gen4/5 line width handling in i965 looks busted too (only
> > gen6+ got fixed).
> 
> First I intend only to fix de bug , then add extra fixes like CLAMP.
> CLAMP was not done before and it can be subject on next patch series.

Fair enough.

> 
> 
> > 
> > > lis4 |= width << S4_LINE_WIDTH_SHIFT;
> > >
> > > if (lis4 != i915->state.Ctx[I915_CTXREG_LIS4]) {
> > > --
> > > 1.9.1
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > --
> > Ville Syrjälä
> > Intel OTC


Re: [Mesa-dev] [PATCH v2 12/11] i965: Add scalar geometry shader support.

2015-10-15 Thread Kenneth Graunke
On Monday, October 12, 2015 02:55:32 PM Kenneth Graunke wrote:
> +void
> +fs_visitor::emit_gs_input_load(const fs_reg ,
> +   const nir_src _src,
> +   unsigned input_offset,
> +   unsigned num_components)
> +{
> +   const brw_vue_prog_data *vue_prog_data = (const brw_vue_prog_data *) 
> prog_data;
> +   const unsigned vertex = nir_src_as_const_value(vertex_src)->u[0];
> +
> +   const unsigned array_stride = vue_prog_data->urb_read_length * 8;
> +
> +   const bool pushed = 4 * input_offset < array_stride;
> +
> +   if (input_offset == 0) {
> +  /* This is the VUE header, containing VARYING_SLOT_LAYER [.y],
> +   * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w].
> +   * Only gl_PointSize is available as a GS input, so they must
> +   * be asking for that input.
> +   */
> +  if (pushed) {
> + bld.MOV(dst, fs_reg(ATTR, array_stride * vertex + 3, dst.type));
> +  } else {
> + fs_reg tmp = bld.vgrf(dst.type, 4);
> + fs_inst *inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp,
> +  fs_reg(vertex), fs_reg(0));
> + inst->regs_written = 4;
> + bld.MOV(dst, offset(tmp, bld, 3));
> +  }
> +   } else {
> +  if (pushed) {
> + int index = vertex * array_stride + 4 * input_offset;
> + for (unsigned i = 0; i < num_components; i++) {
> +bld.MOV(offset(dst, bld, i), fs_reg(ATTR, index + i, dst.type));
> + }
> +  } else {
> + fs_inst *inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst,
> +  fs_reg(vertex), fs_reg(input_offset));
> + inst->regs_written = num_components;
> +  }
> +   }
> +}
> +

Kristian pointed out that for instanced geometry shaders, the input VUE
handles are packed into a single register, rather than in 6 separate
registers.  So, this will probably not work out for ARB_gpu_shader5
GS instancing.

I don't remember failing any Piglit tests in that area, but I should
make sure this works.  It doesn't necessarily need to block this
landing, as this support is still hidden behind an environment variable
and most of it is still good.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965: Implement a new type_size_4x() function.

2015-10-15 Thread Connor Abbott
On Thu, Oct 15, 2015 at 6:17 PM, Kenneth Graunke  wrote:
> Often, shader inputs/outputs are required to be aligned to vec4 slots
> for one reason or another.  When working with the scalar backend, we
> want to count the number of scalar components, yet still respect the
> vec4 packing rules as required.
>
> The new "hybrid" type_size_4x() function pads everything out to vec4
> slots, similar to type_size_vec4(), but counts in scalar components,
> similar to type_size_scalar().
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 52 
> ++
>  src/mesa/drivers/dri/i965/brw_shader.h |  1 +
>  2 files changed, 53 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 01a7c99..4af88c5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -499,6 +499,58 @@ type_size_scalar(const struct glsl_type *type)
>  }
>
>  /**
> + * Returns the number of scalar components needed to store type, assuming
> + * that vectors are padded out to vec4.
> + *
> + * This has the packing rules of type_size_vec4(), but counts components
> + * similar to type_size_scalar().
> + */
> +extern "C" int
> +type_size_4x(const struct glsl_type *type)
> +{
> +   int size;
> +
> +   switch (type->base_type) {
> +   case GLSL_TYPE_UINT:
> +   case GLSL_TYPE_INT:
> +   case GLSL_TYPE_FLOAT:
> +   case GLSL_TYPE_BOOL:
> +  if (type->is_matrix()) {
> + return 4 * type->matrix_columns;
> +  } else {
> + /* Regardless of the size of vector, it's padded out to a vec4. */
> + return 4;
> +  }
> +   case GLSL_TYPE_ARRAY:
> +  return type_size_4x(type->fields.array) * type->length;
> +   case GLSL_TYPE_STRUCT:
> +  size = 0;
> +  for (unsigned i = 0; i < type->length; i++) {
> +size += type_size_4x(type->fields.structure[i].type);
> +  }
> +  return size;
> +   case GLSL_TYPE_SAMPLER:
> +  /* Samplers take up no register space, since they're baked in at
> +   * link time.
> +   */
> +  return 0;
> +   case GLSL_TYPE_ATOMIC_UINT:
> +  return 0;
> +   case GLSL_TYPE_SUBROUTINE:
> +  return 4;
> +   case GLSL_TYPE_IMAGE:
> +  return ALIGN(BRW_IMAGE_PARAM_SIZE, 4);
> +   case GLSL_TYPE_VOID:
> +   case GLSL_TYPE_ERROR:
> +   case GLSL_TYPE_INTERFACE:
> +   case GLSL_TYPE_DOUBLE:
> +  unreachable("not reached");
> +   }
> +
> +   return 0;
> +}

Is there a difference between this and type_size_vec4(type) *  4?

> +
> +/**
>   * Create a MOV to read the timestamp register.
>   *
>   * The caller is responsible for emitting the MOV.  The return value is
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> b/src/mesa/drivers/dri/i965/brw_shader.h
> index ad2de5e..06a5b4c 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.h
> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> @@ -316,6 +316,7 @@ bool brw_cs_precompile(struct gl_context *ctx,
> struct gl_program *prog);
>
>  int type_size_scalar(const struct glsl_type *type);
> +int type_size_4x(const struct glsl_type *type);
>  int type_size_vec4(const struct glsl_type *type);
>
>  bool is_scalar_shader_stage(const struct brw_compiler *compiler, int stage);
> --
> 2.6.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Drop hack that created NIR for fixed function vertex programs.

2015-10-15 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/2] gallium: add tegra support

2015-10-15 Thread Emil Velikov
Hi Christian,

Mostly minor suggestions I'm afraid. Things just look too good for
anything serious.

On 11 October 2015 at 16:09, Christian Gmeiner
 wrote:
> This commit adds tegra support, which uses the renderonly driver
> library.
>
> Signed-off-by: Christian Gmeiner 
> ---
>  configure.ac   | 19 +++-
>  src/gallium/Makefile.am|  6 +++
>  .../auxiliary/target-helpers/inline_drm_helper.h   | 29 
>  src/gallium/drivers/tegra/Automake.inc | 10 +
>  src/gallium/drivers/tegra/Makefile.am  |  9 
>  src/gallium/targets/dri/Makefile.am|  2 +
>  src/gallium/winsys/tegra/drm/Android.mk| 34 +++
>  src/gallium/winsys/tegra/drm/Makefile.am   | 33 ++
>  src/gallium/winsys/tegra/drm/Makefile.sources  |  3 ++
>  src/gallium/winsys/tegra/drm/tegra_drm_public.h| 31 +
>  src/gallium/winsys/tegra/drm/tegra_drm_winsys.c| 51 
> ++
>  11 files changed, 226 insertions(+), 1 deletion(-)
>  create mode 100644 src/gallium/drivers/tegra/Automake.inc
>  create mode 100644 src/gallium/drivers/tegra/Makefile.am
>  create mode 100644 src/gallium/winsys/tegra/drm/Android.mk
>  create mode 100644 src/gallium/winsys/tegra/drm/Makefile.am
>  create mode 100644 src/gallium/winsys/tegra/drm/Makefile.sources
>  create mode 100644 src/gallium/winsys/tegra/drm/tegra_drm_public.h
>  create mode 100644 src/gallium/winsys/tegra/drm/tegra_drm_winsys.c
>
> diff --git a/configure.ac b/configure.ac
> index ea485b1..9fb8244 100644
> --- a/configure.ac
> +++ b/configure.ac
[snip]
> @@ -2166,6 +2167,12 @@ if test -n "$with_gallium_drivers"; then
>  HAVE_GALLIUM_LLVMPIPE=yes
>  fi
>  ;;
> +xtegra)
> +HAVE_GALLIUM_TEGRA=yes
We need an extra NEED_GALLIUM_NOUVEAU conditional (set to yes here and
in the xnouveau case).
One will also need to duplicate (as a temporary workaround) the
nouveau PKG_CHECK_MODULES here.

Then update the src/gallium/Makefile.am to use it over HAVE_GALLIUM_NOUVEAU

[snip]
> +dnl We need to validate some needed dependencies for renderonly drivers.
> +
> +if test "x$HAVE_GALLIUM_NOUVEAU" != xyes -a "x$HAVE_GALLIUM_TEGRA" == xyes  
> ; then
> +AC_ERROR([Building with tegra requires that nouveau])
> +fi
> +
> +
And then you can drop this hunk.

> --- a/src/gallium/auxiliary/target-helpers/inline_drm_helper.h
> +++ b/src/gallium/auxiliary/target-helpers/inline_drm_helper.h
> @@ -59,6 +59,10 @@
>  #include "vc4/drm/vc4_drm_public.h"
>  #endif
>
> +#if GALLIUM_TEGRA
> +#include "tegra/drm/tegra_drm_public.h"
> +#endif
> +
FYI, I'm just testing some updates/rewrites of these target-helpers,
so things might clash in the not so distant future.

[snip]
> --- /dev/null
> +++ b/src/gallium/drivers/tegra/Automake.inc
> @@ -0,0 +1,10 @@
> +if HAVE_GALLIUM_TEGRA
> +
> +TARGET_DRIVERS += tegra
> +TARGET_CPPFLAGS += -DGALLIUM_TEGRA
> +TARGET_LIB_DEPS += \
> +   $(top_builddir)/src/gallium/drivers/renderonly/librenderonly.la \
> +   $(top_builddir)/src/gallium/winsys/tegra/drm/libtegradrm.la \
> +   $(LIBDRM_LIBS)
This, perhaps, should be TEGRA_LIBS, yet we're not using anything from
libdrm_tegra so we should be safe.

[snip]
> --- /dev/null
> +++ b/src/gallium/winsys/tegra/drm/Android.mk
I think we can drop this file for now. Android + tegra is quite
incomplete as is.

[snip]
> --- /dev/null
> +++ b/src/gallium/winsys/tegra/drm/tegra_drm_winsys.c
[snip]
> +#include "renderonly/renderonly_screen.h"
> +#include "../winsys/tegra/drm/tegra_drm_public.h"
> +#include "../winsys/nouveau/drm/nouveau_drm_public.h"
> +
Please rework things to avoid the ../'s

> +#include 
(as mentioned before) Please drop the path from the include.

> +#include 
> +
Flip these two and move them to the top ?

> +static int tegra_tiling(int fd, uint32_t handle)
> +{
> +   struct drm_tegra_gem_set_tiling args;
> +
> +   memset(, 0, sizeof(args));
> +   args.handle = handle;
> +   args.mode = DRM_TEGRA_GEM_TILING_MODE_BLOCK;
> +   args.value = 4;
Worth adding a note wrt the magic number ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] configure: show which gallium drivers/sts are built

2015-10-15 Thread Michel Dänzer
On 16.10.2015 02:55, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
> 
> v1 -> v2: Take Michel's suggestion to include mesa in the st list, append 
> others

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92278] Black screen in War Thunder

2015-10-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92278

Kai Huuhko  changed:

   What|Removed |Added

 CC||kai.huu...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 1/2] gallium: add renderonly driver

2015-10-15 Thread Emil Velikov
Hi Christian,

I'm glad to see Thierry's work revived. Hopefully this will soon be
the basis of many more drivers.

On 11 October 2015 at 16:09, Christian Gmeiner
 wrote:
> This commit adds a generic renderonly driver library, which fullfille
> the requirements for tegra and etnaviv. As a result it is possible to
> run unmodified egl software directly (without any compositor) on
> supported devices.
>
> In every use case we import a dumb buffer from scanout gpu into
> the renderonly gpu.
>
> If the scanout hardware does support the used tiling format from the
> renderonly gpu, a driver can define a function which is used to 'setup'
> the needed tiling on that imported buffer. This functions gets called
> during rendertarget resource creation.
>
> If the scanout hardware does not support the used tiling format we need
> to create an extra rendertarget resource for the renderonly gpu.
> During XXX we blit the renderonly rendertarget onto the imported dumb
> buffer.
>
I'd assume you meant to add something over the XXX here :-P

But seriously some people might not be too happy with the blit onto
dumb buffer. Personally I ok, esp. since we don't have anything better
atm.

That aside, there are a few minor nitpicks below. With those sorted I
believe the patch is good to land.

> We assume that the renderonly driver provides a blit function that is
> capable of resolving the tilied into untiled one.
>
> Signed-off-by: Christian Gmeiner 
> ---
>  configure.ac   |   1 +
>  src/gallium/drivers/renderonly/Makefile.am |  11 +
>  src/gallium/drivers/renderonly/Makefile.sources|   4 +
>  .../drivers/renderonly/renderonly_context.c| 721 
> +
>  .../drivers/renderonly/renderonly_context.h|  80 +++
>  .../drivers/renderonly/renderonly_resource.c   | 296 +
>  .../drivers/renderonly/renderonly_resource.h   | 101 +++
>  src/gallium/drivers/renderonly/renderonly_screen.c | 178 +
>  src/gallium/drivers/renderonly/renderonly_screen.h |  55 ++
>  9 files changed, 1447 insertions(+)
>  create mode 100644 src/gallium/drivers/renderonly/Makefile.am
>  create mode 100644 src/gallium/drivers/renderonly/Makefile.sources
>  create mode 100644 src/gallium/drivers/renderonly/renderonly_context.c
>  create mode 100644 src/gallium/drivers/renderonly/renderonly_context.h
>  create mode 100644 src/gallium/drivers/renderonly/renderonly_resource.c
>  create mode 100644 src/gallium/drivers/renderonly/renderonly_resource.h
>  create mode 100644 src/gallium/drivers/renderonly/renderonly_screen.c
>  create mode 100644 src/gallium/drivers/renderonly/renderonly_screen.h
>
> diff --git a/configure.ac b/configure.ac
> index 217281f..ea485b1 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2361,6 +2361,7 @@ AC_CONFIG_FILES([Makefile
> src/gallium/drivers/radeon/Makefile
> src/gallium/drivers/radeonsi/Makefile
> src/gallium/drivers/rbug/Makefile
> +   src/gallium/drivers/renderonly/Makefile
> src/gallium/drivers/softpipe/Makefile
> src/gallium/drivers/svga/Makefile
> src/gallium/drivers/trace/Makefile

Don't recall of the top of my head but we might need the following
hunk. Otherwise the files won't end up in the tarball and configure
will scream at us.

--- a/src/gallium/Makefile.am
+++ b/src/gallium/Makefile.am
@@ -109,6 +109,7 @@ EXTRA_DIST = \
   docs \
   README.portability \
   SConscript \
+   drivers/renderonly \
   winsys/sw/gdi \
   winsys/sw/hgl

> --- /dev/null
> +++ b/src/gallium/drivers/renderonly/Makefile.sources
> @@ -0,0 +1,4 @@
> +C_SOURCES := \
> +   renderonly_context.c \
> +   renderonly_resource.c \
> +   renderonly_screen.c
Please list all the sources (including the headers) in here, sorted
alphabetically.

> --- /dev/null
> +++ b/src/gallium/drivers/renderonly/renderonly_context.c
[snip]
> +static void
> +renderonly_draw_vbo(struct pipe_context *pcontext,
> +  const struct pipe_draw_info *pinfo)
> +{
> +   struct renderonly_context *context = to_renderonly_context(pcontext);
> +   struct pipe_draw_info info;
> +
> +   if (pinfo && pinfo->indirect) {
Can pinfo really be null here ?

> +   memcpy(, pinfo, sizeof(info));
> +   info.indirect = renderonly_resource_unwrap(info.indirect);
During the unwrapping sometimes we're using the base object sometimes
the wrapped one. Can we use just the latter ? It should minimize the
(brief) 'wtf !?' moments.

[snip]
> +static void
> +renderonly_set_framebuffer_state(struct pipe_context *pcontext,
> +   const struct pipe_framebuffer_state *fb)
> +{
> +   struct renderonly_context *context = to_renderonly_context(pcontext);
> +   struct pipe_framebuffer_state state;
> +   unsigned i;
> +
> +   if (fb) {

Re: [Mesa-dev] [PATCH 3/7] glsl: add AoA support to subroutines

2015-10-15 Thread Timothy Arceri
On Fri, 2015-10-16 at 11:44 +1000, Dave Airlie wrote:
> you gotta give me something in the commit msg to have any idea what
> I'm reading here :-)
> 
> why does process_parameters move?

Because we need actual_parameters processed earlier so we can use it
with match_subroutine_by_name() to get the subroutine variable, we need
to do this inside the recursive function generate_array_index() because
we can't create the ir_dereference_array() until we have gotten to the
outermost array.

For the remainder of the array dimensions the type doesn't matter so we
can just use the existing _mesa_ast_array_index_to_hir() function to
process the ast.

Hope that makes sense.

> 
> is there a piglit for subroutine/arrays?

Just a simple one that extends one of yours tests for AoA

spec/arb_arrays_of_arrays/execution/subroutines/fs
-subroutine.shader_test

> 
> Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] glsl: add AoA support to subroutines

2015-10-15 Thread Timothy Arceri
Cc: Dave Airlie 
---
 src/glsl/ast_function.cpp | 43 ++-
 src/glsl/lower_subroutine.cpp |  2 +-
 2 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index c5c5cae..e4e4a3f 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -610,6 +610,37 @@ match_subroutine_by_name(const char *name,
return sig;
 }
 
+static ir_rvalue *
+generate_array_index(void *mem_ctx, exec_list *instructions,
+ struct _mesa_glsl_parse_state *state, YYLTYPE loc,
+ const ast_expression *array, ast_expression *idx,
+ const char **function_name, exec_list *actual_parameters)
+{
+   if (array->oper == ast_array_index) {
+  /* This handles arrays of arrays */
+  ir_rvalue *outer_array = generate_array_index(mem_ctx, instructions,
+state, loc,
+array->subexpressions[0],
+array->subexpressions[1],
+function_name, 
actual_parameters);
+  ir_rvalue *outer_array_idx = idx->hir(instructions, state);
+
+  YYLTYPE index_loc = idx->get_location();
+  return _mesa_ast_array_index_to_hir(mem_ctx, state, outer_array,
+  outer_array_idx, loc,
+  index_loc);
+   } else {
+  ir_variable *sub_var = NULL;
+  *function_name = array->primary_expression.identifier;
+
+  match_subroutine_by_name(*function_name, actual_parameters,
+   state, _var);
+
+  ir_rvalue *outer_array_idx = idx->hir(instructions, state);
+  return new(mem_ctx) ir_dereference_array(sub_var, outer_array_idx);
+   }
+}
+
 static void
 print_function_prototypes(_mesa_glsl_parse_state *state, YYLTYPE *loc,
   ir_function *f)
@@ -1989,16 +2020,18 @@ ast_function_expression::hir(exec_list *instructions,
   ir_variable *sub_var = NULL;
   ir_rvalue *array_idx = NULL;
 
+  process_parameters(instructions, _parameters, >expressions,
+state);
+
   if (id->oper == ast_array_index) {
- func_name = id->subexpressions[0]->primary_expression.identifier;
-array_idx = id->subexpressions[1]->hir(instructions, state);
+ array_idx = generate_array_index(ctx, instructions, state, loc,
+  id->subexpressions[0],
+  id->subexpressions[1], _name,
+  _parameters);
   } else {
  func_name = id->primary_expression.identifier;
   }
 
-  process_parameters(instructions, _parameters, >expressions,
-state);
-
   ir_function_signature *sig =
 match_function_by_name(func_name, _parameters, state);
 
diff --git a/src/glsl/lower_subroutine.cpp b/src/glsl/lower_subroutine.cpp
index c1aed61..a0df5e1 100644
--- a/src/glsl/lower_subroutine.cpp
+++ b/src/glsl/lower_subroutine.cpp
@@ -84,7 +84,7 @@ lower_subroutine_visitor::visit_leave(ir_call *ir)
  continue;
 
   if (ir->array_idx != NULL)
- var = new(mem_ctx) ir_dereference_array(ir->sub_var, 
ir->array_idx->clone(mem_ctx, NULL));
+ var = ir->array_idx->clone(mem_ctx, NULL);
   else
  var = new(mem_ctx) ir_dereference_variable(ir->sub_var);
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] docs: Mark AoA as done for i965

2015-10-15 Thread Timothy Arceri
Reviewed-by: Ian Romanick 
---
 docs/GL3.txt  | 4 ++--
 docs/relnotes/11.1.0.html | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 6503e2a..f8e2680 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -149,7 +149,7 @@ GL 4.2, GLSL 4.20:
 
 GL 4.3, GLSL 4.30:
 
-  GL_ARB_arrays_of_arrays  started (Timothy)
+  GL_ARB_arrays_of_arrays  DONE (i965)
   GL_ARB_ES3_compatibility DONE (all drivers that 
support GLSL 3.30)
   GL_ARB_clear_buffer_object   DONE (all drivers)
   GL_ARB_compute_shaderin progress (jljusten)
@@ -209,7 +209,7 @@ GL 4.5, GLSL 4.50:
 
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
-  GL_ARB_arrays_of_arrays  started (Timothy)
+  GL_ARB_arrays_of_arrays  DONE (i965)
   GL_ARB_compute_shaderin progress (jljusten)
   GL_ARB_draw_indirect DONE (i965, nvc0, r600, 
radeonsi, llvmpipe, softpipe)
   GL_ARB_explicit_uniform_location DONE (all drivers that 
support GLSL)
diff --git a/docs/relnotes/11.1.0.html b/docs/relnotes/11.1.0.html
index dcf425e..b5dd208 100644
--- a/docs/relnotes/11.1.0.html
+++ b/docs/relnotes/11.1.0.html
@@ -44,6 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+GL_ARB_arrays_of_arrays on i965
 GL_ARB_blend_func_extended on freedreno (a3xx)
 GL_ARB_gpu_shader_fp64 on r600 for Cypress/Cayman/Aruba chips
 GL_ARB_gpu_shader5 on r600 for Evergreen and later chips
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] i965: enable ARB_arrays_of_arrays

2015-10-15 Thread Timothy Arceri
Reviewed-by: Samuel Iglesias Gonsálvez 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3f9afd1..c1f3d0d 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -174,6 +174,7 @@ intelInitExtensions(struct gl_context *ctx)
 
assert(brw->gen >= 4);
 
+   ctx->Extensions.ARB_arrays_of_arrays = true;
ctx->Extensions.ARB_buffer_storage = true;
ctx->Extensions.ARB_clear_texture = true;
ctx->Extensions.ARB_clip_control = true;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] i965: add support for image AoA

2015-10-15 Thread Timothy Arceri
Cc: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 44 --
 src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp |  2 ++
 2 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 0e044d0..16b5f0a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1037,19 +1037,27 @@ fs_visitor::get_nir_image_deref(const nir_deref_var 
*deref)
 {
fs_reg image(UNIFORM, deref->var->data.driver_location,
 BRW_REGISTER_TYPE_UD);
-
-   if (deref->deref.child) {
-  const nir_deref_array *deref_array =
- nir_deref_as_array(deref->deref.child);
-  assert(deref->deref.child->deref_type == nir_deref_type_array &&
- deref_array->deref.child == NULL);
-  const unsigned size = glsl_get_length(deref->var->type);
+   fs_reg *indirect_offset = NULL;
+
+   unsigned img_offset = 0;
+   const nir_deref *tail = >deref;
+   while (tail->child) {
+  const nir_deref_array *deref_array = nir_deref_as_array(tail->child);
+  assert(tail->child->deref_type == nir_deref_type_array);
+  tail = tail->child;
+  const unsigned size = glsl_get_length(tail->type);
+  const unsigned child_array_elements = tail->child != NULL ?
+ glsl_get_aoa_size(tail->type) : 1;
   const unsigned base = MIN2(deref_array->base_offset, size - 1);
-
-  image = offset(image, bld, base * BRW_IMAGE_PARAM_SIZE);
+  const unsigned aoa_size = child_array_elements * BRW_IMAGE_PARAM_SIZE;
+  img_offset += base * aoa_size;
 
   if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
- fs_reg *tmp = new(mem_ctx) fs_reg(vgrf(glsl_type::int_type));
+ fs_reg tmp = vgrf(glsl_type::int_type);
+ if (indirect_offset == NULL) {
+indirect_offset = new(mem_ctx) fs_reg(vgrf(glsl_type::int_type));
+bld.MOV(*indirect_offset, fs_reg(0));
+ }
 
  if (devinfo->gen == 7 && !devinfo->is_haswell) {
 /* IVB hangs when trying to access an invalid surface index with
@@ -1060,18 +1068,22 @@ fs_visitor::get_nir_image_deref(const nir_deref_var 
*deref)
  * of the possible outcomes of the hang.  Clamp the index to
  * prevent access outside of the array bounds.
  */
-bld.emit_minmax(*tmp, retype(get_nir_src(deref_array->indirect),
- BRW_REGISTER_TYPE_UD),
+bld.emit_minmax(tmp, retype(get_nir_src(deref_array->indirect),
+BRW_REGISTER_TYPE_UD),
 fs_reg(size - base - 1), BRW_CONDITIONAL_L);
  } else {
-bld.MOV(*tmp, get_nir_src(deref_array->indirect));
+bld.MOV(tmp, get_nir_src(deref_array->indirect));
  }
-
- bld.MUL(*tmp, *tmp, fs_reg(BRW_IMAGE_PARAM_SIZE));
- image.reladdr = tmp;
+ bld.MUL(tmp, tmp, fs_reg(aoa_size));
+ bld.ADD(*indirect_offset, *indirect_offset, tmp);
   }
}
 
+   if (indirect_offset) {
+  image.reladdr = indirect_offset;
+   }
+   image = offset(image, bld, img_offset);
+
return image;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp 
b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
index d3326e9..87b3839 100644
--- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
+++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
@@ -98,6 +98,8 @@ brw_nir_setup_glsl_uniform(gl_shader_stage stage, 
nir_variable *var,
   if (storage->type->is_image()) {
  brw_setup_image_uniform_values(stage, stage_prog_data,
 uniform_index, storage);
+ uniform_index +=
+BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
   } else {
  gl_constant_value *components = storage->storage;
  unsigned vector_count = (MAX2(storage->array_elements, 1) *
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] glsl: set image access qualifiers for AoA

2015-10-15 Thread Timothy Arceri
Cc: Francisco Jerez 
---
 src/glsl/link_uniforms.cpp | 77 +-
 1 file changed, 49 insertions(+), 28 deletions(-)

diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 647aa2b..2a1da07 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -1008,38 +1008,37 @@ link_update_uniform_buffer_variables(struct gl_shader 
*shader)
}
 }
 
-/**
- * Scan the program for image uniforms and store image unit access
- * information into the gl_shader data structure.
- */
 static void
-link_set_image_access_qualifiers(struct gl_shader_program *prog)
+link_set_image_access_qualifiers(struct gl_shader_program *prog,
+ gl_shader *sh, unsigned shader_stage,
+ ir_variable *var, const glsl_type *type,
+ char **name, size_t name_length)
 {
-   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  gl_shader *sh = prog->_LinkedShaders[i];
-
-  if (sh == NULL)
-continue;
+   /* Handle arrays of arrays */
+   if (type->is_array() && type->fields.array->is_array()) {
+  for (unsigned i = 0; i < type->length; i++) {
+size_t new_length = name_length;
 
-  foreach_in_list(ir_instruction, node, sh->ir) {
-ir_variable *var = node->as_variable();
+/* Append the subscript to the current variable name */
+ralloc_asprintf_rewrite_tail(name, _length, "[%u]", i);
 
- if (var && var->data.mode == ir_var_uniform &&
- var->type->contains_image()) {
-unsigned id = 0;
-bool found = prog->UniformHash->get(id, var->name);
-assert(found);
-(void) found;
-const gl_uniform_storage *storage = >UniformStorage[id];
-const unsigned index = storage->opaque[i].index;
-const GLenum access = (var->data.image_read_only ? GL_READ_ONLY :
-   var->data.image_write_only ? GL_WRITE_ONLY :
-   GL_READ_WRITE);
-
-for (unsigned j = 0; j < MAX2(1, storage->array_elements); ++j)
-   sh->ImageAccess[index + j] = access;
- }
+ link_set_image_access_qualifiers(prog, sh, shader_stage, var,
+  type->fields.array, name,
+  new_length);
   }
+   } else {
+  unsigned id = 0;
+  bool found = prog->UniformHash->get(id, *name);
+  assert(found);
+  (void) found;
+  const gl_uniform_storage *storage = >UniformStorage[id];
+  const unsigned index = storage->opaque[shader_stage].index;
+  const GLenum access = (var->data.image_read_only ? GL_READ_ONLY :
+ var->data.image_write_only ? GL_WRITE_ONLY :
+ GL_READ_WRITE);
+
+  for (unsigned j = 0; j < MAX2(1, storage->array_elements); ++j)
+ sh->ImageAccess[index + j] = access;
}
 }
 
@@ -1300,7 +1299,29 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
prog->NumHiddenUniforms = hidden_uniforms;
prog->UniformStorage = uniforms;
 
-   link_set_image_access_qualifiers(prog);
+   /**
+* Scan the program for image uniforms and store image unit access
+* information into the gl_shader data structure.
+*/
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  gl_shader *sh = prog->_LinkedShaders[i];
+
+  if (sh == NULL)
+continue;
+
+  foreach_in_list(ir_instruction, node, sh->ir) {
+ir_variable *var = node->as_variable();
+
+ if (var && var->data.mode == ir_var_uniform &&
+ var->type->contains_image()) {
+char *name_copy = ralloc_strdup(NULL, var->name);
+link_set_image_access_qualifiers(prog, sh, i, var, var->type,
+ _copy, strlen(var->name));
+ralloc_free(name_copy);
+ }
+  }
+   }
+
link_set_uniform_initializers(prog, boolean_true);
 
return;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] nir: wrapper for glsl_type arrays_of_arrays_size()

2015-10-15 Thread Timothy Arceri
Reviewed-by: Tapani Pälli 
Reviewed-by: Ian Romanick 
---
 src/glsl/nir/nir_types.cpp | 6 ++
 src/glsl/nir/nir_types.h   | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
index da9807f..965f423 100644
--- a/src/glsl/nir/nir_types.cpp
+++ b/src/glsl/nir/nir_types.cpp
@@ -106,6 +106,12 @@ glsl_get_length(const struct glsl_type *type)
return type->is_matrix() ? type->matrix_columns : type->length;
 }
 
+unsigned
+glsl_get_aoa_size(const struct glsl_type *type)
+{
+   return type->arrays_of_arrays_size();
+}
+
 const char *
 glsl_get_struct_elem_name(const struct glsl_type *type, unsigned index)
 {
diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
index 49d6a65..009a0fb 100644
--- a/src/glsl/nir/nir_types.h
+++ b/src/glsl/nir/nir_types.h
@@ -59,6 +59,8 @@ unsigned glsl_get_matrix_columns(const struct glsl_type 
*type);
 
 unsigned glsl_get_length(const struct glsl_type *type);
 
+unsigned glsl_get_aoa_size(const struct glsl_type *type);
+
 const char *glsl_get_struct_elem_name(const struct glsl_type *type,
   unsigned index);
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] nir: add atomic lowering support for AoA

2015-10-15 Thread Timothy Arceri
Cc: Francisco Jerez 
Cc: Jason Ekstrand 
---
 src/glsl/nir/nir_lower_atomics.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/glsl/nir/nir_lower_atomics.c b/src/glsl/nir/nir_lower_atomics.c
index 6f9ecc0..46e1376 100644
--- a/src/glsl/nir/nir_lower_atomics.c
+++ b/src/glsl/nir/nir_lower_atomics.c
@@ -72,20 +72,22 @@ lower_instr(nir_intrinsic_instr *instr, nir_function_impl 
*impl)
 
nir_ssa_def *offset_def = _const->def;
 
-   if (instr->variables[0]->deref.child != NULL) {
-  assert(instr->variables[0]->deref.child->deref_type ==
- nir_deref_type_array);
-  nir_deref_array *deref_array =
- nir_deref_as_array(instr->variables[0]->deref.child);
-  assert(deref_array->deref.child == NULL);
+   nir_deref *tail = >variables[0]->deref;
+   while (tail->child != NULL) {
+  assert(tail->child->deref_type == nir_deref_type_array);
+  nir_deref_array *deref_array = nir_deref_as_array(tail->child);
+  tail = tail->child;
 
-  offset_const->value.u[0] +=
- deref_array->base_offset * ATOMIC_COUNTER_SIZE;
+  unsigned child_array_elements = tail->child != NULL ?
+ glsl_get_aoa_size(tail->type) : 1;
+
+  offset_const->value.u[0] += deref_array->base_offset *
+ child_array_elements * ATOMIC_COUNTER_SIZE;
 
   if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
  nir_load_const_instr *atomic_counter_size =
nir_load_const_instr_create(mem_ctx, 1);
- atomic_counter_size->value.u[0] = ATOMIC_COUNTER_SIZE;
+ atomic_counter_size->value.u[0] = child_array_elements * 
ATOMIC_COUNTER_SIZE;
  nir_instr_insert_before(>instr, _counter_size->instr);
 
  nir_alu_instr *mul = nir_alu_instr_create(mem_ctx, nir_op_imul);
@@ -102,7 +104,7 @@ lower_instr(nir_intrinsic_instr *instr, nir_function_impl 
*impl)
  add->src[0].src.is_ssa = true;
  add->src[0].src.ssa = >dest.dest.ssa;
  add->src[1].src.is_ssa = true;
- add->src[1].src.ssa = _const->def;
+ add->src[1].src.ssa = offset_def;
  nir_instr_insert_before(>instr, >instr);
 
  offset_def = >dest.dest.ssa;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Remaining unreviewed AoA patches

2015-10-15 Thread Timothy Arceri
This series is just a resend of the remaining unreviewed AoA v7 patches.

This time round I've Cc'd those with the with the most knowledge about the 
code each change touches in the hope of finally wrapping this up.

Patches 2-5 are the unreviewed patches.

Thanks,
Tim
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 1/2] gallium: add renderonly driver

2015-10-15 Thread Rob Clark
On Thu, Oct 15, 2015 at 7:09 PM, Emil Velikov  wrote:
> Hi Christian,
>
> I'm glad to see Thierry's work revived. Hopefully this will soon be
> the basis of many more drivers.
>
> On 11 October 2015 at 16:09, Christian Gmeiner
>  wrote:
>> This commit adds a generic renderonly driver library, which fullfille
>> the requirements for tegra and etnaviv. As a result it is possible to
>> run unmodified egl software directly (without any compositor) on
>> supported devices.
>>
>> In every use case we import a dumb buffer from scanout gpu into
>> the renderonly gpu.
>>
>> If the scanout hardware does support the used tiling format from the
>> renderonly gpu, a driver can define a function which is used to 'setup'
>> the needed tiling on that imported buffer. This functions gets called
>> during rendertarget resource creation.
>>
>> If the scanout hardware does not support the used tiling format we need
>> to create an extra rendertarget resource for the renderonly gpu.
>> During XXX we blit the renderonly rendertarget onto the imported dumb
>> buffer.
>>
> I'd assume you meant to add something over the XXX here :-P
>
> But seriously some people might not be too happy with the blit onto
> dumb buffer. Personally I ok, esp. since we don't have anything better
> atm.

imho, it is ok if driver specific (or maybe in this case we should
call it "semi driver-specific") code is blitting (or otherwise using
gpu) on dumb buffers..  the main point about dumb buffers is some
particular gpu may not support operating on dumb buffers, so truly
generic code outside of drivers should not make assumptions..

I guess this code counts as "helper" code, so some hw that could not
support dumb buffer access simply doesn't use it and implements their
own version instead..

BR,
-R

> That aside, there are a few minor nitpicks below. With those sorted I
> believe the patch is good to land.
>
>> We assume that the renderonly driver provides a blit function that is
>> capable of resolving the tilied into untiled one.
>>
>> Signed-off-by: Christian Gmeiner 
>> ---
>>  configure.ac   |   1 +
>>  src/gallium/drivers/renderonly/Makefile.am |  11 +
>>  src/gallium/drivers/renderonly/Makefile.sources|   4 +
>>  .../drivers/renderonly/renderonly_context.c| 721 
>> +
>>  .../drivers/renderonly/renderonly_context.h|  80 +++
>>  .../drivers/renderonly/renderonly_resource.c   | 296 +
>>  .../drivers/renderonly/renderonly_resource.h   | 101 +++
>>  src/gallium/drivers/renderonly/renderonly_screen.c | 178 +
>>  src/gallium/drivers/renderonly/renderonly_screen.h |  55 ++
>>  9 files changed, 1447 insertions(+)
>>  create mode 100644 src/gallium/drivers/renderonly/Makefile.am
>>  create mode 100644 src/gallium/drivers/renderonly/Makefile.sources
>>  create mode 100644 src/gallium/drivers/renderonly/renderonly_context.c
>>  create mode 100644 src/gallium/drivers/renderonly/renderonly_context.h
>>  create mode 100644 src/gallium/drivers/renderonly/renderonly_resource.c
>>  create mode 100644 src/gallium/drivers/renderonly/renderonly_resource.h
>>  create mode 100644 src/gallium/drivers/renderonly/renderonly_screen.c
>>  create mode 100644 src/gallium/drivers/renderonly/renderonly_screen.h
>>
>> diff --git a/configure.ac b/configure.ac
>> index 217281f..ea485b1 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -2361,6 +2361,7 @@ AC_CONFIG_FILES([Makefile
>> src/gallium/drivers/radeon/Makefile
>> src/gallium/drivers/radeonsi/Makefile
>> src/gallium/drivers/rbug/Makefile
>> +   src/gallium/drivers/renderonly/Makefile
>> src/gallium/drivers/softpipe/Makefile
>> src/gallium/drivers/svga/Makefile
>> src/gallium/drivers/trace/Makefile
>
> Don't recall of the top of my head but we might need the following
> hunk. Otherwise the files won't end up in the tarball and configure
> will scream at us.
>
> --- a/src/gallium/Makefile.am
> +++ b/src/gallium/Makefile.am
> @@ -109,6 +109,7 @@ EXTRA_DIST = \
>docs \
>README.portability \
>SConscript \
> +   drivers/renderonly \
>winsys/sw/gdi \
>winsys/sw/hgl
>
>> --- /dev/null
>> +++ b/src/gallium/drivers/renderonly/Makefile.sources
>> @@ -0,0 +1,4 @@
>> +C_SOURCES := \
>> +   renderonly_context.c \
>> +   renderonly_resource.c \
>> +   renderonly_screen.c
> Please list all the sources (including the headers) in here, sorted
> alphabetically.
>
>> --- /dev/null
>> +++ b/src/gallium/drivers/renderonly/renderonly_context.c
> [snip]
>> +static void
>> +renderonly_draw_vbo(struct pipe_context *pcontext,
>> +  const struct pipe_draw_info *pinfo)
>> +{
>> +   struct renderonly_context *context = to_renderonly_context(pcontext);
>> +   struct pipe_draw_info 

Re: [Mesa-dev] [PATCH] r600g: Implement ARB_texture_view

2015-10-15 Thread Roland Scheidegger
Interesting it doesn't work with llvmpipe, I thought there were tests
for this using some other state tracker...
My guess is that (for llvmpipe) in prepare_shader_sampling() and
lp_setup_set_fragment_sampler_views() we miss adjusting the mip offsets
since we only do that if it's an array target, but this is based on
view->target and not tex->target...

Roland


Am 16.10.2015 um 01:53 schrieb Glenn Kennard:
> Signed-off-by: Glenn Kennard 
> ---
> See also additional texture view piglit test case posted to piglit ml,
> which tests cases with layer>0. Notably softpipe and llvmpipe fail that
> case but i965/hsw, nv50/nvc0 and r600g pass.
> 
>  docs/GL3.txt   |  2 +-
>  docs/relnotes/11.1.0.html  |  1 +
>  src/gallium/drivers/r600/evergreen_state.c | 23 +--
>  src/gallium/drivers/r600/r600_pipe.c   |  2 +-
>  4 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 6503e2a..c03a574 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -169,7 +169,7 @@ GL 4.3, GLSL 4.30:
>GL_ARB_texture_buffer_range  DONE (nv50, nvc0, 
> i965, r600, radeonsi, llvmpipe)
>GL_ARB_texture_query_levels  DONE (all drivers 
> that support GLSL 1.30)
>GL_ARB_texture_storage_multisample   DONE (all drivers 
> that support GL_ARB_texture_multisample)
> -  GL_ARB_texture_view  DONE (i965, nv50, 
> nvc0, llvmpipe, softpipe)
> +  GL_ARB_texture_view  DONE (i965, nv50, 
> nvc0, r600, llvmpipe, softpipe)
>GL_ARB_vertex_attrib_binding DONE (all drivers)
>  
>  
> diff --git a/docs/relnotes/11.1.0.html b/docs/relnotes/11.1.0.html
> index dcf425e..cb8715c 100644
> --- a/docs/relnotes/11.1.0.html
> +++ b/docs/relnotes/11.1.0.html
> @@ -53,6 +53,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_ARB_texture_query_lod on softpipe
>  EGL_KHR_create_context on softpipe, llvmpipe
>  EGL_KHR_gl_colorspace on softpipe, llvmpipe
> +GL_ARB_texture_view on r600 for Evergreen and later chips
>  
>  
>  Bug fixes
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index c6702a9..60747d1 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.c
> @@ -666,6 +666,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
> *ctx,
>   enum pipe_format pipe_format = state->format;
>   struct radeon_surf_level *surflevel;
>   unsigned base_level, first_level, last_level;
> + unsigned dim, last_layer;
>   uint64_t va;
>  
>   if (view == NULL)
> @@ -679,7 +680,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
> *ctx,
>   view->base.reference.count = 1;
>   view->base.context = ctx;
>  
> - if (texture->target == PIPE_BUFFER)
> + if (state->target == PIPE_BUFFER)
>   return texture_buffer_sampler_view(rctx, view, width0, height0);
>  
>   swizzle[0] = state->swizzle_r;
> @@ -773,12 +774,12 @@ evergreen_create_sampler_view_custom(struct 
> pipe_context *ctx,
>   }
>   nbanks = eg_num_banks(rscreen->b.tiling_info.num_banks);
>  
> - if (texture->target == PIPE_TEXTURE_1D_ARRAY) {
> + if (state->target == PIPE_TEXTURE_1D_ARRAY) {
>   height = 1;
>   depth = texture->array_size;
> - } else if (texture->target == PIPE_TEXTURE_2D_ARRAY) {
> + } else if (state->target == PIPE_TEXTURE_2D_ARRAY) {
>   depth = texture->array_size;
> - } else if (texture->target == PIPE_TEXTURE_CUBE_ARRAY)
> + } else if (state->target == PIPE_TEXTURE_CUBE_ARRAY)
>   depth = texture->array_size / 6;
>  
>   va = tmp->resource.gpu_address;
> @@ -790,7 +791,13 @@ evergreen_create_sampler_view_custom(struct pipe_context 
> *ctx,
>   view->is_stencil_sampler = true;
>  
>   view->tex_resource = >resource;
> - view->tex_resource_words[0] = 
> (S_03_DIM(r600_tex_dim(texture->target, texture->nr_samples)) |
> +
> + /* array type views and views into array types need to use layer offset 
> */
> + dim = state->target;
> + if (state->target != PIPE_TEXTURE_CUBE)
> + dim = MAX2(state->target, texture->target);
> +
> + view->tex_resource_words[0] = (S_03_DIM(r600_tex_dim(dim, 
> texture->nr_samples)) |
>  S_03_PITCH((pitch / 8) - 1) |
>  S_03_TEX_WIDTH(width - 1));
>   if (rscreen->b.chip_class == CAYMAN)
> @@ -818,10 +825,14 @@ evergreen_create_sampler_view_custom(struct 
> pipe_context *ctx,
>   view->tex_resource_words[3] = (surflevel[base_level].offset + 
> va) >> 8;
>   }
>  
> + last_layer = state->u.tex.last_layer;
> + if (state->target != 

Re: [Mesa-dev] [PATCH] r600g: Implement ARB_texture_view

2015-10-15 Thread Ilia Mirkin
On Thu, Oct 15, 2015 at 7:53 PM, Glenn Kennard  wrote:
> Signed-off-by: Glenn Kennard 
> ---
> See also additional texture view piglit test case posted to piglit ml,
> which tests cases with layer>0. Notably softpipe and llvmpipe fail that
> case but i965/hsw, nv50/nvc0 and r600g pass.
>
>  docs/GL3.txt   |  2 +-
>  docs/relnotes/11.1.0.html  |  1 +
>  src/gallium/drivers/r600/evergreen_state.c | 23 +--
>  src/gallium/drivers/r600/r600_pipe.c   |  2 +-
>  4 files changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 6503e2a..c03a574 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -169,7 +169,7 @@ GL 4.3, GLSL 4.30:
>GL_ARB_texture_buffer_range  DONE (nv50, nvc0, 
> i965, r600, radeonsi, llvmpipe)
>GL_ARB_texture_query_levels  DONE (all drivers 
> that support GLSL 1.30)
>GL_ARB_texture_storage_multisample   DONE (all drivers 
> that support GL_ARB_texture_multisample)
> -  GL_ARB_texture_view  DONE (i965, nv50, 
> nvc0, llvmpipe, softpipe)
> +  GL_ARB_texture_view  DONE (i965, nv50, 
> nvc0, r600, llvmpipe, softpipe)
>GL_ARB_vertex_attrib_binding DONE (all drivers)
>
>
> diff --git a/docs/relnotes/11.1.0.html b/docs/relnotes/11.1.0.html
> index dcf425e..cb8715c 100644
> --- a/docs/relnotes/11.1.0.html
> +++ b/docs/relnotes/11.1.0.html
> @@ -53,6 +53,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_ARB_texture_query_lod on softpipe
>  EGL_KHR_create_context on softpipe, llvmpipe
>  EGL_KHR_gl_colorspace on softpipe, llvmpipe
> +GL_ARB_texture_view on r600 for Evergreen and later chips
>  
>
>  Bug fixes
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index c6702a9..60747d1 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.c
> @@ -666,6 +666,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
> *ctx,
> enum pipe_format pipe_format = state->format;
> struct radeon_surf_level *surflevel;
> unsigned base_level, first_level, last_level;
> +   unsigned dim, last_layer;
> uint64_t va;
>
> if (view == NULL)
> @@ -679,7 +680,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
> *ctx,
> view->base.reference.count = 1;
> view->base.context = ctx;
>
> -   if (texture->target == PIPE_BUFFER)
> +   if (state->target == PIPE_BUFFER)

Not sure, but I'd guess things would have to be pretty messed up if
texture->target == buffer, but state->target == not buffer. I'd throw
in an assert actually.

> return texture_buffer_sampler_view(rctx, view, width0, 
> height0);
>
> swizzle[0] = state->swizzle_r;
> @@ -773,12 +774,12 @@ evergreen_create_sampler_view_custom(struct 
> pipe_context *ctx,
> }
> nbanks = eg_num_banks(rscreen->b.tiling_info.num_banks);
>
> -   if (texture->target == PIPE_TEXTURE_1D_ARRAY) {
> +   if (state->target == PIPE_TEXTURE_1D_ARRAY) {
> height = 1;
> depth = texture->array_size;

Not sure where depth is used, but doesn't this need to take first/last
layer into account? Does textureSize() return the right thing for
these? I guess there's no piglit for that either.

> -   } else if (texture->target == PIPE_TEXTURE_2D_ARRAY) {
> +   } else if (state->target == PIPE_TEXTURE_2D_ARRAY) {
> depth = texture->array_size;
> -   } else if (texture->target == PIPE_TEXTURE_CUBE_ARRAY)
> +   } else if (state->target == PIPE_TEXTURE_CUBE_ARRAY)
> depth = texture->array_size / 6;
>
> va = tmp->resource.gpu_address;
> @@ -790,7 +791,13 @@ evergreen_create_sampler_view_custom(struct pipe_context 
> *ctx,
> view->is_stencil_sampler = true;
>
> view->tex_resource = >resource;
> -   view->tex_resource_words[0] = 
> (S_03_DIM(r600_tex_dim(texture->target, texture->nr_samples)) |
> +
> +   /* array type views and views into array types need to use layer 
> offset */
> +   dim = state->target;
> +   if (state->target != PIPE_TEXTURE_CUBE)
> +   dim = MAX2(state->target, texture->target);
> +
> +   view->tex_resource_words[0] = (S_03_DIM(r600_tex_dim(dim, 
> texture->nr_samples)) |
>S_03_PITCH((pitch / 8) - 1) |
>S_03_TEX_WIDTH(width - 1));
> if (rscreen->b.chip_class == CAYMAN)
> @@ -818,10 +825,14 @@ evergreen_create_sampler_view_custom(struct 
> pipe_context *ctx,
> view->tex_resource_words[3] = (surflevel[base_level].offset + 
> va) >> 8;
> }
>

[Mesa-dev] [PATCH] r600g: Implement ARB_texture_view

2015-10-15 Thread Glenn Kennard
Signed-off-by: Glenn Kennard 
---
See also additional texture view piglit test case posted to piglit ml,
which tests cases with layer>0. Notably softpipe and llvmpipe fail that
case but i965/hsw, nv50/nvc0 and r600g pass.

 docs/GL3.txt   |  2 +-
 docs/relnotes/11.1.0.html  |  1 +
 src/gallium/drivers/r600/evergreen_state.c | 23 +--
 src/gallium/drivers/r600/r600_pipe.c   |  2 +-
 4 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 6503e2a..c03a574 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -169,7 +169,7 @@ GL 4.3, GLSL 4.30:
   GL_ARB_texture_buffer_range  DONE (nv50, nvc0, i965, 
r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels  DONE (all drivers that 
support GLSL 1.30)
   GL_ARB_texture_storage_multisample   DONE (all drivers that 
support GL_ARB_texture_multisample)
-  GL_ARB_texture_view  DONE (i965, nv50, nvc0, 
llvmpipe, softpipe)
+  GL_ARB_texture_view  DONE (i965, nv50, nvc0, 
r600, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_binding DONE (all drivers)
 
 
diff --git a/docs/relnotes/11.1.0.html b/docs/relnotes/11.1.0.html
index dcf425e..cb8715c 100644
--- a/docs/relnotes/11.1.0.html
+++ b/docs/relnotes/11.1.0.html
@@ -53,6 +53,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_texture_query_lod on softpipe
 EGL_KHR_create_context on softpipe, llvmpipe
 EGL_KHR_gl_colorspace on softpipe, llvmpipe
+GL_ARB_texture_view on r600 for Evergreen and later chips
 
 
 Bug fixes
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index c6702a9..60747d1 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -666,6 +666,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
enum pipe_format pipe_format = state->format;
struct radeon_surf_level *surflevel;
unsigned base_level, first_level, last_level;
+   unsigned dim, last_layer;
uint64_t va;
 
if (view == NULL)
@@ -679,7 +680,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
view->base.reference.count = 1;
view->base.context = ctx;
 
-   if (texture->target == PIPE_BUFFER)
+   if (state->target == PIPE_BUFFER)
return texture_buffer_sampler_view(rctx, view, width0, height0);
 
swizzle[0] = state->swizzle_r;
@@ -773,12 +774,12 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
}
nbanks = eg_num_banks(rscreen->b.tiling_info.num_banks);
 
-   if (texture->target == PIPE_TEXTURE_1D_ARRAY) {
+   if (state->target == PIPE_TEXTURE_1D_ARRAY) {
height = 1;
depth = texture->array_size;
-   } else if (texture->target == PIPE_TEXTURE_2D_ARRAY) {
+   } else if (state->target == PIPE_TEXTURE_2D_ARRAY) {
depth = texture->array_size;
-   } else if (texture->target == PIPE_TEXTURE_CUBE_ARRAY)
+   } else if (state->target == PIPE_TEXTURE_CUBE_ARRAY)
depth = texture->array_size / 6;
 
va = tmp->resource.gpu_address;
@@ -790,7 +791,13 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
view->is_stencil_sampler = true;
 
view->tex_resource = >resource;
-   view->tex_resource_words[0] = 
(S_03_DIM(r600_tex_dim(texture->target, texture->nr_samples)) |
+
+   /* array type views and views into array types need to use layer offset 
*/
+   dim = state->target;
+   if (state->target != PIPE_TEXTURE_CUBE)
+   dim = MAX2(state->target, texture->target);
+
+   view->tex_resource_words[0] = (S_03_DIM(r600_tex_dim(dim, 
texture->nr_samples)) |
   S_03_PITCH((pitch / 8) - 1) |
   S_03_TEX_WIDTH(width - 1));
if (rscreen->b.chip_class == CAYMAN)
@@ -818,10 +825,14 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
view->tex_resource_words[3] = (surflevel[base_level].offset + 
va) >> 8;
}
 
+   last_layer = state->u.tex.last_layer;
+   if (state->target != texture->target && depth == 1) {
+   last_layer = state->u.tex.first_layer;
+   }
view->tex_resource_words[4] = (word4 |
   S_030010_ENDIAN_SWAP(endian));
view->tex_resource_words[5] = 
S_030014_BASE_ARRAY(state->u.tex.first_layer) |
- 
S_030014_LAST_ARRAY(state->u.tex.last_layer);
+ S_030014_LAST_ARRAY(last_layer);
view->tex_resource_words[6] = S_030018_TILE_SPLIT(tile_split);
 
if 

[Mesa-dev] [PATCH] docs/index.html: fix typo

2015-10-15 Thread Fabio Pedretti
---
 docs/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/index.html b/docs/index.html
index 138447f..e28c63d 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -28,7 +28,7 @@ This is a bug-fix release.
 This is a bug-fix release.
 
 NOTE: It is anticipated that 10.6.9 will be the final release in the 10.6
-series. Users of 10.5 are encouraged to migrate to the 11.0 series in order
+series. Users of 10.6 are encouraged to migrate to the 11.0 series in order
 to obtain future fixes.
 
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] glsl: add AoA support to subroutines

2015-10-15 Thread Dave Airlie
you gotta give me something in the commit msg to have any idea what
I'm reading here :-)

why does process_parameters move?

is there a piglit for subroutine/arrays?

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Updating mesa3d.org docs?

2015-10-15 Thread Sarah Sharp
On Thu, Oct 15, 2015 at 01:34:54PM -0600, Brian Paul wrote:
> On 10/15/2015 01:18 PM, Sarah Sharp wrote:
> >1. What's the process for pushing updated documentation to the site?
> 
> All the website pages are found in the git docs/ directory.  Changes are
> submitted as patches and reviewed like code on the mesa-dev list.

Sounds good.

> >2. How often are updated docs pushed? Once every week, month, or when
> >there's a new Mesa version?
> 
> I push them whenever a new Mesa version is released, but I can do it at any
> time on request.

Ok, great! I'll ping you when I get patches in.

However, I will note that the push process isn't working for some pages
on the website. 11.0.1 was released in September 2015 (when I would
expect you to do an update). You pushed a commit in 2013 to remove
references to CVS (commit dbbe108951 "docs: replace CVS with git"), but
that change is still not reflected here:

http://www.mesa3d.org/sourcedocs.html

Other pages that are out of sync with master include:

http://www.mesa3d.org/systems.html
http://www.mesa3d.org/license.html
http://www.mesa3d.org/install.html
http://www.mesa3d.org/envvars.html
http://www.mesa3d.org/osmesa.html
http://www.mesa3d.org/extensions.html

The license page seems particularly important to have up-to-date on the
website, since the text changed from "IN NO EVENT SHALL BRIAN PAUL BE
LIABLE" to "IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE".

Pages that people would generally check often (index.html,
relnotes.html) all seem to get updated. There are also less-frequently
updated pages like shading.html that are up-to-date. It's a bit of
a mystery to me why the other pages aren't being updated.

> >3. Any chance I could get permissions to push updated docs?  I'll be
> >improving Mesa documentation as part of my new job, and I would love
> >to be able to push myself once patches are accepted, rather than
> >having to ping you.
> 
> The typical deal is we wait until a person has some track record of
> producing good patches before giving git-write/push privileges.
> 
> So, I'd suggest you make some changes/patches, post them to the mesa-dev
> list for review (others can push them for you initially), and then when
> you've got some history established you can file a request (via bugzilla)
> for git privileges.

Completely understandable. I have done a couple of commits to Mesa and
piglit, and I do have git repo access under the username 'sarah'.
I haven't pushed my own branches yet, as the patches are still being
tested internally, but I will be having someone else to push my initial
patches after they get mailing list review. I understand if you want to
wait a while for me to prove myself before you grant me additional
trusted privileges. :)

Sarah Sharp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix is-renderable check in intel_image_target_renderbuffer_storage

2015-10-15 Thread Ian Romanick
From: Ian Romanick 

Previously we could create a renderbuffer with format
MESA_FORMAT_R8G8B8A8_UNORM, convert that renderbuffer to an EGLImage,
then FAIL to convert the EGLImage back to a renderbuffer because
reasons.  Just use the same check in
intel_image_target_renderbuffer_storage that brw_render_target_supported
uses.

There are more checks in brw_render_target_supported, but I don't think
they are necessary here.  A different approach would be to refactor
brw_render_target_supported to take rb->Format and rb->NumSamples as
parameters (instead of a gl_renderbuffer) and use the new function here.

Fixes:

ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92476
Cc: "10.3 10.4 10.5 10.6 11.0" 
---
 src/mesa/drivers/dri/i965/intel_fbo.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index 5a6b0dd..7f281fa 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -348,14 +348,10 @@ intel_image_target_renderbuffer_storage(struct gl_context 
*ctx,
}
 
/* __DRIimage is opaque to the core so it has to be checked here */
-   switch (image->format) {
-   case MESA_FORMAT_R8G8B8A8_UNORM:
+   if (!brw->format_supported_as_render_target[image->format]) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
 "glEGLImageTargetRenderbufferStorage(unsupported image format");
   return;
-  break;
-   default:
-  break;
}
 
irb = intel_renderbuffer(rb);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] nir: remove dependency on glsl

2015-10-15 Thread Rob Clark
On Tue, Oct 13, 2015 at 12:15 PM, Emil Velikov  wrote:
> On 13 October 2015 at 16:37, Rob Clark  wrote:
>> On Tue, Oct 13, 2015 at 11:22 AM, Emil Velikov  
>> wrote:
>>> Hi Rob,
>>>
>>> On 10 October 2015 at 19:47, Rob Clark  wrote:
 From: Rob Clark 

 Move glsl_types into NIR, now that the dependency on glsl_symbol_table
 has been split out.

 Possibly makes sense to rename things at this point, but if we do that
 I'd like to keep it split out into a separate patch to make git history
 easier to follow (IMHO).

 Signed-off-by: Rob Clark 
 ---
  src/glsl/Makefile.am   |3 -
  src/glsl/Makefile.sources  |4 +-
  src/glsl/builtin_type_macros.h |  172 --
  src/glsl/glsl_types.cpp| 1729 
 
  src/glsl/glsl_types.h  |  867 --
  src/glsl/nir/builtin_type_macros.h |  172 ++
  src/glsl/nir/glsl_types.cpp| 1729 
 
  src/glsl/nir/glsl_types.h  |  867 ++
  src/glsl/nir/nir_types.h   |2 +-
  .../drivers/dri/i965/brw_cubemap_normalize.cpp |2 +-
  src/mesa/drivers/dri/i965/brw_fs.cpp   |2 +-
  src/mesa/drivers/dri/i965/brw_fs.h |2 +-
  .../dri/i965/brw_fs_channel_expressions.cpp|2 +-
  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |2 +-
  .../drivers/dri/i965/brw_fs_vector_splitting.cpp   |2 +-
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |2 +-
  .../dri/i965/brw_lower_unnormalized_offset.cpp |2 +-
  .../drivers/dri/i965/brw_schedule_instructions.cpp |2 +-
  src/mesa/main/ff_fragment_shader.cpp   |2 +-
  src/mesa/main/uniforms.h   |2 +-
  src/mesa/program/ir_to_mesa.cpp|2 +-
  src/mesa/program/sampler.cpp   |2 +-
  22 files changed, 2784 insertions(+), 2787 deletions(-)
  delete mode 100644 src/glsl/builtin_type_macros.h
  delete mode 100644 src/glsl/glsl_types.cpp
  delete mode 100644 src/glsl/glsl_types.h
  create mode 100644 src/glsl/nir/builtin_type_macros.h
  create mode 100644 src/glsl/nir/glsl_types.cpp
  create mode 100644 src/glsl/nir/glsl_types.h

 diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
 index 347919b..437c6a5 100644
 --- a/src/glsl/Makefile.am
 +++ b/src/glsl/Makefile.am
 @@ -148,9 +148,6 @@ libglsl_la_SOURCES =   
  \


  libnir_la_SOURCES =\
 -   glsl_types.cpp  \
 -   builtin_types.cpp   \
 -   glsl_symbol_table.cpp   \
 $(NIR_FILES)\
 $(NIR_GENERATED_FILES)

 diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
 index 436949c..6e61f23 100644
 --- a/src/glsl/Makefile.sources
 +++ b/src/glsl/Makefile.sources
 @@ -20,6 +20,8 @@ NIR_GENERATED_FILES = \
  NIR_FILES = \
 nir/glsl_to_nir.cpp \
 nir/glsl_to_nir.h \
 +   nir/glsl_types.cpp \
 +   nir/glsl_types.h \
 nir/nir.c \
 nir/nir.h \
 nir/nir_array.h \
 @@ -103,8 +105,6 @@ LIBGLSL_FILES = \
 glsl_parser_extras.h \
 glsl_symbol_table.cpp \
 glsl_symbol_table.h \
 -   glsl_types.cpp \
 -   glsl_types.h \
 hir_field_selection.cpp \
 ir_basic_block.cpp \
 ir_basic_block.h \
>>> Can we split this into two (or more) patches.
>>>  - move the files from glsl to glsl/nir, updating scons/android. note
>>> scons is missing everything NIR related.
>>>  - fold/nuke the additional glsl requirements, from NIR.
>>
>> It is already split up this way.. this patch is primarily the move
>> (plus header path tweaks, etc, to keep things compiling).  I don't see
>> how it could be split up any finer while keeping bisectability (ie.
>> not breaking compile in the middle).
>>
>> That said, I did completely ignore scons/android.  I don't know the
>> first thing about scons or how to do a scons build, so I think I'll
>> ignore that and let someone else fix it up.  I suppose I could fix
>> android build, although I can't build android on my laptop (and I
>> guess I'd have to rebase some of the other android related stuff that
>> isn't upstream yet), so maybe I'll just fix that in a follow-on patch
>> this weekend.
>>
> 

Re: [Mesa-dev] [PATCH 1/3] i965/nir: Switch on shader stage in nir_lower_outputs().

2015-10-15 Thread Kenneth Graunke
On Thursday, October 15, 2015 03:17:19 PM Kenneth Graunke wrote:
> VS, GS, and FS continue doing the same thing they did before.  We can
> simplify the FS code a bit because it is always scalar.
> 
> Compute shaders now assert that there are no outputs instead of doing
> a loop over 0 outputs.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_nir.c | 26 +-
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index af9d041..1b4dace 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -112,11 +112,27 @@ brw_nir_lower_inputs(nir_shader *nir, bool is_scalar)
>  static void
>  brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
>  {
> -   if (is_scalar) {
> -  nir_assign_var_locations(>outputs, >num_outputs, 
> type_size_scalar);
> -   } else {
> -  nir_foreach_variable(var, >outputs)
> - var->data.driver_location = var->data.location;
> +   switch (nir->stage) {
> +   case MESA_SHADER_VERTEX:
> +   case MESA_SHADER_GEOMETRY:
> +  if (is_scalar) {
> + nir_assign_var_locations(>outputs, >num_outputs,
> +  type_size_scalar);
> +  } else {
> + nir_foreach_variable(var, >outputs)
> +var->data.driver_location = var->data.location;
> +  }
> +  break;
> +   case MESA_SHADER_FRAGMENT:
> +  nir_assign_var_locations(>outputs, >num_outputs,
> +   type_size_scalar);
> +  break;
> +   case MESA_SHADER_COMPUTE:
> +  /* Compute shaders have no outputs. */
> +  assert(exec_list_is_empty(>outputs));
> +  break;
> +   default:
> +  unreachable("unsupported shader stage");
> }
>  }
>  
> 

Ilia pointed out that the GLSL IR level varying packing makes it so
float/vec2 arrays don't happen - they get packed to vec4 arrays.
However, varying packing doesn't happen for tessellation stages.

So, I think we should drop the Cc: stable on these.  I'd still
like to include them in master, however.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] configure: show which gallium drivers/sts are built

2015-10-15 Thread Emil Velikov
On 15 October 2015 at 18:55, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>
> v1 -> v2: Take Michel's suggestion to include mesa in the st list, append 
> others
>
Was meaning to suggest the latter :-)

IIRC Matt felt against these a while back, while personally I'm
ambivalent if we have them or not. Patch looks ok so,
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Extract can_change_source_types() functions.

2015-10-15 Thread Jason Ekstrand
On Wed, Oct 14, 2015 at 11:30 AM, Matt Turner  wrote:
> Make them members of fs_inst/vec4_instruction for use elsewhere.
>
> Also fix the fs version to check that dst.type == src[1].type and for
> !saturate.

Reviewed-by: Jason Ekstrand 

> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp| 12 
>  src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp   | 15 ++-
>  src/mesa/drivers/dri/i965/brw_ir_fs.h   |  1 +
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4.cpp  | 12 
>  src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 16 ++--
>  6 files changed, 30 insertions(+), 27 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index d000f16..3837bbc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -338,6 +338,18 @@ fs_inst::can_do_source_mods(const struct brw_device_info 
> *devinfo)
>  }
>
>  bool
> +fs_inst::can_change_types() const
> +{
> +   return dst.type == src[0].type &&
> +  !src[0].abs && !src[0].negate && !saturate &&
> +  (opcode == BRW_OPCODE_MOV ||
> +   (opcode == BRW_OPCODE_SEL &&
> +dst.type == src[1].type &&
> +predicate != BRW_PREDICATE_NONE &&
> +!src[1].abs && !src[1].negate));
> +}
> +
> +bool
>  fs_inst::has_side_effects() const
>  {
> return this->eot || backend_instruction::has_side_effects();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> index 230b0ca..5589716 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> @@ -275,17 +275,6 @@ is_logic_op(enum opcode opcode)
> opcode == BRW_OPCODE_NOT);
>  }
>
> -static bool
> -can_change_source_types(fs_inst *inst)
> -{
> -   return !inst->src[0].abs && !inst->src[0].negate &&
> -  inst->dst.type == inst->src[0].type &&
> -  (inst->opcode == BRW_OPCODE_MOV ||
> -   (inst->opcode == BRW_OPCODE_SEL &&
> -inst->predicate != BRW_PREDICATE_NONE &&
> -!inst->src[1].abs && !inst->src[1].negate));
> -}
> -
>  bool
>  fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
>  {
> @@ -368,7 +357,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
> acp_entry *entry)
>
> if (has_source_modifiers &&
> entry->dst.type != inst->src[arg].type &&
> -   !can_change_source_types(inst))
> +   !inst->can_change_types())
>return false;
>
> if (devinfo->gen >= 8 && (entry->src.negate || entry->src.abs) &&
> @@ -438,7 +427,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
> acp_entry *entry)
>* type.  If we got here, then we can just change the source and
>* destination types of the instruction and keep going.
>*/
> - assert(can_change_source_types(inst));
> + assert(inst->can_change_types());
>   for (int i = 0; i < inst->sources; i++) {
>  inst->src[i].type = entry->dst.type;
>   }
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
> b/src/mesa/drivers/dri/i965/brw_ir_fs.h
> index 97c6f8b..7726e4b 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
> @@ -204,6 +204,7 @@ public:
> unsigned components_read(unsigned i) const;
> int regs_read(int arg) const;
> bool can_do_source_mods(const struct brw_device_info *devinfo);
> +   bool can_change_types() const;
> bool has_side_effects() const;
>
> bool reads_flag() const;
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> index 96dd633..1b57b65 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> @@ -179,6 +179,7 @@ public:
>int swizzle, int swizzle_mask);
> void reswizzle(int dst_writemask, int swizzle);
> bool can_do_source_mods(const struct brw_device_info *devinfo);
> +   bool can_change_types() const;
>
> bool reads_flag()
> {
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 08f3e91..f5242d3 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -280,6 +280,18 @@ vec4_instruction::can_do_source_mods(const struct 
> brw_device_info *devinfo)
> return true;
>  }
>
> +bool
> +vec4_instruction::can_change_types() const
> +{
> +   return dst.type == src[0].type &&
> +  !src[0].abs && !src[0].negate && !saturate &&
> +  (opcode == BRW_OPCODE_MOV ||
> +   (opcode == BRW_OPCODE_SEL &&
> +dst.type == src[1].type &&
> +

[Mesa-dev] [PATCH 2/3] i965: Implement a new type_size_4x() function.

2015-10-15 Thread Kenneth Graunke
Often, shader inputs/outputs are required to be aligned to vec4 slots
for one reason or another.  When working with the scalar backend, we
want to count the number of scalar components, yet still respect the
vec4 packing rules as required.

The new "hybrid" type_size_4x() function pads everything out to vec4
slots, similar to type_size_vec4(), but counts in scalar components,
similar to type_size_scalar().

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 52 ++
 src/mesa/drivers/dri/i965/brw_shader.h |  1 +
 2 files changed, 53 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 01a7c99..4af88c5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -499,6 +499,58 @@ type_size_scalar(const struct glsl_type *type)
 }
 
 /**
+ * Returns the number of scalar components needed to store type, assuming
+ * that vectors are padded out to vec4.
+ *
+ * This has the packing rules of type_size_vec4(), but counts components
+ * similar to type_size_scalar().
+ */
+extern "C" int
+type_size_4x(const struct glsl_type *type)
+{
+   int size;
+
+   switch (type->base_type) {
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_FLOAT:
+   case GLSL_TYPE_BOOL:
+  if (type->is_matrix()) {
+ return 4 * type->matrix_columns;
+  } else {
+ /* Regardless of the size of vector, it's padded out to a vec4. */
+ return 4;
+  }
+   case GLSL_TYPE_ARRAY:
+  return type_size_4x(type->fields.array) * type->length;
+   case GLSL_TYPE_STRUCT:
+  size = 0;
+  for (unsigned i = 0; i < type->length; i++) {
+size += type_size_4x(type->fields.structure[i].type);
+  }
+  return size;
+   case GLSL_TYPE_SAMPLER:
+  /* Samplers take up no register space, since they're baked in at
+   * link time.
+   */
+  return 0;
+   case GLSL_TYPE_ATOMIC_UINT:
+  return 0;
+   case GLSL_TYPE_SUBROUTINE:
+  return 4;
+   case GLSL_TYPE_IMAGE:
+  return ALIGN(BRW_IMAGE_PARAM_SIZE, 4);
+   case GLSL_TYPE_VOID:
+   case GLSL_TYPE_ERROR:
+   case GLSL_TYPE_INTERFACE:
+   case GLSL_TYPE_DOUBLE:
+  unreachable("not reached");
+   }
+
+   return 0;
+}
+
+/**
  * Create a MOV to read the timestamp register.
  *
  * The caller is responsible for emitting the MOV.  The return value is
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index ad2de5e..06a5b4c 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -316,6 +316,7 @@ bool brw_cs_precompile(struct gl_context *ctx,
struct gl_program *prog);
 
 int type_size_scalar(const struct glsl_type *type);
+int type_size_4x(const struct glsl_type *type);
 int type_size_vec4(const struct glsl_type *type);
 
 bool is_scalar_shader_stage(const struct brw_compiler *compiler, int stage);
-- 
2.6.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Fix scalar VS float[] and vec2[] output arrays.

2015-10-15 Thread Kenneth Graunke
The scalar VS backend has never handled float[] and vec2[] outputs
correctly (my original code was broken).  Outputs need to be padded
out to vec4 slots.

In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot
by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4.  However,
this is wrong: type_size_scalar() for a float[2] would return 2, or
for vec2[2] it would return 4.  This looked like a single slot, even
though in reality each array element would be stored in separate vec4
slots.

Because of this bug, outputs[] and output_components[] would not get
initialized for the second element's VARYING_SLOT, which meant
emit_urb_writes() would skip writing them.  Nothing used those values,
and dead code elimination threw a party.

The new type_size_4x() function pads array elements correctly, but
still counts in scalar components, generating correct indices in
store_output intrinsics.

Not observed to fix any Piglit or dEQP tests, but does fix various
tcs-input Piglit tests on a branch that implements tessellation shaders.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +-
 src/mesa/drivers/dri/i965/brw_nir.c  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 0e044d0..a290656 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -93,7 +93,7 @@ fs_visitor::nir_setup_outputs()
 
   switch (stage) {
   case MESA_SHADER_VERTEX:
- for (unsigned int i = 0; i < ALIGN(type_size_scalar(var->type), 4) / 
4; i++) {
+ for (int i = 0; i < type_size_4x(var->type) / 4; i++) {
 int output = var->data.location + i;
 this->outputs[output] = offset(reg, bld, 4 * i);
 this->output_components[output] = vector_elements;
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 1b4dace..c7f94a6 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -117,7 +117,8 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
case MESA_SHADER_GEOMETRY:
   if (is_scalar) {
  nir_assign_var_locations(>outputs, >num_outputs,
-  type_size_scalar);
+  type_size_4x);
+ nir_lower_io(nir, nir_var_shader_out, type_size_4x);
   } else {
  nir_foreach_variable(var, >outputs)
 var->data.driver_location = var->data.location;
-- 
2.6.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965/nir: Switch on shader stage in nir_lower_outputs().

2015-10-15 Thread Kenneth Graunke
VS, GS, and FS continue doing the same thing they did before.  We can
simplify the FS code a bit because it is always scalar.

Compute shaders now assert that there are no outputs instead of doing
a loop over 0 outputs.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_nir.c | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index af9d041..1b4dace 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -112,11 +112,27 @@ brw_nir_lower_inputs(nir_shader *nir, bool is_scalar)
 static void
 brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
 {
-   if (is_scalar) {
-  nir_assign_var_locations(>outputs, >num_outputs, 
type_size_scalar);
-   } else {
-  nir_foreach_variable(var, >outputs)
- var->data.driver_location = var->data.location;
+   switch (nir->stage) {
+   case MESA_SHADER_VERTEX:
+   case MESA_SHADER_GEOMETRY:
+  if (is_scalar) {
+ nir_assign_var_locations(>outputs, >num_outputs,
+  type_size_scalar);
+  } else {
+ nir_foreach_variable(var, >outputs)
+var->data.driver_location = var->data.location;
+  }
+  break;
+   case MESA_SHADER_FRAGMENT:
+  nir_assign_var_locations(>outputs, >num_outputs,
+   type_size_scalar);
+  break;
+   case MESA_SHADER_COMPUTE:
+  /* Compute shaders have no outputs. */
+  assert(exec_list_is_empty(>outputs));
+  break;
+   default:
+  unreachable("unsupported shader stage");
}
 }
 
-- 
2.6.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix is-renderable check in intel_image_target_renderbuffer_storage

2015-10-15 Thread Anuj Phogat
On Thu, Oct 15, 2015 at 2:01 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Previously we could create a renderbuffer with format
> MESA_FORMAT_R8G8B8A8_UNORM, convert that renderbuffer to an EGLImage,
> then FAIL to convert the EGLImage back to a renderbuffer because
> reasons.  Just use the same check in
> intel_image_target_renderbuffer_storage that brw_render_target_supported
> uses.
>
> There are more checks in brw_render_target_supported, but I don't think
> they are necessary here.  A different approach would be to refactor
> brw_render_target_supported to take rb->Format and rb->NumSamples as
> parameters (instead of a gl_renderbuffer) and use the new function here.
>
Right those checks in brw_render_target_supported() shouldn't matter
for OpenGL ES.

> Fixes:
>
> ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image
>
> Signed-off-by: Ian Romanick 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92476
> Cc: "10.3 10.4 10.5 10.6 11.0" 
> ---
>  src/mesa/drivers/dri/i965/intel_fbo.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
> b/src/mesa/drivers/dri/i965/intel_fbo.c
> index 5a6b0dd..7f281fa 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -348,14 +348,10 @@ intel_image_target_renderbuffer_storage(struct 
> gl_context *ctx,
> }
>
> /* __DRIimage is opaque to the core so it has to be checked here */
> -   switch (image->format) {
> -   case MESA_FORMAT_R8G8B8A8_UNORM:
> +   if (!brw->format_supported_as_render_target[image->format]) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
>  "glEGLImageTargetRenderbufferStorage(unsupported image format");
>return;
> -  break;
> -   default:
> -  break;
> }
>
> irb = intel_renderbuffer(rb);
> --
> 2.1.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-stable

Patch is:
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/fs: Consider type mismatches in saturate propagation.

2015-10-15 Thread Jason Ekstrand
On Wed, Oct 14, 2015 at 11:30 AM, Matt Turner  wrote:
> NIR considers bcsel to produce and consume unsigned types, leading to
> SEL instructions operating on unsigned types when the data is really
> floating-point. Previous to this patch, saturate propagation would
> happily transform
>
>(+f0) sel  g20:UD, g30:UD, g40:UD
>  mov.sat  g50:F,  g20:F
>
> into
>
>(+f0) sel.sat  g20:UD, g30:UD, g40:UD
>  mov  g50:F,  g20:F
>
> But since the meaning of .sat is dependent on the type of the
> destination register, this is not valid.
>
> Instead, allow saturate propagation to change the types of dest/source
> on instructions that are simply copying data in order to propagate the
> saturate modifier.
>
> Fixes bad code gen in 158 programs.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp | 15 
> ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
> index e406c28..8792a8c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
> @@ -52,11 +52,12 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t 
> *block)
>ip--;
>
>if (inst->opcode != BRW_OPCODE_MOV ||
> +  !inst->saturate ||
>inst->dst.file != GRF ||
> +  inst->dst.type != inst->src[0].type ||
>inst->src[0].file != GRF ||
>inst->src[0].abs ||
> -  inst->src[0].negate ||
> -  !inst->saturate)
> +  inst->src[0].negate)
>   continue;
>
>int src_var = v->live_intervals->var_from_reg(inst->src[0]);
> @@ -65,7 +66,9 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t 
> *block)
>bool interfered = false;
>foreach_inst_in_block_reverse_starting_from(fs_inst, scan_inst, inst, 
> block) {
>   if (scan_inst->overwrites_reg(inst->src[0])) {
> -if (scan_inst->is_partial_write())
> +if (scan_inst->is_partial_write() ||
> +(scan_inst->dst.type != inst->dst.type &&
> + !scan_inst->can_change_types()))
> break;
>
>  if (scan_inst->saturate) {
> @@ -73,6 +76,12 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t 
> *block)
> progress = true;
>  } else if (src_end_ip <= ip || inst->dst.equals(inst->src[0])) {
> if (scan_inst->can_do_saturate()) {
> +  if (scan_inst->dst.type != inst->dst.type) {

Please add an

assert(scan_inst->can_change_src_types());

With that added,

Reviewed-by: Jason Ekstrand 

> + scan_inst->dst.type = inst->dst.type;
> + for (int i = 0; i < scan_inst->sources; i++) {
> +scan_inst->src[i].type = inst->dst.type;
> + }
> +  }
>scan_inst->saturate = true;
>inst->saturate = false;
>progress = true;
> --
> 2.4.9
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Updating mesa3d.org docs?

2015-10-15 Thread Brian Paul
On Thu, Oct 15, 2015 at 3:27 PM, Sarah Sharp 
wrote:

> On Thu, Oct 15, 2015 at 01:34:54PM -0600, Brian Paul wrote:
> > On 10/15/2015 01:18 PM, Sarah Sharp wrote:
> > >1. What's the process for pushing updated documentation to the site?
> >
> > All the website pages are found in the git docs/ directory.  Changes are
> > submitted as patches and reviewed like code on the mesa-dev list.
>
> Sounds good.
>
> > >2. How often are updated docs pushed? Once every week, month, or when
> > >there's a new Mesa version?
> >
> > I push them whenever a new Mesa version is released, but I can do it at
> any
> > time on request.
>
> Ok, great! I'll ping you when I get patches in.
>
> However, I will note that the push process isn't working for some pages
> on the website. 11.0.1 was released in September 2015 (when I would
> expect you to do an update). You pushed a commit in 2013 to remove
> references to CVS (commit dbbe108951 "docs: replace CVS with git"), but
> that change is still not reflected here:
>
> http://www.mesa3d.org/sourcedocs.html
>
> Other pages that are out of sync with master include:
>
> http://www.mesa3d.org/systems.html
> http://www.mesa3d.org/license.html
> http://www.mesa3d.org/install.html
> http://www.mesa3d.org/envvars.html
> http://www.mesa3d.org/osmesa.html
> http://www.mesa3d.org/extensions.html


OK, should be fixed now.



>
> The license page seems particularly important to have up-to-date on the
> website, since the text changed from "IN NO EVENT SHALL BRIAN PAUL BE
> LIABLE" to "IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
> LIABLE".
>
> Pages that people would generally check often (index.html,
> relnotes.html) all seem to get updated. There are also less-frequently
> updated pages like shading.html that are up-to-date. It's a bit of
> a mystery to me why the other pages aren't being updated.
>

Just an oversight.



>
> > >3. Any chance I could get permissions to push updated docs?  I'll be
> > >improving Mesa documentation as part of my new job, and I would love
> > >to be able to push myself once patches are accepted, rather than
> > >having to ping you.
> >
> > The typical deal is we wait until a person has some track record of
> > producing good patches before giving git-write/push privileges.
> >
> > So, I'd suggest you make some changes/patches, post them to the mesa-dev
> > list for review (others can push them for you initially), and then when
> > you've got some history established you can file a request (via bugzilla)
> > for git privileges.
>
> Completely understandable. I have done a couple of commits to Mesa and
> piglit, and I do have git repo access under the username 'sarah'.
> I haven't pushed my own branches yet, as the patches are still being
> tested internally, but I will be having someone else to push my initial
> patches after they get mailing list review. I understand if you want to
> wait a while for me to prove myself before you grant me additional
> trusted privileges. :)
>

Yeah, I'm sure it won't take you long...

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: silence warning about unhandled ast_unsized_array_dim case in switch

2015-10-15 Thread Timothy Arceri
On Thu, 2015-10-15 at 07:27 -0600, Brian Paul wrote:

Reviewed-by: Timothy Arceri 

> ---
>  src/glsl/ast_to_hir.cpp | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index cd40fe3..ede02d9 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -2017,6 +2017,9 @@ ast_expression::has_sequence_subexpression()
> const
>  
> case ast_function_call:
>unreachable("should be handled by
> ast_function_expression::hir");
> +
> +   case ast_unsized_array_dim:
> +  unreachable("ast_unsized_array_dim: Should never get here.");
> }
>  
> return false;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Request for Proposal for XDC 2016

2015-10-15 Thread Daniel Vetter
Hi all,

The X.Org board is solicting further proposals to organize XDC
somewhere in Europe. The board has already received a proposal for
Helsinki and plans to vote on that in the next meeting on the 29th
Oct, but if there is anyone else interested in hosting XDC we'd very
much like to hear about that.

Please send in your proposal to bo...@foundation.x.org latest by 28th
Oct to make sure the baord can consider it.

Thanks,
Daniel, secretary of the board
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/vs: Drop hack that created NIR for fixed function vertex programs.

2015-10-15 Thread Kenneth Graunke
Marek made core Mesa call ProgramStringNotify(), which solves this
properly.  The hack is no longer needed.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vs.c | 12 
 1 file changed, 12 deletions(-)

Thanks, Marek!

diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index de9a867..7253117 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -57,18 +57,6 @@ brw_codegen_vs_prog(struct brw_context *brw,
bool start_busy = false;
double start_time = 0;
 
-   if (!vp->program.Base.nir) {
-  /* Normally we generate NIR in LinkShader() or
-   * ProgramStringNotify(), but Mesa's fixed-function vertex program
-   * handling doesn't notify the driver at all.  Just do it here, at
-   * the last minute, even though it's lame.
-   */
-  assert(vp->program.Base.Id == 0 && prog == NULL);
-  vp->program.Base.nir =
- brw_create_nir(brw, NULL, >program.Base, MESA_SHADER_VERTEX,
-brw->intelScreen->compiler->scalar_vs);
-   }
-
if (prog)
   vs = (struct brw_shader *) prog->_LinkedShaders[MESA_SHADER_VERTEX];
 
-- 
2.6.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Improve handling of GL_BGRA format in es3 format_and_type checks

2015-10-15 Thread Jason Ekstrand
On Wed, Oct 14, 2015 at 6:56 PM, Eduardo Lima Mitev  wrote:
> We recently added support for GL_BGRA internal format when validating
> combination of format+type+internal_format in Tex(Sub)ImageXD calls
> (to fix https://bugs.freedesktop.org/show_bug.cgi?id=92265).
>
> However, the current implementation handles it as a special case when
> obtaining the effective internal format, treating GL_BGRA as if its
> base format is GL_RGBA execpt for the case of validation.
>
> This causes Mesa to accept a combination like:
> internalFormat = GL_BGRA_EXT, format = GL_RGBA, type = GL_UNSIGNED_BYTE as
> valid arguments to TexImage2D, when it is actually an invalid combination
> per EXT_texture_format_BGRA
> 
>
> This patch makes _mesa_base_tex_format() return GL_BGRA_EXT as base format of
> GL_BGRA_EXT internal format, which is consistent with the extension
> spec. As a result, the code for handling GL_BGRA during validation gets
> simplified.
> ---
>  src/mesa/main/glformats.c | 21 +++--
>  1 file changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index faa6382..e0192fe 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -2148,6 +2148,9 @@ _mesa_es_error_check_format_and_type(GLenum format, 
> GLenum type,
>   *
>   * \return the corresponding \u base internal format (GL_ALPHA, GL_LUMINANCE,
>   * GL_LUMANCE_ALPHA, GL_INTENSITY, GL_RGB, or GL_RGBA), or -1 if invalid 
> enum.
> + * When profile is GLES, it will also return GL_BGRA as base format of
> + * GL_BGRA internal format, as specified by extension
> + * EXT_texture_format_BGRA.
>   *
>   * This is the format which is used during texture application (i.e. the
>   * texture format and env mode determine the arithmetic used.
> @@ -2215,7 +2218,7 @@ _mesa_base_tex_format(const struct gl_context *ctx, 
> GLint internalFormat)
> if (_mesa_is_gles(ctx)) {
>switch (internalFormat) {
>case GL_BGRA:
> - return GL_RGBA;
> + return GL_BGRA_EXT;

I don't think we can just up-and-change this.  It does get used some
places that don't expect GL_BGRA_EXT.  For instance, in
copytexture_error_check (teximage.c:2265) we call
_mesa_base_tex_format to get the base tex format for the renderbuffer
or texture and then pass that into _mesa_base_format_component_count()
which doesn't  know about GL_BGRA_EXT and so returns -1.  Also, in
gallium in st_format.c:1982 they have a line to treat BGRA as RGBA
which I'm guessing is to hack around mesa doing the same.  I think we
probably can make this change, but it's going to take some more care.

Given that we completely broke Weston and KWin with this, I don't
think "It passes piglit" is a particularly convincing argument for
this one.  Some code-searching would be good and we probably need a
test that actually tests that the format works (not just API errors).
--Jason

>default:
>   ; /* fallthrough */
>}
> @@ -2799,18 +2802,8 @@ _mesa_es3_error_check_format_and_type(const struct 
> gl_context *ctx,
>   return GL_INVALID_OPERATION;
>
>GLenum baseInternalFormat;
> -  if (internalFormat == GL_BGRA_EXT) {
> - /* Unfortunately, _mesa_base_tex_format returns a base format of
> -  * GL_RGBA for GL_BGRA_EXT.  This makes perfect sense if you're
> -  * asking the question, "what channels does this format have?"
> -  * However, if we're trying to determine if two internal formats
> -  * match in the ES3 sense, we actually want GL_BGRA.
> -  */
> - baseInternalFormat = GL_BGRA_EXT;
> -  } else {
> - baseInternalFormat =
> -_mesa_base_tex_format(ctx, effectiveInternalFormat);
> -  }
> +  baseInternalFormat =
> + _mesa_base_tex_format(ctx, effectiveInternalFormat);
>
>if (internalFormat != baseInternalFormat)
>   return GL_INVALID_OPERATION;
> @@ -2820,7 +2813,7 @@ _mesa_es3_error_check_format_and_type(const struct 
> gl_context *ctx,
>
> switch (format) {
> case GL_BGRA_EXT:
> -  if (type != GL_UNSIGNED_BYTE || internalFormat != GL_BGRA)
> +  if (type != GL_UNSIGNED_BYTE || internalFormat != format)
>   return GL_INVALID_OPERATION;
>break;
>
> --
> 2.5.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/prog: Don't double-insert the fog-coord variable

2015-10-15 Thread Jason Ekstrand
nir_variable_create already inserts it in the right list for us so
inserting it again causes a linked list corruption.
---
 src/mesa/program/prog_to_nir.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c
index fe8c238..da61a2b 100644
--- a/src/mesa/program/prog_to_nir.c
+++ b/src/mesa/program/prog_to_nir.c
@@ -1001,11 +1001,10 @@ setup_registers_and_variables(struct ptn_compile *c)
 store->src[0] = nir_src_for_ssa(f001);
 nir_builder_instr_insert(b, >instr);
 
-/* Insert the real input into the list so the driver has real
- * inputs, but set c->input_vars[i] to the temporary so we use
+/* We inserted the real input into the list so the driver has real
+ * inputs, but we set c->input_vars[i] to the temporary so we use
  * the splatted value.
  */
-exec_list_push_tail(>inputs, >node);
 c->input_vars[i] = fullvar;
 continue;
  }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nv50/ir: use C++11 standard std::unordered_map if possible

2015-10-15 Thread Chih-Wei Huang
2015-10-16 0:11 GMT+08:00 Ilia Mirkin :
> This patch and the nv30 one are both
>
> Reviewed-by: Ilia Mirkin 

Thank you for the review.

> I guess adding a cc: stable makes sense for these too? Or are further
> fixes required that would make building 11.0.x impractical?

Ah, yes. They apply to 11.0.x as well.
Thank you for reminding.


-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/prog: Don't double-insert the fog-coord variable

2015-10-15 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/util: fix debug_get_flags_option on 32-bit harder

2015-10-15 Thread Rob Clark
From: Rob Clark 

I noticed the FD_MESA_DEBUG=help numeric flags output was looking kinda
funny.

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/util/u_debug.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug.c 
b/src/gallium/auxiliary/util/u_debug.c
index 5fe9e33..7388a49 100644
--- a/src/gallium/auxiliary/util/u_debug.c
+++ b/src/gallium/auxiliary/util/u_debug.c
@@ -276,7 +276,7 @@ debug_get_flags_option(const char *name,
   for (; flags->name; ++flags)
  namealign = MAX2(namealign, strlen(flags->name));
   for (flags = orig; flags->name; ++flags)
- _debug_printf("| %*s [0x%0*"PRIu64"]%s%s\n", namealign, flags->name,
+ _debug_printf("| %*s [0x%0*"PRIx64"]%s%s\n", namealign, flags->name,
   (int)sizeof(uint64_t)*CHAR_BIT/4, flags->value,
   flags->desc ? " " : "", flags->desc ? flags->desc : "");
}
@@ -291,9 +291,9 @@ debug_get_flags_option(const char *name,
 
if (debug_get_option_should_print()) {
   if (str) {
- debug_printf("%s: %s = 0x%"PRIu64" (%s)\n", __FUNCTION__, name, 
result, str);
+ debug_printf("%s: %s = 0x%"PRIx64" (%s)\n", __FUNCTION__, name, 
result, str);
   } else {
- debug_printf("%s: %s = 0x%"PRIu64"\n", __FUNCTION__, name, result);
+ debug_printf("%s: %s = 0x%"PRIx64"\n", __FUNCTION__, name, result);
   }
}
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Updating mesa3d.org docs?

2015-10-15 Thread Brian Paul

On 10/15/2015 01:18 PM, Sarah Sharp wrote:

Hi Brian!


Hi Sarah,



I'm a new Mesa developer in Intel's OTC graphics team (although not new
to open source, I've been a Linux kernel developer for the last seven
years).

I heard that you're responsible for updating mesa3d.org documentation
against the docs in the Mesa source code repo. I noticed the docs are
out-of-date WRT the repo, and I had a couple questions:

1. What's the process for pushing updated documentation to the site?


All the website pages are found in the git docs/ directory.  Changes are 
submitted as patches and reviewed like code on the mesa-dev list.




2. How often are updated docs pushed? Once every week, month, or when
there's a new Mesa version?


I push them whenever a new Mesa version is released, but I can do it at 
any time on request.




3. Any chance I could get permissions to push updated docs?  I'll be
improving Mesa documentation as part of my new job, and I would love
to be able to push myself once patches are accepted, rather than
having to ping you.


The typical deal is we wait until a person has some track record of 
producing good patches before giving git-write/push privileges.


So, I'd suggest you make some changes/patches, post them to the mesa-dev 
list for review (others can push them for you initially), and then when 
you've got some history established you can file a request (via 
bugzilla) for git privileges.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Implementation of vec4 equivalent to fs_cmod_propagation optimization

2015-10-15 Thread Alejandro Piñeiro


On 15/10/15 18:19, Matt Turner wrote:
> On Sat, Oct 10, 2015 at 4:24 AM, Alejandro Piñeiro  
> wrote:
>> This series implements a vec4 equivalent to fs_cmod_propagation optimization.
>>
>> The last two commits are not really needed for the optimization, are just
>> nice-to-have (imho) that I added while implementing the optimization.
>>
>> Alejandro Piñeiro (5):
>>   i965/vec4: nir_emit_if doesn't need to predicate based on all the
>> channels
>>   i965/vec4: adding vec4_cmod_propagation optimization
>>   i965/vec4: Add unit tests for cmod propagation pass.
>>   i965/vec4: use a custom envvar to decide to print the assembly of
>> test_vec4_cmod_propagation
>>   i965/vec4: print predicate control at brw_vec4 dump_instruction
> Pending the flag liveness analysis patch, these five are reviewed,

First, thanks for the review. Even although those patches could be
landed without the flag liveness analysis patch, I would prefer to wait
until everything is finished. In any case, I would not be able to work
tomorrow, so it will need to wait till next Monday. So if you prefer, I
could start to land the reviewed patches without waiting for the flag one.
>  but
> aren't we still missing a vec4_visitor implementation of
> fixup_3src_null_dest()?

Sorry for not answering directly that suggestion. I didn't need to
implement that one. Take a look to what I get from the piglit test
vs-refract-vec2-vec2-float.shader_test with the cmod optimization:

mad.l.f0(8) g18<1>.xF   g10<4,4,1>.xF   g17<4,4,1>.xF  
-g2.0<0,1,0>F { align16 1Q };
(+f0.x) if(8)   JIP: 6  UIP: 13  

It is already a non null-reg dest. When I asked, that was not the
problem, but that we were only writing to the x component of the flag on
the mad.l but using all the components at the if. So I didn't need to
add a vec4 fixup_3src_null_dest.

Best regards

-- 
Alejandro Piñeiro (apinhe...@igalia.com)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V7 03/24] glsl: allow AoA to be sized by initializer or constructor

2015-10-15 Thread Samuel Iglesias Gonsálvez


On 15/10/15 05:43, Timothy Arceri wrote:
> On Fri, 2015-10-09 at 13:33 +0200, Samuel Iglesias Gonsálvez wrote:
>>
>> On 09/10/15 13:25, Timothy Arceri wrote:
>>> On Thu, 2015-10-08 at 11:08 +0200, Samuel Iglesias Gonsálvez wrote:
 On 07/10/15 00:47, Timothy Arceri wrote:
> From Section 4.1.9 of the GLSL ES 3.10 spec:
>
>  "Arrays are sized either at compile-time or at run-time.
>   To size an array at compile-time, either the size
>   must be specified within the brackets as above or
>   must be inferred from the type of the initializer."
> ---
>  src/glsl/ast.h   | 15 ++---
>  src/glsl/ast_array_index.cpp |  7 ++--
>  src/glsl/ast_function.cpp| 33 +-
>  src/glsl/ast_to_hir.cpp  | 79
> ++--
>  src/glsl/glsl_parser.yy  | 11 +++---
>  5 files changed, 104 insertions(+), 41 deletions(-)
>
> diff --git a/src/glsl/ast.h b/src/glsl/ast.h
> index 4c31436..b43be24 100644
> --- a/src/glsl/ast.h
> +++ b/src/glsl/ast.h
> @@ -181,6 +181,7 @@ enum ast_operators {
> ast_post_dec,
> ast_field_selection,
> ast_array_index,
> +   ast_unsized_array_dim,
>  
> ast_function_call,
>  
> @@ -318,16 +319,7 @@ public:
>  
>  class ast_array_specifier : public ast_node {
>  public:
> -   /** Unsized array specifier ([]) */
> -   explicit ast_array_specifier(const struct YYLTYPE )
> - : is_unsized_array(true)
> -   {
> -  set_location(locp);
> -   }
> -
> -   /** Sized array specifier ([dim]) */
> ast_array_specifier(const struct YYLTYPE ,
> ast_expression
> *dim)
> - : is_unsized_array(false)
> {
>set_location(locp);
>array_dimensions.push_tail(>link);
> @@ -340,11 +332,8 @@ public:
>  
> virtual void print(void) const;
>  
> -   /* If true, this means that the array has an unsized
> outermost
> dimension. */
> -   bool is_unsized_array;
> -
> /* This list contains objects of type ast_node containing
> the
> -* sized dimensions only, in outermost-to-innermost order.
> +* array dimensions in outermost-to-innermost order.
>  */
> exec_list array_dimensions;
>  };
> diff --git a/src/glsl/ast_array_index.cpp
> b/src/glsl/ast_array_index.cpp
> index 5e8f49d..7855e0a 100644
> --- a/src/glsl/ast_array_index.cpp
> +++ b/src/glsl/ast_array_index.cpp
> @@ -28,13 +28,10 @@
>  void
>  ast_array_specifier::print(void) const
>  {
> -   if (this->is_unsized_array) {
> -  printf("[ ] ");
> -   }
> -
> foreach_list_typed (ast_node, array_dimension, link, 
> ->array_dimensions) {
>printf("[ ");
> -  array_dimension->print();
> +  if (((ast_expression*)array_dimension)->oper !=
> ast_unsized_array_dim)
> + array_dimension->print();
>printf("] ");
> }
>  }
> diff --git a/src/glsl/ast_function.cpp
> b/src/glsl/ast_function.cpp
> index 26d4c62..cf4e64a 100644
> --- a/src/glsl/ast_function.cpp
> +++ b/src/glsl/ast_function.cpp
> @@ -950,6 +950,7 @@ process_array_constructor(exec_list
> *instructions,
> }
>  
> bool all_parameters_are_constant = true;
> +   const glsl_type *element_type = constructor_type
> ->fields.array;
>  
> /* Type cast each parameter and, if possible, fold
> constants.
> */
> foreach_in_list_safe(ir_rvalue, ir, _parameters) {
> @@ -976,12 +977,34 @@ process_array_constructor(exec_list
> *instructions,
>}
>}
>  
> -  if (result->type != constructor_type->fields.array) {
> +  if (constructor_type->fields.array->is_unsized_array())
> {
> + /* As the inner parameters of the constructor are
> created
> without
> +  * knowledge of each other we need to check to make
> sure
> unsized
> +  * parameters of unsized constructors all end up with
> the
> same size.
> +  *
> +  * e.g we make sure to fail for a constructor like
> this:
> +  * vec4[][] a = vec4[][](vec4[](vec4(0.0),
> vec4(1.0)),
> +  *   vec4[](vec4(0.0), vec4(1.0),
> vec4(1.0)),
> +  *   vec4[](vec4(0.0),
> vec4(1.0)));
> +  */
> + if (element_type->is_unsized_array()) {
> + /* This is the first parameter so just get the
> type
> */
> +element_type = result->type;
> + } else if (element_type != result->type) {
> +_mesa_glsl_error(loc, state, "type error in array
> constructor: "
> + "expected: %s, found 

Re: [Mesa-dev] [mesa-dev, mesa-demos][PATCH] sharedtex_mt: fix rendering thread hang

2015-10-15 Thread Belal, Awais
Thanks a lot Brian.

BR,
Awais


From: Brian Paul [bri...@vmware.com]
Sent: Wednesday, October 14, 2015 7:19 PM
To: Belal, Awais; mesa-dev@lists.freedesktop.org
Subject: Re: [Mesa-dev] [mesa-dev, mesa-demos][PATCH] sharedtex_mt: fix 
rendering thread hang

I just pushed it.  Thanks.

-Brian

On 10/14/2015 02:07 AM, Belal, Awais wrote:
> Hi Brian,
>
> Do you want me to update this or is it good to go as is?
>
> BR,
> Awais
>
> 
> From: mesa-dev [mesa-dev-boun...@lists.freedesktop.org] on behalf of Belal, 
> Awais
> Sent: Monday, October 12, 2015 6:33 PM
> To: Brian Paul; mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [mesa-dev, mesa-demos][PATCH] sharedtex_mt: fix 
> rendering thread hang
>
> Hi Brian,
>
> Thanks for your reply :)
> The move of variable definition was just to make the code look a little 
> cleaner.
>
> BR,
> Awais
>
> 
> From: Brian Paul [bri...@vmware.com]
> Sent: Monday, October 12, 2015 6:31 PM
> To: Belal, Awais; mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [mesa-dev, mesa-demos][PATCH] sharedtex_mt: fix 
> rendering thread hang
>
> On 10/12/2015 05:25 AM, Belal, Awais wrote:
>> Hi,
>>
>> Is there are a reservation against the below patch?
>
> Looks OK, but one comment below.
>
>
>> BR,
>> Awais
>>
>> 
>> From: mesa-dev [mesa-dev-boun...@lists.freedesktop.org] on behalf of Belal, 
>> Awais
>> Sent: Thursday, October 08, 2015 2:00 PM
>> To: mesa-dev@lists.freedesktop.org
>> Subject: [Mesa-dev] [mesa-dev,  mesa-demos][PATCH] sharedtex_mt: fix 
>> rendering thread hang
>>
>> XNextEvent is a blocking call which locks up the display mutex
>> this causes the rendering threads to hang when they try call
>> glXSwapBuffers() as that tries to take the same mutex in
>> underlying calls through XCopyArea().
>> So we only go to XNextEvent when it has at least one event
>> and we wouldn't lock indefinitely.
>>
>> Signed-off-by: Awais Belal 
>> ---
>>src/xdemos/sharedtex_mt.c | 9 +++--
>>1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/xdemos/sharedtex_mt.c b/src/xdemos/sharedtex_mt.c
>> index a90903a..1d503c4 100644
>> --- a/src/xdemos/sharedtex_mt.c
>> +++ b/src/xdemos/sharedtex_mt.c
>> @@ -420,9 +420,14 @@ Resize(struct window *h, unsigned int width, unsigned 
>> int height)
>>static void
>>EventLoop(void)
>>{
>> +   int i;
>> +   XEvent event;
>>   while (1) {
>> -  int i;
>> -  XEvent event;
>> +  /* Do we have an event? */
>> +  if (XPending(gDpy) == 0) {
>> + usleep(1);
>> + continue;
>> +  }
>>  XNextEvent(gDpy, );
>>  for (i = 0; i < NumWindows; i++) {
>>struct window *h = [i];
>
> Was there particular reason to move the i, event declarations?
>
> In any case, I'll commit this in a bit.
>
> -Brian
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev=BQIFAg=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8=OGzLAaED2FXBhL6Q1sLFgviEkbFMzCA_bTROi2XgqJA=o_lPvn53_4RWhqKWa6ZyyfO1l6b2SzQMpEL6OtZc6y8=
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] st/mesa: check of out-of-memory in st_DrawPixels()

2015-10-15 Thread Brian Paul
Before, if make_texture() or st_create_texture_sampler_view() failed
we silently no-op'd the glDrawPixels.  Now, set GL_OUT_OF_MEMORY.
This also allows us to un-nest a bunch of code.
---
 src/mesa/state_tracker/st_cb_drawpixels.c | 74 +--
 1 file changed, 40 insertions(+), 34 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index e4d3580..05f6e6b 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -975,6 +975,7 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
int num_sampler_view = 1;
struct gl_pixelstore_attrib clippedUnpack;
struct st_fp_variant *fpv = NULL;
+   struct pipe_resource *pt;
 
/* Mesa state should be up to date by now */
assert(ctx->NewState == 0x0);
@@ -1030,42 +1031,47 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
   st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT);
}
 
-   /* draw with textured quad */
-   {
-  struct pipe_resource *pt
- = make_texture(st, width, height, format, type, unpack, pixels);
-  if (pt) {
- sv[0] = st_create_texture_sampler_view(st->pipe, pt);
-
- if (sv[0]) {
-/* Create a second sampler view to read stencil.
- * The stencil is written using the shader stencil export
- * functionality. */
-if (write_stencil) {
-   enum pipe_format stencil_format =
- util_format_stencil_only(pt->format);
-   /* we should not be doing pixel map/transfer (see above) */
-   assert(num_sampler_view == 1);
-   sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
- stencil_format);
-   num_sampler_view++;
-}
+   /* Put glDrawPixels image into a texture */
+   pt = make_texture(st, width, height, format, type, unpack, pixels);
+   if (!pt) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
+  return;
+   }
 
-draw_textured_quad(ctx, x, y, ctx->Current.RasterPos[2],
-   width, height,
-   ctx->Pixel.ZoomX, ctx->Pixel.ZoomY,
-   sv,
-   num_sampler_view,
-   driver_vp,
-   driver_fp, fpv,
-   color, GL_FALSE, write_depth, write_stencil);
-pipe_sampler_view_reference([0], NULL);
-if (num_sampler_view > 1)
-   pipe_sampler_view_reference([1], NULL);
- }
- pipe_resource_reference(, NULL);
-  }
+   /* create sampler view for the image */
+   sv[0] = st_create_texture_sampler_view(st->pipe, pt);
+   if (!sv[0]) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
+  pipe_resource_reference(, NULL);
+  return;
}
+
+   /* Create a second sampler view to read stencil.  The stencil is
+* written using the shader stencil export functionality.
+*/
+   if (write_stencil) {
+  enum pipe_format stencil_format =
+ util_format_stencil_only(pt->format);
+  /* we should not be doing pixel map/transfer (see above) */
+  assert(num_sampler_view == 1);
+  sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
+stencil_format);
+  num_sampler_view++;
+   }
+
+   draw_textured_quad(ctx, x, y, ctx->Current.RasterPos[2],
+  width, height,
+  ctx->Pixel.ZoomX, ctx->Pixel.ZoomY,
+  sv,
+  num_sampler_view,
+  driver_vp,
+  driver_fp, fpv,
+  color, GL_FALSE, write_depth, write_stencil);
+   pipe_sampler_view_reference([0], NULL);
+   if (num_sampler_view > 1)
+  pipe_sampler_view_reference([1], NULL);
+
+   pipe_resource_reference(, NULL);
 }
 
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] st/mesa: optimize 4-component ubyte glDrawPixels

2015-10-15 Thread Brian Paul
If we didn't find a gallium surface format that exactly matched the
glDrawPixels format/type combination, we used some other 32-bit packed
RGBA format and swizzled the whole image in the mesa texstore/format code.

That slow path can be avoided in some common cases by using the
pipe_samper_view's swizzle terms to do the swizzling at texture sampling
time instead.

For now, only GL_RGBA/ubyte and GL_BGRA/ubyte combinations are supported.
In the future other formats and types like GL_UNSIGNED_INT_8_8_8_8 could
be added.
---
 src/mesa/state_tracker/st_cb_drawpixels.c | 73 +++
 1 file changed, 64 insertions(+), 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 05f6e6b..a135761 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -395,15 +395,35 @@ make_texture(struct st_context *st,
* Note that the image is actually going to be upside down in
* the texture.  We deal with that with texcoords.
*/
-  success = _mesa_texstore(ctx, 2,   /* dims */
-   baseInternalFormat, /* baseInternalFormat */
-   mformat,  /* mesa_format */
-   transfer->stride, /* dstRowStride, bytes */
-   ,/* destSlices */
-   width, height, 1, /* size */
-   format, type, /* src format/type */
-   pixels,   /* data source */
-   unpack);
+  if ((format == GL_RGBA || format == GL_BGRA)
+  && type == GL_UNSIGNED_BYTE) {
+ /* Use a memcpy-based texstore to avoid software pixel swizzling.
+  * We'll do the necessary swizzling with the pipe_sampler_view to
+  * give much better performance.
+  * XXX in the future, expand this to accomodate more format and
+  * type combinations.
+  */
+ _mesa_memcpy_texture(ctx, 2,
+  mformat,  /* mesa_format */
+  transfer->stride, /* dstRowStride, bytes */
+  ,/* destSlices */
+  width, height, 1, /* size */
+  format, type, /* src format/type */
+  pixels,   /* data source */
+  unpack);
+ success = GL_TRUE;
+  }
+  else {
+ success = _mesa_texstore(ctx, 2,   /* dims */
+  baseInternalFormat, /* baseInternalFormat */
+  mformat,  /* mesa_format */
+  transfer->stride, /* dstRowStride, bytes */
+  ,/* destSlices */
+  width, height, 1, /* size */
+  format, type, /* src format/type */
+  pixels,   /* data source */
+  unpack);
+  }
 
   /* unmap */
   pipe_transfer_unmap(pipe, transfer);
@@ -958,6 +978,38 @@ clamp_size(struct pipe_context *pipe, GLsizei *width, 
GLsizei *height,
 
 
 /**
+ * Set the sampler view's swizzle terms.  This is used to handle RGBA
+ * swizzling when the incoming image format isn't an exact match for
+ * the actual texture format.  For example, if we have glDrawPixels(
+ * GL_RGBA, GL_UNSIGNED_BYTE) and we chose the texture format
+ * PIPE_FORMAT_B8G8R8A8 then we can do use the sampler view swizzle to
+ * avoid swizzling all the pixels in software in the texstore code.
+ */
+static void
+setup_sampler_swizzle(struct pipe_sampler_view *sv, GLenum format, GLenum type)
+{
+   if ((format == GL_RGBA || format == GL_BGRA) && type == GL_UNSIGNED_BYTE) {
+  const struct util_format_description *desc =
+ util_format_description(sv->texture->format);
+  /* Every gallium driver supports at least one 32-bit packed RGBA format.
+   * We must have chosen one for (GL_RGBA, GL_UNSIGNED_BYTE).
+   */
+  assert(desc->block.bits == 32);
+  /* use the format's swizzle to setup the sampler swizzle */
+  sv->swizzle_r = desc->swizzle[0];
+  sv->swizzle_g = desc->swizzle[1];
+  sv->swizzle_b = desc->swizzle[2];
+  sv->swizzle_a = desc->swizzle[3];
+  if (format == GL_BGRA) {
+ /* swap red/blue */
+ sv->swizzle_r = desc->swizzle[2];
+ sv->swizzle_b = desc->swizzle[0];
+  }
+   }
+}
+
+
+/**
  * Called via ctx->Driver.DrawPixels()
  */
 static void
@@ -1046,6 +1098,9 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
   return;
}
 
+   /* Set up the sampler view's swizzle */
+   setup_sampler_swizzle(sv[0], format, type);
+
/* Create a second sampler view to 

[Mesa-dev] [PATCH 1/4] st/mesa: use MAX3() instead of MAX2(MAX2) in draw_textured_quad()

2015-10-15 Thread Brian Paul
---
 src/mesa/state_tracker/st_cb_drawpixels.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 7e8633e..e4d3580 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -667,7 +667,8 @@ draw_textured_quad(struct gl_context *ctx, GLint x, GLint 
y, GLfloat z,
/* user textures, plus the drawpix textures */
if (fpv) {
   struct pipe_sampler_view *sampler_views[PIPE_MAX_SAMPLERS];
-  uint num = MAX2(MAX2(fpv->drawpix_sampler, fpv->pixelmap_sampler) + 1,
+  uint num = MAX3(fpv->drawpix_sampler + 1,
+  fpv->pixelmap_sampler + 1,
   st->state.num_sampler_views[PIPE_SHADER_FRAGMENT]);
 
   memcpy(sampler_views, st->state.sampler_views[PIPE_SHADER_FRAGMENT],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] mesa: make memcpy_texture() non-static

2015-10-15 Thread Brian Paul
So that we can use it directly from the mesa/gallium state tracker.
---
 src/mesa/main/texstore.c | 40 
 src/mesa/main/texstore.h | 11 +++
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
index e50964e..4b13c42 100644
--- a/src/mesa/main/texstore.c
+++ b/src/mesa/main/texstore.c
@@ -97,16 +97,16 @@ static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE 
};
  * No pixel transfer operations or special texel encodings allowed.
  * 1D, 2D and 3D images supported.
  */
-static void
-memcpy_texture(struct gl_context *ctx,
-  GLuint dimensions,
-   mesa_format dstFormat,
-   GLint dstRowStride,
-   GLubyte **dstSlices,
-   GLint srcWidth, GLint srcHeight, GLint srcDepth,
-   GLenum srcFormat, GLenum srcType,
-   const GLvoid *srcAddr,
-   const struct gl_pixelstore_attrib *srcPacking)
+void
+_mesa_memcpy_texture(struct gl_context *ctx,
+ GLuint dimensions,
+ mesa_format dstFormat,
+ GLint dstRowStride,
+ GLubyte **dstSlices,
+ GLint srcWidth, GLint srcHeight, GLint srcDepth,
+ GLenum srcFormat, GLenum srcType,
+ const GLvoid *srcAddr,
+ const struct gl_pixelstore_attrib *srcPacking)
 {
const GLint srcRowStride = _mesa_image_row_stride(srcPacking, srcWidth,
  srcFormat, srcType);
@@ -296,11 +296,11 @@ _mesa_texstore_ycbcr(TEXSTORE_PARAMS)
assert(baseInternalFormat == GL_YCBCR_MESA);
 
/* always just memcpy since no pixel transfer ops apply */
-   memcpy_texture(ctx, dims,
-  dstFormat,
-  dstRowStride, dstSlices,
-  srcWidth, srcHeight, srcDepth, srcFormat, srcType,
-  srcAddr, srcPacking);
+   _mesa_memcpy_texture(ctx, dims,
+dstFormat,
+dstRowStride, dstSlices,
+srcWidth, srcHeight, srcDepth, srcFormat, srcType,
+srcAddr, srcPacking);
 
/* Check if we need byte swapping */
/* XXX the logic here _might_ be wrong */
@@ -899,11 +899,11 @@ _mesa_texstore_memcpy(TEXSTORE_PARAMS)
   return GL_FALSE;
}
 
-   memcpy_texture(ctx, dims,
-  dstFormat,
-  dstRowStride, dstSlices,
-  srcWidth, srcHeight, srcDepth, srcFormat, srcType,
-  srcAddr, srcPacking);
+   _mesa_memcpy_texture(ctx, dims,
+dstFormat,
+dstRowStride, dstSlices,
+srcWidth, srcHeight, srcDepth, srcFormat, srcType,
+srcAddr, srcPacking);
return GL_TRUE;
 }
 /**
diff --git a/src/mesa/main/texstore.h b/src/mesa/main/texstore.h
index 2c974f7..f08dc08 100644
--- a/src/mesa/main/texstore.h
+++ b/src/mesa/main/texstore.h
@@ -74,6 +74,17 @@ _mesa_texstore_needs_transfer_ops(struct gl_context *ctx,
   GLenum baseInternalFormat,
   mesa_format dstFormat);
 
+extern void
+_mesa_memcpy_texture(struct gl_context *ctx,
+ GLuint dimensions,
+ mesa_format dstFormat,
+ GLint dstRowStride,
+ GLubyte **dstSlices,
+ GLint srcWidth, GLint srcHeight, GLint srcDepth,
+ GLenum srcFormat, GLenum srcType,
+ const GLvoid *srcAddr,
+ const struct gl_pixelstore_attrib *srcPacking);
+
 extern GLboolean
 _mesa_texstore_can_use_memcpy(struct gl_context *ctx,
   GLenum baseInternalFormat, mesa_format dstFormat,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vbo: reduce number of vertex buffer mappings for vertex attributes

2015-10-15 Thread Brian Paul
Whenever we got a glColor, glNormal, glTexCoord, etc. call outside a
glBegin/End pair, we'd immediately map a vertex buffer to begin
accumulating vertex data.  In some cases, such as with display lists,
this led to excessive vertex buffer mapping.  For example, if we have
a display list such as:

glNewList(42, GL_COMPILE);
glBegin(prim);
glVertex2f();
...
glVertex2f();
glEnd();
glEndList();

Then did:

glColor3f();
glCallList(42);

We'd map a vertex buffer as soon as we saw glColor3f but we'd never
actually write anything to it.  Note that the vertex position data
was put into a vertex buffer during display list compilation.

With this change, we delay mapping the vertex buffer until we actually
have a vertex to write to it (triggered by a glVertex() call).  In the
above case, we no longer map a vertex buffer when setting the color and
calling the list.

For drivers such as VMware's, reducing buffer mappings gives improved
performance.
---
 src/mesa/vbo/vbo_exec_api.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/mesa/vbo/vbo_exec_api.c b/src/mesa/vbo/vbo_exec_api.c
index 7ae08fe..789869a 100644
--- a/src/mesa/vbo/vbo_exec_api.c
+++ b/src/mesa/vbo/vbo_exec_api.c
@@ -446,10 +446,6 @@ do {   
\
 \
assert(sz == 1 || sz == 2);  \
 \
-   if (unlikely(!(ctx->Driver.NeedFlush & FLUSH_UPDATE_CURRENT))) { \
-  vbo_exec_begin_vertices(ctx);\
-   }   \
-\
/* check if attribute size or type is changing */\
if (unlikely(exec->vtx.active_sz[A] != N * sz) ||\
unlikely(exec->vtx.attrtype[A] != T)) {  \
@@ -470,6 +466,15 @@ do {   
\
   /* This is a glVertex call */\
   GLuint i;
\
\
+  if (unlikely((ctx->Driver.NeedFlush & FLUSH_UPDATE_CURRENT) == 0)) { \
+ vbo_exec_begin_vertices(ctx);  \
+  } \
+\
+  if (unlikely(!exec->vtx.buffer_ptr)) {\
+ vbo_exec_vtx_map(exec);\
+  } \
+  assert(exec->vtx.buffer_ptr); \
+\
   /* copy 32-bit words */   \
   for (i = 0; i < exec->vtx.vertex_size; i++)  \
 exec->vtx.buffer_ptr[i] = exec->vtx.vertex[i]; \
@@ -482,7 +487,10 @@ do {   
\
\
   if (++exec->vtx.vert_count >= exec->vtx.max_vert)
\
 vbo_exec_vtx_wrap( exec ); \
-   }   \
+   } else { \
+  /* we now have accumulated per-vertex attributes */   \
+  ctx->Driver.NeedFlush |= FLUSH_UPDATE_CURRENT;\
+   }\
 } while (0)
 
 #define ERROR(err) _mesa_error( ctx, err, __func__ )
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/vs: Move URB entry_size and read_length calculations to compile_vs

2015-10-15 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 34 ++
 src/mesa/drivers/dri/i965/brw_vs.c | 34 --
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ca4d23a..00e2d63 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1933,6 +1933,40 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 {
const unsigned *assembly = NULL;
 
+   unsigned nr_attributes = _mesa_bitcount_64(prog_data->inputs_read);
+
+   /* gl_VertexID and gl_InstanceID are system values, but arrive via an
+* incoming vertex attribute.  So, add an extra slot.
+*/
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
+BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
+  nr_attributes++;
+   }
+
+   /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
+* Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
+* vec4 mode, the hardware appears to wedge unless we read something.
+*/
+   if (compiler->scalar_vs)
+  prog_data->base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2);
+   else
+  prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 
2);
+
+   prog_data->nr_attributes = nr_attributes;
+
+   /* Since vertex shaders reuse the same VUE entry for inputs and outputs
+* (overwriting the original contents), we need to make sure the size is
+* the larger of the two.
+*/
+   const unsigned vue_entries =
+  MAX2(nr_attributes, (unsigned)prog_data->base.vue_map.num_slots);
+
+   if (compiler->devinfo->gen == 6)
+  prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8);
+   else
+  prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 4);
+
if (compiler->scalar_vs) {
   prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 6c161d0..c9afc63 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -160,40 +160,6 @@ brw_codegen_vs_prog(struct brw_context *brw,
_data.base.vue_map, outputs_written,
prog ? prog->SeparateShader : false);
 
-   unsigned nr_attributes = _mesa_bitcount_64(prog_data.inputs_read);
-
-   /* gl_VertexID and gl_InstanceID are system values, but arrive via an
-* incoming vertex attribute.  So, add an extra slot.
-*/
-   if (vp->program.Base.SystemValuesRead &
-   (BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
-BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
-  nr_attributes++;
-   }
-
-   /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
-* Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
-* vec4 mode, the hardware appears to wedge unless we read something.
-*/
-   if (brw->intelScreen->compiler->scalar_vs)
-  prog_data.base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2);
-   else
-  prog_data.base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 2);
-
-   prog_data.nr_attributes = nr_attributes;
-
-   /* Since vertex shaders reuse the same VUE entry for inputs and outputs
-* (overwriting the original contents), we need to make sure the size is
-* the larger of the two.
-*/
-   const unsigned vue_entries =
-  MAX2(nr_attributes, prog_data.base.vue_map.num_slots);
-
-   if (brw->gen == 6)
-  prog_data.base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8);
-   else
-  prog_data.base.urb_entry_size = DIV_ROUND_UP(vue_entries, 4);
-
if (0) {
   _mesa_fprint_program_opt(stderr, >program.Base, PROG_PRINT_DEBUG,
   true);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Move URB entry_size and read_length calculations to compile_vs

2015-10-15 Thread Jason Ekstrand
This patch applies on top of my previous series to shuffle a bunch of
the compiler code around.

On Thu, Oct 15, 2015 at 12:05 PM, Jason Ekstrand  wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 34 
> ++
>  src/mesa/drivers/dri/i965/brw_vs.c | 34 
> --
>  2 files changed, 34 insertions(+), 34 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index ca4d23a..00e2d63 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1933,6 +1933,40 @@ brw_compile_vs(const struct brw_compiler *compiler, 
> void *log_data,
>  {
> const unsigned *assembly = NULL;
>
> +   unsigned nr_attributes = _mesa_bitcount_64(prog_data->inputs_read);
> +
> +   /* gl_VertexID and gl_InstanceID are system values, but arrive via an
> +* incoming vertex attribute.  So, add an extra slot.
> +*/
> +   if (shader->info.system_values_read &
> +   (BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
> +BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
> +  nr_attributes++;
> +   }
> +
> +   /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
> +* Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
> +* vec4 mode, the hardware appears to wedge unless we read something.
> +*/
> +   if (compiler->scalar_vs)
> +  prog_data->base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2);
> +   else
> +  prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 
> 2);
> +
> +   prog_data->nr_attributes = nr_attributes;
> +
> +   /* Since vertex shaders reuse the same VUE entry for inputs and outputs
> +* (overwriting the original contents), we need to make sure the size is
> +* the larger of the two.
> +*/
> +   const unsigned vue_entries =
> +  MAX2(nr_attributes, (unsigned)prog_data->base.vue_map.num_slots);
> +
> +   if (compiler->devinfo->gen == 6)
> +  prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8);
> +   else
> +  prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 4);
> +
> if (compiler->scalar_vs) {
>prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
> b/src/mesa/drivers/dri/i965/brw_vs.c
> index 6c161d0..c9afc63 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -160,40 +160,6 @@ brw_codegen_vs_prog(struct brw_context *brw,
> _data.base.vue_map, outputs_written,
> prog ? prog->SeparateShader : false);
>
> -   unsigned nr_attributes = _mesa_bitcount_64(prog_data.inputs_read);
> -
> -   /* gl_VertexID and gl_InstanceID are system values, but arrive via an
> -* incoming vertex attribute.  So, add an extra slot.
> -*/
> -   if (vp->program.Base.SystemValuesRead &
> -   (BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
> -BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
> -  nr_attributes++;
> -   }
> -
> -   /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
> -* Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
> -* vec4 mode, the hardware appears to wedge unless we read something.
> -*/
> -   if (brw->intelScreen->compiler->scalar_vs)
> -  prog_data.base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2);
> -   else
> -  prog_data.base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 
> 2);
> -
> -   prog_data.nr_attributes = nr_attributes;
> -
> -   /* Since vertex shaders reuse the same VUE entry for inputs and outputs
> -* (overwriting the original contents), we need to make sure the size is
> -* the larger of the two.
> -*/
> -   const unsigned vue_entries =
> -  MAX2(nr_attributes, prog_data.base.vue_map.num_slots);
> -
> -   if (brw->gen == 6)
> -  prog_data.base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8);
> -   else
> -  prog_data.base.urb_entry_size = DIV_ROUND_UP(vue_entries, 4);
> -
> if (0) {
>_mesa_fprint_program_opt(stderr, >program.Base, PROG_PRINT_DEBUG,
>true);
> --
> 2.5.0.400.gff86faf
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Updating mesa3d.org docs?

2015-10-15 Thread Sarah Sharp
Hi Brian!

I'm a new Mesa developer in Intel's OTC graphics team (although not new
to open source, I've been a Linux kernel developer for the last seven
years).

I heard that you're responsible for updating mesa3d.org documentation
against the docs in the Mesa source code repo. I noticed the docs are
out-of-date WRT the repo, and I had a couple questions:

1. What's the process for pushing updated documentation to the site?
2. How often are updated docs pushed? Once every week, month, or when
   there's a new Mesa version?
3. Any chance I could get permissions to push updated docs?  I'll be
   improving Mesa documentation as part of my new job, and I would love
   to be able to push myself once patches are accepted, rather than
   having to ping you.

Thanks,
Sarah Sharp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >