Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support

2015-12-03 Thread eocallaghan

On 2015-12-04 14:03, Dieter Nützel wrote:

Am 03.12.2015 19:57, schrieb Dave Airlie:

On 4 Dec 2015 03:01, "Aaron Watry"  wrote:


Hi Dave (and others),

I cloned your fdo r600g-tess-submit branch and gave it a spin on

CEDAR (Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few
issues.

Just grab r600g-tess-staging

-submit only worked on cayman, but i d posted it here so didn't want
to change it during review.

Dave.


Hello to all,

r600g-tess-staging WORKS
on Turks XT (6670) with poor 2 GB, RAM width 128bits DDR, but slow.
Time for R9 290 or Tonga (R9 380X).


Tonga already works, Tonga is radeonsi not r600g, this series only 
applies to r600g.

The radeonsi gallium mesa driver has supported tess for a while now.



Only issue so far:
Switching tessellation (with mouse) and/or wireframe (F2) in Heaven-4.0
let system hang. But NOT hard.
SysRq Key + S | U | B works
But haven't had time to hunt for a useful log.

GREAT WORK!

Dieter


1) Initially, I got an assertion in r600_add_atom stating that the

atom ID was not less than the R600_NUM_ATOMS value (id = 51,
R600_NUM_ATOMS=51).

  I bumped R600_NUM_ATOMS to 52 for now, and that got rid of that

issue... although I have no idea if that was a correct fix.


2) Next, I kept getting a segfault in evergreen_adjust_gprs at line

3931. Turns out that rctx->hw_shader_stages[2].shader was null
(missing/miscompiled GS?).


I naively changed the code to the following, and now Heaven actually

runs with tessellation enabled (and it looks like it's working).


/* gather required shader gprs */
for (i = 0; i < EG_NUM_HW_STAGES; i++) {
if (!rctx->hw_shader_stages[i].shader) {
num_gprs[i] = def_gprs[i];
continue;
}
num_gprs[i] =

rctx->hw_shader_stages[i].shader->shader.bc.ngpr;

}

Just figured that I'd let you know...

If you don't have CEDAR hardware to test with, feel free to ping me

to test any additional changes.  Note that I didn't run the benchmark
to completion (too slow, had to get other work done), but it didn't
hang my GPU in the time that I did have it running.


--Aaron


On Mon, Nov 30, 2015 at 12:20 AM, Dave Airlie 

wrote:


Hi,

Patchbomb time, this set of patches is a first pass at add adding
ARB_tessellation_shader support to the r600g driver. Only Evergreen
and newer GPUs support tessellation. On any of the GPUs that

support

native FP64, this will enable OpenGL 4.1 on them.

The first bunch of patches are a bit of a driver rework to get
things in better shape for tessellation, they shouldn't cause
any regressions.

This runs heaven on cayman and should pass all the piglits
unless I've done something wrong.

Development hit two HW programming fun times, one with tess and
dynamic GPR interaction requiring disabling dynamic GPRs, and
one with programming of some SIMD registers to block TESS shaders
on one unit. These fixed most of the hangs we saw during

development.


This doesn't contain SB support yet, Glenn has started working on

it.


Currently tested hw:
working: CAYMAN, REDWOOD, BARTS, TURKS
hangs on any tessellation: CAYMAN
hangs differently at least with heaven: SUMO

This patchset doesn't block it on any GPUs, but when merged it
probably should.

Also available at:
http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-tess-submit

Thanks to Glenn Kennard for lots of discussion and testing.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] AppVeyor: Build failed: mesa 39

2015-12-03 Thread Jose Fonseca

On 04/12/15 03:21, Roland Scheidegger wrote:

Am 04.12.2015 um 03:58 schrieb AppVeyor:


   Build mesa 39 failed
   

Commit 51140f452a by Roland Scheidegger  on
12/4/2015 2:42 AM:
draw: fix clipping of layer/vp index outputs\n\nThis was just plain
broken. It used always the value from v0 (for vp_index)\nbut would pass
the value from the provoking vertex to later stages - but only\nif there
was a corresponding fs input, otherwise the layer/vp index would
get\nlost completely (as it would try to interpolate the (unsigned)
values as\nfloats).\nSo, make it obey provoking vertex rules (drivers
relying on draw will need to\ndo the same). And make sure that the
default interpolation mode (when no\ncorresponding fs input is found)
for them is constant.\nAlso, change the code a bit so constant inputs
aren't interpolated then\ncopied over later.\n\nFixes the new piglit
test gl-layer-render-clipped.\n\nv2: more consistent whitespaces fixes
for function defs, and more tab killing\n(overall still not quite right
however).\n\nReviewed-by: Brian Paul \nReviewed-by:
Jose Fonseca 

Configure your notification preferences




"Failed to provision build worker virtual machine in Google Compute
Engine cloud. Try restarting build later."
I wonder if that's something happening frequently - if so these messages
will get annoying really fast. Would it be possible to skip them if
there's no log (thus indicating it wasn't really a code problem)?

Roland


It never happened for me with Apitrace and other pet projects I have on 
Appveyor.


But Mesa has a high commit frequency, so whatever rare problems there 
may, it will become slightly less rare.  Lets wait and see, if this 
happens too frequently I'll try to figure a way to reduce the spam.


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] A few misc cleanups using Coccinelle semantic scripts

2015-12-03 Thread Edward O'Callaghan
`Write one patch run everywhere.'

These hopefully get a chunk of the churn (sorry about that) out
the way for more interesting Coccinelle semantic patching of mesa.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro

2015-12-03 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
index 3006bd1..4c837a8 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
@@ -145,11 +145,9 @@ ADDR_HANDLE amdgpu_addr_create(struct amdgpu_winsys *ws)
 
regValue.backendDisables = ws->amdinfo.backend_disable[0];
regValue.pTileConfig = ws->amdinfo.gb_tile_mode;
-   regValue.noOfEntries = sizeof(ws->amdinfo.gb_tile_mode) /
-  sizeof(ws->amdinfo.gb_tile_mode[0]);
+   regValue.noOfEntries = ARRAY_SIZE(ws->amdinfo.gb_tile_mode);
regValue.pMacroTileConfig = ws->amdinfo.gb_macro_tile_mode;
-   regValue.noOfMacroEntries = sizeof(ws->amdinfo.gb_macro_tile_mode) /
-   sizeof(ws->amdinfo.gb_macro_tile_mode[0]);
+   regValue.noOfMacroEntries = ARRAY_SIZE(ws->amdinfo.gb_macro_tile_mode);
 
createFlags.value = 0;
createFlags.useTileIndex = 1;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] gallium/drivers/nouveau: Make use of ARRAY_SIZE macro

2015-12-03 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/nouveau/nv30/nv30_transfer.c   | 2 +-
 src/gallium/drivers/nouveau/nv50/nv50_state.c  | 6 +++---
 src/gallium/drivers/nouveau/nv50/nv50_state_validate.c | 3 +--
 src/gallium/drivers/nouveau/nv50/nv84_video_bsp.c  | 2 +-
 src/gallium/drivers/nouveau/nv50/nv84_video_vp.c   | 6 +++---
 src/gallium/drivers/nouveau/nv50/nv98_video_bsp.c  | 2 +-
 src/gallium/drivers/nouveau/nv50/nv98_video_ppp.c  | 2 +-
 src/gallium/drivers/nouveau/nv50/nv98_video_vp.c   | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  | 6 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 3 +--
 src/gallium/drivers/nouveau/nvc0/nvc0_video_bsp.c  | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_video_ppp.c  | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_video_vp.c   | 2 +-
 14 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv30/nv30_transfer.c 
b/src/gallium/drivers/nouveau/nv30/nv30_transfer.c
index 2452071..9ecbcd1 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_transfer.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_transfer.c
@@ -155,7 +155,7 @@ nv30_transfer_rect_blit(XFER_ARGS)
u32 format, stride;
 
if (nouveau_pushbuf_space(push, 512, 8, 0) ||
-   nouveau_pushbuf_refn (push, refs, sizeof(refs) / sizeof(refs[0])))
+   nouveau_pushbuf_refn (push, refs, ARRAY_SIZE(refs)))
   return;
 
/* various switches depending on cpp of the transfer */
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index b4ea08d..fd7c7cd 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -189,7 +189,7 @@ nv50_blend_state_create(struct pipe_context *pipe,
   SB_DATA(so, nv50_colormask(cso->rt[0].colormask));
}
 
-   assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
+   assert(so->size <= ARRAY_SIZE(so->state));
return so;
 }
 
@@ -326,7 +326,7 @@ nv50_rasterizer_state_create(struct pipe_context *pipe,
SB_BEGIN_3D(so, PIXEL_CENTER_INTEGER, 1);
SB_DATA(so, !cso->half_pixel_center);
 
-   assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
+   assert(so->size <= ARRAY_SIZE(so->state));
return (void *)so;
 }
 
@@ -415,7 +415,7 @@ nv50_zsa_state_create(struct pipe_context *pipe,
   SB_DATA(so, 0);
}
 
-   assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
+   assert(so->size <= ARRAY_SIZE(so->state));
return (void *)so;
 }
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
index 1df6bd9..d14d603 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
@@ -508,7 +508,6 @@ static struct state_validate {
 { nv50_vertex_arrays_validate, NV50_NEW_VERTEX | NV50_NEW_ARRAYS },
 { nv50_validate_min_samples,   NV50_NEW_MIN_SAMPLES },
 };
-#define validate_list_len (sizeof(validate_list) / sizeof(validate_list[0]))
 
 bool
 nv50_state_validate(struct nv50_context *nv50, uint32_t mask, unsigned words)
@@ -523,7 +522,7 @@ nv50_state_validate(struct nv50_context *nv50, uint32_t 
mask, unsigned words)
state_mask = nv50->dirty & mask;
 
if (state_mask) {
-  for (i = 0; i < validate_list_len; ++i) {
+  for (i = 0; i < ARRAY_SIZE(validate_list); ++i) {
  struct state_validate *validate = &validate_list[i];
 
  if (state_mask & validate->states)
diff --git a/src/gallium/drivers/nouveau/nv50/nv84_video_bsp.c 
b/src/gallium/drivers/nouveau/nv50/nv84_video_bsp.c
index 1a520d2..38eca17 100644
--- a/src/gallium/drivers/nouveau/nv50/nv84_video_bsp.c
+++ b/src/gallium/drivers/nouveau/nv50/nv84_video_bsp.c
@@ -200,7 +200,7 @@ nv84_decoder_bsp(struct nv84_decoder *dec,
memcpy(dec->bitstream->map + 0x600, more_params, sizeof(more_params));
 
PUSH_SPACE(push, 5 + 21 + 3 + 2 + 4 + 2);
-   nouveau_pushbuf_refn(push, bo_refs, sizeof(bo_refs)/sizeof(bo_refs[0]));
+   nouveau_pushbuf_refn(push, bo_refs, ARRAY_SIZE(bo_refs));
 
/* Wait for the fence = 1 */
BEGIN_NV04(push, SUBC_BSP(0x10), 4);
diff --git a/src/gallium/drivers/nouveau/nv50/nv84_video_vp.c 
b/src/gallium/drivers/nouveau/nv50/nv84_video_vp.c
index 8b12147..d98992c 100644
--- a/src/gallium/drivers/nouveau/nv50/nv84_video_vp.c
+++ b/src/gallium/drivers/nouveau/nv50/nv84_video_vp.c
@@ -81,7 +81,7 @@ nv84_decoder_vp_h264(struct nv84_decoder *dec,
   { dec->vp_params, NOUVEAU_BO_RDWR | NOUVEAU_BO_GART },
   { dec->fence, NOUVEAU_BO_RDWR | NOUVEAU_BO_VRAM },
};
-   int num_refs = sizeof(bo_refs)/sizeof(*bo_refs);
+   int num_refs = ARRAY_SIZE(bo_refs);
bool is_ref = desc->is_reference;
 
STATIC_ASSERT(sizeof(struct h264_iparm1) == 0x218);
@@ -141,7 +141,7 @@ nv84_decoder_

[Mesa-dev] [PATCH 3/7] gallium/drivers/svga: Make use of ARRAY_SIZE macro

2015-12-03 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/svga/svga_state_tss.c  | 2 +-
 src/gallium/drivers/svga/svgadump/svga_shader_op.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_state_tss.c 
b/src/gallium/drivers/svga/svga_state_tss.c
index 5991da1..4debbf1 100644
--- a/src/gallium/drivers/svga/svga_state_tss.c
+++ b/src/gallium/drivers/svga/svga_state_tss.c
@@ -316,7 +316,7 @@ svga_queue_tss( struct ts_queue *q,
 unsigned tss,
 unsigned value )
 {
-   assert(q->ts_count < sizeof(q->ts)/sizeof(q->ts[0]));
+   assert(q->ts_count < ARRAY_SIZE(q->ts));
q->ts[q->ts_count].stage = unit;
q->ts[q->ts_count].name = tss;
q->ts[q->ts_count].value = value;
diff --git a/src/gallium/drivers/svga/svgadump/svga_shader_op.c 
b/src/gallium/drivers/svga/svgadump/svga_shader_op.c
index ad1549d..03a63cf 100644
--- a/src/gallium/drivers/svga/svgadump/svga_shader_op.c
+++ b/src/gallium/drivers/svga/svgadump/svga_shader_op.c
@@ -144,7 +144,7 @@ const struct sh_opcode_info *svga_opcode_info( uint op )
 {
struct sh_opcode_info *info;
 
-   if (op >= sizeof( opcode_info ) / sizeof( opcode_info[0] )) {
+   if (op >= ARRAY_SIZE(opcode_info)) {
   /* The opcode is either PHASE, COMMENT, END or out of range.
*/
   assert( 0 );
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] gallium/aux../u_mm.c: Fix zero integer literal to pointer comparison

2015-12-03 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/gallium/auxiliary/util/u_mm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_mm.c 
b/src/gallium/auxiliary/util/u_mm.c
index 2069b56..bd4c4e1 100644
--- a/src/gallium/auxiliary/util/u_mm.c
+++ b/src/gallium/auxiliary/util/u_mm.c
@@ -34,7 +34,7 @@ void
 u_mmDumpMemInfo(const struct mem_block *heap)
 {
debug_printf("Memory heap %p:\n", (void *) heap);
-   if (heap == 0) {
+   if (heap == NULL) {
   debug_printf("  heap == 0\n");
}
else {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] gallium/drivers/llvmpipe: Make use of ARRAY_SIZE macro

2015-12-03 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/llvmpipe/lp_test_blend.c | 6 +++---
 src/gallium/drivers/llvmpipe/lp_test_conv.c  | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_test_blend.c 
b/src/gallium/drivers/llvmpipe/lp_test_blend.c
index 37420b0..7b19174 100644
--- a/src/gallium/drivers/llvmpipe/lp_test_blend.c
+++ b/src/gallium/drivers/llvmpipe/lp_test_blend.c
@@ -625,9 +625,9 @@ const struct lp_type blend_types[] = {
 };
 
 
-const unsigned num_funcs = sizeof(blend_funcs)/sizeof(blend_funcs[0]);
-const unsigned num_factors = sizeof(blend_factors)/sizeof(blend_factors[0]);
-const unsigned num_types = sizeof(blend_types)/sizeof(blend_types[0]);
+const unsigned num_funcs = ARRAY_SIZE(blend_funcs);
+const unsigned num_factors = ARRAY_SIZE(blend_factors);
+const unsigned num_types = ARRAY_SIZE(blend_types);
 
 
 boolean
diff --git a/src/gallium/drivers/llvmpipe/lp_test_conv.c 
b/src/gallium/drivers/llvmpipe/lp_test_conv.c
index 8290da4..a30f35c 100644
--- a/src/gallium/drivers/llvmpipe/lp_test_conv.c
+++ b/src/gallium/drivers/llvmpipe/lp_test_conv.c
@@ -382,7 +382,7 @@ const struct lp_type conv_types[] = {
 };
 
 
-const unsigned num_types = sizeof(conv_types)/sizeof(conv_types[0]);
+const unsigned num_types = ARRAY_SIZE(conv_types);
 
 
 boolean
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #20 from Axel Davy  ---
Should this bug report be closed ?
DRI3 has complete DRI_PRIME support now.

Also I'm not sure why you do post these benchmarks here.
Is there some hidden message about something not working right ?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #19 from poma  ---

 = GLX VSync DRI3 NOUVEAU =

/etc/X11/xorg.conf.d/nouveau-dri3.conf
Section "Device"
Identifier  "nvidia0"
Driver  "nouveau"
Option  "DRI" "3"
EndSection

Xorg.0.log:
NOUVEAU(0): DRI3 on EXA enabled

# DRI_PRIME=1 LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL: pci id for fd 5: 10de:1292, driver nouveau
libGL: OpenDriver: trying /usr/lib64/dri/tls/nouveau_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/nouveau_dri.so
libGL: Can't open configuration file /root/.drirc: No such file or directory.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI3 for screen 0
14127 frames in 5.0 seconds = 2825.278 FPS
14174 frames in 5.0 seconds = 2834.725 FPS
14196 frames in 5.0 seconds = 2839.174 FPS
14194 frames in 5.0 seconds = 2838.693 FPS
14189 frames in 5.0 seconds = 2837.756 FPS
14187 frames in 5.0 seconds = 2837.263 FPS
14182 frames in 5.0 seconds = 2836.374 FPS
14180 frames in 5.0 seconds = 2835.815 FPS
14188 frames in 5.0 seconds = 2837.346 FPS
14181 frames in 5.0 seconds = 2836.074 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #18 from poma  ---

 = GLX VSync DRI3 INTEL =

/etc/X11/xorg.conf.d/intel-dri3.conf
Section "Device"
Identifier  "intel0"
Driver  "intel"
Option  "DRI" "3"
EndSection

Xorg.0.log:
intel(0): direct rendering: DRI2 DRI3 enabled

# LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: pci id for fd 4: 8086:0156, driver i965
libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI3 for screen 0
libGL: Can't open configuration file /root/.drirc: No such file or directory.
20142 frames in 5.0 seconds = 4028.390 FPS
20267 frames in 5.0 seconds = 4053.330 FPS
20428 frames in 5.0 seconds = 4085.486 FPS
20363 frames in 5.0 seconds = 4072.486 FPS
20281 frames in 5.0 seconds = 4056.115 FPS
20248 frames in 5.0 seconds = 4048.564 FPS
20466 frames in 5.0 seconds = 4093.135 FPS
20355 frames in 5.0 seconds = 4070.827 FPS
20476 frames in 5.0 seconds = 4095.126 FPS
20529 frames in 5.0 seconds = 4105.680 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #17 from poma  ---

 = GLX VSync DRI2 INTEL =

# LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI2 for screen 0
libGL: Can't open configuration file /root/.drirc: No such file or directory.
20624 frames in 5.0 seconds = 4124.721 FPS
20847 frames in 5.0 seconds = 4169.352 FPS
20036 frames in 5.0 seconds = 4007.072 FPS
20823 frames in 5.0 seconds = 4164.519 FPS
20792 frames in 5.0 seconds = 4158.378 FPS
20800 frames in 5.0 seconds = 4159.968 FPS
20752 frames in 5.0 seconds = 4150.326 FPS
20831 frames in 5.0 seconds = 4166.060 FPS
20823 frames in 5.0 seconds = 4164.594 FPS
20811 frames in 5.0 seconds = 4162.066 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #16 from poma  ---

 = XPresent VSync DRI3 NOUVEAU =

/etc/X11/xorg.conf.d/nouveau-dri3.conf
Section "Device"
Identifier  "nvidia0"
Driver  "nouveau"
Option  "DRI" "3"
EndSection

Xorg.0.log:
NOUVEAU(0): DRI3 on EXA enabled

# DRI_PRIME=1 LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL: pci id for fd 5: 10de:1292, driver nouveau
libGL: OpenDriver: trying /usr/lib64/dri/tls/nouveau_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/nouveau_dri.so
libGL: Can't open configuration file /root/.drirc: No such file or directory.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI3 for screen 0
13974 frames in 5.0 seconds = 2794.645 FPS
14077 frames in 5.0 seconds = 2815.394 FPS
14079 frames in 5.0 seconds = 2815.704 FPS
14078 frames in 5.0 seconds = 2815.532 FPS
14078 frames in 5.0 seconds = 2815.559 FPS
14079 frames in 5.0 seconds = 2815.626 FPS
14079 frames in 5.0 seconds = 2815.732 FPS
14079 frames in 5.0 seconds = 2815.622 FPS
14079 frames in 5.0 seconds = 2815.615 FPS
14069 frames in 5.0 seconds = 2813.748 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #15 from poma  ---

 = XPresent VSync DRI3 INTEL =

/etc/X11/xorg.conf.d/intel-dri3.conf
Section "Device"
Identifier  "intel0"
Driver  "intel"
Option  "DRI" "3"
EndSection

Xorg.0.log:
intel(0): direct rendering: DRI2 DRI3 enabled

# LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: pci id for fd 4: 8086:0156, driver i965
libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI3 for screen 0
libGL: Can't open configuration file /root/.drirc: No such file or directory.
24691 frames in 5.0 seconds = 4938.108 FPS
25310 frames in 5.0 seconds = 5061.974 FPS
23346 frames in 5.0 seconds = 4669.024 FPS
22835 frames in 5.0 seconds = 4566.970 FPS
26354 frames in 5.0 seconds = 5270.756 FPS
25152 frames in 5.0 seconds = 5030.316 FPS
25098 frames in 5.0 seconds = 5019.583 FPS
23447 frames in 5.0 seconds = 4689.399 FPS
23899 frames in 5.0 seconds = 4779.689 FPS
21966 frames in 5.0 seconds = 4393.033 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #14 from poma  ---

 = XPresent VSync DRI2 NOUVEAU =

# DRI_PRIME=1 LIBGL_DEBUG=verbose vblank_mode=0 glxgearslibGL: OpenDriver:
trying /usr/lib64/dri/tls/nouveau_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/nouveau_dri.so
libGL: Can't open configuration file /root/.drirc: No such file or directory.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI2 for screen 0
11059 frames in 5.0 seconds = 2211.646 FPS
11099 frames in 5.0 seconds = 2219.753 FPS
11122 frames in 5.0 seconds = 2224.265 FPS
11120 frames in 5.0 seconds = 2223.937 FPS
11089 frames in 5.0 seconds = 2217.751 FPS
11070 frames in 5.0 seconds = 2213.801 FPS
11070 frames in 5.0 seconds = 2213.818 FPS
11071 frames in 5.0 seconds = 2214.184 FPS
11071 frames in 5.0 seconds = 2214.194 FPS
11069 frames in 5.0 seconds = 2213.607 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #13 from poma  ---

 = XPresent VSync DRI2 INTEL =

# LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
libGL: Can't open configuration file /root/.drirc: No such file or directory.
libGL: Using DRI2 for screen 0
libGL: Can't open configuration file /root/.drirc: No such file or directory.
14187 frames in 5.0 seconds = 2837.223 FPS
13509 frames in 5.0 seconds = 2701.708 FPS
13376 frames in 5.0 seconds = 2675.033 FPS
13652 frames in 5.0 seconds = 2730.306 FPS
13381 frames in 5.0 seconds = 2676.152 FPS
13656 frames in 5.0 seconds = 2731.119 FPS
13489 frames in 5.0 seconds = 2697.706 FPS
13471 frames in 5.0 seconds = 2694.155 FPS
13338 frames in 5.0 seconds = 2667.453 FPS
13565 frames in 5.0 seconds = 2712.999 FPS
^C

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #12 from poma  ---

Some of the results with recent Mesa 3D on Optimus/Prime


# xrandr --listproviders 
Providers: number : 1
Provider 0: id: 0x48 cap: 0xb, Source Output, Sink Output, Sink Offload crtcs:
4 outputs: 5 associated providers: 0 name:Intel

# cat /sys/kernel/debug/vgaswitcheroo/switch 
0:IGD:+:Pwr::00:02.0
1:DIS: :DynOff::01:00.0

# lspci | egrep -i vga\|3d
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor
Graphics Controller (rev 09)
01:00.0 3D controller: NVIDIA Corporation GK208M [GeForce GT 740M] (rev a1)

# LIBGL_DEBUG=verbose vblank_mode=0 DRI_PRIME=1 glxgears
...

DIS powering up:

# cat /sys/kernel/debug/vgaswitcheroo/switch
0:IGD:+:Pwr::00:02.0
1:DIS: :DynPwr::01:00.0


# DRI_PRIME=0 glxinfo | grep "OpenGL renderer"
OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Mobile 

# DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
OpenGL renderer string: Gallium 0.4 on NV108

# DRI_PRIME=0 glxinfo | grep "OpenGL vendor string"
OpenGL vendor string: Intel Open Source Technology Center

# DRI_PRIME=1 glxinfo | grep "OpenGL vendor string"
OpenGL vendor string: nouveau

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] AppVeyor: Build completed: mesa 40

2015-12-03 Thread AppVeyor


Build mesa 40 completed



Commit b715e6d528 by Jason Ekstrand on 11/26/2015 8:05 AM:

i965/vec4: Stop pretending to support indirect output stores\n\nSince we're using nir_lower_outputs_to_temporaries to shadow all our\noutputs, it's impossible to actually get an indirect store.  The code we\nhad to "handle" this was pretty bogus as it created a register with a\nreladdr and then stuffed it in a fixed varying slot without so much as a\nMOV.  Not only does this not do the MOV, it also puts the indirect on the\nwrong side of the transaction.  Let's just delete the broken dead code.\n\nReviewed-by: Kenneth Graunke 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix outputing to non-0 buffers for stream 0.

2015-12-03 Thread Dave Airlie
On 4 December 2015 at 14:25, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This fixes:
> arb_transform_feedback3-ext_interleaved_two_bufs_gs
> arb_transform_feedback3-ext_interleaved_two_bufs_gs_max
> transform-feedback-builtins

actually ignore this, I didn't notice it break the xfb-streams cases.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: fix outputing to non-0 buffers for stream 0.

2015-12-03 Thread Dave Airlie
From: Dave Airlie 

This fixes:
arb_transform_feedback3-ext_interleaved_two_bufs_gs
arb_transform_feedback3-ext_interleaved_two_bufs_gs_max
transform-feedback-builtins
---
 src/gallium/drivers/r600/r600_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 4142c3e..19eed06 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -1399,7 +1399,7 @@ static int emit_streamout(struct r600_shader_ctx *ctx, 
struct pipe_stream_output
for (i = 0; i < so->num_outputs; i++) {
struct r600_bytecode_output output;
 
-   if (stream != -1 && stream != so->output[i].output_buffer)
+   if (stream != -1 && stream != 0 && stream != 
so->output[i].output_buffer)
continue;
 
memset(&output, 0, sizeof(struct r600_bytecode_output));
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] r600 tess branches updated

2015-12-03 Thread Dave Airlie
Hey all,

I've pushed an updated version of the r600g tess support to my
r600g-tess-submit branch.

I'm in two minds whether we need to spam the list again,

I think I've included all the review feedback so far, thanks to
everyone that looked.

The major changes since the last posting are:

use 24-bit math operations for LDS index calculations.
CAICOS/SUMO thread count changes - seems to make heaven run
dropping pointless delay slots in LDS reads
attempt to calculate SQ_LDS_ALLOC.HS_NUM_WAVES properly
don't reeemit the LDS constant buffers if we don't have to.
fix sb GDS decoder as per Glenn's request
fix some minor bugs in the previous submit branch.

I'll probably line to push this all next week unless anyone can find
an objection!.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] i965/vec4: Get rid of the nir_inputs array

2015-12-03 Thread Jason Ekstrand
On Tue, Dec 1, 2015 at 11:56 PM, Kenneth Graunke  wrote:
> On Wednesday, November 25, 2015 08:55:54 PM Jason Ekstrand wrote:
>> It's not really buying us anything at this point.  It's just a way of
>> remapping one offset namespace onto another.  We can just use the location
>> namespace the whole way through.
>> ---
>>  src/mesa/drivers/dri/i965/brw_nir.c| 28 
>>  src/mesa/drivers/dri/i965/brw_vec4.h   |  2 --
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 23 +--
>>  3 files changed, 13 insertions(+), 40 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index 91358d8..bf3bc2d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -62,13 +62,6 @@ brw_nir_lower_inputs(nir_shader *nir,
>>  {
>> switch (nir->stage) {
>> case MESA_SHADER_VERTEX:
>> -  /* For now, leave the vec4 backend doing the old method. */
>> -  if (!is_scalar) {
>> - nir_assign_var_locations(&nir->inputs, &nir->num_inputs,
>> -  type_size_vec4);
>> - break;
>> -  }
>> -
>>/* Start with the location of the variable's base. */
>>foreach_list_typed(nir_variable, var, node, &nir->inputs) {
>>   var->data.driver_location = var->data.location;
>> @@ -80,15 +73,18 @@ brw_nir_lower_inputs(nir_shader *nir,
>> */
>>nir_lower_io(nir, nir_var_shader_in, type_size_vec4);
>>
>> -  /* Finally, translate VERT_ATTRIB_* values into the actual registers.
>> -   *
>> -   * Note that we can use nir->info.inputs_read instead of 
>> key->inputs_read
>> -   * since the two are identical aside from Gen4-5 edge flag 
>> differences.
>> -   */
>> -  GLbitfield64 inputs_read = nir->info.inputs_read;
>> -  nir_foreach_overload(nir, overload) {
>> - if (overload->impl) {
>> -nir_foreach_block(overload->impl, remap_vs_attrs, &inputs_read);
>> +  if (is_scalar) {
>> + /* Finally, translate VERT_ATTRIB_* values into the actual 
>> registers.
>> +  *
>> +  * Note that we can use nir->info.inputs_read instead of
>> +  * key->inputs_read since the two are identical aside from Gen4-5
>> +  * edge flag differences.
>> +  */
>> + GLbitfield64 inputs_read = nir->info.inputs_read;
>> + nir_foreach_overload(nir, overload) {
>> +if (overload->impl) {
>> +   nir_foreach_block(overload->impl, remap_vs_attrs, 
>> &inputs_read);
>> +}
>>   }
>>}
>>break;
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
>> b/src/mesa/drivers/dri/i965/brw_vec4.h
>> index 3f67432..a168250 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>> @@ -327,7 +327,6 @@ public:
>> bool is_high_sampler(src_reg sampler);
>>
>> virtual void emit_nir_code();
>> -   virtual void nir_setup_inputs();
>> virtual void nir_setup_uniforms();
>> virtual void nir_setup_system_value_intrinsic(nir_intrinsic_instr 
>> *instr);
>> virtual void nir_setup_system_values();
>> @@ -360,7 +359,6 @@ public:
>>
>> dst_reg *nir_locals;
>> dst_reg *nir_ssa_values;
>> -   src_reg *nir_inputs;
>> dst_reg *nir_system_values;
>>
>>  protected:
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index c777acf..96787db 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -35,9 +35,6 @@ namespace brw {
>>  void
>>  vec4_visitor::emit_nir_code()
>>  {
>> -   if (nir->num_inputs > 0)
>> -  nir_setup_inputs();
>> -
>> if (nir->num_uniforms > 0)
>>nir_setup_uniforms();
>>
>> @@ -118,24 +115,6 @@ vec4_visitor::nir_setup_system_values()
>>  }
>>
>>  void
>> -vec4_visitor::nir_setup_inputs()
>> -{
>> -   nir_inputs = ralloc_array(mem_ctx, src_reg, nir->num_inputs);
>> -   for (unsigned i = 0; i < nir->num_inputs; i++) {
>> -  nir_inputs[i] = src_reg();
>> -   }
>> -
>> -   nir_foreach_variable(var, &nir->inputs) {
>> -  int offset = var->data.driver_location;
>> -  unsigned size = type_size_vec4(var->type);
>> -  for (unsigned i = 0; i < size; i++) {
>> - src_reg src = src_reg(ATTR, var->data.location + i, var->type);
>> - nir_inputs[offset + i] = src;
>> -  }
>> -   }
>> -}
>> -
>> -void
>>  vec4_visitor::nir_setup_uniforms()
>>  {
>> uniforms = nir->num_uniforms;
>> @@ -399,7 +378,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
>> *instr)
>>/* fallthrough */
>> case nir_intrinsic_load_input: {
>>int offset = instr->const_index[0];
>> -  src = nir_inputs[offset];
>> +  src = src_reg(ATTR, offset, glsl_type::uvec4_type);
>>
>>if (has_indirect) {
>>   dest.reladdr = new(mem_ctx) src_reg(get_ni

Re: [Mesa-dev] AppVeyor: Build failed: mesa 39

2015-12-03 Thread Roland Scheidegger
Am 04.12.2015 um 03:58 schrieb AppVeyor:
> 
>   Build mesa 39 failed
>   
> 
> Commit 51140f452a by Roland Scheidegger  on
> 12/4/2015 2:42 AM:
> draw: fix clipping of layer/vp index outputs\n\nThis was just plain
> broken. It used always the value from v0 (for vp_index)\nbut would pass
> the value from the provoking vertex to later stages - but only\nif there
> was a corresponding fs input, otherwise the layer/vp index would
> get\nlost completely (as it would try to interpolate the (unsigned)
> values as\nfloats).\nSo, make it obey provoking vertex rules (drivers
> relying on draw will need to\ndo the same). And make sure that the
> default interpolation mode (when no\ncorresponding fs input is found)
> for them is constant.\nAlso, change the code a bit so constant inputs
> aren't interpolated then\ncopied over later.\n\nFixes the new piglit
> test gl-layer-render-clipped.\n\nv2: more consistent whitespaces fixes
> for function defs, and more tab killing\n(overall still not quite right
> however).\n\nReviewed-by: Brian Paul \nReviewed-by:
> Jose Fonseca 
> 
> Configure your notification preferences
> 
> 

"Failed to provision build worker virtual machine in Google Compute
Engine cloud. Try restarting build later."
I wonder if that's something happening frequently - if so these messages
will get annoying really fast. Would it be possible to skip them if
there's no log (thus indicating it wasn't really a code problem)?

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #43 from Roland Scheidegger  ---
(In reply to bellamorte42 from comment #42)
> I'm not sure I follow your logic.  The code compiles fine on its own.  The
> code compiles fine under a variety of optimizations.  A single optimization
> causes a segfault. =it's the codes fault?
Why not? Things like that are quite common. For instance when using undefined
values it can easily work with some compiler options but not others. Or doing
some kind of reading/writing past array bounds. All bets are off with such code
(and valgrind would catch both these errors). Not saying that's necessarily the
case here but I don't see any evidence to the contrary neither...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support

2015-12-03 Thread Dieter Nützel

Am 03.12.2015 19:57, schrieb Dave Airlie:

On 4 Dec 2015 03:01, "Aaron Watry"  wrote:


Hi Dave (and others),

I cloned your fdo r600g-tess-submit branch and gave it a spin on

CEDAR (Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few
issues.

Just grab r600g-tess-staging

-submit only worked on cayman, but i d posted it here so didn't want
to change it during review.

Dave.


Hello to all,

r600g-tess-staging WORKS
on Turks XT (6670) with poor 2 GB, RAM width 128bits DDR, but slow.
Time for R9 290 or Tonga (R9 380X).

Only issue so far:
Switching tessellation (with mouse) and/or wireframe (F2) in Heaven-4.0
let system hang. But NOT hard.
SysRq Key + S | U | B works
But haven't had time to hunt for a useful log.

GREAT WORK!

Dieter


1) Initially, I got an assertion in r600_add_atom stating that the

atom ID was not less than the R600_NUM_ATOMS value (id = 51,
R600_NUM_ATOMS=51).

  I bumped R600_NUM_ATOMS to 52 for now, and that got rid of that

issue... although I have no idea if that was a correct fix.


2) Next, I kept getting a segfault in evergreen_adjust_gprs at line

3931. Turns out that rctx->hw_shader_stages[2].shader was null
(missing/miscompiled GS?).


I naively changed the code to the following, and now Heaven actually

runs with tessellation enabled (and it looks like it's working).


/* gather required shader gprs */
for (i = 0; i < EG_NUM_HW_STAGES; i++) {
if (!rctx->hw_shader_stages[i].shader) {
num_gprs[i] = def_gprs[i];
continue;
}
num_gprs[i] =

rctx->hw_shader_stages[i].shader->shader.bc.ngpr;

}

Just figured that I'd let you know...

If you don't have CEDAR hardware to test with, feel free to ping me

to test any additional changes.  Note that I didn't run the benchmark
to completion (too slow, had to get other work done), but it didn't
hang my GPU in the time that I did have it running.


--Aaron


On Mon, Nov 30, 2015 at 12:20 AM, Dave Airlie 

wrote:


Hi,

Patchbomb time, this set of patches is a first pass at add adding
ARB_tessellation_shader support to the r600g driver. Only Evergreen
and newer GPUs support tessellation. On any of the GPUs that

support

native FP64, this will enable OpenGL 4.1 on them.

The first bunch of patches are a bit of a driver rework to get
things in better shape for tessellation, they shouldn't cause
any regressions.

This runs heaven on cayman and should pass all the piglits
unless I've done something wrong.

Development hit two HW programming fun times, one with tess and
dynamic GPR interaction requiring disabling dynamic GPRs, and
one with programming of some SIMD registers to block TESS shaders
on one unit. These fixed most of the hangs we saw during

development.


This doesn't contain SB support yet, Glenn has started working on

it.


Currently tested hw:
working: CAYMAN, REDWOOD, BARTS, TURKS
hangs on any tessellation: CAYMAN
hangs differently at least with heaven: SUMO

This patchset doesn't block it on any GPUs, but when merged it
probably should.

Also available at:
http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-tess-submit

Thanks to Glenn Kennard for lots of discussion and testing.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #42 from bellamort...@gmail.com ---
I'm not sure I follow your logic.  The code compiles fine on its own.  The code
compiles fine under a variety of optimizations.  A single optimization causes a
segfault. =it's the codes fault?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] AppVeyor: Build failed: mesa 39

2015-12-03 Thread AppVeyor



Build mesa 39 failed


Commit 51140f452a by Roland Scheidegger on 12/4/2015 2:42 AM:

draw: fix clipping of layer/vp index outputs\n\nThis was just plain broken. It used always the value from v0 (for vp_index)\nbut would pass the value from the provoking vertex to later stages - but only\nif there was a corresponding fs input, otherwise the layer/vp index would get\nlost completely (as it would try to interpolate the (unsigned) values as\nfloats).\nSo, make it obey provoking vertex rules (drivers relying on draw will need to\ndo the same). And make sure that the default interpolation mode (when no\ncorresponding fs input is found) for them is constant.\nAlso, change the code a bit so constant inputs aren't interpolated then\ncopied over later.\n\nFixes the new piglit test gl-layer-render-clipped.\n\nv2: more consistent whitespaces fixes for function defs, and more tab killing\n(overall still not quite right however).\n\nReviewed-by: Brian Paul \nReviewed-by: Jose Fonseca 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] vc4/nir: Use the new unified io intrinsics

2015-12-03 Thread Rob Clark
On Thu, Dec 3, 2015 at 9:03 PM, Eric Anholt  wrote:
> Jason Ekstrand  writes:
>
>> On Dec 3, 2015 3:47 PM, "Eric Anholt"  wrote:
>>>
>>> Jason Ekstrand  writes:
>>>
>>> > Cc: Eric Anholt 
>>>
>>> OK, I've pushed a branch of partial fixes for this series to nir-loads
>>> of my fdo tree, but it's still super broken.  I've spent too much of
>>> today on it, and this series was not ready for review.  There are still
>>> const_index[0] references all over that are obviously from intrinsics
>>> you've removed the const_index on.
>>
>> I'm going to start looking at this again tomorrow.  I've had other things
>> on my plate this week.  Rob already pointed me at some crashing shaders for
>> ir3 and his little standalone compiler. I'll start there.  Is there
>> something similar I can use to debug vc4?
>
> Right now from configure.ac I've got either targeting-hardware mode, or
> targeting-simulator mode.  What I need to add is something in between,
> where we use the ioctls from simulator mode so that people can run the
> driver on i965 systems, and then just don't make the simulator calls at
> the very end of the drawing pipeline since you don't have the simulator.
> Then people could even run shader-db without having the actual hardware,
> and that would get a lot of coverage of compiler changes.
>
> You could knock this together with some #ifdefs around all 7 mentions of
> "simpenrose", but I'll try to get a clean patch sent out.  I'm just
> trying to decide whether this should be a --enable argument or "oh, hey,
> you're not on arm, you must want either simulator or
> everything-but-simulator mode"
>
> I'd also like something a bit nicer than my symlink trick for loading
> the driver (right now I link the built vc4_dri.so to i965_dri.so in a
> subdirectory, and LIBGL_DRIVERS_PATH it). Maybe if I getenv() a name
> From loader.c for computing driver name, I could also skip a bunch of
> the i965 symbols I have to add.

fwiw, I have used https://github.com/robclark/fakedrm/ in the past for
faking the ioctls.. the annoying thing is remembering each time how to
hack up the driver loading to actually get it load freedreno rather
than i965.  A better solution would be nice if for nothing other than
shader-db runs.

something along the lines of a
getenv("NO_REALLY_LOAD_THIS_OTHER_DRIVER") somewhere would be useful..

BR,
-R


> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] vc4/nir: Use the new unified io intrinsics

2015-12-03 Thread Eric Anholt
Jason Ekstrand  writes:

> On Dec 3, 2015 3:47 PM, "Eric Anholt"  wrote:
>>
>> Jason Ekstrand  writes:
>>
>> > Cc: Eric Anholt 
>>
>> OK, I've pushed a branch of partial fixes for this series to nir-loads
>> of my fdo tree, but it's still super broken.  I've spent too much of
>> today on it, and this series was not ready for review.  There are still
>> const_index[0] references all over that are obviously from intrinsics
>> you've removed the const_index on.
>
> I'm going to start looking at this again tomorrow.  I've had other things
> on my plate this week.  Rob already pointed me at some crashing shaders for
> ir3 and his little standalone compiler. I'll start there.  Is there
> something similar I can use to debug vc4?

Right now from configure.ac I've got either targeting-hardware mode, or
targeting-simulator mode.  What I need to add is something in between,
where we use the ioctls from simulator mode so that people can run the
driver on i965 systems, and then just don't make the simulator calls at
the very end of the drawing pipeline since you don't have the simulator.
Then people could even run shader-db without having the actual hardware,
and that would get a lot of coverage of compiler changes.

You could knock this together with some #ifdefs around all 7 mentions of
"simpenrose", but I'll try to get a clean patch sent out.  I'm just
trying to decide whether this should be a --enable argument or "oh, hey,
you're not on arm, you must want either simulator or
everything-but-simulator mode"

I'd also like something a bit nicer than my symlink trick for loading
the driver (right now I link the built vc4_dri.so to i965_dri.so in a
subdirectory, and LIBGL_DRIVERS_PATH it). Maybe if I getenv() a name
From loader.c for computing driver name, I could also skip a bunch of
the i965 symbols I have to add.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] glsl: add support for GL_OES_geometry_shader

2015-12-03 Thread Ian Romanick
On 12/02/2015 01:33 AM, Lofstedt, Marta wrote:
>> -Original Message-
>> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
>> Behalf Of Ian Romanick
>> Sent: Tuesday, December 1, 2015 7:52 PM
>> To: Marta Lofstedt; mesa-dev@lists.freedesktop.org
>> Cc: Emil Velikov
>> Subject: Re: [Mesa-dev] [PATCH v3] glsl: add support for
>> GL_OES_geometry_shader
>>
>> On 12/01/2015 06:49 AM, Marta Lofstedt wrote:
>>> From: Marta Lofstedt 
>>>
>>> This adds glsl support of GL_OES_geometry_shader for OpenGL ES 3.1.
>>>
>>> Signed-off-by: Marta Lofstedt 
>>> ---
>>>  src/glsl/builtin_variables.cpp  | 25 +
>>>  src/glsl/glsl_parser.yy |  4 ++--
>>>  src/glsl/glsl_parser_extras.cpp |  1 +
>>>  src/glsl/glsl_parser_extras.h   |  7 +++
>>>  4 files changed, 23 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/src/glsl/builtin_variables.cpp
>>> b/src/glsl/builtin_variables.cpp index e8eab80..6f87600 100644
>>> --- a/src/glsl/builtin_variables.cpp
>>> +++ b/src/glsl/builtin_variables.cpp
>>> @@ -667,7 +667,7 @@ builtin_variable_generator::generate_constants()
>>>add_const("gl_MaxVaryingComponents", state->ctx-
>>> Const.MaxVarying * 4);
>>> }
>>>
>>> -   if (state->is_version(150, 0)) {
>>> +   if (state->has_geometry_shader()) {
>>>add_const("gl_MaxVertexOutputComponents",
>>>  state->Const.MaxVertexOutputComponents);
>>>add_const("gl_MaxGeometryInputComponents",
>>> @@ -730,12 +730,11 @@ builtin_variable_generator::generate_constants()
>>>add_const("gl_MaxAtomicCounterBindings",
>>>  state->Const.MaxAtomicBufferBindings);
>>>
>>> -  /* When Mesa adds support for GL_OES_geometry_shader and
>>> -   * GL_OES_tessellation_shader, this will need to change.
>>> -   */
>>> -  if (!state->es_shader) {
>>> +  if (state->has_geometry_shader()) {
>>>   add_const("gl_MaxGeometryAtomicCounters",
>>> state->Const.MaxGeometryAtomicCounters);
>>> +  }
>>> +  if (!state->es_shader) {
>>>   add_const("gl_MaxTessControlAtomicCounters",
>>> state->Const.MaxTessControlAtomicCounters);
>>
>> I think you got a little over excited here.  I don't see MaxTess*Atomic
>> anywhere in any GLES extension... not even in GL_EXT_tessellation_shader,
>> which seems like a bug in that extension spec.
>>
>> https://www.khronos.org/bugzilla/show_bug.cgi?id=1427
>>
> Ian, I am not sure I understand your comment. My patch enables 
> gl_MaxGeometryAtomicCounters and gl_MaxGeometryAtomicCounterBuffers if you 
> have geometry shader. These are part of the GL_OES_geometry_shader extension, 
> but was previously not exposed for ES shaders.
> 
> The "tess" stuff stays the same as it was before my patch, i.e. they are not 
> exposed to ES shaders.

Yeah... I'm not sure I understand my comment either.  I think I read the
patch "backwards"... as removing the second "if (!state->es_shader)"
instead of adding it.  Re-reading it with a clearer head, this patch is

Reviewed-by: Ian Romanick 

But I still think GL_EXT_tessellation_shader should add the
gl_MaxTess*Atomic built-in variables. :)

>>>   add_const("gl_MaxTessEvaluationAtomicCounters",
>>> @@ -753,12 +752,11 @@ builtin_variable_generator::generate_constants()
>>>add_const("gl_MaxAtomicCounterBufferSize",
>>>  state->Const.MaxAtomicCounterBufferSize);
>>>
>>> -  /* When Mesa adds support for GL_OES_geometry_shader and
>>> -   * GL_OES_tessellation_shader, this will need to change.
>>> -   */
>>> -  if (!state->es_shader) {
>>> +  if (state->has_geometry_shader()) {
>>>   add_const("gl_MaxGeometryAtomicCounterBuffers",
>>> state->Const.MaxGeometryAtomicCounterBuffers);
>>> +  }
>>> +  if(!state->es_shader) {
>>
>> Here too.
>>
>> With these two things reverted, this patch is
>>
>> Reviewed-by: Ian Romanick 
>>
>>>   add_const("gl_MaxTessControlAtomicCounterBuffers",
>>> state->Const.MaxTessControlAtomicCounterBuffers);
>>>   add_const("gl_MaxTessEvaluationAtomicCounterBuffers",
>>> @@ -814,13 +812,16 @@ builtin_variable_generator::generate_constants()
>>>add_const("gl_MaxCombinedImageUniforms",
>>>  state->Const.MaxCombinedImageUniforms);
>>>
>>> +  if (state->has_geometry_shader()) {
>>> + add_const("gl_MaxGeometryImageUniforms",
>>> +   state->Const.MaxGeometryImageUniforms);
>>> +  }
>>> +
>>>if (!state->es_shader) {
>>>   add_const("gl_MaxCombinedImageUnitsAndFragmentOutputs",
>>> state->Const.MaxCombinedShaderOutputResources);
>>>   add_const("gl_MaxImageSamples",
>>> state->Const.MaxImageSamples);
>>> - add_const("gl_MaxGeometryImageUniforms",
>>> -   state->Const.MaxGeometryImageUniforms);
>>>}
>>>
>>>if (state->is_version(45

Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Emil Velikov
On 4 December 2015 at 00:45, Matt Turner  wrote:
> On Thu, Dec 3, 2015 at 4:37 PM, Emil Velikov  wrote:
>>> FWIW, make check still fails for me even with this patch.
>> Do you have a log that I can take a look ? I've `make clean'ed and
>> rebuild a couple of times just in case and things seems to pass here.
>
> Attached is the output of ./main-test without (p.txt) and with (q.txt)
> your patch.
Interesting... upon closer look the build produces two tests -
lt-main-test and main-test. The former is executed on my system
(either via the top level make check or the one in
src/mesa/main/tests) and passes, while the former produces the same
issues as you've spotted.

Then again the former has RPATH set to the built libglapi.so location
while the latter does not.

Can you try tracking it down on your end - perhaps setting up
LD_PRELOAD=/mesa/build/dir/path/to/libglapi.so ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] vc4/nir: Use the new unified io intrinsics

2015-12-03 Thread Jason Ekstrand
On Dec 3, 2015 3:47 PM, "Eric Anholt"  wrote:
>
> Jason Ekstrand  writes:
>
> > Cc: Eric Anholt 
>
> OK, I've pushed a branch of partial fixes for this series to nir-loads
> of my fdo tree, but it's still super broken.  I've spent too much of
> today on it, and this series was not ready for review.  There are still
> const_index[0] references all over that are obviously from intrinsics
> you've removed the const_index on.

I'm going to start looking at this again tomorrow.  I've had other things
on my plate this week.  Rob already pointed me at some crashing shaders for
ir3 and his little standalone compiler. I'll start there.  Is there
something similar I can use to debug vc4?

> I'd recommend, if you're going to resubmit this series, that you submit
> it as two separate patches, one for stores and one for loads, and sanity
> check the remaining const_index[0] references.

I wanted to split it in half but, unfortunately, that's kind of painful for
nir_lower_io.  I can look into it again.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-03 Thread Rob Clark
On Thu, Dec 3, 2015 at 6:01 PM, Eric Anholt  wrote:
> Jason Ekstrand  writes:
>
>> There is some special-casing needed in a competent back-end.  However, they
>> can do their special-casing easily enough based on whether or not the
>> offset is a constant.  In the mean time, having the *_indirect variants
>> adds special cases a number of places where they don't need to be and, in
>> general, only complicates things.  To complicate matters, NIR had no way to
>> convdert an indirect load/store to a direct one in the case that the
>> indirect was a constant so we would still not really get what the back-ends
>> wanted.  The best solution seems to be to get rid of the *_indirect
>> variants entirely.
>
> I've been putting off debugging this series because NIR intrinsic
> documentation has been bad at const_index[] versus src[] explanations
> and I only ever get my code working through cargo culting.  It looks
> like you simplify things a lot, but I'm going to ask for some
> clarification still.

btw, I 2nd the vagueness of docs wrt const_index[] vs src[]..  we
*really* should get Connor's nir docs[1] patch(es) merged.. I'd take a
pass at updating them as I go, if only to get the patches reviewed to
confirm that my understanding was correct ;-)

Anyways, after http://hastebin.com/kixugulabe.coffee not *everything*
is failing.. current status is at:
http://people.freedesktop.org/~robclark/tmp/tmp.log

[1] ie, whatever produces: http://people.freedesktop.org/~cwabbott0/nir-docs/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-03 Thread Jason Ekstrand
On Dec 3, 2015 3:01 PM, "Eric Anholt"  wrote:
>
> Jason Ekstrand  writes:
>
> > There is some special-casing needed in a competent back-end.  However,
they
> > can do their special-casing easily enough based on whether or not the
> > offset is a constant.  In the mean time, having the *_indirect variants
> > adds special cases a number of places where they don't need to be and,
in
> > general, only complicates things.  To complicate matters, NIR had no
way to
> > convdert an indirect load/store to a direct one in the case that the
> > indirect was a constant so we would still not really get what the
back-ends
> > wanted.  The best solution seems to be to get rid of the *_indirect
> > variants entirely.
>
> I've been putting off debugging this series because NIR intrinsic
> documentation has been bad at const_index[] versus src[] explanations
> and I only ever get my code working through cargo culting.  It looks
> like you simplify things a lot, but I'm going to ask for some
> clarification still.
>
> > ---
> >  src/glsl/nir/nir_intrinsics.h   | 64
-
> >  src/glsl/nir/nir_lower_phis_to_scalar.c |  4 ---
> >  src/glsl/nir/nir_print.c| 19 +-
> >  3 files changed, 38 insertions(+), 49 deletions(-)
> >
> > diff --git a/src/glsl/nir/nir_intrinsics.h
b/src/glsl/nir/nir_intrinsics.h
> > index b2565c5..0fa5a27 100644
> > --- a/src/glsl/nir/nir_intrinsics.h
> > +++ b/src/glsl/nir/nir_intrinsics.h
> > @@ -228,54 +228,50 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
> >  SYSTEM_VALUE(helper_invocation, 1, 0)
> >
> >  /*
> > - * The format of the indices depends on the type of the load.  For
uniforms,
> > - * the first index is the base address and the second index is an
offset that
> > - * should be added to the base address.  (This way you can determine
in the
> > - * back-end which variable is being accessed even in an array.)  For
inputs,
> > - * the one and only index corresponds to the attribute slot.  UBO
loads also
> > - * have a single index which is the base address to load from.
> > + * All load operations have a source specifying an offset which may or
may
> > + * not be constant.  If the shader is still in SSA or partial SSA
form, then
> > + * determining whether or not the offset is constant is trivial.  This
is
> > + * always the last source in the intrinsic.
> >   *
> > - * UBO loads have a (possibly constant) source which is the UBO buffer
index.
> > - * For each type of load, the _indirect variant has one additional
source
> > - * (the second in the case of UBO's) that is the is an indirect to be
added to
> > - * the constant address or base offset to compute the final offset.
> > + * Uniforms have a constant index that provides a secondary base
offset that
> > + * should be added to the offset from the source.  This allows
back-ends to
> > + * determine which uniform variable is being accessed.
>
> For clarity, since we have things that are indices/offsets and might be
> constants but appear in src[]: s/constant index/constant_index/.

Will do.

> > - * For vector backends, the address is in terms of one vec4, and so
each array
> > - * element is +4 scalar components from the previous array element.
For scalar
> > - * backends, the address is in terms of a single 4-byte float/int and
arrays
> > - * elements begin immediately after the previous array element.
> > + * UBO and SSBO loads have a (possibly constant) source which is the
UBO
> > + * buffer index.  The pervertex_input intrinsic has a source which
specifies
> > + * the (possibly constant) vertex id to load from.
> > + *
> > + * The exact address type depends on the lowering pass that generates
the
> > + * load/store intrinsics.  Typically, this is vec4 units for things
such as
> > + * varying slots and float units for fragment shader inputs.  UBO and
SSBO
> > + * offsets are always in bytes.
> >   */
> >
> >  #define LOAD(name, extra_srcs, indices, flags) \
> > -   INTRINSIC(load_##name, extra_srcs, ARR(1), true, 0, 0, indices,
flags) \
> > -   INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \
> > - true, 0, 0, indices, flags)
> > +   INTRINSIC(load_##name, extra_srcs + 1, ARR(1, 1), true, 0, 0,
indices, flags)
> >
> > -LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE |
NIR_INTRINSIC_CAN_REORDER)
>
> Uniforms had two const_index entries before?  I've only ever used one.

Yeah, they did for a while.  It never affected any tgsi users though so I
didn't bother updating those drivers.

> > -LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE |
NIR_INTRINSIC_CAN_REORDER)
> > -LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE |
NIR_INTRINSIC_CAN_REORDER)
> > -LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE |
NIR_INTRINSIC_CAN_REORDER)
> > -LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
> > -LOAD(output, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE)
> > -LOAD(per_vertex_output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
> > +LOAD(uniform, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE |
NIR_INTRINSIC_CAN_

[Mesa-dev] [Bug 93188] "nir/nir.h", line 552: Error: Unexpected type name "nir_src" encountered.

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93188

--- Comment #4 from Ian Romanick  ---
I guess the question is whether or not this builds with Visual Studio's C++
compiler.  If not, maybe we should reopen with changes to the summary?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 4:37 PM, Emil Velikov  wrote:
>> FWIW, make check still fails for me even with this patch.
> Do you have a log that I can take a look ? I've `make clean'ed and
> rebuild a couple of times just in case and things seems to pass here.

Attached is the output of ./main-test without (p.txt) and with (q.txt)
your patch.
Running main() from gtest_main.cc
[==] Running 12 tests from 5 test cases.
[--] Global test environment set-up.
[--] 2 tests from EnumStrings
[ RUN  ] EnumStrings.LookUpByNumber
[   OK ] EnumStrings.LookUpByNumber (0 ms)
[ RUN  ] EnumStrings.LookUpUnknownNumber
[   OK ] EnumStrings.LookUpUnknownNumber (0 ms)
[--] 2 tests from EnumStrings (1 ms total)

[--] 6 tests from DispatchSanity_test
[ RUN  ] DispatchSanity_test.GL31_CORE
[   OK ] DispatchSanity_test.GL31_CORE (1 ms)
[ RUN  ] DispatchSanity_test.GL30
[   OK ] DispatchSanity_test.GL30 (0 ms)
[ RUN  ] DispatchSanity_test.GLES11
[   OK ] DispatchSanity_test.GLES11 (1 ms)
[ RUN  ] DispatchSanity_test.GLES2
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x449fc0
Expected: nop_table[i]
Which is: 0x419650
i = 329 (GetPointerv)
[  FAILED  ] DispatchSanity_test.GLES2 (0 ms)
[ RUN  ] DispatchSanity_test.GLES3
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x449fc0
Expected: nop_table[i]
Which is: 0x419650
i = 329 (GetPointerv)
[  FAILED  ] DispatchSanity_test.GLES3 (0 ms)
[ RUN  ] DispatchSanity_test.GLES31
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x449fc0
Expected: nop_table[i]
Which is: 0x419650
i = 329 (GetPointerv)
[  FAILED  ] DispatchSanity_test.GLES31 (1 ms)
[--] 6 tests from DispatchSanity_test (3 ms total)

[--] 2 tests from MesaFormatsTest
[ RUN  ] MesaFormatsTest.FormatTypeAndComps
[   OK ] MesaFormatsTest.FormatTypeAndComps (0 ms)
[ RUN  ] MesaFormatsTest.FormatSanity
[   OK ] MesaFormatsTest.FormatSanity (0 ms)
[--] 2 tests from MesaFormatsTest (0 ms total)

[--] 1 test from MesaExtensionsTest
[ RUN  ] MesaExtensionsTest.AlphabeticallySorted
[   OK ] MesaExtensionsTest.AlphabeticallySorted (0 ms)
[--] 1 test from MesaExtensionsTest (0 ms total)

[--] 1 test from program_state_string
[ RUN  ] program_state_string.depth_range
[   OK ] program_state_string.depth_range (0 ms)
[--] 1 test from program_state_string (0 ms total)

[--] Global test environment tear-down
[==] 12 tests from 5 test cases ran. (4 ms total)
[  PASSED  ] 9 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] DispatchSanity_test.GLES2
[  FAILED  ] DispatchSanity_test.GLES3
[  FAILED  ] DispatchSanity_test.GLES31

 3 FAILED TESTS
Running main() from gtest_main.cc
[==] Running 12 tests from 5 test cases.
[--] Global test environment set-up.
[--] 2 tests from EnumStrings
[ RUN  ] EnumStrings.LookUpByNumber
[   OK ] EnumStrings.LookUpByNumber (0 ms)
[ RUN  ] EnumStrings.LookUpUnknownNumber
[   OK ] EnumStrings.LookUpUnknownNumber (0 ms)
[--] 2 tests from EnumStrings (0 ms total)

[--] 6 tests from DispatchSanity_test
[ RUN  ] DispatchSanity_test.GL31_CORE
[   OK ] DispatchSanity_test.GL31_CORE (1 ms)
[ RUN  ] DispatchSanity_test.GL30
[   OK ] DispatchSanity_test.GL30 (1 ms)
[ RUN  ] DispatchSanity_test.GLES11
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:151: Failure
Value of: _glapi_get_proc_offset(function_table[i].name)
  Actual: -1
Expected: offset
Which is: 329
Function: glGetPointervKHR
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x449fc0
Expected: nop_table[i]
Which is: 0x419650
i = 329 (GetPointerv)
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x5b2190
Expected: nop_table[i]
Which is: 0x419650
i = 1105 (ObjectLabel)
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x5b22a0
Expected: nop_table[i]
Which is: 0x419650
i = 1106 (ObjectPtrLabel)
[  FAILED  ] DispatchSanity_test.GLES11 (0 ms)
[ RUN  ] DispatchSanity_test.GLES2
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:148: Failure
Expected: (-1) != (offset), actual: -1 vs -1
Function: glGetPointervKHR
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x449fc0
Expected: nop_table[i]
Which is: 0x419650
i = 329 (GetPointerv)
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x5b2190
Expected: nop_table[i]
Which is: 0x419650
i = 1105 (ObjectLabel)
../../../../../mesa/src/mesa/main/tests/dispatch_sanity.cpp:173: Failure
Value of: table[i]
  Actual: 0x5b22a0
Expected:

Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-03 Thread Ilia Mirkin
On Thu, Dec 3, 2015 at 7:25 PM, Roland Scheidegger  wrote:
> Am 03.12.2015 um 23:48 schrieb Ilia Mirkin:
>> On Thu, Dec 3, 2015 at 4:44 AM, Edward O'Callaghan
>>  wrote:
>>> ARB_clear_texture is reasonably generic enough that it should
>>> be moved down to become part of the fallback mechanism of
>>> pipe->clear_texture.
>>>
>>> Signed-off-by: Edward O'Callaghan 
>>> ---
>>>  src/gallium/auxiliary/util/u_surface.c  | 83 
>>> +
>>>  src/gallium/auxiliary/util/u_surface.h  |  6 ++
>>>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
>>>  3 files changed, 90 insertions(+), 66 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/util/u_surface.c 
>>> b/src/gallium/auxiliary/util/u_surface.c
>>> index 6aa44f9..e7ab175 100644
>>> --- a/src/gallium/auxiliary/util/u_surface.c
>>> +++ b/src/gallium/auxiliary/util/u_surface.c
>>> @@ -36,6 +36,7 @@
>>>  #include "pipe/p_screen.h"
>>>  #include "pipe/p_state.h"
>>>
>>> +#include "util/u_math.h"
>>>  #include "util/u_format.h"
>>>  #include "util/u_inlines.h"
>>>  #include "util/u_rect.h"
>>> @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
>>> }
>>>  }
>>>
>>> +/**
>>> + * Fallback for pipe->clear_texture() function.
>>> + * clears a non-PIPE_BUFFER resource's specified level
>>> + * and bounding box with a clear value provided in that
>>> + * resource's native format.
>>> + *
>>> + * XXX sf->format = .. is problematic as hw need
>>> + * not nessarily support the format.
>>> + */
>>> +void
>>> +util_surface_clear_texture(struct pipe_context *pipe,
>>> +   struct pipe_resource *res,
>>> +   unsigned level,
>>> +   const struct pipe_box *box,
>>> +   const void *data)
>>> +{
>>> +   struct pipe_surface tmpl = {{0}}, *sf;
>>> +
>>> +   tmpl.format = res->format;
>>> +   tmpl.u.tex.first_layer = box->z;
>>> +   tmpl.u.tex.last_layer = box->z + box->depth - 1;
>>> +   tmpl.u.tex.level = level;
>>> +   sf = pipe->create_surface(pipe, res, &tmpl);
>>> +   if (!sf)
>>> +  return;
>>> +
>>> +   if (util_format_is_depth_or_stencil(res->format)) {
>>> +  float depth = 0;
>>> +  uint8_t stencil = 0;
>>> +  unsigned clear = 0;
>>> +  const struct util_format_description *desc =
>>> + util_format_description(res->format);
>>> +
>>> +  if (util_format_has_depth(desc)) {
>>> + clear |= PIPE_CLEAR_DEPTH;
>>> + desc->unpack_z_float(&depth, 0, data, 0, 1, 1);
>>> +  }
>>> +  if (util_format_has_stencil(desc)) {
>>> + clear |= PIPE_CLEAR_STENCIL;
>>> + desc->unpack_s_8uint(&stencil, 0, data, 0, 1, 1);
>>> +  }
>>> +  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
>>> +box->x, box->y, box->width, box->height);
>>> +   } else {
>>> +  union pipe_color_union color;
>>> +
>>> +  switch (util_format_get_blocksizebits(res->format)) {
>>> +  case 128:
>>> + sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
>>> + memcpy(&color.ui, data, 128 / 8);
>>> + break;
>>> +  case 64:
>>> + sf->format = PIPE_FORMAT_R32G32_UINT;
>>> + memcpy(&color.ui, data, 64 / 8);
>>> + memset(&color.ui[2], 0, 64 / 8);
>>> + break;
>>> +  case 32:
>>> + sf->format = PIPE_FORMAT_R32_UINT;
>>> + memcpy(&color.ui, data, 32 / 8);
>>> + memset(&color.ui[1], 0, 96 / 8);
>>> + break;
>>> +  case 16:
>>> + sf->format = PIPE_FORMAT_R16_UINT;
>>> + color.ui[0] = util_cpu_to_le32(
>>> +util_le16_to_cpu(*(unsigned short *)data));
>>> + memset(&color.ui[1], 0, 96 / 8);
>>> + break;
>>> +  case 8:
>>> + sf->format = PIPE_FORMAT_R8_UINT;
>>> + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
>>> + memset(&color.ui[1], 0, 96 / 8);
>>> + break;
>>> +  default:
>>> + assert(!"Unknown texel element size");
>>> + return;
>>> +  }
>>> +
>>> +  pipe->clear_render_target(pipe, sf, &color,
>>> +box->x, box->y, box->width, box->height);
>>
>> I only recently realized this, but I'm fairly sure this is wrong --
>> needs to be divided by util_format_blockwidth/height, otherwise we'll
>> go way out of bounds for compressed formats. [And yes,
>> nv50_clear_texture has the same problem.]
> You can't clear compressed textures with ARB_clear_texture, so I guess
> this shouldn't be a problem.

Ah yes, indeed. That's why I didn't get any weird failurs :)

> Even if you think it should work in gallium, with the format illegally
> being changed after-the-fact (after creating the surface) it's not even
> worth thinking about what the correct dimensions should be (but yes if
> that would be fixed you'd be right)...

Yeah... that's why that stuff sat as an nv50 helper, not in generic
code. Figured it

Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Emil Velikov
On 4 December 2015 at 00:23, Matt Turner  wrote:
> On Thu, Dec 3, 2015 at 3:59 PM, Emil Velikov  wrote:
>> On 3 December 2015 at 22:15, Matt Turner  wrote:
>>> On Thu, Dec 3, 2015 at 2:05 PM, Emil Velikov  
>>> wrote:
 Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
 to the ES dispatch".

 Note: as the core symbol is present in GLES 1.1 we cannot (should not)
 include the KHR one in the es11 table. Add the symbol, commented out,
 with description for posterity.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
 Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
 Signed-off-by: Emil Velikov 
 ---
  src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index 97f81f9..687c8f3 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -2049,6 +2049,8 @@ const struct function gles11_functions_possible[] = {
 { "glGetDebugMessageLogKHR", 11, -1 },
 { "glGetObjectLabelKHR", 11, -1 },
 { "glGetObjectPtrLabelKHR", 11, -1 },
 +   // The following clashes with the non KHR definition above
>>>
>>> We have comments elsewhere like
>>>
>>>// We check for the aliased -OES version in GLES 2
>>>
>>> Can you make the comment match that?
>>>
>> Slightly confused here. The example is the opposite of that I'm doing here.
>
> Huh. I'm not sure why. It's just a problem of testing for two
> functions that are aliased.
>
> Whether we test for the "regular" one of the extension one doesn't
> matter. All I was saying was "make your comment match the format of
> the others noting the same problem elsewhere"
>
> ... unless I've misunderstood something.
>
You're spot on. Earlier I've went with the shorter solution - with v2
things are consistent.

>> I'm fine either way just let me know whichever you prefer.
>> -Emil
>
> FWIW, make check still fails for me even with this patch.
Do you have a log that I can take a look ? I've `make clean'ed and
rebuild a couple of times just in case and things seems to pass here.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Emil Velikov
Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
to the ES dispatch".

v2: comment out the ES1.1 symbol and use the same description (pattern)
as elsewhere (Matt)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
Signed-off-by: Emil Velikov 
---
 src/mesa/main/tests/dispatch_sanity.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 97f81f9..d288b1d 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -1937,7 +1937,8 @@ const struct function gles11_functions_possible[] = {
{ "glGetLightxv", 11, -1 },
{ "glGetMaterialfv", 11, _gloffset_GetMaterialfv },
{ "glGetMaterialxv", 11, -1 },
-   { "glGetPointerv", 11, _gloffset_GetPointerv },
+   // We check for the aliased -KHR version in GLES 1.1
+// { "glGetPointerv", 11, _gloffset_GetPointerv },
{ "glGetRenderbufferParameterivOES", 11, -1 },
{ "glGetString", 11, _gloffset_GetString },
{ "glGetTexEnvfv", 11, _gloffset_GetTexEnvfv },
@@ -2049,6 +2050,7 @@ const struct function gles11_functions_possible[] = {
{ "glGetDebugMessageLogKHR", 11, -1 },
{ "glGetObjectLabelKHR", 11, -1 },
{ "glGetObjectPtrLabelKHR", 11, -1 },
+   { "glGetPointervKHR", 11, _gloffset_GetPointerv },
{ "glObjectLabelKHR", 11, -1 },
{ "glObjectPtrLabelKHR", 11, -1 },
 
@@ -2284,6 +2286,7 @@ const struct function gles2_functions_possible[] = {
{ "glGetDebugMessageLogKHR", 20, -1 },
{ "glGetObjectLabelKHR", 20, -1 },
{ "glGetObjectPtrLabelKHR", 20, -1 },
+   { "glGetPointervKHR", 20, -1 },
{ "glObjectLabelKHR", 20, -1 },
{ "glObjectPtrLabelKHR", 20, -1 },
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #41 from Roland Scheidegger  ---
miscompilations are possible but rare, often different compiler options just
hide a bug. Did you try running this with valgrind?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] glsl: Switch opcode and avail parameters to binop().

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 4:24 PM, Matt Turner  wrote:
> On Thu, Dec 3, 2015 at 4:22 PM, Eric Anholt  wrote:
>> Matt Turner  writes:
>>
>>> To make it match unop().
>>
>> FWIW, this series plus the algebraic patch appears to net the same code
>> generated on vc4, so I'm fine with it.  Based on the other feedback, it
>> does seem like we should leave the GLSL ops for now and just change the
>> NIR to be simpler.
>
> With the st_glsl_to_tgsi patch I sent, even the generated TGSI should
> be identical.

... though I suppose I could do the same for ir_to_mesa.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-03 Thread Roland Scheidegger
Am 03.12.2015 um 23:48 schrieb Ilia Mirkin:
> On Thu, Dec 3, 2015 at 4:44 AM, Edward O'Callaghan
>  wrote:
>> ARB_clear_texture is reasonably generic enough that it should
>> be moved down to become part of the fallback mechanism of
>> pipe->clear_texture.
>>
>> Signed-off-by: Edward O'Callaghan 
>> ---
>>  src/gallium/auxiliary/util/u_surface.c  | 83 
>> +
>>  src/gallium/auxiliary/util/u_surface.h  |  6 ++
>>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
>>  3 files changed, 90 insertions(+), 66 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_surface.c 
>> b/src/gallium/auxiliary/util/u_surface.c
>> index 6aa44f9..e7ab175 100644
>> --- a/src/gallium/auxiliary/util/u_surface.c
>> +++ b/src/gallium/auxiliary/util/u_surface.c
>> @@ -36,6 +36,7 @@
>>  #include "pipe/p_screen.h"
>>  #include "pipe/p_state.h"
>>
>> +#include "util/u_math.h"
>>  #include "util/u_format.h"
>>  #include "util/u_inlines.h"
>>  #include "util/u_rect.h"
>> @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
>> }
>>  }
>>
>> +/**
>> + * Fallback for pipe->clear_texture() function.
>> + * clears a non-PIPE_BUFFER resource's specified level
>> + * and bounding box with a clear value provided in that
>> + * resource's native format.
>> + *
>> + * XXX sf->format = .. is problematic as hw need
>> + * not nessarily support the format.
>> + */
>> +void
>> +util_surface_clear_texture(struct pipe_context *pipe,
>> +   struct pipe_resource *res,
>> +   unsigned level,
>> +   const struct pipe_box *box,
>> +   const void *data)
>> +{
>> +   struct pipe_surface tmpl = {{0}}, *sf;
>> +
>> +   tmpl.format = res->format;
>> +   tmpl.u.tex.first_layer = box->z;
>> +   tmpl.u.tex.last_layer = box->z + box->depth - 1;
>> +   tmpl.u.tex.level = level;
>> +   sf = pipe->create_surface(pipe, res, &tmpl);
>> +   if (!sf)
>> +  return;
>> +
>> +   if (util_format_is_depth_or_stencil(res->format)) {
>> +  float depth = 0;
>> +  uint8_t stencil = 0;
>> +  unsigned clear = 0;
>> +  const struct util_format_description *desc =
>> + util_format_description(res->format);
>> +
>> +  if (util_format_has_depth(desc)) {
>> + clear |= PIPE_CLEAR_DEPTH;
>> + desc->unpack_z_float(&depth, 0, data, 0, 1, 1);
>> +  }
>> +  if (util_format_has_stencil(desc)) {
>> + clear |= PIPE_CLEAR_STENCIL;
>> + desc->unpack_s_8uint(&stencil, 0, data, 0, 1, 1);
>> +  }
>> +  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
>> +box->x, box->y, box->width, box->height);
>> +   } else {
>> +  union pipe_color_union color;
>> +
>> +  switch (util_format_get_blocksizebits(res->format)) {
>> +  case 128:
>> + sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
>> + memcpy(&color.ui, data, 128 / 8);
>> + break;
>> +  case 64:
>> + sf->format = PIPE_FORMAT_R32G32_UINT;
>> + memcpy(&color.ui, data, 64 / 8);
>> + memset(&color.ui[2], 0, 64 / 8);
>> + break;
>> +  case 32:
>> + sf->format = PIPE_FORMAT_R32_UINT;
>> + memcpy(&color.ui, data, 32 / 8);
>> + memset(&color.ui[1], 0, 96 / 8);
>> + break;
>> +  case 16:
>> + sf->format = PIPE_FORMAT_R16_UINT;
>> + color.ui[0] = util_cpu_to_le32(
>> +util_le16_to_cpu(*(unsigned short *)data));
>> + memset(&color.ui[1], 0, 96 / 8);
>> + break;
>> +  case 8:
>> + sf->format = PIPE_FORMAT_R8_UINT;
>> + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
>> + memset(&color.ui[1], 0, 96 / 8);
>> + break;
>> +  default:
>> + assert(!"Unknown texel element size");
>> + return;
>> +  }
>> +
>> +  pipe->clear_render_target(pipe, sf, &color,
>> +box->x, box->y, box->width, box->height);
> 
> I only recently realized this, but I'm fairly sure this is wrong --
> needs to be divided by util_format_blockwidth/height, otherwise we'll
> go way out of bounds for compressed formats. [And yes,
> nv50_clear_texture has the same problem.]
You can't clear compressed textures with ARB_clear_texture, so I guess
this shouldn't be a problem.
Even if you think it should work in gallium, with the format illegally
being changed after-the-fact (after creating the surface) it's not even
worth thinking about what the correct dimensions should be (but yes if
that would be fixed you'd be right)...

Roland



> 
>> +   }
>> +   pipe->surface_destroy(pipe, sf);
>> +}
>>
>>  /* Return if the box is totally inside the resource.
>>   */
>> diff --git a/src/gallium/auxiliary/util/u_surface.h 
>> b/src/gallium/auxiliary/util/u_surface.h
>> index bfd8f40..069a393 100644
>> --- a/src/gallium/auxiliary/util/u_surface.h
>> +++ b/src/gall

Re: [Mesa-dev] [PATCH 1/6] glsl: Switch opcode and avail parameters to binop().

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 4:22 PM, Eric Anholt  wrote:
> Matt Turner  writes:
>
>> To make it match unop().
>
> FWIW, this series plus the algebraic patch appears to net the same code
> generated on vc4, so I'm fine with it.  Based on the other feedback, it
> does seem like we should leave the GLSL ops for now and just change the
> NIR to be simpler.

With the st_glsl_to_tgsi patch I sent, even the generated TGSI should
be identical.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 3:59 PM, Emil Velikov  wrote:
> On 3 December 2015 at 22:15, Matt Turner  wrote:
>> On Thu, Dec 3, 2015 at 2:05 PM, Emil Velikov  
>> wrote:
>>> Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
>>> to the ES dispatch".
>>>
>>> Note: as the core symbol is present in GLES 1.1 we cannot (should not)
>>> include the KHR one in the es11 table. Add the symbol, commented out,
>>> with description for posterity.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
>>> Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
>>> Signed-off-by: Emil Velikov 
>>> ---
>>>  src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
>>> b/src/mesa/main/tests/dispatch_sanity.cpp
>>> index 97f81f9..687c8f3 100644
>>> --- a/src/mesa/main/tests/dispatch_sanity.cpp
>>> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
>>> @@ -2049,6 +2049,8 @@ const struct function gles11_functions_possible[] = {
>>> { "glGetDebugMessageLogKHR", 11, -1 },
>>> { "glGetObjectLabelKHR", 11, -1 },
>>> { "glGetObjectPtrLabelKHR", 11, -1 },
>>> +   // The following clashes with the non KHR definition above
>>
>> We have comments elsewhere like
>>
>>// We check for the aliased -OES version in GLES 2
>>
>> Can you make the comment match that?
>>
> Slightly confused here. The example is the opposite of that I'm doing here.

Huh. I'm not sure why. It's just a problem of testing for two
functions that are aliased.

Whether we test for the "regular" one of the extension one doesn't
matter. All I was saying was "make your comment match the format of
the others noting the same problem elsewhere"

... unless I've misunderstood something.

Not important in any case.

> Are you suggesting that I comment out the normal function, give it a
> "// We check for the aliased -KHR version in GLES 1.1" comment and
> uncomment the below (I'll need to change the offset to
> _gloffset_GetPointerv) ?
>
> I'm fine either way just let me know whichever you prefer.
> -Emil

FWIW, make check still fails for me even with this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] glsl: Switch opcode and avail parameters to binop().

2015-12-03 Thread Eric Anholt
Matt Turner  writes:

> To make it match unop().

FWIW, this series plus the algebraic patch appears to net the same code
generated on vc4, so I'm fine with it.  Based on the other feedback, it
does seem like we should leave the GLSL ops for now and just change the
NIR to be simpler.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.

2015-12-03 Thread Eric Anholt
Matt Turner  writes:

> ---
> I add the true/false variables for clarity since there are some existing
> optimizations using ~0 where it actually has nothing to do with true.
>
> I could take it or leave it. We obviously can't use them for feq and
> friends. Maybe itrue/ifalse and ftrue/ffalse?

No changes on my shader-db, but glsl-fs-all-01.shader_test emits a lot
fewer instructions.  For either version:

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Emil Velikov
On 3 December 2015 at 22:15, Matt Turner  wrote:
> On Thu, Dec 3, 2015 at 2:05 PM, Emil Velikov  wrote:
>> Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
>> to the ES dispatch".
>>
>> Note: as the core symbol is present in GLES 1.1 we cannot (should not)
>> include the KHR one in the es11 table. Add the symbol, commented out,
>> with description for posterity.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
>> Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
>> b/src/mesa/main/tests/dispatch_sanity.cpp
>> index 97f81f9..687c8f3 100644
>> --- a/src/mesa/main/tests/dispatch_sanity.cpp
>> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
>> @@ -2049,6 +2049,8 @@ const struct function gles11_functions_possible[] = {
>> { "glGetDebugMessageLogKHR", 11, -1 },
>> { "glGetObjectLabelKHR", 11, -1 },
>> { "glGetObjectPtrLabelKHR", 11, -1 },
>> +   // The following clashes with the non KHR definition above
>
> We have comments elsewhere like
>
>// We check for the aliased -OES version in GLES 2
>
> Can you make the comment match that?
>
Slightly confused here. The example is the opposite of that I'm doing here.

Are you suggesting that I comment out the normal function, give it a
"// We check for the aliased -KHR version in GLES 1.1" comment and
uncomment the below (I'll need to change the offset to
_gloffset_GetPointerv) ?

I'm fine either way just let me know whichever you prefer.
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] i965/vec4: Optimize predicate handling for any/all.

2015-12-03 Thread Kenneth Graunke
On Monday, November 30, 2015 03:32:11 PM Matt Turner wrote:
> For a select whose condition is any(v), instead of emitting
> 
>cmp.nz.f0(8)null<1>Dg1<0,4,1>D  0D
>mov(8)  g7<1>.xUD   0xUD
>(+f0.any4h) mov(8) g7<1>.xUD0xUD
>cmp.nz.f0(8)null<1>Dg7<4,4,1>.xD0D
>(+f0) sel(8)g8<1>UD g4<4,4,1>UD g3<4,4,1>UD
> 
> we now emit
> 
>cmp.nz.f0(8)null<1>Dg1<0,4,1>D  0D
>(+f0.any4h) sel(8) g9<1>UD  g4<4,4,1>UD g3<4,4,1>UD
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  2 +
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 96 
> --
>  2 files changed, 80 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 25b1139..ae98325 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -313,6 +313,8 @@ public:
>  
> bool is_high_sampler(src_reg sampler);
>  
> +   bool optimize_predicate(nir_alu_instr *instr, enum brw_predicate 
> *predicate);
> +
> virtual void emit_nir_code();
> virtual void nir_setup_inputs();
> virtual void nir_setup_uniforms();
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 457..f734171 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -928,6 +928,62 @@ brw_conditional_for_nir_comparison(nir_op op)
> }
>  }
>  
> +bool
> +vec4_visitor::optimize_predicate(nir_alu_instr *instr,
> + enum brw_predicate *predicate)
> +{
> +   if (!instr->src[0].src.is_ssa ||
> +   !instr->src[0].src.ssa->parent_instr)

I think that parent_instr will never be non-NULL.

With that fixed (or not),
Reviewed-by: Kenneth Graunke 

> +  return false;
> +
> +   if (instr->src[0].src.ssa->parent_instr->type != nir_instr_type_alu)
> +  return false;
> +
> +   nir_alu_instr *cmp_instr =
> +  nir_instr_as_alu(instr->src[0].src.ssa->parent_instr);
> +
> +   switch (cmp_instr->op) {
> +   case nir_op_bany_fnequal2:
> +   case nir_op_bany_inequal2:
> +   case nir_op_bany_fnequal3:
> +   case nir_op_bany_inequal3:
> +   case nir_op_bany_fnequal4:
> +   case nir_op_bany_inequal4:
> +  *predicate = BRW_PREDICATE_ALIGN16_ANY4H;
> +  break;
> +   case nir_op_ball_fequal2:
> +   case nir_op_ball_iequal2:
> +   case nir_op_ball_fequal3:
> +   case nir_op_ball_iequal3:
> +   case nir_op_ball_fequal4:
> +   case nir_op_ball_iequal4:
> +  *predicate = BRW_PREDICATE_ALIGN16_ALL4H;
> +  break;
> +   default:
> +  return false;
> +   }
> +
> +   unsigned size_swizzle =
> +  brw_swizzle_for_size(nir_op_infos[cmp_instr->op].input_sizes[0]);
> +
> +   src_reg op[2];
> +   assert(nir_op_infos[cmp_instr->op].num_inputs == 2);
> +   for (unsigned i = 0; i < 2; i++) {
> +  op[i] = get_nir_src(cmp_instr->src[i].src,
> +  nir_op_infos[cmp_instr->op].input_types[i], 4);
> +  unsigned base_swizzle =
> + brw_swizzle_for_nir_swizzle(cmp_instr->src[i].swizzle);
> +  op[i].swizzle = brw_compose_swizzle(size_swizzle, base_swizzle);
> +  op[i].abs = cmp_instr->src[i].abs;
> +  op[i].negate = cmp_instr->src[i].negate;
> +   }
> +
> +   emit(CMP(dst_null_d(), op[0], op[1],
> +brw_conditional_for_nir_comparison(cmp_instr->op)));
> +
> +   return true;
> +}
> +
>  void
>  vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>  {
> @@ -1418,25 +1474,29 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>break;
>  
> case nir_op_bcsel:
> -  emit(CMP(dst_null_d(), op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ));
> -  inst = emit(BRW_OPCODE_SEL, dst, op[1], op[2]);
> -  switch (dst.writemask) {
> -  case WRITEMASK_X:
> - inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_X;
> - break;
> -  case WRITEMASK_Y:
> - inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_Y;
> - break;
> -  case WRITEMASK_Z:
> - inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_Z;
> - break;
> -  case WRITEMASK_W:
> - inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_W;
> - break;
> -  default:
> - inst->predicate = BRW_PREDICATE_NORMAL;
> - break;
> +  enum brw_predicate predicate;
> +  if (!optimize_predicate(instr, &predicate)) {
> + emit(CMP(dst_null_d(), op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ));
> + switch (dst.writemask) {
> + case WRITEMASK_X:
> +predicate = BRW_PREDICATE_ALIGN16_REPLICATE_X;
> +break;
> + case WRITEMASK_Y:
> +predicate = BRW_PREDICATE_ALIGN16_REPLICATE_Y;
> +break;
> + case WRITEMASK_Z:
> +predicate = BRW_PREDICATE_ALIGN16_REPLICATE_Z;
> +break;
> + case WRITEMASK_W:
> +   

Re: [Mesa-dev] [PATCH] meta/generate_mipmap: Work-around GLES 1.x problem with GL_DRAW_FRAMEBUFFER

2015-12-03 Thread Anuj Phogat
On Thu, Dec 3, 2015 at 2:43 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> GL_DRAW_FRAMEBUFFER does not exist in OpenGL ES 1.x, and since
> _mesa_meta_begin hasn't been called yet, we have to work-around API
> difficulties.  The whole reason that GL_DRAW_FRAMEBUFFER is used instead
> of GL_FRAMEBUFFER is that the read framebuffer may be different.  This
> is moot in OpenGL ES 1.x.
>
> I have another patch series that would also fix this (by removing the
> calls to _mesa_BindFramebuffer and friends), but it's not quite ready
> yet... and I think it may be a bit heavy for some stable branches.
> Consider this a stop-gap fix.
>
> Signed-off-by: Ian Romanick 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93215
> Cc: "11.0 11.1" 
> ---
>  src/mesa/drivers/common/meta_generate_mipmap.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
> b/src/mesa/drivers/common/meta_generate_mipmap.c
> index d38e6b8..2b942d6 100644
> --- a/src/mesa/drivers/common/meta_generate_mipmap.c
> +++ b/src/mesa/drivers/common/meta_generate_mipmap.c
> @@ -62,6 +62,15 @@ fallback_required(struct gl_context *ctx, GLenum target,
> GLuint srcLevel;
> GLenum status;
>
> +   /* GL_DRAW_FRAMEBUFFER does not exist in OpenGL ES 1.x, and since
> +* _mesa_meta_begin hasn't been called yet, we have to work-around API
> +* difficulties.  The whole reason that GL_DRAW_FRAMEBUFFER is used 
> instead
> +* of GL_FRAMEBUFFER is that the read framebuffer may be different.  This
> +* is moot in OpenGL ES 1.x.
> +*/
> +   const GLenum fbo_target = ctx->API == API_OPENGLES
> +  ? GL_FRAMEBUFFER : GL_DRAW_FRAMEBUFFER;
> +
> /* check for fallbacks */
> if (target == GL_TEXTURE_3D) {
>_mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_HIGH,
> @@ -102,13 +111,13 @@ fallback_required(struct gl_context *ctx, GLenum target,
>  */
> if (!mipmap->FBO)
>_mesa_GenFramebuffers(1, &mipmap->FBO);
> -   _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, mipmap->FBO);
> +   _mesa_BindFramebuffer(fbo_target, mipmap->FBO);
>
> -   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, 
> baseImage, 0);
> +   _mesa_meta_bind_fbo_image(fbo_target, GL_COLOR_ATTACHMENT0, baseImage, 0);
>
> -   status = _mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER);
> +   status = _mesa_CheckFramebufferStatus(fbo_target);
>
> -   _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fboSave);
> +   _mesa_BindFramebuffer(fbo_target, fboSave);
>
> if (status != GL_FRAMEBUFFER_COMPLETE_EXT) {
>_mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_HIGH,
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] vc4/nir: Use the new unified io intrinsics

2015-12-03 Thread Eric Anholt
Jason Ekstrand  writes:

> Cc: Eric Anholt 

OK, I've pushed a branch of partial fixes for this series to nir-loads
of my fdo tree, but it's still super broken.  I've spent too much of
today on it, and this series was not ready for review.  There are still
const_index[0] references all over that are obviously from intrinsics
you've removed the const_index on.

I'd recommend, if you're going to resubmit this series, that you submit
it as two separate patches, one for stores and one for loads, and sanity
check the remaining const_index[0] references.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #40 from bellamort...@gmail.com ---
compiling with -O3 -fno-inline-small-functions works.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support

2015-12-03 Thread Aaron Watry
On Thu, Dec 3, 2015 at 12:57 PM, Dave Airlie  wrote:

>
> On 4 Dec 2015 03:01, "Aaron Watry"  wrote:
> >
> > Hi Dave (and others),
> >
> > I cloned your fdo r600g-tess-submit branch and gave it a spin on CEDAR
> (Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few issues.
>
> Just grab r600g-tess-staging
>
> -submit only worked on cayman, but i d posted it here so didn't want to
> change it during review.
>
> Yup, that branch works without modification with the version of heaven in
the phoronix-test-suite.

--Aaron


> Dave.
>
> >
> > 1) Initially, I got an assertion in r600_add_atom stating that the atom
> ID was not less than the R600_NUM_ATOMS value (id = 51, R600_NUM_ATOMS=51).
> >   I bumped R600_NUM_ATOMS to 52 for now, and that got rid of that
> issue... although I have no idea if that was a correct fix.
> >
> > 2) Next, I kept getting a segfault in evergreen_adjust_gprs at line
> 3931. Turns out that rctx->hw_shader_stages[2].shader was null
> (missing/miscompiled GS?).
> >
> > I naively changed the code to the following, and now Heaven actually
> runs with tessellation enabled (and it looks like it's working).
> >
> > /* gather required shader gprs */
> > for (i = 0; i < EG_NUM_HW_STAGES; i++) {
> > if (!rctx->hw_shader_stages[i].shader) {
> > num_gprs[i] = def_gprs[i];
> > continue;
> > }
> > num_gprs[i] = rctx->hw_shader_stages[i].shader->shader.bc.ngpr;
> > }
> >
> > Just figured that I'd let you know...
> >
> > If you don't have CEDAR hardware to test with, feel free to ping me to
> test any additional changes.  Note that I didn't run the benchmark to
> completion (too slow, had to get other work done), but it didn't hang my
> GPU in the time that I did have it running.
> >
> > --Aaron
> >
> >
> > On Mon, Nov 30, 2015 at 12:20 AM, Dave Airlie  wrote:
> >>
> >> Hi,
> >>
> >> Patchbomb time, this set of patches is a first pass at add adding
> >> ARB_tessellation_shader support to the r600g driver. Only Evergreen
> >> and newer GPUs support tessellation. On any of the GPUs that support
> >> native FP64, this will enable OpenGL 4.1 on them.
> >>
> >> The first bunch of patches are a bit of a driver rework to get
> >> things in better shape for tessellation, they shouldn't cause
> >> any regressions.
> >>
> >> This runs heaven on cayman and should pass all the piglits
> >> unless I've done something wrong.
> >>
> >> Development hit two HW programming fun times, one with tess and
> >> dynamic GPR interaction requiring disabling dynamic GPRs, and
> >> one with programming of some SIMD registers to block TESS shaders
> >> on one unit. These fixed most of the hangs we saw during development.
> >>
> >> This doesn't contain SB support yet, Glenn has started working on it.
> >>
> >> Currently tested hw:
> >> working: CAYMAN, REDWOOD, BARTS, TURKS
> >> hangs on any tessellation: CAYMAN
> >> hangs differently at least with heaven: SUMO
> >>
> >> This patchset doesn't block it on any GPUs, but when merged it
> >> probably should.
> >>
> >> Also available at:
> >> http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-tess-submit
> >>
> >> Thanks to Glenn Kennard for lots of discussion and testing.
> >>
> >> Dave.
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-03 Thread Eric Anholt
Jason Ekstrand  writes:

> There is some special-casing needed in a competent back-end.  However, they
> can do their special-casing easily enough based on whether or not the
> offset is a constant.  In the mean time, having the *_indirect variants
> adds special cases a number of places where they don't need to be and, in
> general, only complicates things.  To complicate matters, NIR had no way to
> convdert an indirect load/store to a direct one in the case that the
> indirect was a constant so we would still not really get what the back-ends
> wanted.  The best solution seems to be to get rid of the *_indirect
> variants entirely.

I've been putting off debugging this series because NIR intrinsic
documentation has been bad at const_index[] versus src[] explanations
and I only ever get my code working through cargo culting.  It looks
like you simplify things a lot, but I'm going to ask for some
clarification still.

> ---
>  src/glsl/nir/nir_intrinsics.h   | 64 
> -
>  src/glsl/nir/nir_lower_phis_to_scalar.c |  4 ---
>  src/glsl/nir/nir_print.c| 19 +-
>  3 files changed, 38 insertions(+), 49 deletions(-)
>
> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
> index b2565c5..0fa5a27 100644
> --- a/src/glsl/nir/nir_intrinsics.h
> +++ b/src/glsl/nir/nir_intrinsics.h
> @@ -228,54 +228,50 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
>  SYSTEM_VALUE(helper_invocation, 1, 0)
>  
>  /*
> - * The format of the indices depends on the type of the load.  For uniforms,
> - * the first index is the base address and the second index is an offset that
> - * should be added to the base address.  (This way you can determine in the
> - * back-end which variable is being accessed even in an array.)  For inputs,
> - * the one and only index corresponds to the attribute slot.  UBO loads also
> - * have a single index which is the base address to load from.
> + * All load operations have a source specifying an offset which may or may
> + * not be constant.  If the shader is still in SSA or partial SSA form, then
> + * determining whether or not the offset is constant is trivial.  This is
> + * always the last source in the intrinsic.
>   *
> - * UBO loads have a (possibly constant) source which is the UBO buffer index.
> - * For each type of load, the _indirect variant has one additional source
> - * (the second in the case of UBO's) that is the is an indirect to be added 
> to
> - * the constant address or base offset to compute the final offset.
> + * Uniforms have a constant index that provides a secondary base offset that
> + * should be added to the offset from the source.  This allows back-ends to
> + * determine which uniform variable is being accessed.

For clarity, since we have things that are indices/offsets and might be
constants but appear in src[]: s/constant index/constant_index/.

> - * For vector backends, the address is in terms of one vec4, and so each 
> array
> - * element is +4 scalar components from the previous array element. For 
> scalar
> - * backends, the address is in terms of a single 4-byte float/int and arrays
> - * elements begin immediately after the previous array element.
> + * UBO and SSBO loads have a (possibly constant) source which is the UBO
> + * buffer index.  The pervertex_input intrinsic has a source which specifies
> + * the (possibly constant) vertex id to load from.
> + *
> + * The exact address type depends on the lowering pass that generates the
> + * load/store intrinsics.  Typically, this is vec4 units for things such as
> + * varying slots and float units for fragment shader inputs.  UBO and SSBO
> + * offsets are always in bytes.
>   */
>  
>  #define LOAD(name, extra_srcs, indices, flags) \
> -   INTRINSIC(load_##name, extra_srcs, ARR(1), true, 0, 0, indices, flags) \
> -   INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \
> - true, 0, 0, indices, flags)
> +   INTRINSIC(load_##name, extra_srcs + 1, ARR(1, 1), true, 0, 0, indices, 
> flags)
>  
> -LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)

Uniforms had two const_index entries before?  I've only ever used one.

> -LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> -LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> -LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
> -LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
> -LOAD(output, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE)
> -LOAD(per_vertex_output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
> +LOAD(uniform, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> +LOAD(ubo, 1, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> +LOAD(input, 0, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> +LOAD(per_vertex_input, 1, 0, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
> +LOAD(ssbo, 1, 0, NIR_INTRINSIC_CAN_ELIMINATE)
> +L

Re: [Mesa-dev] [PATCH] radeonsi: fix Fiji for LLVM <= 3.7

2015-12-03 Thread Alex Deucher
On Thu, Dec 3, 2015 at 5:51 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Cc: 11.0 11.1 

Reviewed-by: Alex Deucher 

> ---
>  src/gallium/drivers/radeon/r600_pipe_common.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 1ed5eb7..f566a29 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -555,10 +555,11 @@ const char *r600_get_llvm_processor_name(enum 
> radeon_family family)
> case CHIP_TONGA: return "tonga";
> case CHIP_ICELAND: return "iceland";
> case CHIP_CARRIZO: return "carrizo";
> -   case CHIP_FIJI: return "fiji";
>  #if HAVE_LLVM <= 0x0307
> +   case CHIP_FIJI: return "tonga";
> case CHIP_STONEY: return "carrizo";
>  #else
> +   case CHIP_FIJI: return "fiji";
> case CHIP_STONEY: return "stoney";
>  #endif
> default: return "";
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: fix Fiji for LLVM <= 3.7

2015-12-03 Thread Marek Olšák
From: Marek Olšák 

Cc: 11.0 11.1 
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 1ed5eb7..f566a29 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -555,10 +555,11 @@ const char *r600_get_llvm_processor_name(enum 
radeon_family family)
case CHIP_TONGA: return "tonga";
case CHIP_ICELAND: return "iceland";
case CHIP_CARRIZO: return "carrizo";
-   case CHIP_FIJI: return "fiji";
 #if HAVE_LLVM <= 0x0307
+   case CHIP_FIJI: return "tonga";
case CHIP_STONEY: return "carrizo";
 #else
+   case CHIP_FIJI: return "fiji";
case CHIP_STONEY: return "stoney";
 #endif
default: return "";
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-03 Thread Ilia Mirkin
On Thu, Dec 3, 2015 at 4:44 AM, Edward O'Callaghan
 wrote:
> ARB_clear_texture is reasonably generic enough that it should
> be moved down to become part of the fallback mechanism of
> pipe->clear_texture.
>
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/auxiliary/util/u_surface.c  | 83 
> +
>  src/gallium/auxiliary/util/u_surface.h  |  6 ++
>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
>  3 files changed, 90 insertions(+), 66 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_surface.c 
> b/src/gallium/auxiliary/util/u_surface.c
> index 6aa44f9..e7ab175 100644
> --- a/src/gallium/auxiliary/util/u_surface.c
> +++ b/src/gallium/auxiliary/util/u_surface.c
> @@ -36,6 +36,7 @@
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
>
> +#include "util/u_math.h"
>  #include "util/u_format.h"
>  #include "util/u_inlines.h"
>  #include "util/u_rect.h"
> @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
> }
>  }
>
> +/**
> + * Fallback for pipe->clear_texture() function.
> + * clears a non-PIPE_BUFFER resource's specified level
> + * and bounding box with a clear value provided in that
> + * resource's native format.
> + *
> + * XXX sf->format = .. is problematic as hw need
> + * not nessarily support the format.
> + */
> +void
> +util_surface_clear_texture(struct pipe_context *pipe,
> +   struct pipe_resource *res,
> +   unsigned level,
> +   const struct pipe_box *box,
> +   const void *data)
> +{
> +   struct pipe_surface tmpl = {{0}}, *sf;
> +
> +   tmpl.format = res->format;
> +   tmpl.u.tex.first_layer = box->z;
> +   tmpl.u.tex.last_layer = box->z + box->depth - 1;
> +   tmpl.u.tex.level = level;
> +   sf = pipe->create_surface(pipe, res, &tmpl);
> +   if (!sf)
> +  return;
> +
> +   if (util_format_is_depth_or_stencil(res->format)) {
> +  float depth = 0;
> +  uint8_t stencil = 0;
> +  unsigned clear = 0;
> +  const struct util_format_description *desc =
> + util_format_description(res->format);
> +
> +  if (util_format_has_depth(desc)) {
> + clear |= PIPE_CLEAR_DEPTH;
> + desc->unpack_z_float(&depth, 0, data, 0, 1, 1);
> +  }
> +  if (util_format_has_stencil(desc)) {
> + clear |= PIPE_CLEAR_STENCIL;
> + desc->unpack_s_8uint(&stencil, 0, data, 0, 1, 1);
> +  }
> +  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
> +box->x, box->y, box->width, box->height);
> +   } else {
> +  union pipe_color_union color;
> +
> +  switch (util_format_get_blocksizebits(res->format)) {
> +  case 128:
> + sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
> + memcpy(&color.ui, data, 128 / 8);
> + break;
> +  case 64:
> + sf->format = PIPE_FORMAT_R32G32_UINT;
> + memcpy(&color.ui, data, 64 / 8);
> + memset(&color.ui[2], 0, 64 / 8);
> + break;
> +  case 32:
> + sf->format = PIPE_FORMAT_R32_UINT;
> + memcpy(&color.ui, data, 32 / 8);
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  case 16:
> + sf->format = PIPE_FORMAT_R16_UINT;
> + color.ui[0] = util_cpu_to_le32(
> +util_le16_to_cpu(*(unsigned short *)data));
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  case 8:
> + sf->format = PIPE_FORMAT_R8_UINT;
> + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  default:
> + assert(!"Unknown texel element size");
> + return;
> +  }
> +
> +  pipe->clear_render_target(pipe, sf, &color,
> +box->x, box->y, box->width, box->height);

I only recently realized this, but I'm fairly sure this is wrong --
needs to be divided by util_format_blockwidth/height, otherwise we'll
go way out of bounds for compressed formats. [And yes,
nv50_clear_texture has the same problem.]

> +   }
> +   pipe->surface_destroy(pipe, sf);
> +}
>
>  /* Return if the box is totally inside the resource.
>   */
> diff --git a/src/gallium/auxiliary/util/u_surface.h 
> b/src/gallium/auxiliary/util/u_surface.h
> index bfd8f40..069a393 100644
> --- a/src/gallium/auxiliary/util/u_surface.h
> +++ b/src/gallium/auxiliary/util/u_surface.h
> @@ -97,6 +97,12 @@ util_clear_depth_stencil(struct pipe_context *pipe,
>   unsigned stencil,
>   unsigned dstx, unsigned dsty,
>   unsigned width, unsigned height);
> +extern void
> +util_surface_clear_texture(struct pipe_context *pipe,
> +   struct pipe_resource *res,
> +   unsigned level,
> +   const struct pipe_box *box,
> +   const void

Re: [Mesa-dev] [RFC 1/3] gallium/aux/util: Trivial, we already have format use it

2015-12-03 Thread Marek Olšák
Pushed this patch.

Thanks,

Marek

On Thu, Dec 3, 2015 at 10:44 AM, Edward O'Callaghan
 wrote:
> No need to dereference again, fixup for clarity.
>
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/auxiliary/util/u_surface.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/util/u_surface.c 
> b/src/gallium/auxiliary/util/u_surface.c
> index 70ed911..6aa44f9 100644
> --- a/src/gallium/auxiliary/util/u_surface.c
> +++ b/src/gallium/auxiliary/util/u_surface.c
> @@ -397,7 +397,7 @@ util_clear_render_target(struct pipe_context *pipe,
>   }
>}
>else {
> - util_pack_color(color->f, dst->format, &uc);
> + util_pack_color(color->f, format, &uc);
>}
>
>util_fill_box(dst_map, dst->format,
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] meta/generate_mipmap: Work-around GLES 1.x problem with GL_DRAW_FRAMEBUFFER

2015-12-03 Thread Ian Romanick
From: Ian Romanick 

GL_DRAW_FRAMEBUFFER does not exist in OpenGL ES 1.x, and since
_mesa_meta_begin hasn't been called yet, we have to work-around API
difficulties.  The whole reason that GL_DRAW_FRAMEBUFFER is used instead
of GL_FRAMEBUFFER is that the read framebuffer may be different.  This
is moot in OpenGL ES 1.x.

I have another patch series that would also fix this (by removing the
calls to _mesa_BindFramebuffer and friends), but it's not quite ready
yet... and I think it may be a bit heavy for some stable branches.
Consider this a stop-gap fix.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93215
Cc: "11.0 11.1" 
---
 src/mesa/drivers/common/meta_generate_mipmap.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
b/src/mesa/drivers/common/meta_generate_mipmap.c
index d38e6b8..2b942d6 100644
--- a/src/mesa/drivers/common/meta_generate_mipmap.c
+++ b/src/mesa/drivers/common/meta_generate_mipmap.c
@@ -62,6 +62,15 @@ fallback_required(struct gl_context *ctx, GLenum target,
GLuint srcLevel;
GLenum status;
 
+   /* GL_DRAW_FRAMEBUFFER does not exist in OpenGL ES 1.x, and since
+* _mesa_meta_begin hasn't been called yet, we have to work-around API
+* difficulties.  The whole reason that GL_DRAW_FRAMEBUFFER is used instead
+* of GL_FRAMEBUFFER is that the read framebuffer may be different.  This
+* is moot in OpenGL ES 1.x.
+*/
+   const GLenum fbo_target = ctx->API == API_OPENGLES
+  ? GL_FRAMEBUFFER : GL_DRAW_FRAMEBUFFER;
+
/* check for fallbacks */
if (target == GL_TEXTURE_3D) {
   _mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_HIGH,
@@ -102,13 +111,13 @@ fallback_required(struct gl_context *ctx, GLenum target,
 */
if (!mipmap->FBO)
   _mesa_GenFramebuffers(1, &mipmap->FBO);
-   _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, mipmap->FBO);
+   _mesa_BindFramebuffer(fbo_target, mipmap->FBO);
 
-   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, 
baseImage, 0);
+   _mesa_meta_bind_fbo_image(fbo_target, GL_COLOR_ATTACHMENT0, baseImage, 0);
 
-   status = _mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER);
+   status = _mesa_CheckFramebufferStatus(fbo_target);
 
-   _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fboSave);
+   _mesa_BindFramebuffer(fbo_target, fboSave);
 
if (status != GL_FRAMEBUFFER_COMPLETE_EXT) {
   _mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_HIGH,
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #39 from bellamort...@gmail.com ---
Ugh, bisection is done.
CXXFLAG -finline-small-functions cuases the segfault.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 2:05 PM, Emil Velikov  wrote:
> Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
> to the ES dispatch".
>
> Note: as the core symbol is present in GLES 1.1 we cannot (should not)
> include the KHR one in the es11 table. Add the symbol, commented out,
> with description for posterity.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
> Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 97f81f9..687c8f3 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -2049,6 +2049,8 @@ const struct function gles11_functions_possible[] = {
> { "glGetDebugMessageLogKHR", 11, -1 },
> { "glGetObjectLabelKHR", 11, -1 },
> { "glGetObjectPtrLabelKHR", 11, -1 },
> +   // The following clashes with the non KHR definition above

We have comments elsewhere like

   // We check for the aliased -OES version in GLES 2

Can you make the comment match that?

> +//   { "glGetPointervKHR", 11, -1 },

Indent.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Vinson Lee
On Thu, Dec 3, 2015 at 2:05 PM, Emil Velikov  wrote:
> Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
> to the ES dispatch".
>
> Note: as the core symbol is present in GLES 1.1 we cannot (should not)
> include the KHR one in the es11 table. Add the symbol, commented out,
> with description for posterity.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
> Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 97f81f9..687c8f3 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -2049,6 +2049,8 @@ const struct function gles11_functions_possible[] = {
> { "glGetDebugMessageLogKHR", 11, -1 },
> { "glGetObjectLabelKHR", 11, -1 },
> { "glGetObjectPtrLabelKHR", 11, -1 },
> +   // The following clashes with the non KHR definition above
> +//   { "glGetPointervKHR", 11, -1 },
> { "glObjectLabelKHR", 11, -1 },
> { "glObjectPtrLabelKHR", 11, -1 },
>
> @@ -2284,6 +2286,7 @@ const struct function gles2_functions_possible[] = {
> { "glGetDebugMessageLogKHR", 20, -1 },
> { "glGetObjectLabelKHR", 20, -1 },
> { "glGetObjectPtrLabelKHR", 20, -1 },
> +   { "glGetPointervKHR", 20, -1 },
> { "glObjectLabelKHR", 20, -1 },
> { "glObjectPtrLabelKHR", 20, -1 },
>
> --
> 2.6.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Tested-by: Vinson Lee 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-03 Thread Emil Velikov
Should have been part of commit f53f9eb8d49 "glapi: add GetPointervKHR
to the ES dispatch".

Note: as the core symbol is present in GLES 1.1 we cannot (should not)
include the KHR one in the es11 table. Add the symbol, commented out,
with description for posterity.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235
Fixes: f53f9eb8d49 "glapi: add GetPointervKHR to the ES dispatch".
Signed-off-by: Emil Velikov 
---
 src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 97f81f9..687c8f3 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -2049,6 +2049,8 @@ const struct function gles11_functions_possible[] = {
{ "glGetDebugMessageLogKHR", 11, -1 },
{ "glGetObjectLabelKHR", 11, -1 },
{ "glGetObjectPtrLabelKHR", 11, -1 },
+   // The following clashes with the non KHR definition above
+//   { "glGetPointervKHR", 11, -1 },
{ "glObjectLabelKHR", 11, -1 },
{ "glObjectPtrLabelKHR", 11, -1 },
 
@@ -2284,6 +2286,7 @@ const struct function gles2_functions_possible[] = {
{ "glGetDebugMessageLogKHR", 20, -1 },
{ "glGetObjectLabelKHR", 20, -1 },
{ "glGetObjectPtrLabelKHR", 20, -1 },
+   { "glGetPointervKHR", 20, -1 },
{ "glObjectLabelKHR", 20, -1 },
{ "glObjectPtrLabelKHR", 20, -1 },
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 09/44] i965/hsw: Enable L3 atomics.

2015-12-03 Thread Kenneth Graunke
On Tuesday, December 01, 2015 12:19:27 AM Jordan Justen wrote:
> From: Francisco Jerez 
> 
> Improves performance of the arb_shader_image_load_store-atomicity
> piglit test by over 25x (which isn't a real benchmark it's just heavy
> on atomics -- the improvement in a microbenchmark I wrote a while ago
> seemed to be even greater).  The drawback is one needs to be
> extra-careful not to hang the GPU (in fact the whole system).  A DC
> partition must have been allocated on L3, the "convert L3 cycle for DC
> to UC" bit may not be set, the MOCS L3 cacheability bit must be set
> for all surfaces accessed using DC atomics, and the SCRATCH1 and
> ROW_CHICKEN3 bits must be kept in sync.
> 
> A fairly recent kernel is required for the command parser to allow
> writes to these registers.
> 
> Reviewed-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 108f3a8..9aad563 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -254,5 +254,19 @@ setup_l3_config(struct brw_context *brw, const struct 
> brw_l3_config *cfg)
>  SET_FIELD(cfg->n[L3P_T], GEN7_L3CNTLREG3_T_ALLOC));
>  
>ADVANCE_BATCH();
> +
> +  if (brw->is_haswell && brw->intelScreen->cmd_parser_version >= 4) {
> + /* Enable L3 atomics on HSW if we have a DC partition, otherwise 
> keep
> +  * them disabled to avoid crashing the system hard.
> +  */
> + BEGIN_BATCH(5);
> + OUT_BATCH(MI_LOAD_REGISTER_IMM | (5 - 2));
> + OUT_BATCH(HSW_SCRATCH1);
> + OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
> + OUT_BATCH(HSW_ROW_CHICKEN3);
> + OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
> +   (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
> + ADVANCE_BATCH();
> +  }
> }
>  }
> 

This seems reasonable, so assuming it works,

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 14/44] i965: Add debug flag to print out the new L3 state during transitions.

2015-12-03 Thread Kenneth Graunke
On Tuesday, December 01, 2015 12:19:32 AM Jordan Justen wrote:
> From: Francisco Jerez 
> 
> Reviewed-by: Jordan Justen 
> Reviewed-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 17 +
>  src/mesa/drivers/dri/i965/intel_debug.c   |  1 +
>  src/mesa/drivers/dri/i965/intel_debug.h   |  1 +
>  3 files changed, 19 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index a895723..b3b5b2e 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -456,6 +456,18 @@ update_urb_size(struct brw_context *brw, const struct 
> brw_l3_config *cfg)
> }
>  }
>  
> +/**
> + * Print out the specified L3 configuration.
> + */
> +static void
> +dump_l3_config(const struct brw_l3_config *cfg)
> +{
> +   fprintf(stderr, "SLM=%d URB=%d ALL=%d DC=%d RO=%d IS=%d C=%d T=%d\n",
> +   cfg->n[L3P_SLM], cfg->n[L3P_URB], cfg->n[L3P_ALL],
> +   cfg->n[L3P_DC], cfg->n[L3P_RO],
> +   cfg->n[L3P_IS], cfg->n[L3P_C], cfg->n[L3P_T]);
> +}
> +
>  static void
>  emit_l3_state(struct brw_context *brw)
>  {
> @@ -485,6 +497,11 @@ emit_l3_state(struct brw_context *brw)
>setup_l3_config(brw, cfg);
>update_urb_size(brw, cfg);
>brw->l3.config = cfg;
> +
> +  if (unlikely(INTEL_DEBUG & DEBUG_L3)) {
> + fprintf(stderr, "L3 config transition (%f > %f): ", dw, 
> dw_threshold);
> + dump_l3_config(cfg);
> +  }
> }
>  }
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
> b/src/mesa/drivers/dri/i965/intel_debug.c
> index d073d66..e08c296 100644
> --- a/src/mesa/drivers/dri/i965/intel_debug.c
> +++ b/src/mesa/drivers/dri/i965/intel_debug.c
> @@ -78,6 +78,7 @@ static const struct debug_control debug_control[] = {
> { "tcs", DEBUG_TCS },
> { "ds",  DEBUG_TES },
> { "tes", DEBUG_TES },
> +   { "l3",  DEBUG_L3 },
> { NULL,0 }
>  };
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
> b/src/mesa/drivers/dri/i965/intel_debug.h
> index 175ac68..b7b5111 100644
> --- a/src/mesa/drivers/dri/i965/intel_debug.h
> +++ b/src/mesa/drivers/dri/i965/intel_debug.h
> @@ -71,6 +71,7 @@ extern uint64_t INTEL_DEBUG;
>  #define DEBUG_NO_COMPACTION   (1ull << 35)
>  #define DEBUG_TCS (1ull << 36)
>  #define DEBUG_TES (1ull << 37)
> +#define DEBUG_L3  (1ull << 38)
>  
>  #ifdef HAVE_ANDROID_PLATFORM
>  #define LOG_TAG "INTEL-MESA"
> 

Good call, this will be really nice to have as a debug flag.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 13/44] i965: Implement L3 state atom.

2015-12-03 Thread Kenneth Graunke
On Tuesday, December 01, 2015 12:19:31 AM Jordan Justen wrote:
> From: Francisco Jerez 
> 
> The L3 state atom calculates the target L3 partition weights when the
> program bound to some shader stage is modified, and in case they are
> far enough from the current partitioning it makes sure that the L3
> state is re-emitted.
> 
> v3: Fix for inconsistent units the context URB size is expressed in.
> Clamp URB size to 1008 KB on SKL due to FF hardware limitation.
> ---
>  src/mesa/drivers/dri/i965/brw_context.h   |  6 +++
>  src/mesa/drivers/dri/i965/brw_state.h |  1 +
>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 81 
> +++
>  3 files changed, 88 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 0b91147..48024c6 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -672,6 +672,8 @@ enum brw_predicate_state {
>  
>  struct shader_times;
>  
> +struct brw_l3_config;
> +
>  /**
>   * brw_context is derived from gl_context.
>   */
> @@ -1214,6 +1216,10 @@ struct brw_context
> int basevertex;
>  
> struct {
> +  const struct brw_l3_config *config;
> +   } l3;
> +
> +   struct {
>drm_intel_bo *bo;
>const char **names;
>int *ids;
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index 94734ba..49f301a 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -129,6 +129,7 @@ extern const struct brw_tracked_state gen7_depthbuffer;
>  extern const struct brw_tracked_state gen7_clip_state;
>  extern const struct brw_tracked_state gen7_disable_stages;
>  extern const struct brw_tracked_state gen7_gs_state;
> +extern const struct brw_tracked_state gen7_l3_state;
>  extern const struct brw_tracked_state gen7_ps_state;
>  extern const struct brw_tracked_state gen7_push_constant_space;
>  extern const struct brw_tracked_state gen7_sbe_state;
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 0b5e231..a895723 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -418,3 +418,84 @@ setup_l3_config(struct brw_context *brw, const struct 
> brw_l3_config *cfg)
>}
> }
>  }
> +
> +/**
> + * Return the unit brw_context::urb::size is expressed in, in KB.  \sa
> + * brw_device_info::urb::size.
> + */
> +static unsigned
> +get_urb_size_scale(const struct brw_device_info *devinfo)
> +{
> +   return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
> +}
> +
> +/**
> + * Update the URB size in the context state for the specified L3
> + * configuration.
> + */
> +static void
> +update_urb_size(struct brw_context *brw, const struct brw_l3_config *cfg)
> +{
> +   const struct brw_device_info *devinfo = brw->intelScreen->devinfo;
> +   /* From the SKL "L3 Allocation and Programming" documentation:
> +*
> +* "URB is limited to 1008KB due to programming restrictions.  This is not
> +* a restriction of the L3 implementation, but of the FF and other 
> clients.
> +* Therefore, in a GT4 implementation it is possible for the programmed
> +* allocation of the L3 data array to provide 3*384KB=1152KB for URB, but
> +* only 1008KB of this will be used."
> +*/
> +   const unsigned max = (devinfo->gen == 9 ? 1008 : ~0);

I think this would be clearer as devinfo->gt == 4, since it applies to
all currently known GT4 parts, and doesn't apply to GT1-3.  Presumably
GT1-3 just don't have enough URB to hit that limit, so it's probably
moot...*shrug*.

Acked-by: Kenneth Graunke 

> +   const unsigned sz =
> +  MIN2(max, cfg->n[L3P_URB] * get_l3_way_size(devinfo)) /
> +  get_urb_size_scale(devinfo);
> +
> +   if (brw->urb.size != sz) {
> +  brw->urb.size = sz;
> +  brw->ctx.NewDriverState |= BRW_NEW_URB_SIZE;
> +   }
> +}
> +
> +static void
> +emit_l3_state(struct brw_context *brw)
> +{
> +   const struct brw_l3_weights w = get_pipeline_state_l3_weights(brw);
> +   const float dw = diff_l3_weights(w, 
> get_config_l3_weights(brw->l3.config));
> +   /* The distance between any two compatible weight vectors cannot exceed 
> two
> +* due to the triangle inequality.
> +*/
> +   const float large_dw_threshold = 2.0;
> +   /* Somewhat arbitrary, simply makes sure that there will be no repeated
> +* transitions to the same L3 configuration, could probably do better 
> here.
> +*/
> +   const float small_dw_threshold = 0.5;
> +   /* If we're emitting a new batch the caches should already be clean and 
> the
> +* transition should be relatively cheap, so it shouldn't hurt much to use
> +* the smaller threshold.  Otherwise use the larger threshold so that we
> +* only reprogram the L3 mid-batch if the most recently programmed
> +* configuration is incompatible with the current pipe

Re: [Mesa-dev] [PATCH 00/26] i965: Tessellation shaders for Gen8+!

2015-12-03 Thread Matt Turner
On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunke  wrote:
> Happy reviewing!

Patches 3-8 are

Reviewed-by: Matt Turner 

I sent a few small comments on 9. The only real question I have is
about the minimum number of VS URB entries -- it seems to be BDW-only
according to the docs.

More to come.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/26] i965: URB allocations for tessellation

2015-12-03 Thread Matt Turner
On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunke  wrote:
> From: Chris Forbes 
>
> Signed-off-by: Chris Forbes 

The commit title should be some imperative statement. Maybe just add
"Add" to the beginning.

> ---
>  src/mesa/drivers/dri/i965/brw_context.h  |  17 +++-
>  src/mesa/drivers/dri/i965/gen7_blorp.cpp |   8 ++
>  src/mesa/drivers/dri/i965/gen7_urb.c | 162 
> +--
>  3 files changed, 157 insertions(+), 30 deletions(-)
>
> The URB code could use some janitorial work - using arrays based on
> MESA_SHADER_* instead of replicating a bunch of code would be much nicer.
>
> I just don't feel like doing it today.
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index e22f21d..88f6713 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -995,6 +995,8 @@ struct brw_context
> struct {
>GLuint vsize;/* vertex size plus header in urb registers */
>GLuint gsize;/* GS output size in urb registers */
> +  GLuint hsize; /* Tessellation control output size in urb 
> registers */
> +  GLuint dsize; /* Tessellation evaluation output size in 
> urb registers */
>GLuint csize;/* constant buffer size in urb registers */
>GLuint sfsize;   /* setup data size in urb registers */
>
> @@ -1007,12 +1009,16 @@ struct brw_context
>GLuint max_gs_entries;   /* Maximum number of GS entries */
>
>GLuint nr_vs_entries;
> +  GLuint nr_hs_entries;
> +  GLuint nr_ds_entries;
>GLuint nr_gs_entries;
>GLuint nr_clip_entries;
>GLuint nr_sf_entries;
>GLuint nr_cs_entries;
>
>GLuint vs_start;
> +  GLuint hs_start;
> +  GLuint ds_start;
>GLuint gs_start;
>GLuint clip_start;
>GLuint sf_start;
> @@ -1023,6 +1029,7 @@ struct brw_context
> * URB space for the GS.
> */
>bool gs_present;
> +  bool ts_present;
> } urb;
>
>
> @@ -1628,12 +1635,18 @@ void gen8_emit_3dstate_sample_pattern(struct 
> brw_context *brw);
>  /* gen7_urb.c */
>  void
>  gen7_emit_push_constant_state(struct brw_context *brw, unsigned vs_size,
> +  unsigned hs_size, unsigned ds_size,
>unsigned gs_size, unsigned fs_size);
>
>  void
>  gen7_emit_urb_state(struct brw_context *brw,
> -unsigned nr_vs_entries, unsigned vs_size,
> -unsigned vs_start, unsigned nr_gs_entries,
> +unsigned nr_vs_entries,
> +unsigned vs_size, unsigned vs_start,
> +unsigned nr_hs_entries,
> +unsigned hs_size, unsigned hs_start,
> +unsigned nr_ds_entries,
> +unsigned ds_size, unsigned ds_start,
> +unsigned nr_gs_entries,
>  unsigned gs_size, unsigned gs_start);
>
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp 
> b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
> index e87b9d1..89b73ca 100644
> --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
> @@ -50,6 +50,8 @@ gen7_blorp_emit_urb_config(struct brw_context *brw)
> unsigned urb_size = (brw->is_haswell && brw->gt == 3) ? 32 : 16;
> gen7_emit_push_constant_state(brw,
>   urb_size / 2 /* vs_size */,
> + 0 /* hs_size */,
> + 0 /* ds_size */,
>   0 /* gs_size */,
>   urb_size / 2 /* fs_size */);
>
> @@ -60,6 +62,12 @@ gen7_blorp_emit_urb_config(struct brw_context *brw)
> 32 /* num_vs_entries */,
> 2 /* vs_size */,
> 2 /* vs_start */,
> +   0 /* num_hs_entries */,
> +   1 /* hs_size */,
> +   2 /* hs_start */,
> +   0 /* num_ds_entries */,
> +   1 /* ds_size */,
> +   2 /* ds_start */,
> 0 /* num_gs_entries */,
> 1 /* gs_size */,
> 2 /* gs_start */);
> diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> b/src/mesa/drivers/dri/i965/gen7_urb.c
> index 161de77..9a09a19 100644
> --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> @@ -34,7 +34,7 @@
>   *   __-__   _-_
>   *  / \ /   \
>   * +-+
> - * | VS/FS/GS Push |  VS/GS URB  |
> + * |  VS/HS/DS/GS/FS Push  |   VS/HS/DS/GS URB   |
>   * |   Constants   |  

Re: [Mesa-dev] [PATCH 09/26] i965: URB allocations for tessellation

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 12:01 PM, Matt Turner  wrote:
> Align the overflowing expression in these ALIGN()s

Ignore this -- just GMail messing up whitespace.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa/version: Update gl_extensions::Version during version override

2015-12-03 Thread Emil Velikov
On 3 December 2015 at 19:05, Nanley Chery  wrote:
> On Thu, Dec 3, 2015 at 10:58 AM, Emil Velikov 
> wrote:
>>
>> On 1 December 2015 at 20:21, Nanley Chery  wrote:
>> > On Tue, Dec 01, 2015 at 10:44:43AM -0800, Nanley Chery wrote:
>> > From: Nanley Chery 
>> >
>> > Commit a16ffb743ced9fde80b2485dfc2d86ae74e86f25, which introduced
>> > gl_extensions::Version, updates the field when the context version
>> > is computed and when entering/exiting meta. Update this field when
>> > the version is overridden as well.
>> >
>> > Cc: Marta Lofstedt 
>> > Cc: Emil Velikov 
>> > Cc: "11.1" 
>> > Signed-off-by: Nanley Chery 
>> > ---
>> >
>> > Cc'd mesa-stable.
>> >
>> You can also do that just as you push the commit :-)
>>
>
>  How do I do that?
>
Apply the patch in tree.
Amend.
Push.

Not a big deal if you'd like to resend the patch with the tag...
whichever works for you really.
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa/version: Update gl_extensions::Version during version override

2015-12-03 Thread Matt Turner
On Thu, Dec 3, 2015 at 11:05 AM, Nanley Chery  wrote:
>
>
> On Thu, Dec 3, 2015 at 10:58 AM, Emil Velikov 
> wrote:
>>
>> On 1 December 2015 at 20:21, Nanley Chery  wrote:
>> > On Tue, Dec 01, 2015 at 10:44:43AM -0800, Nanley Chery wrote:
>> > From: Nanley Chery 
>> >
>> > Commit a16ffb743ced9fde80b2485dfc2d86ae74e86f25, which introduced
>> > gl_extensions::Version, updates the field when the context version
>> > is computed and when entering/exiting meta. Update this field when
>> > the version is overridden as well.
>> >
>> > Cc: Marta Lofstedt 
>> > Cc: Emil Velikov 
>> > Cc: "11.1" 
>> > Signed-off-by: Nanley Chery 
>> > ---
>> >
>> > Cc'd mesa-stable.
>> >
>> You can also do that just as you push the commit :-)
>>
>
>  How do I do that?

git commit --amend
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa/version: Update gl_extensions::Version during version override

2015-12-03 Thread Nanley Chery
On Thu, Dec 3, 2015 at 10:58 AM, Emil Velikov 
wrote:

> On 1 December 2015 at 20:21, Nanley Chery  wrote:
> > On Tue, Dec 01, 2015 at 10:44:43AM -0800, Nanley Chery wrote:
> > From: Nanley Chery 
> >
> > Commit a16ffb743ced9fde80b2485dfc2d86ae74e86f25, which introduced
> > gl_extensions::Version, updates the field when the context version
> > is computed and when entering/exiting meta. Update this field when
> > the version is overridden as well.
> >
> > Cc: Marta Lofstedt 
> > Cc: Emil Velikov 
> > Cc: "11.1" 
> > Signed-off-by: Nanley Chery 
> > ---
> >
> > Cc'd mesa-stable.
> >
> You can also do that just as you push the commit :-)
>
>
 How do I do that?


> Does exactly what is says on the tin. Thanks !
> Reviewed-by: Emil Velikov 
>
> -Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/1] Do not require all components to apply opt_vector_float()

2015-12-03 Thread Matt Turner
On Wed, Dec 2, 2015 at 7:43 AM, Juan A. Suarez Romero
 wrote:
> This patch, based on Matt suggestion, replaces the former two ones, as it gets
> better results.
>
> Basically, so far opt_vector_float() is only applied when the 4 components of
> the register are written with MOV. This patch changes the behaviour so it
> doesn't require to write the 4 components to apply it.
>
> Results obtained with shader-db tests are:
>
> total instructions in shared programs: 6819484 -> 6811698 (-0.11%)
> instructions in affected programs: 387245 -> 379459 (-2.01%)
> total loops in shared programs:1971 -> 1971 (0.00%)
> helped:3980
> HURT:  0
> GAINED:3
> LOST:  0

The GAINED: 3 is almost certainly because of a sporadic failure in
shader-db (or Mesa...?) during the baseline shader-db run. When you
ran shader-db with your patch applied, the same failure didn't occur,
so report.py thinks this means 3 programs were gained. When I see
things like that, I manually rerun that specific shader and append its
results to the appropriate file and rerun report.py.

> Which are better than the ones obtained in the first version.
>
> Couple of final comments:
>
> * In the original version Matt commented about a bug in
>   opt_dead_code_eliminate(). As he already wrote a patch, I'm just waiting for
>   him to send it.

I sent it ("[PATCH] i965: Don't mark dead instructions' sources
live.") Nov 25 (and Cc'd you). Ken reviewed it, and I pushed it two
days ago as commit 48b4e88.

> * Matt commented also about a possible improvement in an example that allows
>   evaluating at compile-time. As it is a different optimization, I'm not
>   covering it on this patch, and rather letting it for a future improvement.
>
> * I commented about a wrong application of opt_vector_float() in an example 
> Matt
>   found. He told that probably it lacks resetting last_reg to -1. This patch 
> is
>   covering that error.

Excellent, thank you!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/1] i965: add opportunistic behaviour to opt_vector_float()

2015-12-03 Thread Matt Turner
On Wed, Dec 2, 2015 at 7:43 AM, Juan A. Suarez Romero
 wrote:
> opt_vector_float() transforms several scalar MOV operations to a single
> vectorial MOV.
>
> This is done when those MOV covers all the components of the destination
> register. So something like:
>
> mov vgrf3.0.xy:D, 0D
> mov vgrf3.0.w:D, 1065353216D
> mov vgrf3.0.z:D, 0D
>
> is transformed in:
>
> mov vgrf3.0:F, [0F, 0F, 0F, 1F]
>
> But there are cases where not all the components are written. For
> example, in:
>
> mov vgrf2.0.x:D, 1073741824D
> mov vgrf3.0.xy:D, 0D
> mov vgrf3.0.w:D, 1065353216D
> mov vgrf4.0.xy:D, 1065353216D
> mov vgrf4.0.w:D, 0D
> mov vgrf6.0:UD, u4.xyzw:UD
>
> Nor vgrf3 nor vgrf4 .z components are written, so the optimization is
> not applied.
>
> But it could be applied anyway with the components covered, using a
> writemask to select the ones written. So we could transform it in:
>
> mov vgrf2.0.x:D, 1073741824D
> mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F]
> mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F]
> mov vgrf6.0:UD, u4.xyzw:UD
>
> This commit does precisely that: opportunistically apply
> opt_vector_float() when possible.

Thanks for doing this. Some comments inline, but overall looks good!

>
> The improvement obtained regarding current upstream (56aff6bb4eaf) is:
>
> total instructions in shared programs: 6819484 -> 6811698 (-0.11%)
> instructions in affected programs: 387245 -> 379459 (-2.01%)
> total loops in shared programs:1971 -> 1971 (0.00%)
> helped:3980
> HURT:  0
> GAINED:3

Run the shader-db on just the program that had the problem here, add
it to your before-results, and rerun report.py.

> LOST:  0
>
> Signed-off-by: Juan A. Suarez Romero 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 60 
> +-
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  3 ++
>  2 files changed, 41 insertions(+), 22 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index a697bdf..f9bf820 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -309,6 +309,28 @@ src_reg::equals(const src_reg &r) const
>  }
>
>  bool
> +vec4_visitor::vectorize_mov(vec4_instruction *current_inst, vec4_instruction 
> *imm_inst[],
> +  int inst_count, uint writemask, uint8_t *imm, bblock_t *block)

s/uint/unsigned/

bool
vec4_visitor::vectorize_mov(bblock_t *block, vec4_instruction *current_inst,
uint8_t imm[4], vec4_instruction *imm_inst[4],
int inst_count, unsigned writemask)


> +{
> +   if (inst_count < 2) {
> +  return false;
> +   }
> +
> +   unsigned vf;
> +   memcpy(&vf, imm, sizeof(vf));
> +   vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf));
> +   mov->dst.type = BRW_REGISTER_TYPE_F;
> +   mov->dst.writemask = writemask;
> +   current_inst->insert_before(block, mov);
> +
> +   for (int i = 0; i < inst_count; i++) {
> +  imm_inst[i]->remove(block);
> +   }
> +
> +   return true;
> +}
> +
> +bool
>  vec4_visitor::opt_vector_float()
>  {
> bool progress = false;
> @@ -316,27 +338,36 @@ vec4_visitor::opt_vector_float()
> int last_reg = -1, last_reg_offset = -1;
> enum brw_reg_file last_reg_file = BAD_FILE;
>
> -   int remaining_channels = 0;
> -   uint8_t imm[4];
> +   uint8_t imm[4] = { 0 };
> int inst_count = 0;
> vec4_instruction *imm_inst[4];
> +   unsigned int writemask = 0;

s/unsigned int/unsigned/

>
> foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
>if (last_reg != inst->dst.nr ||
>last_reg_offset != inst->dst.reg_offset ||
>last_reg_file != inst->dst.file) {
> +
> + progress |= vectorize_mov(inst, imm_inst, inst_count, writemask, 
> imm, block);

The argument order to this function seems strange to me. Could we make it this?

   vectorize_mov(block, inst, imm, imm_inst, inst_count, writemask)

> +
> + inst_count = 0;
> + writemask = 0;
>   last_reg = inst->dst.nr;
>   last_reg_offset = inst->dst.reg_offset;
>   last_reg_file = inst->dst.file;
> - remaining_channels = WRITEMASK_XYZW;
> -
> - inst_count = 0;
> + for (int i = 0; i < 4; i++) {
> +imm[i] = 0;
> + }
>}
>
>if (inst->opcode != BRW_OPCODE_MOV ||
>inst->dst.writemask == WRITEMASK_XYZW ||
> -  inst->src[0].file != IMM)
> +  inst->src[0].file != IMM) {
> + progress |= vectorize_mov(inst, imm_inst, inst_count, writemask, 
> imm, block);
> + inst_count = 0;
> + last_reg = -1;
>   continue;
> +  }
>
>int vf = brw_float_to_vf(inst->src[0].f);
>if (vf == -1)
> @@ -351,23 +382,8 @@ vec4_visitor::opt_vector_float()
>if ((inst->dst.writemask & WRITEMASK_W) != 0)
>  

[Mesa-dev] [PATCH] nv50/ir: fix DCE to not generate 96-bit loads

2015-12-03 Thread Ilia Mirkin
A situation where there's a 128-bit load where the last component gets
DCE'd causes a 96-bit load to be generated, which no GPU can actually
emit. Avoid generating such instructions by scaling back to 64-bit on
the first load when splitting.

Signed-off-by: Ilia Mirkin 
Cc: "11.0 11.1" 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 32 +-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index dd99973..e07153e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -2962,6 +2962,16 @@ DeadCodeElim::visit(BasicBlock *bb)
return true;
 }
 
+// Each load can go into up to 4 destinations, any of which might potentially
+// be dead (i.e. a hole). These can always be split into 2 loads, independent
+// of where the holes are. We find the first contiguous region, put it into
+// the first load, and then put the second contiguous region into the second
+// load. There can be at most 2 contiguous regions.
+//
+// Note that there are some restrictions, for example it's not possible to do
+// a 64-bit load that's not 64-bit aligned, so such a load has to be split
+// up. Also hardware doesn't support 96-bit loads, so those also have to be
+// split into a 64-bit and 32-bit load.
 void
 DeadCodeElim::checkSplitLoad(Instruction *ld1)
 {
@@ -2982,6 +2992,8 @@ DeadCodeElim::checkSplitLoad(Instruction *ld1)
addr1 = ld1->getSrc(0)->reg.data.offset;
n1 = n2 = 0;
size1 = size2 = 0;
+
+   // Compute address/width for first load
for (d = 0; ld1->defExists(d); ++d) {
   if (mask & (1 << d)) {
  if (size1 && (addr1 & 0x7))
@@ -2995,16 +3007,34 @@ DeadCodeElim::checkSplitLoad(Instruction *ld1)
  break;
   }
}
+
+   // Scale back the size of the first load until it can be loaded. This
+   // typically happens for TYPE_B96 loads.
+   while (n1 &&
+  !prog->getTarget()->isAccessSupported(ld1->getSrc(0)->reg.file,
+typeOfSize(size1))) {
+  size1 -= def1[--n1]->reg.size;
+  d--;
+   }
+
+   // Compute address/width for second load
for (addr2 = addr1 + size1; ld1->defExists(d); ++d) {
   if (mask & (1 << d)) {
+ assert(!size2 || !(addr2 & 0x7));
  def2[n2] = ld1->getDef(d);
  size2 += def2[n2++]->reg.size;
-  } else {
+  } else if (!n2) {
  assert(!n2);
  addr2 += ld1->getDef(d)->reg.size;
+  } else {
+ break;
   }
}
 
+   // Make sure that we've processed all the values
+   for (; ld1->defExists(d); ++d)
+  assert(!(mask & (1 << d)));
+
updateLdStOffset(ld1, addr1, func);
ld1->setType(typeOfSize(size1));
for (d = 0; d < 4; ++d)
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa/version: Update gl_extensions::Version during version override

2015-12-03 Thread Emil Velikov
On 1 December 2015 at 20:21, Nanley Chery  wrote:
> On Tue, Dec 01, 2015 at 10:44:43AM -0800, Nanley Chery wrote:
> From: Nanley Chery 
>
> Commit a16ffb743ced9fde80b2485dfc2d86ae74e86f25, which introduced
> gl_extensions::Version, updates the field when the context version
> is computed and when entering/exiting meta. Update this field when
> the version is overridden as well.
>
> Cc: Marta Lofstedt 
> Cc: Emil Velikov 
> Cc: "11.1" 
> Signed-off-by: Nanley Chery 
> ---
>
> Cc'd mesa-stable.
>
You can also do that just as you push the commit :-)

Does exactly what is says on the tin. Thanks !
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support

2015-12-03 Thread Dave Airlie
On 4 Dec 2015 03:01, "Aaron Watry"  wrote:
>
> Hi Dave (and others),
>
> I cloned your fdo r600g-tess-submit branch and gave it a spin on CEDAR
(Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few issues.

Just grab r600g-tess-staging

-submit only worked on cayman, but i d posted it here so didn't want to
change it during review.

Dave.
>
> 1) Initially, I got an assertion in r600_add_atom stating that the atom
ID was not less than the R600_NUM_ATOMS value (id = 51, R600_NUM_ATOMS=51).
>   I bumped R600_NUM_ATOMS to 52 for now, and that got rid of that
issue... although I have no idea if that was a correct fix.
>
> 2) Next, I kept getting a segfault in evergreen_adjust_gprs at line 3931.
Turns out that rctx->hw_shader_stages[2].shader was null
(missing/miscompiled GS?).
>
> I naively changed the code to the following, and now Heaven actually runs
with tessellation enabled (and it looks like it's working).
>
> /* gather required shader gprs */
> for (i = 0; i < EG_NUM_HW_STAGES; i++) {
> if (!rctx->hw_shader_stages[i].shader) {
> num_gprs[i] = def_gprs[i];
> continue;
> }
> num_gprs[i] = rctx->hw_shader_stages[i].shader->shader.bc.ngpr;
> }
>
> Just figured that I'd let you know...
>
> If you don't have CEDAR hardware to test with, feel free to ping me to
test any additional changes.  Note that I didn't run the benchmark to
completion (too slow, had to get other work done), but it didn't hang my
GPU in the time that I did have it running.
>
> --Aaron
>
>
> On Mon, Nov 30, 2015 at 12:20 AM, Dave Airlie  wrote:
>>
>> Hi,
>>
>> Patchbomb time, this set of patches is a first pass at add adding
>> ARB_tessellation_shader support to the r600g driver. Only Evergreen
>> and newer GPUs support tessellation. On any of the GPUs that support
>> native FP64, this will enable OpenGL 4.1 on them.
>>
>> The first bunch of patches are a bit of a driver rework to get
>> things in better shape for tessellation, they shouldn't cause
>> any regressions.
>>
>> This runs heaven on cayman and should pass all the piglits
>> unless I've done something wrong.
>>
>> Development hit two HW programming fun times, one with tess and
>> dynamic GPR interaction requiring disabling dynamic GPRs, and
>> one with programming of some SIMD registers to block TESS shaders
>> on one unit. These fixed most of the hangs we saw during development.
>>
>> This doesn't contain SB support yet, Glenn has started working on it.
>>
>> Currently tested hw:
>> working: CAYMAN, REDWOOD, BARTS, TURKS
>> hangs on any tessellation: CAYMAN
>> hangs differently at least with heaven: SUMO
>>
>> This patchset doesn't block it on any GPUs, but when merged it
>> probably should.
>>
>> Also available at:
>> http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-tess-submit
>>
>> Thanks to Glenn Kennard for lots of discussion and testing.
>>
>> Dave.
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #38 from har...@gmx.de ---
@bellamorte,

for better performance of this workaround, you could try the following:

1. make clean
2. generate 'makefile' via 'autogen.sh' with your prefered settings
3. make
4. delete 'src/mesa/state_tracker/st_glsl_to_tgsi.lo'
5. generate 'makefile' via 'autogen.sh' with CXXFLAGS='-O1'
6. make (only st_glsl_to_tgsi.cpp will be compiled)
7. install as usual

... that is what works for me ...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 10/44] i965: Define and use REG_MASK macro to make masked MMIO writes slightly more readable.

2015-12-03 Thread Kenneth Graunke
On Tuesday, December 01, 2015 12:19:28 AM Jordan Justen wrote:
> From: Francisco Jerez 
> 
> Reviewed-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  | 6 ++
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 2 +-
>  src/mesa/drivers/dri/i965/gen7_l3_state.c| 2 +-
>  src/mesa/drivers/dri/i965/intel_reg.h| 2 +-
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index a511d5c..94f878b 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -41,6 +41,12 @@
>  #define GET_BITS(data, high, low) ((data & INTEL_MASK((high), (low))) >> 
> (low))
>  #define GET_FIELD(word, field) (((word)  & field ## _MASK) >> field ## 
> _SHIFT)
>  
> +/**
> + * For use with masked MMIO registers where the upper 16 bits control which
> + * of the lower bits are committed to the register.
> + */
> +#define REG_MASK(value) ((value) << 16)
> +
>  #ifndef BRW_DEFINES_H
>  #define BRW_DEFINES_H
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index aab5c91..3d14d65 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -382,7 +382,7 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
>BEGIN_BATCH(3);
>OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
>OUT_BATCH(GEN7_CACHE_MODE_1);
> -  OUT_BATCH((GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC << 16) |
> +  OUT_BATCH(REG_MASK(GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC) |
>  GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC);
>ADVANCE_BATCH();
> }
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 9aad563..2c692be 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -264,7 +264,7 @@ setup_l3_config(struct brw_context *brw, const struct 
> brw_l3_config *cfg)
>   OUT_BATCH(HSW_SCRATCH1);
>   OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
>   OUT_BATCH(HSW_ROW_CHICKEN3);
> - OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
> + OUT_BATCH(REG_MASK(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE) |
> (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
>   ADVANCE_BATCH();
>}
> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h 
> b/src/mesa/drivers/dri/i965/intel_reg.h
> index 0b167d5..d6f 100644
> --- a/src/mesa/drivers/dri/i965/intel_reg.h
> +++ b/src/mesa/drivers/dri/i965/intel_reg.h
> @@ -183,7 +183,7 @@
>  # define GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE (1 << 13)
>  # define GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC (1 << 1)
>  # define GEN8_HIZ_PMA_MASK_BITS \
> -   ((GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) << 16)
> +   REG_MASK(GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE)
>  
>  /* Predicate registers */
>  #define MI_PREDICATE_SRC0   0x2400
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 08/44] i965: Implement programming of the L3 configuration.

2015-12-03 Thread Kenneth Graunke
On Tuesday, December 01, 2015 12:19:26 AM Jordan Justen wrote:
> From: Francisco Jerez 
> 
> ---
>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 95 
> +++
>  1 file changed, 95 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 8765b11..108f3a8 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -161,3 +161,98 @@ get_l3_way_size(const struct brw_device_info *devinfo)
> else
>return 8 * devinfo->num_slices;
>  }
> +
> +/**
> + * Program the hardware to use the specified L3 configuration.
> + */
> +static void
> +setup_l3_config(struct brw_context *brw, const struct brw_l3_config *cfg)
> +{
> +   const bool has_dc = cfg->n[L3P_DC] || cfg->n[L3P_ALL];
> +   const bool has_is = cfg->n[L3P_IS] || cfg->n[L3P_RO] || cfg->n[L3P_ALL];
> +   const bool has_c = cfg->n[L3P_C] || cfg->n[L3P_RO] || cfg->n[L3P_ALL];
> +   const bool has_t = cfg->n[L3P_T] || cfg->n[L3P_RO] || cfg->n[L3P_ALL];
> +   const bool has_slm = cfg->n[L3P_SLM];
> +
> +   /* According to the hardware docs, the L3 partitioning can only be changed
> +* while the pipeline is completely drained and the caches are flushed,
> +* what involves a first PIPE_CONTROL flush which stalls the pipeline and

which involves

Looks good to me at a quick glance.
Acked-by: Kenneth Graunke 

> +* initiates invalidation of the relevant caches...
> +*/
> +   brw_emit_pipe_control_flush(brw,
> +   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
> +   PIPE_CONTROL_CONST_CACHE_INVALIDATE |
> +   PIPE_CONTROL_INSTRUCTION_INVALIDATE |
> +   PIPE_CONTROL_DATA_CACHE_INVALIDATE |
> +   PIPE_CONTROL_NO_WRITE |
> +   PIPE_CONTROL_CS_STALL);
> +
> +   /* ...followed by a second stalling flush which guarantees that
> +* invalidation is complete when the L3 configuration registers are
> +* modified.
> +*/
> +   brw_emit_pipe_control_flush(brw,
> +   PIPE_CONTROL_DATA_CACHE_INVALIDATE |
> +   PIPE_CONTROL_NO_WRITE |
> +   PIPE_CONTROL_CS_STALL);
> +
> +   if (brw->gen >= 8) {
> +  assert(!cfg->n[L3P_IS] && !cfg->n[L3P_C] && !cfg->n[L3P_T]);
> +
> +  BEGIN_BATCH(3);
> +  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> +
> +  /* Set up the L3 partitioning. */
> +  OUT_BATCH(GEN8_L3CNTLREG);
> +  OUT_BATCH((has_slm ? GEN8_L3CNTLREG_SLM_ENABLE : 0) |
> +SET_FIELD(cfg->n[L3P_URB], GEN8_L3CNTLREG_URB_ALLOC) |
> +SET_FIELD(cfg->n[L3P_RO], GEN8_L3CNTLREG_RO_ALLOC) |
> +SET_FIELD(cfg->n[L3P_DC], GEN8_L3CNTLREG_DC_ALLOC) |
> +SET_FIELD(cfg->n[L3P_ALL], GEN8_L3CNTLREG_ALL_ALLOC));
> +
> +  ADVANCE_BATCH();
> +
> +   } else {
> +  assert(!cfg->n[L3P_ALL]);
> +
> +  /* When enabled SLM only uses a portion of the L3 on half of the banks,
> +   * the matching space on the remaining banks has to be allocated to a
> +   * client (URB for all validated configurations) set to the
> +   * lower-bandwidth 2-bank address hashing mode.
> +   */
> +  const bool urb_low_bw = has_slm && !brw->is_baytrail;
> +  assert(!urb_low_bw || cfg->n[L3P_URB] == cfg->n[L3P_SLM]);
> +
> +  /* Minimum number of ways that can be allocated to the URB. */
> +  const unsigned n0_urb = (brw->is_baytrail ? 32 : 0);
> +  assert(cfg->n[L3P_URB] >= n0_urb);
> +
> +  BEGIN_BATCH(7);
> +  OUT_BATCH(MI_LOAD_REGISTER_IMM | (7 - 2));
> +
> +  /* Demote any clients with no ways assigned to LLC. */
> +  OUT_BATCH(GEN7_L3SQCREG1);
> +  OUT_BATCH((brw->is_haswell ? HSW_L3SQCREG1_SQGHPCI_DEFAULT :
> + brw->is_baytrail ? VLV_L3SQCREG1_SQGHPCI_DEFAULT :
> + IVB_L3SQCREG1_SQGHPCI_DEFAULT) |
> +(has_dc ? 0 : GEN7_L3SQCREG1_CONV_DC_UC) |
> +(has_is ? 0 : GEN7_L3SQCREG1_CONV_IS_UC) |
> +(has_c ? 0 : GEN7_L3SQCREG1_CONV_C_UC) |
> +(has_t ? 0 : GEN7_L3SQCREG1_CONV_T_UC));
> +
> +  /* Set up the L3 partitioning. */
> +  OUT_BATCH(GEN7_L3CNTLREG2);
> +  OUT_BATCH((has_slm ? GEN7_L3CNTLREG2_SLM_ENABLE : 0) |
> +SET_FIELD(cfg->n[L3P_URB] - n0_urb, 
> GEN7_L3CNTLREG2_URB_ALLOC) |
> +(urb_low_bw ? GEN7_L3CNTLREG2_URB_LOW_BW : 0) |
> +SET_FIELD(cfg->n[L3P_ALL], GEN7_L3CNTLREG2_ALL_ALLOC) |
> +SET_FIELD(cfg->n[L3P_RO], GEN7_L3CNTLREG2_RO_ALLOC) |
> +SET_FIELD(cfg->n[L3P_DC], GEN7_L3CNTLREG2_DC_ALLOC));
> +  OUT_BATCH(GEN7_L3CNTLREG3);
> +  OUT_BATCH(SET_FIELD(cfg->n[L3P_IS], GEN7_L3CNTLREG3_IS_ALLOC) |
> +SET_FIELD(cfg->n[L3P_C], GEN7_L3CNTLREG3_C_ALLOC) |
> 

Re: [Mesa-dev] [PATCH v3 07/44] i965: Import tables enumerating the set of validated L3 configurations.

2015-12-03 Thread Kenneth Graunke
On Tuesday, December 01, 2015 12:19:25 AM Jordan Justen wrote:
> From: Francisco Jerez 
> 
> It should be possible to use additional L3 configurations other than
> the ones listed in the tables of validated allocations ("BSpec »
> 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*]
> » L3 Allocation and Programming"), but it seems sensible for now to
> hard-code the tables in order to stick to the hardware docs.  Instead
> of setting up the arbitrary L3 partitioning given as input, the
> closest validated L3 configuration will be looked up in these tables
> and used to program the hardware.
> 
> The included tables should work for Gen7-9.  Note that the quantities
> are specified in ways rather than in KB, this is because the L3
> control registers expect the value in ways, and because by doing that
> we can re-use a single table for all GT variants of the same
> generation (and in the case of IVB/HSW and CHV/SKL across different
> generations) which generally have different L3 way sizes but allow the
> same combinations of way allocations.
> 
> v3: Use slice count from the devinfo structure instead of the gt
> number to implement get_l3_way_size().
> 
> Reviewed-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources |   1 +
>  src/mesa/drivers/dri/i965/gen7_l3_state.c  | 163 
> +
>  2 files changed, 164 insertions(+)
>  create mode 100644 src/mesa/drivers/dri/i965/gen7_l3_state.c
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index 5e805fa..b413f73 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -182,6 +182,7 @@ i965_FILES = \
>   gen7_cs_state.c \
>   gen7_disable.c \
>   gen7_gs_state.c \
> + gen7_l3_state.c \
>   gen7_misc_state.c \
>   gen7_sf_state.c \
>   gen7_sol_state.c \
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> new file mode 100644
> index 000..8765b11
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -0,0 +1,163 @@
> +/*
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "brw_context.h"
> +#include "brw_defines.h"
> +#include "brw_state.h"
> +#include "intel_batchbuffer.h"
> +
> +/**
> + * Chunk of L3 cache reserved for some specific purpose.
> + */
> +enum brw_l3_partition {
> +   /** Shared local memory. */
> +   L3P_SLM = 0,
> +   /** Unified return buffer. */
> +   L3P_URB,
> +   /** Union of DC and RO. */
> +   L3P_ALL,
> +   /** Data cluster RW partition. */
> +   L3P_DC,
> +   /** Union of IS, C and T. */
> +   L3P_RO,
> +   /** Instruction and state cache. */
> +   L3P_IS,
> +   /** Constant cache. */
> +   L3P_C,
> +   /** Texture cache. */
> +   L3P_T,
> +   /** Number of supported L3 partitions. */
> +   NUM_L3P
> +};
> +
> +/**
> + * L3 configuration represented as the number of ways allocated for each
> + * partition.  \sa get_l3_way_size().
> + */
> +struct brw_l3_config {
> +   unsigned n[NUM_L3P];
> +};
> +
> +/**
> + * IVB/HSW validated L3 configurations.
> + */
> +static const struct brw_l3_config ivb_l3_configs[] = {

It might be nice to add comments above the columns, similar to
brw_surface_formats.c:

   /* SLM  URB  ALL   DC   RO   ISCT */
   {{   0,  32,   0,   0,  32,   0,   0,   0 }},

Either way,
Acked-by: Kenneth Graunke 

Thanks Curro!

> +   {{  0, 32,  0,  0, 32,  0,  0,  0 }},
> +   {{  0, 32,  0, 16, 16,  0,  0,  0 }},
> +   {{  0, 32,  0,  4,  0,  8,  4, 16 }},
> +   {{  0, 28,  0,  8,  0,  8,  4, 16 }},
> +   {{  0, 28,  0, 16,  0,  8,  4,  8 }},
> +   {{  0, 28,  0,  8,  0, 16,  4,  8 }},
> +   {{  0, 28,  0,  0,  0, 16,  4, 16 }},
> +   {{  0, 32,  0,  0,  0, 16,  0, 16

[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #37 from bellamort...@gmail.com ---
Confirmed that -O3 CFLAGS and -O1 CXXFLAGS works.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #36 from har...@gmx.de ---
i can now confirm:

Compiling only file 'st_glsl_to_tgsi.cpp' with '-O1' and everything else with
default '-O2' works flawless.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #35 from har...@gmx.de ---
I found that only CXXFLAGS do the trick.

If i append CXXFLAGS='-O1' to the configure params, WT works flawless.

Since 'st_glsl_to_tgsi.cpp' is the only .cpp file in folder 'state_tracker', it
seems to make sense to assume the problem here.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] AppVeyor: Build completed: mesa 32

2015-12-03 Thread AppVeyor


Build mesa 32 completed



Commit a0f1bc18e5 by Brian Paul on 12/3/2015 4:40 PM:

mesa: print enum names rather than hexadecimal values in error messages\n\nTrivial.


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support

2015-12-03 Thread Aaron Watry
Hi Dave (and others),

I cloned your fdo r600g-tess-submit branch and gave it a spin on CEDAR
(Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few issues.

1) Initially, I got an assertion in r600_add_atom stating that the atom ID
was not less than the R600_NUM_ATOMS value (id = 51, R600_NUM_ATOMS=51).
  I bumped R600_NUM_ATOMS to 52 for now, and that got rid of that issue...
although I have no idea if that was a correct fix.

2) Next, I kept getting a segfault in evergreen_adjust_gprs at line 3931.
Turns out that rctx->hw_shader_stages[2].shader was null
(missing/miscompiled GS?).

I naively changed the code to the following, and now Heaven actually runs
with tessellation enabled (and it looks like it's working).

/* gather required shader gprs */
for (i = 0; i < EG_NUM_HW_STAGES; i++) {
if (!rctx->hw_shader_stages[i].shader) {
num_gprs[i] = def_gprs[i];
continue;
}
num_gprs[i] = rctx->hw_shader_stages[i].shader->shader.bc.ngpr;
}

Just figured that I'd let you know...

If you don't have CEDAR hardware to test with, feel free to ping me to test
any additional changes.  Note that I didn't run the benchmark to completion
(too slow, had to get other work done), but it didn't hang my GPU in the
time that I did have it running.

--Aaron


On Mon, Nov 30, 2015 at 12:20 AM, Dave Airlie  wrote:

> Hi,
>
> Patchbomb time, this set of patches is a first pass at add adding
> ARB_tessellation_shader support to the r600g driver. Only Evergreen
> and newer GPUs support tessellation. On any of the GPUs that support
> native FP64, this will enable OpenGL 4.1 on them.
>
> The first bunch of patches are a bit of a driver rework to get
> things in better shape for tessellation, they shouldn't cause
> any regressions.
>
> This runs heaven on cayman and should pass all the piglits
> unless I've done something wrong.
>
> Development hit two HW programming fun times, one with tess and
> dynamic GPR interaction requiring disabling dynamic GPRs, and
> one with programming of some SIMD registers to block TESS shaders
> on one unit. These fixed most of the hangs we saw during development.
>
> This doesn't contain SB support yet, Glenn has started working on it.
>
> Currently tested hw:
> working: CAYMAN, REDWOOD, BARTS, TURKS
> hangs on any tessellation: CAYMAN
> hangs differently at least with heaven: SUMO
>
> This patchset doesn't block it on any GPUs, but when merged it
> probably should.
>
> Also available at:
> http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-tess-submit
>
> Thanks to Glenn Kennard for lots of discussion and testing.
>
> Dave.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] AppVeyor: Build failed: mesa 31

2015-12-03 Thread AppVeyor



Build mesa 31 failed


Commit e832b5b7fa by Brian Paul on 12/3/2015 4:12 PM:

st/wgl: add support for WGL_ARB_render_texture\n\nThere are a few legacy OpenGL apps on Windows which need this extension.\nWe basically use glCopyTex[Sub]Image to implement wglBindTexImageARB (see\nthe implementation notes for details).\n\nv2: refactor code to use st_copy_framebuffer_to_texture() helper function.\n\nReviewed-by: José Fonseca \nReviewed-by: Charmaine Lee 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-03 Thread Roland Scheidegger
One problem with this is that clear_render_target and
clear_depth_stencil honor render condition, whereas clear_texture does
not. nv50 got it wrong, but I care a lot more when that wrongness is
moved to util. This could be fixed by making clear_render_target and
clear_depth_stencil optionally honor render condition (like blit does).
Though it seems a bit odd that everybody would implement this via util
and then clear_render_target/clear_depth_stencil, the whole reason
pipe->clear_texture exists in the first place is that you could clear
textures which you might not be able to render to (that is, couldn't use
clear_render_target/clear_depth_stencil), otherwise mesa/st should just
use util code directly and not bother with clear_texture...

Roland




Am 03.12.2015 um 10:44 schrieb Edward O'Callaghan:
> ARB_clear_texture is reasonably generic enough that it should
> be moved down to become part of the fallback mechanism of
> pipe->clear_texture.
> 
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/auxiliary/util/u_surface.c  | 83 
> +
>  src/gallium/auxiliary/util/u_surface.h  |  6 ++
>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
>  3 files changed, 90 insertions(+), 66 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_surface.c 
> b/src/gallium/auxiliary/util/u_surface.c
> index 6aa44f9..e7ab175 100644
> --- a/src/gallium/auxiliary/util/u_surface.c
> +++ b/src/gallium/auxiliary/util/u_surface.c
> @@ -36,6 +36,7 @@
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
>  
> +#include "util/u_math.h"
>  #include "util/u_format.h"
>  #include "util/u_inlines.h"
>  #include "util/u_rect.h"
> @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
> }
>  }
>  
> +/**
> + * Fallback for pipe->clear_texture() function.
> + * clears a non-PIPE_BUFFER resource's specified level
> + * and bounding box with a clear value provided in that
> + * resource's native format.
> + *
> + * XXX sf->format = .. is problematic as hw need
> + * not nessarily support the format.
> + */
> +void
> +util_surface_clear_texture(struct pipe_context *pipe,
> +   struct pipe_resource *res,
> +   unsigned level,
> +   const struct pipe_box *box,
> +   const void *data)
> +{
> +   struct pipe_surface tmpl = {{0}}, *sf;
> +
> +   tmpl.format = res->format;
> +   tmpl.u.tex.first_layer = box->z;
> +   tmpl.u.tex.last_layer = box->z + box->depth - 1;
> +   tmpl.u.tex.level = level;
> +   sf = pipe->create_surface(pipe, res, &tmpl);
> +   if (!sf)
> +  return;
> +
> +   if (util_format_is_depth_or_stencil(res->format)) {
> +  float depth = 0;
> +  uint8_t stencil = 0;
> +  unsigned clear = 0;
> +  const struct util_format_description *desc =
> + util_format_description(res->format);
> +
> +  if (util_format_has_depth(desc)) {
> + clear |= PIPE_CLEAR_DEPTH;
> + desc->unpack_z_float(&depth, 0, data, 0, 1, 1);
> +  }
> +  if (util_format_has_stencil(desc)) {
> + clear |= PIPE_CLEAR_STENCIL;
> + desc->unpack_s_8uint(&stencil, 0, data, 0, 1, 1);
> +  }
> +  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
> +box->x, box->y, box->width, box->height);
> +   } else {
> +  union pipe_color_union color;
> +
> +  switch (util_format_get_blocksizebits(res->format)) {
> +  case 128:
> + sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
> + memcpy(&color.ui, data, 128 / 8);
> + break;
> +  case 64:
> + sf->format = PIPE_FORMAT_R32G32_UINT;
> + memcpy(&color.ui, data, 64 / 8);
> + memset(&color.ui[2], 0, 64 / 8);
> + break;
> +  case 32:
> + sf->format = PIPE_FORMAT_R32_UINT;
> + memcpy(&color.ui, data, 32 / 8);
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  case 16:
> + sf->format = PIPE_FORMAT_R16_UINT;
> + color.ui[0] = util_cpu_to_le32(
> +util_le16_to_cpu(*(unsigned short *)data));
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  case 8:
> + sf->format = PIPE_FORMAT_R8_UINT;
> + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  default:
> + assert(!"Unknown texel element size");
> + return;
> +  }
> +
> +  pipe->clear_render_target(pipe, sf, &color,
> +box->x, box->y, box->width, box->height);
> +   }
> +   pipe->surface_destroy(pipe, sf);
> +}
>  
>  /* Return if the box is totally inside the resource.
>   */
> diff --git a/src/gallium/auxiliary/util/u_surface.h 
> b/src/gallium/auxiliary/util/u_surface.h
> index bfd8f40..069a393 100644
> --- a/src/gallium/auxiliary/util/u_surface.h
> +++ b/src/gallium/auxiliary/util/u

Re: [Mesa-dev] [PATCH 3/3] draw: fix clipping of layer/vp index outputs

2015-12-03 Thread Jose Fonseca

On 03/12/15 00:35, srol...@vmware.com wrote:

From: Roland Scheidegger 

This was just plain broken. It used always the value from v0 (for vp_index)
but would pass the value from the provoking vertex to later stages - but only
if there was a corresponding fs input, otherwise the layer/vp index would get
lost completely (as it would try to interpolate the (unsigned) values as
floats).
So, make it obey provoking vertex rules (drivers relying on draw will need to
do the same). And make sure that the default interpolation mode (when no
corresponding fs input is found) for them is constant.
Also, change the code a bit so constant inputs aren't interpolated then
copied over later.

No piglit change (but I'm writing one...).
---
  src/gallium/auxiliary/draw/draw_pipe_clip.c | 207 +---
  1 file changed, 130 insertions(+), 77 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index c22758b..7d9ea5e 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -58,12 +58,17 @@
  struct clip_stage {
 struct draw_stage stage;  /**< base class */

-   /* List of the attributes to be flatshaded. */
-   uint num_flat_attribs;
-   uint flat_attribs[PIPE_MAX_SHADER_OUTPUTS];
-
-   /* Mask of attributes in noperspective mode */
-   boolean noperspective_attribs[PIPE_MAX_SHADER_OUTPUTS];
+   unsigned pos_attr;
+
+   /* List of the attributes to be constant interpolated. */
+   uint num_const_attribs;
+   uint8_t const_attribs[PIPE_MAX_SHADER_OUTPUTS];
+   /* List of the attributes to be linear interpolated. */
+   uint num_linear_attribs;
+   uint8_t linear_attribs[PIPE_MAX_SHADER_OUTPUTS];
+   /* List of the attributes to be perspective interpolated. */
+   uint num_perspect_attribs;
+   uint8_t perspect_attribs[PIPE_MAX_SHADER_OUTPUTS];

 float (*plane)[4];
  };
@@ -97,9 +102,9 @@ draw_viewport_index(struct draw_context *draw,
  /* All attributes are float[4], so this is easy:
   */
  static void interp_attr( float dst[4],
-float t,
-const float in[4],
-const float out[4] )
+ float t,
+ const float in[4],
+ const float out[4] )
  {
 dst[0] = LINTERP( t, out[0], in[0] );
 dst[1] = LINTERP( t, out[1], in[1] );
@@ -117,8 +122,8 @@ static void copy_flat( struct draw_stage *stage,
  {
 const struct clip_stage *clipper = clip_stage(stage);
 uint i;
-   for (i = 0; i < clipper->num_flat_attribs; i++) {
-  const uint attr = clipper->flat_attribs[i];
+   for (i = 0; i < clipper->num_const_attribs; i++) {
+  const uint attr = clipper->const_attribs[i];
COPY_4FV(dst->data[attr], src->data[attr]);
 }
  }
@@ -126,15 +131,13 @@ static void copy_flat( struct draw_stage *stage,
  /* Interpolate between two vertices to produce a third.
   */
  static void interp( const struct clip_stage *clip,
-   struct vertex_header *dst,
-   float t,
-   const struct vertex_header *out,
-   const struct vertex_header *in,
+struct vertex_header *dst,
+float t,
+const struct vertex_header *out,
+const struct vertex_header *in,
  unsigned viewport_index )
  {
-   const unsigned nr_attrs = draw_num_shader_outputs(clip->stage.draw);
-   const unsigned pos_attr = 
draw_current_shader_position_output(clip->stage.draw);
-   const unsigned clip_attr = 
draw_current_shader_clipvertex_output(clip->stage.draw);
+   const unsigned pos_attr = clip->pos_attr;
 unsigned j;
 float t_nopersp;

@@ -168,6 +171,13 @@ static void interp( const struct clip_stage *clip,
dst->data[pos_attr][3] = oow;
 }

+
+   /* interp perspective attribs */
+   for (j = 0; j < clip->num_perspect_attribs; j++) {
+  const unsigned attr = clip->perspect_attribs[j];
+  interp_attr(dst->data[attr], t, in->data[attr], out->data[attr]);
+   }
+
 /**
  * Compute the t in screen-space instead of 3d space to use
  * for noperspective interpolation.
@@ -177,7 +187,7 @@ static void interp( const struct clip_stage *clip,
  * pick whatever value (the interpolated point won't be in front
  * anyway), so just use the 3d t.
  */
-   {
+   if (clip->num_linear_attribs){
int k;
t_nopersp = t;
/* find either in.x != out.x or in.y != out.y */
@@ -191,24 +201,17 @@ static void interp( const struct clip_stage *clip,
  break;
   }
}
-   }
-
-   /* Other attributes
-*/
-   for (j = 0; j < nr_attrs; j++) {
-  if (j != pos_attr && j != clip_attr) {
- if (clip->noperspective_attribs[j])
-interp_attr(dst->data[j], t_nopersp, in->data[j], out->data[j]);
- else
-interp_attr(dst->data[j],

Re: [Mesa-dev] [RFC 3/3] radeonsi: Add ARB_clear_texture support

2015-12-03 Thread Marek Olšák
On Thu, Dec 3, 2015 at 10:44 AM, Edward O'Callaghan
 wrote:
> Unfortantly we may not toggle on PIPE_CAP_CLEAR_TEXTURE
> and update GL3.txt at this time as the piglit
> ARB_clear_texture-float test still fails for unknown
> reasons.
>
> However, this does allow a user to fake this extention
> now and get some reasonable level of support should
> they urgently need this.
>
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/drivers/radeonsi/si_blit.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
> b/src/gallium/drivers/radeonsi/si_blit.c
> index 13d8e6f..e1185c9 100644
> --- a/src/gallium/drivers/radeonsi/si_blit.c
> +++ b/src/gallium/drivers/radeonsi/si_blit.c
> @@ -439,6 +439,15 @@ static void si_clear_depth_stencil(struct pipe_context 
> *ctx,
> si_blitter_end(ctx);
>  }
>
> +static void si_pipe_clear_texture(struct pipe_context *ctx,
> + struct pipe_resource *res,
> + unsigned level,
> + const struct pipe_box *box,
> + const void *data)
> +{
> +   util_surface_clear_texture(ctx, res, level, box, data);
> +}
> +
>  /* Helper for decompressing a portion of a color or depth resource before
>   * blitting if any decompression is needed.
>   * The driver doesn't decompress resources automatically while u_blitter is
> @@ -800,6 +809,7 @@ void si_init_blit_functions(struct si_context *sctx)
> sctx->b.b.clear_buffer = si_pipe_clear_buffer;
> sctx->b.b.clear_render_target = si_clear_render_target;
> sctx->b.b.clear_depth_stencil = si_clear_depth_stencil;
> +   sctx->b.b.clear_texture = si_pipe_clear_texture;

You can just do:

sctx->b.b.clear_texture = util_surface_clear_texture;

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-03 Thread Marek Olšák
On Thu, Dec 3, 2015 at 10:44 AM, Edward O'Callaghan
 wrote:
> ARB_clear_texture is reasonably generic enough that it should
> be moved down to become part of the fallback mechanism of
> pipe->clear_texture.
>
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/auxiliary/util/u_surface.c  | 83 
> +
>  src/gallium/auxiliary/util/u_surface.h  |  6 ++
>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
>  3 files changed, 90 insertions(+), 66 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_surface.c 
> b/src/gallium/auxiliary/util/u_surface.c
> index 6aa44f9..e7ab175 100644
> --- a/src/gallium/auxiliary/util/u_surface.c
> +++ b/src/gallium/auxiliary/util/u_surface.c
> @@ -36,6 +36,7 @@
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
>
> +#include "util/u_math.h"
>  #include "util/u_format.h"
>  #include "util/u_inlines.h"
>  #include "util/u_rect.h"
> @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
> }
>  }
>
> +/**
> + * Fallback for pipe->clear_texture() function.
> + * clears a non-PIPE_BUFFER resource's specified level
> + * and bounding box with a clear value provided in that
> + * resource's native format.
> + *
> + * XXX sf->format = .. is problematic as hw need
> + * not nessarily support the format.

This should be cleaned up before it can be moved to util. The correct
solution is to set the format in tmpl.format.

Marek

> + */
> +void
> +util_surface_clear_texture(struct pipe_context *pipe,
> +   struct pipe_resource *res,
> +   unsigned level,
> +   const struct pipe_box *box,
> +   const void *data)
> +{
> +   struct pipe_surface tmpl = {{0}}, *sf;
> +
> +   tmpl.format = res->format;
> +   tmpl.u.tex.first_layer = box->z;
> +   tmpl.u.tex.last_layer = box->z + box->depth - 1;
> +   tmpl.u.tex.level = level;
> +   sf = pipe->create_surface(pipe, res, &tmpl);
> +   if (!sf)
> +  return;
> +
> +   if (util_format_is_depth_or_stencil(res->format)) {
> +  float depth = 0;
> +  uint8_t stencil = 0;
> +  unsigned clear = 0;
> +  const struct util_format_description *desc =
> + util_format_description(res->format);
> +
> +  if (util_format_has_depth(desc)) {
> + clear |= PIPE_CLEAR_DEPTH;
> + desc->unpack_z_float(&depth, 0, data, 0, 1, 1);
> +  }
> +  if (util_format_has_stencil(desc)) {
> + clear |= PIPE_CLEAR_STENCIL;
> + desc->unpack_s_8uint(&stencil, 0, data, 0, 1, 1);
> +  }
> +  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
> +box->x, box->y, box->width, box->height);
> +   } else {
> +  union pipe_color_union color;
> +
> +  switch (util_format_get_blocksizebits(res->format)) {
> +  case 128:
> + sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
> + memcpy(&color.ui, data, 128 / 8);
> + break;
> +  case 64:
> + sf->format = PIPE_FORMAT_R32G32_UINT;
> + memcpy(&color.ui, data, 64 / 8);
> + memset(&color.ui[2], 0, 64 / 8);
> + break;
> +  case 32:
> + sf->format = PIPE_FORMAT_R32_UINT;
> + memcpy(&color.ui, data, 32 / 8);
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  case 16:
> + sf->format = PIPE_FORMAT_R16_UINT;
> + color.ui[0] = util_cpu_to_le32(
> +util_le16_to_cpu(*(unsigned short *)data));
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  case 8:
> + sf->format = PIPE_FORMAT_R8_UINT;
> + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
> + memset(&color.ui[1], 0, 96 / 8);
> + break;
> +  default:
> + assert(!"Unknown texel element size");
> + return;
> +  }
> +
> +  pipe->clear_render_target(pipe, sf, &color,
> +box->x, box->y, box->width, box->height);
> +   }
> +   pipe->surface_destroy(pipe, sf);
> +}
>
>  /* Return if the box is totally inside the resource.
>   */
> diff --git a/src/gallium/auxiliary/util/u_surface.h 
> b/src/gallium/auxiliary/util/u_surface.h
> index bfd8f40..069a393 100644
> --- a/src/gallium/auxiliary/util/u_surface.h
> +++ b/src/gallium/auxiliary/util/u_surface.h
> @@ -97,6 +97,12 @@ util_clear_depth_stencil(struct pipe_context *pipe,
>   unsigned stencil,
>   unsigned dstx, unsigned dsty,
>   unsigned width, unsigned height);
> +extern void
> +util_surface_clear_texture(struct pipe_context *pipe,
> +   struct pipe_resource *res,
> +   unsigned level,
> +   const struct pipe_box *box,
> +   const void *data);
>
>  extern boolean
>  util_try_blit_via_copy_region(struct pipe_context *ctx,
> diff --git a/src/gall

[Mesa-dev] [RFC 3/3] radeonsi: Add ARB_clear_texture support

2015-12-03 Thread Edward O'Callaghan
Unfortantly we may not toggle on PIPE_CAP_CLEAR_TEXTURE
and update GL3.txt at this time as the piglit
ARB_clear_texture-float test still fails for unknown
reasons.

However, this does allow a user to fake this extention
now and get some reasonable level of support should
they urgently need this.

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/radeonsi/si_blit.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 13d8e6f..e1185c9 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -439,6 +439,15 @@ static void si_clear_depth_stencil(struct pipe_context 
*ctx,
si_blitter_end(ctx);
 }
 
+static void si_pipe_clear_texture(struct pipe_context *ctx,
+ struct pipe_resource *res,
+ unsigned level,
+ const struct pipe_box *box,
+ const void *data)
+{
+   util_surface_clear_texture(ctx, res, level, box, data);
+}
+
 /* Helper for decompressing a portion of a color or depth resource before
  * blitting if any decompression is needed.
  * The driver doesn't decompress resources automatically while u_blitter is
@@ -800,6 +809,7 @@ void si_init_blit_functions(struct si_context *sctx)
sctx->b.b.clear_buffer = si_pipe_clear_buffer;
sctx->b.b.clear_render_target = si_clear_render_target;
sctx->b.b.clear_depth_stencil = si_clear_depth_stencil;
+   sctx->b.b.clear_texture = si_pipe_clear_texture;
sctx->b.b.resource_copy_region = si_resource_copy_region;
sctx->b.b.blit = si_blit;
sctx->b.b.flush_resource = si_flush_resource;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-03 Thread Edward O'Callaghan
ARB_clear_texture is reasonably generic enough that it should
be moved down to become part of the fallback mechanism of
pipe->clear_texture.

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/auxiliary/util/u_surface.c  | 83 +
 src/gallium/auxiliary/util/u_surface.h  |  6 ++
 src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
 3 files changed, 90 insertions(+), 66 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_surface.c 
b/src/gallium/auxiliary/util/u_surface.c
index 6aa44f9..e7ab175 100644
--- a/src/gallium/auxiliary/util/u_surface.c
+++ b/src/gallium/auxiliary/util/u_surface.c
@@ -36,6 +36,7 @@
 #include "pipe/p_screen.h"
 #include "pipe/p_state.h"
 
+#include "util/u_math.h"
 #include "util/u_format.h"
 #include "util/u_inlines.h"
 #include "util/u_rect.h"
@@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
}
 }
 
+/**
+ * Fallback for pipe->clear_texture() function.
+ * clears a non-PIPE_BUFFER resource's specified level
+ * and bounding box with a clear value provided in that
+ * resource's native format.
+ *
+ * XXX sf->format = .. is problematic as hw need
+ * not nessarily support the format.
+ */
+void
+util_surface_clear_texture(struct pipe_context *pipe,
+   struct pipe_resource *res,
+   unsigned level,
+   const struct pipe_box *box,
+   const void *data)
+{
+   struct pipe_surface tmpl = {{0}}, *sf;
+
+   tmpl.format = res->format;
+   tmpl.u.tex.first_layer = box->z;
+   tmpl.u.tex.last_layer = box->z + box->depth - 1;
+   tmpl.u.tex.level = level;
+   sf = pipe->create_surface(pipe, res, &tmpl);
+   if (!sf)
+  return;
+
+   if (util_format_is_depth_or_stencil(res->format)) {
+  float depth = 0;
+  uint8_t stencil = 0;
+  unsigned clear = 0;
+  const struct util_format_description *desc =
+ util_format_description(res->format);
+
+  if (util_format_has_depth(desc)) {
+ clear |= PIPE_CLEAR_DEPTH;
+ desc->unpack_z_float(&depth, 0, data, 0, 1, 1);
+  }
+  if (util_format_has_stencil(desc)) {
+ clear |= PIPE_CLEAR_STENCIL;
+ desc->unpack_s_8uint(&stencil, 0, data, 0, 1, 1);
+  }
+  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
+box->x, box->y, box->width, box->height);
+   } else {
+  union pipe_color_union color;
+
+  switch (util_format_get_blocksizebits(res->format)) {
+  case 128:
+ sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
+ memcpy(&color.ui, data, 128 / 8);
+ break;
+  case 64:
+ sf->format = PIPE_FORMAT_R32G32_UINT;
+ memcpy(&color.ui, data, 64 / 8);
+ memset(&color.ui[2], 0, 64 / 8);
+ break;
+  case 32:
+ sf->format = PIPE_FORMAT_R32_UINT;
+ memcpy(&color.ui, data, 32 / 8);
+ memset(&color.ui[1], 0, 96 / 8);
+ break;
+  case 16:
+ sf->format = PIPE_FORMAT_R16_UINT;
+ color.ui[0] = util_cpu_to_le32(
+util_le16_to_cpu(*(unsigned short *)data));
+ memset(&color.ui[1], 0, 96 / 8);
+ break;
+  case 8:
+ sf->format = PIPE_FORMAT_R8_UINT;
+ color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
+ memset(&color.ui[1], 0, 96 / 8);
+ break;
+  default:
+ assert(!"Unknown texel element size");
+ return;
+  }
+
+  pipe->clear_render_target(pipe, sf, &color,
+box->x, box->y, box->width, box->height);
+   }
+   pipe->surface_destroy(pipe, sf);
+}
 
 /* Return if the box is totally inside the resource.
  */
diff --git a/src/gallium/auxiliary/util/u_surface.h 
b/src/gallium/auxiliary/util/u_surface.h
index bfd8f40..069a393 100644
--- a/src/gallium/auxiliary/util/u_surface.h
+++ b/src/gallium/auxiliary/util/u_surface.h
@@ -97,6 +97,12 @@ util_clear_depth_stencil(struct pipe_context *pipe,
  unsigned stencil,
  unsigned dstx, unsigned dsty,
  unsigned width, unsigned height);
+extern void
+util_surface_clear_texture(struct pipe_context *pipe,
+   struct pipe_resource *res,
+   unsigned level,
+   const struct pipe_box *box,
+   const void *data);
 
 extern boolean
 util_try_blit_via_copy_region(struct pipe_context *ctx,
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
index 86be1b4..7980c9a 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
@@ -27,7 +27,6 @@
 #include "util/u_inlines.h"
 #include "util/u_pack_color.h"
 #include "util/u_format.h"
-#include "util/u_math.h"
 #include "util/u_surface.h"
 
 #include "tgsi/tgsi_ureg.h"
@@ -446,71 +445,7 @@

[Mesa-dev] [RFC 1/3] gallium/aux/util: Trivial, we already have format use it

2015-12-03 Thread Edward O'Callaghan
No need to dereference again, fixup for clarity.

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/auxiliary/util/u_surface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_surface.c 
b/src/gallium/auxiliary/util/u_surface.c
index 70ed911..6aa44f9 100644
--- a/src/gallium/auxiliary/util/u_surface.c
+++ b/src/gallium/auxiliary/util/u_surface.c
@@ -397,7 +397,7 @@ util_clear_render_target(struct pipe_context *pipe,
  }
   }
   else {
- util_pack_color(color->f, dst->format, &uc);
+ util_pack_color(color->f, format, &uc);
   }
 
   util_fill_box(dst_map, dst->format,
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] radeonsi ARB_clear_texture support

2015-12-03 Thread Edward O'Callaghan
The following patches make steps towards getting ARB_clear_texture
support working on radeonsi. There is still some outstanding work
to be done before we may toggle on the PIPE and check it off as
done.

The following is thus preliminary refactoring to set the stage
for further work without the churn. I also would like to seek
some feedback at this stage on possibly new ideas on why the
ARB_clear_texture-float test could be failing from piglit.
I've spent quite some time on this now without much success
and so throwing some ideas around could be useful for this
to progress.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] travis: Initial import of travis instructions.

2015-12-03 Thread Jose Fonseca

On 03/12/15 01:51, Michel Dänzer wrote:

On 03.12.2015 06:01, Jose Fonseca wrote:

On 02/12/15 03:39, Michel Dänzer wrote:

On 02.12.2015 07:06, Jose Fonseca wrote:

On 28/11/15 21:06, Emil Velikov wrote:

On 25 November 2015 at 07:20, Jose Fonseca  wrote:


BTW, I setup Mesa with Appveyor (like Travis for Windows)

 https://ci.appveyor.com/project/jrfonseca/mesa

I'll try to get that going and commited too.


As a person who has broken the Windows build on an occasion or two,
yes please.


The nice thing about Appveyor is that it can clone the git from
anywhere,
not just GitHub.


Would it be OK to have email notifications to mesa-commits?


I'm wondering if mesa-dev wouldn't be more suitable. Not many of us
(or is it just me?) don't follow mesa-commits.


I'm OK either way.


Whichever list we choose, we need to whitelist emails from
no-re...@appveyor.com somehow.   I don't know the procedure for that.


A list administrator can add a subscription for that address with
delivery of list posts disabled.


Great.

Per http://lists.freedesktop.org/mailman/listinfo/mesa-dev it looks
you're one of the admins!  Could you please add no-re...@appveyor.com?


Done.


Thanks Michel.

I just kicked enabled email for one build for testing.

But from now on mesa-dev will only receive emails when build status 
change (succeess <-> fail).



Last thing I need to do is ask an fdo admin to install the git hook (ATM 
I'm polling mesa git repos, and running the hook on a cron job.)




One thing no one mentioned is that one can automatically feed the
results from Travis-CI/github into Coverity. I know Vinson has been
doing the latter, although I'm not sure how much of it is automated.


Yes, but Coverity recommend (ie ask) not to do so on every commit.

Clang and MSVC have static analyzers too.  I've been running them.  But
to be honest, running them all the time is not enough -- it takes effort
to actually act on issues, resolving them / whitelisting them.  And I'm
afraid I don't had the time.  The bulk is false negatives.


FWIW, I think you mean false positives here.


I know it should be false _something_, but I always hesitate on which to
use. :)


False positive means that the static analyzer reports a problem, but in
fact there is no problem. False negative means there is a problem, but
the static analyzer doesn't report it.


Thanks. Makes sense now you explained it.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] AppVeyor: Build completed: mesa 29

2015-12-03 Thread AppVeyor


Build mesa 29 completed



Commit 231db5869c by Tapani Pälli on 12/2/2015 11:21 AM:

i965: use _Shader to get fragment program when updating surface state\n\nAtomic counters and Images were using ctx::Shader that does not take in\nto account program pipeline changes, ctx::_Shader must be used for SSO to\nwork. Commit c0347705 already changed ubo's to use this.\n\nFixes failures seen with following Piglit test:\n	arb_separate_shader_object-atomic-counter\n\nSigned-off-by: Tapani Pälli \nReviewed-by: Francisco Jerez \nCc: "11.0 11.1" 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev