Re: [Mesa-dev] [PATCH v2 2/3] etnaviv: Update includes from rnndb

2017-04-14 Thread Wladimir J. van der Laan
On Sat, Apr 15, 2017 at 07:49:53AM +0200, Wladimir J. van der Laan wrote:
> On Fri, Apr 14, 2017 at 11:57:21PM +0200, Christian Gmeiner wrote:
> > > +#define INST_OPCODE_IMADLOSAT0 0x004e
> > > +#define INST_OPCODE_IMADLOSAT0 0x004f
> > 
> > INST_OPCODE_IMADLOSAT0 got redefined...
> 
> Second one should be IMADLOSAT1. Strange, I fixed this but apparently it 
> didn't make it to the patch,
> messed up with git again :(
> https://github.com/etnaviv/etna_viv/blob/master/src/etnaviv/isa.xml.h#L119

I now understand what went wrong: apparently I fixed another instance (IMUL not
IMAD) but not this one.
Strange, hadn't seen a warning for this.
Thanks for the fix,

Regards,
Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/3] etnaviv: Update includes from rnndb

2017-04-14 Thread Wladimir J. van der Laan
On Fri, Apr 14, 2017 at 11:57:21PM +0200, Christian Gmeiner wrote:
> > +#define INST_OPCODE_IMADLOSAT0 0x004e
> > +#define INST_OPCODE_IMADLOSAT0 0x004f
> 
> INST_OPCODE_IMADLOSAT0 got redefined...

Second one should be IMADLOSAT1. Strange, I fixed this but apparently it didn't 
make it to the patch,
messed up with git again :(
https://github.com/etnaviv/etna_viv/blob/master/src/etnaviv/isa.xml.h#L119

Regards,
Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/12] i965/cnl: Implement new pipe control workaround

2017-04-14 Thread Ilia Mirkin
On Fri, Apr 14, 2017 at 8:35 PM, Anuj Phogat  wrote:
> From: Ben Widawsky 
>
> GEN10 requires flushing all previous pipe controls before issuing a render
> target cache flush. The docs seem to fairly explicitly say this is gen10 only.
>
> v2: Rebased on
> commit 04f74d66293222d5e1905cfb930bfa083e30463c
> Author: Francisco Jerez 
> Date:   Thu Jun 30 19:39:24 2016 -0700
>
> i965: Emit SNB write cache flush W/A from brw_emit_pipe_control_flush.
>
> Cc: Francisco Jerez 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 18 ++
>  1 file changed, 18 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index b8f7406..b921fe7 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -128,6 +128,24 @@ brw_emit_pipe_control_flush(struct brw_context *brw, 
> uint32_t flags)
>   brw_emit_pipe_control_flush(brw, 0);
>}
>
> +  if (brw->gen == 10) {

Should this only be if flags & PIPE_CONTROL_RENDER_TARGET_FLUSH ?

> +/* Hardware workaround: CNL
> + *
> + * "Before sending a PIPE_CONTROL command with bit 12 set, SW
> + * must issue another PIPE_CONTROL with Render Target Cache
> + * Flush Enable (bit 12) = 0 and Pipe Control Flush Enable (bit
> + * 7) = 1."
> + */
> + BEGIN_BATCH(6);
> + OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
> + OUT_BATCH(PIPE_CONTROL_FLUSH_ENABLE);

Based on the comment above, shouldn't this also be |
PIPE_CONTROL_RENDER_TARGET_FLUSH?

Also, this tends to be done as a brw_emit_pipe_control_flush(brw,
fooflags) call above for gen9, makes sense to do the same thing here,
no?

> + OUT_BATCH(0);
> + OUT_BATCH(0);
> + OUT_BATCH(0);
> + OUT_BATCH(0);
> + ADVANCE_BATCH();
> +  }
> +
>BEGIN_BATCH(6);
>OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
>OUT_BATCH(flags);
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: print target string in glBindTexture() error message

2017-04-14 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

On 15/04/17 04:42, Brian Paul wrote:

---
 src/mesa/main/texobj.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index ad644ca..00feb97 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -1663,7 +1663,8 @@ _mesa_BindTexture( GLenum target, GLuint texName )

targetIndex = _mesa_tex_target_to_index(ctx, target);
if (targetIndex < 0) {
-  _mesa_error(ctx, GL_INVALID_ENUM, "glBindTexture(target)");
+  _mesa_error(ctx, GL_INVALID_ENUM, "glBindTexture(target = %s)",
+  _mesa_enum_to_string(target));
   return;
}
assert(targetIndex < NUM_TEXTURE_TARGETS);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] i965/cnl: Modify thread count shift for VS

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
 src/mesa/drivers/dri/i965/gen8_vs_state.c | 6 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 08106c0..688ff61 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -607,6 +607,7 @@ enum brw_wrap_mode {
 /* DW5 */
 # define GEN6_VS_MAX_THREADS_SHIFT 25
 # define HSW_VS_MAX_THREADS_SHIFT  23
+# define GEN10_VS_MAX_THREADS_SHIFT 22
 # define GEN6_VS_STATISTICS_ENABLE (1 << 10)
 # define GEN6_VS_CACHE_DISABLE (1 << 1)
 # define GEN6_VS_ENABLE(1 << 0)
diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c 
b/src/mesa/drivers/dri/i965/gen8_vs_state.c
index 7b66da4..c4ad9cd 100644
--- a/src/mesa/drivers/dri/i965/gen8_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c
@@ -75,7 +75,11 @@ upload_vs_state(struct brw_context *brw)
uint32_t simd8_enable =
   vue_prog_data->dispatch_mode == DISPATCH_MODE_SIMD8 ?
   GEN8_VS_SIMD8_ENABLE : 0;
-   OUT_BATCH(((devinfo->max_vs_threads - 1) << HSW_VS_MAX_THREADS_SHIFT) |
+
+   uint32_t threads = (devinfo->max_vs_threads - 1);
+   threads <<= brw->gen >= 10 ? GEN10_VS_MAX_THREADS_SHIFT :
+HSW_VS_MAX_THREADS_SHIFT;
+   OUT_BATCH(threads |
  GEN6_VS_STATISTICS_ENABLE |
  simd8_enable |
  GEN6_VS_ENABLE);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] i965/cnl: Properly handle l3 configuration

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

V2: Squash the changes in one patch and rebased on master (Anuj).

Signed-off-by: Ben Widawsky 
Signed-off-by: Anuj Phogat 
---
 src/intel/common/gen_l3_config.c | 43 ++--
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c
index 4fe3503..f3e8793 100644
--- a/src/intel/common/gen_l3_config.c
+++ b/src/intel/common/gen_l3_config.c
@@ -102,6 +102,26 @@ static const struct gen_l3_config chv_l3_configs[] = {
 };
 
 /**
+ * On CNL, RO clients are merged and shared with read/write space. As a result
+ * we have fewer allocation parameters. Also, programming does not require any
+ * back scaling. Programming simply works in 2k increments and is scaled by the
+ * hardware.
+ */
+static const struct gen_l3_config cnl_l3_configs[] = {
+   /* SLM URB Rest  DC  RO */
+   {{  0, 64, 64,  0,  0 }},
+   {{  0, 64,  0, 16, 48 }},
+   {{  0, 48,  0, 16, 64 }},
+   {{  0, 32,  0,  0, 96 }},
+   {{  0, 32, 96,  0,  0 }},
+   {{  0, 32,  0, 16, 80 }},
+   {{ 32, 16, 80,  0,  0 }},
+   {{ 32, 16,  0, 64, 16 }},
+   {{ 32,  0, 96,  0,  0 }},
+   {{ 0 }}
+};
+
+/**
  * Return a zero-terminated array of validated L3 configurations for the
  * specified device.
  */
@@ -116,9 +136,11 @@ get_l3_configs(const struct gen_device_info *devinfo)
   return (devinfo->is_cherryview ? chv_l3_configs : bdw_l3_configs);
 
case 9:
-   case 10:
   return chv_l3_configs;
 
+   case 10:
+  return cnl_l3_configs;
+
default:
   unreachable("Not implemented");
}
@@ -258,13 +280,19 @@ get_l3_way_size(const struct gen_device_info *devinfo)
if (devinfo->is_baytrail)
   return 2;
 
-   else if (devinfo->gt == 1 ||
-devinfo->is_cherryview ||
-devinfo->is_broxton)
+   /* Way size is actually 6 * num_slices, because it's 2k per bank, and
+* normally 3 banks per slice. However, on CNL+ this information isn't
+* needed to setup the URB/l3 configuration. We fudge the answer here
+* and then use the scaling to fix it up later.
+*/
+   if (devinfo->gen >= 10)
+  return 2 * devinfo->l3_banks;
+
+   /* XXX: Cherryview and Broxton are always gt1 */
+   if (devinfo->gt == 1)
   return 4;
 
-   else
-  return 8 * devinfo->num_slices;
+   return 8 * devinfo->num_slices;
 }
 
 /**
@@ -274,6 +302,9 @@ get_l3_way_size(const struct gen_device_info *devinfo)
 static unsigned
 get_urb_size_scale(const struct gen_device_info *devinfo)
 {
+   if (devinfo->gen == 10)
+  return devinfo->l3_banks;
+
return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] i965/cnl: Update memory barrier assert

2017-04-14 Thread Anuj Phogat
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/brw_program.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index e1f9896..ab719ad 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -292,7 +292,7 @@ brw_memory_barrier(struct gl_context *ctx, GLbitfield 
barriers)
unsigned bits = (PIPE_CONTROL_DATA_CACHE_FLUSH |
 PIPE_CONTROL_NO_WRITE |
 PIPE_CONTROL_CS_STALL);
-   assert(brw->gen >= 7 && brw->gen <= 9);
+   assert(brw->gen >= 7 && brw->gen <= 10);
 
if (barriers & (GL_VERTEX_ATTRIB_ARRAY_BARRIER_BIT |
GL_ELEMENT_ARRAY_BARRIER_BIT |
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/12] i965/cnl: Add CNL MOCS defines

2017-04-14 Thread Anuj Phogat
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/brw_blorp.c| 7 ++-
 src/mesa/drivers/dri/i965/brw_defines.h  | 8 
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 2 ++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 8a6cc66..eae925f 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -94,12 +94,17 @@ brw_blorp_init(struct brw_context *brw)
   brw->blorp.exec = gen8_blorp_exec;
   break;
case 9:
-   case 10:
   brw->blorp.mocs.tex = SKL_MOCS_WB;
   brw->blorp.mocs.rb = SKL_MOCS_PTE;
   brw->blorp.mocs.vb = SKL_MOCS_WB;
   brw->blorp.exec = gen9_blorp_exec;
   break;
+   case 10:
+  brw->blorp.mocs.tex = CNL_MOCS_WB;
+  brw->blorp.mocs.rb = CNL_MOCS_PTE;
+  brw->blorp.mocs.vb = CNL_MOCS_WB;
+  brw->blorp.exec = gen9_blorp_exec;
+  break;
default:
   unreachable("Invalid gen");
}
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 688ff61..afa13b4 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1408,6 +1408,14 @@ enum brw_pixel_shader_coverage_mask_mode {
 /* TC=LLC/eLLC, LeCC=PTE, LRUM=3, L3CC=WB */
 #define SKL_MOCS_PTE (1 << 1)
 
+/* CannonLake: MOCS is now an index into an array of 62 different caching
+ * configurations programmed by the kernel.
+ */
+/* TC=LLC/eLLC, LeCC=WB, LRUM=3, L3CC=WB */
+#define CNL_MOCS_WB  (2 << 1)
+/* TC=LLC/eLLC, LeCC=PTE, LRUM=3, L3CC=WB */
+#define CNL_MOCS_PTE (1 << 1)
+
 #define MEDIA_VFE_STATE 0x7000
 /* GEN7 DW2, GEN8+ DW3 */
 # define MEDIA_VFE_STATE_MAX_THREADS_SHIFT  16
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 1d4953e..68942f7 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -64,12 +64,14 @@ uint32_t tex_mocs[] = {
[7] = GEN7_MOCS_L3,
[8] = BDW_MOCS_WB,
[9] = SKL_MOCS_WB,
+   [10] = CNL_MOCS_WB,
 };
 
 uint32_t rb_mocs[] = {
[7] = GEN7_MOCS_L3,
[8] = BDW_MOCS_PTE,
[9] = SKL_MOCS_PTE,
+   [10] = CNL_MOCS_PTE,
 };
 
 static void
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/12] i965/cnl: URB {VS, GS, HS, DS} sizes cannot be a multiple of 3

2017-04-14 Thread Anuj Phogat
v1: By Ben Widawsky 
v2: Add the restriction for GS, HS and DS and make sure
the allocated sizes are not multiple of 3.

Signed-off-by: Anuj Phogat 
Cc: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/gen7_urb.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
b/src/mesa/drivers/dri/i965/gen7_urb.c
index 028161d..dc6826a 100644
--- a/src/mesa/drivers/dri/i965/gen7_urb.c
+++ b/src/mesa/drivers/dri/i965/gen7_urb.c
@@ -194,6 +194,17 @@ gen7_upload_urb(struct brw_context *brw, unsigned vs_size,
   entry_size[i] = prog_data[i] ? prog_data[i]->urb_entry_size : 1;
}
 
+   /* For Cannonlake:
+* Software shall not program an allocation size that specifies a size
+* that is a multiple of 3 64B (512-bit) cachelines.
+*/
+   if (brw->gen == 10) {
+  for (int i = MESA_SHADER_VERTEX; i <= MESA_SHADER_GEOMETRY; i++) {
+ if (entry_size[i] % 3 == 0)
+entry_size[i]++;
+  }
+   }
+
/* If we're just switching between programs with the same URB requirements,
 * skip the rest of the logic.
 */
@@ -224,6 +235,7 @@ gen7_upload_urb(struct brw_context *brw, unsigned vs_size,
 
BEGIN_BATCH(8);
for (int i = MESA_SHADER_VERTEX; i <= MESA_SHADER_GEOMETRY; i++) {
+  assert(brw->gen != 10 || entry_size[i] % 3);
   OUT_BATCH((_3DSTATE_URB_VS + i) << 16 | (2 - 2));
   OUT_BATCH(entries[i] |
 ((entry_size[i] - 1) << GEN7_URB_ENTRY_SIZE_SHIFT) |
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/12] i965/cnl: Implement depth count workaround

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 5c3ecba..d0d0589 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -111,6 +111,14 @@ brw_write_depth_count(struct brw_context *brw, 
drm_intel_bo *query_bo, int idx)
if (brw->gen == 9 && brw->gt == 4)
   flags |= PIPE_CONTROL_CS_STALL;
 
+   if (brw->gen >= 10) {
+  /* "Driver must program PIPE_CONTROL with only Depth Stall Enable bit set
+   * prior to programming a PIPE_CONTROL with Write PS Depth Count Post 
sync
+   * operation."
+   */
+  brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_STALL);
+   }
+
brw_emit_pipe_control_write(brw, flags,
query_bo, idx * sizeof(uint64_t),
0, 0);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] i965/cnl: Restore lossless compression for sRGB formats

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

This support was removed on gen9 (it worked before then) and was brought back
for gen10.

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 467ada5..c8014b9 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -207,7 +207,7 @@ intel_miptree_supports_non_msrt_fast_clear(struct 
brw_context *brw,
if (!brw->format_supported_as_render_target[mt->format])
   return false;
 
-   if (brw->gen >= 9) {
+   if (brw->gen == 9) {
   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
   const uint32_t brw_format = 
brw_isl_format_for_mesa_format(linear_format);
   return isl_format_supports_ccs_e(>screen->devinfo, brw_format);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] i965/cnl: Add a preliminary device for CNL

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

Since we've implemented all the known quirks for supporting gen10 with none of
the new features (ie. functions like Skylake), it should be safe to actually
enable the device.

v2: rebased on top of master and updated pci ids (Anuj)

Signed-off-by: Ben Widawsky 
Signed-off-by: Anuj Phogat 
---
 include/pci_ids/i965_pci_ids.h  | 12 ++
 src/intel/common/gen_device_info.c  | 59 +
 src/intel/common/gen_device_info.h  |  1 +
 src/intel/common/gen_l3_config.c|  1 +
 src/intel/compiler/brw_compiler.h   |  2 +-
 src/intel/compiler/brw_eu.c |  2 +
 src/intel/compiler/brw_eu_compact.c |  1 +
 src/intel/isl/isl.c |  2 +
 src/intel/vulkan/anv_cmd_buffer.c   |  1 +
 src/intel/vulkan/anv_device.c   |  1 +
 src/intel/vulkan/anv_entrypoints_gen.py |  1 +
 src/mesa/drivers/dri/i965/brw_blorp.c   |  1 +
 src/mesa/drivers/dri/i965/brw_draw_upload.c |  1 +
 src/mesa/drivers/dri/i965/brw_formatquery.c |  1 +
 src/mesa/drivers/dri/i965/intel_screen.c|  1 +
 15 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 17504f5..b296359 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -165,3 +165,15 @@ CHIPSET(0x5927, kbl_gt3, "Intel(R) Iris Plus Graphics 650 
(Kaby Lake GT3)")
 CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
 CHIPSET(0x3184, glk, "Intel(R) HD Graphics (Geminilake)")
 CHIPSET(0x3185, glk_2x6, "Intel(R) HD Graphics (Geminilake 2x6)")
+CHIPSET(0x5A49, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
+CHIPSET(0x5A4A, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
+CHIPSET(0x5A41, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")
+CHIPSET(0x5A42, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")
+CHIPSET(0x5A44, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")
+CHIPSET(0x5A59, cnl_4x8, "Intel(R) HD Graphics (Cannonlake 4x8 GT1.5)")
+CHIPSET(0x5A5A, cnl_4x8, "Intel(R) HD Graphics (Cannonlake 4x8 GT1.5)")
+CHIPSET(0x5A5C, cnl_4x8, "Intel(R) HD Graphics (Cannonlake 4x8 GT1.5)")
+CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
+CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
+CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
+CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 47aed9d..43d6f08 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -555,6 +555,65 @@ static const struct gen_device_info 
gen_device_info_glk_2x6 = {
GEN9_LP_FEATURES_2X6
 };
 
+#define GEN10_HW_INFO   \
+   .gen = 10,   \
+   .max_vs_threads = 728,   \
+   .max_gs_threads = 432,   \
+   .max_tcs_threads = 432,  \
+   .max_tes_threads = 624,  \
+   .max_wm_threads = 64 * 12,   \
+   .max_cs_threads = 56,\
+   .urb = { \
+  .size = 256,  \
+  .min_entries = {  \
+ [MESA_SHADER_VERTEX]= 64,  \
+ [MESA_SHADER_TESS_EVAL] = 34,  \
+  },\
+  .max_entries = {  \
+  [MESA_SHADER_VERTEX]   = 3936,\
+  [MESA_SHADER_TESS_CTRL]= 896, \
+  [MESA_SHADER_TESS_EVAL]= 2064,\
+  [MESA_SHADER_GEOMETRY] = 832, \
+  },\
+   }
+
+#define GEN10_FEATURES(_gt, _slices, _l3)   \
+   GEN8_FEATURES,   \
+   GEN10_HW_INFO,   \
+   .gt = _gt, .num_slices = _slices, .l3_banks = _l3
+
+static const struct gen_device_info gen_device_info_cnl_2x8 = {
+   /* GT0.5 */
+   GEN10_FEATURES(1, 1, 2)
+};
+
+static const struct gen_device_info gen_device_info_cnl_3x8 = {
+   /* GT1 */
+   GEN10_FEATURES(1, 1, 3)
+};
+
+static const struct gen_device_info gen_device_info_cnl_4x8 = {
+   /* GT 1.5 */
+   GEN10_FEATURES(1, 2, 6)
+};
+
+static const struct gen_device_info gen_device_info_cnl_5x8 = {
+   /* GT2 */
+   GEN10_FEATURES(2, 2, 6)
+};
+
+static const struct gen_device_info gen_device_info_cnl_gt1 = {
+   GEN10_FEATURES(1, 1, 3)
+};
+
+static const struct gen_device_info gen_device_info_cnl_gt2 = {
+   GEN10_FEATURES(2, 2, 6)
+};
+
+static const struct gen_device_info gen_device_info_cnl_gt3 = {
+   GEN10_FEATURES(3, 4, 12)
+};
+
 bool
 

[Mesa-dev] [PATCH 03/12] i965/cnl: Update the script generating genX_bits.h

2017-04-14 Thread Anuj Phogat
Signed-off-by: Anuj Phogat 
---
 src/intel/genxml/gen_bits_header.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/genxml/gen_bits_header.py 
b/src/intel/genxml/gen_bits_header.py
index 808e6cf..77cd966 100644
--- a/src/intel/genxml/gen_bits_header.py
+++ b/src/intel/genxml/gen_bits_header.py
@@ -84,6 +84,7 @@ static inline uint32_t ATTRIBUTE_PURE
 ${field.token_name}(const struct gen_device_info *devinfo)
 {
switch (devinfo->gen) {
+   case 10: return ${field.bits(10)};
case 9: return ${field.bits(9)};
case 8: return ${field.bits(8)};
case 7:
@@ -151,8 +152,7 @@ class Gen(object):
 def __init__(self, z):
 # Convert potential "major.minor" string
 z = float(z)
-if z < 10:
-z *= 10
+z *= 10
 self.tenx = int(z)
 
 def __lt__(self, other):
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/12] Add Cannonlake support

2017-04-14 Thread Anuj Phogat
This series adds a preliminary support for Cannonlake. We
still end up using gen9 paths in many cases. My upcoming
patches will change it by creating new functions, headers
for gen10. You can also find this series at:
https://github.com/aphogat/mesa.git
branch: reviews 

Anuj Phogat (4):
  i965/cnl: Update the script generating genX_bits.h
  i965/cnl: URB {VS, GS, HS, DS} sizes cannot be a multiple of 3
  i965/cnl: Update memory barrier assert
  i965/cnl: Add CNL MOCS defines

Ben Widawsky (7):
  i965: Make feature macros gen8 based
  i965/cnl: Implement new pipe control workaround
  i965/cnl: Implement depth count workaround
  i965/cnl: Modify thread count shift for VS
  i965/cnl: Restore lossless compression for sRGB formats
  i965/cnl: Add a preliminary device for CNL
  i965/cnl: Properly handle l3 configuration

Jason Ekstrand (1):
  i965/cnl: Add gen10.xml

 include/pci_ids/i965_pci_ids.h   |   12 +
 src/intel/Makefile.sources   |3 +-
 src/intel/common/gen_device_info.c   |   72 +-
 src/intel/common/gen_device_info.h   |1 +
 src/intel/common/gen_l3_config.c |   42 +-
 src/intel/compiler/brw_compiler.h|2 +-
 src/intel/compiler/brw_eu.c  |2 +
 src/intel/compiler/brw_eu_compact.c  |1 +
 src/intel/genxml/gen10.xml   | 3557 ++
 src/intel/genxml/gen_bits_header.py  |4 +-
 src/intel/isl/isl.c  |2 +
 src/intel/vulkan/anv_cmd_buffer.c|1 +
 src/intel/vulkan/anv_device.c|1 +
 src/intel/vulkan/anv_entrypoints_gen.py  |1 +
 src/mesa/drivers/dri/i965/brw_blorp.c|6 +
 src/mesa/drivers/dri/i965/brw_defines.h  |9 +
 src/mesa/drivers/dri/i965/brw_draw_upload.c  |1 +
 src/mesa/drivers/dri/i965/brw_formatquery.c  |1 +
 src/mesa/drivers/dri/i965/brw_pipe_control.c |   18 +
 src/mesa/drivers/dri/i965/brw_program.c  |2 +-
 src/mesa/drivers/dri/i965/brw_queryobj.c |8 +
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |2 +
 src/mesa/drivers/dri/i965/gen7_urb.c |   12 +
 src/mesa/drivers/dri/i965/gen8_vs_state.c|6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|2 +-
 src/mesa/drivers/dri/i965/intel_screen.c |1 +
 26 files changed, 3749 insertions(+), 20 deletions(-)
 create mode 100644 src/intel/genxml/gen10.xml

-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] i965/cnl: Implement new pipe control workaround

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

GEN10 requires flushing all previous pipe controls before issuing a render
target cache flush. The docs seem to fairly explicitly say this is gen10 only.

v2: Rebased on
commit 04f74d66293222d5e1905cfb930bfa083e30463c
Author: Francisco Jerez 
Date:   Thu Jun 30 19:39:24 2016 -0700

i965: Emit SNB write cache flush W/A from brw_emit_pipe_control_flush.

Cc: Francisco Jerez 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_pipe_control.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
b/src/mesa/drivers/dri/i965/brw_pipe_control.c
index b8f7406..b921fe7 100644
--- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
@@ -128,6 +128,24 @@ brw_emit_pipe_control_flush(struct brw_context *brw, 
uint32_t flags)
  brw_emit_pipe_control_flush(brw, 0);
   }
 
+  if (brw->gen == 10) {
+/* Hardware workaround: CNL
+ *
+ * "Before sending a PIPE_CONTROL command with bit 12 set, SW
+ * must issue another PIPE_CONTROL with Render Target Cache
+ * Flush Enable (bit 12) = 0 and Pipe Control Flush Enable (bit
+ * 7) = 1."
+ */
+ BEGIN_BATCH(6);
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
+ OUT_BATCH(PIPE_CONTROL_FLUSH_ENABLE);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ ADVANCE_BATCH();
+  }
+
   BEGIN_BATCH(6);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
   OUT_BATCH(flags);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] i965: Make feature macros gen8 based

2017-04-14 Thread Anuj Phogat
From: Ben Widawsky 

All the "features" of the hardware are similar starting with GEN8, so remove as
much of the GEN9 uniqueness as possible. This makes implementing future gen
platforms a bit easier.

Signed-off-by: Ben Widawsky 
Reviewed-by: Anuj Phogat 
---
 src/intel/common/gen_device_info.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 209b293..47aed9d 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -378,15 +378,8 @@ static const struct gen_device_info gen_device_info_chv = {
}
 };
 
-#define GEN9_FEATURES   \
+#define GEN9_HW_INFO\
.gen = 9,\
-   .has_hiz_and_separate_stencil = true,\
-   .has_resource_streamer = true,   \
-   .must_use_separate_stencil = true,   \
-   .has_llc = true, \
-   .has_pln = true, \
-   .supports_simd16_3src = true,\
-   .has_surface_tile_offset = true, \
.max_vs_threads = 336,   \
.max_gs_threads = 336,   \
.max_tcs_threads = 336,  \
@@ -454,6 +447,10 @@ static const struct gen_device_info gen_device_info_chv = {
   },   \
}
 
+#define GEN9_FEATURES   \
+   GEN8_FEATURES,   \
+   GEN9_HW_INFO
+
 static const struct gen_device_info gen_device_info_skl_gt1 = {
GEN9_FEATURES, .gt = 1,
.num_slices = 1,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: Fix swr osmesa build

2017-04-14 Thread Kyriazis, George
Thanks Emil,

I will attempt to un-meh a bit at checkin.

George

> On Apr 14, 2017, at 5:52 PM, Emil Velikov  wrote:
> 
> Commit summary is a bit meh, but regardless.
> 
> Reviewed-by: Emil Velikov 
> 
> -Emil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: Document interaction Fixes tag and stable branches.

2017-04-14 Thread Emil Velikov
On 14 April 2017 at 22:43, Bas Nieuwenhuizen  wrote:
> For the next time I forget.
>
> CC: Emil Velikov 
> Signed-off-by: Bas Nieuwenhuizen 
> ---
>  docs/submittingpatches.html | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/docs/submittingpatches.html b/docs/submittingpatches.html
> index 5310b1d8c17..4b025647039 100644
> --- a/docs/submittingpatches.html
> +++ b/docs/submittingpatches.html
> @@ -266,6 +266,11 @@ Note: by removing the tag [as the commit is pushed] the 
> patch is
>  Thus, drop the line only if you want to cancel the 
> nomination.
>  
>
> +Alternatively, if one uses the "Fixes" tag as desribed in the "Patch 
> formatting"
s/desribed/described/

> +section, it nominates a commit for all active stable branches that include 
> the
> +commit that is referred to. If the "CC" tag is also present the "Fixes" tag 
> will
> +be used to determine which active stable branches the commit applies to.
> +
Please drop drop the second sentence, since it does not bring much (it
even confuses the hell out of me).

Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] etnaviv: native fence fd support

2017-04-14 Thread Christian Gmeiner
2017-04-12 12:31 GMT+02:00 Philipp Zabel :
> This adds native fence fd support to etnaviv, similarly to commit
> 0b98e84e9ba0 ("freedreno: native fence fd"), enabled for kernel
> driver version 1.1 or later.
>
> Signed-off-by: Philipp Zabel 
> Reviewed-By: Wladimir J. van der Laan 

Reviewed-by: Christian Gmeiner 

--
Christian Gmeiner, MSc

https://www.youtube.com/user/AloryOFFICIAL
https://soundcloud.com/christian-gmeiner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: update dirty_level_mask after the 1-st draw after FB change

2017-04-14 Thread Dieter Nützel

Tested-by: Dieter Nützel 

On Turks XT (6670)

Dieter

Am 13.04.2017 22:56, schrieb Constantine Kharlamov:
Ported from radeonsi. Testing with Kane shows ≈1k skipped 
updates per

frame on average.

No piglit changes with tests/gpu.py, gbm mode.

Signed-off-by: Constantine Kharlamov 
---
 src/gallium/drivers/r600/evergreen_state.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.h |  1 +
 src/gallium/drivers/r600/r600_state.c|  1 +
 src/gallium/drivers/r600/r600_state_common.c | 41 


 4 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c
b/src/gallium/drivers/r600/evergreen_state.c
index 5697da4af9..19ad504097 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -1550,6 +1550,7 @@ static void
evergreen_set_framebuffer_state(struct pipe_context *ctx,
r600_mark_atom_dirty(rctx, >framebuffer.atom);

r600_set_sample_locations_constant_buffer(rctx);
+   rctx->framebuffer.do_update_surf_dirtiness = true;
 }

 static void evergreen_set_min_samples(struct pipe_context *ctx,
unsigned min_samples)
diff --git a/src/gallium/drivers/r600/r600_pipe.h
b/src/gallium/drivers/r600/r600_pipe.h
index 7f1ecc278b..e1715e8628 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -189,6 +189,7 @@ struct r600_framebuffer {
bool cb0_is_integer;
bool is_msaa_resolve;
bool dual_src_blend;
+   bool do_update_surf_dirtiness;
 };

 struct r600_sample_mask {
diff --git a/src/gallium/drivers/r600/r600_state.c
b/src/gallium/drivers/r600/r600_state.c
index 06100abc4a..fc93eb02ad 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -1209,6 +1209,7 @@ static void r600_set_framebuffer_state(struct
pipe_context *ctx,
r600_mark_atom_dirty(rctx, >framebuffer.atom);

r600_set_sample_locations_constant_buffer(rctx);
+   rctx->framebuffer.do_update_surf_dirtiness = true;
 }

 static uint32_t sample_locs_2x[] = {
diff --git a/src/gallium/drivers/r600/r600_state_common.c
b/src/gallium/drivers/r600/r600_state_common.c
index 5be49dcdfe..7b52be36cd 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -99,6 +99,7 @@ static void r600_texture_barrier(struct pipe_context
*ctx, unsigned flags)
   R600_CONTEXT_FLUSH_AND_INV_CB |
   R600_CONTEXT_FLUSH_AND_INV |
   R600_CONTEXT_WAIT_3D_IDLE;
+   rctx->framebuffer.do_update_surf_dirtiness = true;
 }

 static unsigned r600_conv_pipe_prim(unsigned prim)
@@ -1732,6 +1733,7 @@ static void r600_draw_vbo(struct pipe_context
*ctx, const struct pipe_draw_info
if (unlikely(dirty_tex_counter != rctx->b.last_dirty_tex_counter)) {
rctx->b.last_dirty_tex_counter = dirty_tex_counter;
r600_mark_atom_dirty(rctx, >framebuffer.atom);
+   rctx->framebuffer.do_update_surf_dirtiness = true;
}

if (!r600_update_derived_state(rctx)) {
@@ -2034,29 +2036,32 @@ static void r600_draw_vbo(struct pipe_context
*ctx, const struct pipe_draw_info
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_SQ_NON_EVENT));
}

-   /* Set the depth buffer as dirty. */
-   if (rctx->framebuffer.state.zsbuf) {
-   struct pipe_surface *surf = rctx->framebuffer.state.zsbuf;
-   struct r600_texture *rtex = (struct r600_texture 
*)surf->texture;
+   if (rctx->framebuffer.do_update_surf_dirtiness) {
+   /* Set the depth buffer as dirty. */
+   if (rctx->framebuffer.state.zsbuf) {
+   struct pipe_surface *surf = 
rctx->framebuffer.state.zsbuf;
+   struct r600_texture *rtex = (struct r600_texture 
*)surf->texture;

-   rtex->dirty_level_mask |= 1 << surf->u.tex.level;
+   rtex->dirty_level_mask |= 1 << surf->u.tex.level;

-   if (rtex->surface.flags & RADEON_SURF_SBUFFER)
-   rtex->stencil_dirty_level_mask |= 1 << 
surf->u.tex.level;
-   }
-   if (rctx->framebuffer.compressed_cb_mask) {
-   struct pipe_surface *surf;
-   struct r600_texture *rtex;
-   unsigned mask = rctx->framebuffer.compressed_cb_mask;
+   if (rtex->surface.flags & RADEON_SURF_SBUFFER)
+   rtex->stencil_dirty_level_mask |= 1 << 
surf->u.tex.level;
+   }
+   if (rctx->framebuffer.compressed_cb_mask) {
+   struct pipe_surface *surf;
+   struct r600_texture *rtex;
+   unsigned mask = rctx->framebuffer.compressed_cb_mask;

-   do {
-   unsigned i = u_bit_scan();
-   surf = 

Re: [Mesa-dev] [PATCH] swr: Fix swr osmesa build

2017-04-14 Thread Emil Velikov
Commit summary is a bit meh, but regardless.

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-14 Thread Dieter Nützel

Am 14.04.2017 07:53, schrieb gregory hainaut:

On Fri, 14 Apr 2017 05:20:38 +0200
Dieter Nützel  wrote:


Am 14.04.2017 02:06, schrieb Dieter Nützel:
> Hello Gregory,
>
> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> It result in crazy numbers and do not 'return' (one core stays @ 100%).

This is related to 'mesa_glthread=true'.
If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
exit

with ESC as expeted.
Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)

Hope that helps.

Dieter


Hello Dieter,

I tested the demo. There is a pseudo unrelated bug on the exit of the
application.

Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
found non-freed data

I will add a call to a _mesa_HashDeleteAll to fix it.
i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);

Now let's go back to the test behavior. The benchmarks will send 4s of
asynchronous PBO transfer commands. And then will sync gl_thread which
mean the application thread will be blocked until all PBO transfers are
done. Gl_thread is faster to dispatch command so you will need to wait
more before the thread goes back to real life.

On my side, I need to wait around 45 seconds for 6 millions of 
commands.

Result:  6,440,627 reads (gl thread on + PBO patches)
Result:274,960 reads (gl thread off)

In your case, "Result:  77,444,412 reads", I hope you're patient.
I think you must wait at least 10 minutes.


Now, I was patient...
Tried 2 times but after ~20 minutes I've killed it at first and attached 
gdb at it during second run.


0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

(gdb) bt
#0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

#1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
#2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
#3  0x00401e18 in ?? ()
#4  0x004028c7 in ?? ()
#5  0x7fbda9925781 in fghRedrawWindow () from 
/usr/lib64/libglut.so.3

#6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
#7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
#8  0x7fbda9925ce4 in glutMainLoopEvent () from 
/usr/lib64/libglut.so.3

#9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
#10 0x004019fc in ?? ()
#11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
#12 0x00401afa in ?? ()

Should I do more or not worth it?

Dieter


> mesa-demos/tests> ./pbo
> ATTENTION: default value of option mesa_glthread overridden by
> environment.
> GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
> GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 /
> 4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)
> Loaded 194 by 188 image
> Converting RGB image to RGBA
> Benchmarking...
> Result:  7712 reads in 4.00 seconds = -383971576.00
> pixels/sec
>
> top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
> Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
> %Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080
> buff/cache
> KiB Swap:0 total,0 free,0 used. 18437888 avail
> Mem
>
>   PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+
> COMMAND
> 19380 dieter20   0 3259764 2,911g  22472 S 100,3 12,37   2:28.48
> pbo
> 27937 dieter20   0 4029572 570236 166116 S 5,980 2,310   9:45.53
> konqueror
> 13432 dieter20   0 1922820 269892 129152 S 5,648 1,094   4:33.80
> Web Content
>
> Other than that:
>
> For the series:
>
> Tested-by: Dieter Nützel 
> r600g, Turks XT (6670)
>
> Dieter
>
> Am 13.04.2017 19:32, schrieb Gregory Hainaut:
>> Hello,
>>
>> Please find a new version to handle invalid buffer handles.
>>
>> Allow to handle this kind of case:
>>genBuffer();
>>BindBuffer(pbo)
>>DeleteBuffer(pbo);
>>BindBuffer(rand_pbo)
>>TexSubImage2D(user_memory_pointer); // Data transfer will be
>> synchronous
>>
>> There are various subtely to handle multi threaded shared context. In
>> order to
>> keep the code sane, I've considered a buffer invalid when it is
>> deleted by a
>> context even it is still bound to others contexts. It will force a
>> synchronous
>> transfer which 

[Mesa-dev] [PATCH] nir: Add GLSL_TYPE_[U]INT64 to some switch statements

2017-04-14 Thread Jason Ekstrand
Cc: mesa-sta...@lists.freedesktop.org
---
 src/compiler/nir/nir.c  | 2 ++
 src/compiler/nir/nir_split_var_copies.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 43fa60f..0abf9b6 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -699,7 +699,9 @@ deref_foreach_leaf_build_recur(nir_deref_var *deref, 
nir_deref *tail,
assert(tail->child == NULL);
switch (glsl_get_base_type(tail->type)) {
case GLSL_TYPE_UINT:
+   case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT:
+   case GLSL_TYPE_INT64:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
diff --git a/src/compiler/nir/nir_split_var_copies.c 
b/src/compiler/nir/nir_split_var_copies.c
index 58c7873..15a185e 100644
--- a/src/compiler/nir/nir_split_var_copies.c
+++ b/src/compiler/nir/nir_split_var_copies.c
@@ -147,7 +147,9 @@ split_var_copy_instr(nir_intrinsic_instr *old_copy,
   break;
 
case GLSL_TYPE_UINT:
+   case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT:
+   case GLSL_TYPE_INT64:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] etnaviv: resolve tile status when flushing resource

2017-04-14 Thread Christian Gmeiner
2017-04-12 16:13 GMT+02:00 Lucas Stach :
> From: Philipp Zabel 
>
> When passing render buffers from EGL clients to a wayland compositor,
> the resource tile status must be resolved because otherwise the tile
> status is lost in the transfer and cleared parts of the buffer will
> contain old contents.
>
> The same applies when sampling directly from a renderable resource.
>
> lst: Add seqno tracking, to skip flush when not needed.
>
> Fixes: aadcb5e94b35 ("etnaviv: enable TS, but disable autodisable")
> Signed-off-by: Philipp Zabel 
> Signed-off-by: Lucas Stach 

Reviewed-by: Christian Gmeiner 

--
Christian Gmeiner, MSc

https://www.youtube.com/user/AloryOFFICIAL
https://soundcloud.com/christian-gmeiner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] etnaviv: stop repeatedly resolving an unchanged resource into its scanout prime buffer

2017-04-14 Thread Christian Gmeiner
2017-04-12 16:13 GMT+02:00 Lucas Stach :
> From: Philipp Zabel 
>
> Before resolving a resource into its scanout prime buffer, check that
> the prime resource is actually older. If it is not, the resolve is an
> expensive no-op, and we better skip it.
>
> Signed-off-by: Philipp Zabel 

Reviewed-by: Christian Gmeiner 

--
Christian Gmeiner, MSc

https://www.youtube.com/user/AloryOFFICIAL
https://soundcloud.com/christian-gmeiner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/3] etnaviv: Update includes from rnndb

2017-04-14 Thread Christian Gmeiner
> +#define INST_OPCODE_IMADLOSAT0 0x004e
> +#define INST_OPCODE_IMADLOSAT0 0x004f

INST_OPCODE_IMADLOSAT0 got redefined...

greets
--
Christian Gmeiner, MSc

https://www.youtube.com/user/AloryOFFICIAL
https://soundcloud.com/christian-gmeiner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Add GLSL_TYPE_[U]INT64 to some switch statements

2017-04-14 Thread Jason Ekstrand
---
 src/compiler/nir/nir.c  | 2 ++
 src/compiler/nir/nir_split_var_copies.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 43fa60f..0abf9b6 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -699,7 +699,9 @@ deref_foreach_leaf_build_recur(nir_deref_var *deref, 
nir_deref *tail,
assert(tail->child == NULL);
switch (glsl_get_base_type(tail->type)) {
case GLSL_TYPE_UINT:
+   case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT:
+   case GLSL_TYPE_INT64:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
diff --git a/src/compiler/nir/nir_split_var_copies.c 
b/src/compiler/nir/nir_split_var_copies.c
index 58c7873..15a185e 100644
--- a/src/compiler/nir/nir_split_var_copies.c
+++ b/src/compiler/nir/nir_split_var_copies.c
@@ -147,7 +147,9 @@ split_var_copy_instr(nir_intrinsic_instr *old_copy,
   break;
 
case GLSL_TYPE_UINT:
+   case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT:
+   case GLSL_TYPE_INT64:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr: Fix swr osmesa build

2017-04-14 Thread George Kyriazis
---
 src/gallium/targets/osmesa/SConscript | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/targets/osmesa/SConscript 
b/src/gallium/targets/osmesa/SConscript
index 47937a2..7be1b48 100644
--- a/src/gallium/targets/osmesa/SConscript
+++ b/src/gallium/targets/osmesa/SConscript
@@ -31,7 +31,7 @@ if env['llvm']:
 env.Prepend(LIBS = [llvmpipe])
 
 if env['swr']:
-env.Append(CPPDEFINES = 'HAVE_SWR')
+env.Append(CPPDEFINES = 'GALLIUM_SWR')
 env.Prepend(LIBS = [swr])
 
 if env['platform'] == 'windows':
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs: Document interaction Fixes tag and stable branches.

2017-04-14 Thread Bas Nieuwenhuizen
For the next time I forget.

CC: Emil Velikov 
Signed-off-by: Bas Nieuwenhuizen 
---
 docs/submittingpatches.html | 5 +
 1 file changed, 5 insertions(+)

diff --git a/docs/submittingpatches.html b/docs/submittingpatches.html
index 5310b1d8c17..4b025647039 100644
--- a/docs/submittingpatches.html
+++ b/docs/submittingpatches.html
@@ -266,6 +266,11 @@ Note: by removing the tag [as the commit is pushed] the 
patch is
 Thus, drop the line only if you want to cancel the nomination.
 
 
+Alternatively, if one uses the "Fixes" tag as desribed in the "Patch 
formatting"
+section, it nominates a commit for all active stable branches that include the
+commit that is referred to. If the "CC" tag is also present the "Fixes" tag 
will
+be used to determine which active stable branches the commit applies to.
+
 Criteria for accepting patches to the stable branch
 
 Mesa has a designated release manager for each stable branch, and the release
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/4] vc4: Only build the NEON code on arm32.

2017-04-14 Thread Emil Velikov
On 14 April 2017 at 19:21, Eric Anholt  wrote:
> Emil Velikov  writes:
>
>> On 14 April 2017 at 18:47, Eric Anholt  wrote:
>>> NEON is sufficiently different on arm64 that we can't just reuse this
>>> code.  Disable it on arm64 for now.
>>>
>>> Signed-off-by: Eric Anholt 
>>> ---
>>>  src/gallium/drivers/vc4/vc4_tiling_lt.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/vc4/vc4_tiling_lt.c 
>>> b/src/gallium/drivers/vc4/vc4_tiling_lt.c
>>> index c9cbc65e2dbc..7de67b652daa 100644
>>> --- a/src/gallium/drivers/vc4/vc4_tiling_lt.c
>>> +++ b/src/gallium/drivers/vc4/vc4_tiling_lt.c
>>> @@ -61,7 +61,7 @@ static void
>>>  vc4_load_utile(void *cpu, void *gpu, uint32_t cpu_stride, uint32_t cpp)
>>>  {
>>>  uint32_t gpu_stride = vc4_utile_stride(cpp);
>>> -#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
>>> +#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
>>>  if (gpu_stride == 8) {
>>>  __asm__ volatile (
>>>  /* Load from the GPU in one shot, no interleave, to
>>> @@ -118,7 +118,7 @@ vc4_store_utile(void *gpu, void *cpu, uint32_t 
>>> cpu_stride, uint32_t cpp)
>>>  {
>>>  uint32_t gpu_stride = vc4_utile_stride(cpp);
>>>
>>> -#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
>>> +#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
>>
>> This patch should be before 4/4, or it will cause intermittent breakage.
>
> I don't think there is any new breakage.  We've been setting
> VC4_BUILD_NEON already.

From a quick skim it seemed that only the Android build is be busted,
rather than everywhere.

I think you're right. Thanks for the correction.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/3] etnaviv: SINGLE_BUFFER support on GC3000

2017-04-14 Thread Christian Gmeiner
2017-04-14 9:44 GMT+02:00 Wladimir J. van der Laan :
> This patch adds support for the SINGLE_BUFFER feature on GC3000
> GPUs, which allows rendering to a single buffer using multiple pixel
> pipes.
>
> This feature is always used when it is available, which means that
> multi-tiled formats are no longer being used in that case, and all
> buffers will be normal (super)tiled. This mimics the behavior of the
> blob on GC3000.
>
> - Because the same format can be used to render to and texture from,
>   this avoids an extra resolve pass when rendering to texture.
>
> - i.MX6qp includes a PRE which can scan-out directly from tiled formats,
>   avoiding untiling overhead.
>
> Signed-off-by: Wladimir J. van der Laan 

Series is:
Reviewed-by: Christian Gmeiner 

greets
--
Christian Gmeiner, MSc

https://www.youtube.com/user/AloryOFFICIAL
https://soundcloud.com/christian-gmeiner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: enable timestampComputeAndGraphics

2017-04-14 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Fri, Apr 14, 2017 at 11:24 PM, Grazvydas Ignotas  wrote:
> Commit bfee9866 "radv: Use RELEASE_MEM packet for MEC timestamp query."
> added WriteTimestamp handling for compute queues but forgot to flip
> the flag.
>
> Tested with DOOM (by me) and CTS (by Bas), but without verification
> that these tests actually use timestamps on compute queues.
>
> Signed-off-by: Grazvydas Ignotas 
> ---
>  src/amd/vulkan/radv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 12040a0..dd401f4 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -650,11 +650,11 @@ void radv_GetPhysicalDeviceProperties(
> .sampledImageIntegerSampleCounts  = 
> VK_SAMPLE_COUNT_1_BIT,
> .sampledImageDepthSampleCounts= sample_counts,
> .sampledImageStencilSampleCounts  = sample_counts,
> .storageImageSampleCounts = 
> VK_SAMPLE_COUNT_1_BIT,
> .maxSampleMaskWords   = 1,
> -   .timestampComputeAndGraphics  = false,
> +   .timestampComputeAndGraphics  = true,
> .timestampPeriod  = 100.0 / 
> pdevice->rad_info.clock_crystal_freq,
> .maxClipDistances = 8,
> .maxCullDistances = 8,
> .maxCombinedClipAndCullDistances  = 8,
> .discreteQueuePriorities  = 1,
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: enable timestampComputeAndGraphics

2017-04-14 Thread Grazvydas Ignotas
Commit bfee9866 "radv: Use RELEASE_MEM packet for MEC timestamp query."
added WriteTimestamp handling for compute queues but forgot to flip
the flag.

Tested with DOOM (by me) and CTS (by Bas), but without verification
that these tests actually use timestamps on compute queues.

Signed-off-by: Grazvydas Ignotas 
---
 src/amd/vulkan/radv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 12040a0..dd401f4 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -650,11 +650,11 @@ void radv_GetPhysicalDeviceProperties(
.sampledImageIntegerSampleCounts  = 
VK_SAMPLE_COUNT_1_BIT,
.sampledImageDepthSampleCounts= sample_counts,
.sampledImageStencilSampleCounts  = sample_counts,
.storageImageSampleCounts = 
VK_SAMPLE_COUNT_1_BIT,
.maxSampleMaskWords   = 1,
-   .timestampComputeAndGraphics  = false,
+   .timestampComputeAndGraphics  = true,
.timestampPeriod  = 100.0 / 
pdevice->rad_info.clock_crystal_freq,
.maxClipDistances = 8,
.maxCullDistances = 8,
.maxCombinedClipAndCullDistances  = 8,
.discreteQueuePriorities  = 1,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] radeonsi: remove local variable 'mod' from si_compile_tgsi_shader

2017-04-14 Thread Nicolai Hähnle

On 14.04.2017 17:08, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 6242ec1..704c67e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -7481,21 +7481,20 @@ static void si_build_wrapper_function(struct 
si_shader_context *ctx,
 }

 int si_compile_tgsi_shader(struct si_screen *sscreen,
   LLVMTargetMachineRef tm,
   struct si_shader *shader,
   bool is_monolithic,
   struct pipe_debug_callback *debug)
 {
struct si_shader_selector *sel = shader->selector;
struct si_shader_context ctx;
-   LLVMModuleRef mod;
int r = -1;

/* Dump TGSI code before doing TGSI->LLVM conversion in case the
 * conversion fails. */
if (r600_can_dump_shader(>b, sel->info.processor) &&
!(sscreen->b.debug_flags & DBG_NO_TGSI)) {
tgsi_dump(sel->tokens, 0);
si_dump_streamout(>so);
}

@@ -7592,40 +7591,38 @@ int si_compile_tgsi_shader(struct si_screen *sscreen,
parts[0] = ctx.main_fn;
}

si_get_ps_epilog_key(shader, _key);
si_build_ps_epilog_function(, _key);
parts[need_prolog ? 2 : 1] = ctx.main_fn;

si_build_wrapper_function(, parts, need_prolog ? 3 : 2, 
need_prolog ? 1 : 0);
}

-   mod = ctx.gallivm.module;
-
/* Dump LLVM IR before any optimization passes */
if (sscreen->b.debug_flags & DBG_PREOPT_IR &&
r600_can_dump_shader(>b, ctx.type))
-   ac_dump_module(mod);
+   LLVMDumpModule(ctx.gallivm.module);


Are you sure this works? Wasn't there some issue with different LLVM 
versions not having the function?


Or wait... I think the function was briefly removed in trunk and then 
added again, so it's probably fine.


The series is

Reviewed-by: Nicolai Hähnle 



si_llvm_finalize_module(,
r600_extra_shader_checks(>b, 
ctx.type));

/* Post-optimization transformations and analysis. */
si_eliminate_const_vs_outputs();

if ((debug && debug->debug_message) ||
r600_can_dump_shader(>b, ctx.type))
si_count_scratch_private_memory();

/* Compile to bytecode. */
r = si_compile_llvm(sscreen, >binary, >config, tm,
-   mod, debug, ctx.type, "TGSI shader");
+   ctx.gallivm.module, debug, ctx.type, "TGSI shader");
si_llvm_dispose();
if (r) {
fprintf(stderr, "LLVM failed to compile shader\n");
return r;
}

/* Validate SGPR and VGPR usage for compute to detect compiler bugs.
 * LLVM 3.9svn has this bug.
 */
if (sel->type == PIPE_SHADER_COMPUTE) {




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] anv/blorp: Properly handle VK_ATTACHMENT_UNUSED

2017-04-14 Thread Jason Ekstrand
On Wed, Apr 12, 2017 at 2:54 PM, Nanley Chery  wrote:

> On Tue, Apr 11, 2017 at 07:54:23AM -0700, Jason Ekstrand wrote:
> > The Vulkan driver was originally written under the assumption that
> > VK_ATTACHMENT_UNUSED was basically just for depth-stencil attachments.
> > However, the way things fell together, VK_ATTACHMENT_UNUSED can be used
> > anywhere in the subpass description.  The blorp-based clear and resolve
> > code has a bunch of places where we walk lists of attachments and we
> > weren't handling VK_ATTACHMENT_UNUSED everywhere.  This commit should
> > fix all of them.
> >
> > Cc: "13.0 17.0" 
>
> I think specifying the specific stable branch in quotes has been
> deprecated according to
> https://www.mesa3d.org/submittingpatches.html#nominations
>
> > ---
> >  src/intel/vulkan/anv_blorp.c | 30 +-
> >  1 file changed, 25 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> > index 72a468a..d27132a 100644
> > --- a/src/intel/vulkan/anv_blorp.c
> > +++ b/src/intel/vulkan/anv_blorp.c
> > @@ -1148,6 +1148,9 @@ anv_cmd_buffer_flush_attachments(struct
> anv_cmd_buffer *cmd_buffer,
> >
> > for (uint32_t i = 0; i < subpass->color_count; ++i) {
> >uint32_t att = subpass->color_attachments[i].attachment;
> > +  if (att == VK_ATTACHMENT_UNUSED)
> > + continue;
> > +
> >assert(att < pass->attachment_count);
> >if (attachment_needs_flush(cmd_buffer, >attachments[att],
> stage)) {
> >   cmd_buffer->state.pending_pipe_bits |=
> > @@ -1175,14 +1178,19 @@ subpass_needs_clear(const struct anv_cmd_buffer
> *cmd_buffer)
> >
> > for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
> >uint32_t a = cmd_state->subpass->color_attachments[i].attachment;
> > +  if (a == VK_ATTACHMENT_UNUSED)
> > + continue;
> > +
> > +  assert(a < cmd_state->pass->attachment_count);
> >if (cmd_state->attachments[a].pending_clear_aspects) {
> >   return true;
> >}
> > }
> >
> > -   if (ds != VK_ATTACHMENT_UNUSED &&
> > -   cmd_state->attachments[ds].pending_clear_aspects) {
> > -  return true;
> > +   if (ds != VK_ATTACHMENT_UNUSED) {
> > +  assert(ds < cmd_state->pass->attachment_count);
> > +  if (cmd_state->attachments[ds].pending_clear_aspects)
> > + return true;
>
> I'll refer to this hunk below.
>
> > }
> >
> > return false;
> > @@ -1214,6 +1222,10 @@ anv_cmd_buffer_clear_subpass(struct
> anv_cmd_buffer *cmd_buffer)
> > struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> > for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
> >const uint32_t a = cmd_state->subpass->color_
> attachments[i].attachment;
> > +  if (a == VK_ATTACHMENT_UNUSED)
> > + continue;
> > +
> > +  assert(a < cmd_state->pass->attachment_count);
> >struct anv_attachment_state *att_state =
> _state->attachments[a];
> >
> >if (!att_state->pending_clear_aspects)
> > @@ -1273,6 +1285,7 @@ anv_cmd_buffer_clear_subpass(struct
> anv_cmd_buffer *cmd_buffer)
> > }
> >
> > const uint32_t ds = cmd_state->subpass->depth_
> stencil_attachment.attachment;
> > +   assert(ds == VK_ATTACHMENT_UNUSED || ds <
> cmd_state->pass->attachment_count);
> >
>
> I wonder why this assertion differs from the one two hunks up.
>

Not a good reason, but I prefer the simpler assert above but trying to do
that here would have meant an extra unneeded level of control-flow
nesting.  The one above is just an "if (...) return true;" so adding
nesting wasn't a big deal.


> Nevertheless, this series is
> Reviewed-by: Nanley Chery 
>
> > if (ds != VK_ATTACHMENT_UNUSED &&
> > cmd_state->attachments[ds].pending_clear_aspects) {
> > @@ -1578,8 +1591,12 @@ anv_cmd_buffer_resolve_subpass(struct
> anv_cmd_buffer *cmd_buffer)
> > blorp_batch_init(_buffer->device->blorp, , cmd_buffer, 0);
> >
> > for (uint32_t i = 0; i < subpass->color_count; ++i) {
> > -  ccs_resolve_attachment(cmd_buffer, ,
> > - subpass->color_attachments[i].attachment);
> > +  const uint32_t att = subpass->color_attachments[i].attachment;
> > +  if (att == VK_ATTACHMENT_UNUSED)
> > + continue;
> > +
> > +  assert(att < cmd_buffer->state.pass->attachment_count);
> > +  ccs_resolve_attachment(cmd_buffer, , att);
> > }
> >
> > anv_cmd_buffer_flush_attachments(cmd_buffer, SUBPASS_STAGE_DRAW);
> > @@ -1592,6 +1609,9 @@ anv_cmd_buffer_resolve_subpass(struct
> anv_cmd_buffer *cmd_buffer)
> >   if (dst_att == VK_ATTACHMENT_UNUSED)
> >  continue;
> >
> > + assert(src_att < cmd_buffer->state.pass->attachment_count);
> > + assert(dst_att < cmd_buffer->state.pass->attachment_count);
> > +
> >   if 

Re: [Mesa-dev] [PATCH v2] swr: update gallium driver docs

2017-04-14 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 

> On Apr 14, 2017, at 2:03 PM, Tim Rowley  wrote:
> 
> v2: add back scons section, mention additional built swr libraries
> ---
> src/gallium/docs/source/drivers/openswr.rst   |  2 +-
> src/gallium/docs/source/drivers/openswr/usage.rst | 16 +++-
> 2 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/src/gallium/docs/source/drivers/openswr.rst 
> b/src/gallium/docs/source/drivers/openswr.rst
> index 84aa51f..e254d7b 100644
> --- a/src/gallium/docs/source/drivers/openswr.rst
> +++ b/src/gallium/docs/source/drivers/openswr.rst
> @@ -7,7 +7,7 @@ geometry heavy workloads there is a considerable speedup over 
> llvmpipe,
> which is to be expected as the geometry frontend of llvmpipe is single
> threaded.
> 
> -This rasterizer is x86 specific and requires AVX or AVX2.  The driver
> +This rasterizer is x86 specific and requires AVX or above.  The driver
> fits into the gallium framework, and reuses gallivm for doing the TGSI
> to vectorized llvm-IR conversion of the shader kernels.
> 
> diff --git a/src/gallium/docs/source/drivers/openswr/usage.rst 
> b/src/gallium/docs/source/drivers/openswr/usage.rst
> index e55b421..61c30c2 100644
> --- a/src/gallium/docs/source/drivers/openswr/usage.rst
> +++ b/src/gallium/docs/source/drivers/openswr/usage.rst
> @@ -4,8 +4,9 @@ Usage
> Requirements
> 
> 
> -* An x86 processor with AVX or AVX2
> -* LLVM version 3.6 or later
> +* An x86 processor with AVX or above
> +* LLVM version 3.9 or later
> +* C++14 capable compiler
> 
> Building
> 
> @@ -18,13 +19,18 @@ configure time, for example: ::
> Using
> ^
> 
> -On Linux, building will create a drop-in alternative for libGL.so into::
> +On Linux, building with autotools will create a drop-in alternative
> +for libGL.so into::
> 
>   lib/gallium/libGL.so
> +  lib/gallium/libswrAVX.so
> +  lib/gallium/libswrAVX2.so
> 
> -or::
> +Alternatively, building with SCons will produce::
> 
> -  build/foo/gallium/targets/libgl-xlib/libGL.so
> +  build/linux-x86_64/gallium/targets/libgl-xlib/libGL.so
> +  build/linux-x86_64/gallium/drivers/swr/libswrAVX.so
> +  build/linux-x86_64/gallium/drivers/swr/libswrAVX2.so
> 
> To use it set the LD_LIBRARY_PATH environment variable accordingly.
> 
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: Enable MSAA in OpenSWR software renderer

2017-04-14 Thread Kyriazis, George

Reviewed-by: George Kyriazis 
>

With the assumption that there are additional changes forthcoming.

On Apr 13, 2017, at 5:40 PM, Bruce Cherniak 
> wrote:

This patch enables multisample antialiasing in the OpenSWR software renderer.

MSAA is a proof-of-concept/work-in-progress with bug fixes and performance
on the way.  We wanted to get the changes out now to allow several customers
to begin experimenting with MSAA in a software renderer.  So as not to
impact current customers, MSAA is turned off by default - previous
functionality and performance remain intact.  It is easily enabled via
environment variables, as described below.

It has only been tested with the glx-lib winsys.  The intention is to
enable other state-trackers, both Windows and Linux and more fully support
FBOs.

There are 2 environment variables that affect behavior:

* SWR_MSAA_FORCE_ENABLE - force MSAA on, for apps that are not designed
 for MSAA... Beware, results will vary.  This is mainly for testing.

* SWR_MSAA_MAX_SAMPLE_COUNT - sets maximum supported number of
 samples (1,2,4,8,16), or 0 to disable MSAA altogether.
 (The default is currently 0.)


---
src/gallium/drivers/swr/swr_context.cpp |  90 +-
src/gallium/drivers/swr/swr_context.h   |   3 +
src/gallium/drivers/swr/swr_resource.h  |   4 +
src/gallium/drivers/swr/swr_screen.cpp  | 159 +---
src/gallium/drivers/swr/swr_screen.h|   8 ++
src/gallium/drivers/swr/swr_state.cpp   |  74 +--
6 files changed, 313 insertions(+), 25 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_context.cpp 
b/src/gallium/drivers/swr/swr_context.cpp
index 6f46d66..aa5cca8 100644
--- a/src/gallium/drivers/swr/swr_context.cpp
+++ b/src/gallium/drivers/swr/swr_context.cpp
@@ -267,20 +267,104 @@ swr_resource_copy(struct pipe_context *pipe,
}


+/* XXX: This resolve is incomplete and suboptimal. It will be removed once the
+ * pipelined resolve blit works. */
+void
+swr_do_msaa_resolve(struct pipe_resource *src_resource,
+struct pipe_resource *dst_resource)
+{
+   /* This is a pretty dumb inline resolve.  It only supports 8-bit formats
+* (ex RGBA8/BGRA8) - which are most common display formats anyway.
+*/
+
+   /* quick check for 8-bit and number of components */
+   uint8_t bits_per_component =
+  util_format_get_component_bits(src_resource->format,
+UTIL_FORMAT_COLORSPACE_RGB, 0);
+
+   /* Unsupported resolve format */
+   assert(src_resource->format == dst_resource->format);
+   assert(bits_per_component == 8);
+   if ((src_resource->format != dst_resource->format) ||
+   (bits_per_component != 8)) {
+  return;
+   }
+
+   uint8_t src_num_comps = util_format_get_nr_components(src_resource->format);
+
+   SWR_SURFACE_STATE *src_surface = _resource(src_resource)->swr;
+   SWR_SURFACE_STATE *dst_surface = _resource(dst_resource)->swr;
+
+   uint32_t *src, *dst, offset;
+   uint32_t num_samples = src_surface->numSamples;
+   float recip_num_samples = 1.0f / num_samples;
+   for (uint32_t y = 0; y < src_surface->height; y++) {
+  for (uint32_t x = 0; x < src_surface->width; x++) {
+ float r = 0.0f;
+ float g = 0.0f;
+ float b = 0.0f;
+ float a = 0.0f;
+ for (uint32_t sampleNum = 0;  sampleNum < num_samples; sampleNum++) {
+offset = ComputeSurfaceOffset(x, y, 0, 0, sampleNum, 0, 
src_surface);
+src = (uint32_t *) src_surface->pBaseAddress + 
offset/src_num_comps;
+const uint32_t sample = *src;
+r += (float)((sample >> 24) & 0xff) / 255.0f * recip_num_samples;
+g += (float)((sample >> 16) & 0xff) / 255.0f * recip_num_samples;
+b += (float)((sample >>  8) & 0xff) / 255.0f * recip_num_samples;
+a += (float)((sample  ) & 0xff) / 255.0f * recip_num_samples;
+ }
+ uint32_t result = 0;
+ result  = ((uint8_t)(r * 255.0f) & 0xff) << 24;
+ result |= ((uint8_t)(g * 255.0f) & 0xff) << 16;
+ result |= ((uint8_t)(b * 255.0f) & 0xff) <<  8;
+ result |= ((uint8_t)(a * 255.0f) & 0xff);
+ offset = ComputeSurfaceOffset(x, y, 0, 0, 0, 0, src_surface);
+ dst = (uint32_t *) dst_surface->pBaseAddress + offset/src_num_comps;
+ *dst = result;
+  }
+   }
+}
+
+
static void
swr_blit(struct pipe_context *pipe, const struct pipe_blit_info *blit_info)
{
   struct swr_context *ctx = swr_context(pipe);
+   /* Make a copy of the const blit_info, so we can modify it */
   struct pipe_blit_info info = *blit_info;

-   if (blit_info->render_condition_enable && !swr_check_render_cond(pipe))
+   if (info.render_condition_enable && !swr_check_render_cond(pipe))
  return;

   if (info.src.resource->nr_samples > 1 && info.dst.resource->nr_samples <= 1
   && 

Re: [Mesa-dev] [PATCH 3/3] winsys/amdgpu: init buffer_indices_hashlist with memset()

2017-04-14 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Fri, Apr 14, 2017 at 6:32 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
> index f068d8ea7a..8a277d08e1 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
> @@ -695,8 +695,6 @@ static void amdgpu_ib_finalize(struct amdgpu_ib *ib)
>  static bool amdgpu_init_cs_context(struct amdgpu_cs_context *cs,
> enum ring_type ring_type)
>  {
> -   int i;
> -
> switch (ring_type) {
> case RING_DMA:
>cs->request.ip_type = AMDGPU_HW_IP_DMA;
> @@ -720,9 +718,7 @@ static bool amdgpu_init_cs_context(struct 
> amdgpu_cs_context *cs,
>break;
> }
>
> -   for (i = 0; i < ARRAY_SIZE(cs->buffer_indices_hashlist); i++) {
> -  cs->buffer_indices_hashlist[i] = -1;
> -   }
> +   memset(cs->buffer_indices_hashlist, -1, 
> sizeof(cs->buffer_indices_hashlist));
> cs->last_added_bo = NULL;
>
> cs->request.number_of_ibs = 1;
> @@ -757,9 +753,7 @@ static void amdgpu_cs_context_cleanup(struct 
> amdgpu_cs_context *cs)
> cs->num_sparse_buffers = 0;
> amdgpu_fence_reference(>fence, NULL);
>
> -   for (i = 0; i < ARRAY_SIZE(cs->buffer_indices_hashlist); i++) {
> -  cs->buffer_indices_hashlist[i] = -1;
> -   }
> +   memset(cs->buffer_indices_hashlist, -1, 
> sizeof(cs->buffer_indices_hashlist));
> cs->last_added_bo = NULL;
>  }
>
> --
> 2.12.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] radeonsi: enable ARB_shader_viewport_layer_array

2017-04-14 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Thu, Apr 13, 2017 at 10:30 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  docs/features.txt  | 2 +-
>  docs/relnotes/17.1.0.html  | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
>  3 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index a2d7785..7ca5fd3 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -290,21 +290,21 @@ Khronos, ARB, and OES extensions that are not part of 
> any OpenGL or OpenGL ES ve
>GL_ARB_post_depth_coverageDONE (i965)
>GL_ARB_robustness_isolation   not started
>GL_ARB_sample_locations   not started
>GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
> radeonsi, r600, softpipe, swr)
>GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
> nvc0, radeonsi, softpipe)
>GL_ARB_shader_ballot  DONE (nvc0, radeonsi)
>GL_ARB_shader_clock   DONE (i965/gen7+, 
> nv50, nvc0, radeonsi)
>GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
> radeonsi)
>GL_ARB_shader_group_vote  DONE (nvc0, radeonsi)
>GL_ARB_shader_stencil_export  DONE (i965/gen9+, 
> radeonsi, softpipe, llvmpipe, swr)
> -  GL_ARB_shader_viewport_layer_arrayDONE (i965/gen6+)
> +  GL_ARB_shader_viewport_layer_arrayDONE (i965/gen6+, 
> radeonsi)
>GL_ARB_sparse_buffer  DONE (radeonsi/CIK+)
>GL_ARB_sparse_texture not started
>GL_ARB_sparse_texture2not started
>GL_ARB_sparse_texture_clamp   not started
>GL_ARB_texture_filter_minmax  not started
>GL_ARB_transform_feedback_overflow_query  DONE (i965/gen6+)
>GL_KHR_blend_equation_advanced_coherent   DONE (i965/gen9+)
>GL_KHR_no_error   not started
>GL_KHR_texture_compression_astc_hdr   DONE (core only)
>GL_KHR_texture_compression_astc_sliced_3d not started
> diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
> index 8f237ed..82086d5 100644
> --- a/docs/relnotes/17.1.0.html
> +++ b/docs/relnotes/17.1.0.html
> @@ -41,20 +41,21 @@ TBD.
>
>  
>  Note: some of the new features are only available with certain drivers.
>  
>
>  
>  GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, 
> llvmpipe
>  GL_ARB_shader_ballot on nvc0, radeonsi
>  GL_ARB_shader_clock on nv50, nvc0, radeonsi
>  GL_ARB_shader_group_vote on radeonsi
> +GL_ARB_shader_viewport_layer_array on radeonsi
>  GL_ARB_sparse_buffer on radeonsi/CIK+
>  GL_ARB_transform_feedback2 on i965/gen6
>  GL_ARB_transform_feedback_overflow_query on i965/gen6+
>  GL_NV_fill_rectangle on nvc0
>  Geometry shaders enabled on swr
>  
>
>  Bug fixes
>
>  
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 2955249..f0e24c2 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -414,20 +414,21 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_STRING_MARKER:
> case PIPE_CAP_CLEAR_TEXTURE:
> case PIPE_CAP_CULL_DISTANCE:
> case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
> case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
> case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
> case PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS:
> case PIPE_CAP_DOUBLES:
> case PIPE_CAP_TGSI_TEX_TXF_LZ:
> +   case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT:
> return 1;
>
> case PIPE_CAP_INT64:
> case PIPE_CAP_INT64_DIVMOD:
> case PIPE_CAP_TGSI_CLOCK:
> return HAVE_LLVM >= 0x0309;
>
> case PIPE_CAP_TGSI_VOTE:
> return HAVE_LLVM >= 0x0400;
>
> @@ -499,21 +500,20 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> case PIPE_CAP_NATIVE_FENCE_FD:
> case PIPE_CAP_TGSI_FS_FBFETCH:
> case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> case PIPE_CAP_UMA:
> case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
> -   case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT:
> return 0;
>
> case PIPE_CAP_QUERY_BUFFER_OBJECT:
>  

Re: [Mesa-dev] [PATCH] anv/cmd_buffer: Disable CCS on BDW input attachments

2017-04-14 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Apr 14, 2017 at 12:18 PM, Nanley Chery 
wrote:

> The description under RENDER_SURFACE_STATE::RedClearColor says,
>
>For Sampling Engine Multisampled Surfaces and Render Targets:
> Specifies the clear value for the red channel.
>For Other Surfaces:
> This field is ignored.
>
> This means that the sampler on BDW doesn't support CCS.
>
> Cc: Samuel Iglesias Gonsálvez 
> Cc: Jordan Justen 
> Cc: Jason Ekstrand 
> Cc: 
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/anv_blorp.c   | 11 ---
>  src/intel/vulkan/genX_cmd_buffer.c | 32 +---
>  2 files changed, 13 insertions(+), 30 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 4904ee3a5f..8a3c4deed3 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1381,7 +1381,6 @@ ccs_resolve_attachment(struct anv_cmd_buffer
> *cmd_buffer,
>  * still hot in the cache.
>  */
> bool found_draw = false;
> -   bool self_dep = false;
> enum anv_subpass_usage usage = 0;
> for (uint32_t s = subpass_idx + 1; s < pass->subpass_count; s++) {
>usage |= pass->attachments[att].subpass_usage[s];
> @@ -1391,8 +1390,6 @@ ccs_resolve_attachment(struct anv_cmd_buffer
> *cmd_buffer,
>* wait to resolve until then.
>*/
>   found_draw = true;
> - if (pass->attachments[att].subpass_usage[s] &
> ANV_SUBPASS_USAGE_INPUT)
> -self_dep = true;
>   break;
>}
> }
> @@ -1451,14 +1448,6 @@ ccs_resolve_attachment(struct anv_cmd_buffer
> *cmd_buffer,
>*binding this surface to Sampler."
>*/
>   resolve_op = BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
> -  } else if (cmd_buffer->device->info.gen == 8 && self_dep &&
> - att_state->input_aux_usage == ISL_AUX_USAGE_CCS_D) {
> - /* On Broadwell we still need to do resolves when there is a
> -  * self-dependency because HW could not see fast-clears and works
> -  * on the render cache as if there was regular non-fast-clear
> surface.
> -  * To avoid any inconsistency, we force the resolve.
> -  */
> - resolve_op = BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
>}
> }
>
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> index b78b13d88e..2e0108d3f5 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -291,27 +291,21 @@ color_attachment_compute_aux_usage(struct
> anv_device *device,
>att_state->input_aux_usage = ISL_AUX_USAGE_CCS_E;
> } else if (att_state->fast_clear) {
>att_state->aux_usage = ISL_AUX_USAGE_CCS_D;
> -  if (GEN_GEN >= 9 &&
> -  !isl_format_supports_ccs_e(>info, iview->isl.format)) {
> - /* From the Sky Lake PRM, RENDER_SURFACE_STATE::
> AuxiliarySurfaceMode:
> -  *
> -  *"If Number of Multisamples is MULTISAMPLECOUNT_1, AUX_CCS_D
> -  *setting is only allowed if Surface Format supported for
> Fast
> -  *Clear. In addition, if the surface is bound to the sampling
> -  *engine, Surface Format must be supported for Render Target
> -  *Compression for surfaces bound to the sampling engine."
> -  *
> -  * In other words, we can't sample from a fast-cleared image if
> it
> -  * doesn't also support color compression.
> -  */
> - att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
> -  } else if (GEN_GEN >= 8) {
> - /* Broadwell/Skylake can sample from fast-cleared images */
> +  /* From the Sky Lake PRM, RENDER_SURFACE_STATE::
> AuxiliarySurfaceMode:
> +   *
> +   *"If Number of Multisamples is MULTISAMPLECOUNT_1, AUX_CCS_D
> +   *setting is only allowed if Surface Format supported for Fast
> +   *Clear. In addition, if the surface is bound to the sampling
> +   *engine, Surface Format must be supported for Render Target
> +   *Compression for surfaces bound to the sampling engine."
> +   *
> +   * In other words, we can only sample from a fast-cleared image if
> it
> +   * also supports color compression.
> +   */
> +  if (isl_format_supports_ccs_e(>info, iview->isl.format))
>   att_state->input_aux_usage = ISL_AUX_USAGE_CCS_D;
> -  } else {
> - /* Ivy Bridge and Haswell cannot */
> +  else
>   att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
> -  }
> } else {
>att_state->aux_usage = ISL_AUX_USAGE_NONE;
>att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
> --
> 2.12.2
>
>
___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH 2/2] radeonsi: cope with missing disassembly

2017-04-14 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Thu, Apr 13, 2017 at 8:23 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> For robustness and testing purposes.
> ---
>  src/gallium/drivers/radeonsi/si_state_shaders.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
> b/src/gallium/drivers/radeonsi/si_state_shaders.c
> index 78c7495..c52ffd9 100644
> --- a/src/gallium/drivers/radeonsi/si_state_shaders.c
> +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
> @@ -106,21 +106,22 @@ static uint32_t *read_chunk(uint32_t *ptr, void **data, 
> unsigned *size)
>
>  /**
>   * Return the shader binary in a buffer. The first 4 bytes contain its size
>   * as integer.
>   */
>  static void *si_get_shader_binary(struct si_shader *shader)
>  {
> /* There is always a size of data followed by the data itself. */
> unsigned relocs_size = shader->binary.reloc_count *
>sizeof(shader->binary.relocs[0]);
> -   unsigned disasm_size = strlen(shader->binary.disasm_string) + 1;
> +   unsigned disasm_size = shader->binary.disasm_string ?
> +  strlen(shader->binary.disasm_string) + 1 : 0;
> unsigned llvm_ir_size = shader->binary.llvm_ir_string ?
> strlen(shader->binary.llvm_ir_string) + 1 : 0;
> unsigned size =
> 4 + /* total size */
> 4 + /* CRC32 of the data below */
> align(sizeof(shader->config), 4) +
> align(sizeof(shader->info), 4) +
> 4 + align(shader->binary.code_size, 4) +
> 4 + align(shader->binary.rodata_size, 4) +
> 4 + align(relocs_size, 4) +
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/cmd_buffer: Disable CCS on BDW input attachments

2017-04-14 Thread Nanley Chery
The description under RENDER_SURFACE_STATE::RedClearColor says,

   For Sampling Engine Multisampled Surfaces and Render Targets:
Specifies the clear value for the red channel.
   For Other Surfaces:
This field is ignored.

This means that the sampler on BDW doesn't support CCS.

Cc: Samuel Iglesias Gonsálvez 
Cc: Jordan Justen 
Cc: Jason Ekstrand 
Cc: 
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_blorp.c   | 11 ---
 src/intel/vulkan/genX_cmd_buffer.c | 32 +---
 2 files changed, 13 insertions(+), 30 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 4904ee3a5f..8a3c4deed3 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1381,7 +1381,6 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
 * still hot in the cache.
 */
bool found_draw = false;
-   bool self_dep = false;
enum anv_subpass_usage usage = 0;
for (uint32_t s = subpass_idx + 1; s < pass->subpass_count; s++) {
   usage |= pass->attachments[att].subpass_usage[s];
@@ -1391,8 +1390,6 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
   * wait to resolve until then.
   */
  found_draw = true;
- if (pass->attachments[att].subpass_usage[s] & ANV_SUBPASS_USAGE_INPUT)
-self_dep = true;
  break;
   }
}
@@ -1451,14 +1448,6 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
   *binding this surface to Sampler."
   */
  resolve_op = BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
-  } else if (cmd_buffer->device->info.gen == 8 && self_dep &&
- att_state->input_aux_usage == ISL_AUX_USAGE_CCS_D) {
- /* On Broadwell we still need to do resolves when there is a
-  * self-dependency because HW could not see fast-clears and works
-  * on the render cache as if there was regular non-fast-clear surface.
-  * To avoid any inconsistency, we force the resolve.
-  */
- resolve_op = BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
   }
}
 
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index b78b13d88e..2e0108d3f5 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -291,27 +291,21 @@ color_attachment_compute_aux_usage(struct anv_device 
*device,
   att_state->input_aux_usage = ISL_AUX_USAGE_CCS_E;
} else if (att_state->fast_clear) {
   att_state->aux_usage = ISL_AUX_USAGE_CCS_D;
-  if (GEN_GEN >= 9 &&
-  !isl_format_supports_ccs_e(>info, iview->isl.format)) {
- /* From the Sky Lake PRM, RENDER_SURFACE_STATE::AuxiliarySurfaceMode:
-  *
-  *"If Number of Multisamples is MULTISAMPLECOUNT_1, AUX_CCS_D
-  *setting is only allowed if Surface Format supported for Fast
-  *Clear. In addition, if the surface is bound to the sampling
-  *engine, Surface Format must be supported for Render Target
-  *Compression for surfaces bound to the sampling engine."
-  *
-  * In other words, we can't sample from a fast-cleared image if it
-  * doesn't also support color compression.
-  */
- att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
-  } else if (GEN_GEN >= 8) {
- /* Broadwell/Skylake can sample from fast-cleared images */
+  /* From the Sky Lake PRM, RENDER_SURFACE_STATE::AuxiliarySurfaceMode:
+   *
+   *"If Number of Multisamples is MULTISAMPLECOUNT_1, AUX_CCS_D
+   *setting is only allowed if Surface Format supported for Fast
+   *Clear. In addition, if the surface is bound to the sampling
+   *engine, Surface Format must be supported for Render Target
+   *Compression for surfaces bound to the sampling engine."
+   *
+   * In other words, we can only sample from a fast-cleared image if it
+   * also supports color compression.
+   */
+  if (isl_format_supports_ccs_e(>info, iview->isl.format))
  att_state->input_aux_usage = ISL_AUX_USAGE_CCS_D;
-  } else {
- /* Ivy Bridge and Haswell cannot */
+  else
  att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
-  }
} else {
   att_state->aux_usage = ISL_AUX_USAGE_NONE;
   att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] radv: add private push descriptors for meta

2017-04-14 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

for the series.

On Fri, Apr 14, 2017 at 12:26 AM, Fredrik Höglund  wrote:
> This allows meta to use push descriptors without disturbing user
> push descriptors.
>
> radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR
> in that partial updates are not supported; all descriptors used in
> subsequent draw commands must be pushed at the same time.
>
> Signed-off-by: Fredrik Höglund 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 33 +
>  src/amd/vulkan/radv_private.h|  8 
>  2 files changed, 41 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index f03e3dff34..31d04e535d 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1981,6 +1981,39 @@ static bool radv_init_push_descriptor_set(struct 
> radv_cmd_buffer *cmd_buffer,
> return true;
>  }
>
> +void radv_meta_push_descriptor_set(
> +   struct radv_cmd_buffer*  cmd_buffer,
> +   VkPipelineBindPoint  pipelineBindPoint,
> +   VkPipelineLayout _layout,
> +   uint32_t set,
> +   uint32_t descriptorWriteCount,
> +   const VkWriteDescriptorSet*  pDescriptorWrites)
> +{
> +   RADV_FROM_HANDLE(radv_pipeline_layout, layout, _layout);
> +   struct radv_descriptor_set *push_set = 
> _buffer->meta_push_descriptors;
> +   unsigned bo_offset;
> +
> +   assert(layout->set[set].layout->flags & 
> VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR);
> +
> +   push_set->size = layout->set[set].layout->size;
> +   push_set->layout = layout->set[set].layout;
> +
> +   if (!radv_cmd_buffer_upload_alloc(cmd_buffer, push_set->size, 32,
> + _offset,
> + (void**) _set->mapped_ptr))
> +   return;
> +
> +   push_set->va = 
> cmd_buffer->device->ws->buffer_get_va(cmd_buffer->upload.upload_bo);
> +   push_set->va += bo_offset;
> +
> +   radv_update_descriptor_sets(cmd_buffer->device, cmd_buffer,
> +   radv_descriptor_set_to_handle(push_set),
> +   descriptorWriteCount, pDescriptorWrites, 
> 0, NULL);
> +
> +   cmd_buffer->state.descriptors[set] = push_set;
> +   cmd_buffer->state.descriptors_dirty |= (1 << set);
> +}
> +
>  void radv_CmdPushDescriptorSetKHR(
> VkCommandBuffer commandBuffer,
> VkPipelineBindPoint pipelineBindPoint,
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index 00190e7eee..a64336856f 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -787,6 +787,7 @@ struct radv_cmd_buffer {
> uint32_t dynamic_buffers[4 * MAX_DYNAMIC_BUFFERS];
> VkShaderStageFlags push_constant_stages;
> struct radv_push_descriptor_set push_descriptors;
> +   struct radv_descriptor_set meta_push_descriptors;
>
> struct radv_cmd_buffer_upload upload;
>
> @@ -1410,6 +1411,13 @@ radv_update_descriptor_set_with_template(struct 
> radv_device *device,
>   VkDescriptorUpdateTemplateKHR 
> descriptorUpdateTemplate,
>   const void *pData);
>
> +void radv_meta_push_descriptor_set(struct radv_cmd_buffer *cmd_buffer,
> +   VkPipelineBindPoint pipelineBindPoint,
> +   VkPipelineLayout _layout,
> +   uint32_t set,
> +   uint32_t descriptorWriteCount,
> +   const VkWriteDescriptorSet 
> *pDescriptorWrites);
> +
>  void radv_initialise_cmask(struct radv_cmd_buffer *cmd_buffer,
>struct radv_image *image, uint32_t value);
>  void radv_initialize_dcc(struct radv_cmd_buffer *cmd_buffer,
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] swr: update gallium driver docs

2017-04-14 Thread Tim Rowley
v2: add back scons section, mention additional built swr libraries
---
 src/gallium/docs/source/drivers/openswr.rst   |  2 +-
 src/gallium/docs/source/drivers/openswr/usage.rst | 16 +++-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/gallium/docs/source/drivers/openswr.rst 
b/src/gallium/docs/source/drivers/openswr.rst
index 84aa51f..e254d7b 100644
--- a/src/gallium/docs/source/drivers/openswr.rst
+++ b/src/gallium/docs/source/drivers/openswr.rst
@@ -7,7 +7,7 @@ geometry heavy workloads there is a considerable speedup over 
llvmpipe,
 which is to be expected as the geometry frontend of llvmpipe is single
 threaded.
 
-This rasterizer is x86 specific and requires AVX or AVX2.  The driver
+This rasterizer is x86 specific and requires AVX or above.  The driver
 fits into the gallium framework, and reuses gallivm for doing the TGSI
 to vectorized llvm-IR conversion of the shader kernels.
 
diff --git a/src/gallium/docs/source/drivers/openswr/usage.rst 
b/src/gallium/docs/source/drivers/openswr/usage.rst
index e55b421..61c30c2 100644
--- a/src/gallium/docs/source/drivers/openswr/usage.rst
+++ b/src/gallium/docs/source/drivers/openswr/usage.rst
@@ -4,8 +4,9 @@ Usage
 Requirements
 
 
-* An x86 processor with AVX or AVX2
-* LLVM version 3.6 or later
+* An x86 processor with AVX or above
+* LLVM version 3.9 or later
+* C++14 capable compiler
 
 Building
 
@@ -18,13 +19,18 @@ configure time, for example: ::
 Using
 ^
 
-On Linux, building will create a drop-in alternative for libGL.so into::
+On Linux, building with autotools will create a drop-in alternative
+for libGL.so into::
 
   lib/gallium/libGL.so
+  lib/gallium/libswrAVX.so
+  lib/gallium/libswrAVX2.so
 
-or::
+Alternatively, building with SCons will produce::
 
-  build/foo/gallium/targets/libgl-xlib/libGL.so
+  build/linux-x86_64/gallium/targets/libgl-xlib/libGL.so
+  build/linux-x86_64/gallium/drivers/swr/libswrAVX.so
+  build/linux-x86_64/gallium/drivers/swr/libswrAVX2.so
 
 To use it set the LD_LIBRARY_PATH environment variable accordingly.
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: Add polygon stipple support

2017-04-14 Thread Ilia Mirkin
On Fri, Apr 14, 2017 at 2:52 PM, Kyriazis, George
 wrote:
> > > +   /* work around the fact that poly stipple also affects lines */
> > > +   /* and points, since we rasterize them as triangles, too */
> > > +   /* Has to be before fragment shader, since it sets SWR_NEW_FS */
> > > +   if (p_draw_info) {
> > > +  bool new_prim_is_poly = (u_reduced_prim(p_draw_info->mode) ==
> > > PIPE_PRIM_TRIANGLES);
> >
> > What about glPolygonMode and what about geometry shaders that take in
> > e.g. points and put out triangles? Perhaps you need to pass in a "is
> > this *really* a triangle" parameter to the shader generated by the
> > rasterizer.
> >
> >
> > Actually the GS thing won't happen since polygon stippling is a
> > compat-only feature and we don't support GS in compat profiles. You do
> > need to check that the polymode == FILL here though.
>
> Well, currently we don’t have a working polygon mode.  Once we implement it,
> then we’ll look at stipple at that time.

Ah, indeed you don't. I thought you at least handled it when front ==
back, but I was mistaken.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: Add polygon stipple support

2017-04-14 Thread Kyriazis, George

On Apr 14, 2017, at 11:35 AM, Ilia Mirkin 
> wrote:

On Fri, Apr 14, 2017 at 11:18 AM, Ilia Mirkin 
> wrote:
On Thu, Apr 13, 2017 at 4:30 PM, George Kyriazis
> wrote:
Add polygon stipple functionality to the fragment shader.

Explicitly turn off polygon stipple for lines and points, since we
do them using tris.
---
src/gallium/drivers/swr/swr_context.h  |  4 ++-
src/gallium/drivers/swr/swr_shader.cpp | 56 ++
src/gallium/drivers/swr/swr_shader.h   |  1 +
src/gallium/drivers/swr/swr_state.cpp  | 27 ++--
src/gallium/drivers/swr/swr_state.h|  5 +++
5 files changed, 84 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_context.h 
b/src/gallium/drivers/swr/swr_context.h
index be65a20..9d80c70 100644
--- a/src/gallium/drivers/swr/swr_context.h
+++ b/src/gallium/drivers/swr/swr_context.h
@@ -98,6 +98,8 @@ struct swr_draw_context {

   float userClipPlanes[PIPE_MAX_CLIP_PLANES][4];

+   uint32_t polyStipple[32];
+
   SWR_SURFACE_STATE renderTargets[SWR_NUM_ATTACHMENTS];
   void *pStats;
};
@@ -127,7 +129,7 @@ struct swr_context {
   struct pipe_constant_buffer
  constants[PIPE_SHADER_TYPES][PIPE_MAX_CONSTANT_BUFFERS];
   struct pipe_framebuffer_state framebuffer;
-   struct pipe_poly_stipple poly_stipple;
+   struct swr_poly_stipple poly_stipple;
   struct pipe_scissor_state scissor;
   SWR_RECT swr_scissor;
   struct pipe_sampler_view *
diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
b/src/gallium/drivers/swr/swr_shader.cpp
index 6fc0596..d8f5512 100644
--- a/src/gallium/drivers/swr/swr_shader.cpp
+++ b/src/gallium/drivers/swr/swr_shader.cpp
@@ -165,6 +165,9 @@ swr_generate_fs_key(struct swr_jit_fs_key ,
  sizeof(key.vs_output_semantic_idx));

   swr_generate_sampler_key(swr_fs->info, ctx, PIPE_SHADER_FRAGMENT, key);
+
+   key.poly_stipple_enable = ctx->rasterizer->poly_stipple_enable &&
+  ctx->poly_stipple.prim_is_poly;
}

void
@@ -1099,17 +1102,58 @@ BuilderSWR::CompileFS(struct swr_context *ctx, 
swr_jit_fs_key )
   memset(_values, 0, sizeof(system_values));

   struct lp_build_mask_context mask;
+   bool uses_mask = false;

-   if (swr_fs->info.base.uses_kill) {
-  Value *mask_val = LOAD(pPS, {0, SWR_PS_CONTEXT_activeMask}, 
"activeMask");
+   if (swr_fs->info.base.uses_kill ||
+   key.poly_stipple_enable) {
+  Value *vActiveMask = NULL;
+  if (swr_fs->info.base.uses_kill) {
+ vActiveMask = LOAD(pPS, {0, SWR_PS_CONTEXT_activeMask}, "activeMask");
+  }
+  if (key.poly_stipple_enable) {
+ // first get fragment xy coords and clip to stipple bounds
+ Value *vXf = LOAD(pPS, {0, SWR_PS_CONTEXT_vX, PixelPositions_UL});
+ Value *vYf = LOAD(pPS, {0, SWR_PS_CONTEXT_vY, PixelPositions_UL});
+ Value *vXu = FP_TO_UI(vXf, mSimdInt32Ty);
+ Value *vYu = FP_TO_UI(vYf, mSimdInt32Ty);
+
+ // stipple pattern is 32x32, which means that one line of stipple
+ // is stored in one word:
+ // vXstipple is bit offset inside 32-bit stipple word
+ // vYstipple is word index is stipple array
+ Value *vXstipple = AND(vXu, VIMMED1(0x1f)); // & (32-1)
+ Value *vYstipple = AND(vYu, VIMMED1(0x1f)); // & (32-1)
+
+ // grab stipple pattern base address
+ Value *stipplePtr = GEP(hPrivateData, {0, 
swr_draw_context_polyStipple, 0});
+ stipplePtr = BITCAST(stipplePtr, mInt8PtrTy);
+
+ // peform a gather to grab stipple words for each lane
+ Value *vStipple = GATHERDD(VUNDEF_I(), stipplePtr, vYstipple,
+VIMMED1(0x), C((char)4));
+
+ // create a mask with one bit corresponding to the x stipple
+ // and AND it with the pattern, to see if we have a bit
+ Value *vBitMask = LSHR(VIMMED1(0x8000), vXstipple);
+ Value *vStippleMask = AND(vStipple, vBitMask);
+ vStippleMask = ICMP_NE(vStippleMask, VIMMED1(0));
+ vStippleMask = VMASK(vStippleMask);
+
+ if (swr_fs->info.base.uses_kill) {
+vActiveMask = AND(vActiveMask, vStippleMask);
+ } else {
+vActiveMask = vStippleMask;
+ }
+  }
  lp_build_mask_begin(
- , gallivm, lp_type_float_vec(32, 32 * 8), wrap(mask_val));
+ , gallivm, lp_type_float_vec(32, 32 * 8), wrap(vActiveMask));
+  uses_mask = true;
   }

   lp_build_tgsi_soa(gallivm,
 swr_fs->pipe.tokens,
 lp_type_float_vec(32, 32 * 8),
- swr_fs->info.base.uses_kill ?  : NULL, // mask
+ uses_mask ?  : NULL, // mask
 wrap(consts_ptr),
 wrap(const_sizes_ptr),
 _values,
@@ -1172,13 +1216,13 @@ BuilderSWR::CompileFS(struct swr_context *ctx, 

[Mesa-dev] [PATCH] mesa: print target string in glBindTexture() error message

2017-04-14 Thread Brian Paul
---
 src/mesa/main/texobj.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index ad644ca..00feb97 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -1663,7 +1663,8 @@ _mesa_BindTexture( GLenum target, GLuint texName )
 
targetIndex = _mesa_tex_target_to_index(ctx, target);
if (targetIndex < 0) {
-  _mesa_error(ctx, GL_INVALID_ENUM, "glBindTexture(target)");
+  _mesa_error(ctx, GL_INVALID_ENUM, "glBindTexture(target = %s)",
+  _mesa_enum_to_string(target));
   return;
}
assert(targetIndex < NUM_TEXTURE_TARGETS);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 17.0.4 release candidate

2017-04-14 Thread Emil Velikov
Hello list,

The candidate for the Mesa 17.0.4 is now available. Currently we have:
 - 28 queued
 - 1 nominated (outstanding)
 - and 0 rejected patch(es)


Current queue includes of extra PCI IDs and a runtime warning fix for radeonsi.
While r600 has improved error handling in OOM conditions.

There is a GBM flush fix for VMWGFX and other drivers that queue DMA operations
on the mapping context. A performance regression in freedreno has been resolved.

For nouveau and i965 we have various fixes, of which the correct GL version
is now reported on i965 devices.

Haiku build issues have been addressed.

Last but not least, Mesa no longer prints a harmless warning on
platform devices.



Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval


Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 17.0.4 this Sunday (16th of April), around or shortly
after 19:00 GMT.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please go ahead.


Trivial merge conflicts
---

commit 5094311078e23a3a9f62b143f2451d3b91691134
Author: Craig Stout 

anv/cmd_buffer: fix host memory leak

(cherry picked from commit 1da7a11de8113932871487efaeb2674a3d1c644a)


commit 04df217ac07847e7f020a180ac2951ed17209645
Author: Jason Ekstrand 

i965/blorp: Align vertex buffers to 64B

(cherry picked from commit f938354362655a378d474c5f79c52cea9852ab91)


commit e7f872f7b8a897e188cf7b0462867c8f0b5d9397
Author: Kenneth Graunke 

i965: Set screen->cmd_parser_version to 0 if we can't write registers.

(cherry picked from commit 31693a13f8fbc52d4f19f1e8800a4edabeecbe19)


commit a8e217d057a25584949f57093684fe9b4978dbf0
Author: Kenneth Graunke 

i965: Set kernel features before computing max GL version.

(cherry picked from commit 02ccd8f52cffcc25e5fefdd0f900cf04230395f4)


commit 1b2bcb6826ff8855e96117c9523821336a3be88a
Author: Julien Isorce 

winsys/radeon: check null return from radeon_cs_create_fence in cs_flush

(cherry picked from commit d08c0930af8aaef5bdf80df618bb906e0b349830)


Cheers,
Emil


Mesa stable queue
-

Nominated (1)
=

Boyan Ding (1):
  d941ef3 nvc0/ir: Properly handle a "split form" of predicate destination


Queued (28)
===

Alex Deucher (1):
  radeonsi: add new polaris10 pci id

Alex Smith (1):
  radv: Invalidate L2 for TRANSFER_WRITE barriers

Craig Stout (1):
  anv/cmd_buffer: fix host memory leak

Emil Velikov (2):
  Revert "cherry-ignore: add the Flush after unmap in gbm/dri fix"
  Revert "freedreno: fix memory leak"

Fabio Estevam (1):
  loader: Move non-error message to debug level

Ilia Mirkin (4):
  nvc0/ir: fix LSB/BFE/BFI implementations
  nvc0/ir: fix overwriting of offset register with interpolateAtOffset
  nvc0: increase texture buffer object alignment to 256 for pre-GM107
  nouveau: when mapping a persistent buffer, synchronize on former xfers

Jason Ekstrand (5):
  i965/fs: Always provide a default LOD of 0 for TXS and TXL
  anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex
  anv/blorp: Align vertex buffers to 64B
  i965/blorp: Align vertex buffers to 64B
  i965/blorp: Bump the batch space estimate

Jerome Duval (2):
  haiku: build fixes around debug defines
  haiku/winsys: fix dt prototype args

Julien Isorce (4):
  winsys/radeon: check null in radeon_cs_create_fence
  winsys/radeon: check null return from radeon_cs_create_fence in cs_flush
  radeon: initialize hole variable before calling container_of
  radeon_drm_bo: explicitly check return value of drmCommandWriteRead

Kenneth Graunke (4):
  i965: Document the sad story of the kernel command parser.
  i965: Set screen->cmd_parser_version to 0 if we can't write registers.
  i965: Skip register write detection when possible.
  i965: Set kernel features before computing max GL version.

Ken, the former three seem like an implicit requirement for the GL version fix.


Marek Olšák (1):
  targets: export radeon winsys_create functions to silence LLVM warning

Michal Srb (1):
  st: Add cubeMapFace parameter to st_finalize_texture.

Thomas Hellstrom (1):
  gbm/dri: Flush after unmap
Squashed with
  gbm/dri: Check dri extension version before flush after unmap



Rejected (0)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V3 2/2] glsl: don't run the GLSL pre-processor when we are skipping compilation

2017-04-14 Thread Eric Anholt
Timothy Arceri  writes:

> Improves Deus Ex start-up times with a warm cache from ~30 seconds to
> ~22 seconds.
>
> Also fixes the leaking of state.

The commit message could use some more context:

"This moves the hashing of shader source for the cache lookup to before
the preprocessor.  In our experience, shaders are unlikely to hash the
same after preprocessing if they didn't hash the same before, so we can
skip preprocessing for cache hits."

With something like that,

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gbm: add support for loading third-party backend (v2)

2017-04-14 Thread Eric Anholt
Emil Velikov  writes:

> On 14 April 2017 at 10:38, Yu, Qiang  wrote:
>>
>> Hi Emil,
>>
>>> What happened with the idea of reusing your existing amdgpu_dri.so ?
>>> As mentioned before the DRI loader (libgbm) <> DRI driver (foo_dri.so)
>>> interface is stable, so things should just work.
>> Sorry for the late reply. I've asked our amdgpu_dri.so team for this, they
>> seems have no interest and resource for implementing this interface.
>> So the only option left for me is to reuse our current gbm_amdgpu.so
>> and upstream libgbm.so changes if possible.
>>
> Quick look through `strings amdgpu_dri.so' shows that you guys are
> missing the DRI2_FENCE and DRI2_INTEROP extensions.
> Both of which are fairly trivial to implement and it will be the better 
> option.
>
> Doing so will give you:
>  - acknowledgement to the good work done by Marek (your colleague from
> the other end of the org chart)
>  - less binaries to manage - remove gbm_amdgpu.so
>  - less code to manage - remove many of the libEGL and libgbm patches
> that you have on top of Mesa.
>
> The proposed GBM interface end up broken rather often since:
>  - there's no open-source users that people test
>  - we have no tests to catch regressions :-\
>
> TL;DR; You really want to implement the missing functionality in
> amdgpu_dri.so - its more robust and it will reduce the code you have
> to maintain.

I agree with Emil here.  Building ABI-stable interfaces is hard and
error-prone, and the DRI interface and the GL interface are where where
we do that already.  We shouldn't introduce another ABI at the GBM
backend level.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: add linux to scons build

2017-04-14 Thread Kyriazis, George

On Apr 14, 2017, at 12:44 PM, Emil Velikov 
> wrote:

On 13 April 2017 at 20:17, George Kyriazis 
> wrote:
Make swr compile for both linux and windows.
---
src/gallium/drivers/swr/SConscript| 7 +--
src/gallium/targets/libgl-xlib/SConscript | 2 +-
2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/swr/SConscript 
b/src/gallium/drivers/swr/SConscript
index eca5dba..5e3784b 100644
--- a/src/gallium/drivers/swr/SConscript
+++ b/src/gallium/drivers/swr/SConscript
@@ -17,11 +17,6 @@ if env['LLVM_VERSION'] < 
distutils.version.LooseVersion('3.9'):
env['swr'] = False
Return()

-if env['platform'] != 'windows':
-print "warning: swr scons build only supports windows: not building swr"
-env['swr'] = False
-Return()
-
env.MSVC2013Compat()

env = env.Clone()
@@ -205,7 +200,7 @@ envavx2.Append(CPPDEFINES = ['KNOB_ARCH=KNOB_ARCH_AVX2'])
if env['platform'] == 'windows':
envavx2.Append(CCFLAGS = ['/arch:AVX2'])
else:
-envavx2.Append(CCFLAGS = ['-mavx2'])
+envavx2.Append(CCFLAGS = ['-mavx2', '-mfma', '-mbmi2', '-mf16c'])

swrAVX2 = envavx2.SharedLibrary(
target = 'swrAVX2',
diff --git a/src/gallium/targets/libgl-xlib/SConscript 
b/src/gallium/targets/libgl-xlib/SConscript
index d01bb3c..a81ac79 100644
--- a/src/gallium/targets/libgl-xlib/SConscript
+++ b/src/gallium/targets/libgl-xlib/SConscript
@@ -49,7 +49,7 @@ if env['llvm']:
env.Prepend(LIBS = [llvmpipe])

if env['swr']:
-env.Append(CPPDEFINES = 'HAVE_SWR')
+env.Append(CPPDEFINES = 'GALLIUM_SWR')
Seems like we want the same fix in src/gallium/targets/osmesa/SConscript.
Please squash that alongside a small note in docs/relnotes/17.1.0.html

Checkin is already submitted, so I’ll make a foliow-up commit with those 
changes.

With the above
Reviewed-by: Emil Velikov 
>

As a follow-up commit can we have $sed -i s/HAVE_/GALLIUM_
src/gallium/targets/libgl-xlib/* && git commit -asm “…”

Yes, I want to fix this, too, and I was planning on doing it on a later commit.

Thanks,

George


Thanks
Emil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/4] vc4: Only build the NEON code on arm32.

2017-04-14 Thread Eric Anholt
Emil Velikov  writes:

> On 14 April 2017 at 18:47, Eric Anholt  wrote:
>> NEON is sufficiently different on arm64 that we can't just reuse this
>> code.  Disable it on arm64 for now.
>>
>> Signed-off-by: Eric Anholt 
>> ---
>>  src/gallium/drivers/vc4/vc4_tiling_lt.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/drivers/vc4/vc4_tiling_lt.c 
>> b/src/gallium/drivers/vc4/vc4_tiling_lt.c
>> index c9cbc65e2dbc..7de67b652daa 100644
>> --- a/src/gallium/drivers/vc4/vc4_tiling_lt.c
>> +++ b/src/gallium/drivers/vc4/vc4_tiling_lt.c
>> @@ -61,7 +61,7 @@ static void
>>  vc4_load_utile(void *cpu, void *gpu, uint32_t cpu_stride, uint32_t cpp)
>>  {
>>  uint32_t gpu_stride = vc4_utile_stride(cpp);
>> -#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
>> +#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
>>  if (gpu_stride == 8) {
>>  __asm__ volatile (
>>  /* Load from the GPU in one shot, no interleave, to
>> @@ -118,7 +118,7 @@ vc4_store_utile(void *gpu, void *cpu, uint32_t 
>> cpu_stride, uint32_t cpp)
>>  {
>>  uint32_t gpu_stride = vc4_utile_stride(cpp);
>>
>> -#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
>> +#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
>
> This patch should be before 4/4, or it will cause intermittent breakage.

I don't think there is any new breakage.  We've been setting
VC4_BUILD_NEON already.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/4] vc4: Only build the NEON code on arm32.

2017-04-14 Thread Emil Velikov
On 14 April 2017 at 18:47, Eric Anholt  wrote:
> NEON is sufficiently different on arm64 that we can't just reuse this
> code.  Disable it on arm64 for now.
>
> Signed-off-by: Eric Anholt 
> ---
>  src/gallium/drivers/vc4/vc4_tiling_lt.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/vc4/vc4_tiling_lt.c 
> b/src/gallium/drivers/vc4/vc4_tiling_lt.c
> index c9cbc65e2dbc..7de67b652daa 100644
> --- a/src/gallium/drivers/vc4/vc4_tiling_lt.c
> +++ b/src/gallium/drivers/vc4/vc4_tiling_lt.c
> @@ -61,7 +61,7 @@ static void
>  vc4_load_utile(void *cpu, void *gpu, uint32_t cpu_stride, uint32_t cpp)
>  {
>  uint32_t gpu_stride = vc4_utile_stride(cpp);
> -#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
> +#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
>  if (gpu_stride == 8) {
>  __asm__ volatile (
>  /* Load from the GPU in one shot, no interleave, to
> @@ -118,7 +118,7 @@ vc4_store_utile(void *gpu, void *cpu, uint32_t 
> cpu_stride, uint32_t cpp)
>  {
>  uint32_t gpu_stride = vc4_utile_stride(cpp);
>
> -#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
> +#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7

This patch should be before 4/4, or it will cause intermittent breakage.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: remove irrelevant comment

2017-04-14 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Fri, Apr 14, 2017 at 7:17 PM, Grazvydas Ignotas  wrote:
> A leftover from anv.
>
> Signed-off-by: Grazvydas Ignotas 
> ---
>  src/amd/vulkan/radv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 5f14394..7857e8f 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -660,11 +660,11 @@ void radv_GetPhysicalDeviceProperties(
> .driverVersion = radv_get_driver_version(),
> .vendorID = 0x1002,
> .deviceID = pdevice->rad_info.pci_id,
> .deviceType = VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU,
> .limits = limits,
> -   .sparseProperties = {0}, /* Broadwell doesn't do sparse. */
> +   .sparseProperties = {0},
> };
>
> strcpy(pProperties->deviceName, pdevice->name);
> memcpy(pProperties->pipelineCacheUUID, pdevice->uuid, VK_UUID_SIZE);
>  }
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: report timestampPeriod correctly

2017-04-14 Thread Bas Nieuwenhuizen
For some reason I thought it did it in 10 KHz.

Reviewed-by: Bas Nieuwenhuizen 

On Fri, Apr 14, 2017 at 7:17 PM, Grazvydas Ignotas  wrote:
> The kernel returns frequency in kHz, so to convert to nanosecond
> interval that Vulkan uses the dividend should be 100.0 and not
> 10.0.
>
> This fixes the GPU graph in DOOM and matches the amdgpu-pro blob.
>
> Signed-off-by: Grazvydas Ignotas 
> Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
> ---
>  src/amd/vulkan/radv_device.c| 2 +-
>  src/amd/vulkan/radv_radeon_winsys.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 7857e8f..796cc70 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -637,11 +637,11 @@ void radv_GetPhysicalDeviceProperties(
> .sampledImageDepthSampleCounts= sample_counts,
> .sampledImageStencilSampleCounts  = sample_counts,
> .storageImageSampleCounts = 
> VK_SAMPLE_COUNT_1_BIT,
> .maxSampleMaskWords   = 1,
> .timestampComputeAndGraphics  = false,
> -   .timestampPeriod  = 10.0 / 
> pdevice->rad_info.clock_crystal_freq,
> +   .timestampPeriod  = 100.0 / 
> pdevice->rad_info.clock_crystal_freq,
> .maxClipDistances = 8,
> .maxCullDistances = 8,
> .maxCombinedClipAndCullDistances  = 8,
> .discreteQueuePriorities  = 1,
> .pointSizeRange   = { 0.125, 255.875 
> },
> diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
> b/src/amd/vulkan/radv_radeon_winsys.h
> index 9f2430f..f6bab74 100644
> --- a/src/amd/vulkan/radv_radeon_winsys.h
> +++ b/src/amd/vulkan/radv_radeon_winsys.h
> @@ -93,11 +93,11 @@ struct radeon_info {
> bool has_uvd;
> uint32_tsdma_rings;
> uint32_tcompute_rings;
> uint32_tvce_fw_version;
> uint32_tvce_harvest_config;
> -   uint32_tclock_crystal_freq;
> +   uint32_tclock_crystal_freq; /* in kHz */
>
> /* Kernel info. */
> uint32_tdrm_major; /* version */
> uint32_tdrm_minor;
> uint32_tdrm_patchlevel;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gbm: add support for loading third-party backend (v2)

2017-04-14 Thread Emil Velikov
On 14 April 2017 at 10:38, Yu, Qiang  wrote:
>
> Hi Emil,
>
>> What happened with the idea of reusing your existing amdgpu_dri.so ?
>> As mentioned before the DRI loader (libgbm) <> DRI driver (foo_dri.so)
>> interface is stable, so things should just work.
> Sorry for the late reply. I've asked our amdgpu_dri.so team for this, they
> seems have no interest and resource for implementing this interface.
> So the only option left for me is to reuse our current gbm_amdgpu.so
> and upstream libgbm.so changes if possible.
>
Quick look through `strings amdgpu_dri.so' shows that you guys are
missing the DRI2_FENCE and DRI2_INTEROP extensions.
Both of which are fairly trivial to implement and it will be the better option.

Doing so will give you:
 - acknowledgement to the good work done by Marek (your colleague from
the other end of the org chart)
 - less binaries to manage - remove gbm_amdgpu.so
 - less code to manage - remove many of the libEGL and libgbm patches
that you have on top of Mesa.

The proposed GBM interface end up broken rather often since:
 - there's no open-source users that people test
 - we have no tests to catch regressions :-\

TL;DR; You really want to implement the missing functionality in
amdgpu_dri.so - its more robust and it will reduce the code you have
to maintain.

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: update gallium driver docs

2017-04-14 Thread Emil Velikov
On 13 April 2017 at 19:41, Tim Rowley  wrote:
> ---
>  src/gallium/docs/source/drivers/openswr.rst   | 2 +-
>  src/gallium/docs/source/drivers/openswr/usage.rst | 9 +++--
>  2 files changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/docs/source/drivers/openswr.rst 
> b/src/gallium/docs/source/drivers/openswr.rst
> index 84aa51f..e254d7b 100644
> --- a/src/gallium/docs/source/drivers/openswr.rst
> +++ b/src/gallium/docs/source/drivers/openswr.rst
> @@ -7,7 +7,7 @@ geometry heavy workloads there is a considerable speedup over 
> llvmpipe,
>  which is to be expected as the geometry frontend of llvmpipe is single
>  threaded.
>
> -This rasterizer is x86 specific and requires AVX or AVX2.  The driver
> +This rasterizer is x86 specific and requires AVX or above.  The driver
>  fits into the gallium framework, and reuses gallivm for doing the TGSI
>  to vectorized llvm-IR conversion of the shader kernels.
>
> diff --git a/src/gallium/docs/source/drivers/openswr/usage.rst 
> b/src/gallium/docs/source/drivers/openswr/usage.rst
> index e55b421..d2a664e 100644
> --- a/src/gallium/docs/source/drivers/openswr/usage.rst
> +++ b/src/gallium/docs/source/drivers/openswr/usage.rst
> @@ -4,8 +4,9 @@ Usage
>  Requirements
>  
>
> -* An x86 processor with AVX or AVX2
> -* LLVM version 3.6 or later
> +* An x86 processor with AVX or above
> +* LLVM version 3.9 or later
> +* C++14 capable compiler
>
>  Building
>  
> @@ -22,10 +23,6 @@ On Linux, building will create a drop-in alternative for 
> libGL.so into::
>
There is a hunk just outside of the diff that wants a
s/building will/building with autotools will/

>lib/gallium/libGL.so
>
> -or::
s/or/Alternatively, building with SCons will produce/

> -
> -  build/foo/gallium/targets/libgl-xlib/libGL.so
> -
and then keep this line.

Considering George wired everything, one might as well have it documented ;-)
Either way not my call, so feel free to ignore.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/4] gallium: Enable ARM NEON CPU detection.

2017-04-14 Thread Eric Anholt
I wrote this code with reference to pixman, though I've only decided to
cover Linux (what I'm testing) and Android (seems obvious enough).  Linux
has getauxval() as a cleaner interface to the /proc entry, but it's more
glibc-specific and I didn't want to add detection for that.

This will be used to enable NEON at runtime on ARMv6 builds of vc4.

v2: Actually initialize the temp vars in the Android path (noticed by
daniels)
v3: Actually pull in the cpufeatures library (change by robher).  Use
O_CLOEXEC.  Break out of the loop when we find our feature.  Only do
NEON detection, until someone actually wants VFP features.
---
 src/gallium/auxiliary/Android.mk  |  2 ++
 src/gallium/auxiliary/util/u_cpu_detect.c | 43 +++
 src/gallium/auxiliary/util/u_cpu_detect.h |  1 +
 3 files changed, 46 insertions(+)

diff --git a/src/gallium/auxiliary/Android.mk b/src/gallium/auxiliary/Android.mk
index e8628e43744a..4f6f71bbf6a9 100644
--- a/src/gallium/auxiliary/Android.mk
+++ b/src/gallium/auxiliary/Android.mk
@@ -48,6 +48,8 @@ endif
 LOCAL_MODULE := libmesa_gallium
 LOCAL_STATIC_LIBRARIES += libmesa_nir
 
+LOCAL_WHOLE_STATIC_LIBRARIES += cpufeatures
+
 # generate sources
 LOCAL_MODULE_CLASS := STATIC_LIBRARIES
 intermediates := $(call local-generated-sources-dir)
diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c 
b/src/gallium/auxiliary/util/u_cpu_detect.c
index 845fc6b34d5c..76115bf8d55d 100644
--- a/src/gallium/auxiliary/util/u_cpu_detect.c
+++ b/src/gallium/auxiliary/util/u_cpu_detect.c
@@ -59,12 +59,18 @@
 
 #if defined(PIPE_OS_LINUX)
 #include 
+#include 
+#include 
 #endif
 
 #ifdef PIPE_OS_UNIX
 #include 
 #endif
 
+#if defined(PIPE_OS_ANDROID)
+#include 
+#endif
+
 #if defined(PIPE_OS_WINDOWS)
 #include 
 #if defined(PIPE_CC_MSVC)
@@ -294,6 +300,38 @@ PIPE_ALIGN_STACK static inline boolean sse2_has_daz(void)
 
 #endif /* X86 or X86_64 */
 
+#if defined(PIPE_ARCH_ARM)
+static void
+check_os_arm_support(void)
+{
+#if defined(PIPE_OS_ANDROID)
+   AndroidCpuFamily cpu_family = android_getCpuFamily();
+   uint64_t cpu_features = android_getCpuFeatures();
+
+   if (cpu_family == ANDROID_CPU_FAMILY_ARM) {
+  if (cpu_features & ANDROID_CPU_ARM_FEATURE_NEON)
+ util_cpu_caps.has_neon = 1;
+   }
+#elif defined(PIPE_OS_LINUX)
+Elf32_auxv_t aux;
+int fd;
+
+fd = open("/proc/self/auxv", O_RDONLY | O_CLOEXEC);
+if (fd >= 0) {
+   while (read(fd, , sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+  if (aux.a_type == AT_HWCAP) {
+ uint32_t hwcap = aux.a_un.a_val;
+
+ util_cpu_caps.has_neon = (hwcap >> 12) & 1;
+ break;
+  }
+   }
+   close (fd);
+}
+#endif /* PIPE_OS_LINUX */
+}
+#endif /* PIPE_ARCH_ARM */
+
 void
 util_cpu_detect(void)
 {
@@ -443,6 +481,10 @@ util_cpu_detect(void)
}
 #endif /* PIPE_ARCH_X86 || PIPE_ARCH_X86_64 */
 
+#if defined(PIPE_ARCH_ARM)
+   check_os_arm_support();
+#endif
+
 #if defined(PIPE_ARCH_PPC)
check_os_altivec_support();
 #endif /* PIPE_ARCH_PPC */
@@ -471,6 +513,7 @@ util_cpu_detect(void)
   debug_printf("util_cpu_caps.has_3dnow_ext = %u\n", 
util_cpu_caps.has_3dnow_ext);
   debug_printf("util_cpu_caps.has_xop = %u\n", util_cpu_caps.has_xop);
   debug_printf("util_cpu_caps.has_altivec = %u\n", 
util_cpu_caps.has_altivec);
+  debug_printf("util_cpu_caps.has_neon = %u\n", util_cpu_caps.has_neon);
   debug_printf("util_cpu_caps.has_daz = %u\n", util_cpu_caps.has_daz);
   debug_printf("util_cpu_caps.has_avx512f = %u\n", 
util_cpu_caps.has_avx512f);
   debug_printf("util_cpu_caps.has_avx512dq = %u\n", 
util_cpu_caps.has_avx512dq);
diff --git a/src/gallium/auxiliary/util/u_cpu_detect.h 
b/src/gallium/auxiliary/util/u_cpu_detect.h
index 3bd7294f0759..4a34ac4d9a63 100644
--- a/src/gallium/auxiliary/util/u_cpu_detect.h
+++ b/src/gallium/auxiliary/util/u_cpu_detect.h
@@ -72,6 +72,7 @@ struct util_cpu_caps {
unsigned has_xop:1;
unsigned has_altivec:1;
unsigned has_daz:1;
+   unsigned has_neon:1;
 
unsigned has_avx512f:1;
unsigned has_avx512dq:1;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/4] vc4: Only build the NEON code on arm32.

2017-04-14 Thread Eric Anholt
NEON is sufficiently different on arm64 that we can't just reuse this
code.  Disable it on arm64 for now.

Signed-off-by: Eric Anholt 
---
 src/gallium/drivers/vc4/vc4_tiling_lt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_tiling_lt.c 
b/src/gallium/drivers/vc4/vc4_tiling_lt.c
index c9cbc65e2dbc..7de67b652daa 100644
--- a/src/gallium/drivers/vc4/vc4_tiling_lt.c
+++ b/src/gallium/drivers/vc4/vc4_tiling_lt.c
@@ -61,7 +61,7 @@ static void
 vc4_load_utile(void *cpu, void *gpu, uint32_t cpu_stride, uint32_t cpp)
 {
 uint32_t gpu_stride = vc4_utile_stride(cpp);
-#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
+#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
 if (gpu_stride == 8) {
 __asm__ volatile (
 /* Load from the GPU in one shot, no interleave, to
@@ -118,7 +118,7 @@ vc4_store_utile(void *gpu, void *cpu, uint32_t cpu_stride, 
uint32_t cpp)
 {
 uint32_t gpu_stride = vc4_utile_stride(cpp);
 
-#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH)
+#if defined(VC4_BUILD_NEON) && defined(__ARM_ARCH) && __ARM_ARCH <= 7
 if (gpu_stride == 8) {
 __asm__ volatile (
 /* Load each 8-byte line from cpu-side source,
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/4] vc4: Use runtime CPU detection for whether NEON is available.

2017-04-14 Thread Eric Anholt
This will allow Raspbian's ARMv6 builds to take advantage of the new NEON
code, and could prevent problems if vc4 ends up getting used on a v7 CPU
without NEON.
---
 src/gallium/drivers/vc4/vc4_screen.c |  3 +++
 src/gallium/drivers/vc4/vc4_tiling.h | 25 +
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_screen.c 
b/src/gallium/drivers/vc4/vc4_screen.c
index 9030c4baf4bb..514af808b916 100644
--- a/src/gallium/drivers/vc4/vc4_screen.c
+++ b/src/gallium/drivers/vc4/vc4_screen.c
@@ -27,6 +27,7 @@
 #include "pipe/p_screen.h"
 #include "pipe/p_state.h"
 
+#include "util/u_cpu_detect.h"
 #include "util/u_debug.h"
 #include "util/u_memory.h"
 #include "util/u_format.h"
@@ -627,6 +628,8 @@ vc4_screen_create(int fd)
 if (!vc4_get_chip_info(screen))
 goto fail;
 
+util_cpu_detect();
+
 slab_create_parent(>transfer_pool, sizeof(struct 
vc4_transfer), 16);
 
 vc4_fence_init(screen);
diff --git a/src/gallium/drivers/vc4/vc4_tiling.h 
b/src/gallium/drivers/vc4/vc4_tiling.h
index ba1ad6fb3f7d..31317db7a949 100644
--- a/src/gallium/drivers/vc4/vc4_tiling.h
+++ b/src/gallium/drivers/vc4/vc4_tiling.h
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include "util/macros.h"
+#include "util/u_cpu_detect.h"
 
 /** Return the width in pixels of a 64-byte microtile. */
 static inline uint32_t
@@ -83,23 +84,18 @@ void vc4_store_tiled_image(void *dst, uint32_t dst_stride,
uint8_t tiling_format, int cpp,
const struct pipe_box *box);
 
-/* If we're building for ARMv7 (Pi 2+), assume it has NEON.  For Raspbian we
- * should extend this to have some runtime detection of being built for ARMv6
- * on a Pi 2+.
- */
-#if defined(__ARM_ARCH) && __ARM_ARCH == 7
-#define NEON_SUFFIX(x) x ## _neon
-#else
-#define NEON_SUFFIX(x) x ## _base
-#endif
-
 static inline void
 vc4_load_lt_image(void *dst, uint32_t dst_stride,
   void *src, uint32_t src_stride,
   int cpp, const struct pipe_box *box)
 {
-NEON_SUFFIX(vc4_load_lt_image)(dst, dst_stride, src, src_stride,
+if (util_cpu_caps.has_neon) {
+vc4_load_lt_image_neon(dst, dst_stride, src, src_stride,
+   cpp, box);
+} else {
+vc4_load_lt_image_base(dst, dst_stride, src, src_stride,
cpp, box);
+}
 }
 
 static inline void
@@ -107,8 +103,13 @@ vc4_store_lt_image(void *dst, uint32_t dst_stride,
void *src, uint32_t src_stride,
int cpp, const struct pipe_box *box)
 {
-NEON_SUFFIX(vc4_store_lt_image)(dst, dst_stride, src, src_stride,
+if (util_cpu_caps.has_neon) {
+vc4_store_lt_image_neon(dst, dst_stride, src, src_stride,
+cpp, box);
+} else {
+vc4_store_lt_image_base(dst, dst_stride, src, src_stride,
 cpp, box);
+}
 }
 
 #undef NEON_SUFFIX
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/4] vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.

2017-04-14 Thread Eric Anholt
Android.mk was setting the flag across the entire driver, so we didn't
have non-NEON versions getting built.  This was going to be a problem with
the next commit, when I start auto-detecting NEON support and use the
non-NEON version when appropriate.
---
 src/gallium/drivers/vc4/Android.mk   |  2 --
 src/gallium/drivers/vc4/Makefile.am  |  6 --
 src/gallium/drivers/vc4/Makefile.sources |  1 +
 src/gallium/drivers/vc4/vc4_tiling_lt_neon.c | 30 
 4 files changed, 31 insertions(+), 8 deletions(-)
 create mode 100644 src/gallium/drivers/vc4/vc4_tiling_lt_neon.c

diff --git a/src/gallium/drivers/vc4/Android.mk 
b/src/gallium/drivers/vc4/Android.mk
index fdc06744e5ab..de9d5e3f5b3c 100644
--- a/src/gallium/drivers/vc4/Android.mk
+++ b/src/gallium/drivers/vc4/Android.mk
@@ -25,8 +25,6 @@ include $(LOCAL_PATH)/Makefile.sources
 
 include $(CLEAR_VARS)
 
-LOCAL_CFLAGS_arm := -DVC4_BUILD_NEON
-
 LOCAL_SRC_FILES := \
$(C_SOURCES)
 
diff --git a/src/gallium/drivers/vc4/Makefile.am 
b/src/gallium/drivers/vc4/Makefile.am
index b361a0c588a8..0ed49b128b2d 100644
--- a/src/gallium/drivers/vc4/Makefile.am
+++ b/src/gallium/drivers/vc4/Makefile.am
@@ -41,10 +41,4 @@ libvc4_la_SOURCES = $(C_SOURCES)
 libvc4_la_LIBADD = $(SIM_LIB) $(VC4_LIBS)
 libvc4_la_LDFLAGS = $(SIM_LDFLAGS)
 
-noinst_LTLIBRARIES += libvc4_neon.la
-libvc4_la_LIBADD += libvc4_neon.la
-
-libvc4_neon_la_SOURCES = vc4_tiling_lt.c
-libvc4_neon_la_CFLAGS = $(AM_CFLAGS) -DVC4_BUILD_NEON
-
 EXTRA_DIST = kernel/README
diff --git a/src/gallium/drivers/vc4/Makefile.sources 
b/src/gallium/drivers/vc4/Makefile.sources
index 10de34361260..442d7a561782 100644
--- a/src/gallium/drivers/vc4/Makefile.sources
+++ b/src/gallium/drivers/vc4/Makefile.sources
@@ -56,6 +56,7 @@ C_SOURCES := \
vc4_state.c \
vc4_tiling.c \
vc4_tiling_lt.c \
+   vc4_tiling_lt_neon.c \
vc4_tiling.h \
vc4_uniforms.c \
$()
diff --git a/src/gallium/drivers/vc4/vc4_tiling_lt_neon.c 
b/src/gallium/drivers/vc4/vc4_tiling_lt_neon.c
new file mode 100644
index ..7ba66ae4cdf4
--- /dev/null
+++ b/src/gallium/drivers/vc4/vc4_tiling_lt_neon.c
@@ -0,0 +1,30 @@
+/*
+ * Copyright © 2017 Broadcom
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/* Wrapper file for building vc4_tiling_lt.c with the "build NEON assembly if
+ * possible" flag set, since Android.mk doesn't have a way to set CFLAGS for a
+ * single file.
+ */
+
+#define VC4_BUILD_NEON
+#include "vc4_tiling_lt.c"
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: add linux to scons build

2017-04-14 Thread Emil Velikov
On 13 April 2017 at 20:17, George Kyriazis  wrote:
> Make swr compile for both linux and windows.
> ---
>  src/gallium/drivers/swr/SConscript| 7 +--
>  src/gallium/targets/libgl-xlib/SConscript | 2 +-
>  2 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/drivers/swr/SConscript 
> b/src/gallium/drivers/swr/SConscript
> index eca5dba..5e3784b 100644
> --- a/src/gallium/drivers/swr/SConscript
> +++ b/src/gallium/drivers/swr/SConscript
> @@ -17,11 +17,6 @@ if env['LLVM_VERSION'] < 
> distutils.version.LooseVersion('3.9'):
>  env['swr'] = False
>  Return()
>
> -if env['platform'] != 'windows':
> -print "warning: swr scons build only supports windows: not building swr"
> -env['swr'] = False
> -Return()
> -
>  env.MSVC2013Compat()
>
>  env = env.Clone()
> @@ -205,7 +200,7 @@ envavx2.Append(CPPDEFINES = ['KNOB_ARCH=KNOB_ARCH_AVX2'])
>  if env['platform'] == 'windows':
>  envavx2.Append(CCFLAGS = ['/arch:AVX2'])
>  else:
> -envavx2.Append(CCFLAGS = ['-mavx2'])
> +envavx2.Append(CCFLAGS = ['-mavx2', '-mfma', '-mbmi2', '-mf16c'])
>
>  swrAVX2 = envavx2.SharedLibrary(
>  target = 'swrAVX2',
> diff --git a/src/gallium/targets/libgl-xlib/SConscript 
> b/src/gallium/targets/libgl-xlib/SConscript
> index d01bb3c..a81ac79 100644
> --- a/src/gallium/targets/libgl-xlib/SConscript
> +++ b/src/gallium/targets/libgl-xlib/SConscript
> @@ -49,7 +49,7 @@ if env['llvm']:
>  env.Prepend(LIBS = [llvmpipe])
>
>  if env['swr']:
> -env.Append(CPPDEFINES = 'HAVE_SWR')
> +env.Append(CPPDEFINES = 'GALLIUM_SWR')
Seems like we want the same fix in src/gallium/targets/osmesa/SConscript.
Please squash that alongside a small note in docs/relnotes/17.1.0.html

With the above
Reviewed-by: Emil Velikov 

As a follow-up commit can we have $sed -i s/HAVE_/GALLIUM_
src/gallium/targets/libgl-xlib/* && git commit -asm "..."

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 21/21] anv: Use DRM sync objects for external semaphores when available

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c | 69 
 src/intel/vulkan/anv_device.c  |  2 +
 src/intel/vulkan/anv_private.h |  8 
 src/intel/vulkan/anv_queue.c   | 93 --
 4 files changed, 148 insertions(+), 24 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index ec37c81..0f118c8 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -953,6 +953,19 @@ anv_cmd_buffer_add_secondary(struct anv_cmd_buffer 
*primary,
  >surface_relocs, 0);
 }
 
+struct drm_i915_gem_exec_fence {
+   /**
+* User's handle for a dma-fence to wait on or signal.
+*/
+   __u32 handle;
+
+#define I915_EXEC_FENCE_WAIT(1<<0)
+#define I915_EXEC_FENCE_SIGNAL  (1<<1)
+   __u32 flags;
+};
+
+#define I915_EXEC_FENCE_ARRAY   (1<<19)
+
 struct anv_execbuf {
struct drm_i915_gem_execbuffer2   execbuf;
 
@@ -962,6 +975,10 @@ struct anv_execbuf {
 
/* Allocated length of the 'objects' and 'bos' arrays */
uint32_t  array_length;
+
+   uint32_t  fence_count;
+   uint32_t  fence_array_length;
+   struct drm_i915_gem_exec_fence *  fences;
 };
 
 static void
@@ -976,6 +993,7 @@ anv_execbuf_finish(struct anv_execbuf *exec,
 {
vk_free(alloc, exec->objects);
vk_free(alloc, exec->bos);
+   vk_free(alloc, exec->fences);
 }
 
 static VkResult
@@ -1061,6 +1079,35 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
return VK_SUCCESS;
 }
 
+static VkResult
+anv_execbuf_add_syncobj(struct anv_execbuf *exec,
+uint32_t handle,
+uint32_t flags,
+const VkAllocationCallbacks *alloc)
+{
+   if (exec->fence_count >= exec->fence_array_length) {
+  uint32_t new_len = MAX2(exec->fence_array_length * 2, 64);
+
+  struct drm_i915_gem_exec_fence *new_fences =
+ vk_realloc(alloc, exec->fences, new_len * sizeof(*new_fences),
+8, VK_SYSTEM_ALLOCATION_SCOPE_COMMAND);
+  if (new_fences == NULL)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+  exec->fences = new_fences;
+  exec->fence_array_length = new_len;
+   }
+
+   exec->fences[exec->fence_count] = (struct drm_i915_gem_exec_fence) {
+  .handle = handle,
+  .flags = flags,
+   };
+
+   exec->fence_count++;
+
+   return VK_SUCCESS;
+}
+
 static void
 anv_cmd_buffer_process_relocs(struct anv_cmd_buffer *cmd_buffer,
   struct anv_reloc_list *list)
@@ -1447,6 +1494,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  impl->fd = -1;
  break;
 
+  case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_WAIT,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  break;
   }
@@ -1481,6 +1536,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  need_out_fence = true;
  break;
 
+  case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_SIGNAL,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  break;
   }
@@ -1494,6 +1557,12 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   setup_empty_execbuf(, device);
}
 
+   if (execbuf.fence_count > 0) {
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_ARRAY;
+  execbuf.execbuf.num_cliprects = execbuf.fence_count;
+  execbuf.execbuf.cliprects_ptr = (uintptr_t) execbuf.fences;
+   }
+
if (in_fence != -1) {
   execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;
   execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index f853905..13d01d1 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -233,6 +233,8 @@ anv_physical_device_init(struct anv_physical_device *device,
 
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
+   device->has_syncobj =
+  anv_gem_get_param(fd, 47 /* I915_PARAM_HAS_EXEC_FENCE_ARRAY */);
 
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index d1406ab..0731e89 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -648,6 +648,7 @@ struct anv_physical_device {
 int cmd_parser_version;
 bool 

[Mesa-dev] [RFC 20/21] anv/gem: Add a drm syncobj support

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 79 
 src/intel/vulkan/anv_gem_stubs.c | 24 
 src/intel/vulkan/anv_private.h   |  4 ++
 3 files changed, 107 insertions(+)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index e331fbb..6db15ba 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -449,3 +449,82 @@ anv_gem_sync_file_merge(struct anv_device *device, int 
fd1, int fd2)
 
return args.fence;
 }
+
+#define DRM_IOCTL_SYNCOBJ_CREATE   DRM_IOWR(0xBF, struct 
drm_syncobj_create_info)
+#define DRM_IOCTL_SYNCOBJ_DESTROY  DRM_IOWR(0xC0, struct 
drm_syncobj_destroy)
+#define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD DRM_IOWR(0xC1, struct 
drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE DRM_IOWR(0xC2, struct 
drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_INFO DRM_IOWR(0xC3, struct 
drm_syncobj_create_info)
+
+struct drm_syncobj_create_info {
+   __u32 handle;
+   __u32 type;
+   __u32 flags;
+   __u32 pad;
+};
+
+struct drm_syncobj_destroy {
+   __u32 handle;
+   __u32 pad;
+};
+
+struct drm_syncobj_handle {
+   __u32 handle;
+   /** Flags.. only applicable for handle->fd */
+   __u32 flags;
+
+   __s32 fd;
+};
+
+uint32_t
+anv_gem_syncobj_create(struct anv_device *device)
+{
+   struct drm_syncobj_create_info args = {
+  .type = 1 /* SYNC_FILE_TYPE_SEMAPHORE */,
+  .flags = 0,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_CREATE, );
+   if (ret)
+  return 0;
+
+   return args.handle;
+}
+
+void
+anv_gem_syncobj_close(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_destroy args = {
+  .handle = handle,
+   };
+
+   anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_DESTROY, );
+}
+
+int
+anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_handle args = {
+  .handle = handle,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, );
+   if (ret)
+  return -1;
+
+   return args.fd;
+}
+
+uint32_t
+anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd)
+{
+   struct drm_syncobj_handle args = {
+  .fd = fd,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, );
+   if (ret)
+  return 0;
+
+   return args.handle;
+}
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index d93009f..e3998b9 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -187,3 +187,27 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd)
 {
unreachable("Unused");
 }
+
+uint32_t
+anv_gem_syncobj_create(struct anv_device *device)
+{
+   unreachable("Unused");
+}
+
+void
+anv_gem_syncobj_close(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+int
+anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+uint32_t
+anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd)
+{
+   unreachable("Unused");
+}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index b99c93c..d1406ab 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -802,6 +802,10 @@ int anv_gem_set_domain(struct anv_device *device, uint32_t 
gem_handle,
 int anv_gem_sync_file_merge(struct anv_device *device, int fd1, int fd2);
 int anv_gem_set_context_param(struct anv_device *device,
   uint64_t param, uint64_t value);
+uint32_t anv_gem_syncobj_create(struct anv_device *device);
+void anv_gem_syncobj_close(struct anv_device *device, uint32_t handle);
+int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
+uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
 
 VkResult anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, 
uint64_t size);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/21] anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 185086f..1392bf4 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -185,7 +185,10 @@ int
 anv_gem_execbuffer(struct anv_device *device,
struct drm_i915_gem_execbuffer2 *execbuf)
 {
-   return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf);
+   if (execbuf->flags & I915_EXEC_FENCE_OUT)
+  return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2_WR, execbuf);
+   else
+  return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf);
 }
 
 int
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/21] anv: Implement VK_KHX_external_semaphore_fd

2017-04-14 Thread Jason Ekstrand
This implementation allocates a 4k BO for each semaphore that can be
exported using OPAQUE_FD and uses the kernel's already-existing
synchronization mechanism on BOs.
---
 src/intel/vulkan/anv_batch_chain.c  |  53 ++--
 src/intel/vulkan/anv_device.c   |   4 +
 src/intel/vulkan/anv_entrypoints_gen.py |   1 +
 src/intel/vulkan/anv_private.h  |  16 +++-
 src/intel/vulkan/anv_queue.c| 141 ++--
 5 files changed, 199 insertions(+), 16 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 136f273..0529f22 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -982,6 +982,7 @@ static VkResult
 anv_execbuf_add_bo(struct anv_execbuf *exec,
struct anv_bo *bo,
struct anv_reloc_list *relocs,
+   uint32_t extra_flags,
const VkAllocationCallbacks *alloc)
 {
struct drm_i915_gem_exec_object2 *obj = NULL;
@@ -1036,7 +1037,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
   obj->relocs_ptr = 0;
   obj->alignment = 0;
   obj->offset = bo->offset;
-  obj->flags = bo->flags;
+  obj->flags = bo->flags | extra_flags;
   obj->rsvd1 = 0;
   obj->rsvd2 = 0;
}
@@ -1052,7 +1053,8 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
   for (size_t i = 0; i < relocs->num_relocs; i++) {
  /* A quick sanity check on relocations */
  assert(relocs->relocs[i].offset < bo->size);
- anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc);
+ anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL,
+extra_flags, alloc);
   }
}
 
@@ -1261,7 +1263,7 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf,
adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs,
   cmd_buffer->last_ss_pool_center);
VkResult result =
-  anv_execbuf_add_bo(execbuf, _pool->bo, _buffer->surface_relocs,
+  anv_execbuf_add_bo(execbuf, _pool->bo, _buffer->surface_relocs, 0,
  _buffer->device->alloc);
if (result != VK_SUCCESS)
   return result;
@@ -1274,7 +1276,7 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf,
   adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
cmd_buffer->last_ss_pool_center);
 
-  result = anv_execbuf_add_bo(execbuf, &(*bbo)->bo, &(*bbo)->relocs,
+  result = anv_execbuf_add_bo(execbuf, &(*bbo)->bo, &(*bbo)->relocs, 0,
   _buffer->device->alloc);
   if (result != VK_SUCCESS)
  return result;
@@ -1387,12 +1389,51 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf 
*execbuf,
 
 VkResult
 anv_cmd_buffer_execbuf(struct anv_device *device,
-   struct anv_cmd_buffer *cmd_buffer)
+   struct anv_cmd_buffer *cmd_buffer,
+   const VkSemaphore *in_semaphores,
+   uint32_t num_in_semaphores,
+   const VkSemaphore *out_semaphores,
+   uint32_t num_out_semaphores)
 {
struct anv_execbuf execbuf;
anv_execbuf_init();
 
-   VkResult result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
+   VkResult result = VK_SUCCESS;
+   for (uint32_t i = 0; i < num_in_semaphores; i++) {
+  ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
+  assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
+  struct anv_semaphore_impl *impl = >permanent;
+
+  switch (impl->type) {
+  case ANV_SEMAPHORE_TYPE_BO:
+ result = anv_execbuf_add_bo(, impl->bo, NULL,
+ 0, >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+  default:
+ break;
+  }
+   }
+
+   for (uint32_t i = 0; i < num_out_semaphores; i++) {
+  ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
+  assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
+  struct anv_semaphore_impl *impl = >permanent;
+
+  switch (impl->type) {
+  case ANV_SEMAPHORE_TYPE_BO:
+ result = anv_execbuf_add_bo(, impl->bo, NULL,
+ EXEC_OBJECT_WRITE, >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+  default:
+ break;
+  }
+   }
+
+   result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
if (result != VK_SUCCESS)
   return result;
 
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index b85cd40..f6e77ab 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -378,6 +378,10 @@ static const VkExtensionProperties device_extensions[] = {
   .extensionName = VK_KHX_EXTERNAL_SEMAPHORE_EXTENSION_NAME,
   .specVersion = 1,
},
+   {
+  

[Mesa-dev] [PATCH 18/21] anv: Implement support for exporting semaphores as FENCE_FD

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c | 96 --
 src/intel/vulkan/anv_device.c  | 25 ++
 src/intel/vulkan/anv_gem.c | 36 ++
 src/intel/vulkan/anv_private.h | 24 +++---
 src/intel/vulkan/anv_queue.c   | 73 +++--
 5 files changed, 240 insertions(+), 14 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 0529f22..ec37c81 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1387,6 +1387,23 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf,
return VK_SUCCESS;
 }
 
+static void
+setup_empty_execbuf(struct anv_execbuf *execbuf, struct anv_device *device)
+{
+   anv_execbuf_add_bo(execbuf, >trivial_batch_bo, NULL, 0,
+  >alloc);
+
+   execbuf->execbuf = (struct drm_i915_gem_execbuffer2) {
+  .buffers_ptr = (uintptr_t) execbuf->objects,
+  .buffer_count = execbuf->bo_count,
+  .batch_start_offset = 0,
+  .batch_len = 8, /* GEN8_MI_BATCH_BUFFER_END and NOOP */
+  .flags = I915_EXEC_HANDLE_LUT | I915_EXEC_RENDER,
+  .rsvd1 = device->context_id,
+  .rsvd2 = 0,
+   };
+}
+
 VkResult
 anv_cmd_buffer_execbuf(struct anv_device *device,
struct anv_cmd_buffer *cmd_buffer,
@@ -1398,11 +1415,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
struct anv_execbuf execbuf;
anv_execbuf_init();
 
+   int in_fence = -1;
VkResult result = VK_SUCCESS;
for (uint32_t i = 0; i < num_in_semaphores; i++) {
   ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
-  assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
-  struct anv_semaphore_impl *impl = >permanent;
+  struct anv_semaphore_impl *impl =
+ semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ?
+ >temporary : >permanent;
 
   switch (impl->type) {
   case ANV_SEMAPHORE_TYPE_BO:
@@ -1411,13 +1430,42 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  if (result != VK_SUCCESS)
 return result;
  break;
+
+  case ANV_SEMAPHORE_TYPE_SYNC_FILE:
+ if (in_fence == -1) {
+in_fence = impl->fd;
+ } else {
+int merge = anv_gem_sync_file_merge(device, in_fence, impl->fd);
+if (merge == -1)
+   return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
+
+close(impl->fd);
+close(in_fence);
+in_fence = merge;
+ }
+
+ impl->fd = -1;
+ break;
+
   default:
  break;
   }
+
+  /* Waiting on a semaphore with temporary state implicitly resets it back
+   * to the permanent state.
+   */
+  if (semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE) {
+ assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_SYNC_FILE);
+ semaphore->temporary.type = ANV_SEMAPHORE_TYPE_NONE;
+  }
}
 
+   bool need_out_fence = false;
for (uint32_t i = 0; i < num_out_semaphores; i++) {
   ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
+  /* Out fences can't have temporary state because that would imply
+   * that we imported a sync file and are trying to signal it.
+   */
   assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
   struct anv_semaphore_impl *impl = >permanent;
 
@@ -1428,17 +1476,55 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  if (result != VK_SUCCESS)
 return result;
  break;
+
+  case ANV_SEMAPHORE_TYPE_SYNC_FILE:
+ need_out_fence = true;
+ break;
+
   default:
  break;
   }
}
 
-   result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
-   if (result != VK_SUCCESS)
-  return result;
+   if (cmd_buffer) {
+  result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
+  if (result != VK_SUCCESS)
+ return result;
+   } else {
+  setup_empty_execbuf(, device);
+   }
+
+   if (in_fence != -1) {
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;
+  execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;
+   }
+
+   if (need_out_fence)
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_OUT;
 
result = anv_device_execbuf(device, , execbuf.bos);
 
+   /* Execbuf does not consume the in_fence.  It's our job to close it. */
+   close(in_fence);
+
+   if (result == VK_SUCCESS && need_out_fence) {
+  int out_fence = execbuf.execbuf.rsvd2 >> 32;
+  for (uint32_t i = 0; i < num_out_semaphores; i++) {
+ ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
+ /* Out fences can't have temporary state because that would imply
+  * that we imported a sync file and are trying to signal it.
+  */
+ assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
+ struct anv_semaphore_impl *impl = >permanent;
+
+ if (impl->type == 

[Mesa-dev] [HACK 19/21] anv: Set context priorities based on queue priorities

2017-04-14 Thread Jason Ekstrand
This patch will never be committed because Vulkan queue priorities are
supposed to be local to the device and not cross process boundaries.

---
 src/intel/vulkan/anv_device.c| 12 
 src/intel/vulkan/anv_gem.c   | 13 +
 src/intel/vulkan/anv_gem_stubs.c |  7 +++
 src/intel/vulkan/anv_private.h   |  2 ++
 4 files changed, 34 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 2885bb6..f853905 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1068,6 +1068,18 @@ VkResult anv_CreateDevice(
   goto fail_fd;
}
 
+   if (pCreateInfo->pQueueCreateInfos &&
+   pCreateInfo->pQueueCreateInfos->pQueuePriorities) {
+  float priority = *pCreateInfo->pQueueCreateInfos->pQueuePriorities;
+  int kernel_priority = 1023 * priority - 1023;
+  int ret = anv_gem_set_context_param(device, 6, kernel_priority);
+  if (ret == -1) {
+ result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
+"Setting I915_CONTEXT_PARAM_PRIORITY failed: %m");
+ goto fail_fd;
+  }
+   }
+
device->info = physical_device->info;
device->isl_dev = physical_device->isl_dev;
 
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index ffdc5a1..e331fbb 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -231,6 +231,19 @@ anv_gem_get_param(int fd, uint32_t param)
return 0;
 }
 
+int
+anv_gem_set_context_param(struct anv_device *device,
+  uint64_t param, uint64_t value)
+{
+   struct drm_i915_gem_context_param args = {
+  .ctx_id = device->context_id,
+  .param = param,
+  .value = value,
+   };
+
+   return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, );
+}
+
 bool
 anv_gem_get_bit6_swizzle(int fd, uint32_t tiling)
 {
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index a63e96d..d93009f 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -126,6 +126,13 @@ anv_gem_get_param(int fd, uint32_t param)
unreachable("Unused");
 }
 
+int
+anv_gem_set_context_param(struct anv_device *device,
+  uint64_t param, uint64_t value)
+{
+   unreachable("Unused");
+}
+
 bool
 anv_gem_get_bit6_swizzle(int fd, uint32_t tiling)
 {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index a083a07..b99c93c 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -800,6 +800,8 @@ int anv_gem_set_caching(struct anv_device *device, uint32_t 
gem_handle, uint32_t
 int anv_gem_set_domain(struct anv_device *device, uint32_t gem_handle,
uint32_t read_domains, uint32_t write_domain);
 int anv_gem_sync_file_merge(struct anv_device *device, int fd1, int fd2);
+int anv_gem_set_context_param(struct anv_device *device,
+  uint64_t param, uint64_t value);
 
 VkResult anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, 
uint64_t size);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/21] anv: Pull the guts of cmd_buffer_execbuf into a helper

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c | 59 ++
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 3e9fa4c..136f273 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1250,22 +1250,19 @@ relocate_cmd_buffer(struct anv_cmd_buffer *cmd_buffer,
return true;
 }
 
-VkResult
-anv_cmd_buffer_execbuf(struct anv_device *device,
-   struct anv_cmd_buffer *cmd_buffer)
+static VkResult
+setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf,
+ struct anv_cmd_buffer *cmd_buffer)
 {
struct anv_batch *batch = _buffer->batch;
struct anv_block_pool *ss_pool =
   _buffer->device->surface_state_block_pool;
 
-   struct anv_execbuf execbuf;
-   anv_execbuf_init();
-
adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs,
   cmd_buffer->last_ss_pool_center);
VkResult result =
-  anv_execbuf_add_bo(, _pool->bo, _buffer->surface_relocs,
- >alloc);
+  anv_execbuf_add_bo(execbuf, _pool->bo, _buffer->surface_relocs,
+ _buffer->device->alloc);
if (result != VK_SUCCESS)
   return result;
 
@@ -1277,8 +1274,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
cmd_buffer->last_ss_pool_center);
 
-  result = anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs,
-  >alloc);
+  result = anv_execbuf_add_bo(execbuf, &(*bbo)->bo, &(*bbo)->relocs,
+  _buffer->device->alloc);
   if (result != VK_SUCCESS)
  return result;
}
@@ -1297,19 +1294,19 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 * corresponding to the first batch_bo in the chain with the last
 * element in the list.
 */
-   if (first_batch_bo->bo.index != execbuf.bo_count - 1) {
+   if (first_batch_bo->bo.index != execbuf->bo_count - 1) {
   uint32_t idx = first_batch_bo->bo.index;
-  uint32_t last_idx = execbuf.bo_count - 1;
+  uint32_t last_idx = execbuf->bo_count - 1;
 
-  struct drm_i915_gem_exec_object2 tmp_obj = execbuf.objects[idx];
-  assert(execbuf.bos[idx] == _batch_bo->bo);
+  struct drm_i915_gem_exec_object2 tmp_obj = execbuf->objects[idx];
+  assert(execbuf->bos[idx] == _batch_bo->bo);
 
-  execbuf.objects[idx] = execbuf.objects[last_idx];
-  execbuf.bos[idx] = execbuf.bos[last_idx];
-  execbuf.bos[idx]->index = idx;
+  execbuf->objects[idx] = execbuf->objects[last_idx];
+  execbuf->bos[idx] = execbuf->bos[last_idx];
+  execbuf->bos[idx]->index = idx;
 
-  execbuf.objects[last_idx] = tmp_obj;
-  execbuf.bos[last_idx] = _batch_bo->bo;
+  execbuf->objects[last_idx] = tmp_obj;
+  execbuf->bos[last_idx] = _batch_bo->bo;
   first_batch_bo->bo.index = last_idx;
}
 
@@ -1330,9 +1327,9 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   }
}
 
-   execbuf.execbuf = (struct drm_i915_gem_execbuffer2) {
-  .buffers_ptr = (uintptr_t) execbuf.objects,
-  .buffer_count = execbuf.bo_count,
+   execbuf->execbuf = (struct drm_i915_gem_execbuffer2) {
+  .buffers_ptr = (uintptr_t) execbuf->objects,
+  .buffer_count = execbuf->bo_count,
   .batch_start_offset = 0,
   .batch_len = batch->next - batch->start,
   .cliprects_ptr = 0,
@@ -1345,7 +1342,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   .rsvd2 = 0,
};
 
-   if (relocate_cmd_buffer(cmd_buffer, )) {
+   if (relocate_cmd_buffer(cmd_buffer, execbuf)) {
   /* If we were able to successfully relocate everything, tell the kernel
* that it can skip doing relocations. The requirement for using
* NO_RELOC is:
@@ -1370,7 +1367,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
* the RENDER_SURFACE_STATE matches presumed_offset, so it should be
* safe for the kernel to relocate them as needed.
*/
-  execbuf.execbuf.flags |= I915_EXEC_NO_RELOC;
+  execbuf->execbuf.flags |= I915_EXEC_NO_RELOC;
} else {
   /* In the case where we fall back to doing kernel relocations, we need
* to ensure that the relocation list is valid.  All relocations on the
@@ -1385,6 +1382,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  cmd_buffer->surface_relocs.relocs[i].presumed_offset = -1;
}
 
+   return VK_SUCCESS;
+}
+
+VkResult
+anv_cmd_buffer_execbuf(struct anv_device *device,
+   struct anv_cmd_buffer *cmd_buffer)
+{
+   struct anv_execbuf execbuf;
+   anv_execbuf_init();
+
+   VkResult result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
+   if (result != VK_SUCCESS)
+  return result;
+
result = anv_device_execbuf(device, , execbuf.bos);
 
  

[Mesa-dev] [PATCH 14/21] anv: Implement VK_KHX_external_semaphore

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_device.c   | 4 
 src/intel/vulkan/anv_entrypoints_gen.py | 1 +
 src/intel/vulkan/anv_queue.c| 8 
 3 files changed, 13 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 41e0fb3..b85cd40 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -374,6 +374,10 @@ static const VkExtensionProperties device_extensions[] = {
   .extensionName = VK_KHX_EXTERNAL_MEMORY_FD_EXTENSION_NAME,
   .specVersion = 1,
},
+   {
+  .extensionName = VK_KHX_EXTERNAL_SEMAPHORE_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static void *
diff --git a/src/intel/vulkan/anv_entrypoints_gen.py 
b/src/intel/vulkan/anv_entrypoints_gen.py
index 5ad0f26..cfa9d68 100644
--- a/src/intel/vulkan/anv_entrypoints_gen.py
+++ b/src/intel/vulkan/anv_entrypoints_gen.py
@@ -48,6 +48,7 @@ SUPPORTED_EXTENSIONS = [
 'VK_KHX_external_memory',
 'VK_KHX_external_memory_capabilities',
 'VK_KHX_external_memory_fd',
+'VK_KHX_external_semaphore',
 'VK_KHX_external_semaphore_capabilities',
 ]
 
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 906eb25..64c5900 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -508,6 +508,14 @@ VkResult anv_CreateSemaphore(
if (semaphore == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
+   const VkExportSemaphoreCreateInfoKHX *export =
+  vk_find_struct_const(pCreateInfo->pNext, 
EXPORT_SEMAPHORE_CREATE_INFO_KHX);
+VkExternalSemaphoreHandleTypeFlagsKHX handleTypes =
+  export ? export->handleTypes : 0;
+
+   /* External semaphores are not yet supported */
+   assert(handleTypes == 0);
+
/* The DRM execbuffer ioctl always execute in-oder, even between
 * different rings. As such, a dummy no-op semaphore is a perfectly
 * valid implementation.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/21] anv: Add a real semaphore struct

2017-04-14 Thread Jason Ekstrand
It's just a dummy for now, but we'll flesh it out as needed for external
semaphores.
---
 src/intel/vulkan/anv_private.h | 28 
 src/intel/vulkan/anv_queue.c   | 32 ++--
 2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 898f0cf..5cbb0c5 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1706,6 +1706,33 @@ struct anv_event {
struct anv_state state;
 };
 
+enum anv_semaphore_type {
+   ANV_SEMAPHORE_TYPE_NONE = 0,
+   ANV_SEMAPHORE_TYPE_DUMMY
+};
+
+struct anv_semaphore_impl {
+   enum anv_semaphore_type type;
+};
+
+struct anv_semaphore {
+   /* Permanent semaphore state.  Every semaphore has some form of permanent
+* state (type != ANV_SEMAPHORE_TYPE_NONE).  This may be a BO to fence on
+* (for cross-process semaphores0 or it could just be a dummy for use
+* internally.
+*/
+   struct anv_semaphore_impl permanent;
+
+   /* Temporary semaphore state.  A semaphore *may* have temporary state.
+* That state is added to the semaphore by an import operation and is reset
+* back to ANV_SEMAPHORE_TYPE_NONE when the semaphore is waited on.  A
+* semaphore with temporary state cannot be signaled because the semaphore
+* must already be signaled before the temporary state can be exported from
+* the semaphore in the other process and imported here.
+*/
+   struct anv_semaphore_impl temporary;
+};
+
 struct anv_shader_module {
unsigned charsha1[20];
uint32_t size;
@@ -2314,6 +2341,7 @@ ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_pipeline_layout, 
VkPipelineLayout)
 ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_query_pool, VkQueryPool)
 ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_render_pass, VkRenderPass)
 ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_sampler, VkSampler)
+ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_semaphore, VkSemaphore)
 ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_shader_module, VkShaderModule)
 
 /* Gen-specific function declarations */
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 5a22ff7..f6ff41f 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -493,23 +493,43 @@ done:
 // Queue semaphore functions
 
 VkResult anv_CreateSemaphore(
-VkDevicedevice,
+VkDevice_device,
 const VkSemaphoreCreateInfo*pCreateInfo,
 const VkAllocationCallbacks*pAllocator,
 VkSemaphore*pSemaphore)
 {
-   /* The DRM execbuffer ioctl always execute in-oder, even between different
-* rings. As such, there's nothing to do for the user space semaphore.
+   ANV_FROM_HANDLE(anv_device, device, _device);
+   struct anv_semaphore *semaphore;
+
+   assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO);
+
+   semaphore = vk_alloc2(>alloc, pAllocator, sizeof(*semaphore), 8,
+ VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+   if (semaphore == NULL)
+  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+   /* The DRM execbuffer ioctl always execute in-oder, even between
+* different rings. As such, a dummy no-op semaphore is a perfectly
+* valid implementation.
 */
+   semaphore->permanent.type = ANV_SEMAPHORE_TYPE_DUMMY;
+   semaphore->temporary.type = ANV_SEMAPHORE_TYPE_NONE;
 
-   *pSemaphore = (VkSemaphore)1;
+   *pSemaphore = anv_semaphore_to_handle(semaphore);
 
return VK_SUCCESS;
 }
 
 void anv_DestroySemaphore(
-VkDevicedevice,
-VkSemaphore semaphore,
+VkDevice_device,
+VkSemaphore _semaphore,
 const VkAllocationCallbacks*pAllocator)
 {
+   ANV_FROM_HANDLE(anv_device, device, _device);
+   ANV_FROM_HANDLE(anv_semaphore, semaphore, _semaphore);
+
+   if (semaphore == NULL)
+  return;
+
+   vk_free2(>alloc, pAllocator, semaphore);
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/21] anv: Use the BO cache for DeviceMemory allocations

2017-04-14 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_device.c  | 27 ---
 src/intel/vulkan/anv_image.c   |  2 +-
 src/intel/vulkan/anv_intel.c   | 15 ++-
 src/intel/vulkan/anv_private.h |  4 +++-
 src/intel/vulkan/anv_wsi.c |  8 
 5 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a7ae6ce..eaf93b5 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1124,10 +1124,14 @@ VkResult anv_CreateDevice(
 
anv_bo_pool_init(>batch_bo_pool, device);
 
+   result = anv_bo_cache_init(>bo_cache);
+   if (result != VK_SUCCESS)
+  goto fail_batch_bo_pool;
+
result = anv_block_pool_init(>dynamic_state_block_pool, device,
 16384);
if (result != VK_SUCCESS)
-  goto fail_batch_bo_pool;
+  goto fail_bo_cache;
 
anv_state_pool_init(>dynamic_state_pool,
>dynamic_state_block_pool);
@@ -1199,6 +1203,8 @@ VkResult anv_CreateDevice(
  fail_dynamic_state_pool:
anv_state_pool_finish(>dynamic_state_pool);
anv_block_pool_finish(>dynamic_state_block_pool);
+ fail_bo_cache:
+   anv_bo_cache_finish(>bo_cache);
  fail_batch_bo_pool:
anv_bo_pool_finish(>batch_bo_pool);
pthread_cond_destroy(>queue_submit);
@@ -1246,6 +1252,8 @@ void anv_DestroyDevice(
anv_state_pool_finish(>dynamic_state_pool);
anv_block_pool_finish(>dynamic_state_block_pool);
 
+   anv_bo_cache_finish(>bo_cache);
+
anv_bo_pool_finish(>batch_bo_pool);
 
pthread_cond_destroy(>queue_submit);
@@ -1613,7 +1621,8 @@ VkResult anv_AllocateMemory(
/* The kernel is going to give us whole pages anyway */
uint64_t alloc_size = align_u64(pAllocateInfo->allocationSize, 4096);
 
-   result = anv_bo_init_new(>bo, device, alloc_size);
+   result = anv_bo_cache_alloc(device, >bo_cache,
+   alloc_size, >bo);
if (result != VK_SUCCESS)
   goto fail;
 
@@ -1646,11 +1655,7 @@ void anv_FreeMemory(
if (mem->map)
   anv_UnmapMemory(_device, _mem);
 
-   if (mem->bo.map)
-  anv_gem_munmap(mem->bo.map, mem->bo.size);
-
-   if (mem->bo.gem_handle != 0)
-  anv_gem_close(device, mem->bo.gem_handle);
+   anv_bo_cache_release(device, >bo_cache, mem->bo);
 
vk_free2(>alloc, pAllocator, mem);
 }
@@ -1672,7 +1677,7 @@ VkResult anv_MapMemory(
}
 
if (size == VK_WHOLE_SIZE)
-  size = mem->bo.size - offset;
+  size = mem->bo->size - offset;
 
/* From the Vulkan spec version 1.0.32 docs for MapMemory:
 *
@@ -1682,7 +1687,7 @@ VkResult anv_MapMemory(
 *equal to the size of the memory minus offset
 */
assert(size > 0);
-   assert(offset + size <= mem->bo.size);
+   assert(offset + size <= mem->bo->size);
 
/* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() only
 * takes a VkDeviceMemory pointer, it seems like only one map of the memory
@@ -1702,7 +1707,7 @@ VkResult anv_MapMemory(
/* Let's map whole pages */
map_size = align_u64(map_size, 4096);
 
-   void *map = anv_gem_mmap(device, mem->bo.gem_handle,
+   void *map = anv_gem_mmap(device, mem->bo->gem_handle,
 map_offset, map_size, gem_flags);
if (map == MAP_FAILED)
   return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
@@ -1854,7 +1859,7 @@ VkResult anv_BindBufferMemory(
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
 
if (mem) {
-  buffer->bo = >bo;
+  buffer->bo = mem->bo;
   buffer->offset = memoryOffset;
} else {
   buffer->bo = NULL;
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index cf34dbe..4874f2f 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -341,7 +341,7 @@ VkResult anv_BindImageMemory(
   return VK_SUCCESS;
}
 
-   image->bo = >bo;
+   image->bo = mem->bo;
image->offset = memoryOffset;
 
if (image->aux_surface.isl.size > 0) {
diff --git a/src/intel/vulkan/anv_intel.c b/src/intel/vulkan/anv_intel.c
index eda474e..991a935 100644
--- a/src/intel/vulkan/anv_intel.c
+++ b/src/intel/vulkan/anv_intel.c
@@ -49,18 +49,15 @@ VkResult anv_CreateDmaBufImageINTEL(
if (mem == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   uint32_t gem_handle = anv_gem_fd_to_handle(device, pCreateInfo->fd);
-   if (!gem_handle) {
-  result = vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
-  goto fail;
-   }
-
uint64_t size = (uint64_t)pCreateInfo->strideInBytes * 
pCreateInfo->extent.height;
 
-   anv_bo_init(>bo, gem_handle, size);
+   result = anv_bo_cache_import(device, >bo_cache,
+pCreateInfo->fd, size, >bo);
+   if (result != VK_SUCCESS)
+  goto fail;
 
if (device->instance->physicalDevice.supports_48bit_addresses)
-  mem->bo.flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+  mem->bo->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
 

[Mesa-dev] [PATCH 13/21] anv: Implement VK_KHX_external_semaphore_capabilities

2017-04-14 Thread Jason Ekstrand
This just stubs things out.  Real external semaphore support will come
with VK_KHX_external_semaphore_fd.
---
 src/intel/vulkan/anv_device.c   |  4 
 src/intel/vulkan/anv_entrypoints_gen.py |  1 +
 src/intel/vulkan/anv_queue.c| 13 +
 3 files changed, 18 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 98b1868..41e0fb3 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -331,6 +331,10 @@ static const VkExtensionProperties global_extensions[] = {
   .extensionName = VK_KHX_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME,
   .specVersion = 1,
},
+   {
+  .extensionName = VK_KHX_EXTERNAL_SEMAPHORE_CAPABILITIES_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static const VkExtensionProperties device_extensions[] = {
diff --git a/src/intel/vulkan/anv_entrypoints_gen.py 
b/src/intel/vulkan/anv_entrypoints_gen.py
index b4395c0..5ad0f26 100644
--- a/src/intel/vulkan/anv_entrypoints_gen.py
+++ b/src/intel/vulkan/anv_entrypoints_gen.py
@@ -48,6 +48,7 @@ SUPPORTED_EXTENSIONS = [
 'VK_KHX_external_memory',
 'VK_KHX_external_memory_capabilities',
 'VK_KHX_external_memory_fd',
+'VK_KHX_external_semaphore_capabilities',
 ]
 
 # We generate a static hash table for entry point lookup
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index f6ff41f..906eb25 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -533,3 +533,16 @@ void anv_DestroySemaphore(
 
vk_free2(>alloc, pAllocator, semaphore);
 }
+
+void anv_GetPhysicalDeviceExternalSemaphorePropertiesKHX(
+VkPhysicalDevicephysicalDevice,
+const VkPhysicalDeviceExternalSemaphoreInfoKHX* pExternalSemaphoreInfo,
+VkExternalSemaphorePropertiesKHX*   pExternalSemaphoreProperties)
+{
+   switch (pExternalSemaphoreInfo->handleType) {
+   default:
+  pExternalSemaphoreProperties->exportFromImportedHandleTypes = 0;
+  pExternalSemaphoreProperties->compatibleHandleTypes = 0;
+  pExternalSemaphoreProperties->externalSemaphoreFeatures = 0;
+   }
+}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/21] anv: Implement VK_KHX_external_memory_fd

2017-04-14 Thread Jason Ekstrand
This commit just exposes the memory handle type.  There's interesting we
need to do here for images.  So long as the user doesn't set any crazy
environment variables such as INTEL_DEBUG=nohiz, all of the compression
formats etc. should "just work" at least for opaque handle types.

v2 (chadv):
  - Rebase.
  - Fix vkGetPhysicalDeviceImageFormatProperties2KHR when
handleType == 0.
  - Move handleType-independency comments out of handleType-switch, in
vkGetPhysicalDeviceExternalBufferPropertiesKHX.  Reduces diff in
future dma_buf patches.

Co-authored-with: Chad Versace 
Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_device.c   | 71 -
 src/intel/vulkan/anv_entrypoints_gen.py |  1 +
 src/intel/vulkan/anv_formats.c  | 59 +++
 3 files changed, 113 insertions(+), 18 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index eaf93b5..e891912 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -366,6 +366,10 @@ static const VkExtensionProperties device_extensions[] = {
   .extensionName = VK_KHX_EXTERNAL_MEMORY_EXTENSION_NAME,
   .specVersion = 1,
},
+   {
+  .extensionName = VK_KHX_EXTERNAL_MEMORY_FD_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static void *
@@ -1600,7 +1604,7 @@ VkResult anv_AllocateMemory(
 {
ANV_FROM_HANDLE(anv_device, device, _device);
struct anv_device_memory *mem;
-   VkResult result;
+   VkResult result = VK_SUCCESS;
 
assert(pAllocateInfo->sType == VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO);
 
@@ -1618,19 +1622,36 @@ VkResult anv_AllocateMemory(
if (mem == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   /* The kernel is going to give us whole pages anyway */
-   uint64_t alloc_size = align_u64(pAllocateInfo->allocationSize, 4096);
-
-   result = anv_bo_cache_alloc(device, >bo_cache,
-   alloc_size, >bo);
-   if (result != VK_SUCCESS)
-  goto fail;
-
mem->type_index = pAllocateInfo->memoryTypeIndex;
-
mem->map = NULL;
mem->map_size = 0;
 
+   const VkImportMemoryFdInfoKHX *fd_info =
+  vk_find_struct_const(pAllocateInfo->pNext, IMPORT_MEMORY_FD_INFO_KHX);
+
+   /* The Vulkan spec permits handleType to be 0, in which case the struct is
+* ignored.
+*/
+   if (fd_info && fd_info->handleType) {
+  /* At the moment, we only support the OPAQUE_FD memory type which is
+   * just a GEM buffer.
+   */
+  assert(fd_info->handleType ==
+ VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHX);
+
+  result = anv_bo_cache_import(device, >bo_cache,
+   fd_info->fd, pAllocateInfo->allocationSize,
+   >bo);
+  if (result != VK_SUCCESS)
+ goto fail;
+   } else {
+  result = anv_bo_cache_alloc(device, >bo_cache,
+  pAllocateInfo->allocationSize,
+  >bo);
+  if (result != VK_SUCCESS)
+ goto fail;
+   }
+
*pMem = anv_device_memory_to_handle(mem);
 
return VK_SUCCESS;
@@ -1641,6 +1662,36 @@ VkResult anv_AllocateMemory(
return result;
 }
 
+VkResult anv_GetMemoryFdKHX(
+VkDevicedevice_h,
+VkDeviceMemory  memory_h,
+VkExternalMemoryHandleTypeFlagBitsKHX   handleType,
+int*pFd)
+{
+   ANV_FROM_HANDLE(anv_device, dev, device_h);
+   ANV_FROM_HANDLE(anv_device_memory, mem, memory_h);
+
+   /* We support only one handle type. */
+   assert(handleType == VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHX);
+
+   return anv_bo_cache_export(dev, >bo_cache, mem->bo, pFd);
+}
+
+VkResult anv_GetMemoryFdPropertiesKHX(
+VkDevicedevice_h,
+VkExternalMemoryHandleTypeFlagBitsKHX   handleType,
+int fd,
+VkMemoryFdPropertiesKHX*pMemoryFdProperties)
+{
+   /* The valid usage section for this function says:
+*
+*"handleType must not be one of the handle types defined as opaque."
+*
+* Since we only handle opaque handles for now, there are no FD properties.
+*/
+   return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX;
+}
+
 void anv_FreeMemory(
 VkDevice_device,
 VkDeviceMemory  _mem,
diff --git a/src/intel/vulkan/anv_entrypoints_gen.py 
b/src/intel/vulkan/anv_entrypoints_gen.py
index 400b567..b4395c0 100644
--- a/src/intel/vulkan/anv_entrypoints_gen.py
+++ b/src/intel/vulkan/anv_entrypoints_gen.py
@@ -47,6 +47,7 @@ SUPPORTED_EXTENSIONS = [
 'VK_KHR_xlib_surface',
 'VK_KHX_external_memory',
 'VK_KHX_external_memory_capabilities',
+'VK_KHX_external_memory_fd',
 ]
 
 # We generate a static 

[Mesa-dev] [PATCH 08/21] anv/allocator: Add a BO cache

2017-04-14 Thread Jason Ekstrand
This cache allows us to easily ensure that we have a unique anv_bo for
each gem handle.  We'll need this in order to support multiple-import of
memory objects and semaphores.

v2 (Jason Ekstrand):
 - Reject BO imports if the size doesn't match the prime fd size as
   reported by lseek().
---
 src/intel/vulkan/anv_allocator.c   | 257 +
 src/intel/vulkan/anv_private.h |  21 ++
 .../drivers/dri/i965/brw_nir_trig_workarounds.c| 191 +++
 3 files changed, 469 insertions(+)
 create mode 100644 src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.c

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 697309f..4ab5f60 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -34,6 +34,8 @@
 
 #include "anv_private.h"
 
+#include "util/hash_table.h"
+
 #ifdef HAVE_VALGRIND
 #define VG_NOACCESS_READ(__ptr) ({   \
VALGRIND_MAKE_MEM_DEFINED((__ptr), sizeof(*(__ptr))); \
@@ -1004,3 +1006,258 @@ anv_scratch_pool_alloc(struct anv_device *device, 
struct anv_scratch_pool *pool,
 
return >bo;
 }
+
+struct anv_cached_bo {
+   struct anv_bo bo;
+
+   uint32_t refcount;
+};
+
+VkResult
+anv_bo_cache_init(struct anv_bo_cache *cache)
+{
+   cache->bo_map = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+   _mesa_key_pointer_equal);
+   if (!cache->bo_map)
+  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+   if (pthread_mutex_init(>mutex, NULL)) {
+  _mesa_hash_table_destroy(cache->bo_map, NULL);
+  return vk_errorf(VK_ERROR_OUT_OF_HOST_MEMORY,
+   "pthread_mutex_inti failed: %m");
+   }
+
+   return VK_SUCCESS;
+}
+
+void
+anv_bo_cache_finish(struct anv_bo_cache *cache)
+{
+   _mesa_hash_table_destroy(cache->bo_map, NULL);
+   pthread_mutex_destroy(>mutex);
+}
+
+static struct anv_cached_bo *
+anv_bo_cache_lookup_locked(struct anv_bo_cache *cache, uint32_t gem_handle)
+{
+   struct hash_entry *entry =
+  _mesa_hash_table_search(cache->bo_map,
+  (const void *)(uintptr_t)gem_handle);
+   if (!entry)
+  return NULL;
+
+   struct anv_cached_bo *bo = (struct anv_cached_bo *)entry->data;
+   assert(bo->bo.gem_handle == gem_handle);
+
+   return bo;
+}
+
+static struct anv_bo *
+anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t gem_handle)
+{
+   pthread_mutex_lock(>mutex);
+
+   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle);
+
+   pthread_mutex_unlock(>mutex);
+
+   return >bo;
+}
+
+VkResult
+anv_bo_cache_alloc(struct anv_device *device,
+   struct anv_bo_cache *cache,
+   uint64_t size, struct anv_bo **bo_out)
+{
+   struct anv_cached_bo *bo =
+  vk_alloc(>alloc, sizeof(struct anv_cached_bo), 8,
+   VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+   if (!bo)
+  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+   bo->refcount = 1;
+
+   /* The kernel is going to give us whole pages anyway */
+   size = align_u64(size, 4096);
+
+   VkResult result = anv_bo_init_new(>bo, device, size);
+   if (result != VK_SUCCESS) {
+  vk_free(>alloc, bo);
+  return result;
+   }
+
+   assert(bo->bo.gem_handle);
+
+   pthread_mutex_lock(>mutex);
+
+   _mesa_hash_table_insert(cache->bo_map,
+   (void *)(uintptr_t)bo->bo.gem_handle, bo);
+
+   pthread_mutex_unlock(>mutex);
+
+   *bo_out = >bo;
+
+   return VK_SUCCESS;
+}
+
+VkResult
+anv_bo_cache_import(struct anv_device *device,
+struct anv_bo_cache *cache,
+int fd, uint64_t size, struct anv_bo **bo_out)
+{
+   pthread_mutex_lock(>mutex);
+
+   /* The kernel is going to give us whole pages anyway */
+   size = align_u64(size, 4096);
+
+   uint32_t gem_handle = anv_gem_fd_to_handle(device, fd);
+   if (!gem_handle) {
+  pthread_mutex_unlock(>mutex);
+  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
+   }
+
+   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle);
+   if (bo) {
+  if (bo->bo.size != size) {
+ pthread_mutex_unlock(>mutex);
+ return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
+  }
+  __sync_fetch_and_add(>refcount, 1);
+   } else {
+  /* For security purposes, we reject BO imports where the size does not
+   * match exactly.  This prevents a malicious client from passing a
+   * buffer to a trusted client, lying about the size, and telling the
+   * trusted client to try and texture from an image that goes
+   * out-of-bounds.  This sort of thing could lead to GPU hangs or worse
+   * in the trusted client.  The trusted client can protect itself against
+   * this sort of attack but only if it can trust the buffer size.
+   */
+  off_t import_size = lseek(fd, 0, SEEK_END);
+  if (import_size == (off_t)-1 || import_size != size) {
+ anv_gem_close(device, 

[Mesa-dev] [PATCH 11/21] anv: Move queues, events, and semaphores to their own file

2017-04-14 Thread Jason Ekstrand
Things are about to get more complicated, especially as far as
semaphores are concerned.

Reviewed-by: Chad Versace 
---
 src/intel/Makefile.sources|   1 +
 src/intel/vulkan/anv_device.c | 484 ---
 src/intel/vulkan/anv_queue.c  | 515 ++
 3 files changed, 516 insertions(+), 484 deletions(-)
 create mode 100644 src/intel/vulkan/anv_queue.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index d7bc09e..c64a5f2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -202,6 +202,7 @@ VULKAN_FILES := \
vulkan/anv_pipeline.c \
vulkan/anv_pipeline_cache.c \
vulkan/anv_private.h \
+   vulkan/anv_queue.c \
vulkan/anv_util.c \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index e891912..98b1868 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -981,62 +981,6 @@ anv_device_init_border_colors(struct anv_device *device)
 border_colors);
 }
 
-VkResult
-anv_device_submit_simple_batch(struct anv_device *device,
-   struct anv_batch *batch)
-{
-   struct drm_i915_gem_execbuffer2 execbuf;
-   struct drm_i915_gem_exec_object2 exec2_objects[1];
-   struct anv_bo bo, *exec_bos[1];
-   VkResult result = VK_SUCCESS;
-   uint32_t size;
-
-   /* Kernel driver requires 8 byte aligned batch length */
-   size = align_u32(batch->next - batch->start, 8);
-   result = anv_bo_pool_alloc(>batch_bo_pool, , size);
-   if (result != VK_SUCCESS)
-  return result;
-
-   memcpy(bo.map, batch->start, size);
-   if (!device->info.has_llc)
-  anv_flush_range(bo.map, size);
-
-   exec_bos[0] = 
-   exec2_objects[0].handle = bo.gem_handle;
-   exec2_objects[0].relocation_count = 0;
-   exec2_objects[0].relocs_ptr = 0;
-   exec2_objects[0].alignment = 0;
-   exec2_objects[0].offset = bo.offset;
-   exec2_objects[0].flags = 0;
-   exec2_objects[0].rsvd1 = 0;
-   exec2_objects[0].rsvd2 = 0;
-
-   execbuf.buffers_ptr = (uintptr_t) exec2_objects;
-   execbuf.buffer_count = 1;
-   execbuf.batch_start_offset = 0;
-   execbuf.batch_len = size;
-   execbuf.cliprects_ptr = 0;
-   execbuf.num_cliprects = 0;
-   execbuf.DR1 = 0;
-   execbuf.DR4 = 0;
-
-   execbuf.flags =
-  I915_EXEC_HANDLE_LUT | I915_EXEC_NO_RELOC | I915_EXEC_RENDER;
-   execbuf.rsvd1 = device->context_id;
-   execbuf.rsvd2 = 0;
-
-   result = anv_device_execbuf(device, , exec_bos);
-   if (result != VK_SUCCESS)
-  goto fail;
-
-   result = anv_device_wait(device, , INT64_MAX);
-
- fail:
-   anv_bo_pool_free(>batch_bo_pool, );
-
-   return result;
-}
-
 VkResult anv_CreateDevice(
 VkPhysicalDevicephysicalDevice,
 const VkDeviceCreateInfo*   pCreateInfo,
@@ -1350,26 +1294,6 @@ void anv_GetDeviceQueue(
 }
 
 VkResult
-anv_device_execbuf(struct anv_device *device,
-   struct drm_i915_gem_execbuffer2 *execbuf,
-   struct anv_bo **execbuf_bos)
-{
-   int ret = anv_gem_execbuffer(device, execbuf);
-   if (ret != 0) {
-  /* We don't know the real error. */
-  device->lost = true;
-  return vk_errorf(VK_ERROR_DEVICE_LOST, "execbuf2 failed: %m");
-   }
-
-   struct drm_i915_gem_exec_object2 *objects =
-  (void *)(uintptr_t)execbuf->buffers_ptr;
-   for (uint32_t k = 0; k < execbuf->buffer_count; k++)
-  execbuf_bos[k]->offset = objects[k].offset;
-
-   return VK_SUCCESS;
-}
-
-VkResult
 anv_device_query_status(struct anv_device *device)
 {
/* This isn't likely as most of the callers of this function already check
@@ -1446,119 +1370,6 @@ anv_device_wait(struct anv_device *device, struct 
anv_bo *bo,
return anv_device_query_status(device);
 }
 
-VkResult anv_QueueSubmit(
-VkQueue _queue,
-uint32_tsubmitCount,
-const VkSubmitInfo* pSubmits,
-VkFence _fence)
-{
-   ANV_FROM_HANDLE(anv_queue, queue, _queue);
-   ANV_FROM_HANDLE(anv_fence, fence, _fence);
-   struct anv_device *device = queue->device;
-
-   /* Query for device status prior to submitting.  Technically, we don't need
-* to do this.  However, if we have a client that's submitting piles of
-* garbage, we would rather break as early as possible to keep the GPU
-* hanging contained.  If we don't check here, we'll either be waiting for
-* the kernel to kick us or we'll have to wait until the client waits on a
-* fence before we actually know whether or not we've hung.
-*/
-   VkResult result = anv_device_query_status(device);
-   if (result != VK_SUCCESS)
-  return result;
-
-   /* We lock around QueueSubmit for three main reasons:
-*
-*  1) When a block pool is 

[Mesa-dev] [PATCH 07/21] anv: Implement VK_KHX_external_memory

2017-04-14 Thread Jason Ekstrand
This is the trivial implementation that just exposes the extension
string but exposes zero external handle types.

Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_device.c   | 4 
 src/intel/vulkan/anv_entrypoints_gen.py | 1 +
 2 files changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d8de707..a7ae6ce 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -362,6 +362,10 @@ static const VkExtensionProperties device_extensions[] = {
   .extensionName = VK_KHR_INCREMENTAL_PRESENT_EXTENSION_NAME,
   .specVersion = 1,
},
+   {
+  .extensionName = VK_KHX_EXTERNAL_MEMORY_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static void *
diff --git a/src/intel/vulkan/anv_entrypoints_gen.py 
b/src/intel/vulkan/anv_entrypoints_gen.py
index 245d6d0..400b567 100644
--- a/src/intel/vulkan/anv_entrypoints_gen.py
+++ b/src/intel/vulkan/anv_entrypoints_gen.py
@@ -45,6 +45,7 @@ SUPPORTED_EXTENSIONS = [
 'VK_KHR_wayland_surface',
 'VK_KHR_xcb_surface',
 'VK_KHR_xlib_surface',
+'VK_KHX_external_memory',
 'VK_KHX_external_memory_capabilities',
 ]
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/21] anv: Implement VK_KHX_external_memory_capabilities

2017-04-14 Thread Jason Ekstrand
From: Chad Versace 

This is a complete but trivial implementation. It's trivial becasue We
support no external memory capabilities yet.  Most of the real work in
this commit is in reworking the UUIDs advertised by the driver.

v2 (chadv):
  - Fix chain traversal in vkGetPhysicalDeviceImageFormatProperties2KHR.
Extract VkPhysicalDeviceExternalImageFormatInfoKHX from the chain of
input structs, not the chain of output structs.
  - In vkGetPhysicalDeviceImageFormatProperties2KHR, iterate over the
input chain and the output chain separately. Reduces diff in future
dma_buf patches.

Co-authored-with: Jason Ekstrand 
Reviewed-by: Chad Versace 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_device.c   | 52 ---
 src/intel/vulkan/anv_entrypoints_gen.py |  1 +
 src/intel/vulkan/anv_formats.c  | 75 +
 src/intel/vulkan/anv_private.h  |  2 +
 4 files changed, 116 insertions(+), 14 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 0a67414..d8de707 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -116,6 +116,9 @@ anv_physical_device_init_uuids(struct anv_physical_device 
*device)
uint8_t sha1[20];
STATIC_ASSERT(VK_UUID_SIZE <= sizeof(sha1));
 
+   /* The pipeline cache UUID is used for determining when a pipeline cache is
+* invalid.  It needs both a driver build and the PCI ID of the device.
+*/
_mesa_sha1_init(_ctx);
_mesa_sha1_update(_ctx, build_id_data(note), build_id_len);
_mesa_sha1_update(_ctx, >chipset_id,
@@ -123,6 +126,27 @@ anv_physical_device_init_uuids(struct anv_physical_device 
*device)
_mesa_sha1_final(_ctx, sha1);
memcpy(device->pipeline_cache_uuid, sha1, VK_UUID_SIZE);
 
+   /* The driver UUID is used for determining sharability of images and memory
+* between two Vulkan instances in separate processes.  People who want to
+* share memory need to also check the device UUID (below) so all this
+* needs to be is the build-id.
+*/
+   memcpy(device->driver_uuid, build_id_data(note), VK_UUID_SIZE);
+
+   /* The device UUID uniquely identifies the given device within the machine.
+* Since we never have more than one device, this doesn't need to be a real
+* UUID.  However, on the off-chance that someone tries to use this to
+* cache pre-tiled images or something of the like, we use the PCI ID and
+* some bits of ISL info to ensure that this is safe.
+*/
+   _mesa_sha1_init(_ctx);
+   _mesa_sha1_update(_ctx, >chipset_id,
+ sizeof(device->chipset_id));
+   _mesa_sha1_update(_ctx, >isl_dev.has_bit6_swizzling,
+ sizeof(device->isl_dev.has_bit6_swizzling));
+   _mesa_sha1_final(_ctx, sha1);
+   memcpy(device->device_uuid, sha1, VK_UUID_SIZE);
+
return VK_SUCCESS;
 }
 
@@ -209,10 +233,6 @@ anv_physical_device_init(struct anv_physical_device 
*device,
 
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
 
-   result = anv_physical_device_init_uuids(device);
-   if (result != VK_SUCCESS)
-  goto fail;
-
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
/* GENs prior to 8 do not support EU/Subslice info */
@@ -252,14 +272,18 @@ anv_physical_device_init(struct anv_physical_device 
*device,
device->compiler->shader_debug_log = compiler_debug_log;
device->compiler->shader_perf_log = compiler_perf_log;
 
+   isl_device_init(>isl_dev, >info, swizzled);
+
+   result = anv_physical_device_init_uuids(device);
+   if (result != VK_SUCCESS)
+  goto fail;
+
result = anv_init_wsi(device);
if (result != VK_SUCCESS) {
   ralloc_free(device->compiler);
   goto fail;
}
 
-   isl_device_init(>isl_dev, >info, swizzled);
-
device->local_fd = fd;
return VK_SUCCESS;
 
@@ -303,6 +327,10 @@ static const VkExtensionProperties global_extensions[] = {
   .extensionName = VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME,
   .specVersion = 1,
},
+   {
+  .extensionName = VK_KHX_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static const VkExtensionProperties device_extensions[] = {
@@ -729,6 +757,8 @@ void anv_GetPhysicalDeviceProperties2KHR(
 VkPhysicalDevicephysicalDevice,
 VkPhysicalDeviceProperties2KHR* pProperties)
 {
+   ANV_FROM_HANDLE(anv_physical_device, pdevice, physicalDevice);
+
anv_GetPhysicalDeviceProperties(physicalDevice, >properties);
 
vk_foreach_struct(ext, pProperties->pNext) {
@@ -741,6 +771,16 @@ void anv_GetPhysicalDeviceProperties2KHR(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES_KHX: {
+ VkPhysicalDeviceIDPropertiesKHX *id_props =
+(VkPhysicalDeviceIDPropertiesKHX 

[Mesa-dev] [PATCH 04/21] anv: Refactor device_get_cache_uuid into physical_device_init_uuids

2017-04-14 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_device.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 079b0c5..ad10531 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -97,16 +97,20 @@ anv_compute_heap_size(int fd, uint64_t *heap_size)
return VK_SUCCESS;
 }
 
-static bool
-anv_device_get_cache_uuid(void *uuid, uint16_t pci_id)
+static VkResult
+anv_physical_device_init_uuids(struct anv_physical_device *device)
 {
const struct build_id_note *note = build_id_find_nhdr("libvulkan_intel.so");
-   if (!note)
-  return false;
+   if (!note) {
+  return vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
+   "Failed to find build-id");
+   }
 
unsigned build_id_len = build_id_length(note);
-   if (build_id_len < 20) /* It should be a SHA-1 */
-  return false;
+   if (build_id_len < 20) {
+  return vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
+   "build-id too short.  It needs to be a SHA");
+   }
 
struct mesa_sha1 sha1_ctx;
uint8_t sha1[20];
@@ -114,11 +118,12 @@ anv_device_get_cache_uuid(void *uuid, uint16_t pci_id)
 
_mesa_sha1_init(_ctx);
_mesa_sha1_update(_ctx, build_id_data(note), build_id_len);
-   _mesa_sha1_update(_ctx, _id, sizeof(pci_id));
+   _mesa_sha1_update(_ctx, >chipset_id,
+ sizeof(device->chipset_id));
_mesa_sha1_final(_ctx, sha1);
+   memcpy(device->uuid, sha1, VK_UUID_SIZE);
 
-   memcpy(uuid, sha1, VK_UUID_SIZE);
-   return true;
+   return VK_SUCCESS;
 }
 
 static VkResult
@@ -204,11 +209,10 @@ anv_physical_device_init(struct anv_physical_device 
*device,
 
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
 
-   if (!anv_device_get_cache_uuid(device->uuid, device->chipset_id)) {
-  result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
- "cannot generate UUID");
+   result = anv_physical_device_init_uuids(device);
+   if (result != VK_SUCCESS)
   goto fail;
-   }
+
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
/* GENs prior to 8 do not support EU/Subslice info */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/21] anv: Add the pci_id into the shader cache UUID

2017-04-14 Thread Jason Ekstrand
This prevents a user from using a cache created on one hardware
generation on a different one.  Of course, with Intel hardware, this
requires moving their drive from one machine to another but it's still
possible and we should prevent it.

Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_device.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 35ef4c4..7a25ee9 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -34,6 +34,7 @@
 #include "util/strtod.h"
 #include "util/debug.h"
 #include "util/build_id.h"
+#include "util/mesa-sha1.h"
 #include "util/vk_util.h"
 
 #include "genxml/gen7_pack.h"
@@ -97,17 +98,26 @@ anv_compute_heap_size(int fd, uint64_t *heap_size)
 }
 
 static bool
-anv_device_get_cache_uuid(void *uuid)
+anv_device_get_cache_uuid(void *uuid, uint16_t pci_id)
 {
const struct build_id_note *note = build_id_find_nhdr("libvulkan_intel.so");
if (!note)
   return false;
 
-   unsigned len = build_id_length(note);
-   if (len < VK_UUID_SIZE)
+   unsigned build_id_len = build_id_length(note);
+   if (build_id_len < 20) /* It should be a SHA-1 */
   return false;
 
-   memcpy(uuid, build_id_data(note), VK_UUID_SIZE);
+   struct mesa_sha1 sha1_ctx;
+   uint8_t sha1[20];
+   STATIC_ASSERT(VK_UUID_SIZE <= sizeof(sha1));
+
+   _mesa_sha1_init(_ctx);
+   _mesa_sha1_update(_ctx, build_id_data(note), build_id_len);
+   _mesa_sha1_update(_ctx, _id, sizeof(pci_id));
+   _mesa_sha1_final(_ctx, sha1);
+
+   memcpy(uuid, sha1, VK_UUID_SIZE);
return true;
 }
 
@@ -192,7 +202,7 @@ anv_physical_device_init(struct anv_physical_device *device,
if (result != VK_SUCCESS)
   goto fail;
 
-   if (!anv_device_get_cache_uuid(device->uuid)) {
+   if (!anv_device_get_cache_uuid(device->uuid, device->chipset_id)) {
   result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
  "cannot generate UUID");
   goto fail;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/21] anv/physical_device: Rename uuid to pipeline_cache_uuid

2017-04-14 Thread Jason Ekstrand
We're about to have more UUIDs for different things so this one really
needs to be properly labeled.

Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_device.c | 5 +++--
 src/intel/vulkan/anv_pipeline_cache.c | 4 ++--
 src/intel/vulkan/anv_private.h| 2 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index ad10531..0a67414 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -121,7 +121,7 @@ anv_physical_device_init_uuids(struct anv_physical_device 
*device)
_mesa_sha1_update(_ctx, >chipset_id,
  sizeof(device->chipset_id));
_mesa_sha1_final(_ctx, sha1);
-   memcpy(device->uuid, sha1, VK_UUID_SIZE);
+   memcpy(device->pipeline_cache_uuid, sha1, VK_UUID_SIZE);
 
return VK_SUCCESS;
 }
@@ -721,7 +721,8 @@ void anv_GetPhysicalDeviceProperties(
};
 
strcpy(pProperties->deviceName, pdevice->name);
-   memcpy(pProperties->pipelineCacheUUID, pdevice->uuid, VK_UUID_SIZE);
+   memcpy(pProperties->pipelineCacheUUID,
+  pdevice->pipeline_cache_uuid, VK_UUID_SIZE);
 }
 
 void anv_GetPhysicalDeviceProperties2KHR(
diff --git a/src/intel/vulkan/anv_pipeline_cache.c 
b/src/intel/vulkan/anv_pipeline_cache.c
index cdd8215..3cfe3ec 100644
--- a/src/intel/vulkan/anv_pipeline_cache.c
+++ b/src/intel/vulkan/anv_pipeline_cache.c
@@ -351,7 +351,7 @@ anv_pipeline_cache_load(struct anv_pipeline_cache *cache,
   return;
if (header.device_id != device->chipset_id)
   return;
-   if (memcmp(header.uuid, pdevice->uuid, VK_UUID_SIZE) != 0)
+   if (memcmp(header.uuid, pdevice->pipeline_cache_uuid, VK_UUID_SIZE) != 0)
   return;
 
const void *end = data + size;
@@ -498,7 +498,7 @@ VkResult anv_GetPipelineCacheData(
header->header_version = VK_PIPELINE_CACHE_HEADER_VERSION_ONE;
header->vendor_id = 0x8086;
header->device_id = device->chipset_id;
-   memcpy(header->uuid, pdevice->uuid, VK_UUID_SIZE);
+   memcpy(header->uuid, pdevice->pipeline_cache_uuid, VK_UUID_SIZE);
p += align_u32(header->header_size, 8);
 
uint32_t *count = p;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 1f12a59..2fb0019 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -630,7 +630,7 @@ struct anv_physical_device {
 uint32_teu_total;
 uint32_tsubslice_total;
 
-uint8_t uuid[VK_UUID_SIZE];
+uint8_t 
pipeline_cache_uuid[VK_UUID_SIZE];
 
 struct wsi_device   wsi_device;
 int local_fd;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/21] anv: Add support for VK_KHX_external*

2017-04-14 Thread Jason Ekstrand
This patch series adds support for a bunch of the VK_KHX_external
extensions.  This is mostly a re-send but there are a few bugfixes tucked
in here and there are also some new patches.  Changes of note:

 1) It's been freshly rebased on master

 2) The BO cache has undergone quite a few bugfixes.

 3) We're now setting EXEC_OBJECT_ASYNC on almost everything.

 4) Patches have been added to implement external semaphores using DRM sync
objects as created by Dave Airlie.

The only non-new patch that has undergone extensive changes (beyond just
fixing rebase issues) is the BO cache patch.

The last two patches in this series are marked RFC because they add support
for using the new DRM sync object API from Dave Airlie.  I think I'm
relatively happy with the kernel API but would like to give the kernel
people a chance to chip in before we commit to it.  Hopefully, we can get
the sync object API and its semantics nailed down soon.

The series has also undergone significantly better testing.  I have written
a new crucible test (that I will push later today after cleaning it up
a bit) which seems to do a pretty good job of testing this stuff.  It then
took me a while to get the crucible test to fail because the kernel
currently works on a first-come-first-served model so zero synchronization
is needed in order to get the proper Vulkan behavior.  Thanks to Chris'
kernel series to add support for context priorities and the patch labled
"HACK" in this series, I was able to force one of the two contexts in my
test to run at significantly lower priority and things actually started
executing out of sync.  Once I finally had a test that failed, I was able
to prove that the patches work. :-)

In order to test it properly, you will need my drm-syncobj3 kernel branch
which contains patches from Chris Wilson, Dave Airlie, and myself.  It can
be found here:

https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj3

This series can be found here:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external

I now consider this stuff to be in good enough shape to merge.  I intend to
do so as soon as it is reviewed and the 17.1 branch point is past.

Cc: Chad Versace 
Cc: Dave Airlie 
Cc: Chris Wilson 
Cc: Daniel Vetter 

Chad Versace (1):
  anv: Implement VK_KHX_external_memory_capabilities

Jason Ekstrand (20):
  anv: Add the pci_id into the shader cache UUID
  anv/cmd_buffer: Use the device allocator for QueueSubmit
  anv: Set EXEC_OBJECT_ASYNC when available
  anv: Refactor device_get_cache_uuid into physical_device_init_uuids
  anv/physical_device: Rename uuid to pipeline_cache_uuid
  anv: Implement VK_KHX_external_memory
  anv/allocator: Add a BO cache
  anv: Use the BO cache for DeviceMemory allocations
  anv: Implement VK_KHX_external_memory_fd
  anv: Move queues, events, and semaphores to their own file
  anv: Add a real semaphore struct
  anv: Implement VK_KHX_external_semaphore_capabilities
  anv: Implement VK_KHX_external_semaphore
  anv: Pull the guts of cmd_buffer_execbuf into a helper
  anv: Implement VK_KHX_external_semaphore_fd
  anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set
  anv: Implement support for exporting semaphores as FENCE_FD
  HACK/anv: Set context priorities based on queue priorities
  anv/gem: Add a drm syncobj support
  anv: Use DRM sync objects for external semaphores when available

 src/intel/Makefile.sources  |   1 +
 src/intel/vulkan/anv_allocator.c| 271 +++
 src/intel/vulkan/anv_batch_chain.c  | 263 +--
 src/intel/vulkan/anv_device.c   | 713 
 src/intel/vulkan/anv_entrypoints_gen.py |   6 +
 src/intel/vulkan/anv_formats.c  | 118 -
 src/intel/vulkan/anv_gem.c  | 133 +-
 src/intel/vulkan/anv_gem_stubs.c|  31 ++
 src/intel/vulkan/anv_image.c|   2 +-
 src/intel/vulkan/anv_intel.c|  15 +-
 src/intel/vulkan/anv_pipeline_cache.c   |   4 +-
 src/intel/vulkan/anv_private.h  |  98 +++-
 src/intel/vulkan/anv_queue.c| 793 
 src/intel/vulkan/anv_wsi.c  |   7 +-
 14 files changed, 1888 insertions(+), 567 deletions(-)
 create mode 100644 src/intel/vulkan/anv_queue.c

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/21] anv: Set EXEC_OBJECT_ASYNC when available

2017-04-14 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_allocator.c | 3 +++
 src/intel/vulkan/anv_device.c| 5 +
 src/intel/vulkan/anv_private.h   | 1 +
 src/intel/vulkan/anv_wsi.c   | 1 +
 4 files changed, 10 insertions(+)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 784191e..697309f 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -504,6 +504,9 @@ anv_block_pool_grow(struct anv_block_pool *pool, struct 
anv_block_state *state)
anv_bo_init(>bo, gem_handle, size);
pool->bo.map = map;
 
+   if (pool->device->instance->physicalDevice.has_exec_async)
+  pool->bo.flags |= EXEC_OBJECT_ASYNC;
+
 done:
pthread_mutex_unlock(>device->mutex);
 
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 7a25ee9..079b0c5 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -202,6 +202,8 @@ anv_physical_device_init(struct anv_physical_device *device,
if (result != VK_SUCCESS)
   goto fail;
 
+   device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
+
if (!anv_device_get_cache_uuid(device->uuid, device->chipset_id)) {
   result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
  "cannot generate UUID");
@@ -1527,6 +1529,9 @@ anv_bo_init_new(struct anv_bo *bo, struct anv_device 
*device, uint64_t size)
if (device->instance->physicalDevice.supports_48bit_addresses)
   bo->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
 
+   if (device->instance->physicalDevice.has_exec_async)
+  bo->flags |= EXEC_OBJECT_ASYNC;
+
return VK_SUCCESS;
 }
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 7d07900..1f12a59 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -625,6 +625,7 @@ struct anv_physical_device {
 struct brw_compiler *   compiler;
 struct isl_device   isl_dev;
 int cmd_parser_version;
+boolhas_exec_async;
 
 uint32_teu_total;
 uint32_tsubslice_total;
diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index ba66ea6..a024561 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -208,6 +208,7 @@ x11_anv_wsi_image_create(VkDevice device_h,
 * know we're writing to them and synchronize uses on other rings (eg if
 * the display server uses the blitter ring).
 */
+   memory->bo.flags &= ~EXEC_OBJECT_ASYNC;
memory->bo.flags |= EXEC_OBJECT_WRITE;
 
anv_BindImageMemory(device_h, image_h, memory_h, 0);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/21] anv/cmd_buffer: Use the device allocator for QueueSubmit

2017-04-14 Thread Jason Ekstrand
The command is really operating on a Queue not a command buffer and the
nearest object to that with an allocator is VkDevice.

Cc: "17.0" 
---
 src/intel/vulkan/anv_batch_chain.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 5f0528f..3e9fa4c 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1265,7 +1265,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   cmd_buffer->last_ss_pool_center);
VkResult result =
   anv_execbuf_add_bo(, _pool->bo, _buffer->surface_relocs,
- _buffer->pool->alloc);
+ >alloc);
if (result != VK_SUCCESS)
   return result;
 
@@ -1278,7 +1278,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
cmd_buffer->last_ss_pool_center);
 
   result = anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs,
-  _buffer->pool->alloc);
+  >alloc);
   if (result != VK_SUCCESS)
  return result;
}
@@ -1387,7 +1387,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 
result = anv_device_execbuf(device, , execbuf.bos);
 
-   anv_execbuf_finish(, _buffer->pool->alloc);
+   anv_execbuf_finish(, >alloc);
 
return result;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH kmscube 6/6] common: Give cmdline parameter for forcing modifiers

2017-04-14 Thread Emil Velikov
On 13 April 2017 at 19:22, Ben Widawsky  wrote:
> ---
>  common.c  | 13 -
>  common.h  | 11 ++-
>  kmscube.c | 14 +++---
>  3 files changed, 29 insertions(+), 9 deletions(-)
>
> diff --git a/common.c b/common.c
> index e63bb39..eaaa9a4 100644
> --- a/common.c
> +++ b/common.c
> @@ -31,9 +31,6 @@
>
>  static struct gbm gbm;
>
> -#ifndef DRM_FORMAT_MOD_LINEAR
> -#define DRM_FORMAT_MOD_LINEAR 0
> -#endif
>  static int
>  get_modifiers(uint64_t **mods)
>  {
> @@ -43,7 +40,7 @@ get_modifiers(uint64_t **mods)
> return 1;
>  }
>
> -const struct gbm * init_gbm(int drm_fd, int w, int h)
> +const struct gbm * init_gbm(int drm_fd, int w, int h, uint64_t modifier)
>  {
> gbm.dev = gbm_create_device(drm_fd);
>
> @@ -57,7 +54,13 @@ const struct gbm * init_gbm(int drm_fd, int w, int h)
> }
>  #else
> uint64_t *mods;
> -   int count = get_modifiers();
> +   int count;
> +   if (modifier != DRM_FORMAT_MOD_INVALID) {
> +   count = 1;
> +   mods = 
> +   } else {
> +   count = get_modifiers();
> +   }
> gbm.surface = gbm_surface_create_with_modifiers(gbm.dev, w, h,
> GBM_FORMAT_XRGB, mods, count);
>  #endif
> diff --git a/common.h b/common.h
> index f3d9d32..03634cc 100644
> --- a/common.h
> +++ b/common.h
> @@ -36,6 +36,14 @@
>#include "config.h"
>  #endif
>
> +#ifndef DRM_FORMAT_MOD_LINEAR
> +#define DRM_FORMAT_MOD_LINEAR 0
> +#endif
> +
> +#ifndef DRM_FORMAT_MOD_INVALID
> +#define DRM_FORMAT_MOD_INVALID __u64)0) << 56) | ((1ULL << 56) - 1))
> +#endif
> +
>  #ifndef EGL_KHR_platform_gbm
>  #define EGL_KHR_platform_gbm 1
>  #define EGL_PLATFORM_GBM_KHR  0x31D7
> @@ -57,9 +65,10 @@ struct gbm {
> struct gbm_device *dev;
> struct gbm_surface *surface;
> int width, height;
> +   uint64_t forced_modifier;
Seems used. Drop for now?

With my trivial suggestions the series is
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH kmscube 5/6] common: Use libdrm AddFB with modifiers

2017-04-14 Thread Emil Velikov
On 13 April 2017 at 19:22, Ben Widawsky  wrote:
> Note: nothing happens here yet since LINEAR == 0.

Suggestion for the subject

common: use drmModeAddFB2* API over the legacy drmModeAddFB one


> ---
>  configure.ac |  2 +-
>  drm-common.c | 37 +
>  2 files changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 33167e4..f564ef3 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -35,7 +35,7 @@ AC_PROG_CC
>  m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
>
>  # Obtain compiler/linker options for depedencies
> -PKG_CHECK_MODULES(DRM, libdrm)
> +PKG_CHECK_MODULES(DRM, [libdrm >= 2.4.71])
>  PKG_CHECK_MODULES(GBM, gbm >= 13.0)
>  PKG_CHECK_MODULES(EGL, egl)
>  PKG_CHECK_MODULES(GLES2, glesv2)
> diff --git a/drm-common.c b/drm-common.c
> index b69ed70..eb460df 100644
> --- a/drm-common.c
> +++ b/drm-common.c
> @@ -46,7 +46,7 @@ struct drm_fb * drm_fb_get_from_bo(struct gbm_bo *bo)
>  {
> int drm_fd = gbm_device_get_fd(gbm_bo_get_device(bo));
> struct drm_fb *fb = gbm_bo_get_user_data(bo);
> -   uint32_t width, height, stride, handle;
> +   uint32_t width, height, strides[4]={0}, handles[4] = {0}, offsets[4] 
> = {0}, flags = 0;
Nit: Add spaces around = for strides[].

> int ret;
>
> if (fb)
> @@ -57,10 +57,39 @@ struct drm_fb * drm_fb_get_from_bo(struct gbm_bo *bo)
>
> width = gbm_bo_get_width(bo);
> height = gbm_bo_get_height(bo);
> -   stride = gbm_bo_get_stride(bo);
> -   handle = gbm_bo_get_handle(bo).u32;
>
> -   ret = drmModeAddFB(drm_fd, width, height, 24, 32, stride, handle, 
> >fb_id);
> +#ifndef HAVE_GBM_MODIFIERS
> +   strides[0] = gbm_bo_get_stride(bo);
> +   handles[0] = gbm_bo_get_handle(bo).u32;
These two should go in the fallback path.

> +   ret = -1;
> +#else
> +   uint64_t modifiers[4] = {0};
> +   modifiers[0] = gbm_bo_get_modifier(bo);
> +   const int num_planes = gbm_bo_get_plane_count(bo);
> +   for (int i = 0; i < num_planes; i++) {
> +   strides[i] = gbm_bo_get_stride_for_plane(bo, i);
> +   handles[i] = gbm_bo_get_handle(bo).u32;
> +   offsets[i] = gbm_bo_get_offset(bo, i);
> +   modifiers[i] = modifiers[0];
> +   }
> +
> +   if (modifiers[0]) {
> +   flags = DRM_MODE_FB_MODIFIERS;
> +   printf("Using modifier %lx\n", modifiers[0]);
> +   }
> +
> +   ret = drmModeAddFB2WithModifiers(drm_fd, width, height,
> +   DRM_FORMAT_XRGB, handles, strides, offsets,
> +   modifiers, >fb_id, flags);
> +#endif
> +   if (ret) {
> +   if (flags)
> +   fprintf(stderr, "Modifiers failed!\n");
> +   flags = 0;
Drop this line or use it in drmModeAddFB2?

Here we'd want to correctly initialise all of strides[] handles[],
since they may contain the 'wrong' values from above.
it's a bit pedantic I admit, but should make the code easier to read
and will prevent explosions in [buggy] kernel modules.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: remove irrelevant comment

2017-04-14 Thread Grazvydas Ignotas
A leftover from anv.

Signed-off-by: Grazvydas Ignotas 
---
 src/amd/vulkan/radv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5f14394..7857e8f 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -660,11 +660,11 @@ void radv_GetPhysicalDeviceProperties(
.driverVersion = radv_get_driver_version(),
.vendorID = 0x1002,
.deviceID = pdevice->rad_info.pci_id,
.deviceType = VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU,
.limits = limits,
-   .sparseProperties = {0}, /* Broadwell doesn't do sparse. */
+   .sparseProperties = {0},
};
 
strcpy(pProperties->deviceName, pdevice->name);
memcpy(pProperties->pipelineCacheUUID, pdevice->uuid, VK_UUID_SIZE);
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: report timestampPeriod correctly

2017-04-14 Thread Grazvydas Ignotas
The kernel returns frequency in kHz, so to convert to nanosecond
interval that Vulkan uses the dividend should be 100.0 and not
10.0.

This fixes the GPU graph in DOOM and matches the amdgpu-pro blob.

Signed-off-by: Grazvydas Ignotas 
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
 src/amd/vulkan/radv_device.c| 2 +-
 src/amd/vulkan/radv_radeon_winsys.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 7857e8f..796cc70 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -637,11 +637,11 @@ void radv_GetPhysicalDeviceProperties(
.sampledImageDepthSampleCounts= sample_counts,
.sampledImageStencilSampleCounts  = sample_counts,
.storageImageSampleCounts = 
VK_SAMPLE_COUNT_1_BIT,
.maxSampleMaskWords   = 1,
.timestampComputeAndGraphics  = false,
-   .timestampPeriod  = 10.0 / 
pdevice->rad_info.clock_crystal_freq,
+   .timestampPeriod  = 100.0 / 
pdevice->rad_info.clock_crystal_freq,
.maxClipDistances = 8,
.maxCullDistances = 8,
.maxCombinedClipAndCullDistances  = 8,
.discreteQueuePriorities  = 1,
.pointSizeRange   = { 0.125, 255.875 },
diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
index 9f2430f..f6bab74 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -93,11 +93,11 @@ struct radeon_info {
bool has_uvd;
uint32_tsdma_rings;
uint32_tcompute_rings;
uint32_tvce_fw_version;
uint32_tvce_harvest_config;
-   uint32_tclock_crystal_freq;
+   uint32_tclock_crystal_freq; /* in kHz */
 
/* Kernel info. */
uint32_tdrm_major; /* version */
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH kmscube 4/6] common: Use the create with modifiers interface

2017-04-14 Thread Emil Velikov
On 13 April 2017 at 19:22, Ben Widawsky  wrote:
> ---
>  common.c | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/common.c b/common.c
> index 4bf3c5a..e63bb39 100644
> --- a/common.c
> +++ b/common.c
> @@ -31,10 +31,23 @@
>
>  static struct gbm gbm;
>
> +#ifndef DRM_FORMAT_MOD_LINEAR
> +#define DRM_FORMAT_MOD_LINEAR 0
> +#endif
> +static int
> +get_modifiers(uint64_t **mods)
> +{
> +   /* Assumed LINEAR is supported everywhere */
> +   static uint64_t modifiers[] = {DRM_FORMAT_MOD_LINEAR};
> +   *mods = modifiers;
> +   return 1;
> +}
> +
>  const struct gbm * init_gbm(int drm_fd, int w, int h)
>  {
> gbm.dev = gbm_create_device(drm_fd);
>
> +#ifndef HAVE_GBM_MODIFIERS
> gbm.surface = gbm_surface_create(gbm.dev, w, h,
> GBM_FORMAT_XRGB,
> GBM_BO_USE_SCANOUT | GBM_BO_USE_RENDERING);
> @@ -42,6 +55,12 @@ const struct gbm * init_gbm(int drm_fd, int w, int h)
> printf("failed to create gbm surface\n");
> return NULL;
> }
> +#else
> +   uint64_t *mods;
> +   int count = get_modifiers();
> +   gbm.surface = gbm_surface_create_with_modifiers(gbm.dev, w, h,
> +   GBM_FORMAT_XRGB, mods, count);
> +#endif
>
Since gbm_surface_create_with_modifiers() can fail we want to have
some error handling.
Move the existing one after the ifndef/else block ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH kmscube 3/6] common: include config.h

2017-04-14 Thread Emil Velikov
Hi Ben,

On 13 April 2017 at 19:22, Ben Widawsky  wrote:
> ---
>  common.h | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/common.h b/common.h
> index 2eceac7..f3d9d32 100644
> --- a/common.h
> +++ b/common.h
> @@ -32,6 +32,10 @@
>  #include 
>  #include 
>
> +#ifdef HAVE_CONFIG_H
> +  #include "config.h"
> +#endif
> +
There's no config.h so you don't need this patch.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] configure.ac: add --enable-sanitize option

2017-04-14 Thread Emil Velikov
On 13 April 2017 at 17:14, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Enable code sanitizers by adding -fsanitize=$foo flags for the compiler
> and linker.
>
> In addition, this also disables checking for undefined symbols: running
> the address sanitizer requires additional symbols which should be provided
> by a preloaded libasan.so (preloaded for hooking into malloc & friends
> globally), and the undefined symbols check gets tripped up by that.
>
> Running the tests works normally via `make check`, but shows additional
> failures with the address sanitizer due to memory leaks that seem to be
> mostly leaks in the tests themselves. I believe those failures should
> really be fixed. In the mean-time, you can set
>
> export ASAN_OPTIONS=detect_leaks=0
>
> to only check for more serious error types.
>
> v2:
> - fail reasonably when an unsupported sanitize flag is given (Eric Engestrom)
>
> Reviewed-by: Bartosz Tomczyk  (v1)
> Reviewed-by: Eric Engestrom 
> --
> Eric, did you ever figure out what went wrong with LLVM? I'm compiling
> with a fairly recent LLVM trunk here and it works fine, and so apparently
> did you. FWIW, I'm using gcc 6.2.
>
> Emil, as you can see I tried `make check`, and it works without the
> preload because all the tests are standalone libraries.
>
Thought we had some tests that use shared libs. Glad to hear that
everything works as expected.
Thanks for double-checking!

Reviewed-by: Emil Velikov 

-Emil
P.S. Feel free to add a note to docs/relnotes/17.1.0.html or I'll add
one later today.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Destination component count of shader_clock intrinsic is 2

2017-04-14 Thread Jason Ekstrand
+Matt, +Ken

On Wed, Apr 12, 2017 at 6:09 PM, Boyan Ding  wrote:

> 2017-04-13 2:25 GMT+08:00 Jason Ekstrand :
> > On Wed, Apr 12, 2017 at 6:14 AM, Boyan Ding 
> wrote:
> >>
> >> This fixes the following error when using ARB_shader_clock on i965:
> >> vec1 32 ssa_0 = intrinsic shader_clock () () ()
> >> intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */
> >> error: src->ssa->num_components == num_components
> (nir/nir_validate.c:204)
> >>
> >> Cc: mesa-sta...@lists.freedesktop.org
> >> Signed-off-by: Boyan Ding 
> >> ---
> >>  src/compiler/glsl/glsl_to_nir.cpp | 3 ++-
> >>  src/compiler/nir/nir_intrinsics.h | 2 +-
> >>  2 files changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/compiler/glsl/glsl_to_nir.cpp
> >> b/src/compiler/glsl/glsl_to_nir.cpp
> >> index f0557f985b..870d457681 100644
> >> --- a/src/compiler/glsl/glsl_to_nir.cpp
> >> +++ b/src/compiler/glsl/glsl_to_nir.cpp
> >> @@ -930,7 +930,8 @@ nir_visitor::visit(ir_call *ir)
> >>   nir_builder_instr_insert(, >instr);
> >>   break;
> >>case nir_intrinsic_shader_clock:
> >> - nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
> >> + nir_ssa_dest_init(>instr, >dest, 2, 32, NULL);
> >> + instr->num_components = 2;
> >
> >
> > This made me go look at the spec, and things get a bit more subtle...  In
> > particular, ARB_shader_clock specifies two builtin functions:
> >
> > uvec2 clock2x32ARB(void);
> > uint64_t clockARB(void);
> >
> >  Where the second one only exists if you support int64.  On gen8+, we do
> > support int64...
> >
> > My feeling is that the correct way to implement this is to make the NIR
> > intrinsic return a 64bit value and wrap it in a nir_unpack_64_2x32 if the
> > client asks for the 2x32 version.  If that's too much refactoring for
> you,
> > then this patch is probably sufficient to solve the issue today.
> >
>
> I agree with you. I'm not very familiar with nir internals, and was
> just copying TGSI's handling here. There will be more intrinsics with
> 64bit results, for example, ballot, which radv guys might be
> interested in.
>
> I won't mind if someone comes up with a better solution and replaces
> mine. But just as you said above, it solves the issue today. It's up
> to you to decide.
>
> Cheers,
> Boyan Ding
>
> >>   nir_builder_instr_insert(, >instr);
> >>   break;
> >>case nir_intrinsic_store_ssbo: {
> >> diff --git a/src/compiler/nir/nir_intrinsics.h
> >> b/src/compiler/nir/nir_intrinsics.h
> >> index 105c56f759..3a519a73dd 100644
> >> --- a/src/compiler/nir/nir_intrinsics.h
> >> +++ b/src/compiler/nir/nir_intrinsics.h
> >> @@ -91,7 +91,7 @@ BARRIER(memory_barrier)
> >>   * The latter can be used as code motion barrier, which is currently
> not
> >>   * feasible with NIR.
> >>   */
> >> -INTRINSIC(shader_clock, 0, ARR(0), true, 1, 0, 0, xx, xx, xx,
> >> NIR_INTRINSIC_CAN_ELIMINATE)
> >> +INTRINSIC(shader_clock, 0, ARR(0), true, 2, 0, 0, xx, xx, xx,
> >> NIR_INTRINSIC_CAN_ELIMINATE)
> >>
> >>  /*
> >>   * Memory barrier with semantics analogous to the compute shader
> >> --
> >> 2.12.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: replace _mesa_index_buffer::type with index_size

2017-04-14 Thread Ilia Mirkin
On Fri, Apr 14, 2017 at 12:45 PM, Marek Olšák  wrote:
> On Fri, Apr 14, 2017 at 5:12 PM, Ilia Mirkin  wrote:
>> On Fri, Apr 14, 2017 at 11:06 AM, Marek Olšák  wrote:
>>> diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h
>>> index d62ab4e..79f7538 100644
>>> --- a/src/mesa/vbo/vbo.h
>>> +++ b/src/mesa/vbo/vbo.h
>>
>> Should also be possible to remove vbo_sizeof_ib_type from here right?
>
> vbo_sizeof_ib_type is used to get index_size at the beginning of
> indexed draw calls. However, it's not used in other places anymore.

Erm right. Duh. My r-b still stands.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] gallium: fold u_trim_pipe_prim call from st/mesa to drivers

2017-04-14 Thread Ilia Mirkin
On Fri, Apr 14, 2017 at 12:42 PM, Marek Olšák  wrote:
> On Fri, Apr 14, 2017 at 5:45 PM, Ilia Mirkin  wrote:
>> On Fri, Apr 14, 2017 at 11:07 AM, Marek Olšák  wrote:
>>> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_vbo.c 
>>> b/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
>>> index bc9b9a1..295c394 100644
>>> --- a/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
>>> +++ b/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
>>> @@ -543,20 +543,23 @@ nv30_draw_elements(struct nv30_context *nv30, bool 
>>> shorten,
>>> }
>>>  }
>>>
>>>  static void
>>>  nv30_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
>>>  {
>>> struct nv30_context *nv30 = nv30_context(pipe);
>>> struct nouveau_pushbuf *push = nv30->base.pushbuf;
>>> int i;
>>>
>>> +   if (!u_trim_pipe_prim(info->mode, (unsigned*)>count))
>>> +  return;
>>> +
>>
>> Should this also have a !info->primitive_restart? It's supported on
>> nv4x (covered by this driver).
>
> In that case, I wonder if u_trim_pipe_prim is required with this
> driver. It might be better to just remove that call.

Based on a quick look, this seems to exist to prevent short draws and
trim the count to the nearest prim size, i.e. if you try to draw a tri
with %3 != 0 vertices, or a line with %2 != 0? I'm not 100% sure that
the NV30 HW handles those correctly, but it probably does. I can
double-check tonight, as I have one plugged in these days.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Destination component count of shader_clock intrinsic is 2

2017-04-14 Thread Jason Ekstrand
On Wed, Apr 12, 2017 at 6:14 AM, Boyan Ding  wrote:

> This fixes the following error when using ARB_shader_clock on i965:
> vec1 32 ssa_0 = intrinsic shader_clock () () ()
> intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */
> error: src->ssa->num_components == num_components (nir/nir_validate.c:204)
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Boyan Ding 
> ---
>  src/compiler/glsl/glsl_to_nir.cpp | 3 ++-
>  src/compiler/nir/nir_intrinsics.h | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/glsl/glsl_to_nir.cpp
> b/src/compiler/glsl/glsl_to_nir.cpp
> index f0557f985b..870d457681 100644
> --- a/src/compiler/glsl/glsl_to_nir.cpp
> +++ b/src/compiler/glsl/glsl_to_nir.cpp
> @@ -930,7 +930,8 @@ nir_visitor::visit(ir_call *ir)
>   nir_builder_instr_insert(, >instr);
>   break;
>case nir_intrinsic_shader_clock:
> - nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
> + nir_ssa_dest_init(>instr, >dest, 2, 32, NULL);
> + instr->num_components = 2;
>

This isn't needed for things that have an explicit number of components.
You can drop it.  Other than that,

Reviewed-by: Jason Ekstrand 

We can figure out hte int64 interactions later.


>   nir_builder_instr_insert(, >instr);
>   break;
>case nir_intrinsic_store_ssbo: {
> diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_
> intrinsics.h
> index 105c56f759..3a519a73dd 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -91,7 +91,7 @@ BARRIER(memory_barrier)
>   * The latter can be used as code motion barrier, which is currently not
>   * feasible with NIR.
>   */
> -INTRINSIC(shader_clock, 0, ARR(0), true, 1, 0, 0, xx, xx, xx,
> NIR_INTRINSIC_CAN_ELIMINATE)
> +INTRINSIC(shader_clock, 0, ARR(0), true, 2, 0, 0, xx, xx, xx,
> NIR_INTRINSIC_CAN_ELIMINATE)
>
>  /*
>   * Memory barrier with semantics analogous to the compute shader
> --
> 2.12.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: replace _mesa_index_buffer::type with index_size

2017-04-14 Thread Marek Olšák
On Fri, Apr 14, 2017 at 5:12 PM, Ilia Mirkin  wrote:
> On Fri, Apr 14, 2017 at 11:06 AM, Marek Olšák  wrote:
>> diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h
>> index d62ab4e..79f7538 100644
>> --- a/src/mesa/vbo/vbo.h
>> +++ b/src/mesa/vbo/vbo.h
>
> Should also be possible to remove vbo_sizeof_ib_type from here right?

vbo_sizeof_ib_type is used to get index_size at the beginning of
indexed draw calls. However, it's not used in other places anymore.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] gallium: fold u_trim_pipe_prim call from st/mesa to drivers

2017-04-14 Thread Marek Olšák
On Fri, Apr 14, 2017 at 5:45 PM, Ilia Mirkin  wrote:
> On Fri, Apr 14, 2017 at 11:07 AM, Marek Olšák  wrote:
>> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_vbo.c 
>> b/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
>> index bc9b9a1..295c394 100644
>> --- a/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
>> +++ b/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
>> @@ -543,20 +543,23 @@ nv30_draw_elements(struct nv30_context *nv30, bool 
>> shorten,
>> }
>>  }
>>
>>  static void
>>  nv30_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
>>  {
>> struct nv30_context *nv30 = nv30_context(pipe);
>> struct nouveau_pushbuf *push = nv30->base.pushbuf;
>> int i;
>>
>> +   if (!u_trim_pipe_prim(info->mode, (unsigned*)>count))
>> +  return;
>> +
>
> Should this also have a !info->primitive_restart? It's supported on
> nv4x (covered by this driver).

In that case, I wonder if u_trim_pipe_prim is required with this
driver. It might be better to just remove that call.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] st/mesa: use one big translation table in st_pipe_vertex_format

2017-04-14 Thread Marek Olšák
Thanks.

I'm amending this:

diff --git a/src/mesa/state_tracker/st_atom_array.c
b/src/mesa/state_tracker/st_atom_array.c
index 6cfbd24..436ea45 100644
--- a/src/mesa/state_tracker/st_atom_array.c
+++ b/src/mesa/state_tracker/st_atom_array.c
@@ -47,8 +47,9 @@
 #include "main/bufferobj.h"
 #include "main/glformats.h"

-static uint16_t vertex_formats[][4][4] = {
-   {
+/* vertex_formats[gltype - GL_BYTE][integer*2 + normalized][size - 1] */
+static const uint16_t vertex_formats[][4][4] = {
+   { /* GL_BYTE */
   {
  PIPE_FORMAT_R8_SSCALED,
  PIPE_FORMAT_R8G8_SSCALED,
@@ -68,7 +69,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R8G8B8A8_SINT
   },
},
-   {
+   { /* GL_UNSIGNED_BYTE */
   {
  PIPE_FORMAT_R8_USCALED,
  PIPE_FORMAT_R8G8_USCALED,
@@ -88,7 +89,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R8G8B8A8_UINT
   },
},
-   {
+   { /* GL_SHORT */
   {
  PIPE_FORMAT_R16_SSCALED,
  PIPE_FORMAT_R16G16_SSCALED,
@@ -108,7 +109,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R16G16B16A16_SINT
   },
},
-   {
+   { /* GL_UNSIGNED_SHORT */
   {
  PIPE_FORMAT_R16_USCALED,
  PIPE_FORMAT_R16G16_USCALED,
@@ -128,7 +129,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R16G16B16A16_UINT
   },
},
-   {
+   { /* GL_INT */
   {
  PIPE_FORMAT_R32_SSCALED,
  PIPE_FORMAT_R32G32_SSCALED,
@@ -148,7 +149,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R32G32B32A32_SINT
   },
},
-   {
+   { /* GL_UNSIGNED_INT */
   {
  PIPE_FORMAT_R32_USCALED,
  PIPE_FORMAT_R32G32_USCALED,
@@ -168,7 +169,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R32G32B32A32_UINT
   },
},
-   {
+   { /* GL_FLOAT */
   {
  PIPE_FORMAT_R32_FLOAT,
  PIPE_FORMAT_R32G32_FLOAT,
@@ -185,7 +186,7 @@ static uint16_t vertex_formats[][4][4] = {
{{0}}, /* GL_2_BYTES */
{{0}}, /* GL_3_BYTES */
{{0}}, /* GL_4_BYTES */
-   {
+   { /* GL_DOUBLE */
   {
  PIPE_FORMAT_R64_FLOAT,
  PIPE_FORMAT_R64G64_FLOAT,
@@ -199,7 +200,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R64G64B64A64_FLOAT
   },
},
-   {
+   { /* GL_HALF_FLOAT */
   {
  PIPE_FORMAT_R16_FLOAT,
  PIPE_FORMAT_R16G16_FLOAT,
@@ -213,7 +214,7 @@ static uint16_t vertex_formats[][4][4] = {
  PIPE_FORMAT_R16G16B16A16_FLOAT
   },
},
-   {
+   { /* GL_FIXED */
   {
  PIPE_FORMAT_R32_FIXED,

Marek

On Fri, Apr 14, 2017 at 5:41 PM, Brian Paul  wrote:
> On 04/14/2017 09:07 AM, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> for lower overhead.
>> ---
>>   src/mesa/state_tracker/st_atom_array.c | 469
>> -
>>   1 file changed, 227 insertions(+), 242 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_atom_array.c
>> b/src/mesa/state_tracker/st_atom_array.c
>> index 221b2c7..6cfbd24 100644
>> --- a/src/mesa/state_tracker/st_atom_array.c
>> +++ b/src/mesa/state_tracker/st_atom_array.c
>> @@ -40,284 +40,269 @@
>>   #include "st_atom.h"
>>   #include "st_cb_bufferobjects.h"
>>   #include "st_draw.h"
>>   #include "st_program.h"
>>
>>   #include "cso_cache/cso_context.h"
>>   #include "util/u_math.h"
>>   #include "main/bufferobj.h"
>>   #include "main/glformats.h"
>>
>> -
>> -static GLuint double_types[4] = {
>> -   PIPE_FORMAT_R64_FLOAT,
>> -   PIPE_FORMAT_R64G64_FLOAT,
>> -   PIPE_FORMAT_R64G64B64_FLOAT,
>> -   PIPE_FORMAT_R64G64B64A64_FLOAT
>> -};
>> -
>> -static GLuint float_types[4] = {
>> -   PIPE_FORMAT_R32_FLOAT,
>> -   PIPE_FORMAT_R32G32_FLOAT,
>> -   PIPE_FORMAT_R32G32B32_FLOAT,
>> -   PIPE_FORMAT_R32G32B32A32_FLOAT
>> -};
>> -
>> -static GLuint half_float_types[4] = {
>> -   PIPE_FORMAT_R16_FLOAT,
>> -   PIPE_FORMAT_R16G16_FLOAT,
>> -   PIPE_FORMAT_R16G16B16_FLOAT,
>> -   PIPE_FORMAT_R16G16B16A16_FLOAT
>> -};
>> -
>> -static GLuint uint_types_norm[4] = {
>> -   PIPE_FORMAT_R32_UNORM,
>> -   PIPE_FORMAT_R32G32_UNORM,
>> -   PIPE_FORMAT_R32G32B32_UNORM,
>> -   PIPE_FORMAT_R32G32B32A32_UNORM
>> -};
>> -
>> -static GLuint uint_types_scale[4] = {
>> -   PIPE_FORMAT_R32_USCALED,
>> -   PIPE_FORMAT_R32G32_USCALED,
>> -   PIPE_FORMAT_R32G32B32_USCALED,
>> -   PIPE_FORMAT_R32G32B32A32_USCALED
>> -};
>> -
>> -static GLuint uint_types_int[4] = {
>> -   PIPE_FORMAT_R32_UINT,
>> -   PIPE_FORMAT_R32G32_UINT,
>> -   PIPE_FORMAT_R32G32B32_UINT,
>> -   PIPE_FORMAT_R32G32B32A32_UINT
>> -};
>> -
>> -static GLuint int_types_norm[4] = {
>> -   PIPE_FORMAT_R32_SNORM,
>> -   PIPE_FORMAT_R32G32_SNORM,
>> -   PIPE_FORMAT_R32G32B32_SNORM,
>> -   PIPE_FORMAT_R32G32B32A32_SNORM
>> -};
>> -
>> -static GLuint int_types_scale[4] = {
>> -   PIPE_FORMAT_R32_SSCALED,
>> -   PIPE_FORMAT_R32G32_SSCALED,
>> -   PIPE_FORMAT_R32G32B32_SSCALED,
>> -   

Re: [Mesa-dev] [PATCH] swr: Add polygon stipple support

2017-04-14 Thread Ilia Mirkin
On Fri, Apr 14, 2017 at 11:18 AM, Ilia Mirkin  wrote:
> On Thu, Apr 13, 2017 at 4:30 PM, George Kyriazis
>  wrote:
>> Add polygon stipple functionality to the fragment shader.
>>
>> Explicitly turn off polygon stipple for lines and points, since we
>> do them using tris.
>> ---
>>  src/gallium/drivers/swr/swr_context.h  |  4 ++-
>>  src/gallium/drivers/swr/swr_shader.cpp | 56 
>> ++
>>  src/gallium/drivers/swr/swr_shader.h   |  1 +
>>  src/gallium/drivers/swr/swr_state.cpp  | 27 ++--
>>  src/gallium/drivers/swr/swr_state.h|  5 +++
>>  5 files changed, 84 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/gallium/drivers/swr/swr_context.h 
>> b/src/gallium/drivers/swr/swr_context.h
>> index be65a20..9d80c70 100644
>> --- a/src/gallium/drivers/swr/swr_context.h
>> +++ b/src/gallium/drivers/swr/swr_context.h
>> @@ -98,6 +98,8 @@ struct swr_draw_context {
>>
>> float userClipPlanes[PIPE_MAX_CLIP_PLANES][4];
>>
>> +   uint32_t polyStipple[32];
>> +
>> SWR_SURFACE_STATE renderTargets[SWR_NUM_ATTACHMENTS];
>> void *pStats;
>>  };
>> @@ -127,7 +129,7 @@ struct swr_context {
>> struct pipe_constant_buffer
>>constants[PIPE_SHADER_TYPES][PIPE_MAX_CONSTANT_BUFFERS];
>> struct pipe_framebuffer_state framebuffer;
>> -   struct pipe_poly_stipple poly_stipple;
>> +   struct swr_poly_stipple poly_stipple;
>> struct pipe_scissor_state scissor;
>> SWR_RECT swr_scissor;
>> struct pipe_sampler_view *
>> diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
>> b/src/gallium/drivers/swr/swr_shader.cpp
>> index 6fc0596..d8f5512 100644
>> --- a/src/gallium/drivers/swr/swr_shader.cpp
>> +++ b/src/gallium/drivers/swr/swr_shader.cpp
>> @@ -165,6 +165,9 @@ swr_generate_fs_key(struct swr_jit_fs_key ,
>>sizeof(key.vs_output_semantic_idx));
>>
>> swr_generate_sampler_key(swr_fs->info, ctx, PIPE_SHADER_FRAGMENT, key);
>> +
>> +   key.poly_stipple_enable = ctx->rasterizer->poly_stipple_enable &&
>> +  ctx->poly_stipple.prim_is_poly;
>>  }
>>
>>  void
>> @@ -1099,17 +1102,58 @@ BuilderSWR::CompileFS(struct swr_context *ctx, 
>> swr_jit_fs_key )
>> memset(_values, 0, sizeof(system_values));
>>
>> struct lp_build_mask_context mask;
>> +   bool uses_mask = false;
>>
>> -   if (swr_fs->info.base.uses_kill) {
>> -  Value *mask_val = LOAD(pPS, {0, SWR_PS_CONTEXT_activeMask}, 
>> "activeMask");
>> +   if (swr_fs->info.base.uses_kill ||
>> +   key.poly_stipple_enable) {
>> +  Value *vActiveMask = NULL;
>> +  if (swr_fs->info.base.uses_kill) {
>> + vActiveMask = LOAD(pPS, {0, SWR_PS_CONTEXT_activeMask}, 
>> "activeMask");
>> +  }
>> +  if (key.poly_stipple_enable) {
>> + // first get fragment xy coords and clip to stipple bounds
>> + Value *vXf = LOAD(pPS, {0, SWR_PS_CONTEXT_vX, PixelPositions_UL});
>> + Value *vYf = LOAD(pPS, {0, SWR_PS_CONTEXT_vY, PixelPositions_UL});
>> + Value *vXu = FP_TO_UI(vXf, mSimdInt32Ty);
>> + Value *vYu = FP_TO_UI(vYf, mSimdInt32Ty);
>> +
>> + // stipple pattern is 32x32, which means that one line of stipple
>> + // is stored in one word:
>> + // vXstipple is bit offset inside 32-bit stipple word
>> + // vYstipple is word index is stipple array
>> + Value *vXstipple = AND(vXu, VIMMED1(0x1f)); // & (32-1)
>> + Value *vYstipple = AND(vYu, VIMMED1(0x1f)); // & (32-1)
>> +
>> + // grab stipple pattern base address
>> + Value *stipplePtr = GEP(hPrivateData, {0, 
>> swr_draw_context_polyStipple, 0});
>> + stipplePtr = BITCAST(stipplePtr, mInt8PtrTy);
>> +
>> + // peform a gather to grab stipple words for each lane
>> + Value *vStipple = GATHERDD(VUNDEF_I(), stipplePtr, vYstipple,
>> +VIMMED1(0x), C((char)4));
>> +
>> + // create a mask with one bit corresponding to the x stipple
>> + // and AND it with the pattern, to see if we have a bit
>> + Value *vBitMask = LSHR(VIMMED1(0x8000), vXstipple);
>> + Value *vStippleMask = AND(vStipple, vBitMask);
>> + vStippleMask = ICMP_NE(vStippleMask, VIMMED1(0));
>> + vStippleMask = VMASK(vStippleMask);
>> +
>> + if (swr_fs->info.base.uses_kill) {
>> +vActiveMask = AND(vActiveMask, vStippleMask);
>> + } else {
>> +vActiveMask = vStippleMask;
>> + }
>> +  }
>>lp_build_mask_begin(
>> - , gallivm, lp_type_float_vec(32, 32 * 8), wrap(mask_val));
>> + , gallivm, lp_type_float_vec(32, 32 * 8), wrap(vActiveMask));
>> +  uses_mask = true;
>> }
>>
>> lp_build_tgsi_soa(gallivm,
>>   swr_fs->pipe.tokens,
>>   lp_type_float_vec(32, 32 * 8),
>> - swr_fs->info.base.uses_kill ?  : NULL, // mask
>> + uses_mask ?  : NULL, // 

[Mesa-dev] [PATCH 3/3] winsys/amdgpu: init buffer_indices_hashlist with memset()

2017-04-14 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index f068d8ea7a..8a277d08e1 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -695,8 +695,6 @@ static void amdgpu_ib_finalize(struct amdgpu_ib *ib)
 static bool amdgpu_init_cs_context(struct amdgpu_cs_context *cs,
enum ring_type ring_type)
 {
-   int i;
-
switch (ring_type) {
case RING_DMA:
   cs->request.ip_type = AMDGPU_HW_IP_DMA;
@@ -720,9 +718,7 @@ static bool amdgpu_init_cs_context(struct amdgpu_cs_context 
*cs,
   break;
}
 
-   for (i = 0; i < ARRAY_SIZE(cs->buffer_indices_hashlist); i++) {
-  cs->buffer_indices_hashlist[i] = -1;
-   }
+   memset(cs->buffer_indices_hashlist, -1, 
sizeof(cs->buffer_indices_hashlist));
cs->last_added_bo = NULL;
 
cs->request.number_of_ibs = 1;
@@ -757,9 +753,7 @@ static void amdgpu_cs_context_cleanup(struct 
amdgpu_cs_context *cs)
cs->num_sparse_buffers = 0;
amdgpu_fence_reference(>fence, NULL);
 
-   for (i = 0; i < ARRAY_SIZE(cs->buffer_indices_hashlist); i++) {
-  cs->buffer_indices_hashlist[i] = -1;
-   }
+   memset(cs->buffer_indices_hashlist, -1, 
sizeof(cs->buffer_indices_hashlist));
cs->last_added_bo = NULL;
 }
 
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >