date:20181017

Re: [Mesa-dev] [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]

2018-10-17 Thread Keith Packard

Bas Nieuwenhuizen  writes:

> Reviewed-by: Bas Nieuwenhuizen 

Thanks to you, Jason and Lionel for reviewing the code and helping
improve it.

-- 
-keith


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/15] A bunch of shared code and RadeonSI changes

2018-10-17 Thread Dieter Nützel


GREAT work Marek!

Best speed up for months on Polaris 20, at least.
Coming from vacation with injured right ankle joint, so I haven't had 
time for testing before commit. But 'glmark2' numbers are better than 
before all the Spectre shit (~8-9%?!).


In German: 'Da geht noch was...' ;-)

Greetings,
Dieter

===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.26.0, 
4.18.14-1.gce1c446-default, LLVM 8.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 18.3.0-devel 
(git-58a51d0a67)

===
[build] use-vbo=false: FPS: 3382 FrameTime: 0.296 ms
[build] use-vbo=true: FPS: 11679 FrameTime: 0.086 ms
[texture] texture-filter=nearest: FPS: 11607 FrameTime: 0.086 ms
[texture] texture-filter=linear: FPS: 11572 FrameTime: 0.086 ms
[texture] texture-filter=mipmap: FPS: 11676 FrameTime: 0.086 ms
[shading] shading=gouraud: FPS: 12207 FrameTime: 0.082 ms
[shading] shading=blinn-phong-inf: FPS: 11892 FrameTime: 0.084 ms
[shading] shading=phong: FPS: 12073 FrameTime: 0.083 ms
[shading] shading=cel: FPS: 11763 FrameTime: 0.085 ms
[bump] bump-render=high-poly: FPS: 11252 FrameTime: 0.089 ms
[bump] bump-render=normals: FPS: 11366 FrameTime: 0.088 ms
[bump] bump-render=height: FPS: 11226 FrameTime: 0.089 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 12171 FrameTime: 0.082 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 11314 FrameTime: 
0.088 ms
[pulsar] light=false:quads=5:texture=false: FPS: 10452 FrameTime: 0.096 
ms

libpng warning: iCCP: known incorrect sRGB profile
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: 
FPS: 5506 FrameTime: 0.182 ms

libpng warning: iCCP: known incorrect sRGB profile
[desktop] effect=shadow:windows=4: FPS: 5864 FrameTime: 0.171 ms
[buffer] 
columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: 
FPS: 812 FrameTime: 1.232 ms
[buffer] 
columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: 
FPS: 1128 FrameTime: 0.887 ms
[buffer] 
columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: 
FPS: 893 FrameTime: 1.120 ms

[ideas] speed=duration: FPS: 2999 FrameTime: 0.333 ms
[jellyfish] : FPS: 9422 FrameTime: 0.106 ms
[terrain] : FPS: 1787 FrameTime: 0.560 ms
[shadow] : FPS: 8930 FrameTime: 0.112 ms
[refract] : FPS: 3418 FrameTime: 0.293 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 11901 FrameTime: 
0.084 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 11567 FrameTime: 
0.086 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 11614 FrameTime: 
0.086 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 11611 
FrameTime: 0.086 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 11643 
FrameTime: 0.086 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 11933 
FrameTime: 0.084 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 
11964 FrameTime: 0.084 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 11714 
FrameTime: 0.085 ms

===
  glmark2 Score: 9101
===

Before, even with DRM 3.27.0 (amd-staging-drm-next) I had

glmark2 Score: 8361

Am 03.10.2018 00:35, schrieb Marek Olšák:

Hi,

Interesting bits:
- CP DMA support for GDS (unused but there is a test)
- switch back to DX sample positions
- center the viewport in the scanline area for maximizing the guardband
- optimal PA_SU_PRIM_FILTER_CNTL
- higher subpixel precision for 4K and lower resolutions
  (for more precise rendering of T-junctions in geometry)

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: use nir_shrink_vec_array_vars()

2018-10-17 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

for the series.

I wonder what the perf diff is for tessellation. See e.g.
https://github.com/doitsujin/dxvk/issues/645 for a game where
tessellation is hitting us hard.
On Thu, Oct 18, 2018 at 1:28 AM Timothy Arceri  wrote:
>
> Totals from affected shaders:
> SGPRS: 1096 -> 1096 (0.00 %)
> VGPRS: 1192 -> 1056 (-11.41 %)
> Spilled SGPRs: 0 -> 0 (0.00 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 100940 -> 94384 (-6.49 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 100 -> 112 (12.00 %)
> Wait states: 0 -> 0 (0.00 %)
>
> All affected shaders are from Batman Arkham City.
> ---
>  src/amd/vulkan/radv_shader.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 13858b6130f..15c9de1e020 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -127,6 +127,7 @@ radv_optimize_nir(struct nir_shader *shader, bool 
> optimize_conservatively,
>  progress = false;
>
> NIR_PASS(progress, shader, nir_split_array_vars, 
> nir_var_local);
> +   NIR_PASS(progress, shader, nir_shrink_vec_array_vars, 
> nir_var_local);
>
>  NIR_PASS_V(shader, nir_lower_vars_to_ssa);
> NIR_PASS_V(shader, nir_lower_pack);
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: use nir_shrink_vec_array_vars()

2018-10-17 Thread Timothy Arceri

Totals from affected shaders:
SGPRS: 1096 -> 1096 (0.00 %)
VGPRS: 1192 -> 1056 (-11.41 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 100940 -> 94384 (-6.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 100 -> 112 (12.00 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.
---
 src/amd/vulkan/radv_shader.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 13858b6130f..15c9de1e020 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -127,6 +127,7 @@ radv_optimize_nir(struct nir_shader *shader, bool 
optimize_conservatively,
 progress = false;
 
NIR_PASS(progress, shader, nir_split_array_vars, nir_var_local);
+   NIR_PASS(progress, shader, nir_shrink_vec_array_vars, 
nir_var_local);
 
 NIR_PASS_V(shader, nir_lower_vars_to_ssa);
NIR_PASS_V(shader, nir_lower_pack);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv: use nir_split_array_vars()

2018-10-17 Thread Timothy Arceri

We call in the opt loop in case another pass results in an
array with indirect access being turned into direct access.

Totals from affected shaders:
SGPRS: 512 -> 496 (-3.12 %)
VGPRS: 456 -> 452 (-0.88 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 40040 -> 39664 (-0.94 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 41 -> 43 (4.88 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.
---
 src/amd/vulkan/radv_shader.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 52aa83d4a5a..13858b6130f 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -126,6 +126,8 @@ radv_optimize_nir(struct nir_shader *shader, bool 
optimize_conservatively,
 do {
 progress = false;
 
+   NIR_PASS(progress, shader, nir_split_array_vars, nir_var_local);
+
 NIR_PASS_V(shader, nir_lower_vars_to_ssa);
NIR_PASS_V(shader, nir_lower_pack);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: use nir_opt_find_array_copies()

2018-10-17 Thread Jason Ekstrand

For full effect, you want to also enable shrink_vec_var_arrays and
split_array_vars

On Wed, Oct 17, 2018 at 6:00 PM Timothy Arceri 
wrote:

> Totals from affected shaders:
> SGPRS: 1112 -> 1112 (0.00 %)
> VGPRS: 1492 -> 1196 (-19.84 %)
> Spilled SGPRs: 0 -> 0 (0.00 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 112172 -> 101316 (-9.68 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 93 -> 98 (5.38 %)
> Wait states: 0 -> 0 (0.00 %)
>
> All affected shaders are from "Batman: Arkham City" over DXVK.
>
> The pass detects that the temporary array created by DXVK for
> storing TCS inputs is a copy of the input arrays and allows
> us to avoid copying all of the input data and then indirecting
> on it with if-ladders, instead we just do indirect indexing.
> ---
>  src/amd/vulkan/radv_pipeline.c |  6 +++---
>  src/amd/vulkan/radv_shader.c   | 22 ++
>  src/amd/vulkan/radv_shader.h   |  3 ++-
>  3 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c
> b/src/amd/vulkan/radv_pipeline.c
> index e1d665d0ac7..8d15a048bbf 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -1808,13 +1808,13 @@ radv_link_shaders(struct radv_pipeline *pipeline,
> nir_shader **shaders)
>
> ac_lower_indirect_derefs(ordered_shaders[i],
>
>  pipeline->device->physical_device->rad_info.chip_class);
> }
> -   radv_optimize_nir(ordered_shaders[i], false);
> +   radv_optimize_nir(ordered_shaders[i], false,
> false);
>
> if
> (nir_lower_global_vars_to_local(ordered_shaders[i - 1])) {
> ac_lower_indirect_derefs(ordered_shaders[i
> - 1],
>
>  pipeline->device->physical_device->rad_info.chip_class);
> }
> -   radv_optimize_nir(ordered_shaders[i - 1], false);
> +   radv_optimize_nir(ordered_shaders[i - 1], false,
> false);
> }
> }
>  }
> @@ -2073,7 +2073,7 @@ void radv_create_shaders(struct radv_pipeline
> *pipeline,
>
> if (!(flags &
> VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT)) {
> nir_lower_io_to_scalar_early(nir[i], mask);
> -   radv_optimize_nir(nir[i], false);
> +   radv_optimize_nir(nir[i], false, false);
> }
> }
> }
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 3b3422c8da6..52aa83d4a5a 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -118,7 +118,8 @@ void radv_DestroyShaderModule(
>  }
>
>  void
> -radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively)
> +radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively,
> +  bool allow_copies)
>  {
>  bool progress;
>
> @@ -128,6 +129,15 @@ radv_optimize_nir(struct nir_shader *shader, bool
> optimize_conservatively)
>  NIR_PASS_V(shader, nir_lower_vars_to_ssa);
> NIR_PASS_V(shader, nir_lower_pack);
>
> +   if (allow_copies) {
> +   /* Only run this pass in the first call to
> +* radv_optimize_nir.  Later calls assume that
> we've
> +* lowered away any copy_deref instructions and we
> +*  don't want to introduce any more.
> +   */
> +   NIR_PASS(progress, shader,
> nir_opt_find_array_copies);
> +   }
> +
> NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
> NIR_PASS(progress, shader, nir_opt_dead_write_vars);
>
> @@ -306,7 +316,6 @@ radv_shader_compile_to_nir(struct radv_device *device,
> }
>
> nir_split_var_copies(nir);
> -   nir_lower_var_copies(nir);
>
> nir_lower_global_vars_to_local(nir);
> nir_remove_dead_variables(nir, nir_var_local);
> @@ -323,7 +332,12 @@ radv_shader_compile_to_nir(struct radv_device *device,
> nir_lower_load_const_to_scalar(nir);
>
> if (!(flags & VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT))
> -   radv_optimize_nir(nir, false);
> +   radv_optimize_nir(nir, false, true);
> +
> +   /* We call nir_lower_var_copies() after the first
> radv_optimize_nir()
> +* to remove any copies introduced by nir_opt_find_array_copies().
> +*/
> +   nir_lower_var_copies(nir);
>
> /* Indirect lowering must be called after the radv_optimize_nir()
> loop
>  * has been called at least once. Otherwise indirect lowering can
> @@ -331,7 +345,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
>  * considered too large for unrolling.
>  */
> ac_lower_ind

Re: [Mesa-dev] [PATCH] radv: use nir_opt_find_array_copies()

2018-10-17 Thread Jason Ekstrand

and split_struct_vars while you're at it

On Wed, Oct 17, 2018 at 6:15 PM Jason Ekstrand  wrote:

> For full effect, you want to also enable shrink_vec_var_arrays and
> split_array_vars
>
> On Wed, Oct 17, 2018 at 6:00 PM Timothy Arceri 
> wrote:
>
>> Totals from affected shaders:
>> SGPRS: 1112 -> 1112 (0.00 %)
>> VGPRS: 1492 -> 1196 (-19.84 %)
>> Spilled SGPRs: 0 -> 0 (0.00 %)
>> Spilled VGPRs: 0 -> 0 (0.00 %)
>> Private memory VGPRs: 0 -> 0 (0.00 %)
>> Scratch size: 0 -> 0 (0.00 %) dwords per thread
>> Code Size: 112172 -> 101316 (-9.68 %) bytes
>> LDS: 0 -> 0 (0.00 %) blocks
>> Max Waves: 93 -> 98 (5.38 %)
>> Wait states: 0 -> 0 (0.00 %)
>>
>> All affected shaders are from "Batman: Arkham City" over DXVK.
>>
>> The pass detects that the temporary array created by DXVK for
>> storing TCS inputs is a copy of the input arrays and allows
>> us to avoid copying all of the input data and then indirecting
>> on it with if-ladders, instead we just do indirect indexing.
>> ---
>>  src/amd/vulkan/radv_pipeline.c |  6 +++---
>>  src/amd/vulkan/radv_shader.c   | 22 ++
>>  src/amd/vulkan/radv_shader.h   |  3 ++-
>>  3 files changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/amd/vulkan/radv_pipeline.c
>> b/src/amd/vulkan/radv_pipeline.c
>> index e1d665d0ac7..8d15a048bbf 100644
>> --- a/src/amd/vulkan/radv_pipeline.c
>> +++ b/src/amd/vulkan/radv_pipeline.c
>> @@ -1808,13 +1808,13 @@ radv_link_shaders(struct radv_pipeline *pipeline,
>> nir_shader **shaders)
>>
>> ac_lower_indirect_derefs(ordered_shaders[i],
>>
>>  pipeline->device->physical_device->rad_info.chip_class);
>> }
>> -   radv_optimize_nir(ordered_shaders[i], false);
>> +   radv_optimize_nir(ordered_shaders[i], false,
>> false);
>>
>> if
>> (nir_lower_global_vars_to_local(ordered_shaders[i - 1])) {
>>
>> ac_lower_indirect_derefs(ordered_shaders[i - 1],
>>
>>  pipeline->device->physical_device->rad_info.chip_class);
>> }
>> -   radv_optimize_nir(ordered_shaders[i - 1], false);
>> +   radv_optimize_nir(ordered_shaders[i - 1], false,
>> false);
>> }
>> }
>>  }
>> @@ -2073,7 +2073,7 @@ void radv_create_shaders(struct radv_pipeline
>> *pipeline,
>>
>> if (!(flags &
>> VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT)) {
>> nir_lower_io_to_scalar_early(nir[i],
>> mask);
>> -   radv_optimize_nir(nir[i], false);
>> +   radv_optimize_nir(nir[i], false, false);
>> }
>> }
>> }
>> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
>> index 3b3422c8da6..52aa83d4a5a 100644
>> --- a/src/amd/vulkan/radv_shader.c
>> +++ b/src/amd/vulkan/radv_shader.c
>> @@ -118,7 +118,8 @@ void radv_DestroyShaderModule(
>>  }
>>
>>  void
>> -radv_optimize_nir(struct nir_shader *shader, bool
>> optimize_conservatively)
>> +radv_optimize_nir(struct nir_shader *shader, bool
>> optimize_conservatively,
>> +  bool allow_copies)
>>  {
>>  bool progress;
>>
>> @@ -128,6 +129,15 @@ radv_optimize_nir(struct nir_shader *shader, bool
>> optimize_conservatively)
>>  NIR_PASS_V(shader, nir_lower_vars_to_ssa);
>> NIR_PASS_V(shader, nir_lower_pack);
>>
>> +   if (allow_copies) {
>> +   /* Only run this pass in the first call to
>> +* radv_optimize_nir.  Later calls assume that
>> we've
>> +* lowered away any copy_deref instructions and we
>> +*  don't want to introduce any more.
>> +   */
>> +   NIR_PASS(progress, shader,
>> nir_opt_find_array_copies);
>> +   }
>> +
>> NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
>> NIR_PASS(progress, shader, nir_opt_dead_write_vars);
>>
>> @@ -306,7 +316,6 @@ radv_shader_compile_to_nir(struct radv_device *device,
>> }
>>
>> nir_split_var_copies(nir);
>> -   nir_lower_var_copies(nir);
>>
>> nir_lower_global_vars_to_local(nir);
>> nir_remove_dead_variables(nir, nir_var_local);
>> @@ -323,7 +332,12 @@ radv_shader_compile_to_nir(struct radv_device
>> *device,
>> nir_lower_load_const_to_scalar(nir);
>>
>> if (!(flags & VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT))
>> -   radv_optimize_nir(nir, false);
>> +   radv_optimize_nir(nir, false, true);
>> +
>> +   /* We call nir_lower_var_copies() after the first
>> radv_optimize_nir()
>> +* to remove any copies introduced by nir_opt_find_array_copies().
>> +*/
>> +   nir_lower_var_copies(nir);
>>
>> /* Indirect lowering must be called after the radv_optimize_nir()
>> loop
>>  * has been called

Re: [Mesa-dev] [PATCH] radv: use nir_opt_find_array_copies()

2018-10-17 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 
On Thu, Oct 18, 2018 at 1:00 AM Timothy Arceri  wrote:
>
> Totals from affected shaders:
> SGPRS: 1112 -> 1112 (0.00 %)
> VGPRS: 1492 -> 1196 (-19.84 %)
> Spilled SGPRs: 0 -> 0 (0.00 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 112172 -> 101316 (-9.68 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 93 -> 98 (5.38 %)
> Wait states: 0 -> 0 (0.00 %)
>
> All affected shaders are from "Batman: Arkham City" over DXVK.
>
> The pass detects that the temporary array created by DXVK for
> storing TCS inputs is a copy of the input arrays and allows
> us to avoid copying all of the input data and then indirecting
> on it with if-ladders, instead we just do indirect indexing.
> ---
>  src/amd/vulkan/radv_pipeline.c |  6 +++---
>  src/amd/vulkan/radv_shader.c   | 22 ++
>  src/amd/vulkan/radv_shader.h   |  3 ++-
>  3 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index e1d665d0ac7..8d15a048bbf 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -1808,13 +1808,13 @@ radv_link_shaders(struct radv_pipeline *pipeline, 
> nir_shader **shaders)
> ac_lower_indirect_derefs(ordered_shaders[i],
>  
> pipeline->device->physical_device->rad_info.chip_class);
> }
> -   radv_optimize_nir(ordered_shaders[i], false);
> +   radv_optimize_nir(ordered_shaders[i], false, false);
>
> if (nir_lower_global_vars_to_local(ordered_shaders[i 
> - 1])) {
> ac_lower_indirect_derefs(ordered_shaders[i - 
> 1],
>  
> pipeline->device->physical_device->rad_info.chip_class);
> }
> -   radv_optimize_nir(ordered_shaders[i - 1], false);
> +   radv_optimize_nir(ordered_shaders[i - 1], false, 
> false);
> }
> }
>  }
> @@ -2073,7 +2073,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
>
> if (!(flags & 
> VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT)) {
> nir_lower_io_to_scalar_early(nir[i], mask);
> -   radv_optimize_nir(nir[i], false);
> +   radv_optimize_nir(nir[i], false, false);
> }
> }
> }
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 3b3422c8da6..52aa83d4a5a 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -118,7 +118,8 @@ void radv_DestroyShaderModule(
>  }
>
>  void
> -radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively)
> +radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively,
> +  bool allow_copies)
>  {
>  bool progress;
>
> @@ -128,6 +129,15 @@ radv_optimize_nir(struct nir_shader *shader, bool 
> optimize_conservatively)
>  NIR_PASS_V(shader, nir_lower_vars_to_ssa);
> NIR_PASS_V(shader, nir_lower_pack);
>
> +   if (allow_copies) {
> +   /* Only run this pass in the first call to
> +* radv_optimize_nir.  Later calls assume that we've
> +* lowered away any copy_deref instructions and we
> +*  don't want to introduce any more.
> +   */
> +   NIR_PASS(progress, shader, nir_opt_find_array_copies);
> +   }
> +
> NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
> NIR_PASS(progress, shader, nir_opt_dead_write_vars);
>
> @@ -306,7 +316,6 @@ radv_shader_compile_to_nir(struct radv_device *device,
> }
>
> nir_split_var_copies(nir);
> -   nir_lower_var_copies(nir);
>
> nir_lower_global_vars_to_local(nir);
> nir_remove_dead_variables(nir, nir_var_local);
> @@ -323,7 +332,12 @@ radv_shader_compile_to_nir(struct radv_device *device,
> nir_lower_load_const_to_scalar(nir);
>
> if (!(flags & VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT))
> -   radv_optimize_nir(nir, false);
> +   radv_optimize_nir(nir, false, true);
> +
> +   /* We call nir_lower_var_copies() after the first radv_optimize_nir()
> +* to remove any copies introduced by nir_opt_find_array_copies().
> +*/
> +   nir_lower_var_copies(nir);
>
> /* Indirect lowering must be called after the radv_optimize_nir() loop
>  * has been called at least once. Otherwise indirect lowering can
> @@ -331,7 +345,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
>

Re: [Mesa-dev] [PATCH] radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars

2018-10-17 Thread Timothy Arceri


On 18/10/18 9:51 am, Bas Nieuwenhuizen wrote:

On Thu, Oct 18, 2018 at 12:04 AM Timothy Arceri  wrote:


Totals from affected shaders:
SGPRS: 2856 -> 2856 (0.00 %)
VGPRS: 3236 -> 3248 (0.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 236560 -> 233548 (-1.27 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 277 -> 283 (2.17 %)
Wait states: 0 -> 0 (0.00 %)


Interesting that both max waves and VGPRs increased.


Yeah it was just one of those changes where the new NIR increased VGPR 
use in a larger number of shaders compared to the number that reduced
enough to bump the max waves. However as I tried to indicate below the 
increase of VGPRs is something that could likely be improved on the LLVM 
side, the NIR itself looks much better.




Reviewed-by: Bas Nieuwenhuizen 



Even in the cases were we have increased VGPR use it appears
the NIR is improved significantly.
---
  src/amd/vulkan/radv_shader.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 3e3eb96a531..3b3422c8da6 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -127,6 +127,10 @@ radv_optimize_nir(struct nir_shader *shader, bool 
optimize_conservatively)

  NIR_PASS_V(shader, nir_lower_vars_to_ssa);
 NIR_PASS_V(shader, nir_lower_pack);
+
+   NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
+   NIR_PASS(progress, shader, nir_opt_dead_write_vars);
+
  NIR_PASS_V(shader, nir_lower_alu_to_scalar);
  NIR_PASS_V(shader, nir_lower_phis_to_scalar);

--
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: use nir_opt_find_array_copies()

2018-10-17 Thread Timothy Arceri

Totals from affected shaders:
SGPRS: 1112 -> 1112 (0.00 %)
VGPRS: 1492 -> 1196 (-19.84 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 112172 -> 101316 (-9.68 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 93 -> 98 (5.38 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from "Batman: Arkham City" over DXVK.

The pass detects that the temporary array created by DXVK for
storing TCS inputs is a copy of the input arrays and allows
us to avoid copying all of the input data and then indirecting
on it with if-ladders, instead we just do indirect indexing.
---
 src/amd/vulkan/radv_pipeline.c |  6 +++---
 src/amd/vulkan/radv_shader.c   | 22 ++
 src/amd/vulkan/radv_shader.h   |  3 ++-
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index e1d665d0ac7..8d15a048bbf 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1808,13 +1808,13 @@ radv_link_shaders(struct radv_pipeline *pipeline, 
nir_shader **shaders)
ac_lower_indirect_derefs(ordered_shaders[i],
 
pipeline->device->physical_device->rad_info.chip_class);
}
-   radv_optimize_nir(ordered_shaders[i], false);
+   radv_optimize_nir(ordered_shaders[i], false, false);
 
if (nir_lower_global_vars_to_local(ordered_shaders[i - 
1])) {
ac_lower_indirect_derefs(ordered_shaders[i - 1],
 
pipeline->device->physical_device->rad_info.chip_class);
}
-   radv_optimize_nir(ordered_shaders[i - 1], false);
+   radv_optimize_nir(ordered_shaders[i - 1], false, false);
}
}
 }
@@ -2073,7 +2073,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
 
if (!(flags & 
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT)) {
nir_lower_io_to_scalar_early(nir[i], mask);
-   radv_optimize_nir(nir[i], false);
+   radv_optimize_nir(nir[i], false, false);
}
}
}
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 3b3422c8da6..52aa83d4a5a 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -118,7 +118,8 @@ void radv_DestroyShaderModule(
 }
 
 void
-radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively)
+radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively,
+  bool allow_copies)
 {
 bool progress;
 
@@ -128,6 +129,15 @@ radv_optimize_nir(struct nir_shader *shader, bool 
optimize_conservatively)
 NIR_PASS_V(shader, nir_lower_vars_to_ssa);
NIR_PASS_V(shader, nir_lower_pack);
 
+   if (allow_copies) {
+   /* Only run this pass in the first call to
+* radv_optimize_nir.  Later calls assume that we've
+* lowered away any copy_deref instructions and we
+*  don't want to introduce any more.
+   */
+   NIR_PASS(progress, shader, nir_opt_find_array_copies);
+   }
+
NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
NIR_PASS(progress, shader, nir_opt_dead_write_vars);
 
@@ -306,7 +316,6 @@ radv_shader_compile_to_nir(struct radv_device *device,
}
 
nir_split_var_copies(nir);
-   nir_lower_var_copies(nir);
 
nir_lower_global_vars_to_local(nir);
nir_remove_dead_variables(nir, nir_var_local);
@@ -323,7 +332,12 @@ radv_shader_compile_to_nir(struct radv_device *device,
nir_lower_load_const_to_scalar(nir);
 
if (!(flags & VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT))
-   radv_optimize_nir(nir, false);
+   radv_optimize_nir(nir, false, true);
+
+   /* We call nir_lower_var_copies() after the first radv_optimize_nir()
+* to remove any copies introduced by nir_opt_find_array_copies().
+*/
+   nir_lower_var_copies(nir);
 
/* Indirect lowering must be called after the radv_optimize_nir() loop
 * has been called at least once. Otherwise indirect lowering can
@@ -331,7 +345,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
 * considered too large for unrolling.
 */
ac_lower_indirect_derefs(nir, 
device->physical_device->rad_info.chip_class);
-   radv_optimize_nir(nir, flags & 
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT);
+   radv_optimize_nir(nir, flags & 
VK_PIPELINE_CREATE_DISABLE

Re: [Mesa-dev] [PATCH] radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars

2018-10-17 Thread Bas Nieuwenhuizen

On Thu, Oct 18, 2018 at 12:04 AM Timothy Arceri  wrote:
>
> Totals from affected shaders:
> SGPRS: 2856 -> 2856 (0.00 %)
> VGPRS: 3236 -> 3248 (0.37 %)
> Spilled SGPRs: 0 -> 0 (0.00 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 236560 -> 233548 (-1.27 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 277 -> 283 (2.17 %)
> Wait states: 0 -> 0 (0.00 %)

Interesting that both max waves and VGPRs increased.

Reviewed-by: Bas Nieuwenhuizen 

>
> Even in the cases were we have increased VGPR use it appears
> the NIR is improved significantly.
> ---
>  src/amd/vulkan/radv_shader.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 3e3eb96a531..3b3422c8da6 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -127,6 +127,10 @@ radv_optimize_nir(struct nir_shader *shader, bool 
> optimize_conservatively)
>
>  NIR_PASS_V(shader, nir_lower_vars_to_ssa);
> NIR_PASS_V(shader, nir_lower_pack);
> +
> +   NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
> +   NIR_PASS(progress, shader, nir_opt_dead_write_vars);
> +
>  NIR_PASS_V(shader, nir_lower_alu_to_scalar);
>  NIR_PASS_V(shader, nir_lower_phis_to_scalar);
>
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]

2018-10-17 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 
On Wed, Oct 17, 2018 at 6:49 PM Keith Packard  wrote:
>
> Offers three clocks, device, clock monotonic and clock monotonic
> raw. Could use some kernel support to reduce the deviation between
> clock values.
>
> v2:
> Ensure deviation is at least as big as the GPU time interval.
>
> v3:
> Set device->lost when returning DEVICE_LOST.
> Use MAX2 and DIV_ROUND_UP instead of open coding these.
> Delete spurious TIMESTAMP in radv version.
>
> Suggested-by: Jason Ekstrand 
> Suggested-by: Lionel Landwerlin 
>
> v4:
> Add anv_gem_reg_read to anv_gem_stubs.c
>
> Suggested-by: Jason Ekstrand 
>
> v5:
> Adjust maxDeviation computation to max(sampled_clock_period) +
> sample_interval.
>
> Suggested-by: Bas Nieuwenhuizen 
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
> ---
>  src/amd/vulkan/radv_device.c   | 119 +++
>  src/amd/vulkan/radv_extensions.py  |   1 +
>  src/intel/vulkan/anv_device.c  | 127 +
>  src/intel/vulkan/anv_extensions.py |   1 +
>  src/intel/vulkan/anv_gem.c |  13 +++
>  src/intel/vulkan/anv_gem_stubs.c   |   7 ++
>  src/intel/vulkan/anv_private.h |   2 +
>  7 files changed, 270 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 174922780fc..4a705a724ef 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -4955,3 +4955,122 @@ radv_GetDeviceGroupPeerMemoryFeatures(
>VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT |
>VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT;
>  }
> +
> +static const VkTimeDomainEXT radv_time_domains[] = {
> +   VK_TIME_DOMAIN_DEVICE_EXT,
> +   VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT,
> +   VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT,
> +};
> +
> +VkResult radv_GetPhysicalDeviceCalibrateableTimeDomainsEXT(
> +   VkPhysicalDevice physicalDevice,
> +   uint32_t *pTimeDomainCount,
> +   VkTimeDomainEXT  *pTimeDomains)
> +{
> +   int d;
> +   VK_OUTARRAY_MAKE(out, pTimeDomains, pTimeDomainCount);
> +
> +   for (d = 0; d < ARRAY_SIZE(radv_time_domains); d++) {
> +   vk_outarray_append(&out, i) {
> +   *i = radv_time_domains[d];
> +   }
> +   }
> +
> +   return vk_outarray_status(&out);
> +}
> +
> +static uint64_t
> +radv_clock_gettime(clockid_t clock_id)
> +{
> +   struct timespec current;
> +   int ret;
> +
> +   ret = clock_gettime(clock_id, ¤t);
> +   if (ret < 0 && clock_id == CLOCK_MONOTONIC_RAW)
> +   ret = clock_gettime(CLOCK_MONOTONIC, ¤t);
> +   if (ret < 0)
> +   return 0;
> +
> +   return (uint64_t) current.tv_sec * 10ULL + current.tv_nsec;
> +}
> +
> +VkResult radv_GetCalibratedTimestampsEXT(
> +   VkDevice _device,
> +   uint32_t timestampCount,
> +   const VkCalibratedTimestampInfoEXT   *pTimestampInfos,
> +   uint64_t *pTimestamps,
> +   uint64_t *pMaxDeviation)
> +{
> +   RADV_FROM_HANDLE(radv_device, device, _device);
> +   uint32_t clock_crystal_freq = 
> device->physical_device->rad_info.clock_crystal_freq;
> +   int d;
> +   uint64_t begin, end;
> +uint64_t max_clock_period = 0;
> +
> +   begin = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
> +
> +   for (d = 0; d < timestampCount; d++) {
> +   switch (pTimestampInfos[d].timeDomain) {
> +   case VK_TIME_DOMAIN_DEVICE_EXT:
> +   pTimestamps[d] = device->ws->query_value(device->ws,
> +
> RADEON_TIMESTAMP);
> +uint64_t device_period = DIV_ROUND_UP(100, 
> clock_crystal_freq);
> +max_clock_period = MAX2(max_clock_period, 
> device_period);
> +   break;
> +   case VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT:
> +   pTimestamps[d] = radv_clock_gettime(CLOCK_MONOTONIC);
> +max_clock_period = MAX2(max_clock_period, 1);
> +   break;
> +
> +   case VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT:
> +   pTimestamps[d] = begin;
> +   break;
> +   default:
> +   pTimestamps[d] = 0;
> +   break;
> +   }
> +   }
> +
> +   end = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
> +
> +/*
> + * The maximum deviation is the sum of the interval over which we
> + * perform the sampling and the maximum perio

[Mesa-dev] [PATCH] radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars

2018-10-17 Thread Timothy Arceri

Totals from affected shaders:
SGPRS: 2856 -> 2856 (0.00 %)
VGPRS: 3236 -> 3248 (0.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 236560 -> 233548 (-1.27 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 277 -> 283 (2.17 %)
Wait states: 0 -> 0 (0.00 %)

Even in the cases were we have increased VGPR use it appears
the NIR is improved significantly.
---
 src/amd/vulkan/radv_shader.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 3e3eb96a531..3b3422c8da6 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -127,6 +127,10 @@ radv_optimize_nir(struct nir_shader *shader, bool 
optimize_conservatively)
 
 NIR_PASS_V(shader, nir_lower_vars_to_ssa);
NIR_PASS_V(shader, nir_lower_pack);
+
+   NIR_PASS(progress, shader, nir_opt_copy_prop_vars);
+   NIR_PASS(progress, shader, nir_opt_dead_write_vars);
+
 NIR_PASS_V(shader, nir_lower_alu_to_scalar);
 NIR_PASS_V(shader, nir_lower_phis_to_scalar);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/1] swr: Fix for LLVM 5 to 6 API change

2018-10-17 Thread Alok Hota

This patch fixes a compile error resulting from a function whose API
changed between LLVM versions 5 and 6. I sent this to mesa-dev, but it's
primarly a fix for the stable branch as it affects releases with LLVM
5-based codegen.

v2: included mesa-stable cc

Alok Hota (1):
  swr/rast: ignore CreateElementUnorderedAtomicMemCpy

 .../drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/1] swr/rast: ignore CreateElementUnorderedAtomicMemCpy

2018-10-17 Thread Alok Hota

This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball

CC: 
---
 .../drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py 
b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
index d34e88d1bc..485403ae1e 100644
--- a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
+++ b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
@@ -161,7 +161,8 @@ def parse_ir_builder(input_file):
 func_name == 'CreateAlignmentAssumptionHelper' or
 func_name == 'CreateGEP' or
 func_name == 'CreateLoad' or
-func_name == 'CreateMaskedLoad'):
+func_name == 'CreateMaskedLoad' or
+func_name == 'CreateElementUnorderedAtomicMemCpy'):
 ignore = True
 
 # Convert CamelCase to CAMEL_CASE
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Q: to which software renderers should we contribute to help virgl conformance testing

2018-10-17 Thread Roland Scheidegger

Am 17.10.18 um 19:15 schrieb Gert Wollny:
> Dear all, 
> 
> we are looking into doing a CI for virglrenderer that also runs a
> subset of the GLES dEQP, and in order to be able to run this also in
> gitlab.fd.o we were looking into the available gallium software
> renderers. Inital tests by just running the dEQP-GLES2 were quite
> successful in the sense that the exection time is not too long (a full
> run on the GL and GLES host with llvmpipe takes about 10 min [1]). 
> 
> Now to extend on that work the focus is turning to which software
> renderer has the most features, the least failing tests, and is
> actively developed. 
> 
> Simply looking at the commit stats it seems that the developement of
> softpipe and llvmpipe is mostly stalled, swr, on the other had has seen
> quite some development, but mostly regarding performance, and given the
> FAQ [2] the focus is on a very specific application space and not so
> much on getting more features in.
I wouldn't quite say llvmpipe is stalled, although it's true that there
weren't all that many changes (in particular as new features are concerned).

> 
> When checking for conformance of virglrenderer we need a host driver
> that is conformant itself, and we are willing to contribute here, but
> it seems to make most sense to focus this work on just one driver. To
> make sensible choice there are some open questions:
> 
> Are there plans to get swr and/or llvmpipe to support gles 3.1, or
> carry any of the drivers even further, maybe GLES 3.2 and desktop 4.x?
At a quick glance for for gles 3.1 llvmpipe would be missing mostly
compute shaders and shader images / ssbo, so definitely some work. GL 4
would add tessellation as well (at least I think these are the big parts
missing).
Unfortunately I don't have time to work on this, but it would be nice to
have indeed. Well volunteers welcome, no special hw nor docs needed :-).
(Although softpipe is easier to work with, but it's just not all that
interesing.)

> 
> 
> Is there any specific interest to fix all failures that occur when
> running gles dEQP? In this bug report [3] Roland pointed out that
> "there is no goal as such to pass dEQP, although patches are welcome",
> any opinion for the other drivers? (for swr beyond what is written in
> the FAQ).
I think it wouldn't really be all that much work to get dEQP passing -
since llvmpipe is built to honor dx10 rules, which are typically more
strict than GL. But some things specific to GL fail. So IMHO if you want
a non-hw driver to pass dEQP, llvmpipe is probably still your best bet
(but of course, softpipe is generally easier to fix).
Can't really comment on swr there.

Roland


> 
> As pointed out in the FAQ, swr is very Intel specific, are there plans
> not layed out in the FAQ to support other, non-x86 hardware?
> 
> many thanks 
> Gert

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR

2018-10-17 Thread Jason Ekstrand

This got missed during 1.1 enabling because it was defined as an
interaction between device groups and WSI and it wasn't obvious it was
in the delta.

The idea behind it is that it's supposed to provide a hint to the
application in a multi-GPU setup to indicate which regions of the screen
are being scanned out by which GPU so a multi-device split-screen
rendering application can render each part of the screen on the GPU that
will be presenting it and avoid extra bus traffic between GPUs.  On a
single-GPU setup or one which doesn't support this present mode, we need
to do something.  We choose to return the window size (or a max-size
rect) if the compositor, X server, or crtc is associated with the given
physical device and zero rectangles otherwise.
---
 src/amd/vulkan/radv_wsi.c   | 14 +++
 src/intel/vulkan/anv_wsi.c  | 14 +++
 src/vulkan/wsi/wsi_common.c | 14 +++
 src/vulkan/wsi/wsi_common.h |  7 
 src/vulkan/wsi/wsi_common_display.c | 41 +++
 src/vulkan/wsi/wsi_common_private.h |  5 +++
 src/vulkan/wsi/wsi_common_wayland.c | 21 ++
 src/vulkan/wsi/wsi_common_x11.c | 61 +
 8 files changed, 177 insertions(+)

diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index 8b165ea3916..43103a4ef85 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -284,3 +284,17 @@ VkResult radv_GetDeviceGroupSurfacePresentModesKHR(
 
return VK_SUCCESS;
 }
+
+VkResult radv_GetPhysicalDevicePresentRectanglesKHR(
+   VkPhysicalDevicephysicalDevice,
+   VkSurfaceKHRsurface,
+   uint32_t*   pRectCount,
+   VkRect2D*   pRects)
+{
+   RADV_FROM_HANDLE(radv_physical_device, device, physicalDevice);
+
+   return wsi_common_get_present_rectangles(&device->wsi_device,
+device->local_fd,
+surface,
+pRectCount, pRects);
+}
diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index 1c9a54804e8..5d672c211c4 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -293,3 +293,17 @@ VkResult anv_GetDeviceGroupSurfacePresentModesKHR(
 
return VK_SUCCESS;
 }
+
+VkResult anv_GetPhysicalDevicePresentRectanglesKHR(
+VkPhysicalDevicephysicalDevice,
+VkSurfaceKHRsurface,
+uint32_t*   pRectCount,
+VkRect2D*   pRects)
+{
+   ANV_FROM_HANDLE(anv_physical_device, device, physicalDevice);
+
+   return wsi_common_get_present_rectangles(&device->wsi_device,
+device->local_fd,
+surface,
+pRectCount, pRects);
+}
diff --git a/src/vulkan/wsi/wsi_common.c b/src/vulkan/wsi/wsi_common.c
index 1e3c4e0028b..ad4b8c9075e 100644
--- a/src/vulkan/wsi/wsi_common.c
+++ b/src/vulkan/wsi/wsi_common.c
@@ -803,6 +803,20 @@ wsi_common_get_surface_present_modes(struct wsi_device 
*wsi_device,
pPresentModes);
 }
 
+VkResult
+wsi_common_get_present_rectangles(struct wsi_device *wsi_device,
+  int local_fd,
+  VkSurfaceKHR _surface,
+  uint32_t* pRectCount,
+  VkRect2D* pRects)
+{
+   ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
+   struct wsi_interface *iface = wsi_device->wsi[surface->platform];
+
+   return iface->get_present_rectangles(surface, wsi_device, local_fd,
+pRectCount, pRects);
+}
+
 VkResult
 wsi_common_create_swapchain(struct wsi_device *wsi,
 VkDevice device,
diff --git a/src/vulkan/wsi/wsi_common.h b/src/vulkan/wsi/wsi_common.h
index 424330de566..5b69c573d9e 100644
--- a/src/vulkan/wsi/wsi_common.h
+++ b/src/vulkan/wsi/wsi_common.h
@@ -199,6 +199,13 @@ wsi_common_get_surface_present_modes(struct wsi_device 
*wsi_device,
  uint32_t *pPresentModeCount,
  VkPresentModeKHR *pPresentModes);
 
+VkResult
+wsi_common_get_present_rectangles(struct wsi_device *wsi,
+  int local_fd,
+  VkSurfaceKHR surface,
+  uint32_t* pRectCount,
+  VkRect2D* pRects);
+
 VkResult
 wsi_common_get_surface_capabilities2ext(
struct wsi_device *wsi_device,
diff --git a/src/vulkan/wsi/wsi_common_display.c 
b/src/vulkan/wsi/wsi_common_display.c
index c004060a205..2315717ef8e 100644
--- a/src/vulkan/wsi/wsi_common_display.c
+++

[Mesa-dev] [PATCH 1/2] vulkan/wsi: Store the instance allocator in wsi_device

2018-10-17 Thread Jason Ekstrand

We already have wsi_device and we know the instance allocator at
wsi_device_init time so there's no need to pass it into the physical
device queries.  This also fixes a memory allocation domain bug that can
occur if CreateSwapchain gets called prior to any queries (not likely)
in which case the cached connection gets allocated off the device
instead of the instance.
---
 src/amd/vulkan/radv_wsi.c   |  1 -
 src/amd/vulkan/radv_wsi_x11.c   |  2 --
 src/intel/vulkan/anv_wsi.c  |  1 -
 src/intel/vulkan/anv_wsi_x11.c  |  2 --
 src/vulkan/wsi/wsi_common.c |  4 ++--
 src/vulkan/wsi/wsi_common.h |  4 +++-
 src/vulkan/wsi/wsi_common_display.c |  1 -
 src/vulkan/wsi/wsi_common_private.h |  1 -
 src/vulkan/wsi/wsi_common_wayland.c |  1 -
 src/vulkan/wsi/wsi_common_x11.c | 25 +++--
 src/vulkan/wsi/wsi_common_x11.h |  1 -
 11 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index 6479bea070b..8b165ea3916 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -75,7 +75,6 @@ VkResult radv_GetPhysicalDeviceSurfaceSupportKHR(
  device->local_fd,
  queueFamilyIndex,
  surface,
- &device->instance->alloc,
  pSupported);
 }
 
diff --git a/src/amd/vulkan/radv_wsi_x11.c b/src/amd/vulkan/radv_wsi_x11.c
index c65ac938772..9ef02ccc435 100644
--- a/src/amd/vulkan/radv_wsi_x11.c
+++ b/src/amd/vulkan/radv_wsi_x11.c
@@ -44,7 +44,6 @@ VkBool32 radv_GetPhysicalDeviceXcbPresentationSupportKHR(
 
return wsi_get_physical_device_xcb_presentation_support(
   &device->wsi_device,
-  &device->instance->alloc,
   queueFamilyIndex,
   device->local_fd, true,
   connection, visual_id);
@@ -60,7 +59,6 @@ VkBool32 radv_GetPhysicalDeviceXlibPresentationSupportKHR(
 
return wsi_get_physical_device_xcb_presentation_support(
   &device->wsi_device,
-  &device->instance->alloc,
   queueFamilyIndex,
   device->local_fd, true,
   XGetXCBConnection(dpy), visualID);
diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index 5ed1d711689..1c9a54804e8 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -92,7 +92,6 @@ VkResult anv_GetPhysicalDeviceSurfaceSupportKHR(
  device->local_fd,
  queueFamilyIndex,
  surface,
- &device->instance->alloc,
  pSupported);
 }
 
diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_x11.c
index 2feb5f13376..45c43f6f17f 100644
--- a/src/intel/vulkan/anv_wsi_x11.c
+++ b/src/intel/vulkan/anv_wsi_x11.c
@@ -40,7 +40,6 @@ VkBool32 anv_GetPhysicalDeviceXcbPresentationSupportKHR(
 
return wsi_get_physical_device_xcb_presentation_support(
   &device->wsi_device,
-  &device->instance->alloc,
   queueFamilyIndex,
   device->local_fd, false,
   connection, visual_id);
@@ -56,7 +55,6 @@ VkBool32 anv_GetPhysicalDeviceXlibPresentationSupportKHR(
 
return wsi_get_physical_device_xcb_presentation_support(
   &device->wsi_device,
-  &device->instance->alloc,
   queueFamilyIndex,
   device->local_fd, false,
   XGetXCBConnection(dpy), visualID);
diff --git a/src/vulkan/wsi/wsi_common.c b/src/vulkan/wsi/wsi_common.c
index 3416fef3076..1e3c4e0028b 100644
--- a/src/vulkan/wsi/wsi_common.c
+++ b/src/vulkan/wsi/wsi_common.c
@@ -39,6 +39,7 @@ wsi_device_init(struct wsi_device *wsi,
 
memset(wsi, 0, sizeof(*wsi));
 
+   wsi->instance_alloc = *alloc;
wsi->pdevice = pdevice;
 
 #define WSI_GET_CB(func) \
@@ -677,13 +678,12 @@ wsi_common_get_surface_support(struct wsi_device 
*wsi_device,
int local_fd,
uint32_t queueFamilyIndex,
VkSurfaceKHR _surface,
-   const VkAllocationCallbacks *alloc,
VkBool32* pSupported)
 {
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
struct wsi_interface *iface = wsi_device->wsi[surface->platform];
 
-   return iface->get_support(surface, wsi_device, alloc,
+   return iface->get_support(surface, wsi_device,
  queueFamilyIndex, local_fd, pSupported);
 }
 
diff --git a/src/vulkan/wsi/wsi_common.h b/src/vulkan/wsi/wsi_common.h
index 14f65097bb3..424330de566 100644
--- a/src/vulkan/wsi/wsi_common.h
+++ b/src/vulkan/wsi/wsi_common.h
@@ -90,6 +90,9 @@ struct wsi_interface;
 #define VK_ICD_WSI_PLATFORM_MAX (VK_ICD_WSI_PLATFORM_DISPLAY + 1)
 
 struct wsi_device {
+   /* Allocator for the instance */
+   VkAllocationCallb

Re: [Mesa-dev] [PATCH 1/1] swr/rast: ignore CreateElementUnorderedAtomicMemCpy

2018-10-17 Thread Cherniak, Bruce

Reviewed-by: Bruce Cherniak  

> On Oct 17, 2018, at 1:51 PM, Alok Hota  wrote:
> 
> This function's API changed between LLVM 5 and 6. Compile errors occur
> when building with LLVM 6+ if LLVM 5 was used for a dist tarball
> ---
> .../drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py   | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py 
> b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
> index d34e88d1bc..485403ae1e 100644
> --- a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
> +++ b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
> @@ -161,7 +161,8 @@ def parse_ir_builder(input_file):
> func_name == 'CreateAlignmentAssumptionHelper' or
> func_name == 'CreateGEP' or
> func_name == 'CreateLoad' or
> -func_name == 'CreateMaskedLoad'):
> +func_name == 'CreateMaskedLoad' or
> +func_name == 'CreateElementUnorderedAtomicMemCpy'):
> ignore = True
> 
> # Convert CamelCase to CAMEL_CASE
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 108355] Civilization VI - Artifacts in mouse cursor

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=108355

Hadrien Nilsson  changed:

   What|Removed |Added

  Component|Drivers/Gallium/softpipe|Drivers/Gallium/swr

--- Comment #7 from Hadrien Nilsson  ---
amdgpu.dc=0 had no effect, but using xf86-video-amdgpu 18.1.0 indeed fixed the
problem :) Thank you Michel.

Hopefully that new version we'll be released somehow for my Linux distribution.

I still do not know if the mouse cursor is displayed as intended (there is a
shadow which seems to use additive blending instead of alpha blending) but at
least there are no more artifacts.

I guess I should change the Bugzilla Product to "xorg"?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 107765] [regression] Batman Arkham City crashes with DXVK under wine

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=107765

--- Comment #13 from farmboy0+freedesk...@googlemail.com ---
Can you tell me what settings you use?
Do you use a 64 bit prefix?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] radeonsi: enable vcn jpeg decode for raven

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Enable vcn jpeg decode for raven.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeonsi/si_get.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index a87cb3cbc8..9b995bbcbf 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -628,6 +628,8 @@ static int si_get_video_param(struct pipe_screen *screen,
return profile == PIPE_VIDEO_PROFILE_HEVC_MAIN;
return false;
case PIPE_VIDEO_FORMAT_JPEG:
+   if (sscreen->info.family == CHIP_RAVEN)
+   return true;
if (sscreen->info.family < CHIP_CARRIZO || 
sscreen->info.family >= CHIP_VEGA10)
return false;
if (!(sscreen->info.drm_major == 3 && 
sscreen->info.drm_minor >= 19)) {
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/15] amd/common: add vcn jpeg ip info query

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/amd/common/ac_gpu_info.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 766ad83547..8c50738c3f 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -99,7 +99,7 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
struct drm_amdgpu_info_device device_info = {};
struct amdgpu_buffer_size_alignments alignment_info = {};
struct drm_amdgpu_info_hw_ip dma = {}, compute = {}, uvd = {};
-   struct drm_amdgpu_info_hw_ip uvd_enc = {}, vce = {}, vcn_dec = {};
+   struct drm_amdgpu_info_hw_ip uvd_enc = {}, vce = {}, vcn_dec = {}, 
vcn_jpeg = {};
struct drm_amdgpu_info_hw_ip vcn_enc = {}, gfx = {};
struct amdgpu_gds_resource_info gds = {};
uint32_t vce_version = 0, vce_feature = 0, uvd_version = 0, uvd_feature 
= 0;
@@ -186,6 +186,14 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
}
}
 
+   if (info->drm_major == 3 && info->drm_minor >= 17) {
+   r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_VCN_JPEG, 0, 
&vcn_jpeg);
+   if (r) {
+   fprintf(stderr, "amdgpu: 
amdgpu_query_hw_ip_info(vcn_jpeg) failed.\n");
+   return false;
+   }
+   }
+
r = amdgpu_query_firmware_version(dev, AMDGPU_INFO_FW_GFX_ME, 0, 0,
&info->me_fw_version,
&info->me_fw_feature);
@@ -340,7 +348,8 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->max_se = amdinfo->num_shader_engines;
info->max_sh_per_se = amdinfo->num_shader_arrays_per_engine;
info->has_hw_decode =
-   (uvd.available_rings != 0) || (vcn_dec.available_rings != 0);
+   (uvd.available_rings != 0) || (vcn_dec.available_rings != 0) ||
+   (vcn_jpeg.available_rings != 0);
info->uvd_fw_version =
uvd.available_rings ? uvd_version : 0;
info->vce_fw_version =
@@ -439,6 +448,7 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
ib_align = MAX2(ib_align, vce.ib_start_alignment);
ib_align = MAX2(ib_align, vcn_dec.ib_start_alignment);
ib_align = MAX2(ib_align, vcn_enc.ib_start_alignment);
+   ib_align = MAX2(ib_align, vcn_jpeg.ib_start_alignment);
assert(ib_align);
info->ib_start_alignment = ib_align;
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/15] winsys/amdgpu: add vcn jpeg cs support

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Add vcn jpeg cs support, align cs by no-op.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index c0f8b442b1..5986810d4e 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -845,6 +845,10 @@ static bool amdgpu_init_cs_context(struct amdgpu_winsys 
*ws,
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_VCN_ENC;
   break;
 
+   case RING_VCN_JPEG:
+  cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_VCN_JPEG;
+  break;
+
case RING_COMPUTE:
case RING_GFX:
   cs->ib[IB_MAIN].ip_type = ring_type == RING_GFX ? AMDGPU_HW_IP_GFX :
@@ -1589,6 +1593,14 @@ static int amdgpu_cs_flush(struct radeon_cmdbuf *rcs,
   while (rcs->current.cdw & 15)
  radeon_emit(rcs, 0x8000); /* type2 nop packet */
   break;
+   case RING_VCN_JPEG:
+  if (rcs->current.cdw % 2)
+ assert(0);
+  while (rcs->current.cdw & 15) {
+ radeon_emit(rcs, 0x6000); /* nop packet */
+ radeon_emit(rcs, 0x);
+  }
+  break;
case RING_VCN_DEC:
   while (rcs->current.cdw & 15)
  radeon_emit(rcs, 0x81ff); /* nop packet */
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/15] radeon/vcn: implement jpeg target buffer cmd

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang 
Acked-by: Leo Liu 
---
 .../drivers/radeon/radeon_vcn_dec_jpeg.c  | 73 ++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
index 0d96acfcd2..afa2015b09 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
@@ -116,7 +116,78 @@ static void send_cmd_target(struct radeon_decoder *dec,
 struct pb_buffer* buf, uint32_t off,
 enum radeon_bo_usage usage, enum radeon_bo_domain domain)
 {
-   /* TODO */
+   uint64_t addr;
+
+   set_reg_jpeg(dec, mmUVD_JPEG_PITCH, COND0, TYPE0, (dec->jpg.dt_pitch >> 
4));
+   set_reg_jpeg(dec, mmUVD_JPEG_UV_PITCH, COND0, TYPE0, 
((dec->jpg.dt_uv_pitch * 2) >> 4));
+
+   set_reg_jpeg(dec, mmUVD_JPEG_TILING_CTRL, COND0, TYPE0, 0);
+   set_reg_jpeg(dec, mmUVD_JPEG_UV_TILING_CTRL, COND0, TYPE0, 0);
+
+   dec->ws->cs_add_buffer(dec->cs, buf, usage | RADEON_USAGE_SYNCHRONIZED,
+  domain, 0);
+   addr = dec->ws->buffer_get_virtual_address(buf);
+   addr = addr + off;
+
+   // set UVD_LMI_JPEG_WRITE_64BIT_BAR_LOW/HIGH based on target buffer 
address
+   set_reg_jpeg(dec, mmUVD_LMI_JPEG_WRITE_64BIT_BAR_HIGH, COND0, TYPE0, 
(addr >> 32));
+   set_reg_jpeg(dec, mmUVD_LMI_JPEG_WRITE_64BIT_BAR_LOW, COND0, TYPE0, 
addr);
+
+   // set output buffer data address
+   set_reg_jpeg(dec, mmUVD_JPEG_INDEX, COND0, TYPE0, 0);
+   set_reg_jpeg(dec, mmUVD_JPEG_DATA, COND0, TYPE0, 
dec->jpg.dt_luma_top_offset);
+   set_reg_jpeg(dec, mmUVD_JPEG_INDEX, COND0, TYPE0, 1);
+   set_reg_jpeg(dec, mmUVD_JPEG_DATA, COND0, TYPE0, 
dec->jpg.dt_chroma_top_offset);
+   set_reg_jpeg(dec, mmUVD_JPEG_TIER_CNTL2, COND0, TYPE3, 0);
+
+   // set output buffer read pointer
+   set_reg_jpeg(dec, mmUVD_JPEG_OUTBUF_RPTR, COND0, TYPE0, 0);
+
+   // enable error interrupts
+   set_reg_jpeg(dec, mmUVD_JPEG_INT_EN, COND0, TYPE0, 0xFFFE);
+
+   // start engine command
+   set_reg_jpeg(dec, mmUVD_JPEG_CNTL, COND0, TYPE0, 0x6);
+
+   // wait for job completion, wait for job JBSI fetch done
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C3);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, (dec->jpg.bsd_size >> 
2));
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C2);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, 0x01400200);
+   set_reg_jpeg(dec, mmUVD_JPEG_RB_RPTR, COND0, TYPE3, 0x);
+
+   // wait for job jpeg outbuf idle
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C3);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, 0x);
+   set_reg_jpeg(dec, mmUVD_JPEG_OUTBUF_WPTR, COND0, TYPE3, 0x0001);
+
+   // stop engine
+   set_reg_jpeg(dec, mmUVD_JPEG_CNTL, COND0, TYPE0, 0x4);
+
+   // asserting jpeg lmi drop
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x0005);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, (1 << 23 | 1 << 0));
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE1, 0);
+
+   // asserting jpeg reset
+   set_reg_jpeg(dec, mmUVD_JPEG_CNTL, COND0, TYPE0, 1);
+
+   // ensure reset is asserted in sclk domain
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C3);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, (1 << 9));
+   set_reg_jpeg(dec, mmUVD_SOFT_RESET, COND0, TYPE3, (1 << 9));
+
+   // de-assert jpeg reset
+   set_reg_jpeg(dec, mmUVD_JPEG_CNTL, COND0, TYPE0, 0);
+
+   // ensure reset is de-asserted in sclk domain
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C3);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, (0 << 9));
+   set_reg_jpeg(dec, mmUVD_SOFT_RESET, COND0, TYPE3, (1 << 9));
+
+   // de-asserting jpeg lmi drop
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x0005);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, 0);
 }
 
 /**
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/15] st/va: get mjpeg slice header

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c   |  13 +-
 src/gallium/state_trackers/va/picture_mjpeg.c | 142 ++
 src/gallium/state_trackers/va/va_private.h|  11 ++
 3 files changed, 164 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index e2cdb2b40c..04d2da0afe 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -259,11 +259,12 @@ handleVASliceDataBufferType(vlVaContext *context, 
vlVaBuffer *buf)
 {
enum pipe_video_format format;
unsigned num_buffers = 0;
-   void * const *buffers[2];
-   unsigned sizes[2];
+   void * const *buffers[3];
+   unsigned sizes[3];
static const uint8_t start_code_h264[] = { 0x00, 0x00, 0x01 };
static const uint8_t start_code_h265[] = { 0x00, 0x00, 0x01 };
static const uint8_t start_code_vc1[] = { 0x00, 0x00, 0x01, 0x0d };
+   static const uint8_t eoi_jpeg[] = { 0xff, 0xd9 };
 
format = u_reduce_video_profile(context->templat.profile);
switch (format) {
@@ -301,6 +302,9 @@ handleVASliceDataBufferType(vlVaContext *context, 
vlVaBuffer *buf)
   sizes[num_buffers++] = context->mpeg4.start_code_size;
   break;
case PIPE_VIDEO_FORMAT_JPEG:
+  vlVaGetJpegSliceHeader(context);
+  buffers[num_buffers] = (void *)context->mjpeg.slice_header;
+  sizes[num_buffers++] = context->mjpeg.slice_header_size;
   break;
case PIPE_VIDEO_FORMAT_VP9:
   vlVaDecoderVP9BitstreamHeader(context, buf);
@@ -313,6 +317,11 @@ handleVASliceDataBufferType(vlVaContext *context, 
vlVaBuffer *buf)
sizes[num_buffers] = buf->size;
++num_buffers;
 
+   if (format == PIPE_VIDEO_FORMAT_JPEG) {
+  buffers[num_buffers] = (void *const)&eoi_jpeg;
+  sizes[num_buffers++] = sizeof(eoi_jpeg);
+   }
+
if (context->needs_begin_frame) {
   context->decoder->begin_frame(context->decoder, context->target,
  &context->desc.base);
diff --git a/src/gallium/state_trackers/va/picture_mjpeg.c 
b/src/gallium/state_trackers/va/picture_mjpeg.c
index 396b743442..defb0b546d 100644
--- a/src/gallium/state_trackers/va/picture_mjpeg.c
+++ b/src/gallium/state_trackers/va/picture_mjpeg.c
@@ -114,3 +114,145 @@ void vlVaHandleSliceParameterBufferMJPEG(vlVaContext 
*context, vlVaBuffer *buf)
context->desc.mjpeg.slice_parameter.restart_interval = 
mjpeg->restart_interval;
context->desc.mjpeg.slice_parameter.num_mcus = mjpeg->num_mcus;
 }
+
+void vlVaGetJpegSliceHeader(vlVaContext *context)
+{
+   int size = 0, saved_size, len_pos, i;
+   uint16_t *bs;
+   uint8_t *p = context->mjpeg.slice_header;
+
+   /* SOI */
+   p[size++] = 0xff;
+   p[size++] = 0xd8;
+
+   /* DQT */
+   p[size++] = 0xff;
+   p[size++] = 0xdb;
+
+   len_pos = size++;
+   size++;
+
+   for (i = 0; i < 4; ++i) {
+  if (context->desc.mjpeg.quantization_table.load_quantiser_table[i] == 0)
+ continue;
+
+  p[size++] = i;
+  memcpy((p + size), 
&context->desc.mjpeg.quantization_table.quantiser_table[i], 64);
+  size += 64;
+   }
+
+   bs = (uint16_t*)&p[len_pos];
+   *bs = util_bswap16(size - 4);
+
+   saved_size = size;
+
+   /* DHT */
+   p[size++] = 0xff;
+   p[size++] = 0xc4;
+
+   len_pos = size++;
+   size++;
+
+   for (i = 0; i < 2; ++i) {
+  int num = 0, j;
+
+  if (context->desc.mjpeg.huffman_table.load_huffman_table[i] == 0)
+ continue;
+
+  p[size++] = 0x00 | i;
+  memcpy((p + size), 
&context->desc.mjpeg.huffman_table.table[i].num_dc_codes, 16);
+  size += 16;
+  for (j = 0; j < 16; ++j)
+ num += context->desc.mjpeg.huffman_table.table[i].num_dc_codes[j];
+  assert(num <= 12);
+  memcpy((p + size), 
&context->desc.mjpeg.huffman_table.table[i].dc_values, num);
+  size += num;
+   }
+
+   for (i = 0; i < 2; ++i) {
+  int num = 0, j;
+
+  if (context->desc.mjpeg.huffman_table.load_huffman_table[i] == 0)
+ continue;
+
+  p[size++] = 0x10 | i;
+  memcpy((p + size), 
&context->desc.mjpeg.huffman_table.table[i].num_ac_codes, 16);
+  size += 16;
+  for (j = 0; j < 16; ++j)
+ num += context->desc.mjpeg.huffman_table.table[i].num_ac_codes[j];
+  assert(num <= 162);
+  memcpy((p + size), 
&context->desc.mjpeg.huffman_table.table[i].ac_values, num);
+  size += num;
+   }
+
+   bs = (uint16_t*)&p[len_pos];
+   *bs = util_bswap16(size - saved_size - 2);
+
+   saved_size = size;
+
+   /* DRI */
+   if (context->desc.mjpeg.slice_parameter.restart_interval) {
+  p[size++] = 0xff;
+  p[size++] = 0xdd;
+  p[size++] = 0x00;
+  p[size++] = 0x04;
+  bs = (uint16_t*)&p[size++];
+  *bs = util_bswap16(context->desc.mjpeg.slice_parameter.restart_interval);
+  saved_size = ++size;
+   }
+
+   /* SOF */
+   p[size++] = 0xff;
+   p[size+

[Mesa-dev] [PATCH 08/15] radeon/vcn: add jpeg decode implementation

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vcn_dec.c   | 21 ++--
 src/gallium/drivers/radeon/radeon_vcn_dec.h   |  4 +
 .../drivers/radeon/radeon_vcn_dec_jpeg.c  | 99 +++
 src/gallium/drivers/radeonsi/Makefile.sources |  1 +
 src/gallium/drivers/radeonsi/meson.build  |  1 +
 5 files changed, 119 insertions(+), 7 deletions(-)
 create mode 100644 src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index 30a98c2786..75ef4a5d40 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -1247,6 +1247,10 @@ static unsigned calc_dpb_size(struct radeon_decoder *dec)
dpb_size *= (3 / 2);
break;
 
+   case PIPE_VIDEO_FORMAT_JPEG:
+   dpb_size = 0;
+   break;
+
default:
// something is missing here
assert(0);
@@ -1547,14 +1551,14 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
}
 
dpb_size = calc_dpb_size(dec);
-
-   if (!si_vid_create_buffer(dec->screen, &dec->dpb, dpb_size, 
PIPE_USAGE_DEFAULT)) {
-   RVID_ERR("Can't allocated dpb.\n");
-   goto error;
+   if (dpb_size) {
+   if (!si_vid_create_buffer(dec->screen, &dec->dpb, dpb_size, 
PIPE_USAGE_DEFAULT)) {
+   RVID_ERR("Can't allocated dpb.\n");
+   goto error;
+   }
+   si_vid_clear_buffer(context, &dec->dpb);
}
 
-   si_vid_clear_buffer(context, &dec->dpb);
-
if (dec->stream_type == RDECODE_CODEC_H264_PERF) {
unsigned ctx_size = calc_ctx_size_h264_perf(dec);
if (!si_vid_create_buffer(dec->screen, &dec->ctx, ctx_size, 
PIPE_USAGE_DEFAULT)) {
@@ -1581,7 +1585,10 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
 
next_buffer(dec);
 
-   dec->send_cmd = send_cmd_dec;
+   if (stream_type == RDECODE_CODEC_JPEG)
+   dec->send_cmd = send_cmd_jpeg;
+   else
+   dec->send_cmd = send_cmd_dec;
 
return &dec->base;
 
diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.h 
b/src/gallium/drivers/radeon/radeon_vcn_dec.h
index 37c0503377..a6a726f46d 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.h
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.h
@@ -768,6 +768,10 @@ void send_cmd_dec(struct radeon_decoder *dec,
  struct pipe_video_buffer *target,
  struct pipe_picture_desc *picture);
 
+void send_cmd_jpeg(struct radeon_decoder *dec,
+ struct pipe_video_buffer *target,
+ struct pipe_picture_desc *picture);
+
 struct pipe_video_codec *radeon_create_decoder(struct pipe_context *context,
const struct pipe_video_codec *templat);
 
diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
new file mode 100644
index 00..7c078a0964
--- /dev/null
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
@@ -0,0 +1,99 @@
+/**
+ *
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+#include 
+#include 
+
+#include "pipe/p_video_codec.h"
+
+#include "util/u_memory.h"
+#include "util/u_video.h"
+
+#include "radeonsi/si_pipe.h"
+#include "radeon_video.h"
+#include "radeon_vcn_dec.h"
+
+static struct p

[Mesa-dev] [PATCH 07/15] radeon/vcn: separate send cmd call from end frame

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vcn_dec.c | 29 +++--
 src/gallium/drivers/radeon/radeon_vcn_dec.h |  7 +
 2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index 26ea1f82ff..30a98c2786 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -1368,21 +1368,15 @@ static void radeon_dec_decode_bitstream(struct 
pipe_video_codec *decoder,
 }
 
 /**
- * end decoding of the current frame
+ * send cmd for vcn dec
  */
-static void radeon_dec_end_frame(struct pipe_video_codec *decoder,
+void send_cmd_dec(struct radeon_decoder *dec,
   struct pipe_video_buffer *target,
   struct pipe_picture_desc *picture)
 {
-   struct radeon_decoder *dec = (struct radeon_decoder*)decoder;
struct pb_buffer *dt;
struct rvid_buffer *msg_fb_it_probs_buf, *bs_buf;
 
-   assert(decoder);
-
-   if (!dec->bs_ptr)
-   return;
-
msg_fb_it_probs_buf = &dec->msg_fb_it_probs_buffers[dec->cur_buffer];
bs_buf = &dec->bs_buffers[dec->cur_buffer];
 
@@ -1412,6 +1406,23 @@ static void radeon_dec_end_frame(struct pipe_video_codec 
*decoder,
send_cmd(dec, RDECODE_CMD_PROB_TBL_BUFFER, 
msg_fb_it_probs_buf->res->buf,
 FB_BUFFER_OFFSET + FB_BUFFER_SIZE, RADEON_USAGE_READ, 
RADEON_DOMAIN_GTT);
set_reg(dec, RDECODE_ENGINE_CNTL, 1);
+}
+
+/**
+ * end decoding of the current frame
+ */
+static void radeon_dec_end_frame(struct pipe_video_codec *decoder,
+  struct pipe_video_buffer *target,
+  struct pipe_picture_desc *picture)
+{
+   struct radeon_decoder *dec = (struct radeon_decoder*)decoder;
+
+   assert(decoder);
+
+   if (!dec->bs_ptr)
+   return;
+
+   dec->send_cmd(dec, target, picture);
 
flush(dec, PIPE_FLUSH_ASYNC);
next_buffer(dec);
@@ -1570,6 +1581,8 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
 
next_buffer(dec);
 
+   dec->send_cmd = send_cmd_dec;
+
return &dec->base;
 
 error:
diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.h 
b/src/gallium/drivers/radeon/radeon_vcn_dec.h
index 2bcc1bb542..37c0503377 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.h
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.h
@@ -759,8 +759,15 @@ struct radeon_decoder {
boolshow_frame;
unsignedref_idx;
struct jpeg_params  jpg;
+   void (*send_cmd)(struct radeon_decoder *dec,
+struct pipe_video_buffer *target,
+struct pipe_picture_desc *picture);
 };
 
+void send_cmd_dec(struct radeon_decoder *dec,
+ struct pipe_video_buffer *target,
+ struct pipe_picture_desc *picture);
+
 struct pipe_video_codec *radeon_create_decoder(struct pipe_context *context,
const struct pipe_video_codec *templat);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/15] radeon/vcn: implement jpeg bitstream buffer cmd

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang 
Acked-by: Leo Liu 
---
 .../drivers/radeon/radeon_vcn_dec_jpeg.c  | 46 ++-
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
index 7c078a0964..0d96acfcd2 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec_jpeg.c
@@ -59,12 +59,56 @@ static struct pb_buffer 
*radeon_jpeg_get_decode_param(struct radeon_decoder *dec
return luma->buffer.buf;
 }
 
+/* add a new set register command to the IB */
+static void set_reg_jpeg(struct radeon_decoder *dec, unsigned reg,
+unsigned cond, unsigned type, uint32_t val)
+{
+   radeon_emit(dec->cs, RDECODE_PKTJ(SOC15_REG_ADDR(reg), cond, type));
+   radeon_emit(dec->cs, val);
+}
+
 /* send a bitstream buffer command */
 static void send_cmd_bitstream(struct radeon_decoder *dec,
 struct pb_buffer* buf, uint32_t off,
 enum radeon_bo_usage usage, enum radeon_bo_domain domain)
 {
-   /* TODO */
+   uint64_t addr;
+
+   // jpeg soft reset
+   set_reg_jpeg(dec, mmUVD_JPEG_CNTL, COND0, TYPE0, 1);
+
+   // ensuring the Reset is asserted in SCLK domain
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C2);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, 0x01400200);
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C3);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, (1 << 9));
+   set_reg_jpeg(dec, mmUVD_SOFT_RESET, COND0, TYPE3, (1 << 9));
+
+   // wait mem
+   set_reg_jpeg(dec, mmUVD_JPEG_CNTL, COND0, TYPE0, 0);
+
+   // ensuring the Reset is de-asserted in SCLK domain
+   set_reg_jpeg(dec, mmUVD_CTX_INDEX, COND0, TYPE0, 0x01C3);
+   set_reg_jpeg(dec, mmUVD_CTX_DATA, COND0, TYPE0, (0 << 9));
+   set_reg_jpeg(dec, mmUVD_SOFT_RESET, COND0, TYPE3, (1 << 9));
+
+   dec->ws->cs_add_buffer(dec->cs, buf, usage | RADEON_USAGE_SYNCHRONIZED,
+  domain, 0);
+   addr = dec->ws->buffer_get_virtual_address(buf);
+   addr = addr + off;
+
+   // set UVD_LMI_JPEG_READ_64BIT_BAR_LOW/HIGH based on bitstream buffer 
address
+   set_reg_jpeg(dec, mmUVD_LMI_JPEG_READ_64BIT_BAR_HIGH, COND0, TYPE0, 
(addr >> 32));
+   set_reg_jpeg(dec, mmUVD_LMI_JPEG_READ_64BIT_BAR_LOW, COND0, TYPE0, 
addr);
+
+   // set jpeg_rb_base
+   set_reg_jpeg(dec, mmUVD_JPEG_RB_BASE, COND0, TYPE0, 0);
+
+   // set jpeg_rb_base
+   set_reg_jpeg(dec, mmUVD_JPEG_RB_SIZE, COND0, TYPE0, 0xFFF0);
+
+   // set jpeg_rb_wptr
+   set_reg_jpeg(dec, mmUVD_JPEG_RB_WPTR, COND0, TYPE0, (dec->jpg.bsd_size 
>> 2));
 }
 
 /* send a target buffer command */
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/15] radeon/uvd: remove get mjpeg slice header

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_uvd.c | 157 
 1 file changed, 157 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index a7ef4252ee..0f3b43de81 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -964,149 +964,6 @@ static struct ruvd_mpeg4 get_mpeg4_msg(struct 
ruvd_decoder *dec,
return result;
 }
 
-static void get_mjpeg_slice_header(struct ruvd_decoder *dec, struct 
pipe_mjpeg_picture_desc *pic)
-{
-   int size = 0, saved_size, len_pos, i;
-   uint16_t *bs;
-   uint8_t *buf = dec->bs_ptr;
-
-   /* SOI */
-   buf[size++] = 0xff;
-   buf[size++] = 0xd8;
-
-   /* DQT */
-   buf[size++] = 0xff;
-   buf[size++] = 0xdb;
-
-   len_pos = size++;
-   size++;
-
-   for (i = 0; i < 4; ++i) {
-   if (pic->quantization_table.load_quantiser_table[i] == 0)
-   continue;
-
-   buf[size++] = i;
-   memcpy((buf + size), 
&pic->quantization_table.quantiser_table[i], 64);
-   size += 64;
-   }
-
-   bs = (uint16_t*)&buf[len_pos];
-   *bs = util_bswap16(size - 4);
-
-   saved_size = size;
-
-   /* DHT */
-   buf[size++] = 0xff;
-   buf[size++] = 0xc4;
-
-   len_pos = size++;
-   size++;
-
-   for (i = 0; i < 2; ++i) {
-   int num = 0, j;
-
-   if (pic->huffman_table.load_huffman_table[i] == 0)
-   continue;
-
-   buf[size++] = 0x00 | i;
-   memcpy((buf + size), &pic->huffman_table.table[i].num_dc_codes, 
16);
-   size += 16;
-   for (j = 0; j < 16; ++j)
-   num += pic->huffman_table.table[i].num_dc_codes[j];
-   assert(num <= 12);
-   memcpy((buf + size), &pic->huffman_table.table[i].dc_values, 
num);
-   size += num;
-   }
-
-   for (i = 0; i < 2; ++i) {
-   int num = 0, j;
-
-   if (pic->huffman_table.load_huffman_table[i] == 0)
-   continue;
-
-   buf[size++] = 0x10 | i;
-   memcpy((buf + size), &pic->huffman_table.table[i].num_ac_codes, 
16);
-   size += 16;
-   for (j = 0; j < 16; ++j)
-   num += pic->huffman_table.table[i].num_ac_codes[j];
-   assert(num <= 162);
-   memcpy((buf + size), &pic->huffman_table.table[i].ac_values, 
num);
-   size += num;
-   }
-
-   bs = (uint16_t*)&buf[len_pos];
-   *bs = util_bswap16(size - saved_size - 2);
-
-   saved_size = size;
-
-   /* DRI */
-   if (pic->slice_parameter.restart_interval) {
-   buf[size++] = 0xff;
-   buf[size++] = 0xdd;
-   buf[size++] = 0x00;
-   buf[size++] = 0x04;
-   bs = (uint16_t*)&buf[size++];
-   *bs = util_bswap16(pic->slice_parameter.restart_interval);
-   saved_size = ++size;
-   }
-
-   /* SOF */
-   buf[size++] = 0xff;
-   buf[size++] = 0xc0;
-
-   len_pos = size++;
-   size++;
-
-   buf[size++] = 0x08;
-
-   bs = (uint16_t*)&buf[size++];
-   *bs = util_bswap16(pic->picture_parameter.picture_height);
-   size++;
-
-   bs = (uint16_t*)&buf[size++];
-   *bs = util_bswap16(pic->picture_parameter.picture_width);
-   size++;
-
-   buf[size++] = pic->picture_parameter.num_components;
-
-   for (i = 0; i < pic->picture_parameter.num_components; ++i) {
-   buf[size++] = pic->picture_parameter.components[i].component_id;
-   buf[size++] = 
pic->picture_parameter.components[i].h_sampling_factor << 4 |
-   pic->picture_parameter.components[i].v_sampling_factor;
-   buf[size++] = 
pic->picture_parameter.components[i].quantiser_table_selector;
-   }
-
-   bs = (uint16_t*)&buf[len_pos];
-   *bs = util_bswap16(size - saved_size - 2);
-
-   saved_size = size;
-
-   /* SOS */
-   buf[size++] = 0xff;
-   buf[size++] = 0xda;
-
-   len_pos = size++;
-   size++;
-
-   buf[size++] = pic->slice_parameter.num_components;
-
-   for (i = 0; i < pic->slice_parameter.num_components; ++i) {
-   buf[size++] = 
pic->slice_parameter.components[i].component_selector;
-   buf[size++] = 
pic->slice_parameter.components[i].dc_table_selector << 4 |
-   pic->slice_parameter.components[i].ac_table_selector;
-   }
-
-   buf[size++] = 0x00;
-   buf[size++] = 0x3f;
-   buf[size++] = 0x00;
-
-   bs = (uint16_t*)&buf[len_pos];
-   *bs = util_bswap16(size - saved_size - 2);
-
-   dec->bs_ptr += size;
-

[Mesa-dev] [PATCH 06/15] radeon/vcn: create cs based on ring type

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vcn_dec.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index fbfef6d273..26ea1f82ff 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -1433,7 +1433,7 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
struct si_context *sctx = (struct si_context*)context;
struct radeon_winsys *ws = sctx->ws;
unsigned width = templ->width, height = templ->height;
-   unsigned dpb_size, bs_buf_size, stream_type = 0;
+   unsigned dpb_size, bs_buf_size, stream_type = 0, ring = RING_VCN_DEC;
struct radeon_decoder *dec;
int r, i;
 
@@ -1462,6 +1462,10 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
case PIPE_VIDEO_FORMAT_VP9:
stream_type = RDECODE_CODEC_VP9;
break;
+   case PIPE_VIDEO_FORMAT_JPEG:
+   stream_type = RDECODE_CODEC_JPEG;
+   ring = RING_VCN_JPEG;
+   break;
default:
assert(0);
break;
@@ -1488,7 +1492,7 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
dec->stream_handle = si_vid_alloc_stream_handle();
dec->screen = context->screen;
dec->ws = ws;
-   dec->cs = ws->cs_create(sctx->ctx, RING_VCN_DEC, NULL, NULL);
+   dec->cs = ws->cs_create(sctx->ctx, ring, NULL, NULL);
if (!dec->cs) {
RVID_ERR("Can't get command submission context.\n");
goto error;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/15] radeon/winsys: add vcn jpeg ring type

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Add a new ring type for vcn jpeg.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_winsys.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index bb732ab314..c6800808cb 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -87,6 +87,7 @@ enum ring_type {
 RING_UVD_ENC,
 RING_VCN_DEC,
 RING_VCN_ENC,
+RING_VCN_JPEG,
 RING_LAST,
 };
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/15] radeon/vcn: add vcn jpeg decode interface

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Add VCN Jpeg decode interfaces and register defines.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vcn_dec.h | 90 +
 1 file changed, 90 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.h 
b/src/gallium/drivers/radeon/radeon_vcn_dec.h
index c6c2a933cc..2bcc1bb542 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.h
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.h
@@ -43,6 +43,15 @@
 
 #define RDECODE_PKT2() (RDECODE_PKT_TYPE_S(2))
 
+#define RDECODE_PKT_REG_J(x)   ((unsigned)(x) & 0x3)
+#define RDECODE_PKT_RES_J(x)   (((unsigned)(x) & 0x3F) << 18)
+#define RDECODE_PKT_COND_J(x)  (((unsigned)(x) & 0xF) << 24)
+#define RDECODE_PKT_TYPE_J(x)  (((unsigned)(x) & 0xF) << 28)
+#define RDECODE_PKTJ(reg, cond, type)  (RDECODE_PKT_REG_J(reg) | \
+   RDECODE_PKT_RES_J(0) | \
+   RDECODE_PKT_COND_J(cond) | \
+   RDECODE_PKT_TYPE_J(type))
+
 #define RDECODE_CMD_MSG_BUFFER 0x
 #define RDECODE_CMD_DPB_BUFFER 0x0001
 #define RDECODE_CMD_DECODING_TARGET_BUFFER 0x0002
@@ -62,6 +71,7 @@
 #define RDECODE_CODEC_MPEG2_VLD0x0003
 #define RDECODE_CODEC_MPEG40x0004
 #define RDECODE_CODEC_H264_PERF0x0007
+#define RDECODE_CODEC_JPEG 0x0008
 #define RDECODE_CODEC_H265 0x0010
 #define RDECODE_CODEC_VP9  0x0011
 
@@ -112,6 +122,77 @@
 
 #define RDECODE_VP9_PROBS_DATA_SIZE2304
 
+#define mmUVD_JPEG_CNTL0x0200
+#define mmUVD_JPEG_CNTL_BASE_IDX   1
+#define mmUVD_JPEG_RB_BASE 0x0201
+#define mmUVD_JPEG_RB_BASE_BASE_IDX1
+#define mmUVD_JPEG_RB_WPTR 0x0202
+#define mmUVD_JPEG_RB_WPTR_BASE_IDX1
+#define mmUVD_JPEG_RB_RPTR 0x0203
+#define mmUVD_JPEG_RB_RPTR_BASE_IDX1
+#define mmUVD_JPEG_RB_SIZE 0x0204
+#define mmUVD_JPEG_RB_SIZE_BASE_IDX1
+#define mmUVD_JPEG_TIER_CNTL2  0x021a
+#define mmUVD_JPEG_TIER_CNTL2_BASE_IDX 1
+#define mmUVD_JPEG_UV_TILING_CTRL  0x021c
+#define mmUVD_JPEG_UV_TILING_CTRL_BASE_IDX 1
+#define mmUVD_JPEG_TILING_CTRL 0x021e
+#define mmUVD_JPEG_TILING_CTRL_BASE_IDX1
+#define mmUVD_JPEG_OUTBUF_RPTR 0x0220
+#define mmUVD_JPEG_OUTBUF_RPTR_BASE_IDX1
+#define mmUVD_JPEG_OUTBUF_WPTR 0x0221
+#define mmUVD_JPEG_OUTBUF_WPTR_BASE_IDX1
+#define mmUVD_JPEG_PITCH   0x0222
+#define mmUVD_JPEG_PITCH_BASE_IDX  1
+#define mmUVD_JPEG_INT_EN  0x0229
+#define mmUVD_JPEG_INT_EN_BASE_IDX 1
+#define mmUVD_JPEG_UV_PITCH0x022b
+#define mmUVD_JPEG_UV_PITCH_BASE_IDX   1
+#define mmUVD_JPEG_INDEX   0x023e
+#define mmUVD_JPEG_INDEX_BASE_IDX  1
+#define mmUVD_JPEG_DATA0x023f
+#define mmUVD_JPEG_DATA_BASE_IDX   1
+#define mmUVD_LMI_JPEG_WRITE_64BIT_BAR_HIGH0x0438
+#define mmUVD_LMI_JPEG_WRITE_64BIT_BAR_HIGH_BASE_IDX   1
+#define mmUVD_LMI_JPEG_WRITE_64BIT_BAR_LOW 0x0439
+#define mmUVD_LMI_JPEG_WRITE_64BIT_BAR_LOW_BASE_IDX1
+#define mmUVD_LMI_JPEG_READ_64BIT_BAR_HIGH 0x045a
+#define mmUVD_LMI_JPEG_READ_64BIT_BAR_HIGH_BASE_IDX1
+#define mmUVD_LMI_JPEG_READ_64BIT_BAR_LOW  0x045b
+#define mmUVD_LMI_JPEG_READ_64BIT_BAR_LOW_BASE_IDX 1
+#define mmUVD_CTX_INDEX0x0528
+#define mmUVD_CTX_INDEX_BASE_IDX   1
+#define mmUVD_CTX_DATA 0x0529
+#define mmUVD_CTX_DATA_BASE_IDX1
+#define mmUVD_SOFT_RESET   0x05a0
+#define mmUVD_SOFT_RESET_BASE_IDX  1
+
+#define UVD_BASE_INST0_SEG00x7800
+#define UVD_BASE_INST0_SEG10x7E00
+#define UVD_BASE_INST0_SEG20
+#define UVD_BASE_INST0_SEG30
+#define UVD_BASE_INST0_SEG40
+
+#define SOC15_REG_ADDR(reg)(UVD_BASE_INST0_SEG1 + reg)
+
+#define COND0  0
+#define COND1  1
+#define COND2  2
+

[Mesa-dev] [PATCH 03/15] radeon/vcn: move radeon decoder define to header file

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vcn_dec.c | 31 
 src/gallium/drivers/radeon/radeon_vcn_dec.h | 32 +
 2 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index c2e22048ce..fbfef6d273 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -51,42 +51,11 @@
 #define RDECODE_GPCOM_VCPU_DATA1   0x20714
 #define RDECODE_ENGINE_CNTL0x20718
 
-#define NUM_BUFFERS4
 #define NUM_MPEG2_REFS 6
 #define NUM_H264_REFS  17
 #define NUM_VC1_REFS   5
 #define NUM_VP9_REFS   8
 
-struct radeon_decoder {
-   struct pipe_video_codec base;
-
-   unsignedstream_handle;
-   unsignedstream_type;
-   unsignedframe_number;
-
-   struct pipe_screen  *screen;
-   struct radeon_winsys*ws;
-   struct radeon_cmdbuf*cs;
-
-   void*msg;
-   uint32_t*fb;
-   uint8_t *it;
-   uint8_t *probs;
-   void*bs_ptr;
-
-   struct rvid_buffer  msg_fb_it_probs_buffers[NUM_BUFFERS];
-   struct rvid_buffer  bs_buffers[NUM_BUFFERS];
-   struct rvid_buffer  dpb;
-   struct rvid_buffer  ctx;
-   struct rvid_buffer  sessionctx;
-
-   unsignedbs_size;
-   unsignedcur_buffer;
-   void*render_pic_list[16];
-   boolshow_frame;
-   unsignedref_idx;
-};
-
 static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
struct pipe_h264_picture_desc *pic)
 {
diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.h 
b/src/gallium/drivers/radeon/radeon_vcn_dec.h
index 7a07ad0637..c6c2a933cc 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.h
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.h
@@ -108,6 +108,8 @@
 
 #define RDECODE_SPS_INFO_H264_EXTENSION_SUPPORT_FLAG_SHIFT 7
 
+#define NUM_BUFFERS4
+
 #define RDECODE_VP9_PROBS_DATA_SIZE2304
 
 /* VP9 Frame header flags */
@@ -639,6 +641,36 @@ typedef struct rvcn_dec_vp9_probs_segment_s {
 };
 } rvcn_dec_vp9_probs_segment_t;
 
+struct radeon_decoder {
+   struct pipe_video_codec base;
+
+   unsignedstream_handle;
+   unsignedstream_type;
+   unsignedframe_number;
+
+   struct pipe_screen  *screen;
+   struct radeon_winsys*ws;
+   struct radeon_cmdbuf*cs;
+
+   void*msg;
+   uint32_t*fb;
+   uint8_t *it;
+   uint8_t *probs;
+   void*bs_ptr;
+
+   struct rvid_buffer  msg_fb_it_probs_buffers[NUM_BUFFERS];
+   struct rvid_buffer  bs_buffers[NUM_BUFFERS];
+   struct rvid_buffer  dpb;
+   struct rvid_buffer  ctx;
+   struct rvid_buffer  sessionctx;
+
+   unsignedbs_size;
+   unsignedcur_buffer;
+   void*render_pic_list[16];
+   boolshow_frame;
+   unsignedref_idx;
+};
+
 struct pipe_video_codec *radeon_create_decoder(struct pipe_context *context,
const struct pipe_video_codec *templat);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] meson: update required amdgpu version to 2.4.95

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

VCN jpeg requires new hw ip

Signed-off-by: Boyuan Zhang 
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 002ce35a60..35e3e934a3 100644
--- a/meson.build
+++ b/meson.build
@@ -1108,7 +1108,7 @@ dep_libdrm_etnaviv = null_dep
 dep_libdrm_freedreno = null_dep
 dep_libdrm_intel = null_dep
 
-_drm_amdgpu_ver = '2.4.93'
+_drm_amdgpu_ver = '2.4.95'
 _drm_radeon_ver = '2.4.71'
 _drm_nouveau_ver = '2.4.66'
 _drm_etnaviv_ver = '2.4.89'
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/15] configure.ac: update libdrm amdgpu version to 2.4.95

2018-10-17 Thread boyuan.zhang

From: Boyuan Zhang 

VCN jpeg requires new hw ip

Signed-off-by: Boyuan Zhang 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 520948b051..5fd7d8510d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -74,7 +74,7 @@ AC_SUBST([OPENCL_VERSION])
 # in the first entry.
 LIBDRM_REQUIRED=2.4.75
 LIBDRM_RADEON_REQUIRED=2.4.71
-LIBDRM_AMDGPU_REQUIRED=2.4.93
+LIBDRM_AMDGPU_REQUIRED=2.4.95
 LIBDRM_INTEL_REQUIRED=2.4.75
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/1] swr/rast: ignore CreateElementUnorderedAtomicMemCpy

2018-10-17 Thread Alok Hota

This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball
---
 .../drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py 
b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
index d34e88d1bc..485403ae1e 100644
--- a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
+++ b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
@@ -161,7 +161,8 @@ def parse_ir_builder(input_file):
 func_name == 'CreateAlignmentAssumptionHelper' or
 func_name == 'CreateGEP' or
 func_name == 'CreateLoad' or
-func_name == 'CreateMaskedLoad'):
+func_name == 'CreateMaskedLoad' or
+func_name == 'CreateElementUnorderedAtomicMemCpy'):
 ignore = True
 
 # Convert CamelCase to CAMEL_CASE
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/1] swr: Fix for LLVM 5 to 6 API change

2018-10-17 Thread Alok Hota

This is primarily a fix for the stable branch as it is still packaged
with LLVM 5 libs. This fixes a compile error if a user tries to build
with LLVM 6+ from an 18.2.x release tarball

Alok Hota (1):
  swr/rast: ignore CreateElementUnorderedAtomicMemCpy

 .../drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965/nir: use vectorization for non-scalar stages

2018-10-17 Thread Jason Ekstrand

From: Connor Abbott 

Shader-db results on Haswell:

total instructions in shared programs: 2180337 -> 2154080 (-1.20%)
instructions in affected programs: 959766 -> 933509 (-2.74%)
helped: 5653
HURT: 2560

total cycles in shared programs: 12339326 -> 12307102 (-0.26%)
cycles in affected programs: 6102794 -> 6070570 (-0.53%)
helped: 3838
HURT: 4868

Most of the hurt programs seem to be because we generate extra MOV's due
to vectorizing things. For example, in
shaders/non-free/steam/anomaly-2/158.shader_test, this:

add(8)  g116<1>.xyF g12<4,4,1>.xyyyF g1.4<0,4,1>.xyyyF { align16 
NoDDClr 1Q };
add(8)  g117<1>.xyF g12<4,4,1>.xyyyF g1.4<0,4,1>.zwwwF { align16 
NoDDClr 1Q };
add(8)  g116<1>.zwF g12<4,4,1>.xxxyF -g1.4<0,4,1>.xxxyF { align16 
NoDDChk 1Q };
add(8)  g117<1>.zwF g12<4,4,1>.xxxyF -g1.4<0,4,1>.zzzwF { align16 
NoDDChk 1Q };

Turns into this:

add(8)  g13<1>F g12<4,4,1>.xyxyF g1.4<0,4,1>F   { align16 1Q };
add(8)  g14<1>F g12<4,4,1>.xyxyF -g1.4<0,4,1>F  { align16 1Q };
mov(8)  g116<1>.xyD g13<4,4,1>.xyyyD{ align16 
NoDDClr 1Q };
mov(8)  g117<1>.xyD g13<4,4,1>.zwwwD{ align16 
NoDDClr 1Q };
mov(8)  g116<1>.zwD g14<4,4,1>.xxxyD{ align16 
NoDDChk 1Q };
mov(8)  g117<1>.zwD g14<4,4,1>.zzzwD{ align16 
NoDDChk 1Q };

So we eliminated two add's, but then had to introduce four mov's to
transpose the result.  Some of the hurt is because vectorization is a bit
over-aggressive and we vectorize something when we should have left it
as a scalar and CSEd it.  Unfortunately, this is all really tricky to do
as it involves the interactions between many different components.
---
 src/intel/compiler/brw_nir.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 297845b89b7..564fd004a94 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -568,6 +568,12 @@ brw_nir_optimize(nir_shader *nir, const struct 
brw_compiler *compiler,
   OPT(nir_copy_prop);
   OPT(nir_opt_dce);
   OPT(nir_opt_cse);
+
+  if (!is_scalar) {
+ OPT(nir_opt_vectorize);
+ OPT(nir_copy_prop);
+  }
+
   OPT(nir_opt_peephole_select, 0);
   OPT(nir_opt_intrinsics);
   OPT(nir_opt_algebraic);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] intel/peephole_ffma: Fix swizzle propagation

2018-10-17 Thread Jason Ekstrand

The num_components value passed into get_mul_for_src is used to only
compose the parts of the swizzle that we know will be used so we don't
compose invalid swizzle components.  However, we had a bug where we
passed the number of components of the add all the way through.  For the
given source, we need the number of components read from that source.
In the case where we have a narrow add, say 2 components, that is
sourced from a chain of wider instructions, we may not compose all the
swizzles.  All we really need to do is pass through the right number of
components at each level.

Fixes: 2231cf0ba3a "nir: Fix output swizzle in get_mul_for_src"
---
 src/intel/compiler/brw_nir_opt_peephole_ffma.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_nir_opt_peephole_ffma.c 
b/src/intel/compiler/brw_nir_opt_peephole_ffma.c
index cc225e1847b..7271bdbca43 100644
--- a/src/intel/compiler/brw_nir_opt_peephole_ffma.c
+++ b/src/intel/compiler/brw_nir_opt_peephole_ffma.c
@@ -68,7 +68,7 @@ are_all_uses_fadd(nir_ssa_def *def)
 }
 
 static nir_alu_instr *
-get_mul_for_src(nir_alu_src *src, int num_components,
+get_mul_for_src(nir_alu_src *src, unsigned num_components,
 uint8_t swizzle[4], bool *negate, bool *abs)
 {
uint8_t swizzle_tmp[4];
@@ -93,16 +93,19 @@ get_mul_for_src(nir_alu_src *src, int num_components,
switch (alu->op) {
case nir_op_imov:
case nir_op_fmov:
-  alu = get_mul_for_src(&alu->src[0], num_components, swizzle, negate, 
abs);
+  alu = get_mul_for_src(&alu->src[0], alu->dest.dest.ssa.num_components,
+swizzle, negate, abs);
   break;
 
case nir_op_fneg:
-  alu = get_mul_for_src(&alu->src[0], num_components, swizzle, negate, 
abs);
+  alu = get_mul_for_src(&alu->src[0], alu->dest.dest.ssa.num_components,
+swizzle, negate, abs);
   *negate = !*negate;
   break;
 
case nir_op_fabs:
-  alu = get_mul_for_src(&alu->src[0], num_components, swizzle, negate, 
abs);
+  alu = get_mul_for_src(&alu->src[0], alu->dest.dest.ssa.num_components,
+swizzle, negate, abs);
   *negate = false;
   *abs = true;
   break;
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] nir: add a vectorization pass

2018-10-17 Thread Jason Ekstrand

From: Connor Abbott 

This effectively does the opposite of nir_lower_alus_to_scalar, trying
to combine per-component ALU operations with the same sources but
different swizzles into one larger ALU operation. It uses a similar
model as CSE, where we do a depth-first approach and keep around a hash
set of instructions to be combined, but there are a few major
differences:

1. For now, we only support entirely per-component ALU operations.
2. Since it's not always guaranteed that we'll be able to combine
equivalent instructions, we keep a stack of equivalent instructions
around, trying to combine new instructions with instructions on the
stack.

The pass isn't comprehensive by far; it can't handle operations where
some of the sources are per-component and others aren't, and it can't
handle phi nodes. But it should handle the more common cases, and it
should be reasonably efficient.
---
 src/compiler/Makefile.sources|   1 +
 src/compiler/nir/meson.build |   1 +
 src/compiler/nir/nir.h   |   2 +
 src/compiler/nir/nir_opt_vectorize.c | 454 +++
 4 files changed, 458 insertions(+)
 create mode 100644 src/compiler/nir/nir_opt_vectorize.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index b65bb9b80b9..e231f4a9ab1 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -289,6 +289,7 @@ NIR_FILES = \
nir/nir_opt_shrink_load.c \
nir/nir_opt_trivial_continues.c \
nir/nir_opt_undef.c \
+   nir/nir_opt_vectorize.c \
nir/nir_phi_builder.c \
nir/nir_phi_builder.h \
nir/nir_print.c \
diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
index d8f65640004..865d11bb278 100644
--- a/src/compiler/nir/meson.build
+++ b/src/compiler/nir/meson.build
@@ -173,6 +173,7 @@ files_libnir = files(
   'nir_opt_shrink_load.c',
   'nir_opt_trivial_continues.c',
   'nir_opt_undef.c',
+  'nir_opt_vectorize.c',
   'nir_phi_builder.c',
   'nir_phi_builder.h',
   'nir_print.c',
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 5b871812d46..f33e2d3b726 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -3088,6 +3088,8 @@ bool nir_opt_trivial_continues(nir_shader *shader);
 
 bool nir_opt_undef(nir_shader *shader);
 
+bool nir_opt_vectorize(nir_shader *shader);
+
 bool nir_opt_conditional_discard(nir_shader *shader);
 
 void nir_sweep(nir_shader *shader);
diff --git a/src/compiler/nir/nir_opt_vectorize.c 
b/src/compiler/nir/nir_opt_vectorize.c
new file mode 100644
index 000..7e22726a3ef
--- /dev/null
+++ b/src/compiler/nir/nir_opt_vectorize.c
@@ -0,0 +1,454 @@
+/*
+ * Copyright © 2015 Connor Abbott
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "nir.h"
+#include "nir_vla.h"
+#include "nir_builder.h"
+#include "util/u_dynarray.h"
+
+#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data))
+
+static uint32_t
+hash_src(uint32_t hash, const nir_src *src)
+{
+   assert(src->is_ssa);
+
+   return HASH(hash, src->ssa);
+}
+
+static uint32_t
+hash_alu_src(uint32_t hash, const nir_alu_src *src)
+{
+   assert(!src->abs && !src->negate);
+
+   /* intentionally don't hash swizzle */
+
+   return hash_src(hash, &src->src);
+}
+
+static uint32_t
+hash_alu(uint32_t hash, const nir_alu_instr *instr)
+{
+   hash = HASH(hash, instr->op);
+
+   hash = HASH(hash, instr->dest.dest.ssa.bit_size);
+
+   for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++)
+  hash = hash_alu_src(hash, &instr->src[i]);
+
+   return hash;
+}
+
+static uint32_t
+hash_instr(const nir_instr *instr)
+{
+   uint32_t hash = _mesa_fnv32_1a_offset_bias;
+
+   switch (instr->type) {
+   case nir_instr_type_alu:
+  return hash_alu(hash, nir_instr_as_alu(instr));
+   default:
+  unreachable("bad instruction type");
+   }
+}
+
+static bool
+srcs_equal(const nir_src *s

Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Neil Roberts

Eric Engestrom  writes:

> That's absolutely fair :)
>
> I wanted to ack your patch earlier, since fixing it is good regardless,
> but freedreno isn't my area so I didn't feel comfortable doing so;
> I changed my mind in the mean time though, so here you go :P
> Acked-by: Eric Engestrom 
>
> You have push access, right?

Yes, I have push access. But actually Rob already pushed my other patch
to just remove it in the meantime, so there’s no need to do anything.

Thanks anyway.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Eric Engestrom

On Wednesday, 2018-10-17 17:25:22 +0200, Neil Roberts wrote:
> Eric Engestrom  writes:
> 
> > You might want to remove these instead, and use the .editorconfig [1]
> > already present at src/gallium/drivers/freedreno/.editorconfig This is
> > much easier to maintain than per-files settings ;)
> 
> Either fixing it or removing it is fine by me. I now notice there is a
> .dir-locals.el file that should make it work anyway. (apparently I was
> the last person to touch it too!) It has a typo which makes it fail to
> set indent-tabs-mode though. I can make everything work locally either
> way, I just wanted to get rid of the annoying warning whenever you open
> a file.

That's absolutely fair :)

I wanted to ack your patch earlier, since fixing it is good regardless,
but freedreno isn't my area so I didn't feel comfortable doing so;
I changed my mind in the mean time though, so here you go :P
Acked-by: Eric Engestrom 

You have push access, right?

> 
> - Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radeonsi: clamp point size to the limit

2018-10-17 Thread Jakob Bornecrantz

Tested-by: Jakob Bornecrantz 
On Wed, Oct 17, 2018 at 5:29 PM Marek Olšák  wrote:
>
> From: Marek Olšák 
>
> This fixes dEQP-GLES2.functional.rasterization.limits.points.
> Broken by: ea039f789d9b54e1bd1d644b6a29863ca3500314
> ---
>  src/gallium/drivers/radeonsi/si_get.c   | 5 +++--
>  src/gallium/drivers/radeonsi/si_pipe.h  | 1 +
>  src/gallium/drivers/radeonsi/si_state.c | 2 +-
>  3 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_get.c 
> b/src/gallium/drivers/radeonsi/si_get.c
> index ac302b8a946..804276b3eda 100644
> --- a/src/gallium/drivers/radeonsi/si_get.c
> +++ b/src/gallium/drivers/radeonsi/si_get.c
> @@ -326,25 +326,26 @@ static int si_get_param(struct pipe_screen *pscreen, 
> enum pipe_cap param)
> default:
> return u_pipe_screen_get_param_defaults(pscreen, param);
> }
>  }
>
>  static float si_get_paramf(struct pipe_screen* pscreen, enum pipe_capf param)
>  {
> switch (param) {
> case PIPE_CAPF_MAX_LINE_WIDTH:
> case PIPE_CAPF_MAX_LINE_WIDTH_AA:
> -   case PIPE_CAPF_MAX_POINT_WIDTH:
> -   case PIPE_CAPF_MAX_POINT_WIDTH_AA:
> /* This depends on the quant mode, though the precise 
> interactions
>  * are unknown. */
> return 2048;
> +   case PIPE_CAPF_MAX_POINT_WIDTH:
> +   case PIPE_CAPF_MAX_POINT_WIDTH_AA:
> +   return SI_MAX_POINT_SIZE;
> case PIPE_CAPF_MAX_TEXTURE_ANISOTROPY:
> return 16.0f;
> case PIPE_CAPF_MAX_TEXTURE_LOD_BIAS:
> return 16.0f;
> case PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE:
> case PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE:
> case PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY:
> return 0.0f;
> }
> return 0.0f;
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
> b/src/gallium/drivers/radeonsi/si_pipe.h
> index 6edc06cece7..dc95afb7421 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.h
> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> @@ -41,20 +41,21 @@
>  #define ATI_VENDOR_ID  0x1002
>
>  #define SI_NOT_QUERY   0x
>
>  /* The base vertex and primitive restart can be any number, but we must pick
>   * one which will mean "unknown" for the purpose of state tracking and
>   * the number shouldn't be a commonly-used one. */
>  #define SI_BASE_VERTEX_UNKNOWN INT_MIN
>  #define SI_RESTART_INDEX_UNKNOWN   INT_MIN
>  #define SI_NUM_SMOOTH_AA_SAMPLES   8
> +#define SI_MAX_POINT_SIZE  2048
>  #define SI_GS_PER_ES   128
>  /* Alignment for optimal CP DMA performance. */
>  #define SI_CPDMA_ALIGNMENT 32
>
>  /* Tunables for compute-based clear_buffer and copy_buffer: */
>  #define SI_COMPUTE_CLEAR_DW_PER_THREAD 4
>  #define SI_COMPUTE_COPY_DW_PER_THREAD  4
>  #define SI_COMPUTE_DST_CACHE_POLICYL2_STREAM
>
>  /* Pipeline & streamout query controls. */
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 8b2e6e57f45..176ec749148 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -891,21 +891,21 @@ static void *si_create_rs_state(struct pipe_context 
> *ctx,
> S_0286D4_PNT_SPRITE_OVRD_Z(V_0286D4_SPI_PNT_SPRITE_SEL_0) |
> S_0286D4_PNT_SPRITE_OVRD_W(V_0286D4_SPI_PNT_SPRITE_SEL_1) |
> S_0286D4_PNT_SPRITE_TOP_1(state->sprite_coord_mode != 
> PIPE_SPRITE_COORD_UPPER_LEFT));
>
> /* point size 12.4 fixed point */
> tmp = (unsigned)(state->point_size * 8.0);
> si_pm4_set_reg(pm4, R_028A00_PA_SU_POINT_SIZE, S_028A00_HEIGHT(tmp) | 
> S_028A00_WIDTH(tmp));
>
> if (state->point_size_per_vertex) {
> psize_min = util_get_min_point_size(state);
> -   psize_max = 8192;
> +   psize_max = SI_MAX_POINT_SIZE;
> } else {
> /* Force the point size to be as if the vertex output was 
> disabled. */
> psize_min = state->point_size;
> psize_max = state->point_size;
> }
> rs->max_point_size = psize_max;
>
> /* Divide by two, because 0.5 = 1 pixel. */
> si_pm4_set_reg(pm4, R_028A04_PA_SU_POINT_MINMAX,
> S_028A04_MIN_SIZE(si_pack_float_12p4(psize_min/2)) |
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: fix a VGT hang with primitive restart on Polaris10 and later

2018-10-17 Thread Jakob Bornecrantz

Tested-by: Jakob Bornecrantz 
On Wed, Oct 17, 2018 at 5:29 PM Marek Olšák  wrote:
>
> From: Marek Olšák 
>
> Cc: 18.1 18.2 
> ---
>  src/gallium/drivers/radeonsi/si_state_draw.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
> b/src/gallium/drivers/radeonsi/si_state_draw.c
> index 83eb646b791..612ca910cb9 100644
> --- a/src/gallium/drivers/radeonsi/si_state_draw.c
> +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
> @@ -376,21 +376,21 @@ si_get_init_multi_vgt_param(struct si_screen *sscreen,
> }
>
> if (sscreen->info.chip_class >= CIK) {
> /* WD_SWITCH_ON_EOP has no effect on GPUs with less than
>  * 4 shader engines. Set 1 to pass the assertion below.
>  * The other cases are hardware requirements.
>  *
>  * Polaris supports primitive restart with WD_SWITCH_ON_EOP=0
>  * for points, line strips, and tri strips.
>  */
> -   if (sscreen->info.max_se < 4 ||
> +   if (sscreen->info.max_se <= 2 ||
> key->u.prim == PIPE_PRIM_POLYGON ||
> key->u.prim == PIPE_PRIM_LINE_LOOP ||
> key->u.prim == PIPE_PRIM_TRIANGLE_FAN ||
> key->u.prim == PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY ||
> (key->u.primitive_restart &&
>  (sscreen->info.family < CHIP_POLARIS10 ||
>   (key->u.prim != PIPE_PRIM_POINTS &&
>key->u.prim != PIPE_PRIM_LINE_STRIP &&
>key->u.prim != PIPE_PRIM_TRIANGLE_STRIP))) ||
> key->u.count_from_stream_output)
> @@ -407,35 +407,41 @@ si_get_init_multi_vgt_param(struct si_screen *sscreen,
>  * instances are smaller than a primgroup.
>  * Assume indirect draws always use small instances.
>  * This is needed for good VS wave utilization.
>  */
> if (sscreen->info.chip_class <= VI &&
> sscreen->info.max_se == 4 &&
> key->u.multi_instances_smaller_than_primgroup)
> wd_switch_on_eop = true;
>
> /* Required on CIK and later. */
> -   if (sscreen->info.max_se > 2 && !wd_switch_on_eop)
> +   if (sscreen->info.max_se == 4 && !wd_switch_on_eop)
> ia_switch_on_eoi = true;
>
> /* Required by Hawaii and, for some special cases, by VI. */
> if (ia_switch_on_eoi &&
> (sscreen->info.family == CHIP_HAWAII ||
>  (sscreen->info.chip_class == VI &&
>   (key->u.uses_gs || max_primgroup_in_wave != 2
> partial_vs_wave = true;
>
> /* Instancing bug on Bonaire. */
> if (sscreen->info.family == CHIP_BONAIRE && ia_switch_on_eoi 
> &&
> key->u.uses_instancing)
> partial_vs_wave = true;
>
> +   /* This only applies to Polaris10 and later 4 SE chips.
> +* wd_switch_on_eop is already true on all other chips.
> +*/
> +   if (!wd_switch_on_eop && key->u.primitive_restart)
> +   partial_vs_wave = true;
> +
> /* If the WD switch is false, the IA switch must be false 
> too. */
> assert(wd_switch_on_eop || !ia_switch_on_eop);
> }
>
> /* If SWITCH_ON_EOI is set, PARTIAL_ES_WAVE must be set too. */
> if (sscreen->info.chip_class <= VI && ia_switch_on_eoi)
> partial_es_wave = true;
>
> return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |
> S_028AA8_SWITCH_ON_EOI(ia_switch_on_eoi) |
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]

2018-10-17 Thread Keith Packard

Jason Ekstrand  writes:

> I like it

When the comments are longer than the code, you know you're done?

-- 
-keith


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] mesa/st: enable EXT_sRGB_write_control for drivers that support it

2018-10-17 Thread Gert Wollny

Am Mittwoch, den 17.10.2018, 12:56 -0400 schrieb Ilia Mirkin:
> On Wed, Oct 17, 2018 at 12:39 PM Gert Wollny 
> wrote:
> > 
> > From: Gert Wollny 
> > 
> > With this patch the extension EXT_sRGB_write_control is enabled for
> > gallium drivers that support sRGB formats as render targets.
> > 
> > Tested (and pass) on r600(evergreen) and softpipe:
> > 
> >   dEQP-
> > GLES31.functional.fbo.srgb_write_control.framebuffer_srgb_enabled*
> > 
> > with "MESA_GLES_VERSION_OVERRIDE=3.2" (the tests needlessly check
> > for this)
> > 
> > Signed-off-by: Gert Wollny 
> > ---
> >  src/mesa/state_tracker/st_manager.c | 17 +
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> > 
> > diff --git a/src/mesa/state_tracker/st_manager.c
> > b/src/mesa/state_tracker/st_manager.c
> > index ceb48dd490..562b12a1ef 100644
> > --- a/src/mesa/state_tracker/st_manager.c
> > +++ b/src/mesa/state_tracker/st_manager.c
> > @@ -457,14 +457,12 @@ st_framebuffer_create(struct st_context *st,
> >  * format such that util_format_srgb(visual->color_format) can
> > be supported
> >  * by the pipe driver.  We still need to advertise the
> > capability here.
> >  *
> > -* For GLES, however, sRGB framebuffer write is controlled only
> > by the
> > -* capability of the framebuffer.  There is
> > GL_EXT_sRGB_write_control to
> > -* give applications the control back, but sRGB write is still
> > enabled by
> > -* default.  To avoid unexpected results, we should not
> > advertise the
> > -* capability.  This could change when we add support for
> > -* EGL_KHR_gl_colorspace.
> > +* For GLES, however, sRGB framebuffer write is initially only
> > controlled
> > +* by the capability of the framebuffer, but with
> > GL_EXT_sRGB_write_control
> > +* control is given back to the applications. Similar to
> > desktop GL
> > +* support for this extension depends EXT_framebuffer_sRGB.
> >  */
> > -   if (_mesa_is_desktop_gl(st->ctx)) {
> > +   {
> >struct pipe_screen *screen = st->pipe->screen;
> >const enum pipe_format srgb_format =
> >   util_format_srgb(stfbi->visual->color_format);
> > @@ -475,8 +473,11 @@ st_framebuffer_create(struct st_context *st,
> >PIPE_TEXTURE_2D, stfbi-
> > >visual->samples,
> >stfbi->visual->samples,
> >(PIPE_BIND_DISPLAY_TARGET |
> > -   PIPE_BIND_RENDER_TARGET)))
> > +   PIPE_BIND_RENDER_TARGET)))
> > {
> >   mode.sRGBCapable = GL_TRUE;
> > + /* Exposing this as extension is only needed on GLES */
> > + st->ctx->Extensions.EXT_sRGB_write_control =
> > !_mesa_is_desktop_gl(st->ctx);
> 
> Having weird dependencies in extension enables creates a lot of
> confusion. I'd just flip it to true.
My resasoning here was that this is a GLES only extension, but I now
see that this is acctually done via the extension table. 

Thanks for all the pointers. 
Gert 

> 
> > +  }
> > }
> > 
> > _mesa_initialize_window_framebuffer(&stfb->Base, &mode);
> > --
> > 2.18.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Q: to which software renderers should we contribute to help virgl conformance testing

2018-10-17 Thread Gert Wollny

Dear all, 

we are looking into doing a CI for virglrenderer that also runs a
subset of the GLES dEQP, and in order to be able to run this also in
gitlab.fd.o we were looking into the available gallium software
renderers. Inital tests by just running the dEQP-GLES2 were quite
successful in the sense that the exection time is not too long (a full
run on the GL and GLES host with llvmpipe takes about 10 min [1]). 

Now to extend on that work the focus is turning to which software
renderer has the most features, the least failing tests, and is
actively developed. 

Simply looking at the commit stats it seems that the developement of
softpipe and llvmpipe is mostly stalled, swr, on the other had has seen
quite some development, but mostly regarding performance, and given the
FAQ [2] the focus is on a very specific application space and not so
much on getting more features in.

When checking for conformance of virglrenderer we need a host driver
that is conformant itself, and we are willing to contribute here, but
it seems to make most sense to focus this work on just one driver. To
make sensible choice there are some open questions:

Are there plans to get swr and/or llvmpipe to support gles 3.1, or
carry any of the drivers even further, maybe GLES 3.2 and desktop 4.x?


Is there any specific interest to fix all failures that occur when
running gles dEQP? In this bug report [3] Roland pointed out that
"there is no goal as such to pass dEQP, although patches are welcome",
any opinion for the other drivers? (for swr beyond what is written in
the FAQ). 

As pointed out in the FAQ, swr is very Intel specific, are there plans
not layed out in the FAQ to support other, non-x86 hardware?

many thanks 
Gert

[1] https://gitlab.freedesktop.org/gerddie/virglrenderer/pipelines
[2] https://gallium.readthedocs.io/en/latest/drivers/openswr/faq.html#w
hat-s-the-conformance
[3] https://bugs.freedesktop.org/show_bug.cgi?id=94957
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]

2018-10-17 Thread Jason Ekstrand

I like it

Reviewed-by: Jason Ekstrand 

On Wed, Oct 17, 2018 at 11:49 AM Keith Packard  wrote:

> Offers three clocks, device, clock monotonic and clock monotonic
> raw. Could use some kernel support to reduce the deviation between
> clock values.
>
> v2:
> Ensure deviation is at least as big as the GPU time interval.
>
> v3:
> Set device->lost when returning DEVICE_LOST.
> Use MAX2 and DIV_ROUND_UP instead of open coding these.
> Delete spurious TIMESTAMP in radv version.
>
> Suggested-by: Jason Ekstrand 
> Suggested-by: Lionel Landwerlin 
>
> v4:
> Add anv_gem_reg_read to anv_gem_stubs.c
>
> Suggested-by: Jason Ekstrand 
>
> v5:
> Adjust maxDeviation computation to max(sampled_clock_period) +
> sample_interval.
>
> Suggested-by: Bas Nieuwenhuizen 
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
> ---
>  src/amd/vulkan/radv_device.c   | 119 +++
>  src/amd/vulkan/radv_extensions.py  |   1 +
>  src/intel/vulkan/anv_device.c  | 127 +
>  src/intel/vulkan/anv_extensions.py |   1 +
>  src/intel/vulkan/anv_gem.c |  13 +++
>  src/intel/vulkan/anv_gem_stubs.c   |   7 ++
>  src/intel/vulkan/anv_private.h |   2 +
>  7 files changed, 270 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 174922780fc..4a705a724ef 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -4955,3 +4955,122 @@ radv_GetDeviceGroupPeerMemoryFeatures(
>VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT |
>VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT;
>  }
> +
> +static const VkTimeDomainEXT radv_time_domains[] = {
> +   VK_TIME_DOMAIN_DEVICE_EXT,
> +   VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT,
> +   VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT,
> +};
> +
> +VkResult radv_GetPhysicalDeviceCalibrateableTimeDomainsEXT(
> +   VkPhysicalDevice physicalDevice,
> +   uint32_t *pTimeDomainCount,
> +   VkTimeDomainEXT  *pTimeDomains)
> +{
> +   int d;
> +   VK_OUTARRAY_MAKE(out, pTimeDomains, pTimeDomainCount);
> +
> +   for (d = 0; d < ARRAY_SIZE(radv_time_domains); d++) {
> +   vk_outarray_append(&out, i) {
> +   *i = radv_time_domains[d];
> +   }
> +   }
> +
> +   return vk_outarray_status(&out);
> +}
> +
> +static uint64_t
> +radv_clock_gettime(clockid_t clock_id)
> +{
> +   struct timespec current;
> +   int ret;
> +
> +   ret = clock_gettime(clock_id, ¤t);
> +   if (ret < 0 && clock_id == CLOCK_MONOTONIC_RAW)
> +   ret = clock_gettime(CLOCK_MONOTONIC, ¤t);
> +   if (ret < 0)
> +   return 0;
> +
> +   return (uint64_t) current.tv_sec * 10ULL + current.tv_nsec;
> +}
> +
> +VkResult radv_GetCalibratedTimestampsEXT(
> +   VkDevice _device,
> +   uint32_t timestampCount,
> +   const VkCalibratedTimestampInfoEXT   *pTimestampInfos,
> +   uint64_t *pTimestamps,
> +   uint64_t *pMaxDeviation)
> +{
> +   RADV_FROM_HANDLE(radv_device, device, _device);
> +   uint32_t clock_crystal_freq =
> device->physical_device->rad_info.clock_crystal_freq;
> +   int d;
> +   uint64_t begin, end;
> +uint64_t max_clock_period = 0;
> +
> +   begin = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
> +
> +   for (d = 0; d < timestampCount; d++) {
> +   switch (pTimestampInfos[d].timeDomain) {
> +   case VK_TIME_DOMAIN_DEVICE_EXT:
> +   pTimestamps[d] =
> device->ws->query_value(device->ws,
> +
> RADEON_TIMESTAMP);
> +uint64_t device_period = DIV_ROUND_UP(100,
> clock_crystal_freq);
> +max_clock_period = MAX2(max_clock_period,
> device_period);
> +   break;
> +   case VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT:
> +   pTimestamps[d] =
> radv_clock_gettime(CLOCK_MONOTONIC);
> +max_clock_period = MAX2(max_clock_period, 1);
> +   break;
> +
> +   case VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT:
> +   pTimestamps[d] = begin;
> +   break;
> +   default:
> +   pTimestamps[d] = 0;
> +   break;
> +   }
> +   }
> +
> +   end = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
> +
> +/*
> + * The maximum deviation is the sum of the interval over which we
> + * perform the sampling and the maximum period of any sampled
> + * clock. That's because t

Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Neil Roberts


> On Wed, Oct 17, 2018 at 12:45 PM Neil Roberts  wrote:
>
>> I wonder if you have something else in your setup that is setting it?

Ilia Mirkin  writes:

> Perhaps. It's the default, right?

It is the default but the toplevel .dir-locals.el sets it to nil. These
lower-level files are trying to override it back to the default.

> These might have a common source... although, HAH! IT WASN'T ME!
> Michel in 8d0a1a6bc05a set it to true, I probably copied, and am so
> used to emacs errors that I didn't even notice. Indents worked, so I
> was happy.

:)

> Yes, fixing these all is probably a good move. I don't think there are
> a lot of emacs users in mesa.

Lucky for them :)

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] mesa/core: Add support for EXT_sRGB_write_control

2018-10-17 Thread Ilia Mirkin

On Wed, Oct 17, 2018 at 12:49 PM Ilia Mirkin  wrote:
> On Wed, Oct 17, 2018 at 12:38 PM Gert Wollny  wrote:
> > diff --git a/src/mesa/main/extensions_table.h 
> > b/src/mesa/main/extensions_table.h
> > index 09bf923bd0..1185156f23 100644
> > --- a/src/mesa/main/extensions_table.h
> > +++ b/src/mesa/main/extensions_table.h
> > @@ -265,6 +265,7 @@ EXT(EXT_shader_integer_mix  , 
> > EXT_shader_integer_mix
> >  EXT(EXT_shader_io_blocks, dummy_true   
> >   ,  x ,  x ,  x ,  31, 2014)
> >  EXT(EXT_shader_samples_identical, EXT_shader_samples_identical 
> >   , GLL, GLC,  x ,  31, 2015)
> >  EXT(EXT_shadow_funcs, ARB_shadow   
> >   , GLL,  x ,  x ,  x , 2002)
> > +EXT(EXT_sRGB_write_control  , EXT_sRGB_write_control   
> >   , GLL,  x ,  x ,  30, 2013)
>
> I think you want an "x" instead of "GLL" -- it's an ES-only ext. Also
> I'd list "ES2" as the minimum. A driver that doesn't expose ES 3.0 or
> EXT_sRGB just shouldn't set this enable to true.

Oh, and an additional observation, since we don't expose EXT_sRGB at
all in mesa, the 30 is warranted here. But when we do, we should drop
this to ES2 and then ensure that the relevant drivers don't do
anything silly.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] mesa/st: enable EXT_sRGB_write_control for drivers that support it

2018-10-17 Thread Ilia Mirkin

On Wed, Oct 17, 2018 at 12:39 PM Gert Wollny  wrote:
>
> From: Gert Wollny 
>
> With this patch the extension EXT_sRGB_write_control is enabled for
> gallium drivers that support sRGB formats as render targets.
>
> Tested (and pass) on r600(evergreen) and softpipe:
>
>   dEQP-GLES31.functional.fbo.srgb_write_control.framebuffer_srgb_enabled*
>
> with "MESA_GLES_VERSION_OVERRIDE=3.2" (the tests needlessly check for this)
>
> Signed-off-by: Gert Wollny 
> ---
>  src/mesa/state_tracker/st_manager.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_manager.c 
> b/src/mesa/state_tracker/st_manager.c
> index ceb48dd490..562b12a1ef 100644
> --- a/src/mesa/state_tracker/st_manager.c
> +++ b/src/mesa/state_tracker/st_manager.c
> @@ -457,14 +457,12 @@ st_framebuffer_create(struct st_context *st,
>  * format such that util_format_srgb(visual->color_format) can be 
> supported
>  * by the pipe driver.  We still need to advertise the capability here.
>  *
> -* For GLES, however, sRGB framebuffer write is controlled only by the
> -* capability of the framebuffer.  There is GL_EXT_sRGB_write_control to
> -* give applications the control back, but sRGB write is still enabled by
> -* default.  To avoid unexpected results, we should not advertise the
> -* capability.  This could change when we add support for
> -* EGL_KHR_gl_colorspace.
> +* For GLES, however, sRGB framebuffer write is initially only controlled
> +* by the capability of the framebuffer, but with 
> GL_EXT_sRGB_write_control
> +* control is given back to the applications. Similar to desktop GL
> +* support for this extension depends EXT_framebuffer_sRGB.
>  */
> -   if (_mesa_is_desktop_gl(st->ctx)) {
> +   {
>struct pipe_screen *screen = st->pipe->screen;
>const enum pipe_format srgb_format =
>   util_format_srgb(stfbi->visual->color_format);
> @@ -475,8 +473,11 @@ st_framebuffer_create(struct st_context *st,
>PIPE_TEXTURE_2D, 
> stfbi->visual->samples,
>stfbi->visual->samples,
>(PIPE_BIND_DISPLAY_TARGET |
> -   PIPE_BIND_RENDER_TARGET)))
> +   PIPE_BIND_RENDER_TARGET))) {
>   mode.sRGBCapable = GL_TRUE;
> + /* Exposing this as extension is only needed on GLES */
> + st->ctx->Extensions.EXT_sRGB_write_control = 
> !_mesa_is_desktop_gl(st->ctx);

Having weird dependencies in extension enables creates a lot of
confusion. I'd just flip it to true.

> +  }
> }
>
> _mesa_initialize_window_framebuffer(&stfb->Base, &mode);
> --
> 2.18.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] Fix setting indent-tabs-mode in the Emacs .dir-locals.el files

2018-10-17 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 
On Wed, Oct 17, 2018 at 12:51 PM Neil Roberts  wrote:
>
> Some of the .dir-locals.el had the wrong name for the truthy value so
> it wasn’t setting indent-tabs-mode.
> ---
>  src/gallium/drivers/freedreno/.dir-locals.el | 2 +-
>  src/gallium/drivers/r600/.dir-locals.el  | 2 +-
>  src/gallium/drivers/radeon/.dir-locals.el| 2 +-
>  src/gallium/drivers/radeonsi/.dir-locals.el  | 2 +-
>  src/mesa/drivers/dri/nouveau/.dir-locals.el  | 2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/drivers/freedreno/.dir-locals.el 
> b/src/gallium/drivers/freedreno/.dir-locals.el
> index aa20d495465..b0e90fcbd53 100644
> --- a/src/gallium/drivers/freedreno/.dir-locals.el
> +++ b/src/gallium/drivers/freedreno/.dir-locals.el
> @@ -1,5 +1,5 @@
>  ((prog-mode
> -  (indent-tabs-mode . true)
> +  (indent-tabs-mode . t)
>(tab-width . 4)
>(c-basic-offset . 4)
>(c-file-style . "k&r")
> diff --git a/src/gallium/drivers/r600/.dir-locals.el 
> b/src/gallium/drivers/r600/.dir-locals.el
> index 4e35c129e70..15cd68edb0a 100644
> --- a/src/gallium/drivers/r600/.dir-locals.el
> +++ b/src/gallium/drivers/r600/.dir-locals.el
> @@ -1,5 +1,5 @@
>  ((prog-mode
> -  (indent-tabs-mode . true)
> +  (indent-tabs-mode . t)
>(tab-width . 8)
>(c-basic-offset . 8)
>(c-file-style . "stroustrup")
> diff --git a/src/gallium/drivers/radeon/.dir-locals.el 
> b/src/gallium/drivers/radeon/.dir-locals.el
> index 4e35c129e70..15cd68edb0a 100644
> --- a/src/gallium/drivers/radeon/.dir-locals.el
> +++ b/src/gallium/drivers/radeon/.dir-locals.el
> @@ -1,5 +1,5 @@
>  ((prog-mode
> -  (indent-tabs-mode . true)
> +  (indent-tabs-mode . t)
>(tab-width . 8)
>(c-basic-offset . 8)
>(c-file-style . "stroustrup")
> diff --git a/src/gallium/drivers/radeonsi/.dir-locals.el 
> b/src/gallium/drivers/radeonsi/.dir-locals.el
> index 4e35c129e70..15cd68edb0a 100644
> --- a/src/gallium/drivers/radeonsi/.dir-locals.el
> +++ b/src/gallium/drivers/radeonsi/.dir-locals.el
> @@ -1,5 +1,5 @@
>  ((prog-mode
> -  (indent-tabs-mode . true)
> +  (indent-tabs-mode . t)
>(tab-width . 8)
>(c-basic-offset . 8)
>(c-file-style . "stroustrup")
> diff --git a/src/mesa/drivers/dri/nouveau/.dir-locals.el 
> b/src/mesa/drivers/dri/nouveau/.dir-locals.el
> index 774f023ae6f..9b3ddf52461 100644
> --- a/src/mesa/drivers/dri/nouveau/.dir-locals.el
> +++ b/src/mesa/drivers/dri/nouveau/.dir-locals.el
> @@ -1,5 +1,5 @@
>  ((prog-mode
> -  (indent-tabs-mode . true)
> +  (indent-tabs-mode . t)
>(tab-width . 8)
>(c-basic-offset . 8)
>(c-file-style . "stroustrup")
> --
> 2.17.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Ilia Mirkin

On Wed, Oct 17, 2018 at 12:45 PM Neil Roberts  wrote:
>
> Ilia Mirkin  writes:
>
> > Are you sure? It works fine for me... I'm not against fixing it to be
> > "t", but the current contents definitely worked fine for me. (As I
> > recall, I may be the one who checked this file in.)
>
> Yes, I’m sure. If you type “true” and then do C-x C-e to evaluate it
> then Emacs gives a void-variable error. If I leave it as “true” in the
> file then it does indeed indent without tabs. Also if I do C-h v it says
> the value is nil, whereas if I change the .dir-local.el to “t” then the
> indentation works properly and the variable help says its value comes
> from the .dir-locals.el. I wonder if you have something else in your
> setup that is setting it?

Perhaps. It's the default, right?

>
> I notice that there are some other files with the same problem. It might
> be worth fixing them all in one patch.
>
> $ git grep 'indent-tabs-mode *\. *true'
> src/gallium/drivers/freedreno/.dir-locals.el:  (indent-tabs-mode . true)
> src/gallium/drivers/r600/.dir-locals.el:  (indent-tabs-mode . true)
> src/gallium/drivers/radeon/.dir-locals.el:  (indent-tabs-mode . true)
> src/gallium/drivers/radeonsi/.dir-locals.el:  (indent-tabs-mode . true)
> src/mesa/drivers/dri/nouveau/.dir-locals.el:  (indent-tabs-mode . true)

These might have a common source... although, HAH! IT WASN'T ME!
Michel in 8d0a1a6bc05a set it to true, I probably copied, and am so
used to emacs errors that I didn't even notice. Indents worked, so I
was happy.

Yes, fixing these all is probably a good move. I don't think there are
a lot of emacs users in mesa.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] Fix setting indent-tabs-mode in the Emacs .dir-locals.el files

2018-10-17 Thread Neil Roberts

Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.
---
 src/gallium/drivers/freedreno/.dir-locals.el | 2 +-
 src/gallium/drivers/r600/.dir-locals.el  | 2 +-
 src/gallium/drivers/radeon/.dir-locals.el| 2 +-
 src/gallium/drivers/radeonsi/.dir-locals.el  | 2 +-
 src/mesa/drivers/dri/nouveau/.dir-locals.el  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/freedreno/.dir-locals.el 
b/src/gallium/drivers/freedreno/.dir-locals.el
index aa20d495465..b0e90fcbd53 100644
--- a/src/gallium/drivers/freedreno/.dir-locals.el
+++ b/src/gallium/drivers/freedreno/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 4)
   (c-basic-offset . 4)
   (c-file-style . "k&r")
diff --git a/src/gallium/drivers/r600/.dir-locals.el 
b/src/gallium/drivers/r600/.dir-locals.el
index 4e35c129e70..15cd68edb0a 100644
--- a/src/gallium/drivers/r600/.dir-locals.el
+++ b/src/gallium/drivers/r600/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
diff --git a/src/gallium/drivers/radeon/.dir-locals.el 
b/src/gallium/drivers/radeon/.dir-locals.el
index 4e35c129e70..15cd68edb0a 100644
--- a/src/gallium/drivers/radeon/.dir-locals.el
+++ b/src/gallium/drivers/radeon/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
diff --git a/src/gallium/drivers/radeonsi/.dir-locals.el 
b/src/gallium/drivers/radeonsi/.dir-locals.el
index 4e35c129e70..15cd68edb0a 100644
--- a/src/gallium/drivers/radeonsi/.dir-locals.el
+++ b/src/gallium/drivers/radeonsi/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
diff --git a/src/mesa/drivers/dri/nouveau/.dir-locals.el 
b/src/mesa/drivers/dri/nouveau/.dir-locals.el
index 774f023ae6f..9b3ddf52461 100644
--- a/src/mesa/drivers/dri/nouveau/.dir-locals.el
+++ b/src/mesa/drivers/dri/nouveau/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] mesa/core: Add support for EXT_sRGB_write_control

2018-10-17 Thread Ilia Mirkin

On Wed, Oct 17, 2018 at 12:38 PM Gert Wollny  wrote:
>
> From: Gert Wollny 
>
> This GLES extension gives the applications the control over deciding whether
> the conversion from linear space to sRGB is necessary by enabling or
> disabling this conversion at framebuffer write or blending time just
> like it is possible for desktop GL.
>
> Signed-off-by: Gert Wollny 
> ---
>  src/mesa/main/enable.c   | 4 ++--
>  src/mesa/main/extensions_table.h | 1 +
>  src/mesa/main/get.c  | 6 ++
>  src/mesa/main/get_hash_params.py | 1 +
>  src/mesa/main/mtypes.h   | 1 +
>  5 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
> index bd3e493da5..06c5a0eb68 100644
> --- a/src/mesa/main/enable.c
> +++ b/src/mesa/main/enable.c
> @@ -1125,7 +1125,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
> GLboolean state)
>
>/* GL3.0 - GL_framebuffer_sRGB */
>case GL_FRAMEBUFFER_SRGB_EXT:
> - if (!_mesa_is_desktop_gl(ctx))
> + if (!_mesa_is_desktop_gl(ctx) && 
> !ctx->Extensions.EXT_sRGB_write_control)
>  goto invalid_enum_error;
>   CHECK_EXTENSION(EXT_framebuffer_sRGB, cap);
>   _mesa_set_framebuffer_srgb(ctx, state);
> @@ -1765,7 +1765,7 @@ _mesa_IsEnabled( GLenum cap )
>
>/* GL3.0 - GL_framebuffer_sRGB */
>case GL_FRAMEBUFFER_SRGB_EXT:
> - if (!_mesa_is_desktop_gl(ctx))
> + if (!_mesa_is_desktop_gl(ctx) && 
> !ctx->Extensions.EXT_sRGB_write_control)
>  goto invalid_enum_error;
>   CHECK_EXTENSION(EXT_framebuffer_sRGB);
>   return ctx->Color.sRGBEnabled;
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 09bf923bd0..1185156f23 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -265,6 +265,7 @@ EXT(EXT_shader_integer_mix  , 
> EXT_shader_integer_mix
>  EXT(EXT_shader_io_blocks, dummy_true 
> ,  x ,  x ,  x ,  31, 2014)
>  EXT(EXT_shader_samples_identical, EXT_shader_samples_identical   
> , GLL, GLC,  x ,  31, 2015)
>  EXT(EXT_shadow_funcs, ARB_shadow 
> , GLL,  x ,  x ,  x , 2002)
> +EXT(EXT_sRGB_write_control  , EXT_sRGB_write_control 
> , GLL,  x ,  x ,  30, 2013)

I think you want an "x" instead of "GLL" -- it's an ES-only ext. Also
I'd list "ES2" as the minimum. A driver that doesn't expose ES 3.0 or
EXT_sRGB just shouldn't set this enable to true.

>  EXT(EXT_stencil_two_side, EXT_stencil_two_side   
> , GLL,  x ,  x ,  x , 2001)
>  EXT(EXT_stencil_wrap, dummy_true 
> , GLL,  x ,  x ,  x , 2002)
>  EXT(EXT_subtexture  , dummy_true 
> , GLL,  x ,  x ,  x , 1995)
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 1b1679e8bf..fd9d3885f5 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -394,6 +394,12 @@ static const int extra_ARB_compute_shader_es31[] = {
> EXTRA_END
>  };
>
> +static const int extra_EXT_sRGB_write_control_es30[] = {
> +   EXT(EXT_sRGB_write_control),
> +   EXTRA_API_ES3,
> +   EXTRA_END
> +};

These get OR'd, I believe, which is not what you want. Just leave the
EXT() in, leave the EXTRA_API out.

> +
>  static const int extra_ARB_shader_storage_buffer_object_es31[] = {
> EXT(ARB_shader_storage_buffer_object),
> EXTRA_API_ES31,
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index 1840db6ebb..822fab8151 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -262,6 +262,7 @@ descriptor=[
>  # Enums in GLES2, GLES3
>  { "apis": ["GLES2", "GLES3"], "params": [
>[ "GPU_DISJOINT_EXT", "LOC_CUSTOM, TYPE_INT, 0, 
> extra_EXT_disjoint_timer_query" ],
> +  [ "FRAMEBUFFER_SRGB_EXT", "CONTEXT_BOOL(Color.sRGBEnabled), 
> extra_EXT_sRGB_write_control_es30" ],
>  ]},
>
>  { "apis": ["GL", "GL_CORE", "GLES2"], "params": [
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 9ed49b7ff2..31cf62fdb6 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -4253,6 +4253,7 @@ struct gl_extensions
> GLboolean EXT_semaphore_fd;
> GLboolean EXT_shader_integer_mix;
> GLboolean EXT_shader_samples_identical;
> +   GLboolean EXT_sRGB_write_control;
> GLboolean EXT_stencil_two_side;
> GLboolean EXT_texture_array;
> GLboolean EXT_texture_compression_latc;
> --
> 2.18.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedes

[Mesa-dev] [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]

2018-10-17 Thread Keith Packard

Offers three clocks, device, clock monotonic and clock monotonic
raw. Could use some kernel support to reduce the deviation between
clock values.

v2:
Ensure deviation is at least as big as the GPU time interval.

v3:
Set device->lost when returning DEVICE_LOST.
Use MAX2 and DIV_ROUND_UP instead of open coding these.
Delete spurious TIMESTAMP in radv version.

Suggested-by: Jason Ekstrand 
Suggested-by: Lionel Landwerlin 

v4:
Add anv_gem_reg_read to anv_gem_stubs.c

Suggested-by: Jason Ekstrand 

v5:
Adjust maxDeviation computation to max(sampled_clock_period) +
sample_interval.

Suggested-by: Bas Nieuwenhuizen 
Suggested-by: Jason Ekstrand 

Signed-off-by: Keith Packard 
---
 src/amd/vulkan/radv_device.c   | 119 +++
 src/amd/vulkan/radv_extensions.py  |   1 +
 src/intel/vulkan/anv_device.c  | 127 +
 src/intel/vulkan/anv_extensions.py |   1 +
 src/intel/vulkan/anv_gem.c |  13 +++
 src/intel/vulkan/anv_gem_stubs.c   |   7 ++
 src/intel/vulkan/anv_private.h |   2 +
 7 files changed, 270 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 174922780fc..4a705a724ef 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -4955,3 +4955,122 @@ radv_GetDeviceGroupPeerMemoryFeatures(
   VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT |
   VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT;
 }
+
+static const VkTimeDomainEXT radv_time_domains[] = {
+   VK_TIME_DOMAIN_DEVICE_EXT,
+   VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT,
+   VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT,
+};
+
+VkResult radv_GetPhysicalDeviceCalibrateableTimeDomainsEXT(
+   VkPhysicalDevice physicalDevice,
+   uint32_t *pTimeDomainCount,
+   VkTimeDomainEXT  *pTimeDomains)
+{
+   int d;
+   VK_OUTARRAY_MAKE(out, pTimeDomains, pTimeDomainCount);
+
+   for (d = 0; d < ARRAY_SIZE(radv_time_domains); d++) {
+   vk_outarray_append(&out, i) {
+   *i = radv_time_domains[d];
+   }
+   }
+
+   return vk_outarray_status(&out);
+}
+
+static uint64_t
+radv_clock_gettime(clockid_t clock_id)
+{
+   struct timespec current;
+   int ret;
+
+   ret = clock_gettime(clock_id, ¤t);
+   if (ret < 0 && clock_id == CLOCK_MONOTONIC_RAW)
+   ret = clock_gettime(CLOCK_MONOTONIC, ¤t);
+   if (ret < 0)
+   return 0;
+
+   return (uint64_t) current.tv_sec * 10ULL + current.tv_nsec;
+}
+
+VkResult radv_GetCalibratedTimestampsEXT(
+   VkDevice _device,
+   uint32_t timestampCount,
+   const VkCalibratedTimestampInfoEXT   *pTimestampInfos,
+   uint64_t *pTimestamps,
+   uint64_t *pMaxDeviation)
+{
+   RADV_FROM_HANDLE(radv_device, device, _device);
+   uint32_t clock_crystal_freq = 
device->physical_device->rad_info.clock_crystal_freq;
+   int d;
+   uint64_t begin, end;
+uint64_t max_clock_period = 0;
+
+   begin = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
+
+   for (d = 0; d < timestampCount; d++) {
+   switch (pTimestampInfos[d].timeDomain) {
+   case VK_TIME_DOMAIN_DEVICE_EXT:
+   pTimestamps[d] = device->ws->query_value(device->ws,
+
RADEON_TIMESTAMP);
+uint64_t device_period = DIV_ROUND_UP(100, 
clock_crystal_freq);
+max_clock_period = MAX2(max_clock_period, 
device_period);
+   break;
+   case VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT:
+   pTimestamps[d] = radv_clock_gettime(CLOCK_MONOTONIC);
+max_clock_period = MAX2(max_clock_period, 1);
+   break;
+
+   case VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT:
+   pTimestamps[d] = begin;
+   break;
+   default:
+   pTimestamps[d] = 0;
+   break;
+   }
+   }
+
+   end = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
+
+/*
+ * The maximum deviation is the sum of the interval over which we
+ * perform the sampling and the maximum period of any sampled
+ * clock. That's because the maximum skew between any two sampled
+ * clock edges is when the sampled clock with the largest period is
+ * sampled at the end of that period but right at the beginning of the
+ * sampling interval and some other clock is sampled right at the
+ * begin

Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Neil Roberts

Ilia Mirkin  writes:

> Are you sure? It works fine for me... I'm not against fixing it to be
> "t", but the current contents definitely worked fine for me. (As I
> recall, I may be the one who checked this file in.)

Yes, I’m sure. If you type “true” and then do C-x C-e to evaluate it
then Emacs gives a void-variable error. If I leave it as “true” in the
file then it does indeed indent without tabs. Also if I do C-h v it says
the value is nil, whereas if I change the .dir-local.el to “t” then the
indentation works properly and the variable help says its value comes
from the .dir-locals.el. I wonder if you have something else in your
setup that is setting it?

I notice that there are some other files with the same problem. It might
be worth fixing them all in one patch.

$ git grep 'indent-tabs-mode *\. *true'
src/gallium/drivers/freedreno/.dir-locals.el:  (indent-tabs-mode . true)
src/gallium/drivers/r600/.dir-locals.el:  (indent-tabs-mode . true)
src/gallium/drivers/radeon/.dir-locals.el:  (indent-tabs-mode . true)
src/gallium/drivers/radeonsi/.dir-locals.el:  (indent-tabs-mode . true)
src/mesa/drivers/dri/nouveau/.dir-locals.el:  (indent-tabs-mode . true)

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] radeonsi: use compute shaders for clear_buffer & copy_buffer

2018-10-17 Thread Marek Olšák

Can you test the attached patch?

Marek

On Wed, Oct 17, 2018 at 9:31 AM Michel Dänzer  wrote:

> On 2018-10-07 9:05 a.m., Marek Olšák wrote:
> > From: Marek Olšák 
> >
> > Fast color clears should be much faster. Also, fast color clears on
> > evicted buffers should be 200x faster on GFX8 and older.
>
> Nice! Unfortunately, this broke clover with radeonsi. Everything using
> OpenCL seems to hang, see e.g. the attached backtraces from clinfo.
>
>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
>
From f0978b2afae808edf4ac281b14cd371305a5164b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Wed, 17 Oct 2018 12:41:38 -0400
Subject: [PATCH] radeonsi: fix a deadlock due to partially-initialized context
 on CI

---
 src/gallium/drivers/radeonsi/si_pipe.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c
index 59e41c53300..06740bd0f5c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -575,12 +575,6 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
  &sctx->null_const_buf);
 		si_set_rw_buffer(sctx, SI_PS_CONST_SAMPLE_POSITIONS,
  &sctx->null_const_buf);
-
-		/* Clear the NULL constant buffer, because loads should return zeros. */
-		uint32_t clear_value = 0;
-		si_clear_buffer(sctx, sctx->null_const_buf.buffer, 0,
-sctx->null_const_buf.buffer->width0,
-&clear_value, 4, SI_COHERENCY_SHADER);
 	}
 
 	uint64_t max_threads_per_block;
@@ -625,6 +619,14 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
 
 	/* this must be last */
 	si_begin_new_gfx_cs(sctx);
+
+	if (sctx->chip_class == CIK) {
+		/* Clear the NULL constant buffer, because loads should return zeros. */
+		uint32_t clear_value = 0;
+		si_clear_buffer(sctx, sctx->null_const_buf.buffer, 0,
+sctx->null_const_buf.buffer->width0,
+&clear_value, 4, SI_COHERENCY_SHADER);
+	}
 	return &sctx->b;
 fail:
 	fprintf(stderr, "radeonsi: Failed to create a context.\n");
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] mesa/core: Add support for EXT_sRGB_write_control

2018-10-17 Thread Gert Wollny

From: Gert Wollny 

This GLES extension gives the applications the control over deciding whether
the conversion from linear space to sRGB is necessary by enabling or
disabling this conversion at framebuffer write or blending time just
like it is possible for desktop GL.

Signed-off-by: Gert Wollny 
---
 src/mesa/main/enable.c   | 4 ++--
 src/mesa/main/extensions_table.h | 1 +
 src/mesa/main/get.c  | 6 ++
 src/mesa/main/get_hash_params.py | 1 +
 src/mesa/main/mtypes.h   | 1 +
 5 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index bd3e493da5..06c5a0eb68 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -1125,7 +1125,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
 
   /* GL3.0 - GL_framebuffer_sRGB */
   case GL_FRAMEBUFFER_SRGB_EXT:
- if (!_mesa_is_desktop_gl(ctx))
+ if (!_mesa_is_desktop_gl(ctx) && 
!ctx->Extensions.EXT_sRGB_write_control)
 goto invalid_enum_error;
  CHECK_EXTENSION(EXT_framebuffer_sRGB, cap);
  _mesa_set_framebuffer_srgb(ctx, state);
@@ -1765,7 +1765,7 @@ _mesa_IsEnabled( GLenum cap )
 
   /* GL3.0 - GL_framebuffer_sRGB */
   case GL_FRAMEBUFFER_SRGB_EXT:
- if (!_mesa_is_desktop_gl(ctx))
+ if (!_mesa_is_desktop_gl(ctx) && 
!ctx->Extensions.EXT_sRGB_write_control)
 goto invalid_enum_error;
  CHECK_EXTENSION(EXT_framebuffer_sRGB);
  return ctx->Color.sRGBEnabled;
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 09bf923bd0..1185156f23 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -265,6 +265,7 @@ EXT(EXT_shader_integer_mix  , 
EXT_shader_integer_mix
 EXT(EXT_shader_io_blocks, dummy_true   
  ,  x ,  x ,  x ,  31, 2014)
 EXT(EXT_shader_samples_identical, EXT_shader_samples_identical 
  , GLL, GLC,  x ,  31, 2015)
 EXT(EXT_shadow_funcs, ARB_shadow   
  , GLL,  x ,  x ,  x , 2002)
+EXT(EXT_sRGB_write_control  , EXT_sRGB_write_control   
  , GLL,  x ,  x ,  30, 2013)
 EXT(EXT_stencil_two_side, EXT_stencil_two_side 
  , GLL,  x ,  x ,  x , 2001)
 EXT(EXT_stencil_wrap, dummy_true   
  , GLL,  x ,  x ,  x , 2002)
 EXT(EXT_subtexture  , dummy_true   
  , GLL,  x ,  x ,  x , 1995)
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 1b1679e8bf..fd9d3885f5 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -394,6 +394,12 @@ static const int extra_ARB_compute_shader_es31[] = {
EXTRA_END
 };
 
+static const int extra_EXT_sRGB_write_control_es30[] = {
+   EXT(EXT_sRGB_write_control),
+   EXTRA_API_ES3,
+   EXTRA_END
+};
+
 static const int extra_ARB_shader_storage_buffer_object_es31[] = {
EXT(ARB_shader_storage_buffer_object),
EXTRA_API_ES31,
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 1840db6ebb..822fab8151 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -262,6 +262,7 @@ descriptor=[
 # Enums in GLES2, GLES3
 { "apis": ["GLES2", "GLES3"], "params": [
   [ "GPU_DISJOINT_EXT", "LOC_CUSTOM, TYPE_INT, 0, 
extra_EXT_disjoint_timer_query" ],
+  [ "FRAMEBUFFER_SRGB_EXT", "CONTEXT_BOOL(Color.sRGBEnabled), 
extra_EXT_sRGB_write_control_es30" ],
 ]},
 
 { "apis": ["GL", "GL_CORE", "GLES2"], "params": [
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 9ed49b7ff2..31cf62fdb6 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4253,6 +4253,7 @@ struct gl_extensions
GLboolean EXT_semaphore_fd;
GLboolean EXT_shader_integer_mix;
GLboolean EXT_shader_samples_identical;
+   GLboolean EXT_sRGB_write_control;
GLboolean EXT_stencil_two_side;
GLboolean EXT_texture_array;
GLboolean EXT_texture_compression_latc;
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] intel/i965: Enable extension EXT_sRGB_write_control

2018-10-17 Thread Gert Wollny

From: Gert Wollny 

Enables and passes on i965:
  dEQP-GLES31.functional.fbo.srgb_write_control.framebuffer_srgb_enabled*

Signed-off-by: Gert Wollny 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index d7e02efb54..ca921de8e8 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -76,6 +76,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.ARB_shading_language_packing = true;
ctx->Extensions.ARB_shadow = true;
ctx->Extensions.ARB_sync = true;
+   ctx->Extensions.EXT_sRGB_write_control = true;
ctx->Extensions.ARB_texture_border_clamp = true;
ctx->Extensions.ARB_texture_compression_rgtc = true;
ctx->Extensions.ARB_texture_cube_map = true;
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] mesa/st: enable EXT_sRGB_write_control for drivers that support it

2018-10-17 Thread Gert Wollny

From: Gert Wollny 

With this patch the extension EXT_sRGB_write_control is enabled for
gallium drivers that support sRGB formats as render targets.

Tested (and pass) on r600(evergreen) and softpipe:

  dEQP-GLES31.functional.fbo.srgb_write_control.framebuffer_srgb_enabled*

with "MESA_GLES_VERSION_OVERRIDE=3.2" (the tests needlessly check for this)

Signed-off-by: Gert Wollny 
---
 src/mesa/state_tracker/st_manager.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/st_manager.c 
b/src/mesa/state_tracker/st_manager.c
index ceb48dd490..562b12a1ef 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -457,14 +457,12 @@ st_framebuffer_create(struct st_context *st,
 * format such that util_format_srgb(visual->color_format) can be supported
 * by the pipe driver.  We still need to advertise the capability here.
 *
-* For GLES, however, sRGB framebuffer write is controlled only by the
-* capability of the framebuffer.  There is GL_EXT_sRGB_write_control to
-* give applications the control back, but sRGB write is still enabled by
-* default.  To avoid unexpected results, we should not advertise the
-* capability.  This could change when we add support for
-* EGL_KHR_gl_colorspace.
+* For GLES, however, sRGB framebuffer write is initially only controlled
+* by the capability of the framebuffer, but with GL_EXT_sRGB_write_control
+* control is given back to the applications. Similar to desktop GL
+* support for this extension depends EXT_framebuffer_sRGB.
 */
-   if (_mesa_is_desktop_gl(st->ctx)) {
+   {
   struct pipe_screen *screen = st->pipe->screen;
   const enum pipe_format srgb_format =
  util_format_srgb(stfbi->visual->color_format);
@@ -475,8 +473,11 @@ st_framebuffer_create(struct st_context *st,
   PIPE_TEXTURE_2D, stfbi->visual->samples,
   stfbi->visual->samples,
   (PIPE_BIND_DISPLAY_TARGET |
-   PIPE_BIND_RENDER_TARGET)))
+   PIPE_BIND_RENDER_TARGET))) {
  mode.sRGBCapable = GL_TRUE;
+ /* Exposing this as extension is only needed on GLES */
+ st->ctx->Extensions.EXT_sRGB_write_control = 
!_mesa_is_desktop_gl(st->ctx);
+  }
}
 
_mesa_initialize_window_framebuffer(&stfb->Base, &mode);
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/3] Add and enable extension EXT_sRGB_write_control

2018-10-17 Thread Gert Wollny

From: Gert Wollny 

Dear all, 

this series adds the basic plumbing for EXT_sRGB_write_control and enables it
for some drivers. 
Since this is the first time I add an extension I'd ask reviews this to take a
specific look at the first patch. 

One thing I left out therefore, is to enable this extension already for 
GLES 2.0 + EXT_sRGB, because I was not sure how to deal with the different 
dependencies in the tables in src/mesa/main/get_hash_params.py and 
src/mesa/main/extensions_table.h, so if someone can point me in the right 
direction there, I'll happily add this.  

many thanks for any review, 
Gert


Gert Wollny (3):
  mesa/core: Add support for EXT_sRGB_write_control
  mesa/st: enable EXT_sRGB_write_control for drivers that support it
  i965: Enable extension EXT_sRGB_write_control

 src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
 src/mesa/main/enable.c   |  4 ++--
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py |  1 +
 src/mesa/main/mtypes.h   |  1 +
 src/mesa/state_tracker/st_manager.c  | 17 +
 7 files changed, 21 insertions(+), 10 deletions(-)

-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Ilia Mirkin

Are you sure? It works fine for me... I'm not against fixing it to be
"t", but the current contents definitely worked fine for me. (As I
recall, I may be the one who checked this file in.)
On Wed, Oct 17, 2018 at 11:38 AM Neil Roberts  wrote:
>
> The .dir-locals.el had the wrong name for the truthy value so it
> wasn’t setting indent-tabs-mode.
> ---
>  src/gallium/drivers/freedreno/.dir-locals.el | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/freedreno/.dir-locals.el 
> b/src/gallium/drivers/freedreno/.dir-locals.el
> index aa20d495465..b0e90fcbd53 100644
> --- a/src/gallium/drivers/freedreno/.dir-locals.el
> +++ b/src/gallium/drivers/freedreno/.dir-locals.el
> @@ -1,5 +1,5 @@
>  ((prog-mode
> -  (indent-tabs-mode . true)
> +  (indent-tabs-mode . t)
>(tab-width . 4)
>(c-basic-offset . 4)
>(c-file-style . "k&r")
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radeonsi: fix a VGT hang with primitive restart on Polaris10 and later

2018-10-17 Thread Marek Olšák

From: Marek Olšák 

Cc: 18.1 18.2 
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 83eb646b791..612ca910cb9 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -376,21 +376,21 @@ si_get_init_multi_vgt_param(struct si_screen *sscreen,
}
 
if (sscreen->info.chip_class >= CIK) {
/* WD_SWITCH_ON_EOP has no effect on GPUs with less than
 * 4 shader engines. Set 1 to pass the assertion below.
 * The other cases are hardware requirements.
 *
 * Polaris supports primitive restart with WD_SWITCH_ON_EOP=0
 * for points, line strips, and tri strips.
 */
-   if (sscreen->info.max_se < 4 ||
+   if (sscreen->info.max_se <= 2 ||
key->u.prim == PIPE_PRIM_POLYGON ||
key->u.prim == PIPE_PRIM_LINE_LOOP ||
key->u.prim == PIPE_PRIM_TRIANGLE_FAN ||
key->u.prim == PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY ||
(key->u.primitive_restart &&
 (sscreen->info.family < CHIP_POLARIS10 ||
  (key->u.prim != PIPE_PRIM_POINTS &&
   key->u.prim != PIPE_PRIM_LINE_STRIP &&
   key->u.prim != PIPE_PRIM_TRIANGLE_STRIP))) ||
key->u.count_from_stream_output)
@@ -407,35 +407,41 @@ si_get_init_multi_vgt_param(struct si_screen *sscreen,
 * instances are smaller than a primgroup.
 * Assume indirect draws always use small instances.
 * This is needed for good VS wave utilization.
 */
if (sscreen->info.chip_class <= VI &&
sscreen->info.max_se == 4 &&
key->u.multi_instances_smaller_than_primgroup)
wd_switch_on_eop = true;
 
/* Required on CIK and later. */
-   if (sscreen->info.max_se > 2 && !wd_switch_on_eop)
+   if (sscreen->info.max_se == 4 && !wd_switch_on_eop)
ia_switch_on_eoi = true;
 
/* Required by Hawaii and, for some special cases, by VI. */
if (ia_switch_on_eoi &&
(sscreen->info.family == CHIP_HAWAII ||
 (sscreen->info.chip_class == VI &&
  (key->u.uses_gs || max_primgroup_in_wave != 2
partial_vs_wave = true;
 
/* Instancing bug on Bonaire. */
if (sscreen->info.family == CHIP_BONAIRE && ia_switch_on_eoi &&
key->u.uses_instancing)
partial_vs_wave = true;
 
+   /* This only applies to Polaris10 and later 4 SE chips.
+* wd_switch_on_eop is already true on all other chips.
+*/
+   if (!wd_switch_on_eop && key->u.primitive_restart)
+   partial_vs_wave = true;
+
/* If the WD switch is false, the IA switch must be false too. 
*/
assert(wd_switch_on_eop || !ia_switch_on_eop);
}
 
/* If SWITCH_ON_EOI is set, PARTIAL_ES_WAVE must be set too. */
if (sscreen->info.chip_class <= VI && ia_switch_on_eoi)
partial_es_wave = true;
 
return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |
S_028AA8_SWITCH_ON_EOI(ia_switch_on_eoi) |
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radeonsi: clamp point size to the limit

2018-10-17 Thread Marek Olšák

From: Marek Olšák 

This fixes dEQP-GLES2.functional.rasterization.limits.points.
Broken by: ea039f789d9b54e1bd1d644b6a29863ca3500314
---
 src/gallium/drivers/radeonsi/si_get.c   | 5 +++--
 src/gallium/drivers/radeonsi/si_pipe.h  | 1 +
 src/gallium/drivers/radeonsi/si_state.c | 2 +-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index ac302b8a946..804276b3eda 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -326,25 +326,26 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
default:
return u_pipe_screen_get_param_defaults(pscreen, param);
}
 }
 
 static float si_get_paramf(struct pipe_screen* pscreen, enum pipe_capf param)
 {
switch (param) {
case PIPE_CAPF_MAX_LINE_WIDTH:
case PIPE_CAPF_MAX_LINE_WIDTH_AA:
-   case PIPE_CAPF_MAX_POINT_WIDTH:
-   case PIPE_CAPF_MAX_POINT_WIDTH_AA:
/* This depends on the quant mode, though the precise 
interactions
 * are unknown. */
return 2048;
+   case PIPE_CAPF_MAX_POINT_WIDTH:
+   case PIPE_CAPF_MAX_POINT_WIDTH_AA:
+   return SI_MAX_POINT_SIZE;
case PIPE_CAPF_MAX_TEXTURE_ANISOTROPY:
return 16.0f;
case PIPE_CAPF_MAX_TEXTURE_LOD_BIAS:
return 16.0f;
case PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE:
case PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE:
case PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY:
return 0.0f;
}
return 0.0f;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 6edc06cece7..dc95afb7421 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -41,20 +41,21 @@
 #define ATI_VENDOR_ID  0x1002
 
 #define SI_NOT_QUERY   0x
 
 /* The base vertex and primitive restart can be any number, but we must pick
  * one which will mean "unknown" for the purpose of state tracking and
  * the number shouldn't be a commonly-used one. */
 #define SI_BASE_VERTEX_UNKNOWN INT_MIN
 #define SI_RESTART_INDEX_UNKNOWN   INT_MIN
 #define SI_NUM_SMOOTH_AA_SAMPLES   8
+#define SI_MAX_POINT_SIZE  2048
 #define SI_GS_PER_ES   128
 /* Alignment for optimal CP DMA performance. */
 #define SI_CPDMA_ALIGNMENT 32
 
 /* Tunables for compute-based clear_buffer and copy_buffer: */
 #define SI_COMPUTE_CLEAR_DW_PER_THREAD 4
 #define SI_COMPUTE_COPY_DW_PER_THREAD  4
 #define SI_COMPUTE_DST_CACHE_POLICYL2_STREAM
 
 /* Pipeline & streamout query controls. */
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 8b2e6e57f45..176ec749148 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -891,21 +891,21 @@ static void *si_create_rs_state(struct pipe_context *ctx,
S_0286D4_PNT_SPRITE_OVRD_Z(V_0286D4_SPI_PNT_SPRITE_SEL_0) |
S_0286D4_PNT_SPRITE_OVRD_W(V_0286D4_SPI_PNT_SPRITE_SEL_1) |
S_0286D4_PNT_SPRITE_TOP_1(state->sprite_coord_mode != 
PIPE_SPRITE_COORD_UPPER_LEFT));
 
/* point size 12.4 fixed point */
tmp = (unsigned)(state->point_size * 8.0);
si_pm4_set_reg(pm4, R_028A00_PA_SU_POINT_SIZE, S_028A00_HEIGHT(tmp) | 
S_028A00_WIDTH(tmp));
 
if (state->point_size_per_vertex) {
psize_min = util_get_min_point_size(state);
-   psize_max = 8192;
+   psize_max = SI_MAX_POINT_SIZE;
} else {
/* Force the point size to be as if the vertex output was 
disabled. */
psize_min = state->point_size;
psize_max = state->point_size;
}
rs->max_point_size = psize_max;
 
/* Divide by two, because 0.5 = 1 pixel. */
si_pm4_set_reg(pm4, R_028A04_PA_SU_POINT_MINMAX,
S_028A04_MIN_SIZE(si_pack_float_12p4(psize_min/2)) |
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel/tools: Remove hardcoded PADDING_SIZE from sanitizer

2018-10-17 Thread Rafael Antognolli

On Wed, Oct 17, 2018 at 06:08:34PM +0300, Danylo Piliaiev wrote:
> Signed-off-by: Danylo Piliaiev 
> ---
>  src/intel/tools/intel_sanitize_gpu.c | 38 +++-
>  1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/src/intel/tools/intel_sanitize_gpu.c 
> b/src/intel/tools/intel_sanitize_gpu.c
> index 9b49b0bbf2..36c4725a2f 100644
> --- a/src/intel/tools/intel_sanitize_gpu.c
> +++ b/src/intel/tools/intel_sanitize_gpu.c
> @@ -51,14 +51,6 @@ static int (*libc_fcntl)(int fd, int cmd, int param);
>  
>  #define DRM_MAJOR 226
>  
> -/* TODO: we want to make sure that the padding forces
> - * the BO to take another page on the (PP)GTT; 4KB
> - * may or may not be the page size for the BO. Indeed,
> - * depending on GPU, kernel version and GEM size, the
> - * page size can be one of 4KB, 64KB or 2M.
> - */
> -#define PADDING_SIZE 4096
> -
>  struct refcnt_hash_table {
> struct hash_table *t;
> int refcnt;
> @@ -80,6 +72,8 @@ pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
>  
>  static struct hash_table *fds_to_bo_sizes = NULL;
>  
> +static long padding_size = 0;
> +
>  static inline struct hash_table*
>  bo_size_table(int fd)
>  {
> @@ -166,7 +160,7 @@ padding_is_good(int fd, uint32_t handle)
> struct drm_i915_gem_mmap mmap_arg = {
>.handle = handle,
>.offset = bo_size(fd, handle),
> -  .size = PADDING_SIZE,
> +  .size = padding_size,
>.flags = 0,
> };
>  
> @@ -189,17 +183,17 @@ padding_is_good(int fd, uint32_t handle)
>  * if the bo is not cache coherent we likely need to
>  * invalidate the cache lines to get it.
>  */
> -   gen_invalidate_range(mapped, PADDING_SIZE);
> +   gen_invalidate_range(mapped, padding_size);
>  
> expected_value = handle & 0xFF;
> -   for (uint32_t i = 0; i < PADDING_SIZE; ++i) {
> +   for (uint32_t i = 0; i < padding_size; ++i) {
>if (expected_value != mapped[i]) {
> - munmap(mapped, PADDING_SIZE);
> + munmap(mapped, padding_size);
>   return false;
>}
>expected_value = next_noise_value(expected_value);
> }
> -   munmap(mapped, PADDING_SIZE);
> +   munmap(mapped, padding_size);
>  
> return true;
>  }
> @@ -207,9 +201,9 @@ padding_is_good(int fd, uint32_t handle)
>  static int
>  create_with_padding(int fd, struct drm_i915_gem_create *create)
>  {
> -   create->size += PADDING_SIZE;
> +   create->size += padding_size;
> int ret = libc_ioctl(fd, DRM_IOCTL_I915_GEM_CREATE, create);
> -   create->size -= PADDING_SIZE;
> +   create->size -= padding_size;
>  
> if (ret != 0)
>return ret;
> @@ -218,7 +212,7 @@ create_with_padding(int fd, struct drm_i915_gem_create 
> *create)
> struct drm_i915_gem_mmap mmap_arg = {
>.handle = create->handle,
>.offset = create->size,
> -  .size = PADDING_SIZE,
> +  .size = padding_size,
>.flags = 0,
> };
>  
> @@ -228,8 +222,8 @@ create_with_padding(int fd, struct drm_i915_gem_create 
> *create)
>  
> noise_values = (uint8_t*) (uintptr_t) mmap_arg.addr_ptr;
> fill_noise_buffer(noise_values, create->handle & 0xFF,
> - PADDING_SIZE);
> -   munmap(noise_values, PADDING_SIZE);
> + padding_size);
> +   munmap(noise_values, padding_size);
>  
> _mesa_hash_table_insert(bo_size_table(fd), 
> (void*)(uintptr_t)create->handle,
> (void*)(uintptr_t)create->size);
> @@ -427,4 +421,12 @@ init(void)
> libc_close = dlsym(RTLD_NEXT, "close");
> libc_fcntl = dlsym(RTLD_NEXT, "fcntl");
> libc_ioctl = dlsym(RTLD_NEXT, "ioctl");
> +
> +   /* We want to make sure that the padding forces
> +* the BO to take another page on the (PP)GTT.
> +*/
> +   padding_size = sysconf(_SC_PAGESIZE);

I don't think this is the page size we want. This is the page size of
CPU/system memory, which might be different from what the GPU is using
to map pages. For instance, even if we are using 64K pages for GPU
mapping, I think this call would still return 4K.

Though I'm not sure if there's an interface to query the kernel which
page size we are using for the GPU...

> +   if (padding_size == -1) {
> +  unreachable("Bad page size");
> +   }
>  }
> -- 
> 2.18.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] freedreno: Remove the Emacs mode lines

2018-10-17 Thread Neil Roberts

These are not necessary because the corresponding settings are set via
the .dir-locals.el file anyway. Most of them were missing a ‘:’ after
“tab-width” which was making Emacs display an annoying warning
whenever you open the file.

This patch was made with:

sed -ri '/-\*- mode:/,/^$/d' \
$(find src/gallium/{drivers,winsys} -name \*.\[ch\] \
   -exec grep -l -- '-\*- mode:' {} \+)
---
 src/gallium/drivers/freedreno/a2xx/fd2_blend.c  | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_blend.h  | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_context.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_context.h| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_draw.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_draw.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_emit.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_emit.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_program.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_program.h| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.c | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.h | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_screen.h | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_texture.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_texture.h| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_util.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_util.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_blend.c  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_blend.h  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_context.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_context.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_draw.c   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_draw.h   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_emit.h   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.h   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_program.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_program.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_query.c  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_query.h  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.c | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.h | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_screen.c | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_screen.h | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_texture.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_texture.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_blend.c  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_blend.h  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_context.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_context.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_draw.c   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_draw.h   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_emit.h   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_format.c | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_format.h | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.h   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_program.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_program.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_query.c  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_query.h  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.h | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_screen.c | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_screen.h | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_texture.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_texture.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.h| 2 --
 src/gallium/drivers/freedreno/freedreno_context.c   | 2 --
 src/gallium/drivers/freedreno/freedreno_context.h

[Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Neil Roberts

The .dir-locals.el had the wrong name for the truthy value so it
wasn’t setting indent-tabs-mode.
---
 src/gallium/drivers/freedreno/.dir-locals.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/.dir-locals.el 
b/src/gallium/drivers/freedreno/.dir-locals.el
index aa20d495465..b0e90fcbd53 100644
--- a/src/gallium/drivers/freedreno/.dir-locals.el
+++ b/src/gallium/drivers/freedreno/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 4)
   (c-basic-offset . 4)
   (c-file-style . "k&r")
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v4]

2018-10-17 Thread Jason Ekstrand

On Wed, Oct 17, 2018 at 12:14 AM Keith Packard  wrote:

> Jason Ekstrand  writes:
>
> > Doing all of the CPU sampling on one side or the other of the GPU
> sampling
> > would probably reduce our window.
>
> True, although as I said, it's taking several µs to get through the
> loop, and the gpu clock tick is far smaller than that, so even adding
> the two values together to make it fit the current implementation won't
> make the deviation that much larger.
>
> > This leaves us with a delta of I + max(P(M), P(R), P(G)).  In
> > particular, any two real-number valued times are, instantaneously,
> > within that interval.
>
> That, at least, would be easy to compute, and scale nicely if we added
> more clocks in the future.
>
> > Personally, I'm completely content to have the delta just be a the first
> > one: a bound on the difference between any two real-valued times.  At
> this
> > point, I can guarantee you that far more thought has been put into this
> > mesa-dev discussion than was put into the spec and I think we're rapidly
> > getting to the point of diminishing returns. :-)
>
> It seems likely. How about we do the above computation for the current
> code and leave it at that?
>

Sounds like a plan.  Note that I should be computed as I = end - start +
monotonic_raw_tick_ns to ensure we get a big enough interval.  Given that
monotonic_raw_tick_ns is likely 1, this doesn't expand things much.

I think a comment is likely also in order.  Probably not containing the
entire e-mail thread but maybe some of my reasoning above?

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Neil Roberts

Eric Engestrom  writes:

> You might want to remove these instead, and use the .editorconfig [1]
> already present at src/gallium/drivers/freedreno/.editorconfig This is
> much easier to maintain than per-files settings ;)

Either fixing it or removing it is fine by me. I now notice there is a
.dir-locals.el file that should make it work anyway. (apparently I was
the last person to touch it too!) It has a typo which makes it fail to
set indent-tabs-mode though. I can make everything work locally either
way, I just wanted to get rid of the annoying warning whenever you open
a file.

- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] EGLDevice, take 2.1

2018-10-17 Thread Emil Velikov

On Wed, 3 Oct 2018 at 15:08, Emil Velikov  wrote:
>
> Hi all,
>
> This re-spin of the series includes:
>  - correct flipped asserts
>  - cosmetic wording/comment fixes
>  - drop EGL_EXT_platform_device patches (swrast is broken)
>  - add the EGL_MESA_device_software spec patch
>
> At this point we should be pretty much set, so any formal Ack/Rb will
> be appreciated.
>
> Thanks
> Emil
>
> Cc: Adam Jackson 
> Cc: Eric Engestrom 
> Cc: Mathias Fröhlich 
>
> Adam Jackson (1):
>   specs: Add EGL_MESA_device_software
>
> Emil Velikov (6):
>   egl: add base EGL_EXT_device_base implementation
>   egl: add EGL_MESA_device_software support
>   egl: add EGL_EXT_device_drm support
>   egl: set the EGLDevice when creating a display
>   egl: enable EGL_EXT_device_{base,enumeration,query}

Humble ping?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel/tools: Remove hardcoded PADDING_SIZE from sanitizer

2018-10-17 Thread Danylo Piliaiev

Signed-off-by: Danylo Piliaiev 
---
 src/intel/tools/intel_sanitize_gpu.c | 38 +++-
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/src/intel/tools/intel_sanitize_gpu.c 
b/src/intel/tools/intel_sanitize_gpu.c
index 9b49b0bbf2..36c4725a2f 100644
--- a/src/intel/tools/intel_sanitize_gpu.c
+++ b/src/intel/tools/intel_sanitize_gpu.c
@@ -51,14 +51,6 @@ static int (*libc_fcntl)(int fd, int cmd, int param);
 
 #define DRM_MAJOR 226
 
-/* TODO: we want to make sure that the padding forces
- * the BO to take another page on the (PP)GTT; 4KB
- * may or may not be the page size for the BO. Indeed,
- * depending on GPU, kernel version and GEM size, the
- * page size can be one of 4KB, 64KB or 2M.
- */
-#define PADDING_SIZE 4096
-
 struct refcnt_hash_table {
struct hash_table *t;
int refcnt;
@@ -80,6 +72,8 @@ pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
 
 static struct hash_table *fds_to_bo_sizes = NULL;
 
+static long padding_size = 0;
+
 static inline struct hash_table*
 bo_size_table(int fd)
 {
@@ -166,7 +160,7 @@ padding_is_good(int fd, uint32_t handle)
struct drm_i915_gem_mmap mmap_arg = {
   .handle = handle,
   .offset = bo_size(fd, handle),
-  .size = PADDING_SIZE,
+  .size = padding_size,
   .flags = 0,
};
 
@@ -189,17 +183,17 @@ padding_is_good(int fd, uint32_t handle)
 * if the bo is not cache coherent we likely need to
 * invalidate the cache lines to get it.
 */
-   gen_invalidate_range(mapped, PADDING_SIZE);
+   gen_invalidate_range(mapped, padding_size);
 
expected_value = handle & 0xFF;
-   for (uint32_t i = 0; i < PADDING_SIZE; ++i) {
+   for (uint32_t i = 0; i < padding_size; ++i) {
   if (expected_value != mapped[i]) {
- munmap(mapped, PADDING_SIZE);
+ munmap(mapped, padding_size);
  return false;
   }
   expected_value = next_noise_value(expected_value);
}
-   munmap(mapped, PADDING_SIZE);
+   munmap(mapped, padding_size);
 
return true;
 }
@@ -207,9 +201,9 @@ padding_is_good(int fd, uint32_t handle)
 static int
 create_with_padding(int fd, struct drm_i915_gem_create *create)
 {
-   create->size += PADDING_SIZE;
+   create->size += padding_size;
int ret = libc_ioctl(fd, DRM_IOCTL_I915_GEM_CREATE, create);
-   create->size -= PADDING_SIZE;
+   create->size -= padding_size;
 
if (ret != 0)
   return ret;
@@ -218,7 +212,7 @@ create_with_padding(int fd, struct drm_i915_gem_create 
*create)
struct drm_i915_gem_mmap mmap_arg = {
   .handle = create->handle,
   .offset = create->size,
-  .size = PADDING_SIZE,
+  .size = padding_size,
   .flags = 0,
};
 
@@ -228,8 +222,8 @@ create_with_padding(int fd, struct drm_i915_gem_create 
*create)
 
noise_values = (uint8_t*) (uintptr_t) mmap_arg.addr_ptr;
fill_noise_buffer(noise_values, create->handle & 0xFF,
- PADDING_SIZE);
-   munmap(noise_values, PADDING_SIZE);
+ padding_size);
+   munmap(noise_values, padding_size);
 
_mesa_hash_table_insert(bo_size_table(fd), (void*)(uintptr_t)create->handle,
(void*)(uintptr_t)create->size);
@@ -427,4 +421,12 @@ init(void)
libc_close = dlsym(RTLD_NEXT, "close");
libc_fcntl = dlsym(RTLD_NEXT, "fcntl");
libc_ioctl = dlsym(RTLD_NEXT, "ioctl");
+
+   /* We want to make sure that the padding forces
+* the BO to take another page on the (PP)GTT.
+*/
+   padding_size = sysconf(_SC_PAGESIZE);
+   if (padding_size == -1) {
+  unreachable("Bad page size");
+   }
 }
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Eric Engestrom

On Wednesday, 2018-10-17 15:48:41 +0200, Neil Roberts wrote:
> The modeline was missing a ‘:’ after the tab-width and Emacs was
> complaining every time you open a file. This patch was made with:
> 
> sed -ri '1 s/; tab-width ([0-9])/; tab-width: \1/' \
> $(find -name \*.\[ch\] -exec grep -l -- '-\*- mode:' {} \+)
> ---
[snip]
> 
> diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_blend.c 
> b/src/gallium/drivers/freedreno/a2xx/fd2_blend.c
> index 4e991794f07..48bd395b594 100644
> --- a/src/gallium/drivers/freedreno/a2xx/fd2_blend.c
> +++ b/src/gallium/drivers/freedreno/a2xx/fd2_blend.c
> @@ -1,4 +1,4 @@
> -/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
> +/* -*- mode: C; c-file-style: "k&r"; tab-width: 4; indent-tabs-mode: t; -*- 
> */

You might want to remove these instead, and use the .editorconfig [1]
already present at src/gallium/drivers/freedreno/.editorconfig
This is much easier to maintain than per-files settings ;)

The website [1] has a link for a plugin for Emacs since it appears to
lack native support, but if you're ok with installing a plugin, this
should be a good solution for you :)

[1] https://editorconfig.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Neil Roberts

The modeline was missing a ‘:’ after the tab-width and Emacs was
complaining every time you open a file. This patch was made with:

sed -ri '1 s/; tab-width ([0-9])/; tab-width: \1/' \
$(find -name \*.\[ch\] -exec grep -l -- '-\*- mode:' {} \+)
---
 src/gallium/drivers/freedreno/a2xx/fd2_blend.c  | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_blend.h  | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_context.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_context.h| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_draw.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_draw.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_emit.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_emit.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_program.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_program.h| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.c | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.h | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_screen.h | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_texture.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_texture.h| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_util.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_util.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_blend.c  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_blend.h  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_context.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_context.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_draw.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_draw.h   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.h   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.h   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_program.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_program.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_query.c  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_query.h  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.c | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.h | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_screen.c | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_screen.h | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_texture.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_texture.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_blend.c  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_blend.h  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_context.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_context.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_draw.c   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_draw.h   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.h   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_format.c | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_format.h | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.h   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_program.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_program.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_query.c  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_query.h  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.h | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_screen.c | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_screen.h | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_texture.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_texture.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.h| 2 +-
 src/gallium/drivers/freedreno/freedreno_context.c   | 2 +-
 src/gallium/drivers/freedreno/freedreno_context.h   | 2 +-
 src/gallium/drivers/freedreno/freedreno_draw.c  | 2 +-
 src/gallium/drivers/freedreno/freedreno_draw.h  | 2 +-
 src/galli

Re: [Mesa-dev] [PATCH 2/4] radeonsi: use compute shaders for clear_buffer & copy_buffer

2018-10-17 Thread Michel Dänzer

On 2018-10-07 9:05 a.m., Marek Olšák wrote:
> From: Marek Olšák 
> 
> Fast color clears should be much faster. Also, fast color clears on
> evicted buffers should be 200x faster on GFX8 and older.

Nice! Unfortunately, this broke clover with radeonsi. Everything using
OpenCL seems to hang, see e.g. the attached backtraces from clinfo.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
(gdb) info threads 
  Id   Target Id Frame 
* 1Thread 0x7f63ecdb2740 (LWP 24202) "clinfo" syscall () at 
../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  2Thread 0x7f63e62bc700 (LWP 24203) "clinfo:rcs0" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915203af0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  3Thread 0x7f63e5934700 (LWP 24204) "clinfo:disk$0" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915204768) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  4Thread 0x7f63e510a700 (LWP 24205) "clinfo:cs0" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915214aa0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  5Thread 0x7f63d7fff700 (LWP 24206) "clinfo:disk$0" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e9152185a8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  6Thread 0x7f63d77fe700 (LWP 24207) "clinfo:sh0" __lll_lock_wait () at 
../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  7Thread 0x7f63d6ffd700 (LWP 24208) "clinfo:sh1" __lll_lock_wait () at 
../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  8Thread 0x7f63c700 (LWP 24209) "clinfo:sh2" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  9Thread 0x7f63d67fc700 (LWP 24210) "clinfo:sh3" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  10   Thread 0x7f63d5ffb700 (LWP 24211) "clinfo:sh4" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  11   Thread 0x7f63d57fa700 (LWP 24212) "clinfo:sh5" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  12   Thread 0x7f63d4ff9700 (LWP 24213) "clinfo:sh6" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  13   Thread 0x7f63cf7fe700 (LWP 24214) "clinfo:sh7" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  14   Thread 0x7f63ceffd700 (LWP 24215) "clinfo:sh8" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  15   Thread 0x7f63ce7fc700 (LWP 24216) "clinfo:sh9" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  16   Thread 0x7f63cdffb700 (LWP 24217) "clinfo:sh10" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  17   Thread 0x7f63cd7fa700 (LWP 24218) "clinfo:sh11" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915217d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  18   Thread 0x7f63ccff9700 (LWP 24219) "clinfo:shlo0" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915218280) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  19   Thread 0x7f639bfff700 (LWP 24220) "clinfo:shlo1" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915218280) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  20   Thread 0x7f639b7fe700 (LWP 24221) "clinfo:shlo2" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915218280) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  21   Thread 0x7f639affd700 (LWP 24222) "clinfo:shlo3" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915218280) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  22   Thread 0x7f639a7fc700 (LWP 24223) "clinfo:shlo4" 0x7f63e7e36e6c in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x55e915218280) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
(gdb) thread apply all bt

Thread 22 (Thread 0x7f639a7fc700 (LWP 24223)):
#0  0x7f63e7e36e6c in futex_wait_cancelable (private=, 
expected=0, futex_word=0x55e915218280) at 
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0

[Mesa-dev] [PATCH 2/2] freedreno: allocate batches from the cache in launch_grid

2018-10-17 Thread Hyunjun Ko

Needs to allocate batches from the cache so that it could
get a valid index and make resource dependancy tracking right.

In addition this fixes assertion on debug build since the commit
1a40faa8 landed.
---
 src/gallium/drivers/freedreno/freedreno_draw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_draw.c 
b/src/gallium/drivers/freedreno/freedreno_draw.c
index e130895aac..fe026a5fd8 100644
--- a/src/gallium/drivers/freedreno/freedreno_draw.c
+++ b/src/gallium/drivers/freedreno/freedreno_draw.c
@@ -459,7 +459,7 @@ fd_launch_grid(struct pipe_context *pctx, const struct 
pipe_grid_info *info)
struct fd_batch *batch, *save_batch = NULL;
unsigned i;
 
-   batch = fd_batch_create(ctx, true);
+   batch = fd_bc_alloc_batch(&ctx->screen->batch_cache, ctx, true);
fd_batch_reference(&save_batch, ctx->batch);
fd_batch_reference(&ctx->batch, batch);
 
@@ -506,6 +506,7 @@ fd_launch_grid(struct pipe_context *pctx, const struct 
pipe_grid_info *info)
 
fd_batch_reference(&ctx->batch, save_batch);
fd_batch_reference(&save_batch, NULL);
+   fd_batch_reference(&batch, NULL);
 }
 
 void
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] freedreno: adds nondraw param to fd_bc_alloc_batch

2018-10-17 Thread Hyunjun Ko

Needs to specify nondraw when creating a batch through
fd_bc_alloc_batch since it'd better create a batch through
it rather than fd_batch_create.
---
 src/gallium/drivers/freedreno/a6xx/fd6_blitter.c  | 2 +-
 src/gallium/drivers/freedreno/freedreno_batch_cache.c | 6 +++---
 src/gallium/drivers/freedreno/freedreno_batch_cache.h | 2 +-
 src/gallium/drivers/freedreno/freedreno_context.c | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c 
b/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c
index bd37005d50..c962fe7997 100644
--- a/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c
+++ b/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c
@@ -486,7 +486,7 @@ fd6_blit(struct pipe_context *pctx, const struct 
pipe_blit_info *info)
return;
}
 
-   batch = fd_bc_alloc_batch(&ctx->screen->batch_cache, ctx);
+   batch = fd_bc_alloc_batch(&ctx->screen->batch_cache, ctx, true);
 
fd6_emit_restore(batch, batch->draw);
fd6_emit_lrz_flush(batch->draw);
diff --git a/src/gallium/drivers/freedreno/freedreno_batch_cache.c 
b/src/gallium/drivers/freedreno/freedreno_batch_cache.c
index 9d046f205b..a8b32d9bd0 100644
--- a/src/gallium/drivers/freedreno/freedreno_batch_cache.c
+++ b/src/gallium/drivers/freedreno/freedreno_batch_cache.c
@@ -270,7 +270,7 @@ fd_bc_invalidate_resource(struct fd_resource *rsc, bool 
destroy)
 }
 
 struct fd_batch *
-fd_bc_alloc_batch(struct fd_batch_cache *cache, struct fd_context *ctx)
+fd_bc_alloc_batch(struct fd_batch_cache *cache, struct fd_context *ctx, bool 
nondraw)
 {
struct fd_batch *batch;
uint32_t idx;
@@ -333,7 +333,7 @@ fd_bc_alloc_batch(struct fd_batch_cache *cache, struct 
fd_context *ctx)
 
idx--;  /* bit zero returns 1 for ffs() */
 
-   batch = fd_batch_create(ctx, false);
+   batch = fd_batch_create(ctx, nondraw);
if (!batch)
goto out;
 
@@ -365,7 +365,7 @@ batch_from_key(struct fd_batch_cache *cache, struct key 
*key,
return batch;
}
 
-   batch = fd_bc_alloc_batch(cache, ctx);
+   batch = fd_bc_alloc_batch(cache, ctx, false);
 #ifdef DEBUG
DBG("%p: hash=0x%08x, %ux%u, %u layers, %u samples", batch, hash,
key->width, key->height, key->layers, key->samples);
diff --git a/src/gallium/drivers/freedreno/freedreno_batch_cache.h 
b/src/gallium/drivers/freedreno/freedreno_batch_cache.h
index 348418e187..0f2c40ba8d 100644
--- a/src/gallium/drivers/freedreno/freedreno_batch_cache.h
+++ b/src/gallium/drivers/freedreno/freedreno_batch_cache.h
@@ -68,7 +68,7 @@ void fd_bc_flush_deferred(struct fd_batch_cache *cache, 
struct fd_context *ctx);
 void fd_bc_invalidate_context(struct fd_context *ctx);
 void fd_bc_invalidate_batch(struct fd_batch *batch, bool destroy);
 void fd_bc_invalidate_resource(struct fd_resource *rsc, bool destroy);
-struct fd_batch * fd_bc_alloc_batch(struct fd_batch_cache *cache, struct 
fd_context *ctx);
+struct fd_batch * fd_bc_alloc_batch(struct fd_batch_cache *cache, struct 
fd_context *ctx, bool nondraw);
 
 struct fd_batch * fd_batch_from_fb(struct fd_batch_cache *cache,
struct fd_context *ctx, const struct pipe_framebuffer_state 
*pfb);
diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
b/src/gallium/drivers/freedreno/freedreno_context.c
index 55e978073a..c540d6d143 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.c
+++ b/src/gallium/drivers/freedreno/freedreno_context.c
@@ -316,7 +316,7 @@ fd_context_init(struct fd_context *ctx, struct pipe_screen 
*pscreen,
pctx->const_uploader = pctx->stream_uploader;
 
if (!ctx->screen->reorder)
-   ctx->batch = fd_bc_alloc_batch(&screen->batch_cache, ctx);
+   ctx->batch = fd_bc_alloc_batch(&screen->batch_cache, ctx, 
false);
 
slab_create_child(&ctx->transfer_pool, &screen->transfer_pool);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Allow fd.o to join forces with X.Org

2018-10-17 Thread Daniel Vetter

On Wed, Oct 17, 2018 at 2:05 PM Daniel Stone  wrote:
>
> On Tue, 16 Oct 2018 at 08:17, Peter Hutterer  wrote:
> > On Mon, Oct 15, 2018 at 10:49:24AM -0400, Harry Wentland wrote:
> > > + \item Support free and open source projects through the 
> > > freedesktop.org
> > > + infrastructure. For projects outside the scope of item (\ref{1}) 
> > > support
> > > + extends to project hosting only.
> > > +
> >
> > Yes to the idea but given that the remaining 11 pages cover all the legalese
> > for xorg I think we need to add at least a section of what "project hosting"
> > means. Even if it's just a "includes but is not limited to blah".  And some
> > addition to 4.1 Powers is needed to spell out what the BoD can do in regards
> > to fdo.
>
> Yeah, I think it makes sense. Some things we do:
>   - provide hosted network services for collaborative development,
> testing, and discussion, of open-source projects
>   - administer, improve, and extend this suite of services as necessary
>   - assist open-source projects in their use of these services
>   - purchase, lease, or subscribe to, computing and networking
> infrastructure allowing these services to be run

I fully agree that we should document all this. I don't think the
bylaws are the right place though, much better to put that into
policies that the board approves and which can be adapted as needed.
Imo bylaws should cover the high-level mission and procedural details,
as our "constitution", with the really high acceptance criteria of
2/3rd of all members approving any changes. Some of the early
discussions tried to spell out a lot of the fd.o policies in bylaw
changes, but then we realized it's all there already. All the details
are much better served in policies enacted by the board, like we do
with everything else.

As an example, let's look at XDC. Definitely one of the biggest things
the foundation does, with handling finances, travel sponsoring grants,
papers committee, and acquiring lots of sponsors. None of this is
spelled out in the bylaws, it's all in policies that the board
deliberates and approves. I think this same approach will also work
well for fd.o.

And if members are unhappy with what the board does, they can fix in
the next election by throwing out the unwanted directors.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 100960] Special block from Minecraft mod rendered out of place

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=100960

--- Comment #17 from Sergii Romantsov  ---
Hello, Fabian.

Unfortunately, probably, no one will be interest in that fix in the Mesa so
much.
The reason: actually issue is in the game. Specification doesn't specify exact
way how to handle it. So at this moment its implementation-dependent. 

Suggestion is: please, post an issue to Minecraft-mod owner. Probably, that is
the fastest way to fix it: instead of calls
  'glRotated(angle = 180, x = 0, y = 0, z = 0)'
application should call:
  'glRotated(angle = 180, x = 1, y = 0, z = 0)'.
And, please, provide a link to it.

Proposed patches are still actual and adds compatibility for Mesa with Nvidia
and Windows. But still: current behavior of Mesa is also can be treated as
'correct'.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Allow fd.o to join forces with X.Org

2018-10-17 Thread Daniel Stone

On Tue, 16 Oct 2018 at 08:17, Peter Hutterer  wrote:
> On Mon, Oct 15, 2018 at 10:49:24AM -0400, Harry Wentland wrote:
> > + \item Support free and open source projects through the 
> > freedesktop.org
> > + infrastructure. For projects outside the scope of item (\ref{1}) 
> > support
> > + extends to project hosting only.
> > +
>
> Yes to the idea but given that the remaining 11 pages cover all the legalese
> for xorg I think we need to add at least a section of what "project hosting"
> means. Even if it's just a "includes but is not limited to blah".  And some
> addition to 4.1 Powers is needed to spell out what the BoD can do in regards
> to fdo.

Yeah, I think it makes sense. Some things we do:
  - provide hosted network services for collaborative development,
testing, and discussion, of open-source projects
  - administer, improve, and extend this suite of services as necessary
  - assist open-source projects in their use of these services
  - purchase, lease, or subscribe to, computing and networking
infrastructure allowing these services to be run

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] anv: Implement VK_EXT_conditional_rendering for gen 7.5+

2018-10-17 Thread Danylo Piliaiev

Conditional rendering affects next functions:
- vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
- vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
- vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase
- vkCmdClearAttachments

To reduce readings from the memory a result of the condition is calculated
and stored into designated register MI_ALU_REG15.

In current implementation affected functions expect MI_PREDICATE_RESULT
being set before their call so any code which changes the predicate
should restore it with restore_conditional_render_predicate.
An alternative is to restore MI_PREDICATE_RESULT in all affected
functions at their beginning.

Signed-off-by: Danylo Piliaiev 
---
 src/intel/vulkan/anv_blorp.c   |   7 +-
 src/intel/vulkan/anv_device.c  |  12 ++
 src/intel/vulkan/anv_extensions.py |   1 +
 src/intel/vulkan/anv_private.h |   2 +
 src/intel/vulkan/genX_cmd_buffer.c | 192 -
 5 files changed, 209 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 478b8e7a3d..157875d16f 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1144,8 +1144,11 @@ void anv_CmdClearAttachments(
 * trash our depth and stencil buffers.
 */
struct blorp_batch batch;
-   blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
-BLORP_BATCH_NO_EMIT_DEPTH_STENCIL);
+   enum blorp_batch_flags flags = BLORP_BATCH_NO_EMIT_DEPTH_STENCIL;
+   if (cmd_buffer->state.conditional_render_enabled) {
+   flags |= BLORP_BATCH_PREDICATE_ENABLE;
+   }
+   blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, flags);
 
for (uint32_t a = 0; a < attachmentCount; ++a) {
   if (pAttachments[a].aspectMask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a2551452eb..930a192c25 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -957,6 +957,18 @@ void anv_GetPhysicalDeviceFeatures2(
  break;
   }
 
+  case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CONDITIONAL_RENDERING_FEATURES_EXT: {
+ VkPhysicalDeviceConditionalRenderingFeaturesEXT *features =
+(VkPhysicalDeviceConditionalRenderingFeaturesEXT*)ext;
+ ANV_FROM_HANDLE(anv_physical_device, pdevice, physicalDevice);
+
+ features->conditionalRendering = pdevice->info.gen >= 8 ||
+  pdevice->info.is_haswell;
+ features->inheritedConditionalRendering = pdevice->info.gen >= 8 ||
+   pdevice->info.is_haswell;
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index c13ce531ee..2ef7a52d01 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -127,6 +127,7 @@ EXTENSIONS = [
 Extension('VK_EXT_vertex_attribute_divisor',  3, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
+Extension('VK_EXT_conditional_rendering', 1, 'device->info.gen 
>= 8 || device->info.is_haswell'),
 ]
 
 class VkVersion:
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 599b903f25..108da51a59 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2032,6 +2032,8 @@ struct anv_cmd_state {
 */
bool hiz_enabled;
 
+   bool conditional_render_enabled;
+
/**
 * Array length is anv_cmd_state::pass::attachment_count. Array content is
 * valid only when recording a render pass instance.
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index f07a6aa7c9..87abc443b6 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -479,8 +479,9 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
0, 0, 1, hiz_op);
 }
 
-#define MI_PREDICATE_SRC0  0x2400
-#define MI_PREDICATE_SRC1  0x2408
+#define MI_PREDICATE_SRC00x2400
+#define MI_PREDICATE_SRC10x2408
+#define MI_PREDICATE_RESULT  0x2418
 
 static void
 set_image_compressed_bit(struct anv_cmd_buffer *cmd_buffer,
@@ -545,6 +546,14 @@ mi_alu(uint32_t opcode, uint32_t operand1, uint32_t 
operand2)
 
 #define CS_GPR(n) (0x2600 + (n) * 8)
 
+#if GEN_GEN >= 8 || GEN_IS_HASWELL
+static void
+restore_conditional_render_predicate(struct anv_cmd_buffer *cmd_buffer)
+{
+   emit_lrr(&cmd_buffer->batch, MI_PREDICATE_RESULT, CS_GPR(MI_ALU_REG15));
+}
+#endif
+
 /* This is only really practical on haswell and above because it requires
  * MI math in orde

[Mesa-dev] [PATCH 0/3] anv: Implement VK_KHR_draw_indirect_count and VK_EXT_conditional_rendering

2018-10-17 Thread Danylo Piliaiev

This series implement VK_KHR_draw_indirect_count and 
VK_EXT_conditional_rendering extensions.
They are implemented together because they are highly interweaved.

There are already tests in VK_CTS for VK_KHR_draw_indirect_count and I made a 
pull request with
the tests for VK_EXT_conditional_rendering 
(https://github.com/KhronosGroup/VK-GL-CTS/pull/131).

VK_KHR_draw_indirect_count is implemented for gen7+.
VK_EXT_conditional_rendering is implemented for gen7.5+ because it requires 
MI_MATH to be
implemented correctly.

Since part of the tests aren't in VK-GL-CTS master I'm not sure how to test the 
implementation
of VK_EXT_conditional_rendering with my tests on CI. Could anyone help me with 
this?

Also the one thing I'm uncertain of is described in the last patch.

Many thanks to Jason Ekstrand for the help with the extensions.

Danylo Piliaiev (3):
  anv: Implement VK_KHR_draw_indirect_count for gen 7.5+
  anv: Implement VK_KHR_draw_indirect_count for gen 7
  anv: Implement VK_EXT_conditional_rendering for gen 7.5+

 src/intel/vulkan/anv_blorp.c   |   7 +-
 src/intel/vulkan/anv_device.c  |  12 +
 src/intel/vulkan/anv_extensions.py |   2 +
 src/intel/vulkan/anv_private.h |   2 +
 src/intel/vulkan/genX_cmd_buffer.c | 355 -
 5 files changed, 373 insertions(+), 5 deletions(-)

-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] anv: Implement VK_KHR_draw_indirect_count for gen 7.5+

2018-10-17 Thread Danylo Piliaiev

Signed-off-by: Danylo Piliaiev 
---
 src/intel/vulkan/anv_extensions.py |   1 +
 src/intel/vulkan/genX_cmd_buffer.c | 155 +
 2 files changed, 156 insertions(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index d4915c9501..7f44da6648 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -113,6 +113,7 @@ EXTENSIONS = [
 Extension('VK_KHR_xlib_surface',  6, 
'VK_USE_PLATFORM_XLIB_KHR'),
 Extension('VK_KHR_multiview', 1, True),
 Extension('VK_KHR_display',  23, 
'VK_USE_PLATFORM_DISPLAY_KHR'),
+Extension('VK_KHR_draw_indirect_count',   1, 'device->info.gen 
>= 8 || device->info.is_haswell'),
 Extension('VK_EXT_acquire_xlib_display',  1, 
'VK_USE_PLATFORM_XLIB_XRANDR_EXT'),
 Extension('VK_EXT_debug_report',  8, True),
 Extension('VK_EXT_direct_mode_display',   1, 
'VK_USE_PLATFORM_DISPLAY_KHR'),
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 43a02f2256..d7b94efd19 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2982,6 +2982,161 @@ void genX(CmdDrawIndexedIndirect)(
}
 }
 
+#if GEN_IS_HASWELL || GEN_GEN >= 8
+static void
+emit_draw_count_predicate(struct anv_cmd_buffer *cmd_buffer,
+  struct anv_address count_address,
+  uint32_t draw_index)
+{
+   /* Upload the current draw count from the draw parameters buffer to
+* MI_PREDICATE_SRC0.
+*/
+   emit_lrr(&cmd_buffer->batch, MI_PREDICATE_SRC0, CS_GPR(MI_ALU_REG14));
+
+   /* Upload the index of the current primitive to MI_PREDICATE_SRC1. */
+   emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1, draw_index);
+   emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 + 4, 0);
+
+   if (draw_index == 0) {
+   anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
+  mip.LoadOperation= LOAD_LOADINV;
+  mip.CombineOperation = COMBINE_SET;
+  mip.CompareOperation = COMPARE_SRCS_EQUAL;
+   }
+   } else {
+   /* While draw_index < draw_count the predicate's result will be
+*  (draw_index == draw_count) ^ TRUE = TRUE
+* When draw_index == draw_count the result is
+*  (TRUE) ^ TRUE = FALSE
+* After this all results will be:
+*  (FALSE) ^ FALSE = FALSE
+*/
+   anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
+  mip.LoadOperation= LOAD_LOAD;
+  mip.CombineOperation = COMBINE_XOR;
+  mip.CompareOperation = COMPARE_SRCS_EQUAL;
+   }
+   }
+}
+
+void genX(CmdDrawIndirectCountKHR)(
+VkCommandBuffer commandBuffer,
+VkBuffer_buffer,
+VkDeviceSizeoffset,
+VkBuffer_countBuffer,
+VkDeviceSizecountBufferOffset,
+uint32_tmaxDrawCount,
+uint32_tstride)
+{
+   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
+   ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
+   ANV_FROM_HANDLE(anv_buffer, count_buffer, _countBuffer);
+   struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+   struct anv_pipeline *pipeline = cmd_state->gfx.base.pipeline;
+   const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
+
+   if (anv_batch_has_error(&cmd_buffer->batch))
+  return;
+
+   genX(cmd_buffer_flush_state)(cmd_buffer);
+
+   struct anv_address count_address =
+  anv_address_add(count_buffer->address, countBufferOffset);
+
+   /* Needed to ensure the memory is coherent for the MI_LOAD_REGISTER_MEM
+* command when loading the values into the predicate source registers.
+*/
+   anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.PipeControlFlushEnable = true;
+   }
+
+   emit_lrm(&cmd_buffer->batch, CS_GPR(MI_ALU_REG14), count_address);
+   emit_lri(&cmd_buffer->batch, CS_GPR(MI_ALU_REG14) + 4, 0);
+
+   for (uint32_t i = 0; i < maxDrawCount; i++) {
+  struct anv_address draw = anv_address_add(buffer->address, offset);
+
+  emit_draw_count_predicate(cmd_buffer, count_address, i);
+
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_baseinstance)
+ emit_base_vertex_instance_bo(cmd_buffer, anv_address_add(draw, 8));
+  if (vs_prog_data->uses_drawid)
+ emit_draw_index(cmd_buffer, i);
+
+  load_indirect_parameters(cmd_buffer, draw, false);
+
+  anv_batch_emit(&cmd_buffer->batch, GENX(3DPRIMITIVE), prim) {
+ prim.IndirectParameterEnable  = true;
+ prim.PredicateEnable  = true;
+ prim.VertexAccessType = SEQUENTIAL;
+ prim.PrimitiveTopologyType= pi

[Mesa-dev] [PATCH 2/3] anv: Implement VK_KHR_draw_indirect_count for gen 7

2018-10-17 Thread Danylo Piliaiev

Without MI_MATH we are forced to load MI_PREDICATE_SRC0
from memory on every predicate emission.

Signed-off-by: Danylo Piliaiev 
---
 src/intel/vulkan/anv_extensions.py |  2 +-
 src/intel/vulkan/genX_cmd_buffer.c | 12 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 7f44da6648..c13ce531ee 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -113,7 +113,7 @@ EXTENSIONS = [
 Extension('VK_KHR_xlib_surface',  6, 
'VK_USE_PLATFORM_XLIB_KHR'),
 Extension('VK_KHR_multiview', 1, True),
 Extension('VK_KHR_display',  23, 
'VK_USE_PLATFORM_DISPLAY_KHR'),
-Extension('VK_KHR_draw_indirect_count',   1, 'device->info.gen 
>= 8 || device->info.is_haswell'),
+Extension('VK_KHR_draw_indirect_count',   1, True),
 Extension('VK_EXT_acquire_xlib_display',  1, 
'VK_USE_PLATFORM_XLIB_XRANDR_EXT'),
 Extension('VK_EXT_debug_report',  8, True),
 Extension('VK_EXT_direct_mode_display',   1, 
'VK_USE_PLATFORM_DISPLAY_KHR'),
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index d7b94efd19..f07a6aa7c9 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2982,7 +2982,6 @@ void genX(CmdDrawIndexedIndirect)(
}
 }
 
-#if GEN_IS_HASWELL || GEN_GEN >= 8
 static void
 emit_draw_count_predicate(struct anv_cmd_buffer *cmd_buffer,
   struct anv_address count_address,
@@ -2991,7 +2990,13 @@ emit_draw_count_predicate(struct anv_cmd_buffer 
*cmd_buffer,
/* Upload the current draw count from the draw parameters buffer to
 * MI_PREDICATE_SRC0.
 */
+#if GEN_GEN >= 8 || GEN_IS_HASWELL
emit_lrr(&cmd_buffer->batch, MI_PREDICATE_SRC0, CS_GPR(MI_ALU_REG14));
+#else
+   emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0, count_address);
+   /* Zero the top 32-bits of MI_PREDICATE_SRC0 */
+   emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4, 0);
+#endif
 
/* Upload the index of the current primitive to MI_PREDICATE_SRC1. */
emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1, draw_index);
@@ -3050,8 +3055,10 @@ void genX(CmdDrawIndirectCountKHR)(
  pc.PipeControlFlushEnable = true;
}
 
+#if GEN_GEN >= 8 || GEN_IS_HASWELL
emit_lrm(&cmd_buffer->batch, CS_GPR(MI_ALU_REG14), count_address);
emit_lri(&cmd_buffer->batch, CS_GPR(MI_ALU_REG14) + 4, 0);
+#endif
 
for (uint32_t i = 0; i < maxDrawCount; i++) {
   struct anv_address draw = anv_address_add(buffer->address, offset);
@@ -3108,8 +3115,10 @@ void genX(CmdDrawIndexedIndirectCountKHR)(
  pc.PipeControlFlushEnable = true;
}
 
+#if GEN_GEN >= 8 || GEN_IS_HASWELL
emit_lrm(&cmd_buffer->batch, CS_GPR(MI_ALU_REG14), count_address);
emit_lri(&cmd_buffer->batch, CS_GPR(MI_ALU_REG14) + 4, 0);
+#endif
 
for (uint32_t i = 0; i < maxDrawCount; i++) {
   struct anv_address draw = anv_address_add(buffer->address, offset);
@@ -3135,7 +3144,6 @@ void genX(CmdDrawIndexedIndirectCountKHR)(
   offset += stride;
}
 }
-#endif
 
 static VkResult
 flush_compute_descriptor_set(struct anv_cmd_buffer *cmd_buffer)
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: Add support for VK_KHR_driver_properties.

2018-10-17 Thread Alex Smith

This patch never landed in git, is that intentional?

On Mon, 1 Oct 2018 at 17:46, Jason Ekstrand  wrote:

> On Sun, Sep 30, 2018 at 1:04 PM Bas Nieuwenhuizen 
> wrote:
>
>> ---
>>  src/amd/vulkan/radv_device.c  | 27 +++
>>  src/amd/vulkan/radv_extensions.py |  1 +
>>  2 files changed, 28 insertions(+)
>>
>> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
>> index f7752eac83b..fe7e7f7f6ac 100644
>> --- a/src/amd/vulkan/radv_device.c
>> +++ b/src/amd/vulkan/radv_device.c
>> @@ -1196,6 +1196,33 @@ void radv_GetPhysicalDeviceProperties2(
>>
>> properties->conservativeRasterizationPostDepthCoverage = VK_FALSE;
>> break;
>> }
>> +   case
>> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DRIVER_PROPERTIES_KHR: {
>> +   VkPhysicalDeviceDriverPropertiesKHR *driver_props
>> =
>> +   (VkPhysicalDeviceDriverPropertiesKHR *)
>> ext;
>> +
>> +   driver_props->driverID =
>> VK_DRIVER_ID_MESA_RADV_KHR;
>> +   memset(driver_props->driverName, 0,
>> VK_MAX_DRIVER_NAME_SIZE_KHR);
>> +   strcpy(driver_props->driverName, "radv");
>> +
>> +   memset(driver_props->driverInfo, 0,
>> VK_MAX_DRIVER_INFO_SIZE_KHR);
>> +   snprintf(driver_props->driverInfo,
>> VK_MAX_DRIVER_INFO_SIZE_KHR,
>> +   "Mesa " PACKAGE_VERSION
>> +#ifdef MESA_GIT_SHA1
>> +   " ("MESA_GIT_SHA1")"
>> +#endif
>> +   " (LLVM %i.%i.%i)",
>>
>
> I think %d is more customary, but I don't care.  Assuming you actually
> pass 1.1.0.2,
>
> Reviewed-by: Jason Ekstrand 
>
>
>> +(HAVE_LLVM >> 8) & 0xff, HAVE_LLVM &
>> 0xff,
>> +MESA_LLVM_VERSION_PATCH);
>> +
>> +   driver_props->conformanceVersion =
>> (VkConformanceVersionKHR) {
>> +   .major = 1,
>> +   .minor = 1,
>> +   .subminor = 0,
>> +   .patch = 2,
>> +   };
>> +   break;
>> +   }
>> +
>> default:
>> break;
>> }
>> diff --git a/src/amd/vulkan/radv_extensions.py
>> b/src/amd/vulkan/radv_extensions.py
>> index 584926df390..8df5da76ed5 100644
>> --- a/src/amd/vulkan/radv_extensions.py
>> +++ b/src/amd/vulkan/radv_extensions.py
>> @@ -59,6 +59,7 @@ EXTENSIONS = [
>>  Extension('VK_KHR_device_group',  1, True),
>>  Extension('VK_KHR_device_group_creation', 1, True),
>>  Extension('VK_KHR_draw_indirect_count',   1, True),
>> +Extension('VK_KHR_driver_properties', 1, True),
>>  Extension('VK_KHR_external_fence',1,
>> 'device->rad_info.has_syncobj_wait_for_submit'),
>>  Extension('VK_KHR_external_fence_capabilities',   1, True),
>>  Extension('VK_KHR_external_fence_fd', 1,
>> 'device->rad_info.has_syncobj_wait_for_submit'),
>> --
>> 2.19.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove some redundant bcsel instructions

2018-10-17 Thread Timothy Arceri


On 17/10/18 8:49 pm, Bas Nieuwenhuizen wrote:

On Wed, Oct 17, 2018 at 5:49 AM Timothy Arceri  wrote:


For example:

vec1 32 ssa_386 = feq ssa_333.x, ssa_6
vec1 32 ssa_387 = feq ssa_333.x, ssa_2
vec1 32 ssa_391 = bcsel ssa_387, ssa_388, ssa_324
vec1 32 ssa_396 = bcsel ssa_386, ssa_324, ssa_391

Can be simplified to:

vec1 32 ssa_386 = feq ssa_333.x, ssa_6
vec1 32 ssa_391 = bcsel ssa_387, ssa_388, ssa_324

There are a bunch of these in Rise of The Tomb Raiders Vulkan
shaders. There are also a hadful of shaders helped in shader-db
but the changes there are smaller.

For RADV:

Totals from affected shaders:
SGPRS: 11184 -> 11168 (-0.14 %)
VGPRS: 11484 -> 11484 (0.00 %)
Spilled SGPRs: 1119 -> 1116 (-0.27 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1210856 -> 1210372 (-0.04 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 360 -> 360 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
---
  src/compiler/nir/nir_opt_algebraic.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index cc747250ba5..7530710cbe0 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -34,6 +34,7 @@ a = 'a'
  b = 'b'
  c = 'c'
  d = 'd'
+e = 'e'

  # Written in the form (, ) where  is an expression
  # and  is either an expression or a value.  An expression is
@@ -525,6 +526,9 @@ optimizations = [
 # The result of this should be hit by constant propagation and, in the
 # next round of opt_algebraic, get picked up by one of the above two.
 (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
+   # Remove redundant bcsel
+   (('bcsel', ('ieq', '#a', b), c, ('bcsel', ('ieq', '#d', b), e, c)), 
('bcsel', ('ieq', d, b), e, c)),


I think this only works if the value of a is not equal to the value of
d? if a is equal to d, then the expression on the left is always c,
while the expression on the right is e sometimes?


Hmm. I though the search/matching code was smart enough to handle this, 
but looking at it it seems I was wrong.


I'll take a look tomorrow to see how hard it would be to handle this safely.





+   (('bcsel', ('feq', '#a', b), c, ('bcsel', ('feq', '#d', b), e, c)), 
('bcsel', ('feq', d, b), e, c)),




 (('bcsel', a, b, b), b),
 (('fcsel', a, b, b), b),
--
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel/compiler: fix node interference of simd16 instructions

2018-10-17 Thread Iago Toral Quiroga

SIMD16 instructions need to have additional interferences to prevent
source / destination hazards when the source and destination registers
are off by one register.

While we already have code to handle this, it was only running for SIMD16
dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch.
An example of this are pull constant loads since commit b56fa830c6095,
but there are more cases.

This fixes a number of CTS test failues found in work-in-progress
tests that were hitting this situation for 16-wide pull constants
in a SIMD8 program.
---
 src/intel/compiler/brw_fs_reg_allocate.cpp | 36 ++
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/src/intel/compiler/brw_fs_reg_allocate.cpp 
b/src/intel/compiler/brw_fs_reg_allocate.cpp
index 42ccb28de6..f72826bc41 100644
--- a/src/intel/compiler/brw_fs_reg_allocate.cpp
+++ b/src/intel/compiler/brw_fs_reg_allocate.cpp
@@ -632,26 +632,24 @@ fs_visitor::assign_regs(bool allow_spilling, bool 
spill_all)
   }
}
 
-   if (dispatch_width > 8) {
-  /* In 16-wide dispatch we have an issue where a compressed
-   * instruction is actually two instructions executed simultaneiously.
-   * It's actually ok to have the source and destination registers be
-   * the same.  In this case, each instruction over-writes its own
-   * source and there's no problem.  The real problem here is if the
-   * source and destination registers are off by one.  Then you can end
-   * up in a scenario where the first instruction over-writes the
-   * source of the second instruction.  Since the compiler doesn't know
-   * about this level of granularity, we simply make the source and
-   * destination interfere.
-   */
-  foreach_block_and_inst(block, fs_inst, inst, cfg) {
- if (inst->dst.file != VGRF)
-continue;
+   /* In 16-wide instructions we have an issue where a compressed
+* instruction is actually two instructions executed simultaneiously.
+* It's actually ok to have the source and destination registers be
+* the same.  In this case, each instruction over-writes its own
+* source and there's no problem.  The real problem here is if the
+* source and destination registers are off by one.  Then you can end
+* up in a scenario where the first instruction over-writes the
+* source of the second instruction.  Since the compiler doesn't know
+* about this level of granularity, we simply make the source and
+* destination interfere.
+*/
+   foreach_block_and_inst(block, fs_inst, inst, cfg) {
+  if (inst->exec_size < 16 || inst->dst.file != VGRF)
+ continue;
 
- for (int i = 0; i < inst->sources; ++i) {
-if (inst->src[i].file == VGRF) {
-   ra_add_node_interference(g, inst->dst.nr, inst->src[i].nr);
-}
+  for (int i = 0; i < inst->sources; ++i) {
+ if (inst->src[i].file == VGRF) {
+ra_add_node_interference(g, inst->dst.nr, inst->src[i].nr);
  }
   }
}
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove some redundant bcsel instructions

2018-10-17 Thread Bas Nieuwenhuizen

On Wed, Oct 17, 2018 at 5:49 AM Timothy Arceri  wrote:
>
> For example:
>
>vec1 32 ssa_386 = feq ssa_333.x, ssa_6
>vec1 32 ssa_387 = feq ssa_333.x, ssa_2
>vec1 32 ssa_391 = bcsel ssa_387, ssa_388, ssa_324
>vec1 32 ssa_396 = bcsel ssa_386, ssa_324, ssa_391
>
> Can be simplified to:
>
>vec1 32 ssa_386 = feq ssa_333.x, ssa_6
>vec1 32 ssa_391 = bcsel ssa_387, ssa_388, ssa_324
>
> There are a bunch of these in Rise of The Tomb Raiders Vulkan
> shaders. There are also a hadful of shaders helped in shader-db
> but the changes there are smaller.
>
> For RADV:
>
> Totals from affected shaders:
> SGPRS: 11184 -> 11168 (-0.14 %)
> VGPRS: 11484 -> 11484 (0.00 %)
> Spilled SGPRs: 1119 -> 1116 (-0.27 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 1210856 -> 1210372 (-0.04 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 360 -> 360 (0.00 %)
> Wait states: 0 -> 0 (0.00 %)
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index cc747250ba5..7530710cbe0 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -34,6 +34,7 @@ a = 'a'
>  b = 'b'
>  c = 'c'
>  d = 'd'
> +e = 'e'
>
>  # Written in the form (, ) where  is an expression
>  # and  is either an expression or a value.  An expression is
> @@ -525,6 +526,9 @@ optimizations = [
> # The result of this should be hit by constant propagation and, in the
> # next round of opt_algebraic, get picked up by one of the above two.
> (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
> +   # Remove redundant bcsel
> +   (('bcsel', ('ieq', '#a', b), c, ('bcsel', ('ieq', '#d', b), e, c)), 
> ('bcsel', ('ieq', d, b), e, c)),

I think this only works if the value of a is not equal to the value of
d? if a is equal to d, then the expression on the left is always c,
while the expression on the right is e sometimes?


> +   (('bcsel', ('feq', '#a', b), c, ('bcsel', ('feq', '#d', b), e, c)), 
> ('bcsel', ('feq', d, b), e, c)),

>
> (('bcsel', a, b, b), b),
> (('fcsel', a, b, b), b),
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 104302] Wolfenstein 2 (2017) under wine graphical artifacting on RADV

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=104302

--- Comment #22 from Samuel Pitoiset  ---
Patch available here https://reviews.llvm.org/D53359

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 108105] [DXVK] Dauntless Helmets rendering incorrectly on Vega, works in AMDVLK

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=108105

--- Comment #13 from Samuel Pitoiset  ---
Patch available here https://reviews.llvm.org/D53359

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove some redundant bcsel instructions

2018-10-17 Thread Iago Toral

Reviewed-by: Iago Toral Quiroga 

On Wed, 2018-10-17 at 14:49 +1100, Timothy Arceri wrote:
> For example:
> 
>vec1 32 ssa_386 = feq ssa_333.x, ssa_6
>vec1 32 ssa_387 = feq ssa_333.x, ssa_2
>vec1 32 ssa_391 = bcsel ssa_387, ssa_388, ssa_324
>vec1 32 ssa_396 = bcsel ssa_386, ssa_324, ssa_391
> 
> Can be simplified to:
> 
>vec1 32 ssa_386 = feq ssa_333.x, ssa_6
>vec1 32 ssa_391 = bcsel ssa_387, ssa_388, ssa_324
> 
> There are a bunch of these in Rise of The Tomb Raiders Vulkan
> shaders. There are also a hadful of shaders helped in shader-db
> but the changes there are smaller.
> 
> For RADV:
> 
> Totals from affected shaders:
> SGPRS: 11184 -> 11168 (-0.14 %)
> VGPRS: 11484 -> 11484 (0.00 %)
> Spilled SGPRs: 1119 -> 1116 (-0.27 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 1210856 -> 1210372 (-0.04 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 360 -> 360 (0.00 %)
> Wait states: 0 -> 0 (0.00 %)
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/compiler/nir/nir_opt_algebraic.py
> b/src/compiler/nir/nir_opt_algebraic.py
> index cc747250ba5..7530710cbe0 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -34,6 +34,7 @@ a = 'a'
>  b = 'b'
>  c = 'c'
>  d = 'd'
> +e = 'e'
>  
>  # Written in the form (, ) where  is an
> expression
>  # and  is either an expression or a value.  An expression
> is
> @@ -525,6 +526,9 @@ optimizations = [
> # The result of this should be hit by constant propagation and,
> in the
> # next round of opt_algebraic, get picked up by one of the above
> two.
> (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
> +   # Remove redundant bcsel
> +   (('bcsel', ('ieq', '#a', b), c, ('bcsel', ('ieq', '#d', b), e,
> c)), ('bcsel', ('ieq', d, b), e, c)),
> +   (('bcsel', ('feq', '#a', b), c, ('bcsel', ('feq', '#d', b), e,
> c)), ('bcsel', ('feq', d, b), e, c)),
>  
> (('bcsel', a, b, b), b),
> (('fcsel', a, b, b), b),
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 108355] Civilization VI - Artifacts in mouse cursor

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=108355

--- Comment #6 from Michel Dänzer  ---
Does it still happen with xf86-video-amdgpu 18.1.0?

Does amdgpu.dc=0 on the kernel command line avoid the problem?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 108355] Civilization VI - Artifacts in mouse cursor

2018-10-17 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=108355

Michel Dänzer  changed:

   What|Removed |Added

 Attachment #142059|text/x-log  |text/plain
  mime type||

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

94 matches

Mail list logo