Re: [Mesa-dev] [PATCH 2/4] intel/compiler: Don't propagate cmod into integer multiplies

2017-10-04 Thread Matt Turner
On Wed, Oct 4, 2017 at 4:58 PM, Jason Ekstrand  wrote:
> No shader-db change on Sky Lake.
>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/compiler/brw_fs_cmod_propagation.cpp   | 17 +
>  src/intel/compiler/brw_vec4_cmod_propagation.cpp | 17 +
>  2 files changed, 34 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp 
> b/src/intel/compiler/brw_fs_cmod_propagation.cpp
> index db63e94..e8f1069 100644
> --- a/src/intel/compiler/brw_fs_cmod_propagation.cpp
> +++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp
> @@ -150,6 +150,23 @@ opt_cmod_propagation_local(const gen_device_info 
> *devinfo, bblock_t *block)
>  if (scan_inst->saturate)
> break;
>
> +/* From the Sky Lake PRM, Vol 2a, "Multiply":
> + *
> + *"When multiplying integer data types, if one of the sources
> + *is a DW, the resulting full precision data is stored in
> + *the accumulator. However, if the destination data type is
> + *either W or DW, the low bits of the result are written to
> + *the destination register and the remaining high bits are
> + *discarded. This results in undefined Overflow and Sign
> + *flags. Therefore, conditional modifiers and saturation
> + *(.sat) cannot be used in this case.

Please indent the lines in the block quote one space more than the "

Patches 1-2 are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Extend the live ranges of VGRFs which leave loops

2017-10-04 Thread Jason Ekstrand
Bah!  This one's bogus too.  I think it messes up register coalesce but I'm
not 100% sure...

On Wed, Oct 4, 2017 at 8:22 PM, Jason Ekstrand  wrote:

> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/compiler/brw_fs_live_variables.cpp | 55
> 
>  1 file changed, 55 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs_live_variables.cpp
> b/src/intel/compiler/brw_fs_live_variables.cpp
> index c449672..23ec280 100644
> --- a/src/intel/compiler/brw_fs_live_variables.cpp
> +++ b/src/intel/compiler/brw_fs_live_variables.cpp
> @@ -223,6 +223,61 @@ fs_live_variables::compute_start_end()
>   }
>}
> }
> +
> +   /* Due to the explicit way the SIMD data is handled on GEN, we need to
> be a
> +* bit more careful with live ranges and loops.  Consider the following
> +* example:
> +*
> +*vec4 color2;
> +*while (1) {
> +*   vec4 color = texture();
> +*   if (...) {
> +*  color2 = color * 2;
> +*  break;
> +*   }
> +*}
> +*gl_FragColor = color2;
> +*
> +* In this case, the definition of color2 dominates the use because the
> +* loop only has the one exit.  This means that the live range
> interval for
> +* color2 goes from the statement in the if to it's use below the loop.
> +* Now suppose that the texture operation has a header register that
> gets
> +* assigned one of the registers used for color2.  If the loop
> condition is
> +* non-uniform and some of the threads will take the and others will
> +* continue.  In this case, the next pass through the loop, the WE_all
> +* setup of the header register will stomp the disabled channels of
> color2
> +* and corrupt the value.
> +*
> +* This same problem can occur if you have a mix of 64, 32, and 16-bit
> +* registers because the channels do not line up or if you have a
> SIMD16
> +* program and the first half of one value overlaps the second half of
> the
> +* other.
> +*
> +* To solve this problem, we take any VGRFs whose live ranges cross the
> +* while instruction of a loop and extend their live ranges to the top
> of
> +* the loop.  This more accurately models the hardware because the
> value in
> +* the VGRF needs to be carried through subsequent loop iterations in
> order
> +* to remain valid when we finally do break.
> +*/
> +   foreach_block (block, cfg) {
> +  if (block->end()->opcode != BRW_OPCODE_WHILE)
> + continue;
> +
> +  /* This is a WHILE instrution. Find the DO block. */
> +  bblock_t *do_block = NULL;
> +  foreach_list_typed(bblock_link, child_link, link,
> >children) {
> + if (child_link->block->start_ip < block->end_ip) {
> +assert(do_block == NULL);
> +do_block = child_link->block;
> + }
> +  }
> +  assert(do_block);
> +
> +  for (int i = 0; i < num_vars; i++) {
> + if (start[i] < block->end_ip && end[i] > block->end_ip)
> +start[i] = do_block->start_ip;
> +  }
> +   }
>  }
>
>  fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg)
> --
> 2.5.0.400.gff86faf
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965/gen10: Implement Wa3DStateMode

2017-10-04 Thread Jason Ekstrand
On Wed, Oct 4, 2017 at 3:11 PM, Anuj Phogat  wrote:

> On Mon, Oct 2, 2017 at 7:46 PM, Jason Ekstrand 
> wrote:
> > On Mon, Oct 2, 2017 at 4:08 PM, Anuj Phogat 
> wrote:
> >>
> >> Cc: mesa-sta...@lists.freedesktop.org
> >> Signed-off-by: Anuj Phogat 
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_state_upload.c | 7 +--
> >>  1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c
> >> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> >> index a1bf54dc72..c224355a2b 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> >> @@ -88,8 +88,11 @@ brw_upload_initial_gpu_state(struct brw_context
> *brw)
> >> if (devinfo->gen == 10) {
> >>BEGIN_BATCH(2);
> >>OUT_BATCH(_3DSTATE_3D_MODE  << 16 | (2 - 2));
> >> -  OUT_BATCH(GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE << 16 |
> >> -GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE);
> >> +  /* From gen10 workaround table in h/w specs:
> >> +   * "On 3DSTATE_3D_MODE, driver must always program bits 31:16 of
> >> DW1
> >> +   *  a value of 0x"
> >> +   */
> >> +  OUT_BATCH(0x << 16 | GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE);
> >
> >
> > Bits 31:16 are the mask bits.  By programming them to 0x, you're
> making
> > it write the entire register and not just the float blend optimization
> > enable bit.  If we're going to do that, we need to figure out what
> values we
> > want in the other fields and always set them along with the float blend
> > optimization enable bit.
> >
> Right. After looking at all other fields, I don't think we want to set
> any of them except one. That field is "Slice Hashing Table Enable" which
> says:
> "For gen10, when the total number of subslices enabled is 6,8,10, or
> 12, slice hashing table must be enabled."
>
> I have no idea about slice hashing tables and I think enabling it
> should be handled in a separate patch anyways.
>

What I wonder is what we're using today.  I don't think mesa is actually
setting anything other than the default right now but Ken was looking into
it at one point.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: plumb context priority through to driver

2017-10-04 Thread Andres Rodriguez

This should be good for radeonsi to implement the feature as well.

FWIW:
Reviewed-by: Andres Rodriguez 

Little bikeshed comment.

I'm a little iffy about using a mask instead of an enum for priority 
values. It limits the flexibility on the number of levels drastically. 
Since you can't really be at two different priority levels 
simultaneously, this seems like a waste.


As long as we don't have a need for more than a handful of priority 
levels this should be okay. And if the requirement changes, it can be 
dealt with in the future.


Regards,
Andres

On 2017-10-04 11:44 AM, Rob Clark wrote:

Signed-off-by: Rob Clark 
---
  src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
  src/gallium/drivers/freedreno/freedreno_screen.c|  1 +
  src/gallium/drivers/i915/i915_screen.c  |  1 +
  src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
  src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
  src/gallium/drivers/r300/r300_screen.c  |  1 +
  src/gallium/drivers/r600/r600_pipe.c|  1 +
  src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
  src/gallium/drivers/softpipe/sp_screen.c|  1 +
  src/gallium/drivers/svga/svga_screen.c  |  1 +
  src/gallium/drivers/swr/swr_screen.cpp  |  1 +
  src/gallium/drivers/vc4/vc4_screen.c|  1 +
  src/gallium/drivers/virgl/virgl_screen.c|  1 +
  src/gallium/include/pipe/p_defines.h| 21 +
  src/gallium/include/state_tracker/st_api.h  |  2 ++
  src/gallium/state_trackers/dri/dri_context.c| 11 +++
  src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
  src/mesa/state_tracker/st_manager.c |  5 +
  20 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index 42905ab0620..16bd4b7c0fb 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -264,6 +264,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 case PIPE_CAP_QUERY_SO_OVERFLOW:
 case PIPE_CAP_MEMOBJ:
 case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
  
 /* Stream output. */

diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 040c2c99ec0..96866d656be 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -325,6 +325,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
  
  	case PIPE_CAP_MAX_VIEWPORTS:

diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 8411c0f15cc..7bcf479c4be 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -317,6 +317,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
 case PIPE_CAP_QUERY_SO_OVERFLOW:
 case PIPE_CAP_MEMOBJ:
 case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
  
 case PIPE_CAP_MAX_VIEWPORTS:

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 53171162a54..19411adaf07 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -360,6 +360,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
 case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
 case PIPE_CAP_MEMOBJ:
 case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
 }
 /* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index a66b4fbe67b..782ba0a64db 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -224,6 +224,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 case PIPE_CAP_QUERY_SO_OVERFLOW:
 case PIPE_CAP_MEMOBJ:
 case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
  
 case PIPE_CAP_VENDOR_ID:

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 479283e1b7c..997cb4e71dc 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -276,6 +276,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 

Re: [Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

2017-10-04 Thread Jason Ekstrand
New patch on the list.

On Wed, Oct 4, 2017 at 7:42 PM, Jason Ekstrand  wrote:

> On Wed, Oct 4, 2017 at 5:35 PM, Jason Ekstrand 
> wrote:
>
>> On Wed, Oct 4, 2017 at 5:29 PM, Connor Abbott 
>> wrote:
>>
>>> This won't completely solve the problem. For example, what if you
>>> hoist the assignment to color2 outside the loop?
>>>
>>> vec4 color2;
>>> while (1) {
>>>vec4 color = texture();
>>>color2 = color * 2;
>>>if (...) {
>>>   break;
>>>}
>>> }
>>> gl_FragColor = color2;
>>>
>>>
>>> Now the definition still dominates the use, even with the modified
>>> control-flow graph, and you have the same problem
>>
>>
>> Curro had me convinced that some detail of the liveness analysis pass
>> saved us here but now I can't remember what. :-(
>>
>>
>>> The real problem is
>>> that the assignment to color2 is really a conditional assignment: if
>>> we're going channel-by-channel, it's not, but if you consider the
>>> *whole* register at the same time, it is. To really fix the problem,
>>> you need to model exactly what the machine actually does: you need to
>>> insert "fake" edges like these, that model the jumps that the machine
>>> can take, and you need to make every assignment a conditional
>>> assignment (i.e. it doesn't kill the register). It's probably not as
>>> bad with Curro's patch on top, though. Also, once you do this you can
>>> make register allocation more accurate by generating interferences
>>> from the liveness information directly instead of from the intervals.
>>>
>>> One thing I've thought about is, in addition to maintaining this
>>> "whole-vector" view of things, is to maintain a "per-channel" liveness
>>> that doesn't use the extra edges, partial definitions etc. and then
>>> use the "per-channel view" to calculate interference when the channels
>>> always line up.
>>>
>>
>> Yes, we've considered that and it's a good idea.  However, I'm trying to
>> fix bugs right now, not write the world's best liveness analysis pass. :-)
>>
>
> You're correct, as usual... I've inspected the result of liveness anlaysis
> and we do indeed get it wrong.  I'll come up with something less bogus.
>
>
>> On Wed, Oct 4, 2017 at 7:58 PM, Jason Ekstrand 
>>> wrote:
>>> > Shader-db results on Sky Lake:
>>> >
>>> > total instructions in shared programs: 12955125 -> 12953698
>>> (-0.01%)
>>> > instructions in affected programs: 55956 -> 54529 (-2.55%)
>>> > helped: 6
>>> > HURT: 38
>>> >
>>> > All of the hurt programs were hurt by exactly one instruction because
>>> > this patch affects copy propagation.  Most of the helped instructions
>>> > came from a single orbital explorer shader that was helped by 14.26%
>>> >
>>> > Cc: mesa-sta...@lists.freedesktop.org
>>> > ---
>>> >  src/intel/compiler/brw_cfg.cpp | 37 ++
>>> +--
>>> >  1 file changed, 35 insertions(+), 2 deletions(-)
>>> >
>>> > diff --git a/src/intel/compiler/brw_cfg.cpp
>>> b/src/intel/compiler/brw_cfg.cpp
>>> > index fad12ee..d8bf725 100644
>>> > --- a/src/intel/compiler/brw_cfg.cpp
>>> > +++ b/src/intel/compiler/brw_cfg.cpp
>>> > @@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
>>> >   assert(cur_while != NULL);
>>> >  cur->add_successor(mem_ctx, cur_while);
>>> >
>>> > + /* We also add the next block as a successor of the break.
>>> If the
>>> > +  * break is predicated, we need to do this because the break
>>> may not
>>> > +  * be taken.  If the break is not predicated, we add it
>>> anyway so
>>> > +  * that our live intervals computations will operate as if
>>> the break
>>> > +  * may or may not be taken.  Consider the following example:
>>> > +  *
>>> > +  *vec4 color2;
>>> > +  *while (1) {
>>> > +  *   vec4 color = texture();
>>> > +  *   if (...) {
>>> > +  *  color2 = color * 2;
>>> > +  *  break;
>>> > +  *   }
>>> > +  *}
>>> > +  *gl_FragColor = color2;
>>> > +  *
>>> > +  * In this case, the definition of color2 dominates the use
>>> because
>>> > +  * the loop only has the one exit.  This means that the live
>>> range
>>> > +  * interval for color2 goes from the statement in the if to
>>> it's use
>>> > +  * below the loop.  Now suppose that the texture operation
>>> has a
>>> > +  * header register that gets assigned one of the registers
>>> used for
>>> > +  * color2.  If the loop condition is non-uniform and some of
>>> the
>>> > +  * threads will take the break and others will continue.  In
>>> this
>>> > +  * case, the next pass through the loop, the WE_all setup of
>>> the
>>> > +  * header register will stomp the disabled channels of
>>> color2 and
>>> > +  * corrupt the value.
>>> > +  *
>>> > + 

[Mesa-dev] [PATCH] i965/fs: Extend the live ranges of VGRFs which leave loops

2017-10-04 Thread Jason Ekstrand
Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/compiler/brw_fs_live_variables.cpp | 55 
 1 file changed, 55 insertions(+)

diff --git a/src/intel/compiler/brw_fs_live_variables.cpp 
b/src/intel/compiler/brw_fs_live_variables.cpp
index c449672..23ec280 100644
--- a/src/intel/compiler/brw_fs_live_variables.cpp
+++ b/src/intel/compiler/brw_fs_live_variables.cpp
@@ -223,6 +223,61 @@ fs_live_variables::compute_start_end()
  }
   }
}
+
+   /* Due to the explicit way the SIMD data is handled on GEN, we need to be a
+* bit more careful with live ranges and loops.  Consider the following
+* example:
+*
+*vec4 color2;
+*while (1) {
+*   vec4 color = texture();
+*   if (...) {
+*  color2 = color * 2;
+*  break;
+*   }
+*}
+*gl_FragColor = color2;
+*
+* In this case, the definition of color2 dominates the use because the
+* loop only has the one exit.  This means that the live range interval for
+* color2 goes from the statement in the if to it's use below the loop.
+* Now suppose that the texture operation has a header register that gets
+* assigned one of the registers used for color2.  If the loop condition is
+* non-uniform and some of the threads will take the and others will
+* continue.  In this case, the next pass through the loop, the WE_all
+* setup of the header register will stomp the disabled channels of color2
+* and corrupt the value.
+*
+* This same problem can occur if you have a mix of 64, 32, and 16-bit
+* registers because the channels do not line up or if you have a SIMD16
+* program and the first half of one value overlaps the second half of the
+* other.
+*
+* To solve this problem, we take any VGRFs whose live ranges cross the
+* while instruction of a loop and extend their live ranges to the top of
+* the loop.  This more accurately models the hardware because the value in
+* the VGRF needs to be carried through subsequent loop iterations in order
+* to remain valid when we finally do break.
+*/
+   foreach_block (block, cfg) {
+  if (block->end()->opcode != BRW_OPCODE_WHILE)
+ continue;
+
+  /* This is a WHILE instrution. Find the DO block. */
+  bblock_t *do_block = NULL;
+  foreach_list_typed(bblock_link, child_link, link, >children) {
+ if (child_link->block->start_ip < block->end_ip) {
+assert(do_block == NULL);
+do_block = child_link->block;
+ }
+  }
+  assert(do_block);
+
+  for (int i = 0; i < num_vars; i++) {
+ if (start[i] < block->end_ip && end[i] > block->end_ip)
+start[i] = do_block->start_ip;
+  }
+   }
 }
 
 fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Matt Arsenault

> On Oct 4, 2017, at 12:50, Marek Olšák  wrote:
> 
> The LLVM backends selects MAD (unfused) for fmuladd, and FMA (fused) for fma.

For f64 and f16 by default it will emit an FMA since mad doesn’t support 
denorms.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

2017-10-04 Thread Jason Ekstrand
On Wed, Oct 4, 2017 at 5:35 PM, Jason Ekstrand  wrote:

> On Wed, Oct 4, 2017 at 5:29 PM, Connor Abbott  wrote:
>
>> This won't completely solve the problem. For example, what if you
>> hoist the assignment to color2 outside the loop?
>>
>> vec4 color2;
>> while (1) {
>>vec4 color = texture();
>>color2 = color * 2;
>>if (...) {
>>   break;
>>}
>> }
>> gl_FragColor = color2;
>>
>>
>> Now the definition still dominates the use, even with the modified
>> control-flow graph, and you have the same problem
>
>
> Curro had me convinced that some detail of the liveness analysis pass
> saved us here but now I can't remember what. :-(
>
>
>> The real problem is
>> that the assignment to color2 is really a conditional assignment: if
>> we're going channel-by-channel, it's not, but if you consider the
>> *whole* register at the same time, it is. To really fix the problem,
>> you need to model exactly what the machine actually does: you need to
>> insert "fake" edges like these, that model the jumps that the machine
>> can take, and you need to make every assignment a conditional
>> assignment (i.e. it doesn't kill the register). It's probably not as
>> bad with Curro's patch on top, though. Also, once you do this you can
>> make register allocation more accurate by generating interferences
>> from the liveness information directly instead of from the intervals.
>>
>> One thing I've thought about is, in addition to maintaining this
>> "whole-vector" view of things, is to maintain a "per-channel" liveness
>> that doesn't use the extra edges, partial definitions etc. and then
>> use the "per-channel view" to calculate interference when the channels
>> always line up.
>>
>
> Yes, we've considered that and it's a good idea.  However, I'm trying to
> fix bugs right now, not write the world's best liveness analysis pass. :-)
>

You're correct, as usual... I've inspected the result of liveness anlaysis
and we do indeed get it wrong.  I'll come up with something less bogus.


> On Wed, Oct 4, 2017 at 7:58 PM, Jason Ekstrand 
>> wrote:
>> > Shader-db results on Sky Lake:
>> >
>> > total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
>> > instructions in affected programs: 55956 -> 54529 (-2.55%)
>> > helped: 6
>> > HURT: 38
>> >
>> > All of the hurt programs were hurt by exactly one instruction because
>> > this patch affects copy propagation.  Most of the helped instructions
>> > came from a single orbital explorer shader that was helped by 14.26%
>> >
>> > Cc: mesa-sta...@lists.freedesktop.org
>> > ---
>> >  src/intel/compiler/brw_cfg.cpp | 37 ++
>> +--
>> >  1 file changed, 35 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/intel/compiler/brw_cfg.cpp
>> b/src/intel/compiler/brw_cfg.cpp
>> > index fad12ee..d8bf725 100644
>> > --- a/src/intel/compiler/brw_cfg.cpp
>> > +++ b/src/intel/compiler/brw_cfg.cpp
>> > @@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
>> >   assert(cur_while != NULL);
>> >  cur->add_successor(mem_ctx, cur_while);
>> >
>> > + /* We also add the next block as a successor of the break.
>> If the
>> > +  * break is predicated, we need to do this because the break
>> may not
>> > +  * be taken.  If the break is not predicated, we add it
>> anyway so
>> > +  * that our live intervals computations will operate as if
>> the break
>> > +  * may or may not be taken.  Consider the following example:
>> > +  *
>> > +  *vec4 color2;
>> > +  *while (1) {
>> > +  *   vec4 color = texture();
>> > +  *   if (...) {
>> > +  *  color2 = color * 2;
>> > +  *  break;
>> > +  *   }
>> > +  *}
>> > +  *gl_FragColor = color2;
>> > +  *
>> > +  * In this case, the definition of color2 dominates the use
>> because
>> > +  * the loop only has the one exit.  This means that the live
>> range
>> > +  * interval for color2 goes from the statement in the if to
>> it's use
>> > +  * below the loop.  Now suppose that the texture operation
>> has a
>> > +  * header register that gets assigned one of the registers
>> used for
>> > +  * color2.  If the loop condition is non-uniform and some of
>> the
>> > +  * threads will take the break and others will continue.  In
>> this
>> > +  * case, the next pass through the loop, the WE_all setup of
>> the
>> > +  * header register will stomp the disabled channels of color2
>> and
>> > +  * corrupt the value.
>> > +  *
>> > +  * This same problem can occur if you have a mix of 64, 32,
>> and
>> > +  * 16-bit registers because the channels do not line up or if
>> you
>> > +  * have a SIMD16 program and the first half of one value
>> 

Re: [Mesa-dev] [PATCH 10/21] i965: Only add the wpos state reference if we lowered something

2017-10-04 Thread Jordan Justen
On 2017-09-29 14:25:10, Jason Ekstrand wrote:
> Otherwise, in the ARB program case _mesa_add_state_reference may grow
> the parameter array which will cause brw_nir_setup_arb_uniforms to write
> past the end of the param array because it only looks at the parameter
> list length but the parma array is allocated based on nir->num_uniforms.
> The only reason this hasn't caused us problems is because we are padding
> out the param array for fragment programs unnecessarily.
> ---
>  src/mesa/drivers/dri/i965/brw_program.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> b/src/mesa/drivers/dri/i965/brw_program.c
> index ee464fc..7eec6f7 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -88,8 +88,6 @@ brw_create_nir(struct brw_context *brw,
> }
> nir_validate_shader(nir);
>  
> -   (void)progress;
> -
> nir = brw_preprocess_nir(brw->screen->compiler, nir);
>  
> if (stage == MESA_SHADER_FRAGMENT) {
> @@ -98,10 +96,13 @@ brw_create_nir(struct brw_context *brw,
>   .fs_coord_pixel_center_integer = 1,
>   .fs_coord_origin_upper_left = 1,
>};
> -  _mesa_add_state_reference(prog->Parameters,
> -(gl_state_index *) 
> wpos_options.state_tokens);
>  
> +  progress = false;

Should we move the `progress` declaration here?

>NIR_PASS(progress, nir, nir_lower_wpos_ytransform, _options);
> +  if (progress) {
> + _mesa_add_state_reference(prog->Parameters,
> +   (gl_state_index *) 
> wpos_options.state_tokens);
> +  }
> }
>  
> NIR_PASS(progress, nir, nir_lower_system_values);

And convert this to NIR_PASS_V?

-Jordan

> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/21] i965: Store image_param in brw_context instead of prog_data

2017-10-04 Thread Jordan Justen
On 2017-09-29 14:25:06, Jason Ekstrand wrote:
> This burns an extra 10k of memory or so in the case where you don't have
> any images.  However, if you have several shaders which use images, this
> should be much less memory.  It also gets rid of a part of prog_data
> that really has nothing to do with the compiler.
> ---
>  src/intel/compiler/brw_compiler.h|  4 
>  src/intel/vulkan/anv_pipeline_cache.c|  6 ++
>  src/mesa/drivers/dri/i965/brw_context.h  |  2 ++
>  src/mesa/drivers/dri/i965/brw_cs.c   |  4 
>  src/mesa/drivers/dri/i965/brw_curbe.c|  4 ++--
>  src/mesa/drivers/dri/i965/brw_gs.c   |  4 
>  src/mesa/drivers/dri/i965/brw_program.c  |  1 -
>  src/mesa/drivers/dri/i965/brw_state.h|  2 +-
>  src/mesa/drivers/dri/i965/brw_tcs.c  |  5 -
>  src/mesa/drivers/dri/i965/brw_tes.c  |  4 
>  src/mesa/drivers/dri/i965/brw_vs.c   |  5 -
>  src/mesa/drivers/dri/i965/brw_wm.c   |  4 
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  2 +-
>  src/mesa/drivers/dri/i965/gen6_constant_state.c  | 19 +--
>  14 files changed, 17 insertions(+), 49 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_compiler.h 
> b/src/intel/compiler/brw_compiler.h
> index a415c44..04160aa 100644
> --- a/src/intel/compiler/brw_compiler.h
> +++ b/src/intel/compiler/brw_compiler.h
> @@ -575,7 +575,6 @@ struct brw_stage_prog_data {
>  
> GLuint nr_params;   /**< number of float params/constants */
> GLuint nr_pull_params;
> -   unsigned nr_image_params;
>  
> unsigned curb_read_length;
> unsigned total_scratch;
> @@ -597,9 +596,6 @@ struct brw_stage_prog_data {
>  */
> uint32_t *param;
> uint32_t *pull_param;
> -
> -   /** Image metadata passed to the shader as uniforms. */
> -   struct brw_image_param *image_param;
>  };
>  
>  static inline void
> diff --git a/src/intel/vulkan/anv_pipeline_cache.c 
> b/src/intel/vulkan/anv_pipeline_cache.c
> index c3a62f5..b75dd7e 100644
> --- a/src/intel/vulkan/anv_pipeline_cache.c
> +++ b/src/intel/vulkan/anv_pipeline_cache.c
> @@ -76,7 +76,6 @@ anv_shader_bin_create(struct anv_device *device,
> data += align_u32(prog_data_size, 8);
>  
> assert(prog_data->nr_pull_params == 0);
> -   assert(prog_data->nr_image_params == 0);
> new_prog_data->param = data;
> uint32_t param_size = prog_data->nr_params * sizeof(void *);
> memcpy(data, prog_data_param, param_size);
> @@ -141,9 +140,8 @@ anv_shader_bin_write_data(const struct anv_shader_bin 
> *shader, void *data)
>   *
>   * - Review prog_data struct for size and cacheability: struct
>   *   brw_stage_prog_data has binding_table which uses a lot of uint32_t for 8
> - *   bit quantities etc; param, pull_param, and image_params are pointers, we
> - *   just need the compation map. use bit fields for all bools, eg
> - *   dual_src_blend.
> + *   bit quantities etc; param and pull_param are pointers, we just need the
> + *   compation map. use bit fields for all bools, eg dual_src_blend.

old/new has 'compation' typo

-Jordan

>   */
>  
>  static uint32_t
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index bc3d3e3..bb8588d 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -580,6 +580,8 @@ struct brw_stage_state
> uint32_t sampler_count;
> uint32_t sampler_offset;
>  
> +   struct brw_image_param image_param[BRW_MAX_IMAGES];
> +
> /** Need to re-emit 3DSTATE_CONSTANT_XS? */
> bool push_constants_dirty;
>  };
> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
> b/src/mesa/drivers/dri/i965/brw_cs.c
> index 68fca09..0c505b3 100644
> --- a/src/mesa/drivers/dri/i965/brw_cs.c
> +++ b/src/mesa/drivers/dri/i965/brw_cs.c
> @@ -91,11 +91,7 @@ brw_codegen_cs_prog(struct brw_context *brw,
> param_count += 2 * 
> ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
> prog_data.base.param = rzalloc_array(NULL, uint32_t, param_count);
> prog_data.base.pull_param = rzalloc_array(NULL, uint32_t, param_count);
> -   prog_data.base.image_param =
> -  rzalloc_array(NULL, struct brw_image_param,
> -cp->program.info.num_images);
> prog_data.base.nr_params = param_count;
> -   prog_data.base.nr_image_params = cp->program.info.num_images;
>  
> brw_nir_setup_glsl_uniforms(cp->program.nir, >program,_data.base,
> true);
> diff --git a/src/mesa/drivers/dri/i965/brw_curbe.c 
> b/src/mesa/drivers/dri/i965/brw_curbe.c
> index 9a9c6d0..c747110 100644
> --- a/src/mesa/drivers/dri/i965/brw_curbe.c
> +++ b/src/mesa/drivers/dri/i965/brw_curbe.c
> @@ -227,7 +227,7 @@ brw_upload_constant_buffer(struct brw_context *brw)
>GLuint offset = brw->curbe.wm_start * 16;
>  
>/* BRW_NEW_FS_PROG_DATA | _NEW_PROGRAM_CONSTANTS: copy uniform 

Re: [Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

2017-10-04 Thread Connor Abbott
On Wed, Oct 4, 2017 at 8:35 PM, Jason Ekstrand  wrote:
> On Wed, Oct 4, 2017 at 5:29 PM, Connor Abbott  wrote:
>>
>> This won't completely solve the problem. For example, what if you
>> hoist the assignment to color2 outside the loop?
>>
>> vec4 color2;
>> while (1) {
>>vec4 color = texture();
>>color2 = color * 2;
>>if (...) {
>>   break;
>>}
>> }
>> gl_FragColor = color2;
>>
>>
>> Now the definition still dominates the use, even with the modified
>> control-flow graph, and you have the same problem
>
>
> Curro had me convinced that some detail of the liveness analysis pass saved
> us here but now I can't remember what. :-(
>
>>
>> The real problem is
>> that the assignment to color2 is really a conditional assignment: if
>> we're going channel-by-channel, it's not, but if you consider the
>> *whole* register at the same time, it is. To really fix the problem,
>> you need to model exactly what the machine actually does: you need to
>> insert "fake" edges like these, that model the jumps that the machine
>> can take, and you need to make every assignment a conditional
>> assignment (i.e. it doesn't kill the register). It's probably not as
>> bad with Curro's patch on top, though. Also, once you do this you can
>> make register allocation more accurate by generating interferences
>> from the liveness information directly instead of from the intervals.
>>
>> One thing I've thought about is, in addition to maintaining this
>> "whole-vector" view of things, is to maintain a "per-channel" liveness
>> that doesn't use the extra edges, partial definitions etc. and then
>> use the "per-channel view" to calculate interference when the channels
>> always line up.
>
>
> Yes, we've considered that and it's a good idea.  However, I'm trying to fix
> bugs right now, not write the world's best liveness analysis pass. :-)

That's fair, although just implementing the first bit shouldn't be too hard.

>
>>
>> On Wed, Oct 4, 2017 at 7:58 PM, Jason Ekstrand 
>> wrote:
>> > Shader-db results on Sky Lake:
>> >
>> > total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
>> > instructions in affected programs: 55956 -> 54529 (-2.55%)
>> > helped: 6
>> > HURT: 38
>> >
>> > All of the hurt programs were hurt by exactly one instruction because
>> > this patch affects copy propagation.  Most of the helped instructions
>> > came from a single orbital explorer shader that was helped by 14.26%
>> >
>> > Cc: mesa-sta...@lists.freedesktop.org
>> > ---
>> >  src/intel/compiler/brw_cfg.cpp | 37
>> > +++--
>> >  1 file changed, 35 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/intel/compiler/brw_cfg.cpp
>> > b/src/intel/compiler/brw_cfg.cpp
>> > index fad12ee..d8bf725 100644
>> > --- a/src/intel/compiler/brw_cfg.cpp
>> > +++ b/src/intel/compiler/brw_cfg.cpp
>> > @@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
>> >   assert(cur_while != NULL);
>> >  cur->add_successor(mem_ctx, cur_while);
>> >
>> > + /* We also add the next block as a successor of the break.  If
>> > the
>> > +  * break is predicated, we need to do this because the break
>> > may not
>> > +  * be taken.  If the break is not predicated, we add it anyway
>> > so
>> > +  * that our live intervals computations will operate as if the
>> > break
>> > +  * may or may not be taken.  Consider the following example:
>> > +  *
>> > +  *vec4 color2;
>> > +  *while (1) {
>> > +  *   vec4 color = texture();
>> > +  *   if (...) {
>> > +  *  color2 = color * 2;
>> > +  *  break;
>> > +  *   }
>> > +  *}
>> > +  *gl_FragColor = color2;
>> > +  *
>> > +  * In this case, the definition of color2 dominates the use
>> > because
>> > +  * the loop only has the one exit.  This means that the live
>> > range
>> > +  * interval for color2 goes from the statement in the if to
>> > it's use
>> > +  * below the loop.  Now suppose that the texture operation has
>> > a
>> > +  * header register that gets assigned one of the registers
>> > used for
>> > +  * color2.  If the loop condition is non-uniform and some of
>> > the
>> > +  * threads will take the break and others will continue.  In
>> > this
>> > +  * case, the next pass through the loop, the WE_all setup of
>> > the
>> > +  * header register will stomp the disabled channels of color2
>> > and
>> > +  * corrupt the value.
>> > +  *
>> > +  * This same problem can occur if you have a mix of 64, 32,
>> > and
>> > +  * 16-bit registers because the channels do not line up or if
>> > you
>> > +  * have a SIMD16 program and the first half of one value
>> > overlaps the
>> > +  * 

Re: [Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

2017-10-04 Thread Jason Ekstrand
On Wed, Oct 4, 2017 at 5:29 PM, Connor Abbott  wrote:

> This won't completely solve the problem. For example, what if you
> hoist the assignment to color2 outside the loop?
>
> vec4 color2;
> while (1) {
>vec4 color = texture();
>color2 = color * 2;
>if (...) {
>   break;
>}
> }
> gl_FragColor = color2;
>
>
> Now the definition still dominates the use, even with the modified
> control-flow graph, and you have the same problem


Curro had me convinced that some detail of the liveness analysis pass saved
us here but now I can't remember what. :-(


> The real problem is
> that the assignment to color2 is really a conditional assignment: if
> we're going channel-by-channel, it's not, but if you consider the
> *whole* register at the same time, it is. To really fix the problem,
> you need to model exactly what the machine actually does: you need to
> insert "fake" edges like these, that model the jumps that the machine
> can take, and you need to make every assignment a conditional
> assignment (i.e. it doesn't kill the register). It's probably not as
> bad with Curro's patch on top, though. Also, once you do this you can
> make register allocation more accurate by generating interferences
> from the liveness information directly instead of from the intervals.
>
> One thing I've thought about is, in addition to maintaining this
> "whole-vector" view of things, is to maintain a "per-channel" liveness
> that doesn't use the extra edges, partial definitions etc. and then
> use the "per-channel view" to calculate interference when the channels
> always line up.
>

Yes, we've considered that and it's a good idea.  However, I'm trying to
fix bugs right now, not write the world's best liveness analysis pass. :-)


> On Wed, Oct 4, 2017 at 7:58 PM, Jason Ekstrand 
> wrote:
> > Shader-db results on Sky Lake:
> >
> > total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
> > instructions in affected programs: 55956 -> 54529 (-2.55%)
> > helped: 6
> > HURT: 38
> >
> > All of the hurt programs were hurt by exactly one instruction because
> > this patch affects copy propagation.  Most of the helped instructions
> > came from a single orbital explorer shader that was helped by 14.26%
> >
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/intel/compiler/brw_cfg.cpp | 37 ++
> +--
> >  1 file changed, 35 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_cfg.cpp
> b/src/intel/compiler/brw_cfg.cpp
> > index fad12ee..d8bf725 100644
> > --- a/src/intel/compiler/brw_cfg.cpp
> > +++ b/src/intel/compiler/brw_cfg.cpp
> > @@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
> >   assert(cur_while != NULL);
> >  cur->add_successor(mem_ctx, cur_while);
> >
> > + /* We also add the next block as a successor of the break.  If
> the
> > +  * break is predicated, we need to do this because the break
> may not
> > +  * be taken.  If the break is not predicated, we add it anyway
> so
> > +  * that our live intervals computations will operate as if the
> break
> > +  * may or may not be taken.  Consider the following example:
> > +  *
> > +  *vec4 color2;
> > +  *while (1) {
> > +  *   vec4 color = texture();
> > +  *   if (...) {
> > +  *  color2 = color * 2;
> > +  *  break;
> > +  *   }
> > +  *}
> > +  *gl_FragColor = color2;
> > +  *
> > +  * In this case, the definition of color2 dominates the use
> because
> > +  * the loop only has the one exit.  This means that the live
> range
> > +  * interval for color2 goes from the statement in the if to
> it's use
> > +  * below the loop.  Now suppose that the texture operation has
> a
> > +  * header register that gets assigned one of the registers
> used for
> > +  * color2.  If the loop condition is non-uniform and some of
> the
> > +  * threads will take the break and others will continue.  In
> this
> > +  * case, the next pass through the loop, the WE_all setup of
> the
> > +  * header register will stomp the disabled channels of color2
> and
> > +  * corrupt the value.
> > +  *
> > +  * This same problem can occur if you have a mix of 64, 32, and
> > +  * 16-bit registers because the channels do not line up or if
> you
> > +  * have a SIMD16 program and the first half of one value
> overlaps the
> > +  * second half of the other.  To solve it, we simply treat the
> break
> > +  * as if it may also continue on because some of the threads
> may
> > +  * continue on.
> > +  */
> >  next = new_block();
> > -if (inst->predicate)
> > -   cur->add_successor(mem_ctx, next);
> > +

Re: [Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

2017-10-04 Thread Connor Abbott
This won't completely solve the problem. For example, what if you
hoist the assignment to color2 outside the loop?

vec4 color2;
while (1) {
   vec4 color = texture();
   color2 = color * 2;
   if (...) {
  break;
   }
}
gl_FragColor = color2;


Now the definition still dominates the use, even with the modified
control-flow graph, and you have the same problem. The real problem is
that the assignment to color2 is really a conditional assignment: if
we're going channel-by-channel, it's not, but if you consider the
*whole* register at the same time, it is. To really fix the problem,
you need to model exactly what the machine actually does: you need to
insert "fake" edges like these, that model the jumps that the machine
can take, and you need to make every assignment a conditional
assignment (i.e. it doesn't kill the register). It's probably not as
bad with Curro's patch on top, though. Also, once you do this you can
make register allocation more accurate by generating interferences
from the liveness information directly instead of from the intervals.

One thing I've thought about is, in addition to maintaining this
"whole-vector" view of things, is to maintain a "per-channel" liveness
that doesn't use the extra edges, partial definitions etc. and then
use the "per-channel view" to calculate interference when the channels
always line up.


On Wed, Oct 4, 2017 at 7:58 PM, Jason Ekstrand  wrote:
> Shader-db results on Sky Lake:
>
> total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
> instructions in affected programs: 55956 -> 54529 (-2.55%)
> helped: 6
> HURT: 38
>
> All of the hurt programs were hurt by exactly one instruction because
> this patch affects copy propagation.  Most of the helped instructions
> came from a single orbital explorer shader that was helped by 14.26%
>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/compiler/brw_cfg.cpp | 37 +++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp
> index fad12ee..d8bf725 100644
> --- a/src/intel/compiler/brw_cfg.cpp
> +++ b/src/intel/compiler/brw_cfg.cpp
> @@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
>   assert(cur_while != NULL);
>  cur->add_successor(mem_ctx, cur_while);
>
> + /* We also add the next block as a successor of the break.  If the
> +  * break is predicated, we need to do this because the break may not
> +  * be taken.  If the break is not predicated, we add it anyway so
> +  * that our live intervals computations will operate as if the break
> +  * may or may not be taken.  Consider the following example:
> +  *
> +  *vec4 color2;
> +  *while (1) {
> +  *   vec4 color = texture();
> +  *   if (...) {
> +  *  color2 = color * 2;
> +  *  break;
> +  *   }
> +  *}
> +  *gl_FragColor = color2;
> +  *
> +  * In this case, the definition of color2 dominates the use because
> +  * the loop only has the one exit.  This means that the live range
> +  * interval for color2 goes from the statement in the if to it's use
> +  * below the loop.  Now suppose that the texture operation has a
> +  * header register that gets assigned one of the registers used for
> +  * color2.  If the loop condition is non-uniform and some of the
> +  * threads will take the break and others will continue.  In this
> +  * case, the next pass through the loop, the WE_all setup of the
> +  * header register will stomp the disabled channels of color2 and
> +  * corrupt the value.
> +  *
> +  * This same problem can occur if you have a mix of 64, 32, and
> +  * 16-bit registers because the channels do not line up or if you
> +  * have a SIMD16 program and the first half of one value overlaps 
> the
> +  * second half of the other.  To solve it, we simply treat the break
> +  * as if it may also continue on because some of the threads may
> +  * continue on.
> +  */
>  next = new_block();
> -if (inst->predicate)
> -   cur->add_successor(mem_ctx, next);
> +cur->add_successor(mem_ctx, next);
>
>  set_next_block(, next, ip);
>  break;
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] meson: build gbm

2017-10-04 Thread Dylan Baker
Quoting Eric Anholt (2017-10-04 15:04:28)
> Dylan Baker  writes:
> 
> > This doesn't include egl support, just dri support.
> >
> > Signed-off-by: Dylan Baker 
> > ---
> >  meson.build | 49 +---
> >  meson_options.txt   | 14 +
> >  src/{loader => gbm}/meson.build | 63 
> > -
> >  src/glx/meson.build | 10 +++
> >  src/loader/meson.build  |  2 ++
> >  src/mesa/meson.build|  2 +-
> >  src/meson.build |  4 ++-
> >  7 files changed, 95 insertions(+), 49 deletions(-)
> >  copy src/{loader => gbm}/meson.build (50%)
> >
> > diff --git a/meson.build b/meson.build
> > index ec50e10b38c..185d70509c5 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -54,19 +54,6 @@ with_any_opengl = with_opengl or with_gles1 or with_gles2
> >  # Only build shared_glapi if at least one OpenGL API is enabled
> >  with_shared_glapi = get_option('shared-glapi') and with_any_opengl
> >  
> > -with_dri3 = get_option('dri3')
> > -if with_dri3 == 'auto'
> > -  if host_machine.system() == 'linux'
> > -with_dri3 = true
> > -  else
> > -with_dri3 = false
> > - endif
> > -elif with_dri3 == 'yes'
> > -  with_dri3 = true
> > -else
> > -  with_dri3 = false
> > -endif
> > -
> >  # TODO: these will need options, but at the moment they just control header
> >  # installs
> >  with_osmesa = false
> > @@ -107,6 +94,27 @@ with_dri_platform = 'drm'
> >  with_gallium = false
> >  # TODO: gallium drivers
> >  
> > +# TODO: conditionalize libdrm requirement
> > +dep_libdrm = dependency('libdrm', version : '>= 2.4.75')
> > +pre_args += '-DHAVE_LIBDRM'
> > +
> > +with_dri2 = with_dri and with_dri_platform == 'drm' and dep_libdrm.found()
> > +with_dri3 = get_option('dri3')
> > +if with_dri3 == 'auto'
> > +  if host_machine.system() == 'linux' and with_dri2
> > +with_dri3 = true
> > +  else
> > +with_dri3 = false
> > + endif
> > +elif with_dri3 == 'yes'
> > +  if not with_dri2
> > +error('dri3 support requires libdrm')
> > +  endif
> > +  with_dri3 = true
> > +else
> > +  with_dri3 = false
> > +endif
> 
> It would be great if the hunk could appear in its ultimate position
> earlier in the series.
> 
> 
> > diff --git a/meson_options.txt b/meson_options.txt
> > index 130d3962db7..b6d44c44ba9 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -32,17 +32,19 @@ option('shader-cache',type : 'boolean', value : 
> > true,
> > description : 'Build with on-disk shader cache support')
> >  option('vulkan-icd-dir', type : 'string',  value : '',
> > description : 'Location relative to prefix to put vulkan icds on 
> > install. Default: $datadir/vulkan/icd.d')
> > -option('shared-glapi',   type : 'boolean', value : true,
> > +option('shared-glapi',type : 'boolean', value : true,
> > description : 'Whether to build a shared or static glapi')
> 
> More stray changes.
> 
> > -option('gles1',  type : 'boolean', value : true,
> > +option('gles1',   type : 'boolean', value : true,
> > description : 'Build support for OpenGL ES 1.x')
> > -option('gles2',  type : 'boolean', value : true,
> > +option('gles2',   type : 'boolean', value : true,
> > description : 'Build support for OpenGL ES 2.x and 3.x')
> > -option('opengl', type : 'boolean', value : true,
> > +option('opengl',  type : 'boolean', value : true,
> > description : 'Build support for OpenGL (all versions)')
> > -option('glx',type : 'combo',   value : 'auto', choices : 
> > ['auto', 'disabled', 'dri', 'xlib', 'gallium-xlib'],
> > +option('gbm', type : 'combo',   value : 'auto', choices : 
> > ['auto', 'yes', 'no'],
> > +   description : 'Build support for gbm platform')
> > +option('glx', type : 'combo',   value : 'auto', choices : 
> > ['auto', 'disabled', 'dri', 'xlib', 'gallium-xlib'],
> > description : 'Build support for GLX platform')
> > -option('glvnd',  type : 'boolean', vaule : false,
> > +option('glvnd',   type : 'boolean', value : false,
> > description : 'Enable GLVND support.')
> 
> Stray changes.
> 
> > diff --git a/src/glx/meson.build b/src/glx/meson.build
> > index 6b6e9095740..8c1b29a9ff8 100644
> > --- a/src/glx/meson.build
> > +++ b/src/glx/meson.build
> > @@ -106,8 +106,6 @@ elif with_windowsdri
> >#]
> >  endif
> >  
> > -# TODO: libglvnd
> > -
> >  dri_driver_dir = join_paths(get_option('prefix'), with_dri_drivers_path)
> >  if not with_glvnd
> >gl_lib_name = 'GL'
> > @@ -137,8 +135,8 @@ libglx = static_library(
> >[files_libglx, glx_indirect_c, glx_indirect_h, glx_indirect_init_c,
> > glx_indirect_size_c, glx_indirect_size_h],
> >include_directories : [
> > -inc_common, inc_glapi,
> > -include_directories('../loader', '../../include/GL/internal')
> > 

[Mesa-dev] [PATCH 2/4] intel/compiler: Don't propagate cmod into integer multiplies

2017-10-04 Thread Jason Ekstrand
No shader-db change on Sky Lake.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/compiler/brw_fs_cmod_propagation.cpp   | 17 +
 src/intel/compiler/brw_vec4_cmod_propagation.cpp | 17 +
 2 files changed, 34 insertions(+)

diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp 
b/src/intel/compiler/brw_fs_cmod_propagation.cpp
index db63e94..e8f1069 100644
--- a/src/intel/compiler/brw_fs_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp
@@ -150,6 +150,23 @@ opt_cmod_propagation_local(const gen_device_info *devinfo, 
bblock_t *block)
 if (scan_inst->saturate)
break;
 
+/* From the Sky Lake PRM, Vol 2a, "Multiply":
+ *
+ *"When multiplying integer data types, if one of the sources
+ *is a DW, the resulting full precision data is stored in
+ *the accumulator. However, if the destination data type is
+ *either W or DW, the low bits of the result are written to
+ *the destination register and the remaining high bits are
+ *discarded. This results in undefined Overflow and Sign
+ *flags. Therefore, conditional modifiers and saturation
+ *(.sat) cannot be used in this case.
+ *
+ * We just disallow cmod propagation on all integer multiplies.
+ */
+if (!brw_reg_type_is_floating_point(scan_inst->dst.type) &&
+scan_inst->opcode == BRW_OPCODE_MUL)
+   break;
+
 /* Otherwise, try propagating the conditional. */
 enum brw_conditional_mod cond =
inst->src[0].negate ? brw_swap_cmod(inst->conditional_mod)
diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp 
b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
index 05e6516..0d72d82 100644
--- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
@@ -137,6 +137,23 @@ opt_cmod_propagation_local(bblock_t *block)
 if (scan_inst->saturate)
break;
 
+/* From the Sky Lake PRM, Vol 2a, "Multiply":
+ *
+ *"When multiplying integer data types, if one of the sources
+ *is a DW, the resulting full precision data is stored in
+ *the accumulator. However, if the destination data type is
+ *either W or DW, the low bits of the result are written to
+ *the destination register and the remaining high bits are
+ *discarded. This results in undefined Overflow and Sign
+ *flags. Therefore, conditional modifiers and saturation
+ *(.sat) cannot be used in this case.
+ *
+ * We just disallow cmod propagation on all integer multiplies.
+ */
+if (!brw_reg_type_is_floating_point(scan_inst->dst.type) &&
+scan_inst->opcode == BRW_OPCODE_MUL)
+   break;
+
 /* Otherwise, try propagating the conditional. */
 enum brw_conditional_mod cond =
inst->src[0].negate ? brw_swap_cmod(inst->conditional_mod)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

2017-10-04 Thread Jason Ekstrand
Shader-db results on Sky Lake:

total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
instructions in affected programs: 55956 -> 54529 (-2.55%)
helped: 6
HURT: 38

All of the hurt programs were hurt by exactly one instruction because
this patch affects copy propagation.  Most of the helped instructions
came from a single orbital explorer shader that was helped by 14.26%

Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/compiler/brw_cfg.cpp | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp
index fad12ee..d8bf725 100644
--- a/src/intel/compiler/brw_cfg.cpp
+++ b/src/intel/compiler/brw_cfg.cpp
@@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
  assert(cur_while != NULL);
 cur->add_successor(mem_ctx, cur_while);
 
+ /* We also add the next block as a successor of the break.  If the
+  * break is predicated, we need to do this because the break may not
+  * be taken.  If the break is not predicated, we add it anyway so
+  * that our live intervals computations will operate as if the break
+  * may or may not be taken.  Consider the following example:
+  *
+  *vec4 color2;
+  *while (1) {
+  *   vec4 color = texture();
+  *   if (...) {
+  *  color2 = color * 2;
+  *  break;
+  *   }
+  *}
+  *gl_FragColor = color2;
+  *
+  * In this case, the definition of color2 dominates the use because
+  * the loop only has the one exit.  This means that the live range
+  * interval for color2 goes from the statement in the if to it's use
+  * below the loop.  Now suppose that the texture operation has a
+  * header register that gets assigned one of the registers used for
+  * color2.  If the loop condition is non-uniform and some of the
+  * threads will take the break and others will continue.  In this
+  * case, the next pass through the loop, the WE_all setup of the
+  * header register will stomp the disabled channels of color2 and
+  * corrupt the value.
+  *
+  * This same problem can occur if you have a mix of 64, 32, and
+  * 16-bit registers because the channels do not line up or if you
+  * have a SIMD16 program and the first half of one value overlaps the
+  * second half of the other.  To solve it, we simply treat the break
+  * as if it may also continue on because some of the threads may
+  * continue on.
+  */
 next = new_block();
-if (inst->predicate)
-   cur->add_successor(mem_ctx, next);
+cur->add_successor(mem_ctx, next);
 
 set_next_block(, next, ip);
 break;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] intel/fs: Restrict live intervals to the subset possibly reachable from any definition.

2017-10-04 Thread Jason Ekstrand
From: Francisco Jerez 

Currently the liveness analysis pass would extend a live interval up
to the top of the program when no unconditional and complete
definition of the variable is found that dominates all of its uses.

This can lead to a serious performance problem in shaders containing
many partial writes, like scalar arithmetic, FP64 and soon FP16
operations.  The number of oversize live intervals in such workloads
can cause the compilation time of the shader to explode because of the
worse than quadratic behavior of the register allocator and scheduler
when running out of registers, and it can also cause the running time
of the shader to explode due to the amount of spilling it leads to,
which is orders of magnitude slower than GRF memory.

This patch fixes it by computing the intersection of our current live
intervals with the subset of the program that can possibly be reached
from any definition of the variable.  Extending the storage allocation
of the variable beyond that is pretty useless because its value is
guaranteed to be undefined at a point that cannot be reached from any
definition.

No significant change in the running time of shader-db (with 5%
statistical significance).

shader-db results on IVB:

  total cycles in shared programs: 61108780 -> 60932856 (-0.29%)
  cycles in affected programs: 16335482 -> 16159558 (-1.08%)
  helped: 5121
  HURT: 4347

  total spills in shared programs: 1309 -> 1288 (-1.60%)
  spills in affected programs: 249 -> 228 (-8.43%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1652 -> 1597 (-3.33%)
  fills in affected programs: 262 -> 207 (-20.99%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 209

shader-db results on BDW:

  total cycles in shared programs: 67617262 -> 67361220 (-0.38%)
  cycles in affected programs: 23397142 -> 23141100 (-1.09%)
  helped: 8045
  HURT: 6488

  total spills in shared programs: 1456 -> 1252 (-14.01%)
  spills in affected programs: 465 -> 261 (-43.87%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1720 -> 1465 (-14.83%)
  fills in affected programs: 471 -> 216 (-54.14%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 162

shader-db results on SKL:

  total cycles in shared programs: 65436248 -> 65245186 (-0.29%)
  cycles in affected programs: 22560936 -> 22369874 (-0.85%)
  helped: 8457
  HURT: 6247

  total spills in shared programs: 437 -> 437 (0.00%)
  spills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total fills in shared programs: 870 -> 854 (-1.84%)
  fills in affected programs: 16 -> 0
  helped: 1
  HURT: 0

  LOST:   0
  GAINED: 107

Reviewed-by: Jason Ekstrand 
---
 src/intel/compiler/brw_fs_live_variables.cpp | 34 
 src/intel/compiler/brw_fs_live_variables.h   | 12 ++
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_live_variables.cpp 
b/src/intel/compiler/brw_fs_live_variables.cpp
index c449672..059f076 100644
--- a/src/intel/compiler/brw_fs_live_variables.cpp
+++ b/src/intel/compiler/brw_fs_live_variables.cpp
@@ -83,9 +83,11 @@ fs_live_variables::setup_one_write(struct block_data *bd, 
fs_inst *inst,
/* The def[] bitset marks when an initialization in a block completely
 * screens off previous updates of that variable (VGRF channel).
 */
-   if (inst->dst.file == VGRF && !inst->is_partial_write()) {
-  if (!BITSET_TEST(bd->use, var))
+   if (inst->dst.file == VGRF) {
+  if (!inst->is_partial_write() && !BITSET_TEST(bd->use, var))
  BITSET_SET(bd->def, var);
+
+  BITSET_SET(bd->defout, var);
}
 }
 
@@ -199,6 +201,28 @@ fs_live_variables::compute_live_variables()
  }
   }
}
+
+   /* Propagate defin and defout down the CFG to calculate the union of live
+* variables potentially defined along any possible control flow path.
+*/
+   do {
+  cont = false;
+
+  foreach_block (block, cfg) {
+ const struct block_data *bd = _data[block->num];
+
+foreach_list_typed(bblock_link, child_link, link, >children) {
+struct block_data *child_bd = _data[child_link->block->num];
+
+   for (int i = 0; i < bitset_words; i++) {
+   const BITSET_WORD new_def = bd->defout[i] & ~child_bd->defin[i];
+   child_bd->defin[i] |= new_def;
+   child_bd->defout[i] |= new_def;
+   cont |= new_def;
+   }
+}
+  }
+   } while (cont);
 }
 
 /**
@@ -212,12 +236,12 @@ fs_live_variables::compute_start_end()
   struct block_data *bd = _data[block->num];
 
   for (int i = 0; i < num_vars; i++) {
- if (BITSET_TEST(bd->livein, i)) {
+ if (BITSET_TEST(bd->livein, i) && BITSET_TEST(bd->defin, i)) {
 start[i] = MIN2(start[i], block->start_ip);
 end[i] = MAX2(end[i], block->start_ip);
  }
 
- if (BITSET_TEST(bd->liveout, i)) {
+ if (BITSET_TEST(bd->liveout, i) && 

[Mesa-dev] [PATCH 1/4] intel/compiler: Don't cmod propagate into a saturated operation

2017-10-04 Thread Jason Ekstrand
Shader-db results on Sky Lake:

total instructions in shared programs: 12954445 -> 12955125 (0.01%)
instructions in affected programs: 141862 -> 142542 (0.48%)
helped: 0
HURT: 626

Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/compiler/brw_fs_cmod_propagation.cpp   | 8 
 src/intel/compiler/brw_vec4_cmod_propagation.cpp | 8 
 2 files changed, 16 insertions(+)

diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp 
b/src/intel/compiler/brw_fs_cmod_propagation.cpp
index 2d50c92..db63e94 100644
--- a/src/intel/compiler/brw_fs_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp
@@ -142,6 +142,14 @@ opt_cmod_propagation_local(const gen_device_info *devinfo, 
bblock_t *block)
 scan_inst->opcode == BRW_OPCODE_CMPN)
break;
 
+/* From the Sky Lake PRM Vol. 7 "Assigning Conditional Mods":
+ *
+ ** Note that the [post condition signal] bits generated at
+ *  the output of a compute are before the .sat.
+ */
+if (scan_inst->saturate)
+   break;
+
 /* Otherwise, try propagating the conditional. */
 enum brw_conditional_mod cond =
inst->src[0].negate ? brw_swap_cmod(inst->conditional_mod)
diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp 
b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
index 4454cdb..05e6516 100644
--- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
@@ -129,6 +129,14 @@ opt_cmod_propagation_local(bblock_t *block)
 scan_inst->opcode == BRW_OPCODE_CMPN)
break;
 
+/* From the Sky Lake PRM Vol. 7 "Assigning Conditional Mods":
+ *
+ ** Note that the [post condition signal] bits generated at
+ *  the output of a compute are before the .sat.
+ */
+if (scan_inst->saturate)
+   break;
+
 /* Otherwise, try propagating the conditional. */
 enum brw_conditional_mod cond =
inst->src[0].negate ? brw_swap_cmod(inst->conditional_mod)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac/nir: use llvm fma intrinsic if nir instruction is exact.

2017-10-04 Thread Connor Abbott
Reviewed-by: Connor Abbott 

On Wed, Oct 4, 2017 at 4:04 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> As pointed out by Connor we still need to use fma if nir wants
> exact (precise) behaviour.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 11ba487..38a2bbe 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
> nir_alu_instr *instr)
>   result);
> break;
> case nir_op_ffma:
> -   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
> +   result = emit_intrin_3f_param(>ac, instr->exact ? 
> "llvm.fma" : "llvm.fmuladd",
>   ac_to_float_type(>ac, 
> def_type), src[0], src[1], src[2]);
> break;
> case nir_op_ibitfield_extract:
> --
> 2.9.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove TGSI_OPCODE_KILL

2017-10-04 Thread Roland Scheidegger
I didn't like this when it was proposed a couple weeks ago, and
unsurprisingly I still don't like it now.
The reason is that KILL is a simple opcode which even maps to what both
glsl and d3d10 actually need, whereas KILL_IF is a complicated mess
including combined per-channel comparisons.
I realize you can of course optimize away all the comparisons if you're
using immediates, but it still doesn't look very clean to me.

Roland


Am 04.10.2017 um 23:15 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> It can be recognized from KILL_IF by checking if the src operand is IMM.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 11 --
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c|  3 --
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c| 42 
> --
>  src/gallium/auxiliary/nir/tgsi_to_nir.c| 14 
>  src/gallium/auxiliary/tgsi/tgsi_exec.c | 18 --
>  src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h |  2 +-
>  src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h   |  1 -
>  src/gallium/auxiliary/tgsi/tgsi_scan.c |  3 +-
>  src/gallium/auxiliary/vl/vl_mc.c   |  2 +-
>  src/gallium/docs/source/tgsi.rst   |  5 ---
>  src/gallium/drivers/i915/i915_fpc_optimize.c   |  1 -
>  src/gallium/drivers/i915/i915_fpc_translate.c  | 13 ---
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  3 --
>  src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c   |  3 --
>  src/gallium/drivers/r300/r300_tgsi_to_rc.c |  1 -
>  src/gallium/drivers/r600/r600_shader.c | 13 +++
>  src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c  | 32 ++---
>  src/gallium/drivers/svga/svga_tgsi_insn.c  | 29 +--
>  src/gallium/drivers/svga/svga_tgsi_vgpu10.c| 21 ---
>  src/gallium/include/pipe/p_shader_tokens.h |  2 +-
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  3 +-
>  21 files changed, 22 insertions(+), 200 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index ce2b927..edcfc6e 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -355,30 +355,20 @@ kil_fetch_args(
> /* src0.z */
> emit_data->args[2] = lp_build_emit_fetch(bld_base, emit_data->inst,
>  0, TGSI_CHAN_Z);
> /* src0.w */
> emit_data->args[3] = lp_build_emit_fetch(bld_base, emit_data->inst,
>  0, TGSI_CHAN_W);
> emit_data->arg_count = 4;
> emit_data->dst_type = 
> LLVMVoidTypeInContext(bld_base->base.gallivm->context);
>  }
>  
> -/* TGSI_OPCODE_KILL */
> -
> -static void
> -kilp_fetch_args(
> -   struct lp_build_tgsi_context * bld_base,
> -   struct lp_build_emit_data * emit_data)
> -{
> -   emit_data->dst_type = 
> LLVMVoidTypeInContext(bld_base->base.gallivm->context);
> -}
> -
>  /* TGSI_OPCODE_LIT */
>  
>  static void
>  lit_fetch_args(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> /* src0.x */
> emit_data->args[0] = lp_build_emit_fetch(bld_base, emit_data->inst, 0, 
> TGSI_CHAN_X);
> /* src0.y */
> @@ -1172,21 +1162,20 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
> bld_base)
> bld_base->op_actions[TGSI_OPCODE_POW] = pow_action;
> bld_base->op_actions[TGSI_OPCODE_UP2H] = up2h_action;
>  
> bld_base->op_actions[TGSI_OPCODE_SWITCH].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_CASE].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_COS].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_EX2].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_IF].fetch_args = scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_UIF].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_KILL_IF].fetch_args = kil_fetch_args;
> -   bld_base->op_actions[TGSI_OPCODE_KILL].fetch_args = kilp_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_RCP].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_SIN].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_LG2].fetch_args = 
> scalar_unary_fetch_args;
>  
> bld_base->op_actions[TGSI_OPCODE_ADD].emit = add_emit;
> bld_base->op_actions[TGSI_OPCODE_ARR].emit = arr_emit;
> bld_base->op_actions[TGSI_OPCODE_END].emit = end_emit;
> bld_base->op_actions[TGSI_OPCODE_FRC].emit = frc_emit;
> bld_base->op_actions[TGSI_OPCODE_LRP].emit = lrp_emit;
> bld_base->op_actions[TGSI_OPCODE_MAD].emit = mad_emit;
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
> index 2529c6a..675b9a5 

Re: [Mesa-dev] [PATCH 2/6] meson: build glx

2017-10-04 Thread Dylan Baker
Quoting Eric Anholt (2017-10-04 14:57:23)
> Dylan Baker  writes:
> 
> > This gets GLX and the loader building. The resulting GLX and i965 have
> > been tested on piglit and seem to work fine. This patch leaves a lot of
> > todo's in it's wake, GLX is quite complicated, and the build options
> > involved are many, and the goal at the moment is to get dri and gallium
> > drivers building.
> >
> > Signed-off-by: Dylan Baker 
> 
> > diff --git a/meson.build b/meson.build
> > index 1824a7ea184..52ac24f59ca 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -21,7 +21,18 @@
> >  project('mesa', ['c', 'cpp'], version : '17.3.0-devel', license : 'MIT',
> >  default_options : ['c_std=c99', 'cpp_std=c++11'])
> >  
> > -with_dri3 = true  # XXX: need a switch for this
> > +# Arguments for the preprocessor, put these in a separate array from the C 
> > and
> > +# C++ (cpp in meson terminology) arguments since they need to be added to 
> > the
> > +# default arguments for both C and C++.
> > +pre_args = [
> > +  '-D__STDC_CONSTANT_MACROS',
> > +  '-D__STDC_FORMAT_MACROS',
> > +  '-D__STDC_LIMIT_MACROS',
> > +  '-DVERSION="@0@"'.format(meson.project_version()),
> > +  '-DPACKAGE_VERSION=VERSION',
> > +  
> > '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa;',
> > +]
> > +
> 
> It would be nice if this hunk appeared in its end position in patch 1.

fixed

> 
> > diff --git a/meson_options.txt b/meson_options.txt
> > index 568903f1a0a..62d6b593f88 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -20,6 +20,8 @@
> >  
> >  option('platforms',  type : 'string',  value : 'x11,wayland',
> > description : 'comma separated list of window systems to support. 
> > wayland, x11, surfaceless, drm, etc.')
> > +option('dri3',   type : 'combo',   value : 'auto', choices : 
> > ['auto', 'yes', 'no'],
> > +   description : 'comma separated list of window systems to support. 
> > wayland, x11, surfaceless, drm, etc.')
> 
> Update the description.

I apparently squashed my fixup into the wrong patch, but I've fixed that
locally.

> 
> > diff --git a/src/glx/meson.build b/src/glx/meson.build
> > new file mode 100644
> > index 000..821623dc263
> > --- /dev/null
> > +++ b/src/glx/meson.build
> 
> > +dri_driver_dir = join_paths(get_option('prefix'), get_option('libdir'), 
> > 'dri')
> > +if not with_glvnd
> > +  gl_lib_name = 'GL'
> > +  gl_lib_version = '1.2'
> > +else
> > +  gl_lib_name = 'GLX_mesa'
> > +  gl_lib_version = '0'
> > +  files_libglx += files(
> > +'g_glxglvnddispatchfuncs.c',
> > +'g_glxglvnddispatchindices.h',
> > +'glxglvnd.c',
> > +'glxglvnd.h',
> > +'glxglvnddispatchfuncs.h',
> > +  )
> > +endif
> > +
> > +gl_lib_cargs = [
> > +  '-D_RENTRANT', '-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_driver_dir),
> 
> "_REENTRANT"
> 
> We probably actually don't need _REENTRANT at all -- if it's needed
> here, it's surely needed across the tree, but _GNU_SOURCE should have us
> covered.

Will we need this for windows?

> 
> > +]
> > +
> > +if dep_xf86vm != [] and dep_xf86vm.found()
> > +  gl_lib_cargs += '-DHAVE_XF86VIDMODE'
> > +endif
> > +
> > +libglx = static_library(
> > +  'glx',
> > +  [files_libglx, glx_indirect_c, glx_indirect_h, glx_indirect_init_c,
> > +   glx_indirect_size_c, glx_indirect_size_h],
> > +  include_directories : [
> > +inc_common, inc_glapi,
> > +include_directories('../loader', '../../include/GL/internal')
> > +  ],
> > +  c_args : [c_vis_args, gl_lib_cargs,
> > +'-DGL_LIB_NAME="lib@0@.so.@1@"'.format(gl_lib_name, 
> > gl_lib_version)],
> 
> GL_LIB_NAME looks like it was libGL.so.1 on !glvnd in automake, not
> libGL.so.1.2.

yup, I just misread that.

> 
> > +  link_with : [libloader, libloader_dri3_helper, libmesa_util, 
> > libxmlconfig],
> > +  dependencies : [dep_libdrm, dep_dri2proto, dep_glproto, dep_x11, 
> > dep_glvnd],
> > +  build_by_default : false,
> > +)
> > +
> > +# workaround for bug #2180
> > +dummy_c = custom_target(
> > +  'dummy_c',
> > +  output : 'dummy.c',
> > +  command : [prog_touch, '@OUTPUT@'],
> > +)
> > +
> > +if with_glx == 'dri'
> > +  libgl = shared_library(
> > +gl_lib_name,
> > +dummy_c,  # workaround for bug #2180
> > +include_directories : [
> > +  inc_common, inc_glapi, 
> > +  include_directories('../loader', '../../include/GL/internal')
> > +],
> > +link_with : [libglapi_static, libglapi],
> > +link_whole : libglx,
> 
> It's not clear to me why we're building a static libglx above if it's
> only used in one place.

the glx tests link with it too. Which I should build.

> 
> > +link_args : [ld_args_bsymbolic, ld_args_gc_sections],
> 
> Missing -no-undefined?

meson enables that by default.

> 
> > diff --git a/src/mapi/glapi/gen/meson.build b/src/mapi/glapi/gen/meson.build
> > index f4c1343202c..cf1f014b4f0 100644
> > --- a/src/mapi/glapi/gen/meson.build
> > +++ 

Re: [Mesa-dev] [PATCH] mesa: Use a 565 format for GL_RGB and GL_UNSIGNED_SHORT_5_6_5 textures.

2017-10-04 Thread Eric Anholt
Kenneth Graunke  writes:

> Found while trying to optimize an application.
>
> Not observed to help performance on i965, but should at least reduce
> the memory usage of such textures a bit.

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965/gen10: Implement Wa3DStateMode

2017-10-04 Thread Anuj Phogat
On Mon, Oct 2, 2017 at 7:46 PM, Jason Ekstrand  wrote:
> On Mon, Oct 2, 2017 at 4:08 PM, Anuj Phogat  wrote:
>>
>> Cc: mesa-sta...@lists.freedesktop.org
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/mesa/drivers/dri/i965/brw_state_upload.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c
>> b/src/mesa/drivers/dri/i965/brw_state_upload.c
>> index a1bf54dc72..c224355a2b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
>> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
>> @@ -88,8 +88,11 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
>> if (devinfo->gen == 10) {
>>BEGIN_BATCH(2);
>>OUT_BATCH(_3DSTATE_3D_MODE  << 16 | (2 - 2));
>> -  OUT_BATCH(GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE << 16 |
>> -GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE);
>> +  /* From gen10 workaround table in h/w specs:
>> +   * "On 3DSTATE_3D_MODE, driver must always program bits 31:16 of
>> DW1
>> +   *  a value of 0x"
>> +   */
>> +  OUT_BATCH(0x << 16 | GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE);
>
>
> Bits 31:16 are the mask bits.  By programming them to 0x, you're making
> it write the entire register and not just the float blend optimization
> enable bit.  If we're going to do that, we need to figure out what values we
> want in the other fields and always set them along with the float blend
> optimization enable bit.
>
Right. After looking at all other fields, I don't think we want to set
any of them except one. That field is "Slice Hashing Table Enable" which says:
"For gen10, when the total number of subslices enabled is 6,8,10, or
12, slice hashing table must be enabled."

I have no idea about slice hashing tables and I think enabling it
should be handled in a separate patch anyways.


> --Jason
>
>>
>>ADVANCE_BATCH();
>> }
>>
>> --
>> 2.13.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] meson: build classic swrast

2017-10-04 Thread Eric Anholt
Dylan Baker  writes:

> This adds support for building the classic swrast implementation. This
> driver has been tested with glxinfo and glxgears.
>
> Signed-off-by: Dylan Baker 
> ---
>  meson.build   |  2 ++
>  meson_options.txt |  2 +-
>  src/mesa/drivers/dri/meson.build  |  5 +++
>  src/mesa/drivers/dri/{ => swrast}/meson.build | 45 
> +--
>  4 files changed, 16 insertions(+), 38 deletions(-)
>  copy src/mesa/drivers/dri/{ => swrast}/meson.build (52%)
>

> diff --git a/src/mesa/drivers/dri/meson.build 
> b/src/mesa/drivers/dri/meson.build
> index f7403ec09fc..153aa15efb6 100644
> --- a/src/mesa/drivers/dri/meson.build
> +++ b/src/mesa/drivers/dri/meson.build
> @@ -19,11 +19,16 @@
>  # SOFTWARE.
>  
>  subdir('common')
> +subdir('swrast')
>  subdir('i965')
>  
>  if with_dri
>dri_drivers = []
>dri_link = []
> +  if with_dri_swrast
> +dri_drivers += libswrast_dri
> +dri_link += 'swrast_dri.so'
> +  endif
>if with_dri_i965
>  dri_drivers += libi965
>  dri_link += 'i965_dri.so'

General style suggestion for this and i965: Define these two variables
before the subdir(driver), conditionally go into the subdir, and have
the driver just append itself to the array.  Then you don't need
build_by_default flags on the driver's libs.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] meson: build gbm

2017-10-04 Thread Eric Anholt
Dylan Baker  writes:

> This doesn't include egl support, just dri support.
>
> Signed-off-by: Dylan Baker 
> ---
>  meson.build | 49 +---
>  meson_options.txt   | 14 +
>  src/{loader => gbm}/meson.build | 63 
> -
>  src/glx/meson.build | 10 +++
>  src/loader/meson.build  |  2 ++
>  src/mesa/meson.build|  2 +-
>  src/meson.build |  4 ++-
>  7 files changed, 95 insertions(+), 49 deletions(-)
>  copy src/{loader => gbm}/meson.build (50%)
>
> diff --git a/meson.build b/meson.build
> index ec50e10b38c..185d70509c5 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -54,19 +54,6 @@ with_any_opengl = with_opengl or with_gles1 or with_gles2
>  # Only build shared_glapi if at least one OpenGL API is enabled
>  with_shared_glapi = get_option('shared-glapi') and with_any_opengl
>  
> -with_dri3 = get_option('dri3')
> -if with_dri3 == 'auto'
> -  if host_machine.system() == 'linux'
> -with_dri3 = true
> -  else
> -with_dri3 = false
> - endif
> -elif with_dri3 == 'yes'
> -  with_dri3 = true
> -else
> -  with_dri3 = false
> -endif
> -
>  # TODO: these will need options, but at the moment they just control header
>  # installs
>  with_osmesa = false
> @@ -107,6 +94,27 @@ with_dri_platform = 'drm'
>  with_gallium = false
>  # TODO: gallium drivers
>  
> +# TODO: conditionalize libdrm requirement
> +dep_libdrm = dependency('libdrm', version : '>= 2.4.75')
> +pre_args += '-DHAVE_LIBDRM'
> +
> +with_dri2 = with_dri and with_dri_platform == 'drm' and dep_libdrm.found()
> +with_dri3 = get_option('dri3')
> +if with_dri3 == 'auto'
> +  if host_machine.system() == 'linux' and with_dri2
> +with_dri3 = true
> +  else
> +with_dri3 = false
> + endif
> +elif with_dri3 == 'yes'
> +  if not with_dri2
> +error('dri3 support requires libdrm')
> +  endif
> +  with_dri3 = true
> +else
> +  with_dri3 = false
> +endif

It would be great if the hunk could appear in its ultimate position
earlier in the series.


> diff --git a/meson_options.txt b/meson_options.txt
> index 130d3962db7..b6d44c44ba9 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -32,17 +32,19 @@ option('shader-cache',type : 'boolean', value : true,
> description : 'Build with on-disk shader cache support')
>  option('vulkan-icd-dir', type : 'string',  value : '',
> description : 'Location relative to prefix to put vulkan icds on 
> install. Default: $datadir/vulkan/icd.d')
> -option('shared-glapi',   type : 'boolean', value : true,
> +option('shared-glapi',type : 'boolean', value : true,
> description : 'Whether to build a shared or static glapi')

More stray changes.

> -option('gles1',  type : 'boolean', value : true,
> +option('gles1',   type : 'boolean', value : true,
> description : 'Build support for OpenGL ES 1.x')
> -option('gles2',  type : 'boolean', value : true,
> +option('gles2',   type : 'boolean', value : true,
> description : 'Build support for OpenGL ES 2.x and 3.x')
> -option('opengl', type : 'boolean', value : true,
> +option('opengl',  type : 'boolean', value : true,
> description : 'Build support for OpenGL (all versions)')
> -option('glx',type : 'combo',   value : 'auto', choices : 
> ['auto', 'disabled', 'dri', 'xlib', 'gallium-xlib'],
> +option('gbm', type : 'combo',   value : 'auto', choices : 
> ['auto', 'yes', 'no'],
> +   description : 'Build support for gbm platform')
> +option('glx', type : 'combo',   value : 'auto', choices : 
> ['auto', 'disabled', 'dri', 'xlib', 'gallium-xlib'],
> description : 'Build support for GLX platform')
> -option('glvnd',  type : 'boolean', vaule : false,
> +option('glvnd',   type : 'boolean', value : false,
> description : 'Enable GLVND support.')

Stray changes.

> diff --git a/src/glx/meson.build b/src/glx/meson.build
> index 6b6e9095740..8c1b29a9ff8 100644
> --- a/src/glx/meson.build
> +++ b/src/glx/meson.build
> @@ -106,8 +106,6 @@ elif with_windowsdri
>#]
>  endif
>  
> -# TODO: libglvnd
> -
>  dri_driver_dir = join_paths(get_option('prefix'), with_dri_drivers_path)
>  if not with_glvnd
>gl_lib_name = 'GL'
> @@ -137,8 +135,8 @@ libglx = static_library(
>[files_libglx, glx_indirect_c, glx_indirect_h, glx_indirect_init_c,
> glx_indirect_size_c, glx_indirect_size_h],
>include_directories : [
> -inc_common, inc_glapi,
> -include_directories('../loader', '../../include/GL/internal')
> +inc_common, inc_glapi, inc_loader,
> +include_directories('../../include/GL/internal')
>],
>c_args : [c_vis_args, gl_lib_cargs,
>  '-DGL_LIB_NAME="lib@0@.so.@1@"'.format(gl_lib_name, 
> gl_lib_version)],
> @@ -159,8 +157,8 @@ if with_glx == 'dri'
>  gl_lib_name,
>  

Re: [Mesa-dev] [PATCH 3/6] meson: Add support for configuring dri drivers directory.

2017-10-04 Thread Eric Anholt
Dylan Baker  writes:

> Signed-off-by: Dylan Baker 
> ---
>  meson.build  |  6 ++
>  meson_options.txt| 14 --
>  src/glx/meson.build  |  2 +-
>  src/mesa/drivers/dri/meson.build |  2 +-
>  4 files changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/meson.build b/meson.build
> index 52ac24f59ca..ec50e10b38c 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -42,6 +42,11 @@ with_asm = get_option('asm')
>  with_appledri = false
>  with_windowsdri = false
>  
> +with_dri_drivers_path = get_option('dri-drivers-path')
> +if with_dri_drivers_path == ''
> +  with_dri_drivers_path = join_paths(get_option('libdir'), 'dri')
> +endif

Could we drop "with_" from the name of this variable?

> +
>  with_gles1 = get_option('gles1')
>  with_gles2 = get_option('gles2')
>  with_opengl = get_option('opengl')
> @@ -573,6 +578,7 @@ if with_platform_x11
>dependency('xcb-dri2', version : '>= 1.8'),
>dependency('xcb-xfixes'),
>  ]
> +pre_args += '-DHAVE_X11_PLATFORM'
>  if with_dri3
>pre_args += '-DHAVE_DRI3'
>dep_xcb_dri3 = [

Part of patch 1?

> diff --git a/meson_options.txt b/meson_options.txt
> index 62d6b593f88..130d3962db7 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -18,13 +18,15 @@
>  # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
> THE
>  # SOFTWARE.
>  
> -option('platforms',  type : 'string',  value : 'x11,wayland',
> +option('platforms',   type : 'string',  value : 'x11,wayland',
> description : 'comma separated list of window systems to
> support. wayland, x11, surfaceless, drm, etc.')

Stray change?

> -option('dri3',   type : 'combo',   value : 'auto', choices : 
> ['auto', 'yes', 'no'],
> -   description : 'comma separated list of window systems to support. 
> wayland, x11, surfaceless, drm, etc.')
> -option('dri-drivers',type : 'string',  value : 'i965',
> +option('dri3',type : 'combo',   value : 'auto', choices : 
> ['auto', 'yes', 'no'],
> +   description : 'enable support for dri3')
> +option('dri-drivers', type : 'string',  value : 'i965',
> description : 'comma separated list of dri drivers to build.')

Squash into earlier patches?

> -option('vulkan-drivers', type : 'string',  value : 'intel,amd',
> +option('dri-drivers-path', type : 'string',  value : '',
> +   description : 'Location of dri drivers. Default: $libdir/dri.')
> +option('vulkan-drivers',  type : 'string',  value : 'intel,amd',
> description : 'comma separated list of vulkan drivers to build.')
>  option('shader-cache',type : 'boolean', value : true,
> description : 'Build with on-disk shader cache support')
> @@ -46,5 +48,5 @@ option('asm',type : 'boolean', value : true,
> description : 'Build assembly code if possible')
>  option('valgrind',   type : 'boolean', vaule : true,
> description : 'Build with valgrind support if possible')
> -option('build-tests',type : 'boolean', value : false,
> +option('build-tests', type : 'boolean', value : false,

Stray change?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] meson: Build i965 and dri stack

2017-10-04 Thread Dylan Baker
Quoting Eric Anholt (2017-10-04 14:34:40)
> Dylan Baker  writes:
> 
> > This gets pretty much the entire classic tree building, as well as
> > i965, including the various glapis. There are some workarounds for bugs
> > that are fixed in meson 0.43.0, which is due out on October 8th.
> >
> > I have tested this with piglit using glx.
> >
> > Signed-off-by: Dylan Baker 
> 
> > diff --git a/meson.build b/meson.build
> > index 5de64acefd6..1824a7ea184 100644
> > --- a/meson.build
> > +++ b/meson.build
> 
> > @@ -336,7 +419,12 @@ endif
> >  
> >  # pthread stubs. Lets not and say we didn't
> >  
> > +_req_parse = with_opengl or with_gles1 or with_gles2
> > +prog_bison = find_program('bison', required : _req_parse)
> > +prog_flex = find_program('flex', required : _req_parse)
> 
> Just reuse with_any_opengl here?

Yup, those predate the with_any_opengl variable in my development.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] meson: Build i965 and dri stack

2017-10-04 Thread Dylan Baker
Quoting Eric Anholt (2017-10-04 14:25:30)
> Dylan Baker  writes:
> 
> > This gets pretty much the entire classic tree building, as well as
> > i965, including the various glapis. There are some workarounds for bugs
> > that are fixed in meson 0.43.0, which is due out on October 8th.
> >
> > I have tested this with piglit using glx.
> >
> > Signed-off-by: Dylan Baker 
> 
> I didn't do a side-by-side diff or anything, but this looks pretty
> good.  A few comments...
> 
> > diff --git a/bin/install_megadrivers.py b/bin/install_megadrivers.py
> > new file mode 100755
> > index 000..50a4323a6e8
> > --- /dev/null
> > +++ b/bin/install_megadrivers.py
> > @@ -0,0 +1,68 @@
> > +#!/usr/bin/env python
> > +# encoding=utf-8
> > +# Copyright © 2017 Intel Corporation
> > +
> > +# Permission is hereby granted, free of charge, to any person obtaining a 
> > copy
> > +# of this software and associated documentation files (the "Software"), to 
> > deal
> > +# in the Software without restriction, including without limitation the 
> > rights
> > +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> > +# copies of the Software, and to permit persons to whom the Software is
> > +# furnished to do so, subject to the following conditions:
> > +
> > +# The above copyright notice and this permission notice shall be included 
> > in
> > +# all copies or substantial portions of the Software.
> > +
> > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 
> > THE
> > +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> > FROM,
> > +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> > IN THE
> > +# SOFTWARE.
> > +
> > +"""Script to install megadriver symlinks for meson."""
> > +
> > +import argparse
> > +import errno
> > +import os
> > +import shutil
> > +
> > +
> > +def main():
> > +parser = argparse.ArgumentParser()
> > +parser.add_argument('megadriver')
> > +parser.add_argument('libdir')
> > +parser.add_argument('drivers', nargs='+')
> > +args = parser.parse_args()
> > +
> > +to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), 
> > args.libdir)
> > +
> > +cross_found = False
> > +
> > +if not os.path.exists(to):
> > +os.makedirs(to)
> > +from_ = args.megadriver
> > +
> > +for each in args.drivers:
> > +final = os.path.join(to, each)
> > +if os.path.exists(final):
> > +os.unlink(final)
> > +print('installing {} to {}'.format(args.megadriver, to))
> > +try:
> > +os.link(from_, final)
> > +except OSError as e:
> > +if e.errno == errno.EXDEV:
> > +if cross_found:
> > +raise Exception('Something went very wrong.')
> > +# if we hit this then we're trying to link from one 
> > filesystem,
> > +# which is obviously invalid. Instead copy the first 
> > binary,
> > +# then set that as the from so that the hard links will 
> > work
> > +shutil.copy(from_, final)
> > +from_ = final
> > +cross_found = True
> > +else:
> > +raise
> 
> The old megadrivers install method would install under the build target
> name ("libmesa_dri_drivers.so"), link from that to the names we need,
> then remove libmesa_dri_drivers.so.  I like the sound of that -- no
> weird error handling like this, and if the developer decides to do
> something silly to the build target in the tree, it doesn't have
> surprise effects on the installed copy.  (I was worred about strip,
> though it does look like strip makes a new inode)
> 
> Would that be hard to do?

Not at all, and in fact that sounds better than what I did, which was not think
about cross device linking, and then do this after I tried to install to /tmp
and said, "hey..."

I'll fix that for v2

> > diff --git a/meson.build b/meson.build
> > index 5de64acefd6..1824a7ea184 100644
> > --- a/meson.build
> > +++ b/meson.build
> 
> > @@ -204,7 +255,39 @@ endif
> >  
> >  # TODO: cross-compiling. I don't think this is relavent to meson
> >  
> > -# TODO: assembly support. mesa and vc4
> > +# FIXME: enable asm when cross compiler
> > +# This is doable (autotools does it), but it's not of immediate concern
> > +if meson.is_cross_build()
> > +  message('Cross compiling, disabling asm')
> > +  with_asm = false
> > +endif
> > +
> > +with_asm_arch = ''
> > +if with_asm
> > +  # TODO: SPARC and PPC
> > +  if target_machine.cpu_family() == 'x86'
> > +if ['linux', 'bsd'].contains(target_machine.system()) # FIXME: hurd?
> > +  with_asm_arch = 

Re: [Mesa-dev] [PATCH 2/6] meson: build glx

2017-10-04 Thread Eric Anholt
Dylan Baker  writes:

> This gets GLX and the loader building. The resulting GLX and i965 have
> been tested on piglit and seem to work fine. This patch leaves a lot of
> todo's in it's wake, GLX is quite complicated, and the build options
> involved are many, and the goal at the moment is to get dri and gallium
> drivers building.
>
> Signed-off-by: Dylan Baker 

> diff --git a/meson.build b/meson.build
> index 1824a7ea184..52ac24f59ca 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -21,7 +21,18 @@
>  project('mesa', ['c', 'cpp'], version : '17.3.0-devel', license : 'MIT',
>  default_options : ['c_std=c99', 'cpp_std=c++11'])
>  
> -with_dri3 = true  # XXX: need a switch for this
> +# Arguments for the preprocessor, put these in a separate array from the C 
> and
> +# C++ (cpp in meson terminology) arguments since they need to be added to the
> +# default arguments for both C and C++.
> +pre_args = [
> +  '-D__STDC_CONSTANT_MACROS',
> +  '-D__STDC_FORMAT_MACROS',
> +  '-D__STDC_LIMIT_MACROS',
> +  '-DVERSION="@0@"'.format(meson.project_version()),
> +  '-DPACKAGE_VERSION=VERSION',
> +  
> '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa;',
> +]
> +

It would be nice if this hunk appeared in its end position in patch 1.

> diff --git a/meson_options.txt b/meson_options.txt
> index 568903f1a0a..62d6b593f88 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -20,6 +20,8 @@
>  
>  option('platforms',  type : 'string',  value : 'x11,wayland',
> description : 'comma separated list of window systems to support. 
> wayland, x11, surfaceless, drm, etc.')
> +option('dri3',   type : 'combo',   value : 'auto', choices : 
> ['auto', 'yes', 'no'],
> +   description : 'comma separated list of window systems to support. 
> wayland, x11, surfaceless, drm, etc.')

Update the description.

> diff --git a/src/glx/meson.build b/src/glx/meson.build
> new file mode 100644
> index 000..821623dc263
> --- /dev/null
> +++ b/src/glx/meson.build

> +dri_driver_dir = join_paths(get_option('prefix'), get_option('libdir'), 
> 'dri')
> +if not with_glvnd
> +  gl_lib_name = 'GL'
> +  gl_lib_version = '1.2'
> +else
> +  gl_lib_name = 'GLX_mesa'
> +  gl_lib_version = '0'
> +  files_libglx += files(
> +'g_glxglvnddispatchfuncs.c',
> +'g_glxglvnddispatchindices.h',
> +'glxglvnd.c',
> +'glxglvnd.h',
> +'glxglvnddispatchfuncs.h',
> +  )
> +endif
> +
> +gl_lib_cargs = [
> +  '-D_RENTRANT', '-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_driver_dir),

"_REENTRANT"

We probably actually don't need _REENTRANT at all -- if it's needed
here, it's surely needed across the tree, but _GNU_SOURCE should have us
covered.

> +]
> +
> +if dep_xf86vm != [] and dep_xf86vm.found()
> +  gl_lib_cargs += '-DHAVE_XF86VIDMODE'
> +endif
> +
> +libglx = static_library(
> +  'glx',
> +  [files_libglx, glx_indirect_c, glx_indirect_h, glx_indirect_init_c,
> +   glx_indirect_size_c, glx_indirect_size_h],
> +  include_directories : [
> +inc_common, inc_glapi,
> +include_directories('../loader', '../../include/GL/internal')
> +  ],
> +  c_args : [c_vis_args, gl_lib_cargs,
> +'-DGL_LIB_NAME="lib@0@.so.@1@"'.format(gl_lib_name, 
> gl_lib_version)],

GL_LIB_NAME looks like it was libGL.so.1 on !glvnd in automake, not
libGL.so.1.2.

> +  link_with : [libloader, libloader_dri3_helper, libmesa_util, libxmlconfig],
> +  dependencies : [dep_libdrm, dep_dri2proto, dep_glproto, dep_x11, 
> dep_glvnd],
> +  build_by_default : false,
> +)
> +
> +# workaround for bug #2180
> +dummy_c = custom_target(
> +  'dummy_c',
> +  output : 'dummy.c',
> +  command : [prog_touch, '@OUTPUT@'],
> +)
> +
> +if with_glx == 'dri'
> +  libgl = shared_library(
> +gl_lib_name,
> +dummy_c,  # workaround for bug #2180
> +include_directories : [
> +  inc_common, inc_glapi, 
> +  include_directories('../loader', '../../include/GL/internal')
> +],
> +link_with : [libglapi_static, libglapi],
> +link_whole : libglx,

It's not clear to me why we're building a static libglx above if it's
only used in one place.

> +link_args : [ld_args_bsymbolic, ld_args_gc_sections],

Missing -no-undefined?

> diff --git a/src/mapi/glapi/gen/meson.build b/src/mapi/glapi/gen/meson.build
> index f4c1343202c..cf1f014b4f0 100644
> --- a/src/mapi/glapi/gen/meson.build
> +++ b/src/mapi/glapi/gen/meson.build
> @@ -247,7 +247,7 @@ glx_indirect_size_h = custom_target(
>input : ['glX_proto_size.py', 'gl_API.xml'],
>output : 'indirect_size.h',
>command : [prog_python2, '@INPUT0@', '-f', '@INPUT1@', '-m', 'size_h',
> - '--header-tag', '_INDIRECT_SIZE_H_'],
> + '--header-tag', '_INDIRECT_SIZE_H_', '--only-set'],
>depend_files : glx_gen_depends,
>capture : true,
>  )
> @@ -256,7 +256,8 @@ glx_indirect_size_c = custom_target(
>'indirect_size.c',
>input : ['glX_proto_size.py', 'gl_API.xml'],
>

Re: [Mesa-dev] [PATCH 1/6] meson: Build i965 and dri stack

2017-10-04 Thread Eric Anholt
Dylan Baker  writes:

> This gets pretty much the entire classic tree building, as well as
> i965, including the various glapis. There are some workarounds for bugs
> that are fixed in meson 0.43.0, which is due out on October 8th.
>
> I have tested this with piglit using glx.
>
> Signed-off-by: Dylan Baker 

> diff --git a/meson.build b/meson.build
> index 5de64acefd6..1824a7ea184 100644
> --- a/meson.build
> +++ b/meson.build

> @@ -336,7 +419,12 @@ endif
>  
>  # pthread stubs. Lets not and say we didn't
>  
> +_req_parse = with_opengl or with_gles1 or with_gles2
> +prog_bison = find_program('bison', required : _req_parse)
> +prog_flex = find_program('flex', required : _req_parse)

Just reuse with_any_opengl here?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 10/12] anv: add nir lowering pass for ycrcb textures

2017-10-04 Thread Jason Ekstrand
On Wed, Oct 4, 2017 at 10:34 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> This pass implements all the implicit conversions required by the
> VK_KHR_sampler_ycbcr_conversion specification.
>
> It also inserts plane sources onto sampling instructions that we then
> let the pipeline layout pass deal with, when mapping things correctly
> to descriptors.
>
> v2: Add new file to meson build (Lionel)
> Use nir_frcp() rather than (1.0f / x) (Jason)
> Reuse nir_tex_instr_dest_size() rather than handwritten one (Jason)
> Return progress (Jason)
> Account for array of samplers (Jason)
>
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/Makefile.sources   |   1 +
>  src/intel/vulkan/anv_nir.h   |   3 +
>  src/intel/vulkan/anv_nir_apply_pipeline_layout.c |  61 ++-
>  src/intel/vulkan/anv_nir_lower_ycbcr_textures.c  | 469
> +++
>  src/intel/vulkan/anv_pipeline.c  |   2 +
>  src/intel/vulkan/anv_private.h   |  16 +-
>  src/intel/vulkan/meson.build |   1 +
>  7 files changed, 547 insertions(+), 6 deletions(-)
>  create mode 100644 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index bca7a132b26..9672dcc252d 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -219,6 +219,7 @@ VULKAN_FILES := \
> vulkan/anv_nir_lower_input_attachments.c \
> vulkan/anv_nir_lower_multiview.c \
> vulkan/anv_nir_lower_push_constants.c \
> +   vulkan/anv_nir_lower_ycbcr_textures.c \
> vulkan/anv_pass.c \
> vulkan/anv_pipeline.c \
> vulkan/anv_pipeline_cache.c \
> diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h
> index 5b450b45cdf..8ac0a119dac 100644
> --- a/src/intel/vulkan/anv_nir.h
> +++ b/src/intel/vulkan/anv_nir.h
> @@ -37,6 +37,9 @@ void anv_nir_lower_push_constants(nir_shader *shader);
>
>  bool anv_nir_lower_multiview(nir_shader *shader, uint32_t view_mask);
>
> +bool anv_nir_lower_ycbcr_textures(nir_shader *shader,
> +  struct anv_pipeline *pipeline);
> +
>  void anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline,
> nir_shader *shader,
> struct brw_stage_prog_data *prog_data,
> diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> index 428cfdf42d1..28cbb98c563 100644
> --- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> +++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> @@ -131,7 +131,7 @@ lower_res_index_intrinsic(nir_intrinsic_instr *intrin,
>  static void
>  lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
>  unsigned *const_index, unsigned hw_binding_size,
> -nir_tex_src_type src_type,
> +nir_tex_src_type src_type, bool allow_indirect,
>  struct apply_pipeline_layout_state *state)
>  {
> nir_builder *b = >builder;
> @@ -141,6 +141,15 @@ lower_tex_deref(nir_tex_instr *tex, nir_deref_var
> *deref,
>nir_deref_array *deref_array = nir_deref_as_array(deref->
> deref.child);
>
>if (deref_array->deref_array_type == nir_deref_array_type_indirect)
> {
> + /* From VK_KHR_sampler_ycbcr_conversion:
> +  *
> +  * If sampler Y’CBCR conversion is enabled, the combined image
> +  * sampler must be indexed only by constant integral expressions
> when
> +  * aggregated into arrays in shader code, irrespective of the
> +  * shaderSampledImageArrayDynamicIndexing feature.
> +  */
> + assert(allow_indirect);
> +
>   nir_ssa_def *index =
>  nir_iadd(b, nir_imm_int(b, deref_array->base_offset),
>  nir_ssa_for_src(b, deref_array->indirect, 1));
> @@ -186,6 +195,46 @@ cleanup_tex_deref(nir_tex_instr *tex, nir_deref_var
> *deref)
> nir_instr_rewrite_src(>instr, _array->indirect,
> NIR_SRC_INIT);
>  }
>
> +static bool
> +has_tex_src_plane(nir_tex_instr *tex)
> +{
> +   for (unsigned i = 0; i < tex->num_srcs; i++) {
> +  if (tex->src[i].src_type == nir_tex_src_plane)
> + return true;
> +   }
> +
> +   return false;
> +}
> +
> +static uint32_t
> +extract_tex_src_plane(nir_tex_instr *tex)
> +{
> +   nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src, tex->num_srcs
> - 1);
> +   unsigned plane = 0;
> +
> +   for (unsigned i = 0, w = 0; i < tex->num_srcs; i++) {
> +  if (tex->src[i].src_type == nir_tex_src_plane) {
> + nir_const_value *const_plane =
> +nir_src_as_const_value(tex->src[i].src);
> +
> + /* Our color conversion lowering pass should only ever insert
> +  * constants. */
> + assert(const_plane);
> + plane = const_plane->u32[0];
> +  } 

Re: [Mesa-dev] [PATCH 1/6] meson: Build i965 and dri stack

2017-10-04 Thread Eric Anholt
Dylan Baker  writes:

> This gets pretty much the entire classic tree building, as well as
> i965, including the various glapis. There are some workarounds for bugs
> that are fixed in meson 0.43.0, which is due out on October 8th.
>
> I have tested this with piglit using glx.
>
> Signed-off-by: Dylan Baker 

I didn't do a side-by-side diff or anything, but this looks pretty
good.  A few comments...

> diff --git a/bin/install_megadrivers.py b/bin/install_megadrivers.py
> new file mode 100755
> index 000..50a4323a6e8
> --- /dev/null
> +++ b/bin/install_megadrivers.py
> @@ -0,0 +1,68 @@
> +#!/usr/bin/env python
> +# encoding=utf-8
> +# Copyright © 2017 Intel Corporation
> +
> +# Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> +# of this software and associated documentation files (the "Software"), to 
> deal
> +# in the Software without restriction, including without limitation the 
> rights
> +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> +# copies of the Software, and to permit persons to whom the Software is
> +# furnished to do so, subject to the following conditions:
> +
> +# The above copyright notice and this permission notice shall be included in
> +# all copies or substantial portions of the Software.
> +
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
> THE
> +# SOFTWARE.
> +
> +"""Script to install megadriver symlinks for meson."""
> +
> +import argparse
> +import errno
> +import os
> +import shutil
> +
> +
> +def main():
> +parser = argparse.ArgumentParser()
> +parser.add_argument('megadriver')
> +parser.add_argument('libdir')
> +parser.add_argument('drivers', nargs='+')
> +args = parser.parse_args()
> +
> +to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), 
> args.libdir)
> +
> +cross_found = False
> +
> +if not os.path.exists(to):
> +os.makedirs(to)
> +from_ = args.megadriver
> +
> +for each in args.drivers:
> +final = os.path.join(to, each)
> +if os.path.exists(final):
> +os.unlink(final)
> +print('installing {} to {}'.format(args.megadriver, to))
> +try:
> +os.link(from_, final)
> +except OSError as e:
> +if e.errno == errno.EXDEV:
> +if cross_found:
> +raise Exception('Something went very wrong.')
> +# if we hit this then we're trying to link from one 
> filesystem,
> +# which is obviously invalid. Instead copy the first binary,
> +# then set that as the from so that the hard links will work
> +shutil.copy(from_, final)
> +from_ = final
> +cross_found = True
> +else:
> +raise

The old megadrivers install method would install under the build target
name ("libmesa_dri_drivers.so"), link from that to the names we need,
then remove libmesa_dri_drivers.so.  I like the sound of that -- no
weird error handling like this, and if the developer decides to do
something silly to the build target in the tree, it doesn't have
surprise effects on the installed copy.  (I was worred about strip,
though it does look like strip makes a new inode)

Would that be hard to do?

> diff --git a/meson.build b/meson.build
> index 5de64acefd6..1824a7ea184 100644
> --- a/meson.build
> +++ b/meson.build

> @@ -204,7 +255,39 @@ endif
>  
>  # TODO: cross-compiling. I don't think this is relavent to meson
>  
> -# TODO: assembly support. mesa and vc4
> +# FIXME: enable asm when cross compiler
> +# This is doable (autotools does it), but it's not of immediate concern
> +if meson.is_cross_build()
> +  message('Cross compiling, disabling asm')
> +  with_asm = false
> +endif
> +
> +with_asm_arch = ''
> +if with_asm
> +  # TODO: SPARC and PPC
> +  if target_machine.cpu_family() == 'x86'
> +if ['linux', 'bsd'].contains(target_machine.system()) # FIXME: hurd?
> +  with_asm_arch = 'x86'
> +  pre_args += ['-DUSE_X86_ASM', '-DUSE_MMX_ASM', '-DUSE_3DNOW_ASM',
> +   '-DUSE_SSE_ASM']
> +endif
> +  elif target_machine.cpu_family() == 'x86_64'
> +if target_machine.system() == 'linux'
> +  with_asm_arch = 'x86_64'
> +  pre_args += ['-DUSE_X86_64_ASM']
> +endif
> +  elif target_machine.cpu_family() == 'arm'
> +if target_machine.system() == 'linux'
> +  with_asm_arch = 'arm'
> +  pre_args += ['-DUSE_ARM_ASM']
> +endif
> +  elif 

Re: [Mesa-dev] [PATCH v3 06/12] anv: modify the internal concept of format to express multiple planes

2017-10-04 Thread Jason Ekstrand
All my comments below are on chunks that are no longer needed now that
anv_get_isl_format hasn't had it's name changed.  Take or leave them as
your personal level of pedantry dictates. :)

On Wed, Oct 4, 2017 at 10:34 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> A given Vulkan format can now be decomposed into a set of planes. We
> now use 'struct anv_format_plane' to represent the format of those
> planes.
>
> v2: by Jason
> Rename anv_get_plane_format() to anv_get_format_plane()
> Don't rename anv_get_isl_format()
> Replace ds_fmt() by fmt2()
> Introduce fmt_unsupported()
>
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/vulkan/anv_blorp.c |  18 +-
>  src/intel/vulkan/anv_formats.c   | 512 +-
> -
>  src/intel/vulkan/anv_image.c |  12 +-
>  src/intel/vulkan/anv_private.h   |  54 -
>  src/intel/vulkan/genX_pipeline.c |   7 +-
>  5 files changed, 339 insertions(+), 264 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 8dead1d87a8..187042c71cf 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -459,12 +459,12 @@ void anv_CmdBlitImage(
>get_blorp_surf_for_anv_image(dst_image, dst_res->aspectMask,
> dst_image->aux_usage, );
>
> -  struct anv_format src_format =
> - anv_get_format(_buffer->device->info, src_image->vk_format,
> -src_res->aspectMask, src_image->tiling);
> -  struct anv_format dst_format =
> - anv_get_format(_buffer->device->info, dst_image->vk_format,
> -dst_res->aspectMask, dst_image->tiling);
> +  struct anv_format_plane src_format =
> + anv_get_format_plane(_buffer->device->info,
> src_image->vk_format,
> +  src_res->aspectMask, src_image->tiling);
> +  struct anv_format_plane dst_format =
> + anv_get_format_plane(_buffer->device->info,
> dst_image->vk_format,
> +  dst_res->aspectMask, dst_image->tiling);
>
>unsigned dst_start, dst_end;
>if (dst_image->type == VK_IMAGE_TYPE_3D) {
> @@ -758,9 +758,9 @@ void anv_CmdClearColorImage(
>
>assert(pRanges[r].aspectMask == VK_IMAGE_ASPECT_COLOR_BIT);
>
> -  struct anv_format src_format =
> - anv_get_format(_buffer->device->info, image->vk_format,
> -VK_IMAGE_ASPECT_COLOR_BIT, image->tiling);
> +  struct anv_format_plane src_format =
> + anv_get_format_plane(_buffer->device->info,
> image->vk_format,
> +  VK_IMAGE_ASPECT_COLOR_BIT, image->tiling);
>
>unsigned base_layer = pRanges[r].baseArrayLayer;
>unsigned layer_count = anv_get_layerCount(image, [r]);
> diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_
> formats.c
> index 9db80ba14e3..e623b4f6324 100644
> --- a/src/intel/vulkan/anv_formats.c
> +++ b/src/intel/vulkan/anv_formats.c
> @@ -44,14 +44,40 @@
>  #define BGRA _ISL_SWIZZLE(BLUE, GREEN, RED, ALPHA)
>  #define RGB1 _ISL_SWIZZLE(RED, GREEN, BLUE, ONE)
>
> -#define swiz_fmt(__vk_fmt, __hw_fmt, __swizzle) \
> +#define _fmt(__hw_fmt, __swizzle) \
> +   { .isl_format = __hw_fmt, \
> + .swizzle = __swizzle }
> +
> +#define swiz_fmt1(__vk_fmt, __hw_fmt, __swizzle) \
> [VK_ENUM_OFFSET(__vk_fmt)] = { \
> -  .isl_format = __hw_fmt, \
> -  .swizzle = __swizzle, \
> +  .planes = { \
> +  { .isl_format = __hw_fmt, .swizzle = __swizzle }, \
> +  }, \
> +  .n_planes = 1, \
> }
>
> -#define fmt(__vk_fmt, __hw_fmt) \
> -   swiz_fmt(__vk_fmt, __hw_fmt, RGBA)
> +#define fmt1(__vk_fmt, __hw_fmt) \
> +   swiz_fmt1(__vk_fmt, __hw_fmt, RGBA)
> +
> +#define fmt2(__vk_fmt, __fmt1, __fmt2) \
> +   [VK_ENUM_OFFSET(__vk_fmt)] = { \
> +  .planes = { \
> + { .isl_format = __fmt1, \
> +   .swizzle = RGBA,   \
> + }, \
> + { .isl_format = __fmt2, \
> +   .swizzle = RGBA,   \
> + }, \
> +  }, \
> +  .n_planes = 2, \
> +   }
> +
> +#define fmt_unsupported(__vk_fmt) \
> +   [VK_ENUM_OFFSET(__vk_fmt)] = { \
> +  .planes = { \
> + { .isl_format = ISL_FORMAT_UNSUPPORTED, }, \
> +  }, \
> +   }
>
>  /* HINT: For array formats, the ISL name should match the VK name.  For
>   * packed formats, they should have the channels in reverse order from
> each
> @@ -59,196 +85,199 @@
>   * bspec) names are in LSB -> MSB order while VK formats are MSB -> LSB.
>   */
>  static const struct anv_format main_formats[] = {
> -   fmt(VK_FORMAT_UNDEFINED,   ISL_FORMAT_UNSUPPORTED),
> -   fmt(VK_FORMAT_R4G4_UNORM_PACK8,ISL_FORMAT_UNSUPPORTED),
> -   fmt(VK_FORMAT_R4G4B4A4_UNORM_PACK16,   ISL_FORMAT_A4B4G4R4_UNORM),
> -   swiz_fmt(VK_FORMAT_B4G4R4A4_UNORM_PACK16,
>  ISL_FORMAT_A4B4G4R4_UNORM,  BGRA),
> -   

[Mesa-dev] [PATCH] gallium: remove TGSI_OPCODE_KILL

2017-10-04 Thread Marek Olšák
From: Marek Olšák 

It can be recognized from KILL_IF by checking if the src operand is IMM.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 11 --
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c|  3 --
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c| 42 --
 src/gallium/auxiliary/nir/tgsi_to_nir.c| 14 
 src/gallium/auxiliary/tgsi/tgsi_exec.c | 18 --
 src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h |  2 +-
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h   |  1 -
 src/gallium/auxiliary/tgsi/tgsi_scan.c |  3 +-
 src/gallium/auxiliary/vl/vl_mc.c   |  2 +-
 src/gallium/docs/source/tgsi.rst   |  5 ---
 src/gallium/drivers/i915/i915_fpc_optimize.c   |  1 -
 src/gallium/drivers/i915/i915_fpc_translate.c  | 13 ---
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  3 --
 src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c   |  3 --
 src/gallium/drivers/r300/r300_tgsi_to_rc.c |  1 -
 src/gallium/drivers/r600/r600_shader.c | 13 +++
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c  | 32 ++---
 src/gallium/drivers/svga/svga_tgsi_insn.c  | 29 +--
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c| 21 ---
 src/gallium/include/pipe/p_shader_tokens.h |  2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  3 +-
 21 files changed, 22 insertions(+), 200 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index ce2b927..edcfc6e 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -355,30 +355,20 @@ kil_fetch_args(
/* src0.z */
emit_data->args[2] = lp_build_emit_fetch(bld_base, emit_data->inst,
 0, TGSI_CHAN_Z);
/* src0.w */
emit_data->args[3] = lp_build_emit_fetch(bld_base, emit_data->inst,
 0, TGSI_CHAN_W);
emit_data->arg_count = 4;
emit_data->dst_type = 
LLVMVoidTypeInContext(bld_base->base.gallivm->context);
 }
 
-/* TGSI_OPCODE_KILL */
-
-static void
-kilp_fetch_args(
-   struct lp_build_tgsi_context * bld_base,
-   struct lp_build_emit_data * emit_data)
-{
-   emit_data->dst_type = 
LLVMVoidTypeInContext(bld_base->base.gallivm->context);
-}
-
 /* TGSI_OPCODE_LIT */
 
 static void
 lit_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
/* src0.x */
emit_data->args[0] = lp_build_emit_fetch(bld_base, emit_data->inst, 0, 
TGSI_CHAN_X);
/* src0.y */
@@ -1172,21 +1162,20 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
bld_base)
bld_base->op_actions[TGSI_OPCODE_POW] = pow_action;
bld_base->op_actions[TGSI_OPCODE_UP2H] = up2h_action;
 
bld_base->op_actions[TGSI_OPCODE_SWITCH].fetch_args = 
scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_CASE].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_COS].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_EX2].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_IF].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_UIF].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KILL_IF].fetch_args = kil_fetch_args;
-   bld_base->op_actions[TGSI_OPCODE_KILL].fetch_args = kilp_fetch_args;
bld_base->op_actions[TGSI_OPCODE_RCP].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_SIN].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_LG2].fetch_args = scalar_unary_fetch_args;
 
bld_base->op_actions[TGSI_OPCODE_ADD].emit = add_emit;
bld_base->op_actions[TGSI_OPCODE_ARR].emit = arr_emit;
bld_base->op_actions[TGSI_OPCODE_END].emit = end_emit;
bld_base->op_actions[TGSI_OPCODE_FRC].emit = frc_emit;
bld_base->op_actions[TGSI_OPCODE_LRP].emit = lrp_emit;
bld_base->op_actions[TGSI_OPCODE_MAD].emit = mad_emit;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
index 2529c6a..675b9a5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
@@ -595,23 +595,20 @@ lp_emit_instruction_aos(
   tmp0 = swizzle_scalar_aos(bld, src0, TGSI_SWIZZLE_X);
   dst0 = lp_build_cos(>bld_base.base, tmp0);
   break;
 
case TGSI_OPCODE_DDX:
   return FALSE;
 
case TGSI_OPCODE_DDY:
   return FALSE;
 
-   case TGSI_OPCODE_KILL:
-  return FALSE;
-
case TGSI_OPCODE_KILL_IF:
   return FALSE;
 
case TGSI_OPCODE_PK2H:
   return FALSE;
   break;
 
case TGSI_OPCODE_PK2US:
   return FALSE;
   break;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 

Re: [Mesa-dev] [PATCH 1/6] gallium: plumb context priority through to driver

2017-10-04 Thread Rob Clark
On Wed, Oct 4, 2017 at 3:33 PM, Roland Scheidegger  wrote:
> Am 04.10.2017 um 17:44 schrieb Rob Clark:
>> Signed-off-by: Rob Clark 
>> ---
>>  src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
>>  src/gallium/drivers/freedreno/freedreno_screen.c|  1 +
>>  src/gallium/drivers/i915/i915_screen.c  |  1 +
>>  src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
>>  src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
>>  src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
>>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
>>  src/gallium/drivers/r300/r300_screen.c  |  1 +
>>  src/gallium/drivers/r600/r600_pipe.c|  1 +
>>  src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
>>  src/gallium/drivers/softpipe/sp_screen.c|  1 +
>>  src/gallium/drivers/svga/svga_screen.c  |  1 +
>>  src/gallium/drivers/swr/swr_screen.cpp  |  1 +
>>  src/gallium/drivers/vc4/vc4_screen.c|  1 +
>>  src/gallium/drivers/virgl/virgl_screen.c|  1 +
>>  src/gallium/include/pipe/p_defines.h| 21 
>> +
>>  src/gallium/include/state_tracker/st_api.h  |  2 ++
>>  src/gallium/state_trackers/dri/dri_context.c| 11 +++
>>  src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
>>  src/mesa/state_tracker/st_manager.c |  5 +
>>  20 files changed, 61 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
>> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
>> index 42905ab0620..16bd4b7c0fb 100644
>> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
>> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
>> @@ -264,6 +264,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
>> pipe_cap param)
>> case PIPE_CAP_QUERY_SO_OVERFLOW:
>> case PIPE_CAP_MEMOBJ:
>> case PIPE_CAP_LOAD_CONSTBUF:
>> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>>return 0;
>>
>> /* Stream output. */
>> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
>> b/src/gallium/drivers/freedreno/freedreno_screen.c
>> index 040c2c99ec0..96866d656be 100644
>> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
>> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
>> @@ -325,6 +325,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
>> pipe_cap param)
>>   case PIPE_CAP_QUERY_SO_OVERFLOW:
>>   case PIPE_CAP_MEMOBJ:
>>   case PIPE_CAP_LOAD_CONSTBUF:
>> + case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>>   return 0;
>>
>>   case PIPE_CAP_MAX_VIEWPORTS:
>> diff --git a/src/gallium/drivers/i915/i915_screen.c 
>> b/src/gallium/drivers/i915/i915_screen.c
>> index 8411c0f15cc..7bcf479c4be 100644
>> --- a/src/gallium/drivers/i915/i915_screen.c
>> +++ b/src/gallium/drivers/i915/i915_screen.c
>> @@ -317,6 +317,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
>> cap)
>> case PIPE_CAP_QUERY_SO_OVERFLOW:
>> case PIPE_CAP_MEMOBJ:
>> case PIPE_CAP_LOAD_CONSTBUF:
>> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>>return 0;
>>
>> case PIPE_CAP_MAX_VIEWPORTS:
>> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
>> b/src/gallium/drivers/llvmpipe/lp_screen.c
>> index 53171162a54..19411adaf07 100644
>> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
>> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
>> @@ -360,6 +360,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
>> pipe_cap param)
>> case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
>> case PIPE_CAP_MEMOBJ:
>> case PIPE_CAP_LOAD_CONSTBUF:
>> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>>return 0;
>> }
>> /* should only get here on unhandled cases */
>> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
>> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>> index a66b4fbe67b..782ba0a64db 100644
>> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>> @@ -224,6 +224,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
>> pipe_cap param)
>> case PIPE_CAP_QUERY_SO_OVERFLOW:
>> case PIPE_CAP_MEMOBJ:
>> case PIPE_CAP_LOAD_CONSTBUF:
>> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>>return 0;
>>
>> case PIPE_CAP_VENDOR_ID:
>> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
>> b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>> index 479283e1b7c..997cb4e71dc 100644
>> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>> @@ -276,6 +276,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
>> pipe_cap param)
>> case PIPE_CAP_QUERY_SO_OVERFLOW:
>> case PIPE_CAP_MEMOBJ:
>> case PIPE_CAP_LOAD_CONSTBUF:
>> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>>return 0;
>>
>> case PIPE_CAP_VENDOR_ID:
>> diff --git 

Re: [Mesa-dev] [PATCH 1/2] radv: check that pipeline is different before binding it

2017-10-04 Thread Bas Nieuwenhuizen
Series is

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Oct 4, 2017 at 10:27 PM, Samuel Pitoiset
 wrote:
> We only need to dirty the descriptors when the pipeline is
> a new one, because user SGPRs can be potentially different.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 61ea11c12a..4b41b358e9 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -2454,14 +2454,20 @@ void radv_CmdBindPipeline(
> RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
> RADV_FROM_HANDLE(radv_pipeline, pipeline, _pipeline);
>
> -   radv_mark_descriptor_sets_dirty(cmd_buffer);
> -
> switch (pipelineBindPoint) {
> case VK_PIPELINE_BIND_POINT_COMPUTE:
> +   if (cmd_buffer->state.compute_pipeline == pipeline)
> +   return;
> +   radv_mark_descriptor_sets_dirty(cmd_buffer);
> +
> cmd_buffer->state.compute_pipeline = pipeline;
> cmd_buffer->push_constant_stages |= 
> VK_SHADER_STAGE_COMPUTE_BIT;
> break;
> case VK_PIPELINE_BIND_POINT_GRAPHICS:
> +   if (cmd_buffer->state.pipeline == pipeline)
> +   return;
> +   radv_mark_descriptor_sets_dirty(cmd_buffer);
> +
> cmd_buffer->state.pipeline = pipeline;
> if (!pipeline)
> break;
> --
> 2.14.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radv: check that pipeline is different before binding it

2017-10-04 Thread Samuel Pitoiset
We only need to dirty the descriptors when the pipeline is
a new one, because user SGPRs can be potentially different.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 61ea11c12a..4b41b358e9 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2454,14 +2454,20 @@ void radv_CmdBindPipeline(
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_pipeline, pipeline, _pipeline);
 
-   radv_mark_descriptor_sets_dirty(cmd_buffer);
-
switch (pipelineBindPoint) {
case VK_PIPELINE_BIND_POINT_COMPUTE:
+   if (cmd_buffer->state.compute_pipeline == pipeline)
+   return;
+   radv_mark_descriptor_sets_dirty(cmd_buffer);
+
cmd_buffer->state.compute_pipeline = pipeline;
cmd_buffer->push_constant_stages |= VK_SHADER_STAGE_COMPUTE_BIT;
break;
case VK_PIPELINE_BIND_POINT_GRAPHICS:
+   if (cmd_buffer->state.pipeline == pipeline)
+   return;
+   radv_mark_descriptor_sets_dirty(cmd_buffer);
+
cmd_buffer->state.pipeline = pipeline;
if (!pipeline)
break;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: remove useless checks around radv_CmdBindPipeline()

2017-10-04 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_blit.c   |  6 ++--
 src/amd/vulkan/radv_meta_blit2d.c | 18 --
 src/amd/vulkan/radv_meta_bufimage.c   | 65 +++
 src/amd/vulkan/radv_meta_clear.c  | 12 +++
 src/amd/vulkan/radv_meta_decompress.c |  8 ++---
 src/amd/vulkan/radv_meta_fast_clear.c |  6 ++--
 src/amd/vulkan/radv_meta_resolve.c|  9 ++---
 src/amd/vulkan/radv_meta_resolve_cs.c |  7 ++--
 8 files changed, 34 insertions(+), 97 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_blit.c b/src/amd/vulkan/radv_meta_blit.c
index a0be498de5..88df1f7f41 100644
--- a/src/amd/vulkan/radv_meta_blit.c
+++ b/src/amd/vulkan/radv_meta_blit.c
@@ -409,10 +409,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
unreachable(!"bad VkImageType");
}
 
-   if (cmd_buffer->state.pipeline != radv_pipeline_from_handle(pipeline)) {
-   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
-VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
-   }
+   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
+VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
 
radv_meta_push_descriptor_set(cmd_buffer, 
VK_PIPELINE_BIND_POINT_GRAPHICS,
  device->meta_state.blit.pipeline_layout,
diff --git a/src/amd/vulkan/radv_meta_blit2d.c 
b/src/amd/vulkan/radv_meta_blit2d.c
index 946c741a27..30f58abb5f 100644
--- a/src/amd/vulkan/radv_meta_blit2d.c
+++ b/src/amd/vulkan/radv_meta_blit2d.c
@@ -186,10 +186,8 @@ bind_pipeline(struct radv_cmd_buffer *cmd_buffer,
VkPipeline pipeline =

cmd_buffer->device->meta_state.blit2d.pipelines[src_type][fs_key];
 
-   if (cmd_buffer->state.pipeline != radv_pipeline_from_handle(pipeline)) {
-   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
-VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
-   }
+   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
+VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
 }
 
 static void
@@ -199,10 +197,8 @@ bind_depth_pipeline(struct radv_cmd_buffer *cmd_buffer,
VkPipeline pipeline =

cmd_buffer->device->meta_state.blit2d.depth_only_pipeline[src_type];
 
-   if (cmd_buffer->state.pipeline != radv_pipeline_from_handle(pipeline)) {
-   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
-VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
-   }
+   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
+VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
 }
 
 static void
@@ -212,10 +208,8 @@ bind_stencil_pipeline(struct radv_cmd_buffer *cmd_buffer,
VkPipeline pipeline =

cmd_buffer->device->meta_state.blit2d.stencil_only_pipeline[src_type];
 
-   if (cmd_buffer->state.pipeline != radv_pipeline_from_handle(pipeline)) {
-   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
-VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
-   }
+   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
+VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
 }
 
 static void
diff --git a/src/amd/vulkan/radv_meta_bufimage.c 
b/src/amd/vulkan/radv_meta_bufimage.c
index cb028dccdc..f5bbf3cb90 100644
--- a/src/amd/vulkan/radv_meta_bufimage.c
+++ b/src/amd/vulkan/radv_meta_bufimage.c
@@ -865,18 +865,6 @@ itob_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
  });
 }
 
-static void
-itob_bind_pipeline(struct radv_cmd_buffer *cmd_buffer)
-{
-   VkPipeline pipeline =
-   cmd_buffer->device->meta_state.itob.pipeline;
-
-   if (cmd_buffer->state.compute_pipeline != 
radv_pipeline_from_handle(pipeline)) {
-   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
-VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
-   }
-}
-
 void
 radv_meta_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
  struct radv_meta_blit2d_surf *src,
@@ -884,6 +872,7 @@ radv_meta_image_to_buffer(struct radv_cmd_buffer 
*cmd_buffer,
  unsigned num_rects,
  struct radv_meta_blit2d_rect *rects)
 {
+   VkPipeline pipeline = cmd_buffer->device->meta_state.itob.pipeline;
struct radv_device *device = cmd_buffer->device;
struct itob_temps temps;
 
@@ -891,7 +880,9 @@ radv_meta_image_to_buffer(struct radv_cmd_buffer 
*cmd_buffer,
create_bview(cmd_buffer, dst->buffer, dst->offset, dst->format, 
_bview);
itob_bind_descriptors(cmd_buffer, );
 
-   itob_bind_pipeline(cmd_buffer);
+
+   

[Mesa-dev] [Bug 103078] MATLAB broken with mesa software rendering

2017-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103078

--- Comment #4 from sergio.calleg...@gmail.com ---
@Brian

thanks for testing! Indeed, I think that Matlab 2016 is fine with the NVIDIA
proprietary driver. However, I cannot use it because I have a KDE desktop and
the nvidia proprietary drivers hang the konsole as per

https://devtalk.nvidia.com/default/topic/879586/linux/kf5-konsole-15-04-and-15-08-consumes-100-cpu-on-close-only-with-proprietary-nvidia-driver/2

https://bugs.kde.org/show_bug.cgi?id=343803

https://bugreports.qt.io/browse/blockquote%3E

Furthermore, I prefer the free drivers when my graphics card copes with them.

The different response to the opengl info may be related to a slightly
different mesa/graphics stack combination. Incidentally, it is unclear to me
whether you got the "MATLAB has experienced a low-level graphics error" with
llvmpipe or the hardware driver.

Matlab seems to use some java library to interact with opengl and I do not know
if it is something commercial or something open source that MathWorks has
adapted to its needs. In the latter case, testing might be easier.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ac/nir: use llvm fma intrinsic if nir instruction is exact.

2017-10-04 Thread Dave Airlie
From: Dave Airlie 

As pointed out by Connor we still need to use fma if nir wants
exact (precise) behaviour.

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 11ba487..38a2bbe 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
  result);
break;
case nir_op_ffma:
-   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
+   result = emit_intrin_3f_param(>ac, instr->exact ? 
"llvm.fma" : "llvm.fmuladd",
  ac_to_float_type(>ac, 
def_type), src[0], src[1], src[2]);
break;
case nir_op_ibitfield_extract:
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: plumb context priority through to driver

2017-10-04 Thread Marek Olšák
What others said, and yes radeonsi shouldn't expose the extension.
Other than those:

Reviewed-by: Marek Olšák 

Marek

On Wed, Oct 4, 2017 at 5:44 PM, Rob Clark  wrote:
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
>  src/gallium/drivers/freedreno/freedreno_screen.c|  1 +
>  src/gallium/drivers/i915/i915_screen.c  |  1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
>  src/gallium/drivers/r300/r300_screen.c  |  1 +
>  src/gallium/drivers/r600/r600_pipe.c|  1 +
>  src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
>  src/gallium/drivers/softpipe/sp_screen.c|  1 +
>  src/gallium/drivers/svga/svga_screen.c  |  1 +
>  src/gallium/drivers/swr/swr_screen.cpp  |  1 +
>  src/gallium/drivers/vc4/vc4_screen.c|  1 +
>  src/gallium/drivers/virgl/virgl_screen.c|  1 +
>  src/gallium/include/pipe/p_defines.h| 21 
> +
>  src/gallium/include/state_tracker/st_api.h  |  2 ++
>  src/gallium/state_trackers/dri/dri_context.c| 11 +++
>  src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
>  src/mesa/state_tracker/st_manager.c |  5 +
>  20 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> index 42905ab0620..16bd4b7c0fb 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> @@ -264,6 +264,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>
> /* Stream output. */
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index 040c2c99ec0..96866d656be 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -325,6 +325,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
> return 0;
>
> case PIPE_CAP_MAX_VIEWPORTS:
> diff --git a/src/gallium/drivers/i915/i915_screen.c 
> b/src/gallium/drivers/i915/i915_screen.c
> index 8411c0f15cc..7bcf479c4be 100644
> --- a/src/gallium/drivers/i915/i915_screen.c
> +++ b/src/gallium/drivers/i915/i915_screen.c
> @@ -317,6 +317,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
> cap)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>
> case PIPE_CAP_MAX_VIEWPORTS:
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 53171162a54..19411adaf07 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -360,6 +360,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
> pipe_cap param)
> case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
> }
> /* should only get here on unhandled cases */
> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> index a66b4fbe67b..782ba0a64db 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> @@ -224,6 +224,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> index 479283e1b7c..997cb4e71dc 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> @@ -276,6 +276,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> diff --git 

Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Marek Olšák
On Wed, Oct 4, 2017 at 7:35 PM, Ilia Mirkin  wrote:
> Ah OK. So llvm.fmuladd is more like llvm.fmadontcare. Wrong assumption
> on my part.

The LLVM backends selects MAD (unfused) for fmuladd, and FMA (fused) for fma.

Personally I would prefer having separate fmul and fadd in LLVM IR
instead of the intrinsic.

Marek

>
> On Wed, Oct 4, 2017 at 1:00 PM, Connor Abbott  wrote:
>> No. From the LLVM langref:
>>
>> The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add
>> expressions that can be fused if the code generator determines that
>> (a) the target instruction set has support for a fused operation, and
>> (b) that the fused operation is more efficient than the equivalent,
>> separate pair of mul and add instructions.
>>
>> The (b) part is especially important -- it says that LLVM can pick and
>> choose which fmuladd intrinsics to turn into FMA instructions, or
>> unfused MULADD instructions, or just a sequence of mul+add. For
>> example, if many instructions call fmuladd with the first two
>> arguments the same, it can break it up into a mul followed by a bunch
>> of adds. That wouldn't be ok under the GLSL precise semantics
>> (assuming the target would've used FMA otherwise, which I think some
>> GCN cards will do).
>>
>> Also, and maybe more importantly, if an app developer explicitly asks
>> for fma() with a precise modifier, it's probably not a great idea to
>> then give them an unfused mul+add -- it's legal, thanks to GLSL's
>> weasel-wording, but probably not what you really want, on HW which
>> actually does have an FMA instruction :)
>>
>> Connor
>>
>>
>> On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin  wrote:
>>> Wouldn't this guarantee that nothing is fused (and thus fine)?
>>> Presumably fmuladd always does mul+add either as 1 or 2 instructions?
>>>
>>> On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott  wrote:
 If the fma has the exact flag, then we need to use the llvm.fma
 intrinsic. These come from fma() calls with the precise or invariant
 qualifiers in GLSL, where you basically have to fuse everything or
 fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.

 On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> For Vulkan SPIR-V the spec states
> fma() Inherited from OpFMul followed by OpFAdd.
>
> Matt says the backend will do the right thing depending on the
> hardware being compiled for, if you use the fmuladd intrinsic.
>
> Using the Mad Max pts test, on high settings at 4K:
> CHP: 55->60
> HGDD: 46->50
> LM: 55->60
> No change on Stronghold.
>
> Thanks to Feral for spending the time to track this down.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c 
> b/src/amd/common/ac_nir_to_llvm.c
> index d7b6259..11ba487 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, 
> const nir_alu_instr *instr)
>   result);
> break;
> case nir_op_ffma:
> -   result = emit_intrin_3f_param(>ac, "llvm.fma",
> +   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
>   ac_to_float_type(>ac, 
> def_type), src[0], src[1], src[2]);
> break;
> case nir_op_ibitfield_extract:
> --
> 2.9.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] freedreno: context priority support

2017-10-04 Thread Roland Scheidegger
Am 04.10.2017 um 17:44 schrieb Rob Clark:
> For devices (and kernels) which support different priority ringbuffers,
> expose context priority support.
> 
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/drivers/freedreno/freedreno_context.c |  9 -
>  src/gallium/drivers/freedreno/freedreno_screen.c  | 12 +++-
>  src/gallium/drivers/freedreno/freedreno_screen.h  |  1 +
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
> b/src/gallium/drivers/freedreno/freedreno_context.c
> index 20480f4f8c1..7fdb848f380 100644
> --- a/src/gallium/drivers/freedreno/freedreno_context.c
> +++ b/src/gallium/drivers/freedreno/freedreno_context.c
> @@ -249,10 +249,17 @@ fd_context_init(struct fd_context *ctx, struct 
> pipe_screen *pscreen,
>  {
>   struct fd_screen *screen = fd_screen(pscreen);
>   struct pipe_context *pctx;
> + unsigned prio = 1;
>   int i;
>  
> + /* lower numerical value == higher priority: */
> + if (flags & PIPE_CONTEXT_HIGH_PRIORITY)
> + prio = 0;
> + else if (flags & PIPE_CONTEXT_LOW_PRIORITY)
> + prio = 2;
> +
>   ctx->screen = screen;
> - ctx->pipe = fd_pipe_new(screen->dev, FD_PIPE_3D);
> + ctx->pipe = fd_pipe_new2(screen->dev, FD_PIPE_3D, prio);
>  
>   ctx->primtypes = primtypes;
>   ctx->primtype_mask = 0;
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index 96866d656be..aa451f501ff 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -325,9 +325,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>   case PIPE_CAP_QUERY_SO_OVERFLOW:
>   case PIPE_CAP_MEMOBJ:
>   case PIPE_CAP_LOAD_CONSTBUF:
> - case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>   return 0;
>  
> + case PIPE_CAP_CONTEXT_PRIORITY_MASK:
> + return screen->priority_mask;
> +
>   case PIPE_CAP_MAX_VIEWPORTS:
>   return 1;
>  
> @@ -803,6 +805,14 @@ fd_screen_create(struct fd_device *dev)
>   }
>   screen->chip_id = val;
>  
> + if (fd_pipe_get_param(screen->pipe, FD_NR_RINGS, )) {
> + DBG("could not get # of rings");
> + screen->priority_mask = 0;
> + } else {
> + /* # of rings equates to number of unique priority values: */
> + screen->priority_mask = (1 << val) - 1;
> + }
This doesn't quite seem to guarantee you only return valid values for
the cap,
unless your number of rings doesn't exceed 3. Maybe that's always the case,
but I think should either mention that in the comment or explicitly mask
off invalid bits.


>   DBG("Pipe Info:");
>   DBG(" GPU-id:  %d", screen->gpu_id);
>   DBG(" Chip-id: 0x%08x", screen->chip_id);
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.h 
> b/src/gallium/drivers/freedreno/freedreno_screen.h
> index 68518ef721b..d5e497d4f65 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.h
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.h
> @@ -67,6 +67,7 @@ struct fd_screen {
>   uint32_t max_rts;/* max # of render targets */
>   uint32_t gmem_alignw, gmem_alignh;
>   uint32_t num_vsc_pipes;
> + uint32_t priority_mask;
>   bool has_timestamp;
>  
>   void *compiler;  /* currently unused for a2xx */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: plumb context priority through to driver

2017-10-04 Thread Roland Scheidegger
Am 04.10.2017 um 17:44 schrieb Rob Clark:
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
>  src/gallium/drivers/freedreno/freedreno_screen.c|  1 +
>  src/gallium/drivers/i915/i915_screen.c  |  1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
>  src/gallium/drivers/r300/r300_screen.c  |  1 +
>  src/gallium/drivers/r600/r600_pipe.c|  1 +
>  src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
>  src/gallium/drivers/softpipe/sp_screen.c|  1 +
>  src/gallium/drivers/svga/svga_screen.c  |  1 +
>  src/gallium/drivers/swr/swr_screen.cpp  |  1 +
>  src/gallium/drivers/vc4/vc4_screen.c|  1 +
>  src/gallium/drivers/virgl/virgl_screen.c|  1 +
>  src/gallium/include/pipe/p_defines.h| 21 
> +
>  src/gallium/include/state_tracker/st_api.h  |  2 ++
>  src/gallium/state_trackers/dri/dri_context.c| 11 +++
>  src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
>  src/mesa/state_tracker/st_manager.c |  5 +
>  20 files changed, 61 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> index 42905ab0620..16bd4b7c0fb 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> @@ -264,6 +264,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>  
> /* Stream output. */
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index 040c2c99ec0..96866d656be 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -325,6 +325,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>   case PIPE_CAP_QUERY_SO_OVERFLOW:
>   case PIPE_CAP_MEMOBJ:
>   case PIPE_CAP_LOAD_CONSTBUF:
> + case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>   return 0;
>  
>   case PIPE_CAP_MAX_VIEWPORTS:
> diff --git a/src/gallium/drivers/i915/i915_screen.c 
> b/src/gallium/drivers/i915/i915_screen.c
> index 8411c0f15cc..7bcf479c4be 100644
> --- a/src/gallium/drivers/i915/i915_screen.c
> +++ b/src/gallium/drivers/i915/i915_screen.c
> @@ -317,6 +317,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
> cap)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>  
> case PIPE_CAP_MAX_VIEWPORTS:
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 53171162a54..19411adaf07 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -360,6 +360,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
> pipe_cap param)
> case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
> }
> /* should only get here on unhandled cases */
> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> index a66b4fbe67b..782ba0a64db 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> @@ -224,6 +224,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>  
> case PIPE_CAP_VENDOR_ID:
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> index 479283e1b7c..997cb4e71dc 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> @@ -276,6 +276,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_QUERY_SO_OVERFLOW:
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> +   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>return 0;
>  
> case PIPE_CAP_VENDOR_ID:
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> index ac850c493da..05913bccb65 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> +++ 

Re: [Mesa-dev] [PATCH v3 08/22] egl/tizen: add support of dri2_loader (v2)

2017-10-04 Thread Emil Velikov
On 4 October 2017 at 15:28, Rob Herring  wrote:
> On Wed, Oct 4, 2017 at 1:50 AM, Gwan-gyeong Mun  wrote:
>> It adds support of dri2_loader to egl dri2 tizen backend.
>>   - referenced a basic buffer flow and management implementation from 
>> android.
>>
>> And it implements a query buffer age extesion for tizen and turn on
>> swap_buffers_with_damage extension.
>>   - it add color buffer related member variables to dri_egl_surface for a
>> management of color buffers.
>>
>> v2: Fixes from Emil's review:
>>a) Remove a temporary variable and return directly on get_format_bpp()
>>b) Remove unneeded compiler pragma
>>c) Follow coding style
>>d) Rename get_pitch() to get_stride() for using of consistent naming
>>e) Remove mis-referencing from android implementation on treatment of 
>> buffer
>>   age.
>>   reference: 
>> https://lists.freedesktop.org/archives/mesa-dev/2017-June/158409.html
>>f) Use dri2_egl_surface_free_outdated_buffers_and_update_size() helper
>>g) Use dri2_egl_surface_record_buffers_and_update_back_buffer() helper
>>h) Use add dri2_egl_surface_update_buffer_age() helper
>>i) Use env_var_as_boolean for hw_accel variable on dri2_initialize_tizen()
>>j) Remove getting of the device name and opening of the device node on 
>> dri2_initialize_tizen()
>>   And add duplicating of tbm_bufmgr_fd. As tbm_bufmgr_fd is managed by 
>> tbm_bufmgr,
>>   if mesa use this fd then we should duplicate it.
>>k) Add comments why we can not drop the dri2 codepath on 
>> dri2_initialize_tizen()
>>   As some kernels ported for tizen don't support render node feature yet,
>>   currently we cannot drop the dri2 codepath.
>>
>> Signed-off-by: Mun Gwan-gyeong 
>> ---
>>  src/egl/drivers/dri2/egl_dri2.h   |   9 ++
>>  src/egl/drivers/dri2/platform_tizen.c | 257 
>> --
>>  2 files changed, 252 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/egl/drivers/dri2/egl_dri2.h 
>> b/src/egl/drivers/dri2/egl_dri2.h
>> index 6f9d936ca5..7d047bf5dd 100644
>> --- a/src/egl/drivers/dri2/egl_dri2.h
>> +++ b/src/egl/drivers/dri2/egl_dri2.h
>> @@ -340,6 +340,15 @@ struct dri2_egl_surface
>> tpl_surface_t *tpl_surface;
>> tbm_surface_h  tbm_surface;
>> tbm_format tbm_format;
>> +
>> +   /* Used to record all the tbm_surface created by tpl_surface and their 
>> ages.
>> +* Usually Tizen uses at most triple buffers in tpl_surface 
>> (tbm_surface_queue)
>> +* so hardcode the number of color_buffers to 3.
>> +*/
>> +   struct {
>> +  tbm_surface_h   buffer;
>> +  int age;
>> +   } color_buffers[3], *back;
>
> dri2_egl_surface is quite the mess of ifdefery.
>
> So now we have 3 instances of color_buffer and *back. This struct
> really needs some refactoring to separate out the common and platform
> specific bits. I'm not saying it has to be done as part of this series
> though.
>
The helpers refactoring was meant to address a bit of it. Namely:

   struct {
 void *native_buffer; // aka wl_buffer//gbm_bo/ANativeWindowBuffer
 __DRIimage *dri_image;
 /* for is_different_gpu case. NULL else */
 __DRIimage *linear_copy;
 /* for swrast */
 void *data;
 int data_size;
 boollocked; // can we reuse it for android/tizen ?
 int age;
  } color_buffers[4], *back, *current;

Side note: The 4 buffers thing is a bit strange. The Wayland commit
that introduces it is sparse on details why does so.

With above in mind there will be the odd cast in the platform_foo.c
but the helpers will be perfectly fine.
Aka there should be no new ifdef magic.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102488] radv_handle_depth_image_transition() wrongly clearing depth data when transitioning to htile.

2017-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102488

--- Comment #3 from Bas Nieuwenhuizen  ---
Note that we are only setting the metadata though. 0x30f should be the metadata
that means "data is uncompressed, just take the value from data", 0x is
probably "clear to depth 1.0". 

You might even try just not setting any value at all, given in this transition
you did a decompress previously, so the hardware might have set the metadata to
something similar to 0x30f.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] wayland-drm: use a copy of the wayland_drm_callbacks struct

2017-10-04 Thread Emil Velikov
On 4 October 2017 at 11:46, Daniel Stone  wrote:
> Hi Emil,
>
> On 27 September 2017 at 19:49, Emil Velikov  wrote:
>> The callbacks may be called even when they are no longer valid.
>> Say, the user is dlclose(ing) libEGL while the buffers are being
>> destroyed.
>
> Series looks good to me, but if the user calls dlclose on EGL or
> whatever before the display/surface/buffers have all been destroyed,
> we're screwed anyway, since the authenticate callback will still point
> into oblivion. So the commit message probably needs some rework.
Indeed, the example is a bad one.
Nothing else comes to mind off the top of my head, so I'll just drop
the sentence for now.

> Regardless:
> Reviewed-by: Daniel Stone 
>

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103078] MATLAB broken with mesa software rendering

2017-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103078

--- Comment #3 from Brian Paul  ---
Sergio, I installed the 30-day trial of Matlab r2017b and typed 'opengl info'
in the command window.  I did not get a java exception.  I got an error message
that reads:

"""
MATLAB has experienced a low-level graphics error, and may not have drawn
correctly.
Read about what you can do to prevent this issue at Resolving Low-Level
Graphics Issues then restart MATLAB.
To share details of this issue with MathWorks technical support,
please include this file with your service request.
"""

I don't see that issue when using NVIDIA's driver.

I used apitrace to create a trace of Matlab's GL calls with llvmpipe and with
NVIDIA's linux driver.  It looks like Matlab begins by trying to find the
highest supported GL version of the core profile.  With NVIDIA's driver it
finds 4.5.  With llvmpipe it finds 3.3.

Then, it also tries to create compatibility profile context.  With NVIDIA's
driver it again gets a 4.5 context.  With llvmpipe, we don't have such profiles
and Matlab stops after getting a GL 3.1 context.

Finally, with llvmpipe, Matlab tries to create a 3.3 compatibility profile and
fails.  It then successfully creates a legacy context with
glXCreateNewContext().  It calls glGetIntegerv(GL_MAJOR/MINOR_VERSION) then
destroys the context.

As far as I an tell, Matlab simply doesn't accept llvmpipe's context/version
offerings.  Either that's by design or there's some sort of logic bug in MatLab
that gives up on GL support after failing to create a particular kind of
context.

You could try installing some older versions of Mesa to see if Matlab 2016a
works/fails.  But Matlab 2017b seems to act differently.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 12/12] anv: enable VK_KHR_sampler_ycbcr_conversion

2017-10-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/vulkan/anv_device.c  | 51 ++--
 src/intel/vulkan/anv_extensions.py |  1 +
 src/intel/vulkan/anv_formats.c | 59 ++
 src/intel/vulkan/anv_image.c   | 42 ++-
 src/intel/vulkan/anv_private.h |  4 +++
 src/intel/vulkan/genX_state.c  | 50 ++--
 6 files changed, 183 insertions(+), 24 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d576bb55315..8cb01b386c2 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -703,6 +703,13 @@ void anv_GetPhysicalDeviceFeatures2KHR(
  break;
   }
 
+  case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_YCBCR_CONVERSION_FEATURES_KHR: {
+ VkPhysicalDeviceSamplerYcbcrConversionFeaturesKHR *features =
+(VkPhysicalDeviceSamplerYcbcrConversionFeaturesKHR *) ext;
+ features->samplerYcbcrConversion = true;
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
@@ -1826,8 +1833,48 @@ void anv_GetImageMemoryRequirements2KHR(
 const VkImageMemoryRequirementsInfo2KHR*pInfo,
 VkMemoryRequirements2KHR*   pMemoryRequirements)
 {
-   anv_GetImageMemoryRequirements(_device, pInfo->image,
-  >memoryRequirements);
+   if (pInfo->pNext == NULL) {
+  anv_GetImageMemoryRequirements(_device, pInfo->image,
+ >memoryRequirements);
+   } else {
+  vk_foreach_struct_const(ext, pInfo->pNext) {
+ switch (ext->sType) {
+ case VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO_KHR: {
+ANV_FROM_HANDLE(anv_image, image, pInfo->image);
+ANV_FROM_HANDLE(anv_device, device, _device);
+struct anv_physical_device *pdevice = 
>instance->physicalDevice;
+const VkImagePlaneMemoryRequirementsInfoKHR *plane_reqs =
+   (const VkImagePlaneMemoryRequirementsInfoKHR *) ext;
+uint32_t plane = anv_image_aspect_to_plane(image->aspects,
+   
plane_reqs->planeAspect);
+
+assert(image->planes[plane].offset == 0);
+
+/* The Vulkan spec (git aaed022) says:
+ *
+ *memoryTypeBits is a bitfield and contains one bit set for
+ *every supported memory type for the resource. The bit `1<memoryRequirements.memoryTypeBits =
+   (1ull << pdevice->memory.type_count) - 1;
+
+pMemoryRequirements->memoryRequirements.size = 
image->planes[plane].size;
+pMemoryRequirements->memoryRequirements.alignment =
+   image->planes[plane].alignment;
+break;
+ }
+
+ default:
+anv_debug_ignored_stype(ext->sType);
+break;
+ }
+  }
+   }
 
vk_foreach_struct(ext, pMemoryRequirements->pNext) {
   switch (ext->sType) {
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 491e7086838..a828a668d65 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -74,6 +74,7 @@ EXTENSIONS = [
 Extension('VK_KHR_push_descriptor',   1, True),
 Extension('VK_KHR_relaxed_block_layout',  1, True),
 Extension('VK_KHR_sampler_mirror_clamp_to_edge',  1, True),
+Extension('VK_KHR_sampler_ycbcr_conversion',  1, True),
 Extension('VK_KHR_shader_draw_parameters',1, True),
 Extension('VK_KHR_storage_buffer_storage_class',  1, True),
 Extension('VK_KHR_surface',  25, 
'ANV_HAS_SURFACE'),
diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index 879eb072b10..f07c12eb422 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -1006,3 +1006,62 @@ void anv_GetPhysicalDeviceExternalBufferPropertiesKHR(
pExternalBufferProperties->externalMemoryProperties =
   (VkExternalMemoryPropertiesKHR) {0};
 }
+
+VkResult anv_CreateSamplerYcbcrConversionKHR(
+VkDevice_device,
+const VkSamplerYcbcrConversionCreateInfoKHR* pCreateInfo,
+const VkAllocationCallbacks*pAllocator,
+VkSamplerYcbcrConversionKHR*pYcbcrConversion)
+{
+   ANV_FROM_HANDLE(anv_device, device, _device);
+   struct anv_ycbcr_conversion *conversion;
+
+   assert(pCreateInfo->sType == 

Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Ilia Mirkin
Ah OK. So llvm.fmuladd is more like llvm.fmadontcare. Wrong assumption
on my part.

On Wed, Oct 4, 2017 at 1:00 PM, Connor Abbott  wrote:
> No. From the LLVM langref:
>
> The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add
> expressions that can be fused if the code generator determines that
> (a) the target instruction set has support for a fused operation, and
> (b) that the fused operation is more efficient than the equivalent,
> separate pair of mul and add instructions.
>
> The (b) part is especially important -- it says that LLVM can pick and
> choose which fmuladd intrinsics to turn into FMA instructions, or
> unfused MULADD instructions, or just a sequence of mul+add. For
> example, if many instructions call fmuladd with the first two
> arguments the same, it can break it up into a mul followed by a bunch
> of adds. That wouldn't be ok under the GLSL precise semantics
> (assuming the target would've used FMA otherwise, which I think some
> GCN cards will do).
>
> Also, and maybe more importantly, if an app developer explicitly asks
> for fma() with a precise modifier, it's probably not a great idea to
> then give them an unfused mul+add -- it's legal, thanks to GLSL's
> weasel-wording, but probably not what you really want, on HW which
> actually does have an FMA instruction :)
>
> Connor
>
>
> On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin  wrote:
>> Wouldn't this guarantee that nothing is fused (and thus fine)?
>> Presumably fmuladd always does mul+add either as 1 or 2 instructions?
>>
>> On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott  wrote:
>>> If the fma has the exact flag, then we need to use the llvm.fma
>>> intrinsic. These come from fma() calls with the precise or invariant
>>> qualifiers in GLSL, where you basically have to fuse everything or
>>> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.
>>>
>>> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie  wrote:
 From: Dave Airlie 

 For Vulkan SPIR-V the spec states
 fma() Inherited from OpFMul followed by OpFAdd.

 Matt says the backend will do the right thing depending on the
 hardware being compiled for, if you use the fmuladd intrinsic.

 Using the Mad Max pts test, on high settings at 4K:
 CHP: 55->60
 HGDD: 46->50
 LM: 55->60
 No change on Stronghold.

 Thanks to Feral for spending the time to track this down.

 Signed-off-by: Dave Airlie 
 ---
  src/amd/common/ac_nir_to_llvm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/amd/common/ac_nir_to_llvm.c 
 b/src/amd/common/ac_nir_to_llvm.c
 index d7b6259..11ba487 100644
 --- a/src/amd/common/ac_nir_to_llvm.c
 +++ b/src/amd/common/ac_nir_to_llvm.c
 @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, 
 const nir_alu_instr *instr)
   result);
 break;
 case nir_op_ffma:
 -   result = emit_intrin_3f_param(>ac, "llvm.fma",
 +   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
   ac_to_float_type(>ac, 
 def_type), src[0], src[1], src[2]);
 break;
 case nir_op_ibitfield_extract:
 --
 2.9.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 09/12] anv: prepare sampler emission code for multiplanar images

2017-10-04 Thread Lionel Landwerlin
New settings from the KHR_sampler_ycbcr_conversion specifications
might require different sampler settings for luma and chroma planes.
This change makes the sampler table emission ready to handle multiple
planes.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/vulkan/anv_private.h |  2 +-
 src/intel/vulkan/genX_cmd_buffer.c |  2 +-
 src/intel/vulkan/genX_state.c  | 80 +++---
 3 files changed, 43 insertions(+), 41 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index c2ce8ee43f7..2b3b5a1810a 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2561,7 +2561,7 @@ void anv_fill_buffer_surface_state(struct anv_device 
*device,
uint32_t stride);
 
 struct anv_sampler {
-   uint32_t state[4];
+   uint32_t state[3][4];
uint32_t n_planes;
 };
 
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 367fddcf02a..dc5cf687dc6 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1745,7 +1745,7 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
   }
 
   memcpy(state->map + (s * 16),
- sampler->state, sizeof(sampler->state));
+ sampler->state, sampler->n_planes * sizeof(sampler->state[0]));
 
   s += sampler->n_planes;
}
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 81570825a54..91da05cddbf 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -166,7 +166,7 @@ VkResult genX(CreateSampler)(
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO);
 
-   sampler = vk_alloc2(>alloc, pAllocator, sizeof(*sampler), 8,
+   sampler = vk_zalloc2(>alloc, pAllocator, sizeof(*sampler), 8,
 VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!sampler)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
@@ -181,55 +181,57 @@ VkResult genX(CreateSampler)(
bool enable_mag_filter_addr_rounding =
   pCreateInfo->magFilter != VK_FILTER_NEAREST;
 
-   struct GENX(SAMPLER_STATE) sampler_state = {
-  .SamplerDisable = false,
-  .TextureBorderColorMode = DX10OGL,
+   for (unsigned p = 0; p < sampler->n_planes; p++) {
+  struct GENX(SAMPLER_STATE) sampler_state = {
+ .SamplerDisable = false,
+ .TextureBorderColorMode = DX10OGL,
 
 #if GEN_GEN >= 8
-  .LODPreClampMode = CLAMP_MODE_OGL,
+ .LODPreClampMode = CLAMP_MODE_OGL,
 #else
-  .LODPreClampEnable = CLAMP_ENABLE_OGL,
+ .LODPreClampEnable = CLAMP_ENABLE_OGL,
 #endif
 
 #if GEN_GEN == 8
-  .BaseMipLevel = 0.0,
+ .BaseMipLevel = 0.0,
 #endif
-  .MipModeFilter = vk_to_gen_mipmap_mode[pCreateInfo->mipmapMode],
-  .MagModeFilter = vk_to_gen_tex_filter(pCreateInfo->magFilter,
-pCreateInfo->anisotropyEnable),
-  .MinModeFilter = vk_to_gen_tex_filter(pCreateInfo->minFilter,
-pCreateInfo->anisotropyEnable),
-  .TextureLODBias = anv_clamp_f(pCreateInfo->mipLodBias, -16, 15.996),
-  .AnisotropicAlgorithm = EWAApproximation,
-  .MinLOD = anv_clamp_f(pCreateInfo->minLod, 0, 14),
-  .MaxLOD = anv_clamp_f(pCreateInfo->maxLod, 0, 14),
-  .ChromaKeyEnable = 0,
-  .ChromaKeyIndex = 0,
-  .ChromaKeyMode = 0,
-  .ShadowFunction = vk_to_gen_shadow_compare_op[pCreateInfo->compareOp],
-  .CubeSurfaceControlMode = OVERRIDE,
-
-  .BorderColorPointer = border_color_offset,
+ .MipModeFilter = vk_to_gen_mipmap_mode[pCreateInfo->mipmapMode],
+ .MagModeFilter = vk_to_gen_tex_filter(pCreateInfo->magFilter,
+   pCreateInfo->anisotropyEnable),
+ .MinModeFilter = vk_to_gen_tex_filter(pCreateInfo->minFilter,
+   pCreateInfo->anisotropyEnable),
+ .TextureLODBias = anv_clamp_f(pCreateInfo->mipLodBias, -16, 15.996),
+ .AnisotropicAlgorithm = EWAApproximation,
+ .MinLOD = anv_clamp_f(pCreateInfo->minLod, 0, 14),
+ .MaxLOD = anv_clamp_f(pCreateInfo->maxLod, 0, 14),
+ .ChromaKeyEnable = 0,
+ .ChromaKeyIndex = 0,
+ .ChromaKeyMode = 0,
+ .ShadowFunction = vk_to_gen_shadow_compare_op[pCreateInfo->compareOp],
+ .CubeSurfaceControlMode = OVERRIDE,
+
+ .BorderColorPointer = border_color_offset,
 
 #if GEN_GEN >= 8
-  .LODClampMagnificationMode = MIPNONE,
+ .LODClampMagnificationMode = MIPNONE,
 #endif
 
-  .MaximumAnisotropy = 
vk_to_gen_max_anisotropy(pCreateInfo->maxAnisotropy),
-  .RAddressMinFilterRoundingEnable = enable_min_filter_addr_rounding,
-  .RAddressMagFilterRoundingEnable = enable_mag_filter_addr_rounding,
-  .VAddressMinFilterRoundingEnable = enable_min_filter_addr_rounding,
-  

[Mesa-dev] [PATCH v3 10/12] anv: add nir lowering pass for ycrcb textures

2017-10-04 Thread Lionel Landwerlin
This pass implements all the implicit conversions required by the
VK_KHR_sampler_ycbcr_conversion specification.

It also inserts plane sources onto sampling instructions that we then
let the pipeline layout pass deal with, when mapping things correctly
to descriptors.

v2: Add new file to meson build (Lionel)
Use nir_frcp() rather than (1.0f / x) (Jason)
Reuse nir_tex_instr_dest_size() rather than handwritten one (Jason)
Return progress (Jason)
Account for array of samplers (Jason)

Signed-off-by: Lionel Landwerlin 
---
 src/intel/Makefile.sources   |   1 +
 src/intel/vulkan/anv_nir.h   |   3 +
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c |  61 ++-
 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c  | 469 +++
 src/intel/vulkan/anv_pipeline.c  |   2 +
 src/intel/vulkan/anv_private.h   |  16 +-
 src/intel/vulkan/meson.build |   1 +
 7 files changed, 547 insertions(+), 6 deletions(-)
 create mode 100644 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index bca7a132b26..9672dcc252d 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -219,6 +219,7 @@ VULKAN_FILES := \
vulkan/anv_nir_lower_input_attachments.c \
vulkan/anv_nir_lower_multiview.c \
vulkan/anv_nir_lower_push_constants.c \
+   vulkan/anv_nir_lower_ycbcr_textures.c \
vulkan/anv_pass.c \
vulkan/anv_pipeline.c \
vulkan/anv_pipeline_cache.c \
diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h
index 5b450b45cdf..8ac0a119dac 100644
--- a/src/intel/vulkan/anv_nir.h
+++ b/src/intel/vulkan/anv_nir.h
@@ -37,6 +37,9 @@ void anv_nir_lower_push_constants(nir_shader *shader);
 
 bool anv_nir_lower_multiview(nir_shader *shader, uint32_t view_mask);
 
+bool anv_nir_lower_ycbcr_textures(nir_shader *shader,
+  struct anv_pipeline *pipeline);
+
 void anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline,
nir_shader *shader,
struct brw_stage_prog_data *prog_data,
diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c 
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index 428cfdf42d1..28cbb98c563 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -131,7 +131,7 @@ lower_res_index_intrinsic(nir_intrinsic_instr *intrin,
 static void
 lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
 unsigned *const_index, unsigned hw_binding_size,
-nir_tex_src_type src_type,
+nir_tex_src_type src_type, bool allow_indirect,
 struct apply_pipeline_layout_state *state)
 {
nir_builder *b = >builder;
@@ -141,6 +141,15 @@ lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
   nir_deref_array *deref_array = nir_deref_as_array(deref->deref.child);
 
   if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
+ /* From VK_KHR_sampler_ycbcr_conversion:
+  *
+  * If sampler Y’CBCR conversion is enabled, the combined image
+  * sampler must be indexed only by constant integral expressions when
+  * aggregated into arrays in shader code, irrespective of the
+  * shaderSampledImageArrayDynamicIndexing feature.
+  */
+ assert(allow_indirect);
+
  nir_ssa_def *index =
 nir_iadd(b, nir_imm_int(b, deref_array->base_offset),
 nir_ssa_for_src(b, deref_array->indirect, 1));
@@ -186,6 +195,46 @@ cleanup_tex_deref(nir_tex_instr *tex, nir_deref_var *deref)
nir_instr_rewrite_src(>instr, _array->indirect, NIR_SRC_INIT);
 }
 
+static bool
+has_tex_src_plane(nir_tex_instr *tex)
+{
+   for (unsigned i = 0; i < tex->num_srcs; i++) {
+  if (tex->src[i].src_type == nir_tex_src_plane)
+ return true;
+   }
+
+   return false;
+}
+
+static uint32_t
+extract_tex_src_plane(nir_tex_instr *tex)
+{
+   nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src, tex->num_srcs - 1);
+   unsigned plane = 0;
+
+   for (unsigned i = 0, w = 0; i < tex->num_srcs; i++) {
+  if (tex->src[i].src_type == nir_tex_src_plane) {
+ nir_const_value *const_plane =
+nir_src_as_const_value(tex->src[i].src);
+
+ /* Our color conversion lowering pass should only ever insert
+  * constants. */
+ assert(const_plane);
+ plane = const_plane->u32[0];
+  } else {
+ new_srcs[w].src_type = tex->src[i].src_type;
+ nir_instr_move_src(>instr, _srcs[w].src, >src[i].src);
+ w++;
+  }
+   }
+
+   ralloc_free(tex->src);
+   tex->src = new_srcs;
+   tex->num_srcs--;
+
+   return plane;
+}
+
 static void
 lower_tex(nir_tex_instr *tex, struct 

[Mesa-dev] [PATCH v3 06/12] anv: modify the internal concept of format to express multiple planes

2017-10-04 Thread Lionel Landwerlin
A given Vulkan format can now be decomposed into a set of planes. We
now use 'struct anv_format_plane' to represent the format of those
planes.

v2: by Jason
Rename anv_get_plane_format() to anv_get_format_plane()
Don't rename anv_get_isl_format()
Replace ds_fmt() by fmt2()
Introduce fmt_unsupported()

Signed-off-by: Lionel Landwerlin 
---
 src/intel/vulkan/anv_blorp.c |  18 +-
 src/intel/vulkan/anv_formats.c   | 512 +--
 src/intel/vulkan/anv_image.c |  12 +-
 src/intel/vulkan/anv_private.h   |  54 -
 src/intel/vulkan/genX_pipeline.c |   7 +-
 5 files changed, 339 insertions(+), 264 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 8dead1d87a8..187042c71cf 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -459,12 +459,12 @@ void anv_CmdBlitImage(
   get_blorp_surf_for_anv_image(dst_image, dst_res->aspectMask,
dst_image->aux_usage, );
 
-  struct anv_format src_format =
- anv_get_format(_buffer->device->info, src_image->vk_format,
-src_res->aspectMask, src_image->tiling);
-  struct anv_format dst_format =
- anv_get_format(_buffer->device->info, dst_image->vk_format,
-dst_res->aspectMask, dst_image->tiling);
+  struct anv_format_plane src_format =
+ anv_get_format_plane(_buffer->device->info, src_image->vk_format,
+  src_res->aspectMask, src_image->tiling);
+  struct anv_format_plane dst_format =
+ anv_get_format_plane(_buffer->device->info, dst_image->vk_format,
+  dst_res->aspectMask, dst_image->tiling);
 
   unsigned dst_start, dst_end;
   if (dst_image->type == VK_IMAGE_TYPE_3D) {
@@ -758,9 +758,9 @@ void anv_CmdClearColorImage(
 
   assert(pRanges[r].aspectMask == VK_IMAGE_ASPECT_COLOR_BIT);
 
-  struct anv_format src_format =
- anv_get_format(_buffer->device->info, image->vk_format,
-VK_IMAGE_ASPECT_COLOR_BIT, image->tiling);
+  struct anv_format_plane src_format =
+ anv_get_format_plane(_buffer->device->info, image->vk_format,
+  VK_IMAGE_ASPECT_COLOR_BIT, image->tiling);
 
   unsigned base_layer = pRanges[r].baseArrayLayer;
   unsigned layer_count = anv_get_layerCount(image, [r]);
diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index 9db80ba14e3..e623b4f6324 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -44,14 +44,40 @@
 #define BGRA _ISL_SWIZZLE(BLUE, GREEN, RED, ALPHA)
 #define RGB1 _ISL_SWIZZLE(RED, GREEN, BLUE, ONE)
 
-#define swiz_fmt(__vk_fmt, __hw_fmt, __swizzle) \
+#define _fmt(__hw_fmt, __swizzle) \
+   { .isl_format = __hw_fmt, \
+ .swizzle = __swizzle }
+
+#define swiz_fmt1(__vk_fmt, __hw_fmt, __swizzle) \
[VK_ENUM_OFFSET(__vk_fmt)] = { \
-  .isl_format = __hw_fmt, \
-  .swizzle = __swizzle, \
+  .planes = { \
+  { .isl_format = __hw_fmt, .swizzle = __swizzle }, \
+  }, \
+  .n_planes = 1, \
}
 
-#define fmt(__vk_fmt, __hw_fmt) \
-   swiz_fmt(__vk_fmt, __hw_fmt, RGBA)
+#define fmt1(__vk_fmt, __hw_fmt) \
+   swiz_fmt1(__vk_fmt, __hw_fmt, RGBA)
+
+#define fmt2(__vk_fmt, __fmt1, __fmt2) \
+   [VK_ENUM_OFFSET(__vk_fmt)] = { \
+  .planes = { \
+ { .isl_format = __fmt1, \
+   .swizzle = RGBA,   \
+ }, \
+ { .isl_format = __fmt2, \
+   .swizzle = RGBA,   \
+ }, \
+  }, \
+  .n_planes = 2, \
+   }
+
+#define fmt_unsupported(__vk_fmt) \
+   [VK_ENUM_OFFSET(__vk_fmt)] = { \
+  .planes = { \
+ { .isl_format = ISL_FORMAT_UNSUPPORTED, }, \
+  }, \
+   }
 
 /* HINT: For array formats, the ISL name should match the VK name.  For
  * packed formats, they should have the channels in reverse order from each
@@ -59,196 +85,199 @@
  * bspec) names are in LSB -> MSB order while VK formats are MSB -> LSB.
  */
 static const struct anv_format main_formats[] = {
-   fmt(VK_FORMAT_UNDEFINED,   ISL_FORMAT_UNSUPPORTED),
-   fmt(VK_FORMAT_R4G4_UNORM_PACK8,ISL_FORMAT_UNSUPPORTED),
-   fmt(VK_FORMAT_R4G4B4A4_UNORM_PACK16,   ISL_FORMAT_A4B4G4R4_UNORM),
-   swiz_fmt(VK_FORMAT_B4G4R4A4_UNORM_PACK16,   ISL_FORMAT_A4B4G4R4_UNORM,  
BGRA),
-   fmt(VK_FORMAT_R5G6B5_UNORM_PACK16, ISL_FORMAT_B5G6R5_UNORM),
-   swiz_fmt(VK_FORMAT_B5G6R5_UNORM_PACK16, ISL_FORMAT_B5G6R5_UNORM, BGRA),
-   fmt(VK_FORMAT_R5G5B5A1_UNORM_PACK16,   ISL_FORMAT_A1B5G5R5_UNORM),
-   fmt(VK_FORMAT_B5G5R5A1_UNORM_PACK16,   ISL_FORMAT_UNSUPPORTED),
-   fmt(VK_FORMAT_A1R5G5B5_UNORM_PACK16,   ISL_FORMAT_B5G5R5A1_UNORM),
-   fmt(VK_FORMAT_R8_UNORM,ISL_FORMAT_R8_UNORM),
-   fmt(VK_FORMAT_R8_SNORM,ISL_FORMAT_R8_SNORM),
-   fmt(VK_FORMAT_R8_USCALED,  

[Mesa-dev] [PATCH v3 05/12] anv: prepare formats to handle disjoints sets

2017-10-04 Thread Lionel Landwerlin
Newer format enums start at offset 10, making it impossible to
have them all in one table. This change splits the formats into sets
that we then access through indirection.

v2: rename format_extract to vk_to_anv_format (Chad/Jason)

Signed-off-by: Lionel Landwerlin 
---
 src/intel/vulkan/anv_formats.c | 37 +++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index 049ffe17ac0..9db80ba14e3 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -45,7 +45,7 @@
 #define RGB1 _ISL_SWIZZLE(RED, GREEN, BLUE, ONE)
 
 #define swiz_fmt(__vk_fmt, __hw_fmt, __swizzle) \
-   [__vk_fmt] = { \
+   [VK_ENUM_OFFSET(__vk_fmt)] = { \
   .isl_format = __hw_fmt, \
   .swizzle = __swizzle, \
}
@@ -58,7 +58,7 @@
  * other.  The reason for this is that, for packed formats, the ISL (and
  * bspec) names are in LSB -> MSB order while VK formats are MSB -> LSB.
  */
-static const struct anv_format anv_formats[] = {
+static const struct anv_format main_formats[] = {
fmt(VK_FORMAT_UNDEFINED,   ISL_FORMAT_UNSUPPORTED),
fmt(VK_FORMAT_R4G4_UNORM_PACK8,ISL_FORMAT_UNSUPPORTED),
fmt(VK_FORMAT_R4G4B4A4_UNORM_PACK16,   ISL_FORMAT_A4B4G4R4_UNORM),
@@ -251,13 +251,30 @@ static const struct anv_format anv_formats[] = {
 
 #undef fmt
 
+static const struct {
+   const struct anv_format *formats;
+   uint32_t n_formats;
+} anv_formats[] = {
+   [0] = { .formats = main_formats, .n_formats = ARRAY_SIZE(main_formats), },
+};
+
+static struct anv_format
+vk_to_anv_format(VkFormat vk_format)
+{
+   uint32_t enum_offset = VK_ENUM_OFFSET(vk_format);
+   uint32_t ext_number = VK_ENUM_EXTENSION(vk_format);
+
+   if (ext_number >= ARRAY_SIZE(anv_formats) ||
+   enum_offset >= anv_formats[ext_number].n_formats)
+  return (struct anv_format) { .isl_format = ISL_FORMAT_UNSUPPORTED };
+
+   return anv_formats[ext_number].formats[enum_offset];
+}
+
 static bool
 format_supported(VkFormat vk_format)
 {
-   if (vk_format >= ARRAY_SIZE(anv_formats))
-  return false;
-
-   return anv_formats[vk_format].isl_format != ISL_FORMAT_UNSUPPORTED;
+   return vk_to_anv_format(vk_format).isl_format != ISL_FORMAT_UNSUPPORTED;
 }
 
 /**
@@ -267,10 +284,10 @@ struct anv_format
 anv_get_format(const struct gen_device_info *devinfo, VkFormat vk_format,
VkImageAspectFlags aspect, VkImageTiling tiling)
 {
-   if (!format_supported(vk_format))
-  return anv_formats[VK_FORMAT_UNDEFINED];
+   struct anv_format format = vk_to_anv_format(vk_format);
 
-   struct anv_format format = anv_formats[vk_format];
+   if (format.isl_format == ISL_FORMAT_UNSUPPORTED)
+  return format;
 
if (aspect == VK_IMAGE_ASPECT_STENCIL_BIT) {
   assert(vk_format_aspects(vk_format) & VK_IMAGE_ASPECT_STENCIL_BIT);
@@ -553,7 +570,7 @@ anv_get_image_format_properties(
 ** This field cannot be ASTC format if the Surface Type is SURFTYPE_1D.
 */
if (info->type == VK_IMAGE_TYPE_1D &&
-   isl_format_is_compressed(anv_formats[info->format].isl_format)) {
+   isl_format_is_compressed(vk_to_anv_format(info->format).isl_format)) {
goto unsupported;
}
 
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 08/12] anv: add descriptor support for multiplanar image/sampler

2017-10-04 Thread Lionel Landwerlin
v2: Drop a memset by using zalloc (Jason)
Decouple vulkan descriptors from the underlying binding table
(Jason)

Signed-off-by: Lionel Landwerlin 
---
 src/intel/vulkan/anv_descriptor_set.c| 24 -
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 66 ++--
 src/intel/vulkan/anv_private.h   | 22 
 src/intel/vulkan/genX_cmd_buffer.c   | 17 +++---
 src/intel/vulkan/genX_state.c|  2 +
 5 files changed, 97 insertions(+), 34 deletions(-)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index 84077982307..704693e227f 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -35,6 +35,21 @@
  * Descriptor set layouts.
  */
 
+static uint32_t
+layout_binding_get_hw_binding_size(const VkDescriptorSetLayoutBinding *binding)
+{
+   if (binding->pImmutableSamplers == NULL)
+  return binding->descriptorCount;
+
+   uint32_t immutable_sampler_count = 0;
+   for (uint32_t i = 0; i < binding->descriptorCount; i++) {
+  ANV_FROM_HANDLE(anv_sampler, sampler, binding->pImmutableSamplers[i]);
+  immutable_sampler_count += sampler->n_planes;
+   }
+
+   return immutable_sampler_count;
+}
+
 VkResult anv_CreateDescriptorSetLayout(
 VkDevice_device,
 const VkDescriptorSetLayoutCreateInfo*  pCreateInfo,
@@ -75,6 +90,7 @@ VkResult anv_CreateDescriptorSetLayout(
 
   set_layout->binding[b].array_size = 0;
   set_layout->binding[b].immutable_samplers = NULL;
+  set_layout->binding[b].hw_binding_size = 0;
}
 
/* Initialize all samplers to 0 */
@@ -108,8 +124,13 @@ VkResult anv_CreateDescriptorSetLayout(
   set_layout->binding[b].type = binding->descriptorType;
 #endif
   set_layout->binding[b].array_size = binding->descriptorCount;
+  set_layout->binding[b].hw_binding_size =
+ layout_binding_get_hw_binding_size(binding);
   set_layout->binding[b].descriptor_index = set_layout->size;
+  set_layout->binding[b].hw_binding_index = set_layout->hw_size;
+
   set_layout->size += binding->descriptorCount;
+  set_layout->hw_size += set_layout->binding[b].hw_binding_size;
 
   switch (binding->descriptorType) {
   case VK_DESCRIPTOR_TYPE_SAMPLER:
@@ -323,6 +344,8 @@ VkResult anv_CreateDescriptorPool(
   case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
   case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC:
  buffer_count += pCreateInfo->pPoolSizes[i].descriptorCount;
+ /* Fallthrough */
+
   default:
  descriptor_count += pCreateInfo->pPoolSizes[i].descriptorCount;
  break;
@@ -612,7 +635,6 @@ anv_descriptor_set_write_image_view(struct 
anv_descriptor_set *set,
sampler = bind_layout->immutable_samplers ?
  bind_layout->immutable_samplers[element] :
  sampler;
-
*desc = (struct anv_descriptor) {
   .type = type,
   .layout = info->imageLayout,
diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c 
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index 67bcf5e29ef..428cfdf42d1 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -130,7 +130,7 @@ lower_res_index_intrinsic(nir_intrinsic_instr *intrin,
 
 static void
 lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
-unsigned *const_index, unsigned array_size,
+unsigned *const_index, unsigned hw_binding_size,
 nir_tex_src_type src_type,
 struct apply_pipeline_layout_state *state)
 {
@@ -146,7 +146,7 @@ lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
 nir_ssa_for_src(b, deref_array->indirect, 1));
 
  if (state->add_bounds_checks)
-index = nir_umin(b, index, nir_imm_int(b, array_size - 1));
+index = nir_umin(b, index, nir_imm_int(b, hw_binding_size - 1));
 
  nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src,
tex->num_srcs + 1);
@@ -167,7 +167,7 @@ lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
nir_src_for_ssa(index));
  tex->num_srcs++;
   } else {
- *const_index += MIN2(deref_array->base_offset, array_size - 1);
+ *const_index += MIN2(deref_array->base_offset, hw_binding_size - 1);
   }
}
 }
@@ -196,19 +196,18 @@ lower_tex(nir_tex_instr *tex, struct 
apply_pipeline_layout_state *state)
 
unsigned set = tex->texture->var->data.descriptor_set;
unsigned binding = tex->texture->var->data.binding;
-   unsigned array_size =
-  state->layout->set[set].layout->binding[binding].array_size;
+   unsigned hw_binding_size =
+  state->layout->set[set].layout->binding[binding].hw_binding_size;
tex->texture_index = 

[Mesa-dev] [PATCH v3 07/12] anv: add new formats KHR_sampler_ycbcr_conversion

2017-10-04 Thread Lionel Landwerlin
Adding new downsampling factors for each planes.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/vulkan/anv_formats.c| 158 --
 src/intel/vulkan/anv_private.h|  10 +++
 src/intel/vulkan/vk_format_info.h |  27 +++
 3 files changed, 189 insertions(+), 6 deletions(-)

diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index e623b4f6324..795055b52ff 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -22,6 +22,7 @@
  */
 
 #include "anv_private.h"
+#include "vk_enum_to_str.h"
 #include "vk_format_info.h"
 #include "vk_util.h"
 
@@ -44,14 +45,12 @@
 #define BGRA _ISL_SWIZZLE(BLUE, GREEN, RED, ALPHA)
 #define RGB1 _ISL_SWIZZLE(RED, GREEN, BLUE, ONE)
 
-#define _fmt(__hw_fmt, __swizzle) \
-   { .isl_format = __hw_fmt, \
- .swizzle = __swizzle }
-
 #define swiz_fmt1(__vk_fmt, __hw_fmt, __swizzle) \
[VK_ENUM_OFFSET(__vk_fmt)] = { \
   .planes = { \
-  { .isl_format = __hw_fmt, .swizzle = __swizzle }, \
+ { .isl_format = __hw_fmt, .swizzle = __swizzle, \
+   .denominator_scales = { 1, 1, }, \
+ }, \
   }, \
   .n_planes = 1, \
}
@@ -64,9 +63,11 @@
   .planes = { \
  { .isl_format = __fmt1, \
.swizzle = RGBA,   \
+   .denominator_scales = { 1, 1, }, \
  }, \
  { .isl_format = __fmt2, \
.swizzle = RGBA,   \
+   .denominator_scales = { 1, 1, }, \
  }, \
   }, \
   .n_planes = 2, \
@@ -79,6 +80,31 @@
   }, \
}
 
+#define y_plane(__hw_fmt, __swizzle, __ycbcr_swizzle, dhs, dvs) \
+   { .isl_format = __hw_fmt, \
+ .swizzle = __swizzle, \
+ .ycbcr_swizzle = __ycbcr_swizzle, \
+ .denominator_scales = { dhs, dvs, }, \
+ .has_chroma = false, \
+   }
+
+#define chroma_plane(__hw_fmt, __swizzle, __ycbcr_swizzle, dhs, dvs) \
+   { .isl_format = __hw_fmt, \
+ .swizzle = __swizzle, \
+ .ycbcr_swizzle = __ycbcr_swizzle, \
+ .denominator_scales = { dhs, dvs, }, \
+ .has_chroma = true, \
+   }
+
+#define ycbcr_fmt(__vk_fmt, __n_planes, ...) \
+   [VK_ENUM_OFFSET(__vk_fmt)] = { \
+  .planes = { \
+ __VA_ARGS__, \
+  }, \
+  .n_planes = __n_planes, \
+  .can_ycbcr = true, \
+   }
+
 /* HINT: For array formats, the ISL name should match the VK name.  For
  * packed formats, they should have the channels in reverse order from each
  * other.  The reason for this is that, for packed formats, the ISL (and
@@ -275,6 +301,76 @@ static const struct anv_format main_formats[] = {
fmt1(VK_FORMAT_B8G8R8A8_SRGB, 
ISL_FORMAT_B8G8R8A8_UNORM_SRGB),
 };
 
+static const struct anv_format ycbcr_formats[] = {
+   ycbcr_fmt(VK_FORMAT_G8B8G8R8_422_UNORM_KHR, 1,
+ y_plane(ISL_FORMAT_YCRCB_SWAPUV, RGBA, _ISL_SWIZZLE(BLUE, GREEN, 
RED, ZERO), 1, 1)),
+   ycbcr_fmt(VK_FORMAT_B8G8R8G8_422_UNORM_KHR, 1,
+ y_plane(ISL_FORMAT_YCRCB_SWAPUVY, RGBA, _ISL_SWIZZLE(BLUE, GREEN, 
RED, ZERO), 1, 1)),
+   ycbcr_fmt(VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM_KHR, 3,
+ y_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(GREEN, ZERO, 
ZERO, ZERO), 1, 1),
+ chroma_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(BLUE, ZERO, 
ZERO, ZERO), 2, 2),
+ chroma_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(RED, ZERO, 
ZERO, ZERO), 2, 2)),
+   ycbcr_fmt(VK_FORMAT_G8_B8R8_2PLANE_420_UNORM_KHR, 2,
+ y_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(GREEN, ZERO, 
ZERO, ZERO), 1, 1),
+ chroma_plane(ISL_FORMAT_R8G8_UNORM, RGBA, _ISL_SWIZZLE(BLUE, RED, 
ZERO, ZERO), 2, 2)),
+   ycbcr_fmt(VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM_KHR, 3,
+ y_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(GREEN, ZERO, 
ZERO, ZERO), 1, 1),
+ chroma_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(BLUE, ZERO, 
ZERO, ZERO), 2, 1),
+ chroma_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(RED, ZERO, 
ZERO, ZERO), 2, 1)),
+   ycbcr_fmt(VK_FORMAT_G8_B8R8_2PLANE_422_UNORM_KHR, 2,
+ y_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(GREEN, ZERO, 
ZERO, ZERO), 1, 1),
+ chroma_plane(ISL_FORMAT_R8G8_UNORM, RGBA, _ISL_SWIZZLE(BLUE, RED, 
ZERO, ZERO), 2, 1)),
+   ycbcr_fmt(VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM_KHR, 3,
+ y_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(GREEN, ZERO, 
ZERO, ZERO), 1, 1),
+ chroma_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(BLUE, ZERO, 
ZERO, ZERO), 1, 1),
+ chroma_plane(ISL_FORMAT_R8_UNORM, RGBA, _ISL_SWIZZLE(RED, ZERO, 
ZERO, ZERO), 1, 1)),
+
+   fmt_unsupported(VK_FORMAT_R10X6_UNORM_PACK16_KHR),
+   fmt_unsupported(VK_FORMAT_R10X6G10X6_UNORM_2PACK16_KHR),
+   fmt_unsupported(VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16_KHR),
+   fmt_unsupported(VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16_KHR),
+   fmt_unsupported(VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16_KHR),
+   

[Mesa-dev] [PATCH v3 04/12] isl: fill out layout descriptions for yuv formats

2017-10-04 Thread Lionel Landwerlin
Some description was missing.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/isl/isl_format_layout.csv | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/isl/isl_format_layout.csv 
b/src/intel/isl/isl_format_layout.csv
index f340e30a1bf..ebb3d22bc18 100644
--- a/src/intel/isl/isl_format_layout.csv
+++ b/src/intel/isl/isl_format_layout.csv
@@ -222,8 +222,8 @@ I8_UINT ,   8,  1,  1,  1, , ,  
   , , ,  ui
 I8_SINT ,   8,  1,  1,  1, , , , , ,  
si8,, linear,
 DXT1_RGB_SRGB   ,  64,  4,  4,  1,  un4,  un4,  un4, , ,   
  ,,   srgb,  dxt1
 R1_UNORM,   1,  1,  1,  1,  un1, , , , ,   
  ,, linear,
-YCRCB_NORMAL,   0,  0,  0,  0, , , , , ,   
  ,,yuv,
-YCRCB_SWAPUVY   ,   0,  0,  0,  0, , , , , ,   
  ,,yuv,
+YCRCB_NORMAL,  16,  1,  1,  1,  un8,  un8,  un8, , ,   
  ,,yuv,
+YCRCB_SWAPUVY   ,  16,  1,  1,  1,  un8,  un8,  un8, , ,   
  ,,yuv,
 P2_UNORM_PALETTE0   ,   2,  1,  1,  1, , , , , ,   
  , un2, linear,
 P2_UNORM_PALETTE1   ,   2,  1,  1,  1, , , , , ,   
  , un2, linear,
 BC1_UNORM   ,  64,  4,  4,  1,  un4,  un4,  un4,  un4, ,   
  ,, linear,  dxt1
@@ -235,8 +235,8 @@ BC1_UNORM_SRGB  ,  64,  4,  4,  1,  un4,  un4,  
un4,  un4, ,
 BC2_UNORM_SRGB  , 128,  4,  4,  1,  un4,  un4,  un4,  un4, ,   
  ,,   srgb,  dxt3
 BC3_UNORM_SRGB  , 128,  4,  4,  1,  un4,  un4,  un4,  un4, ,   
  ,,   srgb,  dxt5
 MONO8   ,   1,  1,  1,  1, , , , , ,   
  ,,   ,
-YCRCB_SWAPUV,   0,  0,  0,  0, , , , , ,   
  ,,yuv,
-YCRCB_SWAPY ,   0,  0,  0,  0, , , , , ,   
  ,,yuv,
+YCRCB_SWAPUV,  16,  1,  1,  1,  un8,  un8,  un8, , ,   
  ,,yuv,
+YCRCB_SWAPY ,  16,  1,  1,  1,  un8,  un8,  un8, , ,   
  ,,yuv,
 DXT1_RGB,  64,  4,  4,  1,  un4,  un4,  un4, , ,   
  ,, linear,  dxt1
 FXT1, 128,  8,  4,  1,  un4,  un4,  un4, , ,   
  ,, linear,  fxt1
 R8G8B8_UNORM,  24,  1,  1,  1,  un8,  un8,  un8, , ,   
  ,, linear,
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 01/12] vulkan: util: add macros to extract extension/offset number from enums

2017-10-04 Thread Lionel Landwerlin
v2: Simplify offset enum computation (Jason)

v3: capitalize macros (Chad)

Signed-off-by: Lionel Landwerlin 
---
 src/vulkan/util/vk_util.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/vulkan/util/vk_util.h b/src/vulkan/util/vk_util.h
index 2ed601f881e..4c18a196b71 100644
--- a/src/vulkan/util/vk_util.h
+++ b/src/vulkan/util/vk_util.h
@@ -199,4 +199,10 @@ __vk_find_struct(void *start, VkStructureType sType)
 
 uint32_t vk_get_driver_version(void);
 
+#define VK_EXT_OFFSET (10UL)
+#define VK_ENUM_EXTENSION(__enum) \
+   ((__enum) >= VK_EXT_OFFSET ? __enum) - VK_EXT_OFFSET) / 1000UL) + 1) : 
0)
+#define VK_ENUM_OFFSET(__enum) \
+   ((__enum) >= VK_EXT_OFFSET ? ((__enum) % 1000) : (__enum))
+
 #endif /* VK_UTIL_H */
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 00/12] anv: implement KHR_sampler_ycbcr_conversion

2017-10-04 Thread Lionel Landwerlin
Hi,

A quick update following the comments for the first patches. I suppose
newer comments will appear but probably from patch 7 onwards.

Cheers,

Lionel Landwerlin (12):
  vulkan: util: add macros to extract extension/offset number from enums
  isl: make format layout channels accessible by index
  isl: check whether a format is rgb if colorspace is yuv
  isl: fill out layout descriptions for yuv formats
  anv: prepare formats to handle disjoints sets
  anv: modify the internal concept of format to express multiple planes
  anv: add new formats KHR_sampler_ycbcr_conversion
  anv: add descriptor support for multiplanar image/sampler
  anv: prepare sampler emission code for multiplanar images
  anv: add nir lowering pass for ycrcb textures
  anv: enable multiple planes per image/imageView
  anv: enable VK_KHR_sampler_ycbcr_conversion

 src/intel/Makefile.sources   |   1 +
 src/intel/isl/isl.h  |  23 +-
 src/intel/isl/isl_format_layout.csv  |   8 +-
 src/intel/vulkan/anv_blorp.c | 320 ++
 src/intel/vulkan/anv_descriptor_set.c|  24 +-
 src/intel/vulkan/anv_device.c|  51 +-
 src/intel/vulkan/anv_dump.c  |  17 +-
 src/intel/vulkan/anv_extensions.py   |   1 +
 src/intel/vulkan/anv_formats.c   | 733 +++
 src/intel/vulkan/anv_image.c | 645 
 src/intel/vulkan/anv_intel.c |   4 +-
 src/intel/vulkan/anv_nir.h   |   3 +
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 127 +++-
 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c  | 469 +++
 src/intel/vulkan/anv_pipeline.c  |   2 +
 src/intel/vulkan/anv_private.h   | 348 ---
 src/intel/vulkan/anv_wsi.c   |   8 +-
 src/intel/vulkan/gen8_cmd_buffer.c   |   2 +-
 src/intel/vulkan/genX_cmd_buffer.c   | 331 ++
 src/intel/vulkan/genX_pipeline.c |   7 +-
 src/intel/vulkan/genX_state.c| 122 ++--
 src/intel/vulkan/meson.build |   1 +
 src/intel/vulkan/vk_format_info.h|  27 +
 src/vulkan/util/vk_util.h|   6 +
 24 files changed, 2389 insertions(+), 891 deletions(-)
 create mode 100644 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c

--
2.14.2
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 02/12] isl: make format layout channels accessible by index

2017-10-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chad Versace 
---
 src/intel/isl/isl.h | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index df275f85c49..98de4c0f57f 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -994,15 +994,18 @@ struct isl_format_layout {
uint8_t bh; /**< Block height, in pixels */
uint8_t bd; /**< Block depth, in pixels */
 
-   struct {
-  struct isl_channel_layout r; /**< Red channel */
-  struct isl_channel_layout g; /**< Green channel */
-  struct isl_channel_layout b; /**< Blue channel */
-  struct isl_channel_layout a; /**< Alpha channel */
-  struct isl_channel_layout l; /**< Luminance channel */
-  struct isl_channel_layout i; /**< Intensity channel */
-  struct isl_channel_layout p; /**< Palette channel */
-   } channels;
+   union {
+  struct {
+ struct isl_channel_layout r; /**< Red channel */
+ struct isl_channel_layout g; /**< Green channel */
+ struct isl_channel_layout b; /**< Blue channel */
+ struct isl_channel_layout a; /**< Alpha channel */
+ struct isl_channel_layout l; /**< Luminance channel */
+ struct isl_channel_layout i; /**< Intensity channel */
+ struct isl_channel_layout p; /**< Palette channel */
+  } channels;
+  struct isl_channel_layout channels_array[7];
+   };
 
enum isl_colorspace colorspace;
enum isl_txc txc;
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 03/12] isl: check whether a format is rgb if colorspace is yuv

2017-10-04 Thread Lionel Landwerlin
Suggested by Chad.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/isl/isl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 98de4c0f57f..e3acb0ec280 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -1512,6 +1512,8 @@ enum isl_format isl_format_srgb_to_linear(enum isl_format 
fmt);
 static inline bool
 isl_format_is_rgb(enum isl_format fmt)
 {
+   if (isl_format_is_yuv(fmt))
+  return false;
return isl_format_layouts[fmt].channels.r.bits > 0 &&
   isl_format_layouts[fmt].channels.g.bits > 0 &&
   isl_format_layouts[fmt].channels.b.bits > 0 &&
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Connor Abbott
No. From the LLVM langref:

The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add
expressions that can be fused if the code generator determines that
(a) the target instruction set has support for a fused operation, and
(b) that the fused operation is more efficient than the equivalent,
separate pair of mul and add instructions.

The (b) part is especially important -- it says that LLVM can pick and
choose which fmuladd intrinsics to turn into FMA instructions, or
unfused MULADD instructions, or just a sequence of mul+add. For
example, if many instructions call fmuladd with the first two
arguments the same, it can break it up into a mul followed by a bunch
of adds. That wouldn't be ok under the GLSL precise semantics
(assuming the target would've used FMA otherwise, which I think some
GCN cards will do).

Also, and maybe more importantly, if an app developer explicitly asks
for fma() with a precise modifier, it's probably not a great idea to
then give them an unfused mul+add -- it's legal, thanks to GLSL's
weasel-wording, but probably not what you really want, on HW which
actually does have an FMA instruction :)

Connor


On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin  wrote:
> Wouldn't this guarantee that nothing is fused (and thus fine)?
> Presumably fmuladd always does mul+add either as 1 or 2 instructions?
>
> On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott  wrote:
>> If the fma has the exact flag, then we need to use the llvm.fma
>> intrinsic. These come from fma() calls with the precise or invariant
>> qualifiers in GLSL, where you basically have to fuse everything or
>> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.
>>
>> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie  wrote:
>>> From: Dave Airlie 
>>>
>>> For Vulkan SPIR-V the spec states
>>> fma() Inherited from OpFMul followed by OpFAdd.
>>>
>>> Matt says the backend will do the right thing depending on the
>>> hardware being compiled for, if you use the fmuladd intrinsic.
>>>
>>> Using the Mad Max pts test, on high settings at 4K:
>>> CHP: 55->60
>>> HGDD: 46->50
>>> LM: 55->60
>>> No change on Stronghold.
>>>
>>> Thanks to Feral for spending the time to track this down.
>>>
>>> Signed-off-by: Dave Airlie 
>>> ---
>>>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/amd/common/ac_nir_to_llvm.c 
>>> b/src/amd/common/ac_nir_to_llvm.c
>>> index d7b6259..11ba487 100644
>>> --- a/src/amd/common/ac_nir_to_llvm.c
>>> +++ b/src/amd/common/ac_nir_to_llvm.c
>>> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, 
>>> const nir_alu_instr *instr)
>>>   result);
>>> break;
>>> case nir_op_ffma:
>>> -   result = emit_intrin_3f_param(>ac, "llvm.fma",
>>> +   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
>>>   ac_to_float_type(>ac, 
>>> def_type), src[0], src[1], src[2]);
>>> break;
>>> case nir_op_ibitfield_extract:
>>> --
>>> 2.9.4
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101397] [EGL] Surfaceless lacks swrast support

2017-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101397

--- Comment #2 from Eric Engestrom  ---
Initial support has landed yesterday, but might need further work and testing:

commit 9d9a46d4efc00b256d2c0d04dda6c4ee3f0dc47a
Author: Gurchetan Singh 
Date:   Mon Oct 2 13:48:24 2017 -0700

egl/surfaceless: Use KMS swrast fallback

Please run your unit tests, and CC me on any bug you raise :)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: plumb context priority through to driver

2017-10-04 Thread Brian Paul

On 10/04/2017 09:44 AM, Rob Clark wrote:

Signed-off-by: Rob Clark 
---
  src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
  src/gallium/drivers/freedreno/freedreno_screen.c|  1 +
  src/gallium/drivers/i915/i915_screen.c  |  1 +
  src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
  src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
  src/gallium/drivers/r300/r300_screen.c  |  1 +
  src/gallium/drivers/r600/r600_pipe.c|  1 +
  src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
  src/gallium/drivers/softpipe/sp_screen.c|  1 +
  src/gallium/drivers/svga/svga_screen.c  |  1 +
  src/gallium/drivers/swr/swr_screen.cpp  |  1 +
  src/gallium/drivers/vc4/vc4_screen.c|  1 +
  src/gallium/drivers/virgl/virgl_screen.c|  1 +
  src/gallium/include/pipe/p_defines.h| 21 +
  src/gallium/include/state_tracker/st_api.h  |  2 ++
  src/gallium/state_trackers/dri/dri_context.c| 11 +++
  src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
  src/mesa/state_tracker/st_manager.c |  5 +
  20 files changed, 61 insertions(+), 1 deletion(-)


Can you document the new CAP in src/gallium/docs/?

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: bump Clover LLVM requirement to 3.9

2017-10-04 Thread Emil Velikov
On 4 October 2017 at 15:10, Jan Vesely  wrote:
> On Wed, 2017-10-04 at 14:59 +0100, Emil Velikov wrote:
>> On 3 October 2017 at 19:19, Jan Vesely  wrote:
>> > On Tue, 2017-10-03 at 17:51 +0100, Emil Velikov wrote:
>> > > From: Emil Velikov 
>> > >
>> > > The only driver that utilises Clover already depends on LLVM 3.9.
>> > > Additionally close to every supported distribution has said version.
>> > >
>> > > Additionally libclc requires LLVM 4.0 these days.
>> >
>> > support for llvm-3.9 has been restored to libclc since our discussion.
>> > sorry, I should have mentioned that.
>> >
>>
>> Right, I'll update the commit message as follows and push it in a few hours.
>
> Thanks.
> Acked-by: Jan Vesely 
>
> you might want to get the maintainer's (Francisco) ack as well.
>
I would love some input from him, that's why he's been in CC chain
from the first email ;-)
Guess ^^ is not urgent yet I'd love to trim down the Travis combinations a bit.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] i965/gen10: Implement WaSampleOffsetIZ workaround

2017-10-04 Thread Rafael Antognolli
On Mon, Oct 02, 2017 at 07:39:04PM -0700, Jason Ekstrand wrote:
> On Mon, Oct 2, 2017 at 4:07 PM, Anuj Phogat  wrote:
> 
> WaFlushHangWhenNonPipelineStateAndMarkerStalled goes along
> with WaSampleOffsetIZ. Both recommends the same.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h|  2 +
>  src/mesa/drivers/dri/i965/brw_defines.h|  1 +
>  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 54
> ++
>  src/mesa/drivers/dri/i965/gen8_multisample_state.c |  8 
>  4 files changed, 65 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri
> /i965/brw_context.h
> index 92fc16de13..f0e8d562e9 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct
> brw_context *brw);
>  void brw_emit_depth_stall_flushes(struct brw_context *brw);
>  void gen7_emit_vs_workaround_flush(struct brw_context *brw);
>  void gen7_emit_cs_stall_flush(struct brw_context *brw);
> +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw);
> +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw);
> 
>  /* brw_queryformat.c */
>  void brw_query_internal_format(struct gl_context *ctx, GLenum target,
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri
> /i965/brw_defines.h
> index 4abb790612..270cdf29db 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode {
>  #define GEN7_GPGPU_DISPATCHDIMY 0x2504
>  #define GEN7_GPGPU_DISPATCHDIMZ 0x2508
> 
> +#define GEN7_CACHE_MODE_0   0x7000
>  #define GEN7_CACHE_MODE_1   0x7004
>  # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4)
>  # define GEN8_HIZ_NP_PMA_FIX_ENABLE(1 << 11)
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/
> drivers/dri/i965/brw_pipe_control.c
> index 460b8f73b6..6326957a7a 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -278,6 +278,60 @@ gen7_emit_cs_stall_flush(struct brw_context *brw)
> brw->workaround_bo, 0, 0);
>  }
> 
> +static void
> +brw_flush_write_caches(struct brw_context *brw) {
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CACHE_FLUSH_BITS);
> +}
> +
> +static void
> +brw_flush_read_caches(struct brw_context *brw) {
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CACHE_INVALIDATE_BITS);
> +}
> +
> +/**
> + * From Gen10 Workarounds page in h/w specs:
> + * WaSampleOffsetIZ:
> + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no
> + * markers in the pipeline by programming a PIPE_CONTROL with stall.
> + */
> +void
> +gen10_emit_wa_cs_stall_flush(struct brw_context *brw)
> +{
> +   const struct gen_device_info *devinfo = >screen->devinfo;
> +   assert(devinfo->gen == 10);
> +   brw_emit_pipe_control_flush(brw,
> +   PIPE_CONTROL_CS_STALL |
> +   PIPE_CONTROL_STALL_AT_SCOREBOARD);
> +}
> +
> +/**
> + * From Gen10 Workarounds page in h/w specs:
> + * WaSampleOffsetIZ:
> + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an
> + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF
> (SVL)
> + * after the command to ensure the state has been delivered prior to any
> + * command causing a marker in the pipeline.
> + */
> +void
> +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw)
> +{
> +   const struct gen_device_info *devinfo = >screen->devinfo;
> +   assert(devinfo->gen == 10);
> +
> +   /* Before changing the value of CACHE_MODE_0 register, GFX pipeline
> must
> +* be idle; i.e., full flush is required.
> +*/
> +   brw_flush_write_caches(brw);
> +   brw_flush_read_caches(brw);
> 
> 
> If you do
> 
> brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CACHE_FLUSH_BITS |
>  PIPE_CONTROL_CACHE_INVALIDATE_BITS)
> 
> It will automatically do both and insert a stall between them.  What you have
> above, I don't think will actually every CS stall which appears to be required
> when changing CACHE_MODE_0
>  
> 
> +
> +   /* Write to CACHE_MODE_0 (0x7000) */
> +   BEGIN_BATCH(3);
> +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> +   OUT_BATCH(GEN7_CACHE_MODE_0);
> +   OUT_BATCH(0);
> 
> 

Re: [Mesa-dev] [PATCH 1/4] i965/gen10: Implement WaSampleOffsetIZ workaround

2017-10-04 Thread Rafael Antognolli
Hi Anuj,

On Mon, Oct 02, 2017 at 04:07:57PM -0700, Anuj Phogat wrote:
> WaFlushHangWhenNonPipelineStateAndMarkerStalled goes along
> with WaSampleOffsetIZ. Both recommends the same.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h|  2 +
>  src/mesa/drivers/dri/i965/brw_defines.h|  1 +
>  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 54 
> ++
>  src/mesa/drivers/dri/i965/gen8_multisample_state.c |  8 
>  4 files changed, 65 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 92fc16de13..f0e8d562e9 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct 
> brw_context *brw);
>  void brw_emit_depth_stall_flushes(struct brw_context *brw);
>  void gen7_emit_vs_workaround_flush(struct brw_context *brw);
>  void gen7_emit_cs_stall_flush(struct brw_context *brw);
> +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw);
> +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw);
>  
>  /* brw_queryformat.c */
>  void brw_query_internal_format(struct gl_context *ctx, GLenum target,
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 4abb790612..270cdf29db 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode {
>  #define GEN7_GPGPU_DISPATCHDIMY 0x2504
>  #define GEN7_GPGPU_DISPATCHDIMZ 0x2508
>  
> +#define GEN7_CACHE_MODE_0   0x7000
>  #define GEN7_CACHE_MODE_1   0x7004
>  # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4)
>  # define GEN8_HIZ_NP_PMA_FIX_ENABLE(1 << 11)
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 460b8f73b6..6326957a7a 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -278,6 +278,60 @@ gen7_emit_cs_stall_flush(struct brw_context *brw)
> brw->workaround_bo, 0, 0);
>  }
>  
> +static void
> +brw_flush_write_caches(struct brw_context *brw) {
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CACHE_FLUSH_BITS);
> +}
> +
> +static void
> +brw_flush_read_caches(struct brw_context *brw) {
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CACHE_INVALIDATE_BITS);
> +}
> +
> +/**
> + * From Gen10 Workarounds page in h/w specs:
> + * WaSampleOffsetIZ:
> + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no
> + * markers in the pipeline by programming a PIPE_CONTROL with stall.
> + */
> +void
> +gen10_emit_wa_cs_stall_flush(struct brw_context *brw)
> +{
> +   const struct gen_device_info *devinfo = >screen->devinfo;
> +   assert(devinfo->gen == 10);
> +   brw_emit_pipe_control_flush(brw,
> +   PIPE_CONTROL_CS_STALL |
> +   PIPE_CONTROL_STALL_AT_SCOREBOARD);
> +}
> +
> +/**
> + * From Gen10 Workarounds page in h/w specs:
> + * WaSampleOffsetIZ:
> + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an
> + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF(SVL)
> + * after the command to ensure the state has been delivered prior to any
> + * command causing a marker in the pipeline.
> + */
> +void
> +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw)
> +{
> +   const struct gen_device_info *devinfo = >screen->devinfo;
> +   assert(devinfo->gen == 10);
> +
> +   /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must
> +* be idle; i.e., full flush is required.
> +*/
> +   brw_flush_write_caches(brw);
> +   brw_flush_read_caches(brw);
> +
> +   /* Write to CACHE_MODE_0 (0x7000) */
> +   BEGIN_BATCH(3);
> +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> +   OUT_BATCH(GEN7_CACHE_MODE_0);
> +   OUT_BATCH(0);
> +   ADVANCE_BATCH();
> +}
> +
>  /**
>   * Emits a PIPE_CONTROL with a non-zero post-sync operation, for
>   * implementing two workarounds on gen6.  From section 1.4.7.1
> diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c 
> b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> index 7a31a5df4a..14043025b6 100644
> --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> @@ -49,6 +49,11 @@ gen8_emit_3dstate_multisample(struct brw_context *brw, 
> unsigned num_samples)
>  void
>  gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
>  {
> +   const struct gen_device_info *devinfo = >screen->devinfo;
> +
> +   if (devinfo->gen == 10)
> +  gen10_emit_wa_cs_stall_flush(brw);

Note there's a mention in a document that describes:

"New workaround 

Re: [Mesa-dev] [PATCH 2/6] meson: build glx

2017-10-04 Thread Dylan Baker
Meson does in most cases, just apparently not in options. I'm going to write a
patch for it right now.

Dylan

Quoting Nicholas Miell (2017-10-03 18:38:26)
> On 10/03/2017 05:26 PM, Dylan Baker wrote:
> 
> > diff --git a/meson_options.txt b/meson_options.txt
> > index 568903f1a0a..62d6b593f88 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> 
> > +option('glvnd',  type : 'boolean', vaule : false,
> 
> "vaule" again, although you fix this in "[PATCH 4/6] meson: build gbm" 
> of this series.
> 
> I'm beginning to think that Meson should warn about unknown keys in 
> dictionaries.


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] freedreno: context priority support

2017-10-04 Thread Rob Clark
For devices (and kernels) which support different priority ringbuffers,
expose context priority support.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/freedreno_context.c |  9 -
 src/gallium/drivers/freedreno/freedreno_screen.c  | 12 +++-
 src/gallium/drivers/freedreno/freedreno_screen.h  |  1 +
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
b/src/gallium/drivers/freedreno/freedreno_context.c
index 20480f4f8c1..7fdb848f380 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.c
+++ b/src/gallium/drivers/freedreno/freedreno_context.c
@@ -249,10 +249,17 @@ fd_context_init(struct fd_context *ctx, struct 
pipe_screen *pscreen,
 {
struct fd_screen *screen = fd_screen(pscreen);
struct pipe_context *pctx;
+   unsigned prio = 1;
int i;
 
+   /* lower numerical value == higher priority: */
+   if (flags & PIPE_CONTEXT_HIGH_PRIORITY)
+   prio = 0;
+   else if (flags & PIPE_CONTEXT_LOW_PRIORITY)
+   prio = 2;
+
ctx->screen = screen;
-   ctx->pipe = fd_pipe_new(screen->dev, FD_PIPE_3D);
+   ctx->pipe = fd_pipe_new2(screen->dev, FD_PIPE_3D, prio);
 
ctx->primtypes = primtypes;
ctx->primtype_mask = 0;
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 96866d656be..aa451f501ff 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -325,9 +325,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
-   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
 
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
+   return screen->priority_mask;
+
case PIPE_CAP_MAX_VIEWPORTS:
return 1;
 
@@ -803,6 +805,14 @@ fd_screen_create(struct fd_device *dev)
}
screen->chip_id = val;
 
+   if (fd_pipe_get_param(screen->pipe, FD_NR_RINGS, )) {
+   DBG("could not get # of rings");
+   screen->priority_mask = 0;
+   } else {
+   /* # of rings equates to number of unique priority values: */
+   screen->priority_mask = (1 << val) - 1;
+   }
+
DBG("Pipe Info:");
DBG(" GPU-id:  %d", screen->gpu_id);
DBG(" Chip-id: 0x%08x", screen->chip_id);
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.h 
b/src/gallium/drivers/freedreno/freedreno_screen.h
index 68518ef721b..d5e497d4f65 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.h
+++ b/src/gallium/drivers/freedreno/freedreno_screen.h
@@ -67,6 +67,7 @@ struct fd_screen {
uint32_t max_rts;/* max # of render targets */
uint32_t gmem_alignw, gmem_alignh;
uint32_t num_vsc_pipes;
+   uint32_t priority_mask;
bool has_timestamp;
 
void *compiler;  /* currently unused for a2xx */
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] freedreno: per-context fd_pipe

2017-10-04 Thread Rob Clark
To enable per-context priorities, we need to have per-context pipe's.
Unfortunately we still need to keep the global screen pipe, mostly just
for screen->get_timestamp().

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/a5xx/fd5_draw.c   | 2 +-
 src/gallium/drivers/freedreno/freedreno_batch.c | 6 +++---
 src/gallium/drivers/freedreno/freedreno_context.c   | 2 ++
 src/gallium/drivers/freedreno/freedreno_context.h   | 1 +
 src/gallium/drivers/freedreno/freedreno_fence.c | 2 +-
 src/gallium/drivers/freedreno/freedreno_query_acc.c | 6 +++---
 src/gallium/drivers/freedreno/freedreno_query_hw.c  | 4 ++--
 src/gallium/drivers/freedreno/freedreno_resource.c  | 4 ++--
 src/gallium/drivers/freedreno/freedreno_screen.h| 5 +
 9 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a5xx/fd5_draw.c 
b/src/gallium/drivers/freedreno/a5xx/fd5_draw.c
index d1f1d039b69..1e9117a5b96 100644
--- a/src/gallium/drivers/freedreno/a5xx/fd5_draw.c
+++ b/src/gallium/drivers/freedreno/a5xx/fd5_draw.c
@@ -194,7 +194,7 @@ fd5_clear_lrz(struct fd_batch *batch, struct fd_resource 
*zsbuf, double depth)
// draw
 
if (!batch->lrz_clear) {
-   batch->lrz_clear = fd_ringbuffer_new(batch->ctx->screen->pipe, 
0x1000);
+   batch->lrz_clear = fd_ringbuffer_new(batch->ctx->pipe, 0x1000);
fd_ringbuffer_set_parent(batch->lrz_clear, batch->gmem);
}
 
diff --git a/src/gallium/drivers/freedreno/freedreno_batch.c 
b/src/gallium/drivers/freedreno/freedreno_batch.c
index c2142b5a214..8f0f78861cf 100644
--- a/src/gallium/drivers/freedreno/freedreno_batch.c
+++ b/src/gallium/drivers/freedreno/freedreno_batch.c
@@ -53,9 +53,9 @@ batch_init(struct fd_batch *batch)
size = 0x10;
}
 
-   batch->draw= fd_ringbuffer_new(ctx->screen->pipe, size);
-   batch->binning = fd_ringbuffer_new(ctx->screen->pipe, size);
-   batch->gmem= fd_ringbuffer_new(ctx->screen->pipe, size);
+   batch->draw= fd_ringbuffer_new(ctx->pipe, size);
+   batch->binning = fd_ringbuffer_new(ctx->pipe, size);
+   batch->gmem= fd_ringbuffer_new(ctx->pipe, size);
 
fd_ringbuffer_set_parent(batch->gmem, NULL);
fd_ringbuffer_set_parent(batch->draw, batch->gmem);
diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
b/src/gallium/drivers/freedreno/freedreno_context.c
index e17dcf7b684..20480f4f8c1 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.c
+++ b/src/gallium/drivers/freedreno/freedreno_context.c
@@ -144,6 +144,7 @@ fd_context_destroy(struct pipe_context *pctx)
}
 
fd_device_del(ctx->dev);
+   fd_pipe_del(ctx->pipe);
 
if (fd_mesa_debug & (FD_DBG_BSTAT | FD_DBG_MSGS)) {
printf("batch_total=%u, batch_sysmem=%u, batch_gmem=%u, 
batch_restore=%u\n",
@@ -251,6 +252,7 @@ fd_context_init(struct fd_context *ctx, struct pipe_screen 
*pscreen,
int i;
 
ctx->screen = screen;
+   ctx->pipe = fd_pipe_new(screen->dev, FD_PIPE_3D);
 
ctx->primtypes = primtypes;
ctx->primtype_mask = 0;
diff --git a/src/gallium/drivers/freedreno/freedreno_context.h 
b/src/gallium/drivers/freedreno/freedreno_context.h
index 393b485a096..f10f7ef4ea5 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.h
+++ b/src/gallium/drivers/freedreno/freedreno_context.h
@@ -156,6 +156,7 @@ struct fd_context {
 
struct fd_device *dev;
struct fd_screen *screen;
+   struct fd_pipe *pipe;
 
struct util_queue flush_queue;
 
diff --git a/src/gallium/drivers/freedreno/freedreno_fence.c 
b/src/gallium/drivers/freedreno/freedreno_fence.c
index f20c6ac120e..e3d200aa3a1 100644
--- a/src/gallium/drivers/freedreno/freedreno_fence.c
+++ b/src/gallium/drivers/freedreno/freedreno_fence.c
@@ -69,7 +69,7 @@ boolean fd_fence_finish(struct pipe_screen *pscreen,
return ret == 0;
}
 
-   if (fd_pipe_wait_timeout(fence->screen->pipe, fence->timestamp, 
timeout))
+   if (fd_pipe_wait_timeout(fence->ctx->pipe, fence->timestamp, timeout))
return false;
 
return true;
diff --git a/src/gallium/drivers/freedreno/freedreno_query_acc.c 
b/src/gallium/drivers/freedreno/freedreno_query_acc.c
index 96cee1aee84..724ef69dc24 100644
--- a/src/gallium/drivers/freedreno/freedreno_query_acc.c
+++ b/src/gallium/drivers/freedreno/freedreno_query_acc.c
@@ -66,7 +66,7 @@ realloc_query_bo(struct fd_context *ctx, struct fd_acc_query 
*aq)
/* don't assume the buffer is zero-initialized: */
rsc = fd_resource(aq->prsc);
 
-   fd_bo_cpu_prep(rsc->bo, ctx->screen->pipe, DRM_FREEDRENO_PREP_WRITE);
+   fd_bo_cpu_prep(rsc->bo, ctx->pipe, DRM_FREEDRENO_PREP_WRITE);
 
map = fd_bo_map(rsc->bo);
memset(map, 0, aq->provider->size);
@@ -142,7 +142,7 @@ fd_acc_get_query_result(struct fd_context *ctx, struct 
fd_query *q,
  

[Mesa-dev] [PATCH 6/6] freedreno: add debug flag to force high priority context

2017-10-04 Thread Rob Clark
Mainly for testing, FD_MESA_DEBUG=hiprio will force high priority
contexts.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/freedreno_context.c | 4 +++-
 src/gallium/drivers/freedreno/freedreno_screen.c  | 1 +
 src/gallium/drivers/freedreno/freedreno_util.h| 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
b/src/gallium/drivers/freedreno/freedreno_context.c
index 7fdb848f380..fe46f710b87 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.c
+++ b/src/gallium/drivers/freedreno/freedreno_context.c
@@ -253,7 +253,9 @@ fd_context_init(struct fd_context *ctx, struct pipe_screen 
*pscreen,
int i;
 
/* lower numerical value == higher priority: */
-   if (flags & PIPE_CONTEXT_HIGH_PRIORITY)
+   if (fd_mesa_debug & FD_DBG_HIPRIO)
+   prio = 0;
+   else if (flags & PIPE_CONTEXT_HIGH_PRIORITY)
prio = 0;
else if (flags & PIPE_CONTEXT_LOW_PRIORITY)
prio = 2;
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index aa451f501ff..8807fe86189 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -79,6 +79,7 @@ static const struct debug_named_value debug_options[] = {
{"bstat", FD_DBG_BSTAT,  "Print batch stats at context 
destroy"},
{"nogrow",FD_DBG_NOGROW, "Disable \"growable\" cmdstream 
buffers, even if kernel supports it"},
{"lrz",   FD_DBG_LRZ,"Enable experimental LRZ support 
(a5xx+)"},
+   {"hiprio",FD_DBG_HIPRIO, "Force high-priority context"},
DEBUG_NAMED_VALUE_END
 };
 
diff --git a/src/gallium/drivers/freedreno/freedreno_util.h 
b/src/gallium/drivers/freedreno/freedreno_util.h
index 14fcf1d6725..3ef669ca861 100644
--- a/src/gallium/drivers/freedreno/freedreno_util.h
+++ b/src/gallium/drivers/freedreno/freedreno_util.h
@@ -80,6 +80,7 @@ enum adreno_stencil_op fd_stencil_op(unsigned op);
 #define FD_DBG_BSTAT0x8000
 #define FD_DBG_NOGROW  0x1
 #define FD_DBG_LRZ 0x2
+#define FD_DBG_HIPRIO  0x4
 
 extern int fd_mesa_debug;
 extern bool fd_binning_enabled;
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] freedreno: rename pipe -> vsc_pipe

2017-10-04 Thread Rob Clark
To add context priority support we need to have an fd_pipe per context,
rather than per-screen.  Which conflicts with existing ctx->pipe (which
is actually a visibility stream pipe (hw resource).  So just rename it.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c | 4 ++--
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c | 8 
 src/gallium/drivers/freedreno/a5xx/fd5_gmem.c | 8 
 src/gallium/drivers/freedreno/freedreno_context.c | 4 ++--
 src/gallium/drivers/freedreno/freedreno_context.h | 2 +-
 src/gallium/drivers/freedreno/freedreno_gmem.c| 4 ++--
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c
index 151ecfbf613..4bbbcf90ffa 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c
@@ -778,7 +778,7 @@ update_vsc_pipe(struct fd_batch *batch)
OUT_RELOCW(ring, fd3_ctx->vsc_size_mem, 0, 0, 0); /* VSC_SIZE_ADDRESS */
 
for (i = 0; i < 8; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
 
if (!pipe->bo) {
pipe->bo = fd_bo_new(ctx->dev, 0x4,
@@ -1011,7 +1011,7 @@ fd3_emit_tile_renderprep(struct fd_batch *batch, struct 
fd_tile *tile)
}
 
if (use_hw_binning(batch)) {
-   struct fd_vsc_pipe *pipe = >pipe[tile->p];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[tile->p];
 
assert(pipe->w * pipe->h);
 
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_gmem.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_gmem.c
index 49476d8636d..ebfbcabf67d 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_gmem.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_gmem.c
@@ -569,7 +569,7 @@ update_vsc_pipe(struct fd_batch *batch)
 
OUT_PKT0(ring, REG_A4XX_VSC_PIPE_CONFIG_REG(0), 8);
for (i = 0; i < 8; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
OUT_RING(ring, A4XX_VSC_PIPE_CONFIG_REG_X(pipe->x) |
A4XX_VSC_PIPE_CONFIG_REG_Y(pipe->y) |
A4XX_VSC_PIPE_CONFIG_REG_W(pipe->w) |
@@ -578,7 +578,7 @@ update_vsc_pipe(struct fd_batch *batch)
 
OUT_PKT0(ring, REG_A4XX_VSC_PIPE_DATA_ADDRESS_REG(0), 8);
for (i = 0; i < 8; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
if (!pipe->bo) {
pipe->bo = fd_bo_new(ctx->dev, 0x4,
DRM_FREEDRENO_GEM_TYPE_KMEM);
@@ -588,7 +588,7 @@ update_vsc_pipe(struct fd_batch *batch)
 
OUT_PKT0(ring, REG_A4XX_VSC_PIPE_DATA_LENGTH_REG(0), 8);
for (i = 0; i < 8; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
OUT_RING(ring, fd_bo_size(pipe->bo) - 32); /* 
VSC_PIPE_DATA_LENGTH[i] */
}
 }
@@ -767,7 +767,7 @@ fd4_emit_tile_renderprep(struct fd_batch *batch, struct 
fd_tile *tile)
uint32_t y2 = tile->yoff + tile->bin_h - 1;
 
if (use_hw_binning(batch)) {
-   struct fd_vsc_pipe *pipe = >pipe[tile->p];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[tile->p];
 
assert(pipe->w * pipe->h);
 
diff --git a/src/gallium/drivers/freedreno/a5xx/fd5_gmem.c 
b/src/gallium/drivers/freedreno/a5xx/fd5_gmem.c
index c623b572be5..d8d79217d5b 100644
--- a/src/gallium/drivers/freedreno/a5xx/fd5_gmem.c
+++ b/src/gallium/drivers/freedreno/a5xx/fd5_gmem.c
@@ -275,7 +275,7 @@ update_vsc_pipe(struct fd_batch *batch)
 
OUT_PKT4(ring, REG_A5XX_VSC_PIPE_CONFIG_REG(0), 16);
for (i = 0; i < 16; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
OUT_RING(ring, A5XX_VSC_PIPE_CONFIG_REG_X(pipe->x) |
A5XX_VSC_PIPE_CONFIG_REG_Y(pipe->y) |
A5XX_VSC_PIPE_CONFIG_REG_W(pipe->w) |
@@ -284,7 +284,7 @@ update_vsc_pipe(struct fd_batch *batch)
 
OUT_PKT4(ring, REG_A5XX_VSC_PIPE_DATA_ADDRESS_LO(0), 32);
for (i = 0; i < 16; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
if (!pipe->bo) {
pipe->bo = fd_bo_new(ctx->dev, 0x2,
DRM_FREEDRENO_GEM_TYPE_KMEM);
@@ -294,7 +294,7 @@ update_vsc_pipe(struct fd_batch *batch)
 
OUT_PKT4(ring, REG_A5XX_VSC_PIPE_DATA_LENGTH_REG(0), 16);
for (i = 0; i < 16; i++) {
-   struct fd_vsc_pipe *pipe = >pipe[i];
+   struct fd_vsc_pipe *pipe = >vsc_pipe[i];
OUT_RING(ring, 

[Mesa-dev] [PATCH 2/6] freedreno: pass context flags through to fd_context_init()

2017-10-04 Thread Rob Clark
Prep work for later patch.

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/a2xx/fd2_context.c  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_context.c  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_context.c  | 2 +-
 src/gallium/drivers/freedreno/a5xx/fd5_context.c  | 2 +-
 src/gallium/drivers/freedreno/freedreno_context.c | 2 +-
 src/gallium/drivers/freedreno/freedreno_context.h | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_context.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_context.c
index ec76a227999..4f6e432c965 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_context.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_context.c
@@ -113,7 +113,7 @@ fd2_context_create(struct pipe_screen *pscreen, void *priv, 
unsigned flags)
 
pctx = fd_context_init(_ctx->base, pscreen,
(screen->gpu_id >= 220) ? a22x_primtypes : 
a20x_primtypes,
-   priv);
+   priv, flags);
if (!pctx)
return NULL;
 
diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_context.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_context.c
index b432f593e0f..476d06d43ff 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_context.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_context.c
@@ -94,7 +94,7 @@ fd3_context_create(struct pipe_screen *pscreen, void *priv, 
unsigned flags)
fd3_prog_init(pctx);
fd3_emit_init(pctx);
 
-   pctx = fd_context_init(_ctx->base, pscreen, primtypes, priv);
+   pctx = fd_context_init(_ctx->base, pscreen, primtypes, priv, flags);
if (!pctx)
return NULL;
 
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_context.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_context.c
index db292af8be1..82ba94a0895 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_context.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_context.c
@@ -94,7 +94,7 @@ fd4_context_create(struct pipe_screen *pscreen, void *priv, 
unsigned flags)
fd4_prog_init(pctx);
fd4_emit_init(pctx);
 
-   pctx = fd_context_init(_ctx->base, pscreen, primtypes, priv);
+   pctx = fd_context_init(_ctx->base, pscreen, primtypes, priv, flags);
if (!pctx)
return NULL;
 
diff --git a/src/gallium/drivers/freedreno/a5xx/fd5_context.c 
b/src/gallium/drivers/freedreno/a5xx/fd5_context.c
index 3632cc522ee..1d086338e9f 100644
--- a/src/gallium/drivers/freedreno/a5xx/fd5_context.c
+++ b/src/gallium/drivers/freedreno/a5xx/fd5_context.c
@@ -93,7 +93,7 @@ fd5_context_create(struct pipe_screen *pscreen, void *priv, 
unsigned flags)
fd5_prog_init(pctx);
fd5_emit_init(pctx);
 
-   pctx = fd_context_init(_ctx->base, pscreen, primtypes, priv);
+   pctx = fd_context_init(_ctx->base, pscreen, primtypes, priv, flags);
if (!pctx)
return NULL;
 
diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
b/src/gallium/drivers/freedreno/freedreno_context.c
index 1cf366b0c6a..3d0ac3a22db 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.c
+++ b/src/gallium/drivers/freedreno/freedreno_context.c
@@ -244,7 +244,7 @@ fd_context_cleanup_common_vbos(struct fd_context *ctx)
 
 struct pipe_context *
 fd_context_init(struct fd_context *ctx, struct pipe_screen *pscreen,
-   const uint8_t *primtypes, void *priv)
+   const uint8_t *primtypes, void *priv, unsigned flags)
 {
struct fd_screen *screen = fd_screen(pscreen);
struct pipe_context *pctx;
diff --git a/src/gallium/drivers/freedreno/freedreno_context.h 
b/src/gallium/drivers/freedreno/freedreno_context.h
index 4472afb83e1..c045661468e 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.h
+++ b/src/gallium/drivers/freedreno/freedreno_context.h
@@ -432,7 +432,7 @@ void fd_context_cleanup_common_vbos(struct fd_context *ctx);
 
 struct pipe_context * fd_context_init(struct fd_context *ctx,
struct pipe_screen *pscreen, const uint8_t *primtypes,
-   void *priv);
+   void *priv, unsigned flags);
 
 void fd_context_destroy(struct pipe_context *pctx);
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] gallium: plumb context priority through to driver

2017-10-04 Thread Rob Clark
Signed-off-by: Rob Clark 
---
 src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
 src/gallium/drivers/freedreno/freedreno_screen.c|  1 +
 src/gallium/drivers/i915/i915_screen.c  |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
 src/gallium/drivers/r300/r300_screen.c  |  1 +
 src/gallium/drivers/r600/r600_pipe.c|  1 +
 src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
 src/gallium/drivers/softpipe/sp_screen.c|  1 +
 src/gallium/drivers/svga/svga_screen.c  |  1 +
 src/gallium/drivers/swr/swr_screen.cpp  |  1 +
 src/gallium/drivers/vc4/vc4_screen.c|  1 +
 src/gallium/drivers/virgl/virgl_screen.c|  1 +
 src/gallium/include/pipe/p_defines.h| 21 +
 src/gallium/include/state_tracker/st_api.h  |  2 ++
 src/gallium/state_trackers/dri/dri_context.c| 11 +++
 src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
 src/mesa/state_tracker/st_manager.c |  5 +
 20 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index 42905ab0620..16bd4b7c0fb 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -264,6 +264,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
   return 0;
 
/* Stream output. */
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 040c2c99ec0..96866d656be 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -325,6 +325,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 8411c0f15cc..7bcf479c4be 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -317,6 +317,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
   return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 53171162a54..19411adaf07 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -360,6 +360,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index a66b4fbe67b..782ba0a64db 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -224,6 +224,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 479283e1b7c..997cb4e71dc 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -276,6 +276,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index ac850c493da..05913bccb65 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -305,6 +305,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
+   case 

[Mesa-dev] [PATCH 0/6] gallium/freedreno support for context priority

2017-10-04 Thread Rob Clark
These apply on top of Chris Wilson's patches which add the corresponding
EGL/core bits for IMG_context_priority[1] and add the gallium and
freedreno bits.  The freedreno parts depend on some libdrm_freedreno
patches that are WIP (need updating for some last minute changes we
made to the kernel UABI), so while I don't expect the freedreno gallium
parts to change, they aren't quite ready to merge.  Just including them
for reference, and so people can begin reviewing the gallium part (first
patch).

[1] https://patchwork.freedesktop.org/series/31159/

Rob Clark (6):
  gallium: plumb context priority through to driver
  freedreno: pass context flags through to fd_context_init()
  freedreno: rename pipe -> vsc_pipe
  freedreno: per-context fd_pipe
  freedreno: context priority support
  freedreno: add debug flag to force high priority context

 src/gallium/drivers/etnaviv/etnaviv_screen.c|  1 +
 src/gallium/drivers/freedreno/a2xx/fd2_context.c|  2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_context.c|  2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c   |  4 ++--
 src/gallium/drivers/freedreno/a4xx/fd4_context.c|  2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c   |  8 
 src/gallium/drivers/freedreno/a5xx/fd5_context.c|  2 +-
 src/gallium/drivers/freedreno/a5xx/fd5_draw.c   |  2 +-
 src/gallium/drivers/freedreno/a5xx/fd5_gmem.c   |  8 
 src/gallium/drivers/freedreno/freedreno_batch.c |  6 +++---
 src/gallium/drivers/freedreno/freedreno_context.c   | 17 ++---
 src/gallium/drivers/freedreno/freedreno_context.h   |  5 +++--
 src/gallium/drivers/freedreno/freedreno_fence.c |  2 +-
 src/gallium/drivers/freedreno/freedreno_gmem.c  |  4 ++--
 src/gallium/drivers/freedreno/freedreno_query_acc.c |  6 +++---
 src/gallium/drivers/freedreno/freedreno_query_hw.c  |  4 ++--
 src/gallium/drivers/freedreno/freedreno_resource.c  |  4 ++--
 src/gallium/drivers/freedreno/freedreno_screen.c| 12 
 src/gallium/drivers/freedreno/freedreno_screen.h|  6 ++
 src/gallium/drivers/freedreno/freedreno_util.h  |  1 +
 src/gallium/drivers/i915/i915_screen.c  |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c|  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c  |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c  |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c  |  1 +
 src/gallium/drivers/r300/r300_screen.c  |  1 +
 src/gallium/drivers/r600/r600_pipe.c|  1 +
 src/gallium/drivers/radeonsi/si_pipe.c  |  1 +
 src/gallium/drivers/softpipe/sp_screen.c|  1 +
 src/gallium/drivers/svga/svga_screen.c  |  1 +
 src/gallium/drivers/swr/swr_screen.cpp  |  1 +
 src/gallium/drivers/vc4/vc4_screen.c|  1 +
 src/gallium/drivers/virgl/virgl_screen.c|  1 +
 src/gallium/include/pipe/p_defines.h| 21 +
 src/gallium/include/state_tracker/st_api.h  |  2 ++
 src/gallium/state_trackers/dri/dri_context.c| 11 +++
 src/gallium/state_trackers/dri/dri_query_renderer.c |  8 +++-
 src/mesa/state_tracker/st_manager.c |  5 +
 38 files changed, 124 insertions(+), 34 deletions(-)

-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 22/22] docs: add a high level info about Tizen / Tizen Porting Layer (TPL) for EGL / Tizen Buffer Manager (TBM) / etc

2017-10-04 Thread Eric Engestrom
On Wednesday, 2017-10-04 06:50:34 +, Gwan-gyeong Mun wrote:
> It gives a quick overview and references of developing OpenGLES / EGL
> Driver for Tizen.

Thanks for that; haven't read through it, but it should be quite useful :)
One request below.

> 
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  docs/systems.html |   1 +
>  docs/tizen.html   | 245 
> ++
>  2 files changed, 246 insertions(+)
>  create mode 100644 docs/tizen.html
> 
> diff --git a/docs/systems.html b/docs/systems.html
> index b97e1f0a79..ab6c9c3f74 100644
> --- a/docs/systems.html
> +++ b/docs/systems.html
> @@ -63,6 +63,7 @@ drivers for the X Window System
>  and Unix-like operating systems
>  Microsoft Windows
>  VMware guest OS driver
> +Tizen
>  
>  
>  
> diff --git a/docs/tizen.html b/docs/tizen.html
> new file mode 100644
> index 00..bce3d05bda
> --- /dev/null
> +++ b/docs/tizen.html
> @@ -0,0 +1,245 @@
> + "http://www.w3.org/TR/html4/loose.dtd;>
> +
> +
> +  
> +  Tizen
> +  
> +
> +
> +
> +
> +  The Mesa 3D Graphics Library
> +
> +
> +
> +
> +
> +Introduction
> +
> +
> +This document describes the essential elements of Tizen's platform-level
> +graphics architecture related to OpenGL ES and EGL,
> +and how it is used by the application framework and the display server.
> +The focus is on how graphical data buffers move through the system.
> +
> +
> +
> +Tizen platform requires the OpenGL ES driver for the acceleration of
> +the Wayland display server and wayland-eglclient.
> +This platform demands OpenGL ES and EGL driver which is implemented by
> +the Tizen EGL Porting Layer.
> +
> +
> +
> +Tizen OpenGL ES and EGL Architecture
> +
> +
> +The following figure illustrates the Tizen OpenGL ES and EGL architecture.
> +
> +
> +
> +  https://wiki.tizen.org/images/d/d6/OPENGLES_STACK.png;
> +  width="800" height="582" />
> +
> +
> +
> +CoreGL
> +
> +An injection layer of OpenGL ES that provides the following 
> capabilities:
> +
> +
> +   Support for driver-independent optimization (FastPath)
> +   EGL/OpenGL ES debugging
> +   Performance logging
> +
> +
> +
> +Tizen Porting Layer (TPL) for EGL
> +
> +
> +TPL-EGL is an abstraction layer for surface and buffer management on Tizen
> +platform. It is used for implementation of the EGL platform functions.
> +
> +
> +
> +  https://wiki.tizen.org/images/0/0e/Tpl_architecture.png;
> +  width="800" height="204" />
> +
> +
> +
> +
> +  
> +  The background for the Tizen EGL Porting Layer for EGL is in various window
> +  system protocols in Tizen. There was a need for separating common layer and
> +  backend.
> +  
> +  
> +  Tizen uses the Tizen Porting Layer for EGL, as the TPL-EGL APIs prevents
> +  burdens of the EGL porting on various window system protocols.
> +  The GPU GL Driver’s Window System Porting Layer can be implemented by
> +  TPL-EGL APIs which are the corresponding window system APIs.
> +  The TBM, Wayland, and GBM backends are supported.
> +  
> +
> +
> +
> +Tizen Porting Layer for EGL Object Model
> +
> +
> +TPL-EGL provides interfaces based of object driven model.
> +Every TPL-EGL object can be represented as a generic tpl_object_t,
> +which is reference-counted and provides common functions.
> +Currently, display and surface types of TPL-EGL objects are provided.
> +Display, like normal display, represents a display system which is usually
> +used for connection to the server. Surface corresponds to a native surface
> +like wl_surface. A surface might be configured to use N-buffers,
> +but is usually double-buffered or triple-buffered.
> +Buffer is actually something to render on, usually a set of pixels
> +or a block of memory. For these 2 objects, the Wayland, GBM, TBM backend are
> +defined, and they are corresponding to their own window systems.
> +This means that you do not need to care about the window systems.
> +
> +
> +
> +TPL-EGL Core Object
> +
> +
> +  TPL-EGL Object
> +  
> +Base class for all TPL-EGL objects
> +  
> +
> +  TPL-EGL Display
> +  
> +
> +Encapsulates the native display object (Display *, wl_display) Like a
> +normal display, represents a display system which is usually used for
> +connection to the server, scope for other objects.
> +  
> +  
> +
> +  TPL-EGL Surface
> +  
> +
> +Encapsulates the native drawable object (Window, Pixmap, wl_surface)
> +The surface corresponds to a native surface, such as tbm_surface_queue
> +or wl_surface. A surface can be configured to use N-buffers,
> +but they are usually double-buffered or triple-buffered.
> +
> +  
> +
> +
> +
> +TPL-EGL Objects and Corresponding EGL Objects
> +
> +Both TPL-EGL and vendor GLES/EGL driver handles the tbm_surface as
> +TPL surface's corresponding buffer. It is represented by the TBM_Surface
> +part in the following figure.
> +
> +
> +
> +  https://wiki.tizen.org/images/e/e6/Relationship_TPL_EGL_Gray.png;
> +  width="800" height="403" />
> +
> +
> +
> +The 

Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Ilia Mirkin
Wouldn't this guarantee that nothing is fused (and thus fine)?
Presumably fmuladd always does mul+add either as 1 or 2 instructions?

On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott  wrote:
> If the fma has the exact flag, then we need to use the llvm.fma
> intrinsic. These come from fma() calls with the precise or invariant
> qualifiers in GLSL, where you basically have to fuse everything or
> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.
>
> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie  wrote:
>> From: Dave Airlie 
>>
>> For Vulkan SPIR-V the spec states
>> fma() Inherited from OpFMul followed by OpFAdd.
>>
>> Matt says the backend will do the right thing depending on the
>> hardware being compiled for, if you use the fmuladd intrinsic.
>>
>> Using the Mad Max pts test, on high settings at 4K:
>> CHP: 55->60
>> HGDD: 46->50
>> LM: 55->60
>> No change on Stronghold.
>>
>> Thanks to Feral for spending the time to track this down.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/amd/common/ac_nir_to_llvm.c 
>> b/src/amd/common/ac_nir_to_llvm.c
>> index d7b6259..11ba487 100644
>> --- a/src/amd/common/ac_nir_to_llvm.c
>> +++ b/src/amd/common/ac_nir_to_llvm.c
>> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, 
>> const nir_alu_instr *instr)
>>   result);
>> break;
>> case nir_op_ffma:
>> -   result = emit_intrin_3f_param(>ac, "llvm.fma",
>> +   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
>>   ac_to_float_type(>ac, 
>> def_type), src[0], src[1], src[2]);
>> break;
>> case nir_op_ibitfield_extract:
>> --
>> 2.9.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 01/11] vulkan: util: add macros to extract extension/offset number from enums

2017-10-04 Thread Lionel Landwerlin

On 04/10/17 00:40, Chad Versace wrote:

On Tue 03 Oct 2017, Jason Ekstrand wrote:

On Tue, Oct 3, 2017 at 3:18 PM, Lionel Landwerlin <[1]
lionel.g.landwer...@intel.com> wrote:

 On 03/10/17 21:21, Chad Versace wrote:

 On Tue 03 Oct 2017, Lionel Landwerlin wrote:

 On 03/10/17 19:13, Jason Ekstrand wrote:

      +1 to static inline

 Done locally.

 Cool. Waiting to see it appear in wip/djeath/ycbcr_conversion.


 Ah...
 I didn't actually test that (with all the other commits on top).

 Unfortunately that's breaking a bit the way we index formats :

 [4]https://github.com/djdeath/mesa/blob/wip/djdeath/ycbcr_conversion/src/
 intel/vulkan/anv_formats.c#L49


Right... That's a bummer.  Macros it is, I guess.

Fair enough. But please make the macros uppercase, so no one is hurt by
the multiple evaluation.

I updated the branch on github. The first 7 patches should have 
addressed the review comments.

Still need to deal with array of samplers in the nir pass.

I can resend if you want.

-
Lionel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 09/11] anv: add nir lowering pass for ycrcb textures

2017-10-04 Thread Lionel Landwerlin

On 04/10/17 00:58, Jason Ekstrand wrote:



+   struct ycbcr_state state = {
+      .builder = builder,
+      .origin_tex = tex,
+      .conversion = sampler->conversion,


What about arrays of samplers?  You're not allowed to indirect on them 
but they can, in theory, exist and have different samplers with 
different conversions.  I think we need to fish the array index out 
and take sampler[idx].conversion here instead.


Thanks, need to do that indeed.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 09/11] anv: add nir lowering pass for ycrcb textures

2017-10-04 Thread Lionel Landwerlin

On 04/10/17 00:54, Jason Ekstrand wrote:
On Tue, Oct 3, 2017 at 9:29 AM, Lionel Landwerlin 
> 
wrote:


This pass implements all the implicit conversions required by the
VK_KHR_sampler_ycbcr_conversion specification.

It also inserts plane sources onto sampling instructions that we then
let the pipeline layout pass deal with, when mapping things correctly
to descriptors.

Signed-off-by: Lionel Landwerlin >
---
 src/intel/Makefile.sources                       |   1 +
 src/intel/vulkan/anv_nir.h                       |   3 +
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 62 ++-
 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c  | 468
+++
 src/intel/vulkan/anv_pipeline.c                  |  2 +
 src/intel/vulkan/anv_private.h                   |  16 +-
 6 files changed, 545 insertions(+), 7 deletions(-)
 create mode 100644 src/intel/vulkan/anv_nir_lower_ycbcr_textures.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index bca7a132b26..9672dcc252d 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -219,6 +219,7 @@ VULKAN_FILES := \
        vulkan/anv_nir_lower_input_attachments.c \
        vulkan/anv_nir_lower_multiview.c \
        vulkan/anv_nir_lower_push_constants.c \
+       vulkan/anv_nir_lower_ycbcr_textures.c \
        vulkan/anv_pass.c \
        vulkan/anv_pipeline.c \
        vulkan/anv_pipeline_cache.c \
diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h
index 5b450b45cdf..0a06e3a1cf0 100644
--- a/src/intel/vulkan/anv_nir.h
+++ b/src/intel/vulkan/anv_nir.h
@@ -37,6 +37,9 @@ void anv_nir_lower_push_constants(nir_shader
*shader);

 bool anv_nir_lower_multiview(nir_shader *shader, uint32_t view_mask);

+void anv_nir_lower_ycbcr_textures(nir_shader *shader,
+                                  struct anv_pipeline *pipeline);
+
 void anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline,
                                    nir_shader *shader,
                                    struct brw_stage_prog_data
*prog_data,
diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index 428cfdf42d1..7cd28debe09 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -131,7 +131,7 @@ lower_res_index_intrinsic(nir_intrinsic_instr
*intrin,
 static void
 lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
                 unsigned *const_index, unsigned hw_binding_size,
-                nir_tex_src_type src_type,
+                nir_tex_src_type src_type, bool allow_indirect,
                 struct apply_pipeline_layout_state *state)
 {
    nir_builder *b = >builder;
@@ -141,6 +141,15 @@ lower_tex_deref(nir_tex_instr *tex,
nir_deref_var *deref,
       nir_deref_array *deref_array =
nir_deref_as_array(deref->deref.child);

       if (deref_array->deref_array_type ==
nir_deref_array_type_indirect) {
+         /* From VK_KHR_sampler_ycbcr_conversion:
+          *
+          * If sampler Y’CBCR conversion is enabled, the combined
image
+          * sampler must be indexed only by constant integral
expressions when
+          * aggregated into arrays in shader code, irrespective
of the
+          * shaderSampledImageArrayDynamicIndexing feature.
+          */
+         assert(allow_indirect);
+
          nir_ssa_def *index =
             nir_iadd(b, nir_imm_int(b, deref_array->base_offset),
                         nir_ssa_for_src(b, deref_array->indirect,
1));
@@ -150,7 +159,6 @@ lower_tex_deref(nir_tex_instr *tex,
nir_deref_var *deref,

          nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src,
tex->num_srcs + 1);
-


Spurrious change?

          for (unsigned i = 0; i < tex->num_srcs; i++) {
             new_srcs[i].src_type = tex->src[i].src_type;
             nir_instr_move_src(>instr, _srcs[i].src,
>src[i].src);
@@ -186,6 +194,46 @@ cleanup_tex_deref(nir_tex_instr *tex,
nir_deref_var *deref)
    nir_instr_rewrite_src(>instr, _array->indirect,
NIR_SRC_INIT);
 }

+static bool
+has_tex_src_plane(nir_tex_instr *tex)
+{
+   for (unsigned i = 0; i < tex->num_srcs; i++) {
+      if (tex->src[i].src_type == nir_tex_src_plane)
+         return true;
+   }
+
+   return false;
+}
+
+static uint32_t
+extract_tex_src_plane(nir_tex_instr *tex)
+{
+   nir_tex_src *new_srcs = rzalloc_array(tex, nir_tex_src,
tex->num_srcs - 1);
+   

Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Connor Abbott
If the fma has the exact flag, then we need to use the llvm.fma
intrinsic. These come from fma() calls with the precise or invariant
qualifiers in GLSL, where you basically have to fuse everything or
fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.

On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> For Vulkan SPIR-V the spec states
> fma() Inherited from OpFMul followed by OpFAdd.
>
> Matt says the backend will do the right thing depending on the
> hardware being compiled for, if you use the fmuladd intrinsic.
>
> Using the Mad Max pts test, on high settings at 4K:
> CHP: 55->60
> HGDD: 46->50
> LM: 55->60
> No change on Stronghold.
>
> Thanks to Feral for spending the time to track this down.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index d7b6259..11ba487 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
> nir_alu_instr *instr)
>   result);
> break;
> case nir_op_ffma:
> -   result = emit_intrin_3f_param(>ac, "llvm.fma",
> +   result = emit_intrin_3f_param(>ac, "llvm.fmuladd",
>   ac_to_float_type(>ac, 
> def_type), src[0], src[1], src[2]);
> break;
> case nir_op_ibitfield_extract:
> --
> 2.9.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 05/22] egl: add dri2_egl_surface_free_outdated_buffers_and_update_size() helper

2017-10-04 Thread Eric Engestrom
On Wednesday, 2017-10-04 06:50:17 +, Gwan-gyeong Mun wrote:
> To share common free outdated buffers and update size code.
> This compares width and height arguments with current egl surface dimension,
> if the compared surface dimension is differ, then it free local buffers and
> updates dimension.

Can you split out these refactors into a separate series, and then
mention in your next spin of your tizen patches that the latter depends
on the former?

As for the refactor themselves, can you add the new functions and use
them to replace the old code in the same patches?

It's much easier to review "this code block has been moved into
a separate function and is now called here" rather than having to juggle
multiple patches to see if the code is identical or if you're changing
something and why.

Cheers,
  Eric

> 
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/egl_dri2.c | 12 
>  src/egl/drivers/dri2/egl_dri2.h |  3 +++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index 89e18b6331..8d4bfa8c1a 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -1066,6 +1066,18 @@ dri2_egl_surface_free_local_buffers(struct 
> dri2_egl_surface *dri2_surf)
> }
>  }
>  
> +void
> +dri2_egl_surface_free_outdated_buffers_and_update_size(struct 
> dri2_egl_surface *dri2_surf,
> +   int width, int height)
> +{
> +   /* free outdated buffers and update the surface size */
> +   if (dri2_surf->base.Width != width || dri2_surf->base.Height != height) {
> +  dri2_egl_surface_free_local_buffers(dri2_surf);
> +  dri2_surf->base.Width = width;
> +  dri2_surf->base.Height = height;
> +   }
> +}
> +
>  /**
>   * Called via eglTerminate(), drv->API.Terminate().
>   *
> diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
> index d3cd9e1fef..4d2348e584 100644
> --- a/src/egl/drivers/dri2/egl_dri2.h
> +++ b/src/egl/drivers/dri2/egl_dri2.h
> @@ -486,6 +486,9 @@ dri2_egl_surface_alloc_local_buffer(struct 
> dri2_egl_surface *dri2_surf,
>  void
>  dri2_egl_surface_free_local_buffers(struct dri2_egl_surface *dri2_surf);
>  
> +void
> +dri2_egl_surface_free_outdated_buffers_and_update_size(struct 
> dri2_egl_surface *dri2_surf,
> +   int width, int 
> height);
>  EGLBoolean
>  dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
>  _EGLConfig *conf, const EGLint *attrib_list, EGLBoolean 
> enable_out_fence);
> -- 
> 2.14.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 13/22] egl: add dri2_egl_surface_destroy_image_front() helper

2017-10-04 Thread Rob Herring
On Wed, Oct 4, 2017 at 9:02 AM, Rob Herring  wrote:
> On Wed, Oct 4, 2017 at 1:50 AM, Gwan-gyeong Mun  wrote:
>> To share common destroy dri_image_front code.
>>
>> Signed-off-by: Mun Gwan-gyeong 
>> ---
>>  src/egl/drivers/dri2/egl_dri2.c | 14 ++
>>  src/egl/drivers/dri2/egl_dri2.h |  3 +++
>>  2 files changed, 17 insertions(+)
>>
>> diff --git a/src/egl/drivers/dri2/egl_dri2.c 
>> b/src/egl/drivers/dri2/egl_dri2.c
>> index e13b13c282..4070a80b23 100644
>> --- a/src/egl/drivers/dri2/egl_dri2.c
>> +++ b/src/egl/drivers/dri2/egl_dri2.c
>> @@ -1153,6 +1153,20 @@ dri2_egl_surface_destroy_image_back(struct 
>> dri2_egl_surface *dri2_surf)
>>  #endif
>>  }
>>
>> +void
>> +dri2_egl_surface_destroy_image_front(struct dri2_egl_surface *dri2_surf)
>> +{
>> +#if defined(HAVE_ANDROID_PLATFORM) || defined(HAVE_TIZEN_PLATFORM)
>
> It seems this function only gets called from the Android and Tizen
> specific code, so you don't need the #ifdef.
>
> Plus you can probably also rely on dri_image_front being NULL for
> other platforms.

NM, I guess it is needed as dri_image_front is still conditional.
Perhaps it should not be?

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 08/22] egl/tizen: add support of dri2_loader (v2)

2017-10-04 Thread Rob Herring
On Wed, Oct 4, 2017 at 1:50 AM, Gwan-gyeong Mun  wrote:
> It adds support of dri2_loader to egl dri2 tizen backend.
>   - referenced a basic buffer flow and management implementation from android.
>
> And it implements a query buffer age extesion for tizen and turn on
> swap_buffers_with_damage extension.
>   - it add color buffer related member variables to dri_egl_surface for a
> management of color buffers.
>
> v2: Fixes from Emil's review:
>a) Remove a temporary variable and return directly on get_format_bpp()
>b) Remove unneeded compiler pragma
>c) Follow coding style
>d) Rename get_pitch() to get_stride() for using of consistent naming
>e) Remove mis-referencing from android implementation on treatment of 
> buffer
>   age.
>   reference: 
> https://lists.freedesktop.org/archives/mesa-dev/2017-June/158409.html
>f) Use dri2_egl_surface_free_outdated_buffers_and_update_size() helper
>g) Use dri2_egl_surface_record_buffers_and_update_back_buffer() helper
>h) Use add dri2_egl_surface_update_buffer_age() helper
>i) Use env_var_as_boolean for hw_accel variable on dri2_initialize_tizen()
>j) Remove getting of the device name and opening of the device node on 
> dri2_initialize_tizen()
>   And add duplicating of tbm_bufmgr_fd. As tbm_bufmgr_fd is managed by 
> tbm_bufmgr,
>   if mesa use this fd then we should duplicate it.
>k) Add comments why we can not drop the dri2 codepath on 
> dri2_initialize_tizen()
>   As some kernels ported for tizen don't support render node feature yet,
>   currently we cannot drop the dri2 codepath.
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/egl_dri2.h   |   9 ++
>  src/egl/drivers/dri2/platform_tizen.c | 257 
> --
>  2 files changed, 252 insertions(+), 14 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
> index 6f9d936ca5..7d047bf5dd 100644
> --- a/src/egl/drivers/dri2/egl_dri2.h
> +++ b/src/egl/drivers/dri2/egl_dri2.h
> @@ -340,6 +340,15 @@ struct dri2_egl_surface
> tpl_surface_t *tpl_surface;
> tbm_surface_h  tbm_surface;
> tbm_format tbm_format;
> +
> +   /* Used to record all the tbm_surface created by tpl_surface and their 
> ages.
> +* Usually Tizen uses at most triple buffers in tpl_surface 
> (tbm_surface_queue)
> +* so hardcode the number of color_buffers to 3.
> +*/
> +   struct {
> +  tbm_surface_h   buffer;
> +  int age;
> +   } color_buffers[3], *back;

dri2_egl_surface is quite the mess of ifdefery.

So now we have 3 instances of color_buffer and *back. This struct
really needs some refactoring to separate out the common and platform
specific bits. I'm not saying it has to be done as part of this series
though.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/22] egl: add dri2_egl_surface_update_buffer_age() helper

2017-10-04 Thread Rob Herring
On Wed, Oct 4, 2017 at 1:50 AM, Gwan-gyeong Mun  wrote:
> To share common update buffer age code.
> This updates old buffer's age and sets current back buffer's age to 1.
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/egl_dri2.c | 19 +++
>  src/egl/drivers/dri2/egl_dri2.h |  3 +++
>  2 files changed, 22 insertions(+)
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index 807403dc51..8f6a8a62cb 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -1120,6 +1120,25 @@ 
> dri2_egl_surface_record_buffers_and_update_back_buffer(struct 
> dri2_egl_surface *
>  #endif
>  }
>
> +void
> +dri2_egl_surface_update_buffer_age(struct dri2_egl_surface *dri2_surf)
> +{
> +   for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
> +  if (dri2_surf->color_buffers[i].age > 0)
> + dri2_surf->color_buffers[i].age++;
> +   }
> +
> +#ifdef HAVE_ANDROID_PLATFORM
> +   /* "XXX: we don't use get_back_bo() since it causes regressions in
> +* several dEQP tests.
> +*/
> +   if (dri2_surf->back)
> +  dri2_surf->back->age = 1;
> +#else
> +   dri2_surf->back->age = 1;

No need for the ifdef here. The only difference is in the !ANDROID
case you would crash if back is NULL. Is that somehow desired or
necessary behavior?

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: bump Clover LLVM requirement to 3.9

2017-10-04 Thread Jan Vesely
On Wed, 2017-10-04 at 14:59 +0100, Emil Velikov wrote:
> On 3 October 2017 at 19:19, Jan Vesely  wrote:
> > On Tue, 2017-10-03 at 17:51 +0100, Emil Velikov wrote:
> > > From: Emil Velikov 
> > > 
> > > The only driver that utilises Clover already depends on LLVM 3.9.
> > > Additionally close to every supported distribution has said version.
> > > 
> > > Additionally libclc requires LLVM 4.0 these days.
> > 
> > support for llvm-3.9 has been restored to libclc since our discussion.
> > sorry, I should have mentioned that.
> > 
> 
> Right, I'll update the commit message as follows and push it in a few hours.

Thanks.
Acked-by: Jan Vesely 

you might want to get the maintainer's (Francisco) ack as well.

Jan

> 
> ---
> 
> The only driver that utilises Clover already depends on LLVM 3.9.
> Close to every supported distribution has said version.
> 
> Additionally libclc also requires LLVM 3.9.
> 
> With this in mind, we can safely bump the requirement.
> 
> There is a handful of dead code that we could remove, which will be
> resolved with later commits.
> 
> Note: this drops the LLVM 3.6 build from the Travis build. LLVM 3.9 (and
> later) are already covered in there.
> 
> https://lists.freedesktop.org/archives/mesa-dev/2017-September/170028.html
> 
> v2: Add reference to discussion thread (Eric), adjust libclc LLVM req. (Jan).
> 
> Cc: Jan Vesely 
> Cc: Francisco Jerez 
> Signed-off-by: Emil Velikov 
> Reviewed-by: Eric Engestrom 
> Acked-by: Vedran Miletić 
> 
> 
> -Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 13/22] egl: add dri2_egl_surface_destroy_image_front() helper

2017-10-04 Thread Rob Herring
On Wed, Oct 4, 2017 at 1:50 AM, Gwan-gyeong Mun  wrote:
> To share common destroy dri_image_front code.
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/egl/drivers/dri2/egl_dri2.c | 14 ++
>  src/egl/drivers/dri2/egl_dri2.h |  3 +++
>  2 files changed, 17 insertions(+)
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index e13b13c282..4070a80b23 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -1153,6 +1153,20 @@ dri2_egl_surface_destroy_image_back(struct 
> dri2_egl_surface *dri2_surf)
>  #endif
>  }
>
> +void
> +dri2_egl_surface_destroy_image_front(struct dri2_egl_surface *dri2_surf)
> +{
> +#if defined(HAVE_ANDROID_PLATFORM) || defined(HAVE_TIZEN_PLATFORM)

It seems this function only gets called from the Android and Tizen
specific code, so you don't need the #ifdef.

Plus you can probably also rely on dri_image_front being NULL for
other platforms.

> +   struct dri2_egl_display *dri2_dpy =
> +  dri2_egl_display(dri2_surf->base.Resource.Display);
> +
> +   if (dri2_surf->dri_image_front) {
> +  dri2_dpy->image->destroyImage(dri2_surf->dri_image_front);
> +  dri2_surf->dri_image_front = NULL;
> +   }
> +#endif
> +}
> +
>  /**
>   * Called via eglTerminate(), drv->API.Terminate().
>   *
> diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
> index a990fa3d83..fbef031fb6 100644
> --- a/src/egl/drivers/dri2/egl_dri2.h
> +++ b/src/egl/drivers/dri2/egl_dri2.h
> @@ -509,6 +509,9 @@ dri2_egl_surface_update_buffer_age(struct 
> dri2_egl_surface *dri2_surf);
>  void
>  dri2_egl_surface_destroy_image_back(struct dri2_egl_surface *dri2_surf);
>
> +void
> +dri2_egl_surface_destroy_image_front(struct dri2_egl_surface *dri2_surf);
> +
>  EGLBoolean
>  dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
>  _EGLConfig *conf, const EGLint *attrib_list, EGLBoolean 
> enable_out_fence);
> --
> 2.14.2
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: bump Clover LLVM requirement to 3.9

2017-10-04 Thread Emil Velikov
On 3 October 2017 at 19:19, Jan Vesely  wrote:
> On Tue, 2017-10-03 at 17:51 +0100, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> The only driver that utilises Clover already depends on LLVM 3.9.
>> Additionally close to every supported distribution has said version.
>>
>> Additionally libclc requires LLVM 4.0 these days.
>
> support for llvm-3.9 has been restored to libclc since our discussion.
> sorry, I should have mentioned that.
>
Right, I'll update the commit message as follows and push it in a few hours.

---

The only driver that utilises Clover already depends on LLVM 3.9.
Close to every supported distribution has said version.

Additionally libclc also requires LLVM 3.9.

With this in mind, we can safely bump the requirement.

There is a handful of dead code that we could remove, which will be
resolved with later commits.

Note: this drops the LLVM 3.6 build from the Travis build. LLVM 3.9 (and
later) are already covered in there.

https://lists.freedesktop.org/archives/mesa-dev/2017-September/170028.html

v2: Add reference to discussion thread (Eric), adjust libclc LLVM req. (Jan).

Cc: Jan Vesely 
Cc: Francisco Jerez 
Signed-off-by: Emil Velikov 
Reviewed-by: Eric Engestrom 
Acked-by: Vedran Miletić 


-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] wayland-egl: redistribute the wayland.egl.h include

2017-10-04 Thread Tobias Klausmann

On 10/4/17 3:47 PM, Emil Velikov wrote:
> On 3 October 2017 at 14:45, Tobias Klausmann
>  wrote:
>> Starting with commit ab0589c6ed ("wayland-egl: remove no longer needed
>> wayland-client dependency") the wayland-egl.h include was missing leading to 
>> a
>> build failure:
>>
>>   CC   wayland-egl.lo
>> wayland-egl.c:33:10: fatal error: wayland-egl.h: No such file or directory
>>  #include "wayland-egl.h"
>>   ^~~
>>
> Thanks. I've added some text why we don't use WAYLAND_EGL_CFLAGS and
> pushed it to master.
>
> -Emil


Ah good point indeed, thanks for adding and pushing,

Tobias

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Libclc-dev] opencl-example: didn't compile (run) with latest LLVM git (for some days)

2017-10-04 Thread Emil Velikov
[dropping the libclc-devel list, which annoying rejects people who are
not subscribed]

On 3 October 2017 at 13:48, Emil Velikov  wrote:
> On 20 September 2017 at 18:26, Jan Vesely  wrote:
>> adding mesa-dev. This is not really related to libclc.
>>
>> On Wed, 2017-09-20 at 12:50 +0200, Dieter Nützel via Libclc-dev wrote:
>>> Worked OK with older version (for me latest was #6c9f36933c5) but with
>>> your 'clover: Fix build after LLVM r313390' reverted.
>>>
>>> Now I get this during compilation tries:
>>>
>>> /opt/opencl-example> make
>>> gcc -o hello_world hello_world.o cl_simple.o cl_util.o -L/usr/local/lib
>>> -lOpenCL
>>> /usr/local/lib64/libOpenCL.so: undefined reference to
>>> `llvm::LLVMContext::getDiagnosticHandler() const@LLVM_6.0'
>>> /usr/local/lib64/libOpenCL.so: undefined reference to
>>> `llvm::isKnownNonNull(llvm::Value const*)@LLVM_6.0'
>>> /usr/local/lib64/libOpenCL.so: undefined reference to
>>> `llvm::DIBuilder::createCompileUnit(unsigned int, llvm::DIFile*,
>>> llvm::StringRef, bool, llvm::StringRef, unsigned int, llvm::StringRef,
>>> llvm::DICompileUnit::DebugEmissionKind, unsigned long, bool,
>>> bool)@LLVM_6.0'
>>
>> This is odd, this function is pretty old.
>>
>>> /usr/local/lib64/libOpenCL.so: undefined reference to
>>> `llvm::LLVMContext::setDiagnosticHandler(void (*)(llvm::DiagnosticInfo
>>> const&, void*), void*, bool)@LLVM_6.0'
>>> collect2: error: ld returned 1 exit status
>>> make: *** [Makefile:10: hello_world] Fehler 1
>>>
>>> Greetings,
>>> Dieter
>>>
>>> For reference (running '/opt/amdgpu-pro/bin/clinfo'):
>>>
>>> /opt/opencl-example> /opt/amdgpu-pro/bin/clinfo
>>> /opt/amdgpu-pro/bin/clinfo: /usr/local/lib64/libOpenCL.so.1: no version
>>> information available (required by /opt/amdgpu-pro/bin/clinfo)
>>> /opt/amdgpu-pro/bin/clinfo: /usr/local/lib64/libOpenCL.so.1: no version
>>> information available (required by /opt/amdgpu-pro/bin/clinfo)
>>> ATTENTION: default value of option mesa_glthread overridden by
>>> environment.
>>> ATTENTION: default value of option radeonsi_assume_no_z_fights
>>> overridden by environment.
>>> ATTENTION: default value of option radeonsi_commutative_blend_add
>>> overridden by environment.
>>> ATTENTION: default value of option mesa_glthread overridden by
>>> environment.
>>> Number of platforms: 1
>>>Platform Profile:  FULL_PROFILE
>>>Platform Version:  OpenCL 1.1 Mesa
>>> 17.3.0-devel (git-94fef19509)
>>>Platform Name: Clover
>>>Platform Vendor:   Mesa
>>>Platform Extensions:   cl_khr_icd
>>>
>>>
>>>Platform Name: Clover
>>> Number of devices:   1
>>>Device Type:   CL_DEVICE_TYPE_GPU
>>>Vendor ID: 1002h
>>>Max compute units: 36
>>>Max work items dimensions: 3
>>>  Max work items[0]:   256
>>>  Max work items[1]:   256
>>>  Max work items[2]:   256
>>>Max work group size:   256
>>>Preferred vector width char:   16
>>>Preferred vector width short:  8
>>>Preferred vector width int:4
>>>Preferred vector width long:   2
>>>Preferred vector width float:  4
>>>Preferred vector width double: 2
>>>Native vector width char:  16
>>>Native vector width short: 8
>>>Native vector width int:   4
>>>Native vector width long:  2
>>>Native vector width float: 4
>>>Native vector width double:2
>>>Max clock frequency:   1411Mhz
>>>Address bits:  64
>>>Max memory allocation: 6010904166
>>>Image support: No
>>>Max size of kernel argument:   1024
>>>Alignment (bits) of base address:  1024
>>>Minimum alignment (bytes) for any datatype:128
>>>Single precision floating point capability
>>>  Denorms: No
>>>  Quiet NaNs:  Yes
>>>  Round to nearest even:   Yes
>>>  Round to zero:   No
>>>  Round to +ve and infinity:   No
>>>  IEEE754-2008 fused multiply-add: No
>>>Cache type:None
>>>Cache line size:   0
>>>Cache size:0
>>> 

Re: [Mesa-dev] [PATCH 2/2] swr/rast: use proper alignment for debug transposedPrims

2017-10-04 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak  

> On Oct 3, 2017, at 3:23 PM, Tim Rowley  wrote:
> 
> Causing a crash in ParaView waveletcontour.py test when
> _DEBUG defined due to vector aligned copy with unaligned
> address.
> ---
> src/gallium/drivers/swr/rasterizer/core/clip.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h 
> b/src/gallium/drivers/swr/rasterizer/core/clip.h
> index cde5261521..e9a410daa3 100644
> --- a/src/gallium/drivers/swr/rasterizer/core/clip.h
> +++ b/src/gallium/drivers/swr/rasterizer/core/clip.h
> @@ -561,7 +561,7 @@ public:
> 
> #if defined(_DEBUG)
> // TODO: need to increase stack size, allocating SIMD16-widened 
> transposedPrims causes stack overflow in debug builds
> -SIMDVERTEX_T *transposedPrims = 
> reinterpret_cast(malloc(sizeof(SIMDVERTEX_T) 
> * 2));
> +SIMDVERTEX_T *transposedPrims = 
> reinterpret_cast *>(AlignedMalloc(sizeof(SIMDVERTEX_T) * 2, 64));
> 
> #else
> SIMDVERTEX_T transposedPrims[2];
> @@ -667,7 +667,7 @@ public:
> }
> 
> #if defined(_DEBUG)
> -free(transposedPrims);
> +AlignedFree(transposedPrims);
> 
> #endif
> // update global pipeline stat
> -- 
> 2.11.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: bump Clover LLVM requirement to 3.9

2017-10-04 Thread Vedran Miletić
On 10/03/2017 06:51 PM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The only driver that utilises Clover already depends on LLVM 3.9.
> Additionally close to every supported distribution has said version.
> 
> Additionally libclc requires LLVM 4.0 these days.
> 
> With this in mind, there a handful of dead code that we could remove.
> That will come with later commits.
> 
> Note: this drops the LLVM 3.6 build from the Travis build. LLVM 3.9 (and
> later) are already covered in there.
> 
> Cc: Vedran Miletić 
> Cc: Jan Vesely 
> Cc: Aaron Watry 
> Cc: Francisco Jerez 
> Signed-off-by: Emil Velikov 

Acked-by: Vedran Miletić 

> ---
> Vedran can we volunteer you for the cleanup ;-)
> ---

Yes, incoming.

Regards,
Vedran

-- 
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] wayland-egl: redistribute the wayland.egl.h include

2017-10-04 Thread Emil Velikov
On 3 October 2017 at 14:45, Tobias Klausmann
 wrote:
> Starting with commit ab0589c6ed ("wayland-egl: remove no longer needed
> wayland-client dependency") the wayland-egl.h include was missing leading to a
> build failure:
>
>   CC   wayland-egl.lo
> wayland-egl.c:33:10: fatal error: wayland-egl.h: No such file or directory
>  #include "wayland-egl.h"
>   ^~~
>
Thanks. I've added some text why we don't use WAYLAND_EGL_CFLAGS and
pushed it to master.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] travis: move include path from $CC to $CFLAGS

2017-10-04 Thread Eric Engestrom
On Wednesday, 2017-10-04 14:23:46 +0100, Emil Velikov wrote:
> On 4 October 2017 at 14:10, Eric Engestrom  wrote:
> > Signed-off-by: Eric Engestrom 
> Considering things still work (I'll push the wayland fix in a second)

Yeah, it worked for the meson build, but the make one is broken anyway,
so kinda hard to test.

> Reviewed-by: Emil Velikov 

Thanks, I'll push it when the wayland fix has landed.

> 
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: pass wanted format to intel_miptree_create_for_dri_image

2017-10-04 Thread Tapani Pälli
Change b3a44ae7a4 caused regressions on Android where DRI and renderbuffer
can disagree on the format being used. This patch removes the colorspace
parameter and instead we pass renderbuffer format. For non-winsys images we
still do srgb/linear modification in same manner as change b3a44ae7a4 wanted
but take format from renderbuffer instead of DRI image.

This patch fixes regressions seen with following test sets:

   dEQP-EGL.functional.color_clears*
   dEQP-EGL.functional.render*

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102999
---
 src/mesa/drivers/dri/i965/brw_context.c   | 14 +--
 src/mesa/drivers/dri/i965/intel_fbo.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 34 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  2 +-
 src/mesa/drivers/dri/i965/intel_tex_image.c   |  4 ++--
 5 files changed, 18 insertions(+), 38 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 1fd967e424..751b026439 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1596,21 +1596,9 @@ intel_update_image_buffer(struct brw_context *intel,
if (last_mt && last_mt->bo == buffer->bo)
   return;
 
-   enum isl_colorspace colorspace;
-   switch (_mesa_get_format_color_encoding(intel_rb_format(rb))) {
-   case GL_SRGB:
-  colorspace = ISL_COLORSPACE_SRGB;
-  break;
-   case GL_LINEAR:
-  colorspace = ISL_COLORSPACE_LINEAR;
-  break;
-   default:
-  unreachable("Invalid color encoding");
-   }
-
struct intel_mipmap_tree *mt =
   intel_miptree_create_for_dri_image(intel, buffer, GL_TEXTURE_2D,
- colorspace, true);
+ intel_rb_format(rb), true);
if (!mt)
   return;
 
diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index 46f140c028..4a592f37ef 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -364,7 +364,7 @@ intel_image_target_renderbuffer_storage(struct gl_context 
*ctx,
 * content.
 */
irb->mt = intel_miptree_create_for_dri_image(brw, image, GL_TEXTURE_2D,
-ISL_COLORSPACE_NONE, false);
+image->format, false);
if (!irb->mt)
   return;
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5b7cde82f6..9870748711 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -959,34 +959,26 @@ create_ccs_buf_for_image(struct brw_context *brw,
 struct intel_mipmap_tree *
 intel_miptree_create_for_dri_image(struct brw_context *brw,
__DRIimage *image, GLenum target,
-   enum isl_colorspace colorspace,
+   mesa_format format,
bool is_winsys_image)
 {
-   if (image->planar_format && image->planar_format->nplanes > 1) {
-  assert(colorspace == ISL_COLORSPACE_NONE ||
- colorspace == ISL_COLORSPACE_YUV);
+   if (image->planar_format && image->planar_format->nplanes > 1)
   return miptree_create_for_planar_image(brw, image, target);
-   }
 
if (image->planar_format)
   assert(image->planar_format->planes[0].dri_format == image->dri_format);
 
-   mesa_format format = image->format;
-   switch (colorspace) {
-   case ISL_COLORSPACE_NONE:
-  /* Keep the image format unmodified */
-  break;
-
-   case ISL_COLORSPACE_LINEAR:
-  format =_mesa_get_srgb_format_linear(format);
-  break;
-
-   case ISL_COLORSPACE_SRGB:
-  format =_mesa_get_linear_format_srgb(format);
-  break;
-
-   default:
-  unreachable("Inalid colorspace for non-planar image");
+   if (!is_winsys_image) {
+  switch(_mesa_get_format_color_encoding(format)) {
+  case GL_SRGB:
+ format =_mesa_get_linear_format_srgb(format);
+ break;
+  case GL_LINEAR:
+ format =_mesa_get_srgb_format_linear(format);
+ break;
+  default:
+ unreachable("Invalid color encoding");
+  }
}
 
if (!brw->ctx.TextureFormatSupported[format]) {
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 2fce28c524..439b0f66ae 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -407,7 +407,7 @@ struct intel_mipmap_tree *
 intel_miptree_create_for_dri_image(struct brw_context *brw,
__DRIimage *image,
GLenum target,
-   enum isl_colorspace colorspace,
+  

Re: [Mesa-dev] [PATCH mesa] travis: move include path from $CC to $CFLAGS

2017-10-04 Thread Emil Velikov
On 4 October 2017 at 14:10, Eric Engestrom  wrote:
> Signed-off-by: Eric Engestrom 
Considering things still work (I'll push the wayland fix in a second)
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa] travis: move include path from $CC to $CFLAGS

2017-10-04 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 .travis.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 2c87f60ec12c7a287a2c..19fd6acf3b9e4d6d9a8e 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -507,7 +507,7 @@ script:
   test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
   test -n "$OVERRIDE_PATH" && export PATH="$OVERRIDE_PATH:$PATH";
 
-  export CC="$CC -isystem`pwd`";
+  export CFLAGS="$CFLAGS -isystem`pwd`";
 
   ./autogen.sh --enable-debug
 $LIBUNWIND_FLAGS
@@ -528,7 +528,7 @@ script:
 fi
 
   - if test "x$BUILD" = xmeson; then
-  export CC="$CC -isystem`pwd`";
+  export CFLAGS="$CFLAGS -isystem`pwd`";
   meson _build $MESON_OPTIONS;
   ninja -C _build test;
 fi
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vulkan/wsi/wayland: Extend matching between vk and wl_drm formats

2017-10-04 Thread Alexandros Frantzis
Extend the matching from vk to wl_drm formats and vice-versa, to include
all supported RGB(A) formats. Since the memory layout of many Vulkan
formats depends on system endianness, take endianness into account when
performing the matching.
---
 src/vulkan/wsi/wsi_common_wayland.c | 166 ++--
 1 file changed, 102 insertions(+), 64 deletions(-)

diff --git a/src/vulkan/wsi/wsi_common_wayland.c 
b/src/vulkan/wsi/wsi_common_wayland.c
index 4c94cd60a5..9bb1f68c1a 100644
--- a/src/vulkan/wsi/wsi_common_wayland.c
+++ b/src/vulkan/wsi/wsi_common_wayland.c
@@ -36,6 +36,7 @@
 #include "wayland-drm-client-protocol.h"
 
 #include 
+#include 
 #include 
 
 #define typed_memcpy(dest, src, count) ({ \
@@ -102,45 +103,57 @@ drm_handle_device(void *data, struct wl_drm *drm, const 
char *name)
 static uint32_t
 wl_drm_format_for_vk_format(VkFormat vk_format, bool alpha)
 {
+#define WL_DRM_FMT_XA(X, A) (alpha ? WL_DRM_FORMAT_ ## A : WL_DRM_FORMAT_ ## X)
+#ifdef PIPE_ARCH_LITTLE_ENDIAN
+#define WL_DRM_FMT_LE_XA(X, A) WL_DRM_FMT_XA(X, A)
+#define WL_DRM_FMT_LE_BE_XA(LE_X, LE_A, BE_X, BE_A) WL_DRM_FMT_XA(LE_X, LE_A)
+#else
+#define WL_DRM_FMT_LE_XA(X, A) 0
+#define WL_DRM_FMT_LE_BE_XA(LE_X, LE_A, BE_X, BE_A) WL_DRM_FMT_XA(BE_X, BE_A)
+#endif
+
switch (vk_format) {
-   /* TODO: Figure out what all the formats mean and make this table
-* correct.
-*/
-#if 0
-   case VK_FORMAT_R4G4B4A4_UNORM:
-  return alpha ? WL_DRM_FORMAT_ABGR : WL_DRM_FORMAT_XBGR;
-   case VK_FORMAT_R5G6B5_UNORM:
-  return WL_DRM_FORMAT_BGR565;
-   case VK_FORMAT_R5G5B5A1_UNORM:
-  return alpha ? WL_DRM_FORMAT_ABGR1555 : WL_DRM_FORMAT_XBGR1555;
+   case VK_FORMAT_R4G4B4A4_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(RGBX, RGBA);
+   case VK_FORMAT_B4G4R4A4_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(BGRX, BGRA);
+   case VK_FORMAT_R5G6B5_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(RGB565, RGB565);
+   case VK_FORMAT_B5G6R5_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(BGR565, BGR565);
+   case VK_FORMAT_R5G5B5A1_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(RGBX5551, RGBA5551);
+   case VK_FORMAT_B5G5R5A1_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(BGRX5551, BGRA5551);
+   case VK_FORMAT_A1R5G5B5_UNORM_PACK16:
+  return WL_DRM_FMT_LE_XA(XRGB1555, ARGB1555);
case VK_FORMAT_R8G8B8_UNORM:
-  return WL_DRM_FORMAT_XBGR;
-   case VK_FORMAT_R8G8B8A8_UNORM:
-  return alpha ? WL_DRM_FORMAT_ABGR : WL_DRM_FORMAT_XBGR;
-   case VK_FORMAT_R10G10B10A2_UNORM:
-  return alpha ? WL_DRM_FORMAT_ABGR2101010 : WL_DRM_FORMAT_XBGR2101010;
-   case VK_FORMAT_B4G4R4A4_UNORM:
-  return alpha ? WL_DRM_FORMAT_ARGB : WL_DRM_FORMAT_XRGB;
-   case VK_FORMAT_B5G6R5_UNORM:
-  return WL_DRM_FORMAT_RGB565;
-   case VK_FORMAT_B5G5R5A1_UNORM:
-  return alpha ? WL_DRM_FORMAT_XRGB1555 : WL_DRM_FORMAT_XRGB1555;
-#endif
+   case VK_FORMAT_R8G8B8_SRGB:
+  return WL_DRM_FORMAT_BGR888;
case VK_FORMAT_B8G8R8_UNORM:
case VK_FORMAT_B8G8R8_SRGB:
-  return WL_DRM_FORMAT_BGRX;
+  return WL_DRM_FORMAT_RGB888;
+   case VK_FORMAT_R8G8B8A8_UNORM:
+   case VK_FORMAT_R8G8B8A8_SRGB:
+  return WL_DRM_FMT_XA(XBGR, ABGR);
case VK_FORMAT_B8G8R8A8_UNORM:
case VK_FORMAT_B8G8R8A8_SRGB:
-  return alpha ? WL_DRM_FORMAT_ARGB : WL_DRM_FORMAT_XRGB;
-#if 0
-   case VK_FORMAT_B10G10R10A2_UNORM:
-  return alpha ? WL_DRM_FORMAT_ARGB2101010 : WL_DRM_FORMAT_XRGB2101010;
-#endif
-
+  return WL_DRM_FMT_XA(XRGB, ARGB);
+   case VK_FORMAT_A8B8G8R8_UNORM_PACK32:
+   case VK_FORMAT_A8B8G8R8_SRGB_PACK32:
+  return WL_DRM_FMT_LE_BE_XA(XBGR, ABGR, RGBX, RGBA);
+   case VK_FORMAT_A2R10G10B10_UNORM_PACK32:
+  return WL_DRM_FMT_LE_XA(XRGB2101010, ARGB2101010);
+   case VK_FORMAT_A2B10G10R10_UNORM_PACK32:
+  return WL_DRM_FMT_LE_XA(XBGR2101010, ABGR2101010);
default:
   assert(!"Unsupported Vulkan format");
   return 0;
}
+
+#undef WL_DRM_FMT_LE_BE_XA
+#undef WL_DRM_FMT_LE_XA
+#undef WL_DRM_FMT_XA
 }
 
 static void
@@ -150,56 +163,81 @@ drm_handle_format(void *data, struct wl_drm *drm, 
uint32_t wl_format)
if (display->formats.element_size == 0)
   return;
 
+#define ADD_VK_FORMAT(FMT) \
+   wsi_wl_display_add_vk_format(display, VK_FORMAT_ ## FMT)
+#ifdef PIPE_ARCH_LITTLE_ENDIAN
+#define ADD_VK_FORMAT_LE(FMT) ADD_VK_FORMAT(FMT)
+#define ADD_VK_FORMAT_BE(FMT)
+#else
+#define ADD_VK_FORMAT_LE(FMT)
+#define ADD_VK_FORMAT_BE(FMT) ADD_VK_FORMAT(FMT)
+#endif
+
switch (wl_format) {
-#if 0
-   case WL_DRM_FORMAT_ABGR:
-   case WL_DRM_FORMAT_XBGR:
-  wsi_wl_display_add_vk_format(display, VK_FORMAT_R4G4B4A4_UNORM);
+   case WL_DRM_FORMAT_RGBA:
+   case WL_DRM_FORMAT_RGBX:
+  ADD_VK_FORMAT_LE(R4G4B4A4_UNORM_PACK16);
   break;
-   case WL_DRM_FORMAT_BGR565:
-  wsi_wl_display_add_vk_format(display, VK_FORMAT_R5G6B5_UNORM);
+   case WL_DRM_FORMAT_BGRA:
+   case 

Re: [Mesa-dev] [PATCH] radv: enable tc compatible htile for d32s8 also.

2017-10-04 Thread Samuel Pitoiset

This totally breaks DOW3, I have to retract my Rb.

On 10/04/2017 02:13 PM, Samuel Pitoiset wrote:

Reviewed-by: Samuel Pitoiset 

On 10/04/2017 04:41 AM, Dave Airlie wrote:

From: Dave Airlie 

This enables tc compatible htile for stencil surfaces as well.

This gives a 3-5fps boost on Mad Max on high@4k.

It also depends on Bas's tc-compat htile patch.

Signed-off-by: Dave Airlie 
---
  src/amd/vulkan/radv_image.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index bf30281..c017bf8 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -114,7 +114,8 @@ radv_init_surface(struct radv_device *device,
  pCreateInfo->tiling != VK_IMAGE_TILING_LINEAR &&
  pCreateInfo->mipLevels <= 1 &&
  device->physical_device->rad_info.chip_class >= VI &&
-    (pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
+    ((pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
+  pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) ||
   (device->physical_device->rad_info.chip_class >= GFX9 &&
    pCreateInfo->format == VK_FORMAT_D16_UNORM)))
  surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: enable tc compatible htile for d32s8 also.

2017-10-04 Thread Samuel Pitoiset

Reviewed-by: Samuel Pitoiset 

On 10/04/2017 04:41 AM, Dave Airlie wrote:

From: Dave Airlie 

This enables tc compatible htile for stencil surfaces as well.

This gives a 3-5fps boost on Mad Max on high@4k.

It also depends on Bas's tc-compat htile patch.

Signed-off-by: Dave Airlie 
---
  src/amd/vulkan/radv_image.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index bf30281..c017bf8 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -114,7 +114,8 @@ radv_init_surface(struct radv_device *device,
pCreateInfo->tiling != VK_IMAGE_TILING_LINEAR &&
pCreateInfo->mipLevels <= 1 &&
device->physical_device->rad_info.chip_class >= VI &&
-   (pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
+   ((pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
+ pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) ||
 (device->physical_device->rad_info.chip_class >= GFX9 &&
  pCreateInfo->format == VK_FORMAT_D16_UNORM)))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >