Re: [Mesa-dev] [PATCH 03/10] mesa: Implement _mesa_array_element by walking enabled arrays.

2019-05-03 Thread Mathias Fröhlich
Hi Brian,

On Friday, 3 May 2019 14:40:26 CEST Brian Paul wrote:
> All your suggested changes look good.
>
> Reviewed-by: Brian Paul 
>
> Thanks.

Pushed Thanks!

best
Mathias


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] egl: add EGL_platform_device support

2019-05-03 Thread Marek Olšák
On Fri, May 3, 2019 at 1:58 AM Mathias Fröhlich 
wrote:

> Good Morning,
>
> On Wednesday, 1 May 2019 21:43:08 CEST Marek Olšák wrote:
> > BTW, swrast doesn't have to exist on the system. It's not uncommon for me
> > to have no swrast on my development system.
>
> Ok. I see. I use swrast regularly to test changes with different backend
> drivers.
> Also especially classic swrast as something that is close to the good old
> swtnl
> drivers - to catch bad interactions with those.
>
> Anyhow, with a very old swrast I think you will get test failures.
> But else if the system swrast is found in the hopefully not so distant
> future
> the tests should even pass - well depends on what Emil now does to get a
> better overall swrast behavior.
> On a production system with a full set of driver packages I do expect to
> find swrast, right? At least on a workstation grade linux distribution.
>
> I start to see the actual problem for AMD there.
> Not your test system at home, but the pro driver that needs to ship
> and QA swrast then.
>
> Anyhow, I do not actually understand the way how we walk all
> installed egl driver implementations - including closed drivers - finally
> and present all those devices. In a perfect world *for the customer*
> I could enumerate all devices - including oss i965 and the closed nvidia
> bumblebee device - on my laptop for example.
>
> Means - if that works fine AMD could hook into that mechanism and
> provide further devices. Well - in the long term.
>

We include libGL and libEGL along with radeonsi in our binary driver
installer. We probably don't include swrast, but I'm not 100% sure.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110603] Blocky and black opacity/alpha using RADV on some games

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110603

Lucas Francesco  changed:

   What|Removed |Added

 CC||lucas.francesc...@gmail.com
Version|unspecified |git

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110603] Blocky and black opacity/alpha using RADV on some games

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110603

Bug ID: 110603
   Summary: Blocky and black opacity/alpha using RADV on some
games
   Product: Mesa
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: minor
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: lucas.francesc...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 144153
  --> https://bugs.freedesktop.org/attachment.cgi?id=144153&action=edit
the bug itself on dota2 with vulkan enabled

I am experiencing a RADV bug with both arch and Gentoo linux, on LLVM7+ (cant
test on LLVM 6)


i can't reproduce it with  Ubuntu 18.10(strangely it works flawlessly there, i
didnt test with 19.10 but i can give it a go if its needed) on the same system
with the same hardware specs, i tried nuking Gentoo and installing arch to see
if the bug was a Gentoo specific one and it wasn't, i reinstalled Gentoo 2
times while testing it (with different use flags) and wasn't able to stop that
to happening

Already tried:
switching LLVM versions
switching to arch
changing compiler flags
changing around with the debug enable flag on mesa
downgrading glibc a bit
downgrading x-server
forcing the game to use wayland directly on SDL (in the case of artifact)


The games I can reproduce are mainly Source 2 ones, but i can reproduce it with
skyrim (dxvk dx11 version on proton) 

i'm putting it as minor severity as no one else that i asked that haves the
same gpu hardware besides me can reproduce the issue


System info:
https://gist.github.com/Uramekus/03308e0cdb776374d7cfa9ceb125bbe7

RenderDoc Capture (quite old at this point, i might make a new one tomorrow)

https://drive.google.com/file/d/1XZ8XMiA-j2eeZJU0vpfUeq85iQxsE67b/view?usp=sharing

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [AppVeyor] mesa master #11025 completed

2019-05-03 Thread AppVeyor


Build mesa 11025 completed



Commit d0ea9877b8 by Connor Abbott on 5/2/2019 8:22 PM:

nir/algebraic: Don't emit empty initializers for MSVC\n\nJust don't emit the transform array at all if there are no transforms\n\nv2:\n- Don't use len(array) > 0 (Dylan)\n- Keep using ARRAY_SIZE to make the generated C code easier to read\n(Jason).


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/algebraic: Don't emit empty initializers for MSVC

2019-05-03 Thread Connor Abbott
On Fri, May 3, 2019 at 10:39 PM Jason Ekstrand  wrote:

> On Fri, May 3, 2019 at 3:29 PM Connor Abbott  wrote:
>
>> FWIW, the reason I changed it away was to keep it consistent with the
>> line directly above that uses the length (since the C array won't exist if
>> it's length 0). Does that convince you?
>>
>
> Nope.  First off, if you take Dylan's suggestions (which I think are
> reasonable), it doesn't use the length.  Second, it means that the C code
> will now have an unverifiable random number in it.  Are you sure it's
> really 137?  I dare you to go count them.
>

Ok, I pushed it with your change.

Connor


>
> --Jason
>
>
>> On Fri, May 3, 2019 at 10:26 PM Jason Ekstrand 
>> wrote:
>>
>>> On Thu, May 2, 2019 at 3:51 PM Dylan Baker  wrote:
>>>
 Quoting Connor Abbott (2019-05-02 13:34:07)
 > Just don't emit the transform array at all if there are no transforms
 > for a state, and avoid trying to walk over it.
 > ---
 > Brian, does this build on Windows? I tested it on my shader-db
 > on radeonsi.
 >
 > ---
 >  src/compiler/nir/nir_algebraic.py | 6 +-
 >  1 file changed, 5 insertions(+), 1 deletion(-)
 >
 > diff --git a/src/compiler/nir/nir_algebraic.py
 b/src/compiler/nir/nir_algebraic.py
 > index 6db749e9248..7af80a4f92e 100644
 > --- a/src/compiler/nir/nir_algebraic.py
 > +++ b/src/compiler/nir/nir_algebraic.py
 > @@ -993,11 +993,13 @@ static const uint16_t CONST_STATE = 1;
 >  % endfor
 >
 >  % for state_id, state_xforms in enumerate(automaton.state_patterns):
 > +% if len(state_xforms) > 0: # avoid emitting a 0-length array for
 MSVC

 if not state_xforms:

 Using len() to test container emptiness is an anti-pattern in python,
 is is
 roughly 10x slower  than this.

 >  static const struct transform ${pass_name}_state${state_id}_xforms[]
 = {
 >  % for i in state_xforms:
 >{ ${xforms[i].search.c_ptr(cache)},
 ${xforms[i].replace.c_value_ptr(cache)}, ${xforms[i].condition_index} },
 >  % endfor
 >  };
 > +% endif
 >  % endfor
 >
 >  static const struct per_op_table
 ${pass_name}_table[nir_num_search_ops] = {
 > @@ -1080,7 +1082,8 @@ ${pass_name}_block(nir_builder *build,
 nir_block *block,
 >switch (states[alu->dest.dest.ssa.index]) {
 >  % for i in range(len(automaton.state_patterns)):
 >case ${i}:
 > - for (unsigned i = 0; i <
 ARRAY_SIZE(${pass_name}_state${i}_xforms); i++) {

>>>
>>> I'd rather keep the ARRAY_SIZE unless we've got a really good reason to
>>> drop it.  With that and Dylan's changes,
>>>
>>> Reviewed-by: Jason Ekstrand 
>>>
>>>
 > + % if len(automaton.state_patterns[i]) > 0:

 same here

 Dylan

 > + for (unsigned i = 0; i <
 ${len(automaton.state_patterns[i])}; i++) {
 >  const struct transform *xform =
 &${pass_name}_state${i}_xforms[i];
 >  if (condition_flags[xform->condition_offset] &&
 >  nir_replace_instr(build, alu, xform->search,
 xform->replace)) {
 > @@ -1088,6 +1091,7 @@ ${pass_name}_block(nir_builder *build,
 nir_block *block,
 > break;
 >  }
 >   }
 > + % endif
 >   break;
 >  % endfor
 >default: assert(0);
 > --
 > 2.17.2
 >
 > ___
 > mesa-dev mailing list
 > mesa-dev@lists.freedesktop.org
 > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110459] Escape from Tarkov on DXVK renders wrong windows reflection unless RADV_DEBUG=nohiz is passed

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110459

faalag...@gmail.com changed:

   What|Removed |Added

 Resolution|--- |NOTOURBUG
 Status|NEW |RESOLVED

--- Comment #6 from faalag...@gmail.com ---
Thank you! I was actually about to test it, but since I was messing with other
stuff I didn't do that earlier, I only got to it today. Updating mesa didn't
help, but it turned out that some DXVK update must have, so I misjudged the
culprit here, and it seemed to be a DXVK bug, so I'm closing it now :). Thank
you again for the help everyone!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/algebraic: Don't emit empty initializers for MSVC

2019-05-03 Thread Jason Ekstrand
On Fri, May 3, 2019 at 3:29 PM Connor Abbott  wrote:

> FWIW, the reason I changed it away was to keep it consistent with the line
> directly above that uses the length (since the C array won't exist if it's
> length 0). Does that convince you?
>

Nope.  First off, if you take Dylan's suggestions (which I think are
reasonable), it doesn't use the length.  Second, it means that the C code
will now have an unverifiable random number in it.  Are you sure it's
really 137?  I dare you to go count them.

--Jason


> On Fri, May 3, 2019 at 10:26 PM Jason Ekstrand 
> wrote:
>
>> On Thu, May 2, 2019 at 3:51 PM Dylan Baker  wrote:
>>
>>> Quoting Connor Abbott (2019-05-02 13:34:07)
>>> > Just don't emit the transform array at all if there are no transforms
>>> > for a state, and avoid trying to walk over it.
>>> > ---
>>> > Brian, does this build on Windows? I tested it on my shader-db
>>> > on radeonsi.
>>> >
>>> > ---
>>> >  src/compiler/nir/nir_algebraic.py | 6 +-
>>> >  1 file changed, 5 insertions(+), 1 deletion(-)
>>> >
>>> > diff --git a/src/compiler/nir/nir_algebraic.py
>>> b/src/compiler/nir/nir_algebraic.py
>>> > index 6db749e9248..7af80a4f92e 100644
>>> > --- a/src/compiler/nir/nir_algebraic.py
>>> > +++ b/src/compiler/nir/nir_algebraic.py
>>> > @@ -993,11 +993,13 @@ static const uint16_t CONST_STATE = 1;
>>> >  % endfor
>>> >
>>> >  % for state_id, state_xforms in enumerate(automaton.state_patterns):
>>> > +% if len(state_xforms) > 0: # avoid emitting a 0-length array for MSVC
>>>
>>> if not state_xforms:
>>>
>>> Using len() to test container emptiness is an anti-pattern in python, is
>>> is
>>> roughly 10x slower  than this.
>>>
>>> >  static const struct transform ${pass_name}_state${state_id}_xforms[]
>>> = {
>>> >  % for i in state_xforms:
>>> >{ ${xforms[i].search.c_ptr(cache)},
>>> ${xforms[i].replace.c_value_ptr(cache)}, ${xforms[i].condition_index} },
>>> >  % endfor
>>> >  };
>>> > +% endif
>>> >  % endfor
>>> >
>>> >  static const struct per_op_table
>>> ${pass_name}_table[nir_num_search_ops] = {
>>> > @@ -1080,7 +1082,8 @@ ${pass_name}_block(nir_builder *build, nir_block
>>> *block,
>>> >switch (states[alu->dest.dest.ssa.index]) {
>>> >  % for i in range(len(automaton.state_patterns)):
>>> >case ${i}:
>>> > - for (unsigned i = 0; i <
>>> ARRAY_SIZE(${pass_name}_state${i}_xforms); i++) {
>>>
>>
>> I'd rather keep the ARRAY_SIZE unless we've got a really good reason to
>> drop it.  With that and Dylan's changes,
>>
>> Reviewed-by: Jason Ekstrand 
>>
>>
>>> > + % if len(automaton.state_patterns[i]) > 0:
>>>
>>> same here
>>>
>>> Dylan
>>>
>>> > + for (unsigned i = 0; i <
>>> ${len(automaton.state_patterns[i])}; i++) {
>>> >  const struct transform *xform =
>>> &${pass_name}_state${i}_xforms[i];
>>> >  if (condition_flags[xform->condition_offset] &&
>>> >  nir_replace_instr(build, alu, xform->search,
>>> xform->replace)) {
>>> > @@ -1088,6 +1091,7 @@ ${pass_name}_block(nir_builder *build, nir_block
>>> *block,
>>> > break;
>>> >  }
>>> >   }
>>> > + % endif
>>> >   break;
>>> >  % endfor
>>> >default: assert(0);
>>> > --
>>> > 2.17.2
>>> >
>>> > ___
>>> > mesa-dev mailing list
>>> > mesa-dev@lists.freedesktop.org
>>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] anv: fix alphaToCoverage when there is no color attachment

2019-05-03 Thread Jason Ekstrand
On Fri, May 3, 2019 at 6:52 AM Iago Toral Quiroga  wrote:

> From: Samuel Iglesias Gonsálvez 
>
> There are tests in CTS for alpha to coverage without a color attachment
> that are failing. This happens because when we remove the shader color
> outputs when we don't have a valid color attachment for them, but when
> alpha to coverage is enabled we still want to preserve the the output
> at location 0 since we need its alpha component for alpha to coverage.
> In that case we will also need to create a null render target for RT 0.
>
> v2:
>   - We already create a null rt when we don't have any, so reuse that
> for this case (Jason)
>   - Simplify the code a bit (Iago)
>
> v3:
>   - Take alpha to coverage from the key and don't tie this to depth-only
> rendering only, we want the same behavior if we have multiple render
> targets but the one at location 0 is not used. (Jason).
>   - Rewrite commit message (Iago)
>
> Fixes the following CTS tests:
> dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.*
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> Signed-off-by: Iago Toral Quiroga 
> ---
>  src/intel/vulkan/anv_pipeline.c | 48 +
>  1 file changed, 37 insertions(+), 11 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_pipeline.c
> b/src/intel/vulkan/anv_pipeline.c
> index 20eab548fb2..f379dd2752e 100644
> --- a/src/intel/vulkan/anv_pipeline.c
> +++ b/src/intel/vulkan/anv_pipeline.c
> @@ -818,15 +818,28 @@ anv_pipeline_link_fs(const struct brw_compiler
> *compiler,
> memset(rt_used, 0, sizeof(rt_used));
>
> /* Flag used render targets */
> +   bool needs_null_rt_for_alpha_to_coverage = false;
> nir_foreach_variable_safe(var, &stage->nir->outputs) {
>if (var->data.location < FRAG_RESULT_DATA0)
>   continue;
>
>const unsigned rt = var->data.location - FRAG_RESULT_DATA0;
> -  /* Unused or out-of-bounds */
> -  if (rt >= MAX_RTS || !(stage->key.wm.color_outputs_valid & (1 <<
> rt)))
> +  /* Out-of-bounds */
> +  if (rt >= MAX_RTS)
>   continue;
>
> +  /* Unused */
> +  if (!(stage->key.wm.color_outputs_valid & (1 << rt))) {
>

While we're here, I realized reading this code today that we're only
checking one bit for the color attachment whereas we really should be
comparing against BITFIELD_RANGE(rt, array_size) here.


> + /* If this is the RT at location 0 and we have alpha to coverage
> +  * enabled, we'll have to create a null render target and it must
> +  * be at index 0.
> +  */
> + if (rt == 0 && stage->key.wm.alpha_to_coverage)
> +needs_null_rt_for_alpha_to_coverage = true;
>

Why do we need all this needs_null_rt_for_alpha_to_coverage buisiness?  Why
not just let it fall through and set rt_used to true and then


> +
> + continue;
> +  }
> +
>const unsigned array_len =
>   glsl_type_is_array(var->type) ? glsl_get_length(var->type) : 1;
>assert(rt + array_len <= max_rt);
> @@ -835,7 +848,12 @@ anv_pipeline_link_fs(const struct brw_compiler
> *compiler,
>   rt_used[rt + i] = true;
> }
>
> -   /* Set new, compacted, location */
> +   /* Make sure we leave the first RT slot available for alpha to coverage
> +* if we don't have a valid RT 0.
> +*/
> +   if (needs_null_rt_for_alpha_to_coverage)
> +  num_rts = 1;
> +
> for (unsigned i = 0; i < max_rt; i++) {
>if (!rt_used[i])
>   continue;
>

Down here just do

if (stage->key.wm.color_outputs_valid & (1 << i)) {
   /* Set up a color attachment */
} else {
   /* Set up a null attachment */
}
num_rts++;

This would also fix a bug that I think we have today if you have an array
in the shader that only has some of it's inputs valid.  I think today you'd
end up crashing when we go to bind the non-existent color attachments.


> @@ -857,11 +875,15 @@ anv_pipeline_link_fs(const struct brw_compiler
> *compiler,
>const unsigned rt = var->data.location - FRAG_RESULT_DATA0;
>if (rt >= MAX_RTS ||
>!(stage->key.wm.color_outputs_valid & (1 << rt))) {
> - /* Unused or out-of-bounds, throw it away */
> - deleted_output = true;
> - var->data.mode = nir_var_function_temp;
> - exec_node_remove(&var->node);
> - exec_list_push_tail(&impl->locals, &var->node);
> + /* Unused or out-of-bounds, throw it away, unless it is the first
> +  * RT and we have alpha to coverage.
> +  */
> + if (rt != 0 || !stage->key.wm.alpha_to_coverage) {
> +deleted_output = true;
> +var->data.mode = nir_var_function_temp;
> +exec_node_remove(&var->node);
> +exec_list_push_tail(&impl->locals, &var->node);
> + }
>   continue;
>}
>
> @@ -873,14 +895,18 @@ anv_pipeline_link_fs(const struct brw_compiler
> *compiler,
> if (deleted_output)
>nir_fixup_deref_modes(stage->nir)

Re: [Mesa-dev] [PATCH] nir/algebraic: Don't emit empty initializers for MSVC

2019-05-03 Thread Connor Abbott
FWIW, the reason I changed it away was to keep it consistent with the line
directly above that uses the length (since the C array won't exist if it's
length 0). Does that convince you?

On Fri, May 3, 2019 at 10:26 PM Jason Ekstrand  wrote:

> On Thu, May 2, 2019 at 3:51 PM Dylan Baker  wrote:
>
>> Quoting Connor Abbott (2019-05-02 13:34:07)
>> > Just don't emit the transform array at all if there are no transforms
>> > for a state, and avoid trying to walk over it.
>> > ---
>> > Brian, does this build on Windows? I tested it on my shader-db
>> > on radeonsi.
>> >
>> > ---
>> >  src/compiler/nir/nir_algebraic.py | 6 +-
>> >  1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/src/compiler/nir/nir_algebraic.py
>> b/src/compiler/nir/nir_algebraic.py
>> > index 6db749e9248..7af80a4f92e 100644
>> > --- a/src/compiler/nir/nir_algebraic.py
>> > +++ b/src/compiler/nir/nir_algebraic.py
>> > @@ -993,11 +993,13 @@ static const uint16_t CONST_STATE = 1;
>> >  % endfor
>> >
>> >  % for state_id, state_xforms in enumerate(automaton.state_patterns):
>> > +% if len(state_xforms) > 0: # avoid emitting a 0-length array for MSVC
>>
>> if not state_xforms:
>>
>> Using len() to test container emptiness is an anti-pattern in python, is
>> is
>> roughly 10x slower  than this.
>>
>> >  static const struct transform ${pass_name}_state${state_id}_xforms[] =
>> {
>> >  % for i in state_xforms:
>> >{ ${xforms[i].search.c_ptr(cache)},
>> ${xforms[i].replace.c_value_ptr(cache)}, ${xforms[i].condition_index} },
>> >  % endfor
>> >  };
>> > +% endif
>> >  % endfor
>> >
>> >  static const struct per_op_table
>> ${pass_name}_table[nir_num_search_ops] = {
>> > @@ -1080,7 +1082,8 @@ ${pass_name}_block(nir_builder *build, nir_block
>> *block,
>> >switch (states[alu->dest.dest.ssa.index]) {
>> >  % for i in range(len(automaton.state_patterns)):
>> >case ${i}:
>> > - for (unsigned i = 0; i <
>> ARRAY_SIZE(${pass_name}_state${i}_xforms); i++) {
>>
>
> I'd rather keep the ARRAY_SIZE unless we've got a really good reason to
> drop it.  With that and Dylan's changes,
>
> Reviewed-by: Jason Ekstrand 
>
>
>> > + % if len(automaton.state_patterns[i]) > 0:
>>
>> same here
>>
>> Dylan
>>
>> > + for (unsigned i = 0; i < ${len(automaton.state_patterns[i])};
>> i++) {
>> >  const struct transform *xform =
>> &${pass_name}_state${i}_xforms[i];
>> >  if (condition_flags[xform->condition_offset] &&
>> >  nir_replace_instr(build, alu, xform->search,
>> xform->replace)) {
>> > @@ -1088,6 +1091,7 @@ ${pass_name}_block(nir_builder *build, nir_block
>> *block,
>> > break;
>> >  }
>> >   }
>> > + % endif
>> >   break;
>> >  % endfor
>> >default: assert(0);
>> > --
>> > 2.17.2
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/algebraic: Don't emit empty initializers for MSVC

2019-05-03 Thread Jason Ekstrand
On Thu, May 2, 2019 at 3:51 PM Dylan Baker  wrote:

> Quoting Connor Abbott (2019-05-02 13:34:07)
> > Just don't emit the transform array at all if there are no transforms
> > for a state, and avoid trying to walk over it.
> > ---
> > Brian, does this build on Windows? I tested it on my shader-db
> > on radeonsi.
> >
> > ---
> >  src/compiler/nir/nir_algebraic.py | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/nir/nir_algebraic.py
> b/src/compiler/nir/nir_algebraic.py
> > index 6db749e9248..7af80a4f92e 100644
> > --- a/src/compiler/nir/nir_algebraic.py
> > +++ b/src/compiler/nir/nir_algebraic.py
> > @@ -993,11 +993,13 @@ static const uint16_t CONST_STATE = 1;
> >  % endfor
> >
> >  % for state_id, state_xforms in enumerate(automaton.state_patterns):
> > +% if len(state_xforms) > 0: # avoid emitting a 0-length array for MSVC
>
> if not state_xforms:
>
> Using len() to test container emptiness is an anti-pattern in python, is is
> roughly 10x slower  than this.
>
> >  static const struct transform ${pass_name}_state${state_id}_xforms[] = {
> >  % for i in state_xforms:
> >{ ${xforms[i].search.c_ptr(cache)},
> ${xforms[i].replace.c_value_ptr(cache)}, ${xforms[i].condition_index} },
> >  % endfor
> >  };
> > +% endif
> >  % endfor
> >
> >  static const struct per_op_table ${pass_name}_table[nir_num_search_ops]
> = {
> > @@ -1080,7 +1082,8 @@ ${pass_name}_block(nir_builder *build, nir_block
> *block,
> >switch (states[alu->dest.dest.ssa.index]) {
> >  % for i in range(len(automaton.state_patterns)):
> >case ${i}:
> > - for (unsigned i = 0; i <
> ARRAY_SIZE(${pass_name}_state${i}_xforms); i++) {
>

I'd rather keep the ARRAY_SIZE unless we've got a really good reason to
drop it.  With that and Dylan's changes,

Reviewed-by: Jason Ekstrand 


> > + % if len(automaton.state_patterns[i]) > 0:
>
> same here
>
> Dylan
>
> > + for (unsigned i = 0; i < ${len(automaton.state_patterns[i])};
> i++) {
> >  const struct transform *xform =
> &${pass_name}_state${i}_xforms[i];
> >  if (condition_flags[xform->condition_offset] &&
> >  nir_replace_instr(build, alu, xform->search,
> xform->replace)) {
> > @@ -1088,6 +1091,7 @@ ${pass_name}_block(nir_builder *build, nir_block
> *block,
> > break;
> >  }
> >   }
> > + % endif
> >   break;
> >  % endfor
> >default: assert(0);
> > --
> > 2.17.2
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/algebraic: Don't emit empty arrays for MSVC

2019-05-03 Thread Brian Paul
Just don't emit the transform array at all if there are no transforms
for a state, and avoid trying to walk over it.

Original patch by Connor Abbott.  Updated with suggestions by
Dylan Baker.
---
 src/compiler/nir/nir_algebraic.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_algebraic.py 
b/src/compiler/nir/nir_algebraic.py
index 6db749e..da5e39c 100644
--- a/src/compiler/nir/nir_algebraic.py
+++ b/src/compiler/nir/nir_algebraic.py
@@ -993,11 +993,13 @@ static const uint16_t CONST_STATE = 1;
 % endfor
 
 % for state_id, state_xforms in enumerate(automaton.state_patterns):
+% if state_xforms:  # avoid emitting a 0-length array for MSVC
 static const struct transform ${pass_name}_state${state_id}_xforms[] = {
 % for i in state_xforms:
   { ${xforms[i].search.c_ptr(cache)}, ${xforms[i].replace.c_value_ptr(cache)}, 
${xforms[i].condition_index} },
 % endfor
 };
+% endif
 % endfor
 
 static const struct per_op_table ${pass_name}_table[nir_num_search_ops] = {
@@ -1080,7 +1082,8 @@ ${pass_name}_block(nir_builder *build, nir_block *block,
   switch (states[alu->dest.dest.ssa.index]) {
 % for i in range(len(automaton.state_patterns)):
   case ${i}:
- for (unsigned i = 0; i < ARRAY_SIZE(${pass_name}_state${i}_xforms); 
i++) {
+ % if automaton.state_patterns[i]:
+ for (unsigned i = 0; i < ${len(automaton.state_patterns[i])}; i++) {
 const struct transform *xform = &${pass_name}_state${i}_xforms[i];
 if (condition_flags[xform->condition_offset] &&
 nir_replace_instr(build, alu, xform->search, xform->replace)) {
@@ -1088,6 +1091,7 @@ ${pass_name}_block(nir_builder *build, nir_block *block,
break;
 }
  }
+ % endif
  break;
 % endfor
   default: assert(0);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110573] Mesa vulkan-radeon 19.0.3 system freeze and visual artifacts (RADV)

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110573

--- Comment #16 from ant...@gmx.de ---
Great work, thanks again!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [AppVeyor] mesa master #11022 failed

2019-05-03 Thread AppVeyor



Build mesa 11022 failed


Commit a381dbf253 by Chuck Atkins on 5/3/2019 4:06 PM:

meson: Fix missing glproto dependency for gallium-glx\n\nSigned-off-by: Chuck Atkins \nCc: mesa-stable \nReviewed-by: Dylan Baker 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [AppVeyor] mesa staging/19.0 #11021 completed

2019-05-03 Thread AppVeyor


Build mesa 11021 completed



Commit a37e0454e1 by Bas Nieuwenhuizen on 5/2/2019 2:03 PM:

radv: Disable VK_EXT_descriptor_indexing.\n\nWe did not implement the required non-uniform indexing features.\n\nThis patch is to disable the extension on 19.0. For 19.1 the\nnecessary functionality has been implemented.\n\nFixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing."\nReviewed-by: Samuel Pitoiset 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110573] Mesa vulkan-radeon 19.0.3 system freeze and visual artifacts (RADV)

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110573

Samuel Pitoiset  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #15 from Samuel Pitoiset  ---
Fixed with
https://cgit.freedesktop.org/mesa/mesa/commit/?id=4f18c43d1df64135e8968a7d4fbfd2c9918b76ae

It should be in the 19.0.4 and 19.1 releases.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] panfrost: Questions regarding pan_wallpaper.c (and the 'reload FB content' logic in general)

2019-05-03 Thread Alyssa Rosenzweig
> Actually, I tried several things, so I might have left it flipped at
> some point, but it doesn't work with with src_factor=one and
> dts_factor=zero.

Hum.

> Good question. I haven't dumped the buffers yet. Another thing to note:
> the texture I'm reloading from is using PIPE_FORMAT_B8G8R8X8_UNORM as
> a format, so no alpha component in there. I don't know exactly what
> happens in this case (do we have garbage in the alpha component?) :-/.

That just means the alpha component is implicitly cleared to 1.0 and
never really supposed to be read/written beyond that.

> We definitely bind the new sampler/texture, but I'm not sure we restore
> the old one. This being said, the version I have made using
> util_blitter_blit() was taking care of saving/restoring those, and I
> still had the issue :-/.

I meant the restore bit, but :/

> You mean we should apply the viewport transform on top, right?

No, this is ok, I just wanted to give context on what this is. The
input varying from OpenGL side (with a blitter implementation, for
instance), would be in the range (-1.0, +1.0), but then that gets
viewport transformed. With the default (0, 0)->(w, h) transform, it's
this. I guess for modified viewports this is a little, so we need to fix that
as well, but I don't think that's the issue here, since then blitter
would've been ok.

> Hehe, glad to hear that at least part of this investigation lead to
> something useful :-).

Haha :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] panfrost: Questions regarding pan_wallpaper.c (and the 'reload FB content' logic in general)

2019-05-03 Thread Boris Brezillon
On Fri, 3 May 2019 07:29:22 -0700
Alyssa Rosenzweig  wrote:

> > +else if (!(job->clear & PIPE_CLEAR_COLOR))  
> 
> Make sure this is actually being called when you expect. I don't
> remember if job->clear is being zeroed when we expect (hint: it might
> not be due to a missing job_free routine somewhere, *blush*).
> 
> > +   .rt[0].rgb_func = PIPE_BLEND_ADD,
> > +   .rt[0].rgb_src_factor = PIPE_BLENDFACTOR_ONE,
> > +   .rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_ZERO,
> > +   .rt[0].alpha_func = PIPE_BLEND_ADD,
> > +   .rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ZERO,
> > +   .rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE,  
> 
> Why is alpha flipped?

Actually, I tried several things, so I might have left it flipped at
some point, but it doesn't work with with src_factor=one and
dts_factor=zero.

> Is the black you're seeing really (0, 0, 0, a) or
> might it be (0, 0, 0, 0)?

Good question. I haven't dumped the buffers yet. Another thing to note:
the texture I'm reloading from is using PIPE_FORMAT_B8G8R8X8_UNORM as
a format, so no alpha component in there. I don't know exactly what
happens in this case (do we have garbage in the alpha component?) :-/.


> 
> >  /* Bind texture/sampler. TODO: push/pop */  
> 
> (Was this TODO addressed? It might explain the missing panel)

We definitely bind the new sampler/texture, but I'm not sure we restore
the old one. This being said, the version I have made using
util_blitter_blit() was taking care of saving/restoring those, and I
still had the issue :-/.

> 
> > +0.0, ctx->pipe_framebuffer.height, 0.0, 1.0,
> > +0.0, 0.0, 0.0, 1.0,
> > +ctx->pipe_framebuffer.width, ctx->pipe_framebuffer.height, 
> > 0.0, 1.0,
> > +ctx->pipe_framebuffer.width, 0.0, 0.0, 1.0,  
> 
> Just FWIW, this routine is running a fragment shader _without a vertex
> shader_. In effect, we're running the vertex shader in software and
> writing varyings straight to memory, as if the VS ran. So these values
> are essentially the transformed output of a vertex shader.

You mean we should apply the viewport transform on top, right? But
again, the blitter-based implementation had a full VS -> FS pipeline
with viewport transform applied and it didn't work either.

> 
> > +.format = MALI_RGBA32F,
> > +.swizzle = panfrost_get_default_swizzle(4),
> > +.unknown1 = 0x2,  
> 
> +1
> 
> > +   /* Looks like setting first_tiler_job->dependency_index_2 to point 
> > to
> > +* out 'reload fb content' job is not enough, the link order 
> > matters too.
> > +* Let's insert our job in the first slot so that 
> > panfrost_link_jobs()
> > +* place it before any other tiler jobs.  
> 
> Oh, that's very interesting, I never realized that was the issue (I had
> draw order issues). It's worth more investigation in the future, but +1
> and here's a cookie :P

Hehe, glad to hear that at least part of this investigation lead to
something useful :-).

> 
> --
> 
> Hoping one of these is related, but the diff looks fairly solid

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] etnaviv: fill missing offset in etna_resource_get_handle

2019-05-03 Thread Lucas Stach
Am Freitag, den 03.05.2019, 12:05 +0200 schrieb Philipp Zabel:
> Without this gbm_bo_get_offset() can return 0 where it shouldn't.

Reviewed-by: Lucas Stach 

> ---
>  src/gallium/drivers/etnaviv/etnaviv_resource.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_resource.c 
> b/src/gallium/drivers/etnaviv/etnaviv_resource.c
> index 83179d3cd088..ab77a80c72b3 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_resource.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_resource.c
> @@ -622,6 +622,7 @@ etna_resource_get_handle(struct pipe_screen *pscreen,
>    rsc = etna_resource(rsc->external);
>  
> handle->stride = rsc->levels[0].stride;
> +   handle->offset = rsc->levels[0].offset;
> handle->modifier = layout_to_modifier(rsc->layout);
>  
> if (handle->type == WINSYS_HANDLE_TYPE_SHARED) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] panfrost: Questions regarding pan_wallpaper.c (and the 'reload FB content' logic in general)

2019-05-03 Thread Alyssa Rosenzweig
> +else if (!(job->clear & PIPE_CLEAR_COLOR))

Make sure this is actually being called when you expect. I don't
remember if job->clear is being zeroed when we expect (hint: it might
not be due to a missing job_free routine somewhere, *blush*).

> +   .rt[0].rgb_func = PIPE_BLEND_ADD,
> +   .rt[0].rgb_src_factor = PIPE_BLENDFACTOR_ONE,
> +   .rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_ZERO,
> +   .rt[0].alpha_func = PIPE_BLEND_ADD,
> +   .rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ZERO,
> +   .rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE,

Why is alpha flipped? Is the black you're seeing really (0, 0, 0, a) or
might it be (0, 0, 0, 0)?

>  /* Bind texture/sampler. TODO: push/pop */

(Was this TODO addressed? It might explain the missing panel)

> +0.0, ctx->pipe_framebuffer.height, 0.0, 1.0,
> +0.0, 0.0, 0.0, 1.0,
> +ctx->pipe_framebuffer.width, ctx->pipe_framebuffer.height, 
> 0.0, 1.0,
> +ctx->pipe_framebuffer.width, 0.0, 0.0, 1.0,

Just FWIW, this routine is running a fragment shader _without a vertex
shader_. In effect, we're running the vertex shader in software and
writing varyings straight to memory, as if the VS ran. So these values
are essentially the transformed output of a vertex shader.

> +.format = MALI_RGBA32F,
> +.swizzle = panfrost_get_default_swizzle(4),
> +.unknown1 = 0x2,

+1

> +   /* Looks like setting first_tiler_job->dependency_index_2 to point to
> +* out 'reload fb content' job is not enough, the link order matters 
> too.
> +* Let's insert our job in the first slot so that panfrost_link_jobs()
> +* place it before any other tiler jobs.

Oh, that's very interesting, I never realized that was the issue (I had
draw order issues). It's worth more investigation in the future, but +1
and here's a cookie :P

--

Hoping one of these is related, but the diff looks fairly solid
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: apply the indexing workaround for atomic buffer operations on GFX9

2019-05-03 Thread Bas Nieuwenhuizen
On Fri, May 3, 2019 at 11:42 AM Samuel Pitoiset
 wrote:
>
> Because the new raw/struct intrinsics are buggy with LLVM 8
> (they weren't marked as source of divergence), we fallback to the
> old instrinsics for atomic buffer operations. This means we need
> to apply the indexing workaround for GFX9.

Can you make it more clear that we only delayed atomics to LLVM 9 and
not load/store. I was confused on why we needed another variable.

Otherwise r-b
>
> The fact that we need another workaround is painful but we should
> be able to clean up that a bit once LLVM 7 support will be dropped.
>
> This fixes a GPU hang with AC Odyssey and some rendering problems
> with Nioh.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573
> Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic 
> intrinsics with LLVM 9+")
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/common/ac_nir_to_llvm.c   | 12 +++-
>  src/amd/common/ac_shader_abi.h|  1 +
>  src/amd/vulkan/radv_nir_to_llvm.c |  6 ++
>  3 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index c92eaaca31d..151e0d0f961 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -2417,10 +2417,12 @@ static void get_image_coords(struct ac_nir_context 
> *ctx,
>  }
>
>  static LLVMValueRef get_image_buffer_descriptor(struct ac_nir_context *ctx,
> -const nir_intrinsic_instr 
> *instr, bool write)
> +const nir_intrinsic_instr 
> *instr,
> +   bool write, bool atomic)
>  {
> LLVMValueRef rsrc = get_image_descriptor(ctx, instr, AC_DESC_BUFFER, 
> write);
> -   if (ctx->abi->gfx9_stride_size_workaround) {
> +   if (ctx->abi->gfx9_stride_size_workaround ||
> +   (ctx->abi->gfx9_stride_size_workaround_for_atomic && atomic)) {
> LLVMValueRef elem_count = 
> LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 2, 
> 0), "");
> LLVMValueRef stride = 
> LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 1, 
> 0), "");
> stride = LLVMBuildLShr(ctx->ac.builder, stride, 
> LLVMConstInt(ctx->ac.i32, 16, 0), "");
> @@ -2466,7 +2468,7 @@ static LLVMValueRef visit_image_load(struct 
> ac_nir_context *ctx,
> unsigned num_channels = util_last_bit(mask);
> LLVMValueRef rsrc, vindex;
>
> -   rsrc = get_image_buffer_descriptor(ctx, instr, false);
> +   rsrc = get_image_buffer_descriptor(ctx, instr, false, false);
> vindex = LLVMBuildExtractElement(ctx->ac.builder, 
> get_src(ctx, instr->src[1]),
>  ctx->ac.i32_0, "");
>
> @@ -2520,7 +2522,7 @@ static void visit_image_store(struct ac_nir_context 
> *ctx,
> args.cache_policy = get_cache_policy(ctx, access, true, 
> writeonly_memory);
>
> if (dim == GLSL_SAMPLER_DIM_BUF) {
> -   LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, 
> true);
> +   LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, 
> true, false);
> LLVMValueRef src = ac_to_float(&ctx->ac, get_src(ctx, 
> instr->src[3]));
> unsigned src_channels = ac_get_llvm_num_components(src);
> LLVMValueRef vindex;
> @@ -2632,7 +2634,7 @@ static LLVMValueRef visit_image_atomic(struct 
> ac_nir_context *ctx,
> params[param_count++] = get_src(ctx, instr->src[3]);
>
> if (dim == GLSL_SAMPLER_DIM_BUF) {
> -   params[param_count++] = get_image_buffer_descriptor(ctx, 
> instr, true);
> +   params[param_count++] = get_image_buffer_descriptor(ctx, 
> instr, true, true);
> params[param_count++] = 
> LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, instr->src[1]),
> 
> ctx->ac.i32_0, ""); /* vindex */
> params[param_count++] = ctx->ac.i32_0; /* voffset */
> diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
> index 108fe58ce57..8debb1ff986 100644
> --- a/src/amd/common/ac_shader_abi.h
> +++ b/src/amd/common/ac_shader_abi.h
> @@ -203,6 +203,7 @@ struct ac_shader_abi {
> /* Whether to workaround GFX9 ignoring the stride for the buffer size 
> if IDXEN=0
> * and LLVM optimizes an indexed load with constant index to IDXEN=0. 
> */
> bool gfx9_stride_size_workaround;
> +   bool gfx9_stride_size_workaround_for_atomic;
>  };
>
>  #endif /* AC_SHADER_ABI_H */
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 796d78e34f4..d83f0bd547f 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -3687,6 +3687,12 @@ LLVMM

Re: [Mesa-dev] [PATCH 03/10] mesa: Implement _mesa_array_element by walking enabled arrays.

2019-05-03 Thread Brian Paul

On 05/02/2019 11:18 PM, Mathias Fröhlich wrote:

Hi Brian,

On Friday, 3 May 2019 00:17:51 CEST Brian Paul wrote:

On 05/02/2019 03:27 AM, mathias.froehl...@gmx.net wrote:

From: Mathias Fröhlich 

In glArrayElement, use the bitmask trick to just walk the enabled
vao arrays. This should be about equivalent in execution time to
walk the prepare aelt_context list. Finally this will allow us to
reduce the _mesa_update_state calls in a few patches.

Signed-off-by: Mathias Fröhlich 
---
   src/mesa/main/api_arrayelt.c | 78 
   1 file changed, 61 insertions(+), 17 deletions(-)

diff --git a/src/mesa/main/api_arrayelt.c b/src/mesa/main/api_arrayelt.c
index d46c8d14b68..62f1e73ca4c 100644
--- a/src/mesa/main/api_arrayelt.c
+++ b/src/mesa/main/api_arrayelt.c
@@ -1541,32 +1541,76 @@ _ae_update_state(struct gl_context *ctx)
   }


+static inline attrib_func
+func_nv(const struct gl_vertex_format *vformat)
+{
+   return AttribFuncsNV[vformat->Normalized][vformat->Size-1]
+  [TYPE_IDX(vformat->Type)];
+}
+
+
+static inline attrib_func
+func_arb(const struct gl_vertex_format *vformat)
+{
+   return AttribFuncsARB[NORM_IDX(vformat)][vformat->Size-1]
+  [TYPE_IDX(vformat->Type)];
+}
+
+
+static inline const void *
+attrib_src(const struct gl_vertex_array_object *vao,
+   const struct gl_array_attributes *array, GLint elt)
+{
+   const struct gl_vertex_buffer_binding *binding =
+  &vao->BufferBinding[array->BufferBindingIndex];
+   const GLubyte *src
+  = ADD_POINTERS(binding->BufferObj->Mappings[MAP_INTERNAL].Pointer,
+ _mesa_vertex_attrib_address(array, binding))
+  + elt * binding->Stride;
+   return src;
+}


Could you add some brief comments on those functions to explain what
they do?


Added brief comments:

/*
  * Return VertexAttrib*NV function pointer matching the provided vertex format.
  */
static inline attrib_func
func_nv(const struct gl_vertex_format *vformat)

[...]

/*
  * Return VertexAttrib*ARB function pointer matching the provided vertex 
format.
  */
static inline attrib_func
func_arb(const struct gl_vertex_format *vformat)

[...]

/*
  * Return the address of the array attribute array at elt in the
  * vertex array object vao.
  */
static inline const void *
attrib_src(const struct gl_vertex_array_object *vao,
const struct gl_array_attributes *array, GLint elt)



Otherwise, for the rest of the series,
Reviewed-by: Brian Paul 

Nice work!!


Thanks for looking at the patches!


All your suggested changes look good.

Reviewed-by: Brian Paul 

Thanks.

-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] anv: fix alphaToCoverage when there is no color attachment

2019-05-03 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez 

There are tests in CTS for alpha to coverage without a color attachment
that are failing. This happens because when we remove the shader color
outputs when we don't have a valid color attachment for them, but when
alpha to coverage is enabled we still want to preserve the the output
at location 0 since we need its alpha component for alpha to coverage.
In that case we will also need to create a null render target for RT 0.

v2:
  - We already create a null rt when we don't have any, so reuse that
for this case (Jason)
  - Simplify the code a bit (Iago)

v3:
  - Take alpha to coverage from the key and don't tie this to depth-only
rendering only, we want the same behavior if we have multiple render
targets but the one at location 0 is not used. (Jason).
  - Rewrite commit message (Iago)

Fixes the following CTS tests:
dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.*

Signed-off-by: Samuel Iglesias Gonsálvez 
Signed-off-by: Iago Toral Quiroga 
---
 src/intel/vulkan/anv_pipeline.c | 48 +
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 20eab548fb2..f379dd2752e 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -818,15 +818,28 @@ anv_pipeline_link_fs(const struct brw_compiler *compiler,
memset(rt_used, 0, sizeof(rt_used));
 
/* Flag used render targets */
+   bool needs_null_rt_for_alpha_to_coverage = false;
nir_foreach_variable_safe(var, &stage->nir->outputs) {
   if (var->data.location < FRAG_RESULT_DATA0)
  continue;
 
   const unsigned rt = var->data.location - FRAG_RESULT_DATA0;
-  /* Unused or out-of-bounds */
-  if (rt >= MAX_RTS || !(stage->key.wm.color_outputs_valid & (1 << rt)))
+  /* Out-of-bounds */
+  if (rt >= MAX_RTS)
  continue;
 
+  /* Unused */
+  if (!(stage->key.wm.color_outputs_valid & (1 << rt))) {
+ /* If this is the RT at location 0 and we have alpha to coverage
+  * enabled, we'll have to create a null render target and it must
+  * be at index 0.
+  */
+ if (rt == 0 && stage->key.wm.alpha_to_coverage)
+needs_null_rt_for_alpha_to_coverage = true;
+
+ continue;
+  }
+
   const unsigned array_len =
  glsl_type_is_array(var->type) ? glsl_get_length(var->type) : 1;
   assert(rt + array_len <= max_rt);
@@ -835,7 +848,12 @@ anv_pipeline_link_fs(const struct brw_compiler *compiler,
  rt_used[rt + i] = true;
}
 
-   /* Set new, compacted, location */
+   /* Make sure we leave the first RT slot available for alpha to coverage
+* if we don't have a valid RT 0.
+*/
+   if (needs_null_rt_for_alpha_to_coverage)
+  num_rts = 1;
+
for (unsigned i = 0; i < max_rt; i++) {
   if (!rt_used[i])
  continue;
@@ -857,11 +875,15 @@ anv_pipeline_link_fs(const struct brw_compiler *compiler,
   const unsigned rt = var->data.location - FRAG_RESULT_DATA0;
   if (rt >= MAX_RTS ||
   !(stage->key.wm.color_outputs_valid & (1 << rt))) {
- /* Unused or out-of-bounds, throw it away */
- deleted_output = true;
- var->data.mode = nir_var_function_temp;
- exec_node_remove(&var->node);
- exec_list_push_tail(&impl->locals, &var->node);
+ /* Unused or out-of-bounds, throw it away, unless it is the first
+  * RT and we have alpha to coverage.
+  */
+ if (rt != 0 || !stage->key.wm.alpha_to_coverage) {
+deleted_output = true;
+var->data.mode = nir_var_function_temp;
+exec_node_remove(&var->node);
+exec_list_push_tail(&impl->locals, &var->node);
+ }
  continue;
   }
 
@@ -873,14 +895,18 @@ anv_pipeline_link_fs(const struct brw_compiler *compiler,
if (deleted_output)
   nir_fixup_deref_modes(stage->nir);
 
-   if (num_rts == 0) {
-  /* If we have no render targets, we need a null render target */
+   /* If we have no render targets or we need to create one for alpha to
+* coverage, we need a null render target.
+*/
+   if (num_rts == 0 || needs_null_rt_for_alpha_to_coverage) {
   rt_bindings[0] = (struct anv_pipeline_binding) {
  .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS,
  .binding = 0,
  .index = UINT32_MAX,
   };
-  num_rts = 1;
+
+  if (num_rts == 0)
+ num_rts = 1;
}
 
/* Now that we've determined the actual number of render targets, adjust
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/10] radeonsi: don't declare pointers to static strings

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

The compiler should be able to optimize them away, but still. There's
no point in declaring those as pointers, and if the compiler *doesn't*
optimize them away, they add unnecessary load-time relocations.
---
 src/gallium/drivers/radeonsi/si_shader.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 71c85eb79a5..c457ca12b9a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -30,24 +30,24 @@
 
 #include "ac_exp_param.h"
 #include "ac_shader_util.h"
 #include "ac_llvm_util.h"
 #include "si_shader_internal.h"
 #include "si_pipe.h"
 #include "sid.h"
 
 #include "compiler/nir/nir.h"
 
-static const char *scratch_rsrc_dword0_symbol =
+static const char scratch_rsrc_dword0_symbol[] =
"SCRATCH_RSRC_DWORD0";
 
-static const char *scratch_rsrc_dword1_symbol =
+static const char scratch_rsrc_dword1_symbol[] =
"SCRATCH_RSRC_DWORD1";
 
 struct si_shader_output_values
 {
LLVMValueRef values[4];
unsigned semantic_name;
unsigned semantic_index;
ubyte vertex_stream[4];
 };
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/10] amd/common: clarify ac_shader_binary::lds_size

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/amd/common/ac_binary.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_binary.h b/src/amd/common/ac_binary.h
index febc4da7fed..8f594a9ce75 100644
--- a/src/amd/common/ac_binary.h
+++ b/src/amd/common/ac_binary.h
@@ -68,21 +68,21 @@ struct ac_shader_binary {
/** Disassembled shader in a string. */
char *disasm_string;
char *llvm_ir_string;
 };
 
 struct ac_shader_config {
unsigned num_sgprs;
unsigned num_vgprs;
unsigned spilled_sgprs;
unsigned spilled_vgprs;
-   unsigned lds_size;
+   unsigned lds_size; /* in HW allocation units; i.e 256 bytes on SI, 512 
bytes on CI+ */
unsigned spi_ps_input_ena;
unsigned spi_ps_input_addr;
unsigned float_mode;
unsigned scratch_bytes_per_wave;
 };
 
 /*
  * Parse the elf binary stored in \p elf_data and create a
  * ac_shader_binary object.
  */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/10] amd/common: extract ac_parse_shader_binary_config

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/amd/common/ac_binary.c | 77 +-
 src/amd/common/ac_binary.h |  4 ++
 2 files changed, 47 insertions(+), 34 deletions(-)

diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
index fabeb15a204..44251886b5f 100644
--- a/src/amd/common/ac_binary.c
+++ b/src/amd/common/ac_binary.c
@@ -199,57 +199,30 @@ const unsigned char *ac_shader_binary_config_start(
unsigned i;
for (i = 0; i < binary->global_symbol_count; ++i) {
if (binary->global_symbol_offsets[i] == symbol_offset) {
unsigned offset = i * binary->config_size_per_symbol;
return binary->config + offset;
}
}
return binary->config;
 }
 
-
-static const char *scratch_rsrc_dword0_symbol =
-   "SCRATCH_RSRC_DWORD0";
-
-static const char *scratch_rsrc_dword1_symbol =
-   "SCRATCH_RSRC_DWORD1";
-
-void ac_shader_binary_read_config(struct ac_shader_binary *binary,
- struct ac_shader_config *conf,
- unsigned symbol_offset,
- bool supports_spill)
+/* Parse configuration data in .AMDGPU.config section format. */
+void ac_parse_shader_binary_config(const char *data, size_t nbytes,
+  bool really_needs_scratch,
+  struct ac_shader_config *conf)
 {
-   unsigned i;
-   const unsigned char *config =
-   ac_shader_binary_config_start(binary, symbol_offset);
-   bool really_needs_scratch = false;
uint32_t wavesize = 0;
-   /* LLVM adds SGPR spills to the scratch size.
-* Find out if we really need the scratch buffer.
-*/
-   if (supports_spill) {
-   really_needs_scratch = true;
-   } else {
-   for (i = 0; i < binary->reloc_count; i++) {
-   const struct ac_shader_reloc *reloc = 
&binary->relocs[i];
 
-   if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) ||
-   !strcmp(scratch_rsrc_dword1_symbol, reloc->name)) {
-   really_needs_scratch = true;
-   break;
-   }
-   }
-   }
-
-   for (i = 0; i < binary->config_size_per_symbol; i+= 8) {
-   unsigned reg = util_le32_to_cpu(*(uint32_t*)(config + i));
-   unsigned value = util_le32_to_cpu(*(uint32_t*)(config + i + 4));
+   for (size_t i = 0; i < nbytes; i += 8) {
+   unsigned reg = util_le32_to_cpu(*(uint32_t*)(data + i));
+   unsigned value = util_le32_to_cpu(*(uint32_t*)(data + i + 4));
switch (reg) {
case R_00B028_SPI_SHADER_PGM_RSRC1_PS:
case R_00B128_SPI_SHADER_PGM_RSRC1_VS:
case R_00B228_SPI_SHADER_PGM_RSRC1_GS:
case R_00B848_COMPUTE_PGM_RSRC1:
case R_00B428_SPI_SHADER_PGM_RSRC1_HS:
conf->num_sgprs = MAX2(conf->num_sgprs, 
(G_00B028_SGPRS(value) + 1) * 8);
conf->num_vgprs = MAX2(conf->num_vgprs, 
(G_00B028_VGPRS(value) + 1) * 4);
conf->float_mode =  G_00B028_FLOAT_MODE(value);
break;
@@ -292,20 +265,56 @@ void ac_shader_binary_read_config(struct ac_shader_binary 
*binary,
if (!conf->spi_ps_input_addr)
conf->spi_ps_input_addr = conf->spi_ps_input_ena;
}
 
if (really_needs_scratch) {
/* sgprs spills aren't spilling */
conf->scratch_bytes_per_wave = G_00B860_WAVESIZE(wavesize) * 
256 * 4;
}
 }
 
+static const char *scratch_rsrc_dword0_symbol =
+   "SCRATCH_RSRC_DWORD0";
+
+static const char *scratch_rsrc_dword1_symbol =
+   "SCRATCH_RSRC_DWORD1";
+
+void ac_shader_binary_read_config(struct ac_shader_binary *binary,
+ struct ac_shader_config *conf,
+ unsigned symbol_offset,
+ bool supports_spill)
+{
+   unsigned i;
+   const char *config =
+   (const char *)ac_shader_binary_config_start(binary, 
symbol_offset);
+   bool really_needs_scratch = false;
+   /* LLVM adds SGPR spills to the scratch size.
+* Find out if we really need the scratch buffer.
+*/
+   if (supports_spill) {
+   really_needs_scratch = true;
+   } else {
+   for (i = 0; i < binary->reloc_count; i++) {
+   const struct ac_shader_reloc *reloc = 
&binary->relocs[i];
+
+   if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) ||
+   !strcmp(scratch_rsrc_dword1_symbol, reloc->name)) {
+   really_needs_scratch = true;
+   break;
+  

[Mesa-dev] [PATCH 06/10] radeonsi: return bool from si_shader_binary_upload

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

We didn't really use error codes anyway.
---
 src/gallium/drivers/radeonsi/si_compute.c |  6 +++---
 src/gallium/drivers/radeonsi/si_shader.c  | 21 +--
 src/gallium/drivers/radeonsi/si_shader.h  |  2 +-
 .../drivers/radeonsi/si_state_shaders.c   |  6 ++
 4 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 2acd96545aa..2899ee146d4 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -137,21 +137,21 @@ static void si_create_compute_state_async(void *job, int 
thread_index)
mtx_lock(&sscreen->shader_cache_mutex);
 
if (ir_binary &&
si_shader_cache_load_shader(sscreen, ir_binary, shader)) {
mtx_unlock(&sscreen->shader_cache_mutex);
 
si_shader_dump_stats_for_shader_db(shader, debug);
si_shader_dump(sscreen, shader, debug, PIPE_SHADER_COMPUTE,
   stderr, true);
 
-   if (si_shader_binary_upload(sscreen, shader))
+   if (!si_shader_binary_upload(sscreen, shader))
program->shader.compilation_failed = true;
} else {
mtx_unlock(&sscreen->shader_cache_mutex);
 
if (!si_shader_create(sscreen, compiler, &program->shader, 
debug)) {
program->shader.compilation_failed = true;
 
if (program->ir_type == PIPE_SHADER_IR_TGSI)
FREE(program->ir.tgsi);
program->shader.selector = NULL;
@@ -246,21 +246,21 @@ static void *si_create_compute_state(
program->shader.binary.reloc_count);
FREE(program);
return NULL;
}
} else {
ac_shader_binary_read_config(&program->shader.binary,
 &program->shader.config, 0, false);
}
si_shader_dump(sctx->screen, &program->shader, &sctx->debug,
   PIPE_SHADER_COMPUTE, stderr, true);
-   if (si_shader_binary_upload(sctx->screen, &program->shader) < 
0) {
+   if (!si_shader_binary_upload(sctx->screen, &program->shader)) {
fprintf(stderr, "LLVM failed to upload shader\n");
FREE(program);
return NULL;
}
}
 
return program;
 }
 
 static void si_bind_compute_state(struct pipe_context *ctx, void *state)
@@ -388,21 +388,21 @@ static bool si_setup_compute_scratch_buffer(struct 
si_context *sctx,
 
if (!sctx->compute_scratch_buffer)
return false;
}
 
if (sctx->compute_scratch_buffer != shader->scratch_bo && 
scratch_needed) {
uint64_t scratch_va = sctx->compute_scratch_buffer->gpu_address;
 
si_shader_apply_scratch_relocs(shader, scratch_va);
 
-   if (si_shader_binary_upload(sctx->screen, shader))
+   if (!si_shader_binary_upload(sctx->screen, shader))
return false;
 
si_resource_reference(&shader->scratch_bo,
sctx->compute_scratch_buffer);
}
 
return true;
 }
 
 static bool si_switch_compute_shader(struct si_context *sctx,
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 4d08ab88f4a..71c85eb79a5 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5005,21 +5005,21 @@ static unsigned si_get_shader_binary_size(const struct 
si_shader *shader)
size += shader->prolog->binary.code_size;
if (shader->previous_stage)
size += shader->previous_stage->binary.code_size;
if (shader->prolog2)
size += shader->prolog2->binary.code_size;
if (shader->epilog)
size += shader->epilog->binary.code_size;
return size + DEBUGGER_NUM_MARKERS * 4;
 }
 
-int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader)
+bool si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader)
 {
const struct ac_shader_binary *prolog =
shader->prolog ? &shader->prolog->binary : NULL;
const struct ac_shader_binary *previous_stage =
shader->previous_stage ? &shader->previous_stage->binary : NULL;
const struct ac_shader_binary *prolog2 =
shader->prolog2 ? &shader->prolog2->binary : NULL;
const struct ac_shader_binary *epilog =
shader->epilog ? &shader->epilog->binary : NULL;
const struct ac_shader_binary *mainb = &shader->b

[Mesa-dev] [PATCH 03/10] amd/common: add a more powerful runtime linker

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Using an explicit linker instead of just concatenating .text
sections will allow us to start using .rodata sections and
explicit descriptions of data on LDS that is shared between
stages.
---
 src/amd/Makefile.sources   |   2 +
 src/amd/common/ac_binary.h |   2 +
 src/amd/common/ac_rtld.c   | 556 +
 src/amd/common/ac_rtld.h   |  87 ++
 src/amd/common/meson.build |   2 +
 5 files changed, 649 insertions(+)
 create mode 100644 src/amd/common/ac_rtld.c
 create mode 100644 src/amd/common/ac_rtld.h

diff --git a/src/amd/Makefile.sources b/src/amd/Makefile.sources
index 58e0008ee62..122fa306eb1 100644
--- a/src/amd/Makefile.sources
+++ b/src/amd/Makefile.sources
@@ -35,20 +35,22 @@ ADDRLIB_FILES = \
 
 AMD_COMPILER_FILES = \
common/ac_binary.c \
common/ac_binary.h \
common/ac_exp_param.h \
common/ac_llvm_build.c \
common/ac_llvm_build.h \
common/ac_llvm_helper.cpp \
common/ac_llvm_util.c \
common/ac_llvm_util.h \
+   common/ac_rtld.c \
+   common/ac_rtld.h \
common/ac_shader_abi.h \
common/ac_shader_util.c \
common/ac_shader_util.h
 
 
 AMD_NIR_FILES = \
common/ac_nir_to_llvm.c \
common/ac_nir_to_llvm.h
 
 AMD_COMMON_FILES = \
diff --git a/src/amd/common/ac_binary.h b/src/amd/common/ac_binary.h
index 8f594a9ce75..b91ecb4317b 100644
--- a/src/amd/common/ac_binary.h
+++ b/src/amd/common/ac_binary.h
@@ -73,20 +73,22 @@ struct ac_shader_binary {
 struct ac_shader_config {
unsigned num_sgprs;
unsigned num_vgprs;
unsigned spilled_sgprs;
unsigned spilled_vgprs;
unsigned lds_size; /* in HW allocation units; i.e 256 bytes on SI, 512 
bytes on CI+ */
unsigned spi_ps_input_ena;
unsigned spi_ps_input_addr;
unsigned float_mode;
unsigned scratch_bytes_per_wave;
+   unsigned rsrc1;
+   unsigned rsrc2;
 };
 
 /*
  * Parse the elf binary stored in \p elf_data and create a
  * ac_shader_binary object.
  */
 bool ac_elf_read(const char *elf_data, unsigned elf_size,
 struct ac_shader_binary *binary);
 
 /**
diff --git a/src/amd/common/ac_rtld.c b/src/amd/common/ac_rtld.c
new file mode 100644
index 000..a79447904f3
--- /dev/null
+++ b/src/amd/common/ac_rtld.c
@@ -0,0 +1,556 @@
+/*
+ * Copyright 2014-2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ */
+
+#include "ac_rtld.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ac_binary.h"
+#include "util/u_math.h"
+
+// Old distributions may not have this enum constant
+#define MY_EM_AMDGPU 224
+
+#ifndef R_AMDGPU_NONE
+#define R_AMDGPU_NONE 0
+#define R_AMDGPU_ABS32_LO 1
+#define R_AMDGPU_ABS32_HI 2
+#define R_AMDGPU_ABS64 3
+#define R_AMDGPU_REL32 4
+#define R_AMDGPU_REL64 5
+#define R_AMDGPU_ABS32 6
+#define R_AMDGPU_GOTPCREL 7
+#define R_AMDGPU_GOTPCREL32_LO 8
+#define R_AMDGPU_GOTPCREL32_HI 9
+#define R_AMDGPU_REL32_LO 10
+#define R_AMDGPU_REL32_HI 11
+#define R_AMDGPU_RELATIVE64 13
+#endif
+
+/* For the UMR disassembler. */
+#define DEBUGGER_END_OF_CODE_MARKER0xbf9f /* invalid instruction */
+#define DEBUGGER_NUM_MARKERS   5
+
+struct ac_rtld_section {
+   bool is_rx : 1;
+   bool is_pasted_text : 1;
+   uint64_t offset;
+   const char *name;
+};
+
+struct ac_rtld_part {
+   Elf *elf;
+   struct ac_rtld_section *sections;
+   unsigned num_sections;
+};
+
+static void report_erroraf(const char *fmt, va_list va)
+{
+   char *msg;
+   int ret = asprintf(&msg, fmt, va);
+   if (ret < 0)
+   msg = "(asprintf failed)";
+
+   fprintf(stderr, "ac_rtld error: %s\n", msg);
+
+   if (ret >= 0)
+   free(msg);
+}
+
+static void report_errorf(const char *fmt, ...) PRINTFLIKE(1, 2);
+
+static void report_err

[Mesa-dev] [PATCH 08/10] amd/common: add ac_compile_module_to_elf

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

A new variant of ac_compile_module_to_binary that allows us to
keep the entire ELF around.
---
 src/amd/common/ac_llvm_helper.cpp | 88 ---
 src/amd/common/ac_llvm_util.h |  2 +
 2 files changed, 83 insertions(+), 7 deletions(-)

diff --git a/src/amd/common/ac_llvm_helper.cpp 
b/src/amd/common/ac_llvm_helper.cpp
index dcfb8008546..834c5d7f94e 100644
--- a/src/amd/common/ac_llvm_helper.cpp
+++ b/src/amd/common/ac_llvm_helper.cpp
@@ -22,23 +22,27 @@
  * of the Software.
  *
  */
 
 /* based on Marek's patch to lp_bld_misc.cpp */
 
 // Workaround http://llvm.org/PR23628
 #pragma push_macro("DEBUG")
 #undef DEBUG
 
+#include 
+
 #include "ac_binary.h"
 #include "ac_llvm_util.h"
 
+#include "util/macros.h"
+
 #include 
 #include 
 #include 
 #include 
 #include 
 
 #include 
 
 void ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)
 {
@@ -102,28 +106,90 @@ ac_create_target_library_info(const char *triple)
 {
return reinterpret_cast(new 
llvm::TargetLibraryInfoImpl(llvm::Triple(triple)));
 }
 
 void
 ac_dispose_target_library_info(LLVMTargetLibraryInfoRef library_info)
 {
delete reinterpret_cast(library_info);
 }
 
+/* Implementation of raw_pwrite_stream that works on malloc()ed memory for
+ * better compatibility with C code. */
+struct raw_memory_ostream : public llvm::raw_pwrite_stream {
+   char *buffer;
+   size_t written;
+   size_t bufsize;
+
+   raw_memory_ostream()
+   {
+   buffer = NULL;
+   written = 0;
+   bufsize = 0;
+   SetUnbuffered();
+   }
+
+   ~raw_memory_ostream()
+   {
+   free(buffer);
+   }
+
+   void clear()
+   {
+   written = 0;
+   }
+
+   void take(char *&out_buffer, size_t &out_size)
+   {
+   out_buffer = buffer;
+   out_size = written;
+   buffer = NULL;
+   written = 0;
+   bufsize = 0;
+   }
+
+   void flush() = delete;
+
+   void write_impl(const char *ptr, size_t size) override
+   {
+   if (unlikely(written + size < written))
+   abort();
+   if (written + size > bufsize) {
+   bufsize = MAX3(1024, written + size, bufsize / 3 * 4);
+   buffer = (char *)realloc(buffer, bufsize);
+   if (!buffer) {
+   fprintf(stderr, "amd: out of memory allocating 
ELF buffer\n");
+   abort();
+   }
+   }
+   memcpy(buffer + written, ptr, size);
+   written += size;
+   }
+
+   void pwrite_impl(const char *ptr, size_t size, uint64_t offset) override
+   {
+   assert(offset == (size_t)offset &&
+  offset + size >= offset && offset + size <= written);
+   memcpy(buffer + offset, ptr, size);
+   }
+
+   uint64_t current_pos() const override
+   {
+   return written;
+   }
+};
+
 /* The LLVM compiler is represented as a pass manager containing passes for
  * optimizations, instruction selection, and code generation.
  */
 struct ac_compiler_passes {
-   ac_compiler_passes(): ostream(code_string) {}
-
-   llvm::SmallString<0> code_string;  /* ELF shader binary */
-   llvm::raw_svector_ostream ostream; /* stream for appending data to the 
binary */
+   raw_memory_ostream ostream; /* ELF shader binary stream */
llvm::legacy::PassManager passmgr; /* list of passes */
 };
 
 struct ac_compiler_passes *ac_create_llvm_passes(LLVMTargetMachineRef tm)
 {
struct ac_compiler_passes *p = new ac_compiler_passes();
if (!p)
return NULL;
 
llvm::TargetMachine *TM = reinterpret_cast(tm);
@@ -142,28 +208,36 @@ void ac_destroy_llvm_passes(struct ac_compiler_passes *p)
 {
delete p;
 }
 
 /* This returns false on failure. */
 bool ac_compile_module_to_binary(struct ac_compiler_passes *p, LLVMModuleRef 
module,
 struct ac_shader_binary *binary)
 {
p->passmgr.run(*llvm::unwrap(module));
 
-   llvm::StringRef data = p->ostream.str();
-   bool success = ac_elf_read(data.data(), data.size(), binary);
-   p->code_string = ""; /* release the ELF shader binary */
+   bool success = ac_elf_read(p->ostream.buffer, p->ostream.written, 
binary);
+   p->ostream.clear();
 
if (!success)
fprintf(stderr, "amd: cannot read an ELF shader binary\n");
return success;
 }
 
+/* This returns false on failure. */
+bool ac_compile_module_to_elf(struct ac_compiler_passes *p, LLVMModuleRef 
module,
+ char **pelf_buffer, size_t *pelf_size)
+{
+   p->passmgr.run(*llvm::unwrap(module));
+   p->ostream.take(*pelf_buffer, *pelf_size);
+   return true;
+}
+
 void ac_llvm_add

[Mesa-dev] [PATCH 00/10] amd,radeonsi: add a real runtime linker

2019-05-03 Thread Nicolai Hähnle
these patches change the way we load shaders, initially for radeonsi
but ideally radv would adopt the same approach.

Basically, instead of hard-coding that we have a single .text section
in the ELF generated by LLVM, we align ourselves more with the ELF
standard and actually look at all the sections in the file(s), lay
them out in memory, and resolve relocations between them.

There is still hard-coding of ".text" sections for the purpose of
gfx9+ merged shaders.

The immediate consequence is that we will be able to emit .rodata
in LLVM and emit absolute or relative relocations that will be
resolved when shaders are uploaded to the GPU.

As a next step, I want us to explicitly record LDS symbol in the ELF
symbol table and have ac_rtld lay out and resolve those symbols at
load time. This will allow us to use LDS both for communication
between shader parts and for temporary variables used within each
part.

Please review!

Thanks,
Nicolai



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/10] radeonsi: dump shader binary buffer contents

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Help identify bugs related to corruption of shaders in memory,
or errors in shader upload / rtld.
---
 src/gallium/drivers/radeonsi/si_debug.c| 18 ++
 .../drivers/radeonsi/si_debug_options.h|  1 +
 2 files changed, 19 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index 9a4494a98fe..c40dcd0b5d6 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -98,20 +98,38 @@ void si_destroy_saved_cs(struct si_saved_cs *scs)
 }
 
 static void si_dump_shader(struct si_screen *sscreen,
   enum pipe_shader_type processor,
   const struct si_shader *shader, FILE *f)
 {
if (shader->shader_log)
fwrite(shader->shader_log, shader->shader_log_size, 1, f);
else
si_shader_dump(sscreen, shader, NULL, processor, f, false);
+
+   if (shader->bo && sscreen->options.dump_shader_binary) {
+   unsigned size = shader->bo->b.b.width0;
+   fprintf(f, "BO: VA=%"PRIx64" Size=%u\n", 
shader->bo->gpu_address, size);
+
+   const char *mapped = sscreen->ws->buffer_map(shader->bo->buf, 
NULL,
+  
PIPE_TRANSFER_UNSYNCHRONIZED |
+  PIPE_TRANSFER_READ |
+  
RADEON_TRANSFER_TEMPORARY);
+
+   for (unsigned i = 0; i < size; i += 4) {
+   fprintf(f, " %4x: %08x\n", i, *(uint32_t*)(mapped + i));
+   }
+
+   sscreen->ws->buffer_unmap(shader->bo->buf);
+
+   fprintf(f, "\n");
+   }
 }
 
 struct si_log_chunk_shader {
/* The shader destroy code assumes a current context for unlinking of
 * PM4 packets etc.
 *
 * While we should be able to destroy shaders without a context, doing
 * so would happen only very rarely and be therefore likely to fail
 * just when you're trying to debug something. Let's just remember the
 * current context in the chunk.
diff --git a/src/gallium/drivers/radeonsi/si_debug_options.h 
b/src/gallium/drivers/radeonsi/si_debug_options.h
index 0bde7910fc6..db642366ca6 100644
--- a/src/gallium/drivers/radeonsi/si_debug_options.h
+++ b/src/gallium/drivers/radeonsi/si_debug_options.h
@@ -1,7 +1,8 @@
 OPT_BOOL(clear_db_cache_before_clear, false, "Clear DB cache before fast depth 
clear")
 OPT_BOOL(enable_nir, false, "Enable NIR")
 OPT_BOOL(aux_debug, false, "Generate ddebug_dumps for the auxiliary context")
 OPT_BOOL(sync_compile, false, "Always compile synchronously (will cause 
stalls)")
+OPT_BOOL(dump_shader_binary, false, "Dump shader binary as part of 
ddebug_dumps")
 OPT_BOOL(vs_fetch_always_opencode, false, "Always open code vertex fetches 
(less efficient, purely for testing)")
 
 #undef OPT_BOOL
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/10] radeonsi: use the new run-time linker for shaders

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_compute.c |  63 ++--
 src/gallium/drivers/radeonsi/si_debug.c   |  74 +++--
 src/gallium/drivers/radeonsi/si_pipe.c|   2 +-
 src/gallium/drivers/radeonsi/si_pipe.h|   2 +-
 src/gallium/drivers/radeonsi/si_shader.c  | 291 +-
 src/gallium/drivers/radeonsi/si_shader.h  |  19 +-
 .../drivers/radeonsi/si_shader_internal.h |   3 +-
 .../drivers/radeonsi/si_shader_tgsi_setup.c   |  14 +-
 .../drivers/radeonsi/si_state_shaders.c   |  39 +--
 9 files changed, 270 insertions(+), 237 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 2899ee146d4..e4ef138db33 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -21,20 +21,21 @@
  * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
  * USE OR OTHER DEALINGS IN THE SOFTWARE.
  *
  */
 
 #include "tgsi/tgsi_parse.h"
 #include "util/u_async_debug.h"
 #include "util/u_memory.h"
 #include "util/u_upload_mgr.h"
 
+#include "ac_rtld.h"
 #include "amd_kernel_code_t.h"
 #include "si_build_pm4.h"
 #include "si_compute.h"
 
 #define COMPUTE_DBG(sscreen, fmt, args...) \
do { \
if ((sscreen->debug_flags & DBG(COMPUTE))) fprintf(stderr, fmt, 
##args); \
} while (0);
 
 struct dispatch_packet {
@@ -54,22 +55,40 @@ struct dispatch_packet {
uint64_t reserved2;
 };
 
 static const amd_kernel_code_t *si_compute_get_code_object(
const struct si_compute *program,
uint64_t symbol_offset)
 {
if (!program->use_code_object_v2) {
return NULL;
}
-   return (const amd_kernel_code_t*)
-   (program->shader.binary.code + symbol_offset);
+
+   struct ac_rtld_binary rtld;
+   if (!ac_rtld_open(&rtld, 1, &program->shader.binary.elf_buffer,
+ &program->shader.binary.elf_size))
+   return NULL;
+
+   const amd_kernel_code_t *result = NULL;
+   const char *text;
+   size_t size;
+   if (!ac_rtld_get_section_by_name(&rtld, ".text", &text, &size))
+   goto out;
+
+   if (symbol_offset + sizeof(amd_kernel_code_t) > size)
+   goto out;
+
+   result = (const amd_kernel_code_t*)(text + symbol_offset);
+
+out:
+   ac_rtld_close(&rtld);
+   return result;
 }
 
 static void code_object_to_config(const amd_kernel_code_t *code_object,
  struct ac_shader_config *out_config) {
 
uint32_t rsrc1 = code_object->compute_pgm_resource_registers;
uint32_t rsrc2 = code_object->compute_pgm_resource_registers >> 32;
out_config->num_sgprs = code_object->wavefront_sgpr_count;
out_config->num_vgprs = code_object->workitem_vgpr_count;
out_config->float_mode = G_00B028_FLOAT_MODE(rsrc1);
@@ -137,21 +156,21 @@ static void si_create_compute_state_async(void *job, int 
thread_index)
mtx_lock(&sscreen->shader_cache_mutex);
 
if (ir_binary &&
si_shader_cache_load_shader(sscreen, ir_binary, shader)) {
mtx_unlock(&sscreen->shader_cache_mutex);
 
si_shader_dump_stats_for_shader_db(shader, debug);
si_shader_dump(sscreen, shader, debug, PIPE_SHADER_COMPUTE,
   stderr, true);
 
-   if (!si_shader_binary_upload(sscreen, shader))
+   if (!si_shader_binary_upload(sscreen, shader, 0))
program->shader.compilation_failed = true;
} else {
mtx_unlock(&sscreen->shader_cache_mutex);
 
if (!si_shader_create(sscreen, compiler, &program->shader, 
debug)) {
program->shader.compilation_failed = true;
 
if (program->ir_type == PIPE_SHADER_IR_TGSI)
FREE(program->ir.tgsi);
program->shader.selector = NULL;
@@ -229,39 +248,37 @@ static void *si_create_compute_state(
si_schedule_initial_compile(sctx, PIPE_SHADER_COMPUTE,
&program->ready,
&program->compiler_ctx_state,
program, 
si_create_compute_state_async);
} else {
const struct pipe_llvm_program_header *header;
const char *code;
header = cso->prog;
code = cso->prog + sizeof(struct pipe_llvm_program_header);
 
-   ac_elf_read(code, header->num_bytes, &program->shader.binary);
-   if (program->use_code_object_v2) {
-   const amd_kernel_code_t *code_object =
-   si_compute_get_code_object(program, 0);
-   code_object_to_config(code_object, 
&program->shader.config);
-   i

[Mesa-dev] [PATCH 04/10] radeonsi: use ac_shader_config

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/amd/common/ac_binary.c|   2 +
 src/gallium/drivers/radeonsi/si_compute.c |  14 +--
 src/gallium/drivers/radeonsi/si_shader.c  | 112 +++---
 src/gallium/drivers/radeonsi/si_shader.h  |  25 +
 4 files changed, 27 insertions(+), 126 deletions(-)

diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
index 44251886b5f..d0ca55e0e0d 100644
--- a/src/amd/common/ac_binary.c
+++ b/src/amd/common/ac_binary.c
@@ -218,26 +218,28 @@ void ac_parse_shader_binary_config(const char *data, 
size_t nbytes,
unsigned value = util_le32_to_cpu(*(uint32_t*)(data + i + 4));
switch (reg) {
case R_00B028_SPI_SHADER_PGM_RSRC1_PS:
case R_00B128_SPI_SHADER_PGM_RSRC1_VS:
case R_00B228_SPI_SHADER_PGM_RSRC1_GS:
case R_00B848_COMPUTE_PGM_RSRC1:
case R_00B428_SPI_SHADER_PGM_RSRC1_HS:
conf->num_sgprs = MAX2(conf->num_sgprs, 
(G_00B028_SGPRS(value) + 1) * 8);
conf->num_vgprs = MAX2(conf->num_vgprs, 
(G_00B028_VGPRS(value) + 1) * 4);
conf->float_mode =  G_00B028_FLOAT_MODE(value);
+   conf->rsrc1 = value;
break;
case R_00B02C_SPI_SHADER_PGM_RSRC2_PS:
conf->lds_size = MAX2(conf->lds_size, 
G_00B02C_EXTRA_LDS_SIZE(value));
break;
case R_00B84C_COMPUTE_PGM_RSRC2:
conf->lds_size = MAX2(conf->lds_size, 
G_00B84C_LDS_SIZE(value));
+   conf->rsrc2 = value;
break;
case R_0286CC_SPI_PS_INPUT_ENA:
conf->spi_ps_input_ena = value;
break;
case R_0286D0_SPI_PS_INPUT_ADDR:
conf->spi_ps_input_addr = value;
break;
case R_0286E8_SPI_TMPRING_SIZE:
case R_00B860_COMPUTE_TMPRING_SIZE:
/* WAVESIZE is in units of 256 dwords. */
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 541d7e6f118..02d7bac406a 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -59,21 +59,21 @@ static const amd_kernel_code_t *si_compute_get_code_object(
uint64_t symbol_offset)
 {
if (!program->use_code_object_v2) {
return NULL;
}
return (const amd_kernel_code_t*)
(program->shader.binary.code + symbol_offset);
 }
 
 static void code_object_to_config(const amd_kernel_code_t *code_object,
- struct si_shader_config *out_config) {
+ struct ac_shader_config *out_config) {
 
uint32_t rsrc1 = code_object->compute_pgm_resource_registers;
uint32_t rsrc2 = code_object->compute_pgm_resource_registers >> 32;
out_config->num_sgprs = code_object->wavefront_sgpr_count;
out_config->num_vgprs = code_object->workitem_vgpr_count;
out_config->float_mode = G_00B028_FLOAT_MODE(rsrc1);
out_config->rsrc1 = rsrc1;
out_config->lds_size = MAX2(out_config->lds_size, 
G_00B84C_LDS_SIZE(rsrc2));
out_config->rsrc2 = rsrc2;
out_config->scratch_bytes_per_wave =
@@ -241,22 +241,22 @@ static void *si_create_compute_state(
const amd_kernel_code_t *code_object =
si_compute_get_code_object(program, 0);
code_object_to_config(code_object, 
&program->shader.config);
if (program->shader.binary.reloc_count != 0) {
fprintf(stderr, "Error: %d unsupported 
relocations\n",
program->shader.binary.reloc_count);
FREE(program);
return NULL;
}
} else {
-   si_shader_binary_read_config(&program->shader.binary,
-&program->shader.config, 0);
+   ac_shader_binary_read_config(&program->shader.binary,
+&program->shader.config, 0, false);
}
si_shader_dump(sctx->screen, &program->shader, &sctx->debug,
   PIPE_SHADER_COMPUTE, stderr, true);
if (si_shader_binary_upload(sctx->screen, &program->shader) < 
0) {
fprintf(stderr, "LLVM failed to upload shader\n");
FREE(program);
return NULL;
}
}
 
@@ -362,21 +362,21 @@ static void si_initialize_compute(struct si_context *sctx)
  bc_va >> 8);
}
}
 
   

[Mesa-dev] [PATCH 05/10] radeonsi: let si_shader_create return a boolean

2019-05-03 Thread Nicolai Hähnle
From: Nicolai Hähnle 

We didn't really use error codes anyway.
---
 src/gallium/drivers/radeonsi/si_compute.c  |  2 +-
 src/gallium/drivers/radeonsi/si_shader.c   | 18 +-
 src/gallium/drivers/radeonsi/si_shader.h   |  2 +-
 .../drivers/radeonsi/si_state_shaders.c|  8 +++-
 4 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 02d7bac406a..2acd96545aa 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -142,21 +142,21 @@ static void si_create_compute_state_async(void *job, int 
thread_index)
 
si_shader_dump_stats_for_shader_db(shader, debug);
si_shader_dump(sscreen, shader, debug, PIPE_SHADER_COMPUTE,
   stderr, true);
 
if (si_shader_binary_upload(sscreen, shader))
program->shader.compilation_failed = true;
} else {
mtx_unlock(&sscreen->shader_cache_mutex);
 
-   if (si_shader_create(sscreen, compiler, &program->shader, 
debug)) {
+   if (!si_shader_create(sscreen, compiler, &program->shader, 
debug)) {
program->shader.compilation_failed = true;
 
if (program->ir_type == PIPE_SHADER_IR_TGSI)
FREE(program->ir.tgsi);
program->shader.selector = NULL;
return;
}
 
bool scratch_enabled = shader->config.scratch_bytes_per_wave > 
0;
unsigned user_sgprs = SI_NUM_RESOURCE_SGPRS +
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index da43447013d..4d08ab88f4a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -7769,94 +7769,94 @@ static void si_fix_resource_usage(struct si_screen 
*sscreen,
 
shader->config.num_sgprs = MAX2(shader->config.num_sgprs, min_sgprs);
 
if (shader->selector->type == PIPE_SHADER_COMPUTE &&
si_get_max_workgroup_size(shader) > 64) {
si_multiwave_lds_size_workaround(sscreen,
 &shader->config.lds_size);
}
 }
 
-int si_shader_create(struct si_screen *sscreen, struct ac_llvm_compiler 
*compiler,
+bool si_shader_create(struct si_screen *sscreen, struct ac_llvm_compiler 
*compiler,
 struct si_shader *shader,
 struct pipe_debug_callback *debug)
 {
struct si_shader_selector *sel = shader->selector;
struct si_shader *mainp = *si_get_main_shader_part(sel, &shader->key);
int r;
 
/* LS, ES, VS are compiled on demand if the main part hasn't been
 * compiled for that stage.
 *
 * Vertex shaders are compiled on demand when a vertex fetch
 * workaround must be applied.
 */
if (shader->is_monolithic) {
/* Monolithic shader (compiled as a whole, has many variants,
 * may take a long time to compile).
 */
r = si_compile_tgsi_shader(sscreen, compiler, shader, debug);
if (r)
-   return r;
+   return false;
} else {
/* The shader consists of several parts:
 *
 * - the middle part is the user shader, it has 1 variant only
 *   and it was compiled during the creation of the shader
 *   selector
 * - the prolog part is inserted at the beginning
 * - the epilog part is inserted at the end
 *
 * The prolog and epilog have many (but simple) variants.
 *
 * Starting with gfx9, geometry and tessellation control
 * shaders also contain the prolog and user shader parts of
 * the previous shader stage.
 */
 
if (!mainp)
-   return -1;
+   return false;
 
/* Copy the compiled TGSI shader data over. */
shader->is_binary_shared = true;
shader->binary = mainp->binary;
shader->config = mainp->config;
shader->info.num_input_sgprs = mainp->info.num_input_sgprs;
shader->info.num_input_vgprs = mainp->info.num_input_vgprs;
shader->info.face_vgpr_index = mainp->info.face_vgpr_index;
shader->info.ancillary_vgpr_index = 
mainp->info.ancillary_vgpr_index;
memcpy(shader->info.vs_output_param_offset,
   mainp->info.vs_output_param_offset,
   sizeof(mainp->info.vs_output_param_offset));
shader->info.uses_instanceid 

Re: [Mesa-dev] [PATCH 3/3] radeonsi: overhaul the vertex fetch fixup mechanism

2019-05-03 Thread Haehnle, Nicolai
On 03.05.19 12:36, Nicolai Hähnle wrote:
> On 25.04.19 13:18, Nicolai Hähnle wrote:
>> @@ -4618,21 +4648,27 @@ static void si_bind_vertex_elements(struct 
>> pipe_context *ctx, void *state)
>>   struct si_vertex_elements *old = sctx->vertex_elements;
>>   struct si_vertex_elements *v = (struct si_vertex_elements*)state;
>>   sctx->vertex_elements = v;
>>   sctx->vertex_buffers_dirty = true;
>>   if (v &&
>>   (!old ||
>>    old->count != v->count ||
>>    old->uses_instance_divisors != v->uses_instance_divisors ||
>> - v->uses_instance_divisors || /* we don't check which 
>> divisors changed */
>> + /* we don't check which divisors changed */
>> + v->uses_instance_divisors ||
>> + /* fix_fetch_{always,opencode,unaligned} and 
>> hw_load_is_dword are
>> +  * functions of fix_fetch and the src_offset alignment.
>> +  * If they change and fix_fetch doesn't, it must be due to 
>> different
>> +  * src_offset alignment, which is reflected in 
>> fix_fetch_opencode. */
>> + old->fix_fetch_opencode != v->fix_fetch_opencode ||
>>    memcmp(old->fix_fetch, v->fix_fetch, 
>> sizeof(v->fix_fetch[0]) * v->count)))
> 
> The following condition got dropped in a late cleanup that I was doing:
> 
>> (old->vb_alignment_check_mask ^ v->vb_alignment_check_mask) & 
>> sctx->vertex_buffer_unaligned ||

... and also this:

>((v->vb_alignment_check_mask & sctx->vertex_buffer_unaligned) &&
> memcmp(old->vertex_buffer_index, v->vertex_buffer_index,
>sizeof(v->vertex_buffer_index[0]) * v->count)) ||

Cheers,
Nicolai


> 
> I've fixed that locally.
> 
> Cheers,
> Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] panfrost: Questions regarding pan_wallpaper.c (and the 'reload FB content' logic in general)

2019-05-03 Thread Boris Brezillon
Hello Alyssa,

This week I've tried to make the wallpapering logic to work on panfrost
but I'm not quite there yet.

I've done all my tests with weston that tries to use buffer_age and
update only the parts that have changed (which I know is inefficient,
but before implementing partial_update() I need to have the 'reload
FB content into tile buf' stuff working).

Looks like with the below diff applied, weston background is completely
replaced by a black color and the panel is missing.

Interestingly, if I update the texcoord/pos varyings to duplicate the
FB to half the output height it seems to draw something valid on the
screen, but in any case, everytime I open a terminal window the output
is all messed up (meaning that the 'reload' logic does not work properly).

I know you're working other topics right and don't necessarily have the
time to have a deeper look into that code, but maybe you have some
ideas on what could be wrong in my code/approach or have some tricks
to debug this kind of issues.

Also had a look at the reload logic in the lima driver and the driver
seems to do something similar. Quiang, Vasily, any ideas?

Thanks in advance for your help,

Boris

NOTE: I also tried with an u_blitter based implementation and it produces
the same effects.

--->8---
diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index c50c546a3995..46d82b2e19c2 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -1314,6 +1314,8 @@ panfrost_submit_frame(struct panfrost_context *ctx, bool 
flush_immediate,
 /* Workaround a bizarre lockup (a hardware errata?) */
 if (!has_draws)
 flush_immediate = true;
+else if (!(job->clear & PIPE_CLEAR_COLOR))
+panfrost_draw_wallpaper(gallium);
 
 /* A number of jobs are batched -- this must be linked and cleared */
 panfrost_link_jobs(ctx);
diff --git a/src/gallium/drivers/panfrost/pan_wallpaper.c 
b/src/gallium/drivers/panfrost/pan_wallpaper.c
index ac77ad089bc1..96a4f87723e0 100644
--- a/src/gallium/drivers/panfrost/pan_wallpaper.c
+++ b/src/gallium/drivers/panfrost/pan_wallpaper.c
@@ -100,7 +100,9 @@ panfrost_create_wallpaper_program(struct pipe_context *pctx)
 }
 
 static struct panfrost_shader_variants *wallpaper_program = NULL;
+static struct panfrost_blend_state *wallpaper_blend = NULL;
 static struct panfrost_shader_variants *wallpaper_saved_program = NULL;
+static struct panfrost_blend_state *wallpaper_saved_blend = NULL;
 
 static void
 panfrost_enable_wallpaper_program(struct pipe_context *pctx)
@@ -108,14 +110,28 @@ panfrost_enable_wallpaper_program(struct pipe_context 
*pctx)
 struct panfrost_context *ctx = pan_context(pctx);
 
 if (!wallpaper_program) {
+   struct pipe_blend_state bs = {
+   .rt[0].blend_enable = 1,
+   .rt[0].rgb_func = PIPE_BLEND_ADD,
+   .rt[0].rgb_src_factor = PIPE_BLENDFACTOR_ONE,
+   .rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_ZERO,
+   .rt[0].alpha_func = PIPE_BLEND_ADD,
+   .rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ZERO,
+   .rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE,
+   .rt[0].colormask = PIPE_MASK_R | PIPE_MASK_G | 
PIPE_MASK_B | PIPE_MASK_A,
+   };
+
 wallpaper_program = panfrost_create_wallpaper_program(pctx);
+   wallpaper_blend = pctx->create_blend_state(pctx, &bs);
 }
 
 /* Push the shader state */
 wallpaper_saved_program = ctx->fs;
+wallpaper_saved_blend = ctx->blend;
 
 /* Bind the program */
 pctx->bind_fs_state(pctx, wallpaper_program);
+pctx->bind_blend_state(pctx, wallpaper_blend);
 }
 
 static void
@@ -123,6 +139,7 @@ panfrost_disable_wallpaper_program(struct pipe_context 
*pctx)
 {
 /* Pop off the shader state */
 pctx->bind_fs_state(pctx, wallpaper_saved_program);
+pctx->bind_blend_state(pctx, wallpaper_saved_blend);
 }
 
 /* Essentially, we insert a fullscreen textured quad, reading from the
@@ -133,12 +150,7 @@ panfrost_draw_wallpaper(struct pipe_context *pipe)
 {
 /* Disable wallpapering for now, but still exercise the shader 
generation to minimise bit rot */
 
-panfrost_enable_wallpaper_program(pipe);
-panfrost_disable_wallpaper_program(pipe);
-
-return;
-
-#if 0
+#if 1
 struct panfrost_context *ctx = pan_context(pipe);
 
 /* Setup payload for elided quad. TODO: Refactor draw_vbo so this can
@@ -180,12 +192,12 @@ panfrost_draw_wallpaper(struct pipe_context *pipe)
 .normalized_coords = 1
 };
 
-struct pipe_resource *rsrc = pan_screen(pipe->screen)->display_target;
+struct pipe_resource *rsrc = 
&pan_screen(pipe->screen)->display_target->base;
 struct

Re: [Mesa-dev] [PATCH 0/3] radeonsi: handle unaligned vertex buffers in hardware

2019-05-03 Thread Samuel Pitoiset


On 5/3/19 12:39 PM, Nicolai Hähnle wrote:

On 30.04.19 21:20, Marek Olšák wrote:

Why can we not use tbuffer loads?


tbuffer_load_format has the exact same limitations as 
buffer_load_format. They both use the same hardware path, the only 
difference is that tbuffer_load_format gets the format information 
from the instruction, while buffer_load_format gets it from the 
resource descriptor.


Therefore, in all cases where we *can* use tbuffer_load_format, we may 
as well use buffer_load_format (because we can just initialize the 
descriptor for that vertex input / vertex element correctly).


The benefit that tbuffer_load_format could potentially give us in the 
future is that when multiple vertex elements reference the same vertex 
buffer, we could put a single buffer descriptor into the descriptor 
table (or into USER_SGPRs) instead of having one buffer descriptor for 
every element.
Yes, I did that change for RADV recently, and the SGPRs decrease was 
nice. :-)


Cheers,
Nicolai




Marek

On Thu, Apr 25, 2019 at 7:18 AM Nicolai Hähnle > wrote:


    Hi all,

    the following patches contain code to implement all vertex fetches
    using plain, non-format loads plus explicit shader arithmetic for
    format conversion.

    This allows us to remove the software workaround for unaligned 
vertex

    buffers on SI, because we can just load individual bytes on the GPU.
    CI+ will still use short/dword loads even in the unaligned case.

    The format conversion code was tested by running with
    radeonsi_vs_fetch_always_opencode=true on both Verde and Vega.

    Please review!

    Thanks,
    Nicolai
    --
  src/amd/common/ac_llvm_build.c               | 313 
+

  src/amd/common/ac_llvm_build.h               |  30 ++
  .../drivers/radeonsi/si_debug_options.h      |   1 +
  src/gallium/drivers/radeonsi/si_get.c        |   2 +-
  src/gallium/drivers/radeonsi/si_pipe.h       |   1 +
  src/gallium/drivers/radeonsi/si_shader.c     | 249 +
  src/gallium/drivers/radeonsi/si_shader.h     |  46 +--
  src/gallium/drivers/radeonsi/si_state.c      | 233 +++-
  src/gallium/drivers/radeonsi/si_state.h      |  19 +
  .../drivers/radeonsi/si_state_shaders.c      |  37 +-
  10 files changed, 645 insertions(+), 286 deletions(-)


    ___
    mesa-dev mailing list
    mesa-dev@lists.freedesktop.org 


    https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] radeonsi: handle unaligned vertex buffers in hardware

2019-05-03 Thread Nicolai Hähnle

On 30.04.19 21:20, Marek Olšák wrote:

Why can we not use tbuffer loads?


tbuffer_load_format has the exact same limitations as 
buffer_load_format. They both use the same hardware path, the only 
difference is that tbuffer_load_format gets the format information from 
the instruction, while buffer_load_format gets it from the resource 
descriptor.


Therefore, in all cases where we *can* use tbuffer_load_format, we may 
as well use buffer_load_format (because we can just initialize the 
descriptor for that vertex input / vertex element correctly).


The benefit that tbuffer_load_format could potentially give us in the 
future is that when multiple vertex elements reference the same vertex 
buffer, we could put a single buffer descriptor into the descriptor 
table (or into USER_SGPRs) instead of having one buffer descriptor for 
every element.


Cheers,
Nicolai




Marek

On Thu, Apr 25, 2019 at 7:18 AM Nicolai Hähnle > wrote:


Hi all,

the following patches contain code to implement all vertex fetches
using plain, non-format loads plus explicit shader arithmetic for
format conversion.

This allows us to remove the software workaround for unaligned vertex
buffers on SI, because we can just load individual bytes on the GPU.
CI+ will still use short/dword loads even in the unaligned case.

The format conversion code was tested by running with
radeonsi_vs_fetch_always_opencode=true on both Verde and Vega.

Please review!

Thanks,
Nicolai
--
  src/amd/common/ac_llvm_build.c               | 313 +
  src/amd/common/ac_llvm_build.h               |  30 ++
  .../drivers/radeonsi/si_debug_options.h      |   1 +
  src/gallium/drivers/radeonsi/si_get.c        |   2 +-
  src/gallium/drivers/radeonsi/si_pipe.h       |   1 +
  src/gallium/drivers/radeonsi/si_shader.c     | 249 +
  src/gallium/drivers/radeonsi/si_shader.h     |  46 +--
  src/gallium/drivers/radeonsi/si_state.c      | 233 +++-
  src/gallium/drivers/radeonsi/si_state.h      |  19 +
  .../drivers/radeonsi/si_state_shaders.c      |  37 +-
  10 files changed, 645 insertions(+), 286 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org 
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] radeonsi: overhaul the vertex fetch fixup mechanism

2019-05-03 Thread Haehnle, Nicolai
On 25.04.19 13:18, Nicolai Hähnle wrote:
> @@ -4618,21 +4648,27 @@ static void si_bind_vertex_elements(struct 
> pipe_context *ctx, void *state)
>   struct si_vertex_elements *old = sctx->vertex_elements;
>   struct si_vertex_elements *v = (struct si_vertex_elements*)state;
>   
>   sctx->vertex_elements = v;
>   sctx->vertex_buffers_dirty = true;
>   
>   if (v &&
>   (!old ||
>old->count != v->count ||
>old->uses_instance_divisors != v->uses_instance_divisors ||
> -  v->uses_instance_divisors || /* we don't check which divisors 
> changed */
> +  /* we don't check which divisors changed */
> +  v->uses_instance_divisors ||
> +  /* fix_fetch_{always,opencode,unaligned} and hw_load_is_dword are
> +   * functions of fix_fetch and the src_offset alignment.
> +   * If they change and fix_fetch doesn't, it must be due to 
> different
> +   * src_offset alignment, which is reflected in fix_fetch_opencode. 
> */
> +  old->fix_fetch_opencode != v->fix_fetch_opencode ||
>memcmp(old->fix_fetch, v->fix_fetch, sizeof(v->fix_fetch[0]) * 
> v->count)))

The following condition got dropped in a late cleanup that I was doing:

> (old->vb_alignment_check_mask ^ v->vb_alignment_check_mask) & 
> sctx->vertex_buffer_unaligned ||

I've fixed that locally.

Cheers,
Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110573] Mesa vulkan-radeon 19.0.3 system freeze and visual artifacts (RADV)

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110573

--- Comment #14 from ant...@gmx.de ---
Yes that seems to fix the issue!
No more visual artifacts or system hangs. Thanks a lot for the help!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] etnaviv: fill missing offset in etna_resource_get_handle

2019-05-03 Thread Philipp Zabel
Without this gbm_bo_get_offset() can return 0 where it shouldn't.
---
 src/gallium/drivers/etnaviv/etnaviv_resource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_resource.c 
b/src/gallium/drivers/etnaviv/etnaviv_resource.c
index 83179d3cd088..ab77a80c72b3 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_resource.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_resource.c
@@ -622,6 +622,7 @@ etna_resource_get_handle(struct pipe_screen *pscreen,
   rsc = etna_resource(rsc->external);
 
handle->stride = rsc->levels[0].stride;
+   handle->offset = rsc->levels[0].offset;
handle->modifier = layout_to_modifier(rsc->layout);
 
if (handle->type == WINSYS_HANDLE_TYPE_SHARED) {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110540] [AMD TAHITI XT] valve artifact broken

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110540

--- Comment #14 from Sylvain BERTRAND  ---
Don't know why I got the wrong one in my clipboard:
http://bugs.freedesktop.org/show_bug.cgi?id=109550

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: apply the indexing workaround for atomic buffer operations on GFX9

2019-05-03 Thread Samuel Pitoiset
Because the new raw/struct intrinsics are buggy with LLVM 8
(they weren't marked as source of divergence), we fallback to the
old instrinsics for atomic buffer operations. This means we need
to apply the indexing workaround for GFX9.

The fact that we need another workaround is painful but we should
be able to clean up that a bit once LLVM 7 support will be dropped.

This fixes a GPU hang with AC Odyssey and some rendering problems
with Nioh.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573
Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic 
intrinsics with LLVM 9+")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c   | 12 +++-
 src/amd/common/ac_shader_abi.h|  1 +
 src/amd/vulkan/radv_nir_to_llvm.c |  6 ++
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index c92eaaca31d..151e0d0f961 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2417,10 +2417,12 @@ static void get_image_coords(struct ac_nir_context *ctx,
 }
 
 static LLVMValueRef get_image_buffer_descriptor(struct ac_nir_context *ctx,
-const nir_intrinsic_instr 
*instr, bool write)
+const nir_intrinsic_instr 
*instr,
+   bool write, bool atomic)
 {
LLVMValueRef rsrc = get_image_descriptor(ctx, instr, AC_DESC_BUFFER, 
write);
-   if (ctx->abi->gfx9_stride_size_workaround) {
+   if (ctx->abi->gfx9_stride_size_workaround ||
+   (ctx->abi->gfx9_stride_size_workaround_for_atomic && atomic)) {
LLVMValueRef elem_count = 
LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 2, 0), 
"");
LLVMValueRef stride = LLVMBuildExtractElement(ctx->ac.builder, 
rsrc, LLVMConstInt(ctx->ac.i32, 1, 0), "");
stride = LLVMBuildLShr(ctx->ac.builder, stride, 
LLVMConstInt(ctx->ac.i32, 16, 0), "");
@@ -2466,7 +2468,7 @@ static LLVMValueRef visit_image_load(struct 
ac_nir_context *ctx,
unsigned num_channels = util_last_bit(mask);
LLVMValueRef rsrc, vindex;
 
-   rsrc = get_image_buffer_descriptor(ctx, instr, false);
+   rsrc = get_image_buffer_descriptor(ctx, instr, false, false);
vindex = LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, 
instr->src[1]),
 ctx->ac.i32_0, "");
 
@@ -2520,7 +2522,7 @@ static void visit_image_store(struct ac_nir_context *ctx,
args.cache_policy = get_cache_policy(ctx, access, true, 
writeonly_memory);
 
if (dim == GLSL_SAMPLER_DIM_BUF) {
-   LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, 
true);
+   LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, 
true, false);
LLVMValueRef src = ac_to_float(&ctx->ac, get_src(ctx, 
instr->src[3]));
unsigned src_channels = ac_get_llvm_num_components(src);
LLVMValueRef vindex;
@@ -2632,7 +2634,7 @@ static LLVMValueRef visit_image_atomic(struct 
ac_nir_context *ctx,
params[param_count++] = get_src(ctx, instr->src[3]);
 
if (dim == GLSL_SAMPLER_DIM_BUF) {
-   params[param_count++] = get_image_buffer_descriptor(ctx, instr, 
true);
+   params[param_count++] = get_image_buffer_descriptor(ctx, instr, 
true, true);
params[param_count++] = 
LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, instr->src[1]),
ctx->ac.i32_0, 
""); /* vindex */
params[param_count++] = ctx->ac.i32_0; /* voffset */
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index 108fe58ce57..8debb1ff986 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -203,6 +203,7 @@ struct ac_shader_abi {
/* Whether to workaround GFX9 ignoring the stride for the buffer size 
if IDXEN=0
* and LLVM optimizes an indexed load with constant index to IDXEN=0. */
bool gfx9_stride_size_workaround;
+   bool gfx9_stride_size_workaround_for_atomic;
 };
 
 #endif /* AC_SHADER_ABI_H */
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 796d78e34f4..d83f0bd547f 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -3687,6 +3687,12 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
ctx.abi.clamp_shadow_reference = false;
ctx.abi.gfx9_stride_size_workaround = ctx.ac.chip_class == GFX9 && 
HAVE_LLVM < 0x800;
 
+   /* Because the new raw/struct atomic intrinsics are buggy with LLVM 8,
+* we fallback to the old intrinsics for atomic buffer image operations
+* and thus we need to apply the 

[Mesa-dev] [Bug 110573] Mesa vulkan-radeon 19.0.3 system freeze and visual artifacts (RADV)

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110573

--- Comment #13 from Samuel Pitoiset  ---
Can you try https://patchwork.freedesktop.org/series/60252/ ? That should fix
the regression.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110573] Mesa vulkan-radeon 19.0.3 system freeze and visual artifacts (RADV)

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110573

--- Comment #12 from Samuel Pitoiset  ---
I think I found the problem with a different game, I'm working on.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] anv: fix alphaToCoverage when there is no color attachment

2019-05-03 Thread Iago Toral
On Thu, 2019-05-02 at 09:03 -0500, Jason Ekstrand wrote:
> On Thu, May 2, 2019 at 5:46 AM Iago Toral Quiroga 
> wrote:
> > From: Samuel Iglesias Gonsálvez 
> > 
> > 
> > 
> > There tests in CTS for for alpha to coverage without a color
> > attachment.
> > 
> > First the test draws a primitive with alpha 0 and a subpass with
> > only
> > 
> > a depth buffer. No writes to a depth buffer are expected. Then a
> > 
> > second draw with a color buffer and the same depth buffer is done
> > to
> > 
> > verify the depth buffer still has the original clear values.
> > 
> > 
> > 
> > This behavior is not explicitly forbidden by the Vulkan spec, so
> > 
> > it seems it is allowed.
> > 
> > 
> > 
> > When there is no color attachment for a given output, we discard it
> > 
> > so at the end we have an FS assembly like:
> > 
> > 
> > 
> > Native code for unnamed fragment shader (null)
> > 
> > SIMD16 shader: 1 instructions. 0 loops. 4 cycles. 0:0 spills:fills.
> > 
> > Promoted 0 constants. Compacted 16 to 16 bytes (0%)
> > 
> >   START B0 (4 cycles)
> > 
> > sendc(16)   null<1>UW   g120<0,1,0>F0x90031000
> > 
> > 
> > 
> > render MsgDesc: RT write SIMD16 LastRT Surface = 0 mlen 8 rlen 0 {
> > 
> > align1 1H EOT };
> > 
> > 
> > 
> > As g120 is not initialized, we see random writes to the depth
> > buffer
> > 
> > due to the alphaToCoverage enablement. This patch fixes that by
> > 
> > keeping the output in that case.
> > 
> > 
> > 
> > v2:
> > 
> >  - No need to create a null render target, the driver is already
> > 
> >doing that (Jason)
> > 
> >  - Simplified code a bit (Iago)
> > 
> > 
> > 
> > Fixes the following CTS tests:
> > 
> > dEQP-
> > VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.*
> > 
> > 
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > 
> > Signed-off-by: Iago Toral Quiroga 
> > 
> > ---
> > 
> >  src/intel/vulkan/anv_pipeline.c | 25 ++---
> > 
> >  1 file changed, 18 insertions(+), 7 deletions(-)
> > 
> > 
> > 
> > diff --git a/src/intel/vulkan/anv_pipeline.c
> > b/src/intel/vulkan/anv_pipeline.c
> > 
> > index b9c9bfd7598..07f1a939e43 100644
> > 
> > --- a/src/intel/vulkan/anv_pipeline.c
> > 
> > +++ b/src/intel/vulkan/anv_pipeline.c
> > 
> > @@ -808,7 +808,9 @@ anv_pipeline_compile_gs(const struct
> > brw_compiler *compiler,
> > 
> > 
> > 
> >  static void
> > 
> >  anv_pipeline_link_fs(const struct brw_compiler *compiler,
> > 
> > - struct anv_pipeline_stage *stage)
> > 
> > + struct anv_pipeline_stage *stage,
> > 
> > + bool has_depth_stencil_att,
> > 
> > + bool has_alpha_to_coverage)
> > 
> >  {
> > 
> > unsigned num_rts = 0;
> > 
> > const int max_rt = FRAG_RESULT_DATA7 - FRAG_RESULT_DATA0 + 1;
> > 
> > @@ -859,11 +861,17 @@ anv_pipeline_link_fs(const struct
> > brw_compiler *compiler,
> > 
> >const unsigned rt = var->data.location - FRAG_RESULT_DATA0;
> > 
> >if (rt >= MAX_RTS ||
> > 
> >!(stage->key.wm.color_outputs_valid & (1 << rt))) {
> > 
> > - /* Unused or out-of-bounds, throw it away */
> > 
> > - deleted_output = true;
> > 
> > - var->data.mode = nir_var_function_temp;
> > 
> > - exec_node_remove(&var->node);
> > 
> > - exec_list_push_tail(&impl->locals, &var->node);
> > 
> > + /* Unused or out-of-bounds, throw it away. The exception
> > is depth-only
> > 
> > +  * rendering with alphaToCoverage, as in this case we
> > need to keep the
> > 
> > +  * fragment output in location 0, which we will bind
> > later to a null
> > 
> > +  * render target.
> > 
> > +  */
> > 
> > + if (rt != 0 || !has_alpha_to_coverage ||
> > !has_depth_stencil_att) {
> > 
> > +deleted_output = true;
> > 
> > +var->data.mode = nir_var_function_temp;
> > 
> > +exec_node_remove(&var->node);
> > 
> > +exec_list_push_tail(&impl->locals, &var->node);
> > 
> > + }
> > 
> >   continue;
> > 
> >}
> > 
> > 
> > 
> > @@ -1120,7 +1128,10 @@ anv_pipeline_compile_graphics(struct
> > anv_pipeline *pipeline,
> > 
> >   anv_pipeline_link_gs(compiler, &stages[s], next_stage);
> > 
> >   break;
> > 
> >case MESA_SHADER_FRAGMENT:
> > 
> > - anv_pipeline_link_fs(compiler, &stages[s]);
> > 
> > + anv_pipeline_link_fs(compiler, &stages[s],
> > 
> > +  pipeline->subpass-
> > >depth_stencil_attachment,
> > 
> > +  info->pMultisampleState &&
> > 
> > +  info->pMultisampleState-
> > >alphaToCoverageEnable);
> 
> You don't need to pass alphaToCoverageEnable through because it's
> already in the key.  For pipeline->subpass->depth_stencil_attachment, 
> I'm inclined to just not have that in the calculation.  It's not in
> the key, so we'd have to add it, and the calculation should be
> correct

[Mesa-dev] [Bug 92552] piglit egl-create-context-valid-flag-forward-compatible-gl regression

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92552

Sergii Romantsov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Sergii Romantsov  ---
Should be fixed:

commit 5c581b3dd6979b79cb3e3ab8e2e03b442e6ecb0d
Author: Andrii Simiklit 
Date:   Thu Oct 11 13:53:21 2018 +0300

egl: return correct error code for a case req ver < 3 with
forward-compatible

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110540] [AMD TAHITI XT] valve artifact broken

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110540

--- Comment #13 from Samuel Pitoiset  ---
(In reply to Sylvain BERTRAND from comment #11)
> I did report the regression ages ago: since there is a year in time lag,
> could be anywhere, even in linux drm or worse, in llvm:
> https://bugs.freedesktop.org/show_bug.cgi?id=110540

Wrong link?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110540] [AMD TAHITI XT] valve artifact broken

2019-05-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110540

Samuel Pitoiset  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #12 from Samuel Pitoiset  ---
Should be fixed with
https://cgit.freedesktop.org/mesa/mesa/commit/?id=e68d7bec677f61645dc41226df5cfa9b56b01b56

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev