Re: [Mesa-dev] [PATCH 02/10] dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDE

2018-04-25 Thread Marek Olšák
Why would you want to modify it? It's exactly what you get when you map it,
but that stride can't be used for modesetting.

Marek

On Wed, Apr 25, 2018 at 11:14 PM, Gurchetan Singh <
gurchetansi...@chromium.org> wrote:

> That sounds fine to me.  We can just modify the stride after
> dri_bo_create(..).
>
> On Wed, Apr 25, 2018 at 7:30 PM, Marek Olšák  wrote:
> > On Wed, Apr 25, 2018 at 6:56 PM, Gurchetan Singh
> >  wrote:
> >>
> >> On Wed, Apr 25, 2018 at 2:16 PM, Marek Olšák  wrote:
> >> > From: Nicolai Hähnle 
> >> >
> >> > Allow the caller to specify the row stride (in bytes) with which an
> >> > image
> >> > should be mapped. Note that completely ignoring USER_STRIDE is a valid
> >> > implementation of mapImage.
> >> >
> >> > This is horrible API design. Unfortunately, cros_gralloc does indeed
> >> > have
> >> > a horrible API design -- in that arbitrary images should be allowed to
> >> > be
> >> > mapped with the stride that a linear image of the same width would
> have.
> >>
> >> Yes, unfortunately the gralloc API doesn't return the stride when
> >> (*lock) is called.  However, the stride is returned during (*alloc).
> >> Currently, for the dri backend, minigbm uses
> >> __DRI_IMAGE_ATTRIB_STRIDE to compute the pixel stride given to
> >> Android.
> >>
> >> Is AMD seeing problems with the current approach (I haven't seen any
> >> bugs filed for this issue)?
> >>
> >> Another possible solution is to call mapImage()/unmapImage right after
> >> allocating the image, and use the stride returned by mapImage() to
> >> compute the pixel stride.  That could also fix whatever bugs AMD is
> >> seeing.
> >
> >
> > Thanks. You cleared it up to me. It looks like that everything we've been
> > told so far is BS. This series isn't needed.
> >
> > The solution is to do this in the amdgpu minigbm backend at alloc time:
> >stride = align(width * Bpp, 64);
> >
> > Later chips should change that to:
> >stride = align(width * Bpp, 256);
> >
> > No querying needed. What do you think?
> >
> > Marek
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Nouveau driver problem when using EGL_LINUX_DMA_BUF_EXT

2018-04-25 Thread Ilia Mirkin
On Wed, Apr 18, 2018 at 6:04 AM, Volker Vogelhuber
 wrote:
> On 17.04.2018 15:44, Pekka Paalanen wrote:
>> If Nouveau cannot handle that correctly, it would hopefully refuse the
>> import.
>
> Although it would not solve my problem, it would be at least a proper
> handling of the API calls. I still doubt the implementations is not
> supported, as I got the image data rendered. It's just it is not rendered
> correctly. So it seems like data transfer is successfully done in general,
> but only not with the right parameters.

My leading theory is that NVIDIA hardware can only texture/render from
linear surfaces with pitch (and offset) aligned to 64 bytes.
Unfortunately this is something of a pain to test without all the
proper setup, but it's something I'll hopefully get to. If that theory
pans out, we'll start failing the imports of such images.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDE

2018-04-25 Thread Gurchetan Singh
That sounds fine to me.  We can just modify the stride after dri_bo_create(..).

On Wed, Apr 25, 2018 at 7:30 PM, Marek Olšák  wrote:
> On Wed, Apr 25, 2018 at 6:56 PM, Gurchetan Singh
>  wrote:
>>
>> On Wed, Apr 25, 2018 at 2:16 PM, Marek Olšák  wrote:
>> > From: Nicolai Hähnle 
>> >
>> > Allow the caller to specify the row stride (in bytes) with which an
>> > image
>> > should be mapped. Note that completely ignoring USER_STRIDE is a valid
>> > implementation of mapImage.
>> >
>> > This is horrible API design. Unfortunately, cros_gralloc does indeed
>> > have
>> > a horrible API design -- in that arbitrary images should be allowed to
>> > be
>> > mapped with the stride that a linear image of the same width would have.
>>
>> Yes, unfortunately the gralloc API doesn't return the stride when
>> (*lock) is called.  However, the stride is returned during (*alloc).
>> Currently, for the dri backend, minigbm uses
>> __DRI_IMAGE_ATTRIB_STRIDE to compute the pixel stride given to
>> Android.
>>
>> Is AMD seeing problems with the current approach (I haven't seen any
>> bugs filed for this issue)?
>>
>> Another possible solution is to call mapImage()/unmapImage right after
>> allocating the image, and use the stride returned by mapImage() to
>> compute the pixel stride.  That could also fix whatever bugs AMD is
>> seeing.
>
>
> Thanks. You cleared it up to me. It looks like that everything we've been
> told so far is BS. This series isn't needed.
>
> The solution is to do this in the amdgpu minigbm backend at alloc time:
>stride = align(width * Bpp, 64);
>
> Later chips should change that to:
>stride = align(width * Bpp, 256);
>
> No querying needed. What do you think?
>
> Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDE

2018-04-25 Thread Marek Olšák
On Wed, Apr 25, 2018 at 6:56 PM, Gurchetan Singh <
gurchetansi...@chromium.org> wrote:

> On Wed, Apr 25, 2018 at 2:16 PM, Marek Olšák  wrote:
> > From: Nicolai Hähnle 
> >
> > Allow the caller to specify the row stride (in bytes) with which an image
> > should be mapped. Note that completely ignoring USER_STRIDE is a valid
> > implementation of mapImage.
> >
> > This is horrible API design. Unfortunately, cros_gralloc does indeed have
> > a horrible API design -- in that arbitrary images should be allowed to be
> > mapped with the stride that a linear image of the same width would have.
>
> Yes, unfortunately the gralloc API doesn't return the stride when
> (*lock) is called.  However, the stride is returned during (*alloc).
> Currently, for the dri backend, minigbm uses
> __DRI_IMAGE_ATTRIB_STRIDE to compute the pixel stride given to
> Android.
>
> Is AMD seeing problems with the current approach (I haven't seen any
> bugs filed for this issue)?
>
> Another possible solution is to call mapImage()/unmapImage right after
> allocating the image, and use the stride returned by mapImage() to
> compute the pixel stride.  That could also fix whatever bugs AMD is
> seeing.
>

Thanks. You cleared it up to me. It looks like that everything we've been
told so far is BS. This series isn't needed.

The solution is to do this in the amdgpu minigbm backend at alloc time:
   stride = align(width * Bpp, 64);

Later chips should change that to:
   stride = align(width * Bpp, 256);

No querying needed. What do you think?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106232] LLVM unit tests have error in random number handling

2018-04-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106232

Roland Scheidegger  changed:

   What|Removed |Added

 CC||jfons...@vmware.com,
   ||srol...@vmware.com

--- Comment #1 from Roland Scheidegger  ---
I think for "normal" unsigned type it should still cover the whole range
because of the "value += (double)(mask & rand());" line above? So the numbers
covered whole uint range before.
This line is weird though indeed and you're quite right that it will make all
numbers negative just to get them clamped to zero later.
And float range is only from 0 to 2.0f?
Honestly I don't quite understand what the code is trying to do.
CC Jose as he wrote it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/5] i965/clear: Simplify updating the indirect depth value

2018-04-25 Thread Jason Ekstrand



On April 25, 2018 20:25:16 Nanley Chery  wrote:

On Wed, Apr 25, 2018 at 04:50:11PM -0700, Jason Ekstrand wrote:
On Tue, Apr 24, 2018 at 5:48 PM, Nanley Chery  wrote:

Determine the predicate for updating the indirect depth value in the
loop which inspects whether or not we need to resolve any slices.
---
src/mesa/drivers/dri/i965/brw_clear.c | 43 +-
-
1 file changed, 16 insertions(+), 27 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clear.c
b/src/mesa/drivers/dri/i965/brw_clear.c
index 6521141d7f6..e372d28926e 100644
--- a/src/mesa/drivers/dri/i965/brw_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_clear.c
@@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
struct intel_mipmap_tree *mt = depth_irb->mt;
struct gl_renderbuffer_attachment *depth_att =
>Attachment[BUFFER_DEPTH];
const struct gen_device_info *devinfo = >screen->devinfo;
-   bool same_clear_value = true;

if (devinfo->gen < 6)
return false;
@@ -174,9 +173,16 @@ brw_fast_clear_depth(struct gl_context *ctx)
const uint32_t num_layers = depth_att->Layered ?
depth_irb->layer_count : 1;

/* If we're clearing to a new clear value, then we need to resolve any
clear
-* flags out of the HiZ buffer into the real depth buffer.
+* flags out of the HiZ buffer into the real depth buffer and update
the
+* miptree's clear value.
*/
if (mt->fast_clear_color.f32[0] != clear_value) {
+  /* BLORP updates the indirect clear color buffer when we do fast
clears.
+   * If we won't do a fast clear, we'll have to update it ourselves.
Start
+   * off assuming we won't perform a fast clear.
+   */
+  bool blorp_will_update_indirect_color = false;

This boolean is rather awkward.

Why's that?

It does have a clear meaning and it does what it says it does.  However, 
it's not that obvious of a thing to work with compared to "did we do a clear?"



+
for (uint32_t level = mt->first_level; level <= mt->last_level;
level++) {
if (!intel_miptree_level_has_hiz(mt, level))
continue;
@@ -184,16 +190,20 @@ brw_fast_clear_depth(struct gl_context *ctx)
const unsigned level_layers = brw_get_num_logical_layers(mt,
level);

for (uint32_t layer = 0; layer < level_layers; layer++) {
+const enum isl_aux_state aux_state =
+   intel_miptree_get_aux_state(mt, level, layer);
+
if (level == depth_irb->mt_level &&
layer >= depth_irb->mt_layer &&
layer < depth_irb->mt_layer + num_layers) {
+
+   if (aux_state != ISL_AUX_STATE_CLEAR)
+  blorp_will_update_indirect_color = true;

Putting this here separates the detection of whether or not we are doing a 
fast clear (and therefore don't need to set the clear color) even further 
from where we do the clear and use this value than it was previously.



+
/* We're going to clear this layer anyway.  Leave it
alone. */
continue;
}

-enum isl_aux_state aux_state =
-   intel_miptree_get_aux_state(mt, level, layer);
-
if (aux_state != ISL_AUX_STATE_CLEAR &&
aux_state != ISL_AUX_STATE_COMPRESSED_CLEAR) {
/* This slice doesn't have any fast-cleared bits. */
@@ -214,29 +224,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
}

intel_miptree_set_depth_clear_value(brw, mt, clear_value);
-  same_clear_value = false;
-   }
-
-   bool need_clear = false;
-   for (unsigned a = 0; a < num_layers; a++) {
-  enum isl_aux_state aux_state =
- intel_miptree_get_aux_state(mt, depth_irb->mt_level,
- depth_irb->mt_layer + a);
-
-  if (aux_state != ISL_AUX_STATE_CLEAR) {
- need_clear = true;
- break;
-  }
-   }
-
-   if (!need_clear) {
-  if (!same_clear_value) {
- /* BLORP updates the indirect clear color buffer when performing
a
-  * fast clear. Since we are skipping the fast clear here, we
need to
-  * do the update ourselves.
-  */
+  if (!blorp_will_update_indirect_color)
intel_miptree_update_indirect_color(brw, mt);
-  }

I think we can do this even better.  We could do

bool blorp_updated_indirect_clear_color = false;

and then set it to true if we call intel_hiz_exec below.  Then, after the
loop below we would do

if (!blorp_updated_indirect_clear_color)
intel_miptree_update_indirect_color(brw, mt);

after we've done the clears.

I had something like that originally and I think that solution would
have marginally better performance. I went with doing it this way
because it allows us to:

* Do all the clear color updates in one place.

That's sort-of true.  It puts all the clear color updated that happen in 
this function together.  But there is another update that BLORP is doing 
that, I would argue, it separates even further.



* Place blorp_will_update_indirect_color in a scope smaller
than the function.

True, but it's declaration, update, and use are much further apart in terms 
of logic and lines of code.  Also, it's much further away from 

Re: [Mesa-dev] [PATCH v2 4/5] i965/clear: Simplify updating the indirect depth value

2018-04-25 Thread Nanley Chery
On Wed, Apr 25, 2018 at 04:50:11PM -0700, Jason Ekstrand wrote:
> On Tue, Apr 24, 2018 at 5:48 PM, Nanley Chery  wrote:
> 
> > Determine the predicate for updating the indirect depth value in the
> > loop which inspects whether or not we need to resolve any slices.
> > ---
> >  src/mesa/drivers/dri/i965/brw_clear.c | 43 +-
> > -
> >  1 file changed, 16 insertions(+), 27 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c
> > b/src/mesa/drivers/dri/i965/brw_clear.c
> > index 6521141d7f6..e372d28926e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > @@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > struct intel_mipmap_tree *mt = depth_irb->mt;
> > struct gl_renderbuffer_attachment *depth_att =
> > >Attachment[BUFFER_DEPTH];
> > const struct gen_device_info *devinfo = >screen->devinfo;
> > -   bool same_clear_value = true;
> >
> > if (devinfo->gen < 6)
> >return false;
> > @@ -174,9 +173,16 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > const uint32_t num_layers = depth_att->Layered ?
> > depth_irb->layer_count : 1;
> >
> > /* If we're clearing to a new clear value, then we need to resolve any
> > clear
> > -* flags out of the HiZ buffer into the real depth buffer.
> > +* flags out of the HiZ buffer into the real depth buffer and update
> > the
> > +* miptree's clear value.
> >  */
> > if (mt->fast_clear_color.f32[0] != clear_value) {
> > +  /* BLORP updates the indirect clear color buffer when we do fast
> > clears.
> > +   * If we won't do a fast clear, we'll have to update it ourselves.
> > Start
> > +   * off assuming we won't perform a fast clear.
> > +   */
> > +  bool blorp_will_update_indirect_color = false;
> >
> 
> This boolean is rather awkward.
> 
> 

Why's that?

> > +
> >for (uint32_t level = mt->first_level; level <= mt->last_level;
> > level++) {
> >   if (!intel_miptree_level_has_hiz(mt, level))
> >  continue;
> > @@ -184,16 +190,20 @@ brw_fast_clear_depth(struct gl_context *ctx)
> >   const unsigned level_layers = brw_get_num_logical_layers(mt,
> > level);
> >
> >   for (uint32_t layer = 0; layer < level_layers; layer++) {
> > +const enum isl_aux_state aux_state =
> > +   intel_miptree_get_aux_state(mt, level, layer);
> > +
> >  if (level == depth_irb->mt_level &&
> >  layer >= depth_irb->mt_layer &&
> >  layer < depth_irb->mt_layer + num_layers) {
> > +
> > +   if (aux_state != ISL_AUX_STATE_CLEAR)
> > +  blorp_will_update_indirect_color = true;
> > +
> > /* We're going to clear this layer anyway.  Leave it
> > alone. */
> > continue;
> >  }
> >
> > -enum isl_aux_state aux_state =
> > -   intel_miptree_get_aux_state(mt, level, layer);
> > -
> >  if (aux_state != ISL_AUX_STATE_CLEAR &&
> >  aux_state != ISL_AUX_STATE_COMPRESSED_CLEAR) {
> > /* This slice doesn't have any fast-cleared bits. */
> > @@ -214,29 +224,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
> >}
> >
> >intel_miptree_set_depth_clear_value(brw, mt, clear_value);
> > -  same_clear_value = false;
> > -   }
> > -
> > -   bool need_clear = false;
> > -   for (unsigned a = 0; a < num_layers; a++) {
> > -  enum isl_aux_state aux_state =
> > - intel_miptree_get_aux_state(mt, depth_irb->mt_level,
> > - depth_irb->mt_layer + a);
> > -
> > -  if (aux_state != ISL_AUX_STATE_CLEAR) {
> > - need_clear = true;
> > - break;
> > -  }
> > -   }
> > -
> > -   if (!need_clear) {
> > -  if (!same_clear_value) {
> > - /* BLORP updates the indirect clear color buffer when performing
> > a
> > -  * fast clear. Since we are skipping the fast clear here, we
> > need to
> > -  * do the update ourselves.
> > -  */
> > +  if (!blorp_will_update_indirect_color)
> >   intel_miptree_update_indirect_color(brw, mt);
> > -  }
> >
> 
> I think we can do this even better.  We could do
> 
> bool blorp_updated_indirect_clear_color = false;
> 
> and then set it to true if we call intel_hiz_exec below.  Then, after the
> loop below we would do
> 
> if (!blorp_updated_indirect_clear_color)
>intel_miptree_update_indirect_color(brw, mt);
> 
> after we've done the clears.
> 
> 

I had something like that originally and I think that solution would
have marginally better performance. I went with doing it this way
because it allows us to:

* Do all the clear color updates in one place.
* Place blorp_will_update_indirect_color in a scope smaller
  than the function.
* Delete more code.

If we wait until the loop below to assign

Re: [Mesa-dev] [PATCH v2 4/5] i965/clear: Simplify updating the indirect depth value

2018-04-25 Thread Jason Ekstrand
On Tue, Apr 24, 2018 at 5:48 PM, Nanley Chery  wrote:

> Determine the predicate for updating the indirect depth value in the
> loop which inspects whether or not we need to resolve any slices.
> ---
>  src/mesa/drivers/dri/i965/brw_clear.c | 43 +-
> -
>  1 file changed, 16 insertions(+), 27 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index 6521141d7f6..e372d28926e 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> struct intel_mipmap_tree *mt = depth_irb->mt;
> struct gl_renderbuffer_attachment *depth_att =
> >Attachment[BUFFER_DEPTH];
> const struct gen_device_info *devinfo = >screen->devinfo;
> -   bool same_clear_value = true;
>
> if (devinfo->gen < 6)
>return false;
> @@ -174,9 +173,16 @@ brw_fast_clear_depth(struct gl_context *ctx)
> const uint32_t num_layers = depth_att->Layered ?
> depth_irb->layer_count : 1;
>
> /* If we're clearing to a new clear value, then we need to resolve any
> clear
> -* flags out of the HiZ buffer into the real depth buffer.
> +* flags out of the HiZ buffer into the real depth buffer and update
> the
> +* miptree's clear value.
>  */
> if (mt->fast_clear_color.f32[0] != clear_value) {
> +  /* BLORP updates the indirect clear color buffer when we do fast
> clears.
> +   * If we won't do a fast clear, we'll have to update it ourselves.
> Start
> +   * off assuming we won't perform a fast clear.
> +   */
> +  bool blorp_will_update_indirect_color = false;
>

This boolean is rather awkward.


> +
>for (uint32_t level = mt->first_level; level <= mt->last_level;
> level++) {
>   if (!intel_miptree_level_has_hiz(mt, level))
>  continue;
> @@ -184,16 +190,20 @@ brw_fast_clear_depth(struct gl_context *ctx)
>   const unsigned level_layers = brw_get_num_logical_layers(mt,
> level);
>
>   for (uint32_t layer = 0; layer < level_layers; layer++) {
> +const enum isl_aux_state aux_state =
> +   intel_miptree_get_aux_state(mt, level, layer);
> +
>  if (level == depth_irb->mt_level &&
>  layer >= depth_irb->mt_layer &&
>  layer < depth_irb->mt_layer + num_layers) {
> +
> +   if (aux_state != ISL_AUX_STATE_CLEAR)
> +  blorp_will_update_indirect_color = true;
> +
> /* We're going to clear this layer anyway.  Leave it
> alone. */
> continue;
>  }
>
> -enum isl_aux_state aux_state =
> -   intel_miptree_get_aux_state(mt, level, layer);
> -
>  if (aux_state != ISL_AUX_STATE_CLEAR &&
>  aux_state != ISL_AUX_STATE_COMPRESSED_CLEAR) {
> /* This slice doesn't have any fast-cleared bits. */
> @@ -214,29 +224,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
>}
>
>intel_miptree_set_depth_clear_value(brw, mt, clear_value);
> -  same_clear_value = false;
> -   }
> -
> -   bool need_clear = false;
> -   for (unsigned a = 0; a < num_layers; a++) {
> -  enum isl_aux_state aux_state =
> - intel_miptree_get_aux_state(mt, depth_irb->mt_level,
> - depth_irb->mt_layer + a);
> -
> -  if (aux_state != ISL_AUX_STATE_CLEAR) {
> - need_clear = true;
> - break;
> -  }
> -   }
> -
> -   if (!need_clear) {
> -  if (!same_clear_value) {
> - /* BLORP updates the indirect clear color buffer when performing
> a
> -  * fast clear. Since we are skipping the fast clear here, we
> need to
> -  * do the update ourselves.
> -  */
> +  if (!blorp_will_update_indirect_color)
>   intel_miptree_update_indirect_color(brw, mt);
> -  }
>

I think we can do this even better.  We could do

bool blorp_updated_indirect_clear_color = false;

and then set it to true if we call intel_hiz_exec below.  Then, after the
loop below we would do

if (!blorp_updated_indirect_clear_color)
   intel_miptree_update_indirect_color(brw, mt);

after we've done the clears.


> }
>
> for (unsigned a = 0; a < num_layers; a++) {
> --
> 2.16.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/icl: Enable Vulkan on Ice Lake

2018-04-25 Thread Nanley Chery
On Wed, Apr 25, 2018 at 03:59:52PM -0700, Anuj Phogat wrote:
> This patch enables the Vulkan driver on Ice Lake h/w
> with added warning about preliminary support.
> 
> Signed-off-by: Anuj Phogat 
> ---
>  src/intel/vulkan/anv_device.c | 2 ++
>  1 file changed, 2 insertions(+)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 7522b7865c..b456d3d4c5 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -323,6 +323,8 @@ anv_physical_device_init(struct anv_physical_device 
> *device,
>intel_logw("Bay Trail Vulkan support is incomplete");
> } else if (device->info.gen >= 8 && device->info.gen <= 10) {
>/* Gen8-10 fully supported */
> +   } else if (device->info.gen == 11) {
> +  intel_logw("Vulkan is not yet fully supported on gen11.");
> } else {
>result = vk_errorf(device->instance, device,
>   VK_ERROR_INCOMPATIBLE_DRIVER,
> -- 
> 2.13.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/icl: Enable Vulkan on Ice Lake

2018-04-25 Thread Anuj Phogat
This patch enables the Vulkan driver on Ice Lake h/w
with added warning about preliminary support.

Signed-off-by: Anuj Phogat 
---
 src/intel/vulkan/anv_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 7522b7865c..b456d3d4c5 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -323,6 +323,8 @@ anv_physical_device_init(struct anv_physical_device *device,
   intel_logw("Bay Trail Vulkan support is incomplete");
} else if (device->info.gen >= 8 && device->info.gen <= 10) {
   /* Gen8-10 fully supported */
+   } else if (device->info.gen == 11) {
+  intel_logw("Vulkan is not yet fully supported on gen11.");
} else {
   result = vk_errorf(device->instance, device,
  VK_ERROR_INCOMPATIBLE_DRIVER,
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDE

2018-04-25 Thread Gurchetan Singh
On Wed, Apr 25, 2018 at 2:16 PM, Marek Olšák  wrote:
> From: Nicolai Hähnle 
>
> Allow the caller to specify the row stride (in bytes) with which an image
> should be mapped. Note that completely ignoring USER_STRIDE is a valid
> implementation of mapImage.
>
> This is horrible API design. Unfortunately, cros_gralloc does indeed have
> a horrible API design -- in that arbitrary images should be allowed to be
> mapped with the stride that a linear image of the same width would have.

Yes, unfortunately the gralloc API doesn't return the stride when
(*lock) is called.  However, the stride is returned during (*alloc).
Currently, for the dri backend, minigbm uses
__DRI_IMAGE_ATTRIB_STRIDE to compute the pixel stride given to
Android.

Is AMD seeing problems with the current approach (I haven't seen any
bugs filed for this issue)?

Another possible solution is to call mapImage()/unmapImage right after
allocating the image, and use the stride returned by mapImage() to
compute the pixel stride.  That could also fix whatever bugs AMD is
seeing.

>
> There is no separate capability bit because it's unclear how stricter
> requirements should be defined.
> ---
>  include/GL/internal/dri_interface.h | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/include/GL/internal/dri_interface.h 
> b/include/GL/internal/dri_interface.h
> index 07dfd74f9d8..4247e61415f 100644
> --- a/include/GL/internal/dri_interface.h
> +++ b/include/GL/internal/dri_interface.h
> @@ -1213,21 +1213,21 @@ struct __DRIdri2ExtensionRec {
>  */
> __DRIcreateNewScreen2FunccreateNewScreen2;
>  };
>
>
>  /**
>   * This extension provides functionality to enable various EGLImage
>   * extensions.
>   */
>  #define __DRI_IMAGE "DRI_IMAGE"
> -#define __DRI_IMAGE_VERSION 17
> +#define __DRI_IMAGE_VERSION 18
>
>  /**
>   * These formats correspond to the similarly named MESA_FORMAT_*
>   * tokens, except in the native endian of the CPU.  For example, on
>   * little endian __DRI_IMAGE_FORMAT_XRGB corresponds to
>   * MESA_FORMAT_XRGB, but MESA_FORMAT_XRGB_REV on big endian.
>   *
>   * __DRI_IMAGE_FORMAT_NONE is for images that aren't directly usable
>   * by the driver (YUV planar formats) but serve as a base image for
>   * creating sub-images for the different planes within the image.
> @@ -1263,20 +1263,21 @@ struct __DRIdri2ExtensionRec {
>   * in contrary to gbm buffers, front buffers and fake front buffers, which
>   * could be read after a flush."
>   */
>  #define __DRI_IMAGE_USE_BACKBUFFER  0x0010
>
>
>  #define __DRI_IMAGE_TRANSFER_READ0x1
>  #define __DRI_IMAGE_TRANSFER_WRITE   0x2
>  #define __DRI_IMAGE_TRANSFER_READ_WRITE  \
>  (__DRI_IMAGE_TRANSFER_READ | __DRI_IMAGE_TRANSFER_WRITE)
> +#define __DRI_IMAGE_TRANSFER_USER_STRIDE 0x4 /* since version 18 */
>
>  /**
>   * Four CC formats that matches with WL_DRM_FORMAT_* from wayland_drm.h,
>   * GBM_FORMAT_* from gbm.h, and DRM_FORMAT_* from drm_fourcc.h. Used with
>   * createImageFromNames.
>   *
>   * \since 5
>   */
>
>  #define __DRI_IMAGE_FOURCC_R8  0x20203852
> @@ -1554,22 +1555,31 @@ struct __DRIimageExtensionRec {
> /**
>  * Returns a map of the specified region of a __DRIimage for the 
> specified usage.
>  *
>  * flags may include __DRI_IMAGE_TRANSFER_READ, which will populate the
>  * mapping with the current buffer content. If __DRI_IMAGE_TRANSFER_READ
>  * is not included in the flags, the buffer content at map time is
>  * undefined. Users wanting to modify the mapping must include
>  * __DRI_IMAGE_TRANSFER_WRITE; if __DRI_IMAGE_TRANSFER_WRITE is not
>  * included, behaviour when writing the mapping is undefined.
>  *
> -* Returns the byte stride in *stride, and an opaque pointer to data
> -* tracking the mapping in **data, which must be passed to unmapImage().
> +* When __DRI_IMAGE_TRANSFER_USER_STRIDE is set in \p flags (since 
> version 18),
> +* the driver should attempt to map the image with the byte stride given 
> in
> +* *stride. The caller must ensure that *stride is large enough to hold a
> +* row of the mapping. If the requested stride is not supported, the 
> mapping
> +* may fail, or a mapping with a different stride may be created (in which
> +* case the actual stride is returned in *stride).
> +*
> +* Returns an opaque pointer to data tracking the mapping in **data, which
> +* must be passed to unmapImage().
> +*
> +* Returns the byte stride in *stride.
>  *
>  * Returns NULL on error.
>  *
>  * \since 12
>  */
> void *(*mapImage)(__DRIcontext *context, __DRIimage *image,
>   int x0, int y0, int width, int height,
>   unsigned int flags, int *stride, void **data);
>
> /**
> --
> 2.17.0
>
> ___
> mesa-dev 

[Mesa-dev] [Bug 106231] llvmpipe blends produce bad code after llvm patch https://reviews.llvm.org/D44785

2018-04-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106231

--- Comment #2 from Roland Scheidegger  ---
FWIW I won't have time to look into this until next week, so volunteers
welcome.
Not using the intrinsics for new llvm version is trivial, but we need to update
our existing code (when intrinsics can't be used) to match what the
auto-upgrader would do, otherwise we'll probably get really crappy code (at
least I suspect in this case llvm won't recognize our patterns), which is of
course the reason why we used intrinsics in the first place...
(Can't really say I'm all that happy to see the intrinsics go, as the code to
emulate it is rather complex with 8 instructions, thus blowing up IR size and
likely compile time as well, but I guess that's life when compilers getting
smarter...)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/5] i965/clear: Simplify updating the indirect depth value

2018-04-25 Thread Rafael Antognolli
On Wed, Apr 25, 2018 at 02:53:26PM -0700, Nanley Chery wrote:
> On Wed, Apr 25, 2018 at 02:26:18PM -0700, Rafael Antognolli wrote:
> > On Tue, Apr 24, 2018 at 05:48:45PM -0700, Nanley Chery wrote:
> > > Determine the predicate for updating the indirect depth value in the
> > > loop which inspects whether or not we need to resolve any slices.
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_clear.c | 43 
> > > +--
> > >  1 file changed, 16 insertions(+), 27 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> > > b/src/mesa/drivers/dri/i965/brw_clear.c
> > > index 6521141d7f6..e372d28926e 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > > @@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > > struct intel_mipmap_tree *mt = depth_irb->mt;
> > > struct gl_renderbuffer_attachment *depth_att = 
> > > >Attachment[BUFFER_DEPTH];
> > > const struct gen_device_info *devinfo = >screen->devinfo;
> > > -   bool same_clear_value = true;
> > >  
> > > if (devinfo->gen < 6)
> > >return false;
> > > @@ -174,9 +173,16 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > > const uint32_t num_layers = depth_att->Layered ? 
> > > depth_irb->layer_count : 1;
> > >  
> > > /* If we're clearing to a new clear value, then we need to resolve 
> > > any clear
> > > -* flags out of the HiZ buffer into the real depth buffer.
> > > +* flags out of the HiZ buffer into the real depth buffer and update 
> > > the
> > > +* miptree's clear value.
> > >  */
> > 
> > I got confused by this comment here. I think your addition to the
> > comment is fine, but the original one wasn't very descriptive of what's
> > going on (at least it wasn't obvious to me).
> > 
> > Since you are already changing it, maybe we can improve it to something
> > like:
> > 
> > /* If we are clearing to a new clear value, the levels/layers being
> >  * cleared don't need resolving because they will stay in the clear
> >  * state, and only the miptree's clear vale needs updating. However, if
> >  * some levels/layers were already in a clear state, but are not being
> >  * cleared now, and the clear value is changing, then we need to resolve
> >  * their clear flags out of the HiZ buffer into the real depth buffer.
> >  */
> > 
> 
> I see. The original comment does fail to mention that we don't resolve
> the level/layer range being cleared. 
> 
> > I'm not sure if this actually helps or if it just makes the comment
> > unnecessarily complex.
> > 
> 
> I think we can fix this while keeping the comment simple. What do you
> think about one of these:
> 
>/* If we're clearing to a new clear value, then we need to resolve
> * any clear flags that are outside of the specified range and then
> * update the miptree's clear value.
> */
> 
>/* If we're clearing to a new clear value, then we need to resolve
> * any clear flags that are outside of the level/layer range
> * specified for clearing and then update the miptree's clear value.
> */

Ah, excelent, either of those options are great imho! Choose whatever
you want.

Rafael

> > On a second thought, this doesn't need to be changed in this commit if
> > you don't want to. We can just send a new one later clarifying these
> > points, and we could also update the comment where the resolve happens
> > to clarify that it should only happen to layers not being cleared now.
> > 
> > In any case, this patch is a nice cleanup.
> > 
> > Reviewed-by: Rafael Antognolli 
> > 
> 
> Thanks!
> 
> -Nanley
> 
> > > if (mt->fast_clear_color.f32[0] != clear_value) {
> > > +  /* BLORP updates the indirect clear color buffer when we do fast 
> > > clears.
> > > +   * If we won't do a fast clear, we'll have to update it ourselves. 
> > > Start
> > > +   * off assuming we won't perform a fast clear.
> > > +   */
> > > +  bool blorp_will_update_indirect_color = false;
> > > +
> > >for (uint32_t level = mt->first_level; level <= mt->last_level; 
> > > level++) {
> > >   if (!intel_miptree_level_has_hiz(mt, level))
> > >  continue;
> > > @@ -184,16 +190,20 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > >   const unsigned level_layers = brw_get_num_logical_layers(mt, 
> > > level);
> > >  
> > >   for (uint32_t layer = 0; layer < level_layers; layer++) {
> > > +const enum isl_aux_state aux_state =
> > > +   intel_miptree_get_aux_state(mt, level, layer);
> > > +
> > >  if (level == depth_irb->mt_level &&
> > >  layer >= depth_irb->mt_layer &&
> > >  layer < depth_irb->mt_layer + num_layers) {
> > > +
> > > +   if (aux_state != ISL_AUX_STATE_CLEAR)
> > > +  blorp_will_update_indirect_color = true;
> > > +
> > > /* We're going to clear 

Re: [Mesa-dev] [PATCH v2 4/5] i965/clear: Simplify updating the indirect depth value

2018-04-25 Thread Nanley Chery
On Wed, Apr 25, 2018 at 02:26:18PM -0700, Rafael Antognolli wrote:
> On Tue, Apr 24, 2018 at 05:48:45PM -0700, Nanley Chery wrote:
> > Determine the predicate for updating the indirect depth value in the
> > loop which inspects whether or not we need to resolve any slices.
> > ---
> >  src/mesa/drivers/dri/i965/brw_clear.c | 43 
> > +--
> >  1 file changed, 16 insertions(+), 27 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> > b/src/mesa/drivers/dri/i965/brw_clear.c
> > index 6521141d7f6..e372d28926e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > @@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > struct intel_mipmap_tree *mt = depth_irb->mt;
> > struct gl_renderbuffer_attachment *depth_att = 
> > >Attachment[BUFFER_DEPTH];
> > const struct gen_device_info *devinfo = >screen->devinfo;
> > -   bool same_clear_value = true;
> >  
> > if (devinfo->gen < 6)
> >return false;
> > @@ -174,9 +173,16 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > const uint32_t num_layers = depth_att->Layered ? depth_irb->layer_count 
> > : 1;
> >  
> > /* If we're clearing to a new clear value, then we need to resolve any 
> > clear
> > -* flags out of the HiZ buffer into the real depth buffer.
> > +* flags out of the HiZ buffer into the real depth buffer and update the
> > +* miptree's clear value.
> >  */
> 
> I got confused by this comment here. I think your addition to the
> comment is fine, but the original one wasn't very descriptive of what's
> going on (at least it wasn't obvious to me).
> 
> Since you are already changing it, maybe we can improve it to something
> like:
> 
> /* If we are clearing to a new clear value, the levels/layers being
>  * cleared don't need resolving because they will stay in the clear
>  * state, and only the miptree's clear vale needs updating. However, if
>  * some levels/layers were already in a clear state, but are not being
>  * cleared now, and the clear value is changing, then we need to resolve
>  * their clear flags out of the HiZ buffer into the real depth buffer.
>  */
> 

I see. The original comment does fail to mention that we don't resolve
the level/layer range being cleared. 

> I'm not sure if this actually helps or if it just makes the comment
> unnecessarily complex.
> 

I think we can fix this while keeping the comment simple. What do you
think about one of these:

   /* If we're clearing to a new clear value, then we need to resolve
* any clear flags that are outside of the specified range and then
* update the miptree's clear value.
*/

   /* If we're clearing to a new clear value, then we need to resolve
* any clear flags that are outside of the level/layer range
* specified for clearing and then update the miptree's clear value.
*/

> On a second thought, this doesn't need to be changed in this commit if
> you don't want to. We can just send a new one later clarifying these
> points, and we could also update the comment where the resolve happens
> to clarify that it should only happen to layers not being cleared now.
> 
> In any case, this patch is a nice cleanup.
> 
> Reviewed-by: Rafael Antognolli 
> 

Thanks!

-Nanley

> > if (mt->fast_clear_color.f32[0] != clear_value) {
> > +  /* BLORP updates the indirect clear color buffer when we do fast 
> > clears.
> > +   * If we won't do a fast clear, we'll have to update it ourselves. 
> > Start
> > +   * off assuming we won't perform a fast clear.
> > +   */
> > +  bool blorp_will_update_indirect_color = false;
> > +
> >for (uint32_t level = mt->first_level; level <= mt->last_level; 
> > level++) {
> >   if (!intel_miptree_level_has_hiz(mt, level))
> >  continue;
> > @@ -184,16 +190,20 @@ brw_fast_clear_depth(struct gl_context *ctx)
> >   const unsigned level_layers = brw_get_num_logical_layers(mt, 
> > level);
> >  
> >   for (uint32_t layer = 0; layer < level_layers; layer++) {
> > +const enum isl_aux_state aux_state =
> > +   intel_miptree_get_aux_state(mt, level, layer);
> > +
> >  if (level == depth_irb->mt_level &&
> >  layer >= depth_irb->mt_layer &&
> >  layer < depth_irb->mt_layer + num_layers) {
> > +
> > +   if (aux_state != ISL_AUX_STATE_CLEAR)
> > +  blorp_will_update_indirect_color = true;
> > +
> > /* We're going to clear this layer anyway.  Leave it alone. 
> > */
> > continue;
> >  }
> >  
> > -enum isl_aux_state aux_state =
> > -   intel_miptree_get_aux_state(mt, level, layer);
> > -
> >  if (aux_state != ISL_AUX_STATE_CLEAR &&
> >  aux_state != ISL_AUX_STATE_COMPRESSED_CLEAR) {
> > /* This slice 

Re: [Mesa-dev] [PATCH] st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.

2018-04-25 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Apr 25, 2018 at 3:08 PM, Eric Anholt  wrote:

> GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an
> unsized internalformat, and it should be non-color-renderable.
> fbobject.c's implementation of the check for color-renderable is checks
> that the texture has a 2101010 mesa format, so make sure that we have
> chosen a 2101010 format so that check can do what it meant to.
>
> Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5.
> ---
>  src/mesa/state_tracker/st_format.c | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_format.c
> b/src/mesa/state_tracker/st_format.c
> index 3db3c7e967c6..418f5342025c 100644
> --- a/src/mesa/state_tracker/st_format.c
> +++ b/src/mesa/state_tracker/st_format.c
> @@ -2138,6 +2138,19 @@ st_choose_format(struct st_context *st, GLenum
> internalFormat,
>goto success;
> }
>
> +   /* For an unsized GL_RGB but a 2_10_10_10 type, try to pick one of the
> +* 2_10_10_10 formats.  This is important for
> +* GL_EXT_texture_type_2_10_10_10_EXT support, which says that these
> +* formats are not color-renderable.  Mesa's check for making those
> +* non-color-renderable is based on our chosen format being 2101010.
> +*/
> +   if (type == GL_UNSIGNED_INT_2_10_10_10_REV) {
> +  if (internalFormat == GL_RGB)
> + internalFormat = GL_RGB10;
> +  else if (internalFormat == GL_RGBA)
> + internalFormat = GL_RGB10_A2;
> +   }
> +
> /* search table for internalFormat */
> for (i = 0; i < ARRAY_SIZE(format_map); i++) {
>const struct format_mapping *mapping = _map[i];
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/10] gallium: add PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT

2018-04-25 Thread Marek Olšák
On Wed, Apr 25, 2018 at 5:29 PM, Roland Scheidegger 
wrote:

> Am 25.04.2018 um 23:16 schrieb Marek Olšák:
> > From: Marek Olšák 
> >
> > ---
> >  src/gallium/docs/source/screen.rst   | 3 +++
> >  src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
> >  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
> >  src/gallium/drivers/i915/i915_screen.c   | 1 +
> >  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
> >  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
> >  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
> >  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
> >  src/gallium/drivers/r300/r300_screen.c   | 1 +
> >  src/gallium/drivers/r600/r600_pipe.c | 1 +
> >  src/gallium/drivers/radeonsi/si_get.c| 3 +++
> >  src/gallium/drivers/softpipe/sp_screen.c | 1 +
> >  src/gallium/drivers/svga/svga_screen.c   | 1 +
> >  src/gallium/drivers/swr/swr_screen.cpp   | 1 +
> >  src/gallium/drivers/vc4/vc4_screen.c | 1 +
> >  src/gallium/drivers/vc5/vc5_screen.c | 1 +
> >  src/gallium/drivers/virgl/virgl_screen.c | 1 +
> >  src/gallium/include/pipe/p_defines.h | 1 +
> >  18 files changed, 22 insertions(+)
> >
> > diff --git a/src/gallium/docs/source/screen.rst
> b/src/gallium/docs/source/screen.rst
> > index 3837360fb40..7cc6d378306 100644
> > --- a/src/gallium/docs/source/screen.rst
> > +++ b/src/gallium/docs/source/screen.rst
> > @@ -413,20 +413,23 @@ The integer capabilities:
> >supported priority levels.  A driver that does not support prioritized
> >contexts can return 0.
> >  * ``PIPE_CAP_FENCE_SIGNAL``: True if the driver supports signaling
> semaphores
> >using fence_server_signal().
> >  * ``PIPE_CAP_CONSTBUF0_FLAGS``: The bits of pipe_resource::flags that
> must be
> >set when binding that buffer as constant buffer 0. If the buffer
> doesn't have
> >those bits set, pipe_context::set_constant_buffer(.., 0, ..) is
> ignored
> >by the driver, and the driver can throw assertion failures.
> >  * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed
> uniforms
> >as opposed to padding to vec4s.
> > +* ``PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT``: The minimum supported
> alignment of
> > +  the user_stride parameter of transfer_map. If 0, the user-specified
> stride
> > +  is unsupported and the user_stride parameter is ignored.
> Does this really make a whole lot of sense? What if the minimum stride
> natively supported isn't always the same? What happens if the stride
> requested is larger than what the hw usually would do, does that need to
> be honored as well - it certainly looks like the cap query here wouldn't
> answer that?
>

Correct. The CAP query is only used by the test. I don't think gralloc
cares what you (or anyone) support.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/10] gallium: add PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT

2018-04-25 Thread Roland Scheidegger
Am 25.04.2018 um 23:16 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> ---
>  src/gallium/docs/source/screen.rst   | 3 +++
>  src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
>  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
>  src/gallium/drivers/i915/i915_screen.c   | 1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
>  src/gallium/drivers/r300/r300_screen.c   | 1 +
>  src/gallium/drivers/r600/r600_pipe.c | 1 +
>  src/gallium/drivers/radeonsi/si_get.c| 3 +++
>  src/gallium/drivers/softpipe/sp_screen.c | 1 +
>  src/gallium/drivers/svga/svga_screen.c   | 1 +
>  src/gallium/drivers/swr/swr_screen.cpp   | 1 +
>  src/gallium/drivers/vc4/vc4_screen.c | 1 +
>  src/gallium/drivers/vc5/vc5_screen.c | 1 +
>  src/gallium/drivers/virgl/virgl_screen.c | 1 +
>  src/gallium/include/pipe/p_defines.h | 1 +
>  18 files changed, 22 insertions(+)
> 
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index 3837360fb40..7cc6d378306 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -413,20 +413,23 @@ The integer capabilities:
>supported priority levels.  A driver that does not support prioritized
>contexts can return 0.
>  * ``PIPE_CAP_FENCE_SIGNAL``: True if the driver supports signaling semaphores
>using fence_server_signal().
>  * ``PIPE_CAP_CONSTBUF0_FLAGS``: The bits of pipe_resource::flags that must be
>set when binding that buffer as constant buffer 0. If the buffer doesn't 
> have
>those bits set, pipe_context::set_constant_buffer(.., 0, ..) is ignored
>by the driver, and the driver can throw assertion failures.
>  * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed uniforms
>as opposed to padding to vec4s.
> +* ``PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT``: The minimum supported 
> alignment of
> +  the user_stride parameter of transfer_map. If 0, the user-specified stride
> +  is unsupported and the user_stride parameter is ignored.
Does this really make a whole lot of sense? What if the minimum stride
natively supported isn't always the same? What happens if the stride
requested is larger than what the hw usually would do, does that need to
be honored as well - it certainly looks like the cap query here wouldn't
answer that?

Roland




>  
>  
>  .. _pipe_capf:
>  
>  PIPE_CAPF_*
>  
>  
>  The floating-point capabilities are:
>  
>  * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line.
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> index b0f8b4bebe3..915e7d7da7d 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> @@ -267,20 +267,21 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_MEMOBJ:
> case PIPE_CAP_LOAD_CONSTBUF:
> case PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS:
> case PIPE_CAP_TILE_RASTER_ORDER:
> case PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES:
> case PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET:
> case PIPE_CAP_CONTEXT_PRIORITY_MASK:
> case PIPE_CAP_FENCE_SIGNAL:
> case PIPE_CAP_CONSTBUF0_FLAGS:
> case PIPE_CAP_PACKED_UNIFORMS:
> +   case PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT:
>return 0;
>  
> /* Stream output. */
> case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
> case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
> case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
> case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
>return 0;
>  
> /* Geometry shader output, unsupported. */
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index f338d756dfe..dd052a22f25 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -333,20 +333,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>   case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
>   case PIPE_CAP_QUERY_SO_OVERFLOW:
>   case PIPE_CAP_MEMOBJ:
>   case PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS:
>   case PIPE_CAP_TILE_RASTER_ORDER:
>   case PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES:
>   case PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET:
>   case PIPE_CAP_FENCE_SIGNAL:
>   case PIPE_CAP_CONSTBUF0_FLAGS:
>   case PIPE_CAP_PACKED_UNIFORMS:
> + case PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT:
>   return 0;
>  
>   case PIPE_CAP_CONTEXT_PRIORITY_MASK:
>   return screen->priority_mask;
>  
>   case PIPE_CAP_DRAW_INDIRECT:
>  

Re: [Mesa-dev] [PATCH v2 4/5] i965/clear: Simplify updating the indirect depth value

2018-04-25 Thread Rafael Antognolli
On Tue, Apr 24, 2018 at 05:48:45PM -0700, Nanley Chery wrote:
> Determine the predicate for updating the indirect depth value in the
> loop which inspects whether or not we need to resolve any slices.
> ---
>  src/mesa/drivers/dri/i965/brw_clear.c | 43 
> +--
>  1 file changed, 16 insertions(+), 27 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index 6521141d7f6..e372d28926e 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> struct intel_mipmap_tree *mt = depth_irb->mt;
> struct gl_renderbuffer_attachment *depth_att = 
> >Attachment[BUFFER_DEPTH];
> const struct gen_device_info *devinfo = >screen->devinfo;
> -   bool same_clear_value = true;
>  
> if (devinfo->gen < 6)
>return false;
> @@ -174,9 +173,16 @@ brw_fast_clear_depth(struct gl_context *ctx)
> const uint32_t num_layers = depth_att->Layered ? depth_irb->layer_count : 
> 1;
>  
> /* If we're clearing to a new clear value, then we need to resolve any 
> clear
> -* flags out of the HiZ buffer into the real depth buffer.
> +* flags out of the HiZ buffer into the real depth buffer and update the
> +* miptree's clear value.
>  */

I got confused by this comment here. I think your addition to the
comment is fine, but the original one wasn't very descriptive of what's
going on (at least it wasn't obvious to me).

Since you are already changing it, maybe we can improve it to something
like:

/* If we are clearing to a new clear value, the levels/layers being
 * cleared don't need resolving because they will stay in the clear
 * state, and only the miptree's clear vale needs updating. However, if
 * some levels/layers were already in a clear state, but are not being
 * cleared now, and the clear value is changing, then we need to resolve
 * their clear flags out of the HiZ buffer into the real depth buffer.
 */

I'm not sure if this actually helps or if it just makes the comment
unnecessarily complex.

On a second thought, this doesn't need to be changed in this commit if
you don't want to. We can just send a new one later clarifying these
points, and we could also update the comment where the resolve happens
to clarify that it should only happen to layers not being cleared now.

In any case, this patch is a nice cleanup.

Reviewed-by: Rafael Antognolli 

> if (mt->fast_clear_color.f32[0] != clear_value) {
> +  /* BLORP updates the indirect clear color buffer when we do fast 
> clears.
> +   * If we won't do a fast clear, we'll have to update it ourselves. 
> Start
> +   * off assuming we won't perform a fast clear.
> +   */
> +  bool blorp_will_update_indirect_color = false;
> +
>for (uint32_t level = mt->first_level; level <= mt->last_level; 
> level++) {
>   if (!intel_miptree_level_has_hiz(mt, level))
>  continue;
> @@ -184,16 +190,20 @@ brw_fast_clear_depth(struct gl_context *ctx)
>   const unsigned level_layers = brw_get_num_logical_layers(mt, level);
>  
>   for (uint32_t layer = 0; layer < level_layers; layer++) {
> +const enum isl_aux_state aux_state =
> +   intel_miptree_get_aux_state(mt, level, layer);
> +
>  if (level == depth_irb->mt_level &&
>  layer >= depth_irb->mt_layer &&
>  layer < depth_irb->mt_layer + num_layers) {
> +
> +   if (aux_state != ISL_AUX_STATE_CLEAR)
> +  blorp_will_update_indirect_color = true;
> +
> /* We're going to clear this layer anyway.  Leave it alone. */
> continue;
>  }
>  
> -enum isl_aux_state aux_state =
> -   intel_miptree_get_aux_state(mt, level, layer);
> -
>  if (aux_state != ISL_AUX_STATE_CLEAR &&
>  aux_state != ISL_AUX_STATE_COMPRESSED_CLEAR) {
> /* This slice doesn't have any fast-cleared bits. */
> @@ -214,29 +224,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
>}
>  
>intel_miptree_set_depth_clear_value(brw, mt, clear_value);
> -  same_clear_value = false;
> -   }
> -
> -   bool need_clear = false;
> -   for (unsigned a = 0; a < num_layers; a++) {
> -  enum isl_aux_state aux_state =
> - intel_miptree_get_aux_state(mt, depth_irb->mt_level,
> - depth_irb->mt_layer + a);
> -
> -  if (aux_state != ISL_AUX_STATE_CLEAR) {
> - need_clear = true;
> - break;
> -  }
> -   }
> -
> -   if (!need_clear) {
> -  if (!same_clear_value) {
> - /* BLORP updates the indirect clear color buffer when performing a
> -  * fast clear. Since we are skipping the fast clear here, we need to
> -  * do the update ourselves.
> -  */
> +  if 

[Mesa-dev] [PATCH 10/10] gallium/u_tests: test user-specified transfer stride

2018-04-25 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/util/u_tests.c | 62 
 1 file changed, 62 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_tests.c 
b/src/gallium/auxiliary/util/u_tests.c
index e7d11ce117e..32c543027cb 100644
--- a/src/gallium/auxiliary/util/u_tests.c
+++ b/src/gallium/auxiliary/util/u_tests.c
@@ -778,35 +778,97 @@ test_texture_barrier(struct pipe_context *ctx, bool 
use_fbfetch,
/* Cleanup. */
cso_destroy_context(cso);
ctx->delete_vs_state(ctx, vs);
ctx->delete_fs_state(ctx, fs);
pipe_sampler_view_reference(, NULL);
pipe_resource_reference(, NULL);
 
util_report_result_helper(pass, name);
 }
 
+/* Write a value into a texture, or read and check a value inside a texture,
+ * while using a user-specified stride.
+ */
+static bool
+test_one_transfer(struct pipe_context *ctx, struct pipe_resource *tex,
+  unsigned x, unsigned y, unsigned value,
+  unsigned user_stride, bool read)
+{
+   struct pipe_transfer *transfer = NULL;
+   struct pipe_box box;
+   uint32_t *map;
+
+   assert(y >= 1);
+   u_box_2d(x, y - 1, 1, 2, );
+   map = ctx->transfer_map(ctx, tex, 0,
+   read ? PIPE_TRANSFER_READ : PIPE_TRANSFER_WRITE,
+   , user_stride, );
+
+   if (transfer->stride != user_stride) {
+  ctx->transfer_unmap(ctx, transfer);
+  return false;
+   }
+
+   if (read) {
+  if (map[transfer->stride / 4] != value) {
+ ctx->transfer_unmap(ctx, transfer);
+ return false;
+  }
+   } else {
+  map[transfer->stride / 4] = value;
+   }
+
+   ctx->transfer_unmap(ctx, transfer);
+   return true;
+}
+
+static void
+test_transfer_user_stride(struct pipe_context *ctx)
+{
+   struct pipe_screen *screen = ctx->screen;
+   struct pipe_resource *tex =
+  util_create_texture2d(screen, 320, 240, PIPE_FORMAT_R8G8B8A8_UNORM, 1);
+   unsigned stride =
+  screen->get_param(screen, PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT);
+   bool status = true;
+
+   /* Write pixels. Strides are in the ascending order. */
+   for (unsigned i = 0; i < 20; i++) {
+  status = status && test_one_transfer(ctx, tex, 2 + i, 2 + i, i + 1,
+   stride * (i + 1), false);
+   }
+   /* Read pixels and compare values. Strides are in the descending order. */
+   for (unsigned i = 0; i < 20; i++) {
+  status = status && test_one_transfer(ctx, tex, 2 + i, 2 + i, i + 1,
+   stride * (20 - i), true);
+   }
+
+   pipe_resource_reference(, NULL);
+   util_report_result(status);
+}
+
 /**
  * Run all tests. This should be run with a clean context after
  * context_create.
  */
 void
 util_run_tests(struct pipe_screen *screen)
 {
struct pipe_context *ctx = screen->context_create(screen, NULL, 0);
 
null_fragment_shader(ctx);
tgsi_vs_window_space_position(ctx);
null_sampler_view(ctx, TGSI_TEXTURE_2D);
null_sampler_view(ctx, TGSI_TEXTURE_BUFFER);
util_test_constant_buffer(ctx, NULL);
test_sync_file_fences(ctx);
+   test_transfer_user_stride(ctx);
 
for (int i = 1; i <= 8; i = i * 2)
   test_texture_barrier(ctx, false, i);
for (int i = 1; i <= 8; i = i * 2)
   test_texture_barrier(ctx, true, i);
 
ctx->destroy(ctx);
 
puts("Done. Exiting..");
exit(0);
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] st/dri: implement __DRI_IMAGE_TRANSFER_MAP_USER_STRIDE

2018-04-25 Thread Marek Olšák
From: Nicolai Hähnle 

---
 src/gallium/state_trackers/dri/dri2.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 58a6757f037..b9c09fbabd0 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -1627,32 +1627,38 @@ dri2_blit_image(__DRIcontext *context, __DRIimage *dst, 
__DRIimage *src,
 
 static void *
 dri2_map_image(__DRIcontext *context, __DRIimage *image,
 int x0, int y0, int width, int height,
 unsigned int flags, int *stride, void **data)
 {
struct dri_context *ctx = dri_context(context);
struct pipe_context *pipe = ctx->st->pipe;
enum pipe_transfer_usage pipe_access = 0;
struct pipe_transfer *trans;
+   struct pipe_box box;
+   unsigned user_stride = 0;
void *map;
 
if (!image || !data || *data)
   return NULL;
 
if (flags & __DRI_IMAGE_TRANSFER_READ)
- pipe_access |= PIPE_TRANSFER_READ;
+  pipe_access |= PIPE_TRANSFER_READ;
if (flags & __DRI_IMAGE_TRANSFER_WRITE)
- pipe_access |= PIPE_TRANSFER_WRITE;
+  pipe_access |= PIPE_TRANSFER_WRITE;
+   if (flags & __DRI_IMAGE_TRANSFER_USER_STRIDE)
+  user_stride = *stride;
 
-   map = pipe_transfer_map(pipe, image->texture,
-   0, 0, pipe_access, x0, y0, width, height,
+   u_box_2d(x0, y0, width, height, );
+
+   map = pipe->transfer_map(pipe, image->texture,
+   0, pipe_access, , user_stride,
);
if (map) {
   *data = trans;
   *stride = trans->stride;
}
 
return map;
 }
 
 static void
@@ -1661,27 +1667,26 @@ dri2_unmap_image(__DRIcontext *context, __DRIimage 
*image, void *data)
struct dri_context *ctx = dri_context(context);
struct pipe_context *pipe = ctx->st->pipe;
 
pipe_transfer_unmap(pipe, (struct pipe_transfer *)data);
 }
 
 static int
 dri2_get_capabilities(__DRIscreen *_screen)
 {
struct dri_screen *screen = dri_screen(_screen);
-
return (screen->can_share_buffer ? __DRI_IMAGE_CAP_GLOBAL_NAMES : 0);
 }
 
 /* The extension is modified during runtime if DRI_PRIME is detected */
 static __DRIimageExtension dri2ImageExtension = {
-.base = { __DRI_IMAGE, 17 },
+.base = { __DRI_IMAGE, 18 },
 
 .createImageFromName  = dri2_create_image_from_name,
 .createImageFromRenderbuffer  = dri2_create_image_from_renderbuffer,
 .destroyImage = dri2_destroy_image,
 .createImage  = dri2_create_image,
 .queryImage   = dri2_query_image,
 .dupImage = dri2_dup_image,
 .validateUsage= dri2_validate_usage,
 .createImageFromNames = dri2_from_names,
 .fromPlanar   = dri2_from_planar,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] gallium: use pipe_transfer_map_box inline helper

2018-04-25 Thread Marek Olšák
From: Nicolai Hähnle 

We will change pipe_context::transfer_map in a subsequent commit. Wrapping
it in an inline function makes that subsequent change less noisy.
---
 src/gallium/auxiliary/util/u_inlines.h   | 16 
 src/gallium/auxiliary/util/u_surface.c   |  4 ++--
 src/gallium/auxiliary/util/u_transfer.c  |  4 ++--
 src/gallium/auxiliary/util/u_transfer_helper.c   |  2 +-
 src/gallium/auxiliary/vl/vl_idct.c   |  2 +-
 src/gallium/auxiliary/vl/vl_mpeg12_decoder.c |  2 +-
 src/gallium/auxiliary/vl/vl_zscan.c  |  4 ++--
 src/gallium/drivers/r600/compute_memory_pool.c   |  6 +++---
 src/gallium/drivers/r600/evergreen_compute.c |  2 +-
 .../state_trackers/clover/core/resource.cpp  |  2 +-
 src/gallium/state_trackers/nine/buffer9.c|  2 +-
 src/gallium/state_trackers/nine/device9.c|  6 +++---
 src/gallium/state_trackers/nine/nine_state.c |  4 ++--
 src/gallium/state_trackers/nine/surface9.c   |  6 +++---
 src/gallium/state_trackers/nine/volume9.c|  4 ++--
 .../state_trackers/omx/bellagio/vid_enc.c|  2 +-
 src/gallium/state_trackers/osmesa/osmesa.c   |  4 ++--
 src/gallium/state_trackers/va/buffer.c   |  2 +-
 src/gallium/state_trackers/va/image.c|  4 ++--
 src/gallium/state_trackers/va/surface.c  |  2 +-
 src/gallium/state_trackers/vdpau/output.c|  2 +-
 src/gallium/state_trackers/vdpau/surface.c   |  4 ++--
 src/gallium/state_trackers/xvmc/subpicture.c |  6 +++---
 src/gallium/tests/trivial/compute.c  |  4 ++--
 src/gallium/tests/trivial/quad-tex.c |  2 +-
 src/mesa/state_tracker/st_texture.c  |  2 +-
 26 files changed, 58 insertions(+), 42 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_inlines.h 
b/src/gallium/auxiliary/util/u_inlines.h
index 4bd9b7e3c62..b7a28568807 100644
--- a/src/gallium/auxiliary/util/u_inlines.h
+++ b/src/gallium/auxiliary/util/u_inlines.h
@@ -448,20 +448,36 @@ pipe_buffer_read(struct pipe_context *pipe,
 PIPE_TRANSFER_READ,
 _transfer);
if (!map)
   return;
 
memcpy(data, map, size);
pipe_buffer_unmap(pipe, src_transfer);
 }
 
 
+/**
+ * Map a resource for reading/writing.
+ */
+static inline void *
+pipe_transfer_map_box(struct pipe_context *context,
+  struct pipe_resource *resource,
+  unsigned level,
+  enum pipe_transfer_usage usage,
+  const struct pipe_box *box,
+  struct pipe_transfer **out_transfer)
+{
+   return context->transfer_map(context, resource, level, usage, box,
+out_transfer);
+}
+
+
 /**
  * Map a resource for reading/writing.
  * \param access  bitmask of PIPE_TRANSFER_x flags
  */
 static inline void *
 pipe_transfer_map(struct pipe_context *context,
   struct pipe_resource *resource,
   unsigned level, unsigned layer,
   unsigned access,
   unsigned x, unsigned y,
diff --git a/src/gallium/auxiliary/util/u_surface.c 
b/src/gallium/auxiliary/util/u_surface.c
index 5f07eb1cdac..c28c93abd4c 100644
--- a/src/gallium/auxiliary/util/u_surface.c
+++ b/src/gallium/auxiliary/util/u_surface.c
@@ -330,31 +330,31 @@ util_resource_copy_region(struct pipe_context *pipe,
/* check that region boxes are not out of bounds */
assert(src_box.x + src_box.width <= (int)u_minify(src->width0, src_level));
assert(src_box.y + src_box.height <= (int)u_minify(src->height0, 
src_level));
assert(dst_box.x + dst_box.width <= (int)u_minify(dst->width0, dst_level));
assert(dst_box.y + dst_box.height <= (int)u_minify(dst->height0, 
dst_level));
 
/* check that total number of src, dest bytes match */
assert((src_box.width / src_bw) * (src_box.height / src_bh) * src_bs ==
   (dst_box.width / dst_bw) * (dst_box.height / dst_bh) * dst_bs);
 
-   src_map = pipe->transfer_map(pipe,
+   src_map = pipe_transfer_map_box(pipe,
 src,
 src_level,
 PIPE_TRANSFER_READ,
 _box, _trans);
assert(src_map);
if (!src_map) {
   goto no_src_map;
}
 
-   dst_map = pipe->transfer_map(pipe,
+   dst_map = pipe_transfer_map_box(pipe,
 dst,
 dst_level,
 PIPE_TRANSFER_WRITE |
 PIPE_TRANSFER_DISCARD_RANGE, _box,
 _trans);
assert(dst_map);
if (!dst_map) {
   goto no_dst_map;
}
 
diff --git a/src/gallium/auxiliary/util/u_transfer.c 
b/src/gallium/auxiliary/util/u_transfer.c
index 3089bcb1f34..0e0c4cc91cd 100644
--- a/src/gallium/auxiliary/util/u_transfer.c
+++ 

[Mesa-dev] [PATCH 07/10] ac/surface: don't apply the 256-byte alignment to staging surfaces

2018-04-25 Thread Marek Olšák
From: Marek Olšák 

Having the over-alignment on staging surfaces breaks the user_stride
mechanism.

v2: Add a new SURF flag.
---
 src/amd/common/ac_surface.c   | 5 -
 src/amd/common/ac_surface.h   | 1 +
 src/gallium/drivers/radeonsi/si_texture.c | 3 +++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
index a23952717e3..9595298055b 100644
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -266,22 +266,25 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
 {
struct legacy_surf_level *surf_level;
ADDR_E_RETURNCODE ret;
 
AddrSurfInfoIn->mipLevel = level;
AddrSurfInfoIn->width = u_minify(config->info.width, level);
AddrSurfInfoIn->height = u_minify(config->info.height, level);
 
/* Make GFX6 linear surfaces compatible with GFX9 for hybrid graphics,
 * because GFX9 needs linear alignment of 256 bytes.
+*
+* This should not be applied to staging surfaces.
 */
-   if (config->info.levels == 1 &&
+   if (!(surf->flags & RADEON_SURF_TRANSFER_STAGING) &&
+   config->info.levels == 1 &&
AddrSurfInfoIn->tileMode == ADDR_TM_LINEAR_ALIGNED &&
AddrSurfInfoIn->bpp) {
unsigned alignment = 256 / (AddrSurfInfoIn->bpp / 8);
 
assert(util_is_power_of_two_or_zero(AddrSurfInfoIn->bpp));
AddrSurfInfoIn->width = align(AddrSurfInfoIn->width, alignment);
}
 
if (config->is_3d)
AddrSurfInfoIn->numSlices = u_minify(config->info.depth, level);
diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
index 37df859e6de..4060b84edab 100644
--- a/src/amd/common/ac_surface.h
+++ b/src/amd/common/ac_surface.h
@@ -61,20 +61,21 @@ enum radeon_micro_mode {
 #define RADEON_SURF_ZBUFFER (1 << 17)
 #define RADEON_SURF_SBUFFER (1 << 18)
 #define RADEON_SURF_Z_OR_SBUFFER(RADEON_SURF_ZBUFFER | 
RADEON_SURF_SBUFFER)
 /* bits 19 and 20 are reserved for libdrm_radeon, don't use them */
 #define RADEON_SURF_FMASK   (1 << 21)
 #define RADEON_SURF_DISABLE_DCC (1 << 22)
 #define RADEON_SURF_TC_COMPATIBLE_HTILE (1 << 23)
 #define RADEON_SURF_IMPORTED(1 << 24)
 #define RADEON_SURF_OPTIMIZE_FOR_SPACE  (1 << 25)
 #define RADEON_SURF_SHAREABLE   (1 << 26)
+#define RADEON_SURF_TRANSFER_STAGING(1 << 27)
 
 struct legacy_surf_level {
 uint64_toffset;
 uint32_tslice_size_dw; /* in dwords; max = 4GB / 4. */
 uint32_tdcc_offset; /* relative offset within DCC mip 
tree */
 uint32_tdcc_fast_clear_size;
 unsignednblk_x:15;
 unsignednblk_y:15;
 enum radeon_surf_mode   mode:2;
 };
diff --git a/src/gallium/drivers/radeonsi/si_texture.c 
b/src/gallium/drivers/radeonsi/si_texture.c
index 368fb034977..4ac284bb9d4 100644
--- a/src/gallium/drivers/radeonsi/si_texture.c
+++ b/src/gallium/drivers/radeonsi/si_texture.c
@@ -295,20 +295,23 @@ static int si_init_surface(struct si_screen *sscreen,
flags |= RADEON_SURF_SCANOUT;
}
 
if (ptex->bind & PIPE_BIND_SHARED)
flags |= RADEON_SURF_SHAREABLE;
if (is_imported)
flags |= RADEON_SURF_IMPORTED | RADEON_SURF_SHAREABLE;
if (!(ptex->flags & SI_RESOURCE_FLAG_FORCE_TILING))
flags |= RADEON_SURF_OPTIMIZE_FOR_SPACE;
 
+   if (ptex->flags & SI_RESOURCE_FLAG_TRANSFER)
+   flags |= RADEON_SURF_TRANSFER_STAGING;
+
r = sscreen->ws->surface_init(sscreen->ws, ptex, flags, bpe,
  array_mode, surface);
if (r) {
return r;
}
 
unsigned pitch = pitch_in_bytes_override / bpe;
 
if (sscreen->info.chip_class >= GFX9) {
if (pitch) {
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] dri_interface: document error behavior of mapImage

2018-04-25 Thread Marek Olšák
From: Nicolai Hähnle 

This function is meant to return NULL on error, unlike some other APIs
(such as mmap()), which return MAP_FAILED.
---
 include/GL/internal/dri_interface.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 319a1fe4f90..07dfd74f9d8 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1557,20 +1557,22 @@ struct __DRIimageExtensionRec {
 * flags may include __DRI_IMAGE_TRANSFER_READ, which will populate the
 * mapping with the current buffer content. If __DRI_IMAGE_TRANSFER_READ
 * is not included in the flags, the buffer content at map time is
 * undefined. Users wanting to modify the mapping must include
 * __DRI_IMAGE_TRANSFER_WRITE; if __DRI_IMAGE_TRANSFER_WRITE is not
 * included, behaviour when writing the mapping is undefined.
 *
 * Returns the byte stride in *stride, and an opaque pointer to data
 * tracking the mapping in **data, which must be passed to unmapImage().
 *
+* Returns NULL on error.
+*
 * \since 12
 */
void *(*mapImage)(__DRIcontext *context, __DRIimage *image,
  int x0, int y0, int width, int height,
  unsigned int flags, int *stride, void **data);
 
/**
 * Unmap a previously mapped __DRIimage
 *
 * \since 12
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] gallium: add user_stride parameter to pipe_context::transfer_map

2018-04-25 Thread Marek Olšák
From: Nicolai Hähnle 

Allow callers to prescribe a desired stride for a transfer. Drivers
are free to ignore this new parameter.

There is no new capability because it's unclear how strict requirements
on this feature should be expressed.
---
 src/gallium/auxiliary/driver_ddebug/dd_draw.c| 3 ++-
 src/gallium/auxiliary/driver_noop/noop_pipe.c| 1 +
 src/gallium/auxiliary/driver_rbug/rbug_context.c | 5 -
 src/gallium/auxiliary/driver_trace/tr_context.c  | 3 ++-
 src/gallium/auxiliary/util/u_inlines.h   | 8 
 src/gallium/auxiliary/util/u_threaded_context.c  | 5 +++--
 src/gallium/auxiliary/util/u_transfer.c  | 3 ++-
 src/gallium/auxiliary/util/u_transfer.h  | 2 ++
 src/gallium/auxiliary/util/u_transfer_helper.c   | 1 +
 src/gallium/auxiliary/util/u_transfer_helper.h   | 1 +
 src/gallium/drivers/etnaviv/etnaviv_transfer.c   | 1 +
 src/gallium/drivers/i915/i915_resource_buffer.c  | 1 +
 src/gallium/drivers/i915/i915_resource_texture.c | 1 +
 src/gallium/drivers/llvmpipe/lp_texture.c| 1 +
 src/gallium/drivers/nouveau/nouveau_buffer.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_miptree.c  | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_transfer.c | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_transfer.c | 1 +
 src/gallium/drivers/r300/r300_screen_buffer.c| 1 +
 src/gallium/drivers/r300/r300_transfer.c | 1 +
 src/gallium/drivers/r300/r300_transfer.h | 1 +
 src/gallium/drivers/r600/evergreen_compute.c | 1 +
 src/gallium/drivers/r600/r600_buffer_common.c| 3 ++-
 src/gallium/drivers/r600/r600_texture.c  | 1 +
 src/gallium/drivers/radeonsi/si_buffer.c | 3 ++-
 src/gallium/drivers/radeonsi/si_texture.c| 1 +
 src/gallium/drivers/softpipe/sp_texture.c| 1 +
 src/gallium/drivers/svga/svga_resource_buffer.c  | 1 +
 src/gallium/drivers/svga/svga_resource_texture.c | 1 +
 src/gallium/drivers/vc4/vc4_resource.c   | 1 +
 src/gallium/drivers/virgl/virgl_buffer.c | 1 +
 src/gallium/drivers/virgl/virgl_texture.c| 1 +
 src/gallium/include/pipe/p_context.h | 7 +++
 33 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/driver_ddebug/dd_draw.c 
b/src/gallium/auxiliary/driver_ddebug/dd_draw.c
index cb5db8ab83b..125f6041324 100644
--- a/src/gallium/auxiliary/driver_ddebug/dd_draw.c
+++ b/src/gallium/auxiliary/driver_ddebug/dd_draw.c
@@ -1478,33 +1478,34 @@ dd_context_clear_texture(struct pipe_context *_pipe,
 }
 
 /
  * transfer
  */
 
 static void *
 dd_context_transfer_map(struct pipe_context *_pipe,
 struct pipe_resource *resource, unsigned level,
 unsigned usage, const struct pipe_box *box,
+unsigned user_stride,
 struct pipe_transfer **transfer)
 {
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_draw_record *record =
   dd_screen(dctx->base.screen)->transfers ? dd_create_record(dctx) : NULL;
 
if (record) {
   record->call.type = CALL_TRANSFER_MAP;
 
   dd_before_draw(dctx, record);
}
-   void *ptr = pipe->transfer_map(pipe, resource, level, usage, box, transfer);
+   void *ptr = pipe->transfer_map(pipe, resource, level, usage, box, 
user_stride, transfer);
if (record) {
   record->call.info.transfer_map.transfer_ptr = *transfer;
   record->call.info.transfer_map.ptr = ptr;
   if (*transfer) {
  record->call.info.transfer_map.transfer = **transfer;
  record->call.info.transfer_map.transfer.resource = NULL;
  
pipe_resource_reference(>call.info.transfer_map.transfer.resource,
  (*transfer)->resource);
   } else {
  memset(>call.info.transfer_map.transfer, 0, sizeof(struct 
pipe_transfer));
diff --git a/src/gallium/auxiliary/driver_noop/noop_pipe.c 
b/src/gallium/auxiliary/driver_noop/noop_pipe.c
index d1e795dab16..cc74fcbd5df 100644
--- a/src/gallium/auxiliary/driver_noop/noop_pipe.c
+++ b/src/gallium/auxiliary/driver_noop/noop_pipe.c
@@ -167,20 +167,21 @@ static void noop_resource_destroy(struct pipe_screen 
*screen,
 
 
 /*
  * transfer
  */
 static void *noop_transfer_map(struct pipe_context *pipe,
struct pipe_resource *resource,
unsigned level,
enum pipe_transfer_usage usage,
const struct pipe_box *box,
+   unsigned user_stride,
struct pipe_transfer **ptransfer)
 {
struct pipe_transfer *transfer;
struct noop_resource *nresource = (struct noop_resource *)resource;
 
transfer = CALLOC_STRUCT(pipe_transfer);
if (!transfer)
   return NULL;
pipe_resource_reference(>resource, resource);

[Mesa-dev] [PATCH 08/10] gallium/u_tests: test NULL in constbuf[1] instead of constbuf[0]

2018-04-25 Thread Marek Olšák
From: Marek Olšák 

radeonsi doesn't support constbuf[0] = NULL.
---
 src/gallium/auxiliary/util/u_tests.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_tests.c 
b/src/gallium/auxiliary/util/u_tests.c
index 293a4580a9f..e7d11ce117e 100644
--- a/src/gallium/auxiliary/util/u_tests.c
+++ b/src/gallium/auxiliary/util/u_tests.c
@@ -421,30 +421,30 @@ util_test_constant_buffer(struct pipe_context *ctx,
struct pipe_resource *cb;
void *fs, *vs;
bool pass = true;
static const float zero[] = {0, 0, 0, 0};
 
cso = cso_create_context(ctx, 0);
cb = util_create_texture2d(ctx->screen, 256, 256,
   PIPE_FORMAT_R8G8B8A8_UNORM, 0);
util_set_common_states_and_clear(cso, ctx, cb);
 
-   pipe_set_constant_buffer(ctx, PIPE_SHADER_FRAGMENT, 0, constbuf);
+   pipe_set_constant_buffer(ctx, PIPE_SHADER_FRAGMENT, 1, constbuf);
 
/* Fragment shader. */
{
   static const char *text = /* I don't like ureg... */
 "FRAG\n"
-"DCL CONST[0][0]\n"
+"DCL CONST[1][0]\n"
 "DCL OUT[0], COLOR\n"
 
-"MOV OUT[0], CONST[0][0]\n"
+"MOV OUT[0], CONST[1][0]\n"
 "END\n";
   struct tgsi_token tokens[1000];
   struct pipe_shader_state state;
 
   if (!tgsi_text_translate(text, tokens, ARRAY_SIZE(tokens))) {
  puts("Can't compile a fragment shader.");
  util_report_result(FAIL);
  return;
   }
   pipe_shader_state_from_tgsi(, tokens);
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] DRI interface, gallium: User-specified transfer stride

2018-04-25 Thread Marek Olšák
Hi,

This feature is for gralloc, which requires drivers to be able to map
an image with a stride of its own choosing, which is usually the same
as the image stride. This is a very silly feature that probably comes
from designing around mobile GPUs, and must be emulated on everything
else.

GCN is pretty limited here. A 16x16 tiled image can have a stride of 16
because it's tiled, but the stride of a linear mapping of that image
will be 64 on <= Polaris, or 256 on >= Vega.

The hardware doesn't have the capability to give you a stride that is
not a multiple of 64 or 256, so gralloc will have to deal with it.

Nicolai wrote the first 6 commits. I added the last 4.
(so I'll put my Rb on the first 6)

Please review,

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] radeonsi: implement transfer_map with user_stride

2018-04-25 Thread Marek Olšák
From: Nicolai Hähnle 

The stride ends up being aligned by AddrLib in ways that are
inconvenient to express clearly, but basically, a stride that
is aligned to both 64 pixels and 256 bytes will go through
unchanged in practice.
---
 src/gallium/drivers/radeonsi/si_texture.c | 35 +++
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_texture.c 
b/src/gallium/drivers/radeonsi/si_texture.c
index 43f1560ec3e..368fb034977 100644
--- a/src/gallium/drivers/radeonsi/si_texture.c
+++ b/src/gallium/drivers/radeonsi/si_texture.c
@@ -170,43 +170,54 @@ static void si_copy_from_staging_texture(struct 
pipe_context *ctx, struct r600_t
   transfer->box.x, transfer->box.y, 
transfer->box.z,
   src, 0, );
return;
}
 
sctx->dma_copy(ctx, dst, transfer->level,
   transfer->box.x, transfer->box.y, transfer->box.z,
   src, 0, );
 }
 
+static unsigned si_texture_get_stride(struct si_screen *sscreen,
+ struct r600_texture *rtex,
+ unsigned level)
+{
+   if (sscreen->info.chip_class >= GFX9) {
+   return rtex->surface.u.gfx9.surf_pitch * rtex->surface.bpe;
+   } else {
+   return rtex->surface.u.legacy.level[level].nblk_x *
+  rtex->surface.bpe;
+   }
+}
+
 static unsigned si_texture_get_offset(struct si_screen *sscreen,
  struct r600_texture *rtex, unsigned level,
  const struct pipe_box *box,
  unsigned *stride,
  unsigned *layer_stride)
 {
+   *stride = si_texture_get_stride(sscreen, rtex, level);
+
if (sscreen->info.chip_class >= GFX9) {
-   *stride = rtex->surface.u.gfx9.surf_pitch * rtex->surface.bpe;
*layer_stride = rtex->surface.u.gfx9.surf_slice_size;
 
if (!box)
return 0;
 
/* Each texture is an array of slices. Each slice is an array
 * of mipmap levels. */
return box->z * rtex->surface.u.gfx9.surf_slice_size +
   rtex->surface.u.gfx9.offset[level] +
   (box->y / rtex->surface.blk_h *
rtex->surface.u.gfx9.surf_pitch +
box->x / rtex->surface.blk_w) * rtex->surface.bpe;
} else {
-   *stride = rtex->surface.u.legacy.level[level].nblk_x *
- rtex->surface.bpe;

assert((uint64_t)rtex->surface.u.legacy.level[level].slice_size_dw * 4 <= 
UINT_MAX);
*layer_stride = 
(uint64_t)rtex->surface.u.legacy.level[level].slice_size_dw * 4;
 
if (!box)
return rtex->surface.u.legacy.level[level].offset;
 
/* Each texture is an array of mipmap levels. Each level is
 * an array of slices. */
return rtex->surface.u.legacy.level[level].offset +
   box->z * 
(uint64_t)rtex->surface.u.legacy.level[level].slice_size_dw * 4 +
@@ -1686,21 +1697,23 @@ static void *si_texture_transfer_map(struct 
pipe_context *ctx,
 
/* Tiled textures need to be converted into a linear texture 
for CPU
 * access. The staging texture is always linear and is placed 
in GART.
 *
 * Reading from VRAM or GTT WC is slow, always use the staging
 * texture in this case.
 *
 * Use the staging texture for uploads if the underlying BO
 * is busy.
 */
-   if (!rtex->surface.is_linear)
+   if (!rtex->surface.is_linear ||
+   (user_stride &&
+user_stride != si_texture_get_stride(sctx->screen, rtex, 
level)))
use_staging_texture = true;
else if (usage & PIPE_TRANSFER_READ)
use_staging_texture =
rtex->resource.domains & RADEON_DOMAIN_VRAM ||
rtex->resource.flags & RADEON_FLAG_GTT_WC;
/* Write & linear only: */
else if (si_rings_is_buffer_referenced(sctx, rtex->resource.buf,
   RADEON_USAGE_READWRITE) 
||
 !sctx->ws->buffer_wait(rtex->resource.buf, 0,
RADEON_USAGE_READWRITE)) {
@@ -1778,23 +1791,33 @@ static void *si_texture_transfer_map(struct 
pipe_context *ctx,
 level, box,
 >b.b.stride,
 

[Mesa-dev] [PATCH 09/10] gallium: add PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT

2018-04-25 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/docs/source/screen.rst   | 3 +++
 src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_get.c| 3 +++
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/vc5/vc5_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 18 files changed, 22 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 3837360fb40..7cc6d378306 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -413,20 +413,23 @@ The integer capabilities:
   supported priority levels.  A driver that does not support prioritized
   contexts can return 0.
 * ``PIPE_CAP_FENCE_SIGNAL``: True if the driver supports signaling semaphores
   using fence_server_signal().
 * ``PIPE_CAP_CONSTBUF0_FLAGS``: The bits of pipe_resource::flags that must be
   set when binding that buffer as constant buffer 0. If the buffer doesn't have
   those bits set, pipe_context::set_constant_buffer(.., 0, ..) is ignored
   by the driver, and the driver can throw assertion failures.
 * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed uniforms
   as opposed to padding to vec4s.
+* ``PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT``: The minimum supported alignment 
of
+  the user_stride parameter of transfer_map. If 0, the user-specified stride
+  is unsupported and the user_stride parameter is ignored.
 
 
 .. _pipe_capf:
 
 PIPE_CAPF_*
 
 
 The floating-point capabilities are:
 
 * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line.
diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index b0f8b4bebe3..915e7d7da7d 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -267,20 +267,21 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
case PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS:
case PIPE_CAP_TILE_RASTER_ORDER:
case PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES:
case PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET:
case PIPE_CAP_CONTEXT_PRIORITY_MASK:
case PIPE_CAP_FENCE_SIGNAL:
case PIPE_CAP_CONSTBUF0_FLAGS:
case PIPE_CAP_PACKED_UNIFORMS:
+   case PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT:
   return 0;
 
/* Stream output. */
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
   return 0;
 
/* Geometry shader output, unsupported. */
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index f338d756dfe..dd052a22f25 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -333,20 +333,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS:
case PIPE_CAP_TILE_RASTER_ORDER:
case PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES:
case PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET:
case PIPE_CAP_FENCE_SIGNAL:
case PIPE_CAP_CONSTBUF0_FLAGS:
case PIPE_CAP_PACKED_UNIFORMS:
+   case PIPE_CAP_TRANSFER_USER_STRIDE_ALIGNMENT:
return 0;
 
case PIPE_CAP_CONTEXT_PRIORITY_MASK:
return screen->priority_mask;
 
case PIPE_CAP_DRAW_INDIRECT:
if (is_a4xx(screen) || is_a5xx(screen))
return 1;
return 0;
 
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 59d2ec66284..6ec0d026bed 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -320,20 +320,21 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_MEMOBJ:
case PIPE_CAP_LOAD_CONSTBUF:
case PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS:
case 

[Mesa-dev] [PATCH 02/10] dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDE

2018-04-25 Thread Marek Olšák
From: Nicolai Hähnle 

Allow the caller to specify the row stride (in bytes) with which an image
should be mapped. Note that completely ignoring USER_STRIDE is a valid
implementation of mapImage.

This is horrible API design. Unfortunately, cros_gralloc does indeed have
a horrible API design -- in that arbitrary images should be allowed to be
mapped with the stride that a linear image of the same width would have.

There is no separate capability bit because it's unclear how stricter
requirements should be defined.
---
 include/GL/internal/dri_interface.h | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 07dfd74f9d8..4247e61415f 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1213,21 +1213,21 @@ struct __DRIdri2ExtensionRec {
 */
__DRIcreateNewScreen2FunccreateNewScreen2;
 };
 
 
 /**
  * This extension provides functionality to enable various EGLImage
  * extensions.
  */
 #define __DRI_IMAGE "DRI_IMAGE"
-#define __DRI_IMAGE_VERSION 17
+#define __DRI_IMAGE_VERSION 18
 
 /**
  * These formats correspond to the similarly named MESA_FORMAT_*
  * tokens, except in the native endian of the CPU.  For example, on
  * little endian __DRI_IMAGE_FORMAT_XRGB corresponds to
  * MESA_FORMAT_XRGB, but MESA_FORMAT_XRGB_REV on big endian.
  *
  * __DRI_IMAGE_FORMAT_NONE is for images that aren't directly usable
  * by the driver (YUV planar formats) but serve as a base image for
  * creating sub-images for the different planes within the image.
@@ -1263,20 +1263,21 @@ struct __DRIdri2ExtensionRec {
  * in contrary to gbm buffers, front buffers and fake front buffers, which
  * could be read after a flush."
  */
 #define __DRI_IMAGE_USE_BACKBUFFER  0x0010
 
 
 #define __DRI_IMAGE_TRANSFER_READ0x1
 #define __DRI_IMAGE_TRANSFER_WRITE   0x2
 #define __DRI_IMAGE_TRANSFER_READ_WRITE  \
 (__DRI_IMAGE_TRANSFER_READ | __DRI_IMAGE_TRANSFER_WRITE)
+#define __DRI_IMAGE_TRANSFER_USER_STRIDE 0x4 /* since version 18 */
 
 /**
  * Four CC formats that matches with WL_DRM_FORMAT_* from wayland_drm.h,
  * GBM_FORMAT_* from gbm.h, and DRM_FORMAT_* from drm_fourcc.h. Used with
  * createImageFromNames.
  *
  * \since 5
  */
 
 #define __DRI_IMAGE_FOURCC_R8  0x20203852
@@ -1554,22 +1555,31 @@ struct __DRIimageExtensionRec {
/**
 * Returns a map of the specified region of a __DRIimage for the specified 
usage.
 *
 * flags may include __DRI_IMAGE_TRANSFER_READ, which will populate the
 * mapping with the current buffer content. If __DRI_IMAGE_TRANSFER_READ
 * is not included in the flags, the buffer content at map time is
 * undefined. Users wanting to modify the mapping must include
 * __DRI_IMAGE_TRANSFER_WRITE; if __DRI_IMAGE_TRANSFER_WRITE is not
 * included, behaviour when writing the mapping is undefined.
 *
-* Returns the byte stride in *stride, and an opaque pointer to data
-* tracking the mapping in **data, which must be passed to unmapImage().
+* When __DRI_IMAGE_TRANSFER_USER_STRIDE is set in \p flags (since version 
18),
+* the driver should attempt to map the image with the byte stride given in
+* *stride. The caller must ensure that *stride is large enough to hold a
+* row of the mapping. If the requested stride is not supported, the mapping
+* may fail, or a mapping with a different stride may be created (in which
+* case the actual stride is returned in *stride).
+*
+* Returns an opaque pointer to data tracking the mapping in **data, which
+* must be passed to unmapImage().
+*
+* Returns the byte stride in *stride.
 *
 * Returns NULL on error.
 *
 * \since 12
 */
void *(*mapImage)(__DRIcontext *context, __DRIimage *image,
  int x0, int y0, int width, int height,
  unsigned int flags, int *stride, void **data);
 
/**
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Intel: Add a Kaby Lake PCI ID

2018-04-25 Thread Rafael Antognolli
And pushed.

Thanks,
Rafael

On Wed, Apr 25, 2018 at 09:49:45AM -0700, Rafael Antognolli wrote:
> This patch is
> 
> Reviewed-by: Rafael Antognolli 
> 
> On Wed, Apr 25, 2018 at 09:23:04AM -0700, matthew.s.atw...@intel.com wrote:
> > From: Matt Atwood 
> > 
> > v2: Branding changed
> > 
> > Signed-off-by: Matt Atwood 
> > ---
> >  include/pci_ids/i965_pci_ids.h | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> > index c740a50..82e4a54 100644
> > --- a/include/pci_ids/i965_pci_ids.h
> > +++ b/include/pci_ids/i965_pci_ids.h
> > @@ -156,6 +156,7 @@ CHIPSET(0x5912, kbl_gt2, "Intel(R) HD Graphics 630 
> > (Kaby Lake GT2)")
> >  CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kaby Lake GT2)")
> >  CHIPSET(0x591A, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
> >  CHIPSET(0x591B, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
> > +CHIPSET(0x591C, kbl_gt2, "Intel(R) Kaby Lake GT2")
> >  CHIPSET(0x591D, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
> >  CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kaby Lake GT2)")
> >  CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
> > -- 
> > 2.7.4
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/21] swr/rast: Fix init in EventHandlerWorkerStats

2018-04-25 Thread George Kyriazis
Make sure we initialize variables.
---
 src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp 
b/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
index 871db79..ff7bdc3 100644
--- a/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
+++ b/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
@@ -121,7 +121,10 @@ namespace ArchRast
 class EventHandlerWorkerStats : public EventHandlerFile
 {
 public:
-EventHandlerWorkerStats(uint32_t id) : EventHandlerFile(id), 
mNeedFlush(false) {}
+EventHandlerWorkerStats(uint32_t id) : EventHandlerFile(id), 
mNeedFlush(false)
+{
+memset(mShaderStats, 0, sizeof(mShaderStats));
+}
 
 virtual void Handle(const EarlyDepthStencilInfoSingleSample& event)
 {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/21] swr/rast: Fix regressions.

2018-04-25 Thread George Kyriazis
Bump jit cache revision number to force recompile.
---
 src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
index bfc3e42..3b4c3f5 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
@@ -474,7 +474,7 @@ struct JitCacheFileHeader
 uint64_t GetObjectCRC() const { return m_objCRC; }
 
 private:
-static const uint64_t   JC_MAGIC_NUMBER = 0xfedcba9876543211ULL + 3;
+static const uint64_t   JC_MAGIC_NUMBER = 0xfedcba9876543211ULL + 4;
 static const size_t JC_STR_MAX_LEN = 32;
 static const uint32_t   JC_PLATFORM_KEY =
 (LLVM_VERSION_MAJOR << 24)  |
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/21] swr/rast: Fix return type of VCVTPS2PH.

2018-04-25 Thread George Kyriazis
expecting <8xi16> return.
---
 src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py 
b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
index bced657..2e7f1a8 100644
--- a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
+++ b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
@@ -53,7 +53,7 @@ intrinsics = [
 ['VPERMPS', ['idx', 'a'], 'a'],
 ['VCVTPD2PS',   ['a'], 'VectorType::get(mFP32Ty, 
a->getType()->getVectorNumElements())'],
 ['VCVTPH2PS',   ['a'], 'VectorType::get(mFP32Ty, 
a->getType()->getVectorNumElements())'],
-['VCVTPS2PH',   ['a', 'round'], 'mSimdFP16Ty'],
+['VCVTPS2PH',   ['a', 'round'], 'mSimdInt16Ty'],
 ['VHSUBPS', ['a', 'b'], 'a'],
 ['VPTESTC', ['a', 'b'], 'mInt32Ty'],
 ['VPTESTZ', ['a', 'b'], 'mInt32Ty'],
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/21] swr/rast: Fix wrong type allocation

2018-04-25 Thread George Kyriazis
ALLOCA pointer elements, not pointers.
---
 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 09590b7..a43c787 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -1014,7 +1014,7 @@ template Value* 
FetchJit::GetSimdValidIndicesHelper(Value* pIndices,
 
 {
 // store 0 index on stack to be used to conditionally load from if 
index address is OOB
-Value* pZeroIndex = ALLOCA(Ty);
+Value* pZeroIndex = ALLOCA(Ty->getPointerElementType());
 STORE(C((T)0), pZeroIndex);
 
 // Load a SIMD of index pointers
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/21] swr/rast: Output rasterizer dir to console since it's process specific

2018-04-25 Thread George Kyriazis
---
 .../swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git 
a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
index 54d2486..4f87e0c 100644
--- 
a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
+++ 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
@@ -36,6 +36,7 @@
 #include "${event_header}"
 #include 
 #include 
+#include 
 #include 
 
 namespace ArchRast
@@ -57,7 +58,9 @@ namespace ArchRast
 std::stringstream outDir;
 outDir << KNOB_DEBUG_OUTPUT_DIR << pBaseName << "_" << pid << 
std::ends;
 mOutputDir = outDir.str();
-CreateDirectory(mOutputDir.c_str(), NULL);
+if (CreateDirectory(mOutputDir.c_str(), NULL)) {
+std::cout << "Rasterizer Dir:  " << mOutputDir << 
std::endl << std::endl << std::flush;
+}
 
 // There could be multiple threads creating thread pools. We
 // want to make sure they are uniquly identified by adding in
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/21] swr/rast: Add TranslateGfxAddress for shader

2018-04-25 Thread George Kyriazis
Also add GFX_MEM_CLIENT_SHADER
---
 .../drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp   |  2 +-
 .../drivers/swr/rasterizer/jitter/builder_gfx_mem.h | 17 -
 src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h |  3 ++-
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
index 9b70716..03e34db 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
@@ -201,7 +201,7 @@ namespace SwrJit
 return Builder::MASKED_LOAD(Ptr, Align, Mask, PassThru, Name, Ty, 
usage);
 }
 
-Value* BuilderGfxMem::TranslateGfxAddress(Value* xpGfxAddress, Type* 
PtrTy, const Twine )
+Value* BuilderGfxMem::TranslateGfxAddress(Value* xpGfxAddress, Type* 
PtrTy, const Twine , JIT_MEM_CLIENT /* usage */)
 {
 if (PtrTy == nullptr)
 {
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
index effbe05..d1a25c4 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
@@ -57,7 +57,22 @@ namespace SwrJit
 
 virtual Value *GATHERDD(Value* src, Value* pBase, Value* indices, 
Value* mask, uint8_t scale = 1, JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
 
-Value* TranslateGfxAddress(Value* xpGfxAddress, Type* PtrTy = nullptr, 
const Twine  = "");
+Value* TranslateGfxAddress(Value* xpGfxAddress, Type* PtrTy = nullptr, 
const Twine  = "", JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
+template 
+Value* TranslateGfxAddress(Value* xpGfxBaseAddress, const 
std::initializer_list , Type* PtrTy = nullptr, const Twine  = 
"", JIT_MEM_CLIENT usage = GFX_MEM_CLIENT_SHADER)
+{
+AssertGFXMemoryParams(xpGfxBaseAddress, usage);
+SWR_ASSERT(xpGfxBaseAddress->getType()->isPointerTy() == false);
+
+if (!PtrTy)
+{
+PtrTy = mInt8PtrTy;
+}
+
+Value* ptr = INT_TO_PTR(xpGfxBaseAddress, PtrTy);
+ptr = GEP(ptr, offset);
+return TranslateGfxAddress(PTR_TO_INT(ptr, mInt64Ty), PtrTy, Name, 
usage);
+}
 
 
 protected:
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h
index 9ccac4f..3823a13 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h
@@ -35,7 +35,8 @@ typedef enum _JIT_MEM_CLIENT
 {
 MEM_CLIENT_INTERNAL,
 GFX_MEM_CLIENT_FETCH,
-GFX_MEM_CLIENT_SAMPLER
+GFX_MEM_CLIENT_SAMPLER,
+GFX_MEM_CLIENT_SHADER,
 } JIT_MEM_CLIENT;
 
 protected:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/21] swr/rast: Internal core change

2018-04-25 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/core/utils.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/swr/rasterizer/core/utils.h 
b/src/gallium/drivers/swr/rasterizer/core/utils.h
index d6cbf24..7769e05 100644
--- a/src/gallium/drivers/swr/rasterizer/core/utils.h
+++ b/src/gallium/drivers/swr/rasterizer/core/utils.h
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "common/os.h"
 #include "common/intrin.h"
 #include "common/swr_assert.h"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/21] swr/rast: Fix x86 lowering 64-bit float handling

2018-04-25 Thread George Kyriazis
- 64-bit cvt-to-float needs to be explicitly handled
- gathers need the right parameter types to work with doubles

Fixes draw-vertices piglit tests
---
 .../drivers/swr/rasterizer/jitter/builder_misc.h   | 12 ++
 .../rasterizer/jitter/functionpasses/lower_x86.cpp | 50 +++---
 2 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h
index bd4be9f..a51aad0 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h
@@ -55,6 +55,18 @@ Constant *CA(LLVMContext& ctx, ArrayRef constList)
 return ConstantDataArray::get(ctx, constList);
 }
 
+template
+Constant *CInc(uint32_t base, uint32_t count)
+{
+std::vector vConsts;
+
+for(uint32_t i = 0; i < count; i++) {
+vConsts.push_back(C((Ty)base));
+base++;
+}
+return ConstantVector::get(vConsts);
+}
+
 Constant *PRED(bool pred);
 
 Value *VIMMED1(int i);
diff --git 
a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
index baf3ab5..eac0549 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
@@ -115,7 +115,7 @@ namespace SwrJit
 {"meta.intrinsic.VGATHERPD",   {{Intrinsic::not_intrinsic, 
  Intrinsic::not_intrinsic},  VGATHER_EMU}},
 {"meta.intrinsic.VGATHERPS",   {{Intrinsic::not_intrinsic, 
  Intrinsic::not_intrinsic},  VGATHER_EMU}},
 {"meta.intrinsic.VGATHERDD",   {{Intrinsic::not_intrinsic, 
  Intrinsic::not_intrinsic},  VGATHER_EMU}},
-{"meta.intrinsic.VCVTPD2PS",   {{Intrinsic::x86_avx_cvt_pd2_ps_256,
  Intrinsic::not_intrinsic},  NO_EMU}},
+{"meta.intrinsic.VCVTPD2PS",   {{Intrinsic::x86_avx_cvt_pd2_ps_256,
  DOUBLE},NO_EMU}},
 {"meta.intrinsic.VCVTPH2PS",   {{Intrinsic::x86_vcvtph2ps_256, 
  Intrinsic::not_intrinsic},  NO_EMU}},
 {"meta.intrinsic.VROUND",  {{Intrinsic::x86_avx_round_ps_256,  
  DOUBLE},NO_EMU}},
 {"meta.intrinsic.VHSUBPS", {{Intrinsic::x86_avx_hsub_ps_256,   
  DOUBLE},NO_EMU}},
@@ -166,10 +166,18 @@ namespace SwrJit
 // across all intrinsics, and will have to be rethought. Probably need 
something
 // similar to llvm's getDeclaration() utility to map a set of inputs 
to a specific typed
 // intrinsic.
-void GetRequestedWidthAndType(CallInst* pCallInst, TargetWidth* 
pWidth, Type** pTy)
+void GetRequestedWidthAndType(CallInst* pCallInst, const StringRef 
intrinName, TargetWidth* pWidth, Type** pTy)
 {
 uint32_t vecWidth;
 Type* pVecTy = pCallInst->getType();
+
+// Check for intrinsic specific types
+// VCVTPD2PS type comes from src, not dst
+if (intrinName.equals("meta.intrinsic.VCVTPD2PS"))
+{
+pVecTy = pCallInst->getOperand(0)->getType();
+}
+
 if (!pVecTy->isVectorTy())
 {
 for (auto& op : pCallInst->arg_operands())
@@ -231,7 +239,7 @@ namespace SwrJit
 auto& intrinsic = intrinsicMap2[mTarget][pFunc->getName()];
 TargetWidth vecWidth;
 Type* pElemTy;
-GetRequestedWidthAndType(pCallInst, , );
+GetRequestedWidthAndType(pCallInst, pFunc->getName(), , 
);
 
 // Check if there is a native intrinsic for this instruction
 Intrinsic::ID id = intrinsic.intrin[vecWidth];
@@ -460,7 +468,9 @@ namespace SwrJit
 // Double pump 4-wide for 64bit elements
 if (vSrc->getType()->getVectorElementType() == B->mDoubleTy)
 {
-auto v64Mask = B->S_EXT(pThis->VectorMask(vi1Mask), 
B->mInt64Ty);
+auto v64Mask = pThis->VectorMask(vi1Mask);
+v64Mask = B->S_EXT(v64Mask,
+   VectorType::get(B->mInt64Ty, 
v64Mask->getType()->getVectorNumElements()));
 v64Mask = B->BITCAST(v64Mask, vSrc->getType());
 
 Value* src0 = B->VSHUFFLE(vSrc, vSrc, B->C({ 0, 1, 2, 3 
}));
@@ -472,10 +482,15 @@ namespace SwrJit
 Value* mask0 = B->VSHUFFLE(v64Mask, v64Mask, B->C({ 0, 1, 
2, 3 }));
 Value* mask1 = B->VSHUFFLE(v64Mask, v64Mask, B->C({ 4, 5, 
6, 7 }));
 
+src0 = B->BITCAST(src0, VectorType::get(B->mInt64Ty, 

[Mesa-dev] [PATCH 20/21] swr/rast: Small editorial changes

2018-04-25 Thread George Kyriazis
---
 .../swr/rasterizer/jitter/builder_gfx_mem.cpp  | 33 ++
 .../swr/rasterizer/jitter/builder_gfx_mem.h|  1 +
 .../rasterizer/jitter/functionpasses/lower_x86.cpp |  2 +-
 3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
index 03e34db..c6d0619 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
@@ -55,6 +55,7 @@ namespace SwrJit
 SWR_ASSERT(!(ptr->getType() == mInt64Ty && usage == 
MEM_CLIENT_INTERNAL), "Internal memory should not be gfxptr_t.");
 }
 
+
 //
 /// @brief Generate a masked gather operation in LLVM IR.  If not  
 /// supported on the underlying platform, emulate it with loads
@@ -63,17 +64,15 @@ namespace SwrJit
 /// @param vIndices - SIMD wide value of VB byte offsets
 /// @param vMask - SIMD wide mask that controls whether to access memory 
or the src values
 /// @param scale - value to scale indices by
-Value *BuilderGfxMem::GATHERPS(Value *vSrc, Value *pBase, Value *vIndices, 
Value *vMask, uint8_t scale, JIT_MEM_CLIENT usage)
+Value* BuilderGfxMem::GATHERPS(Value* vSrc, Value* pBase, Value* vIndices, 
Value* vMask, uint8_t scale, JIT_MEM_CLIENT usage)
 {
-Value *vGather;
-
 // address may be coming in as 64bit int now so get the pointer
 if (pBase->getType() == mInt64Ty)
 {
 pBase = INT_TO_PTR(pBase, PointerType::get(mInt8Ty, 0));
 }
 
-vGather = Builder::GATHERPS(vSrc, pBase, vIndices, vMask, scale);
+Value* vGather = Builder::GATHERPS(vSrc, pBase, vIndices, vMask, 
scale);
 return vGather;
 }
 
@@ -85,10 +84,8 @@ namespace SwrJit
 /// @param vIndices - SIMD wide value of VB byte offsets
 /// @param vMask - SIMD wide mask that controls whether to access memory 
or the src values
 /// @param scale - value to scale indices by
-Value *BuilderGfxMem::GATHERDD(Value* vSrc, Value* pBase, Value* vIndices, 
Value* vMask, uint8_t scale, JIT_MEM_CLIENT usage)
+Value* BuilderGfxMem::GATHERDD(Value* vSrc, Value* pBase, Value* vIndices, 
Value* vMask, uint8_t scale, JIT_MEM_CLIENT usage)
 {
-Value* vGather = VIMMED1(0.0f);
-
 
 // address may be coming in as 64bit int now so get the pointer
 if (pBase->getType() == mInt64Ty)
@@ -96,7 +93,7 @@ namespace SwrJit
 pBase = INT_TO_PTR(pBase, PointerType::get(mInt8Ty, 0));
 }
 
-vGather = Builder::GATHERDD(vSrc, pBase, vIndices, vMask, scale);
+Value* vGather = Builder::GATHERDD(vSrc, pBase, vIndices, vMask, 
scale);
 return vGather;
 }
 
@@ -106,31 +103,31 @@ namespace SwrJit
 return ADD(base, offset);
 }
 
-Value *BuilderGfxMem::GEP(Value *Ptr, Value *Idx, Type *Ty, const Twine 
)
+Value* BuilderGfxMem::GEP(Value* Ptr, Value* Idx, Type *Ty, const Twine 
)
 {
 Ptr = TranslationHelper(Ptr, Ty);
 return Builder::GEP(Ptr, Idx, nullptr, Name);
 }
 
-Value *BuilderGfxMem::GEP(Type *Ty, Value *Ptr, Value *Idx, const Twine 
)
+Value* BuilderGfxMem::GEP(Type *Ty, Value* Ptr, Value* Idx, const Twine 
)
 {
 Ptr = TranslationHelper(Ptr, Ty);
 return Builder::GEP(Ty, Ptr, Idx, Name);
 }
 
-Value *BuilderGfxMem::GEP(Value* Ptr, const std::initializer_list 
, Type *Ty)
+Value* BuilderGfxMem::GEP(Value* Ptr, const std::initializer_list 
, Type *Ty)
 {
 Ptr = TranslationHelper(Ptr, Ty);
 return Builder::GEP(Ptr, indexList);
 }
 
-Value *BuilderGfxMem::GEP(Value* Ptr, const 
std::initializer_list , Type *Ty)
+Value* BuilderGfxMem::GEP(Value* Ptr, const 
std::initializer_list , Type *Ty)
 {
 Ptr = TranslationHelper(Ptr, Ty);
 return Builder::GEP(Ptr, indexList);
 }
 
-Value* BuilderGfxMem::TranslationHelper(Value *Ptr, Type *Ty)
+Value* BuilderGfxMem::TranslationHelper(Value* Ptr, Type *Ty)
 {
 SWR_ASSERT(!(Ptr->getType() == mInt64Ty && Ty == nullptr), "Access of 
GFX pointers must have non-null type specified.");
 
@@ -144,7 +141,7 @@ namespace SwrJit
 return Ptr;
 }
 
-LoadInst* BuilderGfxMem::LOAD(Value *Ptr, const char *Name, Type *Ty, 
JIT_MEM_CLIENT usage)
+LoadInst* BuilderGfxMem::LOAD(Value* Ptr, const char *Name, Type *Ty, 
JIT_MEM_CLIENT usage)
 {
 AssertGFXMemoryParams(Ptr, usage);
 
@@ -152,7 +149,7 @@ namespace SwrJit
 return Builder::LOAD(Ptr, Name);
 }
 
-LoadInst* BuilderGfxMem::LOAD(Value *Ptr, const Twine , Type *Ty, 
JIT_MEM_CLIENT usage)
+LoadInst* BuilderGfxMem::LOAD(Value* Ptr, const Twine , Type *Ty, 
JIT_MEM_CLIENT usage)
 {
 AssertGFXMemoryParams(Ptr, 

[Mesa-dev] [PATCH 16/21] swr/rast: jit PRINT improvements.

2018-04-25 Thread George Kyriazis
Sign-extend integer types to 32bit when specifying "%d" and add new %u
which zero-extends to 32bit. Improves  printing of sub 32bit integer types
(i1 specifically).
---
 .../drivers/swr/rasterizer/jitter/builder_misc.cpp| 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
index f893693..619a67b 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
@@ -416,9 +416,20 @@ namespace SwrJit
 {
 tempStr.insert(pos, std::string("%d "));
 pos += 3;
-printCallArgs.push_back(VEXTRACT(pArg, C(i)));
+printCallArgs.push_back(S_EXT(VEXTRACT(pArg, C(i)), 
Type::getInt32Ty(JM()->mContext)));
+}
+printCallArgs.push_back(S_EXT(VEXTRACT(pArg, C(i)), 
Type::getInt32Ty(JM()->mContext)));
+}
+else if ((tempStr[pos + 1] == 'u') && 
(pContainedType->isIntegerTy()))
+{
+uint32_t i = 0;
+for (; i < (pArg->getType()->getVectorNumElements()) - 1; 
i++)
+{
+tempStr.insert(pos, std::string("%d "));
+pos += 3;
+printCallArgs.push_back(Z_EXT(VEXTRACT(pArg, C(i)), 
Type::getInt32Ty(JM()->mContext)));
 }
-printCallArgs.push_back(VEXTRACT(pArg, C(i)));
+printCallArgs.push_back(Z_EXT(VEXTRACT(pArg, C(i)), 
Type::getInt32Ty(JM()->mContext)));
 }
 }
 else
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/21] swr/rast: Add support for TexelMask evaluation

2018-04-25 Thread George Kyriazis
---
 .../drivers/swr/rasterizer/jitter/builder.cpp  | 42 ++
 .../drivers/swr/rasterizer/jitter/builder.h|  2 ++
 2 files changed, 44 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
index bd81560..3248735 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
@@ -128,4 +128,46 @@ namespace SwrJit
 
 return (pAlloca->getMetadata("is_temp_alloca") != nullptr);
 }
+
+// Returns true if able to find an intrinsic to mark
+bool Builder::SetTexelMaskEvaluate(Instruction* inst)
+{
+CallInst* pGenIntrin = dyn_cast(inst);
+if (pGenIntrin)
+{
+MDNode* N = MDNode::get(JM()->mContext, 
MDString::get(JM()->mContext, "is_evaluate"));
+pGenIntrin->setMetadata("is_evaluate", N);
+return true;
+}
+else
+{
+// Follow use def chain back up
+for (Use& u : inst->operands())
+{
+Instruction* srcInst = dyn_cast(u.get());
+if (srcInst)
+{
+if (SetTexelMaskEvaluate(srcInst))
+{
+return true;
+}
+}
+}
+}
+
+return false;
+}
+
+bool Builder::IsTexelMaskEvaluate(Instruction* genSampleOrLoadIntrinsic)
+{
+CallInst* pGenIntrin = dyn_cast(genSampleOrLoadIntrinsic);
+
+if (!pGenIntrin)
+{
+return false;
+}
+
+return (pGenIntrin->getMetadata("is_evaluate") != nullptr);
+}
+
 }
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder.h
index e2ad1e8..82c5f8c 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.h
@@ -121,6 +121,8 @@ namespace SwrJit
 void SetTargetWidth(uint32_t width);
 void SetTempAlloca(Value* inst);
 bool IsTempAlloca(Value* inst);
+bool SetTexelMaskEvaluate(Instruction* inst);
+bool IsTexelMaskEvaluate(Instruction* inst);
 
 #include "gen_builder.hpp"
 #include "gen_builder_meta.hpp"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/21] swr/rast: No need to export GetSimdValidIndicesGfx

2018-04-25 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 48f0961..7b0b80a 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -985,10 +985,6 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE 
,
 }
 }
 
-typedef void*(*PFN_TRANSLATEGFXADDRESS_FUNC)(void* pdc, gfxptr_t va);
-extern "C" void GetSimdValid8bitIndicesGfx(gfxptr_t indices, gfxptr_t 
lastIndex, uint32_t vWidth, PFN_TRANSLATEGFXADDRESS_FUNC pfnTranslate, void* 
pdc, uint32_t* outIndices);
-extern "C" void GetSimdValid16bitIndicesGfx(gfxptr_t indices, gfxptr_t 
lastIndex, uint32_t vWidth, PFN_TRANSLATEGFXADDRESS_FUNC pfnTranslate, void* 
pdc, uint32_t* outIndices);
-
 template Value* FetchJit::GetSimdValidIndicesHelper(Value* 
pIndices, Value* pLastIndex)
 {
 SWR_ASSERT(pIndices->getType() == mInt64Ty && pLastIndex->getType() == 
mInt64Ty, "Function expects gfxptr_t for both input parameters.");
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/21] swr/rast: Use new processor detection mechanism

2018-04-25 Thread George Kyriazis
Use specific avx512 selection mechanism based on avx512er bit instead of
getHostCPUName().  LLVM 6.0.0 has a bug that reports wrong string for KNL
(fixed in 6.0.1).
---
 .../drivers/swr/rasterizer/jitter/JitManager.cpp   | 50 +-
 .../drivers/swr/rasterizer/jitter/JitManager.h |  2 +
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
index 3b4c3f5..28aadc6 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
@@ -80,7 +80,55 @@ JitManager::JitManager(uint32_t simdWidth, const char *arch, 
const char* core)
 
 StringRef hostCPUName;
 
-hostCPUName = sys::getHostCPUName();
+// force JIT to use the same CPU arch as the rest of swr
+if(mArch.AVX512F())
+{
+#if USE_SIMD16_SHADERS
+if(mArch.AVX512ER())
+{
+hostCPUName = StringRef("knl");
+}
+else
+{
+hostCPUName = StringRef("skylake-avx512");
+}
+mUsingAVX512 = true;
+#else
+hostCPUName = StringRef("core-avx2");
+#endif
+if (mVWidth == 0)
+{
+mVWidth = 8;
+}
+}
+else if(mArch.AVX2())
+{
+hostCPUName = StringRef("core-avx2");
+if (mVWidth == 0)
+{
+mVWidth = 8;
+}
+}
+else if(mArch.AVX())
+{
+if (mArch.F16C())
+{
+hostCPUName = StringRef("core-avx-i");
+}
+else
+{
+hostCPUName = StringRef("corei7-avx");
+}
+if (mVWidth == 0)
+{
+mVWidth = 8;
+}
+}
+else
+{
+SWR_INVALID("Jitting requires at least AVX ISA support");
+}
+
 
 auto optLevel = CodeGenOpt::Aggressive;
 
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.h 
b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.h
index c15e0d1..54a25d8 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.h
@@ -69,6 +69,7 @@ public:
 
 bool AVX2(void) { return bForceAVX ? 0 : InstructionSet::AVX2(); }
 bool AVX512F(void) { return (bForceAVX | bForceAVX2) ? 0 : 
InstructionSet::AVX512F(); }
+bool AVX512ER(void) { return (bForceAVX | bForceAVX2) ? 0 : 
InstructionSet::AVX512ER(); }
 bool BMI2(void) { return bForceAVX ? 0 : InstructionSet::BMI2(); }
 
 private:
@@ -142,6 +143,7 @@ struct JitManager
 
 uint32_tmVWidth;
 
+boolmUsingAVX512 = false;
 
 // fetch shader types
 llvm::FunctionType* mFetchShaderTy;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/21] swr/rast: WIP Translation handling

2018-04-25 Thread George Kyriazis
---
 .../swr/rasterizer/jitter/builder_gfx_mem.cpp  | 41 +-
 .../swr/rasterizer/jitter/builder_gfx_mem.h|  3 +-
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
index 6ecd969..9b70716 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
@@ -160,14 +160,6 @@ namespace SwrJit
 return Builder::LOAD(Ptr, Name);
 }
 
-LoadInst* BuilderGfxMem::LOAD(Type *Ty, Value *Ptr, const Twine , 
JIT_MEM_CLIENT usage)
-{
-AssertGFXMemoryParams(Ptr, usage);
-
-Ptr = TranslationHelper(Ptr, Ty);
-return Builder::LOAD(Ty, Ptr, Name);
-}
-
 LoadInst* BuilderGfxMem::LOAD(Value *Ptr, bool isVolatile, const Twine 
, Type *Ty, JIT_MEM_CLIENT usage)
 {
 AssertGFXMemoryParams(Ptr, usage);
@@ -180,12 +172,25 @@ namespace SwrJit
 {
 AssertGFXMemoryParams(BasePtr, usage);
 
-// This call is just a pass through to the base class.
-// It needs to be here to compile due to the combination of virtual 
overrides and signature overloads.
-// It doesn't do anything meaningful because the implementation in the 
base class is going to call 
-// another version of LOAD inside itself where the actual per offset 
translation will take place 
-// and we can't just translate the BasePtr once, each address needs 
individual translation.
-return Builder::LOAD(BasePtr, offset, name, Ty, usage);
+bool bNeedTranslation = false;
+if (BasePtr->getType() == mInt64Ty)
+{
+SWR_ASSERT(Ty);
+BasePtr = INT_TO_PTR(BasePtr, Ty, name);
+bNeedTranslation = true;
+}
+std::vector valIndices;
+for (auto i : offset)
+{
+valIndices.push_back(C(i));
+}
+BasePtr = Builder::GEPA(BasePtr, valIndices, name);
+if (bNeedTranslation)
+{
+BasePtr = PTR_TO_INT(BasePtr, mInt64Ty, name);
+}
+
+return LOAD(BasePtr, name, Ty, usage);
 }
 
 CallInst* BuilderGfxMem::MASKED_LOAD(Value *Ptr, unsigned Align, Value 
*Mask, Value *PassThru, const Twine , Type *Ty, JIT_MEM_CLIENT usage)
@@ -196,8 +201,12 @@ namespace SwrJit
 return Builder::MASKED_LOAD(Ptr, Align, Mask, PassThru, Name, Ty, 
usage);
 }
 
-Value* BuilderGfxMem::TranslateGfxAddress(Value* xpGfxAddress)
+Value* BuilderGfxMem::TranslateGfxAddress(Value* xpGfxAddress, Type* 
PtrTy, const Twine )
 {
-return INT_TO_PTR(xpGfxAddress, PointerType::get(mInt8Ty, 0));
+if (PtrTy == nullptr)
+{
+PtrTy = mInt8PtrTy;
+}
+return INT_TO_PTR(xpGfxAddress, PtrTy, Name);
 }
 }
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
index f8ec0ac..effbe05 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
@@ -48,7 +48,6 @@ namespace SwrJit
 
 virtual LoadInst* LOAD(Value *Ptr, const char *Name, Type *Ty = 
nullptr, JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
 virtual LoadInst* LOAD(Value *Ptr, const Twine  = "", Type *Ty = 
nullptr, JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
-virtual LoadInst* LOAD(Type *Ty, Value *Ptr, const Twine  = "", 
JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
 virtual LoadInst* LOAD(Value *Ptr, bool isVolatile, const Twine  
= "", Type *Ty = nullptr, JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
 virtual LoadInst* LOAD(Value *BasePtr, const 
std::initializer_list , const llvm::Twine& Name = "", Type *Ty 
= nullptr, JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
 
@@ -58,7 +57,7 @@ namespace SwrJit
 
 virtual Value *GATHERDD(Value* src, Value* pBase, Value* indices, 
Value* mask, uint8_t scale = 1, JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
 
-Value* TranslateGfxAddress(Value* xpGfxAddress);
+Value* TranslateGfxAddress(Value* xpGfxAddress, Type* PtrTy = nullptr, 
const Twine  = "");
 
 
 protected:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/21] swr: touch generated files to update timestamp

2018-04-25 Thread George Kyriazis
previous change in generators necessitates this change
---
 src/gallium/drivers/swr/Makefile.am | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/swr/Makefile.am 
b/src/gallium/drivers/swr/Makefile.am
index c22f09e..8b31502 100644
--- a/src/gallium/drivers/swr/Makefile.am
+++ b/src/gallium/drivers/swr/Makefile.am
@@ -104,6 +104,7 @@ gen_swr_context_llvm.h: 
rasterizer/codegen/gen_llvm_types.py rasterizer/codegen/
$(srcdir)/rasterizer/codegen/gen_llvm_types.py \
--input $(srcdir)/swr_context.h \
--output ./gen_swr_context_llvm.h
+   $(AM_V_GEN)touch $@
 
 rasterizer/codegen/gen_knobs.cpp: rasterizer/codegen/gen_knobs.py 
rasterizer/codegen/knob_defs.py rasterizer/codegen/templates/gen_knobs.cpp 
rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -111,6 +112,7 @@ rasterizer/codegen/gen_knobs.cpp: 
rasterizer/codegen/gen_knobs.py rasterizer/cod
$(srcdir)/rasterizer/codegen/gen_knobs.py \
--output rasterizer/codegen/gen_knobs.cpp \
--gen_cpp
+   $(AM_V_GEN)touch $@
 
 rasterizer/codegen/gen_knobs.h: rasterizer/codegen/gen_knobs.py 
rasterizer/codegen/knob_defs.py rasterizer/codegen/templates/gen_knobs.h 
rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -118,6 +120,7 @@ rasterizer/codegen/gen_knobs.h: 
rasterizer/codegen/gen_knobs.py rasterizer/codeg
$(srcdir)/rasterizer/codegen/gen_knobs.py \
--output rasterizer/codegen/gen_knobs.h \
--gen_h
+   $(AM_V_GEN)touch $@
 
 rasterizer/jitter/gen_state_llvm.h: rasterizer/codegen/gen_llvm_types.py 
rasterizer/codegen/templates/gen_llvm.hpp rasterizer/core/state.h 
rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -125,6 +128,7 @@ rasterizer/jitter/gen_state_llvm.h: 
rasterizer/codegen/gen_llvm_types.py rasteri
$(srcdir)/rasterizer/codegen/gen_llvm_types.py \
--input $(srcdir)/rasterizer/core/state.h \
--output rasterizer/jitter/gen_state_llvm.h
+   $(AM_V_GEN)touch $@
 
 rasterizer/jitter/gen_builder.hpp: rasterizer/codegen/gen_llvm_ir_macros.py 
rasterizer/codegen/templates/gen_builder.hpp rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -133,6 +137,7 @@ rasterizer/jitter/gen_builder.hpp: 
rasterizer/codegen/gen_llvm_ir_macros.py rast
--input $(LLVM_INCLUDEDIR)/llvm/IR/IRBuilder.h \
--output rasterizer/jitter \
--gen_h
+   $(AM_V_GEN)touch $@
 
 rasterizer/jitter/gen_builder_meta.hpp: 
rasterizer/codegen/gen_llvm_ir_macros.py 
rasterizer/codegen/templates/gen_builder.hpp rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -140,6 +145,7 @@ rasterizer/jitter/gen_builder_meta.hpp: 
rasterizer/codegen/gen_llvm_ir_macros.py
$(srcdir)/rasterizer/codegen/gen_llvm_ir_macros.py \
--output rasterizer/jitter \
--gen_meta_h
+   $(AM_V_GEN)touch $@
 
 rasterizer/jitter/gen_builder_intrin.hpp: 
rasterizer/codegen/gen_llvm_ir_macros.py 
rasterizer/codegen/templates/gen_builder.hpp rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -147,6 +153,7 @@ rasterizer/jitter/gen_builder_intrin.hpp: 
rasterizer/codegen/gen_llvm_ir_macros.
$(srcdir)/rasterizer/codegen/gen_llvm_ir_macros.py \
--output rasterizer/jitter \
--gen_intrin_h
+   $(AM_V_GEN)touch $@
 
 rasterizer/archrast/gen_ar_event.hpp: rasterizer/codegen/gen_archrast.py 
rasterizer/codegen/templates/gen_ar_event.hpp rasterizer/archrast/events.proto 
rasterizer/archrast/events_private.proto rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -156,6 +163,7 @@ rasterizer/archrast/gen_ar_event.hpp: 
rasterizer/codegen/gen_archrast.py rasteri
--proto_private 
$(srcdir)/rasterizer/archrast/events_private.proto \
--output rasterizer/archrast/gen_ar_event.hpp \
--gen_event_hpp
+   $(AM_V_GEN)touch $@
 
 rasterizer/archrast/gen_ar_event.cpp: rasterizer/codegen/gen_archrast.py 
rasterizer/codegen/templates/gen_ar_event.cpp rasterizer/archrast/events.proto 
rasterizer/archrast/events_private.proto rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -165,6 +173,7 @@ rasterizer/archrast/gen_ar_event.cpp: 
rasterizer/codegen/gen_archrast.py rasteri
--proto_private 
$(srcdir)/rasterizer/archrast/events_private.proto \
--output rasterizer/archrast/gen_ar_event.cpp \
--gen_event_cpp
+   $(AM_V_GEN)touch $@
 
 rasterizer/archrast/gen_ar_eventhandler.hpp: 
rasterizer/codegen/gen_archrast.py 
rasterizer/codegen/templates/gen_ar_eventhandler.hpp 
rasterizer/archrast/events.proto rasterizer/archrast/events_private.proto 
rasterizer/codegen/gen_common.py
$(MKDIR_GEN)
@@ -174,6 +183,7 @@ rasterizer/archrast/gen_ar_eventhandler.hpp: 
rasterizer/codegen/gen_archrast.py
--proto_private 

[Mesa-dev] [PATCH 00/21] OpenSWR batch change

2018-04-25 Thread George Kyriazis
Misc changes.  Include:
- fix KNL behavior with LLVm 6.0.0
- fix byte offset for non-indexed draws
- fix 64-bit float handling with code generator
- misc cleanup

George Kyriazis (21):
  swr/rast: Fix byte offset for non-indexed draws
  swr: touch generated files to update timestamp
  swr/rast: Fix wrong type allocation
  swr/rast: Add some SIMD_T utility functors
  swr/rast: Fix x86 lowering 64-bit float handling
  swr/rast: Internal core change
  swr/rast: Add support for TexelMask evaluation
  swr/rast: Silence warnings
  swr/rast: Use different handing for stream masks
  swr/rast: WIP Translation handling
  swr/rast: Fix return type of VCVTPS2PH.
  swr/rast: Fix init in EventHandlerWorkerStats
  swr/rast: Package events.proto with core output
  swr/rast: Cleanup old windows cruft.
  swr/rast: Fix regressions.
  swr/rast: jit PRINT improvements.
  swr/rast: Add TranslateGfxAddress for shader
  swr/rast: Output rasterizer dir to console since it's process specific
  swr/rast: Use new processor detection mechanism
  swr/rast: Small editorial changes
  swr/rast: No need to export GetSimdValidIndicesGfx

 src/gallium/drivers/swr/Makefile.am| 11 
 .../drivers/swr/rasterizer/archrast/archrast.cpp   | 35 +-
 .../swr/rasterizer/codegen/gen_llvm_ir_macros.py   |  2 +-
 .../codegen/templates/gen_ar_eventhandlerfile.hpp  |  7 +-
 src/gallium/drivers/swr/rasterizer/common/os.h |  3 +
 .../drivers/swr/rasterizer/common/simdlib.hpp  | 66 +++
 src/gallium/drivers/swr/rasterizer/core/api.cpp|  4 +-
 .../drivers/swr/rasterizer/core/frontend.cpp   | 12 ++--
 src/gallium/drivers/swr/rasterizer/core/state.h|  2 +-
 src/gallium/drivers/swr/rasterizer/core/utils.h|  1 +
 .../drivers/swr/rasterizer/jitter/JitManager.cpp   | 66 ++-
 .../drivers/swr/rasterizer/jitter/JitManager.h |  2 +
 .../drivers/swr/rasterizer/jitter/blend_jit.cpp|  2 -
 .../drivers/swr/rasterizer/jitter/builder.cpp  | 42 
 .../drivers/swr/rasterizer/jitter/builder.h|  2 +
 .../swr/rasterizer/jitter/builder_gfx_mem.cpp  | 74 --
 .../swr/rasterizer/jitter/builder_gfx_mem.h| 19 +-
 .../drivers/swr/rasterizer/jitter/builder_mem.h|  3 +-
 .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 15 -
 .../drivers/swr/rasterizer/jitter/builder_misc.h   | 12 
 .../drivers/swr/rasterizer/jitter/fetch_jit.cpp|  7 +-
 .../rasterizer/jitter/functionpasses/lower_x86.cpp | 55 +---
 .../swr/rasterizer/jitter/streamout_jit.cpp|  2 +
 23 files changed, 361 insertions(+), 83 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/21] swr/rast: Fix byte offset for non-indexed draws

2018-04-25 Thread George Kyriazis
for the case when USE_SIMD16_SHADERS == FALSE
---
 src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp 
b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
index 9630afa..6e2bab3 100644
--- a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
@@ -1729,15 +1729,15 @@ void ProcessDraw(
 fetchInfo_lo.xpLastIndex = fetchInfo_lo.xpIndices;
 uint32_t offset;
 offset = std::min(endVertex-i, (uint32_t) 
KNOB_SIMD16_WIDTH);
-#if USE_SIMD16_SHADERS
 offset *= 4; // convert from index to address
+#if USE_SIMD16_SHADERS
 fetchInfo_lo.xpLastIndex += offset;
 #else
-fetchInfo_lo.xpLastIndex += std::min(offset, (uint32_t) 
KNOB_SIMD_WIDTH) * 4; // * 4 for converting index to address
+fetchInfo_lo.xpLastIndex += std::min(offset, (uint32_t) 
KNOB_SIMD_WIDTH);
 uint32_t offset2 = std::min(offset, (uint32_t) 
KNOB_SIMD16_WIDTH)-KNOB_SIMD_WIDTH;
 assert(offset >= 0);
 fetchInfo_hi.xpLastIndex = fetchInfo_hi.xpIndices;
-fetchInfo_hi.xpLastIndex += offset2 * 4; // * 4 for 
converting index to address
+fetchInfo_hi.xpLastIndex += offset2;
 #endif
 }
 // 1. Execute FS/VS for a single SIMD.
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/21] swr/rast: Cleanup old windows cruft.

2018-04-25 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
index 284eb27..bfc3e42 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
@@ -82,13 +82,6 @@ JitManager::JitManager(uint32_t simdWidth, const char *arch, 
const char* core)
 
 hostCPUName = sys::getHostCPUName();
 
-#if defined(_WIN32)
-// Needed for MCJIT on windows
-Triple hostTriple(sys::getProcessTriple());
-hostTriple.setObjectFormat(Triple::COFF);
-mpCurrentModule->setTargetTriple(hostTriple.getTriple());
-#endif // _WIN32
-
 auto optLevel = CodeGenOpt::Aggressive;
 
 if (KNOB_JIT_OPTIMIZATION_LEVEL >= CodeGenOpt::None &&
@@ -97,6 +90,7 @@ JitManager::JitManager(uint32_t simdWidth, const char *arch, 
const char* core)
 optLevel = CodeGenOpt::Level(KNOB_JIT_OPTIMIZATION_LEVEL);
 }
 
+mpCurrentModule->setTargetTriple(sys::getProcessTriple());
 mpExec = EngineBuilder(std::move(newModule))
 .setTargetOptions(tOpts)
 .setOptLevel(optLevel)
@@ -163,13 +157,7 @@ void JitManager::SetupNewModule()
 
 std::unique_ptr newModule(new Module("", mContext));
 mpCurrentModule = newModule.get();
-#if defined(_WIN32)
-// Needed for MCJIT on windows
-Triple hostTriple(sys::getProcessTriple());
-hostTriple.setObjectFormat(Triple::COFF);
-newModule->setTargetTriple(hostTriple.getTriple());
-#endif // _WIN32
-
+mpCurrentModule->setTargetTriple(sys::getProcessTriple());
 mpExec->addModule(std::move(newModule));
 mIsModuleFinalized = false;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/21] swr/rast: Use different handing for stream masks

2018-04-25 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/common/os.h  | 3 +++
 src/gallium/drivers/swr/rasterizer/core/api.cpp | 4 ++--
 src/gallium/drivers/swr/rasterizer/core/frontend.cpp| 6 +++---
 src/gallium/drivers/swr/rasterizer/core/state.h | 2 +-
 src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp | 2 ++
 5 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h 
b/src/gallium/drivers/swr/rasterizer/common/os.h
index 5cfd12f..e779562 100644
--- a/src/gallium/drivers/swr/rasterizer/common/os.h
+++ b/src/gallium/drivers/swr/rasterizer/common/os.h
@@ -209,6 +209,9 @@ unsigned char _BitScanReverse(unsigned int *Index, unsigned 
int Mask)
 return (Mask != 0);
 }
 
+#define _BitScanForward64 _BitScanForward
+#define _BitScanReverse64 _BitScanReverse
+
 inline
 void *AlignedMalloc(size_t size, size_t alignment)
 {
diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp 
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index e37e2e4..a2ee85d 100644
--- a/src/gallium/drivers/swr/rasterizer/core/api.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp
@@ -976,14 +976,14 @@ void SetupPipeline(DRAW_CONTEXT *pDC)
 
 if (pState->state.soState.soEnable)
 {
-uint32_t streamMasks = 0;
+uint64_t streamMasks = 0;
 for (uint32_t i = 0; i < 4; ++i)
 {
 streamMasks |= pState->state.soState.streamMasks[i];
 }
 
 DWORD maxAttrib;
-if (_BitScanReverse(, streamMasks))
+if (_BitScanReverse64(, streamMasks))
 {
 pState->state.feNumAttributes = 
std::max(pState->state.feNumAttributes, (uint32_t)(maxAttrib + 1));
 }
diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp 
b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
index 6e2bab3..1847c3e 100644
--- a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
@@ -528,10 +528,10 @@ static void StreamOut(
 for (uint32_t primIndex = 0; primIndex < numPrims; ++primIndex)
 {
 DWORD slot = 0;
-uint32_t soMask = soState.streamMasks[streamIndex];
+uint64_t soMask = soState.streamMasks[streamIndex];
 
 // Write all entries into primitive data buffer for SOS.
-while (_BitScanForward(, soMask))
+while (_BitScanForward64(, soMask))
 {
 simd4scalar attrib[MAX_NUM_VERTS_PER_PRIM];// prim attribs 
(always 4 wide)
 uint32_t paSlot = slot + soState.vertexAttribOffset[streamIndex];
@@ -551,7 +551,7 @@ static void StreamOut(
 _mm_store_ps((float*)pPrimDataAttrib, attrib[v]);
 }
 
-soMask &= ~(1 << slot);
+soMask &= ~(uint64_t(1) << slot);
 }
 
 // Update pPrimData pointer 
diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h 
b/src/gallium/drivers/swr/rasterizer/core/state.h
index 217cf44..f160913 100644
--- a/src/gallium/drivers/swr/rasterizer/core/state.h
+++ b/src/gallium/drivers/swr/rasterizer/core/state.h
@@ -702,7 +702,7 @@ struct SWR_STREAMOUT_STATE
 // The stream masks specify which attributes are sent to which streams.
 // These masks help the FE to setup the pPrimData buffer that is passed
 // the Stream Output Shader (SOS) function.
-uint32_t streamMasks[MAX_SO_STREAMS];
+uint64_t streamMasks[MAX_SO_STREAMS];
 
 // Number of attributes, including position, per vertex that are streamed 
out.
 // This should match number of bits in stream mask.
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp
index 15a6bc4..f804900 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp
@@ -313,6 +313,7 @@ struct StreamOutJit : public Builder
 
 JitManager::DumpToFile(soFunc, "SoFunc_optimized");
 
+
 return soFunc;
 }
 };
@@ -333,6 +334,7 @@ PFN_SO_FUNC JitStreamoutFunc(HANDLE hJitMgr, const HANDLE 
hFunc)
 
 pJitMgr->DumpAsm(func, "SoFunc_optimized");
 
+
 return pfnStreamOut;
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/21] swr/rast: Add some SIMD_T utility functors

2018-04-25 Thread George Kyriazis
VecEqual and VecHash
---
 .../drivers/swr/rasterizer/common/simdlib.hpp  | 66 ++
 1 file changed, 66 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/common/simdlib.hpp 
b/src/gallium/drivers/swr/rasterizer/common/simdlib.hpp
index 4114645..24cf27d 100644
--- a/src/gallium/drivers/swr/rasterizer/common/simdlib.hpp
+++ b/src/gallium/drivers/swr/rasterizer/common/simdlib.hpp
@@ -580,3 +580,69 @@ template  using Double = typename 
SIMD_T::Double;
 template  using Integer= typename SIMD_T::Integer;
 template  using Vec4   = typename SIMD_T::Vec4;
 template  using Mask   = typename SIMD_T::Mask;
+
+template 
+struct SIMDVecEqual
+{
+INLINE bool operator () (Integer a, Integer b) const
+{
+Integer c = SIMD_T::xor_si(a, b);
+return SIMD_T::testz_si(c, c);
+}
+
+INLINE bool operator () (Float a, Float b) const
+{
+return this->operator()(SIMD_T::castps_si(a), SIMD_T::castps_si(b));
+}
+
+INLINE bool operator () (Double a, Double b) const
+{
+return this->operator()(SIMD_T::castpd_si(a), SIMD_T::castpd_si(b));
+}
+};
+
+template 
+struct SIMDVecHash
+{
+INLINE uint32_t operator ()(Integer val) const
+{
+#if defined(_WIN64) || !defined(_WIN32) // assume non-Windows is always 64-bit
+static_assert(sizeof(void*) == 8, "This path only meant for 64-bit 
code");
+
+uint64_t crc32 = 0;
+const uint64_t *pData = reinterpret_cast();
+static const uint32_t loopIterations = sizeof(val) / sizeof(void*);
+static_assert(loopIterations * sizeof(void*) == sizeof(val), "bad 
vector size");
+
+for (uint32_t i = 0; i < loopIterations; ++i)
+{
+crc32 = _mm_crc32_u64(crc32, pData[i]);
+}
+
+return static_cast(crc32);
+#else
+static_assert(sizeof(void*) == 4, "This path only meant for 32-bit 
code");
+
+uint32_t crc32 = 0;
+const uint32_t *pData = reinterpret_cast();
+static const uint32_t loopIterations = sizeof(val) / sizeof(void*);
+static_assert(loopIterations * sizeof(void*) == sizeof(val), "bad 
vector size");
+
+for (uint32_t i = 0; i < loopIterations; ++i)
+{
+crc32 = _mm_crc32_u32(crc32, pData[i]);
+}
+
+return crc32;
+#endif
+};
+
+INLINE uint32_t operator ()(Float val) const
+{
+return operator()(SIMD_T::castps_si(val));
+};
+INLINE uint32_t operator ()(Double val) const
+{
+return operator()(SIMD_T::castpd_si(val));
+}
+};
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/21] swr/rast: Silence warnings

2018-04-25 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp| 2 --
 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp| 1 -
 src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp | 3 ++-
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
index 58fdb7f..72bf900 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
@@ -558,8 +558,6 @@ struct BlendJit : public Builder
 ppoMask->setName("ppoMask");
 Value* ppMask = LOAD(pBlendContext, { 0, SWR_BLEND_CONTEXT_pMask });
 ppMask->setName("pMask");
-Value* AlphaTest1 = LOAD(pBlendContext, { 0, 
SWR_BLEND_CONTEXT_isAlphaBlended });
-ppMask->setName("AlphaTest1");
 
 static_assert(KNOB_COLOR_HOT_TILE_FORMAT == R32G32B32A32_FLOAT, 
"Unsupported hot tile format");
 Value* dst[4];
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index a43c787..48f0961 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -1070,7 +1070,6 @@ Value* FetchJit::GetSimdValid16bitIndices(Value* 
pIndices, Value* pLastIndex)
 Value* FetchJit::GetSimdValid32bitIndices(Value* pIndices, Value* pLastIndex)
 {
 DataLayout dL(JM()->mpCurrentModule);
-unsigned int ptrSize = dL.getPointerSize() * 8;  // ptr size in bits
 Value* iLastIndex = pLastIndex; 
 Value* iIndices = pIndices;
 
diff --git 
a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
index eac0549..b8c3296 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
@@ -168,7 +168,6 @@ namespace SwrJit
 // intrinsic.
 void GetRequestedWidthAndType(CallInst* pCallInst, const StringRef 
intrinName, TargetWidth* pWidth, Type** pTy)
 {
-uint32_t vecWidth;
 Type* pVecTy = pCallInst->getType();
 
 // Check for intrinsic specific types
@@ -210,6 +209,7 @@ namespace SwrJit
 {
 case W256: numElem = 8; break;
 case W512: numElem = 16; break;
+   default: SWR_ASSERT(false, "Unhandled vector width type %d\n", 
width);
 }
 
 return ConstantVector::getNullValue(VectorType::get(pTy, numElem));
@@ -222,6 +222,7 @@ namespace SwrJit
 {
 case W256: mask = B->C((uint8_t)-1); break;
 case W512: mask = B->C((uint16_t)-1); break;
+   default: SWR_ASSERT(false, "Unhandled vector width type %d\n", 
width);
 }
 return mask;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/21] swr/rast: Package events.proto with core output

2018-04-25 Thread George Kyriazis
However only if the file exists in DEBUG_OUTPUT_DIR. The expectation is
that AR rasterizerLauncher will start placing it there when launching
a workload (which is in a subsequent checkin)
---
 .../drivers/swr/rasterizer/archrast/archrast.cpp   | 30 +-
 .../codegen/templates/gen_ar_eventhandlerfile.hpp  |  4 ++-
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp 
b/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
index ff7bdc3..285d1ac 100644
--- a/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
+++ b/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
@@ -93,7 +93,35 @@ namespace ArchRast
 class EventHandlerApiStats : public EventHandlerFile
 {
 public:
-EventHandlerApiStats(uint32_t id) : EventHandlerFile(id) {}
+EventHandlerApiStats(uint32_t id) : EventHandlerFile(id) {
+#if defined(_WIN32)
+// Attempt to copy the events.proto file to the ArchRasty output 
dir. It's common for tools to place the events.proto file
+// in the DEBUG_OUTPUT_DIR when launching AR. If it exists, this 
will attempt to copy it the first time we get here to package
+// it with the stats. Otherwise, the user would need to specify 
the events.proto location when parsing the stats in post.
+std::stringstream eventsProtoSrcFilename, eventsProtoDstFilename;
+eventsProtoSrcFilename << KNOB_DEBUG_OUTPUT_DIR << 
"\\events.proto" << std::ends;
+eventsProtoDstFilename << mOutputDir.substr(0, mOutputDir.size() - 
1) << "\\events.proto" << std::ends;
+
+// If event.proto already exists, we're done; else do the copy
+struct stat buf; // Use a Posix stat for file existence check
+if (!stat(eventsProtoDstFilename.str().c_str(), ) == 0) {
+// Now check to make sure the events.proto source exists
+if (stat(eventsProtoSrcFilename.str().c_str(), ) == 0) {
+std::ifstream srcFile;
+srcFile.open(eventsProtoSrcFilename.str().c_str(), 
std::ios::binary);
+if (srcFile.is_open())
+{
+// Just do a binary buffer copy
+std::ofstream dstFile;
+dstFile.open(eventsProtoDstFilename.str().c_str(), 
std::ios::binary);
+dstFile << srcFile.rdbuf();
+dstFile.close();
+}
+srcFile.close();
+}
+}
+#endif
+}
 
 virtual void Handle(const DrawInstancedEvent& event)
 {
diff --git 
a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
index d1852b3..54d2486 100644
--- 
a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
+++ 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
@@ -56,7 +56,8 @@ namespace ArchRast
 const char* pBaseName = strrchr(procname, '\\');
 std::stringstream outDir;
 outDir << KNOB_DEBUG_OUTPUT_DIR << pBaseName << "_" << pid << 
std::ends;
-CreateDirectory(outDir.str().c_str(), NULL);
+mOutputDir = outDir.str();
+CreateDirectory(mOutputDir.c_str(), NULL);
 
 // There could be multiple threads creating thread pools. We
 // want to make sure they are uniquly identified by adding in
@@ -152,6 +153,7 @@ namespace ArchRast
 }
 
 std::string mFilename;
+std::string mOutputDir;
 
 static const uint32_t mBufferSize = 1024;
 uint8_t mBuffer[mBufferSize];
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.

2018-04-25 Thread Eric Anholt
GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an
unsized internalformat, and it should be non-color-renderable.
fbobject.c's implementation of the check for color-renderable is checks
that the texture has a 2101010 mesa format, so make sure that we have
chosen a 2101010 format so that check can do what it meant to.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5.
---
 src/mesa/state_tracker/st_format.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/mesa/state_tracker/st_format.c 
b/src/mesa/state_tracker/st_format.c
index 3db3c7e967c6..418f5342025c 100644
--- a/src/mesa/state_tracker/st_format.c
+++ b/src/mesa/state_tracker/st_format.c
@@ -2138,6 +2138,19 @@ st_choose_format(struct st_context *st, GLenum 
internalFormat,
   goto success;
}
 
+   /* For an unsized GL_RGB but a 2_10_10_10 type, try to pick one of the
+* 2_10_10_10 formats.  This is important for
+* GL_EXT_texture_type_2_10_10_10_EXT support, which says that these
+* formats are not color-renderable.  Mesa's check for making those
+* non-color-renderable is based on our chosen format being 2101010.
+*/
+   if (type == GL_UNSIGNED_INT_2_10_10_10_REV) {
+  if (internalFormat == GL_RGB)
+ internalFormat = GL_RGB10;
+  else if (internalFormat == GL_RGBA)
+ internalFormat = GL_RGB10_A2;
+   }
+
/* search table for internalFormat */
for (i = 0; i < ARRAY_SIZE(format_map); i++) {
   const struct format_mapping *mapping = _map[i];
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106231] llvmpipe blends produce bad code after llvm patch https://reviews.llvm.org/D44785

2018-04-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106231

Roland Scheidegger  changed:

   What|Removed |Added

 CC||jfons...@vmware.com,
   ||srol...@vmware.com

--- Comment #1 from Roland Scheidegger  ---
(In reply to Tom Hudson from comment #0)
> https://reviews.llvm.org/D44785 changed the way adds, addus, subs, subus are
> handled.
> 
> llvmpipe issues llvm.x86.sse2.padds and llvm.x86.sse2.psubs in
> src/gallium/auxiliary/gallivm/lp_bld_arit.c:lp_build_add() and
> lp_build_sub().
> 
> After D44785 landed, lp_test_blend.c started crash every time it entered
> LLVM-compiled code for type=u8nx16.
> 
> Commenting out the issues of padds/psubs avoids this crash. The LLVM
> project, in discussing the bug at https://reviews.llvm.org/D44785, suspects
> that the cause may be because llvmpipe is "missing the autoupgrade stage"?

Autoupgrade doesn't work for jit code (at least I wouldn't know how, and it
never has in the past), so the way we handled disappearing of intrinsics in the
past was to just not use them any more (for newer llvm versions) and do
essentially the same as what autoupgrade would do, and we'll have to do the
same here.

I am however sure that in the past when intrinsics disappeared, it would
complain when compiling the IR, rather than just call 0 function in the
compiled code. Which is of course much nicer...
For instance when the min/max (integer) intrinsics disappared:
https://bugs.llvm.org/show_bug.cgi?id=28176
But I'm not sure if just calling 0 function now is expected due to some llvm
changes, but if it is that's definitely making everybody's life harder...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/5] i965/clear: Remove an early return in fast_clear_depth

2018-04-25 Thread Nanley Chery
On Wed, Apr 25, 2018 at 11:43:23AM -0700, Rafael Antognolli wrote:
> On Wed, Apr 25, 2018 at 11:40:15AM -0700, Nanley Chery wrote:
> > On Wed, Apr 25, 2018 at 11:30:14AM -0700, Rafael Antognolli wrote:
> > > On Tue, Apr 24, 2018 at 05:48:44PM -0700, Nanley Chery wrote:
> > > > Reduce complexity and allow the next patch to delete some code. With
> > > > this change, clear operations will still be skipped and setting the
> > > > aux_state will cause no side-effects.
> > > 
> > > It's going to skip the fast clear, but if I understood correctly it will
> > > call intel_miptree_set_aux_state(), which marks the BRW_NEW_AUX_STATE
> > > and will make all the surface state to be reemited.
> > > 
> > > I'm not sure if there's something else that already triggers that, but
> > > if that wasn't the case already, maybe we are going to be emitting a lot
> > > more state now?
> > > 
> > 
> > The surface state won't be re-emitted. intel_miptree_set_aux_state()
> > will only mark BRW_NEW_AUX_STATE if the new aux state differs from the
> > current one. In the case where we can skip the fast clear, the current
> > and the new states will both equal ISL_AUX_STATE_CLEAR.
> 
> Ouch, you are right, I missed the big "if" there. In this case, this
> patch is
> 
> Reviewed-by: Rafael Antognolli 
> 

Thanks!

> > > > Remove the associated comment which implies an early return.
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_clear.c | 5 -
> > > >  1 file changed, 5 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> > > > b/src/mesa/drivers/dri/i965/brw_clear.c
> > > > index fdc31cd9b68..6521141d7f6 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > > > @@ -230,10 +230,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > > > }
> > > >  
> > > > if (!need_clear) {
> > > > -  /* If all of the layers we intend to clear are already in the 
> > > > clear
> > > > -   * state then simply updating the miptree fast clear value is 
> > > > sufficient
> > > > -   * to change their clear value.
> > > > -   */
> > > >if (!same_clear_value) {
> > > >   /* BLORP updates the indirect clear color buffer when 
> > > > performing a
> > > >* fast clear. Since we are skipping the fast clear here, we 
> > > > need to
> > > > @@ -241,7 +237,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > > >*/
> > > >   intel_miptree_update_indirect_color(brw, mt);
> > > >}
> > > > -  return true;
> > > > }
> > > >  
> > > > for (unsigned a = 0; a < num_layers; a++) {
> > > > -- 
> > > > 2.16.2
> > > > 
> > > > ___
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe : Fixed an issue where the display target texture was mapped multiple times.

2018-04-25 Thread Roland Scheidegger
Am 20.04.2018 um 08:07 schrieb 정성찬:
> Dear Roland Scheidegger
> 
> Thank you very much for your time and efforts.
> 
> First, I want to talk about the problem that I encountered.
> I am currently developing a display server system using the llvmpipe driver
> and the kms-dri winsys module. During the compositing process, winsys->
> displaytarget_map () will be called continuously for the same resource. So 
> in the case of kms winsys, mmap () returns MAP_FAILED inside the 
> kms_sw_displaytarget_map function, and segfault terminates the process.
Isn't there actually a bug in kms_sw_displaytarget_map() too?
By the looks of it (together with mks_sw_displaytarget_unmap() it tries
to keep a count of mappings and only map when it isn't already mapped
(the unmap will only decrease count if map count is still higher than
1). But the logic can't work since the mapped/rd_mapped fields are never
set.

That said, even if the logic there would work, we'd still have an
inbalance in map count obviously.
I suppose we can't keep it always mapped?
If there's only ever one dt texture could maybe simply always try to
unmap it at the beginning of lp_setup_set_fragment_sampler_views (and I
suppose in theory we should do the same for vertex/geometry sampler view
setup, unless we outright forbid mapping dt textures there).
Then inside the loop, only map it when it's seen for the first time (and
record on the setup context it has been mapped).
(And don't forget to unmap it when the setup context is destroyed too.)

> 
> I also do not think this patch is perfect. As you said, the resource is
> still 
> mapped. But in my opinion, this approach is a good way to solve the 
> aforementioned critical issues.

I think it would be better to fix this for real rather than some half
attempt which still leaves things quite broken.

Roland

> 
> What do you think? I look forward to your reply.
> Sincerely yours,
> 
> Seongchan Jeong.
> 
> 
> 2018-04-20 11:15 GMT+09:00 Roland Scheidegger  >:
> 
> Am 19.04.2018 um 08:04 schrieb Seongchan Jeong:
> > The lp_setup_set_fragment_sampler_views function can be called
> > when the texture module is enabled. However, mapping can be
> > performed several times for one display target texture, but
> > unmapping does not proceed. So some logic have been added to
> > unmap the display target texture to prevent additional mappings
> > when the texture is already mapped.
> > ---
> >  src/gallium/drivers/llvmpipe/lp_setup.c   | 9 +
> >  src/gallium/drivers/llvmpipe/lp_texture.h | 2 ++
> >  2 files changed, 11 insertions(+)
> >
> > diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c
> b/src/gallium/drivers/llvmpipe/lp_setup.c
> > index c157323133..71ceafe2b7 100644
> > --- a/src/gallium/drivers/llvmpipe/lp_setup.c
> > +++ b/src/gallium/drivers/llvmpipe/lp_setup.c
> > @@ -907,6 +907,13 @@ lp_setup_set_fragment_sampler_views(struct
> lp_setup_context *setup,
> >               */
> >              struct llvmpipe_screen *screen =
> llvmpipe_screen(res->screen);
> >              struct sw_winsys *winsys = screen->winsys;
> > +
> > +            /* unmap the texture which is already mapped */
> > +            if(lp_tex->mapped){
> > +                winsys->displaytarget_unmap(winsys, lp_tex->dt);
> > +                lp_tex->mapped = false;
> > +            }
> > +
> >              jit_tex->base = winsys->displaytarget_map(winsys,
> lp_tex->dt,
> >                                                         
>  PIPE_TRANSFER_READ);
> >              jit_tex->row_stride[0] = lp_tex->row_stride[0];
> > @@ -917,6 +924,8 @@ lp_setup_set_fragment_sampler_views(struct
> lp_setup_context *setup,
> >              jit_tex->depth = res->depth0;
> >              jit_tex->first_level = jit_tex->last_level = 0;
> >              assert(jit_tex->base);
> > +
> > +            lp_tex->mapped = true;
> 
> I am not quite convinced this is the right fix.
> Clearly the code right now isn't right, and pretty much relies on the
> winsys->displaytarget_map() being a no-op there just giving the mapping
> without any side effects.
> The problem with this fix is it still would be kept mapped in the end
> after sampling (and, it can and probably will be mapped elsewhere too
> still).
> 
> Do you hit any specific bug with the code as-is?
> 
> Roland
> 
> 
> >           }
> >        }
> >        else {
> > diff --git a/src/gallium/drivers/llvmpipe/lp_texture.h
> b/src/gallium/drivers/llvmpipe/lp_texture.h
> > index 3d315bb9a7..9e39d31eb3 100644
> > --- a/src/gallium/drivers/llvmpipe/lp_texture.h
> > +++ b/src/gallium/drivers/llvmpipe/lp_texture.h
> > @@ -75,6 +75,8 @@ struct llvmpipe_resource
> >      */
> >     struct sw_displaytarget *dt;
> > 
>   

Re: [Mesa-dev] [PATCH v2 3/5] i965/clear: Remove an early return in fast_clear_depth

2018-04-25 Thread Rafael Antognolli
On Wed, Apr 25, 2018 at 11:40:15AM -0700, Nanley Chery wrote:
> On Wed, Apr 25, 2018 at 11:30:14AM -0700, Rafael Antognolli wrote:
> > On Tue, Apr 24, 2018 at 05:48:44PM -0700, Nanley Chery wrote:
> > > Reduce complexity and allow the next patch to delete some code. With
> > > this change, clear operations will still be skipped and setting the
> > > aux_state will cause no side-effects.
> > 
> > It's going to skip the fast clear, but if I understood correctly it will
> > call intel_miptree_set_aux_state(), which marks the BRW_NEW_AUX_STATE
> > and will make all the surface state to be reemited.
> > 
> > I'm not sure if there's something else that already triggers that, but
> > if that wasn't the case already, maybe we are going to be emitting a lot
> > more state now?
> > 
> 
> The surface state won't be re-emitted. intel_miptree_set_aux_state()
> will only mark BRW_NEW_AUX_STATE if the new aux state differs from the
> current one. In the case where we can skip the fast clear, the current
> and the new states will both equal ISL_AUX_STATE_CLEAR.

Ouch, you are right, I missed the big "if" there. In this case, this
patch is

Reviewed-by: Rafael Antognolli 

> > > Remove the associated comment which implies an early return.
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_clear.c | 5 -
> > >  1 file changed, 5 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> > > b/src/mesa/drivers/dri/i965/brw_clear.c
> > > index fdc31cd9b68..6521141d7f6 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > > @@ -230,10 +230,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > > }
> > >  
> > > if (!need_clear) {
> > > -  /* If all of the layers we intend to clear are already in the clear
> > > -   * state then simply updating the miptree fast clear value is 
> > > sufficient
> > > -   * to change their clear value.
> > > -   */
> > >if (!same_clear_value) {
> > >   /* BLORP updates the indirect clear color buffer when 
> > > performing a
> > >* fast clear. Since we are skipping the fast clear here, we 
> > > need to
> > > @@ -241,7 +237,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > >*/
> > >   intel_miptree_update_indirect_color(brw, mt);
> > >}
> > > -  return true;
> > > }
> > >  
> > > for (unsigned a = 0; a < num_layers; a++) {
> > > -- 
> > > 2.16.2
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/5] i965/clear: Remove an early return in fast_clear_depth

2018-04-25 Thread Nanley Chery
On Wed, Apr 25, 2018 at 11:30:14AM -0700, Rafael Antognolli wrote:
> On Tue, Apr 24, 2018 at 05:48:44PM -0700, Nanley Chery wrote:
> > Reduce complexity and allow the next patch to delete some code. With
> > this change, clear operations will still be skipped and setting the
> > aux_state will cause no side-effects.
> 
> It's going to skip the fast clear, but if I understood correctly it will
> call intel_miptree_set_aux_state(), which marks the BRW_NEW_AUX_STATE
> and will make all the surface state to be reemited.
> 
> I'm not sure if there's something else that already triggers that, but
> if that wasn't the case already, maybe we are going to be emitting a lot
> more state now?
> 

The surface state won't be re-emitted. intel_miptree_set_aux_state()
will only mark BRW_NEW_AUX_STATE if the new aux state differs from the
current one. In the case where we can skip the fast clear, the current
and the new states will both equal ISL_AUX_STATE_CLEAR.

-Nanley

> > Remove the associated comment which implies an early return.
> > ---
> >  src/mesa/drivers/dri/i965/brw_clear.c | 5 -
> >  1 file changed, 5 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> > b/src/mesa/drivers/dri/i965/brw_clear.c
> > index fdc31cd9b68..6521141d7f6 100644
> > --- a/src/mesa/drivers/dri/i965/brw_clear.c
> > +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> > @@ -230,10 +230,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> > }
> >  
> > if (!need_clear) {
> > -  /* If all of the layers we intend to clear are already in the clear
> > -   * state then simply updating the miptree fast clear value is 
> > sufficient
> > -   * to change their clear value.
> > -   */
> >if (!same_clear_value) {
> >   /* BLORP updates the indirect clear color buffer when performing a
> >* fast clear. Since we are skipping the fast clear here, we need 
> > to
> > @@ -241,7 +237,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> >*/
> >   intel_miptree_update_indirect_color(brw, mt);
> >}
> > -  return true;
> > }
> >  
> > for (unsigned a = 0; a < num_layers; a++) {
> > -- 
> > 2.16.2
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/5] i965: Add and use a helper to update the indirect miptree color

2018-04-25 Thread Rafael Antognolli
This patch is

Reviewed-by: Rafael Antognolli 

On Tue, Apr 24, 2018 at 05:48:42PM -0700, Nanley Chery wrote:
> Split out this functionality to enable a fast-clear optimization for
> color miptrees in the next commit.
> 
> v2: Avoid the additional refactor (Jason).
> ---
>  src/mesa/drivers/dri/i965/brw_clear.c | 23 +--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 22 ++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  7 +++
>  3 files changed, 34 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index 3d540d6d905..fdc31cd9b68 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -234,25 +234,12 @@ brw_fast_clear_depth(struct gl_context *ctx)
> * state then simply updating the miptree fast clear value is 
> sufficient
> * to change their clear value.
> */
> -  if (devinfo->gen >= 10 && !same_clear_value) {
> - /* Before gen10, it was enough to just update the clear value in the
> -  * miptree. But on gen10+, we let blorp update the clear value state
> -  * buffer when doing a fast clear. Since we are skipping the fast
> -  * clear here, we need to update the clear color ourselves.
> +  if (!same_clear_value) {
> + /* BLORP updates the indirect clear color buffer when performing a
> +  * fast clear. Since we are skipping the fast clear here, we need to
> +  * do the update ourselves.
>*/
> - uint32_t clear_offset = mt->aux_buf->clear_color_offset;
> - union isl_color_value clear_color = { .f32 = { clear_value, } };
> -
> - /* We can't update the clear color while the hardware is still using
> -  * the previous one for a resolve or sampling from it. So make sure
> -  * that there's no pending commands at this point.
> -  */
> - brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
> - for (int i = 0; i < 4; i++) {
> -brw_store_data_imm32(brw, mt->aux_buf->clear_color_bo,
> - clear_offset + i * 4, clear_color.u32[i]);
> - }
> - brw_emit_pipe_control_flush(brw, 
> PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> + intel_miptree_update_indirect_color(brw, mt);
>}
>return true;
> }
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 93a91fd8081..1006635c0d7 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -3837,3 +3837,25 @@ intel_miptree_get_clear_color(const struct 
> gen_device_info *devinfo,
>return mt->fast_clear_color;
> }
>  }
> +
> +void
> +intel_miptree_update_indirect_color(struct brw_context *brw,
> +struct intel_mipmap_tree *mt)
> +{
> +   assert(mt->aux_buf);
> +
> +   if (mt->aux_buf->clear_color_bo == NULL)
> +  return;
> +
> +   /* We can't update the clear color while the hardware is still using the
> +* previous one for a resolve or sampling from it. Make sure that there 
> are
> +* no pending commands at this point.
> +*/
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
> +   for (int i = 0; i < 4; i++) {
> +  brw_store_data_imm32(brw, mt->aux_buf->clear_color_bo,
> +   mt->aux_buf->clear_color_offset + i * 4,
> +   mt->fast_clear_color.u32[i]);
> +   }
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> +}
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index e99ea44b809..1c2361c1cb0 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -749,6 +749,13 @@ intel_miptree_set_depth_clear_value(struct brw_context 
> *brw,
>  struct intel_mipmap_tree *mt,
>  float clear_value);
>  
> +/* If this miptree has an indirect clear color, update it with the value 
> stored
> + * in the miptree object.
> + */
> +void
> +intel_miptree_update_indirect_color(struct brw_context *brw,
> +struct intel_mipmap_tree *mt);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> -- 
> 2.16.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/5] i965/clear: Remove an early return in fast_clear_depth

2018-04-25 Thread Rafael Antognolli
On Tue, Apr 24, 2018 at 05:48:44PM -0700, Nanley Chery wrote:
> Reduce complexity and allow the next patch to delete some code. With
> this change, clear operations will still be skipped and setting the
> aux_state will cause no side-effects.

It's going to skip the fast clear, but if I understood correctly it will
call intel_miptree_set_aux_state(), which marks the BRW_NEW_AUX_STATE
and will make all the surface state to be reemited.

I'm not sure if there's something else that already triggers that, but
if that wasn't the case already, maybe we are going to be emitting a lot
more state now?

> Remove the associated comment which implies an early return.
> ---
>  src/mesa/drivers/dri/i965/brw_clear.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index fdc31cd9b68..6521141d7f6 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -230,10 +230,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
> }
>  
> if (!need_clear) {
> -  /* If all of the layers we intend to clear are already in the clear
> -   * state then simply updating the miptree fast clear value is 
> sufficient
> -   * to change their clear value.
> -   */
>if (!same_clear_value) {
>   /* BLORP updates the indirect clear color buffer when performing a
>* fast clear. Since we are skipping the fast clear here, we need to
> @@ -241,7 +237,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
>*/
>   intel_miptree_update_indirect_color(brw, mt);
>}
> -  return true;
> }
>  
> for (unsigned a = 0; a < num_layers; a++) {
> -- 
> 2.16.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl/x11: Send invalidate to the driver on dri2_copy_region

2018-04-25 Thread Deepak Rawat
Similar to what is done in dri2_x11_swap_buffers_msc send invalidate
to the driver because egl/X11 is not watching for for server's
invalidate events. The dri2_copy_region path is trigerred when
server supports DRI2 version minor 1.

Tested with piglit egl tests for regression.

Cc: 
Signed-off-by: Deepak Rawat 
Reviewed-by: Thomas Hellstrom 
---
 src/egl/drivers/dri2/platform_x11.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 6c287b4d06..e99434ea3a 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -841,6 +841,13 @@ dri2_copy_region(_EGLDriver *drv, _EGLDisplay *disp,
   render_attachment);
free(xcb_dri2_copy_region_reply(dri2_dpy->conn, cookie, NULL));
 
+   /*
+* Just like as done in dri2_x11_swap_buffers_msc we aren't watching for
+* server's invalidate events, so just send invalidate to driver.
+*/
+   if (dri2_dpy->flush->base.version >= 3 && dri2_dpy->flush->invalidate)
+  dri2_dpy->flush->invalidate(dri2_surf->dri_drawable);
+
return EGL_TRUE;
 }
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: Fix incorrect refcounting of separate stencil.

2018-04-25 Thread Rob Clark
On Wed, Apr 25, 2018 at 12:49 PM, Eric Anholt  wrote:
> The driver may have a reference on the separate stencil buffer for some
> reason (like an unflushed job using it), so we can't directly free the
> resource and should instead just decrement the refcount that we own.
> Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
> on vc5.
>
> Fixes: e94eb5e6000e ("gallium/util: add u_transfer_helper")

oh, whoops

Reviewed-by: Rob Clark 

> ---
>  src/gallium/auxiliary/util/u_transfer_helper.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_transfer_helper.c 
> b/src/gallium/auxiliary/util/u_transfer_helper.c
> index dd31049920fc..3b085fd99f09 100644
> --- a/src/gallium/auxiliary/util/u_transfer_helper.c
> +++ b/src/gallium/auxiliary/util/u_transfer_helper.c
> @@ -138,8 +138,7 @@ u_transfer_helper_resource_destroy(struct pipe_screen 
> *pscreen,
> if (helper->vtbl->get_stencil) {
>struct pipe_resource *stencil = helper->vtbl->get_stencil(prsc);
>
> -  if (stencil)
> - helper->vtbl->resource_destroy(pscreen, stencil);
> +  pipe_resource_reference(, NULL);
> }
>
> helper->vtbl->resource_destroy(pscreen, prsc);
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Intel: Add a Kaby Lake PCI ID

2018-04-25 Thread Rafael Antognolli
This patch is

Reviewed-by: Rafael Antognolli 

On Wed, Apr 25, 2018 at 09:23:04AM -0700, matthew.s.atw...@intel.com wrote:
> From: Matt Atwood 
> 
> v2: Branding changed
> 
> Signed-off-by: Matt Atwood 
> ---
>  include/pci_ids/i965_pci_ids.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> index c740a50..82e4a54 100644
> --- a/include/pci_ids/i965_pci_ids.h
> +++ b/include/pci_ids/i965_pci_ids.h
> @@ -156,6 +156,7 @@ CHIPSET(0x5912, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby 
> Lake GT2)")
>  CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kaby Lake GT2)")
>  CHIPSET(0x591A, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
>  CHIPSET(0x591B, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
> +CHIPSET(0x591C, kbl_gt2, "Intel(R) Kaby Lake GT2")
>  CHIPSET(0x591D, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
>  CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kaby Lake GT2)")
>  CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/util: Fix incorrect refcounting of separate stencil.

2018-04-25 Thread Eric Anholt
The driver may have a reference on the separate stencil buffer for some
reason (like an unflushed job using it), so we can't directly free the
resource and should instead just decrement the refcount that we own.
Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
on vc5.

Fixes: e94eb5e6000e ("gallium/util: add u_transfer_helper")
---
 src/gallium/auxiliary/util/u_transfer_helper.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_transfer_helper.c 
b/src/gallium/auxiliary/util/u_transfer_helper.c
index dd31049920fc..3b085fd99f09 100644
--- a/src/gallium/auxiliary/util/u_transfer_helper.c
+++ b/src/gallium/auxiliary/util/u_transfer_helper.c
@@ -138,8 +138,7 @@ u_transfer_helper_resource_destroy(struct pipe_screen 
*pscreen,
if (helper->vtbl->get_stencil) {
   struct pipe_resource *stencil = helper->vtbl->get_stencil(prsc);
 
-  if (stencil)
- helper->vtbl->resource_destroy(pscreen, stencil);
+  pipe_resource_reference(, NULL);
}
 
helper->vtbl->resource_destroy(pscreen, prsc);
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Intel: Add a Kaby Lake PCI ID

2018-04-25 Thread matthew . s . atwood
From: Matt Atwood 

v2: Branding changed

Signed-off-by: Matt Atwood 
---
 include/pci_ids/i965_pci_ids.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index c740a50..82e4a54 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -156,6 +156,7 @@ CHIPSET(0x5912, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby 
Lake GT2)")
 CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kaby Lake GT2)")
 CHIPSET(0x591A, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
 CHIPSET(0x591B, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
+CHIPSET(0x591C, kbl_gt2, "Intel(R) Kaby Lake GT2")
 CHIPSET(0x591D, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
 CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kaby Lake GT2)")
 CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meson: Fix with_intel_vk and with_amd_vk variables

2018-04-25 Thread Dylan Baker
Quoting Mike Lothian (2018-04-24 18:49:10)
> Can you also add radeon to the amd one? That works on autotools
> 
> On Wed, 25 Apr 2018 at 02:16 Jordan Justen  wrote:
> 
> Fixes: 5608d0a2cee "meson: use array type options"
> Cc: Dylan Baker 
> Signed-off-by: Jordan Justen 
> ---
>  meson.build | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/meson.build b/meson.build
> index 52a1075823f..c0e5c94d794 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -213,8 +213,8 @@ if _vulkan_drivers.contains('auto')
>    endif
>  endif
>  if _vulkan_drivers != ['']
> -  with_intel_vk = _drivers.contains('intel')
> -  with_amd_vk = _drivers.contains('amd')
> +  with_intel_vk = _vulkan_drivers.contains('intel')
> +  with_amd_vk = _vulkan_drivers.contains('amd')
>    with_any_vk = true
>  endif
> 
> --
> 2.16.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

At this point I think the ship has kinda sailed on changing the names of various
flags unless we're significantly reworking the functionality, we have a lot of
devs and some distros using meson already. Adding a compatibility option to
match autotools seems short sighted, the plan to remove autotools and scons
remains (once we have functional equivalence). I also like the symmetry of intel
and amd being the choices, but that's just personal taste.

Just my 2¢.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: set ac_surf_info::num_channels correctly

2018-04-25 Thread Samuel Pitoiset



On 04/25/2018 05:11 PM, Bas Nieuwenhuizen wrote:

Reviewed-by: Bas Nieuwenhuizen 

Do we want this in 18.1?


Not sure if we have to, but we can backport it.

I will run a full CTS before pushing all pending fixes.



On Wed, Apr 25, 2018 at 11:22 AM, Samuel Pitoiset
 wrote:

num_channels has been introduced since "ac/surface: don't set
the display flag for obviously unsupported cases".

Based on RadeonSI.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_image.c | 2 +-
  src/amd/vulkan/vk_format.h  | 7 +++
  2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 942df56d42..5dfd0dc739 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -968,7 +968,7 @@ radv_image_create(VkDevice _device,
 image->info.samples = pCreateInfo->samples;
 image->info.array_size = pCreateInfo->arrayLayers;
 image->info.levels = pCreateInfo->mipLevels;
-   image->info.num_channels = 4; /* TODO: set this correctly */
+   image->info.num_channels = vk_format_get_nr_components(format);

 image->vk_format = pCreateInfo->format;
 image->tiling = pCreateInfo->tiling;
diff --git a/src/amd/vulkan/vk_format.h b/src/amd/vulkan/vk_format.h
index 43265ed3d9..b8cb4f4ed3 100644
--- a/src/amd/vulkan/vk_format.h
+++ b/src/amd/vulkan/vk_format.h
@@ -488,4 +488,11 @@ vk_to_non_srgb_format(VkFormat format)
 }
  }

+static inline unsigned
+vk_format_get_nr_components(VkFormat format)
+{
+   const struct vk_format_description *desc = 
vk_format_description(format);
+   return desc->nr_channels;
+}
+
  #endif /* VK_FORMAT_H */
--
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix DCC enablement since partial MSAA implementation

2018-04-25 Thread Samuel Pitoiset



On 04/25/2018 05:10 PM, Bas Nieuwenhuizen wrote:

Oops.


Yes, oops. :)



Reviewed-by: Bas Nieuwenhuizen 

On Wed, Apr 25, 2018 at 10:56 AM, Samuel Pitoiset
 wrote:

dcc_msaa_allowed is always false on GFX9+ and only true on VI
if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
in some situations where it should not.

This is likely going to fix a performance regression.

Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an 
option")
Cc: 18.1 
Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_image.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 348f4c7b34..793f861f4f 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -133,12 +133,12 @@ radv_use_dcc_for_image(struct radv_device *device,
 if (create_info->scanout)
 return false;

-   /* FIXME: DCC for MSAA with 4x and 8x samples doesn't work yet. */
-   if (pCreateInfo->samples > 2)
-   return false;
-
-   /* TODO: Enable DCC for MSAA textures. */
-   if (!device->physical_device->dcc_msaa_allowed)
+   /* FIXME: DCC for MSAA with 4x and 8x samples doesn't work yet, while
+* 2x can be enabled with an option.
+*/
+   if (pCreateInfo->samples > 2 ||
+   (pCreateInfo->samples == 2 &&
+!device->physical_device->dcc_msaa_allowed))
 return false;

 /* Determine if the formats are DCC compatible. */
--
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] ac: fix texture query LOD for 1D textures on GFX9

2018-04-25 Thread Samuel Pitoiset
1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

v2: - move the fixup into ac_nir_to_llvm

Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 900c1c4afea..e4ae6ef49ad 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1309,6 +1309,14 @@ static LLVMValueRef build_tex_intrinsic(struct 
ac_nir_context *ctx,
}
}
 
+   /* Fixup for GFX9 which allocates 1D textures as 2D. */
+   if (instr->op == nir_texop_lod && ctx->ac.chip_class >= GFX9) {
+   if ((args->dim == ac_image_2darray ||
+args->dim == ac_image_2d) && !args->coords[1]) {
+   args->coords[1] = ctx->ac.i32_0;
+   }
+   }
+
args->attributes = AC_FUNC_ATTR_READNONE;
return ac_build_image_opcode(>ac, args);
 }
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Fix host access validation for sub-buffer creation

2018-04-25 Thread Aaron Watry
On Wed, Apr 25, 2018 at 9:03 AM, Jan Vesely  wrote:
> On Thu, 2018-04-19 at 20:39 -0500, Aaron Watry wrote:
>>   From CL 1.2 Section 5.2.1:
>> CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and
>> flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with
>> CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if
>> buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify
>> CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY .
>>
>> Fixes CL 1.2 CTS test/api get_buffer_info
>
> Hi Aaron,
>
> there are similar failures in test/mem_host_flags:
>
> test_mem_host_write_only_buffer_RW_Mapping
> Mapped host pointer difference found
> ERROR: test_mem_host_write_only_buffer_RW_Mapping! ((unknown) from 
> /home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:267)
> ERROR: test_mem_host_write_only_buffer! ((unknown) from 
> /home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:295)
> test_mem_host_write_only_buffer FAILED
>
> test_mem_host_write_only_buffer_RW_Mapping
> Mapped host pointer difference found
> ERROR: test_mem_host_write_only_buffer_RW_Mapping! ((unknown) from 
> /home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:267)
> ERROR: test_mem_host_write_only_subbuffer! ((unknown) from 
> /home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:328)
> test_mem_host_write_only_subbuffer FAILED
>
> ...
> FAILED 2 of 9 tests
>
> Are you looking into those as well?

Thanks for making me aware of that one.  I hadn't been looking into it.

The next thing I had been trying to look into was issues with kernel
attributes not being available after compilation for
clGetKernelWorkgroupInfo when running the API test-group.

Your error looks potentially simpler to solve with possibly less
interference with Pierre/Karol's work, so maybe I'll look into that
instead.

For reference, the one I had been looking at was in test_api:

kernel_required_group_size...
Device reported CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3.
The CL_KERNEL_WORK_GROUP_SIZE for the kernel is 256.
For global dimension 64 x 14 x 10, kernel will require local dimension
64 x 2 x 2.
ERROR: Incorrect compile work group size returned for specified size!
(returned 0,0,0, expected 64,2,2)
kernel_required_group_size FAILED

--Aaron

>
> thanks,
> Jan
>
>>
>> v2: Correct host_access_flags check (Francisco)
>>
>> Signed-off-by: Aaron Watry 
>> Cc: Francisco Jerez 
>> ---
>>  src/gallium/state_trackers/clover/api/memory.cpp | 8 ++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/clover/api/memory.cpp 
>> b/src/gallium/state_trackers/clover/api/memory.cpp
>> index 9b3cd8b1f5..e83be0286a 100644
>> --- a/src/gallium/state_trackers/clover/api/memory.cpp
>> +++ b/src/gallium/state_trackers/clover/api/memory.cpp
>> @@ -57,8 +57,12 @@ namespace {
>>parent.flags() & host_access_flags) |
>>   (parent.flags() & host_ptr_flags));
>>
>> - if (~flags & parent.flags() &
>> - ((dev_access_flags & ~CL_MEM_READ_WRITE) | host_access_flags))
>> + if (~flags & parent.flags() & (dev_access_flags & 
>> ~CL_MEM_READ_WRITE))
>> +throw error(CL_INVALID_VALUE);
>> +
>> + //Check if new host access flags cause a mismatch between 
>> host-read/write-only.
>> + if (!(flags & CL_MEM_HOST_NO_ACCESS) &&
>> + (~flags & parent.flags() & host_access_flags))
>>  throw error(CL_INVALID_VALUE);
>>
>>   return flags;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Fix typos

2018-04-25 Thread Leo Liu

Reviewed-by: Leo Liu 


On 2018-04-25 11:32 AM, Drew Davenport wrote:

s/attibute/attribute/
s/suface/surface/
---
  src/gallium/state_trackers/va/surface.c | 48 -
  1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 8604136944..1dc4466560 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -525,9 +525,9 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, VAConfigID 
config_id,
  }
  
  static VAStatus

-suface_from_external_memory(VADriverContextP ctx, vlVaSurface *surface,
-VASurfaceAttribExternalBuffers *memory_attibute,
-unsigned index, struct pipe_video_buffer *templat)
+surface_from_external_memory(VADriverContextP ctx, vlVaSurface *surface,
+ VASurfaceAttribExternalBuffers *memory_attribute,
+ unsigned index, struct pipe_video_buffer *templat)
  {
 vlVaDriver *drv;
 struct pipe_screen *pscreen;
@@ -539,21 +539,21 @@ suface_from_external_memory(VADriverContextP ctx, 
vlVaSurface *surface,
 pscreen = VL_VA_PSCREEN(ctx);
 drv = VL_VA_DRIVER(ctx);
  
-   if (!memory_attibute || !memory_attibute->buffers ||

-   index > memory_attibute->num_buffers)
+   if (!memory_attribute || !memory_attribute->buffers ||
+   index > memory_attribute->num_buffers)
return VA_STATUS_ERROR_INVALID_PARAMETER;
  
-   if (surface->templat.width != memory_attibute->width ||

-   surface->templat.height != memory_attibute->height ||
-   memory_attibute->num_planes < 1)
+   if (surface->templat.width != memory_attribute->width ||
+   surface->templat.height != memory_attribute->height ||
+   memory_attribute->num_planes < 1)
return VA_STATUS_ERROR_INVALID_PARAMETER;
  
-   switch (memory_attibute->pixel_format) {

+   switch (memory_attribute->pixel_format) {
 case VA_FOURCC_RGBA:
 case VA_FOURCC_RGBX:
 case VA_FOURCC_BGRA:
 case VA_FOURCC_BGRX:
-  if (memory_attibute->num_planes != 1)
+  if (memory_attribute->num_planes != 1)
   return VA_STATUS_ERROR_INVALID_PARAMETER;
break;
 default:
@@ -565,16 +565,16 @@ suface_from_external_memory(VADriverContextP ctx, 
vlVaSurface *surface,
 res_templ.last_level = 0;
 res_templ.depth0 = 1;
 res_templ.array_size = 1;
-   res_templ.width0 = memory_attibute->width;
-   res_templ.height0 = memory_attibute->height;
+   res_templ.width0 = memory_attribute->width;
+   res_templ.height0 = memory_attribute->height;
 res_templ.format = surface->templat.buffer_format;
 res_templ.bind = PIPE_BIND_SAMPLER_VIEW;
 res_templ.usage = PIPE_USAGE_DEFAULT;
  
 memset(, 0, sizeof(struct winsys_handle));

 whandle.type = DRM_API_HANDLE_TYPE_FD;
-   whandle.handle = memory_attibute->buffers[index];
-   whandle.stride = memory_attibute->pitches[index];
+   whandle.handle = memory_attribute->buffers[index];
+   whandle.stride = memory_attribute->pitches[index];
  
 resource = pscreen->resource_from_handle(pscreen, _templ, ,

  PIPE_HANDLE_USAGE_READ_WRITE);
@@ -629,7 +629,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
  VASurfaceAttrib *attrib_list, unsigned int num_attribs)
  {
 vlVaDriver *drv;
-   VASurfaceAttribExternalBuffers *memory_attibute;
+   VASurfaceAttribExternalBuffers *memory_attribute;
 struct pipe_video_buffer templat;
 struct pipe_screen *pscreen;
 int i;
@@ -655,7 +655,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
return VA_STATUS_ERROR_INVALID_CONTEXT;
  
 /* Default. */

-   memory_attibute = NULL;
+   memory_attribute = NULL;
 memory_type = VA_SURFACE_ATTRIB_MEM_TYPE_VA;
 expected_fourcc = 0;
  
@@ -687,7 +687,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format,

(attrib_list[i].flags == VA_SURFACE_ATTRIB_SETTABLE)) {
   if (attrib_list[i].value.type != VAGenericValueTypePointer)
  return VA_STATUS_ERROR_INVALID_PARAMETER;
- memory_attibute = (VASurfaceAttribExternalBuffers 
*)attrib_list[i].value.value.p;
+ memory_attribute = (VASurfaceAttribExternalBuffers 
*)attrib_list[i].value.value.p;
}
 }
  
@@ -703,10 +703,10 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int format,

 case VA_SURFACE_ATTRIB_MEM_TYPE_VA:
break;
 case VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME:
-  if (!memory_attibute)
+  if (!memory_attribute)
   return VA_STATUS_ERROR_INVALID_PARAMETER;
  
-  expected_fourcc = memory_attibute->pixel_format;

+  expected_fourcc = memory_attribute->pixel_format;
break;
 default:
assert(0);
@@ -730,7 +730,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
  

[Mesa-dev] [PATCH] radeon/vcn: fix mpeg4 msg buffer settings

2018-04-25 Thread boyuan.zhang
From: Boyuan Zhang 

Previous bit-fields assignments are incorrect and will result certain mpeg4
decode failed due to wrong flag values. This patch fixes these assignments.

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vcn_dec.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index f83e9e5..4bc922d 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -554,15 +554,15 @@ static rvcn_dec_message_mpeg4_asp_vld_t 
get_mpeg4_msg(struct radeon_decoder *dec
 
result.vop_time_increment_resolution = 
pic->vop_time_increment_resolution;
 
-   result.short_video_header |= pic->short_video_header << 0;
-   result.interlaced |= pic->interlaced << 2;
-result.load_intra_quant_mat |= 1 << 3;
-   result.load_nonintra_quant_mat |= 1 << 4;
-   result.quarter_sample |= pic->quarter_sample << 5;
-   result.complexity_estimation_disable |= 1 << 6;
-   result.resync_marker_disable |= pic->resync_marker_disable << 7;
-   result.newpred_enable |= 0 << 10; //
-   result.reduced_resolution_vop_enable |= 0 << 11;
+   result.short_video_header = pic->short_video_header;
+   result.interlaced = pic->interlaced;
+   result.load_intra_quant_mat = 1;
+   result.load_nonintra_quant_mat = 1;
+   result.quarter_sample = pic->quarter_sample;
+   result.complexity_estimation_disable = 1;
+   result.resync_marker_disable = pic->resync_marker_disable;
+   result.newpred_enable = 0;
+   result.reduced_resolution_vop_enable = 0;
 
result.quant_type = pic->quant_type;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Fix typos

2018-04-25 Thread Drew Davenport
s/attibute/attribute/
s/suface/surface/
---
 src/gallium/state_trackers/va/surface.c | 48 -
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 8604136944..1dc4466560 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -525,9 +525,9 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, VAConfigID 
config_id,
 }
 
 static VAStatus
-suface_from_external_memory(VADriverContextP ctx, vlVaSurface *surface,
-VASurfaceAttribExternalBuffers *memory_attibute,
-unsigned index, struct pipe_video_buffer *templat)
+surface_from_external_memory(VADriverContextP ctx, vlVaSurface *surface,
+ VASurfaceAttribExternalBuffers *memory_attribute,
+ unsigned index, struct pipe_video_buffer *templat)
 {
vlVaDriver *drv;
struct pipe_screen *pscreen;
@@ -539,21 +539,21 @@ suface_from_external_memory(VADriverContextP ctx, 
vlVaSurface *surface,
pscreen = VL_VA_PSCREEN(ctx);
drv = VL_VA_DRIVER(ctx);
 
-   if (!memory_attibute || !memory_attibute->buffers ||
-   index > memory_attibute->num_buffers)
+   if (!memory_attribute || !memory_attribute->buffers ||
+   index > memory_attribute->num_buffers)
   return VA_STATUS_ERROR_INVALID_PARAMETER;
 
-   if (surface->templat.width != memory_attibute->width ||
-   surface->templat.height != memory_attibute->height ||
-   memory_attibute->num_planes < 1)
+   if (surface->templat.width != memory_attribute->width ||
+   surface->templat.height != memory_attribute->height ||
+   memory_attribute->num_planes < 1)
   return VA_STATUS_ERROR_INVALID_PARAMETER;
 
-   switch (memory_attibute->pixel_format) {
+   switch (memory_attribute->pixel_format) {
case VA_FOURCC_RGBA:
case VA_FOURCC_RGBX:
case VA_FOURCC_BGRA:
case VA_FOURCC_BGRX:
-  if (memory_attibute->num_planes != 1)
+  if (memory_attribute->num_planes != 1)
  return VA_STATUS_ERROR_INVALID_PARAMETER;
   break;
default:
@@ -565,16 +565,16 @@ suface_from_external_memory(VADriverContextP ctx, 
vlVaSurface *surface,
res_templ.last_level = 0;
res_templ.depth0 = 1;
res_templ.array_size = 1;
-   res_templ.width0 = memory_attibute->width;
-   res_templ.height0 = memory_attibute->height;
+   res_templ.width0 = memory_attribute->width;
+   res_templ.height0 = memory_attribute->height;
res_templ.format = surface->templat.buffer_format;
res_templ.bind = PIPE_BIND_SAMPLER_VIEW;
res_templ.usage = PIPE_USAGE_DEFAULT;
 
memset(, 0, sizeof(struct winsys_handle));
whandle.type = DRM_API_HANDLE_TYPE_FD;
-   whandle.handle = memory_attibute->buffers[index];
-   whandle.stride = memory_attibute->pitches[index];
+   whandle.handle = memory_attribute->buffers[index];
+   whandle.stride = memory_attribute->pitches[index];
 
resource = pscreen->resource_from_handle(pscreen, _templ, ,
 PIPE_HANDLE_USAGE_READ_WRITE);
@@ -629,7 +629,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
 VASurfaceAttrib *attrib_list, unsigned int num_attribs)
 {
vlVaDriver *drv;
-   VASurfaceAttribExternalBuffers *memory_attibute;
+   VASurfaceAttribExternalBuffers *memory_attribute;
struct pipe_video_buffer templat;
struct pipe_screen *pscreen;
int i;
@@ -655,7 +655,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
   return VA_STATUS_ERROR_INVALID_CONTEXT;
 
/* Default. */
-   memory_attibute = NULL;
+   memory_attribute = NULL;
memory_type = VA_SURFACE_ATTRIB_MEM_TYPE_VA;
expected_fourcc = 0;
 
@@ -687,7 +687,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
   (attrib_list[i].flags == VA_SURFACE_ATTRIB_SETTABLE)) {
  if (attrib_list[i].value.type != VAGenericValueTypePointer)
 return VA_STATUS_ERROR_INVALID_PARAMETER;
- memory_attibute = (VASurfaceAttribExternalBuffers 
*)attrib_list[i].value.value.p;
+ memory_attribute = (VASurfaceAttribExternalBuffers 
*)attrib_list[i].value.value.p;
   }
}
 
@@ -703,10 +703,10 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
case VA_SURFACE_ATTRIB_MEM_TYPE_VA:
   break;
case VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME:
-  if (!memory_attibute)
+  if (!memory_attribute)
  return VA_STATUS_ERROR_INVALID_PARAMETER;
 
-  expected_fourcc = memory_attibute->pixel_format;
+  expected_fourcc = memory_attribute->pixel_format;
   break;
default:
   assert(0);
@@ -730,7 +730,7 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
if (expected_fourcc) {
   enum pipe_format expected_format = VaFourccToPipeFormat(expected_fourcc);
 
-  if (expected_format != 

Re: [Mesa-dev] [PATCH] radv: set ac_surf_info::num_channels correctly

2018-04-25 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

Do we want this in 18.1?

On Wed, Apr 25, 2018 at 11:22 AM, Samuel Pitoiset
 wrote:
> num_channels has been introduced since "ac/surface: don't set
> the display flag for obviously unsupported cases".
>
> Based on RadeonSI.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 2 +-
>  src/amd/vulkan/vk_format.h  | 7 +++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 942df56d42..5dfd0dc739 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -968,7 +968,7 @@ radv_image_create(VkDevice _device,
> image->info.samples = pCreateInfo->samples;
> image->info.array_size = pCreateInfo->arrayLayers;
> image->info.levels = pCreateInfo->mipLevels;
> -   image->info.num_channels = 4; /* TODO: set this correctly */
> +   image->info.num_channels = vk_format_get_nr_components(format);
>
> image->vk_format = pCreateInfo->format;
> image->tiling = pCreateInfo->tiling;
> diff --git a/src/amd/vulkan/vk_format.h b/src/amd/vulkan/vk_format.h
> index 43265ed3d9..b8cb4f4ed3 100644
> --- a/src/amd/vulkan/vk_format.h
> +++ b/src/amd/vulkan/vk_format.h
> @@ -488,4 +488,11 @@ vk_to_non_srgb_format(VkFormat format)
> }
>  }
>
> +static inline unsigned
> +vk_format_get_nr_components(VkFormat format)
> +{
> +   const struct vk_format_description *desc = 
> vk_format_description(format);
> +   return desc->nr_channels;
> +}
> +
>  #endif /* VK_FORMAT_H */
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix DCC enablement since partial MSAA implementation

2018-04-25 Thread Bas Nieuwenhuizen
Oops.

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Apr 25, 2018 at 10:56 AM, Samuel Pitoiset
 wrote:
> dcc_msaa_allowed is always false on GFX9+ and only true on VI
> if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
> in some situations where it should not.
>
> This is likely going to fix a performance regression.
>
> Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an 
> option")
> Cc: 18.1 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 348f4c7b34..793f861f4f 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -133,12 +133,12 @@ radv_use_dcc_for_image(struct radv_device *device,
> if (create_info->scanout)
> return false;
>
> -   /* FIXME: DCC for MSAA with 4x and 8x samples doesn't work yet. */
> -   if (pCreateInfo->samples > 2)
> -   return false;
> -
> -   /* TODO: Enable DCC for MSAA textures. */
> -   if (!device->physical_device->dcc_msaa_allowed)
> +   /* FIXME: DCC for MSAA with 4x and 8x samples doesn't work yet, while
> +* 2x can be enabled with an option.
> +*/
> +   if (pCreateInfo->samples > 2 ||
> +   (pCreateInfo->samples == 2 &&
> +!device->physical_device->dcc_msaa_allowed))
> return false;
>
> /* Determine if the formats are DCC compatible. */
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac: fix texture query LOD for 1D textures on GFX9

2018-04-25 Thread Nicolai Hähnle

On 25.04.2018 16:46, Samuel Pitoiset wrote:



On 04/25/2018 04:10 PM, Nicolai Hähnle wrote:

On 25.04.2018 11:58, Samuel Pitoiset wrote:

1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 
Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_llvm_build.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c 
b/src/amd/common/ac_llvm_build.c

index f21a5d2623c..be7379f72ef 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1533,6 +1533,16 @@ LLVMValueRef ac_build_image_opcode(struct 
ac_llvm_context *ctx,

  default:
  break;
  }
+
+    /* Fixup for GFX9 which allocates 1D textures as 2D, because at
+ * this point we don't know the orignal sampler dimension.
+ */
+    if (ctx->chip_class >= GFX9) {
+    if ((a->dim == ac_image_2darray ||
+ a->dim == ac_image_2d) && !a->coords[1]) {
+    num_coords = 1;
+    }
+    }


Can we do this fixup in ac_nir_to_llvm instead, please? Pretty sure 
that that's needed for correctness anyway: with this change, the 
second coordinate will be basically random, which can probably affect 
the hardware's LOD even when the texture's height is 1.


Yes, I can do that, but what should we put in coords[1]? just zero?


Yes, that seems best.

Thanks,
Nicolai






Thanks,
Nicolai


  }
  if (a->offset)







--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac: fix texture query LOD for 1D textures on GFX9

2018-04-25 Thread Samuel Pitoiset



On 04/25/2018 04:10 PM, Nicolai Hähnle wrote:

On 25.04.2018 11:58, Samuel Pitoiset wrote:

1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 
Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_llvm_build.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c 
b/src/amd/common/ac_llvm_build.c

index f21a5d2623c..be7379f72ef 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1533,6 +1533,16 @@ LLVMValueRef ac_build_image_opcode(struct 
ac_llvm_context *ctx,

  default:
  break;
  }
+
+    /* Fixup for GFX9 which allocates 1D textures as 2D, because at
+ * this point we don't know the orignal sampler dimension.
+ */
+    if (ctx->chip_class >= GFX9) {
+    if ((a->dim == ac_image_2darray ||
+ a->dim == ac_image_2d) && !a->coords[1]) {
+    num_coords = 1;
+    }
+    }


Can we do this fixup in ac_nir_to_llvm instead, please? Pretty sure that 
that's needed for correctness anyway: with this change, the second 
coordinate will be basically random, which can probably affect the 
hardware's LOD even when the texture's height is 1.


Yes, I can do that, but what should we put in coords[1]? just zero?



Thanks,
Nicolai


  }
  if (a->offset)





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac: fix texture query LOD for 1D textures on GFX9

2018-04-25 Thread Nicolai Hähnle

On 25.04.2018 11:58, Samuel Pitoiset wrote:

1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 
Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_llvm_build.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index f21a5d2623c..be7379f72ef 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1533,6 +1533,16 @@ LLVMValueRef ac_build_image_opcode(struct 
ac_llvm_context *ctx,
default:
break;
}
+
+   /* Fixup for GFX9 which allocates 1D textures as 2D, because at
+* this point we don't know the orignal sampler dimension.
+*/
+   if (ctx->chip_class >= GFX9) {
+   if ((a->dim == ac_image_2darray ||
+a->dim == ac_image_2d) && !a->coords[1]) {
+   num_coords = 1;
+   }
+   }


Can we do this fixup in ac_nir_to_llvm instead, please? Pretty sure that 
that's needed for correctness anyway: with this change, the second 
coordinate will be basically random, which can probably affect the 
hardware's LOD even when the texture's height is 1.


Thanks,
Nicolai


}
  
  	if (a->offset)





--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] intel/compiler: lower 16-bit integer extended math instructions

2018-04-25 Thread Jason Ekstrand
Some of these comments may be duplicates of ones I made the first time
through.

On Wed, Apr 11, 2018 at 12:20 AM, Iago Toral Quiroga 
wrote:

> The hardware doesn't support 16-bit integer types, so we need to implement
> these using 32-bit integer instructions and then convert the result back
> to 16-bit.
> ---
>  src/intel/Makefile.sources|   1 +
>  src/intel/compiler/brw_nir.c  |   2 +
>  src/intel/compiler/brw_nir.h  |   2 +
>  src/intel/compiler/brw_nir_lower_16bit_int_math.c | 108
> ++
>  src/intel/compiler/meson.build|   1 +
>  5 files changed, 114 insertions(+)
>  create mode 100644 src/intel/compiler/brw_nir_lower_16bit_int_math.c
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 91c71a8dfaf..2cd76961ea4 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -79,6 +79,7 @@ COMPILER_FILES = \
> compiler/brw_nir_analyze_boolean_resolves.c \
> compiler/brw_nir_analyze_ubo_ranges.c \
> compiler/brw_nir_attribute_workarounds.c \
> +   compiler/brw_nir_lower_16bit_int_math.c \
> compiler/brw_nir_lower_cs_intrinsics.c \
> compiler/brw_nir_opt_peephole_ffma.c \
> compiler/brw_nir_tcs_workarounds.c \
> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index 69ab162f888..2e5754076ed 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -638,6 +638,8 @@ brw_preprocess_nir(const struct brw_compiler
> *compiler, nir_shader *nir)
>  nir_lower_isign64 |
>  nir_lower_divmod64);
>
> +   brw_nir_lower_16bit_int_math(nir);
> +
> nir = brw_nir_optimize(nir, compiler, is_scalar);
>
> if (is_scalar) {
> diff --git a/src/intel/compiler/brw_nir.h b/src/intel/compiler/brw_nir.h
> index 03f52da08e5..6ba1a8bc654 100644
> --- a/src/intel/compiler/brw_nir.h
> +++ b/src/intel/compiler/brw_nir.h
> @@ -152,6 +152,8 @@ void brw_nir_analyze_ubo_ranges(const struct
> brw_compiler *compiler,
>
>  bool brw_nir_opt_peephole_ffma(nir_shader *shader);
>
> +bool brw_nir_lower_16bit_int_math(nir_shader *shader);
> +
>  nir_shader *brw_nir_optimize(nir_shader *nir,
>   const struct brw_compiler *compiler,
>   bool is_scalar);
> diff --git a/src/intel/compiler/brw_nir_lower_16bit_int_math.c
> b/src/intel/compiler/brw_nir_lower_16bit_int_math.c
> new file mode 100644
> index 000..6876309a822
> --- /dev/null
> +++ b/src/intel/compiler/brw_nir_lower_16bit_int_math.c
> @@ -0,0 +1,108 @@
> +/*
> + * Copyright © 2018 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "brw_nir.h"
> +#include "nir_builder.h"
> +
> +/**
> + * Intel hardware doesn't support 16-bit integer Math instructions so this
> + * pass implements them in 32-bit and then converts the result back to
> 16-bit.
> + */
> +static void
> +lower_math_instr(nir_builder *bld, nir_alu_instr *alu, bool is_signed)
> +{
> +   const nir_op op = alu->op;
> +
> +   bld->cursor = nir_before_instr(>instr);
> +
> +   nir_ssa_def *srcs_32[4] = { NULL, NULL, NULL, NULL };
> +   const uint32_t num_inputs = nir_op_infos[op].num_inputs;
> +   for (uint32_t i = 0; i < num_inputs; i++) {
> +  nir_ssa_def *src = nir_ssa_for_alu_src(bld, alu, i);
> +  srcs_32[i] = is_signed ? nir_i2i32(bld, src) : nir_u2u32(bld, src);
>

For float16, we'll need f2f32.  Also, is_signed can be derived from
nir_op_infos[op].input_types so it doesn't need to be passed in.  If we
want to make it fully general, we probably also want to only do the
conversion if the source type is unsized.


> +   }
> +
> +   nir_ssa_def *dst_32 =
> +  nir_build_alu(bld, op, 

Re: [Mesa-dev] [PATCH] clover: Fix host access validation for sub-buffer creation

2018-04-25 Thread Jan Vesely
On Thu, 2018-04-19 at 20:39 -0500, Aaron Watry wrote:
>   From CL 1.2 Section 5.2.1:
> CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and
> flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with
> CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if
> buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify
> CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY .
> 
> Fixes CL 1.2 CTS test/api get_buffer_info

Hi Aaron,

there are similar failures in test/mem_host_flags:

test_mem_host_write_only_buffer_RW_Mapping
Mapped host pointer difference found
ERROR: test_mem_host_write_only_buffer_RW_Mapping! ((unknown) from 
/home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:267)
ERROR: test_mem_host_write_only_buffer! ((unknown) from 
/home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:295)
test_mem_host_write_only_buffer FAILED

test_mem_host_write_only_buffer_RW_Mapping
Mapped host pointer difference found
ERROR: test_mem_host_write_only_buffer_RW_Mapping! ((unknown) from 
/home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:267)
ERROR: test_mem_host_write_only_subbuffer! ((unknown) from 
/home/jvesely/OpenCL-CTS/test_conformance/mem_host_flags/mem_host_buffer.cpp:328)
test_mem_host_write_only_subbuffer FAILED

...
FAILED 2 of 9 tests

Are you looking into those as well?

thanks,
Jan

> 
> v2: Correct host_access_flags check (Francisco)
> 
> Signed-off-by: Aaron Watry 
> Cc: Francisco Jerez 
> ---
>  src/gallium/state_trackers/clover/api/memory.cpp | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/memory.cpp 
> b/src/gallium/state_trackers/clover/api/memory.cpp
> index 9b3cd8b1f5..e83be0286a 100644
> --- a/src/gallium/state_trackers/clover/api/memory.cpp
> +++ b/src/gallium/state_trackers/clover/api/memory.cpp
> @@ -57,8 +57,12 @@ namespace {
>parent.flags() & host_access_flags) |
>   (parent.flags() & host_ptr_flags));
>  
> - if (~flags & parent.flags() &
> - ((dev_access_flags & ~CL_MEM_READ_WRITE) | host_access_flags))
> + if (~flags & parent.flags() & (dev_access_flags & 
> ~CL_MEM_READ_WRITE))
> +throw error(CL_INVALID_VALUE);
> +
> + //Check if new host access flags cause a mismatch between 
> host-read/write-only.
> + if (!(flags & CL_MEM_HOST_NO_ACCESS) &&
> + (~flags & parent.flags() & host_access_flags))
>  throw error(CL_INVALID_VALUE);
>  
>   return flags;


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] travis: update libva required version

2018-04-25 Thread Juan A. Suarez Romero
On Wed, 2018-04-25 at 13:54 +0100, Emil Velikov wrote:
> On 24 April 2018 at 08:49, Juan A. Suarez Romero  wrote:
> > On Fri, 2018-04-20 at 16:42 +0200, Juan A. Suarez Romero wrote:
> > > Commit fa328456e8f29 added VP9 config support, but this needs a newer
> > > libva version, 1.7.0 or above.
> > > 
> > > Fixes: fa328456e8f ("st/va: add VP9 config to enable profile2")
> > 
> > Besides requesting R-B, CCing to @stable, as this fixes 18.1 build in 
> > Travis CI.
> > 
> 
> The Fixes should be enough but stable@ won't hurt.
> 
> Seems like we should also bump the versions in configure.ac
> (LIBVA_REQUIRED) and meson.build, right?
> Can be done as a follow-up, though. As-is patch is

Thing is, I did in a first version, but then I reverted that change. The reason
is that LIBVA_REQUIRED contains the expected VA API version to interact with the
library, but in this case we are not changing the functions we use, and thus we
are fine with keeping the same API version.


> 
> Reviewed-by: Emil Velikov 
> 
> -Emil
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH i-g-t] [RFC] CONTRIBUTING: commit rights docs

2018-04-25 Thread Daniel Vetter
On Wed, Apr 25, 2018 at 01:27:20PM +0100, Emil Velikov wrote:
> On 24 April 2018 at 20:14, Daniel Vetter  wrote:
> > On Tue, Apr 24, 2018 at 7:30 PM, Emil Velikov  
> > wrote:
> >> On 13 April 2018 at 11:00, Daniel Vetter  wrote:
> >>> This tries to align with the X.org communities's long-standing
> >>> tradition of trying to be an inclusive community and handing out
> >>> commit rights fairly freely.
> >>>
> >>> We also tend to not revoke commit rights for people no longer
> >>> regularly active in a given project, as long as they're still part of
> >>> the larger community.
> >>>
> >>> Finally make sure that commit rights, like anything happening on fd.o
> >>> infrastructre, is subject to the fd.o's Code of Conduct.
> >>>
> >>> v2: Point at MAINTAINERS for contact info (Daniel S.)
> >>>
> >>> v3:
> >>> - Make it clear that commit rights are voluntary and that committers
> >>>   need to acknowledge positively when they're nominated by someone
> >>>   else (Keith).
> >>> - Encourage committers to drop their commit rights when they're no
> >>>   longer active, and make it clear they'll get readded (Keith).
> >>> - Add a line that maintainers and committers should actively nominate
> >>>   new committers (me).
> >>>
> >>> v4: Typo (Petri).
> >>>
> >>> v5: Typo (Sean).
> >>>
> >>> v6: Wording clarifications and spelling (Jani).
> >>>
> >>> v7: Require an explicit commitment to the documented merge criteria
> >>> and rules, instead of just the implied one through the Code of Conduct
> >>> threat (Jani).
> >>>
> >>> Acked-by: Alex Deucher 
> >>> Acked-by: Arkadiusz Hiler 
> >>> Acked-by: Daniel Stone 
> >>> Acked-by: Eric Anholt 
> >>> Acked-by: Gustavo Padovan 
> >>> Acked-by: Petri Latvala 
> >>> Cc: Alex Deucher 
> >>> Cc: Arkadiusz Hiler 
> >>> Cc: Ben Widawsky 
> >>> Cc: Daniel Stone 
> >>> Cc: Dave Airlie 
> >>> Cc: Eric Anholt 
> >>> Cc: Gustavo Padovan 
> >>> Cc: Jani Nikula 
> >>> Cc: Joonas Lahtinen 
> >>> Cc: Keith Packard 
> >>> Cc: Kenneth Graunke 
> >>> Cc: Kristian H. Kristensen 
> >>> Cc: Maarten Lankhorst 
> >>> Cc: Petri Latvala 
> >>> Cc: Rodrigo Vivi 
> >>> Cc: Sean Paul 
> >>> Reviewed-by: Keith Packard 
> >>> Signed-off-by: Daniel Vetter 
> >>> ---
> >>> If you wonder about the wide distribution list for an igt patch: I'd
> >>> like to start a discussions about x.org community norms around commit
> >>> rights at large, at least for all the shared repos. I plan to propose
> >>> the same text for drm-misc and libdrm too, and hopefully others like
> >>> mesa/xserver/wayland would follow.
> >>>
> >> I think the idea is pretty good, simply highlighting some bits.
> >>
> >> What you've outlined in this patch has been in practise for many years:
> >>  a) undocumented, applicable to most xorg projects [1]
> >>  b) documented, mesa
> >
> > Hm, I chatted with a few mesa devs about this, and I wasn't aware
> > there's explicit documentation for mesa. Where is it? I'd very much
> > want to align as much as we can.
> >
> See the "Developer git Access" section in [1]. FWIW I prefer the
> wording used in this patch and the CoC reference is a big plus.
> 
> HTH
> Emil
> 
> [1] https://www.mesa3d.org/repository.html

Ah missed this indeed. One thing to note wrt mesa is that this text here
relies heavily on _documented_ merge criteria. When I discussed it with
mesa we realized that the documented merge criteria do not really match
the actual criteria:

https://www.mesa3d.org/submittingpatches.html

E.g. for many drivers review is mandatory I think, same for core code. And
Intel folks require that you go through their CI too.

So the bigger part in adopting this for mesa would be in updating the
merge criteria doc to reflect reality.

Anyway, I'm happy that even the few terse lines match what I'm proposing
here (minus lots of details), I think we're good to go.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] travis: update libva required version

2018-04-25 Thread Emil Velikov
On 24 April 2018 at 08:49, Juan A. Suarez Romero  wrote:
> On Fri, 2018-04-20 at 16:42 +0200, Juan A. Suarez Romero wrote:
>> Commit fa328456e8f29 added VP9 config support, but this needs a newer
>> libva version, 1.7.0 or above.
>>
>> Fixes: fa328456e8f ("st/va: add VP9 config to enable profile2")
>
> Besides requesting R-B, CCing to @stable, as this fixes 18.1 build in Travis 
> CI.
>
The Fixes should be enough but stable@ won't hurt.

Seems like we should also bump the versions in configure.ac
(LIBVA_REQUIRED) and meson.build, right?
Can be done as a follow-up, though. As-is patch is

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] travis: update libva required version

2018-04-25 Thread Andres Gomez
This is:

Reviewed-by: Andres Gomez 


On Fri, 2018-04-20 at 16:42 +0200, Juan A. Suarez Romero wrote:
> Commit fa328456e8f29 added VP9 config support, but this needs a newer
> libva version, 1.7.0 or above.
> 
> Fixes: fa328456e8f ("st/va: add VP9 config to enable profile2")
> ---
>  .travis.yml | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index 45c5b80cbac..e0d6a827a6d 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -21,7 +21,7 @@ env:
>  - LIBXCB_VERSION=libxcb-1.13
>  - LIBXSHMFENCE_VERSION=libxshmfence-1.2
>  - LIBVDPAU_VERSION=libvdpau-1.1
> -- LIBVA_VERSION=libva-1.6.2
> +- LIBVA_VERSION=libva-1.7.0
>  - LIBWAYLAND_VERSION=wayland-1.11.1
>  - WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
>  - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] i965/compiler: handle conversion to smaller type in the lowering pass for that

2018-04-25 Thread Jason Ekstrand


On April 25, 2018 02:04:03 Iago Toral  wrote:
On Tue, 2018-04-24 at 07:58 -0700, Jason Ekstrand wrote:
On Wed, Apr 11, 2018 at 12:20 AM, Iago Toral Quiroga  wrote:

The lowering pass was specialized to act on 64-bit to 32-bit conversions only,
but the implementation is valid for other cases.
---
src/intel/compiler/brw_fs_lower_conversions.cpp |  5 -
src/intel/compiler/brw_fs_nir.cpp   | 14 +++---
2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/intel/compiler/brw_fs_lower_conversions.cpp 
b/src/intel/compiler/brw_fs_lower_conversions.cpp

index 663c9674c49..f95b39d3e86 100644
--- a/src/intel/compiler/brw_fs_lower_conversions.cpp
+++ b/src/intel/compiler/brw_fs_lower_conversions.cpp
@@ -54,7 +54,7 @@ fs_visitor::lower_conversions()
bool saturate = inst->saturate;

if (supports_type_conversion(inst)) {
- if (get_exec_type_size(inst) == 8 && type_sz(inst->dst.type) < 8) {
+ if (type_sz(inst->dst.type) < get_exec_type_size(inst)) {
/* From the Broadwell PRM, 3D Media GPGPU, "Double Precision Float to
* Single Precision Float":
*
@@ -64,6 +64,9 @@ fs_visitor::lower_conversions()
* So we need to allocate a temporary that's two registers, and then do
* a strided MOV to get the lower DWord of every Qword that has the
* result.
+ *
+ * This restriction applies, in general, whenever we convert to
+ * a type with a smaller bit-size.
*/
fs_reg temp = ibld.vgrf(get_exec_type(inst));
fs_reg strided_temp = subscript(temp, dst.type, 0);
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp

index f40a3540e31..5e0dd37eefd 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -753,19 +753,9 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)

*/

case nir_op_f2f16_undef:
-   case nir_op_i2i16:
-   case nir_op_u2u16: {
-  /* TODO: Fixing aligment rules for conversions from 32-bits to
-   * 16-bit types should be moved to lower_conversions
-   */
-  fs_reg tmp = bld.vgrf(op[0].type, 1);
-  tmp = subscript(tmp, result.type, 0);
-  inst = bld.MOV(tmp, op[0]);
-  inst->saturate = instr->dest.saturate;
-  inst = bld.MOV(result, tmp);
+  inst = bld.MOV(result, op[0]);
inst->saturate = instr->dest.saturate;
break;

It appears to me that we can move f2f16_undef to the block below as well.  
Without or without that,


f2f16_undef is the fallthough for the other f2f16 cases (the ones that 
handle rounding modes) and the cases we are grouping here are also 
falltrough cases for other things, so if we moves it here we'd need to 
replicate the code again for the other f2f16 cases anyway.


Ok, that's reasonable.



Reviewed-by: Jason Ekstrand 

-   }

case nir_op_f2f64:
case nir_op_f2i64:
@@ -803,6 +793,8 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)

case nir_op_f2u32:
case nir_op_i2i32:
case nir_op_u2u32:
+   case nir_op_i2i16:
+   case nir_op_u2u16:
inst = bld.MOV(result, op[0]);
inst->saturate = instr->dest.saturate;
break;
--
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir: support converting to 8-bit integers in nir_type_conversion_op

2018-04-25 Thread Jason Ekstrand



On April 25, 2018 05:14:17 Karol Herbst  wrote:

Signed-off-by: Karol Herbst 
---
src/compiler/nir/nir_opcodes_c.py | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_opcodes_c.py 
b/src/compiler/nir/nir_opcodes_c.py

index c19185534af..8afccca9504 100644
--- a/src/compiler/nir/nir_opcodes_c.py
+++ b/src/compiler/nir/nir_opcodes_c.py
@@ -62,7 +62,12 @@ nir_type_conversion_op(nir_alu_type src, nir_alu_type 
dst, nir_rounding_mode rnd

% endif
%  endif
switch (dst_bit_size) {
-% for dst_bits in [16, 32, 64]:
+% if dst_t == 'float':
+<%bit_sizes = [16, 32, 64] %>

The <% can be indented. It doesn't have to be at the start of the line.  
Doesn't really matter that much though.



+% else:
+<%bit_sizes = [8, 16, 32, 64] %>
+% endif
+% for dst_bits in bit_sizes:

You could also do
%if dst_t == 'float' and dst_bits == 8:
<% continue %>

I'm not sure which is better.  What you did is fine.  Rb


case ${dst_bits}:
%if src_t == 'float' and dst_t == 'float' and dst_bits 
== 16:

switch(rnd) {
--
2.14.3



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] nir: print 8 and 16 bit constants correctly

2018-04-25 Thread Jason Ekstrand

Making the float comment thing work with 16-bit would be cool.  R-b anyway.

On April 25, 2018 05:14:18 Karol Herbst  wrote:


Signed-off-by: Karol Herbst 
---
src/compiler/nir/nir_print.c | 14 --
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 21f13097651..1c84b4b7076 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -846,11 +846,21 @@ print_load_const_instr(nir_load_const_instr *instr, 
print_state *state)

   * and then print the float in a comment for readability.
   */

-  if (instr->def.bit_size == 64)
+  switch (instr->def.bit_size) {
+  case 64:
 fprintf(fp, "0x%16" PRIx64 " /* %f */", instr->value.u64[i],
 instr->value.f64[i]);
-  else
+ break;
+  case 32:
 fprintf(fp, "0x%08x /* %f */", instr->value.u32[i], 
instr->value.f32[i]);
+ break;
+  case 16:
+ fprintf(fp, "0x%04x", instr->value.u16[i]);
+ break;
+  case 8:
+ fprintf(fp, "0x%02x", instr->value.u8[i]);
+ break;
+  }
   }

   fprintf(fp, ")");
--
2.14.3




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] nir/opt_constant_folding: fix folding of 8 and 16 bit ints

2018-04-25 Thread Jason Ekstrand

Rb

On April 25, 2018 05:14:20 Karol Herbst  wrote:


Signed-off-by: Karol Herbst 
---
src/compiler/nir/nir_opt_constant_folding.c | 14 --
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_opt_constant_folding.c 
b/src/compiler/nir/nir_opt_constant_folding.c

index d6be807b3dc..a848b145874 100644
--- a/src/compiler/nir/nir_opt_constant_folding.c
+++ b/src/compiler/nir/nir_opt_constant_folding.c
@@ -76,10 +76,20 @@ constant_fold_alu_instr(nir_alu_instr *instr, void 
*mem_ctx)


  for (unsigned j = 0; j < nir_ssa_alu_instr_src_components(instr, i);
   j++) {
- if (load_const->def.bit_size == 64)
+ switch(load_const->def.bit_size) {
+ case 64:
src[i].u64[j] = load_const->value.u64[instr->src[i].swizzle[j]];
- else
+break;
+ case 32:
src[i].u32[j] = load_const->value.u32[instr->src[i].swizzle[j]];
+break;
+ case 16:
+src[i].u16[j] = load_const->value.u16[instr->src[i].swizzle[j]];
+break;
+ case 8:
+src[i].u8[j] = load_const->value.u8[instr->src[i].swizzle[j]];
+break;
+ }
  }

  /* We shouldn't have any source modifiers in the optimization loop. */
--
2.14.3




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] nir: print 8 and 16 bit constants correctly

2018-04-25 Thread Rob Clark
On Wed, Apr 25, 2018 at 5:14 AM, Karol Herbst  wrote:
> Signed-off-by: Karol Herbst 

Reviewed-by: Rob Clark 

> ---
>  src/compiler/nir/nir_print.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
> index 21f13097651..1c84b4b7076 100644
> --- a/src/compiler/nir/nir_print.c
> +++ b/src/compiler/nir/nir_print.c
> @@ -846,11 +846,21 @@ print_load_const_instr(nir_load_const_instr *instr, 
> print_state *state)
> * and then print the float in a comment for readability.
> */
>
> -  if (instr->def.bit_size == 64)
> +  switch (instr->def.bit_size) {
> +  case 64:
>   fprintf(fp, "0x%16" PRIx64 " /* %f */", instr->value.u64[i],
>   instr->value.f64[i]);
> -  else
> + break;
> +  case 32:
>   fprintf(fp, "0x%08x /* %f */", instr->value.u32[i], 
> instr->value.f32[i]);
> + break;
> +  case 16:
> + fprintf(fp, "0x%04x", instr->value.u16[i]);
> + break;
> +  case 8:
> + fprintf(fp, "0x%02x", instr->value.u8[i]);
> + break;
> +  }
> }
>
> fprintf(fp, ")");
> --
> 2.14.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH i-g-t] [RFC] CONTRIBUTING: commit rights docs

2018-04-25 Thread Emil Velikov
On 24 April 2018 at 20:14, Daniel Vetter  wrote:
> On Tue, Apr 24, 2018 at 7:30 PM, Emil Velikov  
> wrote:
>> On 13 April 2018 at 11:00, Daniel Vetter  wrote:
>>> This tries to align with the X.org communities's long-standing
>>> tradition of trying to be an inclusive community and handing out
>>> commit rights fairly freely.
>>>
>>> We also tend to not revoke commit rights for people no longer
>>> regularly active in a given project, as long as they're still part of
>>> the larger community.
>>>
>>> Finally make sure that commit rights, like anything happening on fd.o
>>> infrastructre, is subject to the fd.o's Code of Conduct.
>>>
>>> v2: Point at MAINTAINERS for contact info (Daniel S.)
>>>
>>> v3:
>>> - Make it clear that commit rights are voluntary and that committers
>>>   need to acknowledge positively when they're nominated by someone
>>>   else (Keith).
>>> - Encourage committers to drop their commit rights when they're no
>>>   longer active, and make it clear they'll get readded (Keith).
>>> - Add a line that maintainers and committers should actively nominate
>>>   new committers (me).
>>>
>>> v4: Typo (Petri).
>>>
>>> v5: Typo (Sean).
>>>
>>> v6: Wording clarifications and spelling (Jani).
>>>
>>> v7: Require an explicit commitment to the documented merge criteria
>>> and rules, instead of just the implied one through the Code of Conduct
>>> threat (Jani).
>>>
>>> Acked-by: Alex Deucher 
>>> Acked-by: Arkadiusz Hiler 
>>> Acked-by: Daniel Stone 
>>> Acked-by: Eric Anholt 
>>> Acked-by: Gustavo Padovan 
>>> Acked-by: Petri Latvala 
>>> Cc: Alex Deucher 
>>> Cc: Arkadiusz Hiler 
>>> Cc: Ben Widawsky 
>>> Cc: Daniel Stone 
>>> Cc: Dave Airlie 
>>> Cc: Eric Anholt 
>>> Cc: Gustavo Padovan 
>>> Cc: Jani Nikula 
>>> Cc: Joonas Lahtinen 
>>> Cc: Keith Packard 
>>> Cc: Kenneth Graunke 
>>> Cc: Kristian H. Kristensen 
>>> Cc: Maarten Lankhorst 
>>> Cc: Petri Latvala 
>>> Cc: Rodrigo Vivi 
>>> Cc: Sean Paul 
>>> Reviewed-by: Keith Packard 
>>> Signed-off-by: Daniel Vetter 
>>> ---
>>> If you wonder about the wide distribution list for an igt patch: I'd
>>> like to start a discussions about x.org community norms around commit
>>> rights at large, at least for all the shared repos. I plan to propose
>>> the same text for drm-misc and libdrm too, and hopefully others like
>>> mesa/xserver/wayland would follow.
>>>
>> I think the idea is pretty good, simply highlighting some bits.
>>
>> What you've outlined in this patch has been in practise for many years:
>>  a) undocumented, applicable to most xorg projects [1]
>>  b) documented, mesa
>
> Hm, I chatted with a few mesa devs about this, and I wasn't aware
> there's explicit documentation for mesa. Where is it? I'd very much
> want to align as much as we can.
>
See the "Developer git Access" section in [1]. FWIW I prefer the
wording used in this patch and the CoC reference is a big plus.

HTH
Emil

[1] https://www.mesa3d.org/repository.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] nir/opt_constant_folding: fix folding of 8 and 16 bit ints

2018-04-25 Thread Karol Herbst
On Wed, Apr 25, 2018 at 1:50 PM, Chema Casanova  wrote:
> I've already got to the same code addressing Jason feedback about
> "[PATCH 06/11] nir/constant_folding: support 16-bit constants."
>

okay, will push then as soon as possible, so that you don't have to wait.

> So this is:
>
> Reviewed-by: Jose Maria Casanova Crespo 
>
> El 25/04/18 a las 11:14, Karol Herbst escribió:
>> Signed-off-by: Karol Herbst 
>> ---
>>  src/compiler/nir/nir_opt_constant_folding.c | 14 --
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/compiler/nir/nir_opt_constant_folding.c 
>> b/src/compiler/nir/nir_opt_constant_folding.c
>> index d6be807b3dc..a848b145874 100644
>> --- a/src/compiler/nir/nir_opt_constant_folding.c
>> +++ b/src/compiler/nir/nir_opt_constant_folding.c
>> @@ -76,10 +76,20 @@ constant_fold_alu_instr(nir_alu_instr *instr, void 
>> *mem_ctx)
>>
>>for (unsigned j = 0; j < nir_ssa_alu_instr_src_components(instr, i);
>> j++) {
>> - if (load_const->def.bit_size == 64)
>> + switch(load_const->def.bit_size) {
>> + case 64:
>>  src[i].u64[j] = load_const->value.u64[instr->src[i].swizzle[j]];
>> - else
>> +break;
>> + case 32:
>>  src[i].u32[j] = load_const->value.u32[instr->src[i].swizzle[j]];
>> +break;
>> + case 16:
>> +src[i].u16[j] = load_const->value.u16[instr->src[i].swizzle[j]];
>> +break;
>> + case 8:
>> +src[i].u8[j] = load_const->value.u8[instr->src[i].swizzle[j]];
>> +break;
>> + }
>>}
>>
>>/* We shouldn't have any source modifiers in the optimization loop. */
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: convert the shift operand for bitwise shift ops to uint32

2018-04-25 Thread Samuel Iglesias Gonsálvez
SPIR-V allows to define the shift operand for shift opcodes with
a bit-size different than 32 bits, but in NIR the opcodes have
that limitation. As agreed in the mailing list, this patch adds
a conversion to 32 bits to fix this.

For more info, see:

https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/spirv/vtn_alu.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 71e743cdd1e..1d33ae28273 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -640,6 +640,19 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
}
 
+   case SpvOpShiftLeftLogical:
+   case SpvOpShiftRightArithmetic:
+   case SpvOpShiftRightLogical: {
+  if (src[1]->bit_size != 32) {
+ /* Convert the Shift operand to 32 bits, which is the bitsize
+  * supported by the NIR instruction. See discussion here:
+  *
+  * 
https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html
+  */
+ src[1] = nir_build_alu(>nb, nir_op_u2u32, src[1], NULL, NULL, 
NULL);
+  }
+   }
+   /* fall-through */
default: {
   bool swap;
   unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] nir/constant_folding: support 16-bit constants

2018-04-25 Thread Chema Casanova
El 24/04/18 a las 23:52, Jason Ekstrand escribió:
> On Wed, Apr 11, 2018 at 12:20 AM, Iago Toral Quiroga  > wrote:
> 
> From: Jose Maria Casanova Crespo  >
> 
> ---
>  src/compiler/nir/nir_opt_constant_folding.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/compiler/nir/nir_opt_constant_folding.c
> b/src/compiler/nir/nir_opt_constant_folding.c
> index d6be807b3dc..b63660ea4da 100644
> --- a/src/compiler/nir/nir_opt_constant_folding.c
> +++ b/src/compiler/nir/nir_opt_constant_folding.c
> @@ -78,6 +78,8 @@ constant_fold_alu_instr(nir_alu_instr *instr, void
> *mem_ctx)
>             j++) {
>           if (load_const->def.bit_size == 64)
>              src[i].u64[j] =
> load_const->value.u64[instr->src[i].swizzle[j]];
> +         else if (load_const->def.bit_size == 16)
> +            src[i].u16[j] =
> load_const->value.u16[instr->src[i].swizzle[j]];
>           else
>              src[i].u32[j] =
> load_const->value.u32[instr->src[i].swizzle[j]];
> 
> 
> Let's make this a switch and support 8 while we're at it.

Karol Herbst has just sent just a patch with these changes done. So I've
reviewed it as I got to the same patch.

> 
>        }
> -- 
> 2.14.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/android: remove flink name support

2018-04-25 Thread Robert Foss

Hey Emil & Chih-Wei,


On 04/24/2018 01:59 PM, Emil Velikov wrote:

On 24 April 2018 at 12:28, Emil Velikov  wrote:


On the topic of keeping the old code behind a #define or just removing
it, it'll be great if interested parties can reach a consensus.


Actually one can simply drop this code and drm_gralloc users can add a
drm_ioctl_permit() hack.
Namely: loosen the restrictions to consider render nodes identical to
primary/card ones.

Yes, it's a nasty hack, yet no worse than the existing one that
removes the auth :-\


I'm fine with adding a #define.
Chih-Wei: Do you have any objections?


Rob.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] nir/opt_constant_folding: fix folding of 8 and 16 bit ints

2018-04-25 Thread Chema Casanova
I've already got to the same code addressing Jason feedback about
"[PATCH 06/11] nir/constant_folding: support 16-bit constants."

So this is:

Reviewed-by: Jose Maria Casanova Crespo 

El 25/04/18 a las 11:14, Karol Herbst escribió:
> Signed-off-by: Karol Herbst 
> ---
>  src/compiler/nir/nir_opt_constant_folding.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/nir/nir_opt_constant_folding.c 
> b/src/compiler/nir/nir_opt_constant_folding.c
> index d6be807b3dc..a848b145874 100644
> --- a/src/compiler/nir/nir_opt_constant_folding.c
> +++ b/src/compiler/nir/nir_opt_constant_folding.c
> @@ -76,10 +76,20 @@ constant_fold_alu_instr(nir_alu_instr *instr, void 
> *mem_ctx)
>  
>for (unsigned j = 0; j < nir_ssa_alu_instr_src_components(instr, i);
> j++) {
> - if (load_const->def.bit_size == 64)
> + switch(load_const->def.bit_size) {
> + case 64:
>  src[i].u64[j] = load_const->value.u64[instr->src[i].swizzle[j]];
> - else
> +break;
> + case 32:
>  src[i].u32[j] = load_const->value.u32[instr->src[i].swizzle[j]];
> +break;
> + case 16:
> +src[i].u16[j] = load_const->value.u16[instr->src[i].swizzle[j]];
> +break;
> + case 8:
> +src[i].u8[j] = load_const->value.u8[instr->src[i].swizzle[j]];
> +break;
> + }
>}
>  
>/* We shouldn't have any source modifiers in the optimization loop. */
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir: support converting to 8-bit integers in nir_type_conversion_op

2018-04-25 Thread Chema Casanova
Reviewed-by: Jose Maria Casanova Crespo 

El 25/04/18 a las 11:14, Karol Herbst escribió:
> Signed-off-by: Karol Herbst 
> ---
>  src/compiler/nir/nir_opcodes_c.py | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/nir/nir_opcodes_c.py 
> b/src/compiler/nir/nir_opcodes_c.py
> index c19185534af..8afccca9504 100644
> --- a/src/compiler/nir/nir_opcodes_c.py
> +++ b/src/compiler/nir/nir_opcodes_c.py
> @@ -62,7 +62,12 @@ nir_type_conversion_op(nir_alu_type src, nir_alu_type dst, 
> nir_rounding_mode rnd
>  % endif
>  %  endif
> switch (dst_bit_size) {
> -% for dst_bits in [16, 32, 64]:
> +% if dst_t == 'float':
> +<%bit_sizes = [16, 32, 64] %>
> +% else:
> +<%bit_sizes = [8, 16, 32, 64] %>
> +% endif
> +% for dst_bits in bit_sizes:
>case ${dst_bits}:
>  %if src_t == 'float' and dst_t == 'float' and dst_bits 
> == 16:
>   switch(rnd) {
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105494] UT2004 cube map reflection problem

2018-04-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105494

--- Comment #5 from Denis  ---
hello. I can confirm the same behavior with a pond and on i965 driver (on KBL
and SNB cpu's).

I also tried to find mesa version without the issue, but I couldn't. Tested on
18.1.0 and 13.0.0. mesa versions.
upd - the hole game has issues with shadows (it can be visible on maps Idoma,
Curse4, Grendelkeep, Icetomb) - when you are changing camera corner, shadows
don't disappear/change smoothly, there is a border between "old view" with
shadow, and new (without) appears (I didn't face with this behavior on windows
OS and nvidia GPU/drivers).

And the last thing, "commit f02f1ad13fa4123986d17a5d04b0e2831c3a7091" didn't
fix this issue, so they are different.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] i965: Add support for fp16 <-> fp64 conversions

2018-04-25 Thread Samuel Iglesias Gonsálvez
These two patches are still unreviewed.

Sam


On 13/04/18 07:30, Samuel Iglesias Gonsálvez wrote:
> Hello,
>
> This series implements support for doing fp16 <-> fp64 conversions on
> i965. The PRM says we need to do an intermediate conversion to a 32 bit
> type.
>
> This patch series applies on top of shaderInt16's patch series [0].
> There is a branch for testing on Github:
>
> $ git clone https://github.com/Igalia/mesa.git \
>   -b siglesias/vulkan-fp16-fp64-conversions
>
> There are tests for Vulkan CTS under review (CL#2246) for testing these
> patches.
>
> Best regards,
>
> Sam
>
> [0] https://lists.freedesktop.org/archives/mesa-dev/2018-April/191888.html
>
> Samuel Iglesias Gonsálvez (2):
>   i965/fs: implement conversions from float16 to 64 bits data types
>   i965/fs: Implement float64 to float16 conversion
>
>  src/intel/compiler/brw_fs_nir.cpp | 49 
> +++
>  1 file changed, 49 insertions(+)
>




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Don’t check for NaN for most OpFOrd* comparisons

2018-04-25 Thread Iago Toral
Thanks Neil!

Reviewed-by: Iago Toral Quiroga 

Maybe we need other drivers (radv?) to double-check that this doesn't
break stuff for them either?

Iago

On Tue, 2018-04-24 at 16:55 +0200, Neil Roberts wrote:
> For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware
> should probably already return false if one of the operands is NaN so
> we don’t need to have an explicit check for it. This seems to at
> least
> work on Intel hardware. This should reduce the number of instructions
> generated for the most common comparisons.
> 
> For what it’s worth, the original code to handle this was added in
> e062eb6415de3a. The commit message for that says that it was to fix
> some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t
> handle NaNs this patch shouldn’t affect those tests. At any rate they
> have since been moved out of the mustpass list. Incidentally those
> tests fail on the nvidia proprietary driver so it doesn’t seem like
> handling NaNs correctly is a priority.
> ---
> 
> I made a VkRunner test case for all of the OpFOrd* and OpFUnord*
> opcodes with and without NaNs on the test branch. It can be run like
> this:
> 
> git clone -b tests https://github.com/Igalia/vkrunner.git
> cd vkrunner
> ./autogen.sh && make -j8
> ./src/vkrunner examples/unordered-comparison.shader_test
> 
>  src/compiler/spirv/vtn_alu.c | 17 ++---
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/src/compiler/spirv/vtn_alu.c
> b/src/compiler/spirv/vtn_alu.c
> index 71e743cdd1e..3134849ba90 100644
> --- a/src/compiler/spirv/vtn_alu.c
> +++ b/src/compiler/spirv/vtn_alu.c
> @@ -597,23 +597,18 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp
> opcode,
>break;
> }
>  
> -   case SpvOpFOrdEqual:
> -   case SpvOpFOrdNotEqual:
> -   case SpvOpFOrdLessThan:
> -   case SpvOpFOrdGreaterThan:
> -   case SpvOpFOrdLessThanEqual:
> -   case SpvOpFOrdGreaterThanEqual: {
> +   case SpvOpFOrdNotEqual: {
> +  /* For all the SpvOpFOrd* comparisons apart from NotEqual, the
> value
> +   * from the ALU will probably already be false if the operands
> are not
> +   * ordered so we don’t need to handle it specially.
> +   */
>bool swap;
>unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type);
>unsigned dst_bit_size = glsl_get_bit_size(type);
>nir_op op = vtn_nir_alu_op_for_spirv_opcode(b, opcode, ,
>src_bit_size,
> dst_bit_size);
>  
> -  if (swap) {
> - nir_ssa_def *tmp = src[0];
> - src[0] = src[1];
> - src[1] = tmp;
> -  }
> +  assert(!swap);
>  
>val->ssa->def =
>   nir_iand(>nb,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR issue with SPIRV ops that have operands with different bit-size

2018-04-25 Thread Samuel Iglesias Gonsálvez
Thanks to all for the opinions.

I'm going to implement the conversion then.

Sam


On 24/04/18 15:45, Jason Ekstrand wrote:
> On Tue, Apr 24, 2018 at 6:42 AM, Ian Romanick  > wrote:
>
> On 04/24/2018 05:44 AM, Rob Clark wrote:
> > On Tue, Apr 24, 2018 at 4:24 AM, Samuel Iglesias Gonsálvez
> > > wrote:
> >> Hello,
> >>
> >> Recently, we have found problems between some SPIRV opcodes and
> NIR.
> >>
> >> SPIR-V allows opcodes to mix different bit-sizes for their
> operands, such as for some bitfield operations and other ops like
> bitwise shifts.
> >>
> >> In NIR, when the ALU opcode doesn't have specified bitsizes for
> their operands, it is expected to have both the same bitsize (see
> the assert in nir_build_alu() at nir_builder.h). I suppose this
> assumption comes from the time that NIR were only fed with GLSL IR
> but now with SPIR-V that assert is wrong.
> >>
> >> Instead of adding new variants for the opcodes (such as
> nir_op_ishl16, or so) to workaround the issue, I think it is
> needed to fix this by removing this assumption from NIR and its
> validator. I send this email to ask for ideas and to find the best
> way to handle this.
> >>
> >
> > Karol hit the same thing (with for example, shift instructions) with
> > the work for spv compute/kernel support.  I *think* the number of
> > special cases isn't too high, so probably vtn should just insert the
> > appropriate conversion instruction (ie. u2u32, etc) so that if
> the src
> > bitsize is incorrect it will be converted.
>
> That's what I was going to suggest.  I guess it's possible that
> some HW
> might benefit from using the smaller bit-size, but the code generator
> should be able to see through the conversions to get the smaller data.
>
>
> Agreed.  I think most of these case should have no difference between
> the theoretical new opcode and original + conversion. 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >