[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

--- Comment #5 from Tapani Pälli  ---
(In reply to Tapani Pälli from comment #4)
> I've noticed that if I skip uniform removal in opt_dead_code (even builtin
> uniforms), then this test passes. I have no idea why this is though. I have
> been running chrome with following arguments: "--use-gl=egl
> --disable-gpu-program-cache --disable-gpu-shader-disk-cache".

Hmm it seems that these flags are not really honored by Chrome though, if I
'disable' program binary by just returning from program binary functions then
there are far less failures ... so this is probably about program binary after
all.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: reduce maxFragmentInputComponents

2018-06-14 Thread Samuel Iglesias Gonsálvez
This patch is still unreviewed.

Sam


On 29/05/18 09:07, Samuel Iglesias Gonsálvez wrote:
> If the application asks for the maximum number of fragment input
> components (128), use all of them plus some builtins that are
> passed in the VUE, then we exceed the maximum number of used VUE
> slots (32) and we break one assert that checks this limit.
>
> Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1
> builtins in brw_compute_vue_map() because we don't know if
> gl_ClipDistance is going to be read/write by an adjacent stage.
>
> Fixes VK-GL-CTS CL#2569.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/intel/vulkan/anv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 374fc16c4c9..87c0d0cb4a6 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -898,7 +898,7 @@ void anv_GetPhysicalDeviceProperties(
>.maxGeometryOutputComponents  = 128,
>.maxGeometryOutputVertices= 256,
>.maxGeometryTotalOutputComponents = 1024,
> -  .maxFragmentInputComponents   = 128,
> +  .maxFragmentInputComponents   = 112, /* 128 components - 
> (POS, PSIZ, CLIP_DIST0, CLIP_DIST1) */
>.maxFragmentOutputAttachments = 8,
>.maxFragmentDualSrcAttachments= 1,
>.maxFragmentCombinedOutputResources   = 8,




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 3/3] egl/android: Add DRM node probing and filtering

2018-06-14 Thread Amit Pundir
On 13 June 2018 at 20:45, Rob Herring  wrote:
>
> +Amit and John
>
> On Sat, Jun 9, 2018 at 11:27 AM, Robert Foss  
> wrote:
> > This patch both adds support for probing & filtering DRM nodes
> > and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
> > gralloc call.
> >
> > Currently the filtering is based just on the driver name,
> > and the desired name is supplied using the "drm.gpu.vendor_name"
> > Android property.
>
> There's a potential issue with this whole approach and that is
> SELinux. With the way SELinux locks down accesses, getting probing
> thru device files to work can be a pain. It may be better now than the
> prior version because sysfs is not probed. I'll leave it to Amit or
> John to comment.

Right.. so ICYMI, this patch is already pulled into external/mesa3d
project of AOSP and I stumbled upon one such /dev/dri/ access denial
on db820c recently.

In AOSP, zygote spawned apps already have access to GPU device nodes
in the form of /dev/gpu_device file, but the missing part is the
open-read access to "/dev/dri/" which need to be allowed explicitly.
Rest of the denials related to sysfs access can be easily resolved
using audit2allow tool.

Regards,
Amit Pundir

>
> Rob
>
> >
> > Signed-off-by: Robert Foss 
> > ---
> >
> > Changes since v2:
> >  - Switch from drmGetDevices2 to manual renderD node iteration
> >  - Add probe_res enum to communicate probing results better
> >  - Avoid using _eglError() in internal static functions
> >  - Avoid actually loading the driver while probing, just verify
> >that it exists.
> >  - Replace strlen call with the assumed length PROPERTY_VALUE_MAX
> >
> > Changes since v1:
> >  - Do not rely on libdrm for probing
> >  - Distinguish between errors and when no drm devices are found
> >
> > Changes since RFC:
> >  - Rebased on newer libdrm drmHandleMatch patch
> >  - Added support for driver probing
> >
> >
> >  src/egl/drivers/dri2/platform_android.c | 222 ++--
> >  1 file changed, 169 insertions(+), 53 deletions(-)
> >
> > diff --git a/src/egl/drivers/dri2/platform_android.c 
> > b/src/egl/drivers/dri2/platform_android.c
> > index 4ba96aad90..a2cbe92d93 100644
> > --- a/src/egl/drivers/dri2/platform_android.c
> > +++ b/src/egl/drivers/dri2/platform_android.c
> > @@ -27,12 +27,16 @@
> >   * DEALINGS IN THE SOFTWARE.
> >   */
> >
> > +#include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> > +#include 
> >
> >  #include "loader.h"
> >  #include "egl_dri2.h"
> > @@ -1130,31 +1134,6 @@ droid_add_configs_for_visuals(_EGLDriver *drv, 
> > _EGLDisplay *dpy)
> > return (config_count != 0);
> >  }
> >
> > -enum {
> > -/* perform(const struct gralloc_module_t *mod,
> > - * int op,
> > - * int *fd);
> > - */
> > -GRALLOC_MODULE_PERFORM_GET_DRM_FD = 0x4002,
> > -};
> > -
> > -static int
> > -droid_open_device(struct dri2_egl_display *dri2_dpy)
> > -{
> > -   int fd = -1, err = -EINVAL;
> > -
> > -   if (dri2_dpy->gralloc->perform)
> > - err = dri2_dpy->gralloc->perform(dri2_dpy->gralloc,
> > -  
> > GRALLOC_MODULE_PERFORM_GET_DRM_FD,
> > -  &fd);
> > -   if (err || fd < 0) {
> > -  _eglLog(_EGL_WARNING, "fail to get drm fd");
> > -  fd = -1;
> > -   }
> > -
> > -   return (fd >= 0) ? fcntl(fd, F_DUPFD_CLOEXEC, 3) : -1;
> > -}
> > -
> >  static const struct dri2_egl_display_vtbl droid_display_vtbl = {
> > .authenticate = NULL,
> > .create_window_surface = droid_create_window_surface,
> > @@ -1215,6 +1194,168 @@ static const __DRIextension 
> > *droid_image_loader_extensions[] = {
> > NULL,
> >  };
> >
> > +EGLBoolean
> > +droid_load_driver(_EGLDisplay *disp)
> > +{
> > +   struct dri2_egl_display *dri2_dpy = disp->DriverData;
> > +   const char *err;
> > +
> > +   dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
> > +   if (dri2_dpy->driver_name == NULL)
> > +  return false;
> > +
> > +   dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == 
> > DRM_NODE_RENDER;
> > +
> > +   if (!dri2_dpy->is_render_node) {
> > +   #ifdef HAVE_DRM_GRALLOC
> > +   /* Handle control nodes using __DRI_DRI2_LOADER extension and GEM 
> > names
> > +* for backwards compatibility with drm_gralloc. (Do not use on new
> > +* systems.) */
> > +   dri2_dpy->loader_extensions = droid_dri2_loader_extensions;
> > +   if (!dri2_load_driver(disp)) {
> > +  err = "DRI2: failed to load driver";
> > +  goto error;
> > +   }
> > +   #else
> > +   err = "DRI2: handle is not for a render node";
> > +   goto error;
> > +   #endif
> > +   } else {
> > +   dri2_dpy->loader_extensions = droid_image_loader_extensions;
> > +   if (!dri2_load_driver_dri3(disp)) {
> > +  err = "DRI3: failed to load driver";
> > +  goto error

Re: [Mesa-dev] [PATCH] anv: reduce maxFragmentInputComponents

2018-06-14 Thread Jason Ekstrand
Makes sense.

Reviewed-by: Jason Ekstrand 

On Tue, May 29, 2018 at 12:07 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:

> If the application asks for the maximum number of fragment input
> components (128), use all of them plus some builtins that are
> passed in the VUE, then we exceed the maximum number of used VUE
> slots (32) and we break one assert that checks this limit.
>
> Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1
> builtins in brw_compute_vue_map() because we don't know if
> gl_ClipDistance is going to be read/write by an adjacent stage.
>
> Fixes VK-GL-CTS CL#2569.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/intel/vulkan/anv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 374fc16c4c9..87c0d0cb4a6 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -898,7 +898,7 @@ void anv_GetPhysicalDeviceProperties(
>.maxGeometryOutputComponents  = 128,
>.maxGeometryOutputVertices= 256,
>.maxGeometryTotalOutputComponents = 1024,
> -  .maxFragmentInputComponents   = 128,
> +  .maxFragmentInputComponents   = 112, /* 128 components
> - (POS, PSIZ, CLIP_DIST0, CLIP_DIST1) */
>.maxFragmentOutputAttachments = 8,
>.maxFragmentDualSrcAttachments= 1,
>.maxFragmentCombinedOutputResources   = 8,
> --
> 2.17.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106807] Failed to parse macro "#line"

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106807

--- Comment #2 from Juan A. Suarez  ---
As Kenneth said, Khronos voted to consider #line with an expression as
undefined behaviour, and thus these tests were removed

https://github.com/KhronosGroup/VK-GL-CTS/commit/4ff5a922a15bcdb93e59313221033bee1204be2c

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

Tapani Pälli  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|mesa-dev@lists.freedesktop. |lem...@gmail.com
   |org |

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Fix output for sparse MRTs.

2018-06-14 Thread Samuel Pitoiset

Reviewed-by: Samuel Pitoiset 

On 06/13/2018 11:35 PM, Bas Nieuwenhuizen wrote:

We need to init the cb_shader_format correctly with the changed
col_format, so this moves the col_format adjustment to before the
adjustment to before the cb_shader_mask gets generated.

Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903
CC: 18.1 
---
  src/amd/vulkan/radv_pipeline.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index b8b425aca9f..6eeedc65a39 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -524,20 +524,21 @@ radv_pipeline_compute_spi_color_formats(struct 
radv_pipeline *pipeline,
col_format |= cf << (4 * i);
}
  
-	blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);

-
-   if (blend->mrt0_is_dual_src)
-   col_format |= (col_format & 0xf) << 4;
-   blend->spi_shader_col_format = col_format;
-
/* If the i-th target format is set, all previous target formats must
 * be non-zero to avoid hangs.
 */
-   num_targets = (util_last_bit(blend->spi_shader_col_format) + 3) / 4;
+   num_targets = (util_last_bit(col_format) + 3) / 4;
for (unsigned i = 0; i < num_targets; i++) {
-   if (!(blend->spi_shader_col_format & (0xf << (i * 4
-   blend->spi_shader_col_format |= V_028714_SPI_SHADER_32_R 
<< (i * 4);
+   if (!(col_format & (0xf << (i * 4 {
+   col_format |= V_028714_SPI_SHADER_32_R << (i * 4);
+   }
}
+
+   blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
+
+   if (blend->mrt0_is_dual_src)
+   col_format |= (col_format & 0xf) << 4;
+   blend->spi_shader_col_format = col_format;
  }
  
  static bool



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] i965: Don't recycle BOs until they are idle

2018-06-14 Thread Michel Dänzer
On 2018-06-13 10:26 PM, Jason Ekstrand wrote:
> The current BO cache puts BOs back into the recycle bucket the moment the
> refcount hits zero.  If the BO is busy, we just don't re-use it until it
> isn't or we re-use it for a render target which we assume will be used
> first for drawing.  This patch series reworks the way the BO cache works a
> bit so that we don't ever recycle a busy BO.  On the down side, it means
> that we don't get the "keep busy BOs busy" heuristic (which we have no
> proof actually helps).  On the up side, we can now easily use a MRU
> heuristic instead of round-robin for all buffers and not just the busy
> ones.  Will this be an improvement, a regression or a wash?  I don't know
> but I doubt it will have a major effect one way or another.

FWIW, I suspect this could be a significant loss with overlapping copies
in glamor (e.g. x11perf -copywinwin500), because it won't be able to
reuse the busy BOs anymore (glamor creates a temporary FBO for each
overlapping copy).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: allow RADV_PERFTEST=dccmsaa on GFX9

2018-06-14 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5936b43093..e7fc45ef35 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -329,8 +329,8 @@ radv_physical_device_init(struct radv_physical_device 
*device,
device->out_of_order_rast_allowed = device->has_out_of_order_rast &&
!(device->instance->debug_flags & 
RADV_DEBUG_NO_OUT_OF_ORDER);
 
-   device->dcc_msaa_allowed = device->rad_info.chip_class == VI &&
-  (device->instance->perftest_flags & 
RADV_PERFTEST_DCC_MSAA);
+   device->dcc_msaa_allowed =
+   (device->instance->perftest_flags & RADV_PERFTEST_DCC_MSAA);
 
radv_physical_device_init_mem_types(device);
radv_fill_device_extension_table(device, &device->supported_extensions);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105396] tc compatible htile sets depth of htiles of discarded fragments to 1.0

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105396

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Samuel Pitoiset  ---
Fixed.
https://cgit.freedesktop.org/mesa/mesa/commit/?id=68dead112e710b261ad33604175d635dec6afd34

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/5] util: manually extract the program name from program_invocation_name

2018-06-14 Thread Eric Engestrom
On Thursday, 2018-06-14 11:00:21 +1000, Timothy Arceri wrote:
> Glibc has the same code to get program_invocation_short_name. However
> for some reason the short name gets mangled for some wine apps.
> 
> For example with Google Earth VR I get:
> 
> program_invocation_name:
> "/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe"
> 
> program_invocation_short_name:
> "e"
> ---
>  src/util/xmlconfig.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/src/util/xmlconfig.c b/src/util/xmlconfig.c
> index 60a6331c86c..ad943e2ce48 100644
> --- a/src/util/xmlconfig.c
> +++ b/src/util/xmlconfig.c
> @@ -45,7 +45,16 @@
>  /* These aren't declared in any libc5 header */
>  extern char *program_invocation_name, *program_invocation_short_name;
>  #endif
> -#define GET_PROGRAM_NAME() program_invocation_short_name
> +static const char *
> +__getProgramName()
> +{
> +char * arg = strrchr(program_invocation_name, '/');
> +if (arg)
> +return arg+1;
> +else
> +return program_invocation_name;
> +}
> +#define GET_PROGRAM_NAME() __getProgramName()

How about:

  #include 
  #define GET_PROGRAM_NAME() basename(program_invocation_name)

>  #elif defined(__CYGWIN__)
>  #define GET_PROGRAM_NAME() program_invocation_short_name
>  #elif defined(__FreeBSD__) && (__FreeBSD__ >= 2)
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: serialize data from glTransformFeedbackVaryings

2018-06-14 Thread Tapani Pälli
While XFB has been enabled for cache, we did not serialize enough
data for the whole API to work (such as glGetProgramiv).

Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
---
 src/compiler/glsl/serialize.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/compiler/glsl/serialize.cpp b/src/compiler/glsl/serialize.cpp
index 727822633d..4cb74ddba9 100644
--- a/src/compiler/glsl/serialize.cpp
+++ b/src/compiler/glsl/serialize.cpp
@@ -323,6 +323,12 @@ write_xfb(struct blob *metadata, struct gl_shader_program 
*shProg)
 
blob_write_uint32(metadata, prog->info.stage);
 
+   /* Data set by glTransformFeedbackVaryings. */
+   blob_write_uint32(metadata, shProg->TransformFeedback.BufferMode);
+   blob_write_uint32(metadata, shProg->TransformFeedback.NumVarying);
+   for (unsigned i = 0; i < shProg->TransformFeedback.NumVarying; i++)
+  blob_write_string(metadata, shProg->TransformFeedback.VaryingNames[i]);
+
blob_write_uint32(metadata, ltf->NumOutputs);
blob_write_uint32(metadata, ltf->ActiveBuffers);
blob_write_uint32(metadata, ltf->NumVarying);
@@ -352,6 +358,15 @@ read_xfb(struct blob_reader *metadata, struct 
gl_shader_program *shProg)
if (xfb_stage == ~0u)
   return;
 
+   /* Data set by glTransformFeedbackVaryings. */
+   shProg->TransformFeedback.BufferMode = blob_read_uint32(metadata);
+   shProg->TransformFeedback.NumVarying = blob_read_uint32(metadata);
+   shProg->TransformFeedback.VaryingNames = (char **)
+  malloc(shProg->TransformFeedback.NumVarying * sizeof(GLchar *));
+   for (unsigned i = 0; i < shProg->TransformFeedback.NumVarying; i++)
+  shProg->TransformFeedback.VaryingNames[i] =
+ strdup(blob_read_string(metadata));
+
struct gl_program *prog = shProg->_LinkedShaders[xfb_stage]->Program;
struct gl_transform_feedback_info *ltf =
   rzalloc(prog, struct gl_transform_feedback_info);
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106903] radv: Fragment shader output goes to wrong attachments when render targets are sparse

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106903

Bas Nieuwenhuizen  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #3 from Bas Nieuwenhuizen  ---
Fixed by 

https://gitlab.freedesktop.org/mesa/mesa/commit/41dabdc47538fb7660f7063d9dd423473eaa2515

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

--- Comment #6 from Tapani Pälli  ---
Fix proposal sent here:
https://lists.freedesktop.org/archives/mesa-dev/2018-June/197678.html

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106915] [GLSL] Unused arrays declared without a size should be handled like arrays of size 1.

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106915

Bug ID: 106915
   Summary: [GLSL] Unused arrays declared without a size should be
handled like arrays of size 1.
   Product: Mesa
   Version: unspecified
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: es...@igalia.com
QA Contact: mesa-dev@lists.freedesktop.org

>From GLSLang Spec 4.60 Section 4.2 Scoping:

"An array implicitly sized in one shader can be explicitly sized by another
shader in the same stage. If no shader in a stage has an explicit size for the
array, the largest implicit size (one more than the largest index used) in that
stage is used. There is no cross-stage array sizing. If there is no static
access to an implicitly sized array within the stage declaring it, then the
array is given a size of 1, which is relevant when the array is declared within
an interface block that is shared with other stages or the application (other
unused arrays might be eliminated by the optimizer)."

According to the paragraph above, the following piglit test should not generate
any errors as s[] would be treated as an array of size 1:

[vertex shader]
#version 150
#extension GL_ARB_shader_storage_buffer_object: require
buffer a {
vec4 s[];
vec4 a[];
} b;

in vec4 piglit_vertex;
out vec4 c;

void main(void) {
c = b.a[0];

gl_Position = piglit_vertex;
}

[test]
link error

Test:
spec/arb_shader_storage_buffer_object/linker/unsized_array_member.shader_test

but it does.

If we convert the GLSL code to SPIR-V there are no linker errors anymore since
s[] seems to have type: OpTypeArray with size 1 and the linker accepts it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106915] [GLSL] Unused arrays declared without a size should be handled like arrays of size 1.

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106915

Neil Roberts  changed:

   What|Removed |Added

 CC||nrobe...@igalia.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa: add a space between headers and source (trivial)

2018-06-14 Thread Tapani Pälli
There used to be one and it looks like it was removed by eb63640c1d.

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/program_resource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c
index 41024d68ce..fedd1f183c 100644
--- a/src/mesa/main/program_resource.c
+++ b/src/mesa/main/program_resource.c
@@ -31,6 +31,7 @@
 #include "main/context.h"
 #include "program_resource.h"
 #include "compiler/glsl/ir_uniform.h"
+
 static bool
 supported_interface_enum(struct gl_context *ctx, GLenum iface)
 {
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: small cleanup in blorp debug printing output (trivial)

2018-06-14 Thread Tapani Pälli
Signed-off-by: Tapani Pälli 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 8c6d77e1b7..5f99e51bc2 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -292,7 +292,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
 {
const struct gen_device_info *devinfo = &brw->screen->devinfo;
 
-   DBG("%s from %dx %s mt %p %d %d (%f,%f) (%f,%f)"
+   DBG("%s from %dx %s mt %p %d %d (%f,%f) (%f,%f) "
"to %dx %s mt %p %d %d (%f,%f) (%f,%f) (flip %d,%d)\n",
__func__,
src_mt->surf.samples, _mesa_get_format_name(src_mt->format), src_mt,
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] features.txt: mark some extensions as done

2018-06-14 Thread Tapani Pälli
Signed-off-by: Tapani Pälli 
---
 docs/features.txt | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index b32606d223..423b03a9a9 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -322,12 +322,14 @@ Khronos, ARB, and OES extensions that are not part of any 
OpenGL or OpenGL ES ve
   GL_EXT_semaphore  DONE (radeonsi)
   GL_EXT_semaphore_fd   DONE (radeonsi)
   GL_EXT_semaphore_win32not started
+  GL_EXT_texture_norm16 DONE (i965, r600, 
radeonsi, nvc0)
   GL_KHR_blend_equation_advanced_coherent   DONE (i965/gen9+)
   GL_KHR_texture_compression_astc_hdr   DONE (i965/bxt)
   GL_KHR_texture_compression_astc_sliced_3d DONE (i965/gen9+)
   GL_OES_depth_texture_cube_map DONE (all drivers that 
support GLSL 1.30+)
   GL_OES_EGL_image  DONE (all drivers)
-  GL_OES_EGL_image_external_essl3   not started
+  GL_OES_EGL_image_external DONE (all drivers)
+  GL_OES_EGL_image_external_essl3   DONE (all drivers)
   GL_OES_required_internalformatDONE (all drivers)
   GL_OES_surfaceless_contextDONE (all drivers)
   GL_OES_texture_compression_astc   DONE (core only)
@@ -335,7 +337,7 @@ Khronos, ARB, and OES extensions that are not part of any 
OpenGL or OpenGL ES ve
   GL_OES_texture_float_linear   DONE (freedreno, i965, 
r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float DONE (freedreno, i965, 
r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear  DONE (freedreno, i965, 
r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
-  GL_OES_texture_view   not started - based on 
GL_ARB_texture_view
+  GL_OES_texture_view   DONE (i965/gen8+)
   GL_OES_viewport_array DONE (i965, nvc0, 
radeonsi)
   GLX_ARB_context_flush_control not started
   GLX_ARB_robustness_application_isolation  not started
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] radv: update the fast color clear values only if the image is bound

2018-06-14 Thread Samuel Pitoiset
It's unnecessary to update the fast color clear values if the
fast cleared color image isn't currently bound.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 35 +---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 53fb4988a8..a4a2e97321 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1299,6 +1299,37 @@ radv_set_dcc_need_cmask_elim_pred(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cmd_buffer->cs, pred_val >> 32);
 }
 
+/**
+ * Update the fast clear color values if the image is bound as a color buffer.
+ */
+static void
+radv_update_bound_fast_clear_color(struct radv_cmd_buffer *cmd_buffer,
+  struct radv_image *image,
+  int cb_idx,
+  uint32_t color_values[2])
+{
+   struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
+   const struct radv_subpass *subpass = cmd_buffer->state.subpass;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   struct radv_attachment_info *att;
+   uint32_t att_idx;
+
+   if (!framebuffer || !subpass)
+   return;
+
+   att_idx = subpass->color_attachments[cb_idx].attachment;
+   if (att_idx == VK_ATTACHMENT_UNUSED)
+   return;
+
+   att = &framebuffer->attachments[att_idx];
+   if (att->attachment->image != image)
+   return;
+
+   radeon_set_context_reg_seq(cs, R_028C8C_CB_COLOR0_CLEAR_WORD0 + cb_idx 
* 0x3c, 2);
+   radeon_emit(cs, color_values[0]);
+   radeon_emit(cs, color_values[1]);
+}
+
 void
 radv_set_color_clear_regs(struct radv_cmd_buffer *cmd_buffer,
  struct radv_image *image,
@@ -1319,9 +1350,7 @@ radv_set_color_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cmd_buffer->cs, color_values[0]);
radeon_emit(cmd_buffer->cs, color_values[1]);
 
-   radeon_set_context_reg_seq(cmd_buffer->cs, 
R_028C8C_CB_COLOR0_CLEAR_WORD0 + idx * 0x3c, 2);
-   radeon_emit(cmd_buffer->cs, color_values[0]);
-   radeon_emit(cmd_buffer->cs, color_values[1]);
+   radv_update_bound_fast_clear_color(cmd_buffer, image, idx, 
color_values);
 }
 
 static void
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] radv: update the fast ds clear values only if the image is bound

2018-06-14 Thread Samuel Pitoiset
It's unnecessary to update the fast depth/stencil clear values
if the fast cleared depth/stencil image isn't currently bound.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 37 +++-
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index de4af76ce6..ad83bc6c6f 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1171,6 +1171,37 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
   ds->pa_su_poly_offset_db_fmt_cntl);
 }
 
+/**
+ * Update the fast clear depth/stencil values if the image is bound as a
+ * depth/stencil buffer.
+ */
+static void
+radv_update_bound_fast_clear_ds(struct radv_cmd_buffer *cmd_buffer,
+   struct radv_image *image,
+   VkClearDepthStencilValue ds_clear_value)
+{
+   struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
+   const struct radv_subpass *subpass = cmd_buffer->state.subpass;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   struct radv_attachment_info *att;
+   uint32_t att_idx;
+
+   if (!framebuffer || !subpass)
+   return;
+
+   att_idx = subpass->depth_stencil_attachment.attachment;
+   if (att_idx == VK_ATTACHMENT_UNUSED)
+   return;
+
+   att = &framebuffer->attachments[att_idx];
+   if (att->attachment->image != image)
+   return;
+
+   radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 2);
+   radeon_emit(cs, ds_clear_value.stencil);
+   radeon_emit(cs, fui(ds_clear_value.depth));
+}
+
 void
 radv_set_depth_clear_regs(struct radv_cmd_buffer *cmd_buffer,
  struct radv_image *image,
@@ -1203,11 +1234,7 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth));
 
-   radeon_set_context_reg_seq(cmd_buffer->cs, R_028028_DB_STENCIL_CLEAR + 
4 * reg_offset, reg_count);
-   if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT)
-   radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* 
R_028028_DB_STENCIL_CLEAR */
-   if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
-   radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* 
R_02802C_DB_DEPTH_CLEAR */
+   radv_update_bound_fast_clear_ds(cmd_buffer, image, ds_clear_value);
 
/* Update the ZRANGE_PRECISION value for the TC-compat bug. This is
 * only needed when clearing Z to 0.0.
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] radv: clean up radv_{set, load}_color_clear_regs() helpers

2018-06-14 Thread Samuel Pitoiset
And replace _regs by _metadata because it makes more sense.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 67 +++-
 src/amd/vulkan/radv_meta_clear.c |  3 +-
 src/amd/vulkan/radv_private.h| 10 +++--
 3 files changed, 47 insertions(+), 33 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index a4a2e97321..de4af76ce6 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1330,53 +1330,64 @@ radv_update_bound_fast_clear_color(struct 
radv_cmd_buffer *cmd_buffer,
radeon_emit(cs, color_values[1]);
 }
 
+/**
+ * Set the clear color values to the image's metadata.
+ */
 void
-radv_set_color_clear_regs(struct radv_cmd_buffer *cmd_buffer,
- struct radv_image *image,
- int idx,
- uint32_t color_values[2])
+radv_set_color_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
+ struct radv_image *image,
+ int cb_idx,
+ uint32_t color_values[2])
 {
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
+
va += image->offset + image->clear_value_offset;
 
assert(radv_image_has_cmask(image) || radv_image_has_dcc(image));
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
-   radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
-   S_370_WR_CONFIRM(1) |
-   S_370_ENGINE_SEL(V_370_PFP));
-   radeon_emit(cmd_buffer->cs, va);
-   radeon_emit(cmd_buffer->cs, va >> 32);
-   radeon_emit(cmd_buffer->cs, color_values[0]);
-   radeon_emit(cmd_buffer->cs, color_values[1]);
+   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, 0));
+   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
+   S_370_WR_CONFIRM(1) |
+   S_370_ENGINE_SEL(V_370_PFP));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, color_values[0]);
+   radeon_emit(cs, color_values[1]);
 
-   radv_update_bound_fast_clear_color(cmd_buffer, image, idx, 
color_values);
+   radv_update_bound_fast_clear_color(cmd_buffer, image, cb_idx,
+  color_values);
 }
 
+/**
+ * Load the clear color values from the image's metadata.
+ */
 static void
-radv_load_color_clear_regs(struct radv_cmd_buffer *cmd_buffer,
-  struct radv_image *image,
-  int idx)
+radv_load_color_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
+  struct radv_image *image,
+  int cb_idx)
 {
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
+
va += image->offset + image->clear_value_offset;
 
if (!radv_image_has_cmask(image) && !radv_image_has_dcc(image))
return;
 
-   uint32_t reg = R_028C8C_CB_COLOR0_CLEAR_WORD0 + idx * 0x3c;
+   uint32_t reg = R_028C8C_CB_COLOR0_CLEAR_WORD0 + cb_idx * 0x3c;
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_COPY_DATA, 4, 
cmd_buffer->state.predicating));
-   radeon_emit(cmd_buffer->cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
-   COPY_DATA_DST_SEL(COPY_DATA_REG) |
-   COPY_DATA_COUNT_SEL);
-   radeon_emit(cmd_buffer->cs, va);
-   radeon_emit(cmd_buffer->cs, va >> 32);
-   radeon_emit(cmd_buffer->cs, reg >> 2);
-   radeon_emit(cmd_buffer->cs, 0);
+   radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, cmd_buffer->state.predicating));
+   radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
+   COPY_DATA_DST_SEL(COPY_DATA_REG) |
+   COPY_DATA_COUNT_SEL);
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, reg >> 2);
+   radeon_emit(cs, 0);
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_PFP_SYNC_ME, 0, 
cmd_buffer->state.predicating));
-   radeon_emit(cmd_buffer->cs, 0);
+   radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 
cmd_buffer->state.predicating));
+   radeon_emit(cs, 0);
 }
 
 static void
@@ -1407,7 +1418,7 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer 
*cmd_buffer)
assert(att->attachment->aspect_mask & 
VK_IMAGE_ASPECT_COLOR_BIT);
radv_emit_fb_color_state(cmd_buffer, i, att, image, layout);
 
-   radv_load_color_clear_regs(cmd_buffer, image, i);
+   radv_load_color_clear_metadata(cmd_buffer, image, i);
}
 
if(subpass->depth_stencil_attachment.attachment != 
VK_ATTACHMENT_UNUSED) {
diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index 373072dd36..26dc3e6ede 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulka

[Mesa-dev] [PATCH 4/6] radv: always set/load both depth and stencil clear values

2018-06-14 Thread Samuel Pitoiset
I don't think that matter much to emit both values and that
makes the code a bit simpler.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 33 +---
 1 file changed, 5 insertions(+), 28 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ad83bc6c6f..c2db11d041 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1210,29 +1210,17 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
 {
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->clear_value_offset;
-   unsigned reg_offset = 0, reg_count = 0;
 
assert(radv_image_has_htile(image));
 
-   if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
-   ++reg_count;
-   } else {
-   ++reg_offset;
-   va += 4;
-   }
-   if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
-   ++reg_count;
-
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 2 + reg_count, 0));
+   radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_PFP));
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, va >> 32);
-   if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT)
-   radeon_emit(cmd_buffer->cs, ds_clear_value.stencil);
-   if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
-   radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth));
+   radeon_emit(cmd_buffer->cs, ds_clear_value.stencil);
+   radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth));
 
radv_update_bound_fast_clear_ds(cmd_buffer, image, ds_clear_value);
 
@@ -1270,30 +1258,19 @@ static void
 radv_load_depth_clear_regs(struct radv_cmd_buffer *cmd_buffer,
   struct radv_image *image)
 {
-   VkImageAspectFlags aspects = vk_format_aspects(image->vk_format);
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->clear_value_offset;
-   unsigned reg_offset = 0, reg_count = 0;
 
if (!radv_image_has_htile(image))
return;
 
-   if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
-   ++reg_count;
-   } else {
-   ++reg_offset;
-   va += 4;
-   }
-   if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
-   ++reg_count;
-
radeon_emit(cmd_buffer->cs, PKT3(PKT3_COPY_DATA, 4, 0));
radeon_emit(cmd_buffer->cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
COPY_DATA_DST_SEL(COPY_DATA_REG) |
-   (reg_count == 2 ? COPY_DATA_COUNT_SEL : 0));
+   COPY_DATA_COUNT_SEL);
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, va >> 32);
-   radeon_emit(cmd_buffer->cs, (R_028028_DB_STENCIL_CLEAR + 4 * 
reg_offset) >> 2);
+   radeon_emit(cmd_buffer->cs, R_028028_DB_STENCIL_CLEAR >> 2);
radeon_emit(cmd_buffer->cs, 0);
 
radeon_emit(cmd_buffer->cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] radv: update ZRANGE_PRECISION in radv_update_bound_fast_clear_ds()

2018-06-14 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 46 +++-
 1 file changed, 15 insertions(+), 31 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 894960461a..56dbb759cb 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1178,7 +1178,8 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
 static void
 radv_update_bound_fast_clear_ds(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
-   VkClearDepthStencilValue ds_clear_value)
+   VkClearDepthStencilValue ds_clear_value,
+   VkImageAspectFlags aspects)
 {
struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
@@ -1200,6 +1201,17 @@ radv_update_bound_fast_clear_ds(struct radv_cmd_buffer 
*cmd_buffer,
radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 2);
radeon_emit(cs, ds_clear_value.stencil);
radeon_emit(cs, fui(ds_clear_value.depth));
+
+   /* Update the ZRANGE_PRECISION value for the TC-compat bug. This is
+* only needed when clearing Z to 0.0.
+*/
+   if ((aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
+   ds_clear_value.depth == 0.0) {
+   VkImageLayout layout = subpass->depth_stencil_attachment.layout;
+
+   radv_update_zrange_precision(cmd_buffer, &att->ds, image,
+layout, false);
+   }
 }
 
 /**
@@ -1227,36 +1239,8 @@ radv_set_ds_clear_metadata(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cs, ds_clear_value.stencil);
radeon_emit(cs, fui(ds_clear_value.depth));
 
-   radv_update_bound_fast_clear_ds(cmd_buffer, image, ds_clear_value);
-
-   /* Update the ZRANGE_PRECISION value for the TC-compat bug. This is
-* only needed when clearing Z to 0.0.
-*/
-   if ((aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
-   ds_clear_value.depth == 0.0) {
-   struct radv_framebuffer *framebuffer = 
cmd_buffer->state.framebuffer;
-   const struct radv_subpass *subpass = cmd_buffer->state.subpass;
-
-   if (!framebuffer || !subpass)
-   return;
-
-   if (subpass->depth_stencil_attachment.attachment == 
VK_ATTACHMENT_UNUSED)
-   return;
-
-   int idx = subpass->depth_stencil_attachment.attachment;
-   VkImageLayout layout = subpass->depth_stencil_attachment.layout;
-   struct radv_attachment_info *att = 
&framebuffer->attachments[idx];
-   struct radv_image *image = att->attachment->image;
-
-   /* Only needed if the image is currently bound as the depth
-* surface.
-*/
-   if (att->attachment->image != image)
-   return;
-
-   radv_update_zrange_precision(cmd_buffer, &att->ds, image,
-layout, false);
-   }
+   radv_update_bound_fast_clear_ds(cmd_buffer, image, ds_clear_value,
+   aspects);
 }
 
 /**
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] radv: clean up radv_{set, load}_depth_clear_regs() helpers

2018-06-14 Thread Samuel Pitoiset
And replace _regs by _metadata because it makes more sense.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 62 ++--
 src/amd/vulkan/radv_meta_clear.c |  5 +--
 src/amd/vulkan/radv_private.h|  9 ++---
 3 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index c2db11d041..894960461a 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1202,25 +1202,30 @@ radv_update_bound_fast_clear_ds(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cs, fui(ds_clear_value.depth));
 }
 
+/**
+ * Set the clear depth/stencil values to the image's metadata.
+ */
 void
-radv_set_depth_clear_regs(struct radv_cmd_buffer *cmd_buffer,
- struct radv_image *image,
- VkClearDepthStencilValue ds_clear_value,
- VkImageAspectFlags aspects)
+radv_set_ds_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
+  struct radv_image *image,
+  VkClearDepthStencilValue ds_clear_value,
+  VkImageAspectFlags aspects)
 {
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
+
va += image->offset + image->clear_value_offset;
 
assert(radv_image_has_htile(image));
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
-   radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
-   S_370_WR_CONFIRM(1) |
-   S_370_ENGINE_SEL(V_370_PFP));
-   radeon_emit(cmd_buffer->cs, va);
-   radeon_emit(cmd_buffer->cs, va >> 32);
-   radeon_emit(cmd_buffer->cs, ds_clear_value.stencil);
-   radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth));
+   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, 0));
+   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
+   S_370_WR_CONFIRM(1) |
+   S_370_ENGINE_SEL(V_370_PFP));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, ds_clear_value.stencil);
+   radeon_emit(cs, fui(ds_clear_value.depth));
 
radv_update_bound_fast_clear_ds(cmd_buffer, image, ds_clear_value);
 
@@ -1254,27 +1259,32 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
}
 }
 
+/**
+ * Load the clear depth/stencil values from the image's metadata.
+ */
 static void
-radv_load_depth_clear_regs(struct radv_cmd_buffer *cmd_buffer,
-  struct radv_image *image)
+radv_load_ds_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
+   struct radv_image *image)
 {
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
+
va += image->offset + image->clear_value_offset;
 
if (!radv_image_has_htile(image))
return;
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_COPY_DATA, 4, 0));
-   radeon_emit(cmd_buffer->cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
-   COPY_DATA_DST_SEL(COPY_DATA_REG) |
-   COPY_DATA_COUNT_SEL);
-   radeon_emit(cmd_buffer->cs, va);
-   radeon_emit(cmd_buffer->cs, va >> 32);
-   radeon_emit(cmd_buffer->cs, R_028028_DB_STENCIL_CLEAR >> 2);
-   radeon_emit(cmd_buffer->cs, 0);
+   radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, 0));
+   radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
+   COPY_DATA_DST_SEL(COPY_DATA_REG) |
+   COPY_DATA_COUNT_SEL);
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, R_028028_DB_STENCIL_CLEAR >> 2);
+   radeon_emit(cs, 0);
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
-   radeon_emit(cmd_buffer->cs, 0);
+   radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
+   radeon_emit(cs, 0);
 }
 
 /*
@@ -1444,7 +1454,7 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer 
*cmd_buffer)
cmd_buffer->state.dirty |= 
RADV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS;
cmd_buffer->state.offset_scale = att->ds.offset_scale;
}
-   radv_load_depth_clear_regs(cmd_buffer, image);
+   radv_load_ds_clear_metadata(cmd_buffer, image);
} else {
if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX9)
radeon_set_context_reg_seq(cmd_buffer->cs, 
R_028038_DB_Z_INFO, 2);
@@ -3934,7 +3944,7 @@ static void radv_initialize_htile(struct radv_cmd_buffer 
*cmd_buffer,
if (vk_format_is_stencil(image->vk_format))
aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
 
-   radv_set_depth_clear_regs(cmd_buffer, image, value, aspects);
+   radv_set_ds_clear_metadata(cmd_buffer, 

[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #8 from Giovanni ongaro  ---
i did build mesa with debug mode but how do i activate it also i submitted the
same bug to DXVK devs and they said that all the shaders are validated and is
not an DXVK bug probably 
now i do not know if it is a mesa or llvm bug
→ did use wine 3.10 DXVK git mesa git llvm git from 12 juni
my system is a ryzen2700x with vega64 and FC27 and 16 GB of ram
till now i found 3 games crashing with the same bug
the long journey home
quantum break
just cause 3
i have a setup with a rx480 on my secondary pcie and tested it with the rx480
on the same system setup and all 3 games work

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #9 from Samuel Pitoiset  ---
Well, the LLVM documentation [1] says:

"BUILD_SHARED_LIBS is only recommended for use by LLVM developers. If you want
to build LLVM as a shared library, you should use the LLVM_BUILD_LLVM_DYLIB
option."

Please, don't use that and re-build your LLVM with:

-DLLVM_BUILD_LLVM_DYLIB=ON
-DLLVM_LINK_LLVM_DYLIB=ON

(and remove BUILD_SHARED_LIBS, of course).

Let me know if it still crashes with these new flags.

[1] https://llvm.org/docs/CMake.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #10 from Pavel Ondračka  ---
BTW regarding the debug information from winedbg. It might help to specifically
build mesa (and maybe llvm) with CFLAGS="-g -gdwarf-2", since some distros
(like Fedora) default to newer dwarf version which winedbg doesn't support.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: add RADV_DEBUG=checkir

2018-06-14 Thread Samuel Pitoiset
This allows to run the LLVM verifier pass.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_debug.h   |  1 +
 src/amd/vulkan/radv_device.c  |  1 +
 src/amd/vulkan/radv_nir_to_llvm.c | 10 +++---
 src/amd/vulkan/radv_shader.c  |  1 +
 src/amd/vulkan/radv_shader.h  |  1 +
 5 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
index 762b338219..1e71349509 100644
--- a/src/amd/vulkan/radv_debug.h
+++ b/src/amd/vulkan/radv_debug.h
@@ -48,6 +48,7 @@ enum {
RADV_DEBUG_INFO  = 0x4,
RADV_DEBUG_ERRORS= 0x8,
RADV_DEBUG_STARTUP   = 0x10,
+   RADV_DEBUG_CHECKIR   = 0x20,
 };
 
 enum {
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5936b43093..1ffbe75ef6 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -410,6 +410,7 @@ static const struct debug_control radv_debug_options[] = {
{"info", RADV_DEBUG_INFO},
{"errors", RADV_DEBUG_ERRORS},
{"startup", RADV_DEBUG_STARTUP},
+   {"checkir", RADV_DEBUG_CHECKIR},
{NULL, 0}
 };
 
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index a56f017e25..5168c9d554 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2967,13 +2967,17 @@ handle_shader_outputs_post(struct ac_shader_abi *abi, 
unsigned max_outputs,
}
 }
 
-static void ac_llvm_finalize_module(struct radv_shader_context *ctx)
+static void ac_llvm_finalize_module(struct radv_shader_context *ctx,
+   const struct radv_nir_compiler_options 
*options)
 {
LLVMPassManagerRef passmgr;
/* Create the pass manager */
passmgr = LLVMCreateFunctionPassManagerForModule(
ctx->ac.module);
 
+   if (options->check_ir)
+   LLVMAddVerifierPass(passmgr);
+
/* This pass should eliminate all the load and store instructions */
LLVMAddPromoteMemoryToRegisterPass(passmgr);
 
@@ -3299,7 +3303,7 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
if (options->dump_preoptir)
ac_dump_module(ctx.ac.module);
 
-   ac_llvm_finalize_module(&ctx);
+   ac_llvm_finalize_module(&ctx, options);
 
if (shader_count == 1)
ac_nir_eliminate_const_vs_outputs(&ctx);
@@ -3617,7 +3621,7 @@ radv_compile_gs_copy_shader(LLVMTargetMachineRef tm,
 
LLVMBuildRetVoid(ctx.ac.builder);
 
-   ac_llvm_finalize_module(&ctx);
+   ac_llvm_finalize_module(&ctx, options);
 
ac_compile_llvm_module(tm, ctx.ac.module, binary, config, shader_info,
   MESA_SHADER_VERTEX, options);
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 76790a1904..a68e1d0254 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -571,6 +571,7 @@ shader_variant_create(struct radv_device *device,
options->dump_preoptir = options->dump_shader &&
 device->instance->debug_flags & 
RADV_DEBUG_PREOPTIR;
options->record_llvm_ir = device->keep_shader_info;
+   options->check_ir = device->instance->debug_flags & RADV_DEBUG_CHECKIR;
options->tess_offchip_block_dw_size = 
device->tess_offchip_block_dw_size;
options->address32_hi = device->physical_device->rad_info.address32_hi;
 
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index 05de188e3f..0473f3fa6a 100644
--- a/src/amd/vulkan/radv_shader.h
+++ b/src/amd/vulkan/radv_shader.h
@@ -120,6 +120,7 @@ struct radv_nir_compiler_options {
bool dump_shader;
bool dump_preoptir;
bool record_llvm_ir;
+   bool check_ir;
enum radeon_family family;
enum chip_class chip_class;
uint32_t tess_offchip_block_dw_size;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #11 from Samuel Pitoiset  ---
I can reproduce the issue with https://patchwork.freedesktop.org/patch/229503/.
Not sure why it doesn't crash for me... I will fix it.

Anyway, I highly recommend you to not use BUILD_SHARED_LIBS.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965/bufmgr: Drop the BO_ALLOC_BUSY flag

2018-06-14 Thread Lionel Landwerlin

On 13/06/18 21:26, Jason Ekstrand wrote:

---
  src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 ++
  src/mesa/drivers/dri/i965/brw_bufmgr.h |  1 -
  2 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 58bb559fdee..e9d3daa5985 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -448,11 +448,6 @@ int
  brw_bo_busy(struct brw_bo *bo)
  {
 struct brw_bufmgr *bufmgr = bo->bufmgr;


I don't really understand this hunk.
It seems related to this patch and undoes what you added in patch 1.


-
-   /* If we know it's idle, don't bother with the kernel round trip */
-   if (bo->idle && !bo->external)
-  return false;
-
 struct drm_i915_gem_busy busy = { .handle = bo->gem_handle };
  
 int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy);

@@ -506,20 +501,11 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
 struct bo_cache_bucket *bucket;
 bool alloc_from_cache;
 uint64_t bo_size;
-   bool busy = false;
 bool zeroed = false;
  
-   if (flags & BO_ALLOC_BUSY)

-  busy = true;
-
 if (flags & BO_ALLOC_ZEROED)
zeroed = true;
  
-   /* BUSY does doesn't really jive with ZEROED as we have to wait for it to

-* be idle before we can memset.  Just disallow that combination.
-*/
-   assert(!(busy && zeroed));
-
 /* Round the allocated size up to a power of two number of pages. */
 bucket = bucket_for_size(bufmgr, size);
  
@@ -539,29 +525,17 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,

  retry:
 alloc_from_cache = false;
 if (bucket != NULL && !list_empty(&bucket->head)) {
-  if (busy && !zeroed) {
- /* Allocate new render-target BOs from the tail (MRU)
-  * of the list, as it will likely be hot in the GPU
-  * cache and in the aperture for us.  If the caller
-  * asked us to zero the buffer, we don't want this
-  * because we are going to mmap it.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
- list_del(&bo->head);
+  /* For non-render-target BOs (where we're probably
+   * going to map it first thing in order to fill it
+   * with data), check if the last BO in the cache is
+   * unbusy, and only reuse in that case. Otherwise,
+   * allocating a new buffer is probably faster than
+   * waiting for the GPU to finish.
+   */
+  bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
+  if (!brw_bo_busy(bo)) {
   alloc_from_cache = true;
-  } else {
- /* For non-render-target BOs (where we're probably
-  * going to map it first thing in order to fill it
-  * with data), check if the last BO in the cache is
-  * unbusy, and only reuse in that case. Otherwise,
-  * allocating a new buffer is probably faster than
-  * waiting for the GPU to finish.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
- if (!brw_bo_busy(bo)) {
-alloc_from_cache = true;
-list_del(&bo->head);
- }
+ list_del(&bo->head);
}
  
if (alloc_from_cache) {

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index 32fc7a553c9..d3b3aadc0db 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -195,7 +195,6 @@ struct brw_bo {
 bool cache_coherent;
  };
  
-#define BO_ALLOC_BUSY   (1<<0)

  #define BO_ALLOC_ZEROED (1<<1)
  
  /**



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #12 from Giovanni ongaro  ---
I build it with dylib still crashes


Inviato da smartphone Samsung Galaxy.
 Messaggio originale Da: bugzilla-dae...@freedesktop.org Data:
14/06/18  14:41  (GMT+01:00) A: giovanni.nic...@ticino.com Oggetto: [Bug
106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA
but works ON POLARIS 



Comment # 11
  on bug 106756
  from  Samuel Pitoiset

I can reproduce the issue with
https://patchwork.freedesktop.org/patch/229503/.
Not sure why it doesn't crash for me... I will fix it.

Anyway, I highly recommend you to not use BUILD_SHARED_LIBS.





  You are receiving this mail because:


  You reported the bug.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function

2018-06-14 Thread Chema Casanova
El 14/06/18 a las 02:46, Jason Ekstrand escribió:
> On Wed, Jun 13, 2018 at 5:07 PM, Chema Casanova  > wrote:
> 
> On 13/06/18 22:46, Jason Ekstrand wrote:
> > On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> > mailto:jmcasan...@igalia.com>
> >> wrote:
> >
> >     This new function takes care of shuffle/unshuffle components of a
> >     particular bit-size in components with a different bit-size.
> >
> >     If source type size is smaller than destination type size the
> operation
> >     needed is a component shuffle. The opposite case would be an
> unshuffle.
> >
> >     The operation allows to skip first_component number of
> components from
> >     the source.
> >
> >     Shuffle MOVs are retyped using integer types avoiding problems
> with
> >     denorms
> >     and float types. This allows to simplify uses of shuffle functions
> >     that are
> >     dealing with these retypes individually.
> >
> >     Now there is a new restriction so source and destination can
> not overlap
> >     anymore when calling this suffle function. Following patches that
> >     migrate
> >     to use this new function will take care individually of
> avoiding source
> >     and destination overlaps.
> >     ---
> >      src/intel/compiler/brw_fs_nir.cpp | 92
> +++
> >      1 file changed, 92 insertions(+)
> >
> >     diff --git a/src/intel/compiler/brw_fs_nir.cpp
> >     b/src/intel/compiler/brw_fs_nir.cpp
> >     index 166da0aa6d7..1a9d3c41d1d 100644
> >     --- a/src/intel/compiler/brw_fs_nir.cpp
> >     +++ b/src/intel/compiler/brw_fs_nir.cpp
> >     @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const
> >     fs_builder &bld,
> >         }
> >      }
> >
> >     +/*
> >     + * This helper takes a source register and un/shuffles it
> into the
> >     destination
> >     + * register.
> >     + *
> >     + * If source type size is smaller than destination type size the
> >     operation
> >     + * needed is a component shuffle. The opposite case would be an
> >     unshuffle. If
> >     + * source/destination type size is equal a shuffle is done that
> >     would be
> >     + * equivalent to a simple MOV.
> >
> >
> > There's a sticky bit here if we want this to work with 64-bit types on
> > gen7 and earlier because we only have DF there and not Q so the
> > brw_reg_type_from_bit_size below doesn't work.  If we care about that
> > case (and I'm not convinced we do), it should be easy enough to add a
> > type_sz(src.type) == type_sz(dst.type) case which just does MOVs from
> > source to dest.
> 
> At this moment, current uses of this function are to read from 32-bits
> or to write to 32-bit. But I think that for completeness if would be
> nice to have all cases covered. The option of doing the MOVs in the case
> of equality (that would be quite normal) saves us to do the shuffle
> calculus for the simple case. So I'm going for it.
> 
> >     + *
> >     + * For example, if source is a 16-bit type and destination is
> >     32-bit. A 3
> >     + * components .xyz 16-bit vector on SIMD8 would be.
> >     + *
> >     + *    |x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8|
> >     + *    |z1|z2|z3|z4|z5|z6|z7|z8|  |  |  |  |  |  |  |  |
> >     + *
> >     + * This helper will return the following 2 32-bit components with
> >     the 16-bit
> >     + * values shuffled:
> >     + *
> >     + *    |x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8|
> >     + *    |z1   |z2   |z3   |z4   |z5   |z6   |z7   |z8   |
> >     + *
> >     + * For unshuffle, the example would be the opposite, a 64-bit
> type
> >     source
> >     + * and a 32-bit destination. A 2 component .xy 64-bit vector
> on SIMD8
> >     + * would be:
> >     + *
> >     + *    | x1l   x1h | x2l   x2h | x3l   x3h | x4l   x4h |
> >     + *    | x5l   x5h | x6l   x6h | x7l   x7h | x8l   x8h |
> >     + *    | y1l   y1h | y2l   y2h | y3l   y3h | y4l   y4h |
> >     + *    | y5l   y5h | y6l   y6h | y7l   y7h | y8l   y8h |
> >     + *
> >     + * The returned result would be the following 4 32-bit components
> >     unshuffled:
> >     + *
> >     + *    | x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l |
> >     + *    | x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h |
> >     + *    | y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l |
> >     + *    | y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h |
> >     + *
> >     + * - Source and destination register must not be overlapped.
> >     + * - first_com

Re: [Mesa-dev] [PATCH 3/8] i965/bufmgr: Drop the BO_ALLOC_BUSY flag

2018-06-14 Thread Lionel Landwerlin

On 14/06/18 14:01, Lionel Landwerlin wrote:

On 13/06/18 21:26, Jason Ekstrand wrote:

---
  src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 ++
  src/mesa/drivers/dri/i965/brw_bufmgr.h |  1 -
  2 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c

index 58bb559fdee..e9d3daa5985 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -448,11 +448,6 @@ int
  brw_bo_busy(struct brw_bo *bo)
  {
 struct brw_bufmgr *bufmgr = bo->bufmgr;


I don't really understand this hunk.
It seems related to this patch and undoes what you added in patch 1.


I meant "unrelated".




-
-   /* If we know it's idle, don't bother with the kernel round trip */
-   if (bo->idle && !bo->external)
-  return false;
-
 struct drm_i915_gem_busy busy = { .handle = bo->gem_handle };
   int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy);
@@ -506,20 +501,11 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
 struct bo_cache_bucket *bucket;
 bool alloc_from_cache;
 uint64_t bo_size;
-   bool busy = false;
 bool zeroed = false;
  -   if (flags & BO_ALLOC_BUSY)
-  busy = true;
-
 if (flags & BO_ALLOC_ZEROED)
    zeroed = true;
  -   /* BUSY does doesn't really jive with ZEROED as we have to wait 
for it to

-    * be idle before we can memset.  Just disallow that combination.
-    */
-   assert(!(busy && zeroed));
-
 /* Round the allocated size up to a power of two number of 
pages. */

 bucket = bucket_for_size(bufmgr, size);
  @@ -539,29 +525,17 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
  retry:
 alloc_from_cache = false;
 if (bucket != NULL && !list_empty(&bucket->head)) {
-  if (busy && !zeroed) {
- /* Allocate new render-target BOs from the tail (MRU)
-  * of the list, as it will likely be hot in the GPU
-  * cache and in the aperture for us.  If the caller
-  * asked us to zero the buffer, we don't want this
-  * because we are going to mmap it.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
- list_del(&bo->head);
+  /* For non-render-target BOs (where we're probably
+   * going to map it first thing in order to fill it
+   * with data), check if the last BO in the cache is
+   * unbusy, and only reuse in that case. Otherwise,
+   * allocating a new buffer is probably faster than
+   * waiting for the GPU to finish.
+   */
+  bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
+  if (!brw_bo_busy(bo)) {
   alloc_from_cache = true;
-  } else {
- /* For non-render-target BOs (where we're probably
-  * going to map it first thing in order to fill it
-  * with data), check if the last BO in the cache is
-  * unbusy, and only reuse in that case. Otherwise,
-  * allocating a new buffer is probably faster than
-  * waiting for the GPU to finish.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
- if (!brw_bo_busy(bo)) {
-    alloc_from_cache = true;
-    list_del(&bo->head);
- }
+ list_del(&bo->head);
    }
      if (alloc_from_cache) {
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h

index 32fc7a553c9..d3b3aadc0db 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -195,7 +195,6 @@ struct brw_bo {
 bool cache_coherent;
  };
  -#define BO_ALLOC_BUSY   (1<<0)
  #define BO_ALLOC_ZEROED (1<<1)
    /**



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] configure: use compliant grep regex checks

2018-06-14 Thread Emil Velikov
From: Emil Velikov 

The current `grep "foo\|bar"' trips on some grep implementations, like
the FreeBSD one. Instead use `egrep "foo|bar"' as suggested by Stefan.

Cc: Stefan Esser 
Reported-by: Stefan Esser 
Bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228673
Fixes: 1914c814a6c ("configure: error out if building OMX w/o supported 
platform")
Fixes: 63e11ac2b5c ("configure: error out if building VA w/o supported 
platform")
Signed-off-by: Emil Velikov 
---
 configure.ac | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index 7c19c8f99d7..958e1e10d77 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2235,13 +2235,13 @@ else
 have_vdpau_platform=no
 fi
 
-if echo $platforms | grep -q "x11\|drm"; then
+if echo $platforms | egrep -q "x11|drm"; then
 have_omx_platform=yes
 else
 have_omx_platform=no
 fi
 
-if echo $platforms | grep -q "x11\|drm\|wayland"; then
+if echo $platforms | egrep -q "x11|drm|wayland"; then
 have_va_platform=yes
 else
 have_va_platform=no
-- 
2.16.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: add RADV_DEBUG=checkir

2018-06-14 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Jun 14, 2018 at 2:28 PM, Samuel Pitoiset
 wrote:
> This allows to run the LLVM verifier pass.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.h   |  1 +
>  src/amd/vulkan/radv_device.c  |  1 +
>  src/amd/vulkan/radv_nir_to_llvm.c | 10 +++---
>  src/amd/vulkan/radv_shader.c  |  1 +
>  src/amd/vulkan/radv_shader.h  |  1 +
>  5 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
> index 762b338219..1e71349509 100644
> --- a/src/amd/vulkan/radv_debug.h
> +++ b/src/amd/vulkan/radv_debug.h
> @@ -48,6 +48,7 @@ enum {
> RADV_DEBUG_INFO  = 0x4,
> RADV_DEBUG_ERRORS= 0x8,
> RADV_DEBUG_STARTUP   = 0x10,
> +   RADV_DEBUG_CHECKIR   = 0x20,
>  };
>
>  enum {
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 5936b43093..1ffbe75ef6 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -410,6 +410,7 @@ static const struct debug_control radv_debug_options[] = {
> {"info", RADV_DEBUG_INFO},
> {"errors", RADV_DEBUG_ERRORS},
> {"startup", RADV_DEBUG_STARTUP},
> +   {"checkir", RADV_DEBUG_CHECKIR},
> {NULL, 0}
>  };
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index a56f017e25..5168c9d554 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -2967,13 +2967,17 @@ handle_shader_outputs_post(struct ac_shader_abi *abi, 
> unsigned max_outputs,
> }
>  }
>
> -static void ac_llvm_finalize_module(struct radv_shader_context *ctx)
> +static void ac_llvm_finalize_module(struct radv_shader_context *ctx,
> +   const struct radv_nir_compiler_options 
> *options)
>  {
> LLVMPassManagerRef passmgr;
> /* Create the pass manager */
> passmgr = LLVMCreateFunctionPassManagerForModule(
> ctx->ac.module);
>
> +   if (options->check_ir)
> +   LLVMAddVerifierPass(passmgr);
> +
> /* This pass should eliminate all the load and store instructions */
> LLVMAddPromoteMemoryToRegisterPass(passmgr);
>
> @@ -3299,7 +3303,7 @@ LLVMModuleRef 
> ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
> if (options->dump_preoptir)
> ac_dump_module(ctx.ac.module);
>
> -   ac_llvm_finalize_module(&ctx);
> +   ac_llvm_finalize_module(&ctx, options);
>
> if (shader_count == 1)
> ac_nir_eliminate_const_vs_outputs(&ctx);
> @@ -3617,7 +3621,7 @@ radv_compile_gs_copy_shader(LLVMTargetMachineRef tm,
>
> LLVMBuildRetVoid(ctx.ac.builder);
>
> -   ac_llvm_finalize_module(&ctx);
> +   ac_llvm_finalize_module(&ctx, options);
>
> ac_compile_llvm_module(tm, ctx.ac.module, binary, config, shader_info,
>MESA_SHADER_VERTEX, options);
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 76790a1904..a68e1d0254 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -571,6 +571,7 @@ shader_variant_create(struct radv_device *device,
> options->dump_preoptir = options->dump_shader &&
>  device->instance->debug_flags & 
> RADV_DEBUG_PREOPTIR;
> options->record_llvm_ir = device->keep_shader_info;
> +   options->check_ir = device->instance->debug_flags & 
> RADV_DEBUG_CHECKIR;
> options->tess_offchip_block_dw_size = 
> device->tess_offchip_block_dw_size;
> options->address32_hi = 
> device->physical_device->rad_info.address32_hi;
>
> diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
> index 05de188e3f..0473f3fa6a 100644
> --- a/src/amd/vulkan/radv_shader.h
> +++ b/src/amd/vulkan/radv_shader.h
> @@ -120,6 +120,7 @@ struct radv_nir_compiler_options {
> bool dump_shader;
> bool dump_preoptir;
> bool record_llvm_ir;
> +   bool check_ir;
> enum radeon_family family;
> enum chip_class chip_class;
> uint32_t tess_offchip_block_dw_size;
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] [RFC] i965/blit: bump some limits to 64k

2018-06-14 Thread Martin Peres
This fixes screenshots using 8k+ wide display setups in modesetting.

Chris Wilson even recommended the changes in intel_mipmap_tree.c
should read 131072 instead of 65535, but I for sure got confused by
his explanation.

In any case, I would like to use this RFC patch as a forum to discuss
why the fallback path is broken[1], and as to what should be the
limits for HW-accelerated blits now that we got rid of the blitter
usage on recent platforms.

Tested-by: Martin Peres  # HSW
Cc: Chris Wilson 

[1] https://fs.mupuf.org/corruption_8k%2B.png
---
 src/mesa/drivers/dri/i965/intel_blit.c| 2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index 90784c5b1958..458f8bd42857 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -403,7 +403,7 @@ emit_miptree_blit(struct brw_context *brw,
 * for linear surfaces and DWords for tiled surfaces.  So the maximum
 * pitch is 32k linear and 128k tiled.
 */
-   if (blt_pitch(src_mt) >= 32768 || blt_pitch(dst_mt) >= 32768) {
+   if (blt_pitch(src_mt) >= 65536 || blt_pitch(dst_mt) >= 65536) {
   perf_debug("Falling back due to >= 32k/128k pitch\n");
   return false;
}
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 6b89bf6848af..7347ea8b99d8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -523,7 +523,7 @@ need_to_retile_as_linear(struct brw_context *brw, unsigned 
row_pitch,
if (row_pitch < 64)
   return true;
 
-   if (ALIGN(row_pitch, 512) >= 32768) {
+   if (ALIGN(row_pitch, 512) >= 65536) {
   perf_debug("row pitch %u too large to blit, falling back to untiled",
  row_pitch);
   return true;
@@ -3583,7 +3583,7 @@ can_blit_slice(struct intel_mipmap_tree *mt,
unsigned int level, unsigned int slice)
 {
/* See intel_miptree_blit() for details on the 32k pitch limit. */
-   if (mt->surf.row_pitch >= 32768)
+   if (mt->surf.row_pitch >= 65536)
   return false;
 
return true;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fix emitting the TCS regs on GFX9

2018-06-14 Thread Samuel Pitoiset
The primitive ID is NULL if the vertex shader is LS. This
generates an invalid select instruction which crashes
because one operand is NULL.

This fixes crashes in The Long Journey Home, Quantum Break
and Just Cause 3 with DXVK.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756
CC: 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 5168c9d554..f6de71176f 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -3107,9 +3107,16 @@ static void ac_nir_fixup_ls_hs_input_vgprs(struct 
radv_shader_context *ctx)
LLVMValueRef hs_empty = LLVMBuildICmp(ctx->ac.builder, LLVMIntEQ, count,
  ctx->ac.i32_0, "");
ctx->abi.instance_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->rel_auto_id, ctx->abi.instance_id, "");
-   ctx->vs_prim_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.vertex_id, ctx->vs_prim_id, "");
-   ctx->rel_auto_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.tcs_rel_ids, ctx->rel_auto_id, "");
ctx->abi.vertex_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.tcs_patch_id, ctx->abi.vertex_id, "");
+   if (ctx->options->key.vs.as_ls) {
+   ctx->rel_auto_id =
+   LLVMBuildSelect(ctx->ac.builder, hs_empty,
+   ctx->abi.tcs_rel_ids, ctx->rel_auto_id, 
"");
+   } else {
+   ctx->vs_prim_id =
+   LLVMBuildSelect(ctx->ac.builder, hs_empty,
+   ctx->abi.vertex_id, ctx->vs_prim_id, 
"");
+   }
 }
 
 static void prepare_gs_input_vgprs(struct radv_shader_context *ctx)
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #13 from Samuel Pitoiset  ---
This should be fixed with https://patchwork.freedesktop.org/patch/229508/

Can you confirm? Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

--- Comment #7 from xinghua  ---
(In reply to Tapani Pälli from comment #6)
> Fix proposal sent here:
> https://lists.freedesktop.org/archives/mesa-dev/2018-June/197678.html

Hi, Tapani, thank you for your patch, seems that this patch could resolve the
bug, I will double check it tomorrow.
Will you merge this patch to mesa master?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix emitting the TCS regs on GFX9

2018-06-14 Thread Bas Nieuwenhuizen
On Thu, Jun 14, 2018 at 3:23 PM, Samuel Pitoiset
 wrote:
> The primitive ID is NULL if the vertex shader is LS. This
> generates an invalid select instruction which crashes
> because one operand is NULL.
>
> This fixes crashes in The Long Journey Home, Quantum Break
> and Just Cause 3 with DXVK.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756
> CC: 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 5168c9d554..f6de71176f 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -3107,9 +3107,16 @@ static void ac_nir_fixup_ls_hs_input_vgprs(struct 
> radv_shader_context *ctx)
> LLVMValueRef hs_empty = LLVMBuildICmp(ctx->ac.builder, LLVMIntEQ, 
> count,
>   ctx->ac.i32_0, "");
> ctx->abi.instance_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
> ctx->rel_auto_id, ctx->abi.instance_id, "");
> -   ctx->vs_prim_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
> ctx->abi.vertex_id, ctx->vs_prim_id, "");
> -   ctx->rel_auto_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
> ctx->abi.tcs_rel_ids, ctx->rel_auto_id, "");
> ctx->abi.vertex_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
> ctx->abi.tcs_patch_id, ctx->abi.vertex_id, "");
> +   if (ctx->options->key.vs.as_ls) {

Isn't vs_as_ls always true here?

> +   ctx->rel_auto_id =
> +   LLVMBuildSelect(ctx->ac.builder, hs_empty,
> +   ctx->abi.tcs_rel_ids, 
> ctx->rel_auto_id, "");
> +   } else {
> +   ctx->vs_prim_id =
> +   LLVMBuildSelect(ctx->ac.builder, hs_empty,
> +   ctx->abi.vertex_id, ctx->vs_prim_id, 
> "");
> +   }
>  }
>
>  static void prepare_gs_input_vgprs(struct radv_shader_context *ctx)
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

--- Comment #8 from Tapani Pälli  ---
(In reply to xinghua from comment #7)
> (In reply to Tapani Pälli from comment #6)
> > Fix proposal sent here:
> > https://lists.freedesktop.org/archives/mesa-dev/2018-June/197678.html
> 
> Hi, Tapani, thank you for your patch, seems that this patch could resolve
> the bug, I will double check it tomorrow.
> Will you merge this patch to mesa master?

If others (reviewers) agree that this is correct, then yes.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.0 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106644

--- Comment #11 from erhar...@mailbox.org ---
Built 18.0.5 and 17.3.9, same test failures here.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #14 from Giovanni ongaro  ---
thank you for the quick response
i can confirm that all 3 games now work on vega 64 under dxvk
with this patch applied
you did a very good job

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

--- Comment #15 from Giovanni ongaro  ---
it also fixes NFS Payback

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit()

2018-06-14 Thread Brian Paul
To silence compiler warning about unhandled switch cases.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index b321112..673c0f6 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3990,6 +3990,8 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
case ir_intrinsic_generic_atomic_max:
case ir_intrinsic_generic_atomic_exchange:
case ir_intrinsic_generic_atomic_comp_swap:
+   case ir_intrinsic_begin_invocation_interlock:
+   case ir_intrinsic_end_invocation_interlock:
   unreachable("Invalid intrinsic");
}
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/9] vulkan: Add KHR_display extension using DRM [v8]

2018-06-14 Thread Keith Packard
Jason Ekstrand  writes:

> I'm trusting that not much changed other than what was explicitly called
> out.  I didn't want to re-read in *that* much detail again. :-)

You are correct, all of the changes from the previous patch were listed
in the commit message.

> Reviewed-by: Jason Ekstrand 

Thanks much!

-- 
-keith


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit()

2018-06-14 Thread Charmaine Lee

Reviewed-by: Charmaine Lee 


From: Brian Paul 
Sent: Thursday, June 14, 2018 8:13:01 AM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee; Neha Bhende
Subject: [PATCH] st/mesa: add missing switch cases in 
glsl_to_tgsi_visitor::visit()

To silence compiler warning about unhandled switch cases.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index b321112..673c0f6 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3990,6 +3990,8 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
case ir_intrinsic_generic_atomic_max:
case ir_intrinsic_generic_atomic_exchange:
case ir_intrinsic_generic_atomic_comp_swap:
+   case ir_intrinsic_begin_invocation_interlock:
+   case ir_intrinsic_end_invocation_interlock:
   unreachable("Invalid intrinsic");
}
 }
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: use compliant grep regex checks

2018-06-14 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [ANNOUNCE] Mesa 18.1.2 release candidate

2018-06-14 Thread dylan
Hello list,

The candidate for the Mesa 18.1.2 is now available. Currently we have:
 - 42 queued
 - 6 nominated (outstanding)
 - and 0 rejected patches

Notable changes in this release:
- numerous fixes for radv
- libatomic checks for meson, as well as fixing coverage for less common (not
  arm or x86) platforms
- lots of common Intel fixes
- GLX fixes
- tarball fixes for android
- meson assembly fixes for x86 when doing an x86 -> x86 cross compile

Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 18.1.2 this Friday (June 13th), around or shortly after 10
AM PDT.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please go ahead.


Mesa stable queue
-

Nominated (6)
==

Bas Nieuwenhuizen (1):
  radv: Fix output for sparse MRTs.

Dave Airlie (1):
  glsl: allow standalone semicolons outside main()

Marek Olšák (2):
  radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers
  ac/gpu_info: report real total memory sizes

Samuel Pitoiset (2):
  radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
  radv: update the ZRANGE_PRECISION value for the TC-compat bug


Queued (42)
===

Alex Smith (4):
  radv: Consolidate GFX9 merged shader lookup logic
  radv: Handle GFX9 merged shaders in radv_flush_constants()
  radeonsi: Fix crash on shaders using MSAA image load/store
  radv: Set active_stages the same whether or not shaders were cached

Andrew Galante (2):
  meson: Test for __atomic_add_fetch in atomic checks
  configure.ac: Test for __atomic_add_fetch in atomic checks

Bas Nieuwenhuizen (1):
  radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.

Cameron Kumar (1):
  vulkan/wsi: Destroy swapchain images after terminating FIFO queues

Dylan Baker (5):
  docs/relnotes: Add sha256 sums for mesa 18.1.1
  cherry-ignore: add commits not to pull
  cherry-ignore: Add patches from Jason that he rebased on 18.1
  meson: work around gentoo applying -m32 to host compiler in cross builds
  cherry-ignore: Add another patch

Eric Engestrom (3):
  autotools: add missing android file to package
  configure: radv depends on mako
  i965: fix resource leak

Jason Ekstrand (10):
  intel/eu: Add some brw_get_default_ helpers
  intel/eu: Copy fields manually in brw_next_insn
  intel/eu: Set flag [sub]register number differently for 3src
  intel/blorp: Don't vertex fetch directly from clear values
  intel/isl: Add bounds-checking assertions in isl_format_get_layout
  intel/isl: Add bounds-checking assertions for the format_info table
  i965/screen: Refactor query_dma_buf_formats
  i965/screen: Use RGBA non-sRGB formats for images
  anv: Set fence/semaphore types to NONE in impl_cleanup
  i965/screen: Return false for unsupported formats in query_modifiers

Jordan Justen (1):
  mesa/program_binary: add implicit UseProgram after successful 
ProgramBinary

Juan A. Suarez Romero (1):
  glsl: Add ir_binop_vector_extract in NIR

Kenneth Graunke (2):
  i965: Fix batch-last mode to properly swap BOs.
  anv: Disable __gen_validate_value if NDEBUG is set.

Marek Olšák (1):
  r300g/swtcl: make pipe_context uploaders use malloc'd memory as before

Matt Turner (1):
  meson: Fix -latomic check

Michel Dänzer (1):
  glx: Fix number of property values to read in glXImportContextEXT

Nicolas Boichat (1):
  configure.ac/meson.build: Fix -latomic test

Philip Rebohle (1):
  radv: Use correct color format for fast clears

Samuel Pitoiset (3):
  radv: fix a GPU hang when MRTs are sparse
  radv: fix missing ZRANGE_PRECISION(1) for GFX9+
  radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold

Scott D Phillips (1):
  intel/tools: add intel_sanitize_gpu to EXTRA_DIST

Thomas Petazzoni (1):
  configure.ac: rework -latomic check

Timothy Arceri (2):
  ac: fix possible truncation of intrinsic name
  radeonsi: fix possible truncation on renderer string


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [ANNOUNCE] Mesa 18.1.2 release candidate

2018-06-14 Thread Bas Nieuwenhuizen
On Thu, Jun 14, 2018 at 6:13 PM,   wrote:
> Hello list,
>
> The candidate for the Mesa 18.1.2 is now available. Currently we have:
>  - 42 queued
>  - 6 nominated (outstanding)
>  - and 0 rejected patches
>
> Notable changes in this release:
> - numerous fixes for radv
> - libatomic checks for meson, as well as fixing coverage for less common (not
>   arm or x86) platforms
> - lots of common Intel fixes
> - GLX fixes
> - tarball fixes for android
> - meson assembly fixes for x86 when doing an x86 -> x86 cross compile
>
> Take a look at section "Mesa stable queue" for more information.
>
>
> Testing reports/general approval
> 
> Any testing reports (or general approval of the state of the branch) will be
> greatly appreciated.
>
> The plan is to have 18.1.2 this Friday (June 13th), around or shortly after 10
> AM PDT.

June 15th?

>
> If you have any questions or suggestions - be that about the current patch
> queue or otherwise, please go ahead.
>
>
> Mesa stable queue
> -
>
> Nominated (6)
> ==
>
> Bas Nieuwenhuizen (1):
>   radv: Fix output for sparse MRTs.
>
> Dave Airlie (1):
>   glsl: allow standalone semicolons outside main()
>
> Marek Olšák (2):
>   radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers
>   ac/gpu_info: report real total memory sizes
>
> Samuel Pitoiset (2):
>   radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
>   radv: update the ZRANGE_PRECISION value for the TC-compat bug
>
>
> Queued (42)
> ===
>
> Alex Smith (4):
>   radv: Consolidate GFX9 merged shader lookup logic
>   radv: Handle GFX9 merged shaders in radv_flush_constants()
>   radeonsi: Fix crash on shaders using MSAA image load/store
>   radv: Set active_stages the same whether or not shaders were cached
>
> Andrew Galante (2):
>   meson: Test for __atomic_add_fetch in atomic checks
>   configure.ac: Test for __atomic_add_fetch in atomic checks
>
> Bas Nieuwenhuizen (1):
>   radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.
>
> Cameron Kumar (1):
>   vulkan/wsi: Destroy swapchain images after terminating FIFO queues
>
> Dylan Baker (5):
>   docs/relnotes: Add sha256 sums for mesa 18.1.1
>   cherry-ignore: add commits not to pull
>   cherry-ignore: Add patches from Jason that he rebased on 18.1
>   meson: work around gentoo applying -m32 to host compiler in cross builds
>   cherry-ignore: Add another patch
>
> Eric Engestrom (3):
>   autotools: add missing android file to package
>   configure: radv depends on mako
>   i965: fix resource leak
>
> Jason Ekstrand (10):
>   intel/eu: Add some brw_get_default_ helpers
>   intel/eu: Copy fields manually in brw_next_insn
>   intel/eu: Set flag [sub]register number differently for 3src
>   intel/blorp: Don't vertex fetch directly from clear values
>   intel/isl: Add bounds-checking assertions in isl_format_get_layout
>   intel/isl: Add bounds-checking assertions for the format_info table
>   i965/screen: Refactor query_dma_buf_formats
>   i965/screen: Use RGBA non-sRGB formats for images
>   anv: Set fence/semaphore types to NONE in impl_cleanup
>   i965/screen: Return false for unsupported formats in query_modifiers
>
> Jordan Justen (1):
>   mesa/program_binary: add implicit UseProgram after successful 
> ProgramBinary
>
> Juan A. Suarez Romero (1):
>   glsl: Add ir_binop_vector_extract in NIR
>
> Kenneth Graunke (2):
>   i965: Fix batch-last mode to properly swap BOs.
>   anv: Disable __gen_validate_value if NDEBUG is set.
>
> Marek Olšák (1):
>   r300g/swtcl: make pipe_context uploaders use malloc'd memory as before
>
> Matt Turner (1):
>   meson: Fix -latomic check
>
> Michel Dänzer (1):
>   glx: Fix number of property values to read in glXImportContextEXT
>
> Nicolas Boichat (1):
>   configure.ac/meson.build: Fix -latomic test
>
> Philip Rebohle (1):
>   radv: Use correct color format for fast clears
>
> Samuel Pitoiset (3):
>   radv: fix a GPU hang when MRTs are sparse
>   radv: fix missing ZRANGE_PRECISION(1) for GFX9+
>   radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold
>
> Scott D Phillips (1):
>   intel/tools: add intel_sanitize_gpu to EXTRA_DIST
>
> Thomas Petazzoni (1):
>   configure.ac: rework -latomic check
>
> Timothy Arceri (2):
>   ac: fix possible truncation of intrinsic name
>   radeonsi: fix possible truncation on renderer string
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC] i965/blit: bump some limits to 64k

2018-06-14 Thread Nanley Chery
On Thu, Jun 14, 2018 at 04:18:30PM +0300, Martin Peres wrote:
> This fixes screenshots using 8k+ wide display setups in modesetting.
> 
> Chris Wilson even recommended the changes in intel_mipmap_tree.c
> should read 131072 instead of 65535, but I for sure got confused by
> his explanation.
> 
> In any case, I would like to use this RFC patch as a forum to discuss
> why the fallback path is broken[1], and as to what should be the
> limits for HW-accelerated blits now that we got rid of the blitter
> usage on recent platforms.
> 

Hi,

My understanding is that the fallback path is broken because we silently
ignore miptree_create_for_bo's request for a tiled miptree. This results
in some parts of mesa treating the surface as tiled and other parts of
treating the surface as linear.

I couldn't come up with a piglit test for this when I was working on a
fix. Please let me know if you can think of any.

I think what the limits should be depends on which mesa branch you're
working off of.

* On the master branch of mesa, which has some commits which reduce the
  dependence on the BLT engine, we can remove these limits by using BLORP.
  As much as I can tell, BLORP can handle images as wide as the surface
  pitch limit in the RENDER_SURFACE_STATE packet will allow.

  I sent out a series [a] a couple weeks ago that removes the limits
  imposed by the hardware blitter.

* On the stable branch however, we can modify some incorrect code to set
  the correct BLT limits (as Chris has suggested). The BLT engine's pitch
  field is a signed 16bit integer, whose unit changes depending on the
  tiling of the surface. For linear surfaces, it's in units of bytes and
  for non-linear surfaces, it's in units of dwords. This translates to
  2^15-1 bytes or (2^15-1) * 4 bytes respectively.
  
  I made a branch [b] which does this already, but I think my rebasing +
  testing strategy for stable branches on the CI might be incorrect.

[a] https://patchwork.freedesktop.org/series/43971/
[b] https://cgit.freedesktop.org/~nchery/mesa/log/?h=wip/stable/stop-retiling

> Tested-by: Martin Peres  # HSW
> Cc: Chris Wilson 
> 
> [1] https://fs.mupuf.org/corruption_8k%2B.png
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c| 2 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index 90784c5b1958..458f8bd42857 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -403,7 +403,7 @@ emit_miptree_blit(struct brw_context *brw,
>  * for linear surfaces and DWords for tiled surfaces.  So the maximum
>  * pitch is 32k linear and 128k tiled.
>  */
> -   if (blt_pitch(src_mt) >= 32768 || blt_pitch(dst_mt) >= 32768) {
> +   if (blt_pitch(src_mt) >= 65536 || blt_pitch(dst_mt) >= 65536) {

This is too large for linear miptrees.

>perf_debug("Falling back due to >= 32k/128k pitch\n");
>return false;
> }
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 6b89bf6848af..7347ea8b99d8 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -523,7 +523,7 @@ need_to_retile_as_linear(struct brw_context *brw, 
> unsigned row_pitch,
> if (row_pitch < 64)
>return true;
>  
> -   if (ALIGN(row_pitch, 512) >= 32768) {
> +   if (ALIGN(row_pitch, 512) >= 65536) {
>perf_debug("row pitch %u too large to blit, falling back to untiled",
>   row_pitch);
>return true;
> @@ -3583,7 +3583,7 @@ can_blit_slice(struct intel_mipmap_tree *mt,
> unsigned int level, unsigned int slice)
>  {
> /* See intel_miptree_blit() for details on the 32k pitch limit. */
> -   if (mt->surf.row_pitch >= 32768)
> +   if (mt->surf.row_pitch >= 65536)

This is also too large for linear miptrees.

-Nanley

>  
> return true;
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/9] intel/batch-decoder: handle non-contiguous binding table / surface state

2018-06-14 Thread Lionel Landwerlin
From: Scott D Phillips 

Reviewed-by: Lionel Landwerlin 
---
 src/intel/common/gen_batch_decoder.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index 3852f32de36..2b6978da92d 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -236,20 +236,30 @@ dump_binding_table(struct gen_batch_decode_ctx *ctx, 
uint32_t offset, int count)
   return;
}
 
+   struct gen_batch_decode_bo bo = ctx->surface_base;
const uint32_t *pointers = ctx->surface_base.map + offset;
for (int i = 0; i < count; i++) {
   if (pointers[i] == 0)
  continue;
 
-  if (pointers[i] % 32 != 0 ||
-  (pointers[i] + strct->dw_length * 4) >= ctx->surface_base.size) {
+  if (pointers[i] % 32 != 0) {
+ fprintf(ctx->fp, "pointer %u: %08x \n", i, pointers[i]);
+ continue;
+  }
+
+  uint64_t addr = ctx->surface_base.addr + pointers[i];
+  uint32_t size = strct->dw_length * 4;
+
+  if (addr < bo.addr || addr + size >= bo.addr + bo.size)
+ bo = ctx->get_bo(ctx->user_data, addr);
+
+  if (addr < bo.addr || addr + size >= bo.addr + bo.size) {
  fprintf(ctx->fp, "pointer %u: %08x \n", i, pointers[i]);
  continue;
   }
 
   fprintf(ctx->fp, "pointer %u: %08x\n", i, pointers[i]);
-  ctx_print_group(ctx, strct, ctx->surface_base.addr + pointers[i],
-  ctx->surface_base.map + pointers[i]);
+  ctx_print_group(ctx, strct, addr, bo.map + (addr - bo.addr));
}
 }
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/9] intel: aubinator: handle ppgtt & softpin

2018-06-14 Thread Lionel Landwerlin
Hi all,

This series is based of what Scott did earlier this year to handle
aubs with ppgtt. This has the nice side effect of also fixing recent
softpin changes that allocate virtual addresses from the top
addresses. Because we didn't have more than 1Tb of GTT mapping, we
just couldn't deal with those mappings.

This series replaces what I sent earlier in :
https://patchwork.freedesktop.org/series/44535/

Cheers,

Jason Ekstrand (1):
  util: rb-tree: A simple, invasive, red-black tree

Lionel Landwerlin (6):
  intel: batch-decoder: don't asks for constant BO until decoding
  intel: batch-decoder: add missing return line
  intel: aubinator: move handle trace function around
  intel: aubinator: move address masking
  intel: aubinator: drop the 1Tb GTT mapping
  intel: aubinator: remove standard input processing option

Scott D Phillips (2):
  intel/tools/aubinator: aubinate ppgtt aubs
  intel/batch-decoder: handle non-contiguous binding table / surface
state

 src/intel/common/gen_batch_decoder.c |  37 +-
 src/intel/tools/aubinator.c  | 608 +++
 src/util/Makefile.sources|   2 +
 src/util/meson.build |   2 +
 src/util/rb_tree.c   | 421 +++
 src/util/rb_tree.h   | 269 
 6 files changed, 1152 insertions(+), 187 deletions(-)
 create mode 100644 src/util/rb_tree.c
 create mode 100644 src/util/rb_tree.h

--
2.17.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] intel/tools/aubinator: aubinate ppgtt aubs

2018-06-14 Thread Lionel Landwerlin
From: Scott D Phillips 

v2: by Lionel
Fix memfd_create compilation issue
Fix pml4 address stored on 32 instead of 64bits
Return no buffer if first ppgtt page is not mapped

Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/aubinator.c | 460 
 1 file changed, 410 insertions(+), 50 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 3120e82b22e..99cd010dd9d 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -37,12 +37,24 @@
 #include 
 #include 
 
+#include "util/list.h"
 #include "util/macros.h"
+#include "util/rb_tree.h"
 
 #include "common/gen_decoder.h"
 #include "common/gen_disasm.h"
 #include "intel_aub.h"
 
+#ifndef HAVE_MEMFD_CREATE
+#include 
+
+static inline int
+memfd_create(const char *name, unsigned int flags)
+{
+   return syscall(SYS_memfd_create, name, flags);
+}
+#endif
+
 /* Below is the only command missing from intel_aub.h in libdrm
  * So, reuse intel_aub.h from libdrm and #define the
  * AUB_MI_BATCH_BUFFER_END as below
@@ -70,6 +82,31 @@ struct gen_batch_decode_ctx batch_ctx;
 
 uint64_t gtt_size, gtt_end;
 void *gtt;
+
+struct bo_map {
+   struct list_head link;
+   struct gen_batch_decode_bo bo;
+};
+
+struct ggtt_entry {
+   struct rb_node node;
+   uint64_t virt_addr;
+   uint64_t phys_addr;
+};
+
+struct phys_mem {
+   struct rb_node node;
+   uint64_t fd_offset;
+   uint64_t phys_addr;
+   uint8_t *data;
+};
+
+static struct list_head maps;
+static struct rb_tree ggtt = {NULL};
+static struct rb_tree mem = {NULL};
+int mem_fd = -1;
+off_t mem_fd_len = 0;
+
 uint64_t general_state_base;
 uint64_t surface_state_base;
 uint64_t dynamic_state_base;
@@ -99,6 +136,191 @@ valid_offset(uint32_t offset)
 #define GEN_ENGINE_RENDER 1
 #define GEN_ENGINE_BLITTER 2
 
+static inline struct ggtt_entry *
+ggtt_entry_next(struct ggtt_entry *entry)
+{
+   if (!entry)
+  return NULL;
+   struct rb_node *node = rb_node_next(&entry->node);
+   if (!node)
+  return NULL;
+   return rb_node_data(struct ggtt_entry, node, node);
+}
+
+static inline int
+cmp_uint64(uint64_t a, uint64_t b)
+{
+   if (a < b)
+  return -1;
+   if (a > b)
+  return 1;
+   return 0;
+}
+
+static inline int
+cmp_ggtt_entry(const struct rb_node *node, const void *addr)
+{
+   struct ggtt_entry *entry = rb_node_data(struct ggtt_entry, node, node);
+   return cmp_uint64(entry->virt_addr, *(uint64_t *)addr);
+}
+
+static struct ggtt_entry *
+ensure_ggtt_entry(struct rb_tree *tree, uint64_t virt_addr)
+{
+   struct rb_node *node = rb_tree_search_sloppy(&ggtt, &virt_addr,
+cmp_ggtt_entry);
+   int cmp = 0;
+   if (!node || (cmp = cmp_ggtt_entry(node, &virt_addr))) {
+  struct ggtt_entry *new_entry = calloc(1, sizeof(*new_entry));
+  new_entry->virt_addr = virt_addr;
+  rb_tree_insert_at(&ggtt, node, &new_entry->node, cmp > 0);
+  node = &new_entry->node;
+   }
+
+   return rb_node_data(struct ggtt_entry, node, node);
+}
+
+static struct ggtt_entry *
+search_ggtt_entry(uint64_t virt_addr)
+{
+   virt_addr &= ~0xfff;
+
+   struct rb_node *node = rb_tree_search(&ggtt, &virt_addr, cmp_ggtt_entry);
+
+   if (!node)
+  return NULL;
+
+   return rb_node_data(struct ggtt_entry, node, node);
+}
+
+static inline int
+cmp_phys_mem(const struct rb_node *node, const void *addr)
+{
+   struct phys_mem *mem = rb_node_data(struct phys_mem, node, node);
+   return cmp_uint64(mem->phys_addr, *(uint64_t *)addr);
+}
+
+static struct phys_mem *
+ensure_phys_mem(uint64_t phys_addr)
+{
+   struct rb_node *node = rb_tree_search_sloppy(&mem, &phys_addr, 
cmp_phys_mem);
+   int cmp = 0;
+   if (!node || (cmp = cmp_phys_mem(node, &phys_addr))) {
+  struct phys_mem *new_mem = calloc(1, sizeof(*new_mem));
+  new_mem->phys_addr = phys_addr;
+  new_mem->fd_offset = mem_fd_len;
+
+  int ftruncate_res = ftruncate(mem_fd, mem_fd_len += 4096);
+  assert(ftruncate_res == 0);
+
+  new_mem->data = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED,
+   mem_fd, new_mem->fd_offset);
+  assert(new_mem->data != MAP_FAILED);
+
+  rb_tree_insert_at(&mem, node, &new_mem->node, cmp > 0);
+  node = &new_mem->node;
+   }
+
+   return rb_node_data(struct phys_mem, node, node);
+}
+
+static struct phys_mem *
+search_phys_mem(uint64_t phys_addr)
+{
+   phys_addr &= ~0xfff;
+
+   struct rb_node *node = rb_tree_search(&mem, &phys_addr, cmp_phys_mem);
+
+   if (!node)
+  return NULL;
+
+   return rb_node_data(struct phys_mem, node, node);
+}
+
+static void
+handle_ggtt_entry_write(uint64_t address, void *_data, uint32_t _size)
+{
+   uint64_t virt_addr = (address / sizeof(uint64_t)) << 12;
+   uint64_t *data = _data;
+   size_t size = _size / sizeof(*data);
+   for (uint64_t *entry = data;
+entry < data + size;
+entry++, virt_addr += 4096) {
+  struct ggtt_entry *pt = ensure_ggtt_entry(&ggtt, virt_addr)

[Mesa-dev] [PATCH 6/9] intel: aubinator: move handle trace function around

2018-06-14 Thread Lionel Landwerlin
No functional changes.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/aubinator.c | 95 +++--
 1 file changed, 49 insertions(+), 46 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 99cd010dd9d..2a1b91c0e54 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -321,54 +321,8 @@ get_gen_batch_bo(void *user_data, uint64_t address)
};
 }
 
-static void
-handle_trace_block(uint32_t *p)
-{
-   int operation = p[1] & AUB_TRACE_OPERATION_MASK;
-   int type = p[1] & AUB_TRACE_TYPE_MASK;
-   int address_space = p[1] & AUB_TRACE_ADDRESS_SPACE_MASK;
-   uint64_t offset = p[3];
-   uint32_t size = p[4];
-   int header_length = p[0] & 0x;
-   uint32_t *data = p + header_length + 2;
-   int engine = GEN_ENGINE_RENDER;
-
-   if (devinfo.gen >= 8)
-  offset += (uint64_t) p[5] << 32;
-
-   switch (operation) {
-   case AUB_TRACE_OP_DATA_WRITE:
-  if (address_space != AUB_TRACE_MEMTYPE_GTT)
- break;
-  if (gtt_size < offset + size) {
- fprintf(stderr, "overflow gtt space: %s\n", strerror(errno));
- exit(EXIT_FAILURE);
-  }
-  memcpy((char *) gtt + offset, data, size);
-  if (gtt_end < offset + size)
- gtt_end = offset + size;
-  break;
-   case AUB_TRACE_OP_COMMAND_WRITE:
-  switch (type) {
-  case AUB_TRACE_TYPE_RING_PRB0:
- engine = GEN_ENGINE_RENDER;
- break;
-  case AUB_TRACE_TYPE_RING_PRB2:
- engine = GEN_ENGINE_BLITTER;
- break;
-  default:
- fprintf(outfile, "command write to unknown ring %d\n", type);
- break;
-  }
 
-  (void)engine; /* TODO */
-  batch_ctx.get_bo = get_gen_batch_bo;
-  gen_print_batch(&batch_ctx, data, size, 0);
 
-  gtt_end = 0;
-  break;
-   }
-}
 
 static struct gen_batch_decode_bo
 get_ggtt_batch_bo(void *user_data, uint64_t address)
@@ -470,6 +424,55 @@ clear_bo_maps(void)
}
 }
 
+static void
+handle_trace_block(uint32_t *p)
+{
+   int operation = p[1] & AUB_TRACE_OPERATION_MASK;
+   int type = p[1] & AUB_TRACE_TYPE_MASK;
+   int address_space = p[1] & AUB_TRACE_ADDRESS_SPACE_MASK;
+   uint64_t offset = p[3];
+   uint32_t size = p[4];
+   int header_length = p[0] & 0x;
+   uint32_t *data = p + header_length + 2;
+   int engine = GEN_ENGINE_RENDER;
+
+   if (devinfo.gen >= 8)
+  offset += (uint64_t) p[5] << 32;
+
+   switch (operation) {
+   case AUB_TRACE_OP_DATA_WRITE:
+  if (address_space != AUB_TRACE_MEMTYPE_GTT)
+ break;
+  if (gtt_size < offset + size) {
+ fprintf(stderr, "overflow gtt space: %s\n", strerror(errno));
+ exit(EXIT_FAILURE);
+  }
+  memcpy((char *) gtt + offset, data, size);
+  if (gtt_end < offset + size)
+ gtt_end = offset + size;
+  break;
+   case AUB_TRACE_OP_COMMAND_WRITE:
+  switch (type) {
+  case AUB_TRACE_TYPE_RING_PRB0:
+ engine = GEN_ENGINE_RENDER;
+ break;
+  case AUB_TRACE_TYPE_RING_PRB2:
+ engine = GEN_ENGINE_BLITTER;
+ break;
+  default:
+ fprintf(outfile, "command write to unknown ring %d\n", type);
+ break;
+  }
+
+  (void)engine; /* TODO */
+  batch_ctx.get_bo = get_gen_batch_bo;
+  gen_print_batch(&batch_ctx, data, size, 0);
+
+  gtt_end = 0;
+  break;
+   }
+}
+
 static void
 aubinator_init(uint16_t aub_pci_id, const char *app_name)
 {
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] util: rb-tree: A simple, invasive, red-black tree

2018-06-14 Thread Lionel Landwerlin
From: Jason Ekstrand 

This is a simple, invasive, liberally licensed red-black tree
implementation. It's an invasive data structure similar to the
Linux kernel linked-list where the intention is that you embed a
rb_node struct the data structure you intend to put into the
tree.

The implementation is mostly based on the one in "Introduction to
Algorithms", third edition, by Cormen, Leiserson, Rivest, and
Stein. There were a few other key design points:

 * It's an invasive data structure similar to the [Linux kernel
   linked list].

 * It uses NULL for leaves instead of a sentinel. This means a few
   algorithms differ a small bit from the ones in "Introduction to
   Algorithms".

 * All search operations are inlined so that the compiler can
   optimize away the function pointer call.
---
 src/util/Makefile.sources |   2 +
 src/util/meson.build  |   2 +
 src/util/rb_tree.c| 421 ++
 src/util/rb_tree.h| 269 
 4 files changed, 694 insertions(+)
 create mode 100644 src/util/rb_tree.c
 create mode 100644 src/util/rb_tree.h

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 534520ce763..37eb0880e35 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -30,6 +30,8 @@ MESA_UTIL_FILES := \
ralloc.h \
rand_xor.c \
rand_xor.h \
+   rb_tree.c \
+   rb_tree.h \
register_allocate.c \
register_allocate.h \
rgtc.c \
diff --git a/src/util/meson.build b/src/util/meson.build
index c777984e28d..62425bb237b 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -54,6 +54,8 @@ files_mesa_util = files(
   'ralloc.h',
   'rand_xor.c',
   'rand_xor.h',
+  'rb_tree.c',
+  'rb_tree.h',
   'register_allocate.c',
   'register_allocate.h',
   'rgtc.c',
diff --git a/src/util/rb_tree.c b/src/util/rb_tree.c
new file mode 100644
index 000..a86fa31a809
--- /dev/null
+++ b/src/util/rb_tree.c
@@ -0,0 +1,421 @@
+/*
+ * Copyright © 2017 Jason Ekstrand
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "rb_tree.h"
+
+/** \file rb_tree.c
+ *
+ * An implementation of a red-black tree
+ *
+ * This file implements the guts of a red-black tree.  The implementation
+ * is mostly based on the one in "Introduction to Algorithms", third
+ * edition, by Cormen, Leiserson, Rivest, and Stein.  The primary
+ * divergence in our algorithms from those presented in CLRS is that we use
+ * NULL for the leaves instead of a sentinel.  This means we have to do a
+ * tiny bit more tracking in our implementation of delete but it makes the
+ * algorithms far more explicit than stashing stuff in the sentinel.
+ */
+
+#include 
+#include 
+#include 
+
+static bool
+rb_node_is_black(struct rb_node *n)
+{
+/* NULL nodes are leaves and therefore black */
+return (n == NULL) || (n->parent & 1);
+}
+
+static bool
+rb_node_is_red(struct rb_node *n)
+{
+return !rb_node_is_black(n);
+}
+
+static void
+rb_node_set_black(struct rb_node *n)
+{
+n->parent |= 1;
+}
+
+static void
+rb_node_set_red(struct rb_node *n)
+{
+n->parent &= ~1ull;
+}
+
+static void
+rb_node_copy_color(struct rb_node *dst, struct rb_node *src)
+{
+dst->parent = (dst->parent & ~1ull) | (src->parent & 1);
+}
+
+static void
+rb_node_set_parent(struct rb_node *n, struct rb_node *p)
+{
+n->parent = (n->parent & 1) | (uintptr_t)p;
+}
+
+static struct rb_node *
+rb_node_minimum(struct rb_node *node)
+{
+while (node->left)
+node = node->left;
+return node;
+}
+
+static struct rb_node *
+rb_node_maximum(struct rb_node *node)
+{
+while (node->right)
+node = node->right;
+return node;
+}
+
+void
+rb_tree_init(struct rb_tree *T)
+{
+T->root = NULL;
+}
+
+/**
+ * Replace the subtree of T rooted at u with the subtree rooted at v
+ *
+ * This is called RB-transplant in CLRS.
+ *
+ * The node to be replaced is assumed to be a non

[Mesa-dev] [PATCH 4/9] intel: batch-decoder: don't asks for constant BO until decoding

2018-06-14 Thread Lionel Landwerlin
With PPGTT mappings, our aubinator implementation can be quite slow if
we request a buffer that doesn't exist. Instead of doing a PPGTT walk
for invalid addresses (0 lengths), wait until we're sure we want to
decode the data.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/common/gen_batch_decoder.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index 2b6978da92d..81d8298c28b 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -562,9 +562,8 @@ decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, 
const uint32_t *p)
struct gen_group *body =
   gen_spec_find_struct(ctx->spec, "3DSTATE_CONSTANT_BODY");
 
-   uint32_t read_length[4];
-   struct gen_batch_decode_bo buffer[4];
-   memset(buffer, 0, sizeof(buffer));
+   uint32_t read_length[4] = {0};
+   uint64_t read_addr[4];
 
struct gen_field_iterator outer;
gen_field_iterator_init(&outer, inst, p, 0, false);
@@ -581,18 +580,24 @@ decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, 
const uint32_t *p)
  if (sscanf(iter.name, "Read Length[%d]", &idx) == 1) {
 read_length[idx] = iter.raw_value;
  } else if (sscanf(iter.name, "Buffer[%d]", &idx) == 1) {
-buffer[idx] = ctx_get_bo(ctx, iter.raw_value);
+read_addr[idx] = iter.raw_value;
  }
   }
 
   for (int i = 0; i < 4; i++) {
- if (read_length[i] == 0 || buffer[i].map == NULL)
+ if (read_length[i] == 0)
 continue;
 
+ struct gen_batch_decode_bo buffer = ctx_get_bo(ctx, read_addr[i]);
+ if (!buffer.map) {
+fprintf(ctx->fp, "constant buffer %d unavailable\n", i);
+continue;
+ }
+
  unsigned size = read_length[i] * 32;
  fprintf(ctx->fp, "constant buffer %d, size %u\n", i, size);
 
- ctx_print_buffer(ctx, buffer[i], size, 0, -1);
+ ctx_print_buffer(ctx, buffer, size, 0, -1);
   }
}
 }
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] intel: batch-decoder: add missing return line

2018-06-14 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/common/gen_batch_decoder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index 81d8298c28b..fc0ff95a476 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -854,7 +854,7 @@ gen_print_batch(struct gen_batch_decode_ctx *ctx,
  }
 
  if (next_batch.map == NULL) {
-fprintf(ctx->fp, "Secondary batch at 0x%08"PRIx64" unavailable",
+fprintf(ctx->fp, "Secondary batch at 0x%08"PRIx64" unavailable\n",
 next_batch.addr);
  }
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 9/9] intel: aubinator: remove standard input processing option

2018-06-14 Thread Lionel Landwerlin
Now that we rely on mmap of the data to parse, we can't process the
standard input anymore.

This isn't much of a big deal because we have in-process batch decoder
(run with INTEL_DEBUG=batch) that supports essentially doing the same
thing.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/aubinator.c | 102 +---
 1 file changed, 12 insertions(+), 90 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index c9308b56137..4ee3df9cd9e 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -694,17 +694,6 @@ aub_file_open(const char *filename)
return file;
 }
 
-static struct aub_file *
-aub_file_stdin(void)
-{
-   struct aub_file *file;
-
-   file = calloc(1, sizeof *file);
-   file->stream = stdin;
-
-   return file;
-}
-
 #define TYPE(dw)   (((dw) >> 29) & 7)
 #define OPCODE(dw) (((dw) >> 23) & 0x3f)
 #define SUBOPCODE(dw)  (((dw) >> 16) & 0x7f)
@@ -742,8 +731,7 @@ aub_file_decode_batch(struct aub_file *file)
uint32_t *p, h, *new_cursor;
int header_length, bias;
 
-   if (file->end - file->cursor < 1)
-  return AUB_ITEM_DECODE_NEED_MORE_DATA;
+   assert(file->cursor < file->end);
 
p = file->cursor;
h = *p;
@@ -765,13 +753,11 @@ aub_file_decode_batch(struct aub_file *file)
 
new_cursor = p + header_length + bias;
if ((h & 0x) == MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BLOCK)) 
{
-  if (file->end - file->cursor < 4)
- return AUB_ITEM_DECODE_NEED_MORE_DATA;
+  assert(file->end - file->cursor >= 4);
   new_cursor += p[4] / 4;
}
 
-   if (new_cursor > file->end)
-  return AUB_ITEM_DECODE_NEED_MORE_DATA;
+   assert(new_cursor <= file->end);
 
switch (h & 0x) {
case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER):
@@ -812,48 +798,6 @@ aub_file_more_stuff(struct aub_file *file)
return file->cursor < file->end || (file->stream && !feof(file->stream));
 }
 
-#define AUB_READ_BUFFER_SIZE (4096)
-#define MAX(a, b) ((a) < (b) ? (b) : (a))
-
-static void
-aub_file_data_grow(struct aub_file *file)
-{
-   size_t old_size = (file->mem_end - file->map) * 4;
-   size_t new_size = MAX(old_size * 2, AUB_READ_BUFFER_SIZE);
-   uint32_t *new_start = realloc(file->map, new_size);
-
-   file->cursor = new_start + (file->cursor - file->map);
-   file->end = new_start + (file->end - file->map);
-   file->map = new_start;
-   file->mem_end = file->map + (new_size / 4);
-}
-
-static bool
-aub_file_data_load(struct aub_file *file)
-{
-   size_t r;
-
-   if (file->stream == NULL)
-  return false;
-
-   /* First remove any consumed data */
-   if (file->cursor > file->map) {
-  memmove(file->map, file->cursor,
-  (file->end - file->cursor) * 4);
-  file->end -= file->cursor - file->map;
-  file->cursor = file->map;
-   }
-
-   /* Then load some new data in */
-   if ((file->mem_end - file->end) < (AUB_READ_BUFFER_SIZE / 4))
-  aub_file_data_grow(file);
-
-   r = fread(file->end, 1, (file->mem_end - file->end) * 4, file->stream);
-   file->end += r / 4;
-
-   return r != 0;
-}
-
 static void
 setup_pager(void)
 {
@@ -885,9 +829,8 @@ static void
 print_help(const char *progname, FILE *file)
 {
fprintf(file,
-   "Usage: %s [OPTION]... [FILE]\n"
-   "Decode aub file contents from either FILE or the standard 
input.\n\n"
-   "A valid --gen option must be provided.\n\n"
+   "Usage: %s [OPTION]... FILE\n"
+   "Decode aub file contents from FILE.\n\n"
"  --help display this help and exit\n"
"  --gen=platform decode for given platform (3 letter 
platform name)\n"
"  --headers  decode only command headers\n"
@@ -956,14 +899,14 @@ int main(int argc, char *argv[])
   }
}
 
-   if (help || argc == 1) {
+   if (optind < argc)
+  input_file = argv[optind];
+
+   if (help || !input_file) {
   print_help(argv[0], stderr);
   exit(0);
}
 
-   if (optind < argc)
-  input_file = argv[optind];
-
/* Do this before we redirect stdout to pager. */
if (option_color == COLOR_AUTO)
   option_color = isatty(1) ? COLOR_ALWAYS : COLOR_NEVER;
@@ -971,35 +914,14 @@ int main(int argc, char *argv[])
if (isatty(1) && pager)
   setup_pager();
 
-   if (input_file == NULL)
-  file = aub_file_stdin();
-   else
-  file = aub_file_open(input_file);
-
mem_fd = memfd_create("phys memory", 0);
 
list_inithead(&maps);
 
-   while (aub_file_more_stuff(file)) {
-  switch (aub_file_decode_batch(file)) {
-  case AUB_ITEM_DECODE_OK:
- break;
-  case AUB_ITEM_DECODE_NEED_MORE_DATA:
- if (!file->stream) {
-file->cursor = file->end;
-break;
- }
- if (aub_file_more_stuff(file) && !aub_file_data_load(file)) {
-fprintf(stderr, "failed to load data from stdin\n");
-exit(EXIT_FAILURE);
- }
-

[Mesa-dev] [PATCH 7/9] intel: aubinator: move address masking

2018-06-14 Thread Lionel Landwerlin
The Masking is only needed for entry matching.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/aubinator.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 2a1b91c0e54..6f2e0d503df 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -329,12 +329,12 @@ get_ggtt_batch_bo(void *user_data, uint64_t address)
 {
struct gen_batch_decode_bo bo = {0};
 
-   address &= ~0xfff;
-
list_for_each_entry(struct bo_map, i, &maps, link)
   if (i->bo.addr <= address && i->bo.addr + i->bo.size > address)
  return i->bo;
 
+   address &= ~0xfff;
+
struct ggtt_entry *start =
   (struct ggtt_entry *)rb_tree_search_sloppy(&ggtt, &address,
  cmp_ggtt_entry);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/9] intel: aubinator: drop the 1Tb GTT mapping

2018-06-14 Thread Lionel Landwerlin
We reuse the existing mechanism introduced in 482a7d1593c621
("intel/tools/aubinator: aubinate ppgtt aubs") that is a list of GTT
address and their corresponding mmapped pointer so that we can get rid
of the 1Tb of mmapped memory and instead just use the already mmapped
aub file.

Sorry Kristian.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/aubinator.c | 85 -
 1 file changed, 28 insertions(+), 57 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 6f2e0d503df..c9308b56137 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -43,6 +43,7 @@
 
 #include "common/gen_decoder.h"
 #include "common/gen_disasm.h"
+#include "common/gen_gem.h"
 #include "intel_aub.h"
 
 #ifndef HAVE_MEMFD_CREATE
@@ -80,12 +81,10 @@ char *input_file = NULL, *xml_path = NULL;
 struct gen_device_info devinfo;
 struct gen_batch_decode_ctx batch_ctx;
 
-uint64_t gtt_size, gtt_end;
-void *gtt;
-
 struct bo_map {
struct list_head link;
struct gen_batch_decode_bo bo;
+   bool unmap_after_use;
 };
 
 struct ggtt_entry {
@@ -127,12 +126,6 @@ field(uint32_t value, int start, int end)
 
 struct brw_instruction;
 
-static inline int
-valid_offset(uint32_t offset)
-{
-   return offset < gtt_end;
-}
-
 #define GEN_ENGINE_RENDER 1
 #define GEN_ENGINE_BLITTER 2
 
@@ -307,22 +300,15 @@ ppgtt_mapped(uint64_t pml4, uint64_t address)
return ppgtt_walk(pml4, address) != NULL;
 }
 
-static struct gen_batch_decode_bo
-get_gen_batch_bo(void *user_data, uint64_t address)
+static void
+add_gtt_bo_map(struct gen_batch_decode_bo bo, bool unmap_after_use)
 {
-   if (address > gtt_end)
-  return (struct gen_batch_decode_bo) { .map = NULL };
-
-   /* We really only have one giant address range */
-   return (struct gen_batch_decode_bo) {
-  .addr = 0,
-  .map = gtt,
-  .size = gtt_size
-   };
-}
-
-
+   struct bo_map *m = calloc(1, sizeof(*m));
 
+   m->bo = bo;
+   m->unmap_after_use = unmap_after_use;
+   list_add(&m->link, &maps);
+}
 
 static struct gen_batch_decode_bo
 get_ggtt_batch_bo(void *user_data, uint64_t address)
@@ -369,9 +355,7 @@ get_ggtt_batch_bo(void *user_data, uint64_t address)
   assert(res != MAP_FAILED);
}
 
-   struct bo_map *m = calloc(1, sizeof(*m));
-   m->bo = bo;
-   list_add(&m->link, &maps);
+   add_gtt_bo_map(bo, true);
 
return bo;
 }
@@ -407,9 +391,7 @@ get_ppgtt_batch_bo(void *user_data, uint64_t address)
   assert(res != MAP_FAILED);
}
 
-   struct bo_map *m = calloc(1, sizeof(*m));
-   m->bo = bo;
-   list_add(&m->link, &maps);
+   add_gtt_bo_map(bo, true);
 
return bo;
 }
@@ -418,7 +400,8 @@ static void
 clear_bo_maps(void)
 {
list_for_each_entry_safe(struct bo_map, i, &maps, link) {
-  munmap((void *)i->bo.map, i->bo.size);
+  if (i->unmap_after_use)
+ munmap((void *)i->bo.map, i->bo.size);
   list_del(&i->link);
   free(i);
}
@@ -430,26 +413,23 @@ handle_trace_block(uint32_t *p)
int operation = p[1] & AUB_TRACE_OPERATION_MASK;
int type = p[1] & AUB_TRACE_TYPE_MASK;
int address_space = p[1] & AUB_TRACE_ADDRESS_SPACE_MASK;
-   uint64_t offset = p[3];
-   uint32_t size = p[4];
int header_length = p[0] & 0x;
-   uint32_t *data = p + header_length + 2;
int engine = GEN_ENGINE_RENDER;
-
-   if (devinfo.gen >= 8)
-  offset += (uint64_t) p[5] << 32;
+   struct gen_batch_decode_bo bo = {
+  .map = p + header_length + 2,
+  /* Addresses written by aubdump here are in canonical form but the batch
+   * decoder always gives us addresses with the top 16bits zeroed, so do
+   * the same here.
+   */
+  .addr = gen_48b_address((devinfo.gen >= 8 ? ((uint64_t) p[5] << 32) : 0) 
|
+  ((uint64_t) p[3])),
+  .size = p[4],
+   };
 
switch (operation) {
case AUB_TRACE_OP_DATA_WRITE:
-  if (address_space != AUB_TRACE_MEMTYPE_GTT)
- break;
-  if (gtt_size < offset + size) {
- fprintf(stderr, "overflow gtt space: %s\n", strerror(errno));
- exit(EXIT_FAILURE);
-  }
-  memcpy((char *) gtt + offset, data, size);
-  if (gtt_end < offset + size)
- gtt_end = offset + size;
+  if (address_space == AUB_TRACE_MEMTYPE_GTT)
+ add_gtt_bo_map(bo, false);
   break;
case AUB_TRACE_OP_COMMAND_WRITE:
   switch (type) {
@@ -465,10 +445,10 @@ handle_trace_block(uint32_t *p)
   }
 
   (void)engine; /* TODO */
-  batch_ctx.get_bo = get_gen_batch_bo;
-  gen_print_batch(&batch_ctx, data, size, 0);
+  batch_ctx.get_bo = get_ggtt_batch_bo;
+  gen_print_batch(&batch_ctx, bo.map, bo.size, 0);
 
-  gtt_end = 0;
+  clear_bo_maps();
   break;
}
 }
@@ -996,15 +976,6 @@ int main(int argc, char *argv[])
else
   file = aub_file_open(input_file);
 
-   /* mmap a terabyte for our gtt space. */
-   gtt_size = 1ull << 40;
-   gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
-   

Re: [Mesa-dev] [ANNOUNCE] Mesa 18.1.2 release candidate

2018-06-14 Thread Dylan Baker
Quoting Bas Nieuwenhuizen (2018-06-14 09:21:49)
> On Thu, Jun 14, 2018 at 6:13 PM,   wrote:
> > Hello list,
> >
> > The candidate for the Mesa 18.1.2 is now available. Currently we have:
> >  - 42 queued
> >  - 6 nominated (outstanding)
> >  - and 0 rejected patches
> >
> > Notable changes in this release:
> > - numerous fixes for radv
> > - libatomic checks for meson, as well as fixing coverage for less common 
> > (not
> >   arm or x86) platforms
> > - lots of common Intel fixes
> > - GLX fixes
> > - tarball fixes for android
> > - meson assembly fixes for x86 when doing an x86 -> x86 cross compile
> >
> > Take a look at section "Mesa stable queue" for more information.
> >
> >
> > Testing reports/general approval
> > 
> > Any testing reports (or general approval of the state of the branch) will be
> > greatly appreciated.
> >
> > The plan is to have 18.1.2 this Friday (June 13th), around or shortly after 
> > 10
> > AM PDT.
> 
> June 15th?

Yes, June 15th.

Apparently being woken up at 5AM does more brain damage than I thought.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] features.txt: mark some extensions as done

2018-06-14 Thread Jordan Justen
Series Reviewed-by: Jordan Justen 

On 2018-06-14 04:08:09, Tapani Pälli wrote:
> Signed-off-by: Tapani Pälli 
> ---
>  docs/features.txt | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/features.txt b/docs/features.txt
> index b32606d223..423b03a9a9 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -322,12 +322,14 @@ Khronos, ARB, and OES extensions that are not part of 
> any OpenGL or OpenGL ES ve
>GL_EXT_semaphore  DONE (radeonsi)
>GL_EXT_semaphore_fd   DONE (radeonsi)
>GL_EXT_semaphore_win32not started
> +  GL_EXT_texture_norm16 DONE (i965, r600, 
> radeonsi, nvc0)
>GL_KHR_blend_equation_advanced_coherent   DONE (i965/gen9+)
>GL_KHR_texture_compression_astc_hdr   DONE (i965/bxt)
>GL_KHR_texture_compression_astc_sliced_3d DONE (i965/gen9+)
>GL_OES_depth_texture_cube_map DONE (all drivers 
> that support GLSL 1.30+)
>GL_OES_EGL_image  DONE (all drivers)
> -  GL_OES_EGL_image_external_essl3   not started
> +  GL_OES_EGL_image_external DONE (all drivers)
> +  GL_OES_EGL_image_external_essl3   DONE (all drivers)
>GL_OES_required_internalformatDONE (all drivers)
>GL_OES_surfaceless_contextDONE (all drivers)
>GL_OES_texture_compression_astc   DONE (core only)
> @@ -335,7 +337,7 @@ Khronos, ARB, and OES extensions that are not part of any 
> OpenGL or OpenGL ES ve
>GL_OES_texture_float_linear   DONE (freedreno, 
> i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
>GL_OES_texture_half_float DONE (freedreno, 
> i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
>GL_OES_texture_half_float_linear  DONE (freedreno, 
> i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
> -  GL_OES_texture_view   not started - based 
> on GL_ARB_texture_view
> +  GL_OES_texture_view   DONE (i965/gen8+)
>GL_OES_viewport_array DONE (i965, nvc0, 
> radeonsi)
>GLX_ARB_context_flush_control not started
>GLX_ARB_robustness_application_isolation  not started
> -- 
> 2.14.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] meson: fix private libs when building without glx

2018-06-14 Thread Lukas Rusak
How can I get some traction on this?

On Mon, Jun 4, 2018 at 12:38 PM Lukas Rusak  wrote:

> I noticed that the generated pkg-config files will include
> glx and x11 dependencies even when x11 isn't a selected platform.
>
> This fixes the private libs and was tested by building kmscube
>
> V2:
>   - check if gallium-xlib is being used for glx
>
> Reviewed-by: Dylan Baker 
> ---
>  meson.build | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/meson.build b/meson.build
> index 4aafba802a..b1ab9d6a20 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1339,18 +1339,24 @@ endforeach
>
>  inc_include = include_directories('include')
>
> -gl_priv_reqs = [
> -  'x11', 'xext', 'xdamage >= 1.1', 'xfixes', 'x11-xcb', 'xcb',
> -  'xcb-glx >= 1.8.1']
> +gl_priv_reqs = []
> +
> +if with_glx == 'xlib' or with_glx == 'gallium-xlib'
> +  gl_priv_reqs += ['x11', 'xext', 'xcb']
> +elif with_glx == 'dri'
> +  gl_priv_reqs += [
> +'x11', 'xext', 'xdamage >= 1.1', 'xfixes', 'x11-xcb', 'xcb',
> +'xcb-glx >= 1.8.1']
> +  if with_dri_platform == 'drm'
> +gl_priv_reqs += 'xcb-dri2 >= 1.8'
> +  endif
> +endif
>  if dep_libdrm.found()
>gl_priv_reqs += 'libdrm >= 2.4.75'
>  endif
>  if dep_xxf86vm.found()
>gl_priv_reqs += 'xxf86vm'
>  endif
> -if with_dri_platform == 'drm'
> -  gl_priv_reqs += 'xcb-dri2 >= 1.8'
> -endif
>
>  gl_priv_libs = []
>  if dep_thread.found()
> --
> 2.17.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meson: only build vl_winsys_dri.c when x11 platform is used

2018-06-14 Thread Lukas Rusak
any updates here?

On Fri, Jun 1, 2018 at 2:09 PM Lukas Rusak  wrote:

> This seems to have been missed in the move from autotools
>
> This fixes the following build issue:
>
> ../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error:
> X11/Xlib-xcb.h: No such file or directory
>  #include 
>   ^~~~
> ---
>  src/gallium/auxiliary/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/meson.build
> b/src/gallium/auxiliary/meson.build
> index 584cbe4509..857001e12c 100644
> --- a/src/gallium/auxiliary/meson.build
> +++ b/src/gallium/auxiliary/meson.build
> @@ -453,7 +453,7 @@ files_libgalliumvl = files(
>  )
>
>  files_libgalliumvlwinsys = files('vl/vl_winsys.h')
> -if with_dri2
> +if with_dri2 and with_platform_x11
>files_libgalliumvlwinsys += files('vl/vl_winsys_dri.c')
>if with_dri3
>  files_libgalliumvlwinsys += files('vl/vl_winsys_dri3.c')
> --
> 2.17.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: serialize data from glTransformFeedbackVaryings

2018-06-14 Thread Jordan Justen
On 2018-06-14 02:58:33, Tapani Pälli wrote:
> While XFB has been enabled for cache, we did not serialize enough
> data for the whole API to work (such as glGetProgramiv).
> 
> Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
> ---
>  src/compiler/glsl/serialize.cpp | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/src/compiler/glsl/serialize.cpp b/src/compiler/glsl/serialize.cpp
> index 727822633d..4cb74ddba9 100644
> --- a/src/compiler/glsl/serialize.cpp
> +++ b/src/compiler/glsl/serialize.cpp
> @@ -323,6 +323,12 @@ write_xfb(struct blob *metadata, struct 
> gl_shader_program *shProg)
>  
> blob_write_uint32(metadata, prog->info.stage);
>  
> +   /* Data set by glTransformFeedbackVaryings. */
> +   blob_write_uint32(metadata, shProg->TransformFeedback.BufferMode);
> +   blob_write_uint32(metadata, shProg->TransformFeedback.NumVarying);
> +   for (unsigned i = 0; i < shProg->TransformFeedback.NumVarying; i++)
> +  blob_write_string(metadata, shProg->TransformFeedback.VaryingNames[i]);

I guess we need BufferStride too?

> +
> blob_write_uint32(metadata, ltf->NumOutputs);
> blob_write_uint32(metadata, ltf->ActiveBuffers);
> blob_write_uint32(metadata, ltf->NumVarying);
> @@ -352,6 +358,15 @@ read_xfb(struct blob_reader *metadata, struct 
> gl_shader_program *shProg)
> if (xfb_stage == ~0u)
>return;
>  
> +   /* Data set by glTransformFeedbackVaryings. */
> +   shProg->TransformFeedback.BufferMode = blob_read_uint32(metadata);
> +   shProg->TransformFeedback.NumVarying = blob_read_uint32(metadata);
> +   shProg->TransformFeedback.VaryingNames = (char **)
> +  malloc(shProg->TransformFeedback.NumVarying * sizeof(GLchar *));
> +   for (unsigned i = 0; i < shProg->TransformFeedback.NumVarying; i++)
> +  shProg->TransformFeedback.VaryingNames[i] =
> + strdup(blob_read_string(metadata));

I guess VaryingNames uses malloc/free rather than ralloc. Maybe worth
a comment? A comment might just be extra noise though too.

With BufferStride added,

Reviewed-by: Jordan Justen 

> +
> struct gl_program *prog = shProg->_LinkedShaders[xfb_stage]->Program;
> struct gl_transform_feedback_info *ltf =
>rzalloc(prog, struct gl_transform_feedback_info);
> -- 
> 2.14.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] i965: Don't recycle BOs until they are idle

2018-06-14 Thread Jason Ekstrand

On June 14, 2018 01:43:12 Michel Dänzer  wrote:


On 2018-06-13 10:26 PM, Jason Ekstrand wrote:

The current BO cache puts BOs back into the recycle bucket the moment the
refcount hits zero.  If the BO is busy, we just don't re-use it until it
isn't or we re-use it for a render target which we assume will be used
first for drawing.  This patch series reworks the way the BO cache works a
bit so that we don't ever recycle a busy BO.  On the down side, it means
that we don't get the "keep busy BOs busy" heuristic (which we have no
proof actually helps).  On the up side, we can now easily use a MRU
heuristic instead of round-robin for all buffers and not just the busy
ones.  Will this be an improvement, a regression or a wash?  I don't know
but I doubt it will have a major effect one way or another.


FWIW, I suspect this could be a significant loss with overlapping copies
in glamor (e.g. x11perf -copywinwin500), because it won't be able to
reuse the busy BOs anymore (glamor creates a temporary FBO for each
overlapping copy).


That's rather horrific... That seems like something glamour could do 
better.  How common are overlapping copies in practice?  Are we talking a 
couple per frame or hundreds?  If that really is going on then we may need 
to rethink our approach on this one. :-(



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965/bufmgr: Drop the BO_ALLOC_BUSY flag

2018-06-14 Thread Jason Ekstrand
On June 14, 2018 06:01:33 Lionel Landwerlin  
wrote:



On 13/06/18 21:26, Jason Ekstrand wrote:

---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 ++
src/mesa/drivers/dri/i965/brw_bufmgr.h |  1 -
2 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c

index 58bb559fdee..e9d3daa5985 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -448,11 +448,6 @@ int
brw_bo_busy(struct brw_bo *bo)
{
struct brw_bufmgr *bufmgr = bo->bufmgr;


I don't really understand this hunk.
It seems related to this patch and undoes what you added in patch 1.


Oops. That was completely unintended.




-
-   /* If we know it's idle, don't bother with the kernel round trip */
-   if (bo->idle && !bo->external)
-  return false;
-
struct drm_i915_gem_busy busy = { .handle = bo->gem_handle };

int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy);
@@ -506,20 +501,11 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
struct bo_cache_bucket *bucket;
bool alloc_from_cache;
uint64_t bo_size;
-   bool busy = false;
bool zeroed = false;

-   if (flags & BO_ALLOC_BUSY)
-  busy = true;
-
if (flags & BO_ALLOC_ZEROED)
zeroed = true;

-   /* BUSY does doesn't really jive with ZEROED as we have to wait for it to
-* be idle before we can memset.  Just disallow that combination.
-*/
-   assert(!(busy && zeroed));
-
/* Round the allocated size up to a power of two number of pages. */
bucket = bucket_for_size(bufmgr, size);

@@ -539,29 +525,17 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
retry:
alloc_from_cache = false;
if (bucket != NULL && !list_empty(&bucket->head)) {
-  if (busy && !zeroed) {
- /* Allocate new render-target BOs from the tail (MRU)
-  * of the list, as it will likely be hot in the GPU
-  * cache and in the aperture for us.  If the caller
-  * asked us to zero the buffer, we don't want this
-  * because we are going to mmap it.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
- list_del(&bo->head);
+  /* For non-render-target BOs (where we're probably
+   * going to map it first thing in order to fill it
+   * with data), check if the last BO in the cache is
+   * unbusy, and only reuse in that case. Otherwise,
+   * allocating a new buffer is probably faster than
+   * waiting for the GPU to finish.
+   */
+  bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
+  if (!brw_bo_busy(bo)) {
  alloc_from_cache = true;
-  } else {
- /* For non-render-target BOs (where we're probably
-  * going to map it first thing in order to fill it
-  * with data), check if the last BO in the cache is
-  * unbusy, and only reuse in that case. Otherwise,
-  * allocating a new buffer is probably faster than
-  * waiting for the GPU to finish.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
- if (!brw_bo_busy(bo)) {
-alloc_from_cache = true;
-list_del(&bo->head);
- }
+ list_del(&bo->head);
}

if (alloc_from_cache) {
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h

index 32fc7a553c9..d3b3aadc0db 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -195,7 +195,6 @@ struct brw_bo {
bool cache_coherent;
};

-#define BO_ALLOC_BUSY   (1<<0)
#define BO_ALLOC_ZEROED (1<<1)

/**




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meson: only build vl_winsys_dri.c when x11 platform is used

2018-06-14 Thread Dylan Baker
Quoting Lukas Rusak (2018-06-14 10:25:43)
> any updates here?
> 
> On Fri, Jun 1, 2018 at 2:09 PM Lukas Rusak  wrote:
> 
> This seems to have been missed in the move from autotools
> 
> This fixes the following build issue:
> 
> ../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/
> Xlib-xcb.h: No such file or directory
>  #include 
>           ^~~~
> ---
>  src/gallium/auxiliary/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/meson.build b/src/gallium/auxiliary/
> meson.build
> index 584cbe4509..857001e12c 100644
> --- a/src/gallium/auxiliary/meson.build
> +++ b/src/gallium/auxiliary/meson.build
> @@ -453,7 +453,7 @@ files_libgalliumvl = files(
>  )
> 
>  files_libgalliumvlwinsys = files('vl/vl_winsys.h')
> -if with_dri2
> +if with_dri2 and with_platform_x11
>    files_libgalliumvlwinsys += files('vl/vl_winsys_dri.c')
>    if with_dri3
>      files_libgalliumvlwinsys += files('vl/vl_winsys_dri3.c')
> --
> 2.17.0
> 
> 

Sorry about that, I've pushed this with my rb and a fixes tag, so it should be
in 18.1.3

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106774] GLSL IR copy propagates loads of SSBOs

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106774

Ian Romanick  changed:

   What|Removed |Added

 CC||i...@freedesktop.org,
   ||mic...@daenzer.net
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Ian Romanick  ---
This bug should be fixed by the commits listed below.

(In reply to Michel Dänzer from comment #10)
> Please don't add piglit tests which are expected to hang the GPU, or at
> least don't set them up to be run by default. Not all drivers can reliably
> recover from GPU hangs.

The updated test does not cause GPU hangs with these patches applied.  Are you
ok with the test landing in piglit master?



commit 37bd9ccd21b860d2b5ffea7e1f472ec83b68b43b (HEAD -> bug-106774,
origin/master, origin/HEAD)
Author: Ian Romanick 
Date:   Tue Jun 5 16:02:25 2018 -0700

glsl: Don't copy propagate elements from SSBO or shared variables either

Since SSBOs can be written by a different GPU thread, copy propagating a
read can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

The same shader was helped by this patch and the previous.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399119 -> 14399113 (<.01%)
instructions in affected programs: 683 -> 677 (-0.88%)
helped: 1
HURT: 0

total cycles in shared programs: 532973113 -> 532971865 (<.01%)
cycles in affected programs: 524666 -> 523418 (-0.24%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick 
Reviewed-by: Caio Marcelo de Oliveira Filho 
Cc: mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774

commit 461a5c899c08064467abb635536381a5a5659280
Author: Ian Romanick 
Date:   Tue Jun 5 15:04:24 2018 -0700

glsl: Don't copy propagate from SSBO or shared variables either

Since SSBOs can be written by other GPU threads, copy propagating a read
can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399120 -> 14399119 (<.01%)
instructions in affected programs: 684 -> 683 (-0.15%)
helped: 1
HURT: 0

total cycles in shared programs: 532978931 -> 532973113 (<.01%)
cycles in affected programs: 530484 -> 524666 (-1.10%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick 
Reviewed-by: Caio Marcelo de Oliveira Filho 
Cc: mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]

2018-06-14 Thread Keith Packard
Jason Ekstrand  writes:

>> +   if (instance->enabled_extensions.KHR_display) {
>> +  master_fd = open(path, O_RDWR | O_CLOEXEC);
>>
>
> Is this supposed to be opening primary_path instead?

Yes, and this section of code needs to occur before anv_init_wsi.

I appear to have skipped testing this path on ANV and only tested it on
RADV -- RADV has the code in the right order. Thanks for catching this;
sorry I messed up and didn't test it.

> This could just be
>
> if (anv_gem_get_param(master_fd, I915_PARAM_CHIPSET_ID) == 0) {
>close(master_fd);
>master_fd = -1;
> }
>
> No need to type out all that IOCTL stuff.

Thanks, that's lots shorter (RADV doesn't appear to have a similar helper).

Here's an amendment to the proposed patch which fixes the bug and
switches to the simpler detection method.

From f4dac824a2566367cc3c66e1eda27fe4aaf64543 Mon Sep 17 00:00:00 2001
From: Keith Packard 
Date: Thu, 14 Jun 2018 11:31:20 -0700
Subject: [PATCH] anv: Open DRM master before initializing WSI layer. Close on
 device_finish.

The DRM master FD is passed to the WSI layer during initialization, so
we need to open the device slightly earlier in the function.

Also, close the DRM master FD when the driver is being shut down.

v2:
	Use anv_gem_get_param to detect working master_fd

Signed-off-by: Keith Packard 
---
 src/intel/vulkan/anv_device.c | 34 --
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index b3c6d1a8722..3507a91810f 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -437,36 +437,32 @@ anv_physical_device_init(struct anv_physical_device *device,
if (result != VK_SUCCESS)
   goto fail;
 
-   result = anv_init_wsi(device);
-   if (result != VK_SUCCESS) {
-  ralloc_free(device->compiler);
-  goto fail;
-   }
-
-   anv_physical_device_get_supported_extensions(device,
-&device->supported_extensions);
-
if (instance->enabled_extensions.KHR_display) {
-  master_fd = open(path, O_RDWR | O_CLOEXEC);
+  master_fd = open(primary_path, O_RDWR | O_CLOEXEC);
   if (master_fd >= 0) {
  /* prod the device with a GETPARAM call which will fail if
   * we don't have permission to even render on this device
   */
- drm_i915_getparam_t gp;
- memset(&gp, '\0', sizeof(gp));
- int devid = 0;
- gp.param = I915_PARAM_CHIPSET_ID;
- gp.value = &devid;
- int ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
- if (ret < 0) {
+ if (anv_gem_get_param(master_fd, I915_PARAM_CHIPSET_ID) == 0) {
 close(master_fd);
 master_fd = -1;
  }
   }
}
+   device->master_fd = master_fd;
+
+   result = anv_init_wsi(device);
+   if (result != VK_SUCCESS) {
+  ralloc_free(device->compiler);
+  goto fail;
+   }
+
+   anv_physical_device_get_supported_extensions(device,
+&device->supported_extensions);
+
 
device->local_fd = fd;
-   device->master_fd = master_fd;
+
return VK_SUCCESS;
 
 fail:
@@ -482,6 +478,8 @@ anv_physical_device_finish(struct anv_physical_device *device)
anv_finish_wsi(device);
ralloc_free(device->compiler);
close(device->local_fd);
+   if (device->master_fd >= 0)
+  close(device->master_fd);
 }
 
 static void *
-- 
2.17.1


-- 
-keith


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]

2018-06-14 Thread Keith Packard
Jason Ekstrand  writes:


>> Signed-off-by: Keith Packard 
>>
>> fixup
>>
>
> Did you mean to leave this in here?

Nope; just rebasing/squashing noise. I noticed this in passing and have
already removed it.

-- 
-keith


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC] i965/blit: bump some limits to 64k

2018-06-14 Thread Nanley Chery
On Thu, Jun 14, 2018 at 10:01:18AM -0700, Nanley Chery wrote:
> On Thu, Jun 14, 2018 at 04:18:30PM +0300, Martin Peres wrote:
> > This fixes screenshots using 8k+ wide display setups in modesetting.
> > 
> > Chris Wilson even recommended the changes in intel_mipmap_tree.c
> > should read 131072 instead of 65535, but I for sure got confused by
> > his explanation.
> > 
> > In any case, I would like to use this RFC patch as a forum to discuss
> > why the fallback path is broken[1], and as to what should be the
> > limits for HW-accelerated blits now that we got rid of the blitter
> > usage on recent platforms.
> > 
> 
> Hi,
> 
> My understanding is that the fallback path is broken because we silently
> ignore miptree_create_for_bo's request for a tiled miptree. This results
> in some parts of mesa treating the surface as tiled and other parts of
> treating the surface as linear.
> 
> I couldn't come up with a piglit test for this when I was working on a
> fix. Please let me know if you can think of any.
> 
> I think what the limits should be depends on which mesa branch you're
> working off of.
> 
> * On the master branch of mesa, which has some commits which reduce the
>   dependence on the BLT engine, we can remove these limits by using BLORP.
>   As much as I can tell, BLORP can handle images as wide as the surface
>   pitch limit in the RENDER_SURFACE_STATE packet will allow.
> 
>   I sent out a series [a] a couple weeks ago that removes the limits
>   imposed by the hardware blitter.
> 
> * On the stable branch however, we can modify some incorrect code to set
>   the correct BLT limits (as Chris has suggested). The BLT engine's pitch
>   field is a signed 16bit integer, whose unit changes depending on the
>   tiling of the surface. For linear surfaces, it's in units of bytes and
>   for non-linear surfaces, it's in units of dwords. This translates to
>   2^15-1 bytes or (2^15-1) * 4 bytes respectively.
>   
>   I made a branch [b] which does this already, but I think my rebasing +
>   testing strategy for stable branches on the CI might be incorrect.

I just rebased this branch onto master and tested it on the CI.
Everything passes except for SNB. I get 1 GPU hang and two test
failures:
* failure-gpu-hang-otc-gfxtest-snbgt1-01-snbm64.compile.error
* KHR-GL33.shaders.uniform_block.random.all_shared_buffer.3.snbm64
* dEQP-EGL.functional.color_clears.multi_context.gles3.rgba_pbuffer.snbm64

I'm not sure why this happens.

-Nanley

> 
> [a] https://patchwork.freedesktop.org/series/43971/
> [b] https://cgit.freedesktop.org/~nchery/mesa/log/?h=wip/stable/stop-retiling
> 
> > Tested-by: Martin Peres  # HSW
> > Cc: Chris Wilson 
> > 
> > [1] https://fs.mupuf.org/corruption_8k%2B.png
> > ---
> >  src/mesa/drivers/dri/i965/intel_blit.c| 2 +-
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> > b/src/mesa/drivers/dri/i965/intel_blit.c
> > index 90784c5b1958..458f8bd42857 100644
> > --- a/src/mesa/drivers/dri/i965/intel_blit.c
> > +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> > @@ -403,7 +403,7 @@ emit_miptree_blit(struct brw_context *brw,
> >  * for linear surfaces and DWords for tiled surfaces.  So the maximum
> >  * pitch is 32k linear and 128k tiled.
> >  */
> > -   if (blt_pitch(src_mt) >= 32768 || blt_pitch(dst_mt) >= 32768) {
> > +   if (blt_pitch(src_mt) >= 65536 || blt_pitch(dst_mt) >= 65536) {
> 
> This is too large for linear miptrees.
> 
> >perf_debug("Falling back due to >= 32k/128k pitch\n");
> >return false;
> > }
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 6b89bf6848af..7347ea8b99d8 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -523,7 +523,7 @@ need_to_retile_as_linear(struct brw_context *brw, 
> > unsigned row_pitch,
> > if (row_pitch < 64)
> >return true;
> >  
> > -   if (ALIGN(row_pitch, 512) >= 32768) {
> > +   if (ALIGN(row_pitch, 512) >= 65536) {
> >perf_debug("row pitch %u too large to blit, falling back to untiled",
> >   row_pitch);
> >return true;
> > @@ -3583,7 +3583,7 @@ can_blit_slice(struct intel_mipmap_tree *mt,
> > unsigned int level, unsigned int slice)
> >  {
> > /* See intel_miptree_blit() for details on the 32k pitch limit. */
> > -   if (mt->surf.row_pitch >= 32768)
> > +   if (mt->surf.row_pitch >= 65536)
> 
> This is also too large for linear miptrees.
> 
> -Nanley
> 
> >  
> > return true;
> > -- 
> > 2.17.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.o

Re: [Mesa-dev] [PATCH] [RFC] i965/blit: bump some limits to 64k

2018-06-14 Thread Chris Wilson
Quoting Nanley Chery (2018-06-14 19:46:09)
> On Thu, Jun 14, 2018 at 10:01:18AM -0700, Nanley Chery wrote:
> > On Thu, Jun 14, 2018 at 04:18:30PM +0300, Martin Peres wrote:
> > > This fixes screenshots using 8k+ wide display setups in modesetting.
> > > 
> > > Chris Wilson even recommended the changes in intel_mipmap_tree.c
> > > should read 131072 instead of 65535, but I for sure got confused by
> > > his explanation.
> > > 
> > > In any case, I would like to use this RFC patch as a forum to discuss
> > > why the fallback path is broken[1], and as to what should be the
> > > limits for HW-accelerated blits now that we got rid of the blitter
> > > usage on recent platforms.
> > > 
> > 
> > Hi,
> > 
> > My understanding is that the fallback path is broken because we silently
> > ignore miptree_create_for_bo's request for a tiled miptree. This results
> > in some parts of mesa treating the surface as tiled and other parts of
> > treating the surface as linear.
> > 
> > I couldn't come up with a piglit test for this when I was working on a
> > fix. Please let me know if you can think of any.
> > 
> > I think what the limits should be depends on which mesa branch you're
> > working off of.
> > 
> > * On the master branch of mesa, which has some commits which reduce the
> >   dependence on the BLT engine, we can remove these limits by using BLORP.
> >   As much as I can tell, BLORP can handle images as wide as the surface
> >   pitch limit in the RENDER_SURFACE_STATE packet will allow.
> > 
> >   I sent out a series [a] a couple weeks ago that removes the limits
> >   imposed by the hardware blitter.
> > 
> > * On the stable branch however, we can modify some incorrect code to set
> >   the correct BLT limits (as Chris has suggested). The BLT engine's pitch
> >   field is a signed 16bit integer, whose unit changes depending on the
> >   tiling of the surface. For linear surfaces, it's in units of bytes and
> >   for non-linear surfaces, it's in units of dwords. This translates to
> >   2^15-1 bytes or (2^15-1) * 4 bytes respectively.
> >   
> >   I made a branch [b] which does this already, but I think my rebasing +
> >   testing strategy for stable branches on the CI might be incorrect.
> 
> I just rebased this branch onto master and tested it on the CI.
> Everything passes except for SNB. I get 1 GPU hang and two test
> failures:
> * failure-gpu-hang-otc-gfxtest-snbgt1-01-snbm64.compile.error
> * KHR-GL33.shaders.uniform_block.random.all_shared_buffer.3.snbm64
> * dEQP-EGL.functional.color_clears.multi_context.gles3.rgba_pbuffer.snbm64

What are the command lines? Assuming piglit, which the last one doesn't
appear to be, I can try, see what happens, and see if I can be of
assistance.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 7/9] vulkan: Add EXT_acquire_xlib_display [v3]

2018-06-14 Thread Keith Packard
Jason Ekstrand  writes:

>> Signed-off-by: Keith Packard 
>>
>> fixup for acquire
>>
>> fixup for RROutput type
>>
>> Signed-off-by: Keith Packard 
>>
>> fixup
>>
>
> Lots of "fixup".  Did you mean to actually comment on what that was?

Sorry; I was squashing patches and moving comments into the main message
and just left this noise below.

>> +static bool
>> +wsi_display_mode_matches_x(struct wsi_display_mode *wsi,
>> +   xcb_randr_mode_info_t *xcb)
>> +{
>> +   return wsi->clock == (xcb->dot_clock + 500) / 1000 &&
>> +  wsi->hdisplay == xcb->width &&
>> +  wsi->hsync_start == xcb->hsync_start &&
>> +  wsi->hsync_end == xcb->hsync_end &&
>> +  wsi->htotal == xcb->htotal &&
>> +  wsi->hskew == xcb->hskew &&
>> +  wsi->vdisplay == xcb->height &&
>> +  wsi->vsync_start == xcb->vsync_start &&
>> +  wsi->vsync_end == xcb->vsync_end &&
>> +  wsi->vtotal == xcb->vtotal &&
>>
>
> You're not checking vscan here.

Yeah, I'm really unsure what vscan means exactly. X only has the
DOUBLE_SCAN flag, while vscan appears more flexible. DRM drivers appear
to use vscan == 0 to mean the same thing as single scan mode, which
seems like it is also covered by vscan == 1. I think what I probably
need is a function which returns the effective vscan value for both X
and DRM modes and then compare those. Maybe something like:

static int
wsi_display_drm_vscan(drmModeModeInfoPtr drm)
{
if (drm->vscan > 1)
return drm->vscan;
return 1;
}

static int
wsi_display_x_vscan(xcb_randr_mode_info_t *x_mode)
{
   if (x_mode->mode_flags & XCB_RANDR_MODE_FLAG_DOUBLE_SCAN)
  return 2;
   return 1;
}

If these look reasonable, then I could use them as appropriate and the
values should all compare correctly.

> Why are you fetching these here and not lower down?  The only uses of them
> inside the "if (!connector)" is to free them.  Seems to be a bit of a
> waste.

Good point. I've moved them below that block, just above the code which
uses their values.

>> +   /* XXX no support for multiple leases yet */
>> +   if (wsi->fd >= 0)
>> +  return VK_ERROR_OUT_OF_DATE_KHR;
>>
>
> This function is supposed to return either VK_SUCCESS or
> VK_ERROR_INITIALIZATION_FAILED.  The errors here and below should probably
> all be VK_ERROR_INITIALIZATION_FAILED.

Thanks. Vulkan error values remain largely a mystery to me. I've changed
the values.

> Everything else looks good to me.

Awesome!

Here's an updated version of this patch:

From 3a47d4543a054a9a1d2333ea311c9f5c057d5e9f Mon Sep 17 00:00:00 2001
From: Keith Packard 
Date: Fri, 9 Feb 2018 07:45:58 -0800
Subject: [PATCH 7/9] vulkan: Add EXT_acquire_xlib_display [v4]

This extension adds the ability to borrow an X RandR output for
temporary use directly by a Vulkan application. For DRM, we use the
Linux resource leasing mechanism.

v2:
	Clean up xlib_lease detection

	* Use separate temporary '_xlib_lease' variable to hold the
	  option value to avoid changin the type of a variable.

	* Use boolean expressions instead of additional if statements
	  to compute resulting with_xlib_lease value.

	* Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to
  vulkan_wsi_args

	  Suggested-by: Eric Engestrom 

	Move mode list from wsi_display to wsi_display_connector

	Fix scope for wsi_display_mode and wsi_display_connector allocs

	  Suggested-by: Jason Ekstrand 

v3:
	Adopt Jason Ekstrand's coding conventions

	Declare variables at first use, eliminate extra whitespace
	between types and names. Wrap lines to 80 columns.

	Explicitly forbid multiple DRM leases. Making the code support
	this looks tricky and will require additional thought.

	Use xcb_randr_output_t throughout the internals of the
	implementation. Convert at the public API
	(wsi_get_randr_output_display).

	Clean up check for usable active_crtc (possible when only the
	desired output is connected to the crtc).

	Suggested-by: Jason Ekstrand 

v4:
	Move output resource fetching closer to use in
	wsi_display_get_output. This simplifies the error returns in
	earlier parts of the code a bit.

	Return VK_ERROR_INITIALIZATION_FAILED from
	wsi_acquire_xlib_display. Jason says this is the right error
	message.

	Suggested-by: Jason Ekstrand 

Signed-off-by: Keith Packard 
---
 configure.ac|  32 ++
 meson.build |  11 +
 meson_options.txt   |   7 +
 src/vulkan/Makefile.am  |   5 +
 src/vulkan/wsi/meson.build  |   5 +
 src/vulkan/wsi/wsi_common_display.c | 489 
 src/vulkan/wsi/wsi_common_display.h |  17 +
 7 files changed, 566 insertions(+)

diff --git a/configure.ac b/configure.ac
index 75ee1a7c01c..e01d0399681 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1577,6 +1577,7 @@ AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = x
 AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes )
 AM_CONDITIONAL(HAV

Re: [Mesa-dev] [PATCH v5] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-06-14 Thread Nanley Chery
On Thu, Jun 07, 2018 at 09:34:41AM +0300, Eleni Maria Stea wrote:
> Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed
> formats that can render. When GetCompressed* functions are called, the
> pixels are returned in the non-compressed format that is used for the
> rendering.
> 
> With this patch we store both the compressed and non-compressed versions
> of the image, so that both rendering commands and GetCompressed*
> commands work.
> 
> Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
> in intel_miptree_map_etc function have been removed because when the
> miptree is mapped for reading (for example from a GetCompress*
> function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set).
> 
> Fixes: the following test in CTS for gen7:
> KHR-GL45.direct_state_access.textures_compressed_subimage test
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843
> 
> v2: fixes issues:
>a) initialized uninitialized variables (Juan A. Suarez, Andres Gomez)
>b) fixed race condition where mt and cmt were mapped at the same time
>c) fixed indentation issues (Andres Gomez)
> v3: adds bugzilla bug with id: 104272
> v4: adds bugzilla bug with id: 81843
> v5: replaced the flags with a bitfield, refactoring (Kenneth Graunke)

+Jason, Ken

Hello,

I recently did some miptree work relating to the r8stencil_mt and I
think I now have a more informed opinion about how things should be
structured. I'd like to propose an alternative solution.

I had initially thought we should have a separate miptree to hold the
compressed data, like this patch does, but now I think we should
actually have the compressed data be the main miptree and to store the
decompressed miptree as part of the main one. The reasoning is that we
could reuse this structure to handle the r8stencil workaround and to
eventually handle the ASTC_LDR surfaces that are modified on gen9.

I'm proposing something like the following:

1. Rename r8stencil_mt ->shadow_mt and
   r8stencil_needs_update -> shadow_needs_update.
2. Make shadow_mt hold the decompressed ETC miptree
3. Update shadow_needs_update whenever the main mt is modified
4. Add an function to update the shadow_mt using the main mt as a source
5. Sample from the shadow_mt as appropriate
6. Make the main miptree hold the compressed data

This method should also be able to handle the CopyImage functions. What
do you all think?

-Nanley

> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  10 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  10 ++
>  src/mesa/drivers/dri/i965/intel_tex.c | 106 +++---
>  src/mesa/drivers/dri/i965/intel_tex.h |   1 -
>  src/mesa/drivers/dri/i965/intel_tex_image.c   |  46 +++-
>  src/mesa/drivers/dri/i965/intel_tex_obj.h |   8 ++
>  src/mesa/main/texstore.c  |  51 ++---
>  src/mesa/main/texstore.h  |   8 +-
>  8 files changed, 197 insertions(+), 43 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 7d1fa96b91..cc807977de 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -733,9 +733,10 @@ miptree_create(struct brw_context *brw,
> mesa_format etc_format = MESA_FORMAT_NONE;
> uint32_t alloc_flags = 0;
>  
> -   format = intel_lower_compressed_format(brw, format);
> -
> -   etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
> +   if (!(flags & MIPTREE_CREATE_ETC)) {
> +  format = intel_lower_compressed_format(brw, format);
> +  etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
> +   }
>  
> if (flags & MIPTREE_CREATE_BUSY)
>alloc_flags |= BO_ALLOC_BUSY;
> @@ -3372,9 +3373,6 @@ intel_miptree_map_etc(struct brw_context *brw,
>assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
> }
>  
> -   assert(map->mode & GL_MAP_WRITE_BIT);
> -   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
> -
> intel_miptree_access_raw(brw, mt, level, slice, true);
>  
> map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index 42f73ba1f9..9e7a401229 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -74,6 +74,7 @@ struct intel_texture_image;
>   * without transcoding back.  This flag to intel_miptree_map() gets you that.
>   */
>  #define BRW_MAP_DIRECT_BIT   0x8000
> +#define BRW_MAP_ETC_BIT  0x4000
>  
>  struct intel_miptree_map {
> /** Bitfield of GL_MAP_*_BIT and BRW_MAP_*_BIT. */
> @@ -380,6 +381,15 @@ enum intel_miptree_create_flags {
>  * that the miptree will be created with mt->aux_usage == NONE.
>  */
> MIPTREE_CREATE_NO_AUX   = 

Re: [Mesa-dev] [PATCH v5] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-06-14 Thread Eleni Maria Stea
On 06/14/2018 10:27 PM, Nanley Chery wrote:

> +Jason, Ken
> 
> Hello,
> 
> I recently did some miptree work relating to the r8stencil_mt and I
> think I now have a more informed opinion about how things should be
> structured. I'd like to propose an alternative solution.
> 
> I had initially thought we should have a separate miptree to hold the
> compressed data, like this patch does, but now I think we should
> actually have the compressed data be the main miptree and to store the
> decompressed miptree as part of the main one. The reasoning is that we
> could reuse this structure to handle the r8stencil workaround and to
> eventually handle the ASTC_LDR surfaces that are modified on gen9.
> 
> I'm proposing something like the following:
> 
> 1. Rename r8stencil_mt ->shadow_mt and
>r8stencil_needs_update -> shadow_needs_update.
> 2. Make shadow_mt hold the decompressed ETC miptree
> 3. Update shadow_needs_update whenever the main mt is modified
> 4. Add an function to update the shadow_mt using the main mt as a source
> 5. Sample from the shadow_mt as appropriate
> 6. Make the main miptree hold the compressed data
> 
> This method should also be able to handle the CopyImage functions. What
> do you all think?
> 
> -Nanley

Hi Nanley,

Thank you for your reply. I wasn't aware that there are other cases we
might need to store a 2nd image. I agree that it's more reasonable to
use one generic purpose miptree that can be accessible from different
parts of the i965 code for such cases instead of storing miptrees in
different places for different hacks when a feature is not supported.

I will search your patch to get a look and I will also get a look at the
mesa code to see how easy this fix would be (which parts of the code it
might affect) and if everyone agrees that this is a good idea I will
modify this patch according to your suggestions.

BR :)
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]

2018-06-14 Thread Jason Ekstrand
On Thu, Jun 14, 2018 at 11:42 AM, Keith Packard  wrote:

> Jason Ekstrand  writes:
>
> >> +   if (instance->enabled_extensions.KHR_display) {
> >> +  master_fd = open(path, O_RDWR | O_CLOEXEC);
> >>
> >
> > Is this supposed to be opening primary_path instead?
>
> Yes, and this section of code needs to occur before anv_init_wsi.
>
> I appear to have skipped testing this path on ANV and only tested it on
> RADV -- RADV has the code in the right order. Thanks for catching this;
> sorry I messed up and didn't test it.
>
> > This could just be
> >
> > if (anv_gem_get_param(master_fd, I915_PARAM_CHIPSET_ID) == 0) {
> >close(master_fd);
> >master_fd = -1;
> > }
> >
> > No need to type out all that IOCTL stuff.
>
> Thanks, that's lots shorter (RADV doesn't appear to have a similar helper).
>
> Here's an amendment to the proposed patch which fixes the bug and
> switches to the simpler detection method.
>

Looks good to me.  With that,

Reviewed-by: Jason Ekstrand 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] mesa: add header for share bptc decompress functions

2018-06-14 Thread Denis Pauk
Make functions public:
* fetch_rgba_unorm_from_block
* fetch_rgb_float_from_block
* compress_rgba_unorm
* compress_rgb_float

Functions will be reused in gallium/auxiliary code.
---
 src/mesa/Makefile.sources  |  1 +
 src/mesa/main/texcompress_bptc.c   |  9 ++---
 src/mesa/main/texcompress_bptc_share.h | 47 ++
 3 files changed, 53 insertions(+), 4 deletions(-)
 create mode 100644 src/mesa/main/texcompress_bptc_share.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 00aba0a2f7..d644112e6a 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -216,6 +216,7 @@ MAIN_FILES = \
main/texcompress.c \
main/texcompress_bptc.c \
main/texcompress_bptc.h \
+   main/texcompress_bptc_share.h \
main/texcompress_cpal.c \
main/texcompress_cpal.h \
main/texcompress_etc.c \
diff --git a/src/mesa/main/texcompress_bptc.c b/src/mesa/main/texcompress_bptc.c
index fd37be97f3..6cfd9aece7 100644
--- a/src/mesa/main/texcompress_bptc.c
+++ b/src/mesa/main/texcompress_bptc.c
@@ -29,6 +29,7 @@
 #include 
 #include "texcompress.h"
 #include "texcompress_bptc.h"
+#include "texcompress_bptc_share.h"
 #include "util/format_srgb.h"
 #include "util/half_float.h"
 #include "texstore.h"
@@ -535,7 +536,7 @@ apply_rotation(int rotation,
result[3] = t;
 }
 
-static void
+void
 fetch_rgba_unorm_from_block(const uint8_t *block,
 uint8_t *result,
 int texel)
@@ -840,7 +841,7 @@ finish_signed_unquantize(int32_t value)
   return value * 31 / 32;
 }
 
-static void
+void
 fetch_rgb_float_from_block(const uint8_t *block,
float *result,
int texel,
@@ -1247,7 +1248,7 @@ compress_rgba_unorm_block(int src_width, int src_height,
  endpoints);
 }
 
-static void
+void
 compress_rgba_unorm(int width, int height,
 const uint8_t *src, int src_rowstride,
 uint8_t *dst, int dst_rowstride)
@@ -1555,7 +1556,7 @@ compress_rgb_float_block(int src_width, int src_height,
endpoints);
 }
 
-static void
+void
 compress_rgb_float(int width, int height,
const float *src, int src_rowstride,
uint8_t *dst, int dst_rowstride,
diff --git a/src/mesa/main/texcompress_bptc_share.h 
b/src/mesa/main/texcompress_bptc_share.h
new file mode 100644
index 00..bf885ef038
--- /dev/null
+++ b/src/mesa/main/texcompress_bptc_share.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef TEXCOMPRESS_BPTC_SHARE_H
+#define TEXCOMPRESS_BPTC_SHARE_H
+
+void
+fetch_rgba_unorm_from_block(const uint8_t *block,
+uint8_t *result,
+int texel);
+void
+fetch_rgb_float_from_block(const uint8_t *block,
+   float *result,
+   int texel,
+   bool is_signed);
+
+void
+compress_rgb_float(int width, int height,
+   const float *src, int src_rowstride,
+   uint8_t *dst, int dst_rowstride,
+   bool is_signed);
+
+void
+compress_rgba_unorm(int width, int height,
+const uint8_t *src, int src_rowstride,
+uint8_t *dst, int dst_rowstride);
+#endif
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Add support GL_ARB_texture_compression_bptc in llvmpipe and softpipe.

2018-06-14 Thread Denis Pauk
Add code for reuse bptc decode logic from mesa/main/texcompress_bptc.c by make 
several function public(nonstatic) and define functions in 
texcompress_bptc_share.h. 

I have made minimal changes in code without possible performance improvements. 
And code 
decodes image by pixels instead possible decode full 4x4 block. Compress code 
works 
by compess full image at once (reuse compress function from texcompress_bptc). 

Checked on x86_64 by: 
* LIBGL_ALWAYS_SOFTWARE=true GALLIUM_DRIVER={llvmpipe,softpipe}
* piglit/bin/bptc-float-modes
* piglit/bin/bptc-modes
* piglit/bin/compressedteximage GL_COMPRESSED_RGBA_BPTC_UNORM
* piglit/bin/compressedteximage GL_COMPRESSED_SRGB_ALPHA_BPTC_UNORM
* piglit/bin/compressedteximage GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT
* piglit/bin/compressedteximage GL_COMPRESSED_RGB_BPTC_SIGNED_FLOAT

Could you please review?

Best regards,
 Denis.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] gallium/auxiliary: Add helper support for bptc format compress/decompress

2018-06-14 Thread Denis Pauk
Reuse code shared with mesa/main/texcompress_bptc.
---
 src/gallium/auxiliary/Makefile.sources   |   2 +
 src/gallium/auxiliary/meson.build|   2 +
 src/gallium/auxiliary/util/u_format_bptc.c   | 322 +++
 src/gallium/auxiliary/util/u_format_bptc.h   | 122 +++
 src/gallium/auxiliary/util/u_format_table.py |   3 +-
 src/gallium/auxiliary/util/u_tile.c  |   1 +
 6 files changed, 451 insertions(+), 1 deletion(-)
 create mode 100644 src/gallium/auxiliary/util/u_format_bptc.c
 create mode 100644 src/gallium/auxiliary/util/u_format_bptc.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 066746f2d0..626cde123a 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -256,6 +256,8 @@ C_SOURCES := \
util/u_fifo.h \
util/u_format.c \
util/u_format.h \
+   util/u_format_bptc.c \
+   util/u_format_bptc.h \
util/u_format_etc.c \
util/u_format_etc.h \
util/u_format_latc.c \
diff --git a/src/gallium/auxiliary/meson.build 
b/src/gallium/auxiliary/meson.build
index 92cfb8f7af..31b75c7207 100644
--- a/src/gallium/auxiliary/meson.build
+++ b/src/gallium/auxiliary/meson.build
@@ -276,6 +276,8 @@ files_libgallium = files(
   'util/u_fifo.h',
   'util/u_format.c',
   'util/u_format.h',
+  'util/u_format_bptc.c',
+  'util/u_format_bptc.h',
   'util/u_format_etc.c',
   'util/u_format_etc.h',
   'util/u_format_latc.c',
diff --git a/src/gallium/auxiliary/util/u_format_bptc.c 
b/src/gallium/auxiliary/util/u_format_bptc.c
new file mode 100644
index 00..d968f766a4
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_format_bptc.c
@@ -0,0 +1,322 @@
+/**
+ *
+ * Copyright (C) 1999-2007  Brian Paul   All Rights Reserved.
+ * Copyright (c) 2008 VMware, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+#include "u_math.h"
+#include "u_format.h"
+#include "u_format_bptc.h"
+#include "util/format_srgb.h"
+#include "../../../mesa/main/texcompress_bptc_share.h"
+#include 
+
+static void
+rgb_float_unpack_rgba_float_with_sign(float *dst_row, unsigned dst_stride,
+  const uint8_t *src_row, unsigned 
src_stride,
+  unsigned width, unsigned height, bool 
is_signed)
+{
+   unsigned x, y;
+
+   for(y = 0; y < height; y += 1) {
+  float *dst = dst_row;
+  const uint8_t *src = src_row;
+
+  for(x = 0; x < width; x += 1) {
+ const uint8_t *block;
+ block = src + (x / 4) * 16;
+ fetch_rgb_float_from_block(block, dst, (x % 4) + (y % 4) * 4, 
is_signed);
+ dst += 4;
+  }
+  if (y % 4 == 3) {
+ src_row += src_stride;
+  }
+  dst_row += dst_stride / sizeof(*dst_row);
+   }
+}
+
+static void
+rgba_unorm_unpack_rgba_int8(uint8_t *dst_row, unsigned dst_stride,
+const uint8_t *src_row, unsigned src_stride,
+unsigned width, unsigned height)
+{
+   unsigned x, y;
+
+   for(y = 0; y < height; y += 1) {
+  uint8_t *dst = dst_row;
+  const uint8_t *src = src_row;
+
+  for(x = 0; x < width; x += 1) {
+ const uint8_t *block;
+ block = src + (x / 4) * 16;
+ fetch_rgba_unorm_from_block(block, dst, (x % 4) + (y % 4) * 4);
+ dst += 4;
+  }
+  if (y % 4 == 3) {
+ src_row += src_stride;
+  }
+  dst_row += (dst_stride * 1)/sizeof(*dst_row);
+   }
+}
+
+void
+util_format_bptc_rgba_unorm_unpack_rgba_8unorm(uint8_t *dst_row, unsigned 
dst_stride,
+   const uint8_t *src_row, 
unsigned src_stride,
+   unsigned width, unsigned height)
+{
+  rgba_unorm_unpack_r

[Mesa-dev] [PATCH 4/4] gallium/llvmpipe: Enable support bptc format.

2018-06-14 Thread Denis Pauk
---
 src/gallium/drivers/llvmpipe/lp_screen.c  | 3 +--
 src/gallium/drivers/llvmpipe/lp_test_format.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index f12ad09298..c1a2fd3379 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -533,8 +533,7 @@ llvmpipe_is_format_supported( struct pipe_screen *_screen,
   }
}
 
-   if (format_desc->layout == UTIL_FORMAT_LAYOUT_BPTC ||
-   format_desc->layout == UTIL_FORMAT_LAYOUT_ASTC) {
+   if (format_desc->layout == UTIL_FORMAT_LAYOUT_ASTC) {
   /* Software decoding is not hooked up. */
   return FALSE;
}
diff --git a/src/gallium/drivers/llvmpipe/lp_test_format.c 
b/src/gallium/drivers/llvmpipe/lp_test_format.c
index e9a6e01fdc..a8aa33d8ae 100644
--- a/src/gallium/drivers/llvmpipe/lp_test_format.c
+++ b/src/gallium/drivers/llvmpipe/lp_test_format.c
@@ -388,8 +388,7 @@ test_all(unsigned verbose, FILE *fp)
   }
 
   /* missing fetch funcs */
-  if (format_desc->layout == UTIL_FORMAT_LAYOUT_BPTC ||
-  format_desc->layout == UTIL_FORMAT_LAYOUT_ASTC) {
+  if (format_desc->layout == UTIL_FORMAT_LAYOUT_ASTC) {
  continue;
   }
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] gallium/softpipe: Enable support bptc format.

2018-06-14 Thread Denis Pauk
---
 src/gallium/drivers/softpipe/sp_screen.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_screen.c 
b/src/gallium/drivers/softpipe/sp_screen.c
index f9c916d938..676cd0812a 100644
--- a/src/gallium/drivers/softpipe/sp_screen.c
+++ b/src/gallium/drivers/softpipe/sp_screen.c
@@ -440,8 +440,7 @@ softpipe_is_format_supported( struct pipe_screen *screen,
  return FALSE;
}
 
-   if (format_desc->layout == UTIL_FORMAT_LAYOUT_BPTC ||
-   format_desc->layout == UTIL_FORMAT_LAYOUT_ASTC) {
+   if (format_desc->layout == UTIL_FORMAT_LAYOUT_ASTC) {
   /* Software decoding is not hooked up. */
   return FALSE;
}
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC] i965/blit: bump some limits to 64k

2018-06-14 Thread Ian Romanick
On 06/14/2018 06:18 AM, Martin Peres wrote:
> This fixes screenshots using 8k+ wide display setups in modesetting.
> 
> Chris Wilson even recommended the changes in intel_mipmap_tree.c
> should read 131072 instead of 65535, but I for sure got confused by
> his explanation.
> 
> In any case, I would like to use this RFC patch as a forum to discuss
> why the fallback path is broken[1], and as to what should be the
> limits for HW-accelerated blits now that we got rid of the blitter
> usage on recent platforms.
> 
> Tested-by: Martin Peres  # HSW
> Cc: Chris Wilson 
> 
> [1] https://fs.mupuf.org/corruption_8k%2B.png
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c| 2 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index 90784c5b1958..458f8bd42857 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -403,7 +403,7 @@ emit_miptree_blit(struct brw_context *brw,
>  * for linear surfaces and DWords for tiled surfaces.  So the maximum
>  * pitch is 32k linear and 128k tiled.
>  */
> -   if (blt_pitch(src_mt) >= 32768 || blt_pitch(dst_mt) >= 32768) {
> +   if (blt_pitch(src_mt) >= 65536 || blt_pitch(dst_mt) >= 65536) {
>perf_debug("Falling back due to >= 32k/128k pitch\n");

Should this message be updated?

>return false;
> }
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 6b89bf6848af..7347ea8b99d8 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -523,7 +523,7 @@ need_to_retile_as_linear(struct brw_context *brw, 
> unsigned row_pitch,
> if (row_pitch < 64)
>return true;
>  
> -   if (ALIGN(row_pitch, 512) >= 32768) {
> +   if (ALIGN(row_pitch, 512) >= 65536) {
>perf_debug("row pitch %u too large to blit, falling back to untiled",
>   row_pitch);
>return true;
> @@ -3583,7 +3583,7 @@ can_blit_slice(struct intel_mipmap_tree *mt,
> unsigned int level, unsigned int slice)
>  {
> /* See intel_miptree_blit() for details on the 32k pitch limit. */
> -   if (mt->surf.row_pitch >= 32768)
> +   if (mt->surf.row_pitch >= 65536)
>return false;
>  
> return true;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 44/48] meson: add windows specific linker flags

2018-06-14 Thread Jose Fonseca

On 12/06/18 17:50, Dylan Baker wrote:

Quoting Eric Engestrom (2018-06-12 04:38:04)

On Monday, 2018-06-11 15:56:11 -0700, Dylan Baker wrote:

---
  meson.build | 21 +
  1 file changed, 21 insertions(+)

diff --git a/meson.build b/meson.build
index a244694fd4a..e1b3afbe688 100644
--- a/meson.build
+++ b/meson.build
@@ -847,6 +847,27 @@ else
endforeach
  endif
  
+# set linker arguments

+if host_machine.system() == 'windows'
+  if cc.get_id() == 'msvc'
+add_project_link_arguments(
+  '/fixed:no',
+  '/incremental:no',
+  '/dynamicbase',
+  '/nxcompat',
+  language : ['c', 'cpp'],
+)
+  else
+add_project_link_arguments(
+  '-Wl,--nxcompat',
+  '-Wl,--dynamicbase',
+  '-static-libgcc',
+  '-static-libstdc++',
+  language : ['c', 'cpp'],


probably harmless, but it feels like libgcc and libstdc++ should be
only added to c, respectively cpp, not both.


cpp needs -static-libgcc if the target has both C and C++ code, right?

I copied this from scons/gallium.py.

Brian or Jose, I don't know what the right think to do is here, do one of you
guys?

Dylan



I'm not entirely sure.

The thing is, one should always use /usr/bin/c++ for linking whenever 
there's a C++ dependency, even if the main program is just C.


I'd go for both, just in case.  At any rate, I doubt it harms.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106922] Tangrams demo: LLVM ERROR: Cannot select: 0x7e8d8750: i16 = bitcast 0x7e8d8af8

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106922

Bug ID: 106922
   Summary: Tangrams demo: LLVM ERROR: Cannot select: 0x7e8d8750:
i16 = bitcast 0x7e8d8af8
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: haa...@frickel.club
QA Contact: mesa-dev@lists.freedesktop.org

RX 480
mesa radv 41dabdc4753
llvm 7svn 165520

The demo is for windows only and I think closed source but it runs well with
the most recent amdvlk. On radv it crashes.
You can get it here: http://www.pouet.net/prod.php?which=76464

Make a 64 bit wine prefix, run wine64 Tangrams.exe, click demo and it
immediately crashes with this message:

LLVM ERROR: Cannot select: 0x7e8d8750: i16 = bitcast 0x7e8d8af8
  0x7e8d8af8: f32,ch = BUFFER_LOAD<(dereferenceable load 4 from custom
TargetCustom7, align 1, addrspace 4)> 0x7d433f78, 0x7e8d7430, Constant:i32<0>,
0x7e8d7638, Constant:i1<0>, Constant:i1<0>
0x7e8d7430: v4i32,ch = load<(dereferenceable invariant load 16 from %ir.56,
addrspace 6)> 0x7d433f78, 0x7e8d72f8, undef:i32
  0x7e8d72f8: i32 = add 0x7e8ce2c8, Constant:i32<48>
0x7e8ce2c8: i32,ch = CopyFromReg 0x7d433f78, Register:i32 %100
  0x7e8ce260: i32 = Register %100
0x7e8d3e40: i32 = Constant<48>
  0x7e8cebb8: i32 = undef
0x7e8ceb50: i32 = Constant<0>
0x7e8d7638: i32 = add 0x7e8d73c8, Constant:i32<2>
  0x7e8d73c8: i32 = mul 0x7e96cb20, Constant:i32<6>
0x7e96cb20: i32 = add 0x7e96c918, 0x7e96c9e8
  0x7e96c918: i32 = mulhs 0x7e8ce810, Constant:i32<1431655766>
0x7e8ce810: i32 = add 0x7e8ce538, 0x7e8ce398
  0x7e8ce538: i32,ch = CopyFromReg 0x7d433f78, Register:i32 %103
0x7e8ce4d0: i32 = Register %103
  0x7e8ce398: i32,ch = CopyFromReg 0x7d433f78, Register:i32 %101
0x7e8ce330: i32 = Register %101
0x7e8d7290: i32 = Constant<1431655766>
  0x7e96c9e8: i32 = srl 0x7e96c918, Constant:i32<31>
0x7e96c918: i32 = mulhs 0x7e8ce810, Constant:i32<1431655766>
  0x7e8ce810: i32 = add 0x7e8ce538, 0x7e8ce398
0x7e8ce538: i32,ch = CopyFromReg 0x7d433f78, Register:i32 %103
  0x7e8ce4d0: i32 = Register %103
0x7e8ce398: i32,ch = CopyFromReg 0x7d433f78, Register:i32 %101
  0x7e8ce330: i32 = Register %101
  0x7e8d7290: i32 = Constant<1431655766>
0x7e96c980: i32 = Constant<31>
0x7e8d7360: i32 = Constant<6>
  0x7e8d75d0: i32 = Constant<2>
0x7e8d3b68: i1 = Constant<0>
0x7e8d3b68: i1 = Constant<0>
In function: main

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 7/9] vulkan: Add EXT_acquire_xlib_display [v3]

2018-06-14 Thread Jason Ekstrand
On Thu, Jun 14, 2018 at 12:24 PM, Keith Packard  wrote:

> Jason Ekstrand  writes:
>
> >> Signed-off-by: Keith Packard 
> >>
> >> fixup for acquire
> >>
> >> fixup for RROutput type
> >>
> >> Signed-off-by: Keith Packard 
> >>
> >> fixup
> >>
> >
> > Lots of "fixup".  Did you mean to actually comment on what that was?
>
> Sorry; I was squashing patches and moving comments into the main message
> and just left this noise below.
>
> >> +static bool
> >> +wsi_display_mode_matches_x(struct wsi_display_mode *wsi,
> >> +   xcb_randr_mode_info_t *xcb)
> >> +{
> >> +   return wsi->clock == (xcb->dot_clock + 500) / 1000 &&
> >> +  wsi->hdisplay == xcb->width &&
> >> +  wsi->hsync_start == xcb->hsync_start &&
> >> +  wsi->hsync_end == xcb->hsync_end &&
> >> +  wsi->htotal == xcb->htotal &&
> >> +  wsi->hskew == xcb->hskew &&
> >> +  wsi->vdisplay == xcb->height &&
> >> +  wsi->vsync_start == xcb->vsync_start &&
> >> +  wsi->vsync_end == xcb->vsync_end &&
> >> +  wsi->vtotal == xcb->vtotal &&
> >>
> >
> > You're not checking vscan here.
>
> Yeah, I'm really unsure what vscan means exactly. X only has the
> DOUBLE_SCAN flag, while vscan appears more flexible. DRM drivers appear
> to use vscan == 0 to mean the same thing as single scan mode, which
> seems like it is also covered by vscan == 1. I think what I probably
> need is a function which returns the effective vscan value for both X
> and DRM modes and then compare those. Maybe something like:
>
> static int
> wsi_display_drm_vscan(drmModeModeInfoPtr drm)
> {
> if (drm->vscan > 1)
> return drm->vscan;
> return 1;
> }
>
> static int
> wsi_display_x_vscan(xcb_randr_mode_info_t *x_mode)
> {
>if (x_mode->mode_flags & XCB_RANDR_MODE_FLAG_DOUBLE_SCAN)
>   return 2;
>return 1;
> }
>
> If these look reasonable, then I could use them as appropriate and the
> values should all compare correctly.
>

I had a chat with one of our kernel display people and he seemed to think
that neither vscan nor doublescan were still a thing on modern display
technologies.  I also did some searching through the DRM code in the kernel
and, as far as I can tell, there are is no code which creates a mode with
vscan != 0.  How would you feel about just rejecting any modes we get in
from KMS which have vscan > 1 or the DBLSCAN flag set and rejecting
anything from XRandR with DOUBLE_SCAN set?  Then we could just delete vscan
from struct wsi_display_mode.  If someone comes back with an HMD or some
other application of the Vulkan display extensions that they want to work
on a doubldscan display, we can deal with it then.

My primary fear here is that I don't want to get a mode from the X server
and try and conjure up some value for vscan and then get it wrong in such a
way that KMS rejects it when we try to flip an image to the display.  It's
better to just reject those modes outright and have
vkGetDisplayModePropertiesKHR return zero modes than to return modes that
don't work.

Does that sound reasonable?


> > Why are you fetching these here and not lower down?  The only uses of
> them
> > inside the "if (!connector)" is to free them.  Seems to be a bit of a
> > waste.
>
> Good point. I've moved them below that block, just above the code which
> uses their values.
>

Much better. Thanks!


> >> +   /* XXX no support for multiple leases yet */
> >> +   if (wsi->fd >= 0)
> >> +  return VK_ERROR_OUT_OF_DATE_KHR;
> >>
> >
> > This function is supposed to return either VK_SUCCESS or
> > VK_ERROR_INITIALIZATION_FAILED.  The errors here and below should
> probably
> > all be VK_ERROR_INITIALIZATION_FAILED.
>
> Thanks. Vulkan error values remain largely a mystery to me. I've changed
> the values.
>

They are largely mysterious in general. :-)  VK_ERROR_OUT_OF_DATE_KHR
exists as an error code to be returned by swapchain functions to indicate
that something has gone sideways and you need a new swapchain before you
can do anything more.  This function, however, does not involve a swapchain
and is really more of an initialization function.  Also, the spec has a
list of allowed return values for each function; it's generally good to
pick from that list. :-)

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 44/48] meson: add windows specific linker flags

2018-06-14 Thread Dylan Baker
Quoting Jose Fonseca (2018-06-14 14:02:39)
> On 12/06/18 17:50, Dylan Baker wrote:
> > Quoting Eric Engestrom (2018-06-12 04:38:04)
> >> On Monday, 2018-06-11 15:56:11 -0700, Dylan Baker wrote:
> >>> ---
> >>>   meson.build | 21 +
> >>>   1 file changed, 21 insertions(+)
> >>>
> >>> diff --git a/meson.build b/meson.build
> >>> index a244694fd4a..e1b3afbe688 100644
> >>> --- a/meson.build
> >>> +++ b/meson.build
> >>> @@ -847,6 +847,27 @@ else
> >>> endforeach
> >>>   endif
> >>>   
> >>> +# set linker arguments
> >>> +if host_machine.system() == 'windows'
> >>> +  if cc.get_id() == 'msvc'
> >>> +add_project_link_arguments(
> >>> +  '/fixed:no',
> >>> +  '/incremental:no',
> >>> +  '/dynamicbase',
> >>> +  '/nxcompat',
> >>> +  language : ['c', 'cpp'],
> >>> +)
> >>> +  else
> >>> +add_project_link_arguments(
> >>> +  '-Wl,--nxcompat',
> >>> +  '-Wl,--dynamicbase',
> >>> +  '-static-libgcc',
> >>> +  '-static-libstdc++',
> >>> +  language : ['c', 'cpp'],
> >>
> >> probably harmless, but it feels like libgcc and libstdc++ should be
> >> only added to c, respectively cpp, not both.
> > 
> > cpp needs -static-libgcc if the target has both C and C++ code, right?
> > 
> > I copied this from scons/gallium.py.
> > 
> > Brian or Jose, I don't know what the right think to do is here, do one of 
> > you
> > guys?
> > 
> > Dylan
> > 
> 
> I'm not entirely sure.
> 
> The thing is, one should always use /usr/bin/c++ for linking whenever 
> there's a C++ dependency, even if the main program is just C.
> 
> I'd go for both, just in case.  At any rate, I doubt it harms.
> 
> Jose

Okay, then I think I'll just stick to both as I couldn't find anything when I
googled it.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106922] Tangrams demo: LLVM ERROR: Cannot select: 0x7e8d8750: i16 = bitcast 0x7e8d8af8

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106922

--- Comment #1 from Bas Nieuwenhuizen  ---
Can reproduce a LLVM error here.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106922] Tangrams demo: LLVM ERROR: Cannot select: 0x7e8d8750: i16 = bitcast 0x7e8d8af8

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106922

--- Comment #2 from Bas Nieuwenhuizen  ---
If this is any indication it may just be not checking exts:

SPIR-V WARNING:
In file ../mesa/src/compiler/spirv/spirv_to_nir.c:3312
Unsupported SPIR-V capability: SpvCapabilityInt16
28 bytes into the SPIR-V binary
SPIR-V WARNING:
In file ../mesa/src/compiler/spirv/spirv_to_nir.c:3394
Unsupported SPIR-V capability: SpvCapabilityStorageBuffer16BitAccess
36 bytes into the SPIR-V binary
SPIR-V WARNING:
In file ../mesa/src/compiler/spirv/spirv_to_nir.c:3394
Unsupported SPIR-V capability:
SpvCapabilityUniformAndStorageBuffer16BitAccess
44 bytes into the SPIR-V binary

Since we don't support these.

(or the undef is messing things up, let me find out where it gets introduced)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/14] intel/compiler: new shuffle_for_32bit_write and shuffle_from_32bit_read

2018-06-14 Thread Chema Casanova
On 14/06/18 03:02, Jason Ekstrand wrote:
> On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> mailto:jmcasan...@igalia.com>> wrote:
> 
> These new shuffle functions deal with the shuffle/unshuffle operations
> needed for read/write operations using 32-bit components when the
> read/written components have a different bit-size (8, 16, 64-bits).
> Shuffle from 32-bit to 32-bit becomes a simple MOV.
> 
> As the new function shuffle_src_to_dst takes of doing a shuffle or an
> unshuffle based on the different type_sz of source an destination this
> generic functions work with any source/destination assuming that writes
> use a 32-bit destination or reads use a 32-bit source.
> 
> 
> I'm having a lot of trouble understanding this paragraph.  Would you
> mind rephrasing it?
>  

Sure, that English didn't compile:

"shuffle_src_to_dst takes care of doing a shuffle when source type is
smaller than destination type and an unshuffle when source type is
bigger than destination. So this new read/write functions just need
to call shuffle_src_to_dst assuming that writes use a 32-bit
destination and reads use a 32-bit source."

I included also this comment in the commit log:

"As shuffle_for_32bit_write/from_32bit_read components take components
in unit of source/destination types and shuffle_src_to_dst takes units
of the smallest type component we adjust the components and
first_component parameters."

> 
> To enable this new functions it is needed than there is no
> source/destination overlap in the case of shuffle_from_32bit_read.
> That never happens on shuffle_for_32bit_write as it allocates a new
> destination register as it was at shuffle_64bit_data_for_32bit_write.
> ---
>  src/intel/compiler/brw_fs.h       | 11 +
>  src/intel/compiler/brw_fs_nir.cpp | 38 +++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
> index faf51568637..779170ecc95 100644
> --- a/src/intel/compiler/brw_fs.h
> +++ b/src/intel/compiler/brw_fs.h
> @@ -519,6 +519,17 @@ void shuffle_16bit_data_for_32bit_write(const
> brw::fs_builder &bld,
>                                          const fs_reg &src,
>                                          uint32_t components);
> 
> +void shuffle_from_32bit_read(const brw::fs_builder &bld,
> +                             const fs_reg &dst,
> +                             const fs_reg &src,
> +                             uint32_t first_component,
> +                             uint32_t components);
> +
> +fs_reg shuffle_for_32bit_write(const brw::fs_builder &bld,
> +                               const fs_reg &src,
> +                               uint32_t first_component,
> +                               uint32_t components);
> +
>  fs_reg setup_imm_df(const brw::fs_builder &bld,
>                      double v);
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 1a9d3c41d1d..1f684149fd5 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -5454,6 +5454,44 @@ shuffle_src_to_dst(const fs_builder &bld,
>     }
>  }
> 
> +void
> +shuffle_from_32bit_read(const fs_builder &bld,
> +                        const fs_reg &dst,
> +                        const fs_reg &src,
> +                        uint32_t first_component,
> +                        uint32_t components)
> +{
> +   assert(type_sz(src.type) == 4);
> +
> 
> 
> /* This function takes components in units of the destination type while
> shuffle_src_to_dst takes components in units of the smallest type */

Done.

> +   if (type_sz(dst.type) > 4) {
> +      assert(type_sz(dst.type) == 8);
> +      first_component *= 2;
> +      components *= 2;
> +   }
> +
> +   shuffle_src_to_dst(bld, dst, src, first_component, components);
> +}
> +
> +fs_reg
> +shuffle_for_32bit_write(const fs_builder &bld,
> +                        const fs_reg &src,
> +                        uint32_t first_component,
> +                        uint32_t components)
> +{
> +   fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_D,
> +                         DIV_ROUND_UP (components *
> type_sz(src.type), 4));
> +
> 
> 
> /* This function takes components in units of the source type while
> shuffle_src_to_dst takes components in units of the smallest type */

Done.

> With those added and the commit message re-worded a bit,
> 
> Reviewed-by: Jason Ekstrand  >

Thanks for the review.

Chema

> +   if (type_sz(src.type) > 4) {
> +      assert(type_sz(src.type) == 8);
> +      first_component *= 2;
> +      components *= 2;
> +   }
> +
> +   shuffle_src_to_dst

[Mesa-dev] [Bug 106922] Tangrams demo: LLVM ERROR: Cannot select: 0x7e8d8750: i16 = bitcast 0x7e8d8af8

2018-06-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106922

--- Comment #3 from Bas Nieuwenhuizen  ---
So top level is a f32 -> i16 bitcast:

Cannot select: 0x7e8d8750: i16 = bitcast 0x7e8d8af8

which is not allowed. Is in LLVM source:

%76 = call float @llvm.amdgcn.buffer.load.f32(<4 x i32> %75, i32 0, i32 %74, i1
false, i1 false) #3
  %77 = bitcast float %76 to i16

Happens in the nir:

vec1 16 ssa_115 = intrinsic load_ssbo (ssa_113, ssa_114) () ()

I'm really inclined to say it is our lacking 16-bit support.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ac: Clear meminfo to avoid valgrind warning.

2018-06-14 Thread Bas Nieuwenhuizen
Somehow valgrind misses that the value is initialized by the ioctl.
---
 src/amd/common/ac_gpu_info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index e908cc6fa96..e885c0538e9 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -235,7 +235,7 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
}
 
if (info->drm_minor >= 9) {
-   struct drm_amdgpu_memory_info meminfo;
+   struct drm_amdgpu_memory_info meminfo = {};
 
r = amdgpu_query_info(dev, AMDGPU_INFO_MEMORY, sizeof(meminfo), 
&meminfo);
if (r) {
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/14] intel/compiler: shuffle_from_32bit_read for 64-bit do_untyped_vector_read

2018-06-14 Thread Chema Casanova


On 14/06/18 03:26, Jason Ekstrand wrote:
> On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> mailto:jmcasan...@igalia.com>> wrote:
> 
> do_untyped_vector_read is used at load_ssbo and load_shared.
> 
> The previous MOVs are removed because shuffle_from_32bit_read
> can handle storing the shuffle results in the expected destination
> just using the proper offset.
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 12 ++--
>  1 file changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 7e738ade82e..780a9e228de 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -2434,16 +2434,8 @@ do_untyped_vector_read(const fs_builder &bld,
>                                                  BRW_PREDICATE_NONE);
> 
>           /* Shuffle the 32-bit load result into valid 64-bit data */
> -         const fs_reg packed_result = bld.vgrf(dest.type,
> iter_components);
> -         shuffle_32bit_load_result_to_64bit_data(
> -            bld, packed_result, read_result, iter_components);
> -
> -         /* Move each component to its destination */
> -         read_result = retype(read_result, BRW_REGISTER_TYPE_DF);
> -         for (int c = 0; c < iter_components; c++) {
> -            bld.MOV(offset(dest, bld, it * 2 + c),
> -                    offset(packed_result, bld, c));
> -         }
> 
> 
> I really don't know why we needed this extra set of MOVs.  They seem
> pretty pointless to me.  Maybe history?  In any case, this looks good.v-

I've just checked and there is not much history as the 64-bit code of
this function hasn't been changed since they landed. I think that the
logic was first shuffle and then move to the proper destination instead
of just shuffling to the final destination directly.

So maybe Iago remembers if there was any reason why...

> Reviewed-by: Jason Ekstrand  >
>  
> 
> +         shuffle_from_32bit_read(bld, offset(dest, bld, it * 2),
> +                                 read_result, 0, iter_components);
> 
>           bld.ADD(read_offset, read_offset, brw_imm_ud(16));
>        }
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/14] intel/compiler: use new shuffle_32bit_write for all 64-bit storage writes

2018-06-14 Thread Chema Casanova
On 14/06/18 03:44, Jason Ekstrand wrote:
> On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> mailto:jmcasan...@igalia.com>> wrote:
> 
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 2521f3c001b..833fad4247a 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -2839,8 +2839,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const
> fs_builder &bld,
>                  * for that.
>                  */
>                 unsigned channel = iter * 2 + i;
> -               fs_reg dest = shuffle_64bit_data_for_32bit_write(bld,
> -                  offset(value, bld, channel), 1);
> +               fs_reg dest = shuffle_for_32bit_write(bld, value,
> channel, 1);
> 
> 
> What happened to offsetting "value"?

Using channel as first_component in shuffle_for_32bit_write is
equivalent to offsetting value, and we save one line. :)

>  
> 
> 
>                 srcs[header_regs + (i + first_component) * 2] = dest;
>                 srcs[header_regs + (i + first_component) * 2 + 1] =
> @@ -3694,8 +3693,8 @@ fs_visitor::nir_emit_cs_intrinsic(const
> fs_builder &bld,
>        unsigned type_size = 4;
>        if (nir_src_bit_size(instr->src[0]) == 64) {
>           type_size = 8;
> -         val_reg = shuffle_64bit_data_for_32bit_write(bld,
> -            val_reg, instr->num_components);
> +         val_reg = shuffle_for_32bit_write(bld, val_reg, 0,
> +                                           instr->num_components);
>        }
> 
>        unsigned type_slots = type_size / 4;
> @@ -4236,8 +4235,8 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
>               * iteration handle the rest.
>               */
>              num_components = MIN2(2, num_components);
> -            write_src = shuffle_64bit_data_for_32bit_write(bld,
> write_src,
> -                                                         
>  num_components);
> +            write_src = shuffle_for_32bit_write(bld, write_src, 0,
> +                                                num_components);
>           } else if (type_size < 4) {
>              assert(type_size == 2);
>              /* For 16-bit types we pack two consecutive values into
> a 32-bit
> @@ -4333,7 +4332,7 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
>        unsigned num_components = instr->num_components;
>        unsigned first_component = nir_intrinsic_component(instr);
>        if (nir_src_bit_size(instr->src[0]) == 64) {
> -         src = shuffle_64bit_data_for_32bit_write(bld, src,
> num_components);
> +         src = shuffle_for_32bit_write(bld, src, 0, num_components);
>           num_components *= 2;
>        }
>  
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: remove multisample bit from shader key.

2018-06-14 Thread Dave Airlie
From: Dave Airlie 

This wasn't being used anywhere inside the shader from what I can see.
---
 src/amd/vulkan/radv_pipeline.c | 2 --
 src/amd/vulkan/radv_private.h  | 1 -
 src/amd/vulkan/radv_shader.h   | 1 -
 3 files changed, 4 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 6eeedc65a39..ccbcbbadd55 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1868,7 +1868,6 @@ radv_generate_graphics_pipeline_key(struct radv_pipeline 
*pipeline,
pCreateInfo->pMultisampleState->rasterizationSamples > 1) {
uint32_t num_samples = 
pCreateInfo->pMultisampleState->rasterizationSamples;
uint32_t ps_iter_samples = 
radv_pipeline_get_ps_iter_samples(pCreateInfo->pMultisampleState);
-   key.multisample = true;
key.log2_num_samples = util_logbase2(num_samples);
key.log2_ps_iter_samples = util_logbase2(ps_iter_samples);
}
@@ -1909,7 +1908,6 @@ radv_fill_shader_keys(struct radv_shader_variant_key 
*keys,
for(int i = 0; i < MESA_SHADER_STAGES; ++i)
keys[i].has_multiview_view_index = 
key->has_multiview_view_index;
 
-   keys[MESA_SHADER_FRAGMENT].fs.multisample = key->multisample;
keys[MESA_SHADER_FRAGMENT].fs.col_format = key->col_format;
keys[MESA_SHADER_FRAGMENT].fs.is_int8 = key->is_int8;
keys[MESA_SHADER_FRAGMENT].fs.is_int10 = key->is_int10;
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 316fbc9af1d..7841d70deea 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -360,7 +360,6 @@ struct radv_pipeline_key {
uint32_t is_int10;
uint8_t log2_ps_iter_samples;
uint8_t log2_num_samples;
-   uint32_t multisample : 1;
uint32_t has_multiview_view_index : 1;
uint32_t optimisations_disabled : 1;
 };
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index 05de188e3f3..5b2284efcfd 100644
--- a/src/amd/vulkan/radv_shader.h
+++ b/src/amd/vulkan/radv_shader.h
@@ -98,7 +98,6 @@ struct radv_fs_variant_key {
uint8_t log2_num_samples;
uint32_t is_int8;
uint32_t is_int10;
-   uint32_t multisample : 1;
 };
 
 struct radv_shader_variant_key {
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: remove multisample bit from shader key.

2018-06-14 Thread Bas Nieuwenhuizen
On Fri, Jun 15, 2018 at 12:51 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This wasn't being used anywhere inside the shader from what I can see.

Well it was used for the BC optimize, but then Samuel enabled it for
non-multisample too, so now we don't use it anymore. (or rather we
were already enabling it, but not putting the shader part in there for
non-multisample)

Reviewed-by: Bas Nieuwenhuizen 

> ---
>  src/amd/vulkan/radv_pipeline.c | 2 --
>  src/amd/vulkan/radv_private.h  | 1 -
>  src/amd/vulkan/radv_shader.h   | 1 -
>  3 files changed, 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 6eeedc65a39..ccbcbbadd55 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -1868,7 +1868,6 @@ radv_generate_graphics_pipeline_key(struct 
> radv_pipeline *pipeline,
> pCreateInfo->pMultisampleState->rasterizationSamples > 1) {
> uint32_t num_samples = 
> pCreateInfo->pMultisampleState->rasterizationSamples;
> uint32_t ps_iter_samples = 
> radv_pipeline_get_ps_iter_samples(pCreateInfo->pMultisampleState);
> -   key.multisample = true;
> key.log2_num_samples = util_logbase2(num_samples);
> key.log2_ps_iter_samples = util_logbase2(ps_iter_samples);
> }
> @@ -1909,7 +1908,6 @@ radv_fill_shader_keys(struct radv_shader_variant_key 
> *keys,
> for(int i = 0; i < MESA_SHADER_STAGES; ++i)
> keys[i].has_multiview_view_index = 
> key->has_multiview_view_index;
>
> -   keys[MESA_SHADER_FRAGMENT].fs.multisample = key->multisample;
> keys[MESA_SHADER_FRAGMENT].fs.col_format = key->col_format;
> keys[MESA_SHADER_FRAGMENT].fs.is_int8 = key->is_int8;
> keys[MESA_SHADER_FRAGMENT].fs.is_int10 = key->is_int10;
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index 316fbc9af1d..7841d70deea 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -360,7 +360,6 @@ struct radv_pipeline_key {
> uint32_t is_int10;
> uint8_t log2_ps_iter_samples;
> uint8_t log2_num_samples;
> -   uint32_t multisample : 1;
> uint32_t has_multiview_view_index : 1;
> uint32_t optimisations_disabled : 1;
>  };
> diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
> index 05de188e3f3..5b2284efcfd 100644
> --- a/src/amd/vulkan/radv_shader.h
> +++ b/src/amd/vulkan/radv_shader.h
> @@ -98,7 +98,6 @@ struct radv_fs_variant_key {
> uint8_t log2_num_samples;
> uint32_t is_int8;
> uint32_t is_int10;
> -   uint32_t multisample : 1;
>  };
>
>  struct radv_shader_variant_key {
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] tgsi: add some atomic opcodes to tgsi_opcode_infer_type

2018-06-14 Thread Gurchetan Singh
We don't have cases for atomic types, some of which are explicitly
signed or unsigned.

The other opcodes could have uint or int return types, based on the
sources.
---
 src/gallium/auxiliary/tgsi/tgsi_info.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 4aa658785cf..6f3cd9c5304 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -253,6 +253,9 @@ tgsi_opcode_infer_src_type(enum tgsi_opcode opcode, uint 
src_idx)
   return TGSI_TYPE_SIGNED;
 
switch (opcode) {
+   case TGSI_OPCODE_ATOMUADD:
+   case TGSI_OPCODE_ATOMUMAX:
+   case TGSI_OPCODE_ATOMUMIN:
case TGSI_OPCODE_UIF:
case TGSI_OPCODE_TXF:
case TGSI_OPCODE_TXF_LZ:
@@ -268,6 +271,8 @@ tgsi_opcode_infer_src_type(enum tgsi_opcode opcode, uint 
src_idx)
case TGSI_OPCODE_U2I64:
case TGSI_OPCODE_MEMBAR:
   return TGSI_TYPE_UNSIGNED;
+   case TGSI_OPCODE_ATOMIMAX:
+   case TGSI_OPCODE_ATOMIMIN:
case TGSI_OPCODE_IMUL_HI:
case TGSI_OPCODE_I2F:
case TGSI_OPCODE_I2D:
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/14] intel/compiler: new shuffle_for_32bit_write and shuffle_from_32bit_read

2018-06-14 Thread Jason Ekstrand
On Thu, Jun 14, 2018 at 2:39 PM, Chema Casanova 
wrote:

> On 14/06/18 03:02, Jason Ekstrand wrote:
> > On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> > mailto:jmcasan...@igalia.com>> wrote:
> >
> > These new shuffle functions deal with the shuffle/unshuffle
> operations
> > needed for read/write operations using 32-bit components when the
> > read/written components have a different bit-size (8, 16, 64-bits).
> > Shuffle from 32-bit to 32-bit becomes a simple MOV.
> >
> > As the new function shuffle_src_to_dst takes of doing a shuffle or an
> > unshuffle based on the different type_sz of source an destination
> this
> > generic functions work with any source/destination assuming that
> writes
> > use a 32-bit destination or reads use a 32-bit source.
> >
> >
> > I'm having a lot of trouble understanding this paragraph.  Would you
> > mind rephrasing it?
> >
>
> Sure, that English didn't compile:
>
> "shuffle_src_to_dst takes care of doing a shuffle when source type is
> smaller than destination type and an unshuffle when source type is
> bigger than destination. So this new read/write functions just need
> to call shuffle_src_to_dst assuming that writes use a 32-bit
> destination and reads use a 32-bit source."
>
> I included also this comment in the commit log:
>
> "As shuffle_for_32bit_write/from_32bit_read components take components
> in unit of source/destination types and shuffle_src_to_dst takes units
> of the smallest type component we adjust the components and
> first_component parameters."
>

Those both look good.


> >
> > To enable this new functions it is needed than there is no
> > source/destination overlap in the case of shuffle_from_32bit_read.
> > That never happens on shuffle_for_32bit_write as it allocates a new
> > destination register as it was at shuffle_64bit_data_for_32bit_
> write.
> > ---
> >  src/intel/compiler/brw_fs.h   | 11 +
> >  src/intel/compiler/brw_fs_nir.cpp | 38
> +++
> >  2 files changed, 49 insertions(+)
> >
> > diff --git a/src/intel/compiler/brw_fs.h
> b/src/intel/compiler/brw_fs.h
> > index faf51568637..779170ecc95 100644
> > --- a/src/intel/compiler/brw_fs.h
> > +++ b/src/intel/compiler/brw_fs.h
> > @@ -519,6 +519,17 @@ void shuffle_16bit_data_for_32bit_write(const
> > brw::fs_builder &bld,
> >  const fs_reg &src,
> >  uint32_t components);
> >
> > +void shuffle_from_32bit_read(const brw::fs_builder &bld,
> > + const fs_reg &dst,
> > + const fs_reg &src,
> > + uint32_t first_component,
> > + uint32_t components);
> > +
> > +fs_reg shuffle_for_32bit_write(const brw::fs_builder &bld,
> > +   const fs_reg &src,
> > +   uint32_t first_component,
> > +   uint32_t components);
> > +
> >  fs_reg setup_imm_df(const brw::fs_builder &bld,
> >  double v);
> >
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 1a9d3c41d1d..1f684149fd5 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -5454,6 +5454,44 @@ shuffle_src_to_dst(const fs_builder &bld,
> > }
> >  }
> >
> > +void
> > +shuffle_from_32bit_read(const fs_builder &bld,
> > +const fs_reg &dst,
> > +const fs_reg &src,
> > +uint32_t first_component,
> > +uint32_t components)
> > +{
> > +   assert(type_sz(src.type) == 4);
> > +
> >
> >
> > /* This function takes components in units of the destination type while
> > shuffle_src_to_dst takes components in units of the smallest type */
>
> Done.
>
> > +   if (type_sz(dst.type) > 4) {
> > +  assert(type_sz(dst.type) == 8);
> > +  first_component *= 2;
> > +  components *= 2;
> > +   }
> > +
> > +   shuffle_src_to_dst(bld, dst, src, first_component, components);
> > +}
> > +
> > +fs_reg
> > +shuffle_for_32bit_write(const fs_builder &bld,
> > +const fs_reg &src,
> > +uint32_t first_component,
> > +uint32_t components)
> > +{
> > +   fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_D,
> > + DIV_ROUND_UP (components *
> > type_sz(src.type), 4));
> > +
> >
> >
> > /* This function takes components in units of the source type while
> > shuffle_src_to_dst takes components in units of the smallest type */
>
> Done.
>
> > With those added and the commit message re

  1   2   >