Re: [Mesa-dev] [PATCH 21/30] i965/screen: Use ISL for doing image import checks

2017-08-04 Thread Jason Ekstrand
On Fri, Aug 4, 2017 at 2:16 AM, Rainer Hochecker 
wrote:

> This seems to breaks exporting 16bit vaapi images via drm buffers
>

Yes, I'm aware of the problem and there are two patches on the list which
should fix it:

https://patchwork.freedesktop.org/patch/170051/
https://patchwork.freedesktop.org/patch/170052/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH shader-db 1/1] shaders: Add Dolphin’s übershaders

2017-08-04 Thread Emmanuel Gil Peyrot
These shaders have been generated by Dolphin 9649494f67 on Mesa
8c26b52349 for an HD4000 GPU.

They include a lot of uniform branches, mostly on integers, as well as
switch statements branching on small and bounded integers.

Signed-off-by: Emmanuel Gil Peyrot 
---

The actual patch isn’t included because it was more than 1 MiB, I
hosted it on my website instead:
https://linkmauve.fr/files/0001-shaders-Add-Dolphin-s-bershaders.patch

 shaders/dolphin/ubershaders/102.shader_test | 1258 +
 shaders/dolphin/ubershaders/111.shader_test | 1268 +
 shaders/dolphin/ubershaders/12.shader_test  |  961 +++
 shaders/dolphin/ubershaders/120.shader_test | 1281 ++
 shaders/dolphin/ubershaders/129.shader_test | 1269 +
 shaders/dolphin/ubershaders/138.shader_test | 1279 ++
 shaders/dolphin/ubershaders/147.shader_test | 1292 ++
 shaders/dolphin/ubershaders/156.shader_test | 1280 ++
 shaders/dolphin/ubershaders/165.shader_test | 1290 ++
 shaders/dolphin/ubershaders/174.shader_test | 1303 ++
 shaders/dolphin/ubershaders/183.shader_test | 1291 ++
 shaders/dolphin/ubershaders/192.shader_test | 1301 ++
 shaders/dolphin/ubershaders/201.shader_test | 1314 ++
 shaders/dolphin/ubershaders/21.shader_test  |  949 +++
 shaders/dolphin/ubershaders/210.shader_test | 1302 ++
 shaders/dolphin/ubershaders/219.shader_test | 1312 ++
 shaders/dolphin/ubershaders/228.shader_test | 1325 +++
 shaders/dolphin/ubershaders/237.shader_test | 1313 ++
 shaders/dolphin/ubershaders/3.shader_test   |  948 +++
 shaders/dolphin/ubershaders/30.shader_test  | 1235 +
 shaders/dolphin/ubershaders/39.shader_test  | 1248 +
 shaders/dolphin/ubershaders/48.shader_test  | 1236 +
 shaders/dolphin/ubershaders/57.shader_test  | 1246 +
 shaders/dolphin/ubershaders/66.shader_test  | 1259 +
 shaders/dolphin/ubershaders/75.shader_test  | 1247 +
 shaders/dolphin/ubershaders/84.shader_test  | 1257 +
 shaders/dolphin/ubershaders/93.shader_test  | 1270 +
 27 files changed, 33534 insertions(+)
 create mode 100644 shaders/dolphin/ubershaders/102.shader_test
 create mode 100644 shaders/dolphin/ubershaders/111.shader_test
 create mode 100644 shaders/dolphin/ubershaders/12.shader_test
 create mode 100644 shaders/dolphin/ubershaders/120.shader_test
 create mode 100644 shaders/dolphin/ubershaders/129.shader_test
 create mode 100644 shaders/dolphin/ubershaders/138.shader_test
 create mode 100644 shaders/dolphin/ubershaders/147.shader_test
 create mode 100644 shaders/dolphin/ubershaders/156.shader_test
 create mode 100644 shaders/dolphin/ubershaders/165.shader_test
 create mode 100644 shaders/dolphin/ubershaders/174.shader_test
 create mode 100644 shaders/dolphin/ubershaders/183.shader_test
 create mode 100644 shaders/dolphin/ubershaders/192.shader_test
 create mode 100644 shaders/dolphin/ubershaders/201.shader_test
 create mode 100644 shaders/dolphin/ubershaders/21.shader_test
 create mode 100644 shaders/dolphin/ubershaders/210.shader_test
 create mode 100644 shaders/dolphin/ubershaders/219.shader_test
 create mode 100644 shaders/dolphin/ubershaders/228.shader_test
 create mode 100644 shaders/dolphin/ubershaders/237.shader_test
 create mode 100644 shaders/dolphin/ubershaders/3.shader_test
 create mode 100644 shaders/dolphin/ubershaders/30.shader_test
 create mode 100644 shaders/dolphin/ubershaders/39.shader_test
 create mode 100644 shaders/dolphin/ubershaders/48.shader_test
 create mode 100644 shaders/dolphin/ubershaders/57.shader_test
 create mode 100644 shaders/dolphin/ubershaders/66.shader_test
 create mode 100644 shaders/dolphin/ubershaders/75.shader_test
 create mode 100644 shaders/dolphin/ubershaders/84.shader_test
 create mode 100644 shaders/dolphin/ubershaders/93.shader_test

-- 
Emmanuel Gil Peyrot
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS

2017-08-04 Thread Tomasz Figa
Hi Yogesh,

On Sat, Aug 5, 2017 at 1:22 AM, Marathe, Yogesh
 wrote:
>> -Original Message-
>> From: Tomasz Figa [mailto:tf...@chromium.org]
>> Sent: Friday, August 4, 2017 9:39 PM
>> On Sat, Aug 5, 2017 at 12:53 AM, Marathe, Yogesh
>>  wrote:
>> > Tomasz, Emil,
>> >
>> >> -Original Message-
>> >> From: Tomasz Figa [mailto:tf...@chromium.org]
>> >> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov 
>> wrote:
>> >> >>> >>  - version check (2+) the fence extension, calling
>> >> >>> >> .create_fence_fd() only when
>> >> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD
>> >> >>
>> >> >> The check looks like below now, this is in
>> >> >> dri2_surf_update_fence_fd() before
>> >> create_fence_fd is called.
>> >> >>
>> >> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) {
>> >> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence-
>> >> >get_capabilities(dri2_dpy->dri_screen)) {
>> >> >>   //create_fence_fd call
>> >> >>}
>> >> >> }
>> >> >>
>> >> > Close but no cigar.
>> >> >
>> >> > if (dri2_surf->enable_out_fence && dri2_dpy->fence &&
>> >> > dri2_dpy->fence->base.version >= 2 &&
>> >> > dri2_dpy->fence->get_capabilities) {
>> >> >
>> >> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
>> >> > __DRI_FENCE_CAP_NATIVE_FD) {
>> >> > //create_fence_fd call
>> >> >}
>> >> > }
>> >>
>> >> If this needs so complicated series of checks, maybe it would make
>> >> more sense to just set enable_out_fence based on availability of the
>> >> capability at initialization time?
>> >
>> > I liked this one compared to nested ifs in dri2_surf_update_fence_fd().
>> >
>> >>
>> >> >
>> >> >> Overall, if I further go ahead and check, actually
>> >> >> get_capabilities() ultimately returns based on has_exec_fence
>> >> >> which depends on I915_PARAM_HAS_EXEC_FENCE. This is always set to
>> >> >> true for i915 in kernel drv unless forced to false!! I'm not sure
>> >> >> if that inner check of
>> >> get_capabilities still makes sense. Isn't the first one sufficient?
>> >> >>
>> >> > Not sure what you mean with "first one", but consider the following
>> example:
>> >> >  - old kernel which does not support (or has force disabled)
>> >> > I915_PARAM_HAS_EXEC_FENCE.
>> >> >  - new userspace which unconditionally advertises the fence v2
>> >> > extension IIRC one may tweak that things to only conditionally
>> >> > advertise it, but IMHO it's not worth the hassle.
>> >> >
>> >> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module)
>> >> > so focusing on one doesn't quite work.
>> >> >
>> >> >>> >>  - don't introduce unused variables (in make_current)
>> >> >>
>> >> >> Done.
>> >> >>
>> >> >>> >>  - the create fd for the old display surface (in make_current)
>> >> >>> >> seems bogus
>> >> >>
>> >> >> Done.
>> >> >>
>> >> > Did you drop it all together or changed to use some other surface?
>> >> > Would be nice to hear the reason why it was added - perhaps I'm
>> >> > missing something.
>> >>
>> >> We have to keep it, otherwise there would be no fence available at
>> >> the time of surface destruction, while, at least for Android, a fence
>> >> can be passed to window's cancelBuffer callback.
>> >>
>> >> >
>> >> > I think that we want a fence/fd for the new draw surface. Since
>> >> > otherwise one won't get created up until the first SwapBuffers call.
>> >>
>> >> I might be missing something, but wouldn't that insert a fence at the
>> >> beginning of command stream, before even doing anything? At least in
>> >> Android use cases, the only places we need the fence is in
>> >> SwapBuffers and DestroySurface and the fence should be inserted after
>> >> all the commands for rendering into given surface.
>> >>
>> >
>> > Emil,
>> >
>> > Tomasz sounds convincing to me here, I just went ahead with the
>> > comment to try out and flatland worked even after removing that.
>> > Zhongmin can explain better but I think in earlier revisions this was
>> > done for cancelBuffer to match with queueBuffer, I mean we are passing
>> > valid fd for queueBuffer by doing this we would have a valid fd during
>> cancelBuffer.  Not sure if this is the reason / one of the reason.
>> >
>> > I will go ahead with rest of your comments if we are ok to keep fd for
>> > old display surface in make_current.
>>
>> My understanding is that nobody actually cares about the fence that
>> cancelBuffer returns, because the contents of the buffer are going to be
>> discarded anyway and the buffer doesn't go to the consumer (e.g.
>> flatland code that reads the timestamp). I even suspect that typically
>> destroySurface would be called directly after swapBuffers and the surface
>> wouldn't have a buffer to cancel. You can easily check this by adding a print
>> before cancelBuffer call happens. So we might actually be fine with simpler 
>> code
>> that gets fence only for swapBuffers.
>>
>
> Sure. I can 

Re: [Mesa-dev] [PATCH] radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)

2017-08-04 Thread Dave Airlie
On 4 August 2017 at 18:53, Nicolai Hähnle  wrote:
> On 04.08.2017 04:51, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> This is a bug in the app, but I'd rather avoid hanging the GPU,
>> esp if someone is running in validation and it takes out their
>> development environment.
>>
>> v2: get it right, reverse the polarity.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>   src/amd/vulkan/radv_meta_resolve.c | 5 +
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/src/amd/vulkan/radv_meta_resolve.c
>> b/src/amd/vulkan/radv_meta_resolve.c
>> index 6cd0c38..6023e0f 100644
>> --- a/src/amd/vulkan/radv_meta_resolve.c
>> +++ b/src/amd/vulkan/radv_meta_resolve.c
>> @@ -382,6 +382,11 @@ void radv_CmdResolveImage(
>> radv_meta_save_graphics_reset_vport_scissor_novertex(_state,
>> cmd_buffer);
>> assert(src_image->info.samples > 1);
>> +   if (src_image->info.samples <= 1) {
>> +   /* this causes GPU hangs if we get past here */
>> +   fprintf(stderr, "radv: Illegal resolve operation (src not
>> multisampled), will hang GPU.");
>> +   return;
>
>
> If you really want to make sure developers get this right, you should
> probably abort(); here? Although that might then bug users... maybe an
> abort() that can be skipped by explicitly setting an environment variable?

Well ideally we do nothing and let GPU reset happen, but I think we've
got a bit of work in that area first!.

This should be fine for now, hopefully they spot the missing rendering
if they attempt this.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/30] i965/screen: Use ISL for doing image import checks

2017-08-04 Thread Rainer Hochecker

This seems to breaks exporting 16bit vaapi images via drm buffers
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] egl/drm: rename dri2_drm_create_surface()

2017-08-04 Thread Emil Velikov
On 5 August 2017 at 00:25, Emil Velikov  wrote:
> From: Emil Velikov 
>
> The function can handle only window surfaces, so let's rename it
> accordingly, killing the wrapper around it.
>
> Suggested-by: Eric Engestrom 
> Signed-off-by: Emil Velikov 
> ---
> New patch
> ---
>  src/egl/drivers/dri2/platform_drm.c | 17 -
>  1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/platform_drm.c 
> b/src/egl/drivers/dri2/platform_drm.c
> index 8d56fcb7698..89ad9e0d10c 100644
> --- a/src/egl/drivers/dri2/platform_drm.c
> +++ b/src/egl/drivers/dri2/platform_drm.c
> @@ -91,9 +91,9 @@ has_free_buffers(struct gbm_surface *_surf)
>  }
>
>  static _EGLSurface *
> -dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
> -_EGLConfig *conf, void *native_surface,
> -const EGLint *attrib_list)
> +dri2_drm_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
> +   _EGLConfig *conf, void *native_window,

Fixed this locally to read native_surface, instead of native_window.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] No reloc for i965

2017-08-04 Thread Kenneth Graunke
On Friday, August 4, 2017 12:22:19 PM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-08-04 19:47:14)
> > On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote:
> > > Patch reordering from last time so that the cosmetic tweaks are done first
> > > and out of the way. Kenneth has reviewed the core NO_RELOC patches, so
> > > hopefully it doesn't look too bad and we can land at least as far as
> > > there (patch 8/10).
> > > 
> > > Thanks,
> > > -Chris
> > 
> > I split up some patches and pushed a modified version of this series.
> > 
> > To ssh://git.freedesktop.org/git/mesa/mesa
> >5c007203b73..6c530ad1160  master -> master
> > 
> > Thanks a ton for getting us to NO_RELOC.  I really like the new reloc
> > flags system as well.  It's so much nicer!
> 
> I've still got to win you over to using LUT indices (kernel side, there
> shouldn't be any case where it is worse, but the differences are easily
> dwarfed in typical cases where it is only about 10% faster, but any
> reduction inside the struct_mutex is a must), I see, and the per-context
> bo along with removing the auxiliary render_cache set...
> 
> But now for something completely different...
> -Chris

I landed I915_EXEC_HANDLE_LUT too, actually.

I definitely want per-context BOs, but Jason and I had come up with some
patches for that as well, and I haven't had a chance to compare your
approach with ours to see which is better.  I hope to do that soon.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/genxml: Fix gen10 BLEND_STATE variable length packing

2017-08-04 Thread Rafael Antognolli
Oh, I saw it had the old xml and was assuming it didn't cause any errors,
but clearly I was wrong.

Reviewed-by: Rafael Antognolli 

On Fri, Aug 04, 2017 at 10:21:43PM +, Scott D Phillips wrote:
> BLEND_STATE packing was modified to be variable-length in:
> 
>  9670124e31 genxml: Make BLEND_STATE command support variable length array.
> 
> The initial gen10.xml still had the old, fixed-length style
> definition for BLEND_STATE. So gen10_upload_blend_state would
> overwrite the packed BLEND_STATE_ENTRYs with its own fixed array
> of all-zero entries when packing BLEND_STATE. This caused
> BLEND_STATE upload to not work at all.
> 
> Fixes: aa416f515a ("i965/genxml: Add gen10.xml")
> ---
>  src/intel/genxml/gen10.xml | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
> index 23c2adb995..a7ae49ae65 100644
> --- a/src/intel/genxml/gen10.xml
> +++ b/src/intel/genxml/gen10.xml
> @@ -554,7 +554,7 @@
>  
>
>  
> -  
> +  
>  
>   type="bool"/>
>  
> @@ -564,7 +564,7 @@
>  
>  
>  
> -
> +
>
>  
>
> -- 
> 2.11.0
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency

2017-08-04 Thread Emil Velikov
On 4 August 2017 at 23:08, Dieter Nützel  wrote:
> For the series:
>
> Tested-by: Dieter Nützel 
>
> on RX580
>
> with Clover, vdpau and Nine.
>
> ./autogen.sh --prefix=/usr/local --with-dri-drivers=""
> --with-gallium-drivers=r600,radeonsi,swrast --with-platforms=drm,x11
> --enable-nine --enable-texture-float --enable-opencl
> --with-vulkan-drivers=radeon
>
Thanks Dieter.

I've refrained from pushing 2/3 and 3/3 since they seem incomplete.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] egl: handle BAD_NATIVE_PIXMAP further up the stack

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

The basic (null) check is identical across all backends.
Just move it to the top.

v2:
 - Split the WINDOW vs PIXMAP into separate patches
 - Move check after the dpy and config - dEQP expects so

Cc: Eric Engestrom 
Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/platform_x11.c | 5 -
 src/egl/main/eglapi.c   | 3 +++
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 00cab577b77..e2007a0313e 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -234,11 +234,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
dri2_surf->drawable, dri2_dpy->screen->root,
dri2_surf->base.Width, dri2_surf->base.Height);
} else {
-  if (!drawable) {
- assert(type == EGL_PIXMAP_BIT_BIT)
- _eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface");
- goto cleanup_surf;
-  }
   dri2_surf->drawable = drawable;
}
 
diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index c5e3955c48c..3ca3dd4c7c1 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1021,6 +1021,9 @@ _eglCreatePixmapSurfaceCommon(_EGLDisplay *disp, 
EGLConfig config,
if ((conf->SurfaceType & EGL_PIXMAP_BIT) == 0)
   RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE);
 
+   if (native_pixmap == NULL)
+  RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_PIXMAP, EGL_NO_SURFACE);
+
surf = drv->API.CreatePixmapSurface(drv, disp, conf, native_pixmap,
attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] egl/x11: pass NULL instead of XCB_WINDOW_NONE as native_surface

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

Signed-off-by: Emil Velikov 
Reviewed-by: Matt Turner 
---
 src/egl/drivers/dri2/platform_x11.c  | 2 +-
 src/egl/drivers/dri2/platform_x11_dri3.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index ce5450155aa..ce4ba6b6e15 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -359,7 +359,7 @@ dri2_x11_create_pbuffer_surface(_EGLDriver *drv, 
_EGLDisplay *disp,
 _EGLConfig *conf, const EGLint *attrib_list)
 {
return dri2_x11_create_surface(drv, disp, EGL_PBUFFER_BIT, conf,
-  XCB_WINDOW_NONE, attrib_list);
+  NULL, attrib_list);
 }
 
 static EGLBoolean
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index b88374c1cbb..9c018168b1c 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -238,7 +238,7 @@ dri3_create_pbuffer_surface(_EGLDriver *drv, _EGLDisplay 
*disp,
 _EGLConfig *conf, const EGLint *attrib_list)
 {
return dri3_create_surface(drv, disp, EGL_PBUFFER_BIT, conf,
-  XCB_WINDOW_NONE, attrib_list);
+  NULL, attrib_list);
 }
 
 static EGLBoolean
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] egl/drm: rename dri2_drm_create_surface()

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

The function can handle only window surfaces, so let's rename it
accordingly, killing the wrapper around it.

Suggested-by: Eric Engestrom 
Signed-off-by: Emil Velikov 
---
New patch
---
 src/egl/drivers/dri2/platform_drm.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index 8d56fcb7698..89ad9e0d10c 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -91,9 +91,9 @@ has_free_buffers(struct gbm_surface *_surf)
 }
 
 static _EGLSurface *
-dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
-_EGLConfig *conf, void *native_surface,
-const EGLint *attrib_list)
+dri2_drm_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
+   _EGLConfig *conf, void *native_window,
+   const EGLint *attrib_list)
 {
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_config *dri2_conf = dri2_egl_config(conf);
@@ -110,7 +110,7 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
   return NULL;
}
 
-   if (!_eglInitSurface(_surf->base, disp, type, conf, attrib_list))
+   if (!_eglInitSurface(_surf->base, disp, EGL_WINDOW_BIT, conf, 
attrib_list))
   goto cleanup_surf;
 
surf = gbm_dri_surface(window);
@@ -149,15 +149,6 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
 }
 
 static _EGLSurface *
-dri2_drm_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
-   _EGLConfig *conf, void *native_window,
-   const EGLint *attrib_list)
-{
-   return dri2_drm_create_surface(drv, disp, EGL_WINDOW_BIT, conf,
-  native_window, attrib_list);
-}
-
-static _EGLSurface *
 dri2_drm_create_pixmap_surface(_EGLDriver *drv, _EGLDisplay *disp,
_EGLConfig *conf, void *native_window,
const EGLint *attrib_list)
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] egl: drop unreachable BAD_NATIVE_WINDOW conditions

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

The code in _eglCreatePixmapSurfaceCommon() already has a NULL check
which handles the condition. There's no point in checkin again further
down the stack.

v2: Split the WINDOW vs PIXMAP into separate patches

Cc: Eric Engestrom 
Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/platform_android.c | 2 +-
 src/egl/drivers/dri2/platform_drm.c | 5 -
 src/egl/drivers/dri2/platform_wayland.c | 5 -
 src/egl/drivers/dri2/platform_x11.c | 6 ++
 4 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index 50a82486956..beb474025f7 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -329,7 +329,7 @@ droid_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
if (type == EGL_WINDOW_BIT) {
   int format;
 
-  if (!window || window->common.magic != ANDROID_NATIVE_WINDOW_MAGIC) {
+  if (window->common.magic != ANDROID_NATIVE_WINDOW_MAGIC) {
  _eglError(EGL_BAD_NATIVE_WINDOW, "droid_create_surface");
  goto cleanup_surface;
   }
diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index a952aa54560..7ea43e62010 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -115,11 +115,6 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
 
switch (type) {
case EGL_WINDOW_BIT:
-  if (!window) {
- _eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface");
- goto cleanup_surf;
-  }
-
   surf = gbm_dri_surface(window);
   dri2_surf->gbm_surf = surf;
   dri2_surf->base.Width =  surf->base.width;
diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index 38fdfe974fa..dcc777a3e8b 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -162,11 +162,6 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay 
*disp,
  dri2_surf->format = WL_SHM_FORMAT_ARGB;
}
 
-   if (!window) {
-  _eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface");
-  goto cleanup_surf;
-   }
-
dri2_surf->wl_win = window;
dri2_surf->wl_queue = wl_display_create_queue(dri2_dpy->wl_dpy);
if (!dri2_surf->wl_queue) {
diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 7b5a1770bd7..00cab577b77 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -235,10 +235,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
dri2_surf->base.Width, dri2_surf->base.Height);
} else {
   if (!drawable) {
- if (type == EGL_WINDOW_BIT)
-_eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface");
- else
-_eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface");
+ assert(type == EGL_PIXMAP_BIT_BIT)
+ _eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface");
  goto cleanup_surf;
   }
   dri2_surf->drawable = drawable;
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] egl/drm: remove unreachable code in dri2_drm_create_surface()

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

The function can be called only when the type is EGL_WINDOW_BIT.
Remove the unneeded switch statement.

Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/platform_drm.c | 20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index 7ea43e62010..8d56fcb7698 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -92,13 +92,13 @@ has_free_buffers(struct gbm_surface *_surf)
 
 static _EGLSurface *
 dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
-_EGLConfig *conf, void *native_window,
+_EGLConfig *conf, void *native_surface,
 const EGLint *attrib_list)
 {
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_config *dri2_conf = dri2_egl_config(conf);
struct dri2_egl_surface *dri2_surf;
-   struct gbm_surface *window = native_window;
+   struct gbm_surface *window = native_surface;
struct gbm_dri_surface *surf;
const __DRIconfig *config;
 
@@ -113,17 +113,11 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
if (!_eglInitSurface(_surf->base, disp, type, conf, attrib_list))
   goto cleanup_surf;
 
-   switch (type) {
-   case EGL_WINDOW_BIT:
-  surf = gbm_dri_surface(window);
-  dri2_surf->gbm_surf = surf;
-  dri2_surf->base.Width =  surf->base.width;
-  dri2_surf->base.Height = surf->base.height;
-  surf->dri_private = dri2_surf;
-  break;
-   default:
-  goto cleanup_surf;
-   }
+   surf = gbm_dri_surface(window);
+   dri2_surf->gbm_surf = surf;
+   dri2_surf->base.Width =  surf->base.width;
+   dri2_surf->base.Height = surf->base.height;
+   surf->dri_private = dri2_surf;
 
config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
 dri2_surf->base.GLColorspace);
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] egl: add dri2_setup_swap_interval helper

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

The current two implementations - X11 and Wayland were identical,
barrind the upper limit.

Instead of having same code twice - introduce a helper and pass the
limit as an argument.

Thus as Android/DRM/others get support - they only need to call the
function ;-)

v2: Rebase on top of keeping ::swap_available

Signed-off-by: Emil Velikov 
Reviewed-by: Eric Engestrom  (v1)
---
 src/egl/drivers/dri2/egl_dri2.c | 35 +++
 src/egl/drivers/dri2/egl_dri2.h |  3 +++
 src/egl/drivers/dri2/platform_wayland.c | 37 +
 src/egl/drivers/dri2/platform_x11.c | 37 ++---
 4 files changed, 49 insertions(+), 63 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 733659d547f..936b7c5199e 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -728,6 +728,41 @@ dri2_setup_screen(_EGLDisplay *disp)
}
 }
 
+void
+dri2_setup_swap_interval(_EGLDisplay *disp, int max_swap_interval)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
+   GLint vblank_mode = DRI_CONF_VBLANK_DEF_INTERVAL_1;
+
+   /* Allow driconf to override applications.*/
+   if (dri2_dpy->config)
+  dri2_dpy->config->configQueryi(dri2_dpy->dri_screen,
+ "vblank_mode", _mode);
+   switch (vblank_mode) {
+   case DRI_CONF_VBLANK_NEVER:
+  dri2_dpy->min_swap_interval = 0;
+  dri2_dpy->max_swap_interval = 0;
+  dri2_dpy->default_swap_interval = 0;
+  break;
+   case DRI_CONF_VBLANK_ALWAYS_SYNC:
+  dri2_dpy->min_swap_interval = 1;
+  dri2_dpy->max_swap_interval = max_swap_interval;
+  dri2_dpy->default_swap_interval = 1;
+  break;
+   case DRI_CONF_VBLANK_DEF_INTERVAL_0:
+  dri2_dpy->min_swap_interval = 0;
+  dri2_dpy->max_swap_interval = max_swap_interval;
+  dri2_dpy->default_swap_interval = 0;
+  break;
+   default:
+   case DRI_CONF_VBLANK_DEF_INTERVAL_1:
+  dri2_dpy->min_swap_interval = 0;
+  dri2_dpy->max_swap_interval = max_swap_interval;
+  dri2_dpy->default_swap_interval = 1;
+  break;
+   }
+}
+
 /* All platforms but DRM call this function to create the screen and populate
  * the driver_configs. DRM inherits that information from its display - GBM.
  */
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index ccfefef61fc..751e7a4e2f3 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -370,6 +370,9 @@ dri2_load_driver(_EGLDisplay *disp);
 void
 dri2_setup_screen(_EGLDisplay *disp);
 
+void
+dri2_setup_swap_interval(_EGLDisplay *disp, int max_swap_interval);
+
 EGLBoolean
 dri2_load_driver_swrast(_EGLDisplay *disp);
 
diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index 73966b7c504..38fdfe974fa 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -925,7 +925,7 @@ dri2_wl_query_buffer_age(_EGLDriver *drv,
 static EGLBoolean
 dri2_wl_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
 {
-   return dri2_wl_swap_buffers_with_damage (drv, disp, draw, NULL, 0);
+   return dri2_wl_swap_buffers_with_damage(drv, disp, draw, NULL, 0);
 }
 
 static struct wl_buffer *
@@ -1140,41 +1140,14 @@ static const struct wl_registry_listener 
registry_listener_drm = {
 };
 
 static void
-dri2_wl_setup_swap_interval(struct dri2_egl_display *dri2_dpy)
+dri2_wl_setup_swap_interval(_EGLDisplay *disp)
 {
-   GLint vblank_mode = DRI_CONF_VBLANK_DEF_INTERVAL_1;
-
/* We can't use values greater than 1 on Wayland because we are using the
 * frame callback to synchronise the frame and the only way we be sure to
 * get a frame callback is to attach a new buffer. Therefore we can't just
 * sit drawing nothing to wait until the next ‘n’ frame callbacks */
 
-   if (dri2_dpy->config)
-  dri2_dpy->config->configQueryi(dri2_dpy->dri_screen,
- "vblank_mode", _mode);
-   switch (vblank_mode) {
-   case DRI_CONF_VBLANK_NEVER:
-  dri2_dpy->min_swap_interval = 0;
-  dri2_dpy->max_swap_interval = 0;
-  dri2_dpy->default_swap_interval = 0;
-  break;
-   case DRI_CONF_VBLANK_ALWAYS_SYNC:
-  dri2_dpy->min_swap_interval = 1;
-  dri2_dpy->max_swap_interval = 1;
-  dri2_dpy->default_swap_interval = 1;
-  break;
-   case DRI_CONF_VBLANK_DEF_INTERVAL_0:
-  dri2_dpy->min_swap_interval = 0;
-  dri2_dpy->max_swap_interval = 1;
-  dri2_dpy->default_swap_interval = 0;
-  break;
-   default:
-   case DRI_CONF_VBLANK_DEF_INTERVAL_1:
-  dri2_dpy->min_swap_interval = 0;
-  dri2_dpy->max_swap_interval = 1;
-  dri2_dpy->default_swap_interval = 1;
-  break;
-   }
+   dri2_setup_swap_interval(disp, 1);
 }
 
 static const struct 

[Mesa-dev] [PATCH 5/8] egl: Clean up native_type vs drawable mess

2017-08-04 Thread Emil Velikov
From: Matt Turner 

The next patch is going to stop passing XCB_WINDOW_NONE (of type
xcb_window_enum_t) as an argument where these functions expect a void *,
which clang does not appreciate.

This patch cleans things up to better convince me and reviewers that
it's safe to do that.

v2: Emil Velikov: rebase/integrate with series
Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/platform_x11.c  | 7 ++-
 src/egl/drivers/dri2/platform_x11_dri3.c | 6 +++---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index e2007a0313e..ce5450155aa 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -210,12 +210,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
xcb_get_geometry_cookie_t cookie;
xcb_get_geometry_reply_t *reply;
xcb_generic_error_t *error;
-   xcb_drawable_t drawable;
const __DRIconfig *config;
 
-   STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
-   drawable = (uintptr_t) native_surface;
-
(void) drv;
 
dri2_surf = malloc(sizeof *dri2_surf);
@@ -234,7 +230,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
dri2_surf->drawable, dri2_dpy->screen->root,
dri2_surf->base.Width, dri2_surf->base.Height);
} else {
-  dri2_surf->drawable = drawable;
+  STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
+  dri2_surf->drawable = (uintptr_t) native_surface;
}
 
config = dri2_get_dri_config(dri2_conf, type,
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index 3a0efc6ccc9..b88374c1cbb 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -141,9 +141,6 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
const __DRIconfig *dri_config;
xcb_drawable_t drawable;
 
-   STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
-   drawable = (uintptr_t) native_surface;
-
(void) drv;
 
dri3_surf = calloc(1, sizeof *dri3_surf);
@@ -160,6 +157,9 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
   xcb_create_pixmap(dri2_dpy->conn, conf->BufferSize,
 drawable, dri2_dpy->screen->root,
 dri3_surf->base.Width, dri3_surf->base.Height);
+   } else {
+  STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
+  drawable = (uintptr_t) native_surface;
}
 
dri_config = dri2_get_dri_config(dri2_conf, type,
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] egl: rework input validation order in _eglCreateWindowSurfaceCommon

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

As mentioned in previous commit the negative tests in dEQP expect the
arguments to be evaluated in particular order.

Namely - first the dpy, then the config, followed by the surface/window.

Move the check further down or executing the test below will produce
the following error.

   dEQP-EGL.functional.negative_api.create_pbuffer_surface


   
  eglCreateWindowSurface(0x9bfff0f150, 0x, 
0x, { EGL_NONE });
  // 0x returned
  // ERROR expected: EGL_BAD_CONFIG, Got: EGL_BAD_NATIVE_WINDOW
   

Cc: 
Cc: Mark Janes 
Cc: Chad Versace 
Signed-off-by: Emil Velikov 
---
Mark,

IMHO the CI does the impossible and passes the test. Perhaps it's worth
looking into how/why it does so - I don't know.

I'll pipe the series through Jenkins tomorrow - don't want to stall
things for the guys still working.

Chad, I see that in the EGL_MESA_surfaceless implementation you
explicitly mentioned that the surface is checked prior to the config.

Wouldn't it be better to stay consistent and move those, as per the
above? AFAICT the spec does not explicitly dictates the order.
---
 src/egl/main/eglapi.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 3ca3dd4c7c1..3b0f896f74c 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -872,10 +872,6 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig 
config,
_EGLSurface *surf;
EGLSurface ret;
 
-
-   if (native_window == NULL)
-  RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE);
-
 #ifdef HAVE_SURFACELESS_PLATFORM
if (disp && disp->Platform == _EGL_PLATFORM_SURFACELESS) {
   /* From the EGL_MESA_platform_surfaceless spec (v1):
@@ -899,6 +895,9 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig 
config,
if ((conf->SurfaceType & EGL_WINDOW_BIT) == 0)
   RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE);
 
+   if (native_window == NULL)
+  RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE);
+
surf = drv->API.CreateWindowSurface(drv, disp, conf, native_window,
attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel/genxml: Fix gen10 BLEND_STATE variable length packing

2017-08-04 Thread Scott D Phillips
BLEND_STATE packing was modified to be variable-length in:

 9670124e31 genxml: Make BLEND_STATE command support variable length array.

The initial gen10.xml still had the old, fixed-length style
definition for BLEND_STATE. So gen10_upload_blend_state would
overwrite the packed BLEND_STATE_ENTRYs with its own fixed array
of all-zero entries when packing BLEND_STATE. This caused
BLEND_STATE upload to not work at all.

Fixes: aa416f515a ("i965/genxml: Add gen10.xml")
---
 src/intel/genxml/gen10.xml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
index 23c2adb995..a7ae49ae65 100644
--- a/src/intel/genxml/gen10.xml
+++ b/src/intel/genxml/gen10.xml
@@ -554,7 +554,7 @@
 
   
 
-  
+  
 
 
 
@@ -564,7 +564,7 @@
 
 
 
-
+
   
 
   
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency

2017-08-04 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

on RX580

with Clover, vdpau and Nine.

./autogen.sh --prefix=/usr/local --with-dri-drivers="" 
--with-gallium-drivers=r600,radeonsi,swrast --with-platforms=drm,x11 
--enable-nine --enable-texture-float --enable-opencl 
--with-vulkan-drivers=radeon


Dieter

Am 04.08.2017 20:18, schrieb Emil Velikov:

From: Emil Velikov 

Currently xmlconfig is conditionally used, only when --enable-dri is
available.

As the library has moved to src/util and has wider wisebase, this guard
is no longer correct. Strictly speaking - it wasn't since the
introduction of xmlconfig into st/nine a while ago.

Unconditionally enable xmlconfig and drop the linking. As said before
there's other users of the library, so depending on the configure
options we will get multiple definitions of said symbols.

NOTE: To avoid breaking other combinations, this commit adds the
xmlconfig link to the required places - throughout gallium and the DRI
loaders.

Cc: Nicolai Hähnle 
Cc: Aaron Watry 
Signed-off-by: Emil Velikov 
---
Nicolai, here is an alternative solution.

I have a very slight inclination towards this one over your earlier
patch. But either one should do, really.
---
 src/egl/Makefile.am   |  8 ++--
 src/gallium/auxiliary/pipe-loader/Makefile.am |  6 --
 src/gallium/targets/opencl/Makefile.am|  1 -
 src/gbm/Makefile.am   |  1 +
 src/glx/Makefile.am   |  4 +++-
 src/loader/Makefile.am| 15 ++-
 6 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index ecaf148aaec..bb8ec9745dd 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -120,8 +120,12 @@ libEGL_common_la_SOURCES += \
$(dri2_backend_FILES) \
$(dri3_backend_FILES)

-libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader.la
-libEGL_common_la_LIBADD += $(DLOPEN_LIBS) $(LIBDRM_LIBS) $(CLOCK_LIB)
+libEGL_common_la_LIBADD += \
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la \
+   $(DLOPEN_LIBS) \
+   $(LIBDRM_LIBS) \
+   $(CLOCK_LIB)

 GLVND_GEN_DEPS = generate/gen_egl_dispatch.py \
generate/egl.xml generate/eglFunctionList.py generate/genCommon.py \
diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am
b/src/gallium/auxiliary/pipe-loader/Makefile.am
index 4ebfc97e6d9..878159f2343 100644
--- a/src/gallium/auxiliary/pipe-loader/Makefile.am
+++ b/src/gallium/auxiliary/pipe-loader/Makefile.am
@@ -41,9 +41,11 @@ libpipe_loader_dynamic_la_SOURCES += \
 endif

 libpipe_loader_static_la_LIBADD = \
-   $(top_builddir)/src/loader/libloader.la
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la

 libpipe_loader_dynamic_la_LIBADD = \
-   $(top_builddir)/src/loader/libloader.la
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la

 EXTRA_DIST = SConscript
diff --git a/src/gallium/targets/opencl/Makefile.am
b/src/gallium/targets/opencl/Makefile.am
index e88fa0fd382..c9d2be7afd0 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(top_builddir)/src/util/libmesautil.la \
-   $(top_builddir)/src/util/libxmlconfig.la \
$(EXPAT_LIBS) \
$(LIBELF_LIBS) \
$(DLOPEN_LIBS) \
diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am
index de8396000b7..7a9a12f87a0 100644
--- a/src/gbm/Makefile.am
+++ b/src/gbm/Makefile.am
@@ -26,6 +26,7 @@ libgbm_la_LDFLAGS = \

 libgbm_la_LIBADD = \
$(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la \
$(DLOPEN_LIBS)

 if HAVE_PLATFORM_WAYLAND
diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
index b306bcc08db..34600475d98 100644
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -97,7 +97,9 @@ libglx_la_SOURCES = \
singlepix.c \
vertarr.c

-libglx_la_LIBADD = $(top_builddir)/src/loader/libloader.la
+libglx_la_LIBADD = \
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la

 if HAVE_DRISW
 libglx_la_SOURCES += \
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index 8b197f2995c..74ac6c51e77 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -26,6 +26,8 @@ EXTRA_DIST = SConscript
 noinst_LTLIBRARIES = libloader.la

 AM_CPPFLAGS = \
+   -I$(top_builddir)/src/util/ \
+   -DUSE_DRICONF \
$(DEFINES) \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src \
@@ -37,19 +39,6 @@ libloader_la_CPPFLAGS = 

Re: [Mesa-dev] [PATCH 10/12] egl/drm: remove unreachable code in dri2_drm_create_surface()

2017-08-04 Thread Emil Velikov
On 4 August 2017 at 11:03, Eric Engestrom  wrote:
> On Thursday, 2017-08-03 19:29:36 +0100, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> The function can be called only when the type is EGL_WINDOW_BIT.
>> Remove the unneeded switch statement.
>
> I take it we plan on never supporting pbuffers or pixmaps in platform_drm?
>
Pixmaps are explicitly forbidden, in the EGL platform
gbm/wayland/android (yes there is one) spec.
Pbuffers on the other hand are not mentioned at all in the ^^+x11 ones.

> If so, I'd rather fold dri2_drm_create_surface() into
> dri2_drm_create_window_surface(), as `type` is meaningless now and should
> be dropped, and without it the latter is an empty pass-through function.
> (can/should be a separate commit, but please send both as a single series)
>
Ack. Will respin the lot, splitting the unrelated patches into separate series.

> Reviewed-by: Eric Engestrom 
>
Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/11] i965/miptree: Call alloc_aux in create_for_bo

2017-08-04 Thread Jordan Justen
On 2017-08-02 13:35:33, Jason Ekstrand wrote:
> Originally, I had moved it to the caller to make some things easier when
> adding the CCS modifier.  However, this broke DRI2 because
> intel_process_dri2_buffer calls intel_miptree_create_for_bo but never
> calls intel_miptree_alloc_aux.  Also, in hindsight, it should be pretty
> easy to make the CCS modifier stuff work even if create_for_bo allocates
> the CCS when DISABLE_AUX is not set.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101925

I guess you want to drop this based on Tapani's feedback.

Reviewed-by: Jordan Justen 

> Cc: Tapani Palli 
> Cc: "17.2" 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 910bb46..305912c 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -839,9 +839,15 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> mt->bo = bo;
> mt->offset = offset;
>  
> -   if (!(layout_flags & MIPTREE_LAYOUT_DISABLE_AUX))
> +   if (!(layout_flags & MIPTREE_LAYOUT_DISABLE_AUX)) {
>intel_miptree_choose_aux_usage(brw, mt);
>  
> +  if (!intel_miptree_alloc_aux(brw, mt)) {
> + intel_miptree_release();
> + return NULL;
> +  }
> +   }
> +
> return mt;
>  }
>  
> @@ -978,11 +984,6 @@ intel_miptree_create_for_dri_image(struct brw_context 
> *brw,
> if (is_winsys_image)
>image->bo->cache_coherent = false;
>  
> -   if (!intel_miptree_alloc_aux(brw, mt)) {
> -  intel_miptree_release();
> -  return NULL;
> -   }
> -
> return mt;
>  }
>  
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] i965: Only put external handles into the handle ht

2017-08-04 Thread Chris Wilson
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 36 +++---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index e1036f25a4..844ccaf1e5 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -363,7 +363,6 @@ retry:
   }
 
   bo->gem_handle = create.handle;
-  _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo);
 
   bo->bufmgr = bufmgr;
   bo->align = alignment;
@@ -555,7 +554,6 @@ bo_free(struct brw_bo *bo)
 {
struct brw_bufmgr *bufmgr = bo->bufmgr;
struct drm_gem_close close;
-   struct hash_entry *entry;
int ret;
 
if (bo->map_cpu) {
@@ -571,12 +569,17 @@ bo_free(struct brw_bo *bo)
   drm_munmap(bo->map_gtt, bo->size);
}
 
-   if (bo->global_name) {
-  entry = _mesa_hash_table_search(bufmgr->name_table, >global_name);
-  _mesa_hash_table_remove(bufmgr->name_table, entry);
+   if (bo->external) {
+  struct hash_entry *entry;
+
+  if (bo->global_name) {
+ entry = _mesa_hash_table_search(bufmgr->name_table, >global_name);
+ _mesa_hash_table_remove(bufmgr->name_table, entry);
+  }
+
+  entry = _mesa_hash_table_search(bufmgr->handle_table, >gem_handle);
+  _mesa_hash_table_remove(bufmgr->handle_table, entry);
}
-   entry = _mesa_hash_table_search(bufmgr->handle_table, >gem_handle);
-   _mesa_hash_table_remove(bufmgr->handle_table, entry);
 
/* Close this object */
memclear(close);
@@ -1161,12 +1164,20 @@ brw_bo_gem_export_to_prime(struct brw_bo *bo, int 
*prime_fd)
 {
struct brw_bufmgr *bufmgr = bo->bufmgr;
 
+   if (!bo->external) {
+  pthread_mutex_lock(>lock);
+  if (!bo->external) {
+ _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo);
+ bo->external = true;
+  }
+  pthread_mutex_unlock(>lock);
+   }
+
if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
   DRM_CLOEXEC, prime_fd) != 0)
   return -errno;
 
bo->reusable = false;
-   bo->external = true;
 
return 0;
 }
@@ -1185,14 +1196,17 @@ brw_bo_flink(struct brw_bo *bo, uint32_t *name)
  return -errno;
 
   pthread_mutex_lock(>lock);
+  if (!bo->external) {
+ _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo);
+ bo->external = true;
+  }
   if (!bo->global_name) {
  bo->global_name = flink.name;
- bo->reusable = false;
- bo->external = true;
-
  _mesa_hash_table_insert(bufmgr->name_table, >global_name, bo);
   }
   pthread_mutex_unlock(>lock);
+
+  bo->reusable = false;
}
 
*name = bo->global_name;
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] i965: Replace open-coded gen6 queryobj offsets with simple helpers

2017-08-04 Thread Chris Wilson
Lots of places open-coded the assumed layout of the predicate/results
within the query object, replace those with simple helpers.

v2: Fix function decl style.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_conditional_render.c | 10 --
 src/mesa/drivers/dri/i965/brw_context.h| 15 +++
 src/mesa/drivers/dri/i965/gen6_queryobj.c  |  6 +++---
 src/mesa/drivers/dri/i965/hsw_queryobj.c   | 18 +-
 4 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_conditional_render.c 
b/src/mesa/drivers/dri/i965/brw_conditional_render.c
index e33e79fb6c..0177a7f80b 100644
--- a/src/mesa/drivers/dri/i965/brw_conditional_render.c
+++ b/src/mesa/drivers/dri/i965/brw_conditional_render.c
@@ -87,8 +87,14 @@ set_predicate_for_occlusion_query(struct brw_context *brw,
 */
brw_emit_pipe_control_flush(brw, PIPE_CONTROL_FLUSH_ENABLE);
 
-   brw_load_register_mem64(brw, MI_PREDICATE_SRC0, query->bo, 0 /* offset */);
-   brw_load_register_mem64(brw, MI_PREDICATE_SRC1, query->bo, 8 /* offset */);
+   brw_load_register_mem64(brw,
+   MI_PREDICATE_SRC0,
+   query->bo,
+   gen6_query_results_offset(query, 0));
+   brw_load_register_mem64(brw,
+   MI_PREDICATE_SRC1,
+   query->bo,
+   gen6_query_results_offset(query, 1));
 }
 
 static void
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index d41e6aa7bd..d37e05bb47 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -428,6 +428,21 @@ struct brw_query_object {
bool flushed;
 };
 
+#define GEN6_QUERY_PREDICATE (2)
+#define GEN6_QUERY_RESULTS (0)
+
+static inline unsigned
+gen6_query_predicate_offset(const struct brw_query_object *query)
+{
+   return GEN6_QUERY_PREDICATE * sizeof(uint64_t);
+}
+
+static inline unsigned
+gen6_query_results_offset(const struct brw_query_object *query, unsigned idx)
+{
+   return (GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t);
+}
+
 enum brw_gpu_ring {
UNKNOWN_RING,
RENDER_RING,
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index 1ee3974198..a0b786f5d9 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -71,7 +71,7 @@ set_query_availability(struct brw_context *brw, struct 
brw_query_object *query,
   }
 
   brw_emit_pipe_control_write(brw, flags,
-  query->bo, 2 * sizeof(uint64_t),
+  query->bo, 
gen6_query_predicate_offset(query),
   available);
}
 }
@@ -318,7 +318,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   const int idx = 0;
+   const int idx = GEN6_QUERY_RESULTS;
 
/* Since we're starting a new query, we need to throw away old results. */
brw_bo_unreference(query->bo);
@@ -407,7 +407,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   const int idx = 1;
+   const int idx = GEN6_QUERY_RESULTS + 1;
 
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
diff --git a/src/mesa/drivers/dri/i965/hsw_queryobj.c 
b/src/mesa/drivers/dri/i965/hsw_queryobj.c
index 9dc3b3de86..32b2e1f342 100644
--- a/src/mesa/drivers/dri/i965/hsw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/hsw_queryobj.c
@@ -191,7 +191,7 @@ load_overflow_data_to_cs_gprs(struct brw_context *brw,
   struct brw_query_object *query,
   int idx)
 {
-   int offset = idx * sizeof(uint64_t) * 4;
+   int offset = gen6_query_results_offset(query, 0) + idx * sizeof(uint64_t) * 
4;
 
brw_load_register_mem64(brw, HSW_CS_GPR(1), query->bo, offset);
 
@@ -282,7 +282,7 @@ hsw_result_to_gpr0(struct gl_context *ctx, struct 
brw_query_object *query,
   brw_load_register_mem64(brw,
   HSW_CS_GPR(0),
   query->bo,
-  2 * sizeof(uint64_t));
+  gen6_query_predicate_offset(query));
   return;
}
 
@@ -299,7 +299,7 @@ hsw_result_to_gpr0(struct gl_context *ctx, struct 
brw_query_object *query,
   brw_load_register_mem64(brw,
   HSW_CS_GPR(0),
   query->bo,
-  0 * sizeof(uint64_t));
+  gen6_query_results_offset(query, 0));
} else if 

[Mesa-dev] [PATCH 06/12] i965: Use snoop bo for accessing query results on !llc

2017-08-04 Thread Chris Wilson
Ony non-llc architectures where we are primarily reading back the
results of the GPU queries, then we can improve performance by using a
cacheable mapping of the results. Unfortunately, enabling snooping makes
the writes from the GPU slower, which may adversely affect pipelined
query operations (where the results are used directly by the GPU and not
CPU).

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c| 24 
 src/mesa/drivers/dri/i965/brw_bufmgr.h|  2 ++
 src/mesa/drivers/dri/i965/gen6_queryobj.c |  4 +++-
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 5c7647f8bc..d71cef25e3 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -683,6 +683,30 @@ brw_bo_unreference(struct brw_bo *bo)
}
 }
 
+static bool
+__brw_bo_set_caching(struct brw_bo *bo, int caching)
+{
+   struct drm_i915_gem_caching arg = {
+  .handle = bo->gem_handle,
+  .caching = caching
+   };
+   return drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_SET_CACHING, ) == 0;
+}
+
+void
+brw_bo_set_cache_coherent(struct brw_bo *bo)
+{
+   assert(!bo->external);
+   if (bo->cache_coherent)
+  return;
+
+   if (!__brw_bo_set_caching(bo, I915_CACHING_CACHED))
+  return;
+
+   bo->reusable = false;
+   bo->cache_coherent = true;
+}
+
 static void
 bo_wait_with_stall_warning(struct brw_context *brw,
struct brw_bo *bo,
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index 9848fe9268..45819c17c5 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -221,6 +221,8 @@ void brw_bo_unreference(struct brw_bo *bo);
 #define MAP_INTERNAL_MASK   (0xff << 24)
 #define MAP_RAW (0x01 << 24)
 
+void brw_bo_set_cache_coherent(struct brw_bo *bo);
+
 /**
  * Maps the buffer into userspace.
  *
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index c96f00d8ba..a3b552c6c1 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -225,7 +225,7 @@ gen6_queryobj_get_results(struct gl_context *ctx,
 
brw_bo_wait_rendering(query->bo);
uint64_t *results = query->results;
-   if (!query->bo->cache_coherent)
+   if (unlikely(!query->bo->cache_coherent))
   gen_invalidate_range(results, query->bo->size);
 
switch (query->Base.Target) {
@@ -320,6 +320,8 @@ gen6_alloc_query(struct brw_context *brw, struct 
brw_query_object *query)
   brw_bo_unreference(query->bo);
 
query->bo = brw_bo_alloc(brw->bufmgr, "query results", 4096, 4096);
+   brw_bo_set_cache_coherent(query->bo);
+
query->results = brw_bo_map(brw, query->bo,
MAP_COHERENT | MAP_PERSISTENT |
MAP_READ | MAP_ASYNC);
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] i965: Pack simple pipelined query objects into the same buffer

2017-08-04 Thread Chris Wilson
Reuse the same query object buffer for multiple queries within the same
batch.

A task for the future is propagating the GL_NO_MEMORY errors.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.c   |  4 +++
 src/mesa/drivers/dri/i965/brw_context.h   | 10 +--
 src/mesa/drivers/dri/i965/brw_queryobj.c  | 16 +--
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 46 +--
 4 files changed, 57 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index d0b22d4342..5cf5a67432 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -860,6 +860,8 @@ brwCreateContext(gl_api api,
 
brw->isl_dev = screen->isl_dev;
 
+   brw->query.last_index = 4096;
+
brw->vs.base.stage = MESA_SHADER_VERTEX;
brw->tcs.base.stage = MESA_SHADER_TESS_CTRL;
brw->tes.base.stage = MESA_SHADER_TESS_EVAL;
@@ -1047,6 +1049,8 @@ intelDestroyContext(__DRIcontext * driContextPriv)
   brw_bo_unreference(brw->gs.base.scratch_bo);
if (brw->wm.base.scratch_bo)
   brw_bo_unreference(brw->wm.base.scratch_bo);
+   if (brw->query.bo)
+  brw_bo_unreference(brw->query.bo);
 
brw_destroy_hw_context(brw->bufmgr, brw->hw_ctx);
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index b415013e47..376bcbb399 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -423,7 +423,7 @@ struct brw_query_object {
uint64_t *results;
 
/** Last index in bo with query data for this object. */
-   int last_index;
+   unsigned index;
 
/** True if we know the batch has been flushed since we ended the query. */
bool flushed;
@@ -435,13 +435,13 @@ struct brw_query_object {
 static inline unsigned
 gen6_query_predicate_offset(const struct brw_query_object *query)
 {
-   return GEN6_QUERY_PREDICATE * sizeof(uint64_t);
+   return (query->index + GEN6_QUERY_PREDICATE) * sizeof(uint64_t);
 }
 
 static inline unsigned
 gen6_query_results_offset(const struct brw_query_object *query, unsigned idx)
 {
-   return (GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t);
+   return (query->index + GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t);
 }
 
 enum brw_gpu_ring {
@@ -1103,6 +1103,10 @@ struct brw_context
} cc;
 
struct {
+  struct brw_bo *bo;
+  uint64_t *map;
+  unsigned last_index;
+
   struct brw_query_object *obj;
   bool begin_emitted;
} query;
diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 04ce9a94ca..8b14c72176 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -184,7 +184,7 @@ brw_queryobj_get_results(struct gl_context *ctx,
* run out of space in the query's BO and allocated a new one.  If so,
* this function was already called to accumulate the results so far.
*/
-  for (i = 0; i < query->last_index; i++) {
+  for (i = 0; i < query->index; i++) {
 query->Base.Result += results[i * 2 + 1] - results[i * 2];
   }
   break;
@@ -194,7 +194,7 @@ brw_queryobj_get_results(struct gl_context *ctx,
   /* If the starting and ending PS_DEPTH_COUNT from any of the batches
* differ, then some fragments passed the depth test.
*/
-  for (i = 0; i < query->last_index; i++) {
+  for (i = 0; i < query->index; i++) {
 if (results[i * 2 + 1] != results[i * 2]) {
 query->Base.Result = GL_TRUE;
 break;
@@ -298,7 +298,7 @@ brw_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
*/
   brw_bo_unreference(query->bo);
   query->bo = NULL;
-  query->last_index = -1;
+  query->index = -1;
 
   brw->query.obj = query;
 
@@ -430,7 +430,7 @@ ensure_bo_has_space(struct gl_context *ctx, struct 
brw_query_object *query)
 
assert(brw->gen < 6);
 
-   if (!query->bo || query->last_index * 2 + 1 >= 4096 / sizeof(uint64_t)) {
+   if (!query->bo || query->index * 2 + 1 >= 4096 / sizeof(uint64_t)) {
 
   if (query->bo != NULL) {
  /* The old query BO did not have enough space, so we allocated a new
@@ -441,7 +441,7 @@ ensure_bo_has_space(struct gl_context *ctx, struct 
brw_query_object *query)
   }
 
   query->bo = brw_bo_alloc(brw->bufmgr, "query", 4096, 1);
-  query->last_index = 0;
+  query->index = 0;
}
 }
 
@@ -482,7 +482,7 @@ brw_emit_query_begin(struct brw_context *brw)
 
ensure_bo_has_space(ctx, query);
 
-   brw_write_depth_count(brw, query->bo, query->last_index * 2);
+   brw_write_depth_count(brw, query->bo, query->index * 2);
 
brw->query.begin_emitted = true;
 }
@@ -504,10 +504,10 @@ brw_emit_query_end(struct brw_context *brw)
if (!brw->query.begin_emitted)
   

[Mesa-dev] [PATCH 05/12] i965: Map the query results for the life of the bo

2017-08-04 Thread Chris Wilson
If we map the bo upon creation, we can avoid the latency of mmapping it
when querying, and later use the asynchronous, persistent map of the
predicate to do a quick query.

v2: Inline the wait on results; it disappears shortly in the next few
patches.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  1 +
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 42 +++
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index d37e05bb47..4d0b76bebb 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -420,6 +420,7 @@ struct brw_query_object {
 
/** Last query BO associated with this query. */
struct brw_bo *bo;
+   uint64_t *results;
 
/** Last index in bo with query data for this object. */
int last_index;
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index a0b786f5d9..c96f00d8ba 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -33,6 +33,8 @@
  */
 #include "main/imports.h"
 
+#include "common/gen_clflush.h"
+
 #include "brw_context.h"
 #include "brw_defines.h"
 #include "brw_state.h"
@@ -221,7 +223,11 @@ gen6_queryobj_get_results(struct gl_context *ctx,
if (query->bo == NULL)
   return;
 
-   uint64_t *results = brw_bo_map(brw, query->bo, MAP_READ);
+   brw_bo_wait_rendering(query->bo);
+   uint64_t *results = query->results;
+   if (!query->bo->cache_coherent)
+  gen_invalidate_range(results, query->bo->size);
+
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
   /* The query BO contains the starting and ending timestamps.
@@ -296,7 +302,6 @@ gen6_queryobj_get_results(struct gl_context *ctx,
default:
   unreachable("Unrecognized query target in brw_queryobj_get_results()");
}
-   brw_bo_unmap(query->bo);
 
/* Now that we've processed the data stored in the query's buffer object,
 * we can release it.
@@ -307,6 +312,24 @@ gen6_queryobj_get_results(struct gl_context *ctx,
query->Base.Ready = true;
 }
 
+static int
+gen6_alloc_query(struct brw_context *brw, struct brw_query_object *query)
+{
+   /* Since we're starting a new query, we need to throw away old results. */
+   if (query->bo)
+  brw_bo_unreference(query->bo);
+
+   query->bo = brw_bo_alloc(brw->bufmgr, "query results", 4096, 4096);
+   query->results = brw_bo_map(brw, query->bo,
+   MAP_COHERENT | MAP_PERSISTENT |
+   MAP_READ | MAP_ASYNC);
+
+   /* For ARB_query_buffer_object: The result is not available */
+   set_query_availability(brw, query, false);
+
+   return 0;
+}
+
 /**
  * Driver hook for glBeginQuery().
  *
@@ -318,14 +341,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   const int idx = GEN6_QUERY_RESULTS;
-
-   /* Since we're starting a new query, we need to throw away old results. */
-   brw_bo_unreference(query->bo);
-   query->bo = brw_bo_alloc(brw->bufmgr, "query results", 4096, 4096);
-
-   /* For ARB_query_buffer_object: The result is not available */
-   set_query_availability(brw, query, false);
+   const int idx = gen6_alloc_query(brw, query) + GEN6_QUERY_RESULTS;
 
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
@@ -539,8 +555,12 @@ gen6_query_counter(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   brw_query_counter(ctx, q);
+   const int idx = gen6_alloc_query(brw, query) + GEN6_QUERY_RESULTS;
+
+   brw_write_timestamp(brw, query->bo, idx);
set_query_availability(brw, query, true);
+
+   query->flushed = false;
 }
 
 /* Initialize Gen6+-specific query object functions. */
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] i965: Use 'available' fence for polling query results

2017-08-04 Thread Chris Wilson
If we always write the 'available' flag after writing the final result
of the query, we can probe that predicate to quickly query whether the
result is ready from userspace. The primary advantage of checking the
predicate is that it allows for more fine-grained queries, we do not
have to wait for the batch to finish before the query is marked as
ready.

We still do check the status of the batch after probing the query so
that if the worst happens and the batch did hang without completing the
query, we do not spin forever (although it is not as nice as completely
eliminating the ioctl, the busy-ioctl is lightweight!).

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  4 +--
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 53 ++-
 2 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 4d0b76bebb..b415013e47 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -429,8 +429,8 @@ struct brw_query_object {
bool flushed;
 };
 
-#define GEN6_QUERY_PREDICATE (2)
-#define GEN6_QUERY_RESULTS (0)
+#define GEN6_QUERY_PREDICATE (0)
+#define GEN6_QUERY_RESULTS (1)
 
 static inline unsigned
 gen6_query_predicate_offset(const struct brw_query_object *query)
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index a3b552c6c1..c6887661a5 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -42,8 +42,7 @@
 #include "intel_buffer_objects.h"
 
 static inline void
-set_query_availability(struct brw_context *brw, struct brw_query_object *query,
-   bool available)
+set_query_available(struct brw_context *brw, struct brw_query_object *query)
 {
/* For platforms that support ARB_query_buffer_object, we write the
 * query availability for "pipelined" queries.
@@ -60,22 +59,12 @@ set_query_availability(struct brw_context *brw, struct 
brw_query_object *query,
 * PIPE_CONTROL with an immediate write will synchronize with
 * those earlier writes, so we write 1 when the value has landed.
 */
-   if (brw->ctx.Extensions.ARB_query_buffer_object &&
-   brw_is_query_pipelined(query)) {
-  unsigned flags = PIPE_CONTROL_WRITE_IMMEDIATE;
 
-  if (available) {
- /* Order available *after* the query results. */
- flags |= PIPE_CONTROL_FLUSH_ENABLE;
-  } else {
- /* Make it unavailable *before* any pipelined reads. */
- flags |= PIPE_CONTROL_CS_STALL;
-  }
-
-  brw_emit_pipe_control_write(brw, flags,
-  query->bo, 
gen6_query_predicate_offset(query),
-  available);
-   }
+   brw_emit_pipe_control_write(brw,
+   PIPE_CONTROL_WRITE_IMMEDIATE |
+   PIPE_CONTROL_FLUSH_ENABLE,
+   query->bo, gen6_query_predicate_offset(query),
+   true);
 }
 
 static void
@@ -141,12 +130,12 @@ write_xfb_overflow_streams(struct gl_context *ctx,
 }
 
 static bool
-check_xfb_overflow_streams(uint64_t *results, int count)
+check_xfb_overflow_streams(const uint64_t *results, int count)
 {
bool overflow = false;
 
for (int i = 0; i < count; i++) {
-  uint64_t *result_i = [4 * i];
+  const uint64_t *result_i = [4 * i];
 
   if ((result_i[3] - result_i[2]) != (result_i[1] - result_i[0])) {
  overflow = true;
@@ -216,15 +205,14 @@ emit_pipeline_stat(struct brw_context *brw, struct brw_bo 
*bo,
  */
 static void
 gen6_queryobj_get_results(struct gl_context *ctx,
-  struct brw_query_object *query)
+  struct brw_query_object *query,
+  uint64_t *results)
 {
struct brw_context *brw = brw_context(ctx);
 
if (query->bo == NULL)
   return;
 
-   brw_bo_wait_rendering(query->bo);
-   uint64_t *results = query->results;
if (unlikely(!query->bo->cache_coherent))
   gen_invalidate_range(results, query->bo->size);
 
@@ -324,10 +312,10 @@ gen6_alloc_query(struct brw_context *brw, struct 
brw_query_object *query)
 
query->results = brw_bo_map(brw, query->bo,
MAP_COHERENT | MAP_PERSISTENT |
-   MAP_READ | MAP_ASYNC);
+   MAP_READ | MAP_WRITE);
 
/* For ARB_query_buffer_object: The result is not available */
-   set_query_availability(brw, query, false);
+   query->results[GEN6_QUERY_PREDICATE] = false;
 
return 0;
 }
@@ -482,7 +470,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
query->flushed = false;
 
/* For ARB_query_buffer_object: The result is now available */
-   

Re: [Mesa-dev] [PATCH 01/11] intel/isl: Stop padding surfaces

2017-08-04 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2017-08-02 13:35:26, Jason Ekstrand wrote:
> The docs contain a bunch of commentary about the need to pad various
> surfaces out to multiples of something or other.  However, all of those
> requirements are about avoiding GTT errors due to missing pages when the
> data port or sampler accesses slightly out-of-bounds.  However, because
> the kernel already fills all the empty space in our GTT with the scratch
> page, we never have to worry about faulting due to OOB reads.  There are
> two caveats to this:
> 
>  1) There is some potential for issues with caches here if extra data
> ends up in a cache we don't expect due to OOB reads.  However,
> because we always trash the entire cache whenever we need to move
> anything between cache domains, this shouldn't be an issue.
> 
>  2) There is a potential issue if a surface gets placed at the very top
> of the GTT by the kernel.  In this case, the hardware could
> potentially end up trying to read past the top of the GTT.  If it
> nicely wraps around at the 48-bit (or 32-bit) boundary, then this
> shouldn't be an issue thanks to the scratch page.  If it doesn't,
> then we need to come up with something to handle it.
> 
> Up until some of the GL move to ISL, having the padding code in there
> just caused us to harmlessly use a bit more memory in Vulkan.  However,
> now that we're using ISL sizes to validate external dma-buf images,
> these padding requirements are causing us to reject otherwise valid
> images due to the size of the BO being too small.
> 
> Cc: "17.2" 
> Cc: Chad Versace 
> Tested-by: Tapani Pälli 
> Tested-by: Tomasz Figa 
> ---
>  src/intel/isl/isl.c | 119 
> +---
>  1 file changed, 2 insertions(+), 117 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 5e3d279..d3124de 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -1374,116 +1374,6 @@ isl_calc_row_pitch(const struct isl_device *dev,
> return true;
>  }
>  
> -/**
> - * Calculate and apply any padding required for the surface.
> - *
> - * @param[inout] total_h_el is updated with the new height
> - * @param[out] pad_bytes is overwritten with additional padding requirements.
> - */
> -static void
> -isl_apply_surface_padding(const struct isl_device *dev,
> -  const struct isl_surf_init_info *restrict info,
> -  const struct isl_tile_info *tile_info,
> -  uint32_t *total_h_el,
> -  uint32_t *pad_bytes)
> -{
> -   const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
> -
> -   *pad_bytes = 0;
> -
> -   /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface
> -* Formats >> Surface Padding Requirements >> Render Target and Media
> -* Surfaces:
> -*
> -*   The data port accesses data (pixels) outside of the surface if they
> -*   are contained in the same cache request as pixels that are within the
> -*   surface. These pixels will not be returned by the requesting message,
> -*   however if these pixels lie outside of defined pages in the GTT,
> -*   a GTT error will result when the cache request is processed. In
> -*   order to avoid these GTT errors, “padding” at the bottom of the
> -*   surface is sometimes necessary.
> -*
> -* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface
> -* Formats >> Surface Padding Requirements >> Sampling Engine Surfaces:
> -*
> -*... Lots of padding requirements, all listed separately below.
> -*/
> -
> -   /* We can safely ignore the first padding requirement, quoted below,
> -* because isl doesn't do buffers.
> -*
> -*- [pre-BDW] For buffers, which have no inherent “height,” padding
> -*  requirements are different. A buffer must be padded to the next
> -*  multiple of 256 array elements, with an additional 16 bytes added
> -*  beyond that to account for the L1 cache line.
> -*/
> -
> -   /*
> -*- For compressed textures [...], padding at the bottom of the 
> surface
> -*  is to an even compressed row.
> -*/
> -   if (isl_format_is_compressed(info->format))
> -  *total_h_el = isl_align(*total_h_el, 2);
> -
> -   /*
> -*- For cube surfaces, an additional two rows of padding are required
> -*  at the bottom of the surface.
> -*/
> -   if (info->usage & ISL_SURF_USAGE_CUBE_BIT)
> -  *total_h_el += 2;
> -
> -   /*
> -*- For packed YUV, 96 bpt, 48 bpt, and 24 bpt surface formats,
> -*  additional padding is required. These surfaces require an extra 
> row
> -*  plus 16 bytes of padding at the bottom in addition to the general
> -*  padding 

[Mesa-dev] [PATCH 02/12] i965: Check last known busy status on bo before asking the kernel

2017-08-04 Thread Chris Wilson
If we know the bo is idle (that is we have no submitted a command buffer
referencing this bo since the last query) we can skip asking the kernel.
Note this may report a false negative if the target is being shared
between processes (exported via dmabuf or flink). To allow the caller
control over using the last known flag, the query is split into two.

v2: Check against external bo before trusting our own tracking.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 42 --
 src/mesa/drivers/dri/i965/brw_bufmgr.h | 11 +++--
 2 files changed, 39 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 844ccaf1e5..5c7647f8bc 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -196,22 +196,40 @@ bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
return NULL;
 }
 
-int
+static int
+__brw_bo_busy(struct brw_bo *bo)
+{
+   struct drm_i915_gem_busy busy = { bo->gem_handle };
+
+   if (bo->idle && !bo->external)
+  return 0;
+
+   /* If we hit an error here, it means that bo->gem_handle is invalid.
+* Treat it as being idle (busy.busy is left as 0) and move along.
+*/
+   drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, );
+
+   bo->idle = !busy.busy;
+   return busy.busy;
+}
+
+bool
 brw_bo_busy(struct brw_bo *bo)
 {
-   struct brw_bufmgr *bufmgr = bo->bufmgr;
-   struct drm_i915_gem_busy busy;
-   int ret;
+   return __brw_bo_busy(bo);
+}
 
-   memclear(busy);
-   busy.handle = bo->gem_handle;
+bool
+brw_bo_map_busy(struct brw_bo *bo, unsigned flags)
+{
+   unsigned mask;
 
-   ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, );
-   if (ret == 0) {
-  bo->idle = !busy.busy;
-  return busy.busy;
-   }
-   return false;
+   if (flags & MAP_WRITE)
+  mask = ~0u;
+   else
+  mask = 0x;
+
+   return __brw_bo_busy(bo) & mask;
 }
 
 int
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index d09bc74c9c..9848fe9268 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -271,10 +271,17 @@ int brw_bo_get_tiling(struct brw_bo *bo, uint32_t 
*tiling_mode,
 int brw_bo_flink(struct brw_bo *bo, uint32_t *name);
 
 /**
- * Returns 1 if mapping the buffer for write could cause the process
+ * Returns false if mapping the buffer is not in active use by the gpu.
+ * If it returns true, any mapping for for write could cause the process
  * to block, due to the object being active in the GPU.
  */
-int brw_bo_busy(struct brw_bo *bo);
+bool brw_bo_busy(struct brw_bo *bo);
+
+/**
+ * Returns true if mapping the buffer for the set of flags (i.e. MAP_READ or
+ * MAP_WRITE) will cause the process to block.
+ */
+bool brw_bo_map_busy(struct brw_bo *bo, unsigned flags);
 
 /**
  * Specify the volatility of the buffer.
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] i965: Use snooping for mapping the miptree

2017-08-04 Thread Chris Wilson
Avoid having to clflush after blitting the miptree to a linear buffer
for mapping by enabling snooping on !llc and treating the buffer as
coherent. Similarly, it avoids the clflush afterwards if used for
READ | WRITE.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 858279fcba..3b5e5595d7 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2955,11 +2955,16 @@ intel_miptree_map_blit(struct brw_context *brw,
  map->w, map->h, 1,
  /* samples */ 1,
  MIPTREE_LAYOUT_TILING_NONE);
-
if (!map->linear_mt) {
   fprintf(stderr, "Failed to allocate blit temporary\n");
   goto fail;
}
+
+   /* Make the GPU do the work of invalidating the CPU cache (using snoop on
+* !llc), it's much faster than clflush!
+*/
+   brw_bo_set_cache_coherent(map->linear_mt->bo);
+
map->stride = map->linear_mt->surf.row_pitch;
 
/* One of either READ_BIT or WRITE_BIT or both is set.  READ_BIT implies no
@@ -3422,11 +3427,11 @@ use_intel_mipree_map_blit(struct brw_context *brw,
   unsigned int level,
   unsigned int slice)
 {
-   if (brw->has_llc &&
-  /* It's probably not worth swapping to the blit ring because of
-   * all the overhead involved.
-   */
-   !(mode & GL_MAP_WRITE_BIT) &&
+   /* It's probably not worth swapping to the blit ring because of
+* all the overhead involved.
+*/
+
+   if (!(mode & GL_MAP_WRITE_BIT) &&
!mt->compressed &&
(mt->surf.tiling == ISL_TILING_X ||
 /* Prior to Sandybridge, the blitter can't handle Y tiling */
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/12] i965: Pass consistent args along gen6_queryobj.c

2017-08-04 Thread Chris Wilson
Be consistent in passing along brw_context rather than switching between
that and gl_context.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 32 ++-
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index 25ea51503e..0ba6919374 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -99,12 +99,10 @@ write_xfb_primitives_written(struct brw_context *brw,
 }
 
 static void
-write_xfb_overflow_streams(struct gl_context *ctx,
+write_xfb_overflow_streams(struct brw_context *brw,
struct brw_bo *bo, int stream, int count,
int idx)
 {
-   struct brw_context *brw = brw_context(ctx);
-
brw_emit_mi_flush(brw);
 
for (int i = 0; i < count; i++) {
@@ -204,15 +202,10 @@ emit_pipeline_stat(struct brw_context *brw, struct brw_bo 
*bo,
  * Wait on the query object's BO and calculate the final result.
  */
 static void
-gen6_queryobj_get_results(struct gl_context *ctx,
+gen6_queryobj_get_results(struct brw_context *brw,
   struct brw_query_object *query,
   uint64_t *results)
 {
-   struct brw_context *brw = brw_context(ctx);
-
-   if (query->bo == NULL)
-  return;
-
if (unlikely(!query->bo->cache_coherent))
   gen_invalidate_range(results, query->bo->size);
 
@@ -232,7 +225,7 @@ gen6_queryobj_get_results(struct gl_context *ctx,
   /* Ensure the scaled timestamp overflows according to
* GL_QUERY_COUNTER_BITS
*/
-  query->Base.Result &= (1ull << ctx->Const.QueryCounterBits.Timestamp) - 
1;
+  query->Base.Result &= (1ull << 
brw->ctx.Const.QueryCounterBits.Timestamp) - 1;
   break;
 
case GL_SAMPLES_PASSED_ARB:
@@ -396,7 +389,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_PRIMITIVES_GENERATED:
   write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
- ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
+ brw->ctx.NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
@@ -404,11 +397,11 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
+  write_xfb_overflow_streams(brw, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
+  write_xfb_overflow_streams(brw, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
case GL_VERTICES_SUBMITTED_ARB:
@@ -459,7 +452,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_PRIMITIVES_GENERATED:
   write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
- ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
+ brw->ctx.NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
@@ -467,11 +460,11 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
+  write_xfb_overflow_streams(brw, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
+  write_xfb_overflow_streams(brw, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
   /* calculate overflow here */
@@ -530,6 +523,9 @@ static void gen6_wait_query(struct gl_context *ctx, struct 
gl_query_object *q)
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
 
+   if (query->bo == NULL)
+  return;
+
/* If the application has requested the query result, but this batch is
 * still contributing to it, flush it now to finish that work so the
 * result will become available (eventually).
@@ -540,7 +536,7 @@ static void gen6_wait_query(struct gl_context *ctx, struct 
gl_query_object *q)
if (!results[GEN6_QUERY_PREDICATE]) /* not yet available, must wait */
   brw_bo_wait_rendering(query->bo);
 
-   gen6_queryobj_get_results(ctx, query, results + GEN6_QUERY_RESULTS);
+   gen6_queryobj_get_results(brw, query, results + GEN6_QUERY_RESULTS);
 }
 
 /**
@@ -572,7 +568,7 @@ static void gen6_check_query(struct gl_context *ctx, struct 
gl_query_object *q)
uint64_t *results = query->results;
if (results[GEN6_QUERY_PREDICATE] || /* already available, can read async */
 

[Mesa-dev] [PATCH 03/12] i965: Replace hard-coded indices with const named variables in gen6_queryobj

2017-08-04 Thread Chris Wilson
To simplify replacement later, replace repeated use of explicit 0/1 with
local variables of the same value.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index 8e639cfeef..1ee3974198 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -318,6 +318,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
+   const int idx = 0;
 
/* Since we're starting a new query, we need to throw away old results. */
brw_bo_unreference(query->bo);
@@ -347,31 +348,31 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
* obtain the time elapsed.  Notably, this includes time elapsed while
* the system was doing other work, such as running other applications.
*/
-  brw_write_timestamp(brw, query->bo, 0);
+  brw_write_timestamp(brw, query->bo, idx);
   break;
 
case GL_ANY_SAMPLES_PASSED:
case GL_ANY_SAMPLES_PASSED_CONSERVATIVE:
case GL_SAMPLES_PASSED_ARB:
-  brw_write_depth_count(brw, query->bo, 0);
+  brw_write_depth_count(brw, query->bo, idx);
   break;
 
case GL_PRIMITIVES_GENERATED:
-  write_primitives_generated(brw, query->bo, query->Base.Stream, 0);
+  write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
  ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
-  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, 0);
+  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, 0);
+  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, 0);
+  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
case GL_VERTICES_SUBMITTED_ARB:
@@ -385,7 +386,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_COMPUTE_SHADER_INVOCATIONS_ARB:
case GL_TESS_CONTROL_SHADER_PATCHES_ARB:
case GL_TESS_EVALUATION_SHADER_INVOCATIONS_ARB:
-  emit_pipeline_stat(brw, query->bo, query->Base.Stream, 
query->Base.Target, 0);
+  emit_pipeline_stat(brw, query->bo, query->Base.Stream, 
query->Base.Target, idx);
   break;
 
default:
@@ -406,34 +407,35 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
+   const int idx = 1;
 
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
-  brw_write_timestamp(brw, query->bo, 1);
+  brw_write_timestamp(brw, query->bo, idx);
   break;
 
case GL_ANY_SAMPLES_PASSED:
case GL_ANY_SAMPLES_PASSED_CONSERVATIVE:
case GL_SAMPLES_PASSED_ARB:
-  brw_write_depth_count(brw, query->bo, 1);
+  brw_write_depth_count(brw, query->bo, idx);
   break;
 
case GL_PRIMITIVES_GENERATED:
-  write_primitives_generated(brw, query->bo, query->Base.Stream, 1);
+  write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
  ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
-  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, 1);
+  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, 1);
+  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, 1);
+  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
   /* calculate overflow here */
@@ -449,7 +451,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_TESS_CONTROL_SHADER_PATCHES_ARB:
case GL_TESS_EVALUATION_SHADER_INVOCATIONS_ARB:
   emit_pipeline_stat(brw, query->bo,
- query->Base.Stream, query->Base.Target, 1);
+ query->Base.Stream, query->Base.Target, idx);
   break;
 

[Mesa-dev] [PATCH 12/12] i965: Prefer to use the GPU copy if we need to stall for reads

2017-08-04 Thread Chris Wilson
If we need to stall to read the bo, ask the GPU to copy it into the CPU
cache whilst we wait.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 3b5e5595d7..5cd8d24f1e 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3421,6 +3421,18 @@ can_blit_slice(struct intel_mipmap_tree *mt,
 }
 
 static bool
+map_will_stall(struct brw_bo *bo, GLbitfield mode)
+{
+   /* If we need to stall for reading the buffer, offload the cost
+* of clflushing it to the GPU.
+*/
+   if (!bo->cache_coherent && !(mode & GL_MAP_INVALIDATE_RANGE_BIT))
+  mode |= GL_MAP_READ_BIT;
+
+   return brw_bo_map_busy(bo, mode);
+}
+
+static bool
 use_intel_mipree_map_blit(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
   GLbitfield mode,
@@ -3431,7 +3443,7 @@ use_intel_mipree_map_blit(struct brw_context *brw,
 * all the overhead involved.
 */
 
-   if (!(mode & GL_MAP_WRITE_BIT) &&
+   if (map_will_stall(mt->bo, mode) &&
!mt->compressed &&
(mt->surf.tiling == ISL_TILING_X ||
 /* Prior to Sandybridge, the blitter can't handle Y tiling */
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] i965: Set query->flush after flushing the query

2017-08-04 Thread Chris Wilson
Skip the next check for brw_batch_references() by recording when we
flush the query.
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index 0ba6919374..30dda5ae1f 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -502,14 +502,16 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 static void
 flush_batch_if_needed(struct brw_context *brw, struct brw_query_object *query)
 {
+   if (query->flushed)
+  return;
+
/* If the batch doesn't reference the BO, it must have been flushed
 * (for example, due to being full).  Record that it's been flushed.
 */
-   query->flushed = query->flushed ||
-!brw_batch_references(>batch, query->bo);
-
-   if (!query->flushed)
+   if (brw_batch_references(>batch, query->bo))
   intel_batchbuffer_flush(brw);
+
+   query->flushed = true;
 }
 
 /**
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] No reloc for i965

2017-08-04 Thread Chris Wilson
Quoting Kenneth Graunke (2017-08-04 19:47:14)
> On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote:
> > Patch reordering from last time so that the cosmetic tweaks are done first
> > and out of the way. Kenneth has reviewed the core NO_RELOC patches, so
> > hopefully it doesn't look too bad and we can land at least as far as
> > there (patch 8/10).
> > 
> > Thanks,
> > -Chris
> 
> I split up some patches and pushed a modified version of this series.
> 
> To ssh://git.freedesktop.org/git/mesa/mesa
>5c007203b73..6c530ad1160  master -> master
> 
> Thanks a ton for getting us to NO_RELOC.  I really like the new reloc
> flags system as well.  It's so much nicer!

I've still got to win you over to using LUT indices (kernel side, there
shouldn't be any case where it is worse, but the differences are easily
dwarfed in typical cases where it is only about 10% faster, but any
reduction inside the struct_mutex is a must), I see, and the per-context
bo along with removing the auxiliary render_cache set...

But now for something completely different...
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] intel/isl: Stop padding surfaces

2017-08-04 Thread Jason Ekstrand
Ken and I had a fairly lengthy conversation about this on IRC today:

https://people.freedesktop.org/~jekstrand/isl-padding

The conclusion was that we both hate the patch but it's probably safe and
it does fix bugs.  The thing that really wins me over is that we have
historically done none of this padding in the GL driver (except for one bit
about cube maps) and seem to have gotten away with it.  We have had some
underallocation issues in the past but none have them have tracked back to
this.

--Jason

On Wed, Aug 2, 2017 at 1:35 PM, Jason Ekstrand  wrote:

> The docs contain a bunch of commentary about the need to pad various
> surfaces out to multiples of something or other.  However, all of those
> requirements are about avoiding GTT errors due to missing pages when the
> data port or sampler accesses slightly out-of-bounds.  However, because
> the kernel already fills all the empty space in our GTT with the scratch
> page, we never have to worry about faulting due to OOB reads.  There are
> two caveats to this:
>
>  1) There is some potential for issues with caches here if extra data
> ends up in a cache we don't expect due to OOB reads.  However,
> because we always trash the entire cache whenever we need to move
> anything between cache domains, this shouldn't be an issue.
>
>  2) There is a potential issue if a surface gets placed at the very top
> of the GTT by the kernel.  In this case, the hardware could
> potentially end up trying to read past the top of the GTT.  If it
> nicely wraps around at the 48-bit (or 32-bit) boundary, then this
> shouldn't be an issue thanks to the scratch page.  If it doesn't,
> then we need to come up with something to handle it.
>
> Up until some of the GL move to ISL, having the padding code in there
> just caused us to harmlessly use a bit more memory in Vulkan.  However,
> now that we're using ISL sizes to validate external dma-buf images,
> these padding requirements are causing us to reject otherwise valid
> images due to the size of the BO being too small.
>
> Cc: "17.2" 
> Cc: Chad Versace 
> Tested-by: Tapani Pälli 
> Tested-by: Tomasz Figa 
> ---
>  src/intel/isl/isl.c | 119 +-
> --
>  1 file changed, 2 insertions(+), 117 deletions(-)
>
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 5e3d279..d3124de 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -1374,116 +1374,6 @@ isl_calc_row_pitch(const struct isl_device *dev,
> return true;
>  }
>
> -/**
> - * Calculate and apply any padding required for the surface.
> - *
> - * @param[inout] total_h_el is updated with the new height
> - * @param[out] pad_bytes is overwritten with additional padding
> requirements.
> - */
> -static void
> -isl_apply_surface_padding(const struct isl_device *dev,
> -  const struct isl_surf_init_info *restrict info,
> -  const struct isl_tile_info *tile_info,
> -  uint32_t *total_h_el,
> -  uint32_t *pad_bytes)
> -{
> -   const struct isl_format_layout *fmtl = isl_format_get_layout(info->fo
> rmat);
> -
> -   *pad_bytes = 0;
> -
> -   /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface
> -* Formats >> Surface Padding Requirements >> Render Target and Media
> -* Surfaces:
> -*
> -*   The data port accesses data (pixels) outside of the surface if
> they
> -*   are contained in the same cache request as pixels that are within
> the
> -*   surface. These pixels will not be returned by the requesting
> message,
> -*   however if these pixels lie outside of defined pages in the GTT,
> -*   a GTT error will result when the cache request is processed. In
> -*   order to avoid these GTT errors, “padding” at the bottom of the
> -*   surface is sometimes necessary.
> -*
> -* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface
> -* Formats >> Surface Padding Requirements >> Sampling Engine Surfaces:
> -*
> -*... Lots of padding requirements, all listed separately below.
> -*/
> -
> -   /* We can safely ignore the first padding requirement, quoted below,
> -* because isl doesn't do buffers.
> -*
> -*- [pre-BDW] For buffers, which have no inherent “height,” padding
> -*  requirements are different. A buffer must be padded to the next
> -*  multiple of 256 array elements, with an additional 16 bytes
> added
> -*  beyond that to account for the L1 cache line.
> -*/
> -
> -   /*
> -*- For compressed textures [...], padding at the bottom of the
> surface
> -*  is to an even compressed row.
> -*/
> -   if (isl_format_is_compressed(info->format))
> -  *total_h_el = isl_align(*total_h_el, 2);
> -
> -   

Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS

2017-08-04 Thread Emil Velikov
On 4 August 2017 at 14:23, Tomasz Figa  wrote:

>
> If this needs so complicated series of checks, maybe it would make
> more sense to just set enable_out_fence based on availability of the
> capability at initialization time?
>
Either way is fine with me.

>> Did you drop it all together or changed to use some other surface?
>> Would be nice to hear the reason why it was added - perhaps I'm
>> missing something.
>
> We have to keep it, otherwise there would be no fence available at the
> time of surface destruction, while, at least for Android, a fence can
> be passed to window's cancelBuffer callback.
>
>>
>> I think that we want a fence/fd for the new draw surface. Since
>> otherwise one won't get created up until the first SwapBuffers call.
>
> I might be missing something, but wouldn't that insert a fence at the
> beginning of command stream, before even doing anything? At least in
> Android use cases, the only places we need the fence is in SwapBuffers
> and DestroySurface and the fence should be inserted after all the
> commands for rendering into given surface.
>
Thanks for the correction. You're absolutely correct in both cases.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8 v2] A few clover fixes for both CTS and eventual 1.2 support

2017-08-04 Thread Jan Vesely
Hi,

I went through most of the series. I think the approach is OK. The
biggest issue I had is with the sequence:
1.) add an interface
2.) implement a feature
3.) change the interface

I gave my rb to 1 and 2, but you might want to consider changing them
as well, if returning int from the functions is better. Generating
string is IMO easier/faster than parsing them.

Also, you might want to consider cc'ing Francisco as he'll have the
final say and might see things differently.

thanks,
Jan

On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> I've dropped the first patch of the previous series for now. I'm not
> withdrawing it completely, just going to see if there's anything about
> the user_ptr stuff that could have been causing the issue instead, and
> if I'm using too big a hammer in this patch. If I convince myself of its
> correctness, it'll be back.
> 
> The rest of the patches move the device version declaration to core/device
> and then use that along with the -cl-std option to determine which
> OpenCL language version to enable in clang.
> 
> I've done a full piglit run (again) before/after, and there are no changes
> for me on radeonsi/pitcairn if the device is left at CL 1.1.
> 
> When I bump my platform/device versions to 1.2, the clang instance has
> been confirmed to enable 1.2 language features (like the static keyword
> required in test/cl/program/execute/static.cl, which goes skip->pass).
> 
> Major changes since v1:
>   Addressed Pierre's build-breakage comments
>   Added a check for cl-std > device_clc_version
>   Added a patch to pass the device object down into invocation.cpp
> instead of adding a bunch of device-based arguments.
>   Use device_clc_version for cl version detection instead of device_version
>   Added device_clc_version in device.cpp/hpp
> 
> Anyway, happy reviewing.
> 
> Cc: Jan Vesely 
> Cc: Pierre Moreau 
> 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] No reloc for i965

2017-08-04 Thread Kenneth Graunke
On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote:
> Patch reordering from last time so that the cosmetic tweaks are done first
> and out of the way. Kenneth has reviewed the core NO_RELOC patches, so
> hopefully it doesn't look too bad and we can land at least as far as
> there (patch 8/10).
> 
> Thanks,
> -Chris

I split up some patches and pushed a modified version of this series.

To ssh://git.freedesktop.org/git/mesa/mesa
   5c007203b73..6c530ad1160  master -> master

Thanks a ton for getting us to NO_RELOC.  I really like the new reloc
flags system as well.  It's so much nicer!

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] clover/llvm: Make __OPENCL_VERSION__ dynamic

2017-08-04 Thread Jan Vesely
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> Signed-off-by: Aaron Watry 
> CC: Jan Vesely 
> 
> v2: base it on the device version
> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 63b2961752..443cd31e66 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -224,7 +224,8 @@ namespace {
>c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
>  
>// Add definition for the OpenCL version
> -  c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110");
> +  c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" +
> +  
> std::to_string(get_language_from_version_str(dev.device_version(;

I don't think you can use the same parsing function here.
__OPENCL_VERSION__ can go up to 2.2, while __OPENCL_C_VERSION__ is max
2.0

Jan

>  
>// clc.h requires that this macro be defined:
>
> c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers");


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency

2017-08-04 Thread Nicolai Hähnle

On 04.08.2017 20:18, Emil Velikov wrote:

From: Emil Velikov 

Currently xmlconfig is conditionally used, only when --enable-dri is
available.

As the library has moved to src/util and has wider wisebase, this guard
is no longer correct. Strictly speaking - it wasn't since the
introduction of xmlconfig into st/nine a while ago.

Unconditionally enable xmlconfig and drop the linking. As said before
there's other users of the library, so depending on the configure
options we will get multiple definitions of said symbols.

NOTE: To avoid breaking other combinations, this commit adds the
xmlconfig link to the required places - throughout gallium and the DRI
loaders.

Cc: Nicolai Hähnle 
Cc: Aaron Watry 
Signed-off-by: Emil Velikov 
---
Nicolai, here is an alternative solution.

I have a very slight inclination towards this one over your earlier
patch. But either one should do, really.


This looks reasonable to me, and you're the build system expert, so go 
for it :)


Patch is

Reviewed-by: Nicolai Hähnle 



---
  src/egl/Makefile.am   |  8 ++--
  src/gallium/auxiliary/pipe-loader/Makefile.am |  6 --
  src/gallium/targets/opencl/Makefile.am|  1 -
  src/gbm/Makefile.am   |  1 +
  src/glx/Makefile.am   |  4 +++-
  src/loader/Makefile.am| 15 ++-
  6 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index ecaf148aaec..bb8ec9745dd 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -120,8 +120,12 @@ libEGL_common_la_SOURCES += \
$(dri2_backend_FILES) \
$(dri3_backend_FILES)
  
-libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader.la

-libEGL_common_la_LIBADD += $(DLOPEN_LIBS) $(LIBDRM_LIBS) $(CLOCK_LIB)
+libEGL_common_la_LIBADD += \
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la \
+   $(DLOPEN_LIBS) \
+   $(LIBDRM_LIBS) \
+   $(CLOCK_LIB)
  
  GLVND_GEN_DEPS = generate/gen_egl_dispatch.py \

generate/egl.xml generate/eglFunctionList.py generate/genCommon.py \
diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am 
b/src/gallium/auxiliary/pipe-loader/Makefile.am
index 4ebfc97e6d9..878159f2343 100644
--- a/src/gallium/auxiliary/pipe-loader/Makefile.am
+++ b/src/gallium/auxiliary/pipe-loader/Makefile.am
@@ -41,9 +41,11 @@ libpipe_loader_dynamic_la_SOURCES += \
  endif
  
  libpipe_loader_static_la_LIBADD = \

-   $(top_builddir)/src/loader/libloader.la
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la
  
  libpipe_loader_dynamic_la_LIBADD = \

-   $(top_builddir)/src/loader/libloader.la
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la
  
  EXTRA_DIST = SConscript

diff --git a/src/gallium/targets/opencl/Makefile.am 
b/src/gallium/targets/opencl/Makefile.am
index e88fa0fd382..c9d2be7afd0 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(top_builddir)/src/util/libmesautil.la \
-   $(top_builddir)/src/util/libxmlconfig.la \
$(EXPAT_LIBS) \
$(LIBELF_LIBS) \
$(DLOPEN_LIBS) \
diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am
index de8396000b7..7a9a12f87a0 100644
--- a/src/gbm/Makefile.am
+++ b/src/gbm/Makefile.am
@@ -26,6 +26,7 @@ libgbm_la_LDFLAGS = \
  
  libgbm_la_LIBADD = \

$(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la \
$(DLOPEN_LIBS)
  
  if HAVE_PLATFORM_WAYLAND

diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
index b306bcc08db..34600475d98 100644
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -97,7 +97,9 @@ libglx_la_SOURCES = \
singlepix.c \
vertarr.c
  
-libglx_la_LIBADD = $(top_builddir)/src/loader/libloader.la

+libglx_la_LIBADD = \
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la
  
  if HAVE_DRISW

  libglx_la_SOURCES += \
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index 8b197f2995c..74ac6c51e77 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -26,6 +26,8 @@ EXTRA_DIST = SConscript
  noinst_LTLIBRARIES = libloader.la
  
  AM_CPPFLAGS = \

+   -I$(top_builddir)/src/util/ \
+   -DUSE_DRICONF \
$(DEFINES) \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src \
@@ -37,19 +39,6 @@ libloader_la_CPPFLAGS = $(AM_CPPFLAGS)
  libloader_la_SOURCES = $(LOADER_C_FILES)
  libloader_la_LIBADD =
  
-if HAVE_DRICOMMON

-libloader_la_CPPFLAGS 

Re: [Mesa-dev] [PATCH 5/8] clover/llvm: Use device in llvm compilation instead of copying fields

2017-08-04 Thread Jan Vesely
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> Copying the individual fields from the device when compiling/linking
> will lead to an unnecessarily large number of fields getting passed
> around.
> 
> Signed-off-by: Aaron Watry 
> Cc: Jan Vesey 

I think this should be patch 3/8. It looks weird to implement new
functionality one way, only to change it to a different interface in
the same patch series.

Jan

> ---
>  src/gallium/state_trackers/clover/core/program.cpp |  9 +++--
>  .../state_trackers/clover/llvm/invocation.cpp  | 22 
> ++
>  .../state_trackers/clover/llvm/invocation.hpp  |  7 ++-
>  3 files changed, 15 insertions(+), 23 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
> b/src/gallium/state_trackers/clover/core/program.cpp
> index f0f0f38548..4e74fccd97 100644
> --- a/src/gallium/state_trackers/clover/core/program.cpp
> +++ b/src/gallium/state_trackers/clover/core/program.cpp
> @@ -53,9 +53,8 @@ program::compile(const ref_vector , const 
> std::string ,
>   try {
>  const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
>tgsi::compile_program(_source, log) :
> -  llvm::compile_program(_source, headers,
> -dev.ir_target(), opts,
> -
> dev.device_clc_version(), log));
> +  llvm::compile_program(_source, headers, dev,
> +opts, log));
>  _builds[] = { m, opts, log };
>   } catch (...) {
>  _builds[] = { module(), opts, log };
> @@ -79,9 +78,7 @@ program::link(const ref_vector , const 
> std::string ,
>try {
>   const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
> tgsi::link_program(ms) :
> -   llvm::link_program(ms, dev.ir_format(),
> -  dev.ir_target(), opts,
> -  dev.device_clc_version(), 
> log));
> +   llvm::link_program(ms, dev, opts, log));
>   _builds[] = { m, opts, log };
>} catch (...) {
>   _builds[] = { module(), opts, log };
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index ca75596b05..e761ca188d 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -261,17 +261,16 @@ namespace {
>  module
>  clover::llvm::compile_program(const std::string ,
>const header_map ,
> -  const std::string ,
> +  const device ,
>const std::string ,
> -  const std::string _version,
>std::string _log) {
> if (has_flag(debug::clc))
>debug::log(".cl", "// Options: " + opts + '\n' + source);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> - device_version, r_log);
> -   auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
> +   auto c = create_compiler_instance(dev.ir_target(), tokenize(opts + " 
> input.cl"),
> + dev.device_clc_version(), r_log);
> +   auto mod = compile(*ctx, *c, "input.cl", source, headers, 
> dev.ir_target(), opts,
>r_log);
>  
> if (has_flag(debug::llvm))
> @@ -330,16 +329,15 @@ namespace {
>  
>  module
>  clover::llvm::link_program(const std::vector ,
> -   enum pipe_shader_ir ir, const std::string ,
> +   const device ,
> const std::string ,
> -   const std::string _version,
> std::string _log) {
> std::vector options = tokenize(opts + " input.cl");
> const bool create_library = count("-create-library", options);
> erase_if(equals("-create-library"), options);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, options, device_version, r_log);
> +   auto c = create_compiler_instance(dev.ir_target(), options, 
> dev.device_clc_version(), r_log);
> auto mod = link(*ctx, *c, modules, r_log);
>  
> optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
> @@ -354,14 +352,14 @@ clover::llvm::link_program(const std::vector 
> ,
> if (create_library) {
>return build_module_library(*mod, module::section::text_library);
>  
> -   } else if (ir == PIPE_SHADER_IR_LLVM) {
> +   } else if (dev.ir_format() == 

Re: [Mesa-dev] [PATCH 4/8] clover/llvm: Use -cl-std and device version to select language defaults

2017-08-04 Thread Jan Vesely
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
>  1) If you have -cl-std=CL1.1+ use the version specified
>  2) If not, use the highest 1.x version that the device supports
> 
> Curiously, there is no valid value for -cl-std=CL1.0
> 
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
> 
> v2: (Pierre) Move create_compiler_instance changes to correct patch
> to prevent temporary build breakage.
> Convert version_str into unsigned and use it to find language version
> Add build_error for unknown language version string
> Whitespace fixes
> ---
>  .../state_trackers/clover/llvm/invocation.cpp  | 61 
> +-
>  1 file changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 7c8d0e738d..ca75596b05 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -93,6 +93,65 @@ namespace {
>return ctx;
> }
>  
> +   unsigned get_language_version_from_string(const std::string _str){
> +  if (version_str == "1.0"){
> + return 100;
> +  }
> +  if (version_str == "1.1"){
> + return 110;
> +  }
> +  if (version_str == "1.2"){
> + return 120;
> +  }
> +  if (version_str == "2.0"){
> + return 200;
> +  }
> +  throw build_error("Unknown/Unsupported language version");
> +   }

I'm a bit conflicted about this. returning int from device.cl_version()
might be nicer, we are using C++ string so we probably don't have to
worry about generating new strings all the time.

> +
> +   clang::LangStandard::Kind
> +   get_language_from_version_str(const std::string _str,
> + bool is_opt = false) {
> +   /**
> +* Per CL 2.0 spec, section 5.8.4.5:
> +* If it's an option, use the value directly.
> +* If it's a device version, clamp to max 1.x version, a.k.a. 1.2
> +*/
> +  unsigned version = get_language_version_from_string(version_str);
> +  if (!is_opt && version > 120 ){
> + version = 120;
> +  }
> +  switch (version){
> + case 100:
> +return clang::LangStandard::lang_opencl10;
> + case 110:
> +return clang::LangStandard::lang_opencl11;
> + case 120:
> +return clang::LangStandard::lang_opencl12;
> + case 200:
> +return clang::LangStandard::lang_opencl20;
> + default:
> +throw build_error("Unknown/Unsupported language version");
> +  }
> +   }
> +
> +   clang::LangStandard::Kind
> +   get_language_version(const std::vector ,
> +const std::string _version) {
> +
> +  const std::string search = "-cl-std=CL";
> +
> +  for(auto opt: opts){
> + auto pos = opt.find(search);
> + if (pos == 0){
> +auto ver = opt.substr(pos+search.size());
> +return get_language_from_version_str(ver, true);
> + }
> +  }

I don't think you need the above. we only set the defaults, so clang
should be able to parse this option on its own if we pass it along.

> +
> +  return get_language_from_version_str(device_version);
> +   }
> +
> std::unique_ptr
> create_compiler_instance(const target ,
>  const std::vector ,
> @@ -129,7 +188,7 @@ namespace {
>compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(),
>  compat::ik_opencl, 
> ::llvm::Triple(target.triple),
>  c->getPreprocessorOpts(),
> -clang::LangStandard::lang_opencl11);
> +get_language_version(opts, device_version));

I'd imagine this could be something like
get_language_from_version(std::max(dev.clc_version(), 120))

Jan

>  
>c->createDiagnostics(new clang::TextDiagnosticPrinter(
>*new raw_string_ostream(r_log),


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] loader: always include libxmlconfig on autotools build

2017-08-04 Thread Emil Velikov
On 4 August 2017 at 10:53, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> This aligns with the fact that we also check for EXPAT_LIBS
> unconditionally in configure.ac now. It should make all the
> various build permutations of Clover work (whether DRI is
> enabled or disabled in the build).
>
> Cc: Aaron Watry 
> Cc: Emil Velikov 
> --
> This change keeps everything green on Travis, and it should fix
> the duplicate-symbol linker error seen by Aaron and others when
> building Clover.
> ---
>  src/gallium/targets/opencl/Makefile.am |  1 -
>  src/loader/Makefile.am | 13 +
>  2 files changed, 5 insertions(+), 9 deletions(-)
>
> diff --git a/src/gallium/targets/opencl/Makefile.am 
> b/src/gallium/targets/opencl/Makefile.am
> index e88fa0fd382..c9d2be7afd0 100644
> --- a/src/gallium/targets/opencl/Makefile.am
> +++ b/src/gallium/targets/opencl/Makefile.am
> @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
> $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
> $(top_builddir)/src/gallium/auxiliary/libgallium.la \
> $(top_builddir)/src/util/libmesautil.la \
> -   $(top_builddir)/src/util/libxmlconfig.la \
> $(EXPAT_LIBS) \
> $(LIBELF_LIBS) \
> $(DLOPEN_LIBS) \
> diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
> index 8b197f2995c..5ed87820664 100644
> --- a/src/loader/Makefile.am
> +++ b/src/loader/Makefile.am
> @@ -33,21 +33,18 @@ AM_CPPFLAGS = \
> $(XCB_DRI3_CFLAGS) \
> $(LIBDRM_CFLAGS)
>
> -libloader_la_CPPFLAGS = $(AM_CPPFLAGS)
> +libloader_la_CPPFLAGS = $(AM_CPPFLAGS) \
> +   -DUSE_DRICONF
>  libloader_la_SOURCES = $(LOADER_C_FILES)
> -libloader_la_LIBADD =
> +libloader_la_LIBADD = \
> +   $(top_builddir)/src/util/libxmlconfig.la
>
>  if HAVE_DRICOMMON
>  libloader_la_CPPFLAGS += \
> -I$(top_builddir)/src/util/ \
> -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
> -I$(top_srcdir)/src/mesa/ \
> -   -I$(top_srcdir)/src/mapi/ \
> -   -DUSE_DRICONF
> -
> -libloader_la_LIBADD += \
> -   $(top_builddir)/src/util/libxmlconfig.la
> -
> +   -I$(top_srcdir)/src/mapi/
Just sent and alternative solution. It's a bit more evasive, so I'll
understand if you prefer this one.

Sidenote: dri/common, mesa and mapi are no longer needed. One could
drop them as follow-up.

Please drop the HAVE_DRICOMMON guard, and assign libloader_la_CPPFLAGS
at once. With that
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] WIP: loader: android: allow using of xmlconfig

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

Brings the Android binaries on par with Autoconf, allowing users to
select their GPU via device_id.

Signed-off-by: Emil Velikov 
---
Completely untested. Posting if anyone is interested if polishing it up.
---
 src/loader/Android.mk | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/loader/Android.mk b/src/loader/Android.mk
index ca9218846c9..4a45bf61865 100644
--- a/src/loader/Android.mk
+++ b/src/loader/Android.mk
@@ -33,6 +33,9 @@ include $(CLEAR_VARS)
 LOCAL_SRC_FILES := \
$(LOADER_C_FILES)
 
+# XXX: might need an include for the generated xmlconfig files
+LOCAL_CPPFLAGS := -DUSE_DRICONF
+
 LOCAL_EXPORT_C_INCLUDE_DIRS := $(LOCAL_PATH)
 
 LOCAL_MODULE := libmesa_loader
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] WIP: loader: scons: allow using of xmlconfig on supported platforms

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

Brings the SCons binaries on par with Autoconf, allowing users to select
their GPU via device_id.

Signed-off-by: Emil Velikov 
---
Completely untested. Posting if anyone is interested if polishing it up.
---
 src/loader/SConscript | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/loader/SConscript b/src/loader/SConscript
index f70654f43ae..e3474b2e4f0 100644
--- a/src/loader/SConscript
+++ b/src/loader/SConscript
@@ -12,6 +12,10 @@ if env['drm']:
 env.PkgUseModules('DRM')
 env.Append(CPPDEFINES = ['HAVE_LIBDRM'])
 
+# XXX: might need an include for the generated xmlconfig files
+if env['dri']:
+env.Append(CPPDEFINES = ['USE_DRICONF'])
+
 # parse Makefile.sources
 sources = env.ParseSourceList('Makefile.sources', 'LOADER_C_FILES')
 
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency

2017-08-04 Thread Emil Velikov
From: Emil Velikov 

Currently xmlconfig is conditionally used, only when --enable-dri is
available.

As the library has moved to src/util and has wider wisebase, this guard
is no longer correct. Strictly speaking - it wasn't since the
introduction of xmlconfig into st/nine a while ago.

Unconditionally enable xmlconfig and drop the linking. As said before
there's other users of the library, so depending on the configure
options we will get multiple definitions of said symbols.

NOTE: To avoid breaking other combinations, this commit adds the
xmlconfig link to the required places - throughout gallium and the DRI
loaders.

Cc: Nicolai Hähnle 
Cc: Aaron Watry 
Signed-off-by: Emil Velikov 
---
Nicolai, here is an alternative solution.

I have a very slight inclination towards this one over your earlier
patch. But either one should do, really.
---
 src/egl/Makefile.am   |  8 ++--
 src/gallium/auxiliary/pipe-loader/Makefile.am |  6 --
 src/gallium/targets/opencl/Makefile.am|  1 -
 src/gbm/Makefile.am   |  1 +
 src/glx/Makefile.am   |  4 +++-
 src/loader/Makefile.am| 15 ++-
 6 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index ecaf148aaec..bb8ec9745dd 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -120,8 +120,12 @@ libEGL_common_la_SOURCES += \
$(dri2_backend_FILES) \
$(dri3_backend_FILES)
 
-libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader.la
-libEGL_common_la_LIBADD += $(DLOPEN_LIBS) $(LIBDRM_LIBS) $(CLOCK_LIB)
+libEGL_common_la_LIBADD += \
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la \
+   $(DLOPEN_LIBS) \
+   $(LIBDRM_LIBS) \
+   $(CLOCK_LIB)
 
 GLVND_GEN_DEPS = generate/gen_egl_dispatch.py \
generate/egl.xml generate/eglFunctionList.py generate/genCommon.py \
diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am 
b/src/gallium/auxiliary/pipe-loader/Makefile.am
index 4ebfc97e6d9..878159f2343 100644
--- a/src/gallium/auxiliary/pipe-loader/Makefile.am
+++ b/src/gallium/auxiliary/pipe-loader/Makefile.am
@@ -41,9 +41,11 @@ libpipe_loader_dynamic_la_SOURCES += \
 endif
 
 libpipe_loader_static_la_LIBADD = \
-   $(top_builddir)/src/loader/libloader.la
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la
 
 libpipe_loader_dynamic_la_LIBADD = \
-   $(top_builddir)/src/loader/libloader.la
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la
 
 EXTRA_DIST = SConscript
diff --git a/src/gallium/targets/opencl/Makefile.am 
b/src/gallium/targets/opencl/Makefile.am
index e88fa0fd382..c9d2be7afd0 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(top_builddir)/src/util/libmesautil.la \
-   $(top_builddir)/src/util/libxmlconfig.la \
$(EXPAT_LIBS) \
$(LIBELF_LIBS) \
$(DLOPEN_LIBS) \
diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am
index de8396000b7..7a9a12f87a0 100644
--- a/src/gbm/Makefile.am
+++ b/src/gbm/Makefile.am
@@ -26,6 +26,7 @@ libgbm_la_LDFLAGS = \
 
 libgbm_la_LIBADD = \
$(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la \
$(DLOPEN_LIBS)
 
 if HAVE_PLATFORM_WAYLAND
diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
index b306bcc08db..34600475d98 100644
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -97,7 +97,9 @@ libglx_la_SOURCES = \
singlepix.c \
vertarr.c
 
-libglx_la_LIBADD = $(top_builddir)/src/loader/libloader.la
+libglx_la_LIBADD = \
+   $(top_builddir)/src/loader/libloader.la \
+   $(top_builddir)/src/util/libxmlconfig.la
 
 if HAVE_DRISW
 libglx_la_SOURCES += \
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index 8b197f2995c..74ac6c51e77 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -26,6 +26,8 @@ EXTRA_DIST = SConscript
 noinst_LTLIBRARIES = libloader.la
 
 AM_CPPFLAGS = \
+   -I$(top_builddir)/src/util/ \
+   -DUSE_DRICONF \
$(DEFINES) \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src \
@@ -37,19 +39,6 @@ libloader_la_CPPFLAGS = $(AM_CPPFLAGS)
 libloader_la_SOURCES = $(LOADER_C_FILES)
 libloader_la_LIBADD =
 
-if HAVE_DRICOMMON
-libloader_la_CPPFLAGS += \
-   -I$(top_builddir)/src/util/ \
-   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-   -I$(top_srcdir)/src/mesa/ \
-   -I$(top_srcdir)/src/mapi/ \
-   -DUSE_DRICONF
-
-libloader_la_LIBADD += \
-   

Re: [Mesa-dev] [PATCH 3/8] clover: Add device_clc_version to llvm::[compile|link]_program

2017-08-04 Thread Jan Vesely
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> We'll be using it to select the default language version soon.
> 
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
> Cc: Jan Vesely 
> 
> v2: (Pierre) Move changes to create_compiler_instance invocation to correct
> patch to prevent temporary build breakage.
> (Jan) Use device_clc_version instead of device_version for compile/link
> ---

This patch looks redundant wrt changes in 5/8. Why not just add device
parameter right away instead of adding version and then changing it
later.

Jan

>  src/gallium/state_trackers/clover/core/program.cpp|  6 --
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 10 +++---
>  src/gallium/state_trackers/clover/llvm/invocation.hpp |  2 ++
>  3 files changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
> b/src/gallium/state_trackers/clover/core/program.cpp
> index ae4b50a879..f0f0f38548 100644
> --- a/src/gallium/state_trackers/clover/core/program.cpp
> +++ b/src/gallium/state_trackers/clover/core/program.cpp
> @@ -54,7 +54,8 @@ program::compile(const ref_vector , const 
> std::string ,
>  const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
>tgsi::compile_program(_source, log) :
>llvm::compile_program(_source, headers,
> -dev.ir_target(), opts, 
> log));
> +dev.ir_target(), opts,
> +
> dev.device_clc_version(), log));
>  _builds[] = { m, opts, log };
>   } catch (...) {
>  _builds[] = { module(), opts, log };
> @@ -79,7 +80,8 @@ program::link(const ref_vector , const 
> std::string ,
>   const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
> tgsi::link_program(ms) :
> llvm::link_program(ms, dev.ir_format(),
> -  dev.ir_target(), opts, log));
> +  dev.ir_target(), opts,
> +  dev.device_clc_version(), 
> log));
>   _builds[] = { m, opts, log };
>} catch (...) {
>   _builds[] = { module(), opts, log };
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 6412377faa..7c8d0e738d 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -96,6 +96,7 @@ namespace {
> std::unique_ptr
> create_compiler_instance(const target ,
>  const std::vector ,
> +const std::string _version,
>  std::string _log) {
>std::unique_ptr c { new 
> clang::CompilerInstance };
>clang::TextDiagnosticBuffer *diag_buffer = new 
> clang::TextDiagnosticBuffer;
> @@ -203,13 +204,14 @@ clover::llvm::compile_program(const std::string ,
>const header_map ,
>const std::string ,
>const std::string ,
> +  const std::string _version,
>std::string _log) {
> if (has_flag(debug::clc))
>debug::log(".cl", "// Options: " + opts + '\n' + source);
>  
> auto ctx = create_context(r_log);
> auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> - r_log);
> + device_version, r_log);
> auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
>r_log);
>  
> @@ -270,13 +272,15 @@ namespace {
>  module
>  clover::llvm::link_program(const std::vector ,
> enum pipe_shader_ir ir, const std::string ,
> -   const std::string , std::string _log) {
> +   const std::string ,
> +   const std::string _version,
> +   std::string _log) {
> std::vector options = tokenize(opts + " input.cl");
> const bool create_library = count("-create-library", options);
> erase_if(equals("-create-library"), options);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, options, r_log);
> +   auto c = create_compiler_instance(target, options, device_version, r_log);
> auto mod = link(*ctx, *c, modules, r_log);
>  
> optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.hpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.hpp
> index 

Re: [Mesa-dev] [PATCH 2/8] clover: Add device_clc_version to device.[hc]pp

2017-08-04 Thread Jan Vesely
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> device_version and device_clc_version are not necessarily the same for
> devices that support CL 1.0, but have a 1.1 compiler and the necessary
> extensions.
> 
> CC: Jan Vesey 

I think you might consider squashing 1/8 and 2/8. squashed or not:
Reviewed-by: Jan Vesely 

Jan

> ---
>  src/gallium/state_trackers/clover/api/device.cpp  | 2 +-
>  src/gallium/state_trackers/clover/core/device.cpp | 5 +
>  src/gallium/state_trackers/clover/core/device.hpp | 1 +
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
> b/src/gallium/state_trackers/clover/api/device.cpp
> index 18ed2f059f..b1b7917e4e 100644
> --- a/src/gallium/state_trackers/clover/api/device.cpp
> +++ b/src/gallium/state_trackers/clover/api/device.cpp
> @@ -368,7 +368,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
>break;
>  
> case CL_DEVICE_OPENCL_C_VERSION:
> -  buf.as_string() = "OpenCL C " + dev.device_version() + " ";
> +  buf.as_string() = "OpenCL C " + dev.device_clc_version() + " ";
>break;
>  
> case CL_DEVICE_PRINTF_BUFFER_SIZE:
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 0277495506..68856ae36b 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -245,3 +245,8 @@ std::string
>  device::device_version() const {
>  return "1.1";
>  }
> +
> +std::string
> +device::device_clc_version() const {
> +return "1.1";
> +}
> diff --git a/src/gallium/state_trackers/clover/core/device.hpp 
> b/src/gallium/state_trackers/clover/core/device.hpp
> index 3cf7e20be5..efc217aedb 100644
> --- a/src/gallium/state_trackers/clover/core/device.hpp
> +++ b/src/gallium/state_trackers/clover/core/device.hpp
> @@ -75,6 +75,7 @@ namespace clover {
>std::string device_name() const;
>std::string vendor_name() const;
>std::string device_version() const;
> +  std::string device_clc_version() const;
>enum pipe_shader_ir ir_format() const;
>std::string ir_target() const;
>enum pipe_endian endianness() const;


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] clover/device: Move device version into core/device.cpp

2017-08-04 Thread Jan Vesely
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> The device version is the maximum CL version that the device supports.
> 
> Eventually, this will be based on the features/extensions of the actual
> device, but for now move it a bit closer to its eventual destination.
> 
> Signed-off-by: Aaron Watry 
> ---
>  src/gallium/state_trackers/clover/api/device.cpp  | 4 ++--
>  src/gallium/state_trackers/clover/core/device.cpp | 5 +
>  src/gallium/state_trackers/clover/core/device.hpp | 1 +
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
> b/src/gallium/state_trackers/clover/api/device.cpp
> index 0b33350bb2..18ed2f059f 100644
> --- a/src/gallium/state_trackers/clover/api/device.cpp
> +++ b/src/gallium/state_trackers/clover/api/device.cpp
> @@ -314,7 +314,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
>break;
>  
> case CL_DEVICE_VERSION:
> -  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
> +  buf.as_string() = "OpenCL " + dev.device_version() + " Mesa " 
> PACKAGE_VERSION
>  #ifdef MESA_GIT_SHA1
>  " (" MESA_GIT_SHA1 ")"
>  #endif
> @@ -368,7 +368,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
>break;
>  
> case CL_DEVICE_OPENCL_C_VERSION:
> -  buf.as_string() = "OpenCL C 1.1 ";
> +  buf.as_string() = "OpenCL C " + dev.device_version() + " ";
>break;

This chunk looks out of place, especially since you change it again in
2/8. With this fixed:
Reviewed-by: Jan Vesely 

Jan

>  
> case CL_DEVICE_PRINTF_BUFFER_SIZE:
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 2ad9e49cf8..0277495506 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -240,3 +240,8 @@ enum pipe_endian
>  device::endianness() const {
> return (enum pipe_endian)pipe->get_param(pipe, PIPE_CAP_ENDIANNESS);
>  }
> +
> +std::string
> +device::device_version() const {
> +return "1.1";
> +}
> diff --git a/src/gallium/state_trackers/clover/core/device.hpp 
> b/src/gallium/state_trackers/clover/core/device.hpp
> index 7b3353df34..3cf7e20be5 100644
> --- a/src/gallium/state_trackers/clover/core/device.hpp
> +++ b/src/gallium/state_trackers/clover/core/device.hpp
> @@ -74,6 +74,7 @@ namespace clover {
>cl_uint address_bits() const;
>std::string device_name() const;
>std::string vendor_name() const;
> +  std::string device_version() const;
>enum pipe_shader_ir ir_format() const;
>std::string ir_target() const;
>enum pipe_endian endianness() const;


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 24/25] i965: Mark brw_hw_type_to_reg_type() as a pure function

2017-08-04 Thread Matt Turner
   textdata bss dec hex filename
7816886  346248  420496 8583630  82f9ce i965_dri.so before
7816214  346248  420496 8582958  82f72e i965_dri.so after
---
 src/intel/compiler/brw_reg_type.h | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_reg_type.h 
b/src/intel/compiler/brw_reg_type.h
index 5d05f293c6..08dc1715a9 100644
--- a/src/intel/compiler/brw_reg_type.h
+++ b/src/intel/compiler/brw_reg_type.h
@@ -28,6 +28,12 @@
 extern "C" {
 #endif
 
+#ifdef HAVE_FUNC_ATTRIBUTE_PURE
+#define ATTRIBUTE_PURE __attribute__((__pure__))
+#else
+#define ATTRIBUTE_PURE
+#endif
+
 enum brw_reg_file;
 struct gen_device_info;
 
@@ -59,7 +65,7 @@ unsigned
 brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
 enum brw_reg_file file, enum brw_reg_type type);
 
-enum brw_reg_type
+enum brw_reg_type ATTRIBUTE_PURE
 brw_hw_type_to_reg_type(const struct gen_device_info *devinfo,
 enum brw_reg_file file, unsigned hw_type);
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/25] i965: Stop using hardware register types directly

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_disasm.c  |  47 -
 src/intel/compiler/brw_eu_validate.c | 196 ---
 src/intel/compiler/brw_reg_type.c|  17 +--
 src/intel/compiler/brw_reg_type.h|  11 +-
 4 files changed, 113 insertions(+), 158 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 02b48c9cf2..e2675b5f4c 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -707,7 +707,8 @@ reg(FILE *file, unsigned _reg_file, unsigned _reg_nr)
 static int
 dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst)
 {
-   unsigned elem_size = brw_element_size(devinfo, inst, dst);
+   enum brw_reg_type type = brw_inst_dst_type(devinfo, inst);
+   unsigned elem_size = brw_reg_type_to_size(type);
int err = 0;
 
if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
@@ -723,10 +724,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  err |= control(file, "horiz stride", horiz_stride,
 brw_inst_dst_hstride(devinfo, inst), NULL);
  string(file, ">");
- string(file,
-brw_hw_reg_type_to_letters(devinfo,
-   brw_inst_dst_reg_file(devinfo, 
inst),
-   brw_inst_dst_reg_hw_type(devinfo, 
inst)));
+ string(file, brw_reg_type_to_letters(type));
   } else {
  string(file, "g[a0");
  if (brw_inst_dst_ia_subreg_nr(devinfo, inst))
@@ -738,10 +736,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  err |= control(file, "horiz stride", horiz_stride,
 brw_inst_dst_hstride(devinfo, inst), NULL);
  string(file, ">");
- string(file,
-brw_hw_reg_type_to_letters(devinfo,
-   brw_inst_dst_reg_file(devinfo, 
inst),
-   brw_inst_dst_reg_hw_type(devinfo, 
inst)));
+ string(file, brw_reg_type_to_letters(type));
   }
} else {
   if (brw_inst_dst_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) {
@@ -754,10 +749,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  string(file, "<1>");
  err |= control(file, "writemask", writemask,
 brw_inst_da16_writemask(devinfo, inst), NULL);
- string(file,
-brw_hw_reg_type_to_letters(devinfo,
-   brw_inst_dst_reg_file(devinfo, 
inst),
-   brw_inst_dst_reg_hw_type(devinfo, 
inst)));
+ string(file, brw_reg_type_to_letters(type));
   } else {
  err = 1;
  string(file, "Indirect align16 address mode not supported");
@@ -812,7 +804,7 @@ static int
 src_da1(FILE *file,
 const struct gen_device_info *devinfo,
 unsigned opcode,
-unsigned type, unsigned _reg_file,
+enum brw_reg_type type, unsigned _reg_file,
 unsigned _vert_stride, unsigned _width, unsigned _horiz_stride,
 unsigned reg_num, unsigned sub_reg_num, unsigned __abs,
 unsigned _negate)
@@ -830,11 +822,11 @@ src_da1(FILE *file,
if (err == -1)
   return 0;
if (sub_reg_num) {
-  unsigned elem_size = brw_hw_reg_type_to_size(devinfo, _reg_file, type);
+  unsigned elem_size = brw_reg_type_to_size(type);
   format(file, ".%d", sub_reg_num / elem_size);   /* use formal style like 
spec */
}
src_align1_region(file, _vert_stride, _width, _horiz_stride);
-   string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type));
+   string(file, brw_reg_type_to_letters(type));
return err;
 }
 
@@ -842,7 +834,7 @@ static int
 src_ia1(FILE *file,
 const struct gen_device_info *devinfo,
 unsigned opcode,
-unsigned type,
+enum brw_reg_type type,
 unsigned _reg_file,
 int _addr_imm,
 unsigned _addr_subreg_nr,
@@ -866,7 +858,7 @@ src_ia1(FILE *file,
   format(file, " %d", _addr_imm);
string(file, "]");
src_align1_region(file, _vert_stride, _width, _horiz_stride);
-   string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type));
+   string(file, brw_reg_type_to_letters(type));
return err;
 }
 
@@ -896,7 +888,7 @@ static int
 src_da16(FILE *file,
  const struct gen_device_info *devinfo,
  unsigned opcode,
- unsigned _reg_type,
+ enum brw_reg_type type,
  unsigned _reg_file,
  unsigned _vert_stride,
  unsigned _reg_nr,
@@ -918,8 +910,7 @@ src_da16(FILE *file,
if (err == -1)
   return 0;
if (_subreg_nr) {
-  unsigned elem_size =
- brw_hw_reg_type_to_size(devinfo, _reg_file, _reg_type);
+  unsigned elem_size = brw_reg_type_to_size(type);
 
   /* bit4 for subreg number byte addressing. Make 

[Mesa-dev] [PATCH 25/25] i965: Optimize reading the destination type

2017-08-04 Thread Matt Turner
brw_hw_type_to_reg_type() needs to know only whether the file is
BRW_IMMEDIATE_VALUE or not, which is not a valid file for the
destination. gcc and clang will evaluate __builtin_strcmp() at compile
time, so we can use it to pass a constant file for the destination.

   textdata bss dec hex filename
7816214  346248  420496 8582958  82f72e i965_dri.so before
7816070  346248  420496 8582814  82f69e i965_dri.so after
---
 src/intel/compiler/brw_inst.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index eacc0a024a..e9dad38f69 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -669,7 +669,9 @@ static inline enum brw_reg_type 
  \
 brw_inst_##reg##_type(const struct gen_device_info *devinfo,  \
   const brw_inst *inst)   \
 { \
-   unsigned file = brw_inst_##reg##_reg_file(devinfo, inst);  \
+   unsigned file = __builtin_strcmp("dst", #reg) == 0 ?   \
+   BRW_GENERAL_REGISTER_FILE :\
+   brw_inst_##reg##_reg_file(devinfo, inst);  \
unsigned hw_type = brw_inst_##reg##_reg_hw_type(devinfo, inst);\
return brw_hw_type_to_reg_type(devinfo, (enum brw_reg_file)file, hw_type); \
 }
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/25] i965: Hide the register type hardware encodings

2017-08-04 Thread Matt Turner
So we stop mixing them with the logical enum.
---
 src/intel/compiler/brw_eu_defines.h | 31 ---
 src/intel/compiler/brw_reg_type.c   | 31 +++
 2 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/src/intel/compiler/brw_eu_defines.h 
b/src/intel/compiler/brw_eu_defines.h
index 44bde3ff51..da482b73c5 100644
--- a/src/intel/compiler/brw_eu_defines.h
+++ b/src/intel/compiler/brw_eu_defines.h
@@ -819,37 +819,6 @@ enum PACKED brw_reg_file {
BAD_FILE,
 };
 
-enum hw_reg_type {
-   BRW_HW_REG_TYPE_UD  = 0,
-   BRW_HW_REG_TYPE_D   = 1,
-   BRW_HW_REG_TYPE_UW  = 2,
-   BRW_HW_REG_TYPE_W   = 3,
-   BRW_HW_REG_TYPE_F   = 7,
-   GEN8_HW_REG_TYPE_UQ = 8,
-   GEN8_HW_REG_TYPE_Q  = 9,
-
-   BRW_HW_REG_TYPE_UB  = 4,
-   BRW_HW_REG_TYPE_B   = 5,
-   GEN7_HW_REG_TYPE_DF = 6,
-   GEN8_HW_REG_TYPE_HF = 10,
-};
-
-enum hw_imm_type {
-   BRW_HW_IMM_TYPE_UD  = 0,
-   BRW_HW_IMM_TYPE_D   = 1,
-   BRW_HW_IMM_TYPE_UW  = 2,
-   BRW_HW_IMM_TYPE_W   = 3,
-   BRW_HW_IMM_TYPE_F   = 7,
-   GEN8_HW_IMM_TYPE_UQ = 8,
-   GEN8_HW_IMM_TYPE_Q  = 9,
-
-   BRW_HW_IMM_TYPE_UV  = 4, /* Gen6+ packed unsigned immediate vector */
-   BRW_HW_IMM_TYPE_VF  = 5, /* packed float immediate vector */
-   BRW_HW_IMM_TYPE_V   = 6, /* packed int imm. vector; uword dest only */
-   GEN8_HW_IMM_TYPE_DF = 10,
-   GEN8_HW_IMM_TYPE_HF = 11,
-};
-
 /* SNB adds 3-src instructions (MAD and LRP) that only operate on floats, so
  * the types were implied. IVB adds BFE and BFI2 that operate on doublewords
  * and unsigned doublewords, so a new field is also available in the da3src
diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index b3e24b195c..fced942740 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -27,6 +27,37 @@
 
 #define INVALID (-1)
 
+enum hw_reg_type {
+   BRW_HW_REG_TYPE_UD  = 0,
+   BRW_HW_REG_TYPE_D   = 1,
+   BRW_HW_REG_TYPE_UW  = 2,
+   BRW_HW_REG_TYPE_W   = 3,
+   BRW_HW_REG_TYPE_F   = 7,
+   GEN8_HW_REG_TYPE_UQ = 8,
+   GEN8_HW_REG_TYPE_Q  = 9,
+
+   BRW_HW_REG_TYPE_UB  = 4,
+   BRW_HW_REG_TYPE_B   = 5,
+   GEN7_HW_REG_TYPE_DF = 6,
+   GEN8_HW_REG_TYPE_HF = 10,
+};
+
+enum hw_imm_type {
+   BRW_HW_IMM_TYPE_UD  = 0,
+   BRW_HW_IMM_TYPE_D   = 1,
+   BRW_HW_IMM_TYPE_UW  = 2,
+   BRW_HW_IMM_TYPE_W   = 3,
+   BRW_HW_IMM_TYPE_F   = 7,
+   GEN8_HW_IMM_TYPE_UQ = 8,
+   GEN8_HW_IMM_TYPE_Q  = 9,
+
+   BRW_HW_IMM_TYPE_UV  = 4,
+   BRW_HW_IMM_TYPE_VF  = 5,
+   BRW_HW_IMM_TYPE_V   = 6,
+   GEN8_HW_IMM_TYPE_DF = 10,
+   GEN8_HW_IMM_TYPE_HF = 11,
+};
+
 static const struct {
enum hw_reg_type reg_type;
enum hw_imm_type imm_type;
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/25] i965: Move brw_reg_type_letters() as well

2017-08-04 Thread Matt Turner
And add "to_" to the name for consistency with the other functions in
this file.
---
 src/intel/compiler/brw_eu.c   | 28 
 src/intel/compiler/brw_fs.cpp |  4 ++--
 src/intel/compiler/brw_reg.h  |  1 -
 src/intel/compiler/brw_reg_type.c | 30 ++
 src/intel/compiler/brw_reg_type.h |  3 +++
 src/intel/compiler/brw_vec4.cpp   |  4 ++--
 6 files changed, 37 insertions(+), 33 deletions(-)

diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
index 700a1badd4..b0bdc38f4b 100644
--- a/src/intel/compiler/brw_eu.c
+++ b/src/intel/compiler/brw_eu.c
@@ -37,34 +37,6 @@
 
 #include "util/ralloc.h"
 
-/**
- * Converts a BRW_REGISTER_TYPE_* enum to a short string (F, UD, and so on).
- *
- * This is different than reg_encoding from brw_disasm.c in that it operates
- * on the abstract enum values, rather than the generation-specific encoding.
- */
-const char *
-brw_reg_type_letters(unsigned type)
-{
-   const char *names[] = {
-  [BRW_REGISTER_TYPE_UD] = "UD",
-  [BRW_REGISTER_TYPE_D]  = "D",
-  [BRW_REGISTER_TYPE_UW] = "UW",
-  [BRW_REGISTER_TYPE_W]  = "W",
-  [BRW_REGISTER_TYPE_F]  = "F",
-  [BRW_REGISTER_TYPE_UB] = "UB",
-  [BRW_REGISTER_TYPE_B]  = "B",
-  [BRW_REGISTER_TYPE_UV] = "UV",
-  [BRW_REGISTER_TYPE_V]  = "V",
-  [BRW_REGISTER_TYPE_VF] = "VF",
-  [BRW_REGISTER_TYPE_DF] = "DF",
-  [BRW_REGISTER_TYPE_HF] = "HF",
-  [BRW_REGISTER_TYPE_UQ] = "UQ",
-  [BRW_REGISTER_TYPE_Q]  = "Q",
-   };
-   return names[type];
-}
-
 /* Returns a conditional modifier that negates the condition. */
 enum brw_conditional_mod
 brw_negate_cmod(uint32_t cmod)
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 0ea4c4f1cc..b48dc4167e 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -5346,7 +5346,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
 
if (inst->dst.stride != 1)
   fprintf(file, "<%u>", inst->dst.stride);
-   fprintf(file, ":%s, ", brw_reg_type_letters(inst->dst.type));
+   fprintf(file, ":%s, ", brw_reg_type_to_letters(inst->dst.type));
 
for (int i = 0; i < inst->sources; i++) {
   if (inst->src[i].negate)
@@ -5443,7 +5443,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
  if (stride != 1)
 fprintf(file, "<%u>", stride);
 
- fprintf(file, ":%s", brw_reg_type_letters(inst->src[i].type));
+ fprintf(file, ":%s", brw_reg_type_to_letters(inst->src[i].type));
   }
 
   if (i < inst->sources - 1 && inst->src[i + 1].file != BAD_FILE)
diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index 9be2b52831..441dfb2447 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -203,7 +203,6 @@ brw_mask_for_swizzle(unsigned swz)
return brw_apply_inv_swizzle_to_mask(swz, ~0);
 }
 
-const char *brw_reg_type_letters(unsigned brw_reg_type);
 uint32_t brw_swizzle_immediate(enum brw_reg_type type, uint32_t x, unsigned 
swz);
 
 #define REG_SIZE (8*4)
diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index 859bcac047..9b048f228d 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -124,3 +124,33 @@ brw_hw_reg_type_to_size(const struct gen_device_info 
*devinfo,
enum brw_reg_type type = brw_hw_type_to_reg_type(devinfo, file, hw_type);
return type_size[type];
 }
+
+/**
+ * Converts a BRW_REGISTER_TYPE_* enum to a short string (F, UD, and so on).
+ *
+ * This is different than reg_encoding from brw_disasm.c in that it operates
+ * on the abstract enum values, rather than the generation-specific encoding.
+ */
+const char *
+brw_reg_type_to_letters(enum brw_reg_type type)
+{
+   static const char letters[][3] = {
+  [BRW_REGISTER_TYPE_DF] = "DF",
+  [BRW_REGISTER_TYPE_F]  = "F",
+  [BRW_REGISTER_TYPE_HF] = "HF",
+  [BRW_REGISTER_TYPE_VF] = "VF",
+
+  [BRW_REGISTER_TYPE_Q]  = "Q",
+  [BRW_REGISTER_TYPE_UQ] = "UQ",
+  [BRW_REGISTER_TYPE_D]  = "D",
+  [BRW_REGISTER_TYPE_UD] = "UD",
+  [BRW_REGISTER_TYPE_W]  = "W",
+  [BRW_REGISTER_TYPE_UW] = "UW",
+  [BRW_REGISTER_TYPE_B]  = "B",
+  [BRW_REGISTER_TYPE_UB] = "UB",
+  [BRW_REGISTER_TYPE_V]  = "V",
+  [BRW_REGISTER_TYPE_UV] = "UV",
+   };
+   assert(type < ARRAY_SIZE(letters));
+   return letters[type];
+}
diff --git a/src/intel/compiler/brw_reg_type.h 
b/src/intel/compiler/brw_reg_type.h
index 743522b294..64f259d2a3 100644
--- a/src/intel/compiler/brw_reg_type.h
+++ b/src/intel/compiler/brw_reg_type.h
@@ -71,6 +71,9 @@ unsigned
 brw_hw_reg_type_to_size(const struct gen_device_info *devinfo,
 enum brw_reg_file file, unsigned hw_type);
 
+const char *
+brw_reg_type_to_letters(enum brw_reg_type type);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/intel/compiler/brw_vec4.cpp 

[Mesa-dev] [PATCH 21/25] i965: Add brw_hw_reg_type_to_letters() and use it in brw_disasm.c

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_disasm.c   | 72 ++-
 src/intel/compiler/brw_reg_type.c |  8 +
 src/intel/compiler/brw_reg_type.h |  4 +++
 3 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 731e64a8ad..02b48c9cf2 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -237,21 +237,6 @@ static const char *const access_mode[2] = {
[1] = "align16",
 };
 
-static const char * const reg_encoding[] = {
-   [BRW_HW_REG_TYPE_UD]  = "UD",
-   [BRW_HW_REG_TYPE_D]   = "D",
-   [BRW_HW_REG_TYPE_UW]  = "UW",
-   [BRW_HW_REG_TYPE_W]   = "W",
-   [BRW_HW_REG_TYPE_F]   = "F",
-   [GEN8_HW_REG_TYPE_UQ] = "UQ",
-   [GEN8_HW_REG_TYPE_Q]  = "Q",
-
-   [BRW_HW_REG_TYPE_UB]  = "UB",
-   [BRW_HW_REG_TYPE_B]   = "B",
-   [GEN7_HW_REG_TYPE_DF] = "DF",
-   [GEN8_HW_REG_TYPE_HF] = "HF",
-};
-
 static const char *const three_source_reg_encoding[] = {
[BRW_3SRC_TYPE_F]  = "F",
[BRW_3SRC_TYPE_D]  = "D",
@@ -738,8 +723,10 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  err |= control(file, "horiz stride", horiz_stride,
 brw_inst_dst_hstride(devinfo, inst), NULL);
  string(file, ">");
- err |= control(file, "dest reg encoding", reg_encoding,
-brw_inst_dst_reg_hw_type(devinfo, inst), NULL);
+ string(file,
+brw_hw_reg_type_to_letters(devinfo,
+   brw_inst_dst_reg_file(devinfo, 
inst),
+   brw_inst_dst_reg_hw_type(devinfo, 
inst)));
   } else {
  string(file, "g[a0");
  if (brw_inst_dst_ia_subreg_nr(devinfo, inst))
@@ -751,8 +738,10 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  err |= control(file, "horiz stride", horiz_stride,
 brw_inst_dst_hstride(devinfo, inst), NULL);
  string(file, ">");
- err |= control(file, "dest reg encoding", reg_encoding,
-brw_inst_dst_reg_hw_type(devinfo, inst), NULL);
+ string(file,
+brw_hw_reg_type_to_letters(devinfo,
+   brw_inst_dst_reg_file(devinfo, 
inst),
+   brw_inst_dst_reg_hw_type(devinfo, 
inst)));
   }
} else {
   if (brw_inst_dst_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) {
@@ -765,8 +754,10 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  string(file, "<1>");
  err |= control(file, "writemask", writemask,
 brw_inst_da16_writemask(devinfo, inst), NULL);
- err |= control(file, "dest reg encoding", reg_encoding,
-brw_inst_dst_reg_hw_type(devinfo, inst), NULL);
+ string(file,
+brw_hw_reg_type_to_letters(devinfo,
+   brw_inst_dst_reg_file(devinfo, 
inst),
+   brw_inst_dst_reg_hw_type(devinfo, 
inst)));
   } else {
  err = 1;
  string(file, "Indirect align16 address mode not supported");
@@ -843,7 +834,7 @@ src_da1(FILE *file,
   format(file, ".%d", sub_reg_num / elem_size);   /* use formal style like 
spec */
}
src_align1_region(file, _vert_stride, _width, _horiz_stride);
-   err |= control(file, "src reg encoding", reg_encoding, type, NULL);
+   string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type));
return err;
 }
 
@@ -875,7 +866,7 @@ src_ia1(FILE *file,
   format(file, " %d", _addr_imm);
string(file, "]");
src_align1_region(file, _vert_stride, _width, _horiz_stride);
-   err |= control(file, "src reg encoding", reg_encoding, type, NULL);
+   string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type));
return err;
 }
 
@@ -938,7 +929,7 @@ src_da16(FILE *file,
err |= control(file, "vert stride", vert_stride, _vert_stride, NULL);
string(file, ">");
err |= src_swizzle(file, BRW_SWIZZLE4(swz_x, swz_y, swz_z, swz_w));
-   err |= control(file, "src da16 reg type", reg_encoding, _reg_type, NULL);
+   string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, _reg_type));
return err;
 }
 
@@ -1025,50 +1016,53 @@ src2_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
 }
 
 static int
-imm(FILE *file, const struct gen_device_info *devinfo, enum hw_imm_type type,
+imm(FILE *file, const struct gen_device_info *devinfo, enum brw_reg_type type,
 const brw_inst *inst)
 {
switch (type) {
-   case GEN8_HW_IMM_TYPE_UQ:
+   case BRW_REGISTER_TYPE_UQ:
   format(file, "0x%16lxUD", brw_inst_imm_uq(devinfo, inst));
   break;
-   case GEN8_HW_IMM_TYPE_Q:
+   case BRW_REGISTER_TYPE_Q:
   format(file, "%ldD", brw_inst_imm_uq(devinfo, inst));
   break;
-   case 

[Mesa-dev] [PATCH 19/25] i965: Switch to using the logical register types

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_eu_compact.c | 27 ---
 src/intel/compiler/brw_eu_emit.c| 13 +++--
 2 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index 743ee9519c..7674aa8b85 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -995,10 +995,11 @@ precompact(const struct gen_device_info *devinfo, 
brw_inst inst)
!(devinfo->is_haswell &&
  brw_inst_opcode(devinfo, ) == BRW_OPCODE_DIM) &&
!(devinfo->gen >= 8 &&
- (brw_inst_src0_reg_hw_type(devinfo, ) == GEN8_HW_IMM_TYPE_DF ||
-  brw_inst_src0_reg_hw_type(devinfo, ) == GEN8_HW_IMM_TYPE_UQ ||
-  brw_inst_src0_reg_hw_type(devinfo, ) == GEN8_HW_IMM_TYPE_Q))) {
-  brw_inst_set_src1_reg_hw_type(devinfo, , BRW_HW_REG_TYPE_UD);
+ (brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_DF ||
+  brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_UQ ||
+  brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_Q))) {
+  enum brw_reg_file file = brw_inst_src0_reg_file(devinfo, );
+  brw_inst_set_src1_file_type(devinfo, , file, BRW_REGISTER_TYPE_UD);
}
 
/* Compacted instructions only have 12-bits (plus 1 for the other 20)
@@ -1013,10 +1014,11 @@ precompact(const struct gen_device_info *devinfo, 
brw_inst inst)
 * If we see a 0.0:F, change the type to VF so that it can be compacted.
 */
if (brw_inst_imm_ud(devinfo, ) == 0x0 &&
-   brw_inst_src0_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
-   brw_inst_dst_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
+   brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_F &&
+   brw_inst_dst_type(devinfo, ) == BRW_REGISTER_TYPE_F &&
brw_inst_dst_hstride(devinfo, ) == BRW_HORIZONTAL_STRIDE_1) {
-  brw_inst_set_src0_reg_hw_type(devinfo, , BRW_HW_IMM_TYPE_VF);
+  enum brw_reg_file file = brw_inst_src0_reg_file(devinfo, );
+  brw_inst_set_src0_file_type(devinfo, , file, BRW_REGISTER_TYPE_VF);
}
 
/* There are no mappings for dst:d | i:d, so if the immediate is suitable
@@ -1024,10 +1026,13 @@ precompact(const struct gen_device_info *devinfo, 
brw_inst inst)
 */
if (is_compactable_immediate(brw_inst_imm_ud(devinfo, )) &&
brw_inst_cond_modifier(devinfo, ) == BRW_CONDITIONAL_NONE &&
-   brw_inst_src0_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_D &&
-   brw_inst_dst_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_D) {
-  brw_inst_set_src0_reg_hw_type(devinfo, , BRW_HW_REG_TYPE_UD);
-  brw_inst_set_dst_reg_hw_type(devinfo, , BRW_HW_REG_TYPE_UD);
+   brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_D &&
+   brw_inst_dst_type(devinfo, ) == BRW_REGISTER_TYPE_D) {
+  enum brw_reg_file src_file = brw_inst_src0_reg_file(devinfo, );
+  enum brw_reg_file dst_file = brw_inst_dst_reg_file(devinfo, );
+
+  brw_inst_set_src0_file_type(devinfo, , src_file, 
BRW_REGISTER_TYPE_UD);
+  brw_inst_set_dst_file_type(devinfo, , dst_file, 
BRW_REGISTER_TYPE_UD);
}
 
return inst;
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 064e4a0387..8c952e7da2 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -96,10 +96,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg dest)
 
gen7_convert_mrf_to_grf(p, );
 
-   brw_inst_set_dst_reg_file(devinfo, inst, dest.file);
-   brw_inst_set_dst_reg_hw_type(devinfo, inst,
-brw_reg_type_to_hw_type(devinfo, dest.file,
-dest.type));
+   brw_inst_set_dst_file_type(devinfo, inst, dest.file, dest.type);
brw_inst_set_dst_address_mode(devinfo, inst, dest.address_mode);
 
if (dest.address_mode == BRW_ADDRESS_DIRECT) {
@@ -263,9 +260,7 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg reg)
 
validate_reg(devinfo, inst, reg);
 
-   brw_inst_set_src0_reg_file(devinfo, inst, reg.file);
-   brw_inst_set_src0_reg_hw_type(devinfo, inst,
- brw_reg_type_to_hw_type(devinfo, reg.file, 
reg.type));
+   brw_inst_set_src0_file_type(devinfo, inst, reg.file, reg.type);
brw_inst_set_src0_abs(devinfo, inst, reg.abs);
brw_inst_set_src0_negate(devinfo, inst, reg.negate);
brw_inst_set_src0_address_mode(devinfo, inst, reg.address_mode);
@@ -370,9 +365,7 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg reg)
 
validate_reg(devinfo, inst, reg);
 
-   brw_inst_set_src1_reg_file(devinfo, inst, reg.file);
-   brw_inst_set_src1_reg_hw_type(devinfo, inst,
- brw_reg_type_to_hw_type(devinfo, reg.file, 
reg.type));
+   brw_inst_set_src1_file_type(devinfo, inst, reg.file, reg.type);
brw_inst_set_src1_abs(devinfo, inst, reg.abs);
brw_inst_set_src1_negate(devinfo, inst, reg.negate);
 
-- 
2.13.0


[Mesa-dev] [PATCH 17/25] i965: Rename brw_inst's functions that access the register type

2017-08-04 Thread Matt Turner
Put hw_ in the name so that it's clear these are the hardware encodings.
---
 src/intel/compiler/brw_disasm.c | 22 
 src/intel/compiler/brw_eu_compact.c | 22 
 src/intel/compiler/brw_eu_emit.c| 18 +++
 src/intel/compiler/brw_eu_validate.c| 28 +-
 src/intel/compiler/brw_inst.h   |  6 +--
 src/intel/compiler/brw_reg_type.h   |  8 +--
 src/intel/compiler/test_eu_validate.cpp | 94 -
 7 files changed, 99 insertions(+), 99 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index aafea693fc..731e64a8ad 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -739,7 +739,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
 brw_inst_dst_hstride(devinfo, inst), NULL);
  string(file, ">");
  err |= control(file, "dest reg encoding", reg_encoding,
-brw_inst_dst_reg_type(devinfo, inst), NULL);
+brw_inst_dst_reg_hw_type(devinfo, inst), NULL);
   } else {
  string(file, "g[a0");
  if (brw_inst_dst_ia_subreg_nr(devinfo, inst))
@@ -752,7 +752,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
 brw_inst_dst_hstride(devinfo, inst), NULL);
  string(file, ">");
  err |= control(file, "dest reg encoding", reg_encoding,
-brw_inst_dst_reg_type(devinfo, inst), NULL);
+brw_inst_dst_reg_hw_type(devinfo, inst), NULL);
   }
} else {
   if (brw_inst_dst_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) {
@@ -766,7 +766,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  err |= control(file, "writemask", writemask,
 brw_inst_da16_writemask(devinfo, inst), NULL);
  err |= control(file, "dest reg encoding", reg_encoding,
-brw_inst_dst_reg_type(devinfo, inst), NULL);
+brw_inst_dst_reg_hw_type(devinfo, inst), NULL);
   } else {
  err = 1;
  string(file, "Indirect align16 address mode not supported");
@@ -1077,13 +1077,13 @@ static int
 src0(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst)
 {
if (brw_inst_src0_reg_file(devinfo, inst) == BRW_IMMEDIATE_VALUE) {
-  return imm(file, devinfo, brw_inst_src0_reg_type(devinfo, inst), inst);
+  return imm(file, devinfo, brw_inst_src0_reg_hw_type(devinfo, inst), 
inst);
} else if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
   if (brw_inst_src0_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) {
  return src_da1(file,
 devinfo,
 brw_inst_opcode(devinfo, inst),
-brw_inst_src0_reg_type(devinfo, inst),
+brw_inst_src0_reg_hw_type(devinfo, inst),
 brw_inst_src0_reg_file(devinfo, inst),
 brw_inst_src0_vstride(devinfo, inst),
 brw_inst_src0_width(devinfo, inst),
@@ -1096,7 +1096,7 @@ src0(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  return src_ia1(file,
 devinfo,
 brw_inst_opcode(devinfo, inst),
-brw_inst_src0_reg_type(devinfo, inst),
+brw_inst_src0_reg_hw_type(devinfo, inst),
 brw_inst_src0_reg_file(devinfo, inst),
 brw_inst_src0_ia1_addr_imm(devinfo, inst),
 brw_inst_src0_ia_subreg_nr(devinfo, inst),
@@ -,7 +,7 @@ src0(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
  return src_da16(file,
  devinfo,
  brw_inst_opcode(devinfo, inst),
- brw_inst_src0_reg_type(devinfo, inst),
+ brw_inst_src0_reg_hw_type(devinfo, inst),
  brw_inst_src0_reg_file(devinfo, inst),
  brw_inst_src0_vstride(devinfo, inst),
  brw_inst_src0_da_reg_nr(devinfo, inst),
@@ -1133,13 +1133,13 @@ static int
 src1(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst)
 {
if (brw_inst_src1_reg_file(devinfo, inst) == BRW_IMMEDIATE_VALUE) {
-  return imm(file, devinfo, brw_inst_src1_reg_type(devinfo, inst), inst);
+  return imm(file, devinfo, brw_inst_src1_reg_hw_type(devinfo, inst), 
inst);
} else if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
   if (brw_inst_src1_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) {
  return src_da1(file,
 devinfo,
 brw_inst_opcode(devinfo, inst),
-

[Mesa-dev] [PATCH 18/25] i965: Add functions to abstract access to register types

2017-08-04 Thread Matt Turner
Previously the brw_inst{,_set}_{dst,src0,src1}_reg_type() functions
provided access to the hardware encodings for the register types. We
often mixed these with the logical BRW_REGISTER_TYPE_* enums (which
themselves used to be the hardware format!) with bad results.

With that functionality now available with the hw_ versions (see
previous commit), we now add functions that take the logical
BRW_REGISTER_TYPE_* enums and convert into the hardware format and vice
versa. To do the conversion we also have to provide the file.

Note the asymmetry between the two functions: the new getter reads the
file from the instruction word, and to ensure that is always set the
setter writes both the file and the type.
---
 src/intel/compiler/brw_inst.h   |  28 +
 src/intel/compiler/test_eu_validate.cpp | 102 
 2 files changed, 79 insertions(+), 51 deletions(-)

diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index 4195150112..eacc0a024a 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -35,6 +35,7 @@
 #include 
 
 #include "brw_eu_defines.h"
+#include "brw_reg_type.h"
 #include "common/gen_device_info.h"
 
 #ifdef __cplusplus
@@ -652,6 +653,33 @@ brw_inst_set_imm_uq(const struct gen_device_info *devinfo,
 
 /** @} */
 
+#define REG_TYPE(reg) \
+static inline void\
+brw_inst_set_##reg##_file_type(const struct gen_device_info *devinfo, \
+   brw_inst *inst, enum brw_reg_file file,\
+   enum brw_reg_type type)\
+{ \
+   assert(file <= BRW_IMMEDIATE_VALUE);   \
+   unsigned hw_type = brw_reg_type_to_hw_type(devinfo, file, type);   \
+   brw_inst_set_##reg##_reg_file(devinfo, inst, file);\
+   brw_inst_set_##reg##_reg_hw_type(devinfo, inst, hw_type);  \
+} \
+  \
+static inline enum brw_reg_type   \
+brw_inst_##reg##_type(const struct gen_device_info *devinfo,  \
+  const brw_inst *inst)   \
+{ \
+   unsigned file = brw_inst_##reg##_reg_file(devinfo, inst);  \
+   unsigned hw_type = brw_inst_##reg##_reg_hw_type(devinfo, inst);\
+   return brw_hw_type_to_reg_type(devinfo, (enum brw_reg_file)file, hw_type); \
+}
+
+REG_TYPE(dst)
+REG_TYPE(src0)
+REG_TYPE(src1)
+#undef REG_TYPE
+
+
 /* The AddrImm fields are split into two discontiguous sections on Gen8+ */
 #define BRW_IA1_ADDR_IMM(reg, g4_high, g4_low, g8_nine, g8_high, g8_low) \
 static inline void   \
diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index c368688829..46d2b83e34 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -208,19 +208,19 @@ TEST_P(validation_test, opcode46)
 TEST_P(validation_test, 
dest_stride_must_be_equal_to_the_ratio_of_exec_size_to_dest_size)
 {
brw_ADD(p, g0, g0, g0);
-   brw_inst_set_dst_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_W);
-   brw_inst_set_src0_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D);
-   brw_inst_set_src1_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D);
+   brw_inst_set_dst_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, 
BRW_REGISTER_TYPE_W);
+   brw_inst_set_src0_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, 
BRW_REGISTER_TYPE_D);
+   brw_inst_set_src1_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, 
BRW_REGISTER_TYPE_D);
 
EXPECT_FALSE(validate(p));
 
clear_instructions(p);
 
brw_ADD(p, g0, g0, g0);
-   brw_inst_set_dst_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_W);
+   brw_inst_set_dst_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, 
BRW_REGISTER_TYPE_W);
brw_inst_set_dst_hstride(, last_inst, BRW_HORIZONTAL_STRIDE_2);
-   brw_inst_set_src0_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D);
-   brw_inst_set_src1_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D);
+   brw_inst_set_src0_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, 
BRW_REGISTER_TYPE_D);
+   brw_inst_set_src1_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, 
BRW_REGISTER_TYPE_D);
 
EXPECT_TRUE(validate(p));
 }
@@ -234,9 +234,9 @@ TEST_P(validation_test, 
dst_subreg_must_be_aligned_to_exec_type_size)
brw_ADD(p, g0, g0, g0);
brw_inst_set_dst_da1_subreg_nr(, last_inst, 2);
brw_inst_set_dst_hstride(, last_inst, BRW_HORIZONTAL_STRIDE_2);
-   

[Mesa-dev] [PATCH 15/25] i965: Add a brw_hw_type_to_reg_type() function

2017-08-04 Thread Matt Turner
Will be used in later commits.
---
 src/intel/compiler/brw_reg_type.c | 25 +
 src/intel/compiler/brw_reg_type.h |  4 
 2 files changed, 29 insertions(+)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index b0696570e5..8da93ae1cb 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -70,6 +70,31 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
 }
 
 /**
+ * Convert the hardware representation into a brw_reg_type enumeration value.
+ *
+ * The hardware encoding may depend on whether the value is an immediate.
+ */
+enum brw_reg_type
+brw_hw_type_to_reg_type(const struct gen_device_info *devinfo,
+enum brw_reg_file file, unsigned hw_type)
+{
+   if (file == BRW_IMMEDIATE_VALUE) {
+  for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
+ if (gen4_hw_type[i].imm_type == hw_type) {
+return i;
+ }
+  }
+   } else {
+  for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
+ if (gen4_hw_type[i].reg_type == hw_type) {
+return i;
+ }
+  }
+   }
+   unreachable("not reached");
+}
+
+/**
  * Return the element size given a hardware register type and file.
  *
  * The hardware encoding may depend on whether the value is an immediate.
diff --git a/src/intel/compiler/brw_reg_type.h 
b/src/intel/compiler/brw_reg_type.h
index 54262af1fc..f5c19c03f9 100644
--- a/src/intel/compiler/brw_reg_type.h
+++ b/src/intel/compiler/brw_reg_type.h
@@ -59,6 +59,10 @@ unsigned
 brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
 enum brw_reg_file file, enum brw_reg_type type);
 
+enum brw_reg_type
+brw_hw_type_to_reg_type(const struct gen_device_info *devinfo,
+enum brw_reg_file file, unsigned hw_type);
+
 #define brw_element_size(devinfo, inst, operand) \
brw_hw_reg_type_to_size(devinfo,  \
brw_inst_ ## operand ## _reg_file(devinfo, inst), \
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/25] i965: Index brw_hw_reg_type_to_size()'s table by logical type

2017-08-04 Thread Matt Turner
I'll be transitioning everything to use the logical types.
---
 src/intel/compiler/brw_reg_type.c | 58 +--
 1 file changed, 19 insertions(+), 39 deletions(-)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index 8da93ae1cb..859bcac047 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -104,43 +104,23 @@ brw_hw_reg_type_to_size(const struct gen_device_info 
*devinfo,
 enum brw_reg_file file,
 unsigned hw_type)
 {
-   if (file == BRW_IMMEDIATE_VALUE) {
-  static const int hw_sizes[] = {
- [0 ... 15]= -1,
- [BRW_HW_IMM_TYPE_UD]  = 4,
- [BRW_HW_IMM_TYPE_D]   = 4,
- [BRW_HW_IMM_TYPE_UW]  = 2,
- [BRW_HW_IMM_TYPE_W]   = 2,
- [BRW_HW_IMM_TYPE_UV]  = 2,
- [BRW_HW_IMM_TYPE_VF]  = 4,
- [BRW_HW_IMM_TYPE_V]   = 2,
- [BRW_HW_IMM_TYPE_F]   = 4,
- [GEN8_HW_IMM_TYPE_UQ] = 8,
- [GEN8_HW_IMM_TYPE_Q]  = 8,
- [GEN8_HW_IMM_TYPE_DF] = 8,
- [GEN8_HW_IMM_TYPE_HF] = 2,
-  };
-  assert(hw_type < ARRAY_SIZE(hw_sizes));
-  assert(hw_sizes[hw_type] != -1);
-  return hw_sizes[hw_type];
-   } else {
-  /* Non-immediate registers */
-  static const int hw_sizes[] = {
- [0 ... 15]= -1,
- [BRW_HW_REG_TYPE_UD]  = 4,
- [BRW_HW_REG_TYPE_D]   = 4,
- [BRW_HW_REG_TYPE_UW]  = 2,
- [BRW_HW_REG_TYPE_W]   = 2,
- [BRW_HW_REG_TYPE_UB]  = 1,
- [BRW_HW_REG_TYPE_B]   = 1,
- [GEN7_HW_REG_TYPE_DF] = 8,
- [BRW_HW_REG_TYPE_F]   = 4,
- [GEN8_HW_REG_TYPE_UQ] = 8,
- [GEN8_HW_REG_TYPE_Q]  = 8,
- [GEN8_HW_REG_TYPE_HF] = 2,
-  };
-  assert(hw_type < ARRAY_SIZE(hw_sizes));
-  assert(hw_sizes[hw_type] != -1);
-  return hw_sizes[hw_type];
-   }
+   static const unsigned type_size[] = {
+  [BRW_REGISTER_TYPE_DF] = 8,
+  [BRW_REGISTER_TYPE_F]  = 4,
+  [BRW_REGISTER_TYPE_HF] = 2,
+  [BRW_REGISTER_TYPE_VF] = 4,
+
+  [BRW_REGISTER_TYPE_Q]  = 8,
+  [BRW_REGISTER_TYPE_UQ] = 8,
+  [BRW_REGISTER_TYPE_D]  = 4,
+  [BRW_REGISTER_TYPE_UD] = 4,
+  [BRW_REGISTER_TYPE_W]  = 2,
+  [BRW_REGISTER_TYPE_UW] = 2,
+  [BRW_REGISTER_TYPE_B]  = 1,
+  [BRW_REGISTER_TYPE_UB] = 1,
+  [BRW_REGISTER_TYPE_V]  = 2,
+  [BRW_REGISTER_TYPE_UV] = 2,
+   };
+   enum brw_reg_type type = brw_hw_type_to_reg_type(devinfo, file, hw_type);
+   return type_size[type];
 }
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/25] i965: Extract functions dealing with register types to separate file

2017-08-04 Thread Matt Turner
I'm going to encapsulate all of the logic dealing with register types in
this file.

Rename the parameters for the hardware encodings from type -> hw_type at
the same time.
---
 src/intel/Makefile.sources|   2 +
 src/intel/compiler/brw_eu_emit.c  | 102 --
 src/intel/compiler/brw_reg.h  |  35 +--
 src/intel/compiler/brw_reg_type.c | 128 ++
 src/intel/compiler/brw_reg_type.h |  74 ++
 5 files changed, 205 insertions(+), 136 deletions(-)
 create mode 100644 src/intel/compiler/brw_reg_type.c
 create mode 100644 src/intel/compiler/brw_reg_type.h

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 5d8785832f..4074ba9ee5 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -81,6 +81,8 @@ COMPILER_FILES = \
compiler/brw_packed_float.c \
compiler/brw_predicated_break.cpp \
compiler/brw_reg.h \
+   compiler/brw_reg_type.c \
+   compiler/brw_reg_type.h \
compiler/brw_schedule_instructions.cpp \
compiler/brw_shader.cpp \
compiler/brw_shader.h \
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 12e9d332a1..133a28e1bf 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -84,108 +84,6 @@ gen7_convert_mrf_to_grf(struct brw_codegen *p, struct 
brw_reg *reg)
}
 }
 
-/**
- * Convert a brw_reg_type enumeration value into the hardware representation.
- *
- * The hardware encoding may depend on whether the value is an immediate.
- */
-unsigned
-brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
-enum brw_reg_file file,
-enum brw_reg_type type)
-{
-   if (file == BRW_IMMEDIATE_VALUE) {
-  static const enum hw_imm_type hw_types[] = {
- [0 ... BRW_REGISTER_TYPE_LAST] = -1,
- [BRW_REGISTER_TYPE_UD] = BRW_HW_IMM_TYPE_UD,
- [BRW_REGISTER_TYPE_D]  = BRW_HW_IMM_TYPE_D,
- [BRW_REGISTER_TYPE_UW] = BRW_HW_IMM_TYPE_UW,
- [BRW_REGISTER_TYPE_W]  = BRW_HW_IMM_TYPE_W,
- [BRW_REGISTER_TYPE_F]  = BRW_HW_IMM_TYPE_F,
- [BRW_REGISTER_TYPE_UV] = BRW_HW_IMM_TYPE_UV,
- [BRW_REGISTER_TYPE_VF] = BRW_HW_IMM_TYPE_VF,
- [BRW_REGISTER_TYPE_V]  = BRW_HW_IMM_TYPE_V,
- [BRW_REGISTER_TYPE_DF] = GEN8_HW_IMM_TYPE_DF,
- [BRW_REGISTER_TYPE_HF] = GEN8_HW_IMM_TYPE_HF,
- [BRW_REGISTER_TYPE_UQ] = GEN8_HW_IMM_TYPE_UQ,
- [BRW_REGISTER_TYPE_Q]  = GEN8_HW_IMM_TYPE_Q,
-  };
-  assert(type < ARRAY_SIZE(hw_types));
-  assert(hw_types[type] != -1);
-  return hw_types[type];
-   } else {
-  /* Non-immediate registers */
-  static const enum hw_reg_type hw_types[] = {
- [0 ... BRW_REGISTER_TYPE_LAST] = -1,
- [BRW_REGISTER_TYPE_UD] = BRW_HW_REG_TYPE_UD,
- [BRW_REGISTER_TYPE_D]  = BRW_HW_REG_TYPE_D,
- [BRW_REGISTER_TYPE_UW] = BRW_HW_REG_TYPE_UW,
- [BRW_REGISTER_TYPE_W]  = BRW_HW_REG_TYPE_W,
- [BRW_REGISTER_TYPE_UB] = BRW_HW_REG_TYPE_UB,
- [BRW_REGISTER_TYPE_B]  = BRW_HW_REG_TYPE_B,
- [BRW_REGISTER_TYPE_F]  = BRW_HW_REG_TYPE_F,
- [BRW_REGISTER_TYPE_DF] = GEN7_HW_REG_TYPE_DF,
- [BRW_REGISTER_TYPE_HF] = GEN8_HW_REG_TYPE_HF,
- [BRW_REGISTER_TYPE_UQ] = GEN8_HW_REG_TYPE_UQ,
- [BRW_REGISTER_TYPE_Q]  = GEN8_HW_REG_TYPE_Q,
-  };
-  assert(type < ARRAY_SIZE(hw_types));
-  assert(hw_types[type] != -1);
-  return hw_types[type];
-   }
-}
-
-/**
- * Return the element size given a hardware register type and file.
- *
- * The hardware encoding may depend on whether the value is an immediate.
- */
-unsigned
-brw_hw_reg_type_to_size(const struct gen_device_info *devinfo,
-enum brw_reg_file file,
-unsigned type)
-{
-   if (file == BRW_IMMEDIATE_VALUE) {
-  static const int hw_sizes[] = {
- [0 ... 15]= -1,
- [BRW_HW_IMM_TYPE_UD]  = 4,
- [BRW_HW_IMM_TYPE_D]   = 4,
- [BRW_HW_IMM_TYPE_UW]  = 2,
- [BRW_HW_IMM_TYPE_W]   = 2,
- [BRW_HW_IMM_TYPE_UV]  = 2,
- [BRW_HW_IMM_TYPE_VF]  = 4,
- [BRW_HW_IMM_TYPE_V]   = 2,
- [BRW_HW_IMM_TYPE_F]   = 4,
- [GEN8_HW_IMM_TYPE_UQ] = 8,
- [GEN8_HW_IMM_TYPE_Q]  = 8,
- [GEN8_HW_IMM_TYPE_DF] = 8,
- [GEN8_HW_IMM_TYPE_HF] = 2,
-  };
-  assert(type < ARRAY_SIZE(hw_sizes));
-  assert(hw_sizes[type] != -1);
-  return hw_sizes[type];
-   } else {
-  /* Non-immediate registers */
-  static const int hw_sizes[] = {
- [0 ... 15]= -1,
- [BRW_HW_REG_TYPE_UD]  = 4,
- [BRW_HW_REG_TYPE_D]   = 4,
- [BRW_HW_REG_TYPE_UW]  = 2,
- [BRW_HW_REG_TYPE_W]   = 2,
- [BRW_HW_REG_TYPE_UB]  = 1,
- [BRW_HW_REG_TYPE_B]   = 1,
- [GEN7_HW_REG_TYPE_DF] = 8,
- 

[Mesa-dev] [PATCH 14/25] i965: Use a common table to translate logical to hardware types

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_reg_type.c | 65 +--
 1 file changed, 29 insertions(+), 36 deletions(-)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index 8aac0ca009..b0696570e5 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -25,6 +25,29 @@
 #include "brw_eu_defines.h"
 #include "common/gen_device_info.h"
 
+#define INVALID (-1)
+
+static const struct {
+   enum hw_reg_type reg_type;
+   enum hw_imm_type imm_type;
+} gen4_hw_type[] = {
+   [BRW_REGISTER_TYPE_DF] = { GEN7_HW_REG_TYPE_DF, GEN8_HW_IMM_TYPE_DF },
+   [BRW_REGISTER_TYPE_F]  = { BRW_HW_REG_TYPE_F,   BRW_HW_IMM_TYPE_F   },
+   [BRW_REGISTER_TYPE_HF] = { GEN8_HW_REG_TYPE_HF, GEN8_HW_IMM_TYPE_HF },
+   [BRW_REGISTER_TYPE_VF] = { INVALID, BRW_HW_IMM_TYPE_VF  },
+
+   [BRW_REGISTER_TYPE_Q]  = { GEN8_HW_REG_TYPE_Q,  GEN8_HW_IMM_TYPE_Q  },
+   [BRW_REGISTER_TYPE_UQ] = { GEN8_HW_REG_TYPE_UQ, GEN8_HW_IMM_TYPE_UQ },
+   [BRW_REGISTER_TYPE_D]  = { BRW_HW_REG_TYPE_D,   BRW_HW_IMM_TYPE_D   },
+   [BRW_REGISTER_TYPE_UD] = { BRW_HW_REG_TYPE_UD,  BRW_HW_IMM_TYPE_UD  },
+   [BRW_REGISTER_TYPE_W]  = { BRW_HW_REG_TYPE_W,   BRW_HW_IMM_TYPE_W   },
+   [BRW_REGISTER_TYPE_UW] = { BRW_HW_REG_TYPE_UW,  BRW_HW_IMM_TYPE_UW  },
+   [BRW_REGISTER_TYPE_B]  = { BRW_HW_REG_TYPE_B,   INVALID },
+   [BRW_REGISTER_TYPE_UB] = { BRW_HW_REG_TYPE_UB,  INVALID },
+   [BRW_REGISTER_TYPE_V]  = { INVALID, BRW_HW_IMM_TYPE_V   },
+   [BRW_REGISTER_TYPE_UV] = { INVALID, BRW_HW_IMM_TYPE_UV  },
+};
+
 /**
  * Convert a brw_reg_type enumeration value into the hardware representation.
  *
@@ -35,44 +58,14 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
 enum brw_reg_file file,
 enum brw_reg_type type)
 {
+   assert(type < ARRAY_SIZE(gen4_hw_type));
+
if (file == BRW_IMMEDIATE_VALUE) {
-  static const enum hw_imm_type hw_types[] = {
- [0 ... BRW_REGISTER_TYPE_LAST] = -1,
- [BRW_REGISTER_TYPE_UD] = BRW_HW_IMM_TYPE_UD,
- [BRW_REGISTER_TYPE_D]  = BRW_HW_IMM_TYPE_D,
- [BRW_REGISTER_TYPE_UW] = BRW_HW_IMM_TYPE_UW,
- [BRW_REGISTER_TYPE_W]  = BRW_HW_IMM_TYPE_W,
- [BRW_REGISTER_TYPE_F]  = BRW_HW_IMM_TYPE_F,
- [BRW_REGISTER_TYPE_UV] = BRW_HW_IMM_TYPE_UV,
- [BRW_REGISTER_TYPE_VF] = BRW_HW_IMM_TYPE_VF,
- [BRW_REGISTER_TYPE_V]  = BRW_HW_IMM_TYPE_V,
- [BRW_REGISTER_TYPE_DF] = GEN8_HW_IMM_TYPE_DF,
- [BRW_REGISTER_TYPE_HF] = GEN8_HW_IMM_TYPE_HF,
- [BRW_REGISTER_TYPE_UQ] = GEN8_HW_IMM_TYPE_UQ,
- [BRW_REGISTER_TYPE_Q]  = GEN8_HW_IMM_TYPE_Q,
-  };
-  assert(type < ARRAY_SIZE(hw_types));
-  assert(hw_types[type] != -1);
-  return hw_types[type];
+  assert(gen4_hw_type[type].imm_type != INVALID);
+  return gen4_hw_type[type].imm_type;
} else {
-  /* Non-immediate registers */
-  static const enum hw_reg_type hw_types[] = {
- [0 ... BRW_REGISTER_TYPE_LAST] = -1,
- [BRW_REGISTER_TYPE_UD] = BRW_HW_REG_TYPE_UD,
- [BRW_REGISTER_TYPE_D]  = BRW_HW_REG_TYPE_D,
- [BRW_REGISTER_TYPE_UW] = BRW_HW_REG_TYPE_UW,
- [BRW_REGISTER_TYPE_W]  = BRW_HW_REG_TYPE_W,
- [BRW_REGISTER_TYPE_UB] = BRW_HW_REG_TYPE_UB,
- [BRW_REGISTER_TYPE_B]  = BRW_HW_REG_TYPE_B,
- [BRW_REGISTER_TYPE_F]  = BRW_HW_REG_TYPE_F,
- [BRW_REGISTER_TYPE_DF] = GEN7_HW_REG_TYPE_DF,
- [BRW_REGISTER_TYPE_HF] = GEN8_HW_REG_TYPE_HF,
- [BRW_REGISTER_TYPE_UQ] = GEN8_HW_REG_TYPE_UQ,
- [BRW_REGISTER_TYPE_Q]  = GEN8_HW_REG_TYPE_Q,
-  };
-  assert(type < ARRAY_SIZE(hw_types));
-  assert(hw_types[type] != -1);
-  return hw_types[type];
+  assert(gen4_hw_type[type].reg_type != INVALID);
+  return gen4_hw_type[type].reg_type;
}
 }
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/25] i965: Reverse file/type arguments to register type functions

2017-08-04 Thread Matt Turner
I think of the initial arguments as "state" and the last as the actual
subject.
---
 src/intel/compiler/brw_disasm.c  |  4 ++--
 src/intel/compiler/brw_eu_emit.c | 14 --
 src/intel/compiler/brw_eu_validate.c |  2 +-
 src/intel/compiler/brw_reg.h |  8 
 4 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 6da7060517..aafea693fc 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -839,7 +839,7 @@ src_da1(FILE *file,
if (err == -1)
   return 0;
if (sub_reg_num) {
-  unsigned elem_size = brw_hw_reg_type_to_size(devinfo, type, _reg_file);
+  unsigned elem_size = brw_hw_reg_type_to_size(devinfo, _reg_file, type);
   format(file, ".%d", sub_reg_num / elem_size);   /* use formal style like 
spec */
}
src_align1_region(file, _vert_stride, _width, _horiz_stride);
@@ -928,7 +928,7 @@ src_da16(FILE *file,
   return 0;
if (_subreg_nr) {
   unsigned elem_size =
- brw_hw_reg_type_to_size(devinfo, _reg_type, _reg_file);
+ brw_hw_reg_type_to_size(devinfo, _reg_file, _reg_type);
 
   /* bit4 for subreg number byte addressing. Make this same meaning as
  in da1 case, so output looks consistent. */
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 69ebf6345c..12e9d332a1 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -91,7 +91,8 @@ gen7_convert_mrf_to_grf(struct brw_codegen *p, struct brw_reg 
*reg)
  */
 unsigned
 brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
-enum brw_reg_type type, enum brw_reg_file file)
+enum brw_reg_file file,
+enum brw_reg_type type)
 {
if (file == BRW_IMMEDIATE_VALUE) {
   static const enum hw_imm_type hw_types[] = {
@@ -141,7 +142,8 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
  */
 unsigned
 brw_hw_reg_type_to_size(const struct gen_device_info *devinfo,
-unsigned type, enum brw_reg_file file)
+enum brw_reg_file file,
+unsigned type)
 {
if (file == BRW_IMMEDIATE_VALUE) {
   static const int hw_sizes[] = {
@@ -198,8 +200,8 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg dest)
 
brw_inst_set_dst_reg_file(devinfo, inst, dest.file);
brw_inst_set_dst_reg_type(devinfo, inst,
- brw_reg_type_to_hw_type(devinfo, dest.type,
- dest.file));
+ brw_reg_type_to_hw_type(devinfo, dest.file,
+ dest.type));
brw_inst_set_dst_address_mode(devinfo, inst, dest.address_mode);
 
if (dest.address_mode == BRW_ADDRESS_DIRECT) {
@@ -365,7 +367,7 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg reg)
 
brw_inst_set_src0_reg_file(devinfo, inst, reg.file);
brw_inst_set_src0_reg_type(devinfo, inst,
-  brw_reg_type_to_hw_type(devinfo, reg.type, 
reg.file));
+  brw_reg_type_to_hw_type(devinfo, reg.file, 
reg.type));
brw_inst_set_src0_abs(devinfo, inst, reg.abs);
brw_inst_set_src0_negate(devinfo, inst, reg.negate);
brw_inst_set_src0_address_mode(devinfo, inst, reg.address_mode);
@@ -472,7 +474,7 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg reg)
 
brw_inst_set_src1_reg_file(devinfo, inst, reg.file);
brw_inst_set_src1_reg_type(devinfo, inst,
-  brw_reg_type_to_hw_type(devinfo, reg.type, 
reg.file));
+  brw_reg_type_to_hw_type(devinfo, reg.file, 
reg.type));
brw_inst_set_src1_abs(devinfo, inst, reg.abs);
brw_inst_set_src1_negate(devinfo, inst, reg.negate);
 
diff --git a/src/intel/compiler/brw_eu_validate.c 
b/src/intel/compiler/brw_eu_validate.c
index 54e0a2e62e..cacf962904 100644
--- a/src/intel/compiler/brw_eu_validate.c
+++ b/src/intel/compiler/brw_eu_validate.c
@@ -459,7 +459,7 @@ general_restrictions_based_on_operand_types(const struct 
gen_device_info *devinf
 
unsigned exec_type = execution_type(devinfo, inst);
unsigned exec_type_size =
-  brw_hw_reg_type_to_size(devinfo, exec_type, BRW_GENERAL_REGISTER_FILE);
+  brw_hw_reg_type_to_size(devinfo, BRW_GENERAL_REGISTER_FILE, exec_type);
unsigned dst_type_size = brw_element_size(devinfo, inst, dst);
 
/* On IVB/BYT, region parameters and execution size for DF are in terms of
diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index bd179606b0..db932cfeee 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -227,14 +227,14 @@ enum PACKED brw_reg_type {
 };
 
 unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
- 

[Mesa-dev] [PATCH 11/25] i965: Add support for disassembling 64-bit integer immediates

2017-08-04 Thread Matt Turner
After the last patch converted things into enums, I helpfully got a
compiler warning about these missing from the switch statement.
---
 src/intel/compiler/brw_disasm.c | 6 ++
 src/intel/compiler/brw_inst.h   | 7 +++
 2 files changed, 13 insertions(+)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index b5c283058a..6da7060517 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -1029,6 +1029,12 @@ imm(FILE *file, const struct gen_device_info *devinfo, 
enum hw_imm_type type,
 const brw_inst *inst)
 {
switch (type) {
+   case GEN8_HW_IMM_TYPE_UQ:
+  format(file, "0x%16lxUD", brw_inst_imm_uq(devinfo, inst));
+  break;
+   case GEN8_HW_IMM_TYPE_Q:
+  format(file, "%ldD", brw_inst_imm_uq(devinfo, inst));
+  break;
case BRW_HW_IMM_TYPE_UD:
   format(file, "0x%08xUD", brw_inst_imm_ud(devinfo, inst));
   break;
diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index 5b2ce32ae4..cd3b0e95ea 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -569,6 +569,13 @@ brw_inst_imm_ud(const struct gen_device_info *devinfo, 
const brw_inst *insn)
return brw_inst_bits(insn, 127, 96);
 }
 
+static inline uint64_t
+brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn)
+{
+   assert(devinfo->gen >= 8);
+   return brw_inst_bits(insn, 127, 64);
+}
+
 static inline float
 brw_inst_imm_f(const struct gen_device_info *devinfo, const brw_inst *insn)
 {
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/25] i965: Don't let raw-move check be tricked by immediate vector types

2017-08-04 Thread Matt Turner
UB and B type encodings are the same as UV and VF. Noticed when writing
the following patch.
---
 src/intel/compiler/brw_eu_validate.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_eu_validate.c 
b/src/intel/compiler/brw_eu_validate.c
index e089c1f90f..827cd707c7 100644
--- a/src/intel/compiler/brw_eu_validate.c
+++ b/src/intel/compiler/brw_eu_validate.c
@@ -96,10 +96,17 @@ inst_is_raw_move(const struct gen_device_info *devinfo, 
const brw_inst *inst)
unsigned dst_type = signed_type(brw_inst_dst_reg_type(devinfo, inst));
unsigned src_type = signed_type(brw_inst_src0_reg_type(devinfo, inst));
 
-   if (brw_inst_src0_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE &&
-   (brw_inst_src0_negate(devinfo, inst) ||
-brw_inst_src0_abs(devinfo, inst)))
+   if (brw_inst_src0_reg_file(devinfo, inst) == BRW_IMMEDIATE_VALUE) {
+  /* FIXME: not strictly true */
+  if (brw_inst_src0_reg_type(devinfo, inst) == BRW_HW_REG_IMM_TYPE_VF ||
+  brw_inst_src0_reg_type(devinfo, inst) == BRW_HW_REG_IMM_TYPE_UV ||
+  brw_inst_src0_reg_type(devinfo, inst) == BRW_HW_REG_IMM_TYPE_V) {
+ return false;
+  }
+   } else if (brw_inst_src0_negate(devinfo, inst) ||
+  brw_inst_src0_abs(devinfo, inst)) {
   return false;
+   }
 
return brw_inst_opcode(devinfo, inst) == BRW_OPCODE_MOV &&
   brw_inst_saturate(devinfo, inst) == 0 &&
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/25] i965: Use separate enums for register vs immediate types

2017-08-04 Thread Matt Turner
The hardware encodings often mean different things depending on whether
the source is an immediate.
---
 src/intel/compiler/brw_disasm.c  |  46 ---
 src/intel/compiler/brw_eu_compact.c  |   8 +--
 src/intel/compiler/brw_eu_defines.h  |  48 +--
 src/intel/compiler/brw_eu_emit.c | 109 +--
 src/intel/compiler/brw_eu_validate.c |  60 +--
 src/intel/compiler/brw_reg.h |   2 +
 6 files changed, 144 insertions(+), 129 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 3a33614523..b5c283058a 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -238,17 +238,18 @@ static const char *const access_mode[2] = {
 };
 
 static const char * const reg_encoding[] = {
-   [BRW_HW_REG_TYPE_UD]  = "UD",
-   [BRW_HW_REG_TYPE_D]   = "D",
-   [BRW_HW_REG_TYPE_UW]  = "UW",
-   [BRW_HW_REG_TYPE_W]   = "W",
-   [BRW_HW_REG_NON_IMM_TYPE_UB]  = "UB",
-   [BRW_HW_REG_NON_IMM_TYPE_B]   = "B",
-   [GEN7_HW_REG_NON_IMM_TYPE_DF] = "DF",
-   [BRW_HW_REG_TYPE_F]   = "F",
-   [GEN8_HW_REG_TYPE_UQ] = "UQ",
-   [GEN8_HW_REG_TYPE_Q]  = "Q",
-   [GEN8_HW_REG_NON_IMM_TYPE_HF] = "HF",
+   [BRW_HW_REG_TYPE_UD]  = "UD",
+   [BRW_HW_REG_TYPE_D]   = "D",
+   [BRW_HW_REG_TYPE_UW]  = "UW",
+   [BRW_HW_REG_TYPE_W]   = "W",
+   [BRW_HW_REG_TYPE_F]   = "F",
+   [GEN8_HW_REG_TYPE_UQ] = "UQ",
+   [GEN8_HW_REG_TYPE_Q]  = "Q",
+
+   [BRW_HW_REG_TYPE_UB]  = "UB",
+   [BRW_HW_REG_TYPE_B]   = "B",
+   [GEN7_HW_REG_TYPE_DF] = "DF",
+   [GEN8_HW_REG_TYPE_HF] = "HF",
 };
 
 static const char *const three_source_reg_encoding[] = {
@@ -1024,41 +1025,42 @@ src2_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
 }
 
 static int
-imm(FILE *file, const struct gen_device_info *devinfo, unsigned type, const 
brw_inst *inst)
+imm(FILE *file, const struct gen_device_info *devinfo, enum hw_imm_type type,
+const brw_inst *inst)
 {
switch (type) {
-   case BRW_HW_REG_TYPE_UD:
+   case BRW_HW_IMM_TYPE_UD:
   format(file, "0x%08xUD", brw_inst_imm_ud(devinfo, inst));
   break;
-   case BRW_HW_REG_TYPE_D:
+   case BRW_HW_IMM_TYPE_D:
   format(file, "%dD", brw_inst_imm_d(devinfo, inst));
   break;
-   case BRW_HW_REG_TYPE_UW:
+   case BRW_HW_IMM_TYPE_UW:
   format(file, "0x%04xUW", (uint16_t) brw_inst_imm_ud(devinfo, inst));
   break;
-   case BRW_HW_REG_TYPE_W:
+   case BRW_HW_IMM_TYPE_W:
   format(file, "%dW", (int16_t) brw_inst_imm_d(devinfo, inst));
   break;
-   case BRW_HW_REG_IMM_TYPE_UV:
+   case BRW_HW_IMM_TYPE_UV:
   format(file, "0x%08xUV", brw_inst_imm_ud(devinfo, inst));
   break;
-   case BRW_HW_REG_IMM_TYPE_VF:
+   case BRW_HW_IMM_TYPE_VF:
   format(file, "[%-gF, %-gF, %-gF, %-gF]VF",
  brw_vf_to_float(brw_inst_imm_ud(devinfo, inst)),
  brw_vf_to_float(brw_inst_imm_ud(devinfo, inst) >> 8),
  brw_vf_to_float(brw_inst_imm_ud(devinfo, inst) >> 16),
  brw_vf_to_float(brw_inst_imm_ud(devinfo, inst) >> 24));
   break;
-   case BRW_HW_REG_IMM_TYPE_V:
+   case BRW_HW_IMM_TYPE_V:
   format(file, "0x%08xV", brw_inst_imm_ud(devinfo, inst));
   break;
-   case BRW_HW_REG_TYPE_F:
+   case BRW_HW_IMM_TYPE_F:
   format(file, "%-gF", brw_inst_imm_f(devinfo, inst));
   break;
-   case GEN8_HW_REG_IMM_TYPE_DF:
+   case GEN8_HW_IMM_TYPE_DF:
   format(file, "%-gDF", brw_inst_imm_df(devinfo, inst));
   break;
-   case GEN8_HW_REG_IMM_TYPE_HF:
+   case GEN8_HW_IMM_TYPE_HF:
   string(file, "Half Float IMM");
   break;
}
diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index 79103d7883..bca526f592 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -995,9 +995,9 @@ precompact(const struct gen_device_info *devinfo, brw_inst 
inst)
!(devinfo->is_haswell &&
  brw_inst_opcode(devinfo, ) == BRW_OPCODE_DIM) &&
!(devinfo->gen >= 8 &&
- (brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_IMM_TYPE_DF ||
-  brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_UQ ||
-  brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_Q))) {
+ (brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_IMM_TYPE_DF ||
+  brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_IMM_TYPE_UQ ||
+  brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_IMM_TYPE_Q))) {
   brw_inst_set_src1_reg_type(devinfo, , BRW_HW_REG_TYPE_UD);
}
 
@@ -1016,7 +1016,7 @@ precompact(const struct gen_device_info *devinfo, 
brw_inst inst)
brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
brw_inst_dst_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
brw_inst_dst_hstride(devinfo, ) == BRW_HORIZONTAL_STRIDE_1) {
-  brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_IMM_TYPE_VF);
+  

[Mesa-dev] [PATCH 09/25] i965: Reorder brw_reg_type enum values

2017-08-04 Thread Matt Turner
These vaguely corresponded to the hardware encodings, but that is purely
historical at this point. Reorder them so we stop making things "almost
work" when mixing enums.

The ordering has been closen so that no enum value is the same as a
compatible hardware encoding.
---
 src/intel/compiler/brw_eu.c  |  1 -
 src/intel/compiler/brw_eu_emit.c |  6 --
 src/intel/compiler/brw_fs.cpp|  1 +
 src/intel/compiler/brw_reg.h | 32 ++--
 src/intel/compiler/brw_vec4.cpp  |  3 ++-
 5 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
index 0ef52e219c..700a1badd4 100644
--- a/src/intel/compiler/brw_eu.c
+++ b/src/intel/compiler/brw_eu.c
@@ -62,7 +62,6 @@ brw_reg_type_letters(unsigned type)
   [BRW_REGISTER_TYPE_UQ] = "UQ",
   [BRW_REGISTER_TYPE_Q]  = "Q",
};
-   assert(type <= BRW_REGISTER_TYPE_Q);
return names[type];
 }
 
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 6673e0741a..b59fc33a54 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -112,7 +112,6 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
   };
   assert(type < ARRAY_SIZE(imm_hw_types));
   assert(imm_hw_types[type] != -1);
-  assert(devinfo->gen >= 8 || type < BRW_REGISTER_TYPE_DF);
   return imm_hw_types[type];
} else {
   /* Non-immediate registers */
@@ -134,8 +133,6 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
   };
   assert(type < ARRAY_SIZE(hw_types));
   assert(hw_types[type] != -1);
-  assert(devinfo->gen >= 7 || type < BRW_REGISTER_TYPE_DF);
-  assert(devinfo->gen >= 8 || type < BRW_REGISTER_TYPE_Q);
   return hw_types[type];
}
 }
@@ -184,9 +181,6 @@ brw_hw_reg_type_to_size(const struct gen_device_info 
*devinfo,
  [GEN8_HW_REG_NON_IMM_TYPE_HF] = 2,
   };
   assert(type < ARRAY_SIZE(hw_sizes));
-  assert(devinfo->gen >= 7 ||
- (type < GEN7_HW_REG_NON_IMM_TYPE_DF || type == 
BRW_HW_REG_TYPE_F));
-  assert(devinfo->gen >= 8 || type <= BRW_HW_REG_TYPE_F);
   return hw_sizes[type];
}
 }
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index fdc30d450c..0ea4c4f1cc 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -403,6 +403,7 @@ void
 fs_reg::init()
 {
memset(this, 0, sizeof(*this));
+   type = BRW_REGISTER_TYPE_UD;
stride = 1;
 }
 
diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index 17a51fbd65..48e6fd7b7d 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -203,29 +203,25 @@ brw_mask_for_swizzle(unsigned swz)
 }
 
 enum PACKED brw_reg_type {
-   BRW_REGISTER_TYPE_UD = 0,
-   BRW_REGISTER_TYPE_D,
-   BRW_REGISTER_TYPE_UW,
-   BRW_REGISTER_TYPE_W,
+   /** Floating-point types: @{ */
+   BRW_REGISTER_TYPE_DF,
BRW_REGISTER_TYPE_F,
-
-   /** Non-immediates only: @{ */
-   BRW_REGISTER_TYPE_UB,
-   BRW_REGISTER_TYPE_B,
-   /** @} */
-
-   /** Immediates only: @{ */
-   BRW_REGISTER_TYPE_UV, /* Gen6+ */
-   BRW_REGISTER_TYPE_V,
+   BRW_REGISTER_TYPE_HF,
BRW_REGISTER_TYPE_VF,
/** @} */
 
-   BRW_REGISTER_TYPE_DF, /* Gen7+ (no immediates until Gen8+) */
-
-   /* Gen8+ */
-   BRW_REGISTER_TYPE_HF,
-   BRW_REGISTER_TYPE_UQ,
+   /** Integer types: @{ */
BRW_REGISTER_TYPE_Q,
+   BRW_REGISTER_TYPE_UQ,
+   BRW_REGISTER_TYPE_D,
+   BRW_REGISTER_TYPE_UD,
+   BRW_REGISTER_TYPE_W,
+   BRW_REGISTER_TYPE_UW,
+   BRW_REGISTER_TYPE_B,
+   BRW_REGISTER_TYPE_UB,
+   BRW_REGISTER_TYPE_V,
+   BRW_REGISTER_TYPE_UV,
+   /** @} */
 };
 
 unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 410922c62b..bf9a271900 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -42,8 +42,8 @@ void
 src_reg::init()
 {
memset(this, 0, sizeof(*this));
-
this->file = BAD_FILE;
+   this->type = BRW_REGISTER_TYPE_UD;
 }
 
 src_reg::src_reg(enum brw_reg_file file, int nr, const glsl_type *type)
@@ -85,6 +85,7 @@ dst_reg::init()
 {
memset(this, 0, sizeof(*this));
this->file = BAD_FILE;
+   this->type = BRW_REGISTER_TYPE_UD;
this->writemask = WRITEMASK_XYZW;
 }
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/25] i965: Only change type of 0.0f to VF if destination stride == 1

2017-08-04 Thread Matt Turner
The destination stride must be equivalent to a dword if VF is used.

Also, since the only compaction table entires with "i:vf" have the
destination as "r:f" specifically check that the destination is of type
float.
---
 src/intel/compiler/brw_eu_compact.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index bf57ddf85c..79103d7883 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -1014,7 +1014,8 @@ precompact(const struct gen_device_info *devinfo, 
brw_inst inst)
 */
if (brw_inst_imm_ud(devinfo, ) == 0x0 &&
brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
-   brw_inst_dst_reg_type(devinfo, ) != GEN7_HW_REG_NON_IMM_TYPE_DF) {
+   brw_inst_dst_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
+   brw_inst_dst_hstride(devinfo, ) == BRW_HORIZONTAL_STRIDE_1) {
   brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_IMM_TYPE_VF);
}
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/25] i965: Remove CONT/BREAK from instruction compaction test

2017-08-04 Thread Matt Turner
These cannot be compacted. A similar mistake was fixed in commit
90eaf01616a8
---
 src/intel/compiler/test_eu_compact.cpp | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/intel/compiler/test_eu_compact.cpp 
b/src/intel/compiler/test_eu_compact.cpp
index 668a972bfa..1532e3b984 100644
--- a/src/intel/compiler/test_eu_compact.cpp
+++ b/src/intel/compiler/test_eu_compact.cpp
@@ -68,8 +68,6 @@ clear_pad_bits(const struct gen_device_info *devinfo, 
brw_inst *inst)
 {
if (brw_inst_opcode(devinfo, inst) != BRW_OPCODE_SEND &&
brw_inst_opcode(devinfo, inst) != BRW_OPCODE_SENDC &&
-   brw_inst_opcode(devinfo, inst) != BRW_OPCODE_BREAK &&
-   brw_inst_opcode(devinfo, inst) != BRW_OPCODE_CONTINUE &&
brw_inst_src0_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE &&
brw_inst_src1_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE) {
   brw_inst_set_bits(inst, 127, 111, 0);
@@ -133,8 +131,6 @@ skip_bit(const struct gen_device_info *devinfo, brw_inst 
*src, int bit)
/* sometimes these are pad bits. */
if (brw_inst_opcode(devinfo, src) != BRW_OPCODE_SEND &&
brw_inst_opcode(devinfo, src) != BRW_OPCODE_SENDC &&
-   brw_inst_opcode(devinfo, src) != BRW_OPCODE_BREAK &&
-   brw_inst_opcode(devinfo, src) != BRW_OPCODE_CONTINUE &&
brw_inst_src0_reg_file(devinfo, src) != BRW_IMMEDIATE_VALUE &&
brw_inst_src1_reg_file(devinfo, src) != BRW_IMMEDIATE_VALUE &&
bit >= 121) {
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/25] i965: Validate destination restrictions with vector immediates

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_eu_emit.c| 13 +-
 src/intel/compiler/brw_eu_validate.c| 61 +
 src/intel/compiler/test_eu_validate.cpp | 79 +
 3 files changed, 141 insertions(+), 12 deletions(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 8a6ec035cc..6673e0741a 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -279,19 +279,8 @@ validate_reg(const struct gen_device_info *devinfo,
const int execsize_for_reg[] = {1, 2, 4, 8, 16, 32};
int width, hstride, vstride, execsize;
 
-   if (reg.file == BRW_IMMEDIATE_VALUE) {
-  /* 3.3.6: Region Parameters.  Restriction: Immediate vectors
-   * mean the destination has to be 128-bit aligned and the
-   * destination horiz stride has to be a word.
-   */
-  if (reg.type == BRW_REGISTER_TYPE_V) {
- unsigned UNUSED elem_size = brw_element_size(devinfo, inst, dst);
- assert(hstride_for_reg[brw_inst_dst_hstride(devinfo, inst)] *
-elem_size == 2);
-  }
-
+   if (reg.file == BRW_IMMEDIATE_VALUE)
   return;
-   }
 
if (reg.file == BRW_ARCHITECTURE_REGISTER_FILE &&
reg.file == BRW_ARF_NULL)
diff --git a/src/intel/compiler/brw_eu_validate.c 
b/src/intel/compiler/brw_eu_validate.c
index 827cd707c7..7f0595e6f8 100644
--- a/src/intel/compiler/brw_eu_validate.c
+++ b/src/intel/compiler/brw_eu_validate.c
@@ -1036,6 +1036,66 @@ region_alignment_rules(const struct gen_device_info 
*devinfo,
return error_msg;
 }
 
+static struct string
+vector_immediate_restrictions(const struct gen_device_info *devinfo,
+  const brw_inst *inst)
+{
+   unsigned num_sources = num_sources_from_inst(devinfo, inst);
+   struct string error_msg = { .str = NULL, .len = 0 };
+
+   if (num_sources == 3 || num_sources == 0)
+  return (struct string){};
+
+   unsigned file = num_sources == 1 ?
+   brw_inst_src0_reg_file(devinfo, inst) :
+   brw_inst_src1_reg_file(devinfo, inst);
+   if (file != BRW_IMMEDIATE_VALUE)
+  return (struct string){};
+
+   unsigned dst_type_size = brw_element_size(devinfo, inst, dst);
+   unsigned dst_subreg = brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1 ?
+ brw_inst_dst_da1_subreg_nr(devinfo, inst) : 0;
+   unsigned dst_stride = 1 << (brw_inst_dst_hstride(devinfo, inst) - 1);
+   unsigned type = num_sources == 1 ?
+   brw_inst_src0_reg_type(devinfo, inst) :
+   brw_inst_src1_reg_type(devinfo, inst);
+
+   /* The PRMs say:
+*
+*When an immediate vector is used in an instruction, the destination
+*must be 128-bit aligned with destination horizontal stride equivalent
+*to a word for an immediate integer vector (v) and equivalent to a
+*DWord for an immediate float vector (vf).
+*
+* The text has not been updated for the addition of the immediate unsigned
+* integer vector type (uv) on SNB, but presumably the same restriction
+* applies.
+*/
+   switch (type) {
+   case BRW_HW_REG_IMM_TYPE_V:
+   case BRW_HW_REG_IMM_TYPE_UV:
+   case BRW_HW_REG_IMM_TYPE_VF:
+  ERROR_IF(dst_subreg % (128 / 8) != 0,
+   "Destination must be 128-bit aligned in order to use immediate "
+   "vector types");
+
+  if (type == BRW_HW_REG_IMM_TYPE_VF) {
+ ERROR_IF(dst_type_size * dst_stride != 4,
+  "Destination must have stride equivalent to dword in order "
+  "to use the VF type");
+  } else {
+ ERROR_IF(dst_type_size * dst_stride != 2,
+  "Destination must have stride equivalent to word in order "
+  "to use the V or UV type");
+  }
+  break;
+   default:
+  break;
+   }
+
+   return error_msg;
+}
+
 bool
 brw_validate_instructions(const struct gen_device_info *devinfo,
   void *assembly, int start_offset, int end_offset,
@@ -1063,6 +1123,7 @@ brw_validate_instructions(const struct gen_device_info 
*devinfo,
  CHECK(general_restrictions_based_on_operand_types);
  CHECK(general_restrictions_on_region_parameters);
  CHECK(region_alignment_rules);
+ CHECK(vector_immediate_restrictions);
   }
 
   if (error_msg.str && annotation) {
diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index 09f4cc142a..b43c41704b 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -132,6 +132,7 @@ validate(struct brw_codegen *p)
 #define last_inst(>store[p->nr_insn - 1])
 #define g0   brw_vec8_grf(0, 0)
 #define null brw_null_reg()
+#define zero brw_imm_f(0.0f)
 
 static void
 clear_instructions(struct brw_codegen *p)
@@ -844,5 +845,83 @@ TEST_P(validation_test, byte_destination_relaxed_alignment)
} else {

[Mesa-dev] [PATCH 04/25] i965: Test instruction compaction on all supported Gens

2017-08-04 Thread Matt Turner
Note that there's no point in testing on G45, since its compaction is
the same as Gen5. Same logic applies to Gen7 variants and low-power
parts.
---
 src/intel/compiler/test_eu_compact.cpp | 50 --
 1 file changed, 42 insertions(+), 8 deletions(-)

diff --git a/src/intel/compiler/test_eu_compact.cpp 
b/src/intel/compiler/test_eu_compact.cpp
index 1ef7e5ae7f..668a972bfa 100644
--- a/src/intel/compiler/test_eu_compact.cpp
+++ b/src/intel/compiler/test_eu_compact.cpp
@@ -74,6 +74,13 @@ clear_pad_bits(const struct gen_device_info *devinfo, 
brw_inst *inst)
brw_inst_src1_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE) {
   brw_inst_set_bits(inst, 127, 111, 0);
}
+
+   if (devinfo->gen == 8 && !devinfo->is_cherryview &&
+   is_3src(devinfo, (opcode)brw_inst_opcode(devinfo, inst))) {
+  brw_inst_set_bits(inst, 105, 105, 0);
+  brw_inst_set_bits(inst, 84, 84, 0);
+  brw_inst_set_bits(inst, 36, 35, 0);
+   }
 }
 
 static bool
@@ -87,13 +94,41 @@ skip_bit(const struct gen_device_info *devinfo, brw_inst 
*src, int bit)
if (bit == 29)
   return true;
 
-   /* pad bit */
-   if (bit == 47)
-  return true;
+   if (is_3src(devinfo, (opcode)brw_inst_opcode(devinfo, src))) {
+  if (devinfo->gen >= 9 || devinfo->is_cherryview) {
+ if (bit == 127)
+return true;
+  } else {
+ if (bit >= 126 && bit <= 127)
+return true;
 
-   /* pad bits */
-   if (bit >= 90 && bit <= 95)
-  return true;
+ if (bit == 105)
+return true;
+
+ if (bit == 84)
+return true;
+
+ if (bit >= 35 && bit <= 36)
+return true;
+  }
+   } else {
+  if (bit == 47)
+ return true;
+
+  if (devinfo->gen >= 8) {
+ if (bit == 11)
+return true;
+
+ if (bit == 95)
+return true;
+  } else {
+ if (devinfo->gen < 7 && bit == 90)
+return true;
+
+ if (bit >= 91 && bit <= 95)
+return true;
+  }
+   }
 
/* sometimes these are pad bits. */
if (brw_inst_opcode(devinfo, src) != BRW_OPCODE_SEND &&
@@ -289,10 +324,9 @@ int
 main(int argc, char **argv)
 {
struct gen_device_info *devinfo = (struct gen_device_info *)calloc(1, 
sizeof(*devinfo));
-   devinfo->gen = 6;
bool fail = false;
 
-   for (devinfo->gen = 6; devinfo->gen <= 7; devinfo->gen++) {
+   for (devinfo->gen = 5; devinfo->gen <= 9; devinfo->gen++) {
   fail |= run_tests(devinfo);
}
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/25] i965: Mark src inst pointer const in compaction code

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_eu.h |  2 +-
 src/intel/compiler/brw_eu_compact.c | 23 ---
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
index a3a9c63239..8e597b212a 100644
--- a/src/intel/compiler/brw_eu.h
+++ b/src/intel/compiler/brw_eu.h
@@ -542,7 +542,7 @@ void brw_compact_instructions(struct brw_codegen *p, int 
start_offset,
 void brw_uncompact_instruction(const struct gen_device_info *devinfo,
brw_inst *dst, brw_compact_inst *src);
 bool brw_try_compact_instruction(const struct gen_device_info *devinfo,
- brw_compact_inst *dst, brw_inst *src);
+ brw_compact_inst *dst, const brw_inst *src);
 
 void brw_debug_compact_uncompact(const struct gen_device_info *devinfo,
  brw_inst *orig, brw_inst *uncompacted);
diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index 740a395f78..a940e214f2 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -671,7 +671,7 @@ static const uint16_t *src_index_table;
 
 static bool
 set_control_index(const struct gen_device_info *devinfo,
-  brw_compact_inst *dst, brw_inst *src)
+  brw_compact_inst *dst, const brw_inst *src)
 {
uint32_t uncompacted = devinfo->gen >= 8  /* 17b/G45; 19b/IVB+ */
   ? (brw_inst_bits(src, 33, 31) << 16) | /*  3b */
@@ -700,7 +700,7 @@ set_control_index(const struct gen_device_info *devinfo,
 
 static bool
 set_datatype_index(const struct gen_device_info *devinfo, brw_compact_inst 
*dst,
-   brw_inst *src)
+   const brw_inst *src)
 {
uint32_t uncompacted = devinfo->gen >= 8  /* 18b/G45+; 21b/BDW+ */
   ? (brw_inst_bits(src, 63, 61) << 18) | /*  3b */
@@ -721,7 +721,7 @@ set_datatype_index(const struct gen_device_info *devinfo, 
brw_compact_inst *dst,
 
 static bool
 set_subreg_index(const struct gen_device_info *devinfo, brw_compact_inst *dst,
- brw_inst *src, bool is_immediate)
+ const brw_inst *src, bool is_immediate)
 {
uint16_t uncompacted = /* 15b */
   (brw_inst_bits(src, 52, 48) << 0) | /*  5b */
@@ -756,7 +756,7 @@ get_src_index(uint16_t uncompacted,
 
 static bool
 set_src0_index(const struct gen_device_info *devinfo,
-   brw_compact_inst *dst, brw_inst *src)
+   brw_compact_inst *dst, const brw_inst *src)
 {
uint16_t compacted;
uint16_t uncompacted = brw_inst_bits(src, 88, 77); /* 12b */
@@ -771,7 +771,7 @@ set_src0_index(const struct gen_device_info *devinfo,
 
 static bool
 set_src1_index(const struct gen_device_info *devinfo, brw_compact_inst *dst,
-   brw_inst *src, bool is_immediate)
+   const brw_inst *src, bool is_immediate)
 {
uint16_t compacted;
 
@@ -791,7 +791,7 @@ set_src1_index(const struct gen_device_info *devinfo, 
brw_compact_inst *dst,
 
 static bool
 set_3src_control_index(const struct gen_device_info *devinfo,
-   brw_compact_inst *dst, brw_inst *src)
+   brw_compact_inst *dst, const brw_inst *src)
 {
assert(devinfo->gen >= 8);
 
@@ -814,7 +814,7 @@ set_3src_control_index(const struct gen_device_info 
*devinfo,
 
 static bool
 set_3src_source_index(const struct gen_device_info *devinfo,
-  brw_compact_inst *dst, brw_inst *src)
+  brw_compact_inst *dst, const brw_inst *src)
 {
assert(devinfo->gen >= 8);
 
@@ -847,7 +847,7 @@ set_3src_source_index(const struct gen_device_info *devinfo,
 }
 
 static bool
-has_unmapped_bits(const struct gen_device_info *devinfo, brw_inst *src)
+has_unmapped_bits(const struct gen_device_info *devinfo, const brw_inst *src)
 {
/* EOT can only be mapped on a send if the src1 is an immediate */
if ((brw_inst_opcode(devinfo, src) == BRW_OPCODE_SENDC ||
@@ -878,7 +878,8 @@ has_unmapped_bits(const struct gen_device_info *devinfo, 
brw_inst *src)
 }
 
 static bool
-has_3src_unmapped_bits(const struct gen_device_info *devinfo, brw_inst *src)
+has_3src_unmapped_bits(const struct gen_device_info *devinfo,
+   const brw_inst *src)
 {
/* Check for three-source instruction bits that don't map to any of the
 * fields of the compacted instruction.  All of them seem to be reserved
@@ -901,7 +902,7 @@ has_3src_unmapped_bits(const struct gen_device_info 
*devinfo, brw_inst *src)
 
 static bool
 brw_try_compact_3src_instruction(const struct gen_device_info *devinfo,
- brw_compact_inst *dst, brw_inst *src)
+ brw_compact_inst *dst, const brw_inst *src)
 {
assert(devinfo->gen >= 8);
 
@@ -962,7 +963,7 @@ is_compactable_immediate(unsigned imm)
  */
 bool
 brw_try_compact_instruction(const struct gen_device_info *devinfo,
-  

[Mesa-dev] [PATCH 00/25] i965: Switch to always using logical register types

2017-08-04 Thread Matt Turner
The mixture of hardware encodings and logical types has caused lots of
confusion. It's time to fix that.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/25] i965: Silence signed/unsigned comparison warning

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/test_eu_compact.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/test_eu_compact.cpp 
b/src/intel/compiler/test_eu_compact.cpp
index 39e7f1a27c..1ef7e5ae7f 100644
--- a/src/intel/compiler/test_eu_compact.cpp
+++ b/src/intel/compiler/test_eu_compact.cpp
@@ -254,7 +254,7 @@ run_tests(const struct gen_device_info *devinfo)
brw_init_compaction_tables(devinfo);
bool fail = false;
 
-   for (int i = 0; i < ARRAY_SIZE(tests); i++) {
+   for (unsigned i = 0; i < ARRAY_SIZE(tests); i++) {
   for (int align_16 = 0; align_16 <= 1; align_16++) {
 struct brw_codegen *p = rzalloc(NULL, struct brw_codegen);
 brw_init_codegen(devinfo, p, p);
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/25] i965: Move compaction "prepass" into brw_eu_compact.c

2017-08-04 Thread Matt Turner
---
 src/intel/compiler/brw_eu_compact.c | 82 -
 src/intel/compiler/brw_eu_emit.c| 72 +---
 2 files changed, 82 insertions(+), 72 deletions(-)

diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index a940e214f2..bf57ddf85c 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -956,6 +956,83 @@ is_compactable_immediate(unsigned imm)
 }
 
 /**
+ * Applies some small changes to instruction types to increase chances of
+ * compaction.
+ */
+static brw_inst
+precompact(const struct gen_device_info *devinfo, brw_inst inst)
+{
+   if (brw_inst_src0_reg_file(devinfo, ) != BRW_IMMEDIATE_VALUE)
+  return inst;
+
+   /* The Bspec's section titled "Non-present Operands" claims that if src0
+* is an immediate that src1's type must be the same as that of src0.
+*
+* The SNB+ DataTypeIndex instruction compaction tables contain mappings
+* that do not follow this rule. E.g., from the IVB/HSW table:
+*
+*  DataTypeIndex   18-Bit Mapping   Mapped Meaning
+*3 0010101101   r:f | i:vf | a:ud | <1> | dir |
+*
+* And from the SNB table:
+*
+*  DataTypeIndex   18-Bit Mapping   Mapped Meaning
+*8 0010001100   a:w | i:w | a:ud | <1> | dir |
+*
+* Neither of these cause warnings from the simulator when used,
+* compacted or otherwise. In fact, all compaction mappings that have an
+* immediate in src0 use a:ud for src1.
+*
+* The GM45 instruction compaction tables do not contain mapped meanings
+* so it's not clear whether it has the restriction. We'll assume it was
+* lifted on SNB. (FINISHME: decode the GM45 tables and check.)
+*
+* Don't do any of this for 64-bit immediates, since the src1 fields
+* overlap with the immediate and setting them would overwrite the
+* immediate we set.
+*/
+   if (devinfo->gen >= 6 &&
+   !(devinfo->is_haswell &&
+ brw_inst_opcode(devinfo, ) == BRW_OPCODE_DIM) &&
+   !(devinfo->gen >= 8 &&
+ (brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_IMM_TYPE_DF ||
+  brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_UQ ||
+  brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_Q))) {
+  brw_inst_set_src1_reg_type(devinfo, , BRW_HW_REG_TYPE_UD);
+   }
+
+   /* Compacted instructions only have 12-bits (plus 1 for the other 20)
+* for immediate values. Presumably the hardware engineers realized
+* that the only useful floating-point value that could be represented
+* in this format is 0.0, which can also be represented as a VF-typed
+* immediate, so they gave us the previously mentioned mapping on IVB+.
+*
+* Strangely, we do have a mapping for imm:f in src1, so we don't need
+* to do this there.
+*
+* If we see a 0.0:F, change the type to VF so that it can be compacted.
+*/
+   if (brw_inst_imm_ud(devinfo, ) == 0x0 &&
+   brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F &&
+   brw_inst_dst_reg_type(devinfo, ) != GEN7_HW_REG_NON_IMM_TYPE_DF) {
+  brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_IMM_TYPE_VF);
+   }
+
+   /* There are no mappings for dst:d | i:d, so if the immediate is suitable
+* set the types to :UD so the instruction can be compacted.
+*/
+   if (is_compactable_immediate(brw_inst_imm_ud(devinfo, )) &&
+   brw_inst_cond_modifier(devinfo, ) == BRW_CONDITIONAL_NONE &&
+   brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_D &&
+   brw_inst_dst_reg_type(devinfo, ) == BRW_HW_REG_TYPE_D) {
+  brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_TYPE_UD);
+  brw_inst_set_dst_reg_type(devinfo, , BRW_HW_REG_TYPE_UD);
+   }
+
+   return inst;
+}
+
+/**
  * Tries to compact instruction src into dst.
  *
  * It doesn't modify dst unless src is compactable, which is relied on by
@@ -1427,9 +1504,10 @@ brw_compact_instructions(struct brw_codegen *p, int 
start_offset,
   old_ip[offset / sizeof(brw_compact_inst)] = src_offset / 
sizeof(brw_inst);
   compacted_counts[src_offset / sizeof(brw_inst)] = compacted_count;
 
-  brw_inst saved = *src;
+  brw_inst inst = precompact(devinfo, *src);
+  brw_inst saved = inst;
 
-  if (brw_try_compact_instruction(devinfo, dst, src)) {
+  if (brw_try_compact_instruction(devinfo, dst, )) {
  compacted_count++;
 
  if (INTEL_DEBUG) {
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 0b0d67a5c5..8a6ec035cc 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -356,16 +356,6 @@ validate_reg(const struct gen_device_info *devinfo,
/* 10. Check destination issues. */
 }
 
-static bool
-is_compactable_immediate(unsigned imm)
-{
-   /* We get the low 12 bits as-is. */
-   imm &= ~0xfff;
-
-   /* We get one bit replicated through the 

[Mesa-dev] [PATCH v3 7/8] anv: Use DRM sync objects for external semaphores when available

2017-08-04 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c | 59 +++
 src/intel/vulkan/anv_device.c  |  1 +
 src/intel/vulkan/anv_private.h |  8 
 src/intel/vulkan/anv_queue.c   | 83 +++---
 4 files changed, 128 insertions(+), 23 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 7a84bbd..e670ad7 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -957,6 +957,11 @@ struct anv_execbuf {
 
/* Allocated length of the 'objects' and 'bos' arrays */
uint32_t  array_length;
+
+   uint32_t  fence_count;
+   uint32_t  fence_array_length;
+   struct drm_i915_gem_exec_fence *  fences;
+   struct anv_syncobj ** syncobjs;
 };
 
 static void
@@ -971,6 +976,8 @@ anv_execbuf_finish(struct anv_execbuf *exec,
 {
vk_free(alloc, exec->objects);
vk_free(alloc, exec->bos);
+   vk_free(alloc, exec->fences);
+   vk_free(alloc, exec->syncobjs);
 }
 
 static VkResult
@@ -1061,6 +1068,35 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
return VK_SUCCESS;
 }
 
+static VkResult
+anv_execbuf_add_syncobj(struct anv_execbuf *exec,
+uint32_t handle, uint32_t flags,
+const VkAllocationCallbacks *alloc)
+{
+   assert(flags != 0);
+
+   if (exec->fence_count >= exec->fence_array_length) {
+  uint32_t new_len = MAX2(exec->fence_array_length * 2, 64);
+
+  exec->fences = vk_realloc(alloc, exec->fences,
+new_len * sizeof(*exec->fences),
+8, VK_SYSTEM_ALLOCATION_SCOPE_COMMAND);
+  if (exec->fences == NULL)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+  exec->fence_array_length = new_len;
+   }
+
+   exec->fences[exec->fence_count] = (struct drm_i915_gem_exec_fence) {
+  .handle = handle,
+  .flags = flags,
+   };
+
+   exec->fence_count++;
+
+   return VK_SUCCESS;
+}
+
 static void
 anv_cmd_buffer_process_relocs(struct anv_cmd_buffer *cmd_buffer,
   struct anv_reloc_list *list)
@@ -1448,6 +1484,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  impl->fd = -1;
  break;
 
+  case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_WAIT,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  break;
   }
@@ -1484,6 +1528,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  need_out_fence = true;
  break;
 
+  case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_SIGNAL,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  break;
   }
@@ -1497,6 +1549,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   setup_empty_execbuf(, device);
}
 
+   if (execbuf.fence_count > 0) {
+  assert(device->instance->physicalDevice.has_syncobj);
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_ARRAY;
+  execbuf.execbuf.num_cliprects = execbuf.fence_count;
+  execbuf.execbuf.cliprects_ptr = (uintptr_t) execbuf.fences;
+   }
+
if (in_fence != -1) {
   execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;
   execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 3c5f78c..a6d5215 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -338,6 +338,7 @@ anv_physical_device_init(struct anv_physical_device *device,
 
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
+   device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
 
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index b451fa5..de74637 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -653,6 +653,7 @@ struct anv_physical_device {
 int cmd_parser_version;
 boolhas_exec_async;
 boolhas_exec_fence;
+boolhas_syncobj;
 
 uint32_teu_total;
 uint32_tsubslice_total;
@@ -1742,6 +1743,7 @@ enum anv_semaphore_type {
ANV_SEMAPHORE_TYPE_DUMMY,

[Mesa-dev] [PATCH v3 8/8] anv: Advertise VK_KHR_external_semaphore

2017-08-04 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_extensions.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 00186bc..3252e0f 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -50,9 +50,9 @@ EXTENSIONS = [
 Extension('VK_KHR_external_memory',   1, True),
 Extension('VK_KHR_external_memory_capabilities',  1, True),
 Extension('VK_KHR_external_memory_fd',1, True),
-Extension('VK_KHR_external_semaphore',1, False),
-Extension('VK_KHR_external_semaphore_capabilities',   1, False),
-Extension('VK_KHR_external_semaphore_fd', 1, False),
+Extension('VK_KHR_external_semaphore',1, True),
+Extension('VK_KHR_external_semaphore_capabilities',   1, True),
+Extension('VK_KHR_external_semaphore_fd', 1, True),
 Extension('VK_KHR_get_memory_requirements2',  1, True),
 Extension('VK_KHR_get_physical_device_properties2',   1, True),
 Extension('VK_KHR_get_surface_capabilities2', 1, True),
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 6/8] anv/gem: Add a drm syncobj support

2017-08-04 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 52 
 src/intel/vulkan/anv_gem_stubs.c | 24 +++
 src/intel/vulkan/anv_private.h   |  4 
 3 files changed, 80 insertions(+)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 5b68e9b..9e6b2bb 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -436,3 +436,55 @@ anv_gem_sync_file_merge(struct anv_device *device, int 
fd1, int fd2)
 
return args.fence;
 }
+
+uint32_t
+anv_gem_syncobj_create(struct anv_device *device)
+{
+   struct drm_syncobj_create args = {
+  .flags = 0,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_CREATE, );
+   if (ret)
+  return 0;
+
+   return args.handle;
+}
+
+void
+anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_destroy args = {
+  .handle = handle,
+   };
+
+   anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_DESTROY, );
+}
+
+int
+anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_handle args = {
+  .handle = handle,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, );
+   if (ret)
+  return -1;
+
+   return args.fd;
+}
+
+uint32_t
+anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd)
+{
+   struct drm_syncobj_handle args = {
+  .fd = fd,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, );
+   if (ret)
+  return 0;
+
+   return args.handle;
+}
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index 8d81eb5..842efb3 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -180,3 +180,27 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd)
 {
unreachable("Unused");
 }
+
+uint32_t
+anv_gem_syncobj_create(struct anv_device *device)
+{
+   unreachable("Unused");
+}
+
+void
+anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+int
+anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+uint32_t
+anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd)
+{
+   unreachable("Unused");
+}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5c7b3b4..b451fa5 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -812,6 +812,10 @@ int anv_gem_set_caching(struct anv_device *device, 
uint32_t gem_handle, uint32_t
 int anv_gem_set_domain(struct anv_device *device, uint32_t gem_handle,
uint32_t read_domains, uint32_t write_domain);
 int anv_gem_sync_file_merge(struct anv_device *device, int fd1, int fd2);
+uint32_t anv_gem_syncobj_create(struct anv_device *device);
+void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle);
+int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
+uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
 
 VkResult anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, 
uint64_t size);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/8] intel/drm: Pull in the i916 fence array API

2017-08-04 Thread Jason Ekstrand
---
 include/drm-uapi/i915_drm.h | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index c26bf7c..338c8c2 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -431,6 +431,11 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_BATCH_FIRST 48
 
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
+ * drm_i915_gem_exec_fence structures.  See I915_EXEC_FENCE_ARRAY.
+ */
+#define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
+
 typedef struct drm_i915_getparam {
__s32 param;
/*
@@ -812,6 +817,17 @@ struct drm_i915_gem_exec_object2 {
__u64 rsvd2;
 };
 
+struct drm_i915_gem_exec_fence {
+   /**
+* User's handle for a dma-fence to wait on or signal.
+*/
+   __u32 handle;
+
+#define I915_EXEC_FENCE_WAIT(1<<0)
+#define I915_EXEC_FENCE_SIGNAL  (1<<1)
+   __u32 flags;
+};
+
 struct drm_i915_gem_execbuffer2 {
/**
 * List of gem_exec_object2 structs
@@ -826,7 +842,10 @@ struct drm_i915_gem_execbuffer2 {
__u32 DR1;
__u32 DR4;
__u32 num_cliprects;
-   /** This is a struct drm_clip_rect *cliprects */
+   /** This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
+ * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
+ * struct drm_i915_gem_exec_fence *fences.
+ */
__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK  (7<<0)
 #define I915_EXEC_DEFAULT(0<<0)
@@ -927,7 +946,14 @@ struct drm_i915_gem_execbuffer2 {
  * element).
  */
 #define I915_EXEC_BATCH_FIRST  (1<<18)
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1))
+
+/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr
+ * define an array of i915_gem_exec_fence structures which specify a set of
+ * dma fences to wait upon or signal.
+ */
+#define I915_EXEC_FENCE_ARRAY   (1<<19)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK  (0x)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/8] anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set

2017-08-04 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/vulkan/anv_gem.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index ac47da4..36692f5 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -185,7 +185,10 @@ int
 anv_gem_execbuffer(struct anv_device *device,
struct drm_i915_gem_execbuffer2 *execbuf)
 {
-   return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf);
+   if (execbuf->flags & I915_EXEC_FENCE_OUT)
+  return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2_WR, execbuf);
+   else
+  return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf);
 }
 
 int
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/8] anv: Implement support for exporting semaphores as FENCE_FD

2017-08-04 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c | 57 +--
 src/intel/vulkan/anv_device.c  |  1 +
 src/intel/vulkan/anv_gem.c | 36 
 src/intel/vulkan/anv_private.h | 23 +
 src/intel/vulkan/anv_queue.c   | 69 --
 5 files changed, 175 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 65fe366..7a84bbd 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1416,11 +1416,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
struct anv_execbuf execbuf;
anv_execbuf_init();
 
+   int in_fence = -1;
VkResult result = VK_SUCCESS;
for (uint32_t i = 0; i < num_in_semaphores; i++) {
   ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
-  assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
-  struct anv_semaphore_impl *impl = >permanent;
+  struct anv_semaphore_impl *impl =
+ semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ?
+ >temporary : >permanent;
 
   switch (impl->type) {
   case ANV_SEMAPHORE_TYPE_BO:
@@ -1429,11 +1431,29 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  if (result != VK_SUCCESS)
 return result;
  break;
+
+  case ANV_SEMAPHORE_TYPE_SYNC_FILE:
+ if (in_fence == -1) {
+in_fence = impl->fd;
+ } else {
+int merge = anv_gem_sync_file_merge(device, in_fence, impl->fd);
+if (merge == -1)
+   return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
+
+close(impl->fd);
+close(in_fence);
+in_fence = merge;
+ }
+
+ impl->fd = -1;
+ break;
+
   default:
  break;
   }
}
 
+   bool need_out_fence = false;
for (uint32_t i = 0; i < num_out_semaphores; i++) {
   ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
 
@@ -1459,6 +1479,11 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  if (result != VK_SUCCESS)
 return result;
  break;
+
+  case ANV_SEMAPHORE_TYPE_SYNC_FILE:
+ need_out_fence = true;
+ break;
+
   default:
  break;
   }
@@ -1472,9 +1497,19 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   setup_empty_execbuf(, device);
}
 
+   if (in_fence != -1) {
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;
+  execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;
+   }
+
+   if (need_out_fence)
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_OUT;
 
result = anv_device_execbuf(device, , execbuf.bos);
 
+   /* Execbuf does not consume the in_fence.  It's our job to close it. */
+   close(in_fence);
+
for (uint32_t i = 0; i < num_in_semaphores; i++) {
   ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
   /* From the Vulkan 1.0.53 spec:
@@ -1489,6 +1524,24 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   anv_semaphore_reset_temporary(device, semaphore);
}
 
+   if (result == VK_SUCCESS && need_out_fence) {
+  int out_fence = execbuf.execbuf.rsvd2 >> 32;
+  for (uint32_t i = 0; i < num_out_semaphores; i++) {
+ ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
+ /* Out fences can't have temporary state because that would imply
+  * that we imported a sync file and are trying to signal it.
+  */
+ assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
+ struct anv_semaphore_impl *impl = >permanent;
+
+ if (impl->type == ANV_SEMAPHORE_TYPE_SYNC_FILE) {
+assert(impl->fd == -1);
+impl->fd = dup(out_fence);
+ }
+  }
+  close(out_fence);
+   }
+
anv_execbuf_finish(, >alloc);
 
return result;
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index e82e1e9..3c5f78c 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -337,6 +337,7 @@ anv_physical_device_init(struct anv_physical_device *device,
   goto fail;
 
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
+   device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
 
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 36692f5..5b68e9b 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -22,6 +22,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -400,3 +401,38 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd)
 
return args.handle;
 }
+
+#ifndef SYNC_IOC_MAGIC
+/* duplicated from linux/sync_file.h to avoid build-time depnedency
+ * on new (v4.7) kernel headers.  Once distro's are mostly using
+ * something newer than v4.7 drop this and #include 
+ * instead.
+ */
+struct 

[Mesa-dev] [PATCH v3 2/8] anv: Submit a dummy batch when only semaphores are provided.

2017-08-04 Thread Jason Ekstrand
Vulkan allows you to do a submit whose only job is to wait on and
trigger semaphores.  The easiest way for us to support that right
now is to insert a dummy execbuf.
---
 src/intel/vulkan/anv_batch_chain.c | 28 +---
 src/intel/vulkan/anv_device.c  | 30 ++
 src/intel/vulkan/anv_private.h |  1 +
 src/intel/vulkan/anv_queue.c   | 17 +
 4 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 94e7a7d..65fe366 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1388,6 +1388,23 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf,
return VK_SUCCESS;
 }
 
+static void
+setup_empty_execbuf(struct anv_execbuf *execbuf, struct anv_device *device)
+{
+   anv_execbuf_add_bo(execbuf, >trivial_batch_bo, NULL, 0,
+  >alloc);
+
+   execbuf->execbuf = (struct drm_i915_gem_execbuffer2) {
+  .buffers_ptr = (uintptr_t) execbuf->objects,
+  .buffer_count = execbuf->bo_count,
+  .batch_start_offset = 0,
+  .batch_len = 8, /* GEN8_MI_BATCH_BUFFER_END and NOOP */
+  .flags = I915_EXEC_HANDLE_LUT | I915_EXEC_RENDER,
+  .rsvd1 = device->context_id,
+  .rsvd2 = 0,
+   };
+}
+
 VkResult
 anv_cmd_buffer_execbuf(struct anv_device *device,
struct anv_cmd_buffer *cmd_buffer,
@@ -1447,9 +1464,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   }
}
 
-   result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
-   if (result != VK_SUCCESS)
-  return result;
+   if (cmd_buffer) {
+  result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
+  if (result != VK_SUCCESS)
+ return result;
+   } else {
+  setup_empty_execbuf(, device);
+   }
+
 
result = anv_device_execbuf(device, , execbuf.bos);
 
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 793e519..e82e1e9 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1014,6 +1014,32 @@ anv_device_init_border_colors(struct anv_device *device)
 border_colors);
 }
 
+static void
+anv_device_init_trivial_batch(struct anv_device *device)
+{
+   anv_bo_init_new(>trivial_batch_bo, device, 4096);
+
+   if (device->instance->physicalDevice.has_exec_async)
+  device->trivial_batch_bo.flags |= EXEC_OBJECT_ASYNC;
+
+   void *map = anv_gem_mmap(device, device->trivial_batch_bo.gem_handle,
+0, 4096, 0);
+
+   struct anv_batch batch = {
+  .start = map,
+  .next = map,
+  .end = map + 4096,
+   };
+
+   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe);
+   anv_batch_emit(, GEN7_MI_NOOP, noop);
+
+   if (!device->info.has_llc)
+  gen_clflush_range(map, batch.next - map);
+
+   anv_gem_munmap(map, device->trivial_batch_bo.size);
+}
+
 VkResult anv_CreateDevice(
 VkPhysicalDevicephysicalDevice,
 const VkDeviceCreateInfo*   pCreateInfo,
@@ -1131,6 +1157,8 @@ VkResult anv_CreateDevice(
if (result != VK_SUCCESS)
   goto fail_surface_state_pool;
 
+   anv_device_init_trivial_batch(device);
+
anv_scratch_pool_init(device, >scratch_pool);
 
anv_queue_init(device, >queue);
@@ -1220,6 +1248,8 @@ void anv_DestroyDevice(
anv_gem_munmap(device->workaround_bo.map, device->workaround_bo.size);
anv_gem_close(device, device->workaround_bo.gem_handle);
 
+   anv_gem_close(device, device->trivial_batch_bo.gem_handle);
+
anv_state_pool_finish(>surface_state_pool);
anv_state_pool_finish(>instruction_state_pool);
anv_state_pool_finish(>dynamic_state_pool);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index b599db3..bc67bb6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -745,6 +745,7 @@ struct anv_device {
 struct anv_state_pool   surface_state_pool;
 
 struct anv_bo   workaround_bo;
+struct anv_bo   trivial_batch_bo;
 
 struct anv_pipeline_cache   blorp_shader_cache;
 struct blorp_contextblorp;
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 9a0789c..039dfd7 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -159,6 +159,23 @@ VkResult anv_QueueSubmit(
pthread_mutex_lock(>mutex);
 
for (uint32_t i = 0; i < submitCount; i++) {
+  if (pSubmits[i].commandBufferCount == 0) {
+ /* If we don't have any command buffers, we need to submit a dummy
+  * batch to give GEM something to wait on.  We could, potentially,
+  * come up with something more efficient but this shouldn't be a
+  * common case.
+  */
+ result = anv_cmd_buffer_execbuf(device, NULL,
+   

[Mesa-dev] [PATCH v3 1/8] anv: Add a basic implementation of VK_KHX_external_semaphore

2017-08-04 Thread Jason Ekstrand
This patch adds an implementation based on DRM BOs.  We don't actually
advertise the extension yet because we want to add a couple more paths
first.
---
 src/intel/vulkan/anv_batch_chain.c |  31 +++-
 src/intel/vulkan/anv_extensions.py |   3 +
 src/intel/vulkan/anv_private.h |   3 +
 src/intel/vulkan/anv_queue.c   | 154 +++--
 4 files changed, 184 insertions(+), 7 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index ad76dc1..94e7a7d 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1419,8 +1419,21 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 
for (uint32_t i = 0; i < num_out_semaphores; i++) {
   ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
-  assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
-  struct anv_semaphore_impl *impl = >permanent;
+
+  /* Under most circumstances, out fences won't be temporary.  However,
+   * the spec does allow it for opaque_fd.  From the Vulkan 1.0.53 spec:
+   *
+   *"If the import is temporary, the implementation must restore the
+   *semaphore to its prior permanent state after submitting the next
+   *semaphore wait operation."
+   *
+   * The spec says nothing whatsoever about signal operations on
+   * temporarily imported semaphores so it appears they are allowed.
+   * There are also CTS tests that require this to work.
+   */
+  struct anv_semaphore_impl *impl =
+ semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ?
+ >temporary : >permanent;
 
   switch (impl->type) {
   case ANV_SEMAPHORE_TYPE_BO:
@@ -1440,6 +1453,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 
result = anv_device_execbuf(device, , execbuf.bos);
 
+   for (uint32_t i = 0; i < num_in_semaphores; i++) {
+  ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
+  /* From the Vulkan 1.0.53 spec:
+   *
+   *"If the import is temporary, the implementation must restore the
+   *semaphore to its prior permanent state after submitting the next
+   *semaphore wait operation."
+   *
+   * This has to happen after the execbuf in case we close any syncobjs in
+   * the process.
+   */
+  anv_semaphore_reset_temporary(device, semaphore);
+   }
+
anv_execbuf_finish(, >alloc);
 
return result;
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index ae22249..00186bc 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -50,6 +50,9 @@ EXTENSIONS = [
 Extension('VK_KHR_external_memory',   1, True),
 Extension('VK_KHR_external_memory_capabilities',  1, True),
 Extension('VK_KHR_external_memory_fd',1, True),
+Extension('VK_KHR_external_semaphore',1, False),
+Extension('VK_KHR_external_semaphore_capabilities',   1, False),
+Extension('VK_KHR_external_semaphore_fd', 1, False),
 Extension('VK_KHR_get_memory_requirements2',  1, True),
 Extension('VK_KHR_get_physical_device_properties2',   1, True),
 Extension('VK_KHR_get_surface_capabilities2', 1, True),
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index c364491..b599db3 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1765,6 +1765,9 @@ struct anv_semaphore {
struct anv_semaphore_impl temporary;
 };
 
+void anv_semaphore_reset_temporary(struct anv_device *device,
+   struct anv_semaphore *semaphore);
+
 struct anv_shader_module {
unsigned charsha1[20];
uint32_t size;
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 2c10e9d..9a0789c 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -528,11 +528,38 @@ VkResult anv_CreateSemaphore(
if (semaphore == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   /* The DRM execbuffer ioctl always execute in-oder so long as you stay
-* on the same ring.  Since we don't expose the blit engine as a DMA
-* queue, a dummy no-op semaphore is a perfectly valid implementation.
-*/
-   semaphore->permanent.type = ANV_SEMAPHORE_TYPE_DUMMY;
+   const VkExportSemaphoreCreateInfoKHR *export =
+  vk_find_struct_const(pCreateInfo->pNext, 
EXPORT_SEMAPHORE_CREATE_INFO_KHR);
+VkExternalSemaphoreHandleTypeFlagsKHR handleTypes =
+  export ? export->handleTypes : 0;
+
+   if (handleTypes == 0) {
+  /* The DRM execbuffer ioctl always execute in-oder so long as you stay
+   * on the same ring.  Since we don't expose the blit engine as a DMA
+   * queue, a dummy no-op semaphore is a perfectly valid implementation.
+   */
+

[Mesa-dev] [PATCH v3 0/8] anv: Implement VK_KHR_external_semaphore

2017-08-04 Thread Jason Ekstrand
This series is a quick re-spin of the v2 sent yesterday to address review
feedback from Chris.  In particular, we now set EXEC_ASYNC on the trivial
batch and I deleted the syncobj cache.  Somehow, when I was working on this
yesterday, I got it into my head that the kernel deduplicates syncobj
handles and that we needed a cache to handle them correctly.  This is not
true.  Every call to SYNCOBJ_FD_TO_HANDLE produces a new handle and the
kernel does the reference counting for us.

Cc: Chad Versace 
Cc: Kristian H. Kristensen 
Cc: Chris Wilson 

Jason Ekstrand (8):
  anv: Add a basic implementation of VK_KHX_external_semaphore
  anv: Submit a dummy batch when only semaphores are provided.
  anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set
  anv: Implement support for exporting semaphores as FENCE_FD
  intel/drm: Pull in the i916 fence array API
  anv/gem: Add a drm syncobj support
  anv: Use DRM sync objects for external semaphores when available
  anv: Advertise VK_KHR_external_semaphore

 include/drm-uapi/i915_drm.h|  30 +++-
 src/intel/vulkan/anv_batch_chain.c | 175 +++-
 src/intel/vulkan/anv_device.c  |  32 +
 src/intel/vulkan/anv_extensions.py |   3 +
 src/intel/vulkan/anv_gem.c |  93 -
 src/intel/vulkan/anv_gem_stubs.c   |  24 
 src/intel/vulkan/anv_private.h |  39 +-
 src/intel/vulkan/anv_queue.c   | 271 -
 8 files changed, 646 insertions(+), 21 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs: removed the '--with-sha1' requirement from shading.html

2017-08-04 Thread Eleni Maria Stea
The configuration option --with-sha1 is no longer required for the
MESA_SHADER_READ_PATH, MESA_SHADER_DUMP_PATH environment variables
to take effect.

1- removed the "--with-sha1" sentence from docs/shading.html
2- added an extra note: that the corresponding dumped and replacement
shaders must have the same filenames for the feature to take effect.
---
 docs/shading.html | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/shading.html b/docs/shading.html
index c789102e64..8b4cfb36a1 100644
--- a/docs/shading.html
+++ b/docs/shading.html
@@ -65,8 +65,7 @@ Example:  export MESA_GLSL=dump,nopt
 
 
 
-Shaders can be dumped and replaced on runtime for debugging purposes. Mesa 
-needs to be configured with '--with-sha1' to enable this functionality. This 
+Shaders can be dumped and replaced on runtime for debugging purposes. This
 feature is not currently supported by SCons build.
 
 This is controlled via following environment variables:
@@ -76,7 +75,8 @@ This is controlled via following environment variables:
 
 Note, path set must exist before running for dumping or replacing to work. 
 When both are set, these paths should be different so the dumped shaders do 
-not clobber the replacement shaders.
+not clobber the replacement shaders. Also, the filenames of the replacement 
shaders
+should match the filenames of the corresponding dumped shaders.
 
 
 GLSL Version
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/12] egl/x11: pass NULL instead of XCB_WINDOW_NONE as native_surface

2017-08-04 Thread Matt Turner
Thanks. I wrote the same patch some time ago, but never had a chance
to send it out. I'll send you another patch that I wrote to clear up
some of this confusion. I put it in my series immediately before this
patch. Feel free to add it to yours.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl: Clean up native_type vs drawable mess

2017-08-04 Thread Matt Turner
The next patch is going to stop passing XCB_WINDOW_NONE (of type
xcb_window_enum_t) as an argument where these functions expect a void *,
which clang does not appreciate.

This patch cleans things up to better convince me and reviewers that
it's safe to do that.
---
 src/egl/drivers/dri2/platform_x11.c  | 10 --
 src/egl/drivers/dri2/platform_x11_dri3.c |  6 +++---
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index c10cd84fce..063c50bcce 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -210,12 +210,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
xcb_get_geometry_cookie_t cookie;
xcb_get_geometry_reply_t *reply;
xcb_generic_error_t *error;
-   xcb_drawable_t drawable;
const __DRIconfig *config;
 
-   STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
-   drawable = (uintptr_t) native_surface;
-
(void) drv;
 
dri2_surf = malloc(sizeof *dri2_surf);
@@ -234,14 +230,16 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
dri2_surf->drawable, dri2_dpy->screen->root,
dri2_surf->base.Width, dri2_surf->base.Height);
} else {
-  if (!drawable) {
+  if (!native_surface) {
  if (type == EGL_WINDOW_BIT)
 _eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface");
  else
 _eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface");
  goto cleanup_surf;
   }
-  dri2_surf->drawable = drawable;
+
+  STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
+  dri2_surf->drawable = (uintptr_t) native_surface;
}
 
config = dri2_get_dri_config(dri2_conf, type,
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index 515be27e20..df17cfa7aa 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -172,9 +172,6 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
const __DRIconfig *dri_config;
xcb_drawable_t drawable;
 
-   STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
-   drawable = (uintptr_t) native_surface;
-
(void) drv;
 
dri3_surf = calloc(1, sizeof *dri3_surf);
@@ -191,6 +188,9 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
   xcb_create_pixmap(dri2_dpy->conn, conf->BufferSize,
 drawable, dri2_dpy->screen->root,
 dri3_surf->base.Width, dri3_surf->base.Height);
+   } else {
+  STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
+  drawable = (uintptr_t) native_surface;
}
 
dri_config = dri2_get_dri_config(dri2_conf, type,
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

--- Comment #2 from Brian Paul  ---
Hmm, swr isn't working at all for me.  It's hanging in a swr_fence_finish()
call with everything I've tried.  Even glxinfo hangs (but elsewhere).

Maybe one of the swr developers can take a look.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

--- Comment #1 from Brian Paul  ---
Sorry for the breakage, Brad.

I'll try to investigate ASAP, but I'm about to leave town for a week.

I can't repro with llmvpipe, fwiw.  I suspect it may be an issue in the swr
driver.  I'm building it now, but may run out of time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] etnaviv: fix etna_bo_from_name

2017-08-04 Thread Philipp Zabel
On Fri, 2017-08-04 at 18:15 +0200, Wladimir J. van der Laan wrote:
> On Fri, Aug 04, 2017 at 05:07:54PM +0200, Philipp Zabel wrote:
> > Look up BOs from the name table using the name parameter instead of
> > req.handle (which at this point is always zero).
> 
> Good catch.
> 
> Just out of interest: when is this used, what problems does this cause?

It is used by the etnaviv gallium driver in etna_screen_bo_from_handle
for DRM_API_HANDLE_TYPE_SHARED handles. Since this just falls back to
asking the kernel to DRM_IOCTL_GEM_OPEN if the BO is not found in the
name_table already, this bug caused no problems.

regards
Philipp

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 8/9] anv: Use DRM sync objects for external semaphores when available

2017-08-04 Thread Jason Ekstrand
On Fri, Aug 4, 2017 at 2:03 AM, Chris Wilson 
wrote:

> Quoting Jason Ekstrand (2017-08-04 02:25:27)
> > @@ -1497,6 +1569,12 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> >setup_empty_execbuf(, device);
> > }
> >
> > +   if (execbuf.fence_count > 0) {
>
> For sanity, since I just had to check, assert(device->has_syncobj);
>

Good call.


> > +  execbuf.execbuf.flags |= I915_EXEC_FENCE_ARRAY;
> > +  execbuf.execbuf.num_cliprects = execbuf.fence_count;
> > +  execbuf.execbuf.cliprects_ptr = (uintptr_t) execbuf.fences;
> > +   }
> > +
> > if (in_fence != -1) {
> >execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;
> >execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 7/9] anv/allocator: Add a syncobj cache

2017-08-04 Thread Jason Ekstrand
On Fri, Aug 4, 2017 at 1:59 AM, Chris Wilson 
wrote:

> Quoting Jason Ekstrand (2017-08-04 02:25:26)
> > This is mostly a copy+paste of the BO cache but it's a bit simpler
> > because syncobjs don't have actual backing storage so we don't need to
> > check sizes or anything like that.  Also, we put the refcount directly
> > in anv_syncobj because they will always be heap pointers.
>
> Ok, but why do we need one at all? Some part of the Vk spec, some bad
> behaviour you noticed? Or just that it is more elegant to be minimalist?
>

Gah!  I thought I saw a real-world problem and decided the kernel must be
de-duplicating for me.  But now I remember that it doesn't and just looked
at the kernel code and confirmed that it gives you a new idr entry on every
fd_to_handle.  I'll delete all this garbage and go back to doing it the way
I was before.  Thanks for pointing that out!


> > ---
> >  src/intel/vulkan/anv_allocator.c | 194 ++
> +
> >  src/intel/vulkan/anv_device.c|   9 +-
> >  src/intel/vulkan/anv_private.h   |  40 
> >  3 files changed, 242 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_
> allocator.c
> > index efaaebc..204c466 100644
> > --- a/src/intel/vulkan/anv_allocator.c
> > +++ b/src/intel/vulkan/anv_allocator.c
> > @@ -1422,3 +1422,197 @@ anv_bo_cache_release(struct anv_device *device,
> >
> > vk_free(>alloc, bo);
> >  }
> > +
> > +VkResult
> > +anv_syncobj_cache_init(struct anv_syncobj_cache *cache)
> > +{
> > +   cache->map = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
> > +_mesa_key_pointer_equal);
>
> Not hash_uint for u32? Bah, for the number of ht mesa creates for
> looking up u32 names, you would think it would have an ultra-specialised
> data struct for it. :(
> -Chris
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] clover/memory: Copy data when creating buffers with CL_MEM_USE_HOST_PTR

2017-08-04 Thread Grigori Goronzy

On 2017-08-03 22:26, Alex Deucher wrote:


IIRC, user_ptrs require page alignment.

Alex



I didn't follow the whole discussion (sorry if I'm saying something 
redundant), but AMD's older OpenCL Optimization Guide [1] has some notes 
regarding the implementation of the USE_HOST_PTR flag.
It initially recommends 4KB (aka page) alignment but also supports 
arbitrary alignment (with additional overhead, I suppose it pins an 
extra page for bad alignments). It also does some optimizations to 
minimize mapping/unmapping operations, called "pre-pinning". Not sure if 
that is applicable to Mesa/Clover, aren't (GTT) buffers usually mapped 
forever?


Grigori

[1] 
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide.pdf




Right now it's hard-coded to R600_MAP_BUFFER_ALIGNMENT in si_pipe.c
and r600_pipe.c which has a value of 64 (bytes, I believe).





And also change si_pipe.c:si_get_param's switch statement value to 
return:

  case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
return sscreen->b.info.gart_page_size;


I'm not sure what the correct value is here. AFAIK, EG uses 256B 
cache

lines so I'd expect the value of to be at least that


Depending on how the weather works out tonight, I might be able to at
least find out what NI reports for gart page sizes and compare that to
my SI.  I haven't tried to test user pointer support on r600g yet, so
either it's working alright with the existing 64-byte alignment, or
it's broken when we allocate pointers using the actual alignments
reported by clGetDeviceInfo. If it's broken, I'll try 256B, then keep
bumping it up until it either starts working or I hit GART page size.

--Aaron



Both NI and GCN should be able to use 4K pages (which is what
gart_page_size is set to), but we might want higher alignment for
better performance[0]

[0]https://lists.freedesktop.org/archives/dri-devel/2014-May/058858.htm
l


Then I can successfully create buffers from user pointers on my SI 
card.


I'm a bit fuzzy on what alignment restrictions exist for SI/GCN 
cards,

but the winsys seems to indicate we should align things to gart page
size, which makes sense on the surface at least.

If the alignment restrictions have changed between R600 and GCN, 
that
might explain why what's broken for me is working for you/Grigori 
(on

r600).


I remember there was a buffer alignment patch form AMD recently for
SI/CI vs. VI+, but I can't find it.
It looks like a separate issue however. if incorrect alignment makes
user_ptr fail, and the test still fails, it looks like the 
no-user_ptr

fallback is broken.

Jan



--Aaron

>
> Jan
>
>
> >
> > Signed-off-by: Aaron Watry 
> > CC: Francisco Jerez 
> > ---
> >  src/gallium/state_trackers/clover/core/memory.cpp | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/gallium/state_trackers/clover/core/memory.cpp 
b/src/gallium/state_trackers/clover/core/memory.cpp
> > index b852e6896f..912d74830a 100644
> > --- a/src/gallium/state_trackers/clover/core/memory.cpp
> > +++ b/src/gallium/state_trackers/clover/core/memory.cpp
> > @@ -30,7 +30,7 @@ memory_obj::memory_obj(clover::context , cl_mem_flags 
flags,
> > size_t size, void *host_ptr) :
> > context(ctx), _flags(flags),
> > _size(size), _host_ptr(host_ptr) {
> > -   if (flags & CL_MEM_COPY_HOST_PTR)
> > +   if (flags & (CL_MEM_COPY_HOST_PTR | CL_MEM_USE_HOST_PTR))
> >data.append((char *)host_ptr, size);
> >  }
> >
>
> --
> Jan Vesely 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS

2017-08-04 Thread Marathe, Yogesh
> -Original Message-
> From: Tomasz Figa [mailto:tf...@chromium.org]
> Sent: Friday, August 4, 2017 9:39 PM
> On Sat, Aug 5, 2017 at 12:53 AM, Marathe, Yogesh
>  wrote:
> > Tomasz, Emil,
> >
> >> -Original Message-
> >> From: Tomasz Figa [mailto:tf...@chromium.org]
> >> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov 
> wrote:
> >> >>> >>  - version check (2+) the fence extension, calling
> >> >>> >> .create_fence_fd() only when
> >> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD
> >> >>
> >> >> The check looks like below now, this is in
> >> >> dri2_surf_update_fence_fd() before
> >> create_fence_fd is called.
> >> >>
> >> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) {
> >> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence-
> >> >get_capabilities(dri2_dpy->dri_screen)) {
> >> >>   //create_fence_fd call
> >> >>}
> >> >> }
> >> >>
> >> > Close but no cigar.
> >> >
> >> > if (dri2_surf->enable_out_fence && dri2_dpy->fence &&
> >> > dri2_dpy->fence->base.version >= 2 &&
> >> > dri2_dpy->fence->get_capabilities) {
> >> >
> >> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
> >> > __DRI_FENCE_CAP_NATIVE_FD) {
> >> > //create_fence_fd call
> >> >}
> >> > }
> >>
> >> If this needs so complicated series of checks, maybe it would make
> >> more sense to just set enable_out_fence based on availability of the
> >> capability at initialization time?
> >
> > I liked this one compared to nested ifs in dri2_surf_update_fence_fd().
> >
> >>
> >> >
> >> >> Overall, if I further go ahead and check, actually
> >> >> get_capabilities() ultimately returns based on has_exec_fence
> >> >> which depends on I915_PARAM_HAS_EXEC_FENCE. This is always set to
> >> >> true for i915 in kernel drv unless forced to false!! I'm not sure
> >> >> if that inner check of
> >> get_capabilities still makes sense. Isn't the first one sufficient?
> >> >>
> >> > Not sure what you mean with "first one", but consider the following
> example:
> >> >  - old kernel which does not support (or has force disabled)
> >> > I915_PARAM_HAS_EXEC_FENCE.
> >> >  - new userspace which unconditionally advertises the fence v2
> >> > extension IIRC one may tweak that things to only conditionally
> >> > advertise it, but IMHO it's not worth the hassle.
> >> >
> >> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module)
> >> > so focusing on one doesn't quite work.
> >> >
> >> >>> >>  - don't introduce unused variables (in make_current)
> >> >>
> >> >> Done.
> >> >>
> >> >>> >>  - the create fd for the old display surface (in make_current)
> >> >>> >> seems bogus
> >> >>
> >> >> Done.
> >> >>
> >> > Did you drop it all together or changed to use some other surface?
> >> > Would be nice to hear the reason why it was added - perhaps I'm
> >> > missing something.
> >>
> >> We have to keep it, otherwise there would be no fence available at
> >> the time of surface destruction, while, at least for Android, a fence
> >> can be passed to window's cancelBuffer callback.
> >>
> >> >
> >> > I think that we want a fence/fd for the new draw surface. Since
> >> > otherwise one won't get created up until the first SwapBuffers call.
> >>
> >> I might be missing something, but wouldn't that insert a fence at the
> >> beginning of command stream, before even doing anything? At least in
> >> Android use cases, the only places we need the fence is in
> >> SwapBuffers and DestroySurface and the fence should be inserted after
> >> all the commands for rendering into given surface.
> >>
> >
> > Emil,
> >
> > Tomasz sounds convincing to me here, I just went ahead with the
> > comment to try out and flatland worked even after removing that.
> > Zhongmin can explain better but I think in earlier revisions this was
> > done for cancelBuffer to match with queueBuffer, I mean we are passing
> > valid fd for queueBuffer by doing this we would have a valid fd during
> cancelBuffer.  Not sure if this is the reason / one of the reason.
> >
> > I will go ahead with rest of your comments if we are ok to keep fd for
> > old display surface in make_current.
> 
> My understanding is that nobody actually cares about the fence that
> cancelBuffer returns, because the contents of the buffer are going to be
> discarded anyway and the buffer doesn't go to the consumer (e.g.
> flatland code that reads the timestamp). I even suspect that typically
> destroySurface would be called directly after swapBuffers and the surface
> wouldn't have a buffer to cancel. You can easily check this by adding a print
> before cancelBuffer call happens. So we might actually be fine with simpler 
> code
> that gets fence only for swapBuffers.
> 

Sure. I can confirm this.

> Changing the topic, the patch doesn't seem to change the implementation of
> swapBuffers to stop doing a flush on the buffer, which defeats the purpose of
> the fence, as the it is likely already 

Re: [Mesa-dev] [PATCH v2 2/9] anv: Submit a dummy batch when only semaphores are provided.

2017-08-04 Thread Jason Ekstrand
On Fri, Aug 4, 2017 at 1:43 AM, Chris Wilson 
wrote:

> Quoting Jason Ekstrand (2017-08-04 02:25:21)
> > Vulkan allows you to do a submit whose only job is to wait on and
> > trigger semaphores.  The easiest way for us to support that right
> > now is to insert a dummy execbuf.
> > ---
> > diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> > index 793e519..0f0aa22 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -1014,6 +1014,28 @@ anv_device_init_border_colors(struct anv_device
> *device)
> >  border_colors);
> >  }
> >
> > +static void
> > +anv_device_init_trivial_batch(struct anv_device *device)
> > +{
> > +   anv_bo_init_new(>trivial_batch_bo, device, 4096);
>
> Is this created with ASYNC?


No, it isn't but I'm happy to set the flag.  This patch predates the ASYNC
stuff, I believe.

Just thinking that you only want the
> external ordering constraints on this bo, and not accidentally serialize
> between contexts.
>

Is this really an issue?  No other process will ever see this BO.  I
suppose the kernel is still doing unneeded flushing but this shouldn't
cause cross-context synchronization.


> > +   void *map = anv_gem_mmap(device, device->trivial_batch_bo.gem_
> handle,
> > +0, 4096, 0);
> > +
> > +   struct anv_batch batch = {
> > +  .start = map,
> > +  .next = map,
> > +  .end = map + 4096,
> > +   };
> > +
> > +   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe);
> > +   anv_batch_emit(, GEN7_MI_NOOP, noop);
> > +
> > +   if (!device->info.has_llc)
> > +  gen_clflush_range(map, batch.next - map);
> > +
> > +   anv_gem_munmap(map, device->trivial_batch_bo.size);
> > +}
>
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> > index b599db3..bc67bb6 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -745,6 +745,7 @@ struct anv_device {
> >  struct anv_state_pool   surface_state_pool;
> >
> >  struct anv_bo   workaround_bo;
> > +struct anv_bo   trivial_batch_bo;
>
> Do you use all 4096 bytes of the workaround_bo, or could you spare 64?
> ;)
>

I could... Then again, I can also easily spare a single 4K page per context
and prevent myself from accidentally overwriting my little batch. :-)


> > diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
> > index 446c3de..9934fef 100644
> > --- a/src/intel/vulkan/anv_queue.c
> > +++ b/src/intel/vulkan/anv_queue.c
> > @@ -159,6 +159,23 @@ VkResult anv_QueueSubmit(
> > pthread_mutex_lock(>mutex);
> >
> > for (uint32_t i = 0; i < submitCount; i++) {
> > +  if (pSubmits[i].commandBufferCount == 0) {
> > + /* If we don't have any command buffers, we need to submit a
> dummy
> > +  * batch to give GEM something to wait on.  We could,
> potentially,
> > +  * come up with something more efficient but this shouldn't be
> a
> > +  * common case.
> > +  */
> > + result = anv_cmd_buffer_execbuf(device, NULL,
> > + pSubmits[i].pWaitSemaphores,
> > + pSubmits[i].
> waitSemaphoreCount,
> > + pSubmits[i].pSignalSemaphores,
> > + pSubmits[i].
> signalSemaphoreCount);
>
> Might as well just pass [i] along?
>

I can't.  See below where we only pass the wait semaphores to the first
execbuf in the batch and only pass the signal semaphores to the last.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] etnaviv: fix etna_bo_from_name

2017-08-04 Thread Wladimir J. van der Laan
On Fri, Aug 04, 2017 at 05:07:54PM +0200, Philipp Zabel wrote:
> Look up BOs from the name table using the name parameter instead of
> req.handle (which at this point is always zero).

Good catch.

Just out of interest: when is this used, what problems does this cause?

Regards,
Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS

2017-08-04 Thread Tomasz Figa
On Sat, Aug 5, 2017 at 12:53 AM, Marathe, Yogesh
 wrote:
> Tomasz, Emil,
>
>> -Original Message-
>> From: Tomasz Figa [mailto:tf...@chromium.org]
>> Sent: Friday, August 4, 2017 6:54 PM
>> To: Emil Velikov 
>> Cc: Marathe, Yogesh ; Antognolli, Rafael
>> ; ML mesa-dev > d...@lists.freedesktop.org>; Wu, Zhongmin ; Gao,
>> Shuo ; Liu, Zhiquan ; Daniel
>> Stone ; Timothy Arceri ; Eric
>> Engestrom ; Kenneth Graunke ;
>> Kondapally, Kalyan ; Varad Gautam
>> ; Rainer Hochecker ;
>> Nicolai Hähnle 
>> Subject: Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync 
>> fence
>> for Android OS
>>
>> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov  
>> wrote:
>> >>> >>  - version check (2+) the fence extension, calling
>> >>> >> .create_fence_fd() only when
>> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD
>> >>
>> >> The check looks like below now, this is in dri2_surf_update_fence_fd() 
>> >> before
>> create_fence_fd is called.
>> >>
>> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) {
>> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence-
>> >get_capabilities(dri2_dpy->dri_screen)) {
>> >>   //create_fence_fd call
>> >>}
>> >> }
>> >>
>> > Close but no cigar.
>> >
>> > if (dri2_surf->enable_out_fence && dri2_dpy->fence &&
>> > dri2_dpy->fence->base.version >= 2 &&
>> > dri2_dpy->fence->get_capabilities) {
>> >
>> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
>> > __DRI_FENCE_CAP_NATIVE_FD) {
>> > //create_fence_fd call
>> >}
>> > }
>>
>> If this needs so complicated series of checks, maybe it would make more sense
>> to just set enable_out_fence based on availability of the capability at
>> initialization time?
>
> I liked this one compared to nested ifs in dri2_surf_update_fence_fd().
>
>>
>> >
>> >> Overall, if I further go ahead and check, actually get_capabilities()
>> >> ultimately returns based on has_exec_fence which depends on
>> >> I915_PARAM_HAS_EXEC_FENCE. This is always set to true for i915 in
>> >> kernel drv unless forced to false!! I'm not sure if that inner check of
>> get_capabilities still makes sense. Isn't the first one sufficient?
>> >>
>> > Not sure what you mean with "first one", but consider the following 
>> > example:
>> >  - old kernel which does not support (or has force disabled)
>> > I915_PARAM_HAS_EXEC_FENCE.
>> >  - new userspace which unconditionally advertises the fence v2
>> > extension IIRC one may tweak that things to only conditionally
>> > advertise it, but IMHO it's not worth the hassle.
>> >
>> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module) so
>> > focusing on one doesn't quite work.
>> >
>> >>> >>  - don't introduce unused variables (in make_current)
>> >>
>> >> Done.
>> >>
>> >>> >>  - the create fd for the old display surface (in make_current)
>> >>> >> seems bogus
>> >>
>> >> Done.
>> >>
>> > Did you drop it all together or changed to use some other surface?
>> > Would be nice to hear the reason why it was added - perhaps I'm
>> > missing something.
>>
>> We have to keep it, otherwise there would be no fence available at the time 
>> of
>> surface destruction, while, at least for Android, a fence can be passed to
>> window's cancelBuffer callback.
>>
>> >
>> > I think that we want a fence/fd for the new draw surface. Since
>> > otherwise one won't get created up until the first SwapBuffers call.
>>
>> I might be missing something, but wouldn't that insert a fence at the 
>> beginning
>> of command stream, before even doing anything? At least in Android use cases,
>> the only places we need the fence is in SwapBuffers and DestroySurface and 
>> the
>> fence should be inserted after all the commands for rendering into given
>> surface.
>>
>
> Emil,
>
> Tomasz sounds convincing to me here, I just went ahead with the comment to 
> try out and
> flatland worked even after removing that. Zhongmin can explain better but I 
> think in earlier
> revisions this was done for cancelBuffer to match with queueBuffer, I mean we 
> are passing
> valid fd for queueBuffer by doing this we would have a valid fd during 
> cancelBuffer.  Not
> sure if this is the reason / one of the reason.
>
> I will go ahead with rest of your comments if we are ok to keep fd for old 
> display surface
> in make_current.

My understanding is that nobody actually cares about the fence that
cancelBuffer returns, because the contents of the buffer are going to
be discarded anyway and the buffer doesn't go to the consumer (e.g.
flatland code that reads the timestamp). I even suspect that typically

Re: [Mesa-dev] [PATCH] loader: always include libxmlconfig on autotools build

2017-08-04 Thread Jan Vesely
On Fri, 2017-08-04 at 11:53 +0200, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> This aligns with the fact that we also check for EXPAT_LIBS
> unconditionally in configure.ac now. It should make all the
> various build permutations of Clover work (whether DRI is
> enabled or disabled in the build).
> 
> Cc: Aaron Watry 
> Cc: Emil Velikov 
> --
> This change keeps everything green on Travis, and it should fix
> the duplicate-symbol linker error seen by Aaron and others when
> building Clover.

It does. This patch fixes last of the problems I had since the driconf
changes.
Tested-by: Jan Vesely 

thanks,
Jan

> ---
>  src/gallium/targets/opencl/Makefile.am |  1 -
>  src/loader/Makefile.am | 13 +
>  2 files changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/src/gallium/targets/opencl/Makefile.am 
> b/src/gallium/targets/opencl/Makefile.am
> index e88fa0fd382..c9d2be7afd0 100644
> --- a/src/gallium/targets/opencl/Makefile.am
> +++ b/src/gallium/targets/opencl/Makefile.am
> @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
>   $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
>   $(top_builddir)/src/gallium/auxiliary/libgallium.la \
>   $(top_builddir)/src/util/libmesautil.la \
> - $(top_builddir)/src/util/libxmlconfig.la \
>   $(EXPAT_LIBS) \
>   $(LIBELF_LIBS) \
>   $(DLOPEN_LIBS) \
> diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
> index 8b197f2995c..5ed87820664 100644
> --- a/src/loader/Makefile.am
> +++ b/src/loader/Makefile.am
> @@ -33,21 +33,18 @@ AM_CPPFLAGS = \
>   $(XCB_DRI3_CFLAGS) \
>   $(LIBDRM_CFLAGS)
>  
> -libloader_la_CPPFLAGS = $(AM_CPPFLAGS)
> +libloader_la_CPPFLAGS = $(AM_CPPFLAGS) \
> + -DUSE_DRICONF
>  libloader_la_SOURCES = $(LOADER_C_FILES)
> -libloader_la_LIBADD =
> +libloader_la_LIBADD = \
> + $(top_builddir)/src/util/libxmlconfig.la
>  
>  if HAVE_DRICOMMON
>  libloader_la_CPPFLAGS += \
>   -I$(top_builddir)/src/util/ \
>   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
>   -I$(top_srcdir)/src/mesa/ \
> - -I$(top_srcdir)/src/mapi/ \
> - -DUSE_DRICONF
> -
> -libloader_la_LIBADD += \
> - $(top_builddir)/src/util/libxmlconfig.la
> -
> + -I$(top_srcdir)/src/mapi/
>  endif
>  
>  if HAVE_LIBDRM


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS

2017-08-04 Thread Marathe, Yogesh
Tomasz, Emil,

> -Original Message-
> From: Tomasz Figa [mailto:tf...@chromium.org]
> Sent: Friday, August 4, 2017 6:54 PM
> To: Emil Velikov 
> Cc: Marathe, Yogesh ; Antognolli, Rafael
> ; ML mesa-dev  d...@lists.freedesktop.org>; Wu, Zhongmin ; Gao,
> Shuo ; Liu, Zhiquan ; Daniel
> Stone ; Timothy Arceri ; Eric
> Engestrom ; Kenneth Graunke ;
> Kondapally, Kalyan ; Varad Gautam
> ; Rainer Hochecker ;
> Nicolai Hähnle 
> Subject: Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync 
> fence
> for Android OS
> 
> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov  wrote:
> >>> >>  - version check (2+) the fence extension, calling
> >>> >> .create_fence_fd() only when
> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD
> >>
> >> The check looks like below now, this is in dri2_surf_update_fence_fd() 
> >> before
> create_fence_fd is called.
> >>
> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) {
> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence-
> >get_capabilities(dri2_dpy->dri_screen)) {
> >>   //create_fence_fd call
> >>}
> >> }
> >>
> > Close but no cigar.
> >
> > if (dri2_surf->enable_out_fence && dri2_dpy->fence &&
> > dri2_dpy->fence->base.version >= 2 &&
> > dri2_dpy->fence->get_capabilities) {
> >
> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
> > __DRI_FENCE_CAP_NATIVE_FD) {
> > //create_fence_fd call
> >}
> > }
> 
> If this needs so complicated series of checks, maybe it would make more sense
> to just set enable_out_fence based on availability of the capability at
> initialization time?

I liked this one compared to nested ifs in dri2_surf_update_fence_fd().

> 
> >
> >> Overall, if I further go ahead and check, actually get_capabilities()
> >> ultimately returns based on has_exec_fence which depends on
> >> I915_PARAM_HAS_EXEC_FENCE. This is always set to true for i915 in
> >> kernel drv unless forced to false!! I'm not sure if that inner check of
> get_capabilities still makes sense. Isn't the first one sufficient?
> >>
> > Not sure what you mean with "first one", but consider the following example:
> >  - old kernel which does not support (or has force disabled)
> > I915_PARAM_HAS_EXEC_FENCE.
> >  - new userspace which unconditionally advertises the fence v2
> > extension IIRC one may tweak that things to only conditionally
> > advertise it, but IMHO it's not worth the hassle.
> >
> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module) so
> > focusing on one doesn't quite work.
> >
> >>> >>  - don't introduce unused variables (in make_current)
> >>
> >> Done.
> >>
> >>> >>  - the create fd for the old display surface (in make_current)
> >>> >> seems bogus
> >>
> >> Done.
> >>
> > Did you drop it all together or changed to use some other surface?
> > Would be nice to hear the reason why it was added - perhaps I'm
> > missing something.
> 
> We have to keep it, otherwise there would be no fence available at the time of
> surface destruction, while, at least for Android, a fence can be passed to
> window's cancelBuffer callback.
> 
> >
> > I think that we want a fence/fd for the new draw surface. Since
> > otherwise one won't get created up until the first SwapBuffers call.
> 
> I might be missing something, but wouldn't that insert a fence at the 
> beginning
> of command stream, before even doing anything? At least in Android use cases,
> the only places we need the fence is in SwapBuffers and DestroySurface and the
> fence should be inserted after all the commands for rendering into given
> surface.
> 

Emil,

Tomasz sounds convincing to me here, I just went ahead with the comment to try 
out and
flatland worked even after removing that. Zhongmin can explain better but I 
think in earlier
revisions this was done for cancelBuffer to match with queueBuffer, I mean we 
are passing
valid fd for queueBuffer by doing this we would have a valid fd during 
cancelBuffer.  Not
sure if this is the reason / one of the reason.

I will go ahead with rest of your comments if we are ok to keep fd for old 
display surface 
in make_current.

> Best regards,
> Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >