Re: [Mesa-dev] [PATCH 21/30] i965/screen: Use ISL for doing image import checks
On Fri, Aug 4, 2017 at 2:16 AM, Rainer Hocheckerwrote: > This seems to breaks exporting 16bit vaapi images via drm buffers > Yes, I'm aware of the problem and there are two patches on the list which should fix it: https://patchwork.freedesktop.org/patch/170051/ https://patchwork.freedesktop.org/patch/170052/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH shader-db 1/1] shaders: Add Dolphin’s übershaders
These shaders have been generated by Dolphin 9649494f67 on Mesa 8c26b52349 for an HD4000 GPU. They include a lot of uniform branches, mostly on integers, as well as switch statements branching on small and bounded integers. Signed-off-by: Emmanuel Gil Peyrot--- The actual patch isn’t included because it was more than 1 MiB, I hosted it on my website instead: https://linkmauve.fr/files/0001-shaders-Add-Dolphin-s-bershaders.patch shaders/dolphin/ubershaders/102.shader_test | 1258 + shaders/dolphin/ubershaders/111.shader_test | 1268 + shaders/dolphin/ubershaders/12.shader_test | 961 +++ shaders/dolphin/ubershaders/120.shader_test | 1281 ++ shaders/dolphin/ubershaders/129.shader_test | 1269 + shaders/dolphin/ubershaders/138.shader_test | 1279 ++ shaders/dolphin/ubershaders/147.shader_test | 1292 ++ shaders/dolphin/ubershaders/156.shader_test | 1280 ++ shaders/dolphin/ubershaders/165.shader_test | 1290 ++ shaders/dolphin/ubershaders/174.shader_test | 1303 ++ shaders/dolphin/ubershaders/183.shader_test | 1291 ++ shaders/dolphin/ubershaders/192.shader_test | 1301 ++ shaders/dolphin/ubershaders/201.shader_test | 1314 ++ shaders/dolphin/ubershaders/21.shader_test | 949 +++ shaders/dolphin/ubershaders/210.shader_test | 1302 ++ shaders/dolphin/ubershaders/219.shader_test | 1312 ++ shaders/dolphin/ubershaders/228.shader_test | 1325 +++ shaders/dolphin/ubershaders/237.shader_test | 1313 ++ shaders/dolphin/ubershaders/3.shader_test | 948 +++ shaders/dolphin/ubershaders/30.shader_test | 1235 + shaders/dolphin/ubershaders/39.shader_test | 1248 + shaders/dolphin/ubershaders/48.shader_test | 1236 + shaders/dolphin/ubershaders/57.shader_test | 1246 + shaders/dolphin/ubershaders/66.shader_test | 1259 + shaders/dolphin/ubershaders/75.shader_test | 1247 + shaders/dolphin/ubershaders/84.shader_test | 1257 + shaders/dolphin/ubershaders/93.shader_test | 1270 + 27 files changed, 33534 insertions(+) create mode 100644 shaders/dolphin/ubershaders/102.shader_test create mode 100644 shaders/dolphin/ubershaders/111.shader_test create mode 100644 shaders/dolphin/ubershaders/12.shader_test create mode 100644 shaders/dolphin/ubershaders/120.shader_test create mode 100644 shaders/dolphin/ubershaders/129.shader_test create mode 100644 shaders/dolphin/ubershaders/138.shader_test create mode 100644 shaders/dolphin/ubershaders/147.shader_test create mode 100644 shaders/dolphin/ubershaders/156.shader_test create mode 100644 shaders/dolphin/ubershaders/165.shader_test create mode 100644 shaders/dolphin/ubershaders/174.shader_test create mode 100644 shaders/dolphin/ubershaders/183.shader_test create mode 100644 shaders/dolphin/ubershaders/192.shader_test create mode 100644 shaders/dolphin/ubershaders/201.shader_test create mode 100644 shaders/dolphin/ubershaders/21.shader_test create mode 100644 shaders/dolphin/ubershaders/210.shader_test create mode 100644 shaders/dolphin/ubershaders/219.shader_test create mode 100644 shaders/dolphin/ubershaders/228.shader_test create mode 100644 shaders/dolphin/ubershaders/237.shader_test create mode 100644 shaders/dolphin/ubershaders/3.shader_test create mode 100644 shaders/dolphin/ubershaders/30.shader_test create mode 100644 shaders/dolphin/ubershaders/39.shader_test create mode 100644 shaders/dolphin/ubershaders/48.shader_test create mode 100644 shaders/dolphin/ubershaders/57.shader_test create mode 100644 shaders/dolphin/ubershaders/66.shader_test create mode 100644 shaders/dolphin/ubershaders/75.shader_test create mode 100644 shaders/dolphin/ubershaders/84.shader_test create mode 100644 shaders/dolphin/ubershaders/93.shader_test -- Emmanuel Gil Peyrot ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS
Hi Yogesh, On Sat, Aug 5, 2017 at 1:22 AM, Marathe, Yogeshwrote: >> -Original Message- >> From: Tomasz Figa [mailto:tf...@chromium.org] >> Sent: Friday, August 4, 2017 9:39 PM >> On Sat, Aug 5, 2017 at 12:53 AM, Marathe, Yogesh >> wrote: >> > Tomasz, Emil, >> > >> >> -Original Message- >> >> From: Tomasz Figa [mailto:tf...@chromium.org] >> >> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov >> wrote: >> >> >>> >> - version check (2+) the fence extension, calling >> >> >>> >> .create_fence_fd() only when >> >> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD >> >> >> >> >> >> The check looks like below now, this is in >> >> >> dri2_surf_update_fence_fd() before >> >> create_fence_fd is called. >> >> >> >> >> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) { >> >> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence- >> >> >get_capabilities(dri2_dpy->dri_screen)) { >> >> >> //create_fence_fd call >> >> >>} >> >> >> } >> >> >> >> >> > Close but no cigar. >> >> > >> >> > if (dri2_surf->enable_out_fence && dri2_dpy->fence && >> >> > dri2_dpy->fence->base.version >= 2 && >> >> > dri2_dpy->fence->get_capabilities) { >> >> > >> >> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) & >> >> > __DRI_FENCE_CAP_NATIVE_FD) { >> >> > //create_fence_fd call >> >> >} >> >> > } >> >> >> >> If this needs so complicated series of checks, maybe it would make >> >> more sense to just set enable_out_fence based on availability of the >> >> capability at initialization time? >> > >> > I liked this one compared to nested ifs in dri2_surf_update_fence_fd(). >> > >> >> >> >> > >> >> >> Overall, if I further go ahead and check, actually >> >> >> get_capabilities() ultimately returns based on has_exec_fence >> >> >> which depends on I915_PARAM_HAS_EXEC_FENCE. This is always set to >> >> >> true for i915 in kernel drv unless forced to false!! I'm not sure >> >> >> if that inner check of >> >> get_capabilities still makes sense. Isn't the first one sufficient? >> >> >> >> >> > Not sure what you mean with "first one", but consider the following >> example: >> >> > - old kernel which does not support (or has force disabled) >> >> > I915_PARAM_HAS_EXEC_FENCE. >> >> > - new userspace which unconditionally advertises the fence v2 >> >> > extension IIRC one may tweak that things to only conditionally >> >> > advertise it, but IMHO it's not worth the hassle. >> >> > >> >> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module) >> >> > so focusing on one doesn't quite work. >> >> > >> >> >>> >> - don't introduce unused variables (in make_current) >> >> >> >> >> >> Done. >> >> >> >> >> >>> >> - the create fd for the old display surface (in make_current) >> >> >>> >> seems bogus >> >> >> >> >> >> Done. >> >> >> >> >> > Did you drop it all together or changed to use some other surface? >> >> > Would be nice to hear the reason why it was added - perhaps I'm >> >> > missing something. >> >> >> >> We have to keep it, otherwise there would be no fence available at >> >> the time of surface destruction, while, at least for Android, a fence >> >> can be passed to window's cancelBuffer callback. >> >> >> >> > >> >> > I think that we want a fence/fd for the new draw surface. Since >> >> > otherwise one won't get created up until the first SwapBuffers call. >> >> >> >> I might be missing something, but wouldn't that insert a fence at the >> >> beginning of command stream, before even doing anything? At least in >> >> Android use cases, the only places we need the fence is in >> >> SwapBuffers and DestroySurface and the fence should be inserted after >> >> all the commands for rendering into given surface. >> >> >> > >> > Emil, >> > >> > Tomasz sounds convincing to me here, I just went ahead with the >> > comment to try out and flatland worked even after removing that. >> > Zhongmin can explain better but I think in earlier revisions this was >> > done for cancelBuffer to match with queueBuffer, I mean we are passing >> > valid fd for queueBuffer by doing this we would have a valid fd during >> cancelBuffer. Not sure if this is the reason / one of the reason. >> > >> > I will go ahead with rest of your comments if we are ok to keep fd for >> > old display surface in make_current. >> >> My understanding is that nobody actually cares about the fence that >> cancelBuffer returns, because the contents of the buffer are going to be >> discarded anyway and the buffer doesn't go to the consumer (e.g. >> flatland code that reads the timestamp). I even suspect that typically >> destroySurface would be called directly after swapBuffers and the surface >> wouldn't have a buffer to cancel. You can easily check this by adding a print >> before cancelBuffer call happens. So we might actually be fine with simpler >> code >> that gets fence only for swapBuffers. >> > > Sure. I can
Re: [Mesa-dev] [PATCH] radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)
On 4 August 2017 at 18:53, Nicolai Hähnlewrote: > On 04.08.2017 04:51, Dave Airlie wrote: >> >> From: Dave Airlie >> >> This is a bug in the app, but I'd rather avoid hanging the GPU, >> esp if someone is running in validation and it takes out their >> development environment. >> >> v2: get it right, reverse the polarity. >> >> Signed-off-by: Dave Airlie >> --- >> src/amd/vulkan/radv_meta_resolve.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/src/amd/vulkan/radv_meta_resolve.c >> b/src/amd/vulkan/radv_meta_resolve.c >> index 6cd0c38..6023e0f 100644 >> --- a/src/amd/vulkan/radv_meta_resolve.c >> +++ b/src/amd/vulkan/radv_meta_resolve.c >> @@ -382,6 +382,11 @@ void radv_CmdResolveImage( >> radv_meta_save_graphics_reset_vport_scissor_novertex(_state, >> cmd_buffer); >> assert(src_image->info.samples > 1); >> + if (src_image->info.samples <= 1) { >> + /* this causes GPU hangs if we get past here */ >> + fprintf(stderr, "radv: Illegal resolve operation (src not >> multisampled), will hang GPU."); >> + return; > > > If you really want to make sure developers get this right, you should > probably abort(); here? Although that might then bug users... maybe an > abort() that can be skipped by explicitly setting an environment variable? Well ideally we do nothing and let GPU reset happen, but I think we've got a bit of work in that area first!. This should be fine for now, hopefully they spot the missing rendering if they attempt this. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/30] i965/screen: Use ISL for doing image import checks
This seems to breaks exporting 16bit vaapi images via drm buffers ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] egl/drm: rename dri2_drm_create_surface()
On 5 August 2017 at 00:25, Emil Velikovwrote: > From: Emil Velikov > > The function can handle only window surfaces, so let's rename it > accordingly, killing the wrapper around it. > > Suggested-by: Eric Engestrom > Signed-off-by: Emil Velikov > --- > New patch > --- > src/egl/drivers/dri2/platform_drm.c | 17 - > 1 file changed, 4 insertions(+), 13 deletions(-) > > diff --git a/src/egl/drivers/dri2/platform_drm.c > b/src/egl/drivers/dri2/platform_drm.c > index 8d56fcb7698..89ad9e0d10c 100644 > --- a/src/egl/drivers/dri2/platform_drm.c > +++ b/src/egl/drivers/dri2/platform_drm.c > @@ -91,9 +91,9 @@ has_free_buffers(struct gbm_surface *_surf) > } > > static _EGLSurface * > -dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, > -_EGLConfig *conf, void *native_surface, > -const EGLint *attrib_list) > +dri2_drm_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp, > + _EGLConfig *conf, void *native_window, Fixed this locally to read native_surface, instead of native_window. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] No reloc for i965
On Friday, August 4, 2017 12:22:19 PM PDT Chris Wilson wrote: > Quoting Kenneth Graunke (2017-08-04 19:47:14) > > On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote: > > > Patch reordering from last time so that the cosmetic tweaks are done first > > > and out of the way. Kenneth has reviewed the core NO_RELOC patches, so > > > hopefully it doesn't look too bad and we can land at least as far as > > > there (patch 8/10). > > > > > > Thanks, > > > -Chris > > > > I split up some patches and pushed a modified version of this series. > > > > To ssh://git.freedesktop.org/git/mesa/mesa > >5c007203b73..6c530ad1160 master -> master > > > > Thanks a ton for getting us to NO_RELOC. I really like the new reloc > > flags system as well. It's so much nicer! > > I've still got to win you over to using LUT indices (kernel side, there > shouldn't be any case where it is worse, but the differences are easily > dwarfed in typical cases where it is only about 10% faster, but any > reduction inside the struct_mutex is a must), I see, and the per-context > bo along with removing the auxiliary render_cache set... > > But now for something completely different... > -Chris I landed I915_EXEC_HANDLE_LUT too, actually. I definitely want per-context BOs, but Jason and I had come up with some patches for that as well, and I haven't had a chance to compare your approach with ours to see which is better. I hope to do that soon. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/genxml: Fix gen10 BLEND_STATE variable length packing
Oh, I saw it had the old xml and was assuming it didn't cause any errors, but clearly I was wrong. Reviewed-by: Rafael AntognolliOn Fri, Aug 04, 2017 at 10:21:43PM +, Scott D Phillips wrote: > BLEND_STATE packing was modified to be variable-length in: > > 9670124e31 genxml: Make BLEND_STATE command support variable length array. > > The initial gen10.xml still had the old, fixed-length style > definition for BLEND_STATE. So gen10_upload_blend_state would > overwrite the packed BLEND_STATE_ENTRYs with its own fixed array > of all-zero entries when packing BLEND_STATE. This caused > BLEND_STATE upload to not work at all. > > Fixes: aa416f515a ("i965/genxml: Add gen10.xml") > --- > src/intel/genxml/gen10.xml | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml > index 23c2adb995..a7ae49ae65 100644 > --- a/src/intel/genxml/gen10.xml > +++ b/src/intel/genxml/gen10.xml > @@ -554,7 +554,7 @@ > > > > - > + > > type="bool"/> > > @@ -564,7 +564,7 @@ > > > > - > + > > > > -- > 2.11.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency
On 4 August 2017 at 23:08, Dieter Nützelwrote: > For the series: > > Tested-by: Dieter Nützel > > on RX580 > > with Clover, vdpau and Nine. > > ./autogen.sh --prefix=/usr/local --with-dri-drivers="" > --with-gallium-drivers=r600,radeonsi,swrast --with-platforms=drm,x11 > --enable-nine --enable-texture-float --enable-opencl > --with-vulkan-drivers=radeon > Thanks Dieter. I've refrained from pushing 2/3 and 3/3 since they seem incomplete. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] egl: handle BAD_NATIVE_PIXMAP further up the stack
From: Emil VelikovThe basic (null) check is identical across all backends. Just move it to the top. v2: - Split the WINDOW vs PIXMAP into separate patches - Move check after the dpy and config - dEQP expects so Cc: Eric Engestrom Signed-off-by: Emil Velikov --- src/egl/drivers/dri2/platform_x11.c | 5 - src/egl/main/eglapi.c | 3 +++ 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index 00cab577b77..e2007a0313e 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -234,11 +234,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, dri2_surf->drawable, dri2_dpy->screen->root, dri2_surf->base.Width, dri2_surf->base.Height); } else { - if (!drawable) { - assert(type == EGL_PIXMAP_BIT_BIT) - _eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface"); - goto cleanup_surf; - } dri2_surf->drawable = drawable; } diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index c5e3955c48c..3ca3dd4c7c1 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -1021,6 +1021,9 @@ _eglCreatePixmapSurfaceCommon(_EGLDisplay *disp, EGLConfig config, if ((conf->SurfaceType & EGL_PIXMAP_BIT) == 0) RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE); + if (native_pixmap == NULL) + RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_PIXMAP, EGL_NO_SURFACE); + surf = drv->API.CreatePixmapSurface(drv, disp, conf, native_pixmap, attrib_list); ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE; -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] egl/x11: pass NULL instead of XCB_WINDOW_NONE as native_surface
From: Emil VelikovSigned-off-by: Emil Velikov Reviewed-by: Matt Turner --- src/egl/drivers/dri2/platform_x11.c | 2 +- src/egl/drivers/dri2/platform_x11_dri3.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index ce5450155aa..ce4ba6b6e15 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -359,7 +359,7 @@ dri2_x11_create_pbuffer_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLConfig *conf, const EGLint *attrib_list) { return dri2_x11_create_surface(drv, disp, EGL_PBUFFER_BIT, conf, - XCB_WINDOW_NONE, attrib_list); + NULL, attrib_list); } static EGLBoolean diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c b/src/egl/drivers/dri2/platform_x11_dri3.c index b88374c1cbb..9c018168b1c 100644 --- a/src/egl/drivers/dri2/platform_x11_dri3.c +++ b/src/egl/drivers/dri2/platform_x11_dri3.c @@ -238,7 +238,7 @@ dri3_create_pbuffer_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLConfig *conf, const EGLint *attrib_list) { return dri3_create_surface(drv, disp, EGL_PBUFFER_BIT, conf, - XCB_WINDOW_NONE, attrib_list); + NULL, attrib_list); } static EGLBoolean -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] egl/drm: rename dri2_drm_create_surface()
From: Emil VelikovThe function can handle only window surfaces, so let's rename it accordingly, killing the wrapper around it. Suggested-by: Eric Engestrom Signed-off-by: Emil Velikov --- New patch --- src/egl/drivers/dri2/platform_drm.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/platform_drm.c index 8d56fcb7698..89ad9e0d10c 100644 --- a/src/egl/drivers/dri2/platform_drm.c +++ b/src/egl/drivers/dri2/platform_drm.c @@ -91,9 +91,9 @@ has_free_buffers(struct gbm_surface *_surf) } static _EGLSurface * -dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, -_EGLConfig *conf, void *native_surface, -const EGLint *attrib_list) +dri2_drm_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp, + _EGLConfig *conf, void *native_window, + const EGLint *attrib_list) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); struct dri2_egl_config *dri2_conf = dri2_egl_config(conf); @@ -110,7 +110,7 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, return NULL; } - if (!_eglInitSurface(_surf->base, disp, type, conf, attrib_list)) + if (!_eglInitSurface(_surf->base, disp, EGL_WINDOW_BIT, conf, attrib_list)) goto cleanup_surf; surf = gbm_dri_surface(window); @@ -149,15 +149,6 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, } static _EGLSurface * -dri2_drm_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp, - _EGLConfig *conf, void *native_window, - const EGLint *attrib_list) -{ - return dri2_drm_create_surface(drv, disp, EGL_WINDOW_BIT, conf, - native_window, attrib_list); -} - -static _EGLSurface * dri2_drm_create_pixmap_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLConfig *conf, void *native_window, const EGLint *attrib_list) -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] egl: drop unreachable BAD_NATIVE_WINDOW conditions
From: Emil VelikovThe code in _eglCreatePixmapSurfaceCommon() already has a NULL check which handles the condition. There's no point in checkin again further down the stack. v2: Split the WINDOW vs PIXMAP into separate patches Cc: Eric Engestrom Signed-off-by: Emil Velikov --- src/egl/drivers/dri2/platform_android.c | 2 +- src/egl/drivers/dri2/platform_drm.c | 5 - src/egl/drivers/dri2/platform_wayland.c | 5 - src/egl/drivers/dri2/platform_x11.c | 6 ++ 4 files changed, 3 insertions(+), 15 deletions(-) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index 50a82486956..beb474025f7 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -329,7 +329,7 @@ droid_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, if (type == EGL_WINDOW_BIT) { int format; - if (!window || window->common.magic != ANDROID_NATIVE_WINDOW_MAGIC) { + if (window->common.magic != ANDROID_NATIVE_WINDOW_MAGIC) { _eglError(EGL_BAD_NATIVE_WINDOW, "droid_create_surface"); goto cleanup_surface; } diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/platform_drm.c index a952aa54560..7ea43e62010 100644 --- a/src/egl/drivers/dri2/platform_drm.c +++ b/src/egl/drivers/dri2/platform_drm.c @@ -115,11 +115,6 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, switch (type) { case EGL_WINDOW_BIT: - if (!window) { - _eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface"); - goto cleanup_surf; - } - surf = gbm_dri_surface(window); dri2_surf->gbm_surf = surf; dri2_surf->base.Width = surf->base.width; diff --git a/src/egl/drivers/dri2/platform_wayland.c b/src/egl/drivers/dri2/platform_wayland.c index 38fdfe974fa..dcc777a3e8b 100644 --- a/src/egl/drivers/dri2/platform_wayland.c +++ b/src/egl/drivers/dri2/platform_wayland.c @@ -162,11 +162,6 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp, dri2_surf->format = WL_SHM_FORMAT_ARGB; } - if (!window) { - _eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface"); - goto cleanup_surf; - } - dri2_surf->wl_win = window; dri2_surf->wl_queue = wl_display_create_queue(dri2_dpy->wl_dpy); if (!dri2_surf->wl_queue) { diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index 7b5a1770bd7..00cab577b77 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -235,10 +235,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, dri2_surf->base.Width, dri2_surf->base.Height); } else { if (!drawable) { - if (type == EGL_WINDOW_BIT) -_eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface"); - else -_eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface"); + assert(type == EGL_PIXMAP_BIT_BIT) + _eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface"); goto cleanup_surf; } dri2_surf->drawable = drawable; -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] egl/drm: remove unreachable code in dri2_drm_create_surface()
From: Emil VelikovThe function can be called only when the type is EGL_WINDOW_BIT. Remove the unneeded switch statement. Signed-off-by: Emil Velikov --- src/egl/drivers/dri2/platform_drm.c | 20 +++- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/platform_drm.c index 7ea43e62010..8d56fcb7698 100644 --- a/src/egl/drivers/dri2/platform_drm.c +++ b/src/egl/drivers/dri2/platform_drm.c @@ -92,13 +92,13 @@ has_free_buffers(struct gbm_surface *_surf) static _EGLSurface * dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, -_EGLConfig *conf, void *native_window, +_EGLConfig *conf, void *native_surface, const EGLint *attrib_list) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); struct dri2_egl_config *dri2_conf = dri2_egl_config(conf); struct dri2_egl_surface *dri2_surf; - struct gbm_surface *window = native_window; + struct gbm_surface *window = native_surface; struct gbm_dri_surface *surf; const __DRIconfig *config; @@ -113,17 +113,11 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, if (!_eglInitSurface(_surf->base, disp, type, conf, attrib_list)) goto cleanup_surf; - switch (type) { - case EGL_WINDOW_BIT: - surf = gbm_dri_surface(window); - dri2_surf->gbm_surf = surf; - dri2_surf->base.Width = surf->base.width; - dri2_surf->base.Height = surf->base.height; - surf->dri_private = dri2_surf; - break; - default: - goto cleanup_surf; - } + surf = gbm_dri_surface(window); + dri2_surf->gbm_surf = surf; + dri2_surf->base.Width = surf->base.width; + dri2_surf->base.Height = surf->base.height; + surf->dri_private = dri2_surf; config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT, dri2_surf->base.GLColorspace); -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] egl: add dri2_setup_swap_interval helper
From: Emil VelikovThe current two implementations - X11 and Wayland were identical, barrind the upper limit. Instead of having same code twice - introduce a helper and pass the limit as an argument. Thus as Android/DRM/others get support - they only need to call the function ;-) v2: Rebase on top of keeping ::swap_available Signed-off-by: Emil Velikov Reviewed-by: Eric Engestrom (v1) --- src/egl/drivers/dri2/egl_dri2.c | 35 +++ src/egl/drivers/dri2/egl_dri2.h | 3 +++ src/egl/drivers/dri2/platform_wayland.c | 37 + src/egl/drivers/dri2/platform_x11.c | 37 ++--- 4 files changed, 49 insertions(+), 63 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 733659d547f..936b7c5199e 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -728,6 +728,41 @@ dri2_setup_screen(_EGLDisplay *disp) } } +void +dri2_setup_swap_interval(_EGLDisplay *disp, int max_swap_interval) +{ + struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + GLint vblank_mode = DRI_CONF_VBLANK_DEF_INTERVAL_1; + + /* Allow driconf to override applications.*/ + if (dri2_dpy->config) + dri2_dpy->config->configQueryi(dri2_dpy->dri_screen, + "vblank_mode", _mode); + switch (vblank_mode) { + case DRI_CONF_VBLANK_NEVER: + dri2_dpy->min_swap_interval = 0; + dri2_dpy->max_swap_interval = 0; + dri2_dpy->default_swap_interval = 0; + break; + case DRI_CONF_VBLANK_ALWAYS_SYNC: + dri2_dpy->min_swap_interval = 1; + dri2_dpy->max_swap_interval = max_swap_interval; + dri2_dpy->default_swap_interval = 1; + break; + case DRI_CONF_VBLANK_DEF_INTERVAL_0: + dri2_dpy->min_swap_interval = 0; + dri2_dpy->max_swap_interval = max_swap_interval; + dri2_dpy->default_swap_interval = 0; + break; + default: + case DRI_CONF_VBLANK_DEF_INTERVAL_1: + dri2_dpy->min_swap_interval = 0; + dri2_dpy->max_swap_interval = max_swap_interval; + dri2_dpy->default_swap_interval = 1; + break; + } +} + /* All platforms but DRM call this function to create the screen and populate * the driver_configs. DRM inherits that information from its display - GBM. */ diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index ccfefef61fc..751e7a4e2f3 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -370,6 +370,9 @@ dri2_load_driver(_EGLDisplay *disp); void dri2_setup_screen(_EGLDisplay *disp); +void +dri2_setup_swap_interval(_EGLDisplay *disp, int max_swap_interval); + EGLBoolean dri2_load_driver_swrast(_EGLDisplay *disp); diff --git a/src/egl/drivers/dri2/platform_wayland.c b/src/egl/drivers/dri2/platform_wayland.c index 73966b7c504..38fdfe974fa 100644 --- a/src/egl/drivers/dri2/platform_wayland.c +++ b/src/egl/drivers/dri2/platform_wayland.c @@ -925,7 +925,7 @@ dri2_wl_query_buffer_age(_EGLDriver *drv, static EGLBoolean dri2_wl_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) { - return dri2_wl_swap_buffers_with_damage (drv, disp, draw, NULL, 0); + return dri2_wl_swap_buffers_with_damage(drv, disp, draw, NULL, 0); } static struct wl_buffer * @@ -1140,41 +1140,14 @@ static const struct wl_registry_listener registry_listener_drm = { }; static void -dri2_wl_setup_swap_interval(struct dri2_egl_display *dri2_dpy) +dri2_wl_setup_swap_interval(_EGLDisplay *disp) { - GLint vblank_mode = DRI_CONF_VBLANK_DEF_INTERVAL_1; - /* We can't use values greater than 1 on Wayland because we are using the * frame callback to synchronise the frame and the only way we be sure to * get a frame callback is to attach a new buffer. Therefore we can't just * sit drawing nothing to wait until the next ‘n’ frame callbacks */ - if (dri2_dpy->config) - dri2_dpy->config->configQueryi(dri2_dpy->dri_screen, - "vblank_mode", _mode); - switch (vblank_mode) { - case DRI_CONF_VBLANK_NEVER: - dri2_dpy->min_swap_interval = 0; - dri2_dpy->max_swap_interval = 0; - dri2_dpy->default_swap_interval = 0; - break; - case DRI_CONF_VBLANK_ALWAYS_SYNC: - dri2_dpy->min_swap_interval = 1; - dri2_dpy->max_swap_interval = 1; - dri2_dpy->default_swap_interval = 1; - break; - case DRI_CONF_VBLANK_DEF_INTERVAL_0: - dri2_dpy->min_swap_interval = 0; - dri2_dpy->max_swap_interval = 1; - dri2_dpy->default_swap_interval = 0; - break; - default: - case DRI_CONF_VBLANK_DEF_INTERVAL_1: - dri2_dpy->min_swap_interval = 0; - dri2_dpy->max_swap_interval = 1; - dri2_dpy->default_swap_interval = 1; - break; - } + dri2_setup_swap_interval(disp, 1); } static const struct
[Mesa-dev] [PATCH 5/8] egl: Clean up native_type vs drawable mess
From: Matt TurnerThe next patch is going to stop passing XCB_WINDOW_NONE (of type xcb_window_enum_t) as an argument where these functions expect a void *, which clang does not appreciate. This patch cleans things up to better convince me and reviewers that it's safe to do that. v2: Emil Velikov: rebase/integrate with series Signed-off-by: Emil Velikov --- src/egl/drivers/dri2/platform_x11.c | 7 ++- src/egl/drivers/dri2/platform_x11_dri3.c | 6 +++--- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index e2007a0313e..ce5450155aa 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -210,12 +210,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, xcb_get_geometry_cookie_t cookie; xcb_get_geometry_reply_t *reply; xcb_generic_error_t *error; - xcb_drawable_t drawable; const __DRIconfig *config; - STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); - drawable = (uintptr_t) native_surface; - (void) drv; dri2_surf = malloc(sizeof *dri2_surf); @@ -234,7 +230,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, dri2_surf->drawable, dri2_dpy->screen->root, dri2_surf->base.Width, dri2_surf->base.Height); } else { - dri2_surf->drawable = drawable; + STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); + dri2_surf->drawable = (uintptr_t) native_surface; } config = dri2_get_dri_config(dri2_conf, type, diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c b/src/egl/drivers/dri2/platform_x11_dri3.c index 3a0efc6ccc9..b88374c1cbb 100644 --- a/src/egl/drivers/dri2/platform_x11_dri3.c +++ b/src/egl/drivers/dri2/platform_x11_dri3.c @@ -141,9 +141,6 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, const __DRIconfig *dri_config; xcb_drawable_t drawable; - STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); - drawable = (uintptr_t) native_surface; - (void) drv; dri3_surf = calloc(1, sizeof *dri3_surf); @@ -160,6 +157,9 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, xcb_create_pixmap(dri2_dpy->conn, conf->BufferSize, drawable, dri2_dpy->screen->root, dri3_surf->base.Width, dri3_surf->base.Height); + } else { + STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); + drawable = (uintptr_t) native_surface; } dri_config = dri2_get_dri_config(dri2_conf, type, -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] egl: rework input validation order in _eglCreateWindowSurfaceCommon
From: Emil VelikovAs mentioned in previous commit the negative tests in dEQP expect the arguments to be evaluated in particular order. Namely - first the dpy, then the config, followed by the surface/window. Move the check further down or executing the test below will produce the following error. dEQP-EGL.functional.negative_api.create_pbuffer_surface eglCreateWindowSurface(0x9bfff0f150, 0x, 0x, { EGL_NONE }); // 0x returned // ERROR expected: EGL_BAD_CONFIG, Got: EGL_BAD_NATIVE_WINDOW Cc: Cc: Mark Janes Cc: Chad Versace Signed-off-by: Emil Velikov --- Mark, IMHO the CI does the impossible and passes the test. Perhaps it's worth looking into how/why it does so - I don't know. I'll pipe the series through Jenkins tomorrow - don't want to stall things for the guys still working. Chad, I see that in the EGL_MESA_surfaceless implementation you explicitly mentioned that the surface is checked prior to the config. Wouldn't it be better to stay consistent and move those, as per the above? AFAICT the spec does not explicitly dictates the order. --- src/egl/main/eglapi.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 3ca3dd4c7c1..3b0f896f74c 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -872,10 +872,6 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config, _EGLSurface *surf; EGLSurface ret; - - if (native_window == NULL) - RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE); - #ifdef HAVE_SURFACELESS_PLATFORM if (disp && disp->Platform == _EGL_PLATFORM_SURFACELESS) { /* From the EGL_MESA_platform_surfaceless spec (v1): @@ -899,6 +895,9 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config, if ((conf->SurfaceType & EGL_WINDOW_BIT) == 0) RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE); + if (native_window == NULL) + RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE); + surf = drv->API.CreateWindowSurface(drv, disp, conf, native_window, attrib_list); ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE; -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel/genxml: Fix gen10 BLEND_STATE variable length packing
BLEND_STATE packing was modified to be variable-length in: 9670124e31 genxml: Make BLEND_STATE command support variable length array. The initial gen10.xml still had the old, fixed-length style definition for BLEND_STATE. So gen10_upload_blend_state would overwrite the packed BLEND_STATE_ENTRYs with its own fixed array of all-zero entries when packing BLEND_STATE. This caused BLEND_STATE upload to not work at all. Fixes: aa416f515a ("i965/genxml: Add gen10.xml") --- src/intel/genxml/gen10.xml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml index 23c2adb995..a7ae49ae65 100644 --- a/src/intel/genxml/gen10.xml +++ b/src/intel/genxml/gen10.xml @@ -554,7 +554,7 @@ - + @@ -564,7 +564,7 @@ - + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency
For the series: Tested-by: Dieter Nützelon RX580 with Clover, vdpau and Nine. ./autogen.sh --prefix=/usr/local --with-dri-drivers="" --with-gallium-drivers=r600,radeonsi,swrast --with-platforms=drm,x11 --enable-nine --enable-texture-float --enable-opencl --with-vulkan-drivers=radeon Dieter Am 04.08.2017 20:18, schrieb Emil Velikov: From: Emil Velikov Currently xmlconfig is conditionally used, only when --enable-dri is available. As the library has moved to src/util and has wider wisebase, this guard is no longer correct. Strictly speaking - it wasn't since the introduction of xmlconfig into st/nine a while ago. Unconditionally enable xmlconfig and drop the linking. As said before there's other users of the library, so depending on the configure options we will get multiple definitions of said symbols. NOTE: To avoid breaking other combinations, this commit adds the xmlconfig link to the required places - throughout gallium and the DRI loaders. Cc: Nicolai Hähnle Cc: Aaron Watry Signed-off-by: Emil Velikov --- Nicolai, here is an alternative solution. I have a very slight inclination towards this one over your earlier patch. But either one should do, really. --- src/egl/Makefile.am | 8 ++-- src/gallium/auxiliary/pipe-loader/Makefile.am | 6 -- src/gallium/targets/opencl/Makefile.am| 1 - src/gbm/Makefile.am | 1 + src/glx/Makefile.am | 4 +++- src/loader/Makefile.am| 15 ++- 6 files changed, 16 insertions(+), 19 deletions(-) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index ecaf148aaec..bb8ec9745dd 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -120,8 +120,12 @@ libEGL_common_la_SOURCES += \ $(dri2_backend_FILES) \ $(dri3_backend_FILES) -libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader.la -libEGL_common_la_LIBADD += $(DLOPEN_LIBS) $(LIBDRM_LIBS) $(CLOCK_LIB) +libEGL_common_la_LIBADD += \ + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la \ + $(DLOPEN_LIBS) \ + $(LIBDRM_LIBS) \ + $(CLOCK_LIB) GLVND_GEN_DEPS = generate/gen_egl_dispatch.py \ generate/egl.xml generate/eglFunctionList.py generate/genCommon.py \ diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am b/src/gallium/auxiliary/pipe-loader/Makefile.am index 4ebfc97e6d9..878159f2343 100644 --- a/src/gallium/auxiliary/pipe-loader/Makefile.am +++ b/src/gallium/auxiliary/pipe-loader/Makefile.am @@ -41,9 +41,11 @@ libpipe_loader_dynamic_la_SOURCES += \ endif libpipe_loader_static_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la libpipe_loader_dynamic_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la EXTRA_DIST = SConscript diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index e88fa0fd382..c9d2be7afd0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ $(top_builddir)/src/util/libmesautil.la \ - $(top_builddir)/src/util/libxmlconfig.la \ $(EXPAT_LIBS) \ $(LIBELF_LIBS) \ $(DLOPEN_LIBS) \ diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am index de8396000b7..7a9a12f87a0 100644 --- a/src/gbm/Makefile.am +++ b/src/gbm/Makefile.am @@ -26,6 +26,7 @@ libgbm_la_LDFLAGS = \ libgbm_la_LIBADD = \ $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la \ $(DLOPEN_LIBS) if HAVE_PLATFORM_WAYLAND diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am index b306bcc08db..34600475d98 100644 --- a/src/glx/Makefile.am +++ b/src/glx/Makefile.am @@ -97,7 +97,9 @@ libglx_la_SOURCES = \ singlepix.c \ vertarr.c -libglx_la_LIBADD = $(top_builddir)/src/loader/libloader.la +libglx_la_LIBADD = \ + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la if HAVE_DRISW libglx_la_SOURCES += \ diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am index 8b197f2995c..74ac6c51e77 100644 --- a/src/loader/Makefile.am +++ b/src/loader/Makefile.am @@ -26,6 +26,8 @@ EXTRA_DIST = SConscript noinst_LTLIBRARIES = libloader.la AM_CPPFLAGS = \ + -I$(top_builddir)/src/util/ \ + -DUSE_DRICONF \ $(DEFINES) \ -I$(top_srcdir)/include \ -I$(top_srcdir)/src \ @@ -37,19 +39,6 @@ libloader_la_CPPFLAGS =
Re: [Mesa-dev] [PATCH 10/12] egl/drm: remove unreachable code in dri2_drm_create_surface()
On 4 August 2017 at 11:03, Eric Engestromwrote: > On Thursday, 2017-08-03 19:29:36 +0100, Emil Velikov wrote: >> From: Emil Velikov >> >> The function can be called only when the type is EGL_WINDOW_BIT. >> Remove the unneeded switch statement. > > I take it we plan on never supporting pbuffers or pixmaps in platform_drm? > Pixmaps are explicitly forbidden, in the EGL platform gbm/wayland/android (yes there is one) spec. Pbuffers on the other hand are not mentioned at all in the ^^+x11 ones. > If so, I'd rather fold dri2_drm_create_surface() into > dri2_drm_create_window_surface(), as `type` is meaningless now and should > be dropped, and without it the latter is an empty pass-through function. > (can/should be a separate commit, but please send both as a single series) > Ack. Will respin the lot, splitting the unrelated patches into separate series. > Reviewed-by: Eric Engestrom > Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] i965/miptree: Call alloc_aux in create_for_bo
On 2017-08-02 13:35:33, Jason Ekstrand wrote: > Originally, I had moved it to the caller to make some things easier when > adding the CCS modifier. However, this broke DRI2 because > intel_process_dri2_buffer calls intel_miptree_create_for_bo but never > calls intel_miptree_alloc_aux. Also, in hindsight, it should be pretty > easy to make the CCS modifier stuff work even if create_for_bo allocates > the CCS when DISABLE_AUX is not set. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101925 I guess you want to drop this based on Tapani's feedback. Reviewed-by: Jordan Justen> Cc: Tapani Palli > Cc: "17.2" > --- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index 910bb46..305912c 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -839,9 +839,15 @@ intel_miptree_create_for_bo(struct brw_context *brw, > mt->bo = bo; > mt->offset = offset; > > - if (!(layout_flags & MIPTREE_LAYOUT_DISABLE_AUX)) > + if (!(layout_flags & MIPTREE_LAYOUT_DISABLE_AUX)) { >intel_miptree_choose_aux_usage(brw, mt); > > + if (!intel_miptree_alloc_aux(brw, mt)) { > + intel_miptree_release(); > + return NULL; > + } > + } > + > return mt; > } > > @@ -978,11 +984,6 @@ intel_miptree_create_for_dri_image(struct brw_context > *brw, > if (is_winsys_image) >image->bo->cache_coherent = false; > > - if (!intel_miptree_alloc_aux(brw, mt)) { > - intel_miptree_release(); > - return NULL; > - } > - > return mt; > } > > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/12] i965: Only put external handles into the handle ht
--- src/mesa/drivers/dri/i965/brw_bufmgr.c | 36 +++--- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index e1036f25a4..844ccaf1e5 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -363,7 +363,6 @@ retry: } bo->gem_handle = create.handle; - _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo); bo->bufmgr = bufmgr; bo->align = alignment; @@ -555,7 +554,6 @@ bo_free(struct brw_bo *bo) { struct brw_bufmgr *bufmgr = bo->bufmgr; struct drm_gem_close close; - struct hash_entry *entry; int ret; if (bo->map_cpu) { @@ -571,12 +569,17 @@ bo_free(struct brw_bo *bo) drm_munmap(bo->map_gtt, bo->size); } - if (bo->global_name) { - entry = _mesa_hash_table_search(bufmgr->name_table, >global_name); - _mesa_hash_table_remove(bufmgr->name_table, entry); + if (bo->external) { + struct hash_entry *entry; + + if (bo->global_name) { + entry = _mesa_hash_table_search(bufmgr->name_table, >global_name); + _mesa_hash_table_remove(bufmgr->name_table, entry); + } + + entry = _mesa_hash_table_search(bufmgr->handle_table, >gem_handle); + _mesa_hash_table_remove(bufmgr->handle_table, entry); } - entry = _mesa_hash_table_search(bufmgr->handle_table, >gem_handle); - _mesa_hash_table_remove(bufmgr->handle_table, entry); /* Close this object */ memclear(close); @@ -1161,12 +1164,20 @@ brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd) { struct brw_bufmgr *bufmgr = bo->bufmgr; + if (!bo->external) { + pthread_mutex_lock(>lock); + if (!bo->external) { + _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo); + bo->external = true; + } + pthread_mutex_unlock(>lock); + } + if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle, DRM_CLOEXEC, prime_fd) != 0) return -errno; bo->reusable = false; - bo->external = true; return 0; } @@ -1185,14 +1196,17 @@ brw_bo_flink(struct brw_bo *bo, uint32_t *name) return -errno; pthread_mutex_lock(>lock); + if (!bo->external) { + _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo); + bo->external = true; + } if (!bo->global_name) { bo->global_name = flink.name; - bo->reusable = false; - bo->external = true; - _mesa_hash_table_insert(bufmgr->name_table, >global_name, bo); } pthread_mutex_unlock(>lock); + + bo->reusable = false; } *name = bo->global_name; -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/12] i965: Replace open-coded gen6 queryobj offsets with simple helpers
Lots of places open-coded the assumed layout of the predicate/results within the query object, replace those with simple helpers. v2: Fix function decl style. Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/brw_conditional_render.c | 10 -- src/mesa/drivers/dri/i965/brw_context.h| 15 +++ src/mesa/drivers/dri/i965/gen6_queryobj.c | 6 +++--- src/mesa/drivers/dri/i965/hsw_queryobj.c | 18 +- 4 files changed, 35 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_conditional_render.c b/src/mesa/drivers/dri/i965/brw_conditional_render.c index e33e79fb6c..0177a7f80b 100644 --- a/src/mesa/drivers/dri/i965/brw_conditional_render.c +++ b/src/mesa/drivers/dri/i965/brw_conditional_render.c @@ -87,8 +87,14 @@ set_predicate_for_occlusion_query(struct brw_context *brw, */ brw_emit_pipe_control_flush(brw, PIPE_CONTROL_FLUSH_ENABLE); - brw_load_register_mem64(brw, MI_PREDICATE_SRC0, query->bo, 0 /* offset */); - brw_load_register_mem64(brw, MI_PREDICATE_SRC1, query->bo, 8 /* offset */); + brw_load_register_mem64(brw, + MI_PREDICATE_SRC0, + query->bo, + gen6_query_results_offset(query, 0)); + brw_load_register_mem64(brw, + MI_PREDICATE_SRC1, + query->bo, + gen6_query_results_offset(query, 1)); } static void diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index d41e6aa7bd..d37e05bb47 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -428,6 +428,21 @@ struct brw_query_object { bool flushed; }; +#define GEN6_QUERY_PREDICATE (2) +#define GEN6_QUERY_RESULTS (0) + +static inline unsigned +gen6_query_predicate_offset(const struct brw_query_object *query) +{ + return GEN6_QUERY_PREDICATE * sizeof(uint64_t); +} + +static inline unsigned +gen6_query_results_offset(const struct brw_query_object *query, unsigned idx) +{ + return (GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t); +} + enum brw_gpu_ring { UNKNOWN_RING, RENDER_RING, diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index 1ee3974198..a0b786f5d9 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -71,7 +71,7 @@ set_query_availability(struct brw_context *brw, struct brw_query_object *query, } brw_emit_pipe_control_write(brw, flags, - query->bo, 2 * sizeof(uint64_t), + query->bo, gen6_query_predicate_offset(query), available); } } @@ -318,7 +318,7 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) { struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; - const int idx = 0; + const int idx = GEN6_QUERY_RESULTS; /* Since we're starting a new query, we need to throw away old results. */ brw_bo_unreference(query->bo); @@ -407,7 +407,7 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) { struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; - const int idx = 1; + const int idx = GEN6_QUERY_RESULTS + 1; switch (query->Base.Target) { case GL_TIME_ELAPSED: diff --git a/src/mesa/drivers/dri/i965/hsw_queryobj.c b/src/mesa/drivers/dri/i965/hsw_queryobj.c index 9dc3b3de86..32b2e1f342 100644 --- a/src/mesa/drivers/dri/i965/hsw_queryobj.c +++ b/src/mesa/drivers/dri/i965/hsw_queryobj.c @@ -191,7 +191,7 @@ load_overflow_data_to_cs_gprs(struct brw_context *brw, struct brw_query_object *query, int idx) { - int offset = idx * sizeof(uint64_t) * 4; + int offset = gen6_query_results_offset(query, 0) + idx * sizeof(uint64_t) * 4; brw_load_register_mem64(brw, HSW_CS_GPR(1), query->bo, offset); @@ -282,7 +282,7 @@ hsw_result_to_gpr0(struct gl_context *ctx, struct brw_query_object *query, brw_load_register_mem64(brw, HSW_CS_GPR(0), query->bo, - 2 * sizeof(uint64_t)); + gen6_query_predicate_offset(query)); return; } @@ -299,7 +299,7 @@ hsw_result_to_gpr0(struct gl_context *ctx, struct brw_query_object *query, brw_load_register_mem64(brw, HSW_CS_GPR(0), query->bo, - 0 * sizeof(uint64_t)); + gen6_query_results_offset(query, 0)); } else if
[Mesa-dev] [PATCH 06/12] i965: Use snoop bo for accessing query results on !llc
Ony non-llc architectures where we are primarily reading back the results of the GPU queries, then we can improve performance by using a cacheable mapping of the results. Unfortunately, enabling snooping makes the writes from the GPU slower, which may adversely affect pipelined query operations (where the results are used directly by the GPU and not CPU). Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/brw_bufmgr.c| 24 src/mesa/drivers/dri/i965/brw_bufmgr.h| 2 ++ src/mesa/drivers/dri/i965/gen6_queryobj.c | 4 +++- 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index 5c7647f8bc..d71cef25e3 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -683,6 +683,30 @@ brw_bo_unreference(struct brw_bo *bo) } } +static bool +__brw_bo_set_caching(struct brw_bo *bo, int caching) +{ + struct drm_i915_gem_caching arg = { + .handle = bo->gem_handle, + .caching = caching + }; + return drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_SET_CACHING, ) == 0; +} + +void +brw_bo_set_cache_coherent(struct brw_bo *bo) +{ + assert(!bo->external); + if (bo->cache_coherent) + return; + + if (!__brw_bo_set_caching(bo, I915_CACHING_CACHED)) + return; + + bo->reusable = false; + bo->cache_coherent = true; +} + static void bo_wait_with_stall_warning(struct brw_context *brw, struct brw_bo *bo, diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h b/src/mesa/drivers/dri/i965/brw_bufmgr.h index 9848fe9268..45819c17c5 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.h +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h @@ -221,6 +221,8 @@ void brw_bo_unreference(struct brw_bo *bo); #define MAP_INTERNAL_MASK (0xff << 24) #define MAP_RAW (0x01 << 24) +void brw_bo_set_cache_coherent(struct brw_bo *bo); + /** * Maps the buffer into userspace. * diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index c96f00d8ba..a3b552c6c1 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -225,7 +225,7 @@ gen6_queryobj_get_results(struct gl_context *ctx, brw_bo_wait_rendering(query->bo); uint64_t *results = query->results; - if (!query->bo->cache_coherent) + if (unlikely(!query->bo->cache_coherent)) gen_invalidate_range(results, query->bo->size); switch (query->Base.Target) { @@ -320,6 +320,8 @@ gen6_alloc_query(struct brw_context *brw, struct brw_query_object *query) brw_bo_unreference(query->bo); query->bo = brw_bo_alloc(brw->bufmgr, "query results", 4096, 4096); + brw_bo_set_cache_coherent(query->bo); + query->results = brw_bo_map(brw, query->bo, MAP_COHERENT | MAP_PERSISTENT | MAP_READ | MAP_ASYNC); -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/12] i965: Pack simple pipelined query objects into the same buffer
Reuse the same query object buffer for multiple queries within the same batch. A task for the future is propagating the GL_NO_MEMORY errors. Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/brw_context.c | 4 +++ src/mesa/drivers/dri/i965/brw_context.h | 10 +-- src/mesa/drivers/dri/i965/brw_queryobj.c | 16 +-- src/mesa/drivers/dri/i965/gen6_queryobj.c | 46 +-- 4 files changed, 57 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index d0b22d4342..5cf5a67432 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -860,6 +860,8 @@ brwCreateContext(gl_api api, brw->isl_dev = screen->isl_dev; + brw->query.last_index = 4096; + brw->vs.base.stage = MESA_SHADER_VERTEX; brw->tcs.base.stage = MESA_SHADER_TESS_CTRL; brw->tes.base.stage = MESA_SHADER_TESS_EVAL; @@ -1047,6 +1049,8 @@ intelDestroyContext(__DRIcontext * driContextPriv) brw_bo_unreference(brw->gs.base.scratch_bo); if (brw->wm.base.scratch_bo) brw_bo_unreference(brw->wm.base.scratch_bo); + if (brw->query.bo) + brw_bo_unreference(brw->query.bo); brw_destroy_hw_context(brw->bufmgr, brw->hw_ctx); diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index b415013e47..376bcbb399 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -423,7 +423,7 @@ struct brw_query_object { uint64_t *results; /** Last index in bo with query data for this object. */ - int last_index; + unsigned index; /** True if we know the batch has been flushed since we ended the query. */ bool flushed; @@ -435,13 +435,13 @@ struct brw_query_object { static inline unsigned gen6_query_predicate_offset(const struct brw_query_object *query) { - return GEN6_QUERY_PREDICATE * sizeof(uint64_t); + return (query->index + GEN6_QUERY_PREDICATE) * sizeof(uint64_t); } static inline unsigned gen6_query_results_offset(const struct brw_query_object *query, unsigned idx) { - return (GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t); + return (query->index + GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t); } enum brw_gpu_ring { @@ -1103,6 +1103,10 @@ struct brw_context } cc; struct { + struct brw_bo *bo; + uint64_t *map; + unsigned last_index; + struct brw_query_object *obj; bool begin_emitted; } query; diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c b/src/mesa/drivers/dri/i965/brw_queryobj.c index 04ce9a94ca..8b14c72176 100644 --- a/src/mesa/drivers/dri/i965/brw_queryobj.c +++ b/src/mesa/drivers/dri/i965/brw_queryobj.c @@ -184,7 +184,7 @@ brw_queryobj_get_results(struct gl_context *ctx, * run out of space in the query's BO and allocated a new one. If so, * this function was already called to accumulate the results so far. */ - for (i = 0; i < query->last_index; i++) { + for (i = 0; i < query->index; i++) { query->Base.Result += results[i * 2 + 1] - results[i * 2]; } break; @@ -194,7 +194,7 @@ brw_queryobj_get_results(struct gl_context *ctx, /* If the starting and ending PS_DEPTH_COUNT from any of the batches * differ, then some fragments passed the depth test. */ - for (i = 0; i < query->last_index; i++) { + for (i = 0; i < query->index; i++) { if (results[i * 2 + 1] != results[i * 2]) { query->Base.Result = GL_TRUE; break; @@ -298,7 +298,7 @@ brw_begin_query(struct gl_context *ctx, struct gl_query_object *q) */ brw_bo_unreference(query->bo); query->bo = NULL; - query->last_index = -1; + query->index = -1; brw->query.obj = query; @@ -430,7 +430,7 @@ ensure_bo_has_space(struct gl_context *ctx, struct brw_query_object *query) assert(brw->gen < 6); - if (!query->bo || query->last_index * 2 + 1 >= 4096 / sizeof(uint64_t)) { + if (!query->bo || query->index * 2 + 1 >= 4096 / sizeof(uint64_t)) { if (query->bo != NULL) { /* The old query BO did not have enough space, so we allocated a new @@ -441,7 +441,7 @@ ensure_bo_has_space(struct gl_context *ctx, struct brw_query_object *query) } query->bo = brw_bo_alloc(brw->bufmgr, "query", 4096, 1); - query->last_index = 0; + query->index = 0; } } @@ -482,7 +482,7 @@ brw_emit_query_begin(struct brw_context *brw) ensure_bo_has_space(ctx, query); - brw_write_depth_count(brw, query->bo, query->last_index * 2); + brw_write_depth_count(brw, query->bo, query->index * 2); brw->query.begin_emitted = true; } @@ -504,10 +504,10 @@ brw_emit_query_end(struct brw_context *brw) if (!brw->query.begin_emitted)
[Mesa-dev] [PATCH 05/12] i965: Map the query results for the life of the bo
If we map the bo upon creation, we can avoid the latency of mmapping it when querying, and later use the asynchronous, persistent map of the predicate to do a quick query. v2: Inline the wait on results; it disappears shortly in the next few patches. Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/brw_context.h | 1 + src/mesa/drivers/dri/i965/gen6_queryobj.c | 42 +++ 2 files changed, 32 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index d37e05bb47..4d0b76bebb 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -420,6 +420,7 @@ struct brw_query_object { /** Last query BO associated with this query. */ struct brw_bo *bo; + uint64_t *results; /** Last index in bo with query data for this object. */ int last_index; diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index a0b786f5d9..c96f00d8ba 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -33,6 +33,8 @@ */ #include "main/imports.h" +#include "common/gen_clflush.h" + #include "brw_context.h" #include "brw_defines.h" #include "brw_state.h" @@ -221,7 +223,11 @@ gen6_queryobj_get_results(struct gl_context *ctx, if (query->bo == NULL) return; - uint64_t *results = brw_bo_map(brw, query->bo, MAP_READ); + brw_bo_wait_rendering(query->bo); + uint64_t *results = query->results; + if (!query->bo->cache_coherent) + gen_invalidate_range(results, query->bo->size); + switch (query->Base.Target) { case GL_TIME_ELAPSED: /* The query BO contains the starting and ending timestamps. @@ -296,7 +302,6 @@ gen6_queryobj_get_results(struct gl_context *ctx, default: unreachable("Unrecognized query target in brw_queryobj_get_results()"); } - brw_bo_unmap(query->bo); /* Now that we've processed the data stored in the query's buffer object, * we can release it. @@ -307,6 +312,24 @@ gen6_queryobj_get_results(struct gl_context *ctx, query->Base.Ready = true; } +static int +gen6_alloc_query(struct brw_context *brw, struct brw_query_object *query) +{ + /* Since we're starting a new query, we need to throw away old results. */ + if (query->bo) + brw_bo_unreference(query->bo); + + query->bo = brw_bo_alloc(brw->bufmgr, "query results", 4096, 4096); + query->results = brw_bo_map(brw, query->bo, + MAP_COHERENT | MAP_PERSISTENT | + MAP_READ | MAP_ASYNC); + + /* For ARB_query_buffer_object: The result is not available */ + set_query_availability(brw, query, false); + + return 0; +} + /** * Driver hook for glBeginQuery(). * @@ -318,14 +341,7 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) { struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; - const int idx = GEN6_QUERY_RESULTS; - - /* Since we're starting a new query, we need to throw away old results. */ - brw_bo_unreference(query->bo); - query->bo = brw_bo_alloc(brw->bufmgr, "query results", 4096, 4096); - - /* For ARB_query_buffer_object: The result is not available */ - set_query_availability(brw, query, false); + const int idx = gen6_alloc_query(brw, query) + GEN6_QUERY_RESULTS; switch (query->Base.Target) { case GL_TIME_ELAPSED: @@ -539,8 +555,12 @@ gen6_query_counter(struct gl_context *ctx, struct gl_query_object *q) { struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; - brw_query_counter(ctx, q); + const int idx = gen6_alloc_query(brw, query) + GEN6_QUERY_RESULTS; + + brw_write_timestamp(brw, query->bo, idx); set_query_availability(brw, query, true); + + query->flushed = false; } /* Initialize Gen6+-specific query object functions. */ -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/12] i965: Use 'available' fence for polling query results
If we always write the 'available' flag after writing the final result of the query, we can probe that predicate to quickly query whether the result is ready from userspace. The primary advantage of checking the predicate is that it allows for more fine-grained queries, we do not have to wait for the batch to finish before the query is marked as ready. We still do check the status of the batch after probing the query so that if the worst happens and the batch did hang without completing the query, we do not spin forever (although it is not as nice as completely eliminating the ioctl, the busy-ioctl is lightweight!). Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/brw_context.h | 4 +-- src/mesa/drivers/dri/i965/gen6_queryobj.c | 53 ++- 2 files changed, 25 insertions(+), 32 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 4d0b76bebb..b415013e47 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -429,8 +429,8 @@ struct brw_query_object { bool flushed; }; -#define GEN6_QUERY_PREDICATE (2) -#define GEN6_QUERY_RESULTS (0) +#define GEN6_QUERY_PREDICATE (0) +#define GEN6_QUERY_RESULTS (1) static inline unsigned gen6_query_predicate_offset(const struct brw_query_object *query) diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index a3b552c6c1..c6887661a5 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -42,8 +42,7 @@ #include "intel_buffer_objects.h" static inline void -set_query_availability(struct brw_context *brw, struct brw_query_object *query, - bool available) +set_query_available(struct brw_context *brw, struct brw_query_object *query) { /* For platforms that support ARB_query_buffer_object, we write the * query availability for "pipelined" queries. @@ -60,22 +59,12 @@ set_query_availability(struct brw_context *brw, struct brw_query_object *query, * PIPE_CONTROL with an immediate write will synchronize with * those earlier writes, so we write 1 when the value has landed. */ - if (brw->ctx.Extensions.ARB_query_buffer_object && - brw_is_query_pipelined(query)) { - unsigned flags = PIPE_CONTROL_WRITE_IMMEDIATE; - if (available) { - /* Order available *after* the query results. */ - flags |= PIPE_CONTROL_FLUSH_ENABLE; - } else { - /* Make it unavailable *before* any pipelined reads. */ - flags |= PIPE_CONTROL_CS_STALL; - } - - brw_emit_pipe_control_write(brw, flags, - query->bo, gen6_query_predicate_offset(query), - available); - } + brw_emit_pipe_control_write(brw, + PIPE_CONTROL_WRITE_IMMEDIATE | + PIPE_CONTROL_FLUSH_ENABLE, + query->bo, gen6_query_predicate_offset(query), + true); } static void @@ -141,12 +130,12 @@ write_xfb_overflow_streams(struct gl_context *ctx, } static bool -check_xfb_overflow_streams(uint64_t *results, int count) +check_xfb_overflow_streams(const uint64_t *results, int count) { bool overflow = false; for (int i = 0; i < count; i++) { - uint64_t *result_i = [4 * i]; + const uint64_t *result_i = [4 * i]; if ((result_i[3] - result_i[2]) != (result_i[1] - result_i[0])) { overflow = true; @@ -216,15 +205,14 @@ emit_pipeline_stat(struct brw_context *brw, struct brw_bo *bo, */ static void gen6_queryobj_get_results(struct gl_context *ctx, - struct brw_query_object *query) + struct brw_query_object *query, + uint64_t *results) { struct brw_context *brw = brw_context(ctx); if (query->bo == NULL) return; - brw_bo_wait_rendering(query->bo); - uint64_t *results = query->results; if (unlikely(!query->bo->cache_coherent)) gen_invalidate_range(results, query->bo->size); @@ -324,10 +312,10 @@ gen6_alloc_query(struct brw_context *brw, struct brw_query_object *query) query->results = brw_bo_map(brw, query->bo, MAP_COHERENT | MAP_PERSISTENT | - MAP_READ | MAP_ASYNC); + MAP_READ | MAP_WRITE); /* For ARB_query_buffer_object: The result is not available */ - set_query_availability(brw, query, false); + query->results[GEN6_QUERY_PREDICATE] = false; return 0; } @@ -482,7 +470,7 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) query->flushed = false; /* For ARB_query_buffer_object: The result is now available */ -
Re: [Mesa-dev] [PATCH 01/11] intel/isl: Stop padding surfaces
Reviewed-by: Jordan JustenOn 2017-08-02 13:35:26, Jason Ekstrand wrote: > The docs contain a bunch of commentary about the need to pad various > surfaces out to multiples of something or other. However, all of those > requirements are about avoiding GTT errors due to missing pages when the > data port or sampler accesses slightly out-of-bounds. However, because > the kernel already fills all the empty space in our GTT with the scratch > page, we never have to worry about faulting due to OOB reads. There are > two caveats to this: > > 1) There is some potential for issues with caches here if extra data > ends up in a cache we don't expect due to OOB reads. However, > because we always trash the entire cache whenever we need to move > anything between cache domains, this shouldn't be an issue. > > 2) There is a potential issue if a surface gets placed at the very top > of the GTT by the kernel. In this case, the hardware could > potentially end up trying to read past the top of the GTT. If it > nicely wraps around at the 48-bit (or 32-bit) boundary, then this > shouldn't be an issue thanks to the scratch page. If it doesn't, > then we need to come up with something to handle it. > > Up until some of the GL move to ISL, having the padding code in there > just caused us to harmlessly use a bit more memory in Vulkan. However, > now that we're using ISL sizes to validate external dma-buf images, > these padding requirements are causing us to reject otherwise valid > images due to the size of the BO being too small. > > Cc: "17.2" > Cc: Chad Versace > Tested-by: Tapani Pälli > Tested-by: Tomasz Figa > --- > src/intel/isl/isl.c | 119 > +--- > 1 file changed, 2 insertions(+), 117 deletions(-) > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index 5e3d279..d3124de 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -1374,116 +1374,6 @@ isl_calc_row_pitch(const struct isl_device *dev, > return true; > } > > -/** > - * Calculate and apply any padding required for the surface. > - * > - * @param[inout] total_h_el is updated with the new height > - * @param[out] pad_bytes is overwritten with additional padding requirements. > - */ > -static void > -isl_apply_surface_padding(const struct isl_device *dev, > - const struct isl_surf_init_info *restrict info, > - const struct isl_tile_info *tile_info, > - uint32_t *total_h_el, > - uint32_t *pad_bytes) > -{ > - const struct isl_format_layout *fmtl = > isl_format_get_layout(info->format); > - > - *pad_bytes = 0; > - > - /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface > -* Formats >> Surface Padding Requirements >> Render Target and Media > -* Surfaces: > -* > -* The data port accesses data (pixels) outside of the surface if they > -* are contained in the same cache request as pixels that are within the > -* surface. These pixels will not be returned by the requesting message, > -* however if these pixels lie outside of defined pages in the GTT, > -* a GTT error will result when the cache request is processed. In > -* order to avoid these GTT errors, “padding” at the bottom of the > -* surface is sometimes necessary. > -* > -* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface > -* Formats >> Surface Padding Requirements >> Sampling Engine Surfaces: > -* > -*... Lots of padding requirements, all listed separately below. > -*/ > - > - /* We can safely ignore the first padding requirement, quoted below, > -* because isl doesn't do buffers. > -* > -*- [pre-BDW] For buffers, which have no inherent “height,” padding > -* requirements are different. A buffer must be padded to the next > -* multiple of 256 array elements, with an additional 16 bytes added > -* beyond that to account for the L1 cache line. > -*/ > - > - /* > -*- For compressed textures [...], padding at the bottom of the > surface > -* is to an even compressed row. > -*/ > - if (isl_format_is_compressed(info->format)) > - *total_h_el = isl_align(*total_h_el, 2); > - > - /* > -*- For cube surfaces, an additional two rows of padding are required > -* at the bottom of the surface. > -*/ > - if (info->usage & ISL_SURF_USAGE_CUBE_BIT) > - *total_h_el += 2; > - > - /* > -*- For packed YUV, 96 bpt, 48 bpt, and 24 bpt surface formats, > -* additional padding is required. These surfaces require an extra > row > -* plus 16 bytes of padding at the bottom in addition to the general > -* padding
[Mesa-dev] [PATCH 02/12] i965: Check last known busy status on bo before asking the kernel
If we know the bo is idle (that is we have no submitted a command buffer referencing this bo since the last query) we can skip asking the kernel. Note this may report a false negative if the target is being shared between processes (exported via dmabuf or flink). To allow the caller control over using the last known flag, the query is split into two. v2: Check against external bo before trusting our own tracking. Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 42 -- src/mesa/drivers/dri/i965/brw_bufmgr.h | 11 +++-- 2 files changed, 39 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index 844ccaf1e5..5c7647f8bc 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -196,22 +196,40 @@ bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size) return NULL; } -int +static int +__brw_bo_busy(struct brw_bo *bo) +{ + struct drm_i915_gem_busy busy = { bo->gem_handle }; + + if (bo->idle && !bo->external) + return 0; + + /* If we hit an error here, it means that bo->gem_handle is invalid. +* Treat it as being idle (busy.busy is left as 0) and move along. +*/ + drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, ); + + bo->idle = !busy.busy; + return busy.busy; +} + +bool brw_bo_busy(struct brw_bo *bo) { - struct brw_bufmgr *bufmgr = bo->bufmgr; - struct drm_i915_gem_busy busy; - int ret; + return __brw_bo_busy(bo); +} - memclear(busy); - busy.handle = bo->gem_handle; +bool +brw_bo_map_busy(struct brw_bo *bo, unsigned flags) +{ + unsigned mask; - ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, ); - if (ret == 0) { - bo->idle = !busy.busy; - return busy.busy; - } - return false; + if (flags & MAP_WRITE) + mask = ~0u; + else + mask = 0x; + + return __brw_bo_busy(bo) & mask; } int diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h b/src/mesa/drivers/dri/i965/brw_bufmgr.h index d09bc74c9c..9848fe9268 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.h +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h @@ -271,10 +271,17 @@ int brw_bo_get_tiling(struct brw_bo *bo, uint32_t *tiling_mode, int brw_bo_flink(struct brw_bo *bo, uint32_t *name); /** - * Returns 1 if mapping the buffer for write could cause the process + * Returns false if mapping the buffer is not in active use by the gpu. + * If it returns true, any mapping for for write could cause the process * to block, due to the object being active in the GPU. */ -int brw_bo_busy(struct brw_bo *bo); +bool brw_bo_busy(struct brw_bo *bo); + +/** + * Returns true if mapping the buffer for the set of flags (i.e. MAP_READ or + * MAP_WRITE) will cause the process to block. + */ +bool brw_bo_map_busy(struct brw_bo *bo, unsigned flags); /** * Specify the volatility of the buffer. -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/12] i965: Use snooping for mapping the miptree
Avoid having to clflush after blitting the miptree to a linear buffer for mapping by enabling snooping on !llc and treating the buffer as coherent. Similarly, it avoids the clflush afterwards if used for READ | WRITE. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 858279fcba..3b5e5595d7 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -2955,11 +2955,16 @@ intel_miptree_map_blit(struct brw_context *brw, map->w, map->h, 1, /* samples */ 1, MIPTREE_LAYOUT_TILING_NONE); - if (!map->linear_mt) { fprintf(stderr, "Failed to allocate blit temporary\n"); goto fail; } + + /* Make the GPU do the work of invalidating the CPU cache (using snoop on +* !llc), it's much faster than clflush! +*/ + brw_bo_set_cache_coherent(map->linear_mt->bo); + map->stride = map->linear_mt->surf.row_pitch; /* One of either READ_BIT or WRITE_BIT or both is set. READ_BIT implies no @@ -3422,11 +3427,11 @@ use_intel_mipree_map_blit(struct brw_context *brw, unsigned int level, unsigned int slice) { - if (brw->has_llc && - /* It's probably not worth swapping to the blit ring because of - * all the overhead involved. - */ - !(mode & GL_MAP_WRITE_BIT) && + /* It's probably not worth swapping to the blit ring because of +* all the overhead involved. +*/ + + if (!(mode & GL_MAP_WRITE_BIT) && !mt->compressed && (mt->surf.tiling == ISL_TILING_X || /* Prior to Sandybridge, the blitter can't handle Y tiling */ -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/12] i965: Pass consistent args along gen6_queryobj.c
Be consistent in passing along brw_context rather than switching between that and gl_context. Signed-off-by: Chris Wilson--- src/mesa/drivers/dri/i965/gen6_queryobj.c | 32 ++- 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index 25ea51503e..0ba6919374 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -99,12 +99,10 @@ write_xfb_primitives_written(struct brw_context *brw, } static void -write_xfb_overflow_streams(struct gl_context *ctx, +write_xfb_overflow_streams(struct brw_context *brw, struct brw_bo *bo, int stream, int count, int idx) { - struct brw_context *brw = brw_context(ctx); - brw_emit_mi_flush(brw); for (int i = 0; i < count; i++) { @@ -204,15 +202,10 @@ emit_pipeline_stat(struct brw_context *brw, struct brw_bo *bo, * Wait on the query object's BO and calculate the final result. */ static void -gen6_queryobj_get_results(struct gl_context *ctx, +gen6_queryobj_get_results(struct brw_context *brw, struct brw_query_object *query, uint64_t *results) { - struct brw_context *brw = brw_context(ctx); - - if (query->bo == NULL) - return; - if (unlikely(!query->bo->cache_coherent)) gen_invalidate_range(results, query->bo->size); @@ -232,7 +225,7 @@ gen6_queryobj_get_results(struct gl_context *ctx, /* Ensure the scaled timestamp overflows according to * GL_QUERY_COUNTER_BITS */ - query->Base.Result &= (1ull << ctx->Const.QueryCounterBits.Timestamp) - 1; + query->Base.Result &= (1ull << brw->ctx.Const.QueryCounterBits.Timestamp) - 1; break; case GL_SAMPLES_PASSED_ARB: @@ -396,7 +389,7 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) case GL_PRIMITIVES_GENERATED: write_primitives_generated(brw, query->bo, query->Base.Stream, idx); if (query->Base.Stream == 0) - ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD; + brw->ctx.NewDriverState |= BRW_NEW_RASTERIZER_DISCARD; break; case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN: @@ -404,11 +397,11 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) break; case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx); + write_xfb_overflow_streams(brw, query->bo, query->Base.Stream, 1, idx); break; case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx); + write_xfb_overflow_streams(brw, query->bo, 0, MAX_VERTEX_STREAMS, idx); break; case GL_VERTICES_SUBMITTED_ARB: @@ -459,7 +452,7 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) case GL_PRIMITIVES_GENERATED: write_primitives_generated(brw, query->bo, query->Base.Stream, idx); if (query->Base.Stream == 0) - ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD; + brw->ctx.NewDriverState |= BRW_NEW_RASTERIZER_DISCARD; break; case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN: @@ -467,11 +460,11 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) break; case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx); + write_xfb_overflow_streams(brw, query->bo, query->Base.Stream, 1, idx); break; case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx); + write_xfb_overflow_streams(brw, query->bo, 0, MAX_VERTEX_STREAMS, idx); break; /* calculate overflow here */ @@ -530,6 +523,9 @@ static void gen6_wait_query(struct gl_context *ctx, struct gl_query_object *q) struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; + if (query->bo == NULL) + return; + /* If the application has requested the query result, but this batch is * still contributing to it, flush it now to finish that work so the * result will become available (eventually). @@ -540,7 +536,7 @@ static void gen6_wait_query(struct gl_context *ctx, struct gl_query_object *q) if (!results[GEN6_QUERY_PREDICATE]) /* not yet available, must wait */ brw_bo_wait_rendering(query->bo); - gen6_queryobj_get_results(ctx, query, results + GEN6_QUERY_RESULTS); + gen6_queryobj_get_results(brw, query, results + GEN6_QUERY_RESULTS); } /** @@ -572,7 +568,7 @@ static void gen6_check_query(struct gl_context *ctx, struct gl_query_object *q) uint64_t *results = query->results; if (results[GEN6_QUERY_PREDICATE] || /* already available, can read async */
[Mesa-dev] [PATCH 03/12] i965: Replace hard-coded indices with const named variables in gen6_queryobj
To simplify replacement later, replace repeated use of explicit 0/1 with local variables of the same value. Signed-off-by: Chris WilsonCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/gen6_queryobj.c | 30 -- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index 8e639cfeef..1ee3974198 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -318,6 +318,7 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) { struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; + const int idx = 0; /* Since we're starting a new query, we need to throw away old results. */ brw_bo_unreference(query->bo); @@ -347,31 +348,31 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) * obtain the time elapsed. Notably, this includes time elapsed while * the system was doing other work, such as running other applications. */ - brw_write_timestamp(brw, query->bo, 0); + brw_write_timestamp(brw, query->bo, idx); break; case GL_ANY_SAMPLES_PASSED: case GL_ANY_SAMPLES_PASSED_CONSERVATIVE: case GL_SAMPLES_PASSED_ARB: - brw_write_depth_count(brw, query->bo, 0); + brw_write_depth_count(brw, query->bo, idx); break; case GL_PRIMITIVES_GENERATED: - write_primitives_generated(brw, query->bo, query->Base.Stream, 0); + write_primitives_generated(brw, query->bo, query->Base.Stream, idx); if (query->Base.Stream == 0) ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD; break; case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN: - write_xfb_primitives_written(brw, query->bo, query->Base.Stream, 0); + write_xfb_primitives_written(brw, query->bo, query->Base.Stream, idx); break; case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, 0); + write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx); break; case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, 0); + write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx); break; case GL_VERTICES_SUBMITTED_ARB: @@ -385,7 +386,7 @@ gen6_begin_query(struct gl_context *ctx, struct gl_query_object *q) case GL_COMPUTE_SHADER_INVOCATIONS_ARB: case GL_TESS_CONTROL_SHADER_PATCHES_ARB: case GL_TESS_EVALUATION_SHADER_INVOCATIONS_ARB: - emit_pipeline_stat(brw, query->bo, query->Base.Stream, query->Base.Target, 0); + emit_pipeline_stat(brw, query->bo, query->Base.Stream, query->Base.Target, idx); break; default: @@ -406,34 +407,35 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) { struct brw_context *brw = brw_context(ctx); struct brw_query_object *query = (struct brw_query_object *)q; + const int idx = 1; switch (query->Base.Target) { case GL_TIME_ELAPSED: - brw_write_timestamp(brw, query->bo, 1); + brw_write_timestamp(brw, query->bo, idx); break; case GL_ANY_SAMPLES_PASSED: case GL_ANY_SAMPLES_PASSED_CONSERVATIVE: case GL_SAMPLES_PASSED_ARB: - brw_write_depth_count(brw, query->bo, 1); + brw_write_depth_count(brw, query->bo, idx); break; case GL_PRIMITIVES_GENERATED: - write_primitives_generated(brw, query->bo, query->Base.Stream, 1); + write_primitives_generated(brw, query->bo, query->Base.Stream, idx); if (query->Base.Stream == 0) ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD; break; case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN: - write_xfb_primitives_written(brw, query->bo, query->Base.Stream, 1); + write_xfb_primitives_written(brw, query->bo, query->Base.Stream, idx); break; case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, 1); + write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx); break; case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB: - write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, 1); + write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx); break; /* calculate overflow here */ @@ -449,7 +451,7 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) case GL_TESS_CONTROL_SHADER_PATCHES_ARB: case GL_TESS_EVALUATION_SHADER_INVOCATIONS_ARB: emit_pipeline_stat(brw, query->bo, - query->Base.Stream, query->Base.Target, 1); + query->Base.Stream, query->Base.Target, idx); break;
[Mesa-dev] [PATCH 12/12] i965: Prefer to use the GPU copy if we need to stall for reads
If we need to stall to read the bo, ask the GPU to copy it into the CPU cache whilst we wait. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 3b5e5595d7..5cd8d24f1e 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3421,6 +3421,18 @@ can_blit_slice(struct intel_mipmap_tree *mt, } static bool +map_will_stall(struct brw_bo *bo, GLbitfield mode) +{ + /* If we need to stall for reading the buffer, offload the cost +* of clflushing it to the GPU. +*/ + if (!bo->cache_coherent && !(mode & GL_MAP_INVALIDATE_RANGE_BIT)) + mode |= GL_MAP_READ_BIT; + + return brw_bo_map_busy(bo, mode); +} + +static bool use_intel_mipree_map_blit(struct brw_context *brw, struct intel_mipmap_tree *mt, GLbitfield mode, @@ -3431,7 +3443,7 @@ use_intel_mipree_map_blit(struct brw_context *brw, * all the overhead involved. */ - if (!(mode & GL_MAP_WRITE_BIT) && + if (map_will_stall(mt->bo, mode) && !mt->compressed && (mt->surf.tiling == ISL_TILING_X || /* Prior to Sandybridge, the blitter can't handle Y tiling */ -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/12] i965: Set query->flush after flushing the query
Skip the next check for brw_batch_references() by recording when we flush the query. --- src/mesa/drivers/dri/i965/gen6_queryobj.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index 0ba6919374..30dda5ae1f 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -502,14 +502,16 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q) static void flush_batch_if_needed(struct brw_context *brw, struct brw_query_object *query) { + if (query->flushed) + return; + /* If the batch doesn't reference the BO, it must have been flushed * (for example, due to being full). Record that it's been flushed. */ - query->flushed = query->flushed || -!brw_batch_references(>batch, query->bo); - - if (!query->flushed) + if (brw_batch_references(>batch, query->bo)) intel_batchbuffer_flush(brw); + + query->flushed = true; } /** -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] No reloc for i965
Quoting Kenneth Graunke (2017-08-04 19:47:14) > On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote: > > Patch reordering from last time so that the cosmetic tweaks are done first > > and out of the way. Kenneth has reviewed the core NO_RELOC patches, so > > hopefully it doesn't look too bad and we can land at least as far as > > there (patch 8/10). > > > > Thanks, > > -Chris > > I split up some patches and pushed a modified version of this series. > > To ssh://git.freedesktop.org/git/mesa/mesa >5c007203b73..6c530ad1160 master -> master > > Thanks a ton for getting us to NO_RELOC. I really like the new reloc > flags system as well. It's so much nicer! I've still got to win you over to using LUT indices (kernel side, there shouldn't be any case where it is worse, but the differences are easily dwarfed in typical cases where it is only about 10% faster, but any reduction inside the struct_mutex is a must), I see, and the per-context bo along with removing the auxiliary render_cache set... But now for something completely different... -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] intel/isl: Stop padding surfaces
Ken and I had a fairly lengthy conversation about this on IRC today: https://people.freedesktop.org/~jekstrand/isl-padding The conclusion was that we both hate the patch but it's probably safe and it does fix bugs. The thing that really wins me over is that we have historically done none of this padding in the GL driver (except for one bit about cube maps) and seem to have gotten away with it. We have had some underallocation issues in the past but none have them have tracked back to this. --Jason On Wed, Aug 2, 2017 at 1:35 PM, Jason Ekstrandwrote: > The docs contain a bunch of commentary about the need to pad various > surfaces out to multiples of something or other. However, all of those > requirements are about avoiding GTT errors due to missing pages when the > data port or sampler accesses slightly out-of-bounds. However, because > the kernel already fills all the empty space in our GTT with the scratch > page, we never have to worry about faulting due to OOB reads. There are > two caveats to this: > > 1) There is some potential for issues with caches here if extra data > ends up in a cache we don't expect due to OOB reads. However, > because we always trash the entire cache whenever we need to move > anything between cache domains, this shouldn't be an issue. > > 2) There is a potential issue if a surface gets placed at the very top > of the GTT by the kernel. In this case, the hardware could > potentially end up trying to read past the top of the GTT. If it > nicely wraps around at the 48-bit (or 32-bit) boundary, then this > shouldn't be an issue thanks to the scratch page. If it doesn't, > then we need to come up with something to handle it. > > Up until some of the GL move to ISL, having the padding code in there > just caused us to harmlessly use a bit more memory in Vulkan. However, > now that we're using ISL sizes to validate external dma-buf images, > these padding requirements are causing us to reject otherwise valid > images due to the size of the BO being too small. > > Cc: "17.2" > Cc: Chad Versace > Tested-by: Tapani Pälli > Tested-by: Tomasz Figa > --- > src/intel/isl/isl.c | 119 +- > -- > 1 file changed, 2 insertions(+), 117 deletions(-) > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index 5e3d279..d3124de 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -1374,116 +1374,6 @@ isl_calc_row_pitch(const struct isl_device *dev, > return true; > } > > -/** > - * Calculate and apply any padding required for the surface. > - * > - * @param[inout] total_h_el is updated with the new height > - * @param[out] pad_bytes is overwritten with additional padding > requirements. > - */ > -static void > -isl_apply_surface_padding(const struct isl_device *dev, > - const struct isl_surf_init_info *restrict info, > - const struct isl_tile_info *tile_info, > - uint32_t *total_h_el, > - uint32_t *pad_bytes) > -{ > - const struct isl_format_layout *fmtl = isl_format_get_layout(info->fo > rmat); > - > - *pad_bytes = 0; > - > - /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface > -* Formats >> Surface Padding Requirements >> Render Target and Media > -* Surfaces: > -* > -* The data port accesses data (pixels) outside of the surface if > they > -* are contained in the same cache request as pixels that are within > the > -* surface. These pixels will not be returned by the requesting > message, > -* however if these pixels lie outside of defined pages in the GTT, > -* a GTT error will result when the cache request is processed. In > -* order to avoid these GTT errors, “padding” at the bottom of the > -* surface is sometimes necessary. > -* > -* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface > -* Formats >> Surface Padding Requirements >> Sampling Engine Surfaces: > -* > -*... Lots of padding requirements, all listed separately below. > -*/ > - > - /* We can safely ignore the first padding requirement, quoted below, > -* because isl doesn't do buffers. > -* > -*- [pre-BDW] For buffers, which have no inherent “height,” padding > -* requirements are different. A buffer must be padded to the next > -* multiple of 256 array elements, with an additional 16 bytes > added > -* beyond that to account for the L1 cache line. > -*/ > - > - /* > -*- For compressed textures [...], padding at the bottom of the > surface > -* is to an even compressed row. > -*/ > - if (isl_format_is_compressed(info->format)) > - *total_h_el = isl_align(*total_h_el, 2); > - > -
Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS
On 4 August 2017 at 14:23, Tomasz Figawrote: > > If this needs so complicated series of checks, maybe it would make > more sense to just set enable_out_fence based on availability of the > capability at initialization time? > Either way is fine with me. >> Did you drop it all together or changed to use some other surface? >> Would be nice to hear the reason why it was added - perhaps I'm >> missing something. > > We have to keep it, otherwise there would be no fence available at the > time of surface destruction, while, at least for Android, a fence can > be passed to window's cancelBuffer callback. > >> >> I think that we want a fence/fd for the new draw surface. Since >> otherwise one won't get created up until the first SwapBuffers call. > > I might be missing something, but wouldn't that insert a fence at the > beginning of command stream, before even doing anything? At least in > Android use cases, the only places we need the fence is in SwapBuffers > and DestroySurface and the fence should be inserted after all the > commands for rendering into given surface. > Thanks for the correction. You're absolutely correct in both cases. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8 v2] A few clover fixes for both CTS and eventual 1.2 support
Hi, I went through most of the series. I think the approach is OK. The biggest issue I had is with the sequence: 1.) add an interface 2.) implement a feature 3.) change the interface I gave my rb to 1 and 2, but you might want to consider changing them as well, if returning int from the functions is better. Generating string is IMO easier/faster than parsing them. Also, you might want to consider cc'ing Francisco as he'll have the final say and might see things differently. thanks, Jan On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > I've dropped the first patch of the previous series for now. I'm not > withdrawing it completely, just going to see if there's anything about > the user_ptr stuff that could have been causing the issue instead, and > if I'm using too big a hammer in this patch. If I convince myself of its > correctness, it'll be back. > > The rest of the patches move the device version declaration to core/device > and then use that along with the -cl-std option to determine which > OpenCL language version to enable in clang. > > I've done a full piglit run (again) before/after, and there are no changes > for me on radeonsi/pitcairn if the device is left at CL 1.1. > > When I bump my platform/device versions to 1.2, the clang instance has > been confirmed to enable 1.2 language features (like the static keyword > required in test/cl/program/execute/static.cl, which goes skip->pass). > > Major changes since v1: > Addressed Pierre's build-breakage comments > Added a check for cl-std > device_clc_version > Added a patch to pass the device object down into invocation.cpp > instead of adding a bunch of device-based arguments. > Use device_clc_version for cl version detection instead of device_version > Added device_clc_version in device.cpp/hpp > > Anyway, happy reviewing. > > Cc: Jan Vesely> Cc: Pierre Moreau > signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] No reloc for i965
On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote: > Patch reordering from last time so that the cosmetic tweaks are done first > and out of the way. Kenneth has reviewed the core NO_RELOC patches, so > hopefully it doesn't look too bad and we can land at least as far as > there (patch 8/10). > > Thanks, > -Chris I split up some patches and pushed a modified version of this series. To ssh://git.freedesktop.org/git/mesa/mesa 5c007203b73..6c530ad1160 master -> master Thanks a ton for getting us to NO_RELOC. I really like the new reloc flags system as well. It's so much nicer! --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] clover/llvm: Make __OPENCL_VERSION__ dynamic
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > Signed-off-by: Aaron Watry> CC: Jan Vesely > > v2: base it on the device version > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 63b2961752..443cd31e66 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -224,7 +224,8 @@ namespace { >c.getPreprocessorOpts().Includes.push_back("clc/clc.h"); > >// Add definition for the OpenCL version > - c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110"); > + c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" + > + > std::to_string(get_language_from_version_str(dev.device_version(; I don't think you can use the same parsing function here. __OPENCL_VERSION__ can go up to 2.2, while __OPENCL_C_VERSION__ is max 2.0 Jan > >// clc.h requires that this macro be defined: > > c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers"); signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency
On 04.08.2017 20:18, Emil Velikov wrote: From: Emil VelikovCurrently xmlconfig is conditionally used, only when --enable-dri is available. As the library has moved to src/util and has wider wisebase, this guard is no longer correct. Strictly speaking - it wasn't since the introduction of xmlconfig into st/nine a while ago. Unconditionally enable xmlconfig and drop the linking. As said before there's other users of the library, so depending on the configure options we will get multiple definitions of said symbols. NOTE: To avoid breaking other combinations, this commit adds the xmlconfig link to the required places - throughout gallium and the DRI loaders. Cc: Nicolai Hähnle Cc: Aaron Watry Signed-off-by: Emil Velikov --- Nicolai, here is an alternative solution. I have a very slight inclination towards this one over your earlier patch. But either one should do, really. This looks reasonable to me, and you're the build system expert, so go for it :) Patch is Reviewed-by: Nicolai Hähnle --- src/egl/Makefile.am | 8 ++-- src/gallium/auxiliary/pipe-loader/Makefile.am | 6 -- src/gallium/targets/opencl/Makefile.am| 1 - src/gbm/Makefile.am | 1 + src/glx/Makefile.am | 4 +++- src/loader/Makefile.am| 15 ++- 6 files changed, 16 insertions(+), 19 deletions(-) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index ecaf148aaec..bb8ec9745dd 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -120,8 +120,12 @@ libEGL_common_la_SOURCES += \ $(dri2_backend_FILES) \ $(dri3_backend_FILES) -libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader.la -libEGL_common_la_LIBADD += $(DLOPEN_LIBS) $(LIBDRM_LIBS) $(CLOCK_LIB) +libEGL_common_la_LIBADD += \ + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la \ + $(DLOPEN_LIBS) \ + $(LIBDRM_LIBS) \ + $(CLOCK_LIB) GLVND_GEN_DEPS = generate/gen_egl_dispatch.py \ generate/egl.xml generate/eglFunctionList.py generate/genCommon.py \ diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am b/src/gallium/auxiliary/pipe-loader/Makefile.am index 4ebfc97e6d9..878159f2343 100644 --- a/src/gallium/auxiliary/pipe-loader/Makefile.am +++ b/src/gallium/auxiliary/pipe-loader/Makefile.am @@ -41,9 +41,11 @@ libpipe_loader_dynamic_la_SOURCES += \ endif libpipe_loader_static_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la libpipe_loader_dynamic_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la EXTRA_DIST = SConscript diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index e88fa0fd382..c9d2be7afd0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ $(top_builddir)/src/util/libmesautil.la \ - $(top_builddir)/src/util/libxmlconfig.la \ $(EXPAT_LIBS) \ $(LIBELF_LIBS) \ $(DLOPEN_LIBS) \ diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am index de8396000b7..7a9a12f87a0 100644 --- a/src/gbm/Makefile.am +++ b/src/gbm/Makefile.am @@ -26,6 +26,7 @@ libgbm_la_LDFLAGS = \ libgbm_la_LIBADD = \ $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la \ $(DLOPEN_LIBS) if HAVE_PLATFORM_WAYLAND diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am index b306bcc08db..34600475d98 100644 --- a/src/glx/Makefile.am +++ b/src/glx/Makefile.am @@ -97,7 +97,9 @@ libglx_la_SOURCES = \ singlepix.c \ vertarr.c -libglx_la_LIBADD = $(top_builddir)/src/loader/libloader.la +libglx_la_LIBADD = \ + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la if HAVE_DRISW libglx_la_SOURCES += \ diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am index 8b197f2995c..74ac6c51e77 100644 --- a/src/loader/Makefile.am +++ b/src/loader/Makefile.am @@ -26,6 +26,8 @@ EXTRA_DIST = SConscript noinst_LTLIBRARIES = libloader.la AM_CPPFLAGS = \ + -I$(top_builddir)/src/util/ \ + -DUSE_DRICONF \ $(DEFINES) \ -I$(top_srcdir)/include \ -I$(top_srcdir)/src \ @@ -37,19 +39,6 @@ libloader_la_CPPFLAGS = $(AM_CPPFLAGS) libloader_la_SOURCES = $(LOADER_C_FILES) libloader_la_LIBADD = -if HAVE_DRICOMMON -libloader_la_CPPFLAGS
Re: [Mesa-dev] [PATCH 5/8] clover/llvm: Use device in llvm compilation instead of copying fields
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > Copying the individual fields from the device when compiling/linking > will lead to an unnecessarily large number of fields getting passed > around. > > Signed-off-by: Aaron Watry> Cc: Jan Vesey I think this should be patch 3/8. It looks weird to implement new functionality one way, only to change it to a different interface in the same patch series. Jan > --- > src/gallium/state_trackers/clover/core/program.cpp | 9 +++-- > .../state_trackers/clover/llvm/invocation.cpp | 22 > ++ > .../state_trackers/clover/llvm/invocation.hpp | 7 ++- > 3 files changed, 15 insertions(+), 23 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/core/program.cpp > b/src/gallium/state_trackers/clover/core/program.cpp > index f0f0f38548..4e74fccd97 100644 > --- a/src/gallium/state_trackers/clover/core/program.cpp > +++ b/src/gallium/state_trackers/clover/core/program.cpp > @@ -53,9 +53,8 @@ program::compile(const ref_vector , const > std::string , > try { > const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? >tgsi::compile_program(_source, log) : > - llvm::compile_program(_source, headers, > -dev.ir_target(), opts, > - > dev.device_clc_version(), log)); > + llvm::compile_program(_source, headers, dev, > +opts, log)); > _builds[] = { m, opts, log }; > } catch (...) { > _builds[] = { module(), opts, log }; > @@ -79,9 +78,7 @@ program::link(const ref_vector , const > std::string , >try { > const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? > tgsi::link_program(ms) : > - llvm::link_program(ms, dev.ir_format(), > - dev.ir_target(), opts, > - dev.device_clc_version(), > log)); > + llvm::link_program(ms, dev, opts, log)); > _builds[] = { m, opts, log }; >} catch (...) { > _builds[] = { module(), opts, log }; > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index ca75596b05..e761ca188d 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -261,17 +261,16 @@ namespace { > module > clover::llvm::compile_program(const std::string , >const header_map , > - const std::string , > + const device , >const std::string , > - const std::string _version, >std::string _log) { > if (has_flag(debug::clc)) >debug::log(".cl", "// Options: " + opts + '\n' + source); > > auto ctx = create_context(r_log); > - auto c = create_compiler_instance(target, tokenize(opts + " input.cl"), > - device_version, r_log); > - auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts, > + auto c = create_compiler_instance(dev.ir_target(), tokenize(opts + " > input.cl"), > + dev.device_clc_version(), r_log); > + auto mod = compile(*ctx, *c, "input.cl", source, headers, > dev.ir_target(), opts, >r_log); > > if (has_flag(debug::llvm)) > @@ -330,16 +329,15 @@ namespace { > > module > clover::llvm::link_program(const std::vector , > - enum pipe_shader_ir ir, const std::string , > + const device , > const std::string , > - const std::string _version, > std::string _log) { > std::vector options = tokenize(opts + " input.cl"); > const bool create_library = count("-create-library", options); > erase_if(equals("-create-library"), options); > > auto ctx = create_context(r_log); > - auto c = create_compiler_instance(target, options, device_version, r_log); > + auto c = create_compiler_instance(dev.ir_target(), options, > dev.device_clc_version(), r_log); > auto mod = link(*ctx, *c, modules, r_log); > > optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library); > @@ -354,14 +352,14 @@ clover::llvm::link_program(const std::vector > , > if (create_library) { >return build_module_library(*mod, module::section::text_library); > > - } else if (ir == PIPE_SHADER_IR_LLVM) { > + } else if (dev.ir_format() ==
Re: [Mesa-dev] [PATCH 4/8] clover/llvm: Use -cl-std and device version to select language defaults
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by: > 1) If you have -cl-std=CL1.1+ use the version specified > 2) If not, use the highest 1.x version that the device supports > > Curiously, there is no valid value for -cl-std=CL1.0 > > Signed-off-by: Aaron Watry> Cc: Pierre Moreau > > v2: (Pierre) Move create_compiler_instance changes to correct patch > to prevent temporary build breakage. > Convert version_str into unsigned and use it to find language version > Add build_error for unknown language version string > Whitespace fixes > --- > .../state_trackers/clover/llvm/invocation.cpp | 61 > +- > 1 file changed, 60 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 7c8d0e738d..ca75596b05 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -93,6 +93,65 @@ namespace { >return ctx; > } > > + unsigned get_language_version_from_string(const std::string _str){ > + if (version_str == "1.0"){ > + return 100; > + } > + if (version_str == "1.1"){ > + return 110; > + } > + if (version_str == "1.2"){ > + return 120; > + } > + if (version_str == "2.0"){ > + return 200; > + } > + throw build_error("Unknown/Unsupported language version"); > + } I'm a bit conflicted about this. returning int from device.cl_version() might be nicer, we are using C++ string so we probably don't have to worry about generating new strings all the time. > + > + clang::LangStandard::Kind > + get_language_from_version_str(const std::string _str, > + bool is_opt = false) { > + /** > +* Per CL 2.0 spec, section 5.8.4.5: > +* If it's an option, use the value directly. > +* If it's a device version, clamp to max 1.x version, a.k.a. 1.2 > +*/ > + unsigned version = get_language_version_from_string(version_str); > + if (!is_opt && version > 120 ){ > + version = 120; > + } > + switch (version){ > + case 100: > +return clang::LangStandard::lang_opencl10; > + case 110: > +return clang::LangStandard::lang_opencl11; > + case 120: > +return clang::LangStandard::lang_opencl12; > + case 200: > +return clang::LangStandard::lang_opencl20; > + default: > +throw build_error("Unknown/Unsupported language version"); > + } > + } > + > + clang::LangStandard::Kind > + get_language_version(const std::vector , > +const std::string _version) { > + > + const std::string search = "-cl-std=CL"; > + > + for(auto opt: opts){ > + auto pos = opt.find(search); > + if (pos == 0){ > +auto ver = opt.substr(pos+search.size()); > +return get_language_from_version_str(ver, true); > + } > + } I don't think you need the above. we only set the defaults, so clang should be able to parse this option on its own if we pass it along. > + > + return get_language_from_version_str(device_version); > + } > + > std::unique_ptr > create_compiler_instance(const target , > const std::vector , > @@ -129,7 +188,7 @@ namespace { >compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(), > compat::ik_opencl, > ::llvm::Triple(target.triple), > c->getPreprocessorOpts(), > -clang::LangStandard::lang_opencl11); > +get_language_version(opts, device_version)); I'd imagine this could be something like get_language_from_version(std::max(dev.clc_version(), 120)) Jan > >c->createDiagnostics(new clang::TextDiagnosticPrinter( >*new raw_string_ostream(r_log), signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] loader: always include libxmlconfig on autotools build
On 4 August 2017 at 10:53, Nicolai Hähnlewrote: > From: Nicolai Hähnle > > This aligns with the fact that we also check for EXPAT_LIBS > unconditionally in configure.ac now. It should make all the > various build permutations of Clover work (whether DRI is > enabled or disabled in the build). > > Cc: Aaron Watry > Cc: Emil Velikov > -- > This change keeps everything green on Travis, and it should fix > the duplicate-symbol linker error seen by Aaron and others when > building Clover. > --- > src/gallium/targets/opencl/Makefile.am | 1 - > src/loader/Makefile.am | 13 + > 2 files changed, 5 insertions(+), 9 deletions(-) > > diff --git a/src/gallium/targets/opencl/Makefile.am > b/src/gallium/targets/opencl/Makefile.am > index e88fa0fd382..c9d2be7afd0 100644 > --- a/src/gallium/targets/opencl/Makefile.am > +++ b/src/gallium/targets/opencl/Makefile.am > @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \ > $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \ > $(top_builddir)/src/gallium/auxiliary/libgallium.la \ > $(top_builddir)/src/util/libmesautil.la \ > - $(top_builddir)/src/util/libxmlconfig.la \ > $(EXPAT_LIBS) \ > $(LIBELF_LIBS) \ > $(DLOPEN_LIBS) \ > diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am > index 8b197f2995c..5ed87820664 100644 > --- a/src/loader/Makefile.am > +++ b/src/loader/Makefile.am > @@ -33,21 +33,18 @@ AM_CPPFLAGS = \ > $(XCB_DRI3_CFLAGS) \ > $(LIBDRM_CFLAGS) > > -libloader_la_CPPFLAGS = $(AM_CPPFLAGS) > +libloader_la_CPPFLAGS = $(AM_CPPFLAGS) \ > + -DUSE_DRICONF > libloader_la_SOURCES = $(LOADER_C_FILES) > -libloader_la_LIBADD = > +libloader_la_LIBADD = \ > + $(top_builddir)/src/util/libxmlconfig.la > > if HAVE_DRICOMMON > libloader_la_CPPFLAGS += \ > -I$(top_builddir)/src/util/ \ > -I$(top_srcdir)/src/mesa/drivers/dri/common/ \ > -I$(top_srcdir)/src/mesa/ \ > - -I$(top_srcdir)/src/mapi/ \ > - -DUSE_DRICONF > - > -libloader_la_LIBADD += \ > - $(top_builddir)/src/util/libxmlconfig.la > - > + -I$(top_srcdir)/src/mapi/ Just sent and alternative solution. It's a bit more evasive, so I'll understand if you prefer this one. Sidenote: dri/common, mesa and mapi are no longer needed. One could drop them as follow-up. Please drop the HAVE_DRICOMMON guard, and assign libloader_la_CPPFLAGS at once. With that Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] WIP: loader: android: allow using of xmlconfig
From: Emil VelikovBrings the Android binaries on par with Autoconf, allowing users to select their GPU via device_id. Signed-off-by: Emil Velikov --- Completely untested. Posting if anyone is interested if polishing it up. --- src/loader/Android.mk | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/loader/Android.mk b/src/loader/Android.mk index ca9218846c9..4a45bf61865 100644 --- a/src/loader/Android.mk +++ b/src/loader/Android.mk @@ -33,6 +33,9 @@ include $(CLEAR_VARS) LOCAL_SRC_FILES := \ $(LOADER_C_FILES) +# XXX: might need an include for the generated xmlconfig files +LOCAL_CPPFLAGS := -DUSE_DRICONF + LOCAL_EXPORT_C_INCLUDE_DIRS := $(LOCAL_PATH) LOCAL_MODULE := libmesa_loader -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] WIP: loader: scons: allow using of xmlconfig on supported platforms
From: Emil VelikovBrings the SCons binaries on par with Autoconf, allowing users to select their GPU via device_id. Signed-off-by: Emil Velikov --- Completely untested. Posting if anyone is interested if polishing it up. --- src/loader/SConscript | 4 1 file changed, 4 insertions(+) diff --git a/src/loader/SConscript b/src/loader/SConscript index f70654f43ae..e3474b2e4f0 100644 --- a/src/loader/SConscript +++ b/src/loader/SConscript @@ -12,6 +12,10 @@ if env['drm']: env.PkgUseModules('DRM') env.Append(CPPDEFINES = ['HAVE_LIBDRM']) +# XXX: might need an include for the generated xmlconfig files +if env['dri']: +env.Append(CPPDEFINES = ['USE_DRICONF']) + # parse Makefile.sources sources = env.ParseSourceList('Makefile.sources', 'LOADER_C_FILES') -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] loader: rework xmlconfig dependency
From: Emil VelikovCurrently xmlconfig is conditionally used, only when --enable-dri is available. As the library has moved to src/util and has wider wisebase, this guard is no longer correct. Strictly speaking - it wasn't since the introduction of xmlconfig into st/nine a while ago. Unconditionally enable xmlconfig and drop the linking. As said before there's other users of the library, so depending on the configure options we will get multiple definitions of said symbols. NOTE: To avoid breaking other combinations, this commit adds the xmlconfig link to the required places - throughout gallium and the DRI loaders. Cc: Nicolai Hähnle Cc: Aaron Watry Signed-off-by: Emil Velikov --- Nicolai, here is an alternative solution. I have a very slight inclination towards this one over your earlier patch. But either one should do, really. --- src/egl/Makefile.am | 8 ++-- src/gallium/auxiliary/pipe-loader/Makefile.am | 6 -- src/gallium/targets/opencl/Makefile.am| 1 - src/gbm/Makefile.am | 1 + src/glx/Makefile.am | 4 +++- src/loader/Makefile.am| 15 ++- 6 files changed, 16 insertions(+), 19 deletions(-) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index ecaf148aaec..bb8ec9745dd 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -120,8 +120,12 @@ libEGL_common_la_SOURCES += \ $(dri2_backend_FILES) \ $(dri3_backend_FILES) -libEGL_common_la_LIBADD += $(top_builddir)/src/loader/libloader.la -libEGL_common_la_LIBADD += $(DLOPEN_LIBS) $(LIBDRM_LIBS) $(CLOCK_LIB) +libEGL_common_la_LIBADD += \ + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la \ + $(DLOPEN_LIBS) \ + $(LIBDRM_LIBS) \ + $(CLOCK_LIB) GLVND_GEN_DEPS = generate/gen_egl_dispatch.py \ generate/egl.xml generate/eglFunctionList.py generate/genCommon.py \ diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am b/src/gallium/auxiliary/pipe-loader/Makefile.am index 4ebfc97e6d9..878159f2343 100644 --- a/src/gallium/auxiliary/pipe-loader/Makefile.am +++ b/src/gallium/auxiliary/pipe-loader/Makefile.am @@ -41,9 +41,11 @@ libpipe_loader_dynamic_la_SOURCES += \ endif libpipe_loader_static_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la libpipe_loader_dynamic_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la EXTRA_DIST = SConscript diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index e88fa0fd382..c9d2be7afd0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ $(top_builddir)/src/util/libmesautil.la \ - $(top_builddir)/src/util/libxmlconfig.la \ $(EXPAT_LIBS) \ $(LIBELF_LIBS) \ $(DLOPEN_LIBS) \ diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am index de8396000b7..7a9a12f87a0 100644 --- a/src/gbm/Makefile.am +++ b/src/gbm/Makefile.am @@ -26,6 +26,7 @@ libgbm_la_LDFLAGS = \ libgbm_la_LIBADD = \ $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la \ $(DLOPEN_LIBS) if HAVE_PLATFORM_WAYLAND diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am index b306bcc08db..34600475d98 100644 --- a/src/glx/Makefile.am +++ b/src/glx/Makefile.am @@ -97,7 +97,9 @@ libglx_la_SOURCES = \ singlepix.c \ vertarr.c -libglx_la_LIBADD = $(top_builddir)/src/loader/libloader.la +libglx_la_LIBADD = \ + $(top_builddir)/src/loader/libloader.la \ + $(top_builddir)/src/util/libxmlconfig.la if HAVE_DRISW libglx_la_SOURCES += \ diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am index 8b197f2995c..74ac6c51e77 100644 --- a/src/loader/Makefile.am +++ b/src/loader/Makefile.am @@ -26,6 +26,8 @@ EXTRA_DIST = SConscript noinst_LTLIBRARIES = libloader.la AM_CPPFLAGS = \ + -I$(top_builddir)/src/util/ \ + -DUSE_DRICONF \ $(DEFINES) \ -I$(top_srcdir)/include \ -I$(top_srcdir)/src \ @@ -37,19 +39,6 @@ libloader_la_CPPFLAGS = $(AM_CPPFLAGS) libloader_la_SOURCES = $(LOADER_C_FILES) libloader_la_LIBADD = -if HAVE_DRICOMMON -libloader_la_CPPFLAGS += \ - -I$(top_builddir)/src/util/ \ - -I$(top_srcdir)/src/mesa/drivers/dri/common/ \ - -I$(top_srcdir)/src/mesa/ \ - -I$(top_srcdir)/src/mapi/ \ - -DUSE_DRICONF - -libloader_la_LIBADD += \ -
Re: [Mesa-dev] [PATCH 3/8] clover: Add device_clc_version to llvm::[compile|link]_program
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > We'll be using it to select the default language version soon. > > Signed-off-by: Aaron Watry> Cc: Pierre Moreau > Cc: Jan Vesely > > v2: (Pierre) Move changes to create_compiler_instance invocation to correct > patch to prevent temporary build breakage. > (Jan) Use device_clc_version instead of device_version for compile/link > --- This patch looks redundant wrt changes in 5/8. Why not just add device parameter right away instead of adding version and then changing it later. Jan > src/gallium/state_trackers/clover/core/program.cpp| 6 -- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 10 +++--- > src/gallium/state_trackers/clover/llvm/invocation.hpp | 2 ++ > 3 files changed, 13 insertions(+), 5 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/core/program.cpp > b/src/gallium/state_trackers/clover/core/program.cpp > index ae4b50a879..f0f0f38548 100644 > --- a/src/gallium/state_trackers/clover/core/program.cpp > +++ b/src/gallium/state_trackers/clover/core/program.cpp > @@ -54,7 +54,8 @@ program::compile(const ref_vector , const > std::string , > const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? >tgsi::compile_program(_source, log) : >llvm::compile_program(_source, headers, > -dev.ir_target(), opts, > log)); > +dev.ir_target(), opts, > + > dev.device_clc_version(), log)); > _builds[] = { m, opts, log }; > } catch (...) { > _builds[] = { module(), opts, log }; > @@ -79,7 +80,8 @@ program::link(const ref_vector , const > std::string , > const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? > tgsi::link_program(ms) : > llvm::link_program(ms, dev.ir_format(), > - dev.ir_target(), opts, log)); > + dev.ir_target(), opts, > + dev.device_clc_version(), > log)); > _builds[] = { m, opts, log }; >} catch (...) { > _builds[] = { module(), opts, log }; > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 6412377faa..7c8d0e738d 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -96,6 +96,7 @@ namespace { > std::unique_ptr > create_compiler_instance(const target , > const std::vector , > +const std::string _version, > std::string _log) { >std::unique_ptr c { new > clang::CompilerInstance }; >clang::TextDiagnosticBuffer *diag_buffer = new > clang::TextDiagnosticBuffer; > @@ -203,13 +204,14 @@ clover::llvm::compile_program(const std::string , >const header_map , >const std::string , >const std::string , > + const std::string _version, >std::string _log) { > if (has_flag(debug::clc)) >debug::log(".cl", "// Options: " + opts + '\n' + source); > > auto ctx = create_context(r_log); > auto c = create_compiler_instance(target, tokenize(opts + " input.cl"), > - r_log); > + device_version, r_log); > auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts, >r_log); > > @@ -270,13 +272,15 @@ namespace { > module > clover::llvm::link_program(const std::vector , > enum pipe_shader_ir ir, const std::string , > - const std::string , std::string _log) { > + const std::string , > + const std::string _version, > + std::string _log) { > std::vector options = tokenize(opts + " input.cl"); > const bool create_library = count("-create-library", options); > erase_if(equals("-create-library"), options); > > auto ctx = create_context(r_log); > - auto c = create_compiler_instance(target, options, r_log); > + auto c = create_compiler_instance(target, options, device_version, r_log); > auto mod = link(*ctx, *c, modules, r_log); > > optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library); > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.hpp > b/src/gallium/state_trackers/clover/llvm/invocation.hpp > index
Re: [Mesa-dev] [PATCH 2/8] clover: Add device_clc_version to device.[hc]pp
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > device_version and device_clc_version are not necessarily the same for > devices that support CL 1.0, but have a 1.1 compiler and the necessary > extensions. > > CC: Jan VeseyI think you might consider squashing 1/8 and 2/8. squashed or not: Reviewed-by: Jan Vesely Jan > --- > src/gallium/state_trackers/clover/api/device.cpp | 2 +- > src/gallium/state_trackers/clover/core/device.cpp | 5 + > src/gallium/state_trackers/clover/core/device.hpp | 1 + > 3 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/clover/api/device.cpp > b/src/gallium/state_trackers/clover/api/device.cpp > index 18ed2f059f..b1b7917e4e 100644 > --- a/src/gallium/state_trackers/clover/api/device.cpp > +++ b/src/gallium/state_trackers/clover/api/device.cpp > @@ -368,7 +368,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >break; > > case CL_DEVICE_OPENCL_C_VERSION: > - buf.as_string() = "OpenCL C " + dev.device_version() + " "; > + buf.as_string() = "OpenCL C " + dev.device_clc_version() + " "; >break; > > case CL_DEVICE_PRINTF_BUFFER_SIZE: > diff --git a/src/gallium/state_trackers/clover/core/device.cpp > b/src/gallium/state_trackers/clover/core/device.cpp > index 0277495506..68856ae36b 100644 > --- a/src/gallium/state_trackers/clover/core/device.cpp > +++ b/src/gallium/state_trackers/clover/core/device.cpp > @@ -245,3 +245,8 @@ std::string > device::device_version() const { > return "1.1"; > } > + > +std::string > +device::device_clc_version() const { > +return "1.1"; > +} > diff --git a/src/gallium/state_trackers/clover/core/device.hpp > b/src/gallium/state_trackers/clover/core/device.hpp > index 3cf7e20be5..efc217aedb 100644 > --- a/src/gallium/state_trackers/clover/core/device.hpp > +++ b/src/gallium/state_trackers/clover/core/device.hpp > @@ -75,6 +75,7 @@ namespace clover { >std::string device_name() const; >std::string vendor_name() const; >std::string device_version() const; > + std::string device_clc_version() const; >enum pipe_shader_ir ir_format() const; >std::string ir_target() const; >enum pipe_endian endianness() const; signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] clover/device: Move device version into core/device.cpp
On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote: > The device version is the maximum CL version that the device supports. > > Eventually, this will be based on the features/extensions of the actual > device, but for now move it a bit closer to its eventual destination. > > Signed-off-by: Aaron Watry> --- > src/gallium/state_trackers/clover/api/device.cpp | 4 ++-- > src/gallium/state_trackers/clover/core/device.cpp | 5 + > src/gallium/state_trackers/clover/core/device.hpp | 1 + > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/api/device.cpp > b/src/gallium/state_trackers/clover/api/device.cpp > index 0b33350bb2..18ed2f059f 100644 > --- a/src/gallium/state_trackers/clover/api/device.cpp > +++ b/src/gallium/state_trackers/clover/api/device.cpp > @@ -314,7 +314,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >break; > > case CL_DEVICE_VERSION: > - buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION > + buf.as_string() = "OpenCL " + dev.device_version() + " Mesa " > PACKAGE_VERSION > #ifdef MESA_GIT_SHA1 > " (" MESA_GIT_SHA1 ")" > #endif > @@ -368,7 +368,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >break; > > case CL_DEVICE_OPENCL_C_VERSION: > - buf.as_string() = "OpenCL C 1.1 "; > + buf.as_string() = "OpenCL C " + dev.device_version() + " "; >break; This chunk looks out of place, especially since you change it again in 2/8. With this fixed: Reviewed-by: Jan Vesely Jan > > case CL_DEVICE_PRINTF_BUFFER_SIZE: > diff --git a/src/gallium/state_trackers/clover/core/device.cpp > b/src/gallium/state_trackers/clover/core/device.cpp > index 2ad9e49cf8..0277495506 100644 > --- a/src/gallium/state_trackers/clover/core/device.cpp > +++ b/src/gallium/state_trackers/clover/core/device.cpp > @@ -240,3 +240,8 @@ enum pipe_endian > device::endianness() const { > return (enum pipe_endian)pipe->get_param(pipe, PIPE_CAP_ENDIANNESS); > } > + > +std::string > +device::device_version() const { > +return "1.1"; > +} > diff --git a/src/gallium/state_trackers/clover/core/device.hpp > b/src/gallium/state_trackers/clover/core/device.hpp > index 7b3353df34..3cf7e20be5 100644 > --- a/src/gallium/state_trackers/clover/core/device.hpp > +++ b/src/gallium/state_trackers/clover/core/device.hpp > @@ -74,6 +74,7 @@ namespace clover { >cl_uint address_bits() const; >std::string device_name() const; >std::string vendor_name() const; > + std::string device_version() const; >enum pipe_shader_ir ir_format() const; >std::string ir_target() const; >enum pipe_endian endianness() const; signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 24/25] i965: Mark brw_hw_type_to_reg_type() as a pure function
textdata bss dec hex filename 7816886 346248 420496 8583630 82f9ce i965_dri.so before 7816214 346248 420496 8582958 82f72e i965_dri.so after --- src/intel/compiler/brw_reg_type.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_reg_type.h b/src/intel/compiler/brw_reg_type.h index 5d05f293c6..08dc1715a9 100644 --- a/src/intel/compiler/brw_reg_type.h +++ b/src/intel/compiler/brw_reg_type.h @@ -28,6 +28,12 @@ extern "C" { #endif +#ifdef HAVE_FUNC_ATTRIBUTE_PURE +#define ATTRIBUTE_PURE __attribute__((__pure__)) +#else +#define ATTRIBUTE_PURE +#endif + enum brw_reg_file; struct gen_device_info; @@ -59,7 +65,7 @@ unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, enum brw_reg_file file, enum brw_reg_type type); -enum brw_reg_type +enum brw_reg_type ATTRIBUTE_PURE brw_hw_type_to_reg_type(const struct gen_device_info *devinfo, enum brw_reg_file file, unsigned hw_type); -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 22/25] i965: Stop using hardware register types directly
--- src/intel/compiler/brw_disasm.c | 47 - src/intel/compiler/brw_eu_validate.c | 196 --- src/intel/compiler/brw_reg_type.c| 17 +-- src/intel/compiler/brw_reg_type.h| 11 +- 4 files changed, 113 insertions(+), 158 deletions(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index 02b48c9cf2..e2675b5f4c 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -707,7 +707,8 @@ reg(FILE *file, unsigned _reg_file, unsigned _reg_nr) static int dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) { - unsigned elem_size = brw_element_size(devinfo, inst, dst); + enum brw_reg_type type = brw_inst_dst_type(devinfo, inst); + unsigned elem_size = brw_reg_type_to_size(type); int err = 0; if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) { @@ -723,10 +724,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) err |= control(file, "horiz stride", horiz_stride, brw_inst_dst_hstride(devinfo, inst), NULL); string(file, ">"); - string(file, -brw_hw_reg_type_to_letters(devinfo, - brw_inst_dst_reg_file(devinfo, inst), - brw_inst_dst_reg_hw_type(devinfo, inst))); + string(file, brw_reg_type_to_letters(type)); } else { string(file, "g[a0"); if (brw_inst_dst_ia_subreg_nr(devinfo, inst)) @@ -738,10 +736,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) err |= control(file, "horiz stride", horiz_stride, brw_inst_dst_hstride(devinfo, inst), NULL); string(file, ">"); - string(file, -brw_hw_reg_type_to_letters(devinfo, - brw_inst_dst_reg_file(devinfo, inst), - brw_inst_dst_reg_hw_type(devinfo, inst))); + string(file, brw_reg_type_to_letters(type)); } } else { if (brw_inst_dst_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) { @@ -754,10 +749,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) string(file, "<1>"); err |= control(file, "writemask", writemask, brw_inst_da16_writemask(devinfo, inst), NULL); - string(file, -brw_hw_reg_type_to_letters(devinfo, - brw_inst_dst_reg_file(devinfo, inst), - brw_inst_dst_reg_hw_type(devinfo, inst))); + string(file, brw_reg_type_to_letters(type)); } else { err = 1; string(file, "Indirect align16 address mode not supported"); @@ -812,7 +804,7 @@ static int src_da1(FILE *file, const struct gen_device_info *devinfo, unsigned opcode, -unsigned type, unsigned _reg_file, +enum brw_reg_type type, unsigned _reg_file, unsigned _vert_stride, unsigned _width, unsigned _horiz_stride, unsigned reg_num, unsigned sub_reg_num, unsigned __abs, unsigned _negate) @@ -830,11 +822,11 @@ src_da1(FILE *file, if (err == -1) return 0; if (sub_reg_num) { - unsigned elem_size = brw_hw_reg_type_to_size(devinfo, _reg_file, type); + unsigned elem_size = brw_reg_type_to_size(type); format(file, ".%d", sub_reg_num / elem_size); /* use formal style like spec */ } src_align1_region(file, _vert_stride, _width, _horiz_stride); - string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type)); + string(file, brw_reg_type_to_letters(type)); return err; } @@ -842,7 +834,7 @@ static int src_ia1(FILE *file, const struct gen_device_info *devinfo, unsigned opcode, -unsigned type, +enum brw_reg_type type, unsigned _reg_file, int _addr_imm, unsigned _addr_subreg_nr, @@ -866,7 +858,7 @@ src_ia1(FILE *file, format(file, " %d", _addr_imm); string(file, "]"); src_align1_region(file, _vert_stride, _width, _horiz_stride); - string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type)); + string(file, brw_reg_type_to_letters(type)); return err; } @@ -896,7 +888,7 @@ static int src_da16(FILE *file, const struct gen_device_info *devinfo, unsigned opcode, - unsigned _reg_type, + enum brw_reg_type type, unsigned _reg_file, unsigned _vert_stride, unsigned _reg_nr, @@ -918,8 +910,7 @@ src_da16(FILE *file, if (err == -1) return 0; if (_subreg_nr) { - unsigned elem_size = - brw_hw_reg_type_to_size(devinfo, _reg_file, _reg_type); + unsigned elem_size = brw_reg_type_to_size(type); /* bit4 for subreg number byte addressing. Make
[Mesa-dev] [PATCH 25/25] i965: Optimize reading the destination type
brw_hw_type_to_reg_type() needs to know only whether the file is BRW_IMMEDIATE_VALUE or not, which is not a valid file for the destination. gcc and clang will evaluate __builtin_strcmp() at compile time, so we can use it to pass a constant file for the destination. textdata bss dec hex filename 7816214 346248 420496 8582958 82f72e i965_dri.so before 7816070 346248 420496 8582814 82f69e i965_dri.so after --- src/intel/compiler/brw_inst.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index eacc0a024a..e9dad38f69 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -669,7 +669,9 @@ static inline enum brw_reg_type \ brw_inst_##reg##_type(const struct gen_device_info *devinfo, \ const brw_inst *inst) \ { \ - unsigned file = brw_inst_##reg##_reg_file(devinfo, inst); \ + unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ + BRW_GENERAL_REGISTER_FILE :\ + brw_inst_##reg##_reg_file(devinfo, inst); \ unsigned hw_type = brw_inst_##reg##_reg_hw_type(devinfo, inst);\ return brw_hw_type_to_reg_type(devinfo, (enum brw_reg_file)file, hw_type); \ } -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 23/25] i965: Hide the register type hardware encodings
So we stop mixing them with the logical enum. --- src/intel/compiler/brw_eu_defines.h | 31 --- src/intel/compiler/brw_reg_type.c | 31 +++ 2 files changed, 31 insertions(+), 31 deletions(-) diff --git a/src/intel/compiler/brw_eu_defines.h b/src/intel/compiler/brw_eu_defines.h index 44bde3ff51..da482b73c5 100644 --- a/src/intel/compiler/brw_eu_defines.h +++ b/src/intel/compiler/brw_eu_defines.h @@ -819,37 +819,6 @@ enum PACKED brw_reg_file { BAD_FILE, }; -enum hw_reg_type { - BRW_HW_REG_TYPE_UD = 0, - BRW_HW_REG_TYPE_D = 1, - BRW_HW_REG_TYPE_UW = 2, - BRW_HW_REG_TYPE_W = 3, - BRW_HW_REG_TYPE_F = 7, - GEN8_HW_REG_TYPE_UQ = 8, - GEN8_HW_REG_TYPE_Q = 9, - - BRW_HW_REG_TYPE_UB = 4, - BRW_HW_REG_TYPE_B = 5, - GEN7_HW_REG_TYPE_DF = 6, - GEN8_HW_REG_TYPE_HF = 10, -}; - -enum hw_imm_type { - BRW_HW_IMM_TYPE_UD = 0, - BRW_HW_IMM_TYPE_D = 1, - BRW_HW_IMM_TYPE_UW = 2, - BRW_HW_IMM_TYPE_W = 3, - BRW_HW_IMM_TYPE_F = 7, - GEN8_HW_IMM_TYPE_UQ = 8, - GEN8_HW_IMM_TYPE_Q = 9, - - BRW_HW_IMM_TYPE_UV = 4, /* Gen6+ packed unsigned immediate vector */ - BRW_HW_IMM_TYPE_VF = 5, /* packed float immediate vector */ - BRW_HW_IMM_TYPE_V = 6, /* packed int imm. vector; uword dest only */ - GEN8_HW_IMM_TYPE_DF = 10, - GEN8_HW_IMM_TYPE_HF = 11, -}; - /* SNB adds 3-src instructions (MAD and LRP) that only operate on floats, so * the types were implied. IVB adds BFE and BFI2 that operate on doublewords * and unsigned doublewords, so a new field is also available in the da3src diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index b3e24b195c..fced942740 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -27,6 +27,37 @@ #define INVALID (-1) +enum hw_reg_type { + BRW_HW_REG_TYPE_UD = 0, + BRW_HW_REG_TYPE_D = 1, + BRW_HW_REG_TYPE_UW = 2, + BRW_HW_REG_TYPE_W = 3, + BRW_HW_REG_TYPE_F = 7, + GEN8_HW_REG_TYPE_UQ = 8, + GEN8_HW_REG_TYPE_Q = 9, + + BRW_HW_REG_TYPE_UB = 4, + BRW_HW_REG_TYPE_B = 5, + GEN7_HW_REG_TYPE_DF = 6, + GEN8_HW_REG_TYPE_HF = 10, +}; + +enum hw_imm_type { + BRW_HW_IMM_TYPE_UD = 0, + BRW_HW_IMM_TYPE_D = 1, + BRW_HW_IMM_TYPE_UW = 2, + BRW_HW_IMM_TYPE_W = 3, + BRW_HW_IMM_TYPE_F = 7, + GEN8_HW_IMM_TYPE_UQ = 8, + GEN8_HW_IMM_TYPE_Q = 9, + + BRW_HW_IMM_TYPE_UV = 4, + BRW_HW_IMM_TYPE_VF = 5, + BRW_HW_IMM_TYPE_V = 6, + GEN8_HW_IMM_TYPE_DF = 10, + GEN8_HW_IMM_TYPE_HF = 11, +}; + static const struct { enum hw_reg_type reg_type; enum hw_imm_type imm_type; -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 20/25] i965: Move brw_reg_type_letters() as well
And add "to_" to the name for consistency with the other functions in this file. --- src/intel/compiler/brw_eu.c | 28 src/intel/compiler/brw_fs.cpp | 4 ++-- src/intel/compiler/brw_reg.h | 1 - src/intel/compiler/brw_reg_type.c | 30 ++ src/intel/compiler/brw_reg_type.h | 3 +++ src/intel/compiler/brw_vec4.cpp | 4 ++-- 6 files changed, 37 insertions(+), 33 deletions(-) diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c index 700a1badd4..b0bdc38f4b 100644 --- a/src/intel/compiler/brw_eu.c +++ b/src/intel/compiler/brw_eu.c @@ -37,34 +37,6 @@ #include "util/ralloc.h" -/** - * Converts a BRW_REGISTER_TYPE_* enum to a short string (F, UD, and so on). - * - * This is different than reg_encoding from brw_disasm.c in that it operates - * on the abstract enum values, rather than the generation-specific encoding. - */ -const char * -brw_reg_type_letters(unsigned type) -{ - const char *names[] = { - [BRW_REGISTER_TYPE_UD] = "UD", - [BRW_REGISTER_TYPE_D] = "D", - [BRW_REGISTER_TYPE_UW] = "UW", - [BRW_REGISTER_TYPE_W] = "W", - [BRW_REGISTER_TYPE_F] = "F", - [BRW_REGISTER_TYPE_UB] = "UB", - [BRW_REGISTER_TYPE_B] = "B", - [BRW_REGISTER_TYPE_UV] = "UV", - [BRW_REGISTER_TYPE_V] = "V", - [BRW_REGISTER_TYPE_VF] = "VF", - [BRW_REGISTER_TYPE_DF] = "DF", - [BRW_REGISTER_TYPE_HF] = "HF", - [BRW_REGISTER_TYPE_UQ] = "UQ", - [BRW_REGISTER_TYPE_Q] = "Q", - }; - return names[type]; -} - /* Returns a conditional modifier that negates the condition. */ enum brw_conditional_mod brw_negate_cmod(uint32_t cmod) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 0ea4c4f1cc..b48dc4167e 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -5346,7 +5346,7 @@ fs_visitor::dump_instruction(backend_instruction *be_inst, FILE *file) if (inst->dst.stride != 1) fprintf(file, "<%u>", inst->dst.stride); - fprintf(file, ":%s, ", brw_reg_type_letters(inst->dst.type)); + fprintf(file, ":%s, ", brw_reg_type_to_letters(inst->dst.type)); for (int i = 0; i < inst->sources; i++) { if (inst->src[i].negate) @@ -5443,7 +5443,7 @@ fs_visitor::dump_instruction(backend_instruction *be_inst, FILE *file) if (stride != 1) fprintf(file, "<%u>", stride); - fprintf(file, ":%s", brw_reg_type_letters(inst->src[i].type)); + fprintf(file, ":%s", brw_reg_type_to_letters(inst->src[i].type)); } if (i < inst->sources - 1 && inst->src[i + 1].file != BAD_FILE) diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h index 9be2b52831..441dfb2447 100644 --- a/src/intel/compiler/brw_reg.h +++ b/src/intel/compiler/brw_reg.h @@ -203,7 +203,6 @@ brw_mask_for_swizzle(unsigned swz) return brw_apply_inv_swizzle_to_mask(swz, ~0); } -const char *brw_reg_type_letters(unsigned brw_reg_type); uint32_t brw_swizzle_immediate(enum brw_reg_type type, uint32_t x, unsigned swz); #define REG_SIZE (8*4) diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index 859bcac047..9b048f228d 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -124,3 +124,33 @@ brw_hw_reg_type_to_size(const struct gen_device_info *devinfo, enum brw_reg_type type = brw_hw_type_to_reg_type(devinfo, file, hw_type); return type_size[type]; } + +/** + * Converts a BRW_REGISTER_TYPE_* enum to a short string (F, UD, and so on). + * + * This is different than reg_encoding from brw_disasm.c in that it operates + * on the abstract enum values, rather than the generation-specific encoding. + */ +const char * +brw_reg_type_to_letters(enum brw_reg_type type) +{ + static const char letters[][3] = { + [BRW_REGISTER_TYPE_DF] = "DF", + [BRW_REGISTER_TYPE_F] = "F", + [BRW_REGISTER_TYPE_HF] = "HF", + [BRW_REGISTER_TYPE_VF] = "VF", + + [BRW_REGISTER_TYPE_Q] = "Q", + [BRW_REGISTER_TYPE_UQ] = "UQ", + [BRW_REGISTER_TYPE_D] = "D", + [BRW_REGISTER_TYPE_UD] = "UD", + [BRW_REGISTER_TYPE_W] = "W", + [BRW_REGISTER_TYPE_UW] = "UW", + [BRW_REGISTER_TYPE_B] = "B", + [BRW_REGISTER_TYPE_UB] = "UB", + [BRW_REGISTER_TYPE_V] = "V", + [BRW_REGISTER_TYPE_UV] = "UV", + }; + assert(type < ARRAY_SIZE(letters)); + return letters[type]; +} diff --git a/src/intel/compiler/brw_reg_type.h b/src/intel/compiler/brw_reg_type.h index 743522b294..64f259d2a3 100644 --- a/src/intel/compiler/brw_reg_type.h +++ b/src/intel/compiler/brw_reg_type.h @@ -71,6 +71,9 @@ unsigned brw_hw_reg_type_to_size(const struct gen_device_info *devinfo, enum brw_reg_file file, unsigned hw_type); +const char * +brw_reg_type_to_letters(enum brw_reg_type type); + #ifdef __cplusplus } #endif diff --git a/src/intel/compiler/brw_vec4.cpp
[Mesa-dev] [PATCH 21/25] i965: Add brw_hw_reg_type_to_letters() and use it in brw_disasm.c
--- src/intel/compiler/brw_disasm.c | 72 ++- src/intel/compiler/brw_reg_type.c | 8 + src/intel/compiler/brw_reg_type.h | 4 +++ 3 files changed, 45 insertions(+), 39 deletions(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index 731e64a8ad..02b48c9cf2 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -237,21 +237,6 @@ static const char *const access_mode[2] = { [1] = "align16", }; -static const char * const reg_encoding[] = { - [BRW_HW_REG_TYPE_UD] = "UD", - [BRW_HW_REG_TYPE_D] = "D", - [BRW_HW_REG_TYPE_UW] = "UW", - [BRW_HW_REG_TYPE_W] = "W", - [BRW_HW_REG_TYPE_F] = "F", - [GEN8_HW_REG_TYPE_UQ] = "UQ", - [GEN8_HW_REG_TYPE_Q] = "Q", - - [BRW_HW_REG_TYPE_UB] = "UB", - [BRW_HW_REG_TYPE_B] = "B", - [GEN7_HW_REG_TYPE_DF] = "DF", - [GEN8_HW_REG_TYPE_HF] = "HF", -}; - static const char *const three_source_reg_encoding[] = { [BRW_3SRC_TYPE_F] = "F", [BRW_3SRC_TYPE_D] = "D", @@ -738,8 +723,10 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) err |= control(file, "horiz stride", horiz_stride, brw_inst_dst_hstride(devinfo, inst), NULL); string(file, ">"); - err |= control(file, "dest reg encoding", reg_encoding, -brw_inst_dst_reg_hw_type(devinfo, inst), NULL); + string(file, +brw_hw_reg_type_to_letters(devinfo, + brw_inst_dst_reg_file(devinfo, inst), + brw_inst_dst_reg_hw_type(devinfo, inst))); } else { string(file, "g[a0"); if (brw_inst_dst_ia_subreg_nr(devinfo, inst)) @@ -751,8 +738,10 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) err |= control(file, "horiz stride", horiz_stride, brw_inst_dst_hstride(devinfo, inst), NULL); string(file, ">"); - err |= control(file, "dest reg encoding", reg_encoding, -brw_inst_dst_reg_hw_type(devinfo, inst), NULL); + string(file, +brw_hw_reg_type_to_letters(devinfo, + brw_inst_dst_reg_file(devinfo, inst), + brw_inst_dst_reg_hw_type(devinfo, inst))); } } else { if (brw_inst_dst_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) { @@ -765,8 +754,10 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) string(file, "<1>"); err |= control(file, "writemask", writemask, brw_inst_da16_writemask(devinfo, inst), NULL); - err |= control(file, "dest reg encoding", reg_encoding, -brw_inst_dst_reg_hw_type(devinfo, inst), NULL); + string(file, +brw_hw_reg_type_to_letters(devinfo, + brw_inst_dst_reg_file(devinfo, inst), + brw_inst_dst_reg_hw_type(devinfo, inst))); } else { err = 1; string(file, "Indirect align16 address mode not supported"); @@ -843,7 +834,7 @@ src_da1(FILE *file, format(file, ".%d", sub_reg_num / elem_size); /* use formal style like spec */ } src_align1_region(file, _vert_stride, _width, _horiz_stride); - err |= control(file, "src reg encoding", reg_encoding, type, NULL); + string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type)); return err; } @@ -875,7 +866,7 @@ src_ia1(FILE *file, format(file, " %d", _addr_imm); string(file, "]"); src_align1_region(file, _vert_stride, _width, _horiz_stride); - err |= control(file, "src reg encoding", reg_encoding, type, NULL); + string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, type)); return err; } @@ -938,7 +929,7 @@ src_da16(FILE *file, err |= control(file, "vert stride", vert_stride, _vert_stride, NULL); string(file, ">"); err |= src_swizzle(file, BRW_SWIZZLE4(swz_x, swz_y, swz_z, swz_w)); - err |= control(file, "src da16 reg type", reg_encoding, _reg_type, NULL); + string(file, brw_hw_reg_type_to_letters(devinfo, _reg_file, _reg_type)); return err; } @@ -1025,50 +1016,53 @@ src2_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst *ins } static int -imm(FILE *file, const struct gen_device_info *devinfo, enum hw_imm_type type, +imm(FILE *file, const struct gen_device_info *devinfo, enum brw_reg_type type, const brw_inst *inst) { switch (type) { - case GEN8_HW_IMM_TYPE_UQ: + case BRW_REGISTER_TYPE_UQ: format(file, "0x%16lxUD", brw_inst_imm_uq(devinfo, inst)); break; - case GEN8_HW_IMM_TYPE_Q: + case BRW_REGISTER_TYPE_Q: format(file, "%ldD", brw_inst_imm_uq(devinfo, inst)); break; - case
[Mesa-dev] [PATCH 19/25] i965: Switch to using the logical register types
--- src/intel/compiler/brw_eu_compact.c | 27 --- src/intel/compiler/brw_eu_emit.c| 13 +++-- 2 files changed, 19 insertions(+), 21 deletions(-) diff --git a/src/intel/compiler/brw_eu_compact.c b/src/intel/compiler/brw_eu_compact.c index 743ee9519c..7674aa8b85 100644 --- a/src/intel/compiler/brw_eu_compact.c +++ b/src/intel/compiler/brw_eu_compact.c @@ -995,10 +995,11 @@ precompact(const struct gen_device_info *devinfo, brw_inst inst) !(devinfo->is_haswell && brw_inst_opcode(devinfo, ) == BRW_OPCODE_DIM) && !(devinfo->gen >= 8 && - (brw_inst_src0_reg_hw_type(devinfo, ) == GEN8_HW_IMM_TYPE_DF || - brw_inst_src0_reg_hw_type(devinfo, ) == GEN8_HW_IMM_TYPE_UQ || - brw_inst_src0_reg_hw_type(devinfo, ) == GEN8_HW_IMM_TYPE_Q))) { - brw_inst_set_src1_reg_hw_type(devinfo, , BRW_HW_REG_TYPE_UD); + (brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_DF || + brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_UQ || + brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_Q))) { + enum brw_reg_file file = brw_inst_src0_reg_file(devinfo, ); + brw_inst_set_src1_file_type(devinfo, , file, BRW_REGISTER_TYPE_UD); } /* Compacted instructions only have 12-bits (plus 1 for the other 20) @@ -1013,10 +1014,11 @@ precompact(const struct gen_device_info *devinfo, brw_inst inst) * If we see a 0.0:F, change the type to VF so that it can be compacted. */ if (brw_inst_imm_ud(devinfo, ) == 0x0 && - brw_inst_src0_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_F && - brw_inst_dst_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_F && + brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_F && + brw_inst_dst_type(devinfo, ) == BRW_REGISTER_TYPE_F && brw_inst_dst_hstride(devinfo, ) == BRW_HORIZONTAL_STRIDE_1) { - brw_inst_set_src0_reg_hw_type(devinfo, , BRW_HW_IMM_TYPE_VF); + enum brw_reg_file file = brw_inst_src0_reg_file(devinfo, ); + brw_inst_set_src0_file_type(devinfo, , file, BRW_REGISTER_TYPE_VF); } /* There are no mappings for dst:d | i:d, so if the immediate is suitable @@ -1024,10 +1026,13 @@ precompact(const struct gen_device_info *devinfo, brw_inst inst) */ if (is_compactable_immediate(brw_inst_imm_ud(devinfo, )) && brw_inst_cond_modifier(devinfo, ) == BRW_CONDITIONAL_NONE && - brw_inst_src0_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_D && - brw_inst_dst_reg_hw_type(devinfo, ) == BRW_HW_REG_TYPE_D) { - brw_inst_set_src0_reg_hw_type(devinfo, , BRW_HW_REG_TYPE_UD); - brw_inst_set_dst_reg_hw_type(devinfo, , BRW_HW_REG_TYPE_UD); + brw_inst_src0_type(devinfo, ) == BRW_REGISTER_TYPE_D && + brw_inst_dst_type(devinfo, ) == BRW_REGISTER_TYPE_D) { + enum brw_reg_file src_file = brw_inst_src0_reg_file(devinfo, ); + enum brw_reg_file dst_file = brw_inst_dst_reg_file(devinfo, ); + + brw_inst_set_src0_file_type(devinfo, , src_file, BRW_REGISTER_TYPE_UD); + brw_inst_set_dst_file_type(devinfo, , dst_file, BRW_REGISTER_TYPE_UD); } return inst; diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 064e4a0387..8c952e7da2 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -96,10 +96,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest) gen7_convert_mrf_to_grf(p, ); - brw_inst_set_dst_reg_file(devinfo, inst, dest.file); - brw_inst_set_dst_reg_hw_type(devinfo, inst, -brw_reg_type_to_hw_type(devinfo, dest.file, -dest.type)); + brw_inst_set_dst_file_type(devinfo, inst, dest.file, dest.type); brw_inst_set_dst_address_mode(devinfo, inst, dest.address_mode); if (dest.address_mode == BRW_ADDRESS_DIRECT) { @@ -263,9 +260,7 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg) validate_reg(devinfo, inst, reg); - brw_inst_set_src0_reg_file(devinfo, inst, reg.file); - brw_inst_set_src0_reg_hw_type(devinfo, inst, - brw_reg_type_to_hw_type(devinfo, reg.file, reg.type)); + brw_inst_set_src0_file_type(devinfo, inst, reg.file, reg.type); brw_inst_set_src0_abs(devinfo, inst, reg.abs); brw_inst_set_src0_negate(devinfo, inst, reg.negate); brw_inst_set_src0_address_mode(devinfo, inst, reg.address_mode); @@ -370,9 +365,7 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg) validate_reg(devinfo, inst, reg); - brw_inst_set_src1_reg_file(devinfo, inst, reg.file); - brw_inst_set_src1_reg_hw_type(devinfo, inst, - brw_reg_type_to_hw_type(devinfo, reg.file, reg.type)); + brw_inst_set_src1_file_type(devinfo, inst, reg.file, reg.type); brw_inst_set_src1_abs(devinfo, inst, reg.abs); brw_inst_set_src1_negate(devinfo, inst, reg.negate); -- 2.13.0
[Mesa-dev] [PATCH 17/25] i965: Rename brw_inst's functions that access the register type
Put hw_ in the name so that it's clear these are the hardware encodings. --- src/intel/compiler/brw_disasm.c | 22 src/intel/compiler/brw_eu_compact.c | 22 src/intel/compiler/brw_eu_emit.c| 18 +++ src/intel/compiler/brw_eu_validate.c| 28 +- src/intel/compiler/brw_inst.h | 6 +-- src/intel/compiler/brw_reg_type.h | 8 +-- src/intel/compiler/test_eu_validate.cpp | 94 - 7 files changed, 99 insertions(+), 99 deletions(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index aafea693fc..731e64a8ad 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -739,7 +739,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) brw_inst_dst_hstride(devinfo, inst), NULL); string(file, ">"); err |= control(file, "dest reg encoding", reg_encoding, -brw_inst_dst_reg_type(devinfo, inst), NULL); +brw_inst_dst_reg_hw_type(devinfo, inst), NULL); } else { string(file, "g[a0"); if (brw_inst_dst_ia_subreg_nr(devinfo, inst)) @@ -752,7 +752,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) brw_inst_dst_hstride(devinfo, inst), NULL); string(file, ">"); err |= control(file, "dest reg encoding", reg_encoding, -brw_inst_dst_reg_type(devinfo, inst), NULL); +brw_inst_dst_reg_hw_type(devinfo, inst), NULL); } } else { if (brw_inst_dst_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) { @@ -766,7 +766,7 @@ dest(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) err |= control(file, "writemask", writemask, brw_inst_da16_writemask(devinfo, inst), NULL); err |= control(file, "dest reg encoding", reg_encoding, -brw_inst_dst_reg_type(devinfo, inst), NULL); +brw_inst_dst_reg_hw_type(devinfo, inst), NULL); } else { err = 1; string(file, "Indirect align16 address mode not supported"); @@ -1077,13 +1077,13 @@ static int src0(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) { if (brw_inst_src0_reg_file(devinfo, inst) == BRW_IMMEDIATE_VALUE) { - return imm(file, devinfo, brw_inst_src0_reg_type(devinfo, inst), inst); + return imm(file, devinfo, brw_inst_src0_reg_hw_type(devinfo, inst), inst); } else if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) { if (brw_inst_src0_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) { return src_da1(file, devinfo, brw_inst_opcode(devinfo, inst), -brw_inst_src0_reg_type(devinfo, inst), +brw_inst_src0_reg_hw_type(devinfo, inst), brw_inst_src0_reg_file(devinfo, inst), brw_inst_src0_vstride(devinfo, inst), brw_inst_src0_width(devinfo, inst), @@ -1096,7 +1096,7 @@ src0(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) return src_ia1(file, devinfo, brw_inst_opcode(devinfo, inst), -brw_inst_src0_reg_type(devinfo, inst), +brw_inst_src0_reg_hw_type(devinfo, inst), brw_inst_src0_reg_file(devinfo, inst), brw_inst_src0_ia1_addr_imm(devinfo, inst), brw_inst_src0_ia_subreg_nr(devinfo, inst), @@ -,7 +,7 @@ src0(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) return src_da16(file, devinfo, brw_inst_opcode(devinfo, inst), - brw_inst_src0_reg_type(devinfo, inst), + brw_inst_src0_reg_hw_type(devinfo, inst), brw_inst_src0_reg_file(devinfo, inst), brw_inst_src0_vstride(devinfo, inst), brw_inst_src0_da_reg_nr(devinfo, inst), @@ -1133,13 +1133,13 @@ static int src1(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) { if (brw_inst_src1_reg_file(devinfo, inst) == BRW_IMMEDIATE_VALUE) { - return imm(file, devinfo, brw_inst_src1_reg_type(devinfo, inst), inst); + return imm(file, devinfo, brw_inst_src1_reg_hw_type(devinfo, inst), inst); } else if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) { if (brw_inst_src1_address_mode(devinfo, inst) == BRW_ADDRESS_DIRECT) { return src_da1(file, devinfo, brw_inst_opcode(devinfo, inst), -
[Mesa-dev] [PATCH 18/25] i965: Add functions to abstract access to register types
Previously the brw_inst{,_set}_{dst,src0,src1}_reg_type() functions provided access to the hardware encodings for the register types. We often mixed these with the logical BRW_REGISTER_TYPE_* enums (which themselves used to be the hardware format!) with bad results. With that functionality now available with the hw_ versions (see previous commit), we now add functions that take the logical BRW_REGISTER_TYPE_* enums and convert into the hardware format and vice versa. To do the conversion we also have to provide the file. Note the asymmetry between the two functions: the new getter reads the file from the instruction word, and to ensure that is always set the setter writes both the file and the type. --- src/intel/compiler/brw_inst.h | 28 + src/intel/compiler/test_eu_validate.cpp | 102 2 files changed, 79 insertions(+), 51 deletions(-) diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index 4195150112..eacc0a024a 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -35,6 +35,7 @@ #include #include "brw_eu_defines.h" +#include "brw_reg_type.h" #include "common/gen_device_info.h" #ifdef __cplusplus @@ -652,6 +653,33 @@ brw_inst_set_imm_uq(const struct gen_device_info *devinfo, /** @} */ +#define REG_TYPE(reg) \ +static inline void\ +brw_inst_set_##reg##_file_type(const struct gen_device_info *devinfo, \ + brw_inst *inst, enum brw_reg_file file,\ + enum brw_reg_type type)\ +{ \ + assert(file <= BRW_IMMEDIATE_VALUE); \ + unsigned hw_type = brw_reg_type_to_hw_type(devinfo, file, type); \ + brw_inst_set_##reg##_reg_file(devinfo, inst, file);\ + brw_inst_set_##reg##_reg_hw_type(devinfo, inst, hw_type); \ +} \ + \ +static inline enum brw_reg_type \ +brw_inst_##reg##_type(const struct gen_device_info *devinfo, \ + const brw_inst *inst) \ +{ \ + unsigned file = brw_inst_##reg##_reg_file(devinfo, inst); \ + unsigned hw_type = brw_inst_##reg##_reg_hw_type(devinfo, inst);\ + return brw_hw_type_to_reg_type(devinfo, (enum brw_reg_file)file, hw_type); \ +} + +REG_TYPE(dst) +REG_TYPE(src0) +REG_TYPE(src1) +#undef REG_TYPE + + /* The AddrImm fields are split into two discontiguous sections on Gen8+ */ #define BRW_IA1_ADDR_IMM(reg, g4_high, g4_low, g8_nine, g8_high, g8_low) \ static inline void \ diff --git a/src/intel/compiler/test_eu_validate.cpp b/src/intel/compiler/test_eu_validate.cpp index c368688829..46d2b83e34 100644 --- a/src/intel/compiler/test_eu_validate.cpp +++ b/src/intel/compiler/test_eu_validate.cpp @@ -208,19 +208,19 @@ TEST_P(validation_test, opcode46) TEST_P(validation_test, dest_stride_must_be_equal_to_the_ratio_of_exec_size_to_dest_size) { brw_ADD(p, g0, g0, g0); - brw_inst_set_dst_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_W); - brw_inst_set_src0_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D); - brw_inst_set_src1_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D); + brw_inst_set_dst_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, BRW_REGISTER_TYPE_W); + brw_inst_set_src0_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, BRW_REGISTER_TYPE_D); + brw_inst_set_src1_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, BRW_REGISTER_TYPE_D); EXPECT_FALSE(validate(p)); clear_instructions(p); brw_ADD(p, g0, g0, g0); - brw_inst_set_dst_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_W); + brw_inst_set_dst_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, BRW_REGISTER_TYPE_W); brw_inst_set_dst_hstride(, last_inst, BRW_HORIZONTAL_STRIDE_2); - brw_inst_set_src0_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D); - brw_inst_set_src1_reg_hw_type(, last_inst, BRW_HW_REG_TYPE_D); + brw_inst_set_src0_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, BRW_REGISTER_TYPE_D); + brw_inst_set_src1_file_type(, last_inst, BRW_GENERAL_REGISTER_FILE, BRW_REGISTER_TYPE_D); EXPECT_TRUE(validate(p)); } @@ -234,9 +234,9 @@ TEST_P(validation_test, dst_subreg_must_be_aligned_to_exec_type_size) brw_ADD(p, g0, g0, g0); brw_inst_set_dst_da1_subreg_nr(, last_inst, 2); brw_inst_set_dst_hstride(, last_inst, BRW_HORIZONTAL_STRIDE_2); -
[Mesa-dev] [PATCH 15/25] i965: Add a brw_hw_type_to_reg_type() function
Will be used in later commits. --- src/intel/compiler/brw_reg_type.c | 25 + src/intel/compiler/brw_reg_type.h | 4 2 files changed, 29 insertions(+) diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index b0696570e5..8da93ae1cb 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -70,6 +70,31 @@ brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, } /** + * Convert the hardware representation into a brw_reg_type enumeration value. + * + * The hardware encoding may depend on whether the value is an immediate. + */ +enum brw_reg_type +brw_hw_type_to_reg_type(const struct gen_device_info *devinfo, +enum brw_reg_file file, unsigned hw_type) +{ + if (file == BRW_IMMEDIATE_VALUE) { + for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) { + if (gen4_hw_type[i].imm_type == hw_type) { +return i; + } + } + } else { + for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) { + if (gen4_hw_type[i].reg_type == hw_type) { +return i; + } + } + } + unreachable("not reached"); +} + +/** * Return the element size given a hardware register type and file. * * The hardware encoding may depend on whether the value is an immediate. diff --git a/src/intel/compiler/brw_reg_type.h b/src/intel/compiler/brw_reg_type.h index 54262af1fc..f5c19c03f9 100644 --- a/src/intel/compiler/brw_reg_type.h +++ b/src/intel/compiler/brw_reg_type.h @@ -59,6 +59,10 @@ unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, enum brw_reg_file file, enum brw_reg_type type); +enum brw_reg_type +brw_hw_type_to_reg_type(const struct gen_device_info *devinfo, +enum brw_reg_file file, unsigned hw_type); + #define brw_element_size(devinfo, inst, operand) \ brw_hw_reg_type_to_size(devinfo, \ brw_inst_ ## operand ## _reg_file(devinfo, inst), \ -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 16/25] i965: Index brw_hw_reg_type_to_size()'s table by logical type
I'll be transitioning everything to use the logical types. --- src/intel/compiler/brw_reg_type.c | 58 +-- 1 file changed, 19 insertions(+), 39 deletions(-) diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index 8da93ae1cb..859bcac047 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -104,43 +104,23 @@ brw_hw_reg_type_to_size(const struct gen_device_info *devinfo, enum brw_reg_file file, unsigned hw_type) { - if (file == BRW_IMMEDIATE_VALUE) { - static const int hw_sizes[] = { - [0 ... 15]= -1, - [BRW_HW_IMM_TYPE_UD] = 4, - [BRW_HW_IMM_TYPE_D] = 4, - [BRW_HW_IMM_TYPE_UW] = 2, - [BRW_HW_IMM_TYPE_W] = 2, - [BRW_HW_IMM_TYPE_UV] = 2, - [BRW_HW_IMM_TYPE_VF] = 4, - [BRW_HW_IMM_TYPE_V] = 2, - [BRW_HW_IMM_TYPE_F] = 4, - [GEN8_HW_IMM_TYPE_UQ] = 8, - [GEN8_HW_IMM_TYPE_Q] = 8, - [GEN8_HW_IMM_TYPE_DF] = 8, - [GEN8_HW_IMM_TYPE_HF] = 2, - }; - assert(hw_type < ARRAY_SIZE(hw_sizes)); - assert(hw_sizes[hw_type] != -1); - return hw_sizes[hw_type]; - } else { - /* Non-immediate registers */ - static const int hw_sizes[] = { - [0 ... 15]= -1, - [BRW_HW_REG_TYPE_UD] = 4, - [BRW_HW_REG_TYPE_D] = 4, - [BRW_HW_REG_TYPE_UW] = 2, - [BRW_HW_REG_TYPE_W] = 2, - [BRW_HW_REG_TYPE_UB] = 1, - [BRW_HW_REG_TYPE_B] = 1, - [GEN7_HW_REG_TYPE_DF] = 8, - [BRW_HW_REG_TYPE_F] = 4, - [GEN8_HW_REG_TYPE_UQ] = 8, - [GEN8_HW_REG_TYPE_Q] = 8, - [GEN8_HW_REG_TYPE_HF] = 2, - }; - assert(hw_type < ARRAY_SIZE(hw_sizes)); - assert(hw_sizes[hw_type] != -1); - return hw_sizes[hw_type]; - } + static const unsigned type_size[] = { + [BRW_REGISTER_TYPE_DF] = 8, + [BRW_REGISTER_TYPE_F] = 4, + [BRW_REGISTER_TYPE_HF] = 2, + [BRW_REGISTER_TYPE_VF] = 4, + + [BRW_REGISTER_TYPE_Q] = 8, + [BRW_REGISTER_TYPE_UQ] = 8, + [BRW_REGISTER_TYPE_D] = 4, + [BRW_REGISTER_TYPE_UD] = 4, + [BRW_REGISTER_TYPE_W] = 2, + [BRW_REGISTER_TYPE_UW] = 2, + [BRW_REGISTER_TYPE_B] = 1, + [BRW_REGISTER_TYPE_UB] = 1, + [BRW_REGISTER_TYPE_V] = 2, + [BRW_REGISTER_TYPE_UV] = 2, + }; + enum brw_reg_type type = brw_hw_type_to_reg_type(devinfo, file, hw_type); + return type_size[type]; } -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/25] i965: Extract functions dealing with register types to separate file
I'm going to encapsulate all of the logic dealing with register types in this file. Rename the parameters for the hardware encodings from type -> hw_type at the same time. --- src/intel/Makefile.sources| 2 + src/intel/compiler/brw_eu_emit.c | 102 -- src/intel/compiler/brw_reg.h | 35 +-- src/intel/compiler/brw_reg_type.c | 128 ++ src/intel/compiler/brw_reg_type.h | 74 ++ 5 files changed, 205 insertions(+), 136 deletions(-) create mode 100644 src/intel/compiler/brw_reg_type.c create mode 100644 src/intel/compiler/brw_reg_type.h diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources index 5d8785832f..4074ba9ee5 100644 --- a/src/intel/Makefile.sources +++ b/src/intel/Makefile.sources @@ -81,6 +81,8 @@ COMPILER_FILES = \ compiler/brw_packed_float.c \ compiler/brw_predicated_break.cpp \ compiler/brw_reg.h \ + compiler/brw_reg_type.c \ + compiler/brw_reg_type.h \ compiler/brw_schedule_instructions.cpp \ compiler/brw_shader.cpp \ compiler/brw_shader.h \ diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 12e9d332a1..133a28e1bf 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -84,108 +84,6 @@ gen7_convert_mrf_to_grf(struct brw_codegen *p, struct brw_reg *reg) } } -/** - * Convert a brw_reg_type enumeration value into the hardware representation. - * - * The hardware encoding may depend on whether the value is an immediate. - */ -unsigned -brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, -enum brw_reg_file file, -enum brw_reg_type type) -{ - if (file == BRW_IMMEDIATE_VALUE) { - static const enum hw_imm_type hw_types[] = { - [0 ... BRW_REGISTER_TYPE_LAST] = -1, - [BRW_REGISTER_TYPE_UD] = BRW_HW_IMM_TYPE_UD, - [BRW_REGISTER_TYPE_D] = BRW_HW_IMM_TYPE_D, - [BRW_REGISTER_TYPE_UW] = BRW_HW_IMM_TYPE_UW, - [BRW_REGISTER_TYPE_W] = BRW_HW_IMM_TYPE_W, - [BRW_REGISTER_TYPE_F] = BRW_HW_IMM_TYPE_F, - [BRW_REGISTER_TYPE_UV] = BRW_HW_IMM_TYPE_UV, - [BRW_REGISTER_TYPE_VF] = BRW_HW_IMM_TYPE_VF, - [BRW_REGISTER_TYPE_V] = BRW_HW_IMM_TYPE_V, - [BRW_REGISTER_TYPE_DF] = GEN8_HW_IMM_TYPE_DF, - [BRW_REGISTER_TYPE_HF] = GEN8_HW_IMM_TYPE_HF, - [BRW_REGISTER_TYPE_UQ] = GEN8_HW_IMM_TYPE_UQ, - [BRW_REGISTER_TYPE_Q] = GEN8_HW_IMM_TYPE_Q, - }; - assert(type < ARRAY_SIZE(hw_types)); - assert(hw_types[type] != -1); - return hw_types[type]; - } else { - /* Non-immediate registers */ - static const enum hw_reg_type hw_types[] = { - [0 ... BRW_REGISTER_TYPE_LAST] = -1, - [BRW_REGISTER_TYPE_UD] = BRW_HW_REG_TYPE_UD, - [BRW_REGISTER_TYPE_D] = BRW_HW_REG_TYPE_D, - [BRW_REGISTER_TYPE_UW] = BRW_HW_REG_TYPE_UW, - [BRW_REGISTER_TYPE_W] = BRW_HW_REG_TYPE_W, - [BRW_REGISTER_TYPE_UB] = BRW_HW_REG_TYPE_UB, - [BRW_REGISTER_TYPE_B] = BRW_HW_REG_TYPE_B, - [BRW_REGISTER_TYPE_F] = BRW_HW_REG_TYPE_F, - [BRW_REGISTER_TYPE_DF] = GEN7_HW_REG_TYPE_DF, - [BRW_REGISTER_TYPE_HF] = GEN8_HW_REG_TYPE_HF, - [BRW_REGISTER_TYPE_UQ] = GEN8_HW_REG_TYPE_UQ, - [BRW_REGISTER_TYPE_Q] = GEN8_HW_REG_TYPE_Q, - }; - assert(type < ARRAY_SIZE(hw_types)); - assert(hw_types[type] != -1); - return hw_types[type]; - } -} - -/** - * Return the element size given a hardware register type and file. - * - * The hardware encoding may depend on whether the value is an immediate. - */ -unsigned -brw_hw_reg_type_to_size(const struct gen_device_info *devinfo, -enum brw_reg_file file, -unsigned type) -{ - if (file == BRW_IMMEDIATE_VALUE) { - static const int hw_sizes[] = { - [0 ... 15]= -1, - [BRW_HW_IMM_TYPE_UD] = 4, - [BRW_HW_IMM_TYPE_D] = 4, - [BRW_HW_IMM_TYPE_UW] = 2, - [BRW_HW_IMM_TYPE_W] = 2, - [BRW_HW_IMM_TYPE_UV] = 2, - [BRW_HW_IMM_TYPE_VF] = 4, - [BRW_HW_IMM_TYPE_V] = 2, - [BRW_HW_IMM_TYPE_F] = 4, - [GEN8_HW_IMM_TYPE_UQ] = 8, - [GEN8_HW_IMM_TYPE_Q] = 8, - [GEN8_HW_IMM_TYPE_DF] = 8, - [GEN8_HW_IMM_TYPE_HF] = 2, - }; - assert(type < ARRAY_SIZE(hw_sizes)); - assert(hw_sizes[type] != -1); - return hw_sizes[type]; - } else { - /* Non-immediate registers */ - static const int hw_sizes[] = { - [0 ... 15]= -1, - [BRW_HW_REG_TYPE_UD] = 4, - [BRW_HW_REG_TYPE_D] = 4, - [BRW_HW_REG_TYPE_UW] = 2, - [BRW_HW_REG_TYPE_W] = 2, - [BRW_HW_REG_TYPE_UB] = 1, - [BRW_HW_REG_TYPE_B] = 1, - [GEN7_HW_REG_TYPE_DF] = 8, -
[Mesa-dev] [PATCH 14/25] i965: Use a common table to translate logical to hardware types
--- src/intel/compiler/brw_reg_type.c | 65 +-- 1 file changed, 29 insertions(+), 36 deletions(-) diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index 8aac0ca009..b0696570e5 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -25,6 +25,29 @@ #include "brw_eu_defines.h" #include "common/gen_device_info.h" +#define INVALID (-1) + +static const struct { + enum hw_reg_type reg_type; + enum hw_imm_type imm_type; +} gen4_hw_type[] = { + [BRW_REGISTER_TYPE_DF] = { GEN7_HW_REG_TYPE_DF, GEN8_HW_IMM_TYPE_DF }, + [BRW_REGISTER_TYPE_F] = { BRW_HW_REG_TYPE_F, BRW_HW_IMM_TYPE_F }, + [BRW_REGISTER_TYPE_HF] = { GEN8_HW_REG_TYPE_HF, GEN8_HW_IMM_TYPE_HF }, + [BRW_REGISTER_TYPE_VF] = { INVALID, BRW_HW_IMM_TYPE_VF }, + + [BRW_REGISTER_TYPE_Q] = { GEN8_HW_REG_TYPE_Q, GEN8_HW_IMM_TYPE_Q }, + [BRW_REGISTER_TYPE_UQ] = { GEN8_HW_REG_TYPE_UQ, GEN8_HW_IMM_TYPE_UQ }, + [BRW_REGISTER_TYPE_D] = { BRW_HW_REG_TYPE_D, BRW_HW_IMM_TYPE_D }, + [BRW_REGISTER_TYPE_UD] = { BRW_HW_REG_TYPE_UD, BRW_HW_IMM_TYPE_UD }, + [BRW_REGISTER_TYPE_W] = { BRW_HW_REG_TYPE_W, BRW_HW_IMM_TYPE_W }, + [BRW_REGISTER_TYPE_UW] = { BRW_HW_REG_TYPE_UW, BRW_HW_IMM_TYPE_UW }, + [BRW_REGISTER_TYPE_B] = { BRW_HW_REG_TYPE_B, INVALID }, + [BRW_REGISTER_TYPE_UB] = { BRW_HW_REG_TYPE_UB, INVALID }, + [BRW_REGISTER_TYPE_V] = { INVALID, BRW_HW_IMM_TYPE_V }, + [BRW_REGISTER_TYPE_UV] = { INVALID, BRW_HW_IMM_TYPE_UV }, +}; + /** * Convert a brw_reg_type enumeration value into the hardware representation. * @@ -35,44 +58,14 @@ brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, enum brw_reg_file file, enum brw_reg_type type) { + assert(type < ARRAY_SIZE(gen4_hw_type)); + if (file == BRW_IMMEDIATE_VALUE) { - static const enum hw_imm_type hw_types[] = { - [0 ... BRW_REGISTER_TYPE_LAST] = -1, - [BRW_REGISTER_TYPE_UD] = BRW_HW_IMM_TYPE_UD, - [BRW_REGISTER_TYPE_D] = BRW_HW_IMM_TYPE_D, - [BRW_REGISTER_TYPE_UW] = BRW_HW_IMM_TYPE_UW, - [BRW_REGISTER_TYPE_W] = BRW_HW_IMM_TYPE_W, - [BRW_REGISTER_TYPE_F] = BRW_HW_IMM_TYPE_F, - [BRW_REGISTER_TYPE_UV] = BRW_HW_IMM_TYPE_UV, - [BRW_REGISTER_TYPE_VF] = BRW_HW_IMM_TYPE_VF, - [BRW_REGISTER_TYPE_V] = BRW_HW_IMM_TYPE_V, - [BRW_REGISTER_TYPE_DF] = GEN8_HW_IMM_TYPE_DF, - [BRW_REGISTER_TYPE_HF] = GEN8_HW_IMM_TYPE_HF, - [BRW_REGISTER_TYPE_UQ] = GEN8_HW_IMM_TYPE_UQ, - [BRW_REGISTER_TYPE_Q] = GEN8_HW_IMM_TYPE_Q, - }; - assert(type < ARRAY_SIZE(hw_types)); - assert(hw_types[type] != -1); - return hw_types[type]; + assert(gen4_hw_type[type].imm_type != INVALID); + return gen4_hw_type[type].imm_type; } else { - /* Non-immediate registers */ - static const enum hw_reg_type hw_types[] = { - [0 ... BRW_REGISTER_TYPE_LAST] = -1, - [BRW_REGISTER_TYPE_UD] = BRW_HW_REG_TYPE_UD, - [BRW_REGISTER_TYPE_D] = BRW_HW_REG_TYPE_D, - [BRW_REGISTER_TYPE_UW] = BRW_HW_REG_TYPE_UW, - [BRW_REGISTER_TYPE_W] = BRW_HW_REG_TYPE_W, - [BRW_REGISTER_TYPE_UB] = BRW_HW_REG_TYPE_UB, - [BRW_REGISTER_TYPE_B] = BRW_HW_REG_TYPE_B, - [BRW_REGISTER_TYPE_F] = BRW_HW_REG_TYPE_F, - [BRW_REGISTER_TYPE_DF] = GEN7_HW_REG_TYPE_DF, - [BRW_REGISTER_TYPE_HF] = GEN8_HW_REG_TYPE_HF, - [BRW_REGISTER_TYPE_UQ] = GEN8_HW_REG_TYPE_UQ, - [BRW_REGISTER_TYPE_Q] = GEN8_HW_REG_TYPE_Q, - }; - assert(type < ARRAY_SIZE(hw_types)); - assert(hw_types[type] != -1); - return hw_types[type]; + assert(gen4_hw_type[type].reg_type != INVALID); + return gen4_hw_type[type].reg_type; } } -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/25] i965: Reverse file/type arguments to register type functions
I think of the initial arguments as "state" and the last as the actual subject. --- src/intel/compiler/brw_disasm.c | 4 ++-- src/intel/compiler/brw_eu_emit.c | 14 -- src/intel/compiler/brw_eu_validate.c | 2 +- src/intel/compiler/brw_reg.h | 8 4 files changed, 15 insertions(+), 13 deletions(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index 6da7060517..aafea693fc 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -839,7 +839,7 @@ src_da1(FILE *file, if (err == -1) return 0; if (sub_reg_num) { - unsigned elem_size = brw_hw_reg_type_to_size(devinfo, type, _reg_file); + unsigned elem_size = brw_hw_reg_type_to_size(devinfo, _reg_file, type); format(file, ".%d", sub_reg_num / elem_size); /* use formal style like spec */ } src_align1_region(file, _vert_stride, _width, _horiz_stride); @@ -928,7 +928,7 @@ src_da16(FILE *file, return 0; if (_subreg_nr) { unsigned elem_size = - brw_hw_reg_type_to_size(devinfo, _reg_type, _reg_file); + brw_hw_reg_type_to_size(devinfo, _reg_file, _reg_type); /* bit4 for subreg number byte addressing. Make this same meaning as in da1 case, so output looks consistent. */ diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 69ebf6345c..12e9d332a1 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -91,7 +91,8 @@ gen7_convert_mrf_to_grf(struct brw_codegen *p, struct brw_reg *reg) */ unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, -enum brw_reg_type type, enum brw_reg_file file) +enum brw_reg_file file, +enum brw_reg_type type) { if (file == BRW_IMMEDIATE_VALUE) { static const enum hw_imm_type hw_types[] = { @@ -141,7 +142,8 @@ brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, */ unsigned brw_hw_reg_type_to_size(const struct gen_device_info *devinfo, -unsigned type, enum brw_reg_file file) +enum brw_reg_file file, +unsigned type) { if (file == BRW_IMMEDIATE_VALUE) { static const int hw_sizes[] = { @@ -198,8 +200,8 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest) brw_inst_set_dst_reg_file(devinfo, inst, dest.file); brw_inst_set_dst_reg_type(devinfo, inst, - brw_reg_type_to_hw_type(devinfo, dest.type, - dest.file)); + brw_reg_type_to_hw_type(devinfo, dest.file, + dest.type)); brw_inst_set_dst_address_mode(devinfo, inst, dest.address_mode); if (dest.address_mode == BRW_ADDRESS_DIRECT) { @@ -365,7 +367,7 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg) brw_inst_set_src0_reg_file(devinfo, inst, reg.file); brw_inst_set_src0_reg_type(devinfo, inst, - brw_reg_type_to_hw_type(devinfo, reg.type, reg.file)); + brw_reg_type_to_hw_type(devinfo, reg.file, reg.type)); brw_inst_set_src0_abs(devinfo, inst, reg.abs); brw_inst_set_src0_negate(devinfo, inst, reg.negate); brw_inst_set_src0_address_mode(devinfo, inst, reg.address_mode); @@ -472,7 +474,7 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg) brw_inst_set_src1_reg_file(devinfo, inst, reg.file); brw_inst_set_src1_reg_type(devinfo, inst, - brw_reg_type_to_hw_type(devinfo, reg.type, reg.file)); + brw_reg_type_to_hw_type(devinfo, reg.file, reg.type)); brw_inst_set_src1_abs(devinfo, inst, reg.abs); brw_inst_set_src1_negate(devinfo, inst, reg.negate); diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index 54e0a2e62e..cacf962904 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -459,7 +459,7 @@ general_restrictions_based_on_operand_types(const struct gen_device_info *devinf unsigned exec_type = execution_type(devinfo, inst); unsigned exec_type_size = - brw_hw_reg_type_to_size(devinfo, exec_type, BRW_GENERAL_REGISTER_FILE); + brw_hw_reg_type_to_size(devinfo, BRW_GENERAL_REGISTER_FILE, exec_type); unsigned dst_type_size = brw_element_size(devinfo, inst, dst); /* On IVB/BYT, region parameters and execution size for DF are in terms of diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h index bd179606b0..db932cfeee 100644 --- a/src/intel/compiler/brw_reg.h +++ b/src/intel/compiler/brw_reg.h @@ -227,14 +227,14 @@ enum PACKED brw_reg_type { }; unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, -
[Mesa-dev] [PATCH 11/25] i965: Add support for disassembling 64-bit integer immediates
After the last patch converted things into enums, I helpfully got a compiler warning about these missing from the switch statement. --- src/intel/compiler/brw_disasm.c | 6 ++ src/intel/compiler/brw_inst.h | 7 +++ 2 files changed, 13 insertions(+) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index b5c283058a..6da7060517 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -1029,6 +1029,12 @@ imm(FILE *file, const struct gen_device_info *devinfo, enum hw_imm_type type, const brw_inst *inst) { switch (type) { + case GEN8_HW_IMM_TYPE_UQ: + format(file, "0x%16lxUD", brw_inst_imm_uq(devinfo, inst)); + break; + case GEN8_HW_IMM_TYPE_Q: + format(file, "%ldD", brw_inst_imm_uq(devinfo, inst)); + break; case BRW_HW_IMM_TYPE_UD: format(file, "0x%08xUD", brw_inst_imm_ud(devinfo, inst)); break; diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index 5b2ce32ae4..cd3b0e95ea 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -569,6 +569,13 @@ brw_inst_imm_ud(const struct gen_device_info *devinfo, const brw_inst *insn) return brw_inst_bits(insn, 127, 96); } +static inline uint64_t +brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn) +{ + assert(devinfo->gen >= 8); + return brw_inst_bits(insn, 127, 64); +} + static inline float brw_inst_imm_f(const struct gen_device_info *devinfo, const brw_inst *insn) { -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/25] i965: Don't let raw-move check be tricked by immediate vector types
UB and B type encodings are the same as UV and VF. Noticed when writing the following patch. --- src/intel/compiler/brw_eu_validate.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index e089c1f90f..827cd707c7 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -96,10 +96,17 @@ inst_is_raw_move(const struct gen_device_info *devinfo, const brw_inst *inst) unsigned dst_type = signed_type(brw_inst_dst_reg_type(devinfo, inst)); unsigned src_type = signed_type(brw_inst_src0_reg_type(devinfo, inst)); - if (brw_inst_src0_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE && - (brw_inst_src0_negate(devinfo, inst) || -brw_inst_src0_abs(devinfo, inst))) + if (brw_inst_src0_reg_file(devinfo, inst) == BRW_IMMEDIATE_VALUE) { + /* FIXME: not strictly true */ + if (brw_inst_src0_reg_type(devinfo, inst) == BRW_HW_REG_IMM_TYPE_VF || + brw_inst_src0_reg_type(devinfo, inst) == BRW_HW_REG_IMM_TYPE_UV || + brw_inst_src0_reg_type(devinfo, inst) == BRW_HW_REG_IMM_TYPE_V) { + return false; + } + } else if (brw_inst_src0_negate(devinfo, inst) || + brw_inst_src0_abs(devinfo, inst)) { return false; + } return brw_inst_opcode(devinfo, inst) == BRW_OPCODE_MOV && brw_inst_saturate(devinfo, inst) == 0 && -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/25] i965: Use separate enums for register vs immediate types
The hardware encodings often mean different things depending on whether the source is an immediate. --- src/intel/compiler/brw_disasm.c | 46 --- src/intel/compiler/brw_eu_compact.c | 8 +-- src/intel/compiler/brw_eu_defines.h | 48 +-- src/intel/compiler/brw_eu_emit.c | 109 +-- src/intel/compiler/brw_eu_validate.c | 60 +-- src/intel/compiler/brw_reg.h | 2 + 6 files changed, 144 insertions(+), 129 deletions(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index 3a33614523..b5c283058a 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -238,17 +238,18 @@ static const char *const access_mode[2] = { }; static const char * const reg_encoding[] = { - [BRW_HW_REG_TYPE_UD] = "UD", - [BRW_HW_REG_TYPE_D] = "D", - [BRW_HW_REG_TYPE_UW] = "UW", - [BRW_HW_REG_TYPE_W] = "W", - [BRW_HW_REG_NON_IMM_TYPE_UB] = "UB", - [BRW_HW_REG_NON_IMM_TYPE_B] = "B", - [GEN7_HW_REG_NON_IMM_TYPE_DF] = "DF", - [BRW_HW_REG_TYPE_F] = "F", - [GEN8_HW_REG_TYPE_UQ] = "UQ", - [GEN8_HW_REG_TYPE_Q] = "Q", - [GEN8_HW_REG_NON_IMM_TYPE_HF] = "HF", + [BRW_HW_REG_TYPE_UD] = "UD", + [BRW_HW_REG_TYPE_D] = "D", + [BRW_HW_REG_TYPE_UW] = "UW", + [BRW_HW_REG_TYPE_W] = "W", + [BRW_HW_REG_TYPE_F] = "F", + [GEN8_HW_REG_TYPE_UQ] = "UQ", + [GEN8_HW_REG_TYPE_Q] = "Q", + + [BRW_HW_REG_TYPE_UB] = "UB", + [BRW_HW_REG_TYPE_B] = "B", + [GEN7_HW_REG_TYPE_DF] = "DF", + [GEN8_HW_REG_TYPE_HF] = "HF", }; static const char *const three_source_reg_encoding[] = { @@ -1024,41 +1025,42 @@ src2_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst *ins } static int -imm(FILE *file, const struct gen_device_info *devinfo, unsigned type, const brw_inst *inst) +imm(FILE *file, const struct gen_device_info *devinfo, enum hw_imm_type type, +const brw_inst *inst) { switch (type) { - case BRW_HW_REG_TYPE_UD: + case BRW_HW_IMM_TYPE_UD: format(file, "0x%08xUD", brw_inst_imm_ud(devinfo, inst)); break; - case BRW_HW_REG_TYPE_D: + case BRW_HW_IMM_TYPE_D: format(file, "%dD", brw_inst_imm_d(devinfo, inst)); break; - case BRW_HW_REG_TYPE_UW: + case BRW_HW_IMM_TYPE_UW: format(file, "0x%04xUW", (uint16_t) brw_inst_imm_ud(devinfo, inst)); break; - case BRW_HW_REG_TYPE_W: + case BRW_HW_IMM_TYPE_W: format(file, "%dW", (int16_t) brw_inst_imm_d(devinfo, inst)); break; - case BRW_HW_REG_IMM_TYPE_UV: + case BRW_HW_IMM_TYPE_UV: format(file, "0x%08xUV", brw_inst_imm_ud(devinfo, inst)); break; - case BRW_HW_REG_IMM_TYPE_VF: + case BRW_HW_IMM_TYPE_VF: format(file, "[%-gF, %-gF, %-gF, %-gF]VF", brw_vf_to_float(brw_inst_imm_ud(devinfo, inst)), brw_vf_to_float(brw_inst_imm_ud(devinfo, inst) >> 8), brw_vf_to_float(brw_inst_imm_ud(devinfo, inst) >> 16), brw_vf_to_float(brw_inst_imm_ud(devinfo, inst) >> 24)); break; - case BRW_HW_REG_IMM_TYPE_V: + case BRW_HW_IMM_TYPE_V: format(file, "0x%08xV", brw_inst_imm_ud(devinfo, inst)); break; - case BRW_HW_REG_TYPE_F: + case BRW_HW_IMM_TYPE_F: format(file, "%-gF", brw_inst_imm_f(devinfo, inst)); break; - case GEN8_HW_REG_IMM_TYPE_DF: + case GEN8_HW_IMM_TYPE_DF: format(file, "%-gDF", brw_inst_imm_df(devinfo, inst)); break; - case GEN8_HW_REG_IMM_TYPE_HF: + case GEN8_HW_IMM_TYPE_HF: string(file, "Half Float IMM"); break; } diff --git a/src/intel/compiler/brw_eu_compact.c b/src/intel/compiler/brw_eu_compact.c index 79103d7883..bca526f592 100644 --- a/src/intel/compiler/brw_eu_compact.c +++ b/src/intel/compiler/brw_eu_compact.c @@ -995,9 +995,9 @@ precompact(const struct gen_device_info *devinfo, brw_inst inst) !(devinfo->is_haswell && brw_inst_opcode(devinfo, ) == BRW_OPCODE_DIM) && !(devinfo->gen >= 8 && - (brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_IMM_TYPE_DF || - brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_UQ || - brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_Q))) { + (brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_IMM_TYPE_DF || + brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_IMM_TYPE_UQ || + brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_IMM_TYPE_Q))) { brw_inst_set_src1_reg_type(devinfo, , BRW_HW_REG_TYPE_UD); } @@ -1016,7 +1016,7 @@ precompact(const struct gen_device_info *devinfo, brw_inst inst) brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F && brw_inst_dst_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F && brw_inst_dst_hstride(devinfo, ) == BRW_HORIZONTAL_STRIDE_1) { - brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_IMM_TYPE_VF); +
[Mesa-dev] [PATCH 09/25] i965: Reorder brw_reg_type enum values
These vaguely corresponded to the hardware encodings, but that is purely historical at this point. Reorder them so we stop making things "almost work" when mixing enums. The ordering has been closen so that no enum value is the same as a compatible hardware encoding. --- src/intel/compiler/brw_eu.c | 1 - src/intel/compiler/brw_eu_emit.c | 6 -- src/intel/compiler/brw_fs.cpp| 1 + src/intel/compiler/brw_reg.h | 32 ++-- src/intel/compiler/brw_vec4.cpp | 3 ++- 5 files changed, 17 insertions(+), 26 deletions(-) diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c index 0ef52e219c..700a1badd4 100644 --- a/src/intel/compiler/brw_eu.c +++ b/src/intel/compiler/brw_eu.c @@ -62,7 +62,6 @@ brw_reg_type_letters(unsigned type) [BRW_REGISTER_TYPE_UQ] = "UQ", [BRW_REGISTER_TYPE_Q] = "Q", }; - assert(type <= BRW_REGISTER_TYPE_Q); return names[type]; } diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 6673e0741a..b59fc33a54 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -112,7 +112,6 @@ brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, }; assert(type < ARRAY_SIZE(imm_hw_types)); assert(imm_hw_types[type] != -1); - assert(devinfo->gen >= 8 || type < BRW_REGISTER_TYPE_DF); return imm_hw_types[type]; } else { /* Non-immediate registers */ @@ -134,8 +133,6 @@ brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, }; assert(type < ARRAY_SIZE(hw_types)); assert(hw_types[type] != -1); - assert(devinfo->gen >= 7 || type < BRW_REGISTER_TYPE_DF); - assert(devinfo->gen >= 8 || type < BRW_REGISTER_TYPE_Q); return hw_types[type]; } } @@ -184,9 +181,6 @@ brw_hw_reg_type_to_size(const struct gen_device_info *devinfo, [GEN8_HW_REG_NON_IMM_TYPE_HF] = 2, }; assert(type < ARRAY_SIZE(hw_sizes)); - assert(devinfo->gen >= 7 || - (type < GEN7_HW_REG_NON_IMM_TYPE_DF || type == BRW_HW_REG_TYPE_F)); - assert(devinfo->gen >= 8 || type <= BRW_HW_REG_TYPE_F); return hw_sizes[type]; } } diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index fdc30d450c..0ea4c4f1cc 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -403,6 +403,7 @@ void fs_reg::init() { memset(this, 0, sizeof(*this)); + type = BRW_REGISTER_TYPE_UD; stride = 1; } diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h index 17a51fbd65..48e6fd7b7d 100644 --- a/src/intel/compiler/brw_reg.h +++ b/src/intel/compiler/brw_reg.h @@ -203,29 +203,25 @@ brw_mask_for_swizzle(unsigned swz) } enum PACKED brw_reg_type { - BRW_REGISTER_TYPE_UD = 0, - BRW_REGISTER_TYPE_D, - BRW_REGISTER_TYPE_UW, - BRW_REGISTER_TYPE_W, + /** Floating-point types: @{ */ + BRW_REGISTER_TYPE_DF, BRW_REGISTER_TYPE_F, - - /** Non-immediates only: @{ */ - BRW_REGISTER_TYPE_UB, - BRW_REGISTER_TYPE_B, - /** @} */ - - /** Immediates only: @{ */ - BRW_REGISTER_TYPE_UV, /* Gen6+ */ - BRW_REGISTER_TYPE_V, + BRW_REGISTER_TYPE_HF, BRW_REGISTER_TYPE_VF, /** @} */ - BRW_REGISTER_TYPE_DF, /* Gen7+ (no immediates until Gen8+) */ - - /* Gen8+ */ - BRW_REGISTER_TYPE_HF, - BRW_REGISTER_TYPE_UQ, + /** Integer types: @{ */ BRW_REGISTER_TYPE_Q, + BRW_REGISTER_TYPE_UQ, + BRW_REGISTER_TYPE_D, + BRW_REGISTER_TYPE_UD, + BRW_REGISTER_TYPE_W, + BRW_REGISTER_TYPE_UW, + BRW_REGISTER_TYPE_B, + BRW_REGISTER_TYPE_UB, + BRW_REGISTER_TYPE_V, + BRW_REGISTER_TYPE_UV, + /** @} */ }; unsigned brw_reg_type_to_hw_type(const struct gen_device_info *devinfo, diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 410922c62b..bf9a271900 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -42,8 +42,8 @@ void src_reg::init() { memset(this, 0, sizeof(*this)); - this->file = BAD_FILE; + this->type = BRW_REGISTER_TYPE_UD; } src_reg::src_reg(enum brw_reg_file file, int nr, const glsl_type *type) @@ -85,6 +85,7 @@ dst_reg::init() { memset(this, 0, sizeof(*this)); this->file = BAD_FILE; + this->type = BRW_REGISTER_TYPE_UD; this->writemask = WRITEMASK_XYZW; } -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/25] i965: Only change type of 0.0f to VF if destination stride == 1
The destination stride must be equivalent to a dword if VF is used. Also, since the only compaction table entires with "i:vf" have the destination as "r:f" specifically check that the destination is of type float. --- src/intel/compiler/brw_eu_compact.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_eu_compact.c b/src/intel/compiler/brw_eu_compact.c index bf57ddf85c..79103d7883 100644 --- a/src/intel/compiler/brw_eu_compact.c +++ b/src/intel/compiler/brw_eu_compact.c @@ -1014,7 +1014,8 @@ precompact(const struct gen_device_info *devinfo, brw_inst inst) */ if (brw_inst_imm_ud(devinfo, ) == 0x0 && brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F && - brw_inst_dst_reg_type(devinfo, ) != GEN7_HW_REG_NON_IMM_TYPE_DF) { + brw_inst_dst_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F && + brw_inst_dst_hstride(devinfo, ) == BRW_HORIZONTAL_STRIDE_1) { brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_IMM_TYPE_VF); } -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/25] i965: Remove CONT/BREAK from instruction compaction test
These cannot be compacted. A similar mistake was fixed in commit 90eaf01616a8 --- src/intel/compiler/test_eu_compact.cpp | 4 1 file changed, 4 deletions(-) diff --git a/src/intel/compiler/test_eu_compact.cpp b/src/intel/compiler/test_eu_compact.cpp index 668a972bfa..1532e3b984 100644 --- a/src/intel/compiler/test_eu_compact.cpp +++ b/src/intel/compiler/test_eu_compact.cpp @@ -68,8 +68,6 @@ clear_pad_bits(const struct gen_device_info *devinfo, brw_inst *inst) { if (brw_inst_opcode(devinfo, inst) != BRW_OPCODE_SEND && brw_inst_opcode(devinfo, inst) != BRW_OPCODE_SENDC && - brw_inst_opcode(devinfo, inst) != BRW_OPCODE_BREAK && - brw_inst_opcode(devinfo, inst) != BRW_OPCODE_CONTINUE && brw_inst_src0_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE && brw_inst_src1_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE) { brw_inst_set_bits(inst, 127, 111, 0); @@ -133,8 +131,6 @@ skip_bit(const struct gen_device_info *devinfo, brw_inst *src, int bit) /* sometimes these are pad bits. */ if (brw_inst_opcode(devinfo, src) != BRW_OPCODE_SEND && brw_inst_opcode(devinfo, src) != BRW_OPCODE_SENDC && - brw_inst_opcode(devinfo, src) != BRW_OPCODE_BREAK && - brw_inst_opcode(devinfo, src) != BRW_OPCODE_CONTINUE && brw_inst_src0_reg_file(devinfo, src) != BRW_IMMEDIATE_VALUE && brw_inst_src1_reg_file(devinfo, src) != BRW_IMMEDIATE_VALUE && bit >= 121) { -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/25] i965: Validate destination restrictions with vector immediates
--- src/intel/compiler/brw_eu_emit.c| 13 +- src/intel/compiler/brw_eu_validate.c| 61 + src/intel/compiler/test_eu_validate.cpp | 79 + 3 files changed, 141 insertions(+), 12 deletions(-) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 8a6ec035cc..6673e0741a 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -279,19 +279,8 @@ validate_reg(const struct gen_device_info *devinfo, const int execsize_for_reg[] = {1, 2, 4, 8, 16, 32}; int width, hstride, vstride, execsize; - if (reg.file == BRW_IMMEDIATE_VALUE) { - /* 3.3.6: Region Parameters. Restriction: Immediate vectors - * mean the destination has to be 128-bit aligned and the - * destination horiz stride has to be a word. - */ - if (reg.type == BRW_REGISTER_TYPE_V) { - unsigned UNUSED elem_size = brw_element_size(devinfo, inst, dst); - assert(hstride_for_reg[brw_inst_dst_hstride(devinfo, inst)] * -elem_size == 2); - } - + if (reg.file == BRW_IMMEDIATE_VALUE) return; - } if (reg.file == BRW_ARCHITECTURE_REGISTER_FILE && reg.file == BRW_ARF_NULL) diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index 827cd707c7..7f0595e6f8 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -1036,6 +1036,66 @@ region_alignment_rules(const struct gen_device_info *devinfo, return error_msg; } +static struct string +vector_immediate_restrictions(const struct gen_device_info *devinfo, + const brw_inst *inst) +{ + unsigned num_sources = num_sources_from_inst(devinfo, inst); + struct string error_msg = { .str = NULL, .len = 0 }; + + if (num_sources == 3 || num_sources == 0) + return (struct string){}; + + unsigned file = num_sources == 1 ? + brw_inst_src0_reg_file(devinfo, inst) : + brw_inst_src1_reg_file(devinfo, inst); + if (file != BRW_IMMEDIATE_VALUE) + return (struct string){}; + + unsigned dst_type_size = brw_element_size(devinfo, inst, dst); + unsigned dst_subreg = brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1 ? + brw_inst_dst_da1_subreg_nr(devinfo, inst) : 0; + unsigned dst_stride = 1 << (brw_inst_dst_hstride(devinfo, inst) - 1); + unsigned type = num_sources == 1 ? + brw_inst_src0_reg_type(devinfo, inst) : + brw_inst_src1_reg_type(devinfo, inst); + + /* The PRMs say: +* +*When an immediate vector is used in an instruction, the destination +*must be 128-bit aligned with destination horizontal stride equivalent +*to a word for an immediate integer vector (v) and equivalent to a +*DWord for an immediate float vector (vf). +* +* The text has not been updated for the addition of the immediate unsigned +* integer vector type (uv) on SNB, but presumably the same restriction +* applies. +*/ + switch (type) { + case BRW_HW_REG_IMM_TYPE_V: + case BRW_HW_REG_IMM_TYPE_UV: + case BRW_HW_REG_IMM_TYPE_VF: + ERROR_IF(dst_subreg % (128 / 8) != 0, + "Destination must be 128-bit aligned in order to use immediate " + "vector types"); + + if (type == BRW_HW_REG_IMM_TYPE_VF) { + ERROR_IF(dst_type_size * dst_stride != 4, + "Destination must have stride equivalent to dword in order " + "to use the VF type"); + } else { + ERROR_IF(dst_type_size * dst_stride != 2, + "Destination must have stride equivalent to word in order " + "to use the V or UV type"); + } + break; + default: + break; + } + + return error_msg; +} + bool brw_validate_instructions(const struct gen_device_info *devinfo, void *assembly, int start_offset, int end_offset, @@ -1063,6 +1123,7 @@ brw_validate_instructions(const struct gen_device_info *devinfo, CHECK(general_restrictions_based_on_operand_types); CHECK(general_restrictions_on_region_parameters); CHECK(region_alignment_rules); + CHECK(vector_immediate_restrictions); } if (error_msg.str && annotation) { diff --git a/src/intel/compiler/test_eu_validate.cpp b/src/intel/compiler/test_eu_validate.cpp index 09f4cc142a..b43c41704b 100644 --- a/src/intel/compiler/test_eu_validate.cpp +++ b/src/intel/compiler/test_eu_validate.cpp @@ -132,6 +132,7 @@ validate(struct brw_codegen *p) #define last_inst(>store[p->nr_insn - 1]) #define g0 brw_vec8_grf(0, 0) #define null brw_null_reg() +#define zero brw_imm_f(0.0f) static void clear_instructions(struct brw_codegen *p) @@ -844,5 +845,83 @@ TEST_P(validation_test, byte_destination_relaxed_alignment) } else {
[Mesa-dev] [PATCH 04/25] i965: Test instruction compaction on all supported Gens
Note that there's no point in testing on G45, since its compaction is the same as Gen5. Same logic applies to Gen7 variants and low-power parts. --- src/intel/compiler/test_eu_compact.cpp | 50 -- 1 file changed, 42 insertions(+), 8 deletions(-) diff --git a/src/intel/compiler/test_eu_compact.cpp b/src/intel/compiler/test_eu_compact.cpp index 1ef7e5ae7f..668a972bfa 100644 --- a/src/intel/compiler/test_eu_compact.cpp +++ b/src/intel/compiler/test_eu_compact.cpp @@ -74,6 +74,13 @@ clear_pad_bits(const struct gen_device_info *devinfo, brw_inst *inst) brw_inst_src1_reg_file(devinfo, inst) != BRW_IMMEDIATE_VALUE) { brw_inst_set_bits(inst, 127, 111, 0); } + + if (devinfo->gen == 8 && !devinfo->is_cherryview && + is_3src(devinfo, (opcode)brw_inst_opcode(devinfo, inst))) { + brw_inst_set_bits(inst, 105, 105, 0); + brw_inst_set_bits(inst, 84, 84, 0); + brw_inst_set_bits(inst, 36, 35, 0); + } } static bool @@ -87,13 +94,41 @@ skip_bit(const struct gen_device_info *devinfo, brw_inst *src, int bit) if (bit == 29) return true; - /* pad bit */ - if (bit == 47) - return true; + if (is_3src(devinfo, (opcode)brw_inst_opcode(devinfo, src))) { + if (devinfo->gen >= 9 || devinfo->is_cherryview) { + if (bit == 127) +return true; + } else { + if (bit >= 126 && bit <= 127) +return true; - /* pad bits */ - if (bit >= 90 && bit <= 95) - return true; + if (bit == 105) +return true; + + if (bit == 84) +return true; + + if (bit >= 35 && bit <= 36) +return true; + } + } else { + if (bit == 47) + return true; + + if (devinfo->gen >= 8) { + if (bit == 11) +return true; + + if (bit == 95) +return true; + } else { + if (devinfo->gen < 7 && bit == 90) +return true; + + if (bit >= 91 && bit <= 95) +return true; + } + } /* sometimes these are pad bits. */ if (brw_inst_opcode(devinfo, src) != BRW_OPCODE_SEND && @@ -289,10 +324,9 @@ int main(int argc, char **argv) { struct gen_device_info *devinfo = (struct gen_device_info *)calloc(1, sizeof(*devinfo)); - devinfo->gen = 6; bool fail = false; - for (devinfo->gen = 6; devinfo->gen <= 7; devinfo->gen++) { + for (devinfo->gen = 5; devinfo->gen <= 9; devinfo->gen++) { fail |= run_tests(devinfo); } -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/25] i965: Mark src inst pointer const in compaction code
--- src/intel/compiler/brw_eu.h | 2 +- src/intel/compiler/brw_eu_compact.c | 23 --- 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h index a3a9c63239..8e597b212a 100644 --- a/src/intel/compiler/brw_eu.h +++ b/src/intel/compiler/brw_eu.h @@ -542,7 +542,7 @@ void brw_compact_instructions(struct brw_codegen *p, int start_offset, void brw_uncompact_instruction(const struct gen_device_info *devinfo, brw_inst *dst, brw_compact_inst *src); bool brw_try_compact_instruction(const struct gen_device_info *devinfo, - brw_compact_inst *dst, brw_inst *src); + brw_compact_inst *dst, const brw_inst *src); void brw_debug_compact_uncompact(const struct gen_device_info *devinfo, brw_inst *orig, brw_inst *uncompacted); diff --git a/src/intel/compiler/brw_eu_compact.c b/src/intel/compiler/brw_eu_compact.c index 740a395f78..a940e214f2 100644 --- a/src/intel/compiler/brw_eu_compact.c +++ b/src/intel/compiler/brw_eu_compact.c @@ -671,7 +671,7 @@ static const uint16_t *src_index_table; static bool set_control_index(const struct gen_device_info *devinfo, - brw_compact_inst *dst, brw_inst *src) + brw_compact_inst *dst, const brw_inst *src) { uint32_t uncompacted = devinfo->gen >= 8 /* 17b/G45; 19b/IVB+ */ ? (brw_inst_bits(src, 33, 31) << 16) | /* 3b */ @@ -700,7 +700,7 @@ set_control_index(const struct gen_device_info *devinfo, static bool set_datatype_index(const struct gen_device_info *devinfo, brw_compact_inst *dst, - brw_inst *src) + const brw_inst *src) { uint32_t uncompacted = devinfo->gen >= 8 /* 18b/G45+; 21b/BDW+ */ ? (brw_inst_bits(src, 63, 61) << 18) | /* 3b */ @@ -721,7 +721,7 @@ set_datatype_index(const struct gen_device_info *devinfo, brw_compact_inst *dst, static bool set_subreg_index(const struct gen_device_info *devinfo, brw_compact_inst *dst, - brw_inst *src, bool is_immediate) + const brw_inst *src, bool is_immediate) { uint16_t uncompacted = /* 15b */ (brw_inst_bits(src, 52, 48) << 0) | /* 5b */ @@ -756,7 +756,7 @@ get_src_index(uint16_t uncompacted, static bool set_src0_index(const struct gen_device_info *devinfo, - brw_compact_inst *dst, brw_inst *src) + brw_compact_inst *dst, const brw_inst *src) { uint16_t compacted; uint16_t uncompacted = brw_inst_bits(src, 88, 77); /* 12b */ @@ -771,7 +771,7 @@ set_src0_index(const struct gen_device_info *devinfo, static bool set_src1_index(const struct gen_device_info *devinfo, brw_compact_inst *dst, - brw_inst *src, bool is_immediate) + const brw_inst *src, bool is_immediate) { uint16_t compacted; @@ -791,7 +791,7 @@ set_src1_index(const struct gen_device_info *devinfo, brw_compact_inst *dst, static bool set_3src_control_index(const struct gen_device_info *devinfo, - brw_compact_inst *dst, brw_inst *src) + brw_compact_inst *dst, const brw_inst *src) { assert(devinfo->gen >= 8); @@ -814,7 +814,7 @@ set_3src_control_index(const struct gen_device_info *devinfo, static bool set_3src_source_index(const struct gen_device_info *devinfo, - brw_compact_inst *dst, brw_inst *src) + brw_compact_inst *dst, const brw_inst *src) { assert(devinfo->gen >= 8); @@ -847,7 +847,7 @@ set_3src_source_index(const struct gen_device_info *devinfo, } static bool -has_unmapped_bits(const struct gen_device_info *devinfo, brw_inst *src) +has_unmapped_bits(const struct gen_device_info *devinfo, const brw_inst *src) { /* EOT can only be mapped on a send if the src1 is an immediate */ if ((brw_inst_opcode(devinfo, src) == BRW_OPCODE_SENDC || @@ -878,7 +878,8 @@ has_unmapped_bits(const struct gen_device_info *devinfo, brw_inst *src) } static bool -has_3src_unmapped_bits(const struct gen_device_info *devinfo, brw_inst *src) +has_3src_unmapped_bits(const struct gen_device_info *devinfo, + const brw_inst *src) { /* Check for three-source instruction bits that don't map to any of the * fields of the compacted instruction. All of them seem to be reserved @@ -901,7 +902,7 @@ has_3src_unmapped_bits(const struct gen_device_info *devinfo, brw_inst *src) static bool brw_try_compact_3src_instruction(const struct gen_device_info *devinfo, - brw_compact_inst *dst, brw_inst *src) + brw_compact_inst *dst, const brw_inst *src) { assert(devinfo->gen >= 8); @@ -962,7 +963,7 @@ is_compactable_immediate(unsigned imm) */ bool brw_try_compact_instruction(const struct gen_device_info *devinfo, -
[Mesa-dev] [PATCH 00/25] i965: Switch to always using logical register types
The mixture of hardware encodings and logical types has caused lots of confusion. It's time to fix that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/25] i965: Silence signed/unsigned comparison warning
--- src/intel/compiler/test_eu_compact.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/test_eu_compact.cpp b/src/intel/compiler/test_eu_compact.cpp index 39e7f1a27c..1ef7e5ae7f 100644 --- a/src/intel/compiler/test_eu_compact.cpp +++ b/src/intel/compiler/test_eu_compact.cpp @@ -254,7 +254,7 @@ run_tests(const struct gen_device_info *devinfo) brw_init_compaction_tables(devinfo); bool fail = false; - for (int i = 0; i < ARRAY_SIZE(tests); i++) { + for (unsigned i = 0; i < ARRAY_SIZE(tests); i++) { for (int align_16 = 0; align_16 <= 1; align_16++) { struct brw_codegen *p = rzalloc(NULL, struct brw_codegen); brw_init_codegen(devinfo, p, p); -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/25] i965: Move compaction "prepass" into brw_eu_compact.c
--- src/intel/compiler/brw_eu_compact.c | 82 - src/intel/compiler/brw_eu_emit.c| 72 +--- 2 files changed, 82 insertions(+), 72 deletions(-) diff --git a/src/intel/compiler/brw_eu_compact.c b/src/intel/compiler/brw_eu_compact.c index a940e214f2..bf57ddf85c 100644 --- a/src/intel/compiler/brw_eu_compact.c +++ b/src/intel/compiler/brw_eu_compact.c @@ -956,6 +956,83 @@ is_compactable_immediate(unsigned imm) } /** + * Applies some small changes to instruction types to increase chances of + * compaction. + */ +static brw_inst +precompact(const struct gen_device_info *devinfo, brw_inst inst) +{ + if (brw_inst_src0_reg_file(devinfo, ) != BRW_IMMEDIATE_VALUE) + return inst; + + /* The Bspec's section titled "Non-present Operands" claims that if src0 +* is an immediate that src1's type must be the same as that of src0. +* +* The SNB+ DataTypeIndex instruction compaction tables contain mappings +* that do not follow this rule. E.g., from the IVB/HSW table: +* +* DataTypeIndex 18-Bit Mapping Mapped Meaning +*3 0010101101 r:f | i:vf | a:ud | <1> | dir | +* +* And from the SNB table: +* +* DataTypeIndex 18-Bit Mapping Mapped Meaning +*8 0010001100 a:w | i:w | a:ud | <1> | dir | +* +* Neither of these cause warnings from the simulator when used, +* compacted or otherwise. In fact, all compaction mappings that have an +* immediate in src0 use a:ud for src1. +* +* The GM45 instruction compaction tables do not contain mapped meanings +* so it's not clear whether it has the restriction. We'll assume it was +* lifted on SNB. (FINISHME: decode the GM45 tables and check.) +* +* Don't do any of this for 64-bit immediates, since the src1 fields +* overlap with the immediate and setting them would overwrite the +* immediate we set. +*/ + if (devinfo->gen >= 6 && + !(devinfo->is_haswell && + brw_inst_opcode(devinfo, ) == BRW_OPCODE_DIM) && + !(devinfo->gen >= 8 && + (brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_IMM_TYPE_DF || + brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_UQ || + brw_inst_src0_reg_type(devinfo, ) == GEN8_HW_REG_TYPE_Q))) { + brw_inst_set_src1_reg_type(devinfo, , BRW_HW_REG_TYPE_UD); + } + + /* Compacted instructions only have 12-bits (plus 1 for the other 20) +* for immediate values. Presumably the hardware engineers realized +* that the only useful floating-point value that could be represented +* in this format is 0.0, which can also be represented as a VF-typed +* immediate, so they gave us the previously mentioned mapping on IVB+. +* +* Strangely, we do have a mapping for imm:f in src1, so we don't need +* to do this there. +* +* If we see a 0.0:F, change the type to VF so that it can be compacted. +*/ + if (brw_inst_imm_ud(devinfo, ) == 0x0 && + brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_F && + brw_inst_dst_reg_type(devinfo, ) != GEN7_HW_REG_NON_IMM_TYPE_DF) { + brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_IMM_TYPE_VF); + } + + /* There are no mappings for dst:d | i:d, so if the immediate is suitable +* set the types to :UD so the instruction can be compacted. +*/ + if (is_compactable_immediate(brw_inst_imm_ud(devinfo, )) && + brw_inst_cond_modifier(devinfo, ) == BRW_CONDITIONAL_NONE && + brw_inst_src0_reg_type(devinfo, ) == BRW_HW_REG_TYPE_D && + brw_inst_dst_reg_type(devinfo, ) == BRW_HW_REG_TYPE_D) { + brw_inst_set_src0_reg_type(devinfo, , BRW_HW_REG_TYPE_UD); + brw_inst_set_dst_reg_type(devinfo, , BRW_HW_REG_TYPE_UD); + } + + return inst; +} + +/** * Tries to compact instruction src into dst. * * It doesn't modify dst unless src is compactable, which is relied on by @@ -1427,9 +1504,10 @@ brw_compact_instructions(struct brw_codegen *p, int start_offset, old_ip[offset / sizeof(brw_compact_inst)] = src_offset / sizeof(brw_inst); compacted_counts[src_offset / sizeof(brw_inst)] = compacted_count; - brw_inst saved = *src; + brw_inst inst = precompact(devinfo, *src); + brw_inst saved = inst; - if (brw_try_compact_instruction(devinfo, dst, src)) { + if (brw_try_compact_instruction(devinfo, dst, )) { compacted_count++; if (INTEL_DEBUG) { diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 0b0d67a5c5..8a6ec035cc 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -356,16 +356,6 @@ validate_reg(const struct gen_device_info *devinfo, /* 10. Check destination issues. */ } -static bool -is_compactable_immediate(unsigned imm) -{ - /* We get the low 12 bits as-is. */ - imm &= ~0xfff; - - /* We get one bit replicated through the
[Mesa-dev] [PATCH v3 7/8] anv: Use DRM sync objects for external semaphores when available
--- src/intel/vulkan/anv_batch_chain.c | 59 +++ src/intel/vulkan/anv_device.c | 1 + src/intel/vulkan/anv_private.h | 8 src/intel/vulkan/anv_queue.c | 83 +++--- 4 files changed, 128 insertions(+), 23 deletions(-) diff --git a/src/intel/vulkan/anv_batch_chain.c b/src/intel/vulkan/anv_batch_chain.c index 7a84bbd..e670ad7 100644 --- a/src/intel/vulkan/anv_batch_chain.c +++ b/src/intel/vulkan/anv_batch_chain.c @@ -957,6 +957,11 @@ struct anv_execbuf { /* Allocated length of the 'objects' and 'bos' arrays */ uint32_t array_length; + + uint32_t fence_count; + uint32_t fence_array_length; + struct drm_i915_gem_exec_fence * fences; + struct anv_syncobj ** syncobjs; }; static void @@ -971,6 +976,8 @@ anv_execbuf_finish(struct anv_execbuf *exec, { vk_free(alloc, exec->objects); vk_free(alloc, exec->bos); + vk_free(alloc, exec->fences); + vk_free(alloc, exec->syncobjs); } static VkResult @@ -1061,6 +1068,35 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, return VK_SUCCESS; } +static VkResult +anv_execbuf_add_syncobj(struct anv_execbuf *exec, +uint32_t handle, uint32_t flags, +const VkAllocationCallbacks *alloc) +{ + assert(flags != 0); + + if (exec->fence_count >= exec->fence_array_length) { + uint32_t new_len = MAX2(exec->fence_array_length * 2, 64); + + exec->fences = vk_realloc(alloc, exec->fences, +new_len * sizeof(*exec->fences), +8, VK_SYSTEM_ALLOCATION_SCOPE_COMMAND); + if (exec->fences == NULL) + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY); + + exec->fence_array_length = new_len; + } + + exec->fences[exec->fence_count] = (struct drm_i915_gem_exec_fence) { + .handle = handle, + .flags = flags, + }; + + exec->fence_count++; + + return VK_SUCCESS; +} + static void anv_cmd_buffer_process_relocs(struct anv_cmd_buffer *cmd_buffer, struct anv_reloc_list *list) @@ -1448,6 +1484,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device, impl->fd = -1; break; + case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ: + result = anv_execbuf_add_syncobj(, impl->syncobj, + I915_EXEC_FENCE_WAIT, + >alloc); + if (result != VK_SUCCESS) +return result; + break; + default: break; } @@ -1484,6 +1528,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device, need_out_fence = true; break; + case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ: + result = anv_execbuf_add_syncobj(, impl->syncobj, + I915_EXEC_FENCE_SIGNAL, + >alloc); + if (result != VK_SUCCESS) +return result; + break; + default: break; } @@ -1497,6 +1549,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device, setup_empty_execbuf(, device); } + if (execbuf.fence_count > 0) { + assert(device->instance->physicalDevice.has_syncobj); + execbuf.execbuf.flags |= I915_EXEC_FENCE_ARRAY; + execbuf.execbuf.num_cliprects = execbuf.fence_count; + execbuf.execbuf.cliprects_ptr = (uintptr_t) execbuf.fences; + } + if (in_fence != -1) { execbuf.execbuf.flags |= I915_EXEC_FENCE_IN; execbuf.execbuf.rsvd2 |= (uint32_t)in_fence; diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 3c5f78c..a6d5215 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -338,6 +338,7 @@ anv_physical_device_init(struct anv_physical_device *device, device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC); device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE); + device->has_syncobj = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE_ARRAY); bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index b451fa5..de74637 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -653,6 +653,7 @@ struct anv_physical_device { int cmd_parser_version; boolhas_exec_async; boolhas_exec_fence; +boolhas_syncobj; uint32_teu_total; uint32_tsubslice_total; @@ -1742,6 +1743,7 @@ enum anv_semaphore_type { ANV_SEMAPHORE_TYPE_DUMMY,
[Mesa-dev] [PATCH v3 8/8] anv: Advertise VK_KHR_external_semaphore
--- src/intel/vulkan/anv_extensions.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 00186bc..3252e0f 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -50,9 +50,9 @@ EXTENSIONS = [ Extension('VK_KHR_external_memory', 1, True), Extension('VK_KHR_external_memory_capabilities', 1, True), Extension('VK_KHR_external_memory_fd',1, True), -Extension('VK_KHR_external_semaphore',1, False), -Extension('VK_KHR_external_semaphore_capabilities', 1, False), -Extension('VK_KHR_external_semaphore_fd', 1, False), +Extension('VK_KHR_external_semaphore',1, True), +Extension('VK_KHR_external_semaphore_capabilities', 1, True), +Extension('VK_KHR_external_semaphore_fd', 1, True), Extension('VK_KHR_get_memory_requirements2', 1, True), Extension('VK_KHR_get_physical_device_properties2', 1, True), Extension('VK_KHR_get_surface_capabilities2', 1, True), -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 6/8] anv/gem: Add a drm syncobj support
--- src/intel/vulkan/anv_gem.c | 52 src/intel/vulkan/anv_gem_stubs.c | 24 +++ src/intel/vulkan/anv_private.h | 4 3 files changed, 80 insertions(+) diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c index 5b68e9b..9e6b2bb 100644 --- a/src/intel/vulkan/anv_gem.c +++ b/src/intel/vulkan/anv_gem.c @@ -436,3 +436,55 @@ anv_gem_sync_file_merge(struct anv_device *device, int fd1, int fd2) return args.fence; } + +uint32_t +anv_gem_syncobj_create(struct anv_device *device) +{ + struct drm_syncobj_create args = { + .flags = 0, + }; + + int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_CREATE, ); + if (ret) + return 0; + + return args.handle; +} + +void +anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle) +{ + struct drm_syncobj_destroy args = { + .handle = handle, + }; + + anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_DESTROY, ); +} + +int +anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle) +{ + struct drm_syncobj_handle args = { + .handle = handle, + }; + + int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, ); + if (ret) + return -1; + + return args.fd; +} + +uint32_t +anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd) +{ + struct drm_syncobj_handle args = { + .fd = fd, + }; + + int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, ); + if (ret) + return 0; + + return args.handle; +} diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c index 8d81eb5..842efb3 100644 --- a/src/intel/vulkan/anv_gem_stubs.c +++ b/src/intel/vulkan/anv_gem_stubs.c @@ -180,3 +180,27 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd) { unreachable("Unused"); } + +uint32_t +anv_gem_syncobj_create(struct anv_device *device) +{ + unreachable("Unused"); +} + +void +anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle) +{ + unreachable("Unused"); +} + +int +anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle) +{ + unreachable("Unused"); +} + +uint32_t +anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd) +{ + unreachable("Unused"); +} diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 5c7b3b4..b451fa5 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -812,6 +812,10 @@ int anv_gem_set_caching(struct anv_device *device, uint32_t gem_handle, uint32_t int anv_gem_set_domain(struct anv_device *device, uint32_t gem_handle, uint32_t read_domains, uint32_t write_domain); int anv_gem_sync_file_merge(struct anv_device *device, int fd1, int fd2); +uint32_t anv_gem_syncobj_create(struct anv_device *device); +void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle); +int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle); +uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd); VkResult anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, uint64_t size); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 5/8] intel/drm: Pull in the i916 fence array API
--- include/drm-uapi/i915_drm.h | 30 -- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h index c26bf7c..338c8c2 100644 --- a/include/drm-uapi/i915_drm.h +++ b/include/drm-uapi/i915_drm.h @@ -431,6 +431,11 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_HAS_EXEC_BATCH_FIRST 48 +/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of + * drm_i915_gem_exec_fence structures. See I915_EXEC_FENCE_ARRAY. + */ +#define I915_PARAM_HAS_EXEC_FENCE_ARRAY 49 + typedef struct drm_i915_getparam { __s32 param; /* @@ -812,6 +817,17 @@ struct drm_i915_gem_exec_object2 { __u64 rsvd2; }; +struct drm_i915_gem_exec_fence { + /** +* User's handle for a dma-fence to wait on or signal. +*/ + __u32 handle; + +#define I915_EXEC_FENCE_WAIT(1<<0) +#define I915_EXEC_FENCE_SIGNAL (1<<1) + __u32 flags; +}; + struct drm_i915_gem_execbuffer2 { /** * List of gem_exec_object2 structs @@ -826,7 +842,10 @@ struct drm_i915_gem_execbuffer2 { __u32 DR1; __u32 DR4; __u32 num_cliprects; - /** This is a struct drm_clip_rect *cliprects */ + /** This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY + * is not set. If I915_EXEC_FENCE_ARRAY is set, then this is a + * struct drm_i915_gem_exec_fence *fences. + */ __u64 cliprects_ptr; #define I915_EXEC_RING_MASK (7<<0) #define I915_EXEC_DEFAULT(0<<0) @@ -927,7 +946,14 @@ struct drm_i915_gem_execbuffer2 { * element). */ #define I915_EXEC_BATCH_FIRST (1<<18) -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1)) + +/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr + * define an array of i915_gem_exec_fence structures which specify a set of + * dma fences to wait upon or signal. + */ +#define I915_EXEC_FENCE_ARRAY (1<<19) + +#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1)) #define I915_EXEC_CONTEXT_ID_MASK (0x) #define i915_execbuffer2_set_context_id(eb2, context) \ -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 3/8] anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set
Reviewed-by: Chad Versace--- src/intel/vulkan/anv_gem.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c index ac47da4..36692f5 100644 --- a/src/intel/vulkan/anv_gem.c +++ b/src/intel/vulkan/anv_gem.c @@ -185,7 +185,10 @@ int anv_gem_execbuffer(struct anv_device *device, struct drm_i915_gem_execbuffer2 *execbuf) { - return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf); + if (execbuf->flags & I915_EXEC_FENCE_OUT) + return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2_WR, execbuf); + else + return anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf); } int -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 4/8] anv: Implement support for exporting semaphores as FENCE_FD
--- src/intel/vulkan/anv_batch_chain.c | 57 +-- src/intel/vulkan/anv_device.c | 1 + src/intel/vulkan/anv_gem.c | 36 src/intel/vulkan/anv_private.h | 23 + src/intel/vulkan/anv_queue.c | 69 -- 5 files changed, 175 insertions(+), 11 deletions(-) diff --git a/src/intel/vulkan/anv_batch_chain.c b/src/intel/vulkan/anv_batch_chain.c index 65fe366..7a84bbd 100644 --- a/src/intel/vulkan/anv_batch_chain.c +++ b/src/intel/vulkan/anv_batch_chain.c @@ -1416,11 +1416,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device, struct anv_execbuf execbuf; anv_execbuf_init(); + int in_fence = -1; VkResult result = VK_SUCCESS; for (uint32_t i = 0; i < num_in_semaphores; i++) { ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]); - assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE); - struct anv_semaphore_impl *impl = >permanent; + struct anv_semaphore_impl *impl = + semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ? + >temporary : >permanent; switch (impl->type) { case ANV_SEMAPHORE_TYPE_BO: @@ -1429,11 +1431,29 @@ anv_cmd_buffer_execbuf(struct anv_device *device, if (result != VK_SUCCESS) return result; break; + + case ANV_SEMAPHORE_TYPE_SYNC_FILE: + if (in_fence == -1) { +in_fence = impl->fd; + } else { +int merge = anv_gem_sync_file_merge(device, in_fence, impl->fd); +if (merge == -1) + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR); + +close(impl->fd); +close(in_fence); +in_fence = merge; + } + + impl->fd = -1; + break; + default: break; } } + bool need_out_fence = false; for (uint32_t i = 0; i < num_out_semaphores; i++) { ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]); @@ -1459,6 +1479,11 @@ anv_cmd_buffer_execbuf(struct anv_device *device, if (result != VK_SUCCESS) return result; break; + + case ANV_SEMAPHORE_TYPE_SYNC_FILE: + need_out_fence = true; + break; + default: break; } @@ -1472,9 +1497,19 @@ anv_cmd_buffer_execbuf(struct anv_device *device, setup_empty_execbuf(, device); } + if (in_fence != -1) { + execbuf.execbuf.flags |= I915_EXEC_FENCE_IN; + execbuf.execbuf.rsvd2 |= (uint32_t)in_fence; + } + + if (need_out_fence) + execbuf.execbuf.flags |= I915_EXEC_FENCE_OUT; result = anv_device_execbuf(device, , execbuf.bos); + /* Execbuf does not consume the in_fence. It's our job to close it. */ + close(in_fence); + for (uint32_t i = 0; i < num_in_semaphores; i++) { ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]); /* From the Vulkan 1.0.53 spec: @@ -1489,6 +1524,24 @@ anv_cmd_buffer_execbuf(struct anv_device *device, anv_semaphore_reset_temporary(device, semaphore); } + if (result == VK_SUCCESS && need_out_fence) { + int out_fence = execbuf.execbuf.rsvd2 >> 32; + for (uint32_t i = 0; i < num_out_semaphores; i++) { + ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]); + /* Out fences can't have temporary state because that would imply + * that we imported a sync file and are trying to signal it. + */ + assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE); + struct anv_semaphore_impl *impl = >permanent; + + if (impl->type == ANV_SEMAPHORE_TYPE_SYNC_FILE) { +assert(impl->fd == -1); +impl->fd = dup(out_fence); + } + } + close(out_fence); + } + anv_execbuf_finish(, >alloc); return result; diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index e82e1e9..3c5f78c 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -337,6 +337,7 @@ anv_physical_device_init(struct anv_physical_device *device, goto fail; device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC); + device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE); bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X); diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c index 36692f5..5b68e9b 100644 --- a/src/intel/vulkan/anv_gem.c +++ b/src/intel/vulkan/anv_gem.c @@ -22,6 +22,7 @@ */ #include +#include #include #include #include @@ -400,3 +401,38 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd) return args.handle; } + +#ifndef SYNC_IOC_MAGIC +/* duplicated from linux/sync_file.h to avoid build-time depnedency + * on new (v4.7) kernel headers. Once distro's are mostly using + * something newer than v4.7 drop this and #include + * instead. + */ +struct
[Mesa-dev] [PATCH v3 2/8] anv: Submit a dummy batch when only semaphores are provided.
Vulkan allows you to do a submit whose only job is to wait on and trigger semaphores. The easiest way for us to support that right now is to insert a dummy execbuf. --- src/intel/vulkan/anv_batch_chain.c | 28 +--- src/intel/vulkan/anv_device.c | 30 ++ src/intel/vulkan/anv_private.h | 1 + src/intel/vulkan/anv_queue.c | 17 + 4 files changed, 73 insertions(+), 3 deletions(-) diff --git a/src/intel/vulkan/anv_batch_chain.c b/src/intel/vulkan/anv_batch_chain.c index 94e7a7d..65fe366 100644 --- a/src/intel/vulkan/anv_batch_chain.c +++ b/src/intel/vulkan/anv_batch_chain.c @@ -1388,6 +1388,23 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf, return VK_SUCCESS; } +static void +setup_empty_execbuf(struct anv_execbuf *execbuf, struct anv_device *device) +{ + anv_execbuf_add_bo(execbuf, >trivial_batch_bo, NULL, 0, + >alloc); + + execbuf->execbuf = (struct drm_i915_gem_execbuffer2) { + .buffers_ptr = (uintptr_t) execbuf->objects, + .buffer_count = execbuf->bo_count, + .batch_start_offset = 0, + .batch_len = 8, /* GEN8_MI_BATCH_BUFFER_END and NOOP */ + .flags = I915_EXEC_HANDLE_LUT | I915_EXEC_RENDER, + .rsvd1 = device->context_id, + .rsvd2 = 0, + }; +} + VkResult anv_cmd_buffer_execbuf(struct anv_device *device, struct anv_cmd_buffer *cmd_buffer, @@ -1447,9 +1464,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device, } } - result = setup_execbuf_for_cmd_buffer(, cmd_buffer); - if (result != VK_SUCCESS) - return result; + if (cmd_buffer) { + result = setup_execbuf_for_cmd_buffer(, cmd_buffer); + if (result != VK_SUCCESS) + return result; + } else { + setup_empty_execbuf(, device); + } + result = anv_device_execbuf(device, , execbuf.bos); diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 793e519..e82e1e9 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1014,6 +1014,32 @@ anv_device_init_border_colors(struct anv_device *device) border_colors); } +static void +anv_device_init_trivial_batch(struct anv_device *device) +{ + anv_bo_init_new(>trivial_batch_bo, device, 4096); + + if (device->instance->physicalDevice.has_exec_async) + device->trivial_batch_bo.flags |= EXEC_OBJECT_ASYNC; + + void *map = anv_gem_mmap(device, device->trivial_batch_bo.gem_handle, +0, 4096, 0); + + struct anv_batch batch = { + .start = map, + .next = map, + .end = map + 4096, + }; + + anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe); + anv_batch_emit(, GEN7_MI_NOOP, noop); + + if (!device->info.has_llc) + gen_clflush_range(map, batch.next - map); + + anv_gem_munmap(map, device->trivial_batch_bo.size); +} + VkResult anv_CreateDevice( VkPhysicalDevicephysicalDevice, const VkDeviceCreateInfo* pCreateInfo, @@ -1131,6 +1157,8 @@ VkResult anv_CreateDevice( if (result != VK_SUCCESS) goto fail_surface_state_pool; + anv_device_init_trivial_batch(device); + anv_scratch_pool_init(device, >scratch_pool); anv_queue_init(device, >queue); @@ -1220,6 +1248,8 @@ void anv_DestroyDevice( anv_gem_munmap(device->workaround_bo.map, device->workaround_bo.size); anv_gem_close(device, device->workaround_bo.gem_handle); + anv_gem_close(device, device->trivial_batch_bo.gem_handle); + anv_state_pool_finish(>surface_state_pool); anv_state_pool_finish(>instruction_state_pool); anv_state_pool_finish(>dynamic_state_pool); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index b599db3..bc67bb6 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -745,6 +745,7 @@ struct anv_device { struct anv_state_pool surface_state_pool; struct anv_bo workaround_bo; +struct anv_bo trivial_batch_bo; struct anv_pipeline_cache blorp_shader_cache; struct blorp_contextblorp; diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c index 9a0789c..039dfd7 100644 --- a/src/intel/vulkan/anv_queue.c +++ b/src/intel/vulkan/anv_queue.c @@ -159,6 +159,23 @@ VkResult anv_QueueSubmit( pthread_mutex_lock(>mutex); for (uint32_t i = 0; i < submitCount; i++) { + if (pSubmits[i].commandBufferCount == 0) { + /* If we don't have any command buffers, we need to submit a dummy + * batch to give GEM something to wait on. We could, potentially, + * come up with something more efficient but this shouldn't be a + * common case. + */ + result = anv_cmd_buffer_execbuf(device, NULL, +
[Mesa-dev] [PATCH v3 1/8] anv: Add a basic implementation of VK_KHX_external_semaphore
This patch adds an implementation based on DRM BOs. We don't actually advertise the extension yet because we want to add a couple more paths first. --- src/intel/vulkan/anv_batch_chain.c | 31 +++- src/intel/vulkan/anv_extensions.py | 3 + src/intel/vulkan/anv_private.h | 3 + src/intel/vulkan/anv_queue.c | 154 +++-- 4 files changed, 184 insertions(+), 7 deletions(-) diff --git a/src/intel/vulkan/anv_batch_chain.c b/src/intel/vulkan/anv_batch_chain.c index ad76dc1..94e7a7d 100644 --- a/src/intel/vulkan/anv_batch_chain.c +++ b/src/intel/vulkan/anv_batch_chain.c @@ -1419,8 +1419,21 @@ anv_cmd_buffer_execbuf(struct anv_device *device, for (uint32_t i = 0; i < num_out_semaphores; i++) { ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]); - assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE); - struct anv_semaphore_impl *impl = >permanent; + + /* Under most circumstances, out fences won't be temporary. However, + * the spec does allow it for opaque_fd. From the Vulkan 1.0.53 spec: + * + *"If the import is temporary, the implementation must restore the + *semaphore to its prior permanent state after submitting the next + *semaphore wait operation." + * + * The spec says nothing whatsoever about signal operations on + * temporarily imported semaphores so it appears they are allowed. + * There are also CTS tests that require this to work. + */ + struct anv_semaphore_impl *impl = + semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ? + >temporary : >permanent; switch (impl->type) { case ANV_SEMAPHORE_TYPE_BO: @@ -1440,6 +1453,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device, result = anv_device_execbuf(device, , execbuf.bos); + for (uint32_t i = 0; i < num_in_semaphores; i++) { + ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]); + /* From the Vulkan 1.0.53 spec: + * + *"If the import is temporary, the implementation must restore the + *semaphore to its prior permanent state after submitting the next + *semaphore wait operation." + * + * This has to happen after the execbuf in case we close any syncobjs in + * the process. + */ + anv_semaphore_reset_temporary(device, semaphore); + } + anv_execbuf_finish(, >alloc); return result; diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index ae22249..00186bc 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -50,6 +50,9 @@ EXTENSIONS = [ Extension('VK_KHR_external_memory', 1, True), Extension('VK_KHR_external_memory_capabilities', 1, True), Extension('VK_KHR_external_memory_fd',1, True), +Extension('VK_KHR_external_semaphore',1, False), +Extension('VK_KHR_external_semaphore_capabilities', 1, False), +Extension('VK_KHR_external_semaphore_fd', 1, False), Extension('VK_KHR_get_memory_requirements2', 1, True), Extension('VK_KHR_get_physical_device_properties2', 1, True), Extension('VK_KHR_get_surface_capabilities2', 1, True), diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index c364491..b599db3 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -1765,6 +1765,9 @@ struct anv_semaphore { struct anv_semaphore_impl temporary; }; +void anv_semaphore_reset_temporary(struct anv_device *device, + struct anv_semaphore *semaphore); + struct anv_shader_module { unsigned charsha1[20]; uint32_t size; diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c index 2c10e9d..9a0789c 100644 --- a/src/intel/vulkan/anv_queue.c +++ b/src/intel/vulkan/anv_queue.c @@ -528,11 +528,38 @@ VkResult anv_CreateSemaphore( if (semaphore == NULL) return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY); - /* The DRM execbuffer ioctl always execute in-oder so long as you stay -* on the same ring. Since we don't expose the blit engine as a DMA -* queue, a dummy no-op semaphore is a perfectly valid implementation. -*/ - semaphore->permanent.type = ANV_SEMAPHORE_TYPE_DUMMY; + const VkExportSemaphoreCreateInfoKHR *export = + vk_find_struct_const(pCreateInfo->pNext, EXPORT_SEMAPHORE_CREATE_INFO_KHR); +VkExternalSemaphoreHandleTypeFlagsKHR handleTypes = + export ? export->handleTypes : 0; + + if (handleTypes == 0) { + /* The DRM execbuffer ioctl always execute in-oder so long as you stay + * on the same ring. Since we don't expose the blit engine as a DMA + * queue, a dummy no-op semaphore is a perfectly valid implementation. + */ +
[Mesa-dev] [PATCH v3 0/8] anv: Implement VK_KHR_external_semaphore
This series is a quick re-spin of the v2 sent yesterday to address review feedback from Chris. In particular, we now set EXEC_ASYNC on the trivial batch and I deleted the syncobj cache. Somehow, when I was working on this yesterday, I got it into my head that the kernel deduplicates syncobj handles and that we needed a cache to handle them correctly. This is not true. Every call to SYNCOBJ_FD_TO_HANDLE produces a new handle and the kernel does the reference counting for us. Cc: Chad VersaceCc: Kristian H. Kristensen Cc: Chris Wilson Jason Ekstrand (8): anv: Add a basic implementation of VK_KHX_external_semaphore anv: Submit a dummy batch when only semaphores are provided. anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set anv: Implement support for exporting semaphores as FENCE_FD intel/drm: Pull in the i916 fence array API anv/gem: Add a drm syncobj support anv: Use DRM sync objects for external semaphores when available anv: Advertise VK_KHR_external_semaphore include/drm-uapi/i915_drm.h| 30 +++- src/intel/vulkan/anv_batch_chain.c | 175 +++- src/intel/vulkan/anv_device.c | 32 + src/intel/vulkan/anv_extensions.py | 3 + src/intel/vulkan/anv_gem.c | 93 - src/intel/vulkan/anv_gem_stubs.c | 24 src/intel/vulkan/anv_private.h | 39 +- src/intel/vulkan/anv_queue.c | 271 - 8 files changed, 646 insertions(+), 21 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: removed the '--with-sha1' requirement from shading.html
The configuration option --with-sha1 is no longer required for the MESA_SHADER_READ_PATH, MESA_SHADER_DUMP_PATH environment variables to take effect. 1- removed the "--with-sha1" sentence from docs/shading.html 2- added an extra note: that the corresponding dumped and replacement shaders must have the same filenames for the feature to take effect. --- docs/shading.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/shading.html b/docs/shading.html index c789102e64..8b4cfb36a1 100644 --- a/docs/shading.html +++ b/docs/shading.html @@ -65,8 +65,7 @@ Example: export MESA_GLSL=dump,nopt -Shaders can be dumped and replaced on runtime for debugging purposes. Mesa -needs to be configured with '--with-sha1' to enable this functionality. This +Shaders can be dumped and replaced on runtime for debugging purposes. This feature is not currently supported by SCons build. This is controlled via following environment variables: @@ -76,7 +75,8 @@ This is controlled via following environment variables: Note, path set must exist before running for dumping or replacing to work. When both are set, these paths should be different so the dumped shaders do -not clobber the replacement shaders. +not clobber the replacement shaders. Also, the filenames of the replacement shaders +should match the filenames of the corresponding dumped shaders. GLSL Version -- 2.13.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/12] egl/x11: pass NULL instead of XCB_WINDOW_NONE as native_surface
Thanks. I wrote the same patch some time ago, but never had a chance to send it out. I'll send you another patch that I wrote to clear up some of this confusion. I put it in my series immediately before this patch. Feel free to add it to yours. Reviewed-by: Matt Turner___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl: Clean up native_type vs drawable mess
The next patch is going to stop passing XCB_WINDOW_NONE (of type xcb_window_enum_t) as an argument where these functions expect a void *, which clang does not appreciate. This patch cleans things up to better convince me and reviewers that it's safe to do that. --- src/egl/drivers/dri2/platform_x11.c | 10 -- src/egl/drivers/dri2/platform_x11_dri3.c | 6 +++--- 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index c10cd84fce..063c50bcce 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -210,12 +210,8 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, xcb_get_geometry_cookie_t cookie; xcb_get_geometry_reply_t *reply; xcb_generic_error_t *error; - xcb_drawable_t drawable; const __DRIconfig *config; - STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); - drawable = (uintptr_t) native_surface; - (void) drv; dri2_surf = malloc(sizeof *dri2_surf); @@ -234,14 +230,16 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, dri2_surf->drawable, dri2_dpy->screen->root, dri2_surf->base.Width, dri2_surf->base.Height); } else { - if (!drawable) { + if (!native_surface) { if (type == EGL_WINDOW_BIT) _eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface"); else _eglError(EGL_BAD_NATIVE_PIXMAP, "dri2_create_surface"); goto cleanup_surf; } - dri2_surf->drawable = drawable; + + STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); + dri2_surf->drawable = (uintptr_t) native_surface; } config = dri2_get_dri_config(dri2_conf, type, diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c b/src/egl/drivers/dri2/platform_x11_dri3.c index 515be27e20..df17cfa7aa 100644 --- a/src/egl/drivers/dri2/platform_x11_dri3.c +++ b/src/egl/drivers/dri2/platform_x11_dri3.c @@ -172,9 +172,6 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, const __DRIconfig *dri_config; xcb_drawable_t drawable; - STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); - drawable = (uintptr_t) native_surface; - (void) drv; dri3_surf = calloc(1, sizeof *dri3_surf); @@ -191,6 +188,9 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, xcb_create_pixmap(dri2_dpy->conn, conf->BufferSize, drawable, dri2_dpy->screen->root, dri3_surf->base.Width, dri3_surf->base.Height); + } else { + STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface)); + drawable = (uintptr_t) native_surface; } dri_config = dri2_get_dri_config(dri2_conf, type, -- 2.13.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size
https://bugs.freedesktop.org/show_bug.cgi?id=102038 --- Comment #2 from Brian Paul--- Hmm, swr isn't working at all for me. It's hanging in a swr_fence_finish() call with everything I've tried. Even glxinfo hangs (but elsewhere). Maybe one of the swr developers can take a look. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size
https://bugs.freedesktop.org/show_bug.cgi?id=102038 --- Comment #1 from Brian Paul--- Sorry for the breakage, Brad. I'll try to investigate ASAP, but I'm about to leave town for a week. I can't repro with llmvpipe, fwiw. I suspect it may be an issue in the swr driver. I'm building it now, but may run out of time. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] etnaviv: fix etna_bo_from_name
On Fri, 2017-08-04 at 18:15 +0200, Wladimir J. van der Laan wrote: > On Fri, Aug 04, 2017 at 05:07:54PM +0200, Philipp Zabel wrote: > > Look up BOs from the name table using the name parameter instead of > > req.handle (which at this point is always zero). > > Good catch. > > Just out of interest: when is this used, what problems does this cause? It is used by the etnaviv gallium driver in etna_screen_bo_from_handle for DRM_API_HANDLE_TYPE_SHARED handles. Since this just falls back to asking the kernel to DRM_IOCTL_GEM_OPEN if the BO is not found in the name_table already, this bug caused no problems. regards Philipp ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 8/9] anv: Use DRM sync objects for external semaphores when available
On Fri, Aug 4, 2017 at 2:03 AM, Chris Wilsonwrote: > Quoting Jason Ekstrand (2017-08-04 02:25:27) > > @@ -1497,6 +1569,12 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > >setup_empty_execbuf(, device); > > } > > > > + if (execbuf.fence_count > 0) { > > For sanity, since I just had to check, assert(device->has_syncobj); > Good call. > > + execbuf.execbuf.flags |= I915_EXEC_FENCE_ARRAY; > > + execbuf.execbuf.num_cliprects = execbuf.fence_count; > > + execbuf.execbuf.cliprects_ptr = (uintptr_t) execbuf.fences; > > + } > > + > > if (in_fence != -1) { > >execbuf.execbuf.flags |= I915_EXEC_FENCE_IN; > >execbuf.execbuf.rsvd2 |= (uint32_t)in_fence; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 7/9] anv/allocator: Add a syncobj cache
On Fri, Aug 4, 2017 at 1:59 AM, Chris Wilsonwrote: > Quoting Jason Ekstrand (2017-08-04 02:25:26) > > This is mostly a copy+paste of the BO cache but it's a bit simpler > > because syncobjs don't have actual backing storage so we don't need to > > check sizes or anything like that. Also, we put the refcount directly > > in anv_syncobj because they will always be heap pointers. > > Ok, but why do we need one at all? Some part of the Vk spec, some bad > behaviour you noticed? Or just that it is more elegant to be minimalist? > Gah! I thought I saw a real-world problem and decided the kernel must be de-duplicating for me. But now I remember that it doesn't and just looked at the kernel code and confirmed that it gives you a new idr entry on every fd_to_handle. I'll delete all this garbage and go back to doing it the way I was before. Thanks for pointing that out! > > --- > > src/intel/vulkan/anv_allocator.c | 194 ++ > + > > src/intel/vulkan/anv_device.c| 9 +- > > src/intel/vulkan/anv_private.h | 40 > > 3 files changed, 242 insertions(+), 1 deletion(-) > > > > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_ > allocator.c > > index efaaebc..204c466 100644 > > --- a/src/intel/vulkan/anv_allocator.c > > +++ b/src/intel/vulkan/anv_allocator.c > > @@ -1422,3 +1422,197 @@ anv_bo_cache_release(struct anv_device *device, > > > > vk_free(>alloc, bo); > > } > > + > > +VkResult > > +anv_syncobj_cache_init(struct anv_syncobj_cache *cache) > > +{ > > + cache->map = _mesa_hash_table_create(NULL, _mesa_hash_pointer, > > +_mesa_key_pointer_equal); > > Not hash_uint for u32? Bah, for the number of ht mesa creates for > looking up u32 names, you would think it would have an ultra-specialised > data struct for it. :( > -Chris > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] clover/memory: Copy data when creating buffers with CL_MEM_USE_HOST_PTR
On 2017-08-03 22:26, Alex Deucher wrote: IIRC, user_ptrs require page alignment. Alex I didn't follow the whole discussion (sorry if I'm saying something redundant), but AMD's older OpenCL Optimization Guide [1] has some notes regarding the implementation of the USE_HOST_PTR flag. It initially recommends 4KB (aka page) alignment but also supports arbitrary alignment (with additional overhead, I suppose it pins an extra page for bad alignments). It also does some optimizations to minimize mapping/unmapping operations, called "pre-pinning". Not sure if that is applicable to Mesa/Clover, aren't (GTT) buffers usually mapped forever? Grigori [1] http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide.pdf Right now it's hard-coded to R600_MAP_BUFFER_ALIGNMENT in si_pipe.c and r600_pipe.c which has a value of 64 (bytes, I believe). And also change si_pipe.c:si_get_param's switch statement value to return: case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT: return sscreen->b.info.gart_page_size; I'm not sure what the correct value is here. AFAIK, EG uses 256B cache lines so I'd expect the value of to be at least that Depending on how the weather works out tonight, I might be able to at least find out what NI reports for gart page sizes and compare that to my SI. I haven't tried to test user pointer support on r600g yet, so either it's working alright with the existing 64-byte alignment, or it's broken when we allocate pointers using the actual alignments reported by clGetDeviceInfo. If it's broken, I'll try 256B, then keep bumping it up until it either starts working or I hit GART page size. --Aaron Both NI and GCN should be able to use 4K pages (which is what gart_page_size is set to), but we might want higher alignment for better performance[0] [0]https://lists.freedesktop.org/archives/dri-devel/2014-May/058858.htm l Then I can successfully create buffers from user pointers on my SI card. I'm a bit fuzzy on what alignment restrictions exist for SI/GCN cards, but the winsys seems to indicate we should align things to gart page size, which makes sense on the surface at least. If the alignment restrictions have changed between R600 and GCN, that might explain why what's broken for me is working for you/Grigori (on r600). I remember there was a buffer alignment patch form AMD recently for SI/CI vs. VI+, but I can't find it. It looks like a separate issue however. if incorrect alignment makes user_ptr fail, and the test still fails, it looks like the no-user_ptr fallback is broken. Jan --Aaron > > Jan > > > > > > Signed-off-by: Aaron Watry> > CC: Francisco Jerez > > --- > > src/gallium/state_trackers/clover/core/memory.cpp | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/gallium/state_trackers/clover/core/memory.cpp b/src/gallium/state_trackers/clover/core/memory.cpp > > index b852e6896f..912d74830a 100644 > > --- a/src/gallium/state_trackers/clover/core/memory.cpp > > +++ b/src/gallium/state_trackers/clover/core/memory.cpp > > @@ -30,7 +30,7 @@ memory_obj::memory_obj(clover::context , cl_mem_flags flags, > > size_t size, void *host_ptr) : > > context(ctx), _flags(flags), > > _size(size), _host_ptr(host_ptr) { > > - if (flags & CL_MEM_COPY_HOST_PTR) > > + if (flags & (CL_MEM_COPY_HOST_PTR | CL_MEM_USE_HOST_PTR)) > >data.append((char *)host_ptr, size); > > } > > > > -- > Jan Vesely ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS
> -Original Message- > From: Tomasz Figa [mailto:tf...@chromium.org] > Sent: Friday, August 4, 2017 9:39 PM > On Sat, Aug 5, 2017 at 12:53 AM, Marathe, Yogesh >wrote: > > Tomasz, Emil, > > > >> -Original Message- > >> From: Tomasz Figa [mailto:tf...@chromium.org] > >> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov > wrote: > >> >>> >> - version check (2+) the fence extension, calling > >> >>> >> .create_fence_fd() only when > >> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD > >> >> > >> >> The check looks like below now, this is in > >> >> dri2_surf_update_fence_fd() before > >> create_fence_fd is called. > >> >> > >> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) { > >> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence- > >> >get_capabilities(dri2_dpy->dri_screen)) { > >> >> //create_fence_fd call > >> >>} > >> >> } > >> >> > >> > Close but no cigar. > >> > > >> > if (dri2_surf->enable_out_fence && dri2_dpy->fence && > >> > dri2_dpy->fence->base.version >= 2 && > >> > dri2_dpy->fence->get_capabilities) { > >> > > >> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) & > >> > __DRI_FENCE_CAP_NATIVE_FD) { > >> > //create_fence_fd call > >> >} > >> > } > >> > >> If this needs so complicated series of checks, maybe it would make > >> more sense to just set enable_out_fence based on availability of the > >> capability at initialization time? > > > > I liked this one compared to nested ifs in dri2_surf_update_fence_fd(). > > > >> > >> > > >> >> Overall, if I further go ahead and check, actually > >> >> get_capabilities() ultimately returns based on has_exec_fence > >> >> which depends on I915_PARAM_HAS_EXEC_FENCE. This is always set to > >> >> true for i915 in kernel drv unless forced to false!! I'm not sure > >> >> if that inner check of > >> get_capabilities still makes sense. Isn't the first one sufficient? > >> >> > >> > Not sure what you mean with "first one", but consider the following > example: > >> > - old kernel which does not support (or has force disabled) > >> > I915_PARAM_HAS_EXEC_FENCE. > >> > - new userspace which unconditionally advertises the fence v2 > >> > extension IIRC one may tweak that things to only conditionally > >> > advertise it, but IMHO it's not worth the hassle. > >> > > >> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module) > >> > so focusing on one doesn't quite work. > >> > > >> >>> >> - don't introduce unused variables (in make_current) > >> >> > >> >> Done. > >> >> > >> >>> >> - the create fd for the old display surface (in make_current) > >> >>> >> seems bogus > >> >> > >> >> Done. > >> >> > >> > Did you drop it all together or changed to use some other surface? > >> > Would be nice to hear the reason why it was added - perhaps I'm > >> > missing something. > >> > >> We have to keep it, otherwise there would be no fence available at > >> the time of surface destruction, while, at least for Android, a fence > >> can be passed to window's cancelBuffer callback. > >> > >> > > >> > I think that we want a fence/fd for the new draw surface. Since > >> > otherwise one won't get created up until the first SwapBuffers call. > >> > >> I might be missing something, but wouldn't that insert a fence at the > >> beginning of command stream, before even doing anything? At least in > >> Android use cases, the only places we need the fence is in > >> SwapBuffers and DestroySurface and the fence should be inserted after > >> all the commands for rendering into given surface. > >> > > > > Emil, > > > > Tomasz sounds convincing to me here, I just went ahead with the > > comment to try out and flatland worked even after removing that. > > Zhongmin can explain better but I think in earlier revisions this was > > done for cancelBuffer to match with queueBuffer, I mean we are passing > > valid fd for queueBuffer by doing this we would have a valid fd during > cancelBuffer. Not sure if this is the reason / one of the reason. > > > > I will go ahead with rest of your comments if we are ok to keep fd for > > old display surface in make_current. > > My understanding is that nobody actually cares about the fence that > cancelBuffer returns, because the contents of the buffer are going to be > discarded anyway and the buffer doesn't go to the consumer (e.g. > flatland code that reads the timestamp). I even suspect that typically > destroySurface would be called directly after swapBuffers and the surface > wouldn't have a buffer to cancel. You can easily check this by adding a print > before cancelBuffer call happens. So we might actually be fine with simpler > code > that gets fence only for swapBuffers. > Sure. I can confirm this. > Changing the topic, the patch doesn't seem to change the implementation of > swapBuffers to stop doing a flush on the buffer, which defeats the purpose of > the fence, as the it is likely already
Re: [Mesa-dev] [PATCH v2 2/9] anv: Submit a dummy batch when only semaphores are provided.
On Fri, Aug 4, 2017 at 1:43 AM, Chris Wilsonwrote: > Quoting Jason Ekstrand (2017-08-04 02:25:21) > > Vulkan allows you to do a submit whose only job is to wait on and > > trigger semaphores. The easiest way for us to support that right > > now is to insert a dummy execbuf. > > --- > > diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > > index 793e519..0f0aa22 100644 > > --- a/src/intel/vulkan/anv_device.c > > +++ b/src/intel/vulkan/anv_device.c > > @@ -1014,6 +1014,28 @@ anv_device_init_border_colors(struct anv_device > *device) > > border_colors); > > } > > > > +static void > > +anv_device_init_trivial_batch(struct anv_device *device) > > +{ > > + anv_bo_init_new(>trivial_batch_bo, device, 4096); > > Is this created with ASYNC? No, it isn't but I'm happy to set the flag. This patch predates the ASYNC stuff, I believe. Just thinking that you only want the > external ordering constraints on this bo, and not accidentally serialize > between contexts. > Is this really an issue? No other process will ever see this BO. I suppose the kernel is still doing unneeded flushing but this shouldn't cause cross-context synchronization. > > + void *map = anv_gem_mmap(device, device->trivial_batch_bo.gem_ > handle, > > +0, 4096, 0); > > + > > + struct anv_batch batch = { > > + .start = map, > > + .next = map, > > + .end = map + 4096, > > + }; > > + > > + anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe); > > + anv_batch_emit(, GEN7_MI_NOOP, noop); > > + > > + if (!device->info.has_llc) > > + gen_clflush_range(map, batch.next - map); > > + > > + anv_gem_munmap(map, device->trivial_batch_bo.size); > > +} > > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_ > private.h > > index b599db3..bc67bb6 100644 > > --- a/src/intel/vulkan/anv_private.h > > +++ b/src/intel/vulkan/anv_private.h > > @@ -745,6 +745,7 @@ struct anv_device { > > struct anv_state_pool surface_state_pool; > > > > struct anv_bo workaround_bo; > > +struct anv_bo trivial_batch_bo; > > Do you use all 4096 bytes of the workaround_bo, or could you spare 64? > ;) > I could... Then again, I can also easily spare a single 4K page per context and prevent myself from accidentally overwriting my little batch. :-) > > diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c > > index 446c3de..9934fef 100644 > > --- a/src/intel/vulkan/anv_queue.c > > +++ b/src/intel/vulkan/anv_queue.c > > @@ -159,6 +159,23 @@ VkResult anv_QueueSubmit( > > pthread_mutex_lock(>mutex); > > > > for (uint32_t i = 0; i < submitCount; i++) { > > + if (pSubmits[i].commandBufferCount == 0) { > > + /* If we don't have any command buffers, we need to submit a > dummy > > + * batch to give GEM something to wait on. We could, > potentially, > > + * come up with something more efficient but this shouldn't be > a > > + * common case. > > + */ > > + result = anv_cmd_buffer_execbuf(device, NULL, > > + pSubmits[i].pWaitSemaphores, > > + pSubmits[i]. > waitSemaphoreCount, > > + pSubmits[i].pSignalSemaphores, > > + pSubmits[i]. > signalSemaphoreCount); > > Might as well just pass [i] along? > I can't. See below where we only pass the wait semaphores to the first execbuf in the batch and only pass the signal semaphores to the last. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] etnaviv: fix etna_bo_from_name
On Fri, Aug 04, 2017 at 05:07:54PM +0200, Philipp Zabel wrote: > Look up BOs from the name table using the name parameter instead of > req.handle (which at this point is always zero). Good catch. Just out of interest: when is this used, what problems does this cause? Regards, Wladimir ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS
On Sat, Aug 5, 2017 at 12:53 AM, Marathe, Yogeshwrote: > Tomasz, Emil, > >> -Original Message- >> From: Tomasz Figa [mailto:tf...@chromium.org] >> Sent: Friday, August 4, 2017 6:54 PM >> To: Emil Velikov >> Cc: Marathe, Yogesh ; Antognolli, Rafael >> ; ML mesa-dev > d...@lists.freedesktop.org>; Wu, Zhongmin ; Gao, >> Shuo ; Liu, Zhiquan ; Daniel >> Stone ; Timothy Arceri ; Eric >> Engestrom ; Kenneth Graunke ; >> Kondapally, Kalyan ; Varad Gautam >> ; Rainer Hochecker ; >> Nicolai Hähnle >> Subject: Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync >> fence >> for Android OS >> >> On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov >> wrote: >> >>> >> - version check (2+) the fence extension, calling >> >>> >> .create_fence_fd() only when >> >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD >> >> >> >> The check looks like below now, this is in dri2_surf_update_fence_fd() >> >> before >> create_fence_fd is called. >> >> >> >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) { >> >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence- >> >get_capabilities(dri2_dpy->dri_screen)) { >> >> //create_fence_fd call >> >>} >> >> } >> >> >> > Close but no cigar. >> > >> > if (dri2_surf->enable_out_fence && dri2_dpy->fence && >> > dri2_dpy->fence->base.version >= 2 && >> > dri2_dpy->fence->get_capabilities) { >> > >> >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) & >> > __DRI_FENCE_CAP_NATIVE_FD) { >> > //create_fence_fd call >> >} >> > } >> >> If this needs so complicated series of checks, maybe it would make more sense >> to just set enable_out_fence based on availability of the capability at >> initialization time? > > I liked this one compared to nested ifs in dri2_surf_update_fence_fd(). > >> >> > >> >> Overall, if I further go ahead and check, actually get_capabilities() >> >> ultimately returns based on has_exec_fence which depends on >> >> I915_PARAM_HAS_EXEC_FENCE. This is always set to true for i915 in >> >> kernel drv unless forced to false!! I'm not sure if that inner check of >> get_capabilities still makes sense. Isn't the first one sufficient? >> >> >> > Not sure what you mean with "first one", but consider the following >> > example: >> > - old kernel which does not support (or has force disabled) >> > I915_PARAM_HAS_EXEC_FENCE. >> > - new userspace which unconditionally advertises the fence v2 >> > extension IIRC one may tweak that things to only conditionally >> > advertise it, but IMHO it's not worth the hassle. >> > >> > Even then, Mesa can produce 20 DRI drivers (used by the EGL module) so >> > focusing on one doesn't quite work. >> > >> >>> >> - don't introduce unused variables (in make_current) >> >> >> >> Done. >> >> >> >>> >> - the create fd for the old display surface (in make_current) >> >>> >> seems bogus >> >> >> >> Done. >> >> >> > Did you drop it all together or changed to use some other surface? >> > Would be nice to hear the reason why it was added - perhaps I'm >> > missing something. >> >> We have to keep it, otherwise there would be no fence available at the time >> of >> surface destruction, while, at least for Android, a fence can be passed to >> window's cancelBuffer callback. >> >> > >> > I think that we want a fence/fd for the new draw surface. Since >> > otherwise one won't get created up until the first SwapBuffers call. >> >> I might be missing something, but wouldn't that insert a fence at the >> beginning >> of command stream, before even doing anything? At least in Android use cases, >> the only places we need the fence is in SwapBuffers and DestroySurface and >> the >> fence should be inserted after all the commands for rendering into given >> surface. >> > > Emil, > > Tomasz sounds convincing to me here, I just went ahead with the comment to > try out and > flatland worked even after removing that. Zhongmin can explain better but I > think in earlier > revisions this was done for cancelBuffer to match with queueBuffer, I mean we > are passing > valid fd for queueBuffer by doing this we would have a valid fd during > cancelBuffer. Not > sure if this is the reason / one of the reason. > > I will go ahead with rest of your comments if we are ok to keep fd for old > display surface > in make_current. My understanding is that nobody actually cares about the fence that cancelBuffer returns, because the contents of the buffer are going to be discarded anyway and the buffer doesn't go to the consumer (e.g. flatland code that reads the timestamp). I even suspect that typically
Re: [Mesa-dev] [PATCH] loader: always include libxmlconfig on autotools build
On Fri, 2017-08-04 at 11:53 +0200, Nicolai Hähnle wrote: > From: Nicolai Hähnle> > This aligns with the fact that we also check for EXPAT_LIBS > unconditionally in configure.ac now. It should make all the > various build permutations of Clover work (whether DRI is > enabled or disabled in the build). > > Cc: Aaron Watry > Cc: Emil Velikov > -- > This change keeps everything green on Travis, and it should fix > the duplicate-symbol linker error seen by Aaron and others when > building Clover. It does. This patch fixes last of the problems I had since the driconf changes. Tested-by: Jan Vesely thanks, Jan > --- > src/gallium/targets/opencl/Makefile.am | 1 - > src/loader/Makefile.am | 13 + > 2 files changed, 5 insertions(+), 9 deletions(-) > > diff --git a/src/gallium/targets/opencl/Makefile.am > b/src/gallium/targets/opencl/Makefile.am > index e88fa0fd382..c9d2be7afd0 100644 > --- a/src/gallium/targets/opencl/Makefile.am > +++ b/src/gallium/targets/opencl/Makefile.am > @@ -19,7 +19,6 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \ > $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \ > $(top_builddir)/src/gallium/auxiliary/libgallium.la \ > $(top_builddir)/src/util/libmesautil.la \ > - $(top_builddir)/src/util/libxmlconfig.la \ > $(EXPAT_LIBS) \ > $(LIBELF_LIBS) \ > $(DLOPEN_LIBS) \ > diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am > index 8b197f2995c..5ed87820664 100644 > --- a/src/loader/Makefile.am > +++ b/src/loader/Makefile.am > @@ -33,21 +33,18 @@ AM_CPPFLAGS = \ > $(XCB_DRI3_CFLAGS) \ > $(LIBDRM_CFLAGS) > > -libloader_la_CPPFLAGS = $(AM_CPPFLAGS) > +libloader_la_CPPFLAGS = $(AM_CPPFLAGS) \ > + -DUSE_DRICONF > libloader_la_SOURCES = $(LOADER_C_FILES) > -libloader_la_LIBADD = > +libloader_la_LIBADD = \ > + $(top_builddir)/src/util/libxmlconfig.la > > if HAVE_DRICOMMON > libloader_la_CPPFLAGS += \ > -I$(top_builddir)/src/util/ \ > -I$(top_srcdir)/src/mesa/drivers/dri/common/ \ > -I$(top_srcdir)/src/mesa/ \ > - -I$(top_srcdir)/src/mapi/ \ > - -DUSE_DRICONF > - > -libloader_la_LIBADD += \ > - $(top_builddir)/src/util/libxmlconfig.la > - > + -I$(top_srcdir)/src/mapi/ > endif > > if HAVE_LIBDRM signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync fence for Android OS
Tomasz, Emil, > -Original Message- > From: Tomasz Figa [mailto:tf...@chromium.org] > Sent: Friday, August 4, 2017 6:54 PM > To: Emil Velikov> Cc: Marathe, Yogesh ; Antognolli, Rafael > ; ML mesa-dev d...@lists.freedesktop.org>; Wu, Zhongmin ; Gao, > Shuo ; Liu, Zhiquan ; Daniel > Stone ; Timothy Arceri ; Eric > Engestrom ; Kenneth Graunke ; > Kondapally, Kalyan ; Varad Gautam > ; Rainer Hochecker ; > Nicolai Hähnle > Subject: Re: [Mesa-dev] [PATCH v5 2/2] i965: Queue the buffer with a sync > fence > for Android OS > > On Fri, Aug 4, 2017 at 9:55 PM, Emil Velikov wrote: > >>> >> - version check (2+) the fence extension, calling > >>> >> .create_fence_fd() only when > >>> >> .get_capabilities() advertises __DRI_FENCE_CAP_NATIVE_FD > >> > >> The check looks like below now, this is in dri2_surf_update_fence_fd() > >> before > create_fence_fd is called. > >> > >> if (dri2_surf->enable_out_fence && dri2_dpy->fence) { > >>if(__DRI_FENCE_CAP_NATIVE_FD | dri2_dpy->fence- > >get_capabilities(dri2_dpy->dri_screen)) { > >> //create_fence_fd call > >>} > >> } > >> > > Close but no cigar. > > > > if (dri2_surf->enable_out_fence && dri2_dpy->fence && > > dri2_dpy->fence->base.version >= 2 && > > dri2_dpy->fence->get_capabilities) { > > > >if (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) & > > __DRI_FENCE_CAP_NATIVE_FD) { > > //create_fence_fd call > >} > > } > > If this needs so complicated series of checks, maybe it would make more sense > to just set enable_out_fence based on availability of the capability at > initialization time? I liked this one compared to nested ifs in dri2_surf_update_fence_fd(). > > > > >> Overall, if I further go ahead and check, actually get_capabilities() > >> ultimately returns based on has_exec_fence which depends on > >> I915_PARAM_HAS_EXEC_FENCE. This is always set to true for i915 in > >> kernel drv unless forced to false!! I'm not sure if that inner check of > get_capabilities still makes sense. Isn't the first one sufficient? > >> > > Not sure what you mean with "first one", but consider the following example: > > - old kernel which does not support (or has force disabled) > > I915_PARAM_HAS_EXEC_FENCE. > > - new userspace which unconditionally advertises the fence v2 > > extension IIRC one may tweak that things to only conditionally > > advertise it, but IMHO it's not worth the hassle. > > > > Even then, Mesa can produce 20 DRI drivers (used by the EGL module) so > > focusing on one doesn't quite work. > > > >>> >> - don't introduce unused variables (in make_current) > >> > >> Done. > >> > >>> >> - the create fd for the old display surface (in make_current) > >>> >> seems bogus > >> > >> Done. > >> > > Did you drop it all together or changed to use some other surface? > > Would be nice to hear the reason why it was added - perhaps I'm > > missing something. > > We have to keep it, otherwise there would be no fence available at the time of > surface destruction, while, at least for Android, a fence can be passed to > window's cancelBuffer callback. > > > > > I think that we want a fence/fd for the new draw surface. Since > > otherwise one won't get created up until the first SwapBuffers call. > > I might be missing something, but wouldn't that insert a fence at the > beginning > of command stream, before even doing anything? At least in Android use cases, > the only places we need the fence is in SwapBuffers and DestroySurface and the > fence should be inserted after all the commands for rendering into given > surface. > Emil, Tomasz sounds convincing to me here, I just went ahead with the comment to try out and flatland worked even after removing that. Zhongmin can explain better but I think in earlier revisions this was done for cancelBuffer to match with queueBuffer, I mean we are passing valid fd for queueBuffer by doing this we would have a valid fd during cancelBuffer. Not sure if this is the reason / one of the reason. I will go ahead with rest of your comments if we are ok to keep fd for old display surface in make_current. > Best regards, > Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev