[Mesa-dev] [PATCH] nvc0: don't try to go through the push path for indirect draws
This fixes dEQP-GLES31.functional.draw_indirect.draw_elements_indirect.*.default_attribute These tests were causing a const vbo to be set up, and were small enough draws that the logic was trying to go via the push path (which emits data directly into the cmd stream rather than uploading a user vbo). Signed-off-by: Ilia Mirkin Cc: mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c index 4d9cd57..888c094 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c @@ -948,7 +948,8 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) * if index count is larger and we expect repeated vertices, suggest upload. */ nvc0->vbo_push_hint = - info->indexed && (nvc0->vb_elt_limit >= (info->count * 2)); + !info->indirect && info->indexed && + (nvc0->vb_elt_limit >= (info->count * 2)); /* Check whether we want to switch vertex-submission mode. */ if (nvc0->vbo_user && !(nvc0->dirty_3d & (NVC0_NEW_3D_ARRAYS | NVC0_NEW_3D_VERTEX))) { -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 95395] glsl: NULL type value in add_uniform() leads to SIGSEGV
https://bugs.freedesktop.org/show_bug.cgi?id=95395 --- Comment #2 from Jonathan Gray --- None of the builds were with LLVM enabled. Interestingly I can't reproduce this on sparc64 which requires strict alignment. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] clover: add LLVM version to device and platform version
Giuseppe Bilotta writes: > Code generation (kernel compilation) may sometimes hit LLVM-specific > bugs. Adding the used LLVM version to the version string may make bug > triaging easier. > > Signed-off-by: Giuseppe Bilotta Acked-by: Francisco Jerez > --- > configure.ac | 2 +- > src/gallium/state_trackers/clover/api/device.cpp | 2 +- > src/gallium/state_trackers/clover/api/platform.cpp | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) > > > I believe similar additions could be made for OpenGL version strings as well. > > diff --git a/configure.ac b/configure.ac > index 023110e..4fcadcf 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -2116,7 +2116,7 @@ if test "x$enable_gallium_llvm" = xyes; then > LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker > instrumentation" > LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option objcarcopts > profiledata" > fi > -DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT > -DMESA_LLVM_VERSION_PATCH=$LLVM_VERSION_PATCH" > +DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT > -DMESA_LLVM_VERSION_PATCH=$LLVM_VERSION_PATCH > '-DMESA_LLVM_VERSION_STRING=\"$LLVM_VERSION_MAJOR.$LLVM_VERSION_MINOR.$LLVM_VERSION_PATCH\"'" > MESA_LLVM=1 > > dnl Check for Clang internal headers > diff --git a/src/gallium/state_trackers/clover/api/device.cpp > b/src/gallium/state_trackers/clover/api/device.cpp > index bc93f91..0d0f77b 100644 > --- a/src/gallium/state_trackers/clover/api/device.cpp > +++ b/src/gallium/state_trackers/clover/api/device.cpp > @@ -300,7 +300,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >break; > > case CL_DEVICE_VERSION: > - buf.as_string() = "OpenCL 1.1 MESA " PACKAGE_VERSION; > + buf.as_string() = "OpenCL 1.1 MESA " PACKAGE_VERSION " LLVM " > MESA_LLVM_VERSION_STRING; >break; > > case CL_DEVICE_EXTENSIONS: > diff --git a/src/gallium/state_trackers/clover/api/platform.cpp > b/src/gallium/state_trackers/clover/api/platform.cpp > index cf71593..06eb4ec 100644 > --- a/src/gallium/state_trackers/clover/api/platform.cpp > +++ b/src/gallium/state_trackers/clover/api/platform.cpp > @@ -57,7 +57,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform, > cl_platform_info param, >break; > > case CL_PLATFORM_VERSION: > - buf.as_string() = "OpenCL 1.1 MESA " PACKAGE_VERSION; > + buf.as_string() = "OpenCL 1.1 MESA " PACKAGE_VERSION " LLVM " > MESA_LLVM_VERSION_STRING; >break; > > case CL_PLATFORM_NAME: > -- > 2.8.1.372.g9612035 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] st/dri: don't call close(-1) in dri{2, kms_}_init_screen error path
Series is: Reviewed-by: Leo Liu On 05/14/2016 11:33 AM, Emil Velikov wrote: Add separate labels and jump to the correct one as needed. Signed-off-by: Emil Velikov --- src/gallium/state_trackers/dri/dri2.c | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 675a9bb..2330530 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -1714,7 +1714,7 @@ dri2_init_screen(__DRIscreen * sPriv) struct pipe_screen *pscreen = NULL; const struct drm_conf_ret *throttle_ret; const struct drm_conf_ret *dmabuf_ret; - int fd = -1; + int fd; screen = CALLOC_STRUCT(dri_screen); if (!screen) @@ -1727,13 +1727,13 @@ dri2_init_screen(__DRIscreen * sPriv) sPriv->driverPrivate = (void *)screen; if (screen->fd < 0 || (fd = dup(screen->fd)) < 0) - goto fail; + goto free_screen; if (pipe_loader_drm_probe_fd(&screen->dev, fd)) pscreen = pipe_loader_create_screen(screen->dev); if (!pscreen) - goto fail; + goto release_pipe; throttle_ret = pipe_loader_configuration(screen->dev, DRM_CONF_THROTTLE); dmabuf_ret = pipe_loader_configuration(screen->dev, DRM_CONF_SHARE_FD); @@ -1762,7 +1762,7 @@ dri2_init_screen(__DRIscreen * sPriv) configs = dri_init_screen_helper(screen, pscreen, screen->dev->driver_name); if (!configs) - goto fail; + goto destroy_screen; screen->can_share_buffer = true; screen->auto_fake_front = dri_with_format(sPriv); @@ -1770,12 +1770,17 @@ dri2_init_screen(__DRIscreen * sPriv) screen->lookup_egl_image = dri2_lookup_egl_image; return configs; -fail: + +destroy_screen: dri_destroy_screen_helper(screen); + +release_pipe: if (screen->dev) pipe_loader_release(&screen->dev, 1); else close(fd); + +free_screen: FREE(screen); return NULL; } @@ -1793,7 +1798,7 @@ dri_kms_init_screen(__DRIscreen * sPriv) struct dri_screen *screen; struct pipe_screen *pscreen = NULL; uint64_t cap; - int fd = -1; + int fd; screen = CALLOC_STRUCT(dri_screen); if (!screen) @@ -1805,13 +1810,13 @@ dri_kms_init_screen(__DRIscreen * sPriv) sPriv->driverPrivate = (void *)screen; if (screen->fd < 0 || (fd = dup(screen->fd)) < 0) - goto fail; + goto free_screen; if (pipe_loader_sw_probe_kms(&screen->dev, fd)) pscreen = pipe_loader_create_screen(screen->dev); if (!pscreen) - goto fail; + goto release_pipe; if (drmGetCap(sPriv->fd, DRM_CAP_PRIME, &cap) == 0 && (cap & DRM_PRIME_CAP_IMPORT)) { @@ -1823,7 +1828,7 @@ dri_kms_init_screen(__DRIscreen * sPriv) configs = dri_init_screen_helper(screen, pscreen, "swrast"); if (!configs) - goto fail; + goto destroy_screen; screen->can_share_buffer = false; screen->auto_fake_front = dri_with_format(sPriv); @@ -1831,12 +1836,17 @@ dri_kms_init_screen(__DRIscreen * sPriv) screen->lookup_egl_image = dri2_lookup_egl_image; return configs; -fail: + +destroy_screen: dri_destroy_screen_helper(screen); + +release_pipe: if (screen->dev) pipe_loader_release(&screen->dev, 1); else close(fd); + +free_screen: FREE(screen); #endif // GALLIUM_SOFTPIPE return NULL; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] clover: Error on incomplete switch statements
Jan Vesely writes: > Signed-off-by: Jan Vesely > --- > src/gallium/state_trackers/clover/Makefile.am | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/state_trackers/clover/Makefile.am > b/src/gallium/state_trackers/clover/Makefile.am > index 4c9d7d9..26ebd3b 100644 > --- a/src/gallium/state_trackers/clover/Makefile.am > +++ b/src/gallium/state_trackers/clover/Makefile.am > @@ -1,5 +1,9 @@ > include Makefile.sources > > +AM_CXXFLAGS = -Werror=switch > + > +CXXFLAGS += $(AM_CXXFLAGS) > + I'm not much into build systems, but I don't think this is the way you're supposed to add flags to the compiler command line, because the user can easily override your definition inadvertently. AFAIK the usual idiom is to add AM_CXXFLAGS explicitly to each of the per-target CXXFLAGS variables. Once you do that it should be easy to remove some redundancy between per-target CXXFLAGS (e.g. -std=c++11 and $(VISIBILITY_CXXFLAGS)). > AM_CPPFLAGS = \ > -I$(top_srcdir)/include \ > -I$(top_srcdir)/src \ > -- > 2.5.5 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] clover: Handle PIPE_SHADER_IR_NIR in switch
Jan Vesely writes: > Signed-off-by: Jan Vesely > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 96f6a48..e2cadda 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -893,8 +893,9 @@ clover::compile_program_llvm(const std::string &source, > module m; > // Build the clover::module > switch (ir) { > + case PIPE_SHADER_IR_NIR: >case PIPE_SHADER_IR_TGSI: > - //XXX: Handle TGSI > + //XXX: Handle TGSI, NIR Heh, I doubt that writing a NIR LLVM back-end would be particularly rewarding or useful, but sure: Reviewed-by: Francisco Jerez > assert(0); > m = module(); > break; > -- > 2.5.5 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nvc0/ir: make sure to align the second arg of TXD to 4, as we do for TEX
This was handled in handleTEX(), however the way the logic works, those extra arguments aren't added on by then, so it did nothing. Instead we must duplicate that bit here. GK110 appears to complain about MISALIGNED_GPR, however it's reasonable to believe that GK104 has the same requirements. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95403 Signed-off-by: Ilia Mirkin Cc: mesa-sta...@lists.freedesktop.org --- .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index 1068c21..869b06c 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp @@ -993,6 +993,20 @@ NVC0LoweringPass::handleTXD(TexInstruction *txd) txd->dPdx[c].set(NULL); txd->dPdy[c].set(NULL); } + + // In this case we have fewer than 4 "real" arguments, which means that + // handleTEX didn't apply any padding. However we have to make sure that + // the second "group" of arguments still gets padded up to 4. + if (chipset >= NVISA_GK104_CHIPSET) { + int s = arg + 2 * dim; + if (s >= 4 && s < 7) { + if (txd->srcExists(s)) // move potential predicate out of the way +txd->moveSources(s, 7 - s); + while (s < 7) +txd->setSrc(s++, bld.loadImm(NULL, 0)); + } + } + return true; } -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 15/30] i965/fs: support doubles with UBO loads
Samuel Iglesias Gonsálvez writes: > On 14/05/16 01:16, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >>> From: Iago Toral Quiroga >>> >>> UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD >>> instruction, which reads 16 bytes (a vec4) of data from memory. For dvec >>> types this only provides components x and y. Thus, if we are reading >>> more than 2 components we need to issue a second load at offset+16 to >>> read the next 16-byte chunk with components w and z. >>> >>> UBO loads with non-constant offset emit a load for each component >>> in the vector (and rely in CSE to fix redundant loads), so we only >>> need to consider the size of the data type when computing the offset >>> of each element in a vector. >>> >>> v2 (Sam): >>> - Adapt the code to use component() (Curro). >>> >>> Signed-off-by: Samuel Iglesias Gonsálvez >>> Reviewed-by: Kenneth Graunke >>> --- >>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 >>> +++- >>> 1 file changed, 45 insertions(+), 7 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> index 2d57fd3..02f1e81 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> @@ -3362,6 +3362,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, >>> nir_intrinsic_instr *instr >>> nir->info.num_ubos - 1); >>>} >>> >>> + /* Number of 32-bit slots in the type */ >>> + unsigned type_slots = MAX2(1, type_sz(dest.type) / 4); >>> + >>>nir_const_value *const_offset = >>> nir_src_as_const_value(instr->src[1]); >>>if (const_offset == NULL) { >>> fs_reg base_offset = retype(get_nir_src(instr->src[1]), >>> @@ -3369,19 +3372,54 @@ fs_visitor::nir_emit_intrinsic(const fs_builder >>> &bld, nir_intrinsic_instr *instr >>> >>> for (int i = 0; i < instr->num_components; i++) >>> VARYING_PULL_CONSTANT_LOAD(bld, offset(dest, bld, i), >>> surf_index, >>> - base_offset, i * 4); >>> + base_offset, i * 4 * type_slots); >> >> Why not 'i * type_sz(...)'? As before it seems like type_slots is just >> going to introduce rounding errors here for no benefit? >> > > Right, I will fix it. > >>>} else { >>> + /* Even if we are loading doubles, a pull constant load will load >>> + * a 32-bit vec4, so should only reserve vgrf space for that. If >>> we >>> + * need to load a full dvec4 we will have to emit 2 loads. This is >>> + * similar to demote_pull_constants(), except that in that case we >>> + * see individual accesses to each component of the vector and >>> then >>> + * we let CSE deal with duplicate loads. Here we see a vector >>> access >>> + * and we have to split it if necessary. >>> + */ >>> fs_reg packed_consts = vgrf(glsl_type::float_type); >>> packed_consts.type = dest.type; >>> >>> - struct brw_reg const_offset_reg = brw_imm_ud(const_offset->u32[0] >>> & ~15); >>> - bld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD, packed_consts, >>> - surf_index, const_offset_reg); >>> + unsigned const_offset_aligned = const_offset->u32[0] & ~15; >>> + >>> + /* A vec4 only contains half of a dvec4, if we need more than 2 >>> + * components of a dvec4 we will have to issue another load for >>> + * components z and w >>> + */ >>> + int num_components; >>> + if (type_slots == 1) >>> +num_components = instr->num_components; >>> + else >>> +num_components = MIN2(2, instr->num_components); >>> >>> - const fs_reg consts = byte_offset(packed_consts, >>> const_offset->u32[0] % 16); >>> + int remaining_components = instr->num_components; >>> + while (remaining_components > 0) { >>> +/* Read the vec4 from a 16-byte aligned offset */ >>> +struct brw_reg const_offset_reg = >>> brw_imm_ud(const_offset_aligned); >>> +bld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD, >>> + retype(packed_consts, BRW_REGISTER_TYPE_F), >>> + surf_index, const_offset_reg); >>> >>> - for (unsigned i = 0; i < instr->num_components; i++) >>> -bld.MOV(offset(dest, bld, i), component(consts, i)); >>> +const fs_reg consts = byte_offset(packed_consts, >>> (const_offset->u32[0] % 16)); >> >> This looks really fishy to me, if the initial offset is not 16B aligned >> you'll apply the same sub-16B offset to the result from each one of the >> subsequent pull constant loads. > > This cannot happen thanks to the layout alignment rules, see below. > >> Also you don't seem to take into >> account whether the initial offset is misaligned in
[Mesa-dev] [Bug 95374] ARK:survival of the fittest fails when GL4.3 is enabled.
https://bugs.freedesktop.org/show_bug.cgi?id=95374 Vedran Miletić changed: What|Removed |Added CC||riva...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/17] nir: add lowering pass for y-transform
On Sat, May 14, 2016 at 4:43 PM, Jason Ekstrand wrote: > On Sat, May 14, 2016 at 12:23 PM, Rob Clark wrote: >> >> On Thu, May 12, 2016 at 10:17 PM, Jason Ekstrand >> wrote: >> > >> > >> > On Mon, May 9, 2016 at 12:33 PM, Rob Clark wrote: >> >> >> >> From: Rob Clark >> >> >> >> Signed-off-by: Rob Clark >> >> Reviewed-by: Connor Abbott >> >> --- >> >> src/compiler/Makefile.sources| 1 + >> >> src/compiler/nir/nir.h | 11 + >> >> src/compiler/nir/nir_lower_wpos_ytransform.c | 310 >> >> +++ >> >> 3 files changed, 322 insertions(+) >> >> create mode 100644 src/compiler/nir/nir_lower_wpos_ytransform.c >> >> >> >> diff --git a/src/compiler/Makefile.sources >> >> b/src/compiler/Makefile.sources >> >> index 2a52319..b542a1a 100644 >> >> --- a/src/compiler/Makefile.sources >> >> +++ b/src/compiler/Makefile.sources >> >> @@ -208,6 +208,7 @@ NIR_FILES = \ >> >> nir/nir_lower_vars_to_ssa.c \ >> >> nir/nir_lower_var_copies.c \ >> >> nir/nir_lower_vec_to_movs.c \ >> >> + nir/nir_lower_wpos_ytransform.c \ >> >> nir/nir_metadata.c \ >> >> nir/nir_move_vec_src_uses_to_dest.c \ >> >> nir/nir_normalize_cubemap_coords.c \ >> >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >> >> index 8a616d4..474ba63 100644 >> >> --- a/src/compiler/nir/nir.h >> >> +++ b/src/compiler/nir/nir.h >> >> @@ -2374,6 +2374,17 @@ void nir_lower_two_sided_color(nir_shader >> >> *shader); >> >> >> >> void nir_lower_clamp_color_outputs(nir_shader *shader); >> >> >> >> +typedef struct nir_lower_wpos_ytransform_options { >> >> + int state_tokens[5]; >> >> + bool fs_coord_origin_upper_left :1; >> >> + bool fs_coord_origin_lower_left :1; >> >> + bool fs_coord_pixel_center_integer :1; >> >> + bool fs_coord_pixel_center_half_integer :1; >> > >> > >> > Drive-by commentary: Why are we using two booleans for one boolean here? >> > All hardware should be either lower-left or upper-left and I'm going to >> > hazard that the other two are mutually exclusive as well. The pass >> > certainly seems to assume so. >> >> mostly just because gallium splits it out into two caps, and this >> matches the logic in the equiv tgsi lowering pass more closely.. >> >> The way it is currently would, I think, work if there was some hw that >> supported both cases (which is, I assume, why the gallium part of it >> works the way it does) > > > Yeah, I guess I could see that. In that case, I suppose you could just not > run the pass? I guess you could have a case where HW supports both > gl_FragCoords modes but only one pixel-center mode. Whatever. I guess I'm > ok with 4 bools if it's useful. > Yeah, not entirely sure why you'd run the pass if hw supported both cases.. at this point, the strongest argument for keeping it as-is is probably just to keep the logic similar to equiv tgsi lowering pass. Maybe someone with a longer history on the gallium/tgsi side of things will pipe up. BR, -R >> >> >> BR, >> -R >> >> > Let's just make it two booleans. If we come across hardware that puts >> > the >> > pixel center at 0.75, 0.25 then we can make fs_coord_pixel_center an >> > enum. >> > --Jason >> > >> >> >> >> +} nir_lower_wpos_ytransform_options; >> >> + >> >> +bool nir_lower_wpos_ytransform(nir_shader *shader, >> >> + const nir_lower_wpos_ytransform_options >> >> *options); >> >> + >> >> void nir_lower_atomics(nir_shader *shader, >> >> const struct gl_shader_program >> >> *shader_program); >> >> void nir_lower_to_source_mods(nir_shader *shader); >> >> diff --git a/src/compiler/nir/nir_lower_wpos_ytransform.c >> >> b/src/compiler/nir/nir_lower_wpos_ytransform.c >> >> new file mode 100644 >> >> index 000..1d53530 >> >> --- /dev/null >> >> +++ b/src/compiler/nir/nir_lower_wpos_ytransform.c >> >> @@ -0,0 +1,310 @@ >> >> +/* >> >> + * Copyright © 2015 Red Hat >> >> + * >> >> + * Permission is hereby granted, free of charge, to any person >> >> obtaining >> >> a >> >> + * copy of this software and associated documentation files (the >> >> "Software"), >> >> + * to deal in the Software without restriction, including without >> >> limitation >> >> + * the rights to use, copy, modify, merge, publish, distribute, >> >> sublicense, >> >> + * and/or sell copies of the Software, and to permit persons to whom >> >> the >> >> + * Software is furnished to do so, subject to the following >> >> conditions: >> >> + * >> >> + * The above copyright notice and this permission notice (including >> >> the >> >> next >> >> + * paragraph) shall be included in all copies or substantial portions >> >> of >> >> the >> >> + * Software. >> >> + * >> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> >> EXPRESS OR >> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> >> MERCHANTABILITY, >> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVEN
Re: [Mesa-dev] [PATCH 04/17] nir: add lowering pass for y-transform
On Sat, May 14, 2016 at 12:23 PM, Rob Clark wrote: > On Thu, May 12, 2016 at 10:17 PM, Jason Ekstrand > wrote: > > > > > > On Mon, May 9, 2016 at 12:33 PM, Rob Clark wrote: > >> > >> From: Rob Clark > >> > >> Signed-off-by: Rob Clark > >> Reviewed-by: Connor Abbott > >> --- > >> src/compiler/Makefile.sources| 1 + > >> src/compiler/nir/nir.h | 11 + > >> src/compiler/nir/nir_lower_wpos_ytransform.c | 310 > >> +++ > >> 3 files changed, 322 insertions(+) > >> create mode 100644 src/compiler/nir/nir_lower_wpos_ytransform.c > >> > >> diff --git a/src/compiler/Makefile.sources > b/src/compiler/Makefile.sources > >> index 2a52319..b542a1a 100644 > >> --- a/src/compiler/Makefile.sources > >> +++ b/src/compiler/Makefile.sources > >> @@ -208,6 +208,7 @@ NIR_FILES = \ > >> nir/nir_lower_vars_to_ssa.c \ > >> nir/nir_lower_var_copies.c \ > >> nir/nir_lower_vec_to_movs.c \ > >> + nir/nir_lower_wpos_ytransform.c \ > >> nir/nir_metadata.c \ > >> nir/nir_move_vec_src_uses_to_dest.c \ > >> nir/nir_normalize_cubemap_coords.c \ > >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > >> index 8a616d4..474ba63 100644 > >> --- a/src/compiler/nir/nir.h > >> +++ b/src/compiler/nir/nir.h > >> @@ -2374,6 +2374,17 @@ void nir_lower_two_sided_color(nir_shader > *shader); > >> > >> void nir_lower_clamp_color_outputs(nir_shader *shader); > >> > >> +typedef struct nir_lower_wpos_ytransform_options { > >> + int state_tokens[5]; > >> + bool fs_coord_origin_upper_left :1; > >> + bool fs_coord_origin_lower_left :1; > >> + bool fs_coord_pixel_center_integer :1; > >> + bool fs_coord_pixel_center_half_integer :1; > > > > > > Drive-by commentary: Why are we using two booleans for one boolean here? > > All hardware should be either lower-left or upper-left and I'm going to > > hazard that the other two are mutually exclusive as well. The pass > > certainly seems to assume so. > > mostly just because gallium splits it out into two caps, and this > matches the logic in the equiv tgsi lowering pass more closely.. > > The way it is currently would, I think, work if there was some hw that > supported both cases (which is, I assume, why the gallium part of it > works the way it does) > Yeah, I guess I could see that. In that case, I suppose you could just not run the pass? I guess you could have a case where HW supports both gl_FragCoords modes but only one pixel-center mode. Whatever. I guess I'm ok with 4 bools if it's useful. > > BR, > -R > > > Let's just make it two booleans. If we come across hardware that puts > the > > pixel center at 0.75, 0.25 then we can make fs_coord_pixel_center an > enum. > > --Jason > > > >> > >> +} nir_lower_wpos_ytransform_options; > >> + > >> +bool nir_lower_wpos_ytransform(nir_shader *shader, > >> + const nir_lower_wpos_ytransform_options > >> *options); > >> + > >> void nir_lower_atomics(nir_shader *shader, > >> const struct gl_shader_program *shader_program); > >> void nir_lower_to_source_mods(nir_shader *shader); > >> diff --git a/src/compiler/nir/nir_lower_wpos_ytransform.c > >> b/src/compiler/nir/nir_lower_wpos_ytransform.c > >> new file mode 100644 > >> index 000..1d53530 > >> --- /dev/null > >> +++ b/src/compiler/nir/nir_lower_wpos_ytransform.c > >> @@ -0,0 +1,310 @@ > >> +/* > >> + * Copyright © 2015 Red Hat > >> + * > >> + * Permission is hereby granted, free of charge, to any person > obtaining > >> a > >> + * copy of this software and associated documentation files (the > >> "Software"), > >> + * to deal in the Software without restriction, including without > >> limitation > >> + * the rights to use, copy, modify, merge, publish, distribute, > >> sublicense, > >> + * and/or sell copies of the Software, and to permit persons to whom > the > >> + * Software is furnished to do so, subject to the following conditions: > >> + * > >> + * The above copyright notice and this permission notice (including the > >> next > >> + * paragraph) shall be included in all copies or substantial portions > of > >> the > >> + * Software. > >> + * > >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > >> EXPRESS OR > >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >> MERCHANTABILITY, > >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > >> SHALL > >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > >> OTHER > >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > >> ARISING FROM, > >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > >> IN THE > >> + * SOFTWARE. > >> + */ > >> + > >> +#include "nir.h" > >> +#include "nir_builder.h" > >> + > >> +/* Lower gl_FragCoord (and fddy) to account for driver's requested > >> coordinate- > >> + * origin and pixel-center vs. shader. If tra
Re: [Mesa-dev] [PATCH 00/28] i965/blorp: Use NIR for compiling shaders
On Tuesday, May 10, 2016 4:16:20 PM PDT Jason Ekstrand wrote: > When Paul originally wrote blorp he hand-rolled a shader builder that > builds i965 shaders directly. This has caused headaches because every time > we make a change to the back-end compiler, we have to update blorp. NIR on > the other hand tends to be more stable at this point since it has many > different users all across mesa. > > Using NIR also means that we get decent optimizations, register allocation, > and scheduling. The original blorp codegen code tried fairly hard to emit > reasonably efficient code in that it didn't do more work than needed but it > was fairly naieve when it came to register allocation and scheduling. > Using the full compiler stack also means that we get new features for free > without having to re-implement them in blorp. On Sky Lake, for instance, > we are now generating shaders with sampler-EOT. > > In spite of all this, this series shows no measurable performance > difference on Haswell with every benchmark in sixonyx run 25 times. Patches 1-13 are: Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] nir: forward-declare 'struct gl_shader_program'
Reviewed-by: Jason Ekstrand On Sat, May 14, 2016 at 1:10 PM, Rob Clark wrote: > From: Rob Clark > > Drop extra #include which is otherwise unneeded (and makes this header > difficult to include from outside of src/mesa). > > Signed-off-by: Rob Clark > --- > src/compiler/nir/glsl_to_nir.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/nir/glsl_to_nir.h > b/src/compiler/nir/glsl_to_nir.h > index e3fe9b0..14641fc 100644 > --- a/src/compiler/nir/glsl_to_nir.h > +++ b/src/compiler/nir/glsl_to_nir.h > @@ -26,12 +26,13 @@ > */ > > #include "nir.h" > -#include "compiler/glsl/glsl_parser_extras.h" > > #ifdef __cplusplus > extern "C" { > #endif > > +struct gl_shader_program; > + > nir_shader *glsl_to_nir(const struct gl_shader_program *shader_prog, > gl_shader_stage stage, > const nir_shader_compiler_options *options); > -- > 2.5.5 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/11] tgsi: remove culldist semantic.
On Sat, May 14, 2016 at 2:58 PM, Roland Scheidegger wrote: > On 05/14/2016 04:24 PM, Ilia Mirkin wrote: >> >> On Sat, May 14, 2016 at 10:23 AM, Roland Scheidegger >> wrote: >>> >>> Am 14.05.2016 um 14:55 schrieb Marek Olšák: Dave, It should be noted that clip distances can be disabled by pipe_rasterizer_state::clip_plane_enable, but cull distances can't. (same as GL) >>> >>> >>> That only applies to user clip planes, not shader clip distances. >> >> >> Actually, it applies to both. > > > Yes, you are right. Ahh crap. draw, however, ignores the enable bits for > clip distances (and we're probably relying on this even internally right > now). Do blobs actually honor them? I'm wondering because some code changes > I was recently doing at vmware shouldn't have worked if they did... Or maybe > I got lucky... > In any case honoring the enable bits should still be possible even with both > clip and cull integrated into the same output. What I do is compute a clip & cull mask separately in the shader and then &= clip mask, then |= cull mask. (Onto the rast->clip_enable mask.) Seems to work. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] freedreno/ir3: cmdline compiler for glsl
From: Rob Clark use glsl/libstandalone.la to add support for taking glsl src files (in addition to .tgsi) as input. Then glsl->nir and feed the result into the ir3 backend as normal. Signed-off-by: Rob Clark --- src/gallium/drivers/freedreno/Makefile.am | 2 + src/gallium/drivers/freedreno/ir3/ir3_cmdline.c | 89 + 2 files changed, 77 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/freedreno/Makefile.am b/src/gallium/drivers/freedreno/Makefile.am index 9c0ccdf..1af8dec 100644 --- a/src/gallium/drivers/freedreno/Makefile.am +++ b/src/gallium/drivers/freedreno/Makefile.am @@ -37,6 +37,8 @@ ir3_compiler_LDADD = \ libfreedreno.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ $(top_builddir)/src/compiler/nir/libnir.la \ + $(top_builddir)/src/compiler/glsl/libstandalone.la \ $(top_builddir)/src/util/libmesautil.la \ + $(top_builddir)/src/mesa/libmesagallium.la \ $(GALLIUM_COMMON_LIB_DEPS) \ $(FREEDRENO_LIBS) diff --git a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c index 47bcec4..0e4827c 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c @@ -44,6 +44,9 @@ #include "instr-a3xx.h" #include "ir3.h" +#include "compiler/glsl/standalone.h" +#include "compiler/nir/glsl_to_nir.h" + static void dump_info(struct ir3_shader_variant *so, const char *str) { uint32_t *bin; @@ -55,6 +58,47 @@ static void dump_info(struct ir3_shader_variant *so, const char *str) free(bin); } +int st_glsl_type_size(const struct glsl_type *type); + +static nir_shader * +load_glsl(const char *filename, gl_shader_stage stage) +{ + static const struct standalone_options options = { + .glsl_version = 140, + .do_link = true, + }; + struct gl_shader_program *prog; + + prog = standalone_compile_shader(&options, 1, (char * const*)&filename); + if (!prog) + errx(1, "couldn't parse `%s'", filename); + + nir_shader *nir = glsl_to_nir(prog, stage, ir3_get_compiler_options()); + + standalone_compiler_cleanup(prog); + + /* required NIR passes: */ + /* TODO cmdline args for some of the conditional lowering passes? */ + + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), + true, true); + NIR_PASS_V(nir, nir_lower_global_vars_to_local); + NIR_PASS_V(nir, nir_split_var_copies); + NIR_PASS_V(nir, nir_lower_var_copies); + + NIR_PASS_V(nir, nir_split_var_copies); + NIR_PASS_V(nir, nir_lower_var_copies); + NIR_PASS_V(nir, nir_lower_io_types); + + // TODO nir_assign_var_locations?? + + NIR_PASS_V(nir, nir_lower_system_values); + NIR_PASS_V(nir, nir_lower_io, nir_var_all, st_glsl_type_size); + NIR_PASS_V(nir, nir_lower_samplers, prog); + + return nir; +} static int read_file(const char *filename, void **ptr, size_t *size) @@ -86,7 +130,7 @@ read_file(const char *filename, void **ptr, size_t *size) static void print_usage(void) { - printf("Usage: ir3_compiler [OPTIONS]... FILE\n"); + printf("Usage: ir3_compiler [OPTIONS]... \n"); printf("--verbose - verbose compiler/debug messages\n"); printf("--binning-pass- generate binning pass shader (VERT)\n"); printf("--color-two-side - emulate two-sided color (FRAG)\n"); @@ -105,8 +149,6 @@ int main(int argc, char **argv) { int ret = 0, n = 1; const char *filename; - struct tgsi_token toks[65536]; - struct tgsi_parse_context parse; struct ir3_shader_variant v; struct ir3_shader s; struct ir3_shader_key key = {}; @@ -234,31 +276,50 @@ int main(int argc, char **argv) if (fd_mesa_debug & FD_DBG_OPTMSGS) debug_printf("%s\n", (char *)ptr); - if (!tgsi_text_translate(ptr, toks, ARRAY_SIZE(toks))) - errx(1, "could not parse `%s'", filename); + nir_shader *nir; - if (fd_mesa_debug & FD_DBG_OPTMSGS) - tgsi_dump(toks, 0); + char *ext = rindex(filename, '.'); + + if (strcmp(ext, ".tgsi") == 0) { + struct tgsi_token toks[65536]; + + if (!tgsi_text_translate(ptr, toks, ARRAY_SIZE(toks))) + errx(1, "could not parse `%s'", filename); + + if (fd_mesa_debug & FD_DBG_OPTMSGS) + tgsi_dump(toks, 0); + + nir = ir3_tgsi_to_nir(toks); + s.from_tgsi = true; + } else if (strcmp(ext, ".frag") == 0) { + nir = load_glsl(filename, MESA_SHADER_FRAGMENT); + s.from_tgsi = false; + } else if (strcmp(ext, ".vert") == 0) { + nir = load_glsl(filename, MESA_SHADER_FRAGMENT); +
[Mesa-dev] [PATCH 1/3] glsl: split out libstandalone
From: Rob Clark Split standalone glsl_compiler into a libstandalone.la and a thin main.cpp. This way drivers can re-use the glsl standalone frontend in their own standalone compilers. Signed-off-by: Rob Clark --- There is one kinda ugly hack (#including a .cpp file) to work around an automake issue.. not sure if there is a better way to do that, or if we should bother caring (since it isn't something that is installed anyways) src/compiler/Makefile.glsl.am| 12 +- src/compiler/Makefile.sources| 5 +- src/compiler/glsl/main.cpp | 380 ++ src/compiler/glsl/standalone.cpp | 437 +++ src/compiler/glsl/standalone.h | 51 + 5 files changed, 514 insertions(+), 371 deletions(-) create mode 100644 src/compiler/glsl/standalone.cpp create mode 100644 src/compiler/glsl/standalone.h diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am index daf98f6..69def41 100644 --- a/src/compiler/Makefile.glsl.am +++ b/src/compiler/Makefile.glsl.am @@ -93,7 +93,7 @@ glsl_tests_sampler_types_test_LDADD = \ $(top_builddir)/src/libglsl_util.la \ $(PTHREAD_LIBS) -noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la +noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la glsl/libstandalone.la glsl_libglcpp_la_LIBADD = \ $(top_builddir)/src/util/libmesautil.la @@ -121,15 +121,21 @@ glsl_libglsl_la_SOURCES = \ $(LIBGLSL_FILES) -glsl_compiler_SOURCES = \ +glsl_libstandalone_la_SOURCES = \ $(GLSL_COMPILER_CXX_FILES) -glsl_compiler_LDADD = \ +glsl_libstandalone_la_LIBADD = \ glsl/libglsl.la \ $(top_builddir)/src/libglsl_util.la \ $(top_builddir)/src/util/libmesautil.la \ $(PTHREAD_LIBS) +glsl_compiler_SOURCES = \ + glsl/main.cpp + +glsl_compiler_LDADD = \ + glsl/libstandalone.la + glsl_glsl_test_SOURCES = \ glsl/standalone_scaffolding.cpp \ glsl/test.cpp \ diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index 66fbd84..881a616 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -136,9 +136,8 @@ LIBGLSL_FILES = \ # glsl_compiler GLSL_COMPILER_CXX_FILES = \ - glsl/standalone_scaffolding.cpp \ - glsl/standalone_scaffolding.h \ - glsl/main.cpp + glsl/standalone.cpp \ + glsl/standalone.h # libglsl generated sources LIBGLSL_GENERATED_CXX_FILES = \ diff --git a/src/compiler/glsl/main.cpp b/src/compiler/glsl/main.cpp index d253575..f65b185 100644 --- a/src/compiler/glsl/main.cpp +++ b/src/compiler/glsl/main.cpp @@ -20,6 +20,8 @@ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. */ + +#include #include /** @file main.cpp @@ -31,255 +33,16 @@ * offline compile GLSL code and examine the resulting GLSL IR. */ -#include "ast.h" -#include "glsl_parser_extras.h" -#include "ir_optimization.h" -#include "program.h" -#include "program/hash_table.h" -#include "loop_analysis.h" -#include "standalone_scaffolding.h" - -static int glsl_version = 330; - -static void -initialize_context(struct gl_context *ctx, gl_api api) -{ - initialize_context_to_defaults(ctx, api); - - /* The standalone compiler needs to claim support for almost -* everything in order to compile the built-in functions. -*/ - ctx->Const.GLSLVersion = glsl_version; - ctx->Extensions.ARB_ES3_compatibility = true; - ctx->Const.MaxComputeWorkGroupCount[0] = 65535; - ctx->Const.MaxComputeWorkGroupCount[1] = 65535; - ctx->Const.MaxComputeWorkGroupCount[2] = 65535; - ctx->Const.MaxComputeWorkGroupSize[0] = 1024; - ctx->Const.MaxComputeWorkGroupSize[1] = 1024; - ctx->Const.MaxComputeWorkGroupSize[2] = 64; - ctx->Const.MaxComputeWorkGroupInvocations = 1024; - ctx->Const.MaxComputeSharedMemorySize = 32768; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxCombinedUniformComponents = 1024; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used */ - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxOutputComponents = 0; /* not used */ - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxAtomicBuffers = 8; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxAtomicCounters = 8; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxImageUniforms = 8; - ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformBlocks = 12; - - switch (ctx->Const.GLSLVersion) { - case 100: - ctx->Const.MaxClipPlanes = 0; - ctx->Const.MaxCombinedTextureImageUnits = 8; - ctx->Const.MaxDrawBuffers = 2; - ctx->Const.MinProgramTexelOffset = 0; - ctx->Const.MaxP
[Mesa-dev] [PATCH 2/3] nir: forward-declare 'struct gl_shader_program'
From: Rob Clark Drop extra #include which is otherwise unneeded (and makes this header difficult to include from outside of src/mesa). Signed-off-by: Rob Clark --- src/compiler/nir/glsl_to_nir.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/compiler/nir/glsl_to_nir.h b/src/compiler/nir/glsl_to_nir.h index e3fe9b0..14641fc 100644 --- a/src/compiler/nir/glsl_to_nir.h +++ b/src/compiler/nir/glsl_to_nir.h @@ -26,12 +26,13 @@ */ #include "nir.h" -#include "compiler/glsl/glsl_parser_extras.h" #ifdef __cplusplus extern "C" { #endif +struct gl_shader_program; + nir_shader *glsl_to_nir(const struct gl_shader_program *shader_prog, gl_shader_stage stage, const nir_shader_compiler_options *options); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] nir/algebraic: support for power-of-two optimizations
On Sat, May 14, 2016 at 12:20 PM, Rob Clark wrote: > On Thu, May 12, 2016 at 10:55 PM, Jason Ekstrand > wrote: > > > > > > On Tue, May 10, 2016 at 11:57 AM, Rob Clark wrote: > >> > >> From: Rob Clark > >> > >> Some optimizations, like converting integer multiply/divide into left/ > >> right shifts, have additional constraints on the search expression. > >> Like requiring that a variable is a constant power of two. Support > >> these cases by allowing a fxn name to be appended to the search var > >> expression (ie. "a#32(is_power_of_two)"). > >> > >> TODO update doc/comment explaining search var syntax > >> TODO the eagle-eyed viewer might have noticed that this could also > >> replace the existing const syntax (ie. "#a"). Not sure if we should > >> keep that.. we could make it syntactic sugar (ie '#' automatically sets > >> the cond fxn ptr to 'is_const') or just get rid of it entirely? Maybe > >> that is a follow-on clean-up patch? > >> > >> Signed-off-by: Rob Clark > >> --- > >> src/compiler/nir/nir_algebraic.py | 8 +++-- > >> src/compiler/nir/nir_opt_algebraic.py | 5 +++ > >> src/compiler/nir/nir_search.c | 3 ++ > >> src/compiler/nir/nir_search.h | 10 ++ > >> src/compiler/nir/nir_search_helpers.h | 66 > >> +++ > >> 5 files changed, 90 insertions(+), 2 deletions(-) > >> create mode 100644 src/compiler/nir/nir_search_helpers.h > >> > >> diff --git a/src/compiler/nir/nir_algebraic.py > >> b/src/compiler/nir/nir_algebraic.py > >> index 285f853..19ac6ee 100644 > >> --- a/src/compiler/nir/nir_algebraic.py > >> +++ b/src/compiler/nir/nir_algebraic.py > >> @@ -76,6 +76,7 @@ class Value(object): > >> return Constant(val, name_base) > >> > >> __template = mako.template.Template(""" > >> +#include "compiler/nir/nir_search_helpers.h" > >> static const ${val.c_type} ${val.name} = { > >> { ${val.type_enum}, ${val.bit_size} }, > >> % if isinstance(val, Constant): > >> @@ -84,6 +85,7 @@ static const ${val.c_type} ${val.name} = { > >> ${val.index}, /* ${val.var_name} */ > >> ${'true' if val.is_constant else 'false'}, > >> ${val.type() or 'nir_type_invalid' }, > >> + ${val.cond if val.cond else 'NULL'}, > >> % elif isinstance(val, Expression): > >> ${'true' if val.inexact else 'false'}, > >> nir_op_${val.opcode}, > >> @@ -113,7 +115,7 @@ static const ${val.c_type} ${val.name} = { > >> Variable=Variable, > >> Expression=Expression) > >> > >> -_constant_re = re.compile(r"(?P[^@]+)(?:@(?P\d+))?") > >> +_constant_re = re.compile(r"(?P[^@\(]+)(?:@(?P\d+))?") > > > > > > Spurious change? > > > > I thought it needed to avoid matching something like > a(is_power_of_two).. but it seems to work with that hunk reverted so I > guess I can drop it.. > > >> > >> > >> class Constant(Value): > >> def __init__(self, val, name): > >> @@ -150,7 +152,8 @@ class Constant(Value): > >> return "nir_type_float" > >> > >> _var_name_re = re.compile(r"(?P#)?(?P\w+)" > >> - > >> r"(?:@(?Pint|uint|bool|float)?(?P\d+)?)?") > >> + > >> r"(?:@(?Pint|uint|bool|float)?(?P\d+)?)?" > >> + r"(?P\([^\)]+\))?") > >> > >> class Variable(Value): > >> def __init__(self, val, name, varset): > >> @@ -161,6 +164,7 @@ class Variable(Value): > >> > >>self.var_name = m.group('name') > >>self.is_constant = m.group('const') is not None > >> + self.cond = m.group('cond') > >>self.required_type = m.group('type') > >>self.bit_size = int(m.group('bits')) if m.group('bits') else 0 > >> > >> diff --git a/src/compiler/nir/nir_opt_algebraic.py > >> b/src/compiler/nir/nir_opt_algebraic.py > >> index 0a95725..952a91a 100644 > >> --- a/src/compiler/nir/nir_opt_algebraic.py > >> +++ b/src/compiler/nir/nir_opt_algebraic.py > >> @@ -62,6 +62,11 @@ d = 'd' > >> # constructed value should have that bit-size. > >> > >> optimizations = [ > >> + > >> + (('imul', a, '#b@32(is_power_of_two)'), ('ishl', a, ('find_lsb', > b))), > >> + (('udiv', a, '#b@32(is_power_of_two)'), ('ushr', a, ('find_lsb', > b))), > >> + (('umod', a, '#b(is_power_of_two)'),('iand', a, ('isub', b, > 1))), > >> + > >> (('fneg', ('fneg', a)), a), > >> (('ineg', ('ineg', a)), a), > >> (('fabs', ('fabs', a)), ('fabs', a)), > >> diff --git a/src/compiler/nir/nir_search.c > b/src/compiler/nir/nir_search.c > >> index 2c2fd92..b21fb2c 100644 > >> --- a/src/compiler/nir/nir_search.c > >> +++ b/src/compiler/nir/nir_search.c > >> @@ -127,6 +127,9 @@ match_value(const nir_search_value *value, > >> nir_alu_instr *instr, unsigned src, > >> instr->src[src].src.ssa->parent_instr->type != > >> nir_instr_type_load_const) > >> return false; > >> > >> + if (var->cond && !var->cond(instr, src, num_components, > >> new_swizzle)) > >> +return false; > >> + > >> if (var->type != nir_type_inval
Re: [Mesa-dev] [PATCH v2] i965/blorp: Special-case the clear color in MSAA resolves
On Fri, May 13, 2016 at 10:49 AM, Jason Ekstrand wrote: > > > On Wed, May 11, 2016 at 7:42 PM, Jason Ekstrand > wrote: > >> The current MSAA resolve code has a special-case for if the MCS value is >> 0. >> In this case we can only sample once because we know that all values are >> in >> slice 0. This commit adds a second optimization that detecs the magic MCS >> value that indicates the clear color and grabs the color from a push >> constant and avoids sampling altogether. On a microbenchmark written by >> Neil Roberts that tests resolving surfaces with just clear color, this >> improves performance by 60% for 8x, 40% for 4x, and 28% for 2x MSAA on my >> SKL gte3 laptop. The benchmark can be found on the ML archive: >> >> https://lists.freedesktop.org/archives/mesa-dev/2016-February/108077.html >> > More data: It seems to help T-Rex on Haswell by maybe 0.5% and hurts some of the cpu-bound synthetics just a bit. Meh? --Jason > --- >> src/mesa/drivers/dri/i965/brw_blorp.h| 4 +- >> src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 101 >> +-- >> 2 files changed, 100 insertions(+), 5 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h >> b/src/mesa/drivers/dri/i965/brw_blorp.h >> index 15114d0..9d71ca4 100644 >> --- a/src/mesa/drivers/dri/i965/brw_blorp.h >> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h >> @@ -197,7 +197,9 @@ struct brw_blorp_wm_push_constants >> uint32_t src_z; >> >> /* Pad out to an integral number of registers */ >> - uint32_t pad[5]; >> + uint32_t pad; >> + >> + union gl_color_union clear_color; >> }; >> >> #define BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS \ >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp >> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp >> index 514a316..45b696d 100644 >> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp >> @@ -346,6 +346,7 @@ struct brw_blorp_blit_vars { >>nir_variable *offset; >> } u_x_transform, u_y_transform; >> nir_variable *u_src_z; >> + nir_variable *u_clear_color; >> >> /* gl_FragCoord */ >> nir_variable *frag_coord; >> @@ -374,6 +375,7 @@ brw_blorp_blit_vars_init(nir_builder *b, struct >> brw_blorp_blit_vars *v, >> LOAD_UNIFORM(y_transform.multiplier, glsl_float_type()) >> LOAD_UNIFORM(y_transform.offset, glsl_float_type()) >> LOAD_UNIFORM(src_z, glsl_uint_type()) >> + LOAD_UNIFORM(clear_color, glsl_vec4_type()) >> >> #undef DECL_UNIFORM >> >> @@ -858,7 +860,8 @@ static nir_ssa_def * >> blorp_nir_manual_blend_average(nir_builder *b, nir_ssa_def *pos, >> unsigned tex_samples, >> enum intel_msaa_layout tex_layout, >> - enum brw_reg_type dst_type) >> + enum brw_reg_type dst_type, >> + struct brw_blorp_blit_vars *v) >> { >> /* If non-null, this is the outer-most if statement */ >> nir_if *outer_if = NULL; >> @@ -867,9 +870,53 @@ blorp_nir_manual_blend_average(nir_builder *b, >> nir_ssa_def *pos, >>nir_local_variable_create(b->impl, glsl_vec4_type(), "color"); >> >> nir_ssa_def *mcs = NULL; >> - if (tex_layout == INTEL_MSAA_LAYOUT_CMS) >> + if (tex_layout == INTEL_MSAA_LAYOUT_CMS) { >>mcs = blorp_nir_txf_ms_mcs(b, pos); >> >> + /* The MCS buffer stores a packed value that provides a mapping >> from >> + * samples to array slices. The magic value of all ones means >> that all >> + * samples have the clear color. In this case, we can >> short-circuit the >> + * sampling process and just use the clear color that we pushed >> into the >> + * shader. >> + */ >> + nir_ssa_def *is_clear_color; >> + switch (tex_samples) { >> + case 2: >> + /* Empirical evidence suggests that the value returned from the >> + * sampler is not always 0x3 for clear color so we need to mask >> it. >> + */ >> + is_clear_color = >> +nir_ieq(b, nir_iand(b, nir_channel(b, mcs, 0), >> nir_imm_int(b, 0x3)), >> + nir_imm_int(b, 0x3)); >> + break; >> + case 4: >> + is_clear_color = >> +nir_ieq(b, nir_channel(b, mcs, 0), nir_imm_int(b, 0xff)); >> + break; >> + case 8: >> + is_clear_color = >> +nir_ieq(b, nir_channel(b, mcs, 0), nir_imm_int(b, ~0)); >> + break; >> + case 16: >> + is_clear_color = >> +nir_ior(b, nir_ieq(b, nir_channel(b, mcs, 0), nir_imm_int(b, >> ~0)), >> > > This needs to be nir_iand. Fixed locally... > > >> + nir_ieq(b, nir_channel(b, mcs, 1), nir_imm_int(b, >> ~0))); >> + break; >> + default: >> + unreachable("Invalid sample count"); >> + } >> + >> + nir_if *if_stmt = nir_if_create(b->shader); >> + if_stmt->condition = nir_src_for_ssa(is_cl
Re: [Mesa-dev] [PATCH] nir: fix comment typo about f2d/d2f
On Saturday, May 14, 2016 3:26:41 PM PDT Rob Clark wrote: > From: Rob Clark > > Signed-off-by: Rob Clark > --- > src/compiler/nir/nir_opcodes.py | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/ nir_opcodes.py > index 24ffc31..9d05594 100644 > --- a/src/compiler/nir/nir_opcodes.py > +++ b/src/compiler/nir/nir_opcodes.py > @@ -180,8 +180,8 @@ unop_convert("b2i", tint32, tbool, "src0 ? 1 : 0") # Boolean-to-int conversion > unop_convert("u2f", tfloat32, tuint32, "src0") # Unsigned-to-float conversion. > unop_convert("u2d", tfloat64, tuint32, "src0") # Unsigned-to-double conversion. > # double-to-float conversion > -unop_convert("d2f", tfloat32, tfloat64, "src0") # Single to double precision > -unop_convert("f2d", tfloat64, tfloat32, "src0") # Double to single precision > +unop_convert("d2f", tfloat32, tfloat64, "src0") # Double to single precision > +unop_convert("f2d", tfloat64, tfloat32, "src0") # Single to double precision > > # half/full conversion: > unop_convert("f2h", tfloat16, tfloat32, "src0") > Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] nir/print: add support for print annotations
From: Rob Clark Caller can pass a hashtable mapping NIR object (currently instr or var, but I guess others could be added as needed) to annotation msg to print inline with the shader dump. As the annotation msg is printed, it is removed from the hashtable to give the caller a way to know about any unassociated msgs. This is used in the next patch, for nir_validate to try to associate error msgs to nir_print dump. Signed-off-by: Rob Clark --- src/compiler/nir/nir.h | 1 + src/compiler/nir/nir_print.c | 41 - 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index ade584c..5f2cc8e 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2196,6 +2196,7 @@ unsigned nir_index_instrs(nir_function_impl *impl); void nir_index_blocks(nir_function_impl *impl); void nir_print_shader(nir_shader *shader, FILE *fp); +void nir_print_shader_annotated(nir_shader *shader, FILE *fp, struct hash_table *errors); void nir_print_instr(const nir_instr *instr, FILE *fp); nir_shader *nir_shader_clone(void *mem_ctx, const nir_shader *s); diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c index a36561e..70ed73f 100644 --- a/src/compiler/nir/nir_print.c +++ b/src/compiler/nir/nir_print.c @@ -53,8 +53,28 @@ typedef struct { /* an index used to make new non-conflicting names */ unsigned index; + + /** +* Optional table of annotations mapping nir object +* (such as instr or var) to message to print. +*/ + struct hash_table *annotations; } print_state; +static const char * +get_annotation(print_state *state, void *obj) +{ + if (!state->annotations) + return NULL; + + struct hash_entry *entry = _mesa_hash_table_search(state->annotations, obj); + if (entry) { + _mesa_hash_table_remove(state->annotations, entry); + return entry->data; + } + return NULL; +} + static void print_register(nir_register *reg, print_state *state) { @@ -413,6 +433,11 @@ print_var_decl(nir_variable *var, print_state *state) } fprintf(fp, "\n"); + + const char *note = get_annotation(state, var); + if (note) { + fprintf(stderr, "%s\n", note); + } } static void @@ -918,6 +943,11 @@ print_block(nir_block *block, print_state *state, unsigned tabs) nir_foreach_instr(instr, block) { print_instr(instr, state, tabs); fprintf(fp, "\n"); + + const char *note = get_annotation(state, instr); + if (note) { + fprintf(stderr, "%s\n", note); + } } print_tabs(tabs, fp); @@ -1090,11 +1120,14 @@ destroy_print_state(print_state *state) } void -nir_print_shader(nir_shader *shader, FILE *fp) +nir_print_shader_annotated(nir_shader *shader, FILE *fp, + struct hash_table *annotations) { print_state state; init_print_state(&state, shader, fp); + state.annotations = annotations; + fprintf(fp, "shader: %s\n", gl_shader_stage_name(shader->stage)); if (shader->info.name) @@ -1144,6 +1177,12 @@ nir_print_shader(nir_shader *shader, FILE *fp) } void +nir_print_shader(nir_shader *shader, FILE *fp) +{ + nir_print_shader_annotated(shader, fp, NULL); +} + +void nir_print_instr(const nir_instr *instr, FILE *fp) { print_state state = { -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] nir/validate: dump annotated shader with error msgs
From: Rob Clark Log all the errors, and at the end dump the shader w/ error annotations to make it easier to see where the problems are. Signed-off-by: Rob Clark --- src/compiler/nir/nir_validate.c | 58 + 1 file changed, 58 insertions(+) diff --git a/src/compiler/nir/nir_validate.c b/src/compiler/nir/nir_validate.c index 84334d4..a26f480 100644 --- a/src/compiler/nir/nir_validate.c +++ b/src/compiler/nir/nir_validate.c @@ -69,6 +69,9 @@ typedef struct { /* the current instruction being validated */ nir_instr *instr; + /* the current variable being validated */ + nir_variable *var; + /* the current basic block being validated */ nir_block *block; @@ -95,8 +98,29 @@ typedef struct { /* map of local variable -> function implementation where it is defined */ struct hash_table *var_defs; + + /* map of instruction/var/etc to failed assert string */ + struct hash_table *errors; } validate_state; +static void +log_error(validate_state *state, const char *failed) +{ + const void *obj; + + if (state->instr) + obj = state->instr; + else if (state->var) + obj = state->var; + else + obj = failed; + + _mesa_hash_table_insert(state->errors, obj, (void *)failed); +} + +#undef assert +#define assert(x) do { if (!(x)) log_error(state, "error: "#x); } while (0) + static void validate_src(nir_src *src, validate_state *state); static void @@ -901,6 +925,8 @@ postvalidate_reg_decl(nir_register *reg, validate_state *state) static void validate_var_decl(nir_variable *var, bool is_global, validate_state *state) { + state->var = var; + assert(is_global == nir_variable_is_global(var)); /* Must have exactly one mode set */ @@ -914,6 +940,8 @@ validate_var_decl(nir_variable *var, bool is_global, validate_state *state) if (!is_global) { _mesa_hash_table_insert(state->var_defs, var, state->impl); } + + state->var = NULL; } static bool @@ -1042,7 +1070,12 @@ init_validate_state(validate_state *state) state->regs_found = NULL; state->var_defs = _mesa_hash_table_create(NULL, _mesa_hash_pointer, _mesa_key_pointer_equal); + state->errors = _mesa_hash_table_create(NULL, _mesa_hash_pointer, + _mesa_key_pointer_equal); + state->loop = NULL; + state->instr = NULL; + state->var = NULL; } static void @@ -1053,6 +1086,28 @@ destroy_validate_state(validate_state *state) free(state->ssa_defs_found); free(state->regs_found); _mesa_hash_table_destroy(state->var_defs, NULL); + _mesa_hash_table_destroy(state->errors, NULL); +} + +static void +dump_errors(validate_state *state) +{ + struct hash_table *errors = state->errors; + + fprintf(stderr, "%d errors:\n", _mesa_hash_table_num_entries(errors)); + + nir_print_shader_annotated(state->shader, stderr, errors); + + if (_mesa_hash_table_num_entries(errors) > 0) { + fprintf(stderr, "%d additional errors:\n", + _mesa_hash_table_num_entries(errors)); + struct hash_entry *entry; + hash_table_foreach(errors, entry) { + fprintf(stderr, "%s\n", (char *)entry->data); + } + } + + abort(); } void @@ -1112,6 +1167,9 @@ nir_validate_shader(nir_shader *shader) postvalidate_reg_decl(reg, &state); } + if (_mesa_hash_table_num_entries(state.errors) > 0) + dump_errors(&state); + destroy_validate_state(&state); } -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] clover: Handle PIPE_SHADER_IR_NIR in switch
Signed-off-by: Jan Vesely --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 96f6a48..e2cadda 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -893,8 +893,9 @@ clover::compile_program_llvm(const std::string &source, module m; // Build the clover::module switch (ir) { + case PIPE_SHADER_IR_NIR: case PIPE_SHADER_IR_TGSI: - //XXX: Handle TGSI + //XXX: Handle TGSI, NIR assert(0); m = module(); break; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] clover: Error on incomplete switch statements
Signed-off-by: Jan Vesely --- src/gallium/state_trackers/clover/Makefile.am | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/Makefile.am b/src/gallium/state_trackers/clover/Makefile.am index 4c9d7d9..26ebd3b 100644 --- a/src/gallium/state_trackers/clover/Makefile.am +++ b/src/gallium/state_trackers/clover/Makefile.am @@ -1,5 +1,9 @@ include Makefile.sources +AM_CXXFLAGS = -Werror=switch + +CXXFLAGS += $(AM_CXXFLAGS) + AM_CPPFLAGS = \ -I$(top_srcdir)/include \ -I$(top_srcdir)/src \ -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: fix comment typo about f2d/d2f
From: Rob Clark Signed-off-by: Rob Clark --- src/compiler/nir/nir_opcodes.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py index 24ffc31..9d05594 100644 --- a/src/compiler/nir/nir_opcodes.py +++ b/src/compiler/nir/nir_opcodes.py @@ -180,8 +180,8 @@ unop_convert("b2i", tint32, tbool, "src0 ? 1 : 0") # Boolean-to-int conversion unop_convert("u2f", tfloat32, tuint32, "src0") # Unsigned-to-float conversion. unop_convert("u2d", tfloat64, tuint32, "src0") # Unsigned-to-double conversion. # double-to-float conversion -unop_convert("d2f", tfloat32, tfloat64, "src0") # Single to double precision -unop_convert("f2d", tfloat64, tfloat32, "src0") # Double to single precision +unop_convert("d2f", tfloat32, tfloat64, "src0") # Double to single precision +unop_convert("f2d", tfloat64, tfloat32, "src0") # Single to double precision # half/full conversion: unop_convert("f2h", tfloat16, tfloat32, "src0") -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/17] nir: add lowering pass for y-transform
On Thu, May 12, 2016 at 10:17 PM, Jason Ekstrand wrote: > > > On Mon, May 9, 2016 at 12:33 PM, Rob Clark wrote: >> >> From: Rob Clark >> >> Signed-off-by: Rob Clark >> Reviewed-by: Connor Abbott >> --- >> src/compiler/Makefile.sources| 1 + >> src/compiler/nir/nir.h | 11 + >> src/compiler/nir/nir_lower_wpos_ytransform.c | 310 >> +++ >> 3 files changed, 322 insertions(+) >> create mode 100644 src/compiler/nir/nir_lower_wpos_ytransform.c >> >> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources >> index 2a52319..b542a1a 100644 >> --- a/src/compiler/Makefile.sources >> +++ b/src/compiler/Makefile.sources >> @@ -208,6 +208,7 @@ NIR_FILES = \ >> nir/nir_lower_vars_to_ssa.c \ >> nir/nir_lower_var_copies.c \ >> nir/nir_lower_vec_to_movs.c \ >> + nir/nir_lower_wpos_ytransform.c \ >> nir/nir_metadata.c \ >> nir/nir_move_vec_src_uses_to_dest.c \ >> nir/nir_normalize_cubemap_coords.c \ >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >> index 8a616d4..474ba63 100644 >> --- a/src/compiler/nir/nir.h >> +++ b/src/compiler/nir/nir.h >> @@ -2374,6 +2374,17 @@ void nir_lower_two_sided_color(nir_shader *shader); >> >> void nir_lower_clamp_color_outputs(nir_shader *shader); >> >> +typedef struct nir_lower_wpos_ytransform_options { >> + int state_tokens[5]; >> + bool fs_coord_origin_upper_left :1; >> + bool fs_coord_origin_lower_left :1; >> + bool fs_coord_pixel_center_integer :1; >> + bool fs_coord_pixel_center_half_integer :1; > > > Drive-by commentary: Why are we using two booleans for one boolean here? > All hardware should be either lower-left or upper-left and I'm going to > hazard that the other two are mutually exclusive as well. The pass > certainly seems to assume so. mostly just because gallium splits it out into two caps, and this matches the logic in the equiv tgsi lowering pass more closely.. The way it is currently would, I think, work if there was some hw that supported both cases (which is, I assume, why the gallium part of it works the way it does) BR, -R > Let's just make it two booleans. If we come across hardware that puts the > pixel center at 0.75, 0.25 then we can make fs_coord_pixel_center an enum. > --Jason > >> >> +} nir_lower_wpos_ytransform_options; >> + >> +bool nir_lower_wpos_ytransform(nir_shader *shader, >> + const nir_lower_wpos_ytransform_options >> *options); >> + >> void nir_lower_atomics(nir_shader *shader, >> const struct gl_shader_program *shader_program); >> void nir_lower_to_source_mods(nir_shader *shader); >> diff --git a/src/compiler/nir/nir_lower_wpos_ytransform.c >> b/src/compiler/nir/nir_lower_wpos_ytransform.c >> new file mode 100644 >> index 000..1d53530 >> --- /dev/null >> +++ b/src/compiler/nir/nir_lower_wpos_ytransform.c >> @@ -0,0 +1,310 @@ >> +/* >> + * Copyright © 2015 Red Hat >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining >> a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without >> limitation >> + * the rights to use, copy, modify, merge, publish, distribute, >> sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the >> next >> + * paragraph) shall be included in all copies or substantial portions of >> the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT >> SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >> OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> ARISING FROM, >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS >> IN THE >> + * SOFTWARE. >> + */ >> + >> +#include "nir.h" >> +#include "nir_builder.h" >> + >> +/* Lower gl_FragCoord (and fddy) to account for driver's requested >> coordinate- >> + * origin and pixel-center vs. shader. If transformation is required, a >> + * gl_FbWposYTransform uniform is inserted (with the specified >> state-slots) >> + * and additional instructions are inserted to transform gl_FragCoord >> (and >> + * fddy src arg). >> + * >> + * This is based on the logic in emit_wpos()/emit_wpos_adjustment() in >> TGSI >> + * compiler. >> + * >> + * Run before nir_lower_io. >> + */ >> + >> +typedef struct { >> + const nir_lower_wpos_ytransform_options *options; >> + nir_shader *shader; >> + nir_builder b; >> + nir_variable *transform; >> +} lower_wpos_ytransform_stat
Re: [Mesa-dev] [PATCH 2/3] nir/algebraic: support for power-of-two optimizations
On Thu, May 12, 2016 at 10:55 PM, Jason Ekstrand wrote: > > > On Tue, May 10, 2016 at 11:57 AM, Rob Clark wrote: >> >> From: Rob Clark >> >> Some optimizations, like converting integer multiply/divide into left/ >> right shifts, have additional constraints on the search expression. >> Like requiring that a variable is a constant power of two. Support >> these cases by allowing a fxn name to be appended to the search var >> expression (ie. "a#32(is_power_of_two)"). >> >> TODO update doc/comment explaining search var syntax >> TODO the eagle-eyed viewer might have noticed that this could also >> replace the existing const syntax (ie. "#a"). Not sure if we should >> keep that.. we could make it syntactic sugar (ie '#' automatically sets >> the cond fxn ptr to 'is_const') or just get rid of it entirely? Maybe >> that is a follow-on clean-up patch? >> >> Signed-off-by: Rob Clark >> --- >> src/compiler/nir/nir_algebraic.py | 8 +++-- >> src/compiler/nir/nir_opt_algebraic.py | 5 +++ >> src/compiler/nir/nir_search.c | 3 ++ >> src/compiler/nir/nir_search.h | 10 ++ >> src/compiler/nir/nir_search_helpers.h | 66 >> +++ >> 5 files changed, 90 insertions(+), 2 deletions(-) >> create mode 100644 src/compiler/nir/nir_search_helpers.h >> >> diff --git a/src/compiler/nir/nir_algebraic.py >> b/src/compiler/nir/nir_algebraic.py >> index 285f853..19ac6ee 100644 >> --- a/src/compiler/nir/nir_algebraic.py >> +++ b/src/compiler/nir/nir_algebraic.py >> @@ -76,6 +76,7 @@ class Value(object): >> return Constant(val, name_base) >> >> __template = mako.template.Template(""" >> +#include "compiler/nir/nir_search_helpers.h" >> static const ${val.c_type} ${val.name} = { >> { ${val.type_enum}, ${val.bit_size} }, >> % if isinstance(val, Constant): >> @@ -84,6 +85,7 @@ static const ${val.c_type} ${val.name} = { >> ${val.index}, /* ${val.var_name} */ >> ${'true' if val.is_constant else 'false'}, >> ${val.type() or 'nir_type_invalid' }, >> + ${val.cond if val.cond else 'NULL'}, >> % elif isinstance(val, Expression): >> ${'true' if val.inexact else 'false'}, >> nir_op_${val.opcode}, >> @@ -113,7 +115,7 @@ static const ${val.c_type} ${val.name} = { >> Variable=Variable, >> Expression=Expression) >> >> -_constant_re = re.compile(r"(?P[^@]+)(?:@(?P\d+))?") >> +_constant_re = re.compile(r"(?P[^@\(]+)(?:@(?P\d+))?") > > > Spurious change? > I thought it needed to avoid matching something like a(is_power_of_two).. but it seems to work with that hunk reverted so I guess I can drop it.. >> >> >> class Constant(Value): >> def __init__(self, val, name): >> @@ -150,7 +152,8 @@ class Constant(Value): >> return "nir_type_float" >> >> _var_name_re = re.compile(r"(?P#)?(?P\w+)" >> - >> r"(?:@(?Pint|uint|bool|float)?(?P\d+)?)?") >> + >> r"(?:@(?Pint|uint|bool|float)?(?P\d+)?)?" >> + r"(?P\([^\)]+\))?") >> >> class Variable(Value): >> def __init__(self, val, name, varset): >> @@ -161,6 +164,7 @@ class Variable(Value): >> >>self.var_name = m.group('name') >>self.is_constant = m.group('const') is not None >> + self.cond = m.group('cond') >>self.required_type = m.group('type') >>self.bit_size = int(m.group('bits')) if m.group('bits') else 0 >> >> diff --git a/src/compiler/nir/nir_opt_algebraic.py >> b/src/compiler/nir/nir_opt_algebraic.py >> index 0a95725..952a91a 100644 >> --- a/src/compiler/nir/nir_opt_algebraic.py >> +++ b/src/compiler/nir/nir_opt_algebraic.py >> @@ -62,6 +62,11 @@ d = 'd' >> # constructed value should have that bit-size. >> >> optimizations = [ >> + >> + (('imul', a, '#b@32(is_power_of_two)'), ('ishl', a, ('find_lsb', b))), >> + (('udiv', a, '#b@32(is_power_of_two)'), ('ushr', a, ('find_lsb', b))), >> + (('umod', a, '#b(is_power_of_two)'),('iand', a, ('isub', b, 1))), >> + >> (('fneg', ('fneg', a)), a), >> (('ineg', ('ineg', a)), a), >> (('fabs', ('fabs', a)), ('fabs', a)), >> diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c >> index 2c2fd92..b21fb2c 100644 >> --- a/src/compiler/nir/nir_search.c >> +++ b/src/compiler/nir/nir_search.c >> @@ -127,6 +127,9 @@ match_value(const nir_search_value *value, >> nir_alu_instr *instr, unsigned src, >> instr->src[src].src.ssa->parent_instr->type != >> nir_instr_type_load_const) >> return false; >> >> + if (var->cond && !var->cond(instr, src, num_components, >> new_swizzle)) >> +return false; >> + >> if (var->type != nir_type_invalid) { >> if (instr->src[src].src.ssa->parent_instr->type != >> nir_instr_type_alu) >> return false; >> diff --git a/src/compiler/nir/nir_search.h b/src/compiler/nir/nir_search.h >> index a500feb..f55d797 100644 >> --- a/src/compiler/nir/nir_search.h >> +++ b/src/compiler/nir/ni
Re: [Mesa-dev] [PATCH 09/11] tgsi: remove culldist semantic.
On 05/14/2016 04:24 PM, Ilia Mirkin wrote: On Sat, May 14, 2016 at 10:23 AM, Roland Scheidegger wrote: Am 14.05.2016 um 14:55 schrieb Marek Olšák: Dave, It should be noted that clip distances can be disabled by pipe_rasterizer_state::clip_plane_enable, but cull distances can't. (same as GL) That only applies to user clip planes, not shader clip distances. Actually, it applies to both. Yes, you are right. Ahh crap. draw, however, ignores the enable bits for clip distances (and we're probably relying on this even internally right now). Do blobs actually honor them? I'm wondering because some code changes I was recently doing at vmware shouldn't have worked if they did... Or maybe I got lucky... In any case honoring the enable bits should still be possible even with both clip and cull integrated into the same output. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] egl: android: drop dri2_create_image_android_native_buffer argument
On Sun, May 1, 2016 at 6:42 AM, Emil Velikov wrote: > The drv is no longer used/needed as of last commit. > > Cc: Rob Herring > Signed-off-by: Emil Velikov > --- > src/egl/drivers/dri2/platform_android.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Acked-by: Rob Herring ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] egl: android: directly use dri2_create_image_dma_buf()
On Sun, May 1, 2016 at 6:42 AM, Emil Velikov wrote: > Make the function non static so that we can use it directly from the > android platform code. > > Cc: Rob Herring > Signed-off-by: Emil Velikov > --- > src/egl/drivers/dri2/egl_dri2.c | 2 +- > src/egl/drivers/dri2/egl_dri2.h | 4 > src/egl/drivers/dri2/platform_android.c | 3 +-- > 3 files changed, 6 insertions(+), 3 deletions(-) Acked-by: Rob Herring ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 95395] glsl: NULL type value in add_uniform() leads to SIGSEGV
https://bugs.freedesktop.org/show_bug.cgi?id=95395 --- Comment #1 from Ilia Mirkin --- Another thing to check is whether you can reproduce on amd64 with DRAW_USE_LLVM=0 - otherwise softpipe uses llvm for the vertex stages when available. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 95395] glsl: NULL type value in add_uniform() leads to SIGSEGV
https://bugs.freedesktop.org/show_bug.cgi?id=95395 Kenneth Graunke changed: What|Removed |Added Assignee|i...@freedesktop.org |mesa-dev@lists.freedesktop. ||org -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Android: apps crashed on Intel Gen9 GPU
2016-05-13 15:30 GMT+08:00 Pohjolainen, Topi : > On Thu, May 12, 2016 at 12:25:25AM +0800, Chih-Wei Huang wrote: >> Testing android-x86 with mesa 11.2.2, >> I found the Google Play crashed forever on >> a device with Intel Gen9 GPU (e.g., Skylake). >> >> After analyzing, the i965 driver seems to assume >> irb->mt is not null. For example in >> brw_meta_fast_clear of brw_meta_fast_clear.c: >> >> struct intel_renderbuffer *irb = intel_renderbuffer(rb); >> ... >> if (brw->gen >= 9 && >> brw_format_for_mesa_format(irb->mt->format) != >> ^ => crashing >> brw->render_target_format[irb->mt->format]) >> clear_type = REP_CLEAR; >> >> If I added null checking to irb->mt, it fixes this crashing. >> However, the app still crashed at other place that >> accesses irb->mt similarly. >> (brw_draw.c line 399, gen8_surface_state.c line 432, etc) >> >> Please comment how to fix it correctly. >> Why irb->mt is null but the code assumes it's not? > > As far as I understand something has gone wrong before - having an > intel_renderbuffer without a miptree shouldn't be a reachable state at all. Thank you for the reply. When/where should the miptree be set? How can I debug it? -- Chih-Wei Android-x86 project http://www.android-x86.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 00/14] radeonsi: Offchip tessellation
Hi, there are minor rendering glitches in Shadow of Mordor on R9 390. git-59156b2 + this patch v2. I can send a screenshot if you are unable to reproduce this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa 11.2.2 problems with Intel i965 graphics on Arch Linux
On Saturday, May 14, 2016 8:55:12 AM PDT Vanja Z wrote: > Hi all, > > I'm sorry if this is the wrong place to post this. Upgrading from mesa 11.2.1 to 11.2.2 on Arch Linux results in several programs not working. I am getting the following errors when launching Paraview for example, > > libGL error: unable to load driver: i965_dri.so > libGL error: driver pointer missing > libGL error: failed to load driver: i965 > libGL error: unable to load driver: swrast_dri.so > libGL error: failed to load driver: swrast > > Both files exist on my system, > > /usr/lib/xorg/modules/dri/i965_dri.so > /usr/lib/xorg/modules/dri/swrast_dri.so > > I am not sure if this is a problem with mesa, or with the Arch package or with my X configuration. I've tried asking on the Arch forums to no avail. > > > Best regards, > Vanja This is likely an issue with your installation. Setting LIBGL_DEBUG=verbose when running the application would give you more information. If some programs are working and not others, it could be a multilib issue - maybe your lib32-mesa packages are messed up? --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] c11/threads: create mutexattrs only when needed
Any comments on the patch and/or discussion after the --- line ? On 24 April 2016 at 16:14, Emil Velikov wrote: > From: Emil Velikov > > If the mutexattrs are the default one can just pass NULL to > pthread_mutex_init. As the compiler does not know this detail it > unnecessarily creates/destroys the attrs. > > Signed-off-by: Emil Velikov > --- > While going through GLVND, I've noticed that it (sort of) breaks its > assumptions/goals - 'we don't want the heavy locking/etc. brought by > pthreads' [for single threaded uses] > > Thus I gave mesa a quick look and the following popped up: > > - pthread_once - libglapi, classic + dri > Replace with an atomic test & set combo ? > > - pthread_mutexattr_* - all dri modules, libGL-apple > Using a recursive lock in src/mesa/main/shared.c and > src/glx/apple/apple_glx_drawable.c > > - pthread_key_* - EGL > - pthread_.etspecific - EGL > Extend pthread-stubs explicitly required it by mesa ? > Note: the original code that pthread-stubs is based on (libX11) does > have these ;-) > > - pthread_barrier_* - llvmpipe > Fall-back to the mutex + cond implementation ? > > - pthread_setname_np - llvmpipe > Do we need this ? Afaict the Windows build does not have an equivalent. > > - pthread_join - nine, llvmpipe, radeon(s), rbug, omx (thanks bellagio) > - pthread_create - nine, llvmpipe, radeon(s), rbug > - pthread_sigmask - nine, llvmpipe, radeon(s), rbug > > These four (five inc bellagio/omx) want more than one thread. How do we > get others pthread free, while keeping these happy ? > > Please let me know how you feel on the topic. Do you see this as worthy > goal ? Does the proposed solutions sound OK ? Can you think of any > alternatives? > > -Emil > > P.S. For anyone who wonders, libc (GNU one only iirc) provides > lightweight stubs, thus single-threaded apps work without the overhead. > --- > include/c11/threads_posix.h | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/include/c11/threads_posix.h b/include/c11/threads_posix.h > index ce9853b..11d36e4 100644 > --- a/include/c11/threads_posix.h > +++ b/include/c11/threads_posix.h > @@ -180,9 +180,14 @@ mtx_init(mtx_t *mtx, int type) >&& type != (mtx_timed|mtx_recursive) >&& type != (mtx_try|mtx_recursive)) > return thrd_error; > + > +if ((type & mtx_recursive) == 0) { > +pthread_mutex_init(mtx, NULL); > +return thrd_success; > +} > + > pthread_mutexattr_init(&attr); > -if ((type & mtx_recursive) != 0) > -pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE); > +pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE); > pthread_mutex_init(mtx, &attr); > pthread_mutexattr_destroy(&attr); > return thrd_success; > -- > 2.8.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] egl: android: directly use dri2_create_image_dma_buf()
On 1 May 2016 at 12:42, Emil Velikov wrote: > Make the function non static so that we can use it directly from the > android platform code. > > Cc: Rob Herring > Signed-off-by: Emil Velikov > --- > src/egl/drivers/dri2/egl_dri2.c | 2 +- > src/egl/drivers/dri2/egl_dri2.h | 4 > src/egl/drivers/dri2/platform_android.c | 3 +-- > 3 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index d8448f4..95dc0d6 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -1960,7 +1960,7 @@ dri2_check_dma_buf_format(const _EGLImageAttribs *attrs) > * > * Therefore we must never close or otherwise modify the file descriptors. > */ > -static _EGLImage * > +_EGLImage * > dri2_create_image_dma_buf(_EGLDisplay *disp, _EGLContext *ctx, > EGLClientBuffer buffer, const EGLint *attr_list) > { > diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h > index ddb5f39..925294b 100644 > --- a/src/egl/drivers/dri2/egl_dri2.h > +++ b/src/egl/drivers/dri2/egl_dri2.h > @@ -361,6 +361,10 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, > _EGLContext *ctx, EGLenum target, > EGLClientBuffer buffer, const EGLint *attr_list); > > +_EGLImage * > +dri2_create_image_dma_buf(_EGLDisplay *disp, _EGLContext *ctx, > + EGLClientBuffer buffer, const EGLint *attr_list); > + > EGLBoolean > dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp); > > diff --git a/src/egl/drivers/dri2/platform_android.c > b/src/egl/drivers/dri2/platform_android.c > index c837b35..9f0f133 100644 > --- a/src/egl/drivers/dri2/platform_android.c > +++ b/src/egl/drivers/dri2/platform_android.c > @@ -494,8 +494,7 @@ dri2_create_image_android_native_buffer(_EGLDriver *drv, > _EGLDisplay *disp, >if (fourcc == -1 || pitch == 0) > return NULL; > > - return dri2_create_image_khr(drv, disp, ctx, EGL_LINUX_DMA_BUF_EXT, > - NULL, attr_list); > + return dri2_create_image_dma_buf(disp, ctx, NULL, attr_list); Rob, care to ack this and 2/3 ? Using dri2_create_image_dma_buf over the generic dri2_create_image_khr seems should make things a bit more obvious Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gbm: remove define _BSD_SOURCE
On 1 May 2016 at 13:48, Emil Velikov wrote: > The build systems already add this as applicable. There's no need to > have this in the source file. > > Signed-off-by: Emil Velikov > --- > src/gbm/main/gbm.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/src/gbm/main/gbm.c b/src/gbm/main/gbm.c > index c046b1a..a8da082 100644 > --- a/src/gbm/main/gbm.c > +++ b/src/gbm/main/gbm.c > @@ -25,7 +25,6 @@ > *Benjamin Franzke > */ > > -#define _BSD_SOURCE > #define _DEFAULT_SOURCE > Do we have a brave soul to ack/r-b this and the other two "kill the manual _{BSD,DEFAUL}_SOURCE defines" ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/14] vl/dri3: implement dri3 screen create and destroy
On 12 May 2016 at 16:25, Leo Liu wrote: > On 05/12/2016 11:08 AM, Emil Velikov wrote: >> >> On 12 May 2016 at 15:01, Leo Liu wrote: >>> >>> >>> On 05/12/2016 09:47 AM, Emil Velikov wrote: Hi Leo, On 11 May 2016 at 22:14, Leo Liu wrote: > > On 05/11/2016 04:20 PM, Axel Davy wrote: >> >> On 11/05/2016 17:06, Leo Liu wrote: >>> >>> Screen created with device fd returned from X server, >>> also will bail out to DRI2 with certain conditions. >>> >>> Signed-off-by: Leo Liu >>> --- >>> configure.ac | 7 ++- >>> src/gallium/auxiliary/vl/vl_winsys_dri3.c | 88 >>> ++- >>> 2 files changed, 93 insertions(+), 2 deletions(-) >>> >>> diff --git a/configure.ac b/configure.ac >>> index 023110e..8c3960a 100644 >>> --- a/configure.ac >>> +++ b/configure.ac >>> @@ -1779,7 +1779,12 @@ if test "x$enable_xvmc" = xyes -o \ >>> "x$enable_vdpau" = xyes -o \ >>> "x$enable_omx" = xyes -o \ >>> "x$enable_va" = xyes; then >>> -PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= >>> $XCBDRI2_REQUIRED]) >>> +if test x"$enable_dri3" = xyes; then >>> +PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync >>> xshmfence = $XSHMFENCE_REQUIRED >>> >>> + x11-xcb xcb xcb-dri2 >= >>> $XCBDRI2_REQUIRED]) We don't need xcb-dri2 in the above do we ? >>> >>> >>> Yes I think so. That's for all vl, includes building vl_winsys_dri.c. >>> >>> >> Yes we need it, or yes we don't need it ? Afaict the vl_winsys_dri.c >> case is handled in the else statement. > > We still need vl_winsys_dri.c even with "enable_dri3", because there's > fallback case. > Indeed you are correct - had a PEBKAC moment. Thanks for the patience Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vl/dri: fix close fd error out
On 12 May 2016 at 16:19, Leo Liu wrote: > On 05/12/2016 11:10 AM, Emil Velikov wrote: >> >> On 12 May 2016 at 15:10, Leo Liu wrote: >>> >>> fd should be set to -1 only if it got closed by pipe_loader_release. >>> >>> Signed-off-by: Leo Liu >>> --- >>> src/gallium/auxiliary/vl/vl_winsys_dri.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri.c >>> b/src/gallium/auxiliary/vl/vl_winsys_dri.c >>> index 0136526..4636feb 100644 >>> --- a/src/gallium/auxiliary/vl/vl_winsys_dri.c >>> +++ b/src/gallium/auxiliary/vl/vl_winsys_dri.c >>> @@ -427,9 +427,10 @@ vl_dri2_screen_create(Display *display, int screen) >>> return &scrn->base; >>> >>> release_pipe: >>> - if (scrn->base.dev) >>> + if (scrn->base.dev) { >>> pipe_loader_release(&scrn->base.dev, 1); >>> - fd = -1; >>> + fd = -1; >>> + } >>> free_authenticate: >>> free(authenticate); >>> close_fd: >> >> +if (fd != -1) >> close(fd) >> >> Please add a -1 check before the close. > > > Sure I will add it, commit it to the repo later today. > In general it would be better to reply with the updated patch first. Obviously it's not a big deal here. Thanks again for squashing my silly mistake(s). Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] st/xa: don't call close(-1) in xa_tracker_create error path
Analogous to previous commit. Signed-off-by: Emil Velikov --- src/gallium/state_trackers/xa/xa_tracker.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/state_trackers/xa/xa_tracker.c b/src/gallium/state_trackers/xa/xa_tracker.c index f09baed..e091b083 100644 --- a/src/gallium/state_trackers/xa/xa_tracker.c +++ b/src/gallium/state_trackers/xa/xa_tracker.c @@ -152,7 +152,7 @@ xa_tracker_create(int drm_fd) struct xa_tracker *xa = calloc(1, sizeof(struct xa_tracker)); enum xa_surface_type stype; unsigned int num_formats; -int fd = -1; +int fd; if (!xa) return NULL; @@ -212,9 +212,9 @@ xa_tracker_create(int drm_fd) out_no_screen: if (xa->dev) pipe_loader_release(&xa->dev, 1); -fd = -1; +else + close(fd); out_no_fd: -close(fd); free(xa); return NULL; } -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] vl/drm: don't call close(-1) in vl_drm_screen_create error path
Analogous to previous commits. Signed-off-by: Emil Velikov --- src/gallium/auxiliary/vl/vl_winsys_drm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_winsys_drm.c b/src/gallium/auxiliary/vl/vl_winsys_drm.c index 6d9d947..6a759ae 100644 --- a/src/gallium/auxiliary/vl/vl_winsys_drm.c +++ b/src/gallium/auxiliary/vl/vl_winsys_drm.c @@ -41,20 +41,20 @@ struct vl_screen * vl_drm_screen_create(int fd) { struct vl_screen *vscreen; - int new_fd = -1; + int new_fd; vscreen = CALLOC_STRUCT(vl_screen); if (!vscreen) return NULL; if (fd < 0 || (new_fd = dup(fd)) < 0) - goto error; + goto free_screen; if (pipe_loader_drm_probe_fd(&vscreen->dev, new_fd)) vscreen->pscreen = pipe_loader_create_screen(vscreen->dev); if (!vscreen->pscreen) - goto error; + goto release_pipe; vscreen->destroy = vl_drm_screen_destroy; vscreen->texture_from_drawable = NULL; @@ -64,12 +64,13 @@ vl_drm_screen_create(int fd) vscreen->get_private = NULL; return vscreen; -error: +release_pipe: if (vscreen->dev) pipe_loader_release(&vscreen->dev, 1); else close(new_fd); +free_screen: FREE(vscreen); return NULL; } -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] st/dri: don't call close(-1) in dri{2, kms_}_init_screen error path
Add separate labels and jump to the correct one as needed. Signed-off-by: Emil Velikov --- src/gallium/state_trackers/dri/dri2.c | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 675a9bb..2330530 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -1714,7 +1714,7 @@ dri2_init_screen(__DRIscreen * sPriv) struct pipe_screen *pscreen = NULL; const struct drm_conf_ret *throttle_ret; const struct drm_conf_ret *dmabuf_ret; - int fd = -1; + int fd; screen = CALLOC_STRUCT(dri_screen); if (!screen) @@ -1727,13 +1727,13 @@ dri2_init_screen(__DRIscreen * sPriv) sPriv->driverPrivate = (void *)screen; if (screen->fd < 0 || (fd = dup(screen->fd)) < 0) - goto fail; + goto free_screen; if (pipe_loader_drm_probe_fd(&screen->dev, fd)) pscreen = pipe_loader_create_screen(screen->dev); if (!pscreen) - goto fail; + goto release_pipe; throttle_ret = pipe_loader_configuration(screen->dev, DRM_CONF_THROTTLE); dmabuf_ret = pipe_loader_configuration(screen->dev, DRM_CONF_SHARE_FD); @@ -1762,7 +1762,7 @@ dri2_init_screen(__DRIscreen * sPriv) configs = dri_init_screen_helper(screen, pscreen, screen->dev->driver_name); if (!configs) - goto fail; + goto destroy_screen; screen->can_share_buffer = true; screen->auto_fake_front = dri_with_format(sPriv); @@ -1770,12 +1770,17 @@ dri2_init_screen(__DRIscreen * sPriv) screen->lookup_egl_image = dri2_lookup_egl_image; return configs; -fail: + +destroy_screen: dri_destroy_screen_helper(screen); + +release_pipe: if (screen->dev) pipe_loader_release(&screen->dev, 1); else close(fd); + +free_screen: FREE(screen); return NULL; } @@ -1793,7 +1798,7 @@ dri_kms_init_screen(__DRIscreen * sPriv) struct dri_screen *screen; struct pipe_screen *pscreen = NULL; uint64_t cap; - int fd = -1; + int fd; screen = CALLOC_STRUCT(dri_screen); if (!screen) @@ -1805,13 +1810,13 @@ dri_kms_init_screen(__DRIscreen * sPriv) sPriv->driverPrivate = (void *)screen; if (screen->fd < 0 || (fd = dup(screen->fd)) < 0) - goto fail; + goto free_screen; if (pipe_loader_sw_probe_kms(&screen->dev, fd)) pscreen = pipe_loader_create_screen(screen->dev); if (!pscreen) - goto fail; + goto release_pipe; if (drmGetCap(sPriv->fd, DRM_CAP_PRIME, &cap) == 0 && (cap & DRM_PRIME_CAP_IMPORT)) { @@ -1823,7 +1828,7 @@ dri_kms_init_screen(__DRIscreen * sPriv) configs = dri_init_screen_helper(screen, pscreen, "swrast"); if (!configs) - goto fail; + goto destroy_screen; screen->can_share_buffer = false; screen->auto_fake_front = dri_with_format(sPriv); @@ -1831,12 +1836,17 @@ dri_kms_init_screen(__DRIscreen * sPriv) screen->lookup_egl_image = dri2_lookup_egl_image; return configs; -fail: + +destroy_screen: dri_destroy_screen_helper(screen); + +release_pipe: if (screen->dev) pipe_loader_release(&screen->dev, 1); else close(fd); + +free_screen: FREE(screen); #endif // GALLIUM_SOFTPIPE return NULL; -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 7/9] gbm: rename gbm_dri_bo_{map, unmap} to gbm_dri_bo_{map, unmap}_dumb
On 4 May 2016 at 03:02, Rob Herring wrote: > In preparation to add public map/unmap functions, rename the existing > gbm_dri_bo_{map,unmap} functions to indicate that they are only for dumb > buffers. > > Signed-off-by: Rob Herring > --- > v2: > - moved into new patch > > src/gbm/backends/dri/gbm_dri.c| 4 ++-- > src/gbm/backends/dri/gbm_driint.h | 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > I mentioned it before, guess I wasn't clear enough - there are cases of these functions used outside of GBM (yes it is a bit of a 'nasty' ABI) Namely, there's two of each in origin/master:src/egl/drivers/dri2/platform_drm.c Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa 11.3.0/12.0.0 release plan
On 29 April 2016 at 14:07, Iago Toral wrote: > On Fri, 2016-04-29 at 14:01 +0100, Emil Velikov wrote: >> On 29 April 2016 at 13:19, Iago Toral wrote: >> > On Fri, 2016-04-29 at 00:55 -0700, Kenneth Graunke wrote: >> >> On Thursday, April 28, 2016 4:06:49 PM PDT Emil Velikov wrote: >> >> > Hi all, >> >> > >> >> > Here is the current tentative 11.3.0/12.0.0 release schedule. >> >> > >> >> > May 20th 2016 - Feature freeze/Release candidate 1 >> >> > May 27th 2016 - Release candidate 2 >> >> > June 03rd 2016 - Release candidate 3 >> >> > June 10th 2016 - Release candidate 4/final release >> >> > >> >> > With the above in mind we have three weeks to get new features. >> >> > >> >> > Do we have some serious work that we want to squeeze in and the time >> >> > is not enough. Does the proposed dates align with distributions >> >> > needs/expectations ? >> >> > >> >> > Kindly let me know. >> >> > >> >> > Thanks >> >> > Emil >> >> >> >> I'd really love to get fp64/va64 for Broadwell+ landed - with that in >> >> place, we'll jump forward to GL 4.2. We'll try and pull it off, but we >> >> might need a little bit more time... >> > >> > We have just sent the first bunch of i965 patches for review so it is >> > all going to depend on the kind of review feedback we get. It is a large >> > series that touches a lot of things so I imagine it might take some time >> > to get it in a shape where everyone feels comfortable merging it, but >> > let's see. >> > >> So should we keep the dates as-is and re-estimate in a week ? Afaict >> it's be nearly impossible to say how much extra time will be needed, >> if any. > > Yes, I think that sounds reasonable. > Gents am I missing something or not many of the fp64/va64 patches have landed yet ? The proposed branchpoint is a week away, so I'd like to hear your thoughts on how long you think is going to take to land the work. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] nvc0: expose GLSL version 420 on GF100
Except for patches 1 and 6, this series is Reviewed-by: Ilia Mirkin On Sat, May 14, 2016 at 9:54 AM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > index bd68ca9..40e5a9d 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > @@ -120,7 +120,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE: >return 128 * 1024 * 1024; > case PIPE_CAP_GLSL_FEATURE_LEVEL: > - if (class_3d == NVE4_3D_CLASS || class_3d == NVF0_3D_CLASS) > + if (class_3d <= NVF0_3D_CLASS) > return 420; >return 410; > case PIPE_CAP_MAX_RENDER_TARGETS: > -- > 2.8.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/8] nvc0/ir: add a lowering pass for surfaces on Fermi
On Sat, May 14, 2016 at 9:54 AM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 117 > + > .../nouveau/codegen/nv50_ir_lowering_nvc0.h| 2 + > 2 files changed, 119 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp > index 1068c21..002f09d 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp > @@ -1982,6 +1982,121 @@ NVC0LoweringPass::handleSurfaceOpNVE4(TexInstruction > *su) >su->sType = (su->tex.target == TEX_TARGET_BUFFER) ? TYPE_U32 : TYPE_U8; > } > > +void > +NVC0LoweringPass::processSurfaceCoordsNVC0(TexInstruction *su) > +{ > + const int idx = su->tex.r; > + const int dim = su->tex.target.getDim(); > + const int arg = dim + (su->tex.target.isArray() || > su->tex.target.isCube()); > + const uint16_t base = idx * NVE4_SU_INFO__STRIDE; > + int c; > + Value *zero = bld.mkImm(0); > + Value *src[3]; > + Value *v; > + Value *ind = NULL; > + > + if (su->tex.rIndirectSrc >= 0) { > + // FIXME: out of bounds > + assert(su->tex.r == 0); > + ind = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), > + su->getIndirectR(), bld.mkImm(6)); > + } > + > + // get surface coordinates > + for (c = 0; c < arg; ++c) > + src[c] = su->getSrc(c); > + for (; c < 3; ++c) > + src[c] = zero; > + > + // calculate pixel offset > + if (su->op == OP_SULDP || su->op == OP_SUREDP) { > + v = loadSuInfo32(ind, base + NVE4_SU_INFO_BSIZE); > + su->setSrc(0, bld.mkOp2v(OP_MUL, TYPE_U32, bld.getSSA(), src[0], v)); > + } > + > + // add array layer offset > + if (su->tex.target.isArray() || su->tex.target.isCube()) { > + v = loadSuInfo32(ind, base + NVE4_SU_INFO_ARRAY); > + assert(dim > 1); > + su->setSrc(2, bld.mkOp2v(OP_MUL, TYPE_U32, bld.getSSA(), src[2], v)); > + } > + > + // prevent read fault when the image is not actually bound > + CmpInstruction *pred = > + bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, bld.getSSA(1, FILE_PREDICATE), > +TYPE_U32, bld.mkImm(0), > +loadSuInfo32(ind, base + NVE4_SU_INFO_ADDR)); > + if (su->tex.format) { > + const TexInstruction::ImgFormatDesc *format = su->tex.format; > + int blockwidth = format->bits[0] + format->bits[1] + > + format->bits[2] + format->bits[3]; > + > + if (blockwidth >= 8) { Why is the blockwidth so important here? Don't you just want to do this for reads, since those use byte-type accesses as well as atomics? i.e. do you need to do this for regular stores? Even if you decide to stick with it, what you're really protecting against here is a format of PIPE_FORMAT_NONE, which you should check for explicitly here rather than creating an arbitrary 8-bit limit. > + // make sure that the format doesn't mismatch > + bld.mkCmp(OP_SET_OR, CC_NE, TYPE_U32, pred->getDef(0), > + TYPE_U32, bld.loadImm(NULL, blockwidth / 8), > + loadSuInfo32(ind, base + NVE4_SU_INFO_BSIZE), > + pred->getDef(0)); > + } > + } > + su->setPredicate(CC_NOT_P, pred->getDef(0)); > +} > + > +void > +NVC0LoweringPass::handleSurfaceOpNVC0(TexInstruction *su) > +{ > + if (su->tex.target == TEX_TARGET_1D_ARRAY) { > + /* As 1d arrays also need 3 coordinates, switching to > TEX_TARGET_2D_ARRAY > + * will simplify the lowering pass and the texture constraints. */ > + su->moveSources(1, 1); > + su->setSrc(2, su->getSrc(1)); Is this line necessary? I thought that moveSources would take src(1) and move it to src(2) [and so on]. > + su->setSrc(1, bld.loadImm(NULL, 0)); > + su->tex.target = TEX_TARGET_2D_ARRAY; > + } > + > + processSurfaceCoordsNVC0(su); > + > + if (su->op == OP_SULDP) > + convertSurfaceFormat(su); > + > + if (su->op == OP_SUREDB || su->op == OP_SUREDP) { > + const int dim = su->tex.target.getDim(); > + const int arg = dim + (su->tex.target.isArray() || > su->tex.target.isCube()); > + LValue *addr = bld.getSSA(8); > + Value *def = su->getDef(0); > + > + su->op = OP_SULEA; > + > + // Set the destination to the address > + su->dType = TYPE_U64; > + su->setDef(0, addr); > + su->setDef(1, su->getPredicate()); > + > + bld.setPosition(su, true); > + > + // Perform the atomic op > + Instruction *red = bld.mkOp(OP_ATOM, su->sType, bld.getSSA()); > + red->subOp = su->subOp; > + red->setSrc(0, bld.mkSymbol(FILE_MEMORY_GLOBAL, 0, su->sType, 0)); > + red->setSrc(1, su->getSrc(arg)); > + if (red->subOp == NV50_IR_SUBOP_ATOM_CAS) > + red->setSrc(2, su->getSrc(arg + 1)); > + red->setIndirect(0, 0, addr); > + > + // make sure to i
Re: [Mesa-dev] [PATCH] nv50/ir: avoid asserts when the state tracker feeds us bogus inputs
On 05/13/2016 05:45 AM, Ilia Mirkin wrote: INTERP is defined (by me) to have to have a INPUT source. However the state tracker does not always obey this. This happens due to varying packing logic introducing additional mov's which can't always be undone. Instead of just giving up, we instead try harder to find the original input. This won't always be possible, for example with indirect accesses. There's not much we can (easily) do about that though. This fixes a bunch of dEQP interpolateAt* tests that happen to hit this. Maybe you can just add: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.* dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.* to be (a little) more precise? Anyway, this patch is: Reviewed-by: Samuel Pitoiset Signed-off-by: Ilia Mirkin --- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 59 ++ 1 file changed, 48 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 69e1a34..73c824c 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -2733,24 +2733,61 @@ Converter::handleINTERP(Value *dst[4]) // Check whether the input is linear. All other attributes ignored. Instruction *insn; Value *offset = NULL, *ptr = NULL, *w = NULL; + Symbol *sym[4] = { NULL }; bool linear; operation op; int c, mode; tgsi::Instruction::SrcRegister src = tgsi.getSrc(0); - assert(src.getFile() == TGSI_FILE_INPUT); - if (src.isIndirect(0)) + // In some odd cases, in large part due to varying packing, the source + // might not actually be an input. This is illegal TGSI, but it's easier to + // account for it here than it is to fix it where the TGSI is being + // generated. In that case, it's going to be a straight up mov (or sequence + // of mov's) from the input in question. We follow the mov chain to see + // which input we need to use. + FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) { + if (src.getFile() == TGSI_FILE_INPUT) { + sym[c] = srcToSym(src, c); + continue; + } + Value *val = fetchSrc(0, c); + assert(val->defs.size() == 1); + insn = val->getInsn(); + while (insn->op == OP_MOV) { + assert(insn->getSrc(0)->defs.size() == 1); + insn = insn->getSrc(0)->getInsn(); + assert(insn); + if (!insn) { +// This could happen if there's an indirect situation which caused +// us to move this temp array into local memory. Just bail. +WARN("Miscompiling shader due to unhandled INTERP\n"); +return; + } + } + sym[c] = insn->getSrc(0)->asSym(); + op = insn->op; + mode = insn->ipa; + } + + if (src.isIndirect(0)) { + // In the case where there's varying packing *and* indirect inputs going + // on, we're sunk. + assert(src.getFile() == TGSI_FILE_INPUT); ptr = fetchSrc(src.getIndirect(0), 0, NULL); + } - // XXX: no way to know interp mode if we don't know the index - linear = info->in[ptr ? 0 : src.getIndex(0)].linear; - if (linear) { - op = OP_LINTERP; - mode = NV50_IR_INTERP_LINEAR; - } else { - op = OP_PINTERP; - mode = NV50_IR_INTERP_PERSPECTIVE; + // We can assume that the fixed index will point to an input of the same + // interpolation type in case of an indirect. + if (src.getFile() == TGSI_FILE_INPUT) { + linear = info->in[src.getIndex(0)].linear; + if (linear) { + op = OP_LINTERP; + mode = NV50_IR_INTERP_LINEAR; + } else { + op = OP_PINTERP; + mode = NV50_IR_INTERP_PERSPECTIVE; + } } switch (tgsi.getOpcode()) { @@ -2793,7 +2830,7 @@ Converter::handleINTERP(Value *dst[4]) FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) { - insn = mkOp1(op, TYPE_F32, dst[c], srcToSym(src, c)); + insn = mkOp1(op, TYPE_F32, dst[c], sym[c]); if (op == OP_PINTERP) insn->setSrc(1, w); if (ptr) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] nvc0: bind images on fragment and compute shaders for Fermi
On Sat, May 14, 2016 at 9:54 AM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 53 > src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1 + > src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 8 +- > src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 154 > +++- > 4 files changed, 209 insertions(+), 7 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > index bbc8edb..78ce000 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > @@ -258,6 +258,45 @@ nvc0_compute_validate_globals(struct nvc0_context *nvc0) > } > } > > +static inline void > +nvc0_compute_invalidate_surfaces(struct nvc0_context *nvc0, const int s) > +{ > + struct nouveau_pushbuf *push = nvc0->base.pushbuf; > + int i; > + > + for (i = 0; i < NVC0_MAX_IMAGES; ++i) { > + if (s == 5) > + BEGIN_NVC0(push, NVC0_CP(IMAGE(i)), 6); > + else > + BEGIN_NVC0(push, NVC0_3D(IMAGE(i)), 6); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0x14000); > + PUSH_DATA(push, 0); > + } > +} > + > +static void > +nvc0_compute_validate_surfaces(struct nvc0_context *nvc0) > +{ > + /* TODO: Invalidating both 3D and CP surfaces before validating surfaces > for > +* compute is probably not really necessary, but we didn't find any better > +* solutions for now. This fixes some invalidation issues when compute and > +* fragment shaders are used inside the same context. Anyway, we > definitely > +* have invalidation issues between 3D and CP for other resources like > SSBO > +* and atomic counters. */ > + nvc0_compute_invalidate_surfaces(nvc0, 4); > + nvc0_compute_invalidate_surfaces(nvc0, 5); > + > + nvc0_validate_suf(nvc0, 5); > + > + /* Invalidate all FRAGMENT images because they are aliased with COMPUTE. > */ > + nvc0->dirty_3d |= NVC0_NEW_3D_SURFACES; > + nvc0->images_dirty[4] |= nvc0->images_valid[4]; > +} > + > static struct nvc0_state_validate > validate_list_cp[] = { > { nvc0_compprog_validate, NVC0_NEW_CP_PROGRAM }, > @@ -267,6 +306,7 @@ validate_list_cp[] = { > { nvc0_compute_validate_textures, NVC0_NEW_CP_TEXTURES}, > { nvc0_compute_validate_samplers, NVC0_NEW_CP_SAMPLERS}, > { nvc0_compute_validate_globals, NVC0_NEW_CP_GLOBALS }, > + { nvc0_compute_validate_surfaces, NVC0_NEW_CP_SURFACES}, > }; > > static bool > @@ -384,6 +424,19 @@ nvc0_launch_grid(struct pipe_context *pipe, const struct > pipe_grid_info *info) >PUSH_DATA (push, 0x1); > } > > + for (int i = 0; i < NVC0_MAX_IMAGES; ++i) { > + BEGIN_NVC0(push, NVC0_CP(IMAGE(i)), 6); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0); > + PUSH_DATA(push, 0x14000); > + PUSH_DATA(push, 0); > + } > + > + /* TODO: Not sure if this is really necessary. */ > + nvc0_compute_invalidate_surfaces(nvc0, 5); Errr... so you're doing this 2x? Did you mean to get rid of the loop above? > + > /* Invalidate all 3D constbufs because they are aliased with COMPUTE. */ > nvc0->dirty_3d |= NVC0_NEW_3D_CONSTBUF; > for (s = 0; s < 5; s++) { > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > index 7fcbf4a..436e912 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > @@ -323,6 +323,7 @@ extern void nvc0_init_surface_functions(struct > nvc0_context *); > bool nvc0_validate_tic(struct nvc0_context *nvc0, int s); > bool nvc0_validate_tsc(struct nvc0_context *nvc0, int s); > bool nve4_validate_tsc(struct nvc0_context *nvc0, int s); > +void nvc0_validate_suf(struct nvc0_context *nvc0, int s); > void nvc0_validate_textures(struct nvc0_context *); > void nvc0_validate_samplers(struct nvc0_context *); > void nve4_set_tex_handles(struct nvc0_context *); > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c > index 9db45c0..9e214a5 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c > @@ -552,22 +552,18 @@ nvc0_program_translate(struct nvc0_program *prog, > uint16_t chipset, > info->io.texBindBase = NVC0_CB_AUX_TEX_INFO(0); > info->prop.cp.gridInfoBase = NVC0_CB_AUX_GRID_INFO; > info->io.uboInfoBase = NVC0_CB_AUX_UBO_INFO(0); > - info->io.suInfoBase = NVC0_CB_AUX_SU_INFO(0); > - } else { > - info->io.suInfoBase = 0; /* TODO */ >} >info->io.msInfoCBSlot = 0; >info->io.msInfoBase = NVC0_CB_AUX_MS_INFO; >
Re: [Mesa-dev] [PATCH 09/11] tgsi: remove culldist semantic.
On Sat, May 14, 2016 at 10:23 AM, Roland Scheidegger wrote: > Am 14.05.2016 um 14:55 schrieb Marek Olšák: >> Dave, >> It should be noted that clip distances can be disabled by >> pipe_rasterizer_state::clip_plane_enable, but cull distances can't. >> (same as GL) > > That only applies to user clip planes, not shader clip distances. Actually, it applies to both. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/11] tgsi: remove culldist semantic.
Am 14.05.2016 um 14:55 schrieb Marek Olšák: > Dave, > It should be noted that clip distances can be disabled by > pipe_rasterizer_state::clip_plane_enable, but cull distances can't. > (same as GL) That only applies to user clip planes, not shader clip distances. > > Roland, > Our hardware only has 2 vec4 outputs. Each component can be configured > to be "clip distance", "cull distance", or "disabled" independently. > Ok. Roland > > On Sat, May 14, 2016 at 12:43 AM, Roland Scheidegger > wrote: >> Am 13.05.2016 um 23:10 schrieb Dave Airlie: >>> From: Dave Airlie >>> >>> This isn't used anymore in the tree, culldist's >>> are part of the clipdist semantic, we could in theory >>> rename it, but I'm not sure there is much point, and >>> I'd have to be careful with virgl. >>> >>> Signed-off-by: Dave Airlie >>> --- >>> src/gallium/auxiliary/tgsi/tgsi_strings.c | 1 - >>> src/gallium/docs/source/tgsi.rst | 22 ++ >>> src/gallium/include/pipe/p_shader_tokens.h | 1 - >>> 3 files changed, 18 insertions(+), 6 deletions(-) >>> >>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c >>> b/src/gallium/auxiliary/tgsi/tgsi_strings.c >>> index 306ab4f..c13f7ea 100644 >>> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c >>> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c >>> @@ -85,7 +85,6 @@ const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] = >>> "PCOORD", >>> "VIEWPORT_INDEX", >>> "LAYER", >>> - "CULLDIST", >>> "SAMPLEID", >>> "SAMPLEPOS", >>> "SAMPLEMASK", >>> diff --git a/src/gallium/docs/source/tgsi.rst >>> b/src/gallium/docs/source/tgsi.rst >>> index 4315707..ab12490 100644 >>> --- a/src/gallium/docs/source/tgsi.rst >>> +++ b/src/gallium/docs/source/tgsi.rst >>> @@ -2876,18 +2876,32 @@ annotated with those semantics. >>> TGSI_SEMANTIC_CLIPDIST >>> "" >>> >>> +Note this covers clipping and culling distances. >>> + >>> When components of vertex elements are identified this way, these >>> values are each assumed to be a float32 signed distance to a plane. >>> + >>> +For clip distances: >>> Primitive setup only invokes rasterization on pixels for which >>> -the interpolated plane distances are >= 0. Multiple clip planes >>> -can be implemented simultaneously, by annotating multiple >>> -components of one or more vertex elements with the above specified >>> -semantic. The limits on both clip and cull distances are bound >>> +the interpolated plane distances are >= 0. >>> + >>> +For cull distances: >>> +Primitives will be completely discarded if the plane distance >>> +for all of the vertices in the primitive are < 0. >>> +If a vertex has a cull distance of NaN, that vertex counts as "out" >>> +(as if its < 0); >>> + >>> +Multiple clip/cull planes can be implemented simultaneously, by >>> +annotating multiple components of one or more vertex elements with >>> +the above specified semantic. >>> +The limits on both clip and cull distances are bound >>> by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT define which defines >>> the maximum number of components that can be used to hold the >>> distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT >>> which specifies the maximum number of registers which can be >>> annotated with those semantics. >>> +The properties NUM_CLIPDIST_ENABLED and NUM_CULLDIST_ENABLED >>> +are used to divide up the 2 x vec4 space between clipping and culling. >> This should really say how it's determined which one is which (so clip >> dists come first). >> >> >> You should remove the TGSI_SEMANTIC_CULLDIST section. >> >> For patch 10, shouldn't this work with softpipe too? >> >> Honestly, I'm not a big fan of packed clip and cull dists in the same >> regs (it's still not the same as what d3d10 does in any case), my >> opinion is since we generally don't allow different semantics within the >> same reg, I see no good reason why we allow it here (and clip dists and >> cull dists, albeit somewhat similar, are still different). So, if some >> drivers wanted it in different regs and some in the same regs, I'd >> prefer it to be different regs in the interface, with drivers having to >> merge it when required, just because it looks cleaner. But if really all >> hw wants it like that, 6,8-11 are >> Reviewed-by: Roland Scheidegger >> (But I'd like to hear from other driver's authors.) >> >> Roland >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=CwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=mWAND3ELitFSIGn3LaQ9eDlEXitrSp5g2LRX0nzGYF8&s=c_Ik7rEVzrYiqaJEZb_A51FunW8lKm-znV3nP6F_Jvc&e= >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50,nvc0: add support for cull distances
From: Tobias Klausmann Cull distances are just a special case of clip distances as far as the hardware is concerned. Make sure that the relevant "planes" are enabled, and flip the clip mode to cull for those. Signed-off-by: Tobias Klausmann [imirkin: add enables on nvc0, add nv50 support] Signed-off-by: Ilia Mirkin --- docs/GL3.txt | 2 +- docs/relnotes/11.3.0.html | 2 +- src/gallium/drivers/nouveau/nv50/nv50_program.c| 9 - src/gallium/drivers/nouveau/nv50/nv50_program.h| 3 +++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +- src/gallium/drivers/nouveau/nv50/nv50_screen.h | 1 + src/gallium/drivers/nouveau/nv50/nv50_shader_state.c | 5 +++-- src/gallium/drivers/nouveau/nv50/nv50_state_validate.c | 15 +++ src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 5 +++-- src/gallium/drivers/nouveau/nvc0/nvc0_program.h| 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 1 + 12 files changed, 35 insertions(+), 13 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 5e49c57..b8b4361 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -211,7 +211,7 @@ GL 4.5, GLSL 4.50: GL_ARB_ES3_1_compatibilitynot started GL_ARB_clip_control DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) GL_ARB_conditional_render_invertedDONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) - GL_ARB_cull_distance DONE (i965) + GL_ARB_cull_distance DONE (i965, nv50, nvc0) GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600, radeonsi) GL_ARB_direct_state_accessDONE (all drivers) GL_ARB_get_texture_sub_image DONE (all drivers) diff --git a/docs/relnotes/11.3.0.html b/docs/relnotes/11.3.0.html index 6a964f2..f456c0e 100644 --- a/docs/relnotes/11.3.0.html +++ b/docs/relnotes/11.3.0.html @@ -46,7 +46,7 @@ Note: some of the new features are only available with certain drivers. OpenGL 4.2 on radeonsi GL_ARB_compute_shader on radeonsi, softpipe -GL_ARB_cull_distance on i965/gen6+ +GL_ARB_cull_distance on i965/gen6+, nv50, nvc0 GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi, softpipe GL_ARB_internalformat_query2 on all drivers GL_ARB_query_buffer_object on i965/hsw+ diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c b/src/gallium/drivers/nouveau/nv50/nv50_program.c index 89db67f..648cb73 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c @@ -319,7 +319,7 @@ nv50_program_translate(struct nv50_program *prog, uint16_t chipset, struct pipe_debug_callback *debug) { struct nv50_ir_prog_info *info; - int ret; + int i, ret; const uint8_t map_undef = (prog->type == PIPE_SHADER_VERTEX) ? 0x40 : 0x80; info = CALLOC_STRUCT(nv50_ir_prog_info); @@ -378,6 +378,13 @@ nv50_program_translate(struct nv50_program *prog, uint16_t chipset, prog->vp.need_vertex_id = info->io.vertexId < PIPE_MAX_SHADER_INPUTS; + prog->vp.clip_enable = (1 << info->io.clipDistances) - 1; + prog->vp.cull_enable = + ((1 << info->io.cullDistances) - 1) << info->io.clipDistances; + prog->vp.clip_mode = 0; + for (i = 0; i < info->io.cullDistances; ++i) + prog->vp.clip_mode |= 1 << ((info->io.clipDistances + i) * 4); + if (prog->type == PIPE_SHADER_FRAGMENT) { if (info->prop.fp.writesDepth) { prog->fp.flags[0] |= NV50_3D_FP_CONTROL_EXPORTS_Z; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.h b/src/gallium/drivers/nouveau/nv50/nv50_program.h index 1de5122..0a22e5b 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.h @@ -79,6 +79,9 @@ struct nv50_program { ubyte clpd[2]; /* output slot of clip distance[i]'s 1st component */ ubyte clpd_nr; bool need_vertex_id; + uint32_t clip_mode; + uint8_t clip_enable; /* mask of defined clip planes */ + uint8_t cull_enable; /* mask of defined cull distances */ } vp; struct { diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 0912150..fa2493c 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -195,6 +195,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL: case PIPE_CAP_INVALIDATE_BUFFER: case PIPE_CAP_STRING_MARKER: + case PIPE_CAP_CULL_DISTANCE: return 1; case PIPE_CAP_SEAMLESS_CUBE_MAP: return 1; /* class_3d >= NVA0_3D_CLASS; */ @
[Mesa-dev] [PATCH] st/mesa: disable cull distance for now
The pass that st/mesa relies on to combine clip and cull distances has been reverted, so we can't expose ARB_cull_distance until that is resolved. Signed-off-by: Ilia Mirkin --- src/mesa/state_tracker/st_extensions.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 4b9a3bd..ea60e41 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -574,7 +574,7 @@ void st_init_extensions(struct pipe_screen *screen, { o(ARB_color_buffer_float), PIPE_CAP_VERTEX_COLOR_UNCLAMPED }, { o(ARB_conditional_render_inverted), PIPE_CAP_CONDITIONAL_RENDER_INVERTED }, { o(ARB_copy_image), PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS }, - { o(ARB_cull_distance),PIPE_CAP_CULL_DISTANCE }, + //{ o(ARB_cull_distance),PIPE_CAP_CULL_DISTANCE }, { o(ARB_depth_clamp), PIPE_CAP_DEPTH_CLIP_DISABLE }, { o(ARB_depth_texture),PIPE_CAP_TEXTURE_SHADOW_MAP }, { o(ARB_derivative_control), PIPE_CAP_TGSI_FS_FINE_DERIVATIVE }, -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] nvc0/ir: add emission for OP_SULEA
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 58 ++ 1 file changed, 58 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index 14f4be4..f7bdc19 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp @@ -63,6 +63,8 @@ private: void emitInterpMode(const Instruction *); void emitLoadStoreType(DataType ty); void emitSUGType(DataType); + void emitSUAddr(const TexInstruction *); + void emitSUDim(const TexInstruction *); void emitCachingMode(CacheMode c); void emitShortSrc2(const ValueRef&); @@ -137,6 +139,8 @@ private: void emitSULDGB(const TexInstruction *); void emitSUSTGx(const TexInstruction *); + void emitSULEA(const TexInstruction *); + void emitVSHL(const Instruction *); void emitVectorSubOp(const Instruction *); @@ -2285,6 +2289,57 @@ CodeEmitterNVC0::emitSUSTGx(const TexInstruction *i) } void +CodeEmitterNVC0::emitSUAddr(const TexInstruction *i) +{ + assert(targ->getChipset() < NVISA_GK104_CHIPSET); + + if (i->tex.rIndirectSrc < 0) { + code[1] |= 0x4000; + code[0] |= i->tex.r << 26; + } else { + srcId(i, i->tex.rIndirectSrc, 26); + } +} + +void +CodeEmitterNVC0::emitSUDim(const TexInstruction *i) +{ + assert(targ->getChipset() < NVISA_GK104_CHIPSET); + + code[1] |= (i->tex.target.getDim() - 1) << 12; + if (i->tex.target.isArray() || i->tex.target.isCube() || + i->tex.target.getDim() == 3) { + // use e2d mode for 3-dim images, arrays and cubes. + code[1] |= 3 << 12; + } + + srcId(i->src(0), 20); +} + +void +CodeEmitterNVC0::emitSULEA(const TexInstruction *i) +{ + assert(targ->getChipset() < NVISA_GK104_CHIPSET); + + code[0] = 0x5; + code[1] = 0xf000; + + emitPredicate(i); + emitLoadStoreType(i->sType); + + defId(i->def(0), 14); + + if (i->defExists(1)) { + defId(i->def(1), 32 + 22); + } else { + code[1] |= 7 << 22; + } + + emitSUAddr(i); + emitSUDim(i); +} + +void CodeEmitterNVC0::emitVectorSubOp(const Instruction *i) { switch (NV50_IR_SUBOP_Vn(i->subOp)) { @@ -2579,6 +2634,9 @@ CodeEmitterNVC0::emitInstruction(Instruction *insn) else ERROR("SUSTx not yet supported on < nve4\n"); break; + case OP_SULEA: + emitSULEA(insn->asTex()); + break; case OP_ATOM: emitATOM(insn); break; -- 2.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] nvc0: bind images on fragment and compute shaders for Fermi
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 53 src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 8 +- src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 154 +++- 4 files changed, 209 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c index bbc8edb..78ce000 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c @@ -258,6 +258,45 @@ nvc0_compute_validate_globals(struct nvc0_context *nvc0) } } +static inline void +nvc0_compute_invalidate_surfaces(struct nvc0_context *nvc0, const int s) +{ + struct nouveau_pushbuf *push = nvc0->base.pushbuf; + int i; + + for (i = 0; i < NVC0_MAX_IMAGES; ++i) { + if (s == 5) + BEGIN_NVC0(push, NVC0_CP(IMAGE(i)), 6); + else + BEGIN_NVC0(push, NVC0_3D(IMAGE(i)), 6); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0x14000); + PUSH_DATA(push, 0); + } +} + +static void +nvc0_compute_validate_surfaces(struct nvc0_context *nvc0) +{ + /* TODO: Invalidating both 3D and CP surfaces before validating surfaces for +* compute is probably not really necessary, but we didn't find any better +* solutions for now. This fixes some invalidation issues when compute and +* fragment shaders are used inside the same context. Anyway, we definitely +* have invalidation issues between 3D and CP for other resources like SSBO +* and atomic counters. */ + nvc0_compute_invalidate_surfaces(nvc0, 4); + nvc0_compute_invalidate_surfaces(nvc0, 5); + + nvc0_validate_suf(nvc0, 5); + + /* Invalidate all FRAGMENT images because they are aliased with COMPUTE. */ + nvc0->dirty_3d |= NVC0_NEW_3D_SURFACES; + nvc0->images_dirty[4] |= nvc0->images_valid[4]; +} + static struct nvc0_state_validate validate_list_cp[] = { { nvc0_compprog_validate, NVC0_NEW_CP_PROGRAM }, @@ -267,6 +306,7 @@ validate_list_cp[] = { { nvc0_compute_validate_textures, NVC0_NEW_CP_TEXTURES}, { nvc0_compute_validate_samplers, NVC0_NEW_CP_SAMPLERS}, { nvc0_compute_validate_globals, NVC0_NEW_CP_GLOBALS }, + { nvc0_compute_validate_surfaces, NVC0_NEW_CP_SURFACES}, }; static bool @@ -384,6 +424,19 @@ nvc0_launch_grid(struct pipe_context *pipe, const struct pipe_grid_info *info) PUSH_DATA (push, 0x1); } + for (int i = 0; i < NVC0_MAX_IMAGES; ++i) { + BEGIN_NVC0(push, NVC0_CP(IMAGE(i)), 6); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0); + PUSH_DATA(push, 0x14000); + PUSH_DATA(push, 0); + } + + /* TODO: Not sure if this is really necessary. */ + nvc0_compute_invalidate_surfaces(nvc0, 5); + /* Invalidate all 3D constbufs because they are aliased with COMPUTE. */ nvc0->dirty_3d |= NVC0_NEW_3D_CONSTBUF; for (s = 0; s < 5; s++) { diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h index 7fcbf4a..436e912 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h @@ -323,6 +323,7 @@ extern void nvc0_init_surface_functions(struct nvc0_context *); bool nvc0_validate_tic(struct nvc0_context *nvc0, int s); bool nvc0_validate_tsc(struct nvc0_context *nvc0, int s); bool nve4_validate_tsc(struct nvc0_context *nvc0, int s); +void nvc0_validate_suf(struct nvc0_context *nvc0, int s); void nvc0_validate_textures(struct nvc0_context *); void nvc0_validate_samplers(struct nvc0_context *); void nve4_set_tex_handles(struct nvc0_context *); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c index 9db45c0..9e214a5 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c @@ -552,22 +552,18 @@ nvc0_program_translate(struct nvc0_program *prog, uint16_t chipset, info->io.texBindBase = NVC0_CB_AUX_TEX_INFO(0); info->prop.cp.gridInfoBase = NVC0_CB_AUX_GRID_INFO; info->io.uboInfoBase = NVC0_CB_AUX_UBO_INFO(0); - info->io.suInfoBase = NVC0_CB_AUX_SU_INFO(0); - } else { - info->io.suInfoBase = 0; /* TODO */ } info->io.msInfoCBSlot = 0; info->io.msInfoBase = NVC0_CB_AUX_MS_INFO; info->io.bufInfoBase = NVC0_CB_AUX_BUF_INFO(0); + info->io.suInfoBase = NVC0_CB_AUX_SU_INFO(0); } else { if (chipset >= NVISA_GK104_CHIPSET) { info->io.texBindBase = NVC0_CB_AUX_TEX_INFO(0); - info->io.suInfoBase = NVC0_CB_AUX_SU_INFO(0); - } else { - info->io.suInfoBase = 0; /* TODO */ } info->io.sampleInfoBase =
[Mesa-dev] [PATCH 3/8] nv50/ir: fix tex constraints for surface coords on Fermi
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 27883a0..b893996 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -2154,6 +2154,9 @@ RegAlloc::InsertConstraintsPass::texConstraintNVC0(TexInstruction *tex) if (tex->op == OP_TXQ) { s = tex->srcCount(0xff); n = 0; + } else if (isSurfaceOp(tex->op)) { + s = tex->tex.target.getDim() + (tex->tex.target.isArray() || tex->tex.target.isCube()); + n = tex->srcCount(0xff) - s; } else { s = tex->tex.target.getArgCount() - tex->tex.target.isMS(); if (!tex->tex.target.isArray() && -- 2.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] nvc0/ir: add a lowering pass for surfaces on Fermi
Signed-off-by: Samuel Pitoiset --- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 117 + .../nouveau/codegen/nv50_ir_lowering_nvc0.h| 2 + 2 files changed, 119 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index 1068c21..002f09d 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp @@ -1982,6 +1982,121 @@ NVC0LoweringPass::handleSurfaceOpNVE4(TexInstruction *su) su->sType = (su->tex.target == TEX_TARGET_BUFFER) ? TYPE_U32 : TYPE_U8; } +void +NVC0LoweringPass::processSurfaceCoordsNVC0(TexInstruction *su) +{ + const int idx = su->tex.r; + const int dim = su->tex.target.getDim(); + const int arg = dim + (su->tex.target.isArray() || su->tex.target.isCube()); + const uint16_t base = idx * NVE4_SU_INFO__STRIDE; + int c; + Value *zero = bld.mkImm(0); + Value *src[3]; + Value *v; + Value *ind = NULL; + + if (su->tex.rIndirectSrc >= 0) { + // FIXME: out of bounds + assert(su->tex.r == 0); + ind = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), + su->getIndirectR(), bld.mkImm(6)); + } + + // get surface coordinates + for (c = 0; c < arg; ++c) + src[c] = su->getSrc(c); + for (; c < 3; ++c) + src[c] = zero; + + // calculate pixel offset + if (su->op == OP_SULDP || su->op == OP_SUREDP) { + v = loadSuInfo32(ind, base + NVE4_SU_INFO_BSIZE); + su->setSrc(0, bld.mkOp2v(OP_MUL, TYPE_U32, bld.getSSA(), src[0], v)); + } + + // add array layer offset + if (su->tex.target.isArray() || su->tex.target.isCube()) { + v = loadSuInfo32(ind, base + NVE4_SU_INFO_ARRAY); + assert(dim > 1); + su->setSrc(2, bld.mkOp2v(OP_MUL, TYPE_U32, bld.getSSA(), src[2], v)); + } + + // prevent read fault when the image is not actually bound + CmpInstruction *pred = + bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, bld.getSSA(1, FILE_PREDICATE), +TYPE_U32, bld.mkImm(0), +loadSuInfo32(ind, base + NVE4_SU_INFO_ADDR)); + if (su->tex.format) { + const TexInstruction::ImgFormatDesc *format = su->tex.format; + int blockwidth = format->bits[0] + format->bits[1] + + format->bits[2] + format->bits[3]; + + if (blockwidth >= 8) { + // make sure that the format doesn't mismatch + bld.mkCmp(OP_SET_OR, CC_NE, TYPE_U32, pred->getDef(0), + TYPE_U32, bld.loadImm(NULL, blockwidth / 8), + loadSuInfo32(ind, base + NVE4_SU_INFO_BSIZE), + pred->getDef(0)); + } + } + su->setPredicate(CC_NOT_P, pred->getDef(0)); +} + +void +NVC0LoweringPass::handleSurfaceOpNVC0(TexInstruction *su) +{ + if (su->tex.target == TEX_TARGET_1D_ARRAY) { + /* As 1d arrays also need 3 coordinates, switching to TEX_TARGET_2D_ARRAY + * will simplify the lowering pass and the texture constraints. */ + su->moveSources(1, 1); + su->setSrc(2, su->getSrc(1)); + su->setSrc(1, bld.loadImm(NULL, 0)); + su->tex.target = TEX_TARGET_2D_ARRAY; + } + + processSurfaceCoordsNVC0(su); + + if (su->op == OP_SULDP) + convertSurfaceFormat(su); + + if (su->op == OP_SUREDB || su->op == OP_SUREDP) { + const int dim = su->tex.target.getDim(); + const int arg = dim + (su->tex.target.isArray() || su->tex.target.isCube()); + LValue *addr = bld.getSSA(8); + Value *def = su->getDef(0); + + su->op = OP_SULEA; + + // Set the destination to the address + su->dType = TYPE_U64; + su->setDef(0, addr); + su->setDef(1, su->getPredicate()); + + bld.setPosition(su, true); + + // Perform the atomic op + Instruction *red = bld.mkOp(OP_ATOM, su->sType, bld.getSSA()); + red->subOp = su->subOp; + red->setSrc(0, bld.mkSymbol(FILE_MEMORY_GLOBAL, 0, su->sType, 0)); + red->setSrc(1, su->getSrc(arg)); + if (red->subOp == NV50_IR_SUBOP_ATOM_CAS) + red->setSrc(2, su->getSrc(arg + 1)); + red->setIndirect(0, 0, addr); + + // make sure to initialize dst value when the atomic operation is not + // performed + Instruction *mov = bld.mkMov(bld.getSSA(), bld.loadImm(NULL, 0)); + + assert(su->cc == CC_NOT_P); + red->setPredicate(su->cc, su->getPredicate()); + mov->setPredicate(CC_P, su->getPredicate()); + + bld.mkOp2(OP_UNION, TYPE_U32, def, red->getDef(0), mov->getDef(0)); + + handleCasExch(red, false); + } +} + bool NVC0LoweringPass::handleWRSV(Instruction *i) { @@ -2455,6 +2570,8 @@ NVC0LoweringPass::visit(Instruction *i) case OP_SUREDP: if (targ->getChipset() >= NVISA_GK104_CHIPSET) handleSurfaceOpNVE4(i->asTex()); + else + handleSurfaceOpNVC0(i->asTex()); break; case OP_SUQ: handleSUQ(i->asTex()); diff --git a/src/gallium/drivers/nou
[Mesa-dev] [PATCH 5/8] nvc0/ir: add emission for SULDB and SUSTx
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 46 +- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index f7bdc19..596293e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp @@ -139,6 +139,8 @@ private: void emitSULDGB(const TexInstruction *); void emitSUSTGx(const TexInstruction *); + void emitSULDB(const TexInstruction *); + void emitSUSTx(const TexInstruction *); void emitSULEA(const TexInstruction *); void emitVSHL(const Instruction *); @@ -2340,6 +2342,46 @@ CodeEmitterNVC0::emitSULEA(const TexInstruction *i) } void +CodeEmitterNVC0::emitSULDB(const TexInstruction *i) +{ + assert(targ->getChipset() < NVISA_GK104_CHIPSET); + + code[0] = 0x5; + code[1] = 0xd400 | (i->subOp << 15); + + emitPredicate(i); + emitLoadStoreType(i->dType); + + defId(i->def(0), 14); + + emitCachingMode(i->cache); + emitSUAddr(i); + emitSUDim(i); +} + +void +CodeEmitterNVC0::emitSUSTx(const TexInstruction *i) +{ + assert(targ->getChipset() < NVISA_GK104_CHIPSET); + + code[0] = 0x5; + code[1] = 0xdc00 | (i->subOp << 15); + + if (i->op == OP_SUSTP) + code[1] |= i->tex.mask << 17; + else + emitLoadStoreType(i->dType); + + emitPredicate(i); + + srcId(i->src(1), 14); + + emitCachingMode(i->cache); + emitSUAddr(i); + emitSUDim(i); +} + +void CodeEmitterNVC0::emitVectorSubOp(const Instruction *i) { switch (NV50_IR_SUBOP_Vn(i->subOp)) { @@ -2625,14 +2667,14 @@ CodeEmitterNVC0::emitInstruction(Instruction *insn) if (targ->getChipset() >= NVISA_GK104_CHIPSET) emitSULDGB(insn->asTex()); else - ERROR("SULDB not yet supported on < nve4\n"); + emitSULDB(insn->asTex()); break; case OP_SUSTB: case OP_SUSTP: if (targ->getChipset() >= NVISA_GK104_CHIPSET) emitSUSTGx(insn->asTex()); else - ERROR("SUSTx not yet supported on < nve4\n"); + emitSUSTx(insn->asTex()); break; case OP_SULEA: emitSULEA(insn->asTex()); -- 2.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] nvc0: enable ARB_shader_image_load_store on GF100
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index eaf9c78..bd68ca9 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c @@ -376,6 +376,9 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader, case PIPE_SHADER_CAP_MAX_SHADER_IMAGES: if (class_3d == NVE4_3D_CLASS || class_3d == NVF0_3D_CLASS) return NVC0_MAX_IMAGES; + if (class_3d < NVE4_3D_CLASS) + if (shader == PIPE_SHADER_FRAGMENT || shader == PIPE_SHADER_COMPUTE) +return NVC0_MAX_IMAGES; return 0; default: NOUVEAU_ERR("unknown PIPE_SHADER_CAP %d\n", param); -- 2.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] nvc0: expose GLSL version 420 on GF100
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index bd68ca9..40e5a9d 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c @@ -120,7 +120,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE: return 128 * 1024 * 1024; case PIPE_CAP_GLSL_FEATURE_LEVEL: - if (class_3d == NVE4_3D_CLASS || class_3d == NVF0_3D_CLASS) + if (class_3d <= NVF0_3D_CLASS) return 420; return 410; case PIPE_CAP_MAX_RENDER_TARGETS: -- 2.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] nvc0: expose OpenGL 4.2 on Fermi
Hi there, This series implements both ARB_shader_image_load_store (GL 4.2) and ARB_shader_image_size (GL 4.3) which allows us to enable OpenGL 4.2 on Fermi GPUS. (GL3.txt won't be updated until images are also implemented on Maxwell) 3D images are fully not supported because we don't think they are used in real applications and because it's a bit tricky to do. Anyway this could be implemented with a separate series later if we really need them. Except 3d images, we have exactly the same passrate as Kepler. Next step is to implement images on Maxwell GPUs but this won't be ready for the next release. As usual, the list of dEQP/piglit fails is listed below. Please review, Thanks! Ilia Mirkin (1): nv50/ir: use moveSources to condense sources Samuel Pitoiset (7): nvc0: bind images on fragment and compute shaders for Fermi nv50/ir: fix tex constraints for surface coords on Fermi nvc0/ir: add emission for OP_SULEA nvc0/ir: add emission for SULDB and SUSTx nvc0/ir: add a lowering pass for surfaces on Fermi nvc0: enable ARB_shader_image_load_store on GF100 nvc0: expose GLSL version 420 on GF100 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 104 +- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 117 .../nouveau/codegen/nv50_ir_lowering_nvc0.h| 2 + src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 53 +++ src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 1 + src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 8 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 5 +- src/gallium/drivers/nouveau/nvc0/nvc0_tex.c| 154 - 9 files changed, 438 insertions(+), 16 deletions(-) -- 2.8.2 ** dEQP ** deqp-gles31/functional/image_load_store/3d/atomic/add_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/add_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/add_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/add_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/and_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/and_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/and_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/and_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/comp_swap_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/comp_swap_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/comp_swap_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/comp_swap_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/exchange_r32f_result: fail deqp-gles31/functional/image_load_store/3d/atomic/exchange_r32f_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/exchange_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/exchange_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/exchange_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/exchange_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/max_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/max_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/max_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/max_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/min_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/min_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/min_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/min_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/or_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/or_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/or_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/or_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/xor_r32i_result: fail deqp-gles31/functional/image_load_store/3d/atomic/xor_r32i_return_value: fail deqp-gles31/functional/image_load_store/3d/atomic/xor_r32ui_result: fail deqp-gles31/functional/image_load_store/3d/atomic/xor_r32ui_return_value: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_r32i: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_r32ui: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_rgba8: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_rgba8_snorm: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_rgba8i: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_rgba8ui: fail deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32i_r32f: fail deqp-gles31/fu
[Mesa-dev] [PATCH 2/8] nv50/ir: use moveSources to condense sources
From: Ilia Mirkin This makes sure that rIndirectSrc and other things stay updated. Signed-off-by: Ilia Mirkin --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 7e8bb17..27883a0 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -2073,14 +2073,9 @@ RegAlloc::InsertConstraintsPass::condenseSrcs(Instruction *insn, merge->setDef(0, lval); for (int s = a, i = 0; s <= b; ++s, ++i) { merge->setSrc(i, insn->getSrc(s)); - insn->setSrc(s, NULL); } + insn->moveSources(b + 1, a - b); insn->setSrc(a, lval); - - for (int k = a + 1, s = b + 1; insn->srcExists(s); ++s, ++k) { - insn->setSrc(k, insn->getSrc(s)); - insn->setSrc(s, NULL); - } insn->bb->insertBefore(insn, merge); insn->putExtraSources(0, save); -- 2.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix undefined df bits in brw_reg comparisons.
Hi Ken, On 14 May 2016 at 01:44, Kenneth Graunke wrote: > Commit 5310bca024f77da40ea6f4c275455f9cb0528f9e added a new "double df" > field to the brw_reg struct, adding an extra 4 bytes of data that isn't > usually initialized (or may contain irrelevant garbage if the struct is > mutated). This means that it's no longer safe to memcmp(). > > Instead, add a brw_regs_equal() function which ignores the extra df bits > unless they matter. To keep the implementation cheap, we wrap the first > set of fields in a union/struct so that we can use a single DWord > comparison. > This seems to be roughly what I did a while ago [1], as part of a series [2] to remove the memclear/memcmp in {brw,fs,src,dst}_reg. Shame the series never got much input, even after a few pings :-( -Emil [1] https://patchwork.freedesktop.org/patch/63840/ [2] https://patchwork.freedesktop.org/series/483/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/11] tgsi: remove culldist semantic.
Dave, It should be noted that clip distances can be disabled by pipe_rasterizer_state::clip_plane_enable, but cull distances can't. (same as GL) Roland, Our hardware only has 2 vec4 outputs. Each component can be configured to be "clip distance", "cull distance", or "disabled" independently. Marek On Sat, May 14, 2016 at 12:43 AM, Roland Scheidegger wrote: > Am 13.05.2016 um 23:10 schrieb Dave Airlie: >> From: Dave Airlie >> >> This isn't used anymore in the tree, culldist's >> are part of the clipdist semantic, we could in theory >> rename it, but I'm not sure there is much point, and >> I'd have to be careful with virgl. >> >> Signed-off-by: Dave Airlie >> --- >> src/gallium/auxiliary/tgsi/tgsi_strings.c | 1 - >> src/gallium/docs/source/tgsi.rst | 22 ++ >> src/gallium/include/pipe/p_shader_tokens.h | 1 - >> 3 files changed, 18 insertions(+), 6 deletions(-) >> >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c >> b/src/gallium/auxiliary/tgsi/tgsi_strings.c >> index 306ab4f..c13f7ea 100644 >> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c >> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c >> @@ -85,7 +85,6 @@ const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] = >> "PCOORD", >> "VIEWPORT_INDEX", >> "LAYER", >> - "CULLDIST", >> "SAMPLEID", >> "SAMPLEPOS", >> "SAMPLEMASK", >> diff --git a/src/gallium/docs/source/tgsi.rst >> b/src/gallium/docs/source/tgsi.rst >> index 4315707..ab12490 100644 >> --- a/src/gallium/docs/source/tgsi.rst >> +++ b/src/gallium/docs/source/tgsi.rst >> @@ -2876,18 +2876,32 @@ annotated with those semantics. >> TGSI_SEMANTIC_CLIPDIST >> "" >> >> +Note this covers clipping and culling distances. >> + >> When components of vertex elements are identified this way, these >> values are each assumed to be a float32 signed distance to a plane. >> + >> +For clip distances: >> Primitive setup only invokes rasterization on pixels for which >> -the interpolated plane distances are >= 0. Multiple clip planes >> -can be implemented simultaneously, by annotating multiple >> -components of one or more vertex elements with the above specified >> -semantic. The limits on both clip and cull distances are bound >> +the interpolated plane distances are >= 0. >> + >> +For cull distances: >> +Primitives will be completely discarded if the plane distance >> +for all of the vertices in the primitive are < 0. >> +If a vertex has a cull distance of NaN, that vertex counts as "out" >> +(as if its < 0); >> + >> +Multiple clip/cull planes can be implemented simultaneously, by >> +annotating multiple components of one or more vertex elements with >> +the above specified semantic. >> +The limits on both clip and cull distances are bound >> by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT define which defines >> the maximum number of components that can be used to hold the >> distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT >> which specifies the maximum number of registers which can be >> annotated with those semantics. >> +The properties NUM_CLIPDIST_ENABLED and NUM_CULLDIST_ENABLED >> +are used to divide up the 2 x vec4 space between clipping and culling. > This should really say how it's determined which one is which (so clip > dists come first). > > > You should remove the TGSI_SEMANTIC_CULLDIST section. > > For patch 10, shouldn't this work with softpipe too? > > Honestly, I'm not a big fan of packed clip and cull dists in the same > regs (it's still not the same as what d3d10 does in any case), my > opinion is since we generally don't allow different semantics within the > same reg, I see no good reason why we allow it here (and clip dists and > cull dists, albeit somewhat similar, are still different). So, if some > drivers wanted it in different regs and some in the same regs, I'd > prefer it to be different regs in the interface, with drivers having to > merge it when required, just because it looks cleaner. But if really all > hw wants it like that, 6,8-11 are > Reviewed-by: Roland Scheidegger > (But I'd like to hear from other driver's authors.) > > Roland > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 15/30] i965/fs: support doubles with UBO loads
On 14/05/16 01:16, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez writes: > >> From: Iago Toral Quiroga >> >> UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD >> instruction, which reads 16 bytes (a vec4) of data from memory. For dvec >> types this only provides components x and y. Thus, if we are reading >> more than 2 components we need to issue a second load at offset+16 to >> read the next 16-byte chunk with components w and z. >> >> UBO loads with non-constant offset emit a load for each component >> in the vector (and rely in CSE to fix redundant loads), so we only >> need to consider the size of the data type when computing the offset >> of each element in a vector. >> >> v2 (Sam): >> - Adapt the code to use component() (Curro). >> >> Signed-off-by: Samuel Iglesias Gonsálvez >> Reviewed-by: Kenneth Graunke >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 >> +++- >> 1 file changed, 45 insertions(+), 7 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 2d57fd3..02f1e81 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -3362,6 +3362,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, >> nir_intrinsic_instr *instr >> nir->info.num_ubos - 1); >>} >> >> + /* Number of 32-bit slots in the type */ >> + unsigned type_slots = MAX2(1, type_sz(dest.type) / 4); >> + >>nir_const_value *const_offset = nir_src_as_const_value(instr->src[1]); >>if (const_offset == NULL) { >> fs_reg base_offset = retype(get_nir_src(instr->src[1]), >> @@ -3369,19 +3372,54 @@ fs_visitor::nir_emit_intrinsic(const fs_builder >> &bld, nir_intrinsic_instr *instr >> >> for (int i = 0; i < instr->num_components; i++) >> VARYING_PULL_CONSTANT_LOAD(bld, offset(dest, bld, i), >> surf_index, >> - base_offset, i * 4); >> + base_offset, i * 4 * type_slots); > > Why not 'i * type_sz(...)'? As before it seems like type_slots is just > going to introduce rounding errors here for no benefit? > Right, I will fix it. >>} else { >> + /* Even if we are loading doubles, a pull constant load will load >> + * a 32-bit vec4, so should only reserve vgrf space for that. If we >> + * need to load a full dvec4 we will have to emit 2 loads. This is >> + * similar to demote_pull_constants(), except that in that case we >> + * see individual accesses to each component of the vector and then >> + * we let CSE deal with duplicate loads. Here we see a vector >> access >> + * and we have to split it if necessary. >> + */ >> fs_reg packed_consts = vgrf(glsl_type::float_type); >> packed_consts.type = dest.type; >> >> - struct brw_reg const_offset_reg = brw_imm_ud(const_offset->u32[0] >> & ~15); >> - bld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD, packed_consts, >> - surf_index, const_offset_reg); >> + unsigned const_offset_aligned = const_offset->u32[0] & ~15; >> + >> + /* A vec4 only contains half of a dvec4, if we need more than 2 >> + * components of a dvec4 we will have to issue another load for >> + * components z and w >> + */ >> + int num_components; >> + if (type_slots == 1) >> +num_components = instr->num_components; >> + else >> +num_components = MIN2(2, instr->num_components); >> >> - const fs_reg consts = byte_offset(packed_consts, >> const_offset->u32[0] % 16); >> + int remaining_components = instr->num_components; >> + while (remaining_components > 0) { >> +/* Read the vec4 from a 16-byte aligned offset */ >> +struct brw_reg const_offset_reg = >> brw_imm_ud(const_offset_aligned); >> +bld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD, >> + retype(packed_consts, BRW_REGISTER_TYPE_F), >> + surf_index, const_offset_reg); >> >> - for (unsigned i = 0; i < instr->num_components; i++) >> -bld.MOV(offset(dest, bld, i), component(consts, i)); >> +const fs_reg consts = byte_offset(packed_consts, >> (const_offset->u32[0] % 16)); > > This looks really fishy to me, if the initial offset is not 16B aligned > you'll apply the same sub-16B offset to the result from each one of the > subsequent pull constant loads. This cannot happen thanks to the layout alignment rules, see below. > Also you don't seem to take into > account whether the initial offset is misaligned in the calculation of > num_components -- If it is it looks like the first pull constant load > could return less than "num_components" usable components and you wou
Re: [Mesa-dev] [PATCH 13/18] winsys/amdgpu: start with smaller IBs, growing as necessary
On Tue, May 10, 2016 at 1:21 AM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > This avoids allocating giant IBs from the outset, especially for CE and DMA. > > With this change, we also never flush prematurely due to the CE IB: as long > as there is space in the buffer, we will use it. > --- > src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 55 > +-- > src/gallium/winsys/amdgpu/drm/amdgpu_cs.h | 1 + > 2 files changed, 46 insertions(+), 10 deletions(-) > > diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c > b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c > index a318670..546f224 100644 > --- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c > +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c > @@ -335,11 +335,31 @@ static unsigned amdgpu_cs_add_buffer(struct > radeon_winsys_cs *rcs, > return index; > } > > -static bool amdgpu_ib_new_buffer(struct radeon_winsys *ws, struct amdgpu_ib > *ib, > - unsigned buffer_size) > +static bool amdgpu_ib_new_buffer(struct radeon_winsys *ws, struct amdgpu_ib > *ib) > { > struct pb_buffer *pb; > uint8_t *mapped; > + unsigned buffer_size; > + > + /* Always create a buffer that is at least as large as the largest IB > +* seen so far (multiplied by a factor to reduce internal fragmentation), > +* but never more than the maximum IB size supported by the hardware. > +*/ > + buffer_size = 4 << MIN2(19, 2 + util_last_bit(ib->max_ib_size)); Would you please use something more readable? I think it's equal or very similar to this expression: MIN2(2 * 1024 * 1024, 4 * 4 * util_next_power_of_two(ib->max_ib_size)) And a comment explaining those numbers would be useful. For example, "2MB is the maximum IB size allowed by the winsys" (I think the hw limit is 4 MB actually) "and we always allocate 4 times more space than the maximum seen IB size aligned to 2^n". > + > + switch (ib->ib_type) { > + case IB_CONST_PREAMBLE: > + buffer_size = MAX2(buffer_size, 4 * 1024); > + break; > + case IB_CONST: > + buffer_size = MAX2(buffer_size, 16 * 1024 * 4); > + break; > + case IB_MAIN: > + buffer_size = MAX2(buffer_size, 8 * 1024 * 4); > + break; > + default: > + unreachable("unhandled IB type"); > + } > > pb = ws->buffer_create(ws, buffer_size, 4096, RADEON_DOMAIN_GTT, >RADEON_FLAG_CPU_ACCESS); > @@ -370,35 +390,34 @@ static bool amdgpu_get_new_ib(struct radeon_winsys *ws, > struct amdgpu_cs *cs, > */ > struct amdgpu_ib *ib = NULL; > struct amdgpu_cs_ib_info *info = &cs->csc->ib[ib_type]; > - unsigned buffer_size, ib_size; > + unsigned ib_size = 0; > > switch (ib_type) { > case IB_CONST_PREAMBLE: >ib = &cs->const_preamble_ib; > - buffer_size = 4 * 1024 * 4; > - ib_size = 1024 * 4; > + ib_size = 256 * 4; >break; > case IB_CONST: >ib = &cs->const_ib; > - buffer_size = 512 * 1024 * 4; > - ib_size = 128 * 1024 * 4; > + ib_size = 8 * 1024 * 4; >break; > case IB_MAIN: >ib = &cs->main; > - buffer_size = 128 * 1024 * 4; > - ib_size = 20 * 1024 * 4; > + ib_size = 4 * 1024 * 4; >break; > default: >unreachable("unhandled IB type"); > } > > + ib_size = MAX2(ib_size, 4 << MIN2(19, util_last_bit(ib->max_ib_size))); This is an expression similar to the one above. Some unification would be nice. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 11.2.2 problems with Intel i965 graphics on Arch Linux
Hi all, I'm sorry if this is the wrong place to post this. Upgrading from mesa 11.2.1 to 11.2.2 on Arch Linux results in several programs not working. I am getting the following errors when launching Paraview for example, libGL error: unable to load driver: i965_dri.so libGL error: driver pointer missing libGL error: failed to load driver: i965 libGL error: unable to load driver: swrast_dri.so libGL error: failed to load driver: swrast Both files exist on my system, /usr/lib/xorg/modules/dri/i965_dri.so /usr/lib/xorg/modules/dri/swrast_dri.so I am not sure if this is a problem with mesa, or with the Arch package or with my X configuration. I've tried asking on the Arch forums to no avail. Best regards, Vanja ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Drop bad ASSERT_TRUE in gl_CullDistance link_varyings test.
On 2016-05-13 19:26:37, Kenneth Graunke wrote: > I don't know what the intention was here, but this function returns > void. We can't assert anything about its return value. c1bbaff1e83f901d67d78f9e1ddfe8291dd09bfa seems to be related, and appears to have changed this file similarly for some other cases. Maybe a rebase issue. There is a linker::populate_consumer_input_sets prototype at the top of the file that has bool rather than void for the return. Can you update it too? Reviewed-by: Jordan Justen > > Fixes "make check" failures. > > Signed-off-by: Kenneth Graunke > --- > src/compiler/glsl/tests/varyings_test.cpp | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/src/compiler/glsl/tests/varyings_test.cpp > b/src/compiler/glsl/tests/varyings_test.cpp > index 936f495..09bf1eb 100644 > --- a/src/compiler/glsl/tests/varyings_test.cpp > +++ b/src/compiler/glsl/tests/varyings_test.cpp > @@ -210,11 +210,11 @@ TEST_F(link_varyings, gl_CullDistance) > > ir.push_tail(culldistance); > > - ASSERT_TRUE(linker::populate_consumer_input_sets(mem_ctx, > -&ir, > -consumer_inputs, > - > consumer_interface_inputs, > -junk)); > + linker::populate_consumer_input_sets(mem_ctx, > +&ir, > +consumer_inputs, > +consumer_interface_inputs, > +junk); > > EXPECT_EQ(culldistance, junk[VARYING_SLOT_CULL_DIST0]); > EXPECT_TRUE(is_empty(consumer_inputs)); > -- > 2.8.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix undefined df bits in brw_reg comparisons.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 With Curro's comment addressed, Reviewed-by: Samuel Iglesias Gonsálvez On 14/05/16 02:44, Kenneth Graunke wrote: > Commit 5310bca024f77da40ea6f4c275455f9cb0528f9e added a new "double > df" field to the brw_reg struct, adding an extra 4 bytes of data > that isn't usually initialized (or may contain irrelevant garbage > if the struct is mutated). This means that it's no longer safe to > memcmp(). > > Instead, add a brw_regs_equal() function which ignores the extra df > bits unless they matter. To keep the implementation cheap, we wrap > the first set of fields in a union/struct so that we can use a > single DWord comparison. > > Signed-off-by: Kenneth Graunke --- > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_reg.h | 27 > +--- src/mesa/drivers/dri/i965/brw_shader.cpp > | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 2 +- 4 > files changed, 22 insertions(+), 11 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index > 4f6f3a3..3b50a82 100644 --- > a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1010,7 +1010,7 > @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, > struct brw_reg src brw_set_default_mask_control(p, > BRW_MASK_DISABLE); brw_set_default_access_mode(p, BRW_ALIGN_1); > > - if (memcmp(&surface_reg, &sampler_reg, sizeof(surface_reg)) > == 0) { + if (brw_regs_equal(&surface_reg, &sampler_reg)) { > brw_MUL(p, addr, sampler_reg, brw_imm_uw(0x101)); } else { > brw_SHL(p, addr, sampler_reg, brw_imm_ud(8)); diff --git > a/src/mesa/drivers/dri/i965/brw_reg.h > b/src/mesa/drivers/dri/i965/brw_reg.h index 6d51623..71e1024 > 100644 --- a/src/mesa/drivers/dri/i965/brw_reg.h +++ > b/src/mesa/drivers/dri/i965/brw_reg.h @@ -234,14 +234,19 @@ > uint32_t brw_swizzle_immediate(enum brw_reg_type type, uint32_t x, > unsigned swz) * or "structure of array" form: */ struct brw_reg { - > enum brw_reg_type type:4; - enum brw_reg_file file:3; /* :2 > hardware format */ - unsigned negate:1; /* source > only */ - unsigned abs:1;/* source only */ - > unsigned address_mode:1; /* relative addressing, hopefully! > */ - unsigned pad0:1; - unsigned subnr:5; /* :1 in > align16 */ - unsigned nr:16; + union { + struct { + > enum brw_reg_type type:4; + enum brw_reg_file file:3; > /* :2 hardware format */ + unsigned negate:1; > /* source only */ + unsigned abs:1;/* > source only */ + unsigned address_mode:1; /* relative > addressing, hopefully! */ + unsigned pad0:1; + > unsigned subnr:5; /* :1 in align16 */ + > unsigned nr:16; + }; + uint32_t bits; + }; > > union { struct { @@ -261,6 +266,12 @@ struct brw_reg { }; }; > > +static inline bool +brw_regs_equal(const struct brw_reg *a, const > struct brw_reg *b) +{ + const bool df = a->type == > BRW_REGISTER_TYPE_DF && a->file == IMM; + return a->bits == > b->bits && (df ? a->df == b->df : a->ud == b->ud); +} > > struct brw_indirect { unsigned addr_subnr:4; diff --git > a/src/mesa/drivers/dri/i965/brw_shader.cpp > b/src/mesa/drivers/dri/i965/brw_shader.cpp index a23f14e..8d9e309 > 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ > b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -687,7 +687,7 @@ > backend_shader::backend_shader(const struct brw_compiler > *compiler, bool backend_reg::equals(const backend_reg &r) const { - > return memcmp((brw_reg *)this, (brw_reg *)&r, sizeof(brw_reg)) == 0 > && + return brw_regs_equal((brw_reg *)this, (brw_reg *)&r) && > reg_offset == r.reg_offset; } > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index > 4b44c3a..baf4422 100644 --- > a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -295,7 +295,7 > @@ generate_tex(struct brw_codegen *p, > brw_set_default_mask_control(p, BRW_MASK_DISABLE); > brw_set_default_access_mode(p, BRW_ALIGN_1); > > - if (memcmp(&surface_reg, &sampler_reg, sizeof(surface_reg)) > == 0) { + if (brw_regs_equal(&surface_reg, &sampler_reg)) { > brw_MUL(p, addr, sampler_reg, brw_imm_uw(0x101)); } else { > brw_SHL(p, addr, sampler_reg, brw_imm_ud(8)); > -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJXNs3JAAoJEH/0ujLxfcNDbNoP/A1sDKrs2iHW7rN/pCT59Qvy xe4ZoPaQU++gUzQbizOvrdKaibIj5SgwY6Cs9gvWoOy8FsOjRrQs2ptnCKyzfEog TDrwQ/4CDY6Kc0NykVQgxJHmw/363XHqWo3FF6mpVyl7MZHyA9ffBxzFO1OSEhx8 uwpY6mt0NfDtBh/R4Pju5UAV7WT9VnIYWh4Te7M098EsEgf6iqg2I833ct1FbNWu LJYu6g46cIN3Mig4Bak5H495Ws4phP+vBcIPKe+wcWSS/p3bGG0OOk3fcm3fDsD4 h7pE8sYBh/t5TInAMFfjAm9SSnXHmxe1zqkvh5XwD8WQPIAx9E9HnzccGDAs35o6 gU3O43DAkYqXm53oOJi5qOWeisllxQcr