Re: [Mesa-dev] [PATCH 00/26] i965: Tessellation shaders for Gen8+!
I'm working on rebasing these patches on Jason's NIR input/output intrinsic changes. Patches 17, 18, 19, 22, and 24 are probably not worth reviewing in their current form. 10, 12-16, and 23 still apply in their current form. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/aux../util: Make u_prims_for_vertices() safe
Let us avoid trapping in hardware from a SIGFPE and instead assert on a zero divisor. Hint: This can occur if a PIPE_PRIM_? is not handled in u_prim_vertex_count() that results in ' info ' not being initialized in the expected manner. Further, we also fix a possibly NULL pointer dereference from ' info ' being NULL from a u_prim_vertex_count() call. Signed-off-by: Edward O'Callaghan--- src/gallium/auxiliary/util/u_prim.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/auxiliary/util/u_prim.h b/src/gallium/auxiliary/util/u_prim.h index 3668015..a09c315 100644 --- a/src/gallium/auxiliary/util/u_prim.h +++ b/src/gallium/auxiliary/util/u_prim.h @@ -145,6 +145,9 @@ u_prims_for_vertices(unsigned prim, unsigned num) { const struct u_prim_vertex_count *info = u_prim_vertex_count(prim); + assert(info); + assert(info->incr != 0); + if (num < info->min) return 0; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression
https://bugs.freedesktop.org/show_bug.cgi?id=91888 --- Comment #16 from Pekka Paalanen--- (In reply to nerdopolis1 from comment #15) > Doesn't seem that break _mesa_error works, it's not defined... It should become defined once all the related Mesa libraries get loaded. Looks like 'start' is not enough to pull all shared objects in, the driver needs to load also. Just let it pend "on future shared library load". -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: Fix build against LLVM 3.8 SVN >= r255078
Michel Dänzerwrites: > From: Michel Dänzer > > Signed-off-by: Michel Dänzer Looks OK to me, Reviewed-by: Francisco Jerez > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 3b37f08..4d11c24 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -661,7 +661,11 @@ namespace { > >if (dump_asm) { > LLVMSetTargetMachineAsmVerbosity(tm, true); > +#if HAVE_LLVM >= 0x0308 > + LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod).release()); > +#else > LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod)); > +#endif > emit_code(tm, debug_mod, LLVMAssemblyFile, _buffer, r_log); > buffer_size = LLVMGetBufferSize(out_buffer); > buffer_data = LLVMGetBufferStart(out_buffer); > -- > 2.6.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] glsl: pass stage into mark function
On Wed, Dec 9, 2015 at 8:06 AM, Dave Airliewrote: > From: Dave Airlie > > Don't use a bool here, as for some 64-bit fixes we need > the stage. > > Signed-off-by: Dave Airlie > --- > src/glsl/ir_set_program_inouts.cpp | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/glsl/ir_set_program_inouts.cpp > b/src/glsl/ir_set_program_inouts.cpp > index d7c29b0..70d754f 100644 > --- a/src/glsl/ir_set_program_inouts.cpp > +++ b/src/glsl/ir_set_program_inouts.cpp > @@ -90,7 +90,7 @@ is_dual_slot(ir_variable *var) > > static void > mark(struct gl_program *prog, ir_variable *var, int offset, int len, > - bool is_fragment_shader) > + gl_shader_stage stage) > { > /* As of GLSL 1.20, varyings can only be floats, floating-point > * vectors or matrices, or arrays of them. For Mesa programs using > @@ -125,7 +125,7 @@ mark(struct gl_program *prog, ir_variable *var, int > offset, int len, > > if (dual_slot) > prog->DoubleInputsRead |= bitfield; > - if (is_fragment_shader) { > + if (stage == MESA_SHADER_FRAGMENT) { > gl_fragment_program *fprog = (gl_fragment_program *) prog; > fprog->InterpQualifier[idx] = > (glsl_interp_qualifier) var->data.interpolation; > @@ -178,7 +178,7 @@ > ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var) > } > > mark(this->prog, var, 0, type->count_attribute_slots(), > -this->shader_stage == MESA_SHADER_FRAGMENT); > +this->shader_stage); > } > > /* Default handler: Mark all the locations in the variable as used. */ > @@ -302,7 +302,7 @@ > ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var, > } > > mark(this->prog, var, index_as_constant->value.u[0] * elem_width, > -elem_width, this->shader_stage == MESA_SHADER_FRAGMENT); > +elem_width, this->shader_stage); > return true; > } > > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Oded Gabbay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 1626] X server should not poll() on DRM fd
https://bugs.freedesktop.org/show_bug.cgi?id=1626 Michel Dänzerchanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WONTFIX --- Comment #4 from Michel Dänzer --- Agreed. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 15/44] i965: Work around L3 state leaks during context switches.
On 2015-12-08 08:43:53, Francisco Jerez wrote: > This is going to require some rather intrusive kernel changes to fix > properly, in the meantime (and forever on at least pre-v4.1 kernels) > we'll have to restore the hardware defaults at the end of every batch > in which the L3 configuration was changed to avoid interfering with > the DDX and GL clients that use an older non-L3-aware version of Mesa. > > Reviewed-by: Samuel Iglesias Gonsálvez> Reviewed-by: Kristian Høgsberg > > v4: Optimize look-up of the default configuration by assuming it's the > first entry of the L3 config array in order to avoid an FPS > regression in GpuTest Triangle and SynMark OglBatch2-7 on most > affected platforms. > --- > src/mesa/drivers/dri/i965/brw_state.h | 4 +++ > src/mesa/drivers/dri/i965/gen7_l3_state.c | 51 > +++ > src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 > src/mesa/drivers/dri/i965/intel_batchbuffer.h | 6 +++- > 4 files changed, 67 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_state.h > b/src/mesa/drivers/dri/i965/brw_state.h > index 49f301a..b7c0039 100644 > --- a/src/mesa/drivers/dri/i965/brw_state.h > +++ b/src/mesa/drivers/dri/i965/brw_state.h > @@ -380,6 +380,10 @@ void gen7_update_binding_table_from_array(struct > brw_context *brw, > void gen7_disable_hw_binding_tables(struct brw_context *brw); > void gen7_reset_hw_bt_pool_offsets(struct brw_context *brw); > > +/* gen7_l3_state.c */ > +void > +gen7_restore_default_l3_config(struct brw_context *brw); > + > #ifdef __cplusplus > } > #endif > diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c > b/src/mesa/drivers/dri/i965/gen7_l3_state.c > index 7956935..7fa7336 100644 > --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c > @@ -520,3 +520,54 @@ const struct brw_tracked_state gen7_l3_state = { > }, > .emit = emit_l3_state > }; > + > +/** > + * Hack to restore the default L3 configuration. > + * > + * This will be called at the end of every batch in order to reset the L3 > + * configuration to the default values for the time being until the kernel is > + * fixed. Until kernel commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b > + * (included in v4.1) we would set the MI_RESTORE_INHIBIT bit when submitting > + * batch buffers for the default context used by the DDX, which meant that > any > + * context state changed by the GL would leak into the DDX, the assumption > + * being that the DDX would initialize any state it cares about manually. > The > + * DDX is however not careful enough to program an L3 configuration > + * explicitly, and it makes assumptions about it (URB size) which won't hold > + * and cause it to misrender if we let our L3 set-up to leak into the DDX. > + * > + * Since v4.1 of the Linux kernel the default context is saved and restored > + * normally, so it's far less likely for our L3 programming to interfere with > + * other contexts -- In fact restoring the default L3 configuration at the > end > + * of the batch will be redundant most of the time. A kind of state leak is > + * still possible though if the context making assumptions about L3 state is > + * created immediately after our context was active (e.g. without the DDX > + * default context being scheduled in between) because at present the DRM > + * doesn't fully initialize the contents of newly created contexts and > instead > + * sets the MI_RESTORE_INHIBIT flag causing it to inherit the state from the > + * last active context. > + * > + * It's possible to realize such a scenario if, say, an X server (or a GL > + * application using an outdated non-L3-aware Mesa version) is started while > + * another GL application is running and happens to have modified the L3 > + * configuration, or if no X server is running at all and a GL application > + * using a non-L3-aware Mesa version is started after another GL application > + * ran and modified the L3 configuration -- The latter situation can actually > + * be reproduced easily on IVB in our CI system. > + */ > +void > +gen7_restore_default_l3_config(struct brw_context *brw) > +{ > + const struct brw_device_info *devinfo = brw->intelScreen->devinfo; > + /* For efficiency assume that the first entry of the array matches the > +* default configuration. > +*/ > + const struct brw_l3_config *const cfg = get_l3_configs(devinfo); > + assert(cfg == get_l3_config(devinfo, > + get_default_l3_weights(devinfo, false, > false))); > + > + if (cfg != brw->l3.config && brw->can_do_pipelined_register_writes) { > + setup_l3_config(brw, cfg); > + update_urb_size(brw, cfg); > + brw->l3.config = cfg; > + } > +} > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c > index 0363bd3..f778074 100644 > ---
Re: [Mesa-dev] [PATCH] svga: initialize pipe_driver_query_info entries with a macro
On 9 December 2015 at 00:35, Brian Paulwrote: > To be safe, set all the fields in case the enums ordering/values > ever change. Since you guys are using MVSC2013 to build svga one can even use c99 initializers. This should be more robust wrt the issues mentioned. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix locking of GLsync objects
Hello Steinar, On 8 December 2015 at 01:01, Steinar H. Gundersonwrote: > Hi, > > I was told that it's easier for people to review my patch if it comes in via > email than being stuck in the bug tracker; FWIW, this is for bug 120238. Which bugtracker it this ? bugs.fd.o does not like the number mentioned. Please add the full URL to the commit message with a Bugzilla: tag. > (It's the same patch as is already in the tracker.) > > /* Steinar */ > > === > > From 6e3d1880fa78a3a965cb7eb51ee12b1f785f84bb Mon Sep 17 00:00:00 2001 > From: "Steinar H. Gunderson" > Date: Tue, 1 Dec 2015 22:05:11 +0100 > Subject: [PATCH] Fix locking of GLsync objects. > > GLsync objects had a race condition when used from multiple threads > (which is the main point of the extension, really); it could be > validated as a sync object at the beginning of the function, and then > deleted by another thread before use, causing crashes. Fix this by > changing all casts from GLsync to struct gl_sync_object to a new > function _mesa_get_sync() that validates and increases the refcount. > Might be worth keeping _mesa_ref_sync_object(), even if it's an inline wrapper around the above. As things get a bit confusing - foo_get vs foo_unref. Alternatively one could even throw the locking (+extra checks) into the validate, use it in _mesa_IsSync(), while using the ref/unref combo elsewhere and drop the "amount" argument from unref. I'm biased towards the latter, although let's see how others feel on the topic. > In a similar vein, validation itself uses _mesa_set_search(), which > requires synchronization -- it was called without a mutex held, causing > spurious error returns and other issues. Since _mesa_get_sync() now > takes the shared context mutex, this problem is also resolved. > Please mention if this commit fixes a certain game/program. Can you also add the following tag. This way it'll be harder to miss the patch when picking things for the stable branch(es). Cc: "11.0 11.1" Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.
My tessellation branch has two additional remap functions. I don't want to replicate this logic there. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 78 - 1 file changed, 50 insertions(+), 28 deletions(-) Hey Jason, If you like this patch, and haven't yet merged your NIR input reworks, feel free to just squash it into your changes. Or, we can land it separately after your changes. It's up to you. Separating this out allows me to reuse this in my new tessellation input and output remapping functions, and also means we don't need to add structs for the remap functions...we can just pass the builder, or inputs_read, or the VUE map...and not have to pack multiple things together. --Ken diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 14ad172..105a175 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -27,15 +27,19 @@ #include "glsl/nir/nir_builder.h" #include "program/prog_to_nir.h" -struct remap_vs_attrs_state { - nir_builder b; - uint64_t inputs_read; -}; - +/** + * In many cases, we just add the base and offset together, so there's no + * reason to keep them separate. Sometimes, combining them is essential: + * if a shader only accesses part of a compound variable (such as a matrix + * or array), the variable's base may not actually exist in the VUE map. + * + * This pass adds constant offsets to instr->const_index[0], and resets + * the offset source to 0. Non-constant offsets remain unchanged. + */ static bool -remap_vs_attrs(nir_block *block, void *void_state) +add_const_offset_to_base(nir_block *block, void *closure) { - struct remap_vs_attrs_state *state = void_state; + nir_builder *b = closure; nir_foreach_instr_safe(block, instr) { if (instr->type != nir_instr_type_intrinsic) @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state) nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); + if (intrin->intrinsic == nir_intrinsic_load_input || + intrin->intrinsic == nir_intrinsic_load_per_vertex_input || + intrin->intrinsic == nir_intrinsic_load_output || + intrin->intrinsic == nir_intrinsic_load_per_vertex_output || + intrin->intrinsic == nir_intrinsic_store_output || + intrin->intrinsic == nir_intrinsic_store_per_vertex_output) { + nir_src *offset = nir_get_io_offset_src(intrin); + nir_const_value *const_offset = nir_src_as_const_value(*offset); + + if (const_offset) { +intrin->const_index[0] += const_offset->u[0]; +b->cursor = nir_before_instr(>instr); +nir_instr_rewrite_src(>instr, offset, + nir_src_for_ssa(nir_imm_int(b, 0))); + } + } + } + return true; + +} + +static bool +remap_vs_attrs(nir_block *block, void *closure) +{ + GLbitfield64 inputs_read = *((GLbitfield64 *) closure); + + nir_foreach_instr(block, instr) { + if (instr->type != nir_instr_type_intrinsic) + continue; + + nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); + if (intrin->intrinsic == nir_intrinsic_load_input) { /* Attributes come in a contiguous block, ordered by their * gl_vert_attrib value. That means we can compute the slot * number for an attribute by masking out the enabled attributes * before it and counting the bits. */ - nir_const_value *const_offset = nir_src_as_const_value(intrin->src[0]); - - /* We set EmitNoIndirect for VS inputs, so there are no indirects. */ - assert(const_offset); - - int attr = intrin->const_index[0] + const_offset->u[0]; - int slot = _mesa_bitcount_64(state->inputs_read & - BITFIELD64_MASK(attr)); + int attr = intrin->const_index[0]; + int slot = _mesa_bitcount_64(inputs_read & BITFIELD64_MASK(attr)); - /* The NIR -> FS pass will just add the base and offset together, so - * there's no reason to keep them separate. Just put it all in - * const_index[0] and set the offset src[0] to load_const(0). - */ intrin->const_index[0] = 4 * slot; - - state->b.cursor = nir_before_instr(>instr); - nir_instr_rewrite_src(>instr, >src[0], - nir_src_for_ssa(nir_imm_int(>b, 0))); } } return true; @@ -97,17 +119,17 @@ brw_nir_lower_inputs(nir_shader *nir, * key->inputs_read since the two are identical aside from Gen4-5 * edge flag differences. */ - struct remap_vs_attrs_state remap_state = { -.inputs_read = nir->info.inputs_read, - }; + GLbitfield64 inputs_read = nir->info.inputs_read; /* This pass needs actual constants */
Re: [Mesa-dev] [Mesa-stable] [PATCH v2] configure.ac: use pkg-config for libelf
On 9 December 2015 at 05:55, Jonathan Graywrote: > Use PKG_CHECK_MODULES to get the flags to link libelf > > v2: keep AC_CHECK_LIB as a fallback for elfutils provided > libelf that doesn't install a pkg-config file. > > Signed-off-by: Jonathan Gray > Reviewed-by: Michel Dänzer > Tested-by: Michel Dänzer > Cc: "11.0 11.1" > --- > configure.ac | 12 +--- > src/gallium/drivers/radeon/Makefile.am | 5 +++-- > src/gallium/targets/opencl/Makefile.am | 5 - > 3 files changed, 16 insertions(+), 6 deletions(-) > Reviewed-by: Emil Velikov François, this should allow you guys to remove these nasty workarounds [1]. Prior you only needed them for opencl or when r600/radeonsi was selected (be that dri, d3dadapter or any of the video backend drivers) -Emil [1] http://gitweb.dragonflybsd.org/dports.git/blob/HEAD:/graphics/libosmesa/Makefile.DragonFly ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] glsl: fix transform feedback for 64-bit outupts.
On Wed, 2015-12-09 at 16:06 +1000, Dave Airlie wrote: > From: Dave Airlie> > This fixes the calculations for transform feedback for doubles. > > Signed-off-by: Dave Airlie Patches 4-7 are also: Reviewed-by: Timothy Arceri ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
This code in brw_set_dest adjusts the execution size of any instruction with a dst.width < 8. However, we don't want to do this with instructions operating on doubles, since these will have a width of 4, but still need an execution size of 8 (for SIMD8). Unfortunately, we can't just check the size of the operands involved to detect if we are doing an operation on doubles, because we can have instructions that do operations on double operands interpreted as UD, operating on any of its 2 32-bit components. Previous commits have made it so we never emit instructions with a horizontal width of 4 that don't have the correct execution size set for gen7/gen8, so we can skip it in this case, avoiding the conflicts with fp64 requirements. Expanding the same fix to other hardware generations requires many more changes but since we are not targetting fp64 support on them wer don't really care for now. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 78f2c8c..50a8771 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest) /* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8) * or 16 (SIMD16), as that's normally correct. However, when dealing with * small registers, we automatically reduce it to match the register size. +* +* In platforms that support fp64 we can emit instructions with a width of +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these +* cases we need to make sure that these instructions have their exec sizes +* set properly when they are emitted and we can't rely on this code to fix +* it. */ - if (dest.width < BRW_EXECUTE_8) + bool fix_exec_size; + if (devinfo->gen == 7 || devinfo->gen == 8) + fix_exec_size = dest.width < BRW_EXECUTE_4; + else + fix_exec_size = dest.width < BRW_EXECUTE_8; + + if (fix_exec_size) brw_inst_set_exec_size(devinfo, inst, dest.width); } -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 3/5] i965/eu: set execution size for SEND message in brw_send_indirect_message
--- src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 9543d5e..13c8c36 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2590,6 +2590,9 @@ brw_send_indirect_message(struct brw_codegen *p, brw_set_src1(p, send, addr); } + if (dst.width < BRW_EXECUTE_8) + brw_inst_set_exec_size(devinfo, send, dst.width); + brw_set_dest(p, send, dst); brw_set_src0(p, send, retype(payload, BRW_REGISTER_TYPE_UD)); brw_inst_set_sfid(devinfo, send, sfid); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 1/5] i965/eu: set correct execution size in brw_NOP
--- src/mesa/drivers/dri/i965/brw_eu_emit.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index f8c0f80..9543d5e 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -1256,6 +1256,7 @@ brw_F16TO32(struct brw_codegen *p, struct brw_reg dst, struct brw_reg src) void brw_NOP(struct brw_codegen *p) { brw_inst *insn = next_insn(p, BRW_OPCODE_NOP); + brw_inst_set_exec_size(p->devinfo, insn, BRW_EXECUTE_4); brw_set_dest(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD)); brw_set_src0(p, insn, retype(brw_vec4_grf(0,0), BRW_REGISTER_TYPE_UD)); brw_set_src1(p, insn, brw_imm_ud(0x0)); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 0/5] Skip automatic execsize for instructions with a width of 4
Right now we rely on the code at the bottom of brw_set_dest to set the correct execution size for anything that does not operate on a full SIMD register (dst.width < BRW_EXECUTE_8). However, this won't work with fp64, where operands are twice as big and we see instructions with a horizontal width of 4 that still require an execution size of 8. We cannot fix this by simply checking the type of the operands involved and skip the automatic execsize adjustment when they are doubles because we can also operate on doubles as integers (for pack and unpack operations for example). Connor and I have been discussing this and we see two options: 1) We fix all the instructions that do not operate on doubles to set the correct execution size so they don't depend on the code in brw_set_dest any more, then remove that code from brw_set_dest so it does not break fp64. However, this involves a lot of changes throughout the driver and across all hardware platforms... it looks rather scary, so I think we probably don't want to go down this path. 2) Since we only produce double instructions with a width of 4, we can limit the change to these instructions. This means that we only automatically adjust execsize in brw_set_dest for instructions where dst.width < BRW_EXECUTE_4 and we fix driver code that emits non-double instructions with a width of 4 to set the execsize they need explicitly. For gen7/8 it seems that only a handful of changes are required for this according to piglit. This RFC series implements 2). As mentioned, piglit seems happy with the changes, but it might be a good idea to run this through Jenkins to make sure that we are not missing anything. My original idea was to expand the fix to cover other gens as well, but unfortunately, this would involve many more changes. For example, ILK's strip-and-fans or clipping modules are full with code that mixes width8 and width4 instructions (I have ~10 patches that fix this that involve changing the default execution size to 4 in the sf module), gen6 also needs a few changes and I would not be surprised if gen4 required more. Gen9 might require changes too. Another problem with this is that we don't have gen4/9 hardware available, so trying this would also require involvement from other people with access to this hardware. So considering that we are not targetting fp64 support on gens < 7, at least for now, it seems that implementing 2) but special-case it for gen7 and gen8 only, leaving other generations intact is a resoanble way to go for now. Opinions? Iago Toral Quiroga (5): i965/eu: set correct execution size in brw_NOP i965/fs: set execution size for SEND messages in generate_uniform_pull_constant_load_gen7 i965/eu: set execution size for SEND message in brw_send_indirect_message i965: set correct execsize for MOVS with a width of 4 in brw_find_live_channel i965: Skip execution size adjustment for instructions of width 4 src/mesa/drivers/dri/i965/brw_eu_emit.c| 21 - src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 ++ 2 files changed, 22 insertions(+), 1 deletion(-) -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression
https://bugs.freedesktop.org/show_bug.cgi?id=91888 --- Comment #17 from nerdopol...@verizon.net --- Argh, I tried to recompile mesa master with all of the symbols, and now SDL is working, not sure what to think now... -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support
On Wed, Dec 9, 2015 at 1:09 PM, Emil Velikovwrote: > On 9 December 2015 at 05:37, Jonathan Gray wrote: >> Change the __m128i variables to be volatile so gcc 4.9 won't optimise >> all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls >> still get optimised out but now there is at least one SSE4.1 instruction >> generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions >> got optimised out the configure test would incorrectly pass when the >> compiler supported the intrinsics and the assembler didn't support the >> instructions. >> > Must admit that I was not expecting that one. Looks like pixman (the > inspiration for this check) is missing volatile as well. Does that one > build/run fine on OpenBSD ? > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 >> Signed-off-by: Jonathan Gray >> Cc: "11.0 11.1" > Reviewed-by: Emil Velikov > > I'll pick this in a couple of days (barring any objections). > > Thanks > Emil Adding pixman ML. I must admit ignorance on this one. I looked at configure.ac of pixman and I don't see any SSE4.1 reference, and AFAIK, we don't use those instructions (only SSE2 and SSSE3). Is the above patch relevant for those as well ? because the tests in configure.ac does *not* contain volatile. Oded ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] glsl: use dual slot helper in the linker code.
On Wed, 2015-12-09 at 16:06 +1000, Dave Airlie wrote: > From: Dave Airlie> > Signed-off-by: Dave Airlie Great timing :) I was going to have to look into fixing this stuff for enhanced layouts. Patches 1 & 2 are: Reviewed-by: Timothy Arceri I have a question about this patch. If these doubles only take up a single attribute then why do we even bother with this test? The spec says its optional and your fixing the counting up in later patches so what does it do thats useful? If there is a good reason for keeping it then: Reviewed-by: Timothy Arceri > --- > src/glsl/linker.cpp | 11 +-- > 1 file changed, 1 insertion(+), 10 deletions(-) > > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp > index ae628cd..89659c7 100644 > --- a/src/glsl/linker.cpp > +++ b/src/glsl/linker.cpp > @@ -2603,17 +2603,8 @@ > assign_attribute_or_color_locations(gl_shader_program *prog, > * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, > this > * is optional behavior, but it seems preferable. > */ > -const glsl_type *type = var->type->without_array(); > -if (type == glsl_type::dvec3_type || > -type == glsl_type::dvec4_type || > -type == glsl_type::dmat2x3_type || > -type == glsl_type::dmat2x4_type || > -type == glsl_type::dmat3_type || > -type == glsl_type::dmat3x4_type || > -type == glsl_type::dmat4x3_type || > -type == glsl_type::dmat4_type) { > +if (var->type->without_array()->is_dual_slot_double()) > double_storage_locations |= (use_mask << attr); > -} >} > >continue; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support
On Wed, Dec 9, 2015 at 2:34 PM, Jonathan Graywrote: > On Wed, Dec 09, 2015 at 01:39:30PM +0200, Oded Gabbay wrote: >> On Wed, Dec 9, 2015 at 1:09 PM, Emil Velikov >> wrote: >> > On 9 December 2015 at 05:37, Jonathan Gray wrote: >> >> Change the __m128i variables to be volatile so gcc 4.9 won't optimise >> >> all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls >> >> still get optimised out but now there is at least one SSE4.1 instruction >> >> generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions >> >> got optimised out the configure test would incorrectly pass when the >> >> compiler supported the intrinsics and the assembler didn't support the >> >> instructions. >> >> >> > Must admit that I was not expecting that one. Looks like pixman (the >> > inspiration for this check) is missing volatile as well. Does that one >> > build/run fine on OpenBSD ? >> > >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 >> >> Signed-off-by: Jonathan Gray >> >> Cc: "11.0 11.1" >> > Reviewed-by: Emil Velikov >> > >> > I'll pick this in a couple of days (barring any objections). >> > >> > Thanks >> > Emil >> >> Adding pixman ML. >> I must admit ignorance on this one. >> I looked at configure.ac of pixman and I don't see any SSE4.1 >> reference, and AFAIK, we don't use those instructions (only SSE2 and >> SSSE3). >> Is the above patch relevant for those as well ? because the tests in >> configure.ac does *not* contain volatile. >> >> Oded > > It looks like this is indeed a problem in pixman as well with at least > gcc 4.2 and 4.9. Running the pixman sse2 test on amd64 I only > see xmm register use and movdqa in the generated assembly with -O0. > > The pixman configure tests passes on OpenBSD but the toolchain > recognises sse2 instructions just not sse 4.1. > > Introducing volatile to the pixman test stops the xmm/movdqa > use from being optimised out. I adapted the patch to pixman's configure.ac and sent it to the ML. Jonathan, thanks for the patch. Oded ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound
On 8 December 2015 at 16:35, Neil Robertswrote: > If EGL_KHR_surfaceless_context is used then glViewport can be called > with NULL for the draw and read surfaces. This was previously causing > a crash because the i965 driver tries to use this point to invalidate > the surfaces and it was derferencing the NULL pointer. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257 > Cc: Nanley Chery > Cc: "11.1" Worth throwing in 11.0 as well ? > --- > > I've also posted a Piglit test for this here: > > http://patchwork.freedesktop.org/patch/67356/ > > src/mesa/drivers/dri/i965/brw_context.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index 7d7643c..5374bad 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -159,8 +159,10 @@ intel_viewport(struct gl_context *ctx) > __DRIcontext *driContext = brw->driContext; > > if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) { > - dri2InvalidateDrawable(driContext->driDrawablePriv); > - dri2InvalidateDrawable(driContext->driReadablePriv); > + if (driContext->driDrawablePriv) > + dri2InvalidateDrawable(driContext->driDrawablePriv); > + if (driContext->driReadablePriv) > + dri2InvalidateDrawable(driContext->driReadablePriv); Afaict i915 could use an identical fix ? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: do not loose always_active_io when packing varyings
On Wed, 2015-12-09 at 09:48 +0200, Tapani Pälli wrote: > Otherwise packed and inactive varyings get optimized away. This needs > to be prevented when using separate shader objects where interface > needs to be preserved. > > Signed-off-by: Tapani PälliReviewed-by: Timothy Arceri > --- > src/glsl/lower_packed_varyings.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/glsl/lower_packed_varyings.cpp > b/src/glsl/lower_packed_varyings.cpp > index 037c27d..8d1eb17 100644 > --- a/src/glsl/lower_packed_varyings.cpp > +++ b/src/glsl/lower_packed_varyings.cpp > @@ -622,6 +622,7 @@ > lower_packed_varyings_visitor::get_packed_varying_deref( >packed_var->data.interpolation = unpacked_var > ->data.interpolation; >packed_var->data.location = location; >packed_var->data.precision = unpacked_var->data.precision; > + packed_var->data.always_active_io = unpacked_var > ->data.always_active_io; >unpacked_var->insert_before(packed_var); >this->packed_varyings[slot] = packed_var; > } else { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support
On Wed, Dec 09, 2015 at 01:39:30PM +0200, Oded Gabbay wrote: > On Wed, Dec 9, 2015 at 1:09 PM, Emil Velikovwrote: > > On 9 December 2015 at 05:37, Jonathan Gray wrote: > >> Change the __m128i variables to be volatile so gcc 4.9 won't optimise > >> all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls > >> still get optimised out but now there is at least one SSE4.1 instruction > >> generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions > >> got optimised out the configure test would incorrectly pass when the > >> compiler supported the intrinsics and the assembler didn't support the > >> instructions. > >> > > Must admit that I was not expecting that one. Looks like pixman (the > > inspiration for this check) is missing volatile as well. Does that one > > build/run fine on OpenBSD ? > > > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 > >> Signed-off-by: Jonathan Gray > >> Cc: "11.0 11.1" > > Reviewed-by: Emil Velikov > > > > I'll pick this in a couple of days (barring any objections). > > > > Thanks > > Emil > > Adding pixman ML. > I must admit ignorance on this one. > I looked at configure.ac of pixman and I don't see any SSE4.1 > reference, and AFAIK, we don't use those instructions (only SSE2 and > SSSE3). > Is the above patch relevant for those as well ? because the tests in > configure.ac does *not* contain volatile. > > Oded It looks like this is indeed a problem in pixman as well with at least gcc 4.2 and 4.9. Running the pixman sse2 test on amd64 I only see xmm register use and movdqa in the generated assembly with -O0. The pixman configure tests passes on OpenBSD but the toolchain recognises sse2 instructions just not sse 4.1. Introducing volatile to the pixman test stops the xmm/movdqa use from being optimised out. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] i965: Resolve color for all active shader images in intel_update_state().
Kristian Høgsbergwrites: > On Sat, Sep 5, 2015 at 11:30 AM, Jordan Justen > wrote: >> From: Francisco Jerez >> >> Fixes >> arb_shader_image_load_store/execution/load-from-cleared-image.shader_test >> >> Cc: Chris Wilson >> Cc: Jason Ekstrand >> Tested-by: Jordan Justen > > This patch is required for correct behavior and looks straightforward > and correct to me. Let's fix the bug and optimize the CPU performance > regression (if there is one) later. > > Reviewed-by: Kristian Høgsberg Thanks, pushed. Chris, feel free to open a bug report and add me to the CC list if you can still reproduce a regression on master. > >> --- >> RE: i965: Perform an explicit flush after doing _mesa_meta_pbo_TexSubImage >> >> curro has some concerns about potential perf impact by this and >> wanted it to be checked on small-core w/CPU bound apps. >> Unfortunately, he is on vacation now. >> >> src/mesa/drivers/dri/i965/brw_context.c | 18 ++ >> 1 file changed, 18 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_context.c >> b/src/mesa/drivers/dri/i965/brw_context.c >> index f0ed891..a274a43 100644 >> --- a/src/mesa/drivers/dri/i965/brw_context.c >> +++ b/src/mesa/drivers/dri/i965/brw_context.c >> @@ -189,6 +189,24 @@ intel_update_state(struct gl_context * ctx, GLuint >> new_state) >>brw_render_cache_set_check_flush(brw, tex_obj->mt->bo); >> } >> >> + /* Resolve color for each active shader image. */ >> + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { >> + const struct gl_shader *shader = ctx->_Shader->CurrentProgram[i] ? >> + ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL; >> + >> + if (unlikely(shader && shader->NumImages)) { >> + for (unsigned j = 0; j < shader->NumImages; j++) { >> +struct gl_image_unit *u = >> >ImageUnits[shader->ImageUnits[j]]; >> +tex_obj = intel_texture_object(u->TexObj); >> + >> +if (tex_obj && tex_obj->mt) { >> + intel_miptree_resolve_color(brw, tex_obj->mt); >> + brw_render_cache_set_check_flush(brw, tex_obj->mt->bo); >> +} >> + } >> + } >> + } >> + >> _mesa_lock_context_textures(ctx); >> } >> >> -- >> 2.5.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support
On 9 December 2015 at 05:37, Jonathan Graywrote: > Change the __m128i variables to be volatile so gcc 4.9 won't optimise > all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls > still get optimised out but now there is at least one SSE4.1 instruction > generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions > got optimised out the configure test would incorrectly pass when the > compiler supported the intrinsics and the assembler didn't support the > instructions. > Must admit that I was not expecting that one. Looks like pixman (the inspiration for this check) is missing volatile as well. Does that one build/run fine on OpenBSD ? > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 > Signed-off-by: Jonathan Gray > Cc: "11.0 11.1" Reviewed-by: Emil Velikov I'll pick this in a couple of days (barring any objections). Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/8] glsl/fp64: add helper for dual slot double detection.
On Wed, Dec 9, 2015 at 8:06 AM, Dave Airliewrote: > From: Dave Airlie > > The old function didn't work for matrices, and we need this > in other places to fix some other problems, so move to a helper > in glsl type and fix the one user so far. > > A dual slot double is one that has 3 or 4 components in it's > base type. > > Signed-off-by: Dave Airlie > --- > src/glsl/ir_set_program_inouts.cpp | 10 +- > src/glsl/nir/glsl_types.h | 8 > 2 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/src/glsl/ir_set_program_inouts.cpp > b/src/glsl/ir_set_program_inouts.cpp > index 70d754f..782f1b1 100644 > --- a/src/glsl/ir_set_program_inouts.cpp > +++ b/src/glsl/ir_set_program_inouts.cpp > @@ -81,13 +81,6 @@ is_shader_inout(ir_variable *var) >var->data.mode == ir_var_system_value; > } > > -static inline bool > -is_dual_slot(ir_variable *var) > -{ > - const glsl_type *type = var->type->without_array(); > - return type == glsl_type::dvec4_type || type == glsl_type::dvec3_type; > -} > - > static void > mark(struct gl_program *prog, ir_variable *var, int offset, int len, > gl_shader_stage stage) > @@ -101,7 +94,6 @@ mark(struct gl_program *prog, ir_variable *var, int > offset, int len, > */ > > for (int i = 0; i < len; i++) { > - bool dual_slot = is_dual_slot(var); >int idx = var->data.location + var->data.index + offset + i; >bool is_patch_generic = var->data.patch && >idx != VARYING_SLOT_TESS_LEVEL_INNER && > @@ -123,7 +115,7 @@ mark(struct gl_program *prog, ir_variable *var, int > offset, int len, > else > prog->InputsRead |= bitfield; > > - if (dual_slot) > + if (var->type->without_array()->is_dual_slot_double()) > prog->DoubleInputsRead |= bitfield; > if (stage == MESA_SHADER_FRAGMENT) { > gl_fragment_program *fprog = (gl_fragment_program *) prog; > diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h > index d8a999a..26f25a1 100644 > --- a/src/glsl/nir/glsl_types.h > +++ b/src/glsl/nir/glsl_types.h > @@ -471,6 +471,14 @@ struct glsl_type { > } > > /** > +* Query whether a double takes two slots. > +*/ > + bool is_dual_slot_double() const > + { > + return base_type == GLSL_TYPE_DOUBLE && vector_elements > 2; > + } > + > + /** > * Query whether or not a type is a non-array boolean type > */ > bool is_boolean() const > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Oded Gabbay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] glsl: only divide left components when it is a dual slot double.
On Wed, 2015-12-09 at 16:06 +1000, Dave Airlie wrote: > From: Dave Airlie> > Signed-off-by: Dave Airlie > --- > src/glsl/lower_packed_varyings.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/glsl/lower_packed_varyings.cpp > b/src/glsl/lower_packed_varyings.cpp > index 037c27d..ec9af62 100644 > --- a/src/glsl/lower_packed_varyings.cpp > +++ b/src/glsl/lower_packed_varyings.cpp > @@ -472,7 +472,7 @@ > lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue, >char right_swizzle_name[4] = { 0, 0, 0, 0 }; > >left_components = 4 - fine_location % 4; > - if (rvalue->type->is_double()) { > + if (rvalue->type->is_dual_slot_double()) { The subject line says what the change is but there is no explanation on why it was made. Can you add more detail to the comment? To me the existing code *seems* correct as all doubles take up twice the amount of components, why would we only divide by 2 when its a dual slot double? > /* We might actually end up with 0 left components! */ > left_components /= 2; >} ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support
On 9 December 2015 at 11:39, Oded Gabbaywrote: > On Wed, Dec 9, 2015 at 1:09 PM, Emil Velikov wrote: >> On 9 December 2015 at 05:37, Jonathan Gray wrote: >>> Change the __m128i variables to be volatile so gcc 4.9 won't optimise >>> all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls >>> still get optimised out but now there is at least one SSE4.1 instruction >>> generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions >>> got optimised out the configure test would incorrectly pass when the >>> compiler supported the intrinsics and the assembler didn't support the >>> instructions. >>> >> Must admit that I was not expecting that one. Looks like pixman (the >> inspiration for this check) is missing volatile as well. Does that one >> build/run fine on OpenBSD ? >> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 >>> Signed-off-by: Jonathan Gray >>> Cc: "11.0 11.1" >> Reviewed-by: Emil Velikov >> >> I'll pick this in a couple of days (barring any objections). >> >> Thanks >> Emil > > Adding pixman ML. > I must admit ignorance on this one. > I looked at configure.ac of pixman and I don't see any SSE4.1 > reference, and AFAIK, we don't use those instructions (only SSE2 and > SSSE3). > Is the above patch relevant for those as well ? because the tests in > configure.ac does *not* contain volatile. > True, there is no SSE4.1 detection in pixman but the logic should still holds. .Unless ... it is exclusive to SSE4.1, which I rather doubt. Unfortunately there is no easy way to get older GCC on Arch otherwise I would have tried it. The full patch for reference http://patchwork.freedesktop.org/patch/67449/ -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 2/5] i965/fs: set execution size for SEND messages in generate_uniform_pull_constant_load_gen7
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 8d24883..36a7329 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1252,6 +1252,7 @@ fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst, brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); brw_set_default_mask_control(p, BRW_MASK_DISABLE); brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND); + brw_inst_set_exec_size(devinfo, send, dst.width); brw_pop_insn_state(p); brw_set_dest(p, send, dst); @@ -1283,6 +1284,7 @@ fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst, /* dst = send(payload, a0.0 | ) */ brw_inst *insn = brw_send_indirect_message( p, BRW_SFID_SAMPLER, dst, src, addr); + brw_inst_set_exec_size(devinfo, insn, dst.width); brw_set_sampler_message(p, insn, 0, 0, /* LD message ignores sampler unit */ -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 4/5] i965: set correct execsize for MOVS with a width of 4 in brw_find_live_channel
--- src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 13c8c36..78f2c8c 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -3360,11 +3360,14 @@ brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst) /* Overwrite the destination without and with execution masking to * find out which of the channels is active. */ + brw_push_insn_state(p); + brw_set_default_exec_size(p, BRW_EXECUTE_4); brw_MOV(p, brw_writemask(vec4(dst), WRITEMASK_X), brw_imm_ud(1)); inst = brw_MOV(p, brw_writemask(vec4(dst), WRITEMASK_X), brw_imm_ud(0)); + brw_pop_insn_state(p); brw_inst_set_mask_control(devinfo, inst, BRW_MASK_ENABLE); } } -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/10] gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager
On Dec 8, 2015 10:08 PM, "Nicolai Hähnle"wrote: > > On 06.12.2015 19:00, Marek Olšák wrote: >> >> From: Marek Olšák >> >> This simplified (basically duplicated) version of pb_cache_manager will >> allow removing some ugly hacks from radeon and amdgpu winsyses and >> flatten simplify their design. >> >> The difference is that winsyses must manually add buffers to the cache >> in "destroy" functions and the cache doesn't know about the buffers before >> that. The integration is therefore trivial and the impact on the winsys >> design is negligible. >> --- >> src/gallium/auxiliary/Makefile.sources | 1 + >> src/gallium/auxiliary/pipebuffer/pb_cache.c | 286 >> src/gallium/auxiliary/pipebuffer/pb_cache.h | 74 +++ >> 3 files changed, 361 insertions(+) >> create mode 100644 src/gallium/auxiliary/pipebuffer/pb_cache.c >> create mode 100644 src/gallium/auxiliary/pipebuffer/pb_cache.h >> >> diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources >> index 6160192..817308d 100644 >> --- a/src/gallium/auxiliary/Makefile.sources >> +++ b/src/gallium/auxiliary/Makefile.sources >> @@ -93,6 +93,7 @@ C_SOURCES := \ >> pipebuffer/pb_bufmgr_ondemand.c \ >> pipebuffer/pb_bufmgr_pool.c \ >> pipebuffer/pb_bufmgr_slab.c \ >> + pipebuffer/pb_cache.c \ > > > I believe pb_cache.h needs to be added as well. > > >> pipebuffer/pb_validate.c \ >> pipebuffer/pb_validate.h \ >> postprocess/filters.h \ >> diff --git a/src/gallium/auxiliary/pipebuffer/pb_cache.c b/src/gallium/auxiliary/pipebuffer/pb_cache.c >> new file mode 100644 >> index 000..45f600d >> --- /dev/null >> +++ b/src/gallium/auxiliary/pipebuffer/pb_cache.c > > ... > >> +/** >> + * \return 1 if compatible and can be reclaimed >> + * 0 if incompatible >> + *-1 if compatible and can't be reclaimed >> + */ >> +static int >> +pb_cache_is_buffer_compat(struct pb_cache_entry *entry, >> + pb_size size, unsigned alignment, unsigned usage) >> +{ >> + struct pb_buffer *buf = entry->buffer; >> + >> + if (usage & entry->mgr->bypass_usage) >> + return 0; > > > It should be possible to move this test to the top of pb_cache_reclaim_buffer, right? I don't know, maybe. I just copied the code as-is and I did notice the bypass_usage documentation doesn't match the code very well. I think VMware people added the flag, so I'll leave any possible cleanup to them to avoid the risk of breaking their driver. The flag can also be moved to the caller of pb_cache_reclaim_buffer. Marek > > >> + if (buf->size < size) >> + return 0; >> + >> + /* be lenient with size */ >> + if (buf->size > (unsigned) (entry->mgr->size_factor * size)) >> + return 0; >> + >> + if (!pb_check_alignment(alignment, buf->alignment)) >> + return 0; >> + >> + if (!pb_check_usage(usage, buf->usage)) >> + return 0; >> + >> + return entry->mgr->can_reclaim(buf) ? 1 : -1; >> +} >> + >> +/** >> + * Find a compatible buffer in the cache, return it, and remove it >> + * from the cache. >> + */ >> +struct pb_buffer * >> +pb_cache_reclaim_buffer(struct pb_cache *mgr, pb_size size, >> +unsigned alignment, unsigned usage) >> +{ >> + struct pb_cache_entry *entry; >> + struct pb_cache_entry *cur_entry; >> + struct list_head *cur, *next; >> + int64_t now; >> + int ret = 0; >> + >> + pipe_mutex_lock(mgr->mutex); >> + >> + entry = NULL; >> + cur = mgr->cache.next; >> + next = cur->next; >> + >> + /* search in the expired buffers, freeing them in the process */ >> + now = os_time_get(); >> + while (cur != >cache) { >> + cur_entry = LIST_ENTRY(struct pb_cache_entry, cur, head); >> + >> + if (!entry && (ret = pb_cache_is_buffer_compat(cur_entry, size, >> + alignment, usage) > 0)) >> + entry = cur_entry; >> + else if (os_time_timeout(cur_entry->start, cur_entry->end, now)) >> + destroy_buffer_locked(cur_entry); >> + else >> + /* This buffer (and all hereafter) are still hot in cache */ >> + break; >> + >> + /* the buffer is busy (and probably all remaining ones too) */ >> + if (ret == -1) >> + break; >> + >> + cur = next; >> + next = cur->next; >> + } >> + >> + /* keep searching in the hot buffers */ >> + if (!entry && ret != -1) { >> + while (cur != >cache) { >> + cur_entry = LIST_ENTRY(struct pb_cache_entry, cur, head); >> + ret = pb_cache_is_buffer_compat(cur_entry, size, alignment, usage); >> + >> + if (ret > 0) { >> +entry = cur_entry; >> +break; >> + } >> + if (ret == -1) >> +break; >> + /* no need to check the timeout here */ >> + cur = next; >> + next = cur->next; >> + } >> +
Re: [Mesa-dev] [PATCH v4 15/44] i965: Work around L3 state leaks during context switches.
Jordan Justenwrites: > On 2015-12-08 08:43:53, Francisco Jerez wrote: >> This is going to require some rather intrusive kernel changes to fix >> properly, in the meantime (and forever on at least pre-v4.1 kernels) >> we'll have to restore the hardware defaults at the end of every batch >> in which the L3 configuration was changed to avoid interfering with >> the DDX and GL clients that use an older non-L3-aware version of Mesa. >> >> Reviewed-by: Samuel Iglesias Gonsálvez >> Reviewed-by: Kristian Høgsberg >> >> v4: Optimize look-up of the default configuration by assuming it's the >> first entry of the L3 config array in order to avoid an FPS >> regression in GpuTest Triangle and SynMark OglBatch2-7 on most >> affected platforms. >> --- >> src/mesa/drivers/dri/i965/brw_state.h | 4 +++ >> src/mesa/drivers/dri/i965/gen7_l3_state.c | 51 >> +++ >> src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 >> src/mesa/drivers/dri/i965/intel_batchbuffer.h | 6 +++- >> 4 files changed, 67 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_state.h >> b/src/mesa/drivers/dri/i965/brw_state.h >> index 49f301a..b7c0039 100644 >> --- a/src/mesa/drivers/dri/i965/brw_state.h >> +++ b/src/mesa/drivers/dri/i965/brw_state.h >> @@ -380,6 +380,10 @@ void gen7_update_binding_table_from_array(struct >> brw_context *brw, >> void gen7_disable_hw_binding_tables(struct brw_context *brw); >> void gen7_reset_hw_bt_pool_offsets(struct brw_context *brw); >> >> +/* gen7_l3_state.c */ >> +void >> +gen7_restore_default_l3_config(struct brw_context *brw); >> + >> #ifdef __cplusplus >> } >> #endif >> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c >> b/src/mesa/drivers/dri/i965/gen7_l3_state.c >> index 7956935..7fa7336 100644 >> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c >> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c >> @@ -520,3 +520,54 @@ const struct brw_tracked_state gen7_l3_state = { >> }, >> .emit = emit_l3_state >> }; >> + >> +/** >> + * Hack to restore the default L3 configuration. >> + * >> + * This will be called at the end of every batch in order to reset the L3 >> + * configuration to the default values for the time being until the kernel >> is >> + * fixed. Until kernel commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b >> + * (included in v4.1) we would set the MI_RESTORE_INHIBIT bit when >> submitting >> + * batch buffers for the default context used by the DDX, which meant that >> any >> + * context state changed by the GL would leak into the DDX, the assumption >> + * being that the DDX would initialize any state it cares about manually. >> The >> + * DDX is however not careful enough to program an L3 configuration >> + * explicitly, and it makes assumptions about it (URB size) which won't hold >> + * and cause it to misrender if we let our L3 set-up to leak into the DDX. >> + * >> + * Since v4.1 of the Linux kernel the default context is saved and restored >> + * normally, so it's far less likely for our L3 programming to interfere >> with >> + * other contexts -- In fact restoring the default L3 configuration at the >> end >> + * of the batch will be redundant most of the time. A kind of state leak is >> + * still possible though if the context making assumptions about L3 state is >> + * created immediately after our context was active (e.g. without the DDX >> + * default context being scheduled in between) because at present the DRM >> + * doesn't fully initialize the contents of newly created contexts and >> instead >> + * sets the MI_RESTORE_INHIBIT flag causing it to inherit the state from the >> + * last active context. >> + * >> + * It's possible to realize such a scenario if, say, an X server (or a GL >> + * application using an outdated non-L3-aware Mesa version) is started while >> + * another GL application is running and happens to have modified the L3 >> + * configuration, or if no X server is running at all and a GL application >> + * using a non-L3-aware Mesa version is started after another GL application >> + * ran and modified the L3 configuration -- The latter situation can >> actually >> + * be reproduced easily on IVB in our CI system. >> + */ >> +void >> +gen7_restore_default_l3_config(struct brw_context *brw) >> +{ >> + const struct brw_device_info *devinfo = brw->intelScreen->devinfo; >> + /* For efficiency assume that the first entry of the array matches the >> +* default configuration. >> +*/ >> + const struct brw_l3_config *const cfg = get_l3_configs(devinfo); >> + assert(cfg == get_l3_config(devinfo, >> + get_default_l3_weights(devinfo, false, >> false))); >> + >> + if (cfg != brw->l3.config && brw->can_do_pipelined_register_writes) { >> + setup_l3_config(brw, cfg); >> + update_urb_size(brw, cfg); >> + brw->l3.config = cfg; >> + } >>
[Mesa-dev] [PATCH] gallium/util: handle patches in u_prims_for_vertices to fix a radeonsi crash
From: Marek OlšákI guess the crash was because of divison by zero. Cc: 11.0 11.1 --- src/gallium/auxiliary/util/u_prim.h | 17 + src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/util/u_prim.h b/src/gallium/auxiliary/util/u_prim.h index 3668015..4926af6 100644 --- a/src/gallium/auxiliary/util/u_prim.h +++ b/src/gallium/auxiliary/util/u_prim.h @@ -141,14 +141,23 @@ u_prim_vertex_count(unsigned prim) * For polygons, return the number of triangles. */ static inline unsigned -u_prims_for_vertices(unsigned prim, unsigned num) +u_prims_for_vertices(unsigned prim, unsigned num, unsigned vertices_per_patch) { - const struct u_prim_vertex_count *info = u_prim_vertex_count(prim); + struct u_prim_vertex_count info; - if (num < info->min) + if (prim == PIPE_PRIM_PATCHES) + info.min = info.incr = vertices_per_patch; + else if (prim < PIPE_PRIM_MAX) + info = *u_prim_vertex_count(prim); + else { + assert(!"invalid prim type"); + return 0; + } + + if (num < info.min) return 0; - return 1 + ((num - info->min) / info->incr); + return 1 + ((num - info.min) / info.incr); } static inline boolean u_validate_pipe_prim( unsigned pipe_prim, unsigned nr ) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index ee84a1f..4ac9d0a 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -320,7 +320,8 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context *sctx, if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi && (info->indirect || (info->instance_count > 1 && - u_prims_for_vertices(info->mode, info->count) <= 1))) + u_prims_for_vertices(info->mode, info->count, + info->vertices_per_patch) <= 1))) sctx->b.flags |= SI_CONTEXT_VGT_FLUSH; return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) | -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH shader-db 1/3] split-to-files: deal with minimum versions, other shader types
Hi, On 11/09/2015 08:47 PM, Matt Turner wrote: On Mon, Nov 9, 2015 at 10:46 AM, Ilia Mirkinwrote: I used this script in conjunction with ST_DUMP_SHADERS. What other way is there? Some local hack and we should probably finish and upstream. Did anything happen with this? I had to rewrite split-to-files because it didn't output all (ARB) shaders and it picks the wrong one [1] when application re-uses same program numbers. [1] It picked first one, although I think almost always the last one will be most interesting. Last one will also allow easily dumping different shader sets (that re-use same program numbers) from the same application. RFC patches for my changes are attached. - Eero >From c52f02ff664af269fa5268627624fe94c647ad37 Mon Sep 17 00:00:00 2001 From: Eero Tamminen Date: Wed, 9 Dec 2015 16:29:43 +0200 Subject: [PATCH 1/2] Rewrite split-to-files.py to fix it This rewrite improves on previous version in following ways: * Improve recognization of shader end. * Remove extra lines after shader ends (in normal shaders anything after last line with '}' that closes main(), and in ARB shaders, lines after END). * Optimize parsing by using compiled regexps. * Calculate (md5) hashes for normalized (single line comments and white space removed) shader contents and identify duplicate shaders with those * If program gets a new shader, output the latest one. It should be more relevant one. It also allows dumping different shader sets e.g. shaders for startup / game menu vs. actual game play, just by running application further before killing it. * When application replaces ARB shaders, continue instead of claiming to be done & exiting. Same program numbers can be used if application removes previous programs. * Tell user which shaders were duplicates and which were replaced by which shaders. * Remove duplicate programs based on shader stage hashes (of their normalized sources) and tell user about this. * Output shader stage sources in 3D pipeline order. * Give ARB shaders different file name from normal shaders. --- split-to-files.py | 409 +++-- 1 file changed, 306 insertions(+), 103 deletions(-) diff --git a/split-to-files.py b/split-to-files.py index 151681e..7150622 100755 --- a/split-to-files.py +++ b/split-to-files.py @@ -2,122 +2,300 @@ import re import os +import hashlib import argparse +class ShaderBase: +def __init__(self, prog, stage): +self.lines = [] +self.progid = prog # latest +self.programs = {self.progid: True} # all +self.stage = stage +self.hash = None +self.hashed_len = 0 +self.done = False +self.replaced = False +# filled by subclasses +self.shadernum = 0 +self.req_start = None +self.req_end = None +self.warn = None + +def append_line(self, line): +assert not self.done +self.lines.append(line) + +def is_finished(self, line): +return False + +def get_source(self): +assert self.done +return "\n".join(self.lines) + "\n" + +def add_program(self, dup): +self.programs[dup.progid] = True + +def del_program(self, dup): +del(self.programs[dup.progid]) + +def get_hash(self): +if self.hash: +return self.hash +assert self.done + +# source without single line comments & whitespace +normalized = [] +for line in self.lines: +offset = line.find("//") +if offset >= 0: +line = line[:offset] +# Python2: line = line.translate(None, " \t") +line = line.translate({' ': None, '\t': None}) +if line: +normalized.append(line) +normalized = "".join(normalized).encode() + +# create hash for normalized source +md5 = hashlib.md5() +self.hashed_len = len(normalized) +md5.update(normalized) +self.hash = md5.hexdigest() +return self.hash + +def check_conflict(self, dup): +assert self.done and dup.done +if self.hash == dup.hash and self.hashed_len != dup.hashed_len: +print("ERROR: hash collision with %s" % dup.get_info()) +exit(-1) +if dup.stage != dup.stage: +# same shader for different stage, this isn't handled correctly at the moment +# all code assumes that each hash/shader represents just one shader stage +print("ERROR: duplicate is for different shader stage (%s)" % dup.stage) +exit(-1) + +def get_info(self): +assert None # must be subclassed + +def show_info(self): +print(self.get_info()) + + +class ShaderARB(ShaderBase): +def __init__(self, prog, stage): +ShaderBase.__init__(self, prog, stage) +self.progid = "%s-ARB_%s" %
[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression
https://bugs.freedesktop.org/show_bug.cgi?id=91888 --- Comment #18 from Daniel Stone--- How about 'hooray, it's fixed'? :) -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC] glapi: Build gl_gentable.c only on Darwin
Removes the public symbol _glapi_create_table_from_handle from libGL.so.1 on all plattforms except Darwin. Since the symbol is not used on other plattforms it makes sense to build gl_gentable.c only on Darwin. A little bit of history: _glapi_create_table_from_handle was introduced in commit 85937f4c0d4a78d3a11e3c1fa6148640f2a9ad7b Author: Jeremy HuddlestonDate: Thu Jun 9 16:59:49 2011 -0700 glapi: Add API that can create a _glapi_table from a dlfcn handle Example usage: void *handle = dlopen(opengl_library_path, RTLD_LOCAL); struct _glapi_table *disp = _glapi_create_table_from_handle(handle, "gl"); Signed-off-by: Jeremy Huddleston and the only user in mesa was added in commit f35913b96e743c5014e99220b1a1c5532a894d69 Author: Jeremy Huddleston Date: Thu Jun 9 17:29:51 2011 -0700 apple: Use _glapi_create_table_from_handle to initialize our dispatch table Signed-off-by: Jeremy Huddleston gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14. Cc: Jeremy Huddleston Signed-off-by: Andreas Boll --- XXX If we still want to distribute gl_gentable.c in the release tarball we could drop the changes in src/mapi/glapi/gen/Makefile.am src/mapi/Makefile.am | 6 +- src/mapi/glapi/gen/Makefile.am | 12 +--- src/mapi/glapi/glapi.h | 2 ++ 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am index 307e05d..ddd3daa 100644 --- a/src/mapi/Makefile.am +++ b/src/mapi/Makefile.am @@ -106,12 +106,16 @@ if HAVE_SPARC_ASM GLAPI_ASM_SOURCES = glapi/glapi_sparc.S endif -glapi_libglapi_la_SOURCES = glapi/glapi_gentable.c +glapi_libglapi_la_SOURCES = glapi_libglapi_la_CPPFLAGS = \ $(AM_CPPFLAGS) \ -I$(top_srcdir)/src/mapi/glapi \ -I$(top_srcdir)/src/mesa +if HAVE_APPLEDRI +glapi_libglapi_la_SOURCES += glapi/glapi_gentable.c +endif + if HAVE_SHARED_GLAPI glapi_libglapi_la_SOURCES += $(MAPI_BRIDGE_FILES) glapi/glapi_mapi_tmp.h glapi_libglapi_la_CPPFLAGS += \ diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index 2da8f7d..25ea44a 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -27,8 +27,11 @@ MESA_GLAPI_OUTPUTS = \ $(MESA_GLAPI_DIR)/glapi_mapi_tmp.h \ $(MESA_GLAPI_DIR)/glprocs.h \ $(MESA_GLAPI_DIR)/glapitemp.h \ - $(MESA_GLAPI_DIR)/glapitable.h \ - $(MESA_GLAPI_DIR)/glapi_gentable.c + $(MESA_GLAPI_DIR)/glapitable.h + +if HAVE_APPLEDRI +MESA_GLAPI_OUTPUTS += $(MESA_GLAPI_DIR)/glapi_gentable.c +endif MESA_GLAPI_ASM_OUTPUTS = if HAVE_X86_ASM @@ -88,8 +91,11 @@ XORG_GLAPI_DIR = $(XORG_BASE)/glx XORG_GLAPI_OUTPUTS = \ $(XORG_GLAPI_DIR)/glprocs.h \ $(XORG_GLAPI_DIR)/glapitable.h \ - $(XORG_GLAPI_DIR)/dispatch.h \ + $(XORG_GLAPI_DIR)/dispatch.h + +if HAVE_APPLEDRI $(XORG_GLAPI_DIR)/glapi_gentable.c +endif XORG_OUTPUTS = \ $(XORG_GLAPI_OUTPUTS) \ diff --git a/src/mapi/glapi/glapi.h b/src/mapi/glapi/glapi.h index f269b17..3593c88 100644 --- a/src/mapi/glapi/glapi.h +++ b/src/mapi/glapi/glapi.h @@ -158,8 +158,10 @@ _GLAPI_EXPORT const char * _glapi_get_proc_name(unsigned int offset); +#ifdef GLX_USE_APPLEGL _GLAPI_EXPORT struct _glapi_table * _glapi_create_table_from_handle(void *handle, const char *symbol_prefix); +#endif _GLAPI_EXPORT void -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound
Emil Velikovwrites: > Worth throwing in 11.0 as well ? Yeah, that would probably be sensible. >> if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) { >> - dri2InvalidateDrawable(driContext->driDrawablePriv); >> - dri2InvalidateDrawable(driContext->driReadablePriv); >> + if (driContext->driDrawablePriv) >> + dri2InvalidateDrawable(driContext->driDrawablePriv); >> + if (driContext->driReadablePriv) >> + dri2InvalidateDrawable(driContext->driReadablePriv); > > Afaict i915 could use an identical fix ? Yes, I think you're right. However I don't have any way of testing it so I feel a bit uncomfortable touching i915 driver. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.
On Dec 9, 2015 2:51 AM, "Kenneth Graunke"wrote: > > My tessellation branch has two additional remap functions. I don't want > to replicate this logic there. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_nir.c | 78 - > 1 file changed, 50 insertions(+), 28 deletions(-) > > Hey Jason, > > If you like this patch, and haven't yet merged your NIR input reworks, > feel free to just squash it into your changes. Or, we can land it > separately after your changes. It's up to you. > > Separating this out allows me to reuse this in my new tessellation input > and output remapping functions, and also means we don't need to add structs > for the remap functions...we can just pass the builder, or inputs_read, or > the VUE map...and not have to pack multiple things together. Sure. It does make things simpler. > --Ken > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c > index 14ad172..105a175 100644 > --- a/src/mesa/drivers/dri/i965/brw_nir.c > +++ b/src/mesa/drivers/dri/i965/brw_nir.c > @@ -27,15 +27,19 @@ > #include "glsl/nir/nir_builder.h" > #include "program/prog_to_nir.h" > > -struct remap_vs_attrs_state { > - nir_builder b; > - uint64_t inputs_read; > -}; > - > +/** > + * In many cases, we just add the base and offset together, so there's no > + * reason to keep them separate. Sometimes, combining them is essential: > + * if a shader only accesses part of a compound variable (such as a matrix > + * or array), the variable's base may not actually exist in the VUE map. > + * > + * This pass adds constant offsets to instr->const_index[0], and resets > + * the offset source to 0. Non-constant offsets remain unchanged. > + */ > static bool > -remap_vs_attrs(nir_block *block, void *void_state) > +add_const_offset_to_base(nir_block *block, void *closure) > { > - struct remap_vs_attrs_state *state = void_state; > + nir_builder *b = closure; > > nir_foreach_instr_safe(block, instr) { >if (instr->type != nir_instr_type_intrinsic) > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state) > >nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); > > + if (intrin->intrinsic == nir_intrinsic_load_input || > + intrin->intrinsic == nir_intrinsic_load_per_vertex_input || > + intrin->intrinsic == nir_intrinsic_load_output || > + intrin->intrinsic == nir_intrinsic_load_per_vertex_output || > + intrin->intrinsic == nir_intrinsic_store_output || > + intrin->intrinsic == nir_intrinsic_store_per_vertex_output) { This seems a bit scortched-earth. It would be nice if the caller had a bit more control. > + nir_src *offset = nir_get_io_offset_src(intrin); > + nir_const_value *const_offset = nir_src_as_const_value(*offset); > + > + if (const_offset) { > +intrin->const_index[0] += const_offset->u[0]; > +b->cursor = nir_before_instr(>instr); > +nir_instr_rewrite_src(>instr, offset, > + nir_src_for_ssa(nir_imm_int(b, 0))); > + } Else??? It seems that you don't want to run this pass if you think you'll ever hit an indirect. I guess it's harmless to just do this for all direct things in our driver, but it doesn't sit well. > + } > + } > + return true; > + > +} > + > +static bool > +remap_vs_attrs(nir_block *block, void *closure) > +{ > + GLbitfield64 inputs_read = *((GLbitfield64 *) closure); > + > + nir_foreach_instr(block, instr) { > + if (instr->type != nir_instr_type_intrinsic) > + continue; > + > + nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); > + >if (intrin->intrinsic == nir_intrinsic_load_input) { > /* Attributes come in a contiguous block, ordered by their >* gl_vert_attrib value. That means we can compute the slot >* number for an attribute by masking out the enabled attributes >* before it and counting the bits. >*/ > - nir_const_value *const_offset = nir_src_as_const_value(intrin->src[0]); > - > - /* We set EmitNoIndirect for VS inputs, so there are no indirects. */ > - assert(const_offset); > - > - int attr = intrin->const_index[0] + const_offset->u[0]; > - int slot = _mesa_bitcount_64(state->inputs_read & > - BITFIELD64_MASK(attr)); > + int attr = intrin->const_index[0]; > + int slot = _mesa_bitcount_64(inputs_read & BITFIELD64_MASK(attr)); > > - /* The NIR -> FS pass will just add the base and offset together, so > - * there's no reason to keep them separate. Just put it all in > - * const_index[0] and set the offset src[0] to load_const(0). > - */ > intrin->const_index[0] = 4 * slot; > - > - state->b.cursor =
Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga"wrote: > > This code in brw_set_dest adjusts the execution size of any instruction > with a dst.width < 8. However, we don't want to do this with instructions > operating on doubles, since these will have a width of 4, but still > need an execution size of 8 (for SIMD8). Unfortunately, we can't just check > the size of the operands involved to detect if we are doing an operation on > doubles, because we can have instructions that do operations on double > operands interpreted as UD, operating on any of its 2 32-bit components. > > Previous commits have made it so we never emit instructions with a horizontal > width of 4 that don't have the correct execution size set for gen7/gen8, so > we can skip it in this case, avoiding the conflicts with fp64 requirements. > > Expanding the same fix to other hardware generations requires many more > changes but since we are not targetting fp64 support on them > wer don't really care for now. > --- > src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index 78f2c8c..50a8771 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest) > /* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8) > * or 16 (SIMD16), as that's normally correct. However, when dealing with > * small registers, we automatically reduce it to match the register size. > +* > +* In platforms that support fp64 we can emit instructions with a width of > +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these > +* cases we need to make sure that these instructions have their exec sizes > +* set properly when they are emitted and we can't rely on this code to fix > +* it. > */ > - if (dest.width < BRW_EXECUTE_8) > + bool fix_exec_size; > + if (devinfo->gen == 7 || devinfo->gen == 8) If we're doing to take this approach, we definitely want to make it gen > 6 or something so we include future gens. Really gen > 4 is probably doable since the only real problem is the legacy clipping code. > + fix_exec_size = dest.width < BRW_EXECUTE_4; > + else > + fix_exec_size = dest.width < BRW_EXECUTE_8; > + > + if (fix_exec_size) >brw_inst_set_exec_size(devinfo, inst, dest.width); > } > > -- > 2.1.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression
https://bugs.freedesktop.org/show_bug.cgi?id=91888 --- Comment #19 from nerdopol...@verizon.net --- I didn't see any thing in the changelog for 'egl' that looked like it might be a fix... Not 100% sure though -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] st/osmesa: Fix a typo in a comment
s/suport/support/ Signed-off-by: Andreas Boll--- src/gallium/state_trackers/osmesa/osmesa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/osmesa/osmesa.c b/src/gallium/state_trackers/osmesa/osmesa.c index 0285cb0..0f27ba8 100644 --- a/src/gallium/state_trackers/osmesa/osmesa.c +++ b/src/gallium/state_trackers/osmesa/osmesa.c @@ -32,7 +32,7 @@ * may be set to "softpipe" or "llvmpipe" to override. * * With softpipe we could render directly into the user's buffer by using a - * display target resource. However, softpipe doesn't suport "upside-down" + * display target resource. However, softpipe doesn't support "upside-down" * rendering which would be needed for the OSMESA_Y_UP=TRUE case. * * With llvmpipe we could only render directly into the user's buffer when its -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] meta: Fix a typo in a print message
s/Unkown/Unknown/ Signed-off-by: Andreas Boll--- src/mesa/drivers/common/meta_blit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/common/meta_blit.c b/src/mesa/drivers/common/meta_blit.c index c5faf61..4dbf0a7 100644 --- a/src/mesa/drivers/common/meta_blit.c +++ b/src/mesa/drivers/common/meta_blit.c @@ -325,7 +325,7 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx, } break; default: - _mesa_problem(ctx, "Unkown texture target %s\n", + _mesa_problem(ctx, "Unknown texture target %s\n", _mesa_enum_to_string(target)); shader_index = BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE; } -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports
Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan: > This fixes my initial attempt so that piglit now passes 14/14. Thanks > to a couple of tips from Roland in the previous patch I was able to > fix the remaining issue. This should be golden now. > Great that you got it working! Please send the patches to the ml. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound
On 9 December 2015 at 14:57, Neil Robertswrote: > Emil Velikov writes: > >> Worth throwing in 11.0 as well ? > > Yeah, that would probably be sensible. > >>> if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) { >>> - dri2InvalidateDrawable(driContext->driDrawablePriv); >>> - dri2InvalidateDrawable(driContext->driReadablePriv); >>> + if (driContext->driDrawablePriv) >>> + dri2InvalidateDrawable(driContext->driDrawablePriv); >>> + if (driContext->driReadablePriv) >>> + dri2InvalidateDrawable(driContext->driReadablePriv); >> >> Afaict i915 could use an identical fix ? > > Yes, I think you're right. However I don't have any way of testing it so > I feel a bit uncomfortable touching i915 driver. > Considering it's a null-deref fix one can just move the check in dri2InvalidateDrawable, which would spare going through i915, i965, radeon... ;-) Just throwing it out there, it's up-to whichever route you want to take. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] mesa: Fix typos in print messages
s/inconsistant/inconsistent/ s/occurences/occurrences/ Signed-off-by: Andreas Boll--- src/mesa/main/teximage.c | 2 +- src/mesa/main/transformfeedback.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 60fc7cc..73b3318 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -2028,7 +2028,7 @@ compressed_texture_error_check(struct gl_context *ctx, GLint dimensions, * if is not consistent with the format, dimensions, and * contents of the specified image. */ - reason = "imageSize inconsistant with width/height/format"; + reason = "imageSize inconsistent with width/height/format"; error = GL_INVALID_VALUE; goto error; } diff --git a/src/mesa/main/transformfeedback.c b/src/mesa/main/transformfeedback.c index 103011c..976b268 100644 --- a/src/mesa/main/transformfeedback.c +++ b/src/mesa/main/transformfeedback.c @@ -861,7 +861,7 @@ _mesa_TransformFeedbackVaryings(GLuint program, GLsizei count, if (buffers > ctx->Const.MaxTransformFeedbackBuffers) { _mesa_error(ctx, GL_INVALID_OPERATION, "glTransformFeedbackVaryings(too many gl_NextBuffer " -"occurences)"); +"occurrences)"); return; } } else { -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.
On Wed, Dec 9, 2015 at 11:18 AM, Devewrote: > This patch indeed seems to not have a sense. I just added it to the bug > report as a suggestion that it works for me after this modification. Emil > Velikov said that I should send it to the mailing list. > > Here is how it works in Supertuxkart: > We create rtt with following parameters: > > DepthStencilTexture = generateRTT(res, GL_DEPTH24_STENCIL8, > GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8); > > Then, during rendering scene, we do: > > glEnable(GL_FRAMEBUFFER_SRGB); > glBindFramebuffer(GL_FRAMEBUFFER, 0); OK, so this is the "winsys" framebuffer (GL has some term for it, sorry, I don't remember what it is... perhaps it's even winsys). This is created based on parameters of your selected GLX visual. For example, when I run glxinfo, I see (on nouveau; the list on intel will be different but comparable): 480 GLX Visuals visual x bf lv rg d st colorbuffer sr ax dp st accumbuffer ms cav id dep cl sp sz l ci b ro r g b a F gb bf th cl r g b a ns b eat 0x021 24 tc 0 32 0 r y . 8 8 8 8 . . 0 24 8 0 0 0 0 0 0 None 0x022 24 dc 0 32 0 r y . 8 8 8 8 . . 0 24 8 0 0 0 0 0 0 None ... 0x343 24 tc 0 32 0 r . . 8 8 8 8 . s 0 0 0 0 0 0 0 0 0 None 0x344 24 tc 0 32 0 r . . 8 8 8 8 . s 0 0 0 16 16 16 16 0 0 Slow 0x345 24 tc 0 32 0 r y . 8 8 8 8 . s 0 0 0 0 0 0 0 0 0 None 0x346 24 tc 0 32 0 r y . 8 8 8 8 . s 0 0 0 16 16 16 16 0 0 Slow Note how some of them have srgb, others don't (and have various differences in their various other properties). EGL has something similar I believe, but tbh I don't remember the specifics. That's the mesaVis->sRGBCapable bit below. If you need an sRGB-capable visual, are you sure you're picking one? This would be somewhere well before any actual rendering takes place. If you're not sure whether you need an sRGB-capable visual or not, try making sure you pick a *non-srgb* visual in a working configuration and see if it breaks. Cheers, -ilia > (...) > render(); > (...) > glDisable(GL_FRAMEBUFFER_SRGB); > > It looks that glEnable(GL_FRAMEBUFFER_SRGB) doesn't work anymore. It's > because of following lines in intel_screen.c in intelCreateBuffer() > function: > >if (mesaVis->redBits == 5) > rgbFormat = MESA_FORMAT_B5G6R5_UNORM; >else if (mesaVis->sRGBCapable) > rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; >else if (mesaVis->alphaBits == 0) > rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM; >else { > rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; > fb->Visual.sRGBCapable = true; >} > > Previously MESA_FORMAT_B8G8R8X8_UNORM was not available, and thus > MESA_FORMAT_B8G8R8A8_UNORM was handled as last case (using > MESA_FORMAT_B8G8R8A8_SRGB format). Now it uses MESA_FORMAT_B8G8R8X8_UNORM > format. > > Any ideas how it should be handled? > > Regards, > Deve > > W dniu 09.12.2015 o 03:00, Ilia Mirkin pisze: > >> On Mon, Dec 7, 2015 at 5:32 PM, Dawid Gan wrote: >>> >>> This format has been added in commit: >>> 28090b30dd6b5977de085f48c620574214b6b4ba >>> But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format. >>> It was causing the screen in Supertuxkart to be darker than expected, >>> see: >>> https://bugs.freedesktop.org/show_bug.cgi?id=92759 >>> >>> Cc: Boyan Ding >>> Cc: "11.0 11.1" >>> Fixes: 28090b30dd6 "i965: Add XRGB format to >>> intel_screen_make_configs" >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 >>> --- >>> src/mesa/drivers/dri/i965/intel_screen.c | 9 + >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c >>> b/src/mesa/drivers/dri/i965/intel_screen.c >>> index cc90efe..75d5a65 100644 >>> --- a/src/mesa/drivers/dri/i965/intel_screen.c >>> +++ b/src/mesa/drivers/dri/i965/intel_screen.c >>> @@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) >>>stencil_bits[2] = 8; >>>num_depth_stencil_bits = 3; >>>} >>> + } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) { >>> + depth_bits[1] = 24; >>> + stencil_bits[1] = 0; >> >> >> Why would you want depth without stencil when using BGRX? I don't see >> how the two are connected... Are you sure you're picking the right >> visual? >> >>-ilia >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: Fix a typo in a comment
s/suports/supports/ Signed-off-by: Andreas Boll--- src/mesa/main/extensions.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/extensions.h b/src/mesa/main/extensions.h index 1615e1c..b5e0350 100644 --- a/src/mesa/main/extensions.h +++ b/src/mesa/main/extensions.h @@ -88,7 +88,7 @@ enum { }; -/** Checks if the context suports a user-facing extension */ +/** Checks if the context supports a user-facing extension */ #define EXT(name_str, driver_cap, ...) \ static inline bool \ _mesa_has_##name_str(const struct gl_context *ctx) \ -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] glx: Fix a typo in a comment
s/suports/supports/ Signed-off-by: Andreas Boll--- Found two more "suports" typos. I could squash all patches together if that's preferred. src/glx/dri2_glx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 27ea952..651915a 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1289,7 +1289,7 @@ dri2CreateScreen(int screen, struct glx_display * priv) __glXEnableDirectExtension(>base, "GLX_OML_sync_control"); } - /* DRI2 suports SubBuffer through DRI2CopyRegion, so it's always + /* DRI2 supports SubBuffer through DRI2CopyRegion, so it's always * available.*/ psp->copySubBuffer = dri2CopySubBuffer; __glXEnableDirectExtension(>base, "GLX_MESA_copy_sub_buffer"); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.
This patch indeed seems to not have a sense. I just added it to the bug report as a suggestion that it works for me after this modification. Emil Velikov said that I should send it to the mailing list. Here is how it works in Supertuxkart: We create rtt with following parameters: DepthStencilTexture = generateRTT(res, GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8); Then, during rendering scene, we do: glEnable(GL_FRAMEBUFFER_SRGB); glBindFramebuffer(GL_FRAMEBUFFER, 0); (...) render(); (...) glDisable(GL_FRAMEBUFFER_SRGB); It looks that glEnable(GL_FRAMEBUFFER_SRGB) doesn't work anymore. It's because of following lines in intel_screen.c in intelCreateBuffer() function: if (mesaVis->redBits == 5) rgbFormat = MESA_FORMAT_B5G6R5_UNORM; else if (mesaVis->sRGBCapable) rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; else if (mesaVis->alphaBits == 0) rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM; else { rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; fb->Visual.sRGBCapable = true; } Previously MESA_FORMAT_B8G8R8X8_UNORM was not available, and thus MESA_FORMAT_B8G8R8A8_UNORM was handled as last case (using MESA_FORMAT_B8G8R8A8_SRGB format). Now it uses MESA_FORMAT_B8G8R8X8_UNORM format. Any ideas how it should be handled? Regards, Deve W dniu 09.12.2015 o 03:00, Ilia Mirkin pisze: On Mon, Dec 7, 2015 at 5:32 PM, Dawid Ganwrote: This format has been added in commit: 28090b30dd6b5977de085f48c620574214b6b4ba But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format. It was causing the screen in Supertuxkart to be darker than expected, see: https://bugs.freedesktop.org/show_bug.cgi?id=92759 Cc: Boyan Ding Cc: "11.0 11.1" Fixes: 28090b30dd6 "i965: Add XRGB format to intel_screen_make_configs" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 --- src/mesa/drivers/dri/i965/intel_screen.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index cc90efe..75d5a65 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) stencil_bits[2] = 8; num_depth_stencil_bits = 3; } + } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) { + depth_bits[1] = 24; + stencil_bits[1] = 0; Why would you want depth without stencil when using BGRX? I don't see how the two are connected... Are you sure you're picking the right visual? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix locking of GLsync objects
On Wed, Dec 09, 2015 at 10:35:25AM +, Emil Velikov wrote: >> I was told that it's easier for people to review my patch if it comes in via >> email than being stuck in the bug tracker; FWIW, this is for bug 120238. > Which bugtracker it this ? bugs.fd.o does not like the number > mentioned. Please add the full URL to the commit message with a > Bugzilla: tag. I mixed up some bug numbers, sorry. https://bugs.freedesktop.org/show_bug.cgi?id=92757 >> GLsync objects had a race condition when used from multiple threads >> (which is the main point of the extension, really); it could be >> validated as a sync object at the beginning of the function, and then >> deleted by another thread before use, causing crashes. Fix this by >> changing all casts from GLsync to struct gl_sync_object to a new >> function _mesa_get_sync() that validates and increases the refcount. > Might be worth keeping _mesa_ref_sync_object(), even if it's an inline > wrapper around the above. As things get a bit confusing - foo_get vs > foo_unref. What about _mesa_get_and_ref_sync()? > Alternatively one could even throw the locking (+extra checks) into > the validate, use it in _mesa_IsSync(), while using the ref/unref > combo elsewhere and drop the "amount" argument from unref. I agree the amount argument is a bit icky. :-) > Please mention if this commit fixes a certain game/program. The motivating program is unreleased for now, but it will be released (under GPLv3) in February. > Can you also add the following tag. This way it'll be harder to miss > the patch when picking things for the stable branch(es). > > Cc: "11.0 11.1"Sure. (I'll send an updated patch when we have agreement on the issues above.) /* Steinar */ -- Homepage: https://www.sesse.net/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] glsl: Fix a typo in a comment
s/suports/supports/ Signed-off-by: Andreas Boll--- src/glsl/glsl_parser_extras.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 6bded3e..a4bda77 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -97,7 +97,7 @@ struct _mesa_glsl_parse_state { * supports the feature. * * \param required_glsl_es_version is the GLSL ES version that is required -* to support the feature, or 0 if no version of GLSL ES suports the +* to support the feature, or 0 if no version of GLSL ES supports the * feature. */ bool is_version(unsigned required_glsl_version, -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glx: Fix a typo in a comment
On 12/09/2015 09:29 AM, Andreas Boll wrote: s/suports/supports/ Signed-off-by: Andreas Boll--- Found two more "suports" typos. I could squash all patches together if that's preferred. src/glx/dri2_glx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 27ea952..651915a 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1289,7 +1289,7 @@ dri2CreateScreen(int screen, struct glx_display * priv) __glXEnableDirectExtension(>base, "GLX_OML_sync_control"); } - /* DRI2 suports SubBuffer through DRI2CopyRegion, so it's always + /* DRI2 supports SubBuffer through DRI2CopyRegion, so it's always * available.*/ psp->copySubBuffer = dri2CopySubBuffer; __glXEnableDirectExtension(>base, "GLX_MESA_copy_sub_buffer"); For both, Reviewed-by: Brian Paul I'd say you don't really have to get reviews for fixing comment typos though. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] glsl: Fix a typo in a comment
On 12/09/2015 09:20 AM, Andreas Boll wrote: s/suports/supports/ Signed-off-by: Andreas Boll--- src/glsl/glsl_parser_extras.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 6bded3e..a4bda77 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -97,7 +97,7 @@ struct _mesa_glsl_parse_state { * supports the feature. * * \param required_glsl_es_version is the GLSL ES version that is required -* to support the feature, or 0 if no version of GLSL ES suports the +* to support the feature, or 0 if no version of GLSL ES supports the * feature. */ bool is_version(unsigned required_glsl_version, For all 4, Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] glapi: Build gl_gentable.c only on Darwin
On 9 December 2015 at 14:11, Andreas Bollwrote: > Removes the public symbol _glapi_create_table_from_handle from > libGL.so.1 on all plattforms except Darwin. > typo -> platforms > Since the symbol is not used on other plattforms it makes sense to ditto > build gl_gentable.c only on Darwin. > Ideally we'll keep the dispatch as close to identical across all platforms (i.e. we'll nuke this), although for now this will do. Out of curiosity is there any noticeable difference in the build times? ... > XXX If we still want to distribute gl_gentable.c in the release tarball > we could drop the changes in src/mapi/glapi/gen/Makefile.am > Yes please. We want to ship all the generated sources, regardless of the platform they're used. ... > index 2da8f7d..25ea44a 100644 > --- a/src/mapi/glapi/gen/Makefile.am > +++ b/src/mapi/glapi/gen/Makefile.am > @@ -27,8 +27,11 @@ MESA_GLAPI_OUTPUTS = \ > $(MESA_GLAPI_DIR)/glapi_mapi_tmp.h \ > $(MESA_GLAPI_DIR)/glprocs.h \ > $(MESA_GLAPI_DIR)/glapitemp.h \ > - $(MESA_GLAPI_DIR)/glapitable.h \ > - $(MESA_GLAPI_DIR)/glapi_gentable.c > + $(MESA_GLAPI_DIR)/glapitable.h > + > +if HAVE_APPLEDRI > +MESA_GLAPI_OUTPUTS += $(MESA_GLAPI_DIR)/glapi_gentable.c > +endif > > MESA_GLAPI_ASM_OUTPUTS = > if HAVE_X86_ASM > @@ -88,8 +91,11 @@ XORG_GLAPI_DIR = $(XORG_BASE)/glx > XORG_GLAPI_OUTPUTS = \ > $(XORG_GLAPI_DIR)/glprocs.h \ > $(XORG_GLAPI_DIR)/glapitable.h \ > - $(XORG_GLAPI_DIR)/dispatch.h \ > + $(XORG_GLAPI_DIR)/dispatch.h > + > +if HAVE_APPLEDRI > $(XORG_GLAPI_DIR)/glapi_gentable.c Erm missing XORG_GLAPI_OUTPUTS += ? Afaict even with the above makefile changes the file should still be in the in the tarball. Am I missing something ? Would be great if Jeremy (or someone else) has the chance to test this, in case we've missing something. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] util/u_helpers: return correct number of bound buffers
I'm probably just being dense... can you provide an exact sequence of calls that would cause this logic to fail? Seems like it should work as-is... On Sun, Dec 6, 2015 at 4:12 AM, Patrick Rudolphwrote: > In case a state tracker unbinds every slot by a seperate > pipe->set_vertex_buffers() call, starting from slot zero, the number > of bound buffers would not reach zero at all. Unbinding all buffers > at once or starting at the top-most slot results in correct behaviour. > > Calculating the correct number of bound buffers fixes a NULL pointer > dereference in nvc0_validate_vertex_buffers_shared(). > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 > > Signed-off-by: Patrick Rudolph > --- > src/gallium/auxiliary/util/u_helpers.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/auxiliary/util/u_helpers.c > b/src/gallium/auxiliary/util/u_helpers.c > index 09619c1..09020b0 100644 > --- a/src/gallium/auxiliary/util/u_helpers.c > +++ b/src/gallium/auxiliary/util/u_helpers.c > @@ -81,7 +81,13 @@ void util_set_vertex_buffers_count(struct > pipe_vertex_buffer *dst, > const struct pipe_vertex_buffer *src, > unsigned start_slot, unsigned count) > { > - uint32_t enabled_buffers = (1ull << *dst_count) - 1; > + unsigned i; > + uint32_t enabled_buffers = 0; > + > + for (i = 0; i < *dst_count; i++) { > + if (dst[i].buffer || dst[i].user_buffer) > + enabled_buffers |= (1ull << i); > + } > > util_set_vertex_buffers_mask(dst, _buffers, src, start_slot, > count); > -- > 2.4.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] glapi: Build gl_gentable.c only on Darwin
The general concept of this change seems fine to me. Given the desire to keep glapi as similar as possible across platforms, would it be better to just move this into glx/apple rather than leaving it in glapi? > On Dec 9, 2015, at 09:07, Emil Velikovwrote: > > On 9 December 2015 at 14:11, Andreas Boll wrote: >> Removes the public symbol _glapi_create_table_from_handle from >> libGL.so.1 on all plattforms except Darwin. >> > typo -> platforms > >> Since the symbol is not used on other plattforms it makes sense to > ditto > >> build gl_gentable.c only on Darwin. >> > Ideally we'll keep the dispatch as close to identical across all > platforms (i.e. we'll nuke this), although for now this will do. > > Out of curiosity is there any noticeable difference in the build times? > > ... >> XXX If we still want to distribute gl_gentable.c in the release tarball >> we could drop the changes in src/mapi/glapi/gen/Makefile.am >> > Yes please. We want to ship all the generated sources, regardless of > the platform they're used. > > ... >> index 2da8f7d..25ea44a 100644 >> --- a/src/mapi/glapi/gen/Makefile.am >> +++ b/src/mapi/glapi/gen/Makefile.am >> @@ -27,8 +27,11 @@ MESA_GLAPI_OUTPUTS = \ >>$(MESA_GLAPI_DIR)/glapi_mapi_tmp.h \ >>$(MESA_GLAPI_DIR)/glprocs.h \ >>$(MESA_GLAPI_DIR)/glapitemp.h \ >> - $(MESA_GLAPI_DIR)/glapitable.h \ >> - $(MESA_GLAPI_DIR)/glapi_gentable.c >> + $(MESA_GLAPI_DIR)/glapitable.h >> + >> +if HAVE_APPLEDRI >> +MESA_GLAPI_OUTPUTS += $(MESA_GLAPI_DIR)/glapi_gentable.c >> +endif >> >> MESA_GLAPI_ASM_OUTPUTS = >> if HAVE_X86_ASM >> @@ -88,8 +91,11 @@ XORG_GLAPI_DIR = $(XORG_BASE)/glx >> XORG_GLAPI_OUTPUTS = \ >>$(XORG_GLAPI_DIR)/glprocs.h \ >>$(XORG_GLAPI_DIR)/glapitable.h \ >> - $(XORG_GLAPI_DIR)/dispatch.h \ >> + $(XORG_GLAPI_DIR)/dispatch.h >> + >> +if HAVE_APPLEDRI >>$(XORG_GLAPI_DIR)/glapi_gentable.c > Erm missing XORG_GLAPI_OUTPUTS += ? > > Afaict even with the above makefile changes the file should still be > in the in the tarball. Am I missing something ? > > Would be great if Jeremy (or someone else) has the chance to test > this, in case we've missing something. > > -Emil smime.p7s Description: S/MIME cryptographic signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: handle patches in u_prims_for_vertices to fix a radeonsi crash
On 2015-12-10 01:47, Marek Olšák wrote: From: Marek OlšákI guess the crash was because of divison by zero. Cc: 11.0 11.1 --- src/gallium/auxiliary/util/u_prim.h | 17 + src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/util/u_prim.h b/src/gallium/auxiliary/util/u_prim.h index 3668015..4926af6 100644 --- a/src/gallium/auxiliary/util/u_prim.h +++ b/src/gallium/auxiliary/util/u_prim.h @@ -141,14 +141,23 @@ u_prim_vertex_count(unsigned prim) * For polygons, return the number of triangles. */ static inline unsigned -u_prims_for_vertices(unsigned prim, unsigned num) +u_prims_for_vertices(unsigned prim, unsigned num, unsigned vertices_per_patch) { - const struct u_prim_vertex_count *info = u_prim_vertex_count(prim); + struct u_prim_vertex_count info; - if (num < info->min) + if (prim == PIPE_PRIM_PATCHES) + info.min = info.incr = vertices_per_patch; + else if (prim < PIPE_PRIM_MAX) We already do this check in u_prim_vertex_count() and if out-of-bounds we returned a NULL. Perhaps it would be better avoid this extra else-if branch here and just in the else branch, make the call and then assert on the NULL. + info = *u_prim_vertex_count(prim); + else { + assert(!"invalid prim type"); + return 0; + } + + if (num < info.min) return 0; Well convolving this with my previous patch, http://lists.freedesktop.org/archives/mesa-dev/2015-December/102729.html I think we should still have an assert(info.incr != 0); here. - return 1 + ((num - info->min) / info->incr); + return 1 + ((num - info.min) / info.incr); } static inline boolean u_validate_pipe_prim( unsigned pipe_prim, unsigned nr ) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index ee84a1f..4ac9d0a 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -320,7 +320,8 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context *sctx, if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi && (info->indirect || (info->instance_count > 1 && - u_prims_for_vertices(info->mode, info->count) <= 1))) + u_prims_for_vertices(info->mode, info->count, + info->vertices_per_patch) <= 1))) sctx->b.flags |= SI_CONTEXT_VGT_FLUSH; return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) | ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.
On Wed, Dec 9, 2015 at 11:23 AM, Ilia Mirkinwrote: > On Wed, Dec 9, 2015 at 11:18 AM, Deve wrote: >> This patch indeed seems to not have a sense. I just added it to the bug >> report as a suggestion that it works for me after this modification. Emil >> Velikov said that I should send it to the mailing list. >> >> Here is how it works in Supertuxkart: >> We create rtt with following parameters: >> >> DepthStencilTexture = generateRTT(res, GL_DEPTH24_STENCIL8, >> GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8); >> >> Then, during rendering scene, we do: >> >> glEnable(GL_FRAMEBUFFER_SRGB); >> glBindFramebuffer(GL_FRAMEBUFFER, 0); > > OK, so this is the "winsys" framebuffer (GL has some term for it, > sorry, I don't remember what it is... perhaps it's even winsys). This > is created based on parameters of your selected GLX visual. For > example, when I run glxinfo, I see (on nouveau; the list on intel will > be different but comparable): > > 480 GLX Visuals > visual x bf lv rg d st colorbuffer sr ax dp st accumbuffer ms cav > id dep cl sp sz l ci b ro r g b a F gb bf th cl r g b a ns b eat > > 0x021 24 tc 0 32 0 r y . 8 8 8 8 . . 0 24 8 0 0 0 0 0 0 None > 0x022 24 dc 0 32 0 r y . 8 8 8 8 . . 0 24 8 0 0 0 0 0 0 None > ... > 0x343 24 tc 0 32 0 r . . 8 8 8 8 . s 0 0 0 0 0 0 0 0 0 None > 0x344 24 tc 0 32 0 r . . 8 8 8 8 . s 0 0 0 16 16 16 16 0 0 Slow > 0x345 24 tc 0 32 0 r y . 8 8 8 8 . s 0 0 0 0 0 0 0 0 0 None > 0x346 24 tc 0 32 0 r y . 8 8 8 8 . s 0 0 0 16 16 16 16 0 0 Slow > > Note how some of them have srgb, others don't (and have various > differences in their various other properties). EGL has something > similar I believe, but tbh I don't remember the specifics. That's the > mesaVis->sRGBCapable bit below. If you need an sRGB-capable visual, > are you sure you're picking one? This would be somewhere well before > any actual rendering takes place. > > If you're not sure whether you need an sRGB-capable visual or not, try > making sure you pick a *non-srgb* visual in a working configuration > and see if it breaks. Right, so looking at your code with in more detail, you probably just require a stencil visual, which, in combination with this patch, makes you skip this one. However I noticed that i965/hsw and probably others don't expose *any* sRGB-capable GLX visuals/fb configs! Pretty sure that's not good... but GLX, sRGB, and visuals are not exactly my strong point, hopefully someone a bit more clued in can comment. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nvc0: fix use after free of pipe_resource
I pushed a slightly modified version of this: http://cgit.freedesktop.org/mesa/mesa/commit/?id=432a798cf5c7fab18a3e32d4073840df7d0d37cb Thanks for the patch! I hope this will resolve some weird crashes people have seen with various buffers being null unexpectedly. On Sun, Dec 6, 2015 at 4:11 AM, Patrick Rudolphwrote: > Always reset the vertex bufctx to make sure there's no pointer to > an already freed pipe_resource left after unbinding buffers. > Fixes use after free crash in nvc0_bufctx_fence(). > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 > > Signed-off-by: Patrick Rudolph > --- > src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > index 5dce5f0..2aa90c9 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > @@ -1000,12 +1000,16 @@ nvc0_set_vertex_buffers(struct pipe_context *pipe, > struct nvc0_context *nvc0 = nvc0_context(pipe); > unsigned i; > > +if (nvc0->num_vtxbufs) > +nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_VTX); > + > util_set_vertex_buffers_count(nvc0->vtxbuf, >num_vtxbufs, vb, >start_slot, count); > > if (!vb) { > nvc0->vbo_user &= ~(((1ull << count) - 1) << start_slot); > nvc0->constant_vbos &= ~(((1ull << count) - 1) << start_slot); > + nvc0->dirty |= NVC0_NEW_ARRAYS; > return; > } > > @@ -1025,7 +1029,6 @@ nvc0_set_vertex_buffers(struct pipe_context *pipe, > } > > nvc0->dirty |= NVC0_NEW_ARRAYS; > -nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_VTX); > } > > static void > -- > 2.4.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90821] Segfault when calling glViewport on surfaceless EGL context without bound FBO
https://bugs.freedesktop.org/show_bug.cgi?id=90821 Nanley Cherychanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #5 from Nanley Chery --- *** This bug has been marked as a duplicate of bug 93257 *** -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] util/u_helpers: return correct number of bound buffers
On Wed, Dec 9, 2015 at 2:01 PM, Patrick Rudolphwrote: > Ok, first of all bind some buffers: > > pipe->set_vertex_buffers(pipe, 0, 1, ); > pipe->set_vertex_buffers(pipe, 1, 1, ); > pipe->set_vertex_buffers(pipe, 2, 1, ); > > num_vtxbufs is now 3 as it should be. > > Now you are unbinding buffers, one after another starting at slot 0: > pipe->set_vertex_buffers(pipe, 0, 1, NULL); > > enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7; > util_set_vertex_buffers_mask(...); > enabled_buffers = 0x6; > num_vtxbufs = util_last_bit(enabled_buffers) = 3 > > > pipe->set_vertex_buffers(pipe, 1, 1, NULL); > > enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7; > util_set_vertex_buffers_mask(...); > enabled_buffers = 0x5; > num_vtxbufs = util_last_bit(enabled_buffers) = 3 > > > pipe->set_vertex_buffers(pipe, 2, 1, NULL); > > enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7; > util_set_vertex_buffers_mask(...); > enabled_buffers = 0x3; > num_vtxbufs = util_last_bit(enabled_buffers) = 2 > > There are no buffers bound any more, but num_vtxbufs is now 2 instead of > 0. > > There would be no problem if you start at slot 2 going down to 0. > There would be no problem if you unbind all buffers at once. > > I hope this clarifies the problem. Right, thank you, yes it does. With a slightly fixed commit log including a bit more justification, this is also Reviewed-by: Ilia Mirkin Basically something mentioning that the current algorithm does not account for pre-existing holes in the buffer list. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: fix test for SSE4.1 assembler support
On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Graywrote: > Change the __m128i variables to be volatile so gcc 4.9 won't optimise > all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls > still get optimised out but now there is at least one SSE4.1 instruction > generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions > got optimised out the configure test would incorrectly pass when the > compiler supported the intrinsics and the assembler didn't support the > instructions. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 > Signed-off-by: Jonathan Gray > Cc: "11.0 11.1" > --- > configure.ac | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/configure.ac b/configure.ac > index 260934d..1d82e47 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS" > AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ > #include > int main () { > -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c; > +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c; > c = _mm_max_epu32(a, b); > return 0; I would have extracted an int from the result of _mm_max_epu32 and returned that instead of 0. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 1/5] i965/eu: set correct execution size in brw_NOP
On Wed, Dec 9, 2015 at 4:15 AM, Iago Toral Quirogawrote: > --- > src/mesa/drivers/dri/i965/brw_eu_emit.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index f8c0f80..9543d5e 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -1256,6 +1256,7 @@ brw_F16TO32(struct brw_codegen *p, struct brw_reg dst, > struct brw_reg src) > void brw_NOP(struct brw_codegen *p) > { > brw_inst *insn = next_insn(p, BRW_OPCODE_NOP); > + brw_inst_set_exec_size(p->devinfo, insn, BRW_EXECUTE_4); I don't follow this change. Was this implicitly set before? At least in newer documentation, NOP is defined to have nearly all fields 0 which would mean execution size must be 1. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports
Roland, I could not due to ml size limit or something, it just bounces hence the pull request. Cheers, Edward. On 2015-12-10 02:38, Roland Scheidegger wrote: Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan: This fixes my initial attempt so that piglit now passes 14/14. Thanks to a couple of tips from Roland in the previous patch I was able to fix the remaining issue. This should be golden now. Great that you got it working! Please send the patches to the ml. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output
On Mon, Dec 7, 2015 at 8:42 PM, Brian Paulwrote: > When a buffer is created with GL_STATIC_DRAW, its contents should not > be changed frequently. But that's exactly what one application I'm > debugging does. This patch adds code to try to detect inefficient > buffer use in a couple places. The GL_ARB_debug_output mechanism is > used to report the issue. > > NVIDIA's driver detects these sort of things too. > > Other types of inefficient buffer use could also be detected in the > future. > --- > src/mesa/main/bufferobj.c | 55 > +++ > src/mesa/main/mtypes.h| 4 > 2 files changed, 59 insertions(+) > > diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c > index f985982..6bc1b5e 100644 > --- a/src/mesa/main/bufferobj.c > +++ b/src/mesa/main/bufferobj.c > @@ -51,6 +51,34 @@ > > > /** > + * We count the number of buffer modification calls to check for > + * inefficient buffer use. This is the number of such calls before we > + * issue a warning. > + */ > +#define BUFFER_WARNING_CALL_COUNT 4 > + > + > +/** > + * Helper to warn of possible performance issues, such as frequently > + * updating a buffer created with GL_STATIC_DRAW. > + */ > +static void > +buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) > +{ > + va_list args; > + GLuint msg_id = 0; This needs to be wrapped in a macro, with a 'static' id (at each macro invocation), otherwise a fresh id will get generated each time this is called, which is presumably not desirable. Same as what I did with pipe_debug_message/_pipe_debug_message. [I know you already pushed this... that's how I noticed... but should still get fixed.] -ilia > + > + va_start(args, fmt); > + _mesa_gl_vdebug(ctx, _id, > + MESA_DEBUG_SOURCE_API, > + MESA_DEBUG_TYPE_PERFORMANCE, > + MESA_DEBUG_SEVERITY_MEDIUM, > + fmt, args); > + va_end(args); > +} > + > + > +/** > * Used as a placeholder for buffer objects between glGenBuffers() and > * glBindBuffer() so that glIsBuffer() can work correctly. > */ > @@ -1677,6 +1705,21 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct > gl_buffer_object *bufObj, > if (size == 0) >return; > > + bufObj->NumSubDataCalls++; > + > + if ((bufObj->Usage == GL_STATIC_DRAW || > +bufObj->Usage == GL_STATIC_COPY) && > + bufObj->NumSubDataCalls >= BUFFER_WARNING_CALL_COUNT) { > + /* If the application declared the buffer as static draw/copy or stream > + * draw, it should not be frequently modified with glBufferSubData. > + */ > + buffer_usage_warning(ctx, > + "using %s(buffer %u, offset %u, size %u) to " > + "update a %s buffer", > + func, bufObj->Name, offset, size, > + _mesa_enum_to_string(bufObj->Usage)); > + } > + > bufObj->Written = GL_TRUE; > > assert(ctx->Driver.BufferSubData); > @@ -2384,6 +2427,18 @@ _mesa_map_buffer_range(struct gl_context *ctx, >return NULL; > } > > + if (access & GL_MAP_WRITE_BIT) { > + bufObj->NumMapBufferWriteCalls++; > + if ((bufObj->Usage == GL_STATIC_DRAW || > + bufObj->Usage == GL_STATIC_COPY) && > + bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) { > + buffer_usage_warning(ctx, > + "using %s(buffer %u, offset %u, length %u) to " > + "update a %s buffer", > + func, bufObj->Name, offset, length, > + _mesa_enum_to_string(bufObj->Usage)); > + } > + } > > assert(ctx->Driver.MapBufferRange); > map = ctx->Driver.MapBufferRange(ctx, offset, length, access, bufObj, > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 1eb1e21..de54169 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -1275,6 +1275,10 @@ struct gl_buffer_object > GLboolean Immutable; /**< GL_ARB_buffer_storage */ > gl_buffer_usage UsageHistory; /**< How has this buffer been used so far? > */ > > + /** Counters used for buffer usage warnings */ > + GLuint NumSubDataCalls; > + GLuint NumMapBufferWriteCalls; > + > struct gl_buffer_mapping Mappings[MAP_COUNT]; > }; > > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] util/u_helpers: return correct number of bound buffers
Ok, first of all bind some buffers: pipe->set_vertex_buffers(pipe, 0, 1, ); pipe->set_vertex_buffers(pipe, 1, 1, ); pipe->set_vertex_buffers(pipe, 2, 1, ); num_vtxbufs is now 3 as it should be. Now you are unbinding buffers, one after another starting at slot 0: pipe->set_vertex_buffers(pipe, 0, 1, NULL); enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7; util_set_vertex_buffers_mask(...); enabled_buffers = 0x6; num_vtxbufs = util_last_bit(enabled_buffers) = 3 pipe->set_vertex_buffers(pipe, 1, 1, NULL); enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7; util_set_vertex_buffers_mask(...); enabled_buffers = 0x5; num_vtxbufs = util_last_bit(enabled_buffers) = 3 pipe->set_vertex_buffers(pipe, 2, 1, NULL); enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7; util_set_vertex_buffers_mask(...); enabled_buffers = 0x3; num_vtxbufs = util_last_bit(enabled_buffers) = 2 There are no buffers bound any more, but num_vtxbufs is now 2 instead of 0. There would be no problem if you start at slot 2 going down to 0. There would be no problem if you unbind all buffers at once. I hope this clarifies the problem. Kind Regards, Patrick On 2015-12-09 07:10 PM, Ilia Mirkin wrote: > I'm probably just being dense... can you provide an exact sequence of > calls that would cause this logic to fail? Seems like it should work > as-is... > > On Sun, Dec 6, 2015 at 4:12 AM, Patrick Rudolphwrote: >> In case a state tracker unbinds every slot by a seperate >> pipe->set_vertex_buffers() call, starting from slot zero, the number >> of bound buffers would not reach zero at all. Unbinding all buffers >> at once or starting at the top-most slot results in correct behaviour. >> >> Calculating the correct number of bound buffers fixes a NULL pointer >> dereference in nvc0_validate_vertex_buffers_shared(). >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 >> >> Signed-off-by: Patrick Rudolph >> --- >> src/gallium/auxiliary/util/u_helpers.c | 8 +++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/src/gallium/auxiliary/util/u_helpers.c >> b/src/gallium/auxiliary/util/u_helpers.c >> index 09619c1..09020b0 100644 >> --- a/src/gallium/auxiliary/util/u_helpers.c >> +++ b/src/gallium/auxiliary/util/u_helpers.c >> @@ -81,7 +81,13 @@ void util_set_vertex_buffers_count(struct >> pipe_vertex_buffer *dst, >> const struct pipe_vertex_buffer *src, >> unsigned start_slot, unsigned count) >> { >> - uint32_t enabled_buffers = (1ull << *dst_count) - 1; >> + unsigned i; >> + uint32_t enabled_buffers = 0; >> + >> + for (i = 0; i < *dst_count; i++) { >> + if (dst[i].buffer || dst[i].user_buffer) >> + enabled_buffers |= (1ull << i); >> + } >> >> util_set_vertex_buffers_mask(dst, _buffers, src, start_slot, >> count); >> -- >> 2.4.3 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 11.0.7
Mesa 11.0.7 is now available. This release brings substantial amount of fixes in meta (affecting i965), some driver fixes for i965, nouveau, r600 and llvm. The video encoding for Stoney has been disabled, as it isn't working properly. There are also build fixes for DragonFly and other *BSD platforms, Chris Wilson (1): meta: Compute correct buffer size with SkipRows/SkipPixels Daniel Stone (1): egl/wayland: Ignore rects from SwapBuffersWithDamage Dave Airlie (4): texgetimage: consolidate 1D array handling code. r600: geometry shader gsvs itemsize workaround r600: rv670 use at least 16es/gs threads r600: workaround empty geom shader. Emil Velikov (5): docs: add sha256 checksums for 11.0.6 get-pick-list.sh: Require explicit "11.0" for nominating stable patches mesa; add get-extra-pick-list.sh script into bin/ Update version to 11.0.7 docs: add release notes for 11.0.7 François Tigeot (1): xmlconfig: Add support for DragonFly Ian Romanick (22): mesa: Make bind_vertex_buffer avilable outside varray.c mesa: Refactor update_array_format to make _mesa_update_array_format_public mesa: Refactor enable_vertex_array_attrib to make _mesa_enable_vertex_array_attrib i965: Pass brw_context instead of gl_context to brw_draw_rectlist i965: Use DSA functions for VBOs in brw_meta_fast_clear i965: Use internal functions for buffer object access i965: Don't pollute the buffer object namespace in brw_meta_fast_clear meta: Use DSA functions for PBO in create_texture_for_pbo meta: Use _mesa_NamedBufferData and _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects i965: Use _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects meta: Don't leave the VBO bound after _mesa_meta_setup_vertex_objects meta: Track VBO using gl_buffer_object instead of GL API object handle meta: Use DSA functions for VBOs in _mesa_meta_setup_vertex_objects meta: Use internal functions for buffer object and VAO access meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects meta: Partially convert _mesa_meta_DrawTex to DSA meta: Track VBO using gl_buffer_object instead of GL API object handle in _mesa_meta_DrawTex meta: Use internal functions for buffer object and VAO access in _mesa_meta_DrawTex meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex meta/TexSubImage: Don't pollute the buffer object namespace meta/generate_mipmap: Don't leak the framebuffer object glsl: Fix off-by-one error in array size check assertion Ilia Mirkin (7): nvc0/ir: actually emit AFETCH on kepler nir: fix typo in idiv lowering, causing large-udiv-udiv failures nouveau: use the buffer usage to determine placement when no binding nv50,nvc0: properly handle buffer storage invalidation on dsa buffer nv50/ir: fix (un)spilling of 3-wide results mesa: support GL_RED/GL_RG in ES2 contexts when driver support exists nvc0/ir: start offset at texBindBase for txq, like regular texturing Jonathan Gray (1): automake: fix some occurrences of hardcoded -ldl and -lpthread Leo Liu (1): radeon/vce: disable Stoney VCE for 11.0 Marta Lofstedt (1): gles2: Update gl2ext.h to revision: 32120 Oded Gabbay (1): llvmpipe: disable VSX in ppc due to LLVM PPC bug git tag: mesa-11.0.7 ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.gz MD5: 73e2e04d02ab2cdddc635fcb261bde13 mesa-11.0.7.tar.gz SHA1: ba017f72c5cfe08140b8b0c1a98035f76e19abae mesa-11.0.7.tar.gz SHA256: 07c27004ff68b288097d17b2faa7bdf15ec73c96b7e6c9835266e544adf0a62f mesa-11.0.7.tar.gz PGP: ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.gz.sig ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.xz MD5: 5bb515d4b0931b7a9e1bfec3da73f10f mesa-11.0.7.tar.xz SHA1: 7e96868bf104673509e30846ccb6f641231e8c5a mesa-11.0.7.tar.xz SHA256: e7e90a332ede6c8fd08eff90786a3fd1605a4e62ebf3a9b514047838194538cb mesa-11.0.7.tar.xz PGP: ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.xz.sig -- Emil signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output
On Wed, Dec 9, 2015 at 5:23 PM, Brian Paulwrote: > On 12/09/2015 11:43 AM, Ilia Mirkin wrote: >> >> On Mon, Dec 7, 2015 at 8:42 PM, Brian Paul wrote: >>> >>> When a buffer is created with GL_STATIC_DRAW, its contents should not >>> be changed frequently. But that's exactly what one application I'm >>> debugging does. This patch adds code to try to detect inefficient >>> buffer use in a couple places. The GL_ARB_debug_output mechanism is >>> used to report the issue. >>> >>> NVIDIA's driver detects these sort of things too. >>> >>> Other types of inefficient buffer use could also be detected in the >>> future. >>> --- >>> src/mesa/main/bufferobj.c | 55 >>> +++ >>> src/mesa/main/mtypes.h| 4 >>> 2 files changed, 59 insertions(+) >>> >>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c >>> index f985982..6bc1b5e 100644 >>> --- a/src/mesa/main/bufferobj.c >>> +++ b/src/mesa/main/bufferobj.c >>> @@ -51,6 +51,34 @@ >>> >>> >>> /** >>> + * We count the number of buffer modification calls to check for >>> + * inefficient buffer use. This is the number of such calls before we >>> + * issue a warning. >>> + */ >>> +#define BUFFER_WARNING_CALL_COUNT 4 >>> + >>> + >>> +/** >>> + * Helper to warn of possible performance issues, such as frequently >>> + * updating a buffer created with GL_STATIC_DRAW. >>> + */ >>> +static void >>> +buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) >>> +{ >>> + va_list args; >>> + GLuint msg_id = 0; >> >> >> This needs to be wrapped in a macro, with a 'static' id (at each macro >> invocation), otherwise a fresh id will get generated each time this is >> called, which is presumably not desirable. Same as what I did with >> pipe_debug_message/_pipe_debug_message. > > > Is the macro required, or can I just pass a pointer to a static GLuint from > each call site? I took a quick stab at the macro but I'm having trouble > with the varargs stuff. The macro is not required... but you can copy the one I used for pipe_debug_message. Passing in a different static variable from each callsite would work too -- that's precisely what the macro would be doing in the first place. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: implement RB+ for Stoney (v2)
From: Marek Olšákv2: fix dual source blending --- src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 3 + src/gallium/drivers/radeon/r600_texture.c | 6 + src/gallium/drivers/radeonsi/si_state.c | 159 +- src/gallium/drivers/radeonsi/sid.h| 3 + 5 files changed, 170 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 8899ba4..ba541ac 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -375,6 +375,7 @@ static const struct debug_named_value common_debug_options[] = { { "check_vm", DBG_CHECK_VM, "Check VM faults and dump debug info." }, { "nodcc", DBG_NO_DCC, "Disable DCC." }, { "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." }, + { "norbplus", DBG_NO_RB_PLUS, "Disable RB+ on Stoney." }, DEBUG_NAMED_VALUE_END /* must be last */ }; diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 8c6c0c3..dd23ed5 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -86,6 +86,7 @@ #define DBG_CHECK_VM (1llu << 42) #define DBG_NO_DCC (1llu << 43) #define DBG_NO_DCC_CLEAR (1llu << 44) +#define DBG_NO_RB_PLUS (1llu << 45) #define R600_MAP_BUFFER_ALIGNMENT 64 @@ -250,6 +251,8 @@ struct r600_surface { unsigned cb_color_fmask_slice; /* EG and later */ unsigned cb_color_cmask;/* CB_COLORn_TILE (r600 only) */ unsigned cb_color_mask; /* R600 only */ + unsigned sx_ps_downconvert; /* Stoney only */ + unsigned sx_blend_opt_epsilon; /* Stoney only */ struct r600_resource *cb_buffer_fmask; /* Used for FMASK relocations. R600 only */ struct r600_resource *cb_buffer_cmask; /* Used for CMASK relocations. R600 only */ diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 774722f..8c145e5 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1393,6 +1393,7 @@ void evergreen_do_fast_color_clear(struct r600_common_context *rctx, return; for (i = 0; i < fb->nr_cbufs; i++) { + struct r600_surface *surf; struct r600_texture *tex; unsigned clear_bit = PIPE_CLEAR_COLOR0 << i; @@ -1403,6 +1404,7 @@ void evergreen_do_fast_color_clear(struct r600_common_context *rctx, if (!(*buffers & clear_bit)) continue; + surf = (struct r600_surface *)fb->cbufs[i]; tex = (struct r600_texture *)fb->cbufs[i]->texture; /* 128-bit formats are unusupported */ @@ -1449,6 +1451,10 @@ void evergreen_do_fast_color_clear(struct r600_common_context *rctx, if (clear_words_needed) tex->dirty_level_mask |= 1 << fb->cbufs[i]->u.tex.level; } else { + /* RB+ doesn't work with CMASK fast clear. */ + if (surf->sx_ps_downconvert) + continue; + /* ensure CMASK is enabled */ r600_texture_alloc_cmask_separate(rctx->screen, tex); if (tex->cmask.size == 0) { diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 2ebfa1c..dcf4a7b 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -347,10 +347,54 @@ static uint32_t si_translate_blend_factor(int blend_fact) return 0; } +static uint32_t si_translate_blend_opt_function(int blend_func) +{ + switch (blend_func) { + case PIPE_BLEND_ADD: + return V_028760_OPT_COMB_ADD; + case PIPE_BLEND_SUBTRACT: + return V_028760_OPT_COMB_SUBTRACT; + case PIPE_BLEND_REVERSE_SUBTRACT: + return V_028760_OPT_COMB_REVSUBTRACT; + case PIPE_BLEND_MIN: + return V_028760_OPT_COMB_MIN; + case PIPE_BLEND_MAX: + return V_028760_OPT_COMB_MAX; + default: + return V_028760_OPT_COMB_BLEND_DISABLED; + } +} + +static uint32_t si_translate_blend_opt_factor(int blend_fact, bool is_alpha) +{ + switch (blend_fact) { + case PIPE_BLENDFACTOR_ZERO: + return V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_ALL; + case PIPE_BLENDFACTOR_ONE: + return V_028760_BLEND_OPT_PRESERVE_ALL_IGNORE_NONE; + case PIPE_BLENDFACTOR_SRC_COLOR: + return is_alpha ? V_028760_BLEND_OPT_PRESERVE_A1_IGNORE_A0 + :
[Mesa-dev] [PATCH] nvc0: optimize coherent buffer checking at draw time
Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset--- I didn't test the patch, but I will run a full piglit tonight. src/gallium/drivers/nouveau/nv50/nv50_context.h | 1 + src/gallium/drivers/nouveau/nv50/nv50_state.c | 5 +++ src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 22 ++--- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 3 ++ src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 26 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 + 6 files changed, 45 insertions(+), 53 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h b/src/gallium/drivers/nouveau/nv50/nv50_context.h index 2cebcd9..dd5b5e3 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h @@ -134,6 +134,7 @@ struct nv50_context { struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS]; uint16_t constbuf_dirty[3]; uint16_t constbuf_valid[3]; + uint16_t constbuf_coherent[3]; struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS]; unsigned num_vtxbufs; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c b/src/gallium/drivers/nouveau/nv50/nv50_state.c index fd7c7cd..c08a63e 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c @@ -852,8 +852,13 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint shader, uint index, nv50->constbuf[s][i].offset = cb->buffer_offset; nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 0x1); nv50->constbuf_valid[s] |= 1 << i; + if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT) + nv50->constbuf_coherent[s] |= 1 << i; + else + nv50->constbuf_coherent[s] &= ~(1 << i); } else { nv50->constbuf_valid[s] &= ~(1 << i); + nv50->constbuf_coherent[s] &= ~(1 << i); } nv50->constbuf_dirty[s] |= 1 << i; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c index 85878d5..4ca8f11 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c @@ -791,27 +791,9 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) push->kick_notify = nv50_draw_vbo_kick_notify; - /* TODO: Instead of iterating over all the buffer resources looking for -* coherent buffers, keep track of a context-wide count. -*/ for (s = 0; s < 3 && !nv50->cb_dirty; ++s) { - uint32_t valid = nv50->constbuf_valid[s]; - - while (valid && !nv50->cb_dirty) { - const unsigned i = ffs(valid) - 1; - struct pipe_resource *res; - - valid &= ~(1 << i); - if (nv50->constbuf[s][i].user) -continue; - - res = nv50->constbuf[s][i].u.buf; - if (!res) -continue; - - if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT) -nv50->cb_dirty = true; - } + if (nv50->constbuf_coherent[s]) + nv50->cb_dirty = true; } /* If there are any coherent constbufs, flush the cache */ diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h index 39b73ec..1219548 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h @@ -134,10 +134,12 @@ struct nvc0_context { struct nvc0_constbuf constbuf[6][NVC0_MAX_PIPE_CONSTBUFS]; uint16_t constbuf_dirty[6]; uint16_t constbuf_valid[6]; + uint16_t constbuf_coherent[6]; bool cb_dirty; struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS]; unsigned num_vtxbufs; + uint32_t vtxbufs_coherent; struct pipe_index_buffer idxbuf; uint32_t constant_vbos; uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */ @@ -149,6 +151,7 @@ struct nvc0_context { struct pipe_sampler_view *textures[6][PIPE_MAX_SAMPLERS]; unsigned num_textures[6]; uint32_t textures_dirty[6]; + uint32_t textures_coherent[6]; struct nv50_tsc_entry *samplers[6][PIPE_MAX_SAMPLERS]; unsigned num_samplers[6]; uint16_t samplers_dirty[6]; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index 5da0ea8..b498b5d 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c @@ -549,11 +549,18 @@ nvc0_stage_set_sampler_views(struct nvc0_context *nvc0, int s, for (i = 0; i < nr; ++i) { struct nv50_tic_entry *old = nv50_tic_entry(nvc0->textures[s][i]); + struct pipe_resource *res = views[i]->texture; if (views[i] == nvc0->textures[s][i]) continue;
Re: [Mesa-dev] [PATCH 14/26] i965: Add Gen8+ tessellation evaluation shader state (3DSTATE_DS).
On 2015-12-02 16:15:55, Kenneth Graunke wrote: > Signed-off-by: Kenneth Graunke> --- > src/mesa/drivers/dri/i965/gen8_ds_state.c | 66 > +++ > 1 file changed, 59 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/gen8_ds_state.c > b/src/mesa/drivers/dri/i965/gen8_ds_state.c > index 4ce4ab3..a79e8aa 100644 > --- a/src/mesa/drivers/dri/i965/gen8_ds_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_ds_state.c > @@ -29,19 +29,71 @@ > static void > gen8_upload_ds_state(struct brw_context *brw) > { > - /* Disable the DS Unit */ > - int ds_pkt_len = brw->gen >= 9 ? 11 : 9; > - BEGIN_BATCH(ds_pkt_len); > - OUT_BATCH(_3DSTATE_DS << 16 | (ds_pkt_len - 2)); > - for (int i = 0; i < ds_pkt_len - 1; i++) > + struct gl_context *ctx = >ctx; > + const struct brw_stage_state *stage_state = >tes.base; > + /* BRW_NEW_TESS_EVAL_PROGRAM */ > + bool active = brw->tess_eval_program; > + assert(!active || brw->tess_ctrl_program); > + > + /* BRW_NEW_TES_PROG_DATA */ > + const struct brw_tes_prog_data *tes_prog_data = brw->tes.prog_data; > + const struct brw_vue_prog_data *vue_prog_data = _prog_data->base; > + const struct brw_stage_prog_data *prog_data = _prog_data->base; > + > + if (active) { > + BEGIN_BATCH(9); > + OUT_BATCH(_3DSTATE_DS << 16 | (9 - 2)); > + OUT_BATCH(stage_state->prog_offset); > + OUT_BATCH(0); > + OUT_BATCH(SET_FIELD(DIV_ROUND_UP(stage_state->sampler_count, 4), > + GEN7_DS_SAMPLER_COUNT) | > +SET_FIELD(prog_data->binding_table.size_bytes / 4, > + GEN7_DS_BINDING_TABLE_ENTRY_COUNT)); > + if (prog_data->total_scratch) { > + OUT_RELOC64(stage_state->scratch_bo, > + I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > + ffs(prog_data->total_scratch) - 11); > + } else { > + OUT_BATCH(0); > + OUT_BATCH(0); > + } > + OUT_BATCH(SET_FIELD(prog_data->dispatch_grf_start_reg, > + GEN7_DS_DISPATCH_START_GRF) | > +SET_FIELD(vue_prog_data->urb_read_length, > + GEN7_DS_URB_READ_LENGTH)); > + > + OUT_BATCH(GEN7_DS_ENABLE | > +GEN7_DS_STATISTICS_ENABLE | > +(brw->max_ds_threads - 1) << HSW_DS_MAX_THREADS_SHIFT | SET_FIELD? Reviewed-by: Jordan Justen > +(vue_prog_data->dispatch_mode == DISPATCH_MODE_SIMD8 ? > + GEN7_DS_SIMD8_DISPATCH_ENABLE : 0) | > +(tes_prog_data->domain == BRW_TESS_DOMAIN_TRI ? > + GEN7_DS_COMPUTE_W_COORDINATE_ENABLE : 0)); > + OUT_BATCH(SET_FIELD(ctx->Transform.ClipPlanesEnabled, > + GEN8_DS_USER_CLIP_DISTANCE)); > + ADVANCE_BATCH(); > + } else { > + BEGIN_BATCH(9); > + OUT_BATCH(_3DSTATE_DS << 16 | (9 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); >OUT_BATCH(0); > - ADVANCE_BATCH(); > + ADVANCE_BATCH(); > + } > + brw->tes.enabled = active; > } > > const struct brw_tracked_state gen8_ds_state = { > .dirty = { >.mesa = 0, > - .brw = BRW_NEW_CONTEXT, > + .brw = BRW_NEW_BATCH | > + BRW_NEW_TESS_EVAL_PROGRAM | > + BRW_NEW_TES_PROG_DATA, > }, > .emit = gen8_upload_ds_state, > }; > -- > 2.6.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/26] i965: Add tessellation shader sampler support.
12 & 13 Reviewed-by: Jordan JustenOn 2015-12-02 16:15:53, Kenneth Graunke wrote: > Based on code by Chris Forbes and Fabian Bieler. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.h | 2 +- > src/mesa/drivers/dri/i965/brw_sampler_state.c | 46 > +++ > src/mesa/drivers/dri/i965/brw_state.h | 2 ++ > src/mesa/drivers/dri/i965/brw_state_upload.c | 2 ++ > 4 files changed, 51 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 8de0e6f..c440c7d 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -1228,7 +1228,7 @@ struct brw_context > } perfmon; > > int num_atoms[BRW_NUM_PIPELINES]; > - const struct brw_tracked_state render_atoms[71]; > + const struct brw_tracked_state render_atoms[73]; > const struct brw_tracked_state compute_atoms[9]; > > /* If (INTEL_DEBUG & DEBUG_BATCH) */ > diff --git a/src/mesa/drivers/dri/i965/brw_sampler_state.c > b/src/mesa/drivers/dri/i965/brw_sampler_state.c > index 6d73444..3f29e2f 100644 > --- a/src/mesa/drivers/dri/i965/brw_sampler_state.c > +++ b/src/mesa/drivers/dri/i965/brw_sampler_state.c > @@ -55,6 +55,8 @@ gen7_emit_sampler_state_pointers_xs(struct brw_context *brw, > { > static const uint16_t packet_headers[] = { >[MESA_SHADER_VERTEX] = _3DSTATE_SAMPLER_STATE_POINTERS_VS, > + [MESA_SHADER_TESS_CTRL] = _3DSTATE_SAMPLER_STATE_POINTERS_HS, > + [MESA_SHADER_TESS_EVAL] = _3DSTATE_SAMPLER_STATE_POINTERS_DS, >[MESA_SHADER_GEOMETRY] = _3DSTATE_SAMPLER_STATE_POINTERS_GS, >[MESA_SHADER_FRAGMENT] = _3DSTATE_SAMPLER_STATE_POINTERS_PS, > }; > @@ -647,3 +649,47 @@ const struct brw_tracked_state brw_gs_samplers = { > }, > .emit = brw_upload_gs_samplers, > }; > + > + > +static void > +brw_upload_tcs_samplers(struct brw_context *brw) > +{ > + /* BRW_NEW_TESS_CTRL_PROGRAM */ > + struct gl_program *tcs = (struct gl_program *) brw->tess_ctrl_program; > + if (!tcs) > + return; > + > + brw_upload_sampler_state_table(brw, tcs, >tcs.base); > +} > + > + > +const struct brw_tracked_state brw_tcs_samplers = { > + .dirty = { > + .mesa = _NEW_TEXTURE, > + .brw = BRW_NEW_BATCH | > + BRW_NEW_TESS_CTRL_PROGRAM, > + }, > + .emit = brw_upload_tcs_samplers, > +}; > + > + > +static void > +brw_upload_tes_samplers(struct brw_context *brw) > +{ > + /* BRW_NEW_TESS_EVAL_PROGRAM */ > + struct gl_program *tes = (struct gl_program *) brw->tess_eval_program; > + if (!tes) > + return; > + > + brw_upload_sampler_state_table(brw, tes, >tes.base); > +} > + > + > +const struct brw_tracked_state brw_tes_samplers = { > + .dirty = { > + .mesa = _NEW_TEXTURE, > + .brw = BRW_NEW_BATCH | > + BRW_NEW_TESS_EVAL_PROGRAM, > + }, > + .emit = brw_upload_tes_samplers, > +}; > diff --git a/src/mesa/drivers/dri/i965/brw_state.h > b/src/mesa/drivers/dri/i965/brw_state.h > index 129de30..a332b92 100644 > --- a/src/mesa/drivers/dri/i965/brw_state.h > +++ b/src/mesa/drivers/dri/i965/brw_state.h > @@ -72,6 +72,8 @@ extern const struct brw_tracked_state > brw_state_base_address; > extern const struct brw_tracked_state brw_urb_fence; > extern const struct brw_tracked_state brw_vs_prog; > extern const struct brw_tracked_state brw_vs_samplers; > +extern const struct brw_tracked_state brw_tcs_samplers; > +extern const struct brw_tracked_state brw_tes_samplers; > extern const struct brw_tracked_state brw_gs_samplers; > extern const struct brw_tracked_state brw_vs_ubo_surfaces; > extern const struct brw_tracked_state brw_vs_abo_surfaces; > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > b/src/mesa/drivers/dri/i965/brw_state_upload.c > index ba66886..e61fa6a 100644 > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > @@ -322,6 +322,8 @@ static const struct brw_tracked_state > *gen8_render_atoms[] = > > _fs_samplers, > _vs_samplers, > + _tcs_samplers, > + _tes_samplers, > _gs_samplers, > _multisample_state, > > -- > 2.6.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [ANNOUNCE] mesa-demos 8.3.0
This new mesa-demos release fixes the build issue against mesa 10.6 (Bug #91643) and picks up the latest glxinfo changes. For the misc changes see below. Andreas. Adam Jackson (1): glxinfo: Add support for GLX_MESA_query_renderer Andreas Boll (3): demos: add missing binaries to .gitignore util: Remove unused glstate.[ch] demos: Bump version to 8.3.0 for release Awais Belal (1): sharedtex_mt: fix rendering thread hang Brian Paul (21): wglinfo: query and report multisample information glxinfo/wglinfo: move argument parsing into common code wglinfo: Add support for reporting core profile info wglinfo: print swap method in print_visual_attribs_verbose() wglinfo: adjust column spacing for pixel format info glxinfo/wglinfo: query/print GL_ARB_texture_multisample limits glxinfo/wglinfo: reverse order of the gl_versions[] array geom-stipple-lines: use VBO, specify number of verts on cmd line geom-wide-lines: use VBO, specify number of verts on cmd line line-sample: a new test that draws a sample of wide/stipple/smooth lines line-sample: use GL_LINE_STRIP instead of GL_LINE_LOOP glxinfo/wglinfo: add brief (-B) output mode glxinfo: pass the options object to print_screen_info() wglinfo: pass the options object to print_screen_info() demos: flush stdout after printing frame rate quad-offset-unfilled: fix GL_POLYGON_OFFSET_FILL/LINE mistake util: add fflush() in ValidateShaderProgram() tri-tex-stipple: trivial test of texturing with stippling line-sample: add flat/smooth and blend toggles rubberband: select line width with 1..4 keys rubberband: add keyboard option to test non-white drawing color Dave Airlie (1): glxinfo: add 4.5 as a valid version Dongwon Kim (1): torus.c: Lighting effect is distorted when object is scaled up/down Jose Fonseca (3): wgl: Remove unnecessary #pragmas. line-sample: Ensure GL_ALIASED_LINE_WIDTH_RANGE is defined on Windows. xdemos/corender: Remove. José Fonseca (5): cmake: Define HAVE_FREEGLUT when glutInitContextProfile symbol is present. cmake: Don't use gcc specific warnings with g++. tests,trival,fp,vp: Rename errno with errnum. wgl: Ensure PIXELFORMATDESCRIPTOR members are zeroed. s/Tungsten Graphics/VMware/ Julien Isorce (1): glxinfo: fix segfault when core profile is unavailable Marc Dietrich (1): glxinfo: fix direct rendering context in glxinfo Marek Olšák (2): eglinfo: print client extensions glxinfo: fix printing core profile extensions Matt Turner (1): egl: Remove demos using EGL_MESA_screen_surface. Michael Olbrich (1): opengles2: fix building without X11 Nathan Kidd (1): glxgears_pixmap: destroy correct pixmap id Ross Burton (1): configure.ac: fix AC_WITH(glut) so that --without-glut works git tag: mesa-demos-8.3.0 ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.bz2 MD5: 628e75c23c17394f11a316c36f8e4164 mesa-demos-8.3.0.tar.bz2 SHA1: 468a8f24938ab07e2e31828cf961515371d45b56 mesa-demos-8.3.0.tar.bz2 SHA256: c173154bbd0d5fb53d732471984def42fb1b14ac85fcb834138fb9518b3e0bef mesa-demos-8.3.0.tar.bz2 PGP: ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.bz2.sig ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.gz MD5: f42510108d9401ad1b4ee67957cc73d7 mesa-demos-8.3.0.tar.gz SHA1: 2d46ef2d83030c978ba032eac14bf94007bd4a84 mesa-demos-8.3.0.tar.gz SHA256: 6127c5511e63447b28c2df735739de06c5d221f68e7671cc0ee446e605e92357 mesa-demos-8.3.0.tar.gz PGP: ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.gz.sig signature.asc Description: Digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: don't call of u_prims_for_vertices for patches and rectangles
From: Marek OlšákBoth caused a crash due to a division by zero in that function. This is an alternative fix. Cc: 11.0 11.1 --- src/gallium/drivers/radeonsi/si_state_draw.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index ee84a1f..e550011 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -216,6 +216,18 @@ static void si_emit_derived_tess_state(struct si_context *sctx, radeon_emit(cs, tcs_out_layout | (num_tcs_output_cp << 26)); } +static unsigned si_num_prims_for_vertices(const struct pipe_draw_info *info) +{ + switch (info->mode) { + case PIPE_PRIM_PATCHES: + return info->count / info->vertices_per_patch; + case R600_PRIM_RECTANGLE_LIST: + return info->count / 3; + default: + return u_prims_for_vertices(info->mode, info->count); + } +} + static unsigned si_get_ia_multi_vgt_param(struct si_context *sctx, const struct pipe_draw_info *info, unsigned num_patches) @@ -320,7 +332,7 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context *sctx, if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi && (info->indirect || (info->instance_count > 1 && - u_prims_for_vertices(info->mode, info->count) <= 1))) + si_num_prims_for_vertices(info) <= 1))) sctx->b.flags |= SI_CONTEXT_VGT_FLUSH; return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) | -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels
On Mon, Dec 7, 2015 at 5:36 PM, Brian Paulwrote: > On 12/06/2015 04:34 PM, Marek Olšák wrote: >> >> From: Marek Olšák >> >> Spotted by luck. The GLSL uniform storage is only associated once >> in LinkShader and can't be reallocated afterwards, because that would >> break the association. >> --- >> src/mesa/state_tracker/st_cb_bitmap.c | 6 +- >> src/mesa/state_tracker/st_cb_drawpixels.c | 6 -- >> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 ++ >> src/mesa/state_tracker/st_program.c| 17 - >> src/mesa/state_tracker/st_program.h| 1 - >> 5 files changed, 15 insertions(+), 21 deletions(-) >> >> diff --git a/src/mesa/state_tracker/st_cb_bitmap.c >> b/src/mesa/state_tracker/st_cb_bitmap.c >> index cbc6845..a4a48a6 100644 >> --- a/src/mesa/state_tracker/st_cb_bitmap.c >> +++ b/src/mesa/state_tracker/st_cb_bitmap.c >> @@ -287,7 +287,8 @@ draw_bitmap_quad(struct gl_context *ctx, GLint x, >> GLint y, GLfloat z, >> GLfloat colorSave[4]; >> COPY_4V(colorSave, ctx->Current.Attrib[VERT_ATTRIB_COLOR0]); >> COPY_4V(ctx->Current.Attrib[VERT_ATTRIB_COLOR0], color); >> - st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT); >> + st_upload_constants(st, st->fp->Base.Base.Parameters, >> + PIPE_SHADER_FRAGMENT); >> COPY_4V(ctx->Current.Attrib[VERT_ATTRIB_COLOR0], colorSave); >> } >> >> @@ -404,6 +405,9 @@ draw_bitmap_quad(struct gl_context *ctx, GLint x, >> GLint y, GLfloat z, >> cso_restore_stream_outputs(cso); >> >> pipe_resource_reference(, NULL); >> + >> + /* We uploaded modified constants, need to invalidate them. */ >> + st->dirty.mesa |= _NEW_PROGRAM_CONSTANTS; >> } >> >> >> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c >> b/src/mesa/state_tracker/st_cb_drawpixels.c >> index 262ad80..e295f54 100644 >> --- a/src/mesa/state_tracker/st_cb_drawpixels.c >> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c >> @@ -1109,9 +1109,6 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint >> y, >> >> st->pixel_xfer.pixelmap_sampler_view); >>num_sampler_view++; >> } >> - >> - /* update fragment program constants */ >> - st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT); >> } >> >> /* Put glDrawPixels image into a texture */ >> @@ -1462,9 +1459,6 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, >> GLint srcy, >> >> st->pixel_xfer.pixelmap_sampler_view); >>num_sampler_view++; >> } >> - >> - /* update fragment program constants */ >> - st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT); >> } >> else { >> assert(type == GL_DEPTH); >> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> index 40c7725..a32c4cf 100644 >> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> @@ -5640,6 +5640,12 @@ get_mesa_program(struct gl_context *ctx, >> >> _mesa_reference_program(ctx, >Program, prog); >> >> + /* Avoid reallocation of the program parameter list, because the >> uniform >> +* storage is only associated with the original parameter list. >> +* This should be enough for Bitmap and DrawPixels constants. >> +*/ >> + _mesa_reserve_parameter_storage(prog->Parameters, 8); >> + >> /* This has to be done last. Any operation the can cause >> * prog->ParameterValues to get reallocated (e.g., anything that adds >> a >> * program constant) has to happen before creating this linkage. >> diff --git a/src/mesa/state_tracker/st_program.c >> b/src/mesa/state_tracker/st_program.c >> index 75ccaf2..39c54c2 100644 >> --- a/src/mesa/state_tracker/st_program.c >> +++ b/src/mesa/state_tracker/st_program.c >> @@ -112,8 +112,6 @@ delete_fp_variant(struct st_context *st, struct >> st_fp_variant *fpv) >> { >> if (fpv->driver_shader) >> cso_delete_fragment_shader(st->cso_context, fpv->driver_shader); >> - if (fpv->parameters) >> - _mesa_free_parameter_list(fpv->parameters); >> free(fpv); >> } >> >> @@ -914,8 +912,6 @@ st_create_fp_variant(struct st_context *st, >>if (tgsi.tokens != stfp->tgsi.tokens) >> tgsi_free_tokens(tgsi.tokens); >>tgsi.tokens = tokens; >> - variant->parameters = >> -_mesa_clone_parameter_list(stfp->Base.Base.Parameters); >> } else >>fprintf(stderr, "mesa: cannot create a shader for glBitmap\n"); >> } >> @@ -924,6 +920,7 @@ st_create_fp_variant(struct st_context *st, >> if (key->drawpixels) { >> const struct tgsi_token *tokens; >> unsigned scale_const = 0, bias_const = 0, texcoord_const = 0; >> + struct gl_program_parameter_list *params = >> stfp->Base.Base.Parameters; >> >> /* Find the first unused slot. */ >>
Re: [Mesa-dev] [PATCH 15/26] i965: Add Gen7+ tessellation engine state (3DSTATE_TE).
On 2015-12-02 16:15:56, Kenneth Graunke wrote: > Signed-off-by: Kenneth Graunke> --- > src/mesa/drivers/dri/i965/gen7_te_state.c | 36 > --- > 1 file changed, 28 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/gen7_te_state.c > b/src/mesa/drivers/dri/i965/gen7_te_state.c > index 95a5e98..2650fa5 100644 > --- a/src/mesa/drivers/dri/i965/gen7_te_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_te_state.c > @@ -29,19 +29,39 @@ > static void > upload_te_state(struct brw_context *brw) > { > - /* Disable the TE */ > - BEGIN_BATCH(4); > - OUT_BATCH(_3DSTATE_TE << 16 | (4 - 2)); > - OUT_BATCH(0); > - OUT_BATCH(0); > - OUT_BATCH(0); > - ADVANCE_BATCH(); > + /* BRW_NEW_TESS_EVAL_PROGRAM */ > + bool active = brw->tess_eval_program; > + if (active) > + assert(brw->tess_ctrl_program); > + > + const struct brw_tes_prog_data *tes_prog_data = brw->tes.prog_data; > + > + if (active) { > + BEGIN_BATCH(4); > + OUT_BATCH(_3DSTATE_TE << 16 | (4 - 2)); > + OUT_BATCH((tes_prog_data->partitioning << GEN7_TE_PARTITIONING_SHIFT) | > +(tes_prog_data->output_topology << > GEN7_TE_OUTPUT_TOPOLOGY_SHIFT) | > +(tes_prog_data->domain << GEN7_TE_DOMAIN_SHIFT) | Looks like we don't currently have the masks for SET_FIELD, but maybe we should add them? Reviewed-by: Jordan Justen > +GEN7_TE_ENABLE); > + OUT_BATCH_F(63.0); > + OUT_BATCH_F(64.0); > + ADVANCE_BATCH(); > + } else { > + BEGIN_BATCH(4); > + OUT_BATCH(_3DSTATE_TE << 16 | (4 - 2)); > + OUT_BATCH(0); > + OUT_BATCH_F(0); > + OUT_BATCH_F(0); > + ADVANCE_BATCH(); > + } > } > > const struct brw_tracked_state gen7_te_state = { > .dirty = { >.mesa = 0, > - .brw = BRW_NEW_CONTEXT, > + .brw = BRW_NEW_CONTEXT | > + BRW_NEW_TES_PROG_DATA | > + BRW_NEW_TESS_EVAL_PROGRAM, > }, > .emit = upload_te_state, > }; > -- > 2.6.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] nv50, nvc0: optimize coherent buffer checking at draw time
Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Changes from v2: - forgot to apply some changes for nv50 (texture/vertex bufs) Signed-off-by: Samuel Pitoiset--- src/gallium/drivers/nouveau/nv50/nv50_context.h | 3 ++ src/gallium/drivers/nouveau/nv50/nv50_state.c | 19 +++ src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 44 +++-- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 3 ++ src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 26 +++ src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +-- 6 files changed, 64 insertions(+), 72 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h b/src/gallium/drivers/nouveau/nv50/nv50_context.h index 2cebcd9..712d00e 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h @@ -134,9 +134,11 @@ struct nv50_context { struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS]; uint16_t constbuf_dirty[3]; uint16_t constbuf_valid[3]; + uint16_t constbuf_coherent[3]; struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS]; unsigned num_vtxbufs; + uint32_t vtxbufs_coherent; struct pipe_index_buffer idxbuf; uint32_t vbo_fifo; /* bitmask of vertex elements to be pushed to FIFO */ uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */ @@ -148,6 +150,7 @@ struct nv50_context { struct pipe_sampler_view *textures[3][PIPE_MAX_SAMPLERS]; unsigned num_textures[3]; + uint32_t textures_coherent[3]; struct nv50_tsc_entry *samplers[3][PIPE_MAX_SAMPLERS]; unsigned num_samplers[3]; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c b/src/gallium/drivers/nouveau/nv50/nv50_state.c index fd7c7cd..b6e5c75 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c @@ -661,9 +661,16 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, int s, assert(nr <= PIPE_MAX_SAMPLERS); for (i = 0; i < nr; ++i) { struct nv50_tic_entry *old = nv50_tic_entry(nv50->textures[s][i]); + struct pipe_resource *res = views[i]->texture; if (old) nv50_screen_tic_unlock(nv50->screen, old); + if (res->target == PIPE_BUFFER && + (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)) + nv50->textures_coherent[s] |= 1 << i; + else + nv50->textures_coherent[s] &= ~(1 << i); + pipe_sampler_view_reference(>textures[s][i], views[i]); } @@ -852,8 +859,13 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint shader, uint index, nv50->constbuf[s][i].offset = cb->buffer_offset; nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 0x1); nv50->constbuf_valid[s] |= 1 << i; + if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT) + nv50->constbuf_coherent[s] |= 1 << i; + else + nv50->constbuf_coherent[s] &= ~(1 << i); } else { nv50->constbuf_valid[s] &= ~(1 << i); + nv50->constbuf_coherent[s] &= ~(1 << i); } nv50->constbuf_dirty[s] |= 1 << i; @@ -1000,6 +1012,7 @@ nv50_set_vertex_buffers(struct pipe_context *pipe, if (!vb) { nv50->vbo_user &= ~(((1ull << count) - 1) << start_slot); nv50->vbo_constant &= ~(((1ull << count) - 1) << start_slot); + nv50->vtxbufs_coherent &= ~(((1ull << count) - 1) << start_slot); return; } @@ -1012,9 +1025,15 @@ nv50_set_vertex_buffers(struct pipe_context *pipe, nv50->vbo_constant |= 1 << dst_index; else nv50->vbo_constant &= ~(1 << dst_index); + nv50->vtxbufs_coherent &= ~(1 << dst_index); } else { nv50->vbo_user &= ~(1 << dst_index); nv50->vbo_constant &= ~(1 << dst_index); + + if (vb[i].buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT) +nv50->vtxbufs_coherent |= (1 << dst_index); + else +nv50->vtxbufs_coherent &= ~(1 << dst_index); } } diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c index 85878d5..b6ba803 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c @@ -761,8 +761,7 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) { struct nv50_context *nv50 = nv50_context(pipe); struct nouveau_pushbuf *push = nv50->base.pushbuf; - bool tex_dirty = false; - int i, s; + int s; /* NOTE: caller must ensure that (min_index + index_bias) is >= 0 */ nv50->vb_elt_first = info->min_index + info->index_bias; @@ -791,27 +790,9 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) push->kick_notify =
Re: [Mesa-dev] [PATCH] gallium/aux../util: Make u_prims_for_vertices() safe
Pushed, thanks. Marek On Wed, Dec 9, 2015 at 10:07 AM, Edward O'Callaghanwrote: > Let us avoid trapping in hardware from a SIGFPE and instead > assert on a zero divisor. > > Hint: This can occur if a PIPE_PRIM_? is not handled in > u_prim_vertex_count() that results in ' info ' not > being initialized in the expected manner. > > Further, we also fix a possibly NULL pointer dereference > from ' info ' being NULL from a u_prim_vertex_count() call. > > Signed-off-by: Edward O'Callaghan > --- > src/gallium/auxiliary/util/u_prim.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/gallium/auxiliary/util/u_prim.h > b/src/gallium/auxiliary/util/u_prim.h > index 3668015..a09c315 100644 > --- a/src/gallium/auxiliary/util/u_prim.h > +++ b/src/gallium/auxiliary/util/u_prim.h > @@ -145,6 +145,9 @@ u_prims_for_vertices(unsigned prim, unsigned num) > { > const struct u_prim_vertex_count *info = u_prim_vertex_count(prim); > > + assert(info); > + assert(info->incr != 0); > + > if (num < info->min) >return 0; > > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] winsys/radeon: clear the buffer cache on mmap failure and try again
From: Marek Olšák--- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index a5f8aeb..5ba01b9 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c @@ -361,9 +361,16 @@ void *radeon_bo_do_map(struct radeon_bo *bo) ptr = os_mmap(0, args.size, PROT_READ|PROT_WRITE, MAP_SHARED, bo->rws->fd, args.addr_ptr); if (ptr == MAP_FAILED) { -pipe_mutex_unlock(bo->map_mutex); -fprintf(stderr, "radeon: mmap failed, errno: %i\n", errno); -return NULL; +/* Clear the cache and try again. */ +pb_cache_release_all_buffers(>rws->bo_cache); + +ptr = os_mmap(0, args.size, PROT_READ|PROT_WRITE, MAP_SHARED, + bo->rws->fd, args.addr_ptr); +if (ptr == MAP_FAILED) { +pipe_mutex_unlock(bo->map_mutex); +fprintf(stderr, "radeon: mmap failed, errno: %i\n", errno); +return NULL; +} } bo->ptr = ptr; bo->map_count = 1; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] winsys/amdgpu: clear the buffer cache on allocation failure and try again
From: Marek Olšák--- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c index 41efbcb..674482e 100644 --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c @@ -494,8 +494,13 @@ amdgpu_bo_create(struct radeon_winsys *rws, /* Create a new one. */ bo = amdgpu_create_bo(ws, size, alignment, usage, domain, flags); - if (!bo) - return NULL; + if (!bo) { + /* Clear the cache and try again. */ + pb_cache_release_all_buffers(>bo_cache); + bo = amdgpu_create_bo(ws, size, alignment, usage, domain, flags); + if (!bo) + return NULL; + } bo->use_reusable_pool = use_reusable_pool; return >base; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] winsys/amdgpu: clear the buffer cache on mmap failure and try again
From: Marek Olšák--- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c index 674482e..adea376 100644 --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c @@ -229,6 +229,11 @@ static void *amdgpu_bo_map(struct pb_buffer *buf, return bo->user_ptr; r = amdgpu_bo_cpu_map(bo->bo, ); + if (r) { + /* Clear the cache and try again. */ + pb_cache_release_all_buffers(>ws->bo_cache); + r = amdgpu_bo_cpu_map(bo->bo, ); + } return r ? NULL : cpu; } -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] winsys/radeon: clear the buffer cache on allocation failure and try again
From: Marek Olšák--- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index 3fd233c..a5f8aeb 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c @@ -766,8 +766,13 @@ radeon_winsys_bo_create(struct radeon_winsys *rws, } bo = radeon_create_bo(ws, size, alignment, usage, domain, flags); -if (!bo) -return NULL; +if (!bo) { +/* Clear the cache and try again. */ +pb_cache_release_all_buffers(>bo_cache); +bo = radeon_create_bo(ws, size, alignment, usage, domain, flags); +if (!bo) +return NULL; +} bo->use_reusable_pool = use_reusable_pool; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports
Am 09.12.2015 um 19:59 schrieb eocallag...@alterapraxis.com: > Roland, > > I could not due to ml size limit or something, it just bounces hence the > pull request. Due to size limit? Didn't look that big to me... Otherwise, looks good to me, though the second commit should mention it actually enables the extension. Roland > Cheers, > Edward. > > On 2015-12-10 02:38, Roland Scheidegger wrote: >> Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan: >>> This fixes my initial attempt so that piglit now passes 14/14. Thanks >>> to a couple of tips from Roland in the previous patch I was able to >>> fix the remaining issue. This should be golden now. >>> >> >> Great that you got it working! >> Please send the patches to the ml. >> >> Roland > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output
On 12/09/2015 11:43 AM, Ilia Mirkin wrote: On Mon, Dec 7, 2015 at 8:42 PM, Brian Paulwrote: When a buffer is created with GL_STATIC_DRAW, its contents should not be changed frequently. But that's exactly what one application I'm debugging does. This patch adds code to try to detect inefficient buffer use in a couple places. The GL_ARB_debug_output mechanism is used to report the issue. NVIDIA's driver detects these sort of things too. Other types of inefficient buffer use could also be detected in the future. --- src/mesa/main/bufferobj.c | 55 +++ src/mesa/main/mtypes.h| 4 2 files changed, 59 insertions(+) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index f985982..6bc1b5e 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -51,6 +51,34 @@ /** + * We count the number of buffer modification calls to check for + * inefficient buffer use. This is the number of such calls before we + * issue a warning. + */ +#define BUFFER_WARNING_CALL_COUNT 4 + + +/** + * Helper to warn of possible performance issues, such as frequently + * updating a buffer created with GL_STATIC_DRAW. + */ +static void +buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) +{ + va_list args; + GLuint msg_id = 0; This needs to be wrapped in a macro, with a 'static' id (at each macro invocation), otherwise a fresh id will get generated each time this is called, which is presumably not desirable. Same as what I did with pipe_debug_message/_pipe_debug_message. Is the macro required, or can I just pass a pointer to a static GLuint from each call site? I took a quick stab at the macro but I'm having trouble with the varargs stuff. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.
On Wednesday, December 09, 2015 08:03:25 AM Jason Ekstrand wrote: > On Dec 9, 2015 2:51 AM, "Kenneth Graunke"wrote: > > > > My tessellation branch has two additional remap functions. I don't want > > to replicate this logic there. > > > > Signed-off-by: Kenneth Graunke > > --- > > src/mesa/drivers/dri/i965/brw_nir.c | 78 > - > > 1 file changed, 50 insertions(+), 28 deletions(-) > > > > Hey Jason, > > > > If you like this patch, and haven't yet merged your NIR input reworks, > > feel free to just squash it into your changes. Or, we can land it > > separately after your changes. It's up to you. > > > > Separating this out allows me to reuse this in my new tessellation input > > and output remapping functions, and also means we don't need to add > structs > > for the remap functions...we can just pass the builder, or inputs_read, or > > the VUE map...and not have to pack multiple things together. > > Sure. It does make things simpler. > > > --Ken > > > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c > b/src/mesa/drivers/dri/i965/brw_nir.c > > index 14ad172..105a175 100644 > > --- a/src/mesa/drivers/dri/i965/brw_nir.c > > +++ b/src/mesa/drivers/dri/i965/brw_nir.c > > @@ -27,15 +27,19 @@ > > #include "glsl/nir/nir_builder.h" > > #include "program/prog_to_nir.h" > > > > -struct remap_vs_attrs_state { > > - nir_builder b; > > - uint64_t inputs_read; > > -}; > > - > > +/** > > + * In many cases, we just add the base and offset together, so there's no > > + * reason to keep them separate. Sometimes, combining them is essential: > > + * if a shader only accesses part of a compound variable (such as a > matrix > > + * or array), the variable's base may not actually exist in the VUE map. > > + * > > + * This pass adds constant offsets to instr->const_index[0], and resets > > + * the offset source to 0. Non-constant offsets remain unchanged. > > + */ > > static bool > > -remap_vs_attrs(nir_block *block, void *void_state) > > +add_const_offset_to_base(nir_block *block, void *closure) > > { > > - struct remap_vs_attrs_state *state = void_state; > > + nir_builder *b = closure; > > > > nir_foreach_instr_safe(block, instr) { > >if (instr->type != nir_instr_type_intrinsic) > > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state) > > > >nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); > > > > + if (intrin->intrinsic == nir_intrinsic_load_input || > > + intrin->intrinsic == nir_intrinsic_load_per_vertex_input || > > + intrin->intrinsic == nir_intrinsic_load_output || > > + intrin->intrinsic == nir_intrinsic_load_per_vertex_output || > > + intrin->intrinsic == nir_intrinsic_store_output || > > + intrin->intrinsic == nir_intrinsic_store_per_vertex_output) { > > This seems a bit scortched-earth. It would be nice if the caller had a bit > more control. We could always add "do input"/"do output" options once we actually need them. It means adding the structs back, but that's fine. I suppose for now we could ignore output intrinsics here. > > + nir_src *offset = nir_get_io_offset_src(intrin); > > + nir_const_value *const_offset = nir_src_as_const_value(*offset); > > + > > + if (const_offset) { > > +intrin->const_index[0] += const_offset->u[0]; > > +b->cursor = nir_before_instr(>instr); > > +nir_instr_rewrite_src(>instr, offset, > > + nir_src_for_ssa(nir_imm_int(b, 0))); > > + } > > Else??? It seems that you don't want to run this pass if you think you'll > ever hit an indirect. I guess it's harmless to just do this for all direct > things in our driver, but it doesn't sit well. Else do nothing. The problem I'm trying to avoid with this logic is that inputs/outputs which take up multiple slots and are directly accessed may only have slots assigned for the sub-components that are actually used. So, I can't use the base offset, and need to move the base to the slot that's actually used. If something is accessed indirectly, we assign all of its slots, so using the base location is fine. > > + } > > + } > > + return true; > > + > > +} > > + > > +static bool > > +remap_vs_attrs(nir_block *block, void *closure) > > +{ > > + GLbitfield64 inputs_read = *((GLbitfield64 *) closure); > > + > > + nir_foreach_instr(block, instr) { > > + if (instr->type != nir_instr_type_intrinsic) > > + continue; > > + > > + nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); > > + > >if (intrin->intrinsic == nir_intrinsic_load_input) { > > /* Attributes come in a contiguous block, ordered by their > >* gl_vert_attrib value. That means we can compute the slot > >* number for an attribute by masking out the enabled attributes > >* before it and
Re: [Mesa-dev] [PATCH] mesa/varray: set double arrays to non-normalised.
...and it matches what we do for single precision. Reviewed-by: Ian RomanickPresumably this should also be a candidate for 11.0 and 11.1? On 12/09/2015 04:37 PM, Dave Airlie wrote: > From: Dave Airlie > > Doesn't have any effect in practice I don't think, but > CTS reads back using GetVertexAttrib. > > This fixes: GL41-CTS.vertex_attrib_64bit.get_vertex_attrib > > Signed-off-by: Dave Airlie > --- > src/mesa/main/varray.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c > index 58f376b..c71e16a 100644 > --- a/src/mesa/main/varray.c > +++ b/src/mesa/main/varray.c > @@ -776,7 +776,7 @@ _mesa_VertexAttribLPointer(GLuint index, GLint size, > GLenum type, > > update_array(ctx, "glVertexAttribLPointer", VERT_ATTRIB_GENERIC(index), > legalTypes, 1, 4, > -size, type, stride, GL_TRUE, GL_FALSE, GL_TRUE, ptr); > +size, type, stride, GL_FALSE, GL_FALSE, GL_TRUE, ptr); > } > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.
On Wed, Dec 9, 2015 at 12:08 PM, Kenneth Graunkewrote: > On Wednesday, December 09, 2015 08:03:25 AM Jason Ekstrand wrote: >> On Dec 9, 2015 2:51 AM, "Kenneth Graunke" wrote: >> > >> > My tessellation branch has two additional remap functions. I don't want >> > to replicate this logic there. >> > >> > Signed-off-by: Kenneth Graunke >> > --- >> > src/mesa/drivers/dri/i965/brw_nir.c | 78 >> - >> > 1 file changed, 50 insertions(+), 28 deletions(-) >> > >> > Hey Jason, >> > >> > If you like this patch, and haven't yet merged your NIR input reworks, >> > feel free to just squash it into your changes. Or, we can land it >> > separately after your changes. It's up to you. >> > >> > Separating this out allows me to reuse this in my new tessellation input >> > and output remapping functions, and also means we don't need to add >> structs >> > for the remap functions...we can just pass the builder, or inputs_read, or >> > the VUE map...and not have to pack multiple things together. >> >> Sure. It does make things simpler. >> >> > --Ken >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c >> b/src/mesa/drivers/dri/i965/brw_nir.c >> > index 14ad172..105a175 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_nir.c >> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c >> > @@ -27,15 +27,19 @@ >> > #include "glsl/nir/nir_builder.h" >> > #include "program/prog_to_nir.h" >> > >> > -struct remap_vs_attrs_state { >> > - nir_builder b; >> > - uint64_t inputs_read; >> > -}; >> > - >> > +/** >> > + * In many cases, we just add the base and offset together, so there's no >> > + * reason to keep them separate. Sometimes, combining them is essential: >> > + * if a shader only accesses part of a compound variable (such as a >> matrix >> > + * or array), the variable's base may not actually exist in the VUE map. >> > + * >> > + * This pass adds constant offsets to instr->const_index[0], and resets >> > + * the offset source to 0. Non-constant offsets remain unchanged. >> > + */ >> > static bool >> > -remap_vs_attrs(nir_block *block, void *void_state) >> > +add_const_offset_to_base(nir_block *block, void *closure) >> > { >> > - struct remap_vs_attrs_state *state = void_state; >> > + nir_builder *b = closure; >> > >> > nir_foreach_instr_safe(block, instr) { >> >if (instr->type != nir_instr_type_intrinsic) >> > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state) >> > >> >nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); >> > >> > + if (intrin->intrinsic == nir_intrinsic_load_input || >> > + intrin->intrinsic == nir_intrinsic_load_per_vertex_input || >> > + intrin->intrinsic == nir_intrinsic_load_output || >> > + intrin->intrinsic == nir_intrinsic_load_per_vertex_output || >> > + intrin->intrinsic == nir_intrinsic_store_output || >> > + intrin->intrinsic == nir_intrinsic_store_per_vertex_output) { >> >> This seems a bit scortched-earth. It would be nice if the caller had a bit >> more control. > > We could always add "do input"/"do output" options once we actually need > them. It means adding the structs back, but that's fine. > > I suppose for now we could ignore output intrinsics here. > >> > + nir_src *offset = nir_get_io_offset_src(intrin); >> > + nir_const_value *const_offset = nir_src_as_const_value(*offset); >> > + >> > + if (const_offset) { >> > +intrin->const_index[0] += const_offset->u[0]; >> > +b->cursor = nir_before_instr(>instr); >> > +nir_instr_rewrite_src(>instr, offset, >> > + nir_src_for_ssa(nir_imm_int(b, 0))); >> > + } >> >> Else??? It seems that you don't want to run this pass if you think you'll >> ever hit an indirect. I guess it's harmless to just do this for all direct >> things in our driver, but it doesn't sit well. > > Else do nothing. The problem I'm trying to avoid with this logic is > that inputs/outputs which take up multiple slots and are directly > accessed may only have slots assigned for the sub-components that are > actually used. So, I can't use the base offset, and need to move the > base to the slot that's actually used. > > If something is accessed indirectly, we assign all of its slots, so > using the base location is fine. Ok, that makes sense. >> > + } >> > + } >> > + return true; >> > + >> > +} >> > + >> > +static bool >> > +remap_vs_attrs(nir_block *block, void *closure) >> > +{ >> > + GLbitfield64 inputs_read = *((GLbitfield64 *) closure); >> > + >> > + nir_foreach_instr(block, instr) { >> > + if (instr->type != nir_instr_type_intrinsic) >> > + continue; >> > + >> > + nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); >> > + >> >if (intrin->intrinsic == nir_intrinsic_load_input) { >> > /* Attributes come
[Mesa-dev] [PATCH 1/2] gallium/ddebug: add GALLIUM_DDEBUG_SKIP option
From: Nicolai HähnleWhen we know that hangs occur only very late in a reproducible run (e.g. apitrace), we can save a lot of debugging time by skipping the flush and hang detection for earlier draw calls. --- src/gallium/drivers/ddebug/dd_draw.c | 39 +- src/gallium/drivers/ddebug/dd_pipe.h | 3 +++ src/gallium/drivers/ddebug/dd_screen.c | 9 3 files changed, 36 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_draw.c b/src/gallium/drivers/ddebug/dd_draw.c index b443c5b..0778099 100644 --- a/src/gallium/drivers/ddebug/dd_draw.c +++ b/src/gallium/drivers/ddebug/dd_draw.c @@ -588,8 +588,11 @@ dd_context_flush(struct pipe_context *_pipe, static void dd_before_draw(struct dd_context *dctx) { - if (dd_screen(dctx->base.screen)->mode == DD_DETECT_HANGS && - !dd_screen(dctx->base.screen)->no_flush) + struct dd_screen *dscreen = dd_screen(dctx->base.screen); + + if (dscreen->mode == DD_DETECT_HANGS && + !dscreen->no_flush && + dctx->num_draw_calls >= dscreen->skip_count) dd_flush_and_handle_hang(dctx, NULL, 0, "GPU hang most likely caused by internal " "driver commands"); @@ -598,22 +601,28 @@ dd_before_draw(struct dd_context *dctx) static void dd_after_draw(struct dd_context *dctx, struct dd_call *call) { - switch (dd_screen(dctx->base.screen)->mode) { - case DD_DETECT_HANGS: - if (!dd_screen(dctx->base.screen)->no_flush && - dd_flush_and_check_hang(dctx, NULL, 0)) { - dd_dump_call(dctx, call, PIPE_DEBUG_DEVICE_IS_HUNG); + struct dd_screen *dscreen = dd_screen(dctx->base.screen); - /* Terminate the process to prevent future hangs. */ - dd_kill_process(); + if (dctx->num_draw_calls >= dscreen->skip_count) { + switch (dscreen->mode) { + case DD_DETECT_HANGS: + if (!dscreen->no_flush && +dd_flush_and_check_hang(dctx, NULL, 0)) { +dd_dump_call(dctx, call, PIPE_DEBUG_DEVICE_IS_HUNG); + +/* Terminate the process to prevent future hangs. */ +dd_kill_process(); + } + break; + case DD_DUMP_ALL_CALLS: + dd_dump_call(dctx, call, 0); + break; + default: + assert(0); } - break; - case DD_DUMP_ALL_CALLS: - dd_dump_call(dctx, call, 0); - break; - default: - assert(0); } + + ++dctx->num_draw_calls; } static void diff --git a/src/gallium/drivers/ddebug/dd_pipe.h b/src/gallium/drivers/ddebug/dd_pipe.h index 34f5920..a045518 100644 --- a/src/gallium/drivers/ddebug/dd_pipe.h +++ b/src/gallium/drivers/ddebug/dd_pipe.h @@ -45,6 +45,7 @@ struct dd_screen unsigned timeout_ms; enum dd_mode mode; bool no_flush; + unsigned skip_count; }; struct dd_query @@ -110,6 +111,8 @@ struct dd_context struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS]; struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS]; float tess_default_levels[6]; + + unsigned num_draw_calls; }; diff --git a/src/gallium/drivers/ddebug/dd_screen.c b/src/gallium/drivers/ddebug/dd_screen.c index a776580..2716845 100644 --- a/src/gallium/drivers/ddebug/dd_screen.c +++ b/src/gallium/drivers/ddebug/dd_screen.c @@ -290,6 +290,9 @@ ddebug_screen_create(struct pipe_screen *screen) puts("$HOME/"DD_DIR"/ when a hang is detected."); puts("If 'noflush' is specified, only detect hangs in pipe->flush."); puts(""); + puts(" GALLIUM_DDEBUG_SKIP=[count]"); + puts("Skip flush and hang detection for the given initial number of draw calls."); + puts(""); exit(0); } @@ -349,5 +352,11 @@ ddebug_screen_create(struct pipe_screen *screen) assert(0); } + dscreen->skip_count = debug_get_num_option("GALLIUM_DDEBUG_SKIP", 0); + if (dscreen->skip_count > 0) { + fprintf(stderr, "Gallium debugger skipping the first %u draw calls.\n", + dscreen->skip_count); + } + return >base; } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] gallium/ddebug: regularly log the total number of draw calls
From: Nicolai HähnleThis helps in the use of GALLIUM_DDEBUG_SKIP: first run a target application with skip set to a very large number and note how many draw calls happen before the bug. Then re-run, skipping the corresponding number of calls. Despite the additional run, this can still be much faster than not skipping anything. --- src/gallium/drivers/ddebug/dd_draw.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/ddebug/dd_draw.c b/src/gallium/drivers/ddebug/dd_draw.c index 0778099..0d7ee9a 100644 --- a/src/gallium/drivers/ddebug/dd_draw.c +++ b/src/gallium/drivers/ddebug/dd_draw.c @@ -623,6 +623,9 @@ dd_after_draw(struct dd_context *dctx, struct dd_call *call) } ++dctx->num_draw_calls; + if (dscreen->skip_count && dctx->num_draw_calls % 1 == 0) + fprintf(stderr, "Gallium debugger reached %u draw calls.\n", + dctx->num_draw_calls); } static void -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: also print hexadecimal values for register fields in the IB parser
From: Marek Olšák--- src/gallium/drivers/radeonsi/si_debug.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_debug.c b/src/gallium/drivers/radeonsi/si_debug.c index cce665e..034acf5 100644 --- a/src/gallium/drivers/radeonsi/si_debug.c +++ b/src/gallium/drivers/radeonsi/si_debug.c @@ -61,13 +61,16 @@ static void print_spaces(FILE *f, unsigned num) static void print_value(FILE *file, uint32_t value, int bits) { /* Guess if it's int or float */ - if (value <= (1 << 15)) - fprintf(file, "%u\n", value); - else { + if (value <= (1 << 15)) { + if (value <= 9) + fprintf(file, "%u\n", value); + else + fprintf(file, "%u (0x%0*x)\n", value, bits / 4, value); + } else { float f = uif(value); if (fabs(f) < 10 && f*10 == floor(f*10)) - fprintf(file, "%.1ff\n", f); + fprintf(file, "%.1ff (0x%0*x)\n", f, bits / 4, value); else /* Don't print more leading zeros than there are bits. */ fprintf(file, "0x%0*x\n", bits / 4, value); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: fix ID usage for buffer warnings
Reviewed-by: Ilia MirkinOn Wed, Dec 9, 2015 at 6:02 PM, Brian Paul wrote: > We need a different ID pointer for each call site. > --- > src/mesa/main/bufferobj.c | 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c > index 6bc1b5e..e0639c8 100644 > --- a/src/mesa/main/bufferobj.c > +++ b/src/mesa/main/bufferobj.c > @@ -60,16 +60,16 @@ > > /** > * Helper to warn of possible performance issues, such as frequently > - * updating a buffer created with GL_STATIC_DRAW. > + * updating a buffer created with GL_STATIC_DRAW. Called via the macro > + * below. > */ > static void > -buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) > +buffer_usage_warning(struct gl_context *ctx, GLuint *id, const char *fmt, > ...) > { > va_list args; > - GLuint msg_id = 0; > > va_start(args, fmt); > - _mesa_gl_vdebug(ctx, _id, > + _mesa_gl_vdebug(ctx, id, > MESA_DEBUG_SOURCE_API, > MESA_DEBUG_TYPE_PERFORMANCE, > MESA_DEBUG_SEVERITY_MEDIUM, > @@ -77,6 +77,12 @@ buffer_usage_warning(struct gl_context *ctx, const char > *fmt, ...) > va_end(args); > } > > +#define BUFFER_USAGE_WARNING(CTX, FMT, ...) \ > + do { \ > + static GLuint id = 0; \ > + buffer_usage_warning(CTX, , FMT, ##__VA_ARGS__); \ > + } while (0) > + > > /** > * Used as a placeholder for buffer objects between glGenBuffers() and > @@ -1713,7 +1719,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct > gl_buffer_object *bufObj, >/* If the application declared the buffer as static draw/copy or stream > * draw, it should not be frequently modified with glBufferSubData. > */ > - buffer_usage_warning(ctx, > + BUFFER_USAGE_WARNING(ctx, > "using %s(buffer %u, offset %u, size %u) to " > "update a %s buffer", > func, bufObj->Name, offset, size, > @@ -2432,7 +2438,7 @@ _mesa_map_buffer_range(struct gl_context *ctx, >if ((bufObj->Usage == GL_STATIC_DRAW || > bufObj->Usage == GL_STATIC_COPY) && >bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) { > - buffer_usage_warning(ctx, > + BUFFER_USAGE_WARNING(ctx, >"using %s(buffer %u, offset %u, length %u) to " >"update a %s buffer", >func, bufObj->Name, offset, length, > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] nv50, nvc0: optimize coherent buffer checking at draw time
On Wed, Dec 9, 2015 at 5:40 PM, Samuel Pitoisetwrote: > Instead of iterating over all the buffer resources looking for coherent > buffers, we keep track of a context-wide count. This will save some > iterations (and CPU cycles) in 99.99% case because usually coherent > buffers are not so used. > > Changes from v2: > - forgot to apply some changes for nv50 (texture/vertex bufs) > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/nv50/nv50_context.h | 3 ++ > src/gallium/drivers/nouveau/nv50/nv50_state.c | 19 +++ > src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 44 > +++-- > src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 3 ++ > src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 26 +++ > src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +-- > 6 files changed, 64 insertions(+), 72 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h > b/src/gallium/drivers/nouveau/nv50/nv50_context.h > index 2cebcd9..712d00e 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h > +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h > @@ -134,9 +134,11 @@ struct nv50_context { > struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS]; > uint16_t constbuf_dirty[3]; > uint16_t constbuf_valid[3]; > + uint16_t constbuf_coherent[3]; > > struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS]; > unsigned num_vtxbufs; > + uint32_t vtxbufs_coherent; > struct pipe_index_buffer idxbuf; > uint32_t vbo_fifo; /* bitmask of vertex elements to be pushed to FIFO */ > uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */ > @@ -148,6 +150,7 @@ struct nv50_context { > > struct pipe_sampler_view *textures[3][PIPE_MAX_SAMPLERS]; > unsigned num_textures[3]; > + uint32_t textures_coherent[3]; > struct nv50_tsc_entry *samplers[3][PIPE_MAX_SAMPLERS]; > unsigned num_samplers[3]; > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c > b/src/gallium/drivers/nouveau/nv50/nv50_state.c > index fd7c7cd..b6e5c75 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c > +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c > @@ -661,9 +661,16 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, > int s, > assert(nr <= PIPE_MAX_SAMPLERS); > for (i = 0; i < nr; ++i) { >struct nv50_tic_entry *old = nv50_tic_entry(nv50->textures[s][i]); > + struct pipe_resource *res = views[i]->texture; I'm moderately sure either views[i] or texture can be null. [Same for nvc0.] >if (old) > nv50_screen_tic_unlock(nv50->screen, old); > > + if (res->target == PIPE_BUFFER && > + (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)) > + nv50->textures_coherent[s] |= 1 << i; > + else > + nv50->textures_coherent[s] &= ~(1 << i); > + >pipe_sampler_view_reference(>textures[s][i], views[i]); > } > > @@ -852,8 +859,13 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint > shader, uint index, >nv50->constbuf[s][i].offset = cb->buffer_offset; >nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), > 0x1); >nv50->constbuf_valid[s] |= 1 << i; > + if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT) > + nv50->constbuf_coherent[s] |= 1 << i; > + else > + nv50->constbuf_coherent[s] &= ~(1 << i); You also need to clear it out in the user case. > } else { >nv50->constbuf_valid[s] &= ~(1 << i); > + nv50->constbuf_coherent[s] &= ~(1 << i); > } > nv50->constbuf_dirty[s] |= 1 << i; > > @@ -1000,6 +1012,7 @@ nv50_set_vertex_buffers(struct pipe_context *pipe, > if (!vb) { >nv50->vbo_user &= ~(((1ull << count) - 1) << start_slot); >nv50->vbo_constant &= ~(((1ull << count) - 1) << start_slot); > + nv50->vtxbufs_coherent &= ~(((1ull << count) - 1) << start_slot); >return; > } > > @@ -1012,9 +1025,15 @@ nv50_set_vertex_buffers(struct pipe_context *pipe, > nv50->vbo_constant |= 1 << dst_index; > else > nv50->vbo_constant &= ~(1 << dst_index); > + nv50->vtxbufs_coherent &= ~(1 << dst_index); >} else { > nv50->vbo_user &= ~(1 << dst_index); > nv50->vbo_constant &= ~(1 << dst_index); > + > + if (vb[i].buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT) I wonder if vb[i].buffer might be null here... Not sure if that's allowed or not... > +nv50->vtxbufs_coherent |= (1 << dst_index); > + else > +nv50->vtxbufs_coherent &= ~(1 << dst_index); >} > } > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c > b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c > index 85878d5..b6ba803 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c > +++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c > @@ -761,8 +761,7
Re: [Mesa-dev] [PATCH 8/8] glsl: only divide left components when it is a dual slot double.
On 12/09/2015 04:07 AM, Timothy Arceri wrote: > On Wed, 2015-12-09 at 16:06 +1000, Dave Airlie wrote: >> From: Dave Airlie>> >> Signed-off-by: Dave Airlie >> --- >> src/glsl/lower_packed_varyings.cpp | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/glsl/lower_packed_varyings.cpp >> b/src/glsl/lower_packed_varyings.cpp >> index 037c27d..ec9af62 100644 >> --- a/src/glsl/lower_packed_varyings.cpp >> +++ b/src/glsl/lower_packed_varyings.cpp >> @@ -472,7 +472,7 @@ >> lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue, >>char right_swizzle_name[4] = { 0, 0, 0, 0 }; >> >>left_components = 4 - fine_location % 4; >> - if (rvalue->type->is_double()) { >> + if (rvalue->type->is_dual_slot_double()) { > > The subject line says what the change is but there is no explanation on > why it was made. Can you add more detail to the comment? > > To me the existing code *seems* correct as all doubles take up twice > the amount of components, why would we only divide by 2 when its a dual > slot double? Yeah... it seems like after this change it might try to pack a double in the W component of a vec3. >> /* We might actually end up with 0 left components! */ >> left_components /= 2; >>} > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: fix ID usage for buffer warnings
We need a different ID pointer for each call site. --- src/mesa/main/bufferobj.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index 6bc1b5e..e0639c8 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -60,16 +60,16 @@ /** * Helper to warn of possible performance issues, such as frequently - * updating a buffer created with GL_STATIC_DRAW. + * updating a buffer created with GL_STATIC_DRAW. Called via the macro + * below. */ static void -buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) +buffer_usage_warning(struct gl_context *ctx, GLuint *id, const char *fmt, ...) { va_list args; - GLuint msg_id = 0; va_start(args, fmt); - _mesa_gl_vdebug(ctx, _id, + _mesa_gl_vdebug(ctx, id, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_PERFORMANCE, MESA_DEBUG_SEVERITY_MEDIUM, @@ -77,6 +77,12 @@ buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) va_end(args); } +#define BUFFER_USAGE_WARNING(CTX, FMT, ...) \ + do { \ + static GLuint id = 0; \ + buffer_usage_warning(CTX, , FMT, ##__VA_ARGS__); \ + } while (0) + /** * Used as a placeholder for buffer objects between glGenBuffers() and @@ -1713,7 +1719,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct gl_buffer_object *bufObj, /* If the application declared the buffer as static draw/copy or stream * draw, it should not be frequently modified with glBufferSubData. */ - buffer_usage_warning(ctx, + BUFFER_USAGE_WARNING(ctx, "using %s(buffer %u, offset %u, size %u) to " "update a %s buffer", func, bufObj->Name, offset, size, @@ -2432,7 +2438,7 @@ _mesa_map_buffer_range(struct gl_context *ctx, if ((bufObj->Usage == GL_STATIC_DRAW || bufObj->Usage == GL_STATIC_COPY) && bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) { - buffer_usage_warning(ctx, + BUFFER_USAGE_WARNING(ctx, "using %s(buffer %u, offset %u, length %u) to " "update a %s buffer", func, bufObj->Name, offset, length, -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression
https://bugs.freedesktop.org/show_bug.cgi?id=91888 --- Comment #20 from nerdopol...@verizon.net --- And now all the examples from qtbase/examples/opengl are working as well even after exporting LIBGL_ALWAYS_SOFTWARE as well with mesa master. including contextinfo, qopenglwidget, qopenglwindow, and threadedqopenglwindow -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/varray: set double arrays to non-normalised.
From: Dave AirlieDoesn't have any effect in practice I don't think, but CTS reads back using GetVertexAttrib. This fixes: GL41-CTS.vertex_attrib_64bit.get_vertex_attrib Signed-off-by: Dave Airlie --- src/mesa/main/varray.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 58f376b..c71e16a 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -776,7 +776,7 @@ _mesa_VertexAttribLPointer(GLuint index, GLint size, GLenum type, update_array(ctx, "glVertexAttribLPointer", VERT_ATTRIB_GENERIC(index), legalTypes, 1, 4, -size, type, stride, GL_TRUE, GL_FALSE, GL_TRUE, ptr); +size, type, stride, GL_FALSE, GL_FALSE, GL_TRUE, ptr); } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] mesa: docs: Add link to planet.freedesktop.org
This patch is Reviewed-by: Ian RomanickOn 12/07/2015 12:18 PM, Sarah Sharp wrote: > The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or > any of the graphics project wikis (including the DRI wiki) on > freedeskop.org. Fix that by linking to it from the sidebar. > > Signed-off-by: Sarah Sharp > --- > docs/contents.html | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/docs/contents.html b/docs/contents.html > index 6612cbe..a683b07 100644 > --- a/docs/contents.html > +++ b/docs/contents.html > @@ -90,6 +90,7 @@ > http://www.opengl.org; target="_parent">OpenGL website > http://dri.freedesktop.org; target="_parent">DRI website > http://www.freedesktop.org; target="_parent">freedesktop.org > +http://planet.freedesktop.org; target="_parent">Developer > blogs > > > Hosted by: > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] mesa: docs: i965: Use correct doxygen groupings syntax
On 12/07/2015 12:18 PM, Sarah Sharp wrote: > When reading the source code, it's useful to indicate that a group of > fields in a struct are related in someway. The convention in Mesa seems > to be: > > struct foo { > /** > * Related fields description > * @{ > */ > int bar; > char baz; > */@} /*@}*/ > long qux; > } > > However, the doxygen syntax for grouping is: > > struct foo { > /** > * @defgroup group_name Related fields description There are 193 uses of \name, but only one use @defgroup. Assuming that \name is also correct, and https://www.stack.nl/~dimitri/doxygen/manual/grouping.html#memgroup suggests that it is, we should stick with that. If it's not correct... can a sed job convert the \name uses to \defgroup? > * @{ > */ > int bar; > char baz; > */@} /*@}*/ > long qux; > } > > https://www.stack.nl/~dimitri/doxygen/manual/grouping.html > > Without the group name definition, the fields don't get properly > grouped. Instead, the group description is applied to the first field. > You can see this in the current doxygen build, since there are no links > to groups (doxygen calls them modules). > > Fix this. Use the brw_ prefix for the group name, since I'm pretty sure > group names are global. > > Signed-off-by: Sarah Sharp> --- > src/mesa/drivers/dri/i965/brw_device_info.h | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.h > b/src/mesa/drivers/dri/i965/brw_device_info.h > index 4911c23..89c1501 100644 > --- a/src/mesa/drivers/dri/i965/brw_device_info.h > +++ b/src/mesa/drivers/dri/i965/brw_device_info.h > @@ -49,8 +49,8 @@ struct brw_device_info > bool has_resource_streamer; > > /** > -* Quirks: > -* @{ > +* @defgroup brw_quirks Hardware quirks > +* @{ > */ > bool has_negative_rhw_bug; > > @@ -62,11 +62,11 @@ struct brw_device_info > * fragment shader instructions. > */ > bool needs_unlit_centroid_workaround; > - /** @} */ > + /* @} */ > > /** > -* GPU Limits: > -* @{ > +* @defgroup brw_gpu_limits GPU hardware limitations > +* @{ > */ > unsigned max_vs_threads; > unsigned max_hs_threads; > @@ -83,7 +83,7 @@ struct brw_device_info >unsigned max_ds_entries; >unsigned max_gs_entries; > } urb; > - /** @} */ > + /* @} */ > }; > > const struct brw_device_info *brw_get_device_info(int devid); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/15] i965/fs: Use MOV_INDIRECT for all indirect uniform loads
On Wed, Dec 9, 2015 at 8:23 PM, Jason Ekstrandwrote: > Instead of using reladdr, this commit changes the FS backend to emit a > MOV_INDIRECT whenever we need an indirect uniform load. We also have to > rework some of the other bits of the backend to handle this new form of > uniform load. The obvious change is that demote_pull_constants now acts > more like a lowering pass when it hits a MOV_INDIRECT. > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 72 > +++- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 53 ++- > 2 files changed, 86 insertions(+), 39 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index bf446d2..7cc03c5 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -1945,8 +1945,8 @@ fs_visitor::assign_constant_locations() > if (inst->src[i].file != UNIFORM) > continue; > > - if (inst->src[i].reladdr) { > -int uniform = inst->src[i].nr; > + if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) { > +int uniform = inst->src[0].nr; > > /* If this array isn't already present in the pull constant > buffer, > * add it. > @@ -2028,49 +2028,63 @@ fs_visitor::assign_constant_locations() > void > fs_visitor::demote_pull_constants() > { > - foreach_block_and_inst (block, fs_inst, inst, cfg) { > + const unsigned index = > stage_prog_data->binding_table.pull_constants_start; > + > + foreach_block_and_inst_safe (block, fs_inst, inst, cfg) { > + /* Set up the annotation tracking for new generated instructions. */ > + const fs_builder ibld(this, block, inst); > + >for (int i = 0; i < inst->sources; i++) { > if (inst->src[i].file != UNIFORM) > continue; > > - int pull_index; > + /* We'll handle this case later */ > + if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) > +continue; > + > unsigned location = inst->src[i].nr + inst->src[i].reg_offset; > - if (location >= uniforms) /* Out of bounds access */ > -pull_index = -1; > - else > -pull_index = pull_constant_loc[location]; > + if (location >= uniforms) > +continue; /* Out of bounds access */ > + > + int pull_index = pull_constant_loc[location]; > > if (pull_index == -1) > continue; > > - /* Set up the annotation tracking for new generated instructions. */ > - const fs_builder ibld(this, block, inst); > - const unsigned index = > stage_prog_data->binding_table.pull_constants_start; > - fs_reg dst = vgrf(glsl_type::float_type); > - > assert(inst->src[i].stride == 0); > > - /* Generate a pull load into dst. */ > - if (inst->src[i].reladdr) { > -VARYING_PULL_CONSTANT_LOAD(ibld, dst, > - brw_imm_ud(index), > - *inst->src[i].reladdr, > - pull_index * 4); > -inst->src[i].reladdr = NULL; > -inst->src[i].stride = 1; > - } else { > -const fs_builder ubld = ibld.exec_all().group(8, 0); > -struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & > ~15); > -ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD, > - dst, brw_imm_ud(index), offset); > -inst->src[i].set_smear(pull_index & 3); > - } > - brw_mark_surface_used(prog_data, index); > + fs_reg dst = vgrf(glsl_type::float_type); > + const fs_builder ubld = ibld.exec_all().group(8, 0); > + struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & > ~15); > + ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD, > + dst, brw_imm_ud(index), offset); > > /* Rewrite the instruction to use the temporary VGRF. */ > inst->src[i].file = VGRF; > inst->src[i].nr = dst.nr; > inst->src[i].reg_offset = 0; > + inst->src[i].set_smear(pull_index & 3); > + > + brw_mark_surface_used(prog_data, index); > + } > + > + if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && > + inst->src[0].file == UNIFORM) { > + > + unsigned location = inst->src[0].nr + inst->src[0].reg_offset; > + if (location >= uniforms) > +continue; /* Out of bounds access */ > + > + int pull_index = pull_constant_loc[location]; > + assert(pull_index >= 0); /* This had better be pull */ > + > + VARYING_PULL_CONSTANT_LOAD(ibld, inst->dst, > +brw_imm_ud(index), > +inst->src[1], > +pull_index * 4); > + inst->remove(block); >
Re: [Mesa-dev] [PATCH v3 44/44] docs: Add ARB_compute_shader to 11.2.0 release notes
On Tue, Dec 01, 2015 at 12:20:02AM -0800, Jordan Justen wrote: > Signed-off-by: Jordan Justen> Cc: Iago Toral Quiroga For the series: Reviewed-by: Kristian Høgsberg Admittedly, light review on patches 17-25 but Iago covered that and the overall refactoring looks sounds. For the new shared load/store intrinsics, we'll want to avoid the _indirect versions, but that may be better done as an update to Jasons series. Kristian > --- > docs/relnotes/11.2.0.html | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/docs/relnotes/11.2.0.html b/docs/relnotes/11.2.0.html > index c9c0c90..eb53e81 100644 > --- a/docs/relnotes/11.2.0.html > +++ b/docs/relnotes/11.2.0.html > @@ -45,6 +45,7 @@ Note: some of the new features are only available with > certain drivers. > > > GL_ARB_base_instance on freedreno/a4xx > +GL_ARB_compute_shader on i965 > GL_ARB_texture_buffer_object_rgb32 on freedreno/a4xx > GL_ARB_texture_buffer_range on freedreno/a4xx > GL_ARB_texture_query_lod on freedreno/a4xx > -- > 2.6.2 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev