[Mesa-dev] [PATCH 15/17] i965: Make BLORP properly avoid batch wrapping.

2017-09-05 Thread Kenneth Graunke
We need to set brw->no_batch_wrap to actually avoid flushing in the middle of our BLORP operation, and instead grow the batchbuffer. --- src/mesa/drivers/dri/i965/genX_blorp_exec.c | 16 ++-- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git

[Mesa-dev] [PATCH 12/17] i965: Replace exit(1) with abort() when command submission fails.

2017-09-05 Thread Kenneth Graunke
Calling exit(1) when execbuffer fails is not necessarily safe. When running Piglit tests with a Mesa that submitted invalid commands to the GPU, I discovered the following problem: 1. do_flush_locked fails and calls exit(1)...invoking atexit handlers. 2. Piglit tries to clean up after itself, and

[Mesa-dev] [PATCH 16/17] i965: Delete BATCH_RESERVED handling.

2017-09-05 Thread Kenneth Graunke
Now that we can grom the batchbuffer if we absolutely need the extra space, we don't need to reserve space for the final do-or-die ending commands. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 11 +++ src/mesa/drivers/dri/i965/intel_batchbuffer.h | 26 -- 2

[Mesa-dev] [PATCH 08/17] i965: Move brw_state_batch code to intel_batchbuffer.c

2017-09-05 Thread Kenneth Graunke
The batch buffer and state buffer code is fairly tied together, and having it in one .c file will make refactoring easier. Also, drop some commentary above brw_state_batch. The "aperture checking performance hacks" are long since gone, so that paragraph makes little sense at this point. ---

[Mesa-dev] [PATCH 11/17] i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.

2017-09-05 Thread Kenneth Graunke
We'll need to read from both buffers when decoding state. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 102 +- 1 file changed, 52 insertions(+), 50 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c

[Mesa-dev] [PATCH 17/17] i965: Disentangle batch and state buffer flushing.

2017-09-05 Thread Kenneth Graunke
We now flush the batch when either the batchbuffer or statebuffer reaches the original intended batch size, instead of when the sum of the two reaches a certain size (which makes no sense now that they're separate buffers). With this change, we also need to update our "are we near the end?"

[Mesa-dev] [PATCH 09/17] i965: Refactor relocs into a brw_reloc_list structure.

2017-09-05 Thread Kenneth Graunke
I'm planning on splitting batch and state into separate buffers, at which point we'll need two relocation lists. In preparation for that, this patch refactors the relocation stuff into a structure we can replicate...which looks a lot like anv_reloc_list. ---

[Mesa-dev] [PATCH 13/17] i965: Use a separate state buffer, but avoid changing flushing behavior.

2017-09-05 Thread Kenneth Graunke
Previously, we emitted GPU commands and indirect state into the same buffer, using a stack/heap like system where we filled in commands from the start of the buffer, and state from the end of the buffer. We then flushed before the two met in the middle. Meeting in the middle is fatal, so you

[Mesa-dev] [PATCH 10/17] i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.

2017-09-05 Thread Kenneth Graunke
brw_batch_reloc emits a relocation from the batchbuffer to elsewhere. brw_state_reloc emits a relocation from the statebuffer to elsewhere. For now, they do the same thing, but when we actually split the two buffers, we'll change brw_state_reloc to use the state buffer. ---

[Mesa-dev] [PATCH 01/17] i965: Don't special case the batchbuffer when reference counting.

2017-09-05 Thread Kenneth Graunke
We don't need to special case the batch - when we add the batch to the validation list, we can simply increase the refcount to 2, and when we make a new batch, we'll drop it back down to 1 (when unreferencing all buffers in the validation list). The final reference is still held by brw->batch.bo,

[Mesa-dev] [PATCH 14/17] i965: Grow the batch/state buffers if we need space and can't flush.

2017-09-05 Thread Kenneth Graunke
Previously, we would just assert fail and die in this case. The only safeguard is the "estimated max prim size" checks when starting a draw (or compute dispatch or BLORP operation)...which are woefully broken. Growing is fairly straightforward: 1. Allocate a new larger BO. 2. memcpy the

[Mesa-dev] [PATCH 04/17] i965: Use batch->bo->size in brw_emit_reloc assertion.

2017-09-05 Thread Kenneth Graunke
This makes the assertion safe against batchbuffers growing. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index

[Mesa-dev] [PATCH 06/17] i965: Drop a useless ret == 0 check.

2017-09-05 Thread Kenneth Graunke
Prior to the previous patch, we would pwrite the batchbuffer contents, and wanted to skip the execbuffer if that failed. Now, we write things directly to the map, so we don't need this check. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 40 --- 1 file changed, 18

[Mesa-dev] [PATCH 07/17] i965: Remove map fallback in INTEL_DEBUG=bat code.

2017-09-05 Thread Kenneth Graunke
This only made sense for the shadow copy of the batch. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index

[Mesa-dev] [PATCH 02/17] i965: Add an INTEL_DEBUG=flush option for printing batch statistics.

2017-09-05 Thread Kenneth Graunke
When a batch is flushed, INTEL_DEBUG=bat prints a message indicating which part of the code triggered the flushed, and some statistics about the batch/state buffer utilization. It also decodes the batchbuffer in debug builds...which is so much output that it drowns out the utilization messages,

[Mesa-dev] [PATCH 05/17] i965: Drop CPU-side shadow copy of the batchbuffer for non-LLC systems.

2017-09-05 Thread Kenneth Graunke
Now that we have write-combining maps, our writes to the batch should be reasonably fast. (In the past, we only had uncached maps, which were slow...so we kept a CPU-side shadow copy for write combining purposes.) There are a few places that still read back a DWord or so from the batch, which

[Mesa-dev] [PATCH 00/17] i965: Growing the batch buffer, separate state buffers

2017-09-05 Thread Kenneth Graunke
Hello, This series separates GPU commands and indirect state into two distinct buffers - the batch buffer and the state buffer. It then adds support for growing the batch/state buffers, in case we need more space but are in a "critical section" where we can't safely "wrap" (flush) the batch.

[Mesa-dev] [PATCH 03/17] i965: Delete a batch size assertion that isn't very useful.

2017-09-05 Thread Kenneth Graunke
This assertion prevents you from doing intel_batchbuffer_require_space with a size so huge it won't fit in the batchbuffer. This doesn't seem like a common mistake, and I've never seen the assert to be useful. Soon, I hope to have batches grow, at which point this won't make sense. ---

[Mesa-dev] [PATCH 1/2] i965: Inline emit_reloc in __genx_combine_address

2017-09-01 Thread Kenneth Graunke
One less layer of baklava. --- src/mesa/drivers/dri/i965/genX_state_upload.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index b15829fb57c..4eb1a79bcd4 100644

[Mesa-dev] [PATCH 2/2] genxml: Make Border Color Pointer an address on Gen4-5, not an offset.

2017-09-01 Thread Kenneth Graunke
--- src/intel/genxml/gen4.xml | 2 +- src/intel/genxml/gen45.xml| 2 +- src/intel/genxml/gen5.xml | 2 +- src/mesa/drivers/dri/i965/genX_state_upload.c | 10 -- 4 files changed, 7 insertions(+), 9 deletions(-) diff --git

[Mesa-dev] [PATCH] i965: Fix crash in fallback GTT mapping.

2017-09-01 Thread Kenneth Graunke
We can't perf_debug without a context. Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index

[Mesa-dev] [PATCH 5/5] i965: Move BATCH_SZ define into intel_batchbuffer.c.

2017-08-31 Thread Kenneth Graunke
It's only used in one file. --- src/mesa/drivers/dri/i965/brw_context.h | 1 - src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index

[Mesa-dev] [PATCH 4/5] i965: Drop batch_size argument from brw_bufmgr_init().

2017-08-31 Thread Kenneth Graunke
This is dead code and hasn't been used in a long time. --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 2 +- src/mesa/drivers/dri/i965/brw_bufmgr.h | 3 +-- src/mesa/drivers/dri/i965/intel_screen.c | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git

[Mesa-dev] [PATCH 1/5] i965: Don't double count the batch in aperture_space.

2017-08-31 Thread Kenneth Graunke
intel_batchbuffer_reset calls add_exec_bo on the batch right away, which adds in the batch BO size. Fixes: 29ba502a4e28 ("i965: Use I915_EXEC_BATCH_FIRST when available.") --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Here are some misc.

[Mesa-dev] [PATCH 3/5] i965: Rename brw_bo::offset64 to presumed_offset.

2017-08-31 Thread Kenneth Graunke
We can drop the meaningless "64" suffix - libdrm_intel originally had an "offset" field that was an "unsigned long" which was the wrong size, and we couldn't remove/alter that field without breaking ABI, so we had to add a uint64_t "offset64" field. "presumed_offset" is a bit more descriptive

[Mesa-dev] [PATCH 2/5] i965: Drop the BRW_BATCH_STRUCT macro.

2017-08-31 Thread Kenneth Graunke
It's used in exactly one place these days, and not much simpler than just calling intel_batchbuffer_data directly. --- src/mesa/drivers/dri/i965/brw_state.h | 3 --- src/mesa/drivers/dri/i965/brw_urb.c | 2 +- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git

Re: [Mesa-dev] [PATCH] i965: add 2xMSAA 16xMSAA modes to DRI configs.

2017-08-30 Thread Kenneth Graunke
On Wednesday, August 30, 2017 3:57:48 AM PDT kevin.rogo...@intel.com wrote: > From: Kevin Rogovin > > For Gen8, add 2xMSAA. For Gen9, add 2xMSAA and 16xMSAA. > Special thanks to Eero Tamminen for reporting rasterizer > numbers being twice what it should be for 2xMSAA

Re: [Mesa-dev] [PATCH 1/5] intel/isl: Set MOCS based on usage for surface states

2017-08-29 Thread Kenneth Graunke
On Tuesday, August 1, 2017 3:48:30 PM PDT Jason Ekstrand wrote: > This makes ISL now ignore the MOCS data provided by the caller and just > set it based on surface usage. > --- > src/intel/isl/isl_emit_depth_stencil.c | 12 > src/intel/isl/isl_genX_mocs.h | 53 >

Re: [Mesa-dev] [PATCH 0/5] intel/isl: Set MOCS based on view usage

2017-08-29 Thread Kenneth Graunke
If there was a vertex buffer usage bit and we could handle that there, I think that would be nice, even if it doesn't otherwise seem relevant to ISL. *shrug* Having ISL do this itself rather than passing in values makes sense. Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> signature.asc

[Mesa-dev] [PATCH 1/6] blorp: Turn anv_CmdCopyBuffer into a blorp_buffer_copy() helper.

2017-08-29 Thread Kenneth Graunke
Anvil already had code to copy between two buffer objects in the most efficient way possible, using large bpp copies, then smaller bpp copies. This patch moves that logic into BLORP as blorp_buffer_copy(), so we can reuse it in i965 as well. --- src/intel/blorp/blorp.h | 6 ++

[Mesa-dev] [PATCH 4/6] i965: Add PIPE_CONTRTOL_DATA_CACHE flush to brw_emit_mi_flush().

2017-08-29 Thread Kenneth Graunke
Although we're phasing out brw_emit_mi_flush(), we still use it in some places in order to "flush everything". In a number of those places, we write data to a buffer that we may then bind as an image surface, SSBO, or atomic buffer. Those usages require us to flush the data cache. ---

[Mesa-dev] [PATCH 5/6] i965: Always flush caches after blitting to a GL buffer object.

2017-08-29 Thread Kenneth Graunke
When we blit data into a buffer object, we may need to invalidate any caches that might contain stale data, so the new data becomes visible. For example, if the buffer object is bound as a vertex buffer, we need to invalidate the vertex fetch cache. While this flushing was missing, it usually

[Mesa-dev] [PATCH 2/6] blorp: Make blorp_buffer_copy work on Gen4-6.

2017-08-29 Thread Kenneth Graunke
Gen4-6 can only handle surfaces up to 8192. Only Gen7+ can do 16384. --- src/intel/blorp/blorp_blit.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index 998fd9b6d39..d3b51a18161 100644 ---

[Mesa-dev] [PATCH 6/6] i965: Use BLORP for buffer object stall avoidance blits instead of BLT.

2017-08-29 Thread Kenneth Graunke
Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2: - Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8). - Car Chase by 1.25607% +/- 0.291262% (n=5). --- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-)

[Mesa-dev] [PATCH 3/6] i965: Add a brw_blorp_copy_buffers() command.

2017-08-29 Thread Kenneth Graunke
This exposes the new blorp_copy_buffer() functionality to i965. It should be a drop-in replacement for intel_emit_linear_blit() (other than the arguments being backwards, for consistency with BLORP). --- src/mesa/drivers/dri/i965/brw_blorp.c | 18 ++

Re: [Mesa-dev] [PATCH 2/2] i915/fpc_translate: Remove a few unused variables

2017-08-27 Thread Kenneth Graunke
i915_translate_instruction(struct i915_fp_compile *p, > const struct i915_full_instruction *inst, > struct i915_fragment_shader *fs) > { > - uint writemask; > uint src0, src1, src2, flags; > uint tmp = 0; > > Patch 1 is

Re: [Mesa-dev] Question about implementing viewport transfer and const load in nir

2017-08-27 Thread Kenneth Graunke
On Saturday, August 26, 2017 6:40:14 AM PDT Qiang Yu wrote: > Hi guys, > > When working on lima gp compiler, I come across two problems about > inserting extra uniform > and instructions in nir for the driver and don't know where's the > right place to do it. So I'd like > to hear your opinion

Re: [Mesa-dev] [PATCH] st/mesa: fix handling of vertex array double inputs

2017-08-27 Thread Kenneth Graunke
amp; var->data.mode == > ir_var_shader_in && var->type->without_array()->is_double()) >this->result.is_double_vertex_input = true; > if (!native_integers) > this->result.type = GLSL_TYPE_FLOAT; > Makes sense to me - structures don't exist

[Mesa-dev] [PATCH] i965: Use GEN_GEN and GEN_IS_HASWELL in genX_state_upload.c code.

2017-08-25 Thread Kenneth Graunke
We were using brw->gen, brw->is_haswell, and devinfo->gen in a few places, when we could just use GEN_GEN and GEN_IS_HASWELL, which are evaluated at compile time. --- src/mesa/drivers/dri/i965/genX_state_upload.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git

Re: [Mesa-dev] [PATCH 1/2] radv: Remove some intel comments from the resolve code.

2017-08-25 Thread Kenneth Graunke
esolve(cmd_buffer, > src_iview, >dest_iview, > Yeah! Down with the 3DSTATE_DRAWING_RECTANGLE comments, and bring on the S_028B50_DONUT_SPLIT and V_028B6C_DISTRIBUTION_MODE_DONUTS comments! :) Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] mesa: Implement GL_ARB_texture_filter_anisotropic

2017-08-25 Thread Kenneth Graunke
it, > GL_ARB_texture_env_dot3_bit, > + GL_ARB_texture_filter_anisotropic_bit, > GL_ARB_texture_mirrored_repeat_bit, > GL_ARB_texture_non_power_of_two_bit, > GL_ARB_texture_rectangle_bit, Hi Adam, I've never seen new GL extensions added to the GLX code like t

Re: [Mesa-dev] [PATCH] i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9.

2017-08-25 Thread Kenneth Graunke
On Thursday, August 24, 2017 9:56:14 PM PDT Tapani Pälli wrote: > Hi; > > On 08/25/2017 12:30 AM, Kenneth Graunke wrote: > > On Thursday, August 24, 2017 4:16:39 AM PDT kevin.rogo...@intel.com wrote: > >> From: Kevin Rogovin <kevin.rogo...@intel.com> > >&g

Re: [Mesa-dev] [PATCH] i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9.

2017-08-24 Thread Kenneth Graunke
igned-off-by: Kevin Rogovin <kevin.rogo...@intel.com> Nice catch! Thanks for fixing this. Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> Ian requested that I run this through a full CTS run before pushing, so that we actually hit all the new visuals, and make sure 2x/16x works as ex

Re: [Mesa-dev] [PATCH] i965: Simplify MOCS mashing in genX_state_upload.c.

2017-08-24 Thread Kenneth Graunke
On Thursday, August 24, 2017 4:04:26 AM PDT Lionel Landwerlin wrote: > Looks good, but it looks like you could replace an additional one in > upload_push_constant_packets(). That one is a bit weird - it uses 0 on Gen8+. I've wondered about that, actually - the docs claim that you must use 0 -

Re: [Mesa-dev] [PATCH v2] i965: Issue performance warnings when growing the program cache

2017-08-23 Thread Kenneth Graunke
On Wednesday, August 23, 2017 1:58:32 AM PDT Chris Wilson wrote: > Quoting Kenneth Graunke (2017-08-22 21:47:54) > > This involves a bunch of unnecessary copying, a batch flush, and > > state re-emission. > > > --- > > src/mesa/drivers/dri/i965/brw_program_cache.c

[Mesa-dev] [PATCH] i965: Simplify MOCS mashing in genX_state_upload.c.

2017-08-23 Thread Kenneth Graunke
Instead of having a proliferation of generation checks and MOCS values, we can just #define MOCS_ALL to the generation-specific value for "use as many caches as possible" and use that in various places. This should make it easier to change MOCS values, as there are fewer places that need

Re: [Mesa-dev] [PATCH] i965: Only set key->flat_shade if COL0/COL1 are written.

2017-08-23 Thread Kenneth Graunke
On Wednesday, August 23, 2017 12:04:40 PM PDT Ilia Mirkin wrote: > You might consider also including whether the interpolation method was > forced or not. i.e. if you have > > flat varying vec4 gl_Color; > > then it doesn't matter whether shade model is flat or not, it'll be > interpolated as

[Mesa-dev] New #intel-3d IRC channel on Freenode

2017-08-23 Thread Kenneth Graunke
Hello, The Intel Mesa team would like to welcome you to a new public IRC channel on Freenode: #intel-3d. The topic is Mesa development for Intel GPUs, in particular the "i965" OpenGL and "anv" Vulkan drivers. The open source graphics community has grown a lot over the last few years, and as a

[Mesa-dev] [PATCH] i965: Clean up brwNewProgram().

2017-08-22 Thread Kenneth Graunke
All shader stages do the exact same thing, so we don't need the switch statement, or the redundant FS case. I believe these used to be different before Tim eliminated the (e.g.) brw_vertex_program subclasses. --- src/mesa/drivers/dri/i965/brw_program.c | 33 + 1

[Mesa-dev] [PATCH] i965: Only set key->flat_shade if COL0/COL1 are written.

2017-08-22 Thread Kenneth Graunke
This may reduce some recompiles. --- src/mesa/drivers/dri/i965/brw_wm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index c9c45045902..e1555d60c56 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c

[Mesa-dev] [PATCH] i965: Record NOS dependencies for shader programs and only check those.

2017-08-22 Thread Kenneth Graunke
Previously, the state upload code listened to a broad set of dirty flags that corresponded to all possible state dependencies for a shader stage. This is somewhat overkill. For example, if a shader has no textures, there is no need to listen to _NEW_TEXTURE. Although these extra dependencies are

[Mesa-dev] [PATCH 1/2] i965: Drop useless gen6_brw_upload_ff_gs_prog() wrapper.

2017-08-22 Thread Kenneth Graunke
gen6...brw? Drop some baklava layers. --- src/mesa/drivers/dri/i965/brw_ff_gs.c | 5 - src/mesa/drivers/dri/i965/brw_ff_gs.h | 1 - src/mesa/drivers/dri/i965/brw_gs.c| 2 +- 3 files changed, 1 insertion(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_ff_gs.c

[Mesa-dev] [PATCH 2/2] i965: Fix state flagging of Gen6 SOL programs.

2017-08-22 Thread Kenneth Graunke
It doesn't seem like the old code could possibly work. 1. brw_gs_state_dirty made us bail unless one of these flags were set: _NEW_TEXTURE, BRW_NEW_GEOMETRY_PROGRAM, BRW_NEW_TRANSFORM_FEEDBACK 2. If there was no geometry program, we called brw_upload_ff_gs_prog()3 3. That checked

[Mesa-dev] [PATCH] i965: Drop Gen7+ nonsense from brw_ff_gs.c.

2017-08-22 Thread Kenneth Graunke
brw_ff_gs.c is about using the geometry shader to implement things that the fixed function ought to do, but doesn't on old hardware. Gen7+ does not need this. We should drop the misleading comment about Gen7 not using geometry shaders. --- src/mesa/drivers/dri/i965/brw_ff_gs.c | 7 +++ 1

[Mesa-dev] [PATCH 6/7] i965: Add a brw_wm_prog_data::has_render_target_reads field.

2017-08-22 Thread Kenneth Graunke
State upload code should use prog_data rather than poking at shader_info directly. --- src/intel/compiler/brw_compiler.h| 1 + src/intel/compiler/brw_fs.cpp| 2 ++ src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 6 ++ 3 files changed, 5 insertions(+), 4

[Mesa-dev] [PATCH 3/7] i965: Devirtualize update_renderbuffer_surface.

2017-08-22 Thread Kenneth Graunke
Replace piles of my own boilerplate with 1-2 lines of code. --- src/mesa/drivers/dri/i965/brw_context.c | 4 src/mesa/drivers/dri/i965/brw_context.h | 5 - src/mesa/drivers/dri/i965/brw_state.h| 4 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |

[Mesa-dev] [PATCH 7/7] i965: Stop using wm_prog_data->binding_table.render_target_start.

2017-08-22 Thread Kenneth Graunke
Render target surfaces always start at binding table index 0. This is required for us to use headerless FB writes, which we really want to do. So, we'll never change that. Given that, it's not necessary to look up a wm_prog_data field which we already know contains 0. We can drop the dependency

[Mesa-dev] [PATCH 1/7] i965: Make brw_update_renderbuffer_surface static.

2017-08-22 Thread Kenneth Graunke
Also rename it to gen6_update_renderbuffer_surface, as this is the function for Gen6+. Having functions named "brw_*" and "gen4_*" is confusing...if we're using gens, let's stick with those. --- src/mesa/drivers/dri/i965/brw_state.h| 5 -

[Mesa-dev] [PATCH 4/7] i965: Pass fb into emit_null_surface instead of dimensions.

2017-08-22 Thread Kenneth Graunke
We either want the framebuffer dimensions or 1x1x1. Passing fb and falling back to 1x1x1 lets us shorten some calls. --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 28 ++-- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git

[Mesa-dev] [PATCH 2/7] i965: Delete update_renderbuffer_surface flags.

2017-08-22 Thread Kenneth Graunke
We don't need yet another set of flags. The function already has access to both brw and the unit, so it can check brw->draw_aux_buffer_disabled itself in one line of code. The layered flag was only used to assert that Gen4-5 doesn't do layered rendering, which isn't that useful. ---

[Mesa-dev] [PATCH 5/7] i965: Inline brw_update_renderbuffer_surfaces().

2017-08-22 Thread Kenneth Graunke
Less baklava layers. --- src/mesa/drivers/dri/i965/brw_state.h| 5 --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 53 +--- 2 files changed, 20 insertions(+), 38 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h

[Mesa-dev] [PATCH v2] i965: Issue performance warnings when growing the program cache

2017-08-22 Thread Kenneth Graunke
This involves a bunch of unnecessary copying, a batch flush, and state re-emission. --- src/mesa/drivers/dri/i965/brw_program_cache.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c b/src/mesa/drivers/dri/i965/brw_program_cache.c index

[Mesa-dev] [PATCH] intel/blorp: Shrink the size of brw_blorp_blit_prog_key.

2017-08-22 Thread Kenneth Graunke
This shrinks the key from 64 bytes to 20 bytes. --- src/intel/blorp/blorp_priv.h | 44 ++-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/src/intel/blorp/blorp_priv.h b/src/intel/blorp/blorp_priv.h index 81bf8c66c66..f96a5e0e5ad 100644 ---

Re: [Mesa-dev] [PATCH] i965/clear: Quantize the depth clear value based on the format

2017-08-21 Thread Kenneth Graunke
This means that > the test basically always fails for anything other than 0.0f and 1.0f. > This caused a slight performance regression in Lightsmark 2008 because > it was using a depth clear value of 0.999 which can't be stored in a > 32-bit float so we were doing unneeded resolves. > &g

Re: [Mesa-dev] [PATCH 2/2] i965/bufmgr: s/BO_ALLOC_FOR_RENDER/BO_ALLOC_BUSY/

2017-08-19 Thread Kenneth Graunke
f > you're planning to immediately render to it. If the flag really means > "alloc a busy BO" we should just call it that. Both are: Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> signature.asc Description: This is a digitally signed message part. ___

[Mesa-dev] [PATCH 1/2] i965: Issue performance warnings when growing the program cache

2017-08-19 Thread Kenneth Graunke
This involves a bunch of unnecessary copying, a batch flush, and state re-emission. --- src/mesa/drivers/dri/i965/brw_program_cache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c b/src/mesa/drivers/dri/i965/brw_program_cache.c index

[Mesa-dev] [PATCH 2/2] i965: Bump the initial program cache size from 4kB to 16kB.

2017-08-19 Thread Kenneth Graunke
Our initial size of 4kB is way too small to do anything useful, so we end up growing it at least a few times. We may as well start it larger. Some data points: - Dinoshade (from Mesa Demos): hit 8kB. - Chromium 60: hit 16kB after browsing a few things in Google Docs. - GFXBench4 TRex/Manhattan

Re: [Mesa-dev] [PATCH 2/3] i965: Use ISL for emitting null surface states.

2017-08-19 Thread Kenneth Graunke
On Friday, August 18, 2017 10:05:46 PM PDT Jason Ekstrand wrote: > On Thu, Aug 17, 2017 at 4:36 PM, Kenneth Graunke <kenn...@whitecape.org> > wrote: > > > We handle the Sandybridge multisampled 2D surface hack here, rather > > than in ISL, because it requires allocating

Re: [Mesa-dev] [PATCH 1/3] isl: Add a null surface fill function.

2017-08-18 Thread Kenneth Graunke
On Thursday, August 17, 2017 10:26:44 PM PDT Jason Ekstrand wrote: > On August 17, 2017 4:36:42 PM Kenneth Graunke <kenn...@whitecape.org> wrote: > > > ISL already offers functions to fill out most kinds of SURFACE_STATE, > > so why not handle null surfaces too? > >

Re: [Mesa-dev] [PATCH] i965: Stop looking at NewDriverState when emitting 3DSTATE_URB

2017-08-18 Thread Kenneth Graunke
otal URB size will trigger blorp to re-emit as well > because 0 < vs_entry_size. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102289 > Cc: Kenneth Graunke <kenn...@whitecape.org> > Cc: mesa-sta...@lists.freedesktop.org > --- > src/mesa/drivers/dri/i965/g

Re: [Mesa-dev] [PATCH] i965/miptree: Return NONE from texture_aux_usage when fully resolved

2017-08-17 Thread Kenneth Graunke
+ if (!intel_miptree_has_color_unresolved(mt, 0, INTEL_REMAINING_LEVELS, > + 0, INTEL_REMAINING_LEVELS)) This should be: 0, INTEL_REMAINING_LAYERS)) (they're the same so it works either way, but, should use th

Re: [Mesa-dev] [PATCH] mesa: count uniform against storage when its bindless

2017-08-17 Thread Kenneth Graunke
On Thursday, August 17, 2017 4:41:52 PM PDT Timothy Arceri wrote: > On 18/08/17 08:30, Kenneth Graunke wrote: > > On Tuesday, August 15, 2017 3:42:29 AM PDT Timothy Arceri wrote: > >> Gallium drivers use this code path so we need to account for > >> bindless after a

Re: [Mesa-dev] [PATCH 1/9] ralloc: Allow reparenting to a NULL context

2017-08-17 Thread Kenneth Graunke
On Thursday, August 17, 2017 4:54:05 PM PDT Timothy Arceri wrote: > > On 18/08/17 09:05, Kenneth Graunke wrote: > > On Thursday, August 17, 2017 10:22:15 AM PDT Jason Ekstrand wrote: > >> --- > >> src/util/ralloc.c | 2 +- > >> 1 file changed, 1 insertion

[Mesa-dev] [PATCH 1/3] isl: Add a null surface fill function.

2017-08-17 Thread Kenneth Graunke
ISL already offers functions to fill out most kinds of SURFACE_STATE, so why not handle null surfaces too? Null surfaces are simple, so we can just take the dimensions, rather than an entirte fill structure. --- src/intel/isl/isl.c | 7 +++ src/intel/isl/isl.h |

[Mesa-dev] [PATCH 2/3] i965: Use ISL for emitting null surface states.

2017-08-17 Thread Kenneth Graunke
We handle the Sandybridge multisampled 2D surface hack here, rather than in ISL, because it requires allocating a BO, and is kind of messy. --- src/mesa/drivers/dri/i965/Makefile.sources| 2 - src/mesa/drivers/dri/i965/brw_context.c | 4 +-

[Mesa-dev] [PATCH 3/3] anv: Use ISL for emitting null surface states.

2017-08-17 Thread Kenneth Graunke
--- src/intel/vulkan/genX_cmd_buffer.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 280efcc2245..c5735b27e02 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++

Re: [Mesa-dev] [PATCH 1/9] ralloc: Allow reparenting to a NULL context

2017-08-17 Thread Kenneth Graunke
+++ b/src/util/ralloc.c > @@ -285,7 +285,7 @@ ralloc_steal(const void *new_ctx, void *ptr) >return; > > info = get_header(ptr); > - parent = get_header(new_ctx); > + parent = new_ctx ? get_header(new_ctx) : NULL; > > unlink_block(info); > &g

Re: [Mesa-dev] [PATCH] mesa: count uniform against storage when its bindless

2017-08-17 Thread Kenneth Graunke
On Tuesday, August 15, 2017 3:42:29 AM PDT Timothy Arceri wrote: > Gallium drivers use this code path so we need to account for > bindless after all. Why do Gallium drivers use ir_to_mesa? That seems like a misfeature. i965 stopped using it years ago. --Ken signature.asc Description: This is

[Mesa-dev] [PATCH 1/2] i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit.

2017-08-17 Thread Kenneth Graunke
When changing fast clear colors, we need to emit new SURFACE_STATE with the updated color at the next draw call. Most things work today because the atoms that handle SURFACE_STATE for images (mutable images, textures, render targets) also listen to BRW_NEW_BLORP, causing us to re-emit these on

[Mesa-dev] [PATCH 2/2] i965: Drop BRW_NEW_BLORP from SURFACE_STATE setup code.

2017-08-17 Thread Kenneth Graunke
BLORP invalidates the binding tables, but it doesn't destroy any of the existing SURFACE_STATE entries in the statebuffer. We can reuse those. --- src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 4 src/mesa/drivers/dri/i965/brw_tcs_surface_state.c | 4

Re: [Mesa-dev] [PATCH 2/2] intel/isl: Replace switch statements of doom with a macro

2017-08-17 Thread Kenneth Graunke
case 4: > - case 5: > - /* Gen 4-5 are all the same when it comes to buffer surfaces */ This will make us actually use the gen4 function. That seems good. Thanks for removing this handwritten stuff. Both are: Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> signature.asc

Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-12 Thread Kenneth Graunke
On Saturday, August 12, 2017 1:25:25 AM PDT Mauro Rossi wrote: > On Aug 11, 2017 5:14 PM, "Kenneth Graunke" <kenn...@whitecape.org> wrote: > > On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote: > > I do wonder what the target machine is (I haven't s

Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Kenneth Graunke
On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote: > I do wonder what the target machine is (I haven't seen one that would > not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling > back to memcpy makes perfect sense without USE_SSE4_1; > > Reviewed-by: Tapani Pälli

[Mesa-dev] [PATCH 1/2] i965: Clean up intel_batchbuffer_init().

2017-08-10 Thread Kenneth Graunke
Passing screen lets us get the kernel features, devinfo, and bufmgr, without needing container_of. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102062 Cc: Mauro Rossi Cc: Tapani Pälli --- src/mesa/drivers/dri/i965/brw_context.c |

[Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-10 Thread Kenneth Graunke
This should hopefully fix build issues on 32-bit Android-x86. Cc: Mauro Rossi Cc: Tapani Pälli Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 --- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++ 1 file changed, 2

Re: [Mesa-dev] [PATCH 6/8] glsl: stop cloning builtin fuctions _mesa_glsl_find_builtin_function()

2017-08-10 Thread Kenneth Graunke
shaders on radeonsi (which take 5min+ on > some machines). Looking just at the GLSL IR compiler the speed up > is ~40%. > > Cc: Kenneth Graunke <kenn...@whitecape.org> > Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com> > > Tested-by: Dieter Nützel <die..

Re: [Mesa-dev] [PATCH v2 5/8] glsl: pass mem_ctx to constant_expression_value(...) and friends

2017-08-10 Thread Kenneth Graunke
performance. This change along with the following patch > helps fix that performance regression. > > Other advantages are that we reduce the number of calls to > ralloc_parent(), and for loop unrolling we free constants after > they are used rather than leaving them hanging around. &g

Re: [Mesa-dev] [PATCH] isl: Validate row pitch of stencil surfaces.

2017-08-10 Thread Kenneth Graunke
On Wednesday, August 9, 2017 1:20:53 PM PDT Jason Ekstrand wrote: > On Wed, Aug 9, 2017 at 1:09 PM, Kenneth Graunke <kenn...@whitecape.org> > wrote: > > > Also, silence an obnoxious finishme that started occurring for all > > GL applications which use stencil af

Re: [Mesa-dev] [PATCH 5/8] glsl: clone builtin function constants

2017-08-09 Thread Kenneth Graunke
nd Divided shaders with shader db, and then > compiling them again after deleting mesa's shader cache > index file. This will cause optimisations to never be performed > on the IR and which presumably caused this issue to be hidden > under normal circumstances. > > Cc: Kenneth Grau

[Mesa-dev] [PATCH] isl: Validate row pitch of stencil surfaces.

2017-08-09 Thread Kenneth Graunke
Also, silence an obnoxious finishme that started occurring for all GL applications which use stencil after the i965 ISL conversion. --- src/intel/isl/isl.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c index

Re: [Mesa-dev] [PATCH] i965: Drop uniform buffer alignment back to 16 on Gen6.

2017-08-08 Thread Kenneth Graunke
On Tuesday, August 8, 2017 12:26:41 AM PDT Kenneth Graunke wrote: > We don't push UBOs on Gen6 currently, so there's no need for the > larger alignment value. > > Cc: "17.2" <mesa-sta...@lists.freedesktop.org> > --- > src/mesa/drivers/dri/i965/brw_context.c | 2

[Mesa-dev] [PATCH] i965: Drop uniform buffer alignment back to 16 on Gen6.

2017-08-08 Thread Kenneth Graunke
We don't push UBOs on Gen6 currently, so there's no need for the larger alignment value. Cc: "17.2" --- src/mesa/drivers/dri/i965/brw_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c

[Mesa-dev] [PATCH 1/3] i965: Get rid of KSP_ro

2017-08-07 Thread Kenneth Graunke
The GPU reads the shader kernel from the program cache BO. It never writes it, so using a read-write BO reference makes no sense. Just make KSP read-only, and drop KSP_ro. --- src/mesa/drivers/dri/i965/genX_state_upload.c | 19 --- 1 file changed, 4 insertions(+), 15

[Mesa-dev] [PATCH 3/3] i965: Don't use ggtt_bo for Gen8+ streamout offset buffer.

2017-08-07 Thread Kenneth Graunke
RELOC_NEEDS_GGTT is only meaningful on Sandybridge - it's skipped on other generations - so this has no purpose. Just use rw_bo(). --- src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c

[Mesa-dev] [PATCH 2/3] i965: Simplify *_bo() helpers.

2017-08-07 Thread Kenneth Graunke
With the reloc domains gone, most of these are basically the same, and the names don't make much sense anymore. Simplify them to ro_bo(), rw_bo(), and ggtt_bo(). --- src/mesa/drivers/dri/i965/genX_state_upload.c | 72 ++- 1 file changed, 25 insertions(+), 47 deletions(-)

Re: [Mesa-dev] [PATCH] i965/miptree: Set supports_fast_clear = false in make_shareable

2017-08-06 Thread Kenneth Graunke
7 @@ intel_miptree_make_shareable(struct brw_context *brw, > } > > mt->aux_usage = ISL_AUX_USAGE_NONE; > + mt->supports_fast_clear = false; > } > > > Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> signature.asc Description: This is a digitally signed

Re: [Mesa-dev] No reloc for i965

2017-08-04 Thread Kenneth Graunke
On Friday, August 4, 2017 12:22:19 PM PDT Chris Wilson wrote: > Quoting Kenneth Graunke (2017-08-04 19:47:14) > > On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote: > > > Patch reordering from last time so that the cosmetic tweaks are done first > > > and

Re: [Mesa-dev] No reloc for i965

2017-08-04 Thread Kenneth Graunke
On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote: > Patch reordering from last time so that the cosmetic tweaks are done first > and out of the way. Kenneth has reviewed the core NO_RELOC patches, so > hopefully it doesn't look too bad and we can land at least as far as > there (patch

Re: [Mesa-dev] [PATCH] intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.

2017-08-03 Thread Kenneth Graunke
On Wednesday, August 2, 2017 8:50:54 PM PDT Dave Airlie wrote: > If dual instanced compile fails (as seems to happen with virgl a I think you mean "dual object" - it tries that first, then falls back to either "dual instance" or "single" mode. Thanks for f

Re: [Mesa-dev] [PATCH] i965/blit: Remember to include miptree buffer offset in relocs

2017-08-02 Thread Kenneth Graunke
On Monday, July 31, 2017 2:56:15 AM PDT Chris Wilson wrote: > Remember to add the offset to the start of the buffer in the relocation > or else we write 0xff into random bytes elsewhere. > --- > src/mesa/drivers/dri/i965/intel_blit.c | 4 ++-- >

<    5   6   7   8   9   10   11   12   13   14   >