We need to set brw->no_batch_wrap to actually avoid flushing in the
middle of our BLORP operation, and instead grow the batchbuffer.
---
src/mesa/drivers/dri/i965/genX_blorp_exec.c | 16 ++--
1 file changed, 2 insertions(+), 14 deletions(-)
diff --git
Calling exit(1) when execbuffer fails is not necessarily safe.
When running Piglit tests with a Mesa that submitted invalid commands
to the GPU, I discovered the following problem:
1. do_flush_locked fails and calls exit(1)...invoking atexit handlers.
2. Piglit tries to clean up after itself, and
Now that we can grom the batchbuffer if we absolutely need the extra
space, we don't need to reserve space for the final do-or-die ending
commands.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 11 +++
src/mesa/drivers/dri/i965/intel_batchbuffer.h | 26 --
2
The batch buffer and state buffer code is fairly tied together,
and having it in one .c file will make refactoring easier.
Also, drop some commentary above brw_state_batch. The "aperture
checking performance hacks" are long since gone, so that paragraph
makes little sense at this point.
---
We'll need to read from both buffers when decoding state.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 102 +-
1 file changed, 52 insertions(+), 50 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
We now flush the batch when either the batchbuffer or statebuffer
reaches the original intended batch size, instead of when the sum of
the two reaches a certain size (which makes no sense now that they're
separate buffers).
With this change, we also need to update our "are we near the end?"
I'm planning on splitting batch and state into separate buffers, at
which point we'll need two relocation lists. In preparation for that,
this patch refactors the relocation stuff into a structure we can
replicate...which looks a lot like anv_reloc_list.
---
Previously, we emitted GPU commands and indirect state into the same
buffer, using a stack/heap like system where we filled in commands from
the start of the buffer, and state from the end of the buffer. We then
flushed before the two met in the middle.
Meeting in the middle is fatal, so you
brw_batch_reloc emits a relocation from the batchbuffer to elsewhere.
brw_state_reloc emits a relocation from the statebuffer to elsewhere.
For now, they do the same thing, but when we actually split the two
buffers, we'll change brw_state_reloc to use the state buffer.
---
We don't need to special case the batch - when we add the batch to the
validation list, we can simply increase the refcount to 2, and when we
make a new batch, we'll drop it back down to 1 (when unreferencing all
buffers in the validation list). The final reference is still held by
brw->batch.bo,
Previously, we would just assert fail and die in this case. The only
safeguard is the "estimated max prim size" checks when starting a draw
(or compute dispatch or BLORP operation)...which are woefully broken.
Growing is fairly straightforward:
1. Allocate a new larger BO.
2. memcpy the
This makes the assertion safe against batchbuffers growing.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index
Prior to the previous patch, we would pwrite the batchbuffer contents,
and wanted to skip the execbuffer if that failed. Now, we write things
directly to the map, so we don't need this check.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 40 ---
1 file changed, 18
This only made sense for the shadow copy of the batch.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index
When a batch is flushed, INTEL_DEBUG=bat prints a message indicating
which part of the code triggered the flushed, and some statistics about
the batch/state buffer utilization.
It also decodes the batchbuffer in debug builds...which is so much
output that it drowns out the utilization messages,
Now that we have write-combining maps, our writes to the batch should be
reasonably fast. (In the past, we only had uncached maps, which were
slow...so we kept a CPU-side shadow copy for write combining purposes.)
There are a few places that still read back a DWord or so from the
batch, which
Hello,
This series separates GPU commands and indirect state into two distinct
buffers - the batch buffer and the state buffer. It then adds support
for growing the batch/state buffers, in case we need more space but are
in a "critical section" where we can't safely "wrap" (flush) the batch.
This assertion prevents you from doing intel_batchbuffer_require_space
with a size so huge it won't fit in the batchbuffer. This doesn't seem
like a common mistake, and I've never seen the assert to be useful.
Soon, I hope to have batches grow, at which point this won't make sense.
---
One less layer of baklava.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 17 +
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index b15829fb57c..4eb1a79bcd4 100644
---
src/intel/genxml/gen4.xml | 2 +-
src/intel/genxml/gen45.xml| 2 +-
src/intel/genxml/gen5.xml | 2 +-
src/mesa/drivers/dri/i965/genX_state_upload.c | 10 --
4 files changed, 7 insertions(+), 9 deletions(-)
diff --git
We can't perf_debug without a context.
Cc: mesa-sta...@lists.freedesktop.org
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index
It's only used in one file.
---
src/mesa/drivers/dri/i965/brw_context.h | 1 -
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 ++
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h
b/src/mesa/drivers/dri/i965/brw_context.h
index
This is dead code and hasn't been used in a long time.
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 2 +-
src/mesa/drivers/dri/i965/brw_bufmgr.h | 3 +--
src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
3 files changed, 3 insertions(+), 4 deletions(-)
diff --git
intel_batchbuffer_reset calls add_exec_bo on the batch right away,
which adds in the batch BO size.
Fixes: 29ba502a4e28 ("i965: Use I915_EXEC_BATCH_FIRST when available.")
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Here are some misc.
We can drop the meaningless "64" suffix - libdrm_intel originally had
an "offset" field that was an "unsigned long" which was the wrong size,
and we couldn't remove/alter that field without breaking ABI, so we had
to add a uint64_t "offset64" field.
"presumed_offset" is a bit more descriptive
It's used in exactly one place these days, and not much simpler than
just calling intel_batchbuffer_data directly.
---
src/mesa/drivers/dri/i965/brw_state.h | 3 ---
src/mesa/drivers/dri/i965/brw_urb.c | 2 +-
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git
On Wednesday, August 30, 2017 3:57:48 AM PDT kevin.rogo...@intel.com wrote:
> From: Kevin Rogovin
>
> For Gen8, add 2xMSAA. For Gen9, add 2xMSAA and 16xMSAA.
> Special thanks to Eero Tamminen for reporting rasterizer
> numbers being twice what it should be for 2xMSAA
On Tuesday, August 1, 2017 3:48:30 PM PDT Jason Ekstrand wrote:
> This makes ISL now ignore the MOCS data provided by the caller and just
> set it based on surface usage.
> ---
> src/intel/isl/isl_emit_depth_stencil.c | 12
> src/intel/isl/isl_genX_mocs.h | 53
>
If there was a vertex buffer usage
bit and we could handle that there, I think that would be nice, even if it
doesn't otherwise seem relevant to ISL. *shrug*
Having ISL do this itself rather than passing in values makes sense.
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
signature.asc
Anvil already had code to copy between two buffer objects in the most
efficient way possible, using large bpp copies, then smaller bpp copies.
This patch moves that logic into BLORP as blorp_buffer_copy(), so we
can reuse it in i965 as well.
---
src/intel/blorp/blorp.h | 6 ++
Although we're phasing out brw_emit_mi_flush(), we still use it in some
places in order to "flush everything". In a number of those places, we
write data to a buffer that we may then bind as an image surface, SSBO,
or atomic buffer. Those usages require us to flush the data cache.
---
When we blit data into a buffer object, we may need to invalidate any
caches that might contain stale data, so the new data becomes visible.
For example, if the buffer object is bound as a vertex buffer, we need
to invalidate the vertex fetch cache.
While this flushing was missing, it usually
Gen4-6 can only handle surfaces up to 8192. Only Gen7+ can do 16384.
---
src/intel/blorp/blorp_blit.c | 19 ++-
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index 998fd9b6d39..d3b51a18161 100644
---
Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
- Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
- Car Chase by 1.25607% +/- 0.291262% (n=5).
---
src/mesa/drivers/dri/i965/intel_buffer_objects.c | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
This exposes the new blorp_copy_buffer() functionality to i965.
It should be a drop-in replacement for intel_emit_linear_blit()
(other than the arguments being backwards, for consistency with BLORP).
---
src/mesa/drivers/dri/i965/brw_blorp.c | 18 ++
i915_translate_instruction(struct i915_fp_compile *p,
> const struct i915_full_instruction *inst,
> struct i915_fragment_shader *fs)
> {
> - uint writemask;
> uint src0, src1, src2, flags;
> uint tmp = 0;
>
>
Patch 1 is
On Saturday, August 26, 2017 6:40:14 AM PDT Qiang Yu wrote:
> Hi guys,
>
> When working on lima gp compiler, I come across two problems about
> inserting extra uniform
> and instructions in nir for the driver and don't know where's the
> right place to do it. So I'd like
> to hear your opinion
amp; var->data.mode ==
> ir_var_shader_in && var->type->without_array()->is_double())
>this->result.is_double_vertex_input = true;
> if (!native_integers)
> this->result.type = GLSL_TYPE_FLOAT;
>
Makes sense to me - structures don't exist
We were using brw->gen, brw->is_haswell, and devinfo->gen in a few
places, when we could just use GEN_GEN and GEN_IS_HASWELL, which are
evaluated at compile time.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git
esolve(cmd_buffer,
> src_iview,
>dest_iview,
>
Yeah! Down with the 3DSTATE_DRAWING_RECTANGLE comments, and bring on the
S_028B50_DONUT_SPLIT and V_028B6C_DISTRIBUTION_MODE_DONUTS comments! :)
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
it,
> GL_ARB_texture_env_dot3_bit,
> + GL_ARB_texture_filter_anisotropic_bit,
> GL_ARB_texture_mirrored_repeat_bit,
> GL_ARB_texture_non_power_of_two_bit,
> GL_ARB_texture_rectangle_bit,
Hi Adam,
I've never seen new GL extensions added to the GLX code like t
On Thursday, August 24, 2017 9:56:14 PM PDT Tapani Pälli wrote:
> Hi;
>
> On 08/25/2017 12:30 AM, Kenneth Graunke wrote:
> > On Thursday, August 24, 2017 4:16:39 AM PDT kevin.rogo...@intel.com wrote:
> >> From: Kevin Rogovin <kevin.rogo...@intel.com>
> >&g
igned-off-by: Kevin Rogovin <kevin.rogo...@intel.com>
Nice catch! Thanks for fixing this.
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
Ian requested that I run this through a full CTS run before pushing, so
that we actually hit all the new visuals, and make sure 2x/16x works as
ex
On Thursday, August 24, 2017 4:04:26 AM PDT Lionel Landwerlin wrote:
> Looks good, but it looks like you could replace an additional one in
> upload_push_constant_packets().
That one is a bit weird - it uses 0 on Gen8+. I've wondered about that,
actually - the docs claim that you must use 0 -
On Wednesday, August 23, 2017 1:58:32 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-08-22 21:47:54)
> > This involves a bunch of unnecessary copying, a batch flush, and
> > state re-emission.
>
> > ---
> > src/mesa/drivers/dri/i965/brw_program_cache.c
Instead of having a proliferation of generation checks and MOCS values,
we can just #define MOCS_ALL to the generation-specific value for "use
as many caches as possible" and use that in various places.
This should make it easier to change MOCS values, as there are fewer
places that need
On Wednesday, August 23, 2017 12:04:40 PM PDT Ilia Mirkin wrote:
> You might consider also including whether the interpolation method was
> forced or not. i.e. if you have
>
> flat varying vec4 gl_Color;
>
> then it doesn't matter whether shade model is flat or not, it'll be
> interpolated as
Hello,
The Intel Mesa team would like to welcome you to a new public IRC channel
on Freenode: #intel-3d. The topic is Mesa development for Intel GPUs, in
particular the "i965" OpenGL and "anv" Vulkan drivers.
The open source graphics community has grown a lot over the last few
years, and as a
All shader stages do the exact same thing, so we don't need the switch
statement, or the redundant FS case. I believe these used to be
different before Tim eliminated the (e.g.) brw_vertex_program
subclasses.
---
src/mesa/drivers/dri/i965/brw_program.c | 33 +
1
This may reduce some recompiles.
---
src/mesa/drivers/dri/i965/brw_wm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c
b/src/mesa/drivers/dri/i965/brw_wm.c
index c9c45045902..e1555d60c56 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
Previously, the state upload code listened to a broad set of dirty flags
that corresponded to all possible state dependencies for a shader stage.
This is somewhat overkill. For example, if a shader has no textures,
there is no need to listen to _NEW_TEXTURE. Although these extra
dependencies are
gen6...brw? Drop some baklava layers.
---
src/mesa/drivers/dri/i965/brw_ff_gs.c | 5 -
src/mesa/drivers/dri/i965/brw_ff_gs.h | 1 -
src/mesa/drivers/dri/i965/brw_gs.c| 2 +-
3 files changed, 1 insertion(+), 7 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_ff_gs.c
It doesn't seem like the old code could possibly work.
1. brw_gs_state_dirty made us bail unless one of these flags were set:
_NEW_TEXTURE, BRW_NEW_GEOMETRY_PROGRAM, BRW_NEW_TRANSFORM_FEEDBACK
2. If there was no geometry program, we called brw_upload_ff_gs_prog()3
3. That checked
brw_ff_gs.c is about using the geometry shader to implement things
that the fixed function ought to do, but doesn't on old hardware.
Gen7+ does not need this. We should drop the misleading comment
about Gen7 not using geometry shaders.
---
src/mesa/drivers/dri/i965/brw_ff_gs.c | 7 +++
1
State upload code should use prog_data rather than poking at shader_info
directly.
---
src/intel/compiler/brw_compiler.h| 1 +
src/intel/compiler/brw_fs.cpp| 2 ++
src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 6 ++
3 files changed, 5 insertions(+), 4
Replace piles of my own boilerplate with 1-2 lines of code.
---
src/mesa/drivers/dri/i965/brw_context.c | 4
src/mesa/drivers/dri/i965/brw_context.h | 5 -
src/mesa/drivers/dri/i965/brw_state.h| 4
src/mesa/drivers/dri/i965/brw_wm_surface_state.c |
Render target surfaces always start at binding table index 0.
This is required for us to use headerless FB writes, which we
really want to do. So, we'll never change that.
Given that, it's not necessary to look up a wm_prog_data field
which we already know contains 0. We can drop the dependency
Also rename it to gen6_update_renderbuffer_surface, as this is the
function for Gen6+. Having functions named "brw_*" and "gen4_*"
is confusing...if we're using gens, let's stick with those.
---
src/mesa/drivers/dri/i965/brw_state.h| 5 -
We either want the framebuffer dimensions or 1x1x1. Passing fb and
falling back to 1x1x1 lets us shorten some calls.
---
src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 28 ++--
1 file changed, 12 insertions(+), 16 deletions(-)
diff --git
We don't need yet another set of flags. The function already has access
to both brw and the unit, so it can check brw->draw_aux_buffer_disabled
itself in one line of code. The layered flag was only used to assert
that Gen4-5 doesn't do layered rendering, which isn't that useful.
---
Less baklava layers.
---
src/mesa/drivers/dri/i965/brw_state.h| 5 ---
src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 53 +---
2 files changed, 20 insertions(+), 38 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_state.h
This involves a bunch of unnecessary copying, a batch flush, and
state re-emission.
---
src/mesa/drivers/dri/i965/brw_program_cache.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c
b/src/mesa/drivers/dri/i965/brw_program_cache.c
index
This shrinks the key from 64 bytes to 20 bytes.
---
src/intel/blorp/blorp_priv.h | 44 ++--
1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/src/intel/blorp/blorp_priv.h b/src/intel/blorp/blorp_priv.h
index 81bf8c66c66..f96a5e0e5ad 100644
---
This means that
> the test basically always fails for anything other than 0.0f and 1.0f.
> This caused a slight performance regression in Lightsmark 2008 because
> it was using a depth clear value of 0.999 which can't be stored in a
> 32-bit float so we were doing unneeded resolves.
>
&g
f
> you're planning to immediately render to it. If the flag really means
> "alloc a busy BO" we should just call it that.
Both are:
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
signature.asc
Description: This is a digitally signed message part.
___
This involves a bunch of unnecessary copying, a batch flush, and
state re-emission.
---
src/mesa/drivers/dri/i965/brw_program_cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c
b/src/mesa/drivers/dri/i965/brw_program_cache.c
index
Our initial size of 4kB is way too small to do anything useful, so we
end up growing it at least a few times. We may as well start it larger.
Some data points:
- Dinoshade (from Mesa Demos): hit 8kB.
- Chromium 60: hit 16kB after browsing a few things in Google Docs.
- GFXBench4 TRex/Manhattan
On Friday, August 18, 2017 10:05:46 PM PDT Jason Ekstrand wrote:
> On Thu, Aug 17, 2017 at 4:36 PM, Kenneth Graunke <kenn...@whitecape.org>
> wrote:
>
> > We handle the Sandybridge multisampled 2D surface hack here, rather
> > than in ISL, because it requires allocating
On Thursday, August 17, 2017 10:26:44 PM PDT Jason Ekstrand wrote:
> On August 17, 2017 4:36:42 PM Kenneth Graunke <kenn...@whitecape.org> wrote:
>
> > ISL already offers functions to fill out most kinds of SURFACE_STATE,
> > so why not handle null surfaces too?
> >
otal URB size will trigger blorp to re-emit as well
> because 0 < vs_entry_size.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102289
> Cc: Kenneth Graunke <kenn...@whitecape.org>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
> src/mesa/drivers/dri/i965/g
+ if (!intel_miptree_has_color_unresolved(mt, 0, INTEL_REMAINING_LEVELS,
> + 0, INTEL_REMAINING_LEVELS))
This should be: 0, INTEL_REMAINING_LAYERS))
(they're the same so it works either way, but, should use th
On Thursday, August 17, 2017 4:41:52 PM PDT Timothy Arceri wrote:
> On 18/08/17 08:30, Kenneth Graunke wrote:
> > On Tuesday, August 15, 2017 3:42:29 AM PDT Timothy Arceri wrote:
> >> Gallium drivers use this code path so we need to account for
> >> bindless after a
On Thursday, August 17, 2017 4:54:05 PM PDT Timothy Arceri wrote:
>
> On 18/08/17 09:05, Kenneth Graunke wrote:
> > On Thursday, August 17, 2017 10:22:15 AM PDT Jason Ekstrand wrote:
> >> ---
> >> src/util/ralloc.c | 2 +-
> >> 1 file changed, 1 insertion
ISL already offers functions to fill out most kinds of SURFACE_STATE,
so why not handle null surfaces too?
Null surfaces are simple, so we can just take the dimensions, rather
than an entirte fill structure.
---
src/intel/isl/isl.c | 7 +++
src/intel/isl/isl.h |
We handle the Sandybridge multisampled 2D surface hack here, rather
than in ISL, because it requires allocating a BO, and is kind of messy.
---
src/mesa/drivers/dri/i965/Makefile.sources| 2 -
src/mesa/drivers/dri/i965/brw_context.c | 4 +-
---
src/intel/vulkan/genX_cmd_buffer.c | 20
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
index 280efcc2245..c5735b27e02 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++
+++ b/src/util/ralloc.c
> @@ -285,7 +285,7 @@ ralloc_steal(const void *new_ctx, void *ptr)
>return;
>
> info = get_header(ptr);
> - parent = get_header(new_ctx);
> + parent = new_ctx ? get_header(new_ctx) : NULL;
>
> unlink_block(info);
>
&g
On Tuesday, August 15, 2017 3:42:29 AM PDT Timothy Arceri wrote:
> Gallium drivers use this code path so we need to account for
> bindless after all.
Why do Gallium drivers use ir_to_mesa? That seems like a misfeature.
i965 stopped using it years ago.
--Ken
signature.asc
Description: This is
When changing fast clear colors, we need to emit new SURFACE_STATE
with the updated color at the next draw call.
Most things work today because the atoms that handle SURFACE_STATE
for images (mutable images, textures, render targets) also listen to
BRW_NEW_BLORP, causing us to re-emit these on
BLORP invalidates the binding tables, but it doesn't destroy any of the
existing SURFACE_STATE entries in the statebuffer. We can reuse those.
---
src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 4
src/mesa/drivers/dri/i965/brw_tcs_surface_state.c | 4
case 4:
> - case 5:
> - /* Gen 4-5 are all the same when it comes to buffer surfaces */
This will make us actually use the gen4 function. That seems good.
Thanks for removing this handwritten stuff. Both are:
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
signature.asc
On Saturday, August 12, 2017 1:25:25 AM PDT Mauro Rossi wrote:
> On Aug 11, 2017 5:14 PM, "Kenneth Graunke" <kenn...@whitecape.org> wrote:
>
> On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote:
> > I do wonder what the target machine is (I haven't s
On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote:
> I do wonder what the target machine is (I haven't seen one that would
> not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling
> back to memcpy makes perfect sense without USE_SSE4_1;
>
> Reviewed-by: Tapani Pälli
Passing screen lets us get the kernel features, devinfo, and bufmgr,
without needing container_of.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102062
Cc: Mauro Rossi
Cc: Tapani Pälli
---
src/mesa/drivers/dri/i965/brw_context.c |
This should hopefully fix build issues on 32-bit Android-x86.
Cc: Mauro Rossi
Cc: Tapani Pälli
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
---
src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++
1 file changed, 2
shaders on radeonsi (which take 5min+ on
> some machines). Looking just at the GLSL IR compiler the speed up
> is ~40%.
>
> Cc: Kenneth Graunke <kenn...@whitecape.org>
> Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com>
>
> Tested-by: Dieter Nützel <die..
performance. This change along with the following patch
> helps fix that performance regression.
>
> Other advantages are that we reduce the number of calls to
> ralloc_parent(), and for loop unrolling we free constants after
> they are used rather than leaving them hanging around.
&g
On Wednesday, August 9, 2017 1:20:53 PM PDT Jason Ekstrand wrote:
> On Wed, Aug 9, 2017 at 1:09 PM, Kenneth Graunke <kenn...@whitecape.org>
> wrote:
>
> > Also, silence an obnoxious finishme that started occurring for all
> > GL applications which use stencil af
nd Divided shaders with shader db, and then
> compiling them again after deleting mesa's shader cache
> index file. This will cause optimisations to never be performed
> on the IR and which presumably caused this issue to be hidden
> under normal circumstances.
>
> Cc: Kenneth Grau
Also, silence an obnoxious finishme that started occurring for all
GL applications which use stencil after the i965 ISL conversion.
---
src/intel/isl/isl.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index
On Tuesday, August 8, 2017 12:26:41 AM PDT Kenneth Graunke wrote:
> We don't push UBOs on Gen6 currently, so there's no need for the
> larger alignment value.
>
> Cc: "17.2" <mesa-sta...@lists.freedesktop.org>
> ---
> src/mesa/drivers/dri/i965/brw_context.c | 2
We don't push UBOs on Gen6 currently, so there's no need for the
larger alignment value.
Cc: "17.2"
---
src/mesa/drivers/dri/i965/brw_context.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_context.c
The GPU reads the shader kernel from the program cache BO. It never
writes it, so using a read-write BO reference makes no sense.
Just make KSP read-only, and drop KSP_ro.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 19 ---
1 file changed, 4 insertions(+), 15
RELOC_NEEDS_GGTT is only meaningful on Sandybridge - it's skipped on
other generations - so this has no purpose. Just use rw_bo().
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
With the reloc domains gone, most of these are basically the same,
and the names don't make much sense anymore. Simplify them to ro_bo(),
rw_bo(), and ggtt_bo().
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 72 ++-
1 file changed, 25 insertions(+), 47 deletions(-)
7 @@ intel_miptree_make_shareable(struct brw_context *brw,
> }
>
> mt->aux_usage = ISL_AUX_USAGE_NONE;
> + mt->supports_fast_clear = false;
> }
>
>
>
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
signature.asc
Description: This is a digitally signed
On Friday, August 4, 2017 12:22:19 PM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-08-04 19:47:14)
> > On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote:
> > > Patch reordering from last time so that the cosmetic tweaks are done first
> > > and
On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote:
> Patch reordering from last time so that the cosmetic tweaks are done first
> and out of the way. Kenneth has reviewed the core NO_RELOC patches, so
> hopefully it doesn't look too bad and we can land at least as far as
> there (patch
On Wednesday, August 2, 2017 8:50:54 PM PDT Dave Airlie wrote:
> If dual instanced compile fails (as seems to happen with virgl a
I think you mean "dual object" - it tries that first, then falls back to
either "dual instance" or "single" mode.
Thanks for f
On Monday, July 31, 2017 2:56:15 AM PDT Chris Wilson wrote:
> Remember to add the offset to the start of the buffer in the relocation
> or else we write 0xff into random bytes elsewhere.
> ---
> src/mesa/drivers/dri/i965/intel_blit.c | 4 ++--
>
901 - 1000 of 8470 matches
Mail list logo