On 2016-12-09 13:39:52, Rafael Antognolli wrote:
> This extension adds new query types which can be used to detect overflow
> of transform feedback buffers. The new query types are also accepted by
> conditional rendering commands.
>
> Signed-off-by: Rafael Antognolli
Hi Emil,
On 12/09/2016 11:20 PM, Emil Velikov wrote:
> On 9 December 2016 at 13:20, Alexandre Courbot wrote:
>> On 12/08/2016 04:16 PM, Alexandre Courbot wrote:
>>> On 11/30/2016 10:44 PM, Christian Gmeiner wrote:
This a very lightweight library to add basic support for
Reviewed-by: Jordan Justen
On 2016-12-09 17:16:37, Kenneth Graunke wrote:
> GetFramebufferAttachmentParameteriv should return GL_LINEAR for the
> window system default framebuffer's GL_DEPTH or GL_STENCIL attachments
> when there are zero depth or stencil bits.
>
>
Hi Daniel,
On 12/09/2016 11:13 PM, Daniel Stone wrote:
> Hi Alexandre,
>
> On 9 December 2016 at 13:20, Alexandre Courbot wrote:
>> On 12/08/2016 04:16 PM, Alexandre Courbot wrote:
>>> First, setting the tiling works indeed just fine if we are using an
>>> ioctl for this.
Updated shader-db numbers for BDW with recent shader-db:
fills helped: shaders/closed/steam/deus-ex-mankind-
divided/306.shader_test CS SIMD16: 56 -> 53 (-5.36%)
fills helped: shaders/closed/steam/deus-ex-mankind-
divided/206.shader_test CS SIMD16: 56 -> 53 (-5.36%)
total instructions in
On Fri, 2016-12-09 at 17:23 -0800, Jason Ekstrand wrote:
> Wow! This is way better than the last time I read through it. Good
> work!
>
> Overall, I'm much happier with the code now. The structure is
> better, some of the crazy phi logic is gone, and clone_cf_list is
> helping a lot. That
A number of games have large arrays of constants, which we promote to
uniforms. This introduces copies from the uniform array to the original
temporary array. Normally, copy propagation eliminates those copies,
making everything refer to the uniform array directly.
A number of shaders in "Deus
On Fri, 2016-12-09 at 14:20 -0800, Jason Ekstrand wrote:
> On Mon, Dec 5, 2016 at 5:12 PM, Timothy Arceri ora.com> wrote:
> > V2:
> > - updated to create a generic list clone helper nir_cf_list_clone()
> > - continue to assert on clone when fallback flag not set as
> >
On Sat, Dec 10, 2016 at 2:07 AM, Timothy Arceri
wrote:
> On Wed, 2016-12-07 at 18:33 +0100, Marek Olšák wrote:
>> From: Marek Olšák
>>
>> ---
>> run.c | 6 +-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/run.c
Wow! This is way better than the last time I read through it. Good work!
Overall, I'm much happier with the code now. The structure is better, some
of the crazy phi logic is gone, and clone_cf_list is helping a lot. That
said... I still have a pile of comments. Most of them are cosmetic, one
GetFramebufferAttachmentParameteriv should return GL_LINEAR for the
window system default framebuffer's GL_DEPTH or GL_STENCIL attachments
when there are zero depth or stencil bits.
The GL 4.5 spec's GetFramebufferAttachmentParameteriv section says:
"If the value of
On Wed, 2016-12-07 at 18:33 +0100, Marek Olšák wrote:
> From: Marek Olšák
>
> ---
> run.c | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/run.c b/run.c
> index 08fd543..ded224a 100644
> --- a/run.c
> +++ b/run.c
> @@ -656,28 +656,32 @@
This was causing my poor 8GB laptop to run out on memory.
---
run.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/run.c b/run.c
index 08fd543..6d635c1 100644
--- a/run.c
+++ b/run.c
@@ -670,7 +670,9 @@ main(int argc, char **argv)
On Friday, December 9, 2016 4:32:53 PM PST Chad Versace wrote:
> The inescapable vortex of HiZ finds me wherever I go...
>
> This series brings us one step closer to passing the Android N CTS.
>
> See https://bugs.freedesktop.org/show_bug.cgi?id=98329.
>
> Chad Versace (2):
> i965/mt: Disable
intel_miptree_make_shareable() discarded and disabled CCS. Fix it so
that it discards and disables HiZ too.
Fixes
dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer
on Skylake.
v2: Actually do what the commit message says. Discard the HiZ buffer.
Fixes:
intel_miptree_make_shareable() discarded and disabled CCS. Fix it so
that it discards and disables HiZ too.
Fixes
dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer
on Skylake.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Cc: Haixia Shi
The inescapable vortex of HiZ finds me wherever I go...
This series brings us one step closer to passing the Android N CTS.
See https://bugs.freedesktop.org/show_bug.cgi?id=98329.
Chad Versace (2):
i965/mt: Disable aux surfaces after making miptree shareable
i965/mt: Disable HiZ when
The entire goal of intel_miptree_make_shareable() is to permanently
disable the miptree's aux surfaces. So set
intel_mipmap_tree:disable_aux_buffers after the function's done with
discarding down the aux surfaces.
References: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Cc: Haixia Shi
Emil Velikov writes:
> From: Emil Velikov
>
> All the information required is provided via the respecive xcb packages.
I was confused by the commit message, as I thought you meant that xcb
had the depends. But what it is is that we don't
Hi,
If the state change doesn't require any state validation in mesa/main,
it shouldn't flag _NEW_MULTISAMPLE. Instead, a new flag should be
added to gl_driver_flags and used here. The final code:
FLUSH_VERTICES(ctx, 0);
ctx->NewDriverState |= ctx->DriverFlags.NewIntelConservativeRasterization;
For the rest:
Reviewed-by: Marek Olšák
Marek
On Tue, Dec 6, 2016 at 11:48 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> ---
> src/gallium/drivers/radeonsi/si_state_shaders.c | 2 +-
> 1 file changed, 1 insertion(+), 1
On Mon, Dec 5, 2016 at 5:12 PM, Timothy Arceri wrote:
> V2:
> - updated to create a generic list clone helper nir_cf_list_clone()
> - continue to assert on clone when fallback flag not set as suggested
> by Jason.
> ---
> src/compiler/nir/nir_clone.c| 58
On Tue, Dec 6, 2016 at 5:17 PM, Philipp Zabel wrote:
> Add resource_changed to the ddebug, rbug, and trace wrappers. Since it
> is optional, there is no need to add it to noop.
>
> Signed-off-by: Philipp Zabel
> Suggested-by: Nicolai Hähnle
This extension adds new query types which can be used to detect overflow
of transform feedback buffers. The new query types are also accepted by
conditional rendering commands.
Signed-off-by: Rafael Antognolli
---
docs/features.txt| 2 +-
Enable the use of a transform feedback overflow query with
glBeginConditionalRender. The render commands will only execute if the
query is true (i.e. if there was an overflow).
Use ARB_conditional_render_inverted to change this behavior.
Signed-off-by: Rafael Antognolli
When querying for transform feedback overflow on one or all of the
streams, store information about number of generated and written
primitives. Then check whether generated == written.
v2:
- use only SO_PRIM_STORAGE_NEEDED, do not fallback to
CL_INVOCATION_COUNT. (Kenneth)
Updated version addressing things suggested by Kenneth Graunke.
The series is available on github here:
https://github.com/rantogno/mesa/tree/review/overflow_query-v02
There are also piglit tests available for it here:
https://github.com/rantogno/piglit/tree/review/overflow_query-v02
Regards,
Enable getting the results of a transform feedback overflow query with a
buffer object.
Signed-off-by: Rafael Antognolli
---
src/mesa/drivers/dri/i965/hsw_queryobj.c | 108 +++
1 file changed, 108 insertions(+)
diff --git
Predication needs cmd parser only on gen7. For newer platforms, it
should be available without it.
Signed-off-by: Rafael Antognolli
---
src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
1 file changed, 1 insertion(+)
diff --git
Add some basic types and storage for the queries of this extension.
v2:
- update date of extension (Kenneth)
Signed-off-by: Rafael Antognolli
---
src/mesa/main/extensions_table.h | 1 +
src/mesa/main/mtypes.h | 5 +
2 files changed, 6
Also update checks on conditional rendering.
Signed-off-by: Rafael Antognolli
---
src/mesa/main/condrender.c | 4 +++-
src/mesa/main/queryobj.c| 21 +
src/mesa/state_tracker/st_cb_queryobj.c | 6 ++
3 files
Asking the DC for less than one cacheline (4 owords) of data for
uniform pull constants is suboptimal because the DC cannot request
less than that from L3, resulting in wasted bandwidth and unnecessary
message dispatch overhead, and exacerbating the IVB L3 serialization
bug. The following table
This is a respin of a series I sent nearly two years ago
reimplementing uniform pull constant loads in terms of constant cache
block read messages instead of using sampler LD messages. The
motivation is that oword block read messages are able to fetch more
data with a single message than the
---
src/mesa/drivers/dri/i965/brw_disasm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
b/src/mesa/drivers/dri/i965/brw_disasm.c
index 5e51be7..5930e44 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++ b/src/mesa/drivers/dri/i965/brw_disasm.c
We'll need roughly the same logic in other places and it would be
annoying to duplicate it. Instead factor it out into a function-like
macro that takes the number of dwords per block (which will prove more
convenient than taking the same value in owords or some other unit).
---
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index e73f2ca..6565f4d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++
brw_set_dp_read_message already had a target_cache argument, but its
interpretation was rather convoluted (on Gen6 the render cache was
used if the caller asked for it, otherwise it was ignored using the
sampler cache instead), and the constant cache wasn't representable at
all.
Not used anymore. It was just a scalar MOV.
---
src/mesa/drivers/dri/i965/brw_defines.h| 1 -
src/mesa/drivers/dri/i965/brw_fs.h | 3 ---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27 --
src/mesa/drivers/dri/i965/brw_shader.cpp | 2 --
Change the FS generator to ask the dataport for enough owords worth of
constants to fill the execution size of the instruction -- Which means
that the visitor now needs to set the execution size correctly for
uniform pull constant load instructions, which we were kind of
neglecting until now.
---
This reverts to using the oword block read messages for uniform pull
constant loads, as used to be the case until
4c1fdae0a01b3f92ec03b61aac1d3df5. There are two important differences
though: Now the L3 cacheability bits are set up correctly for UBOs
(since
In order to make sure that the constant cache is coherent with
previous rendering when we start using it for pull constant loads.
---
src/mesa/drivers/dri/i965/brw_pipe_control.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c
On Wednesday, December 7, 2016 10:50:29 AM PST Rafael Antognolli wrote:
> Add some basic types and storage for the queries of this extension.
>
> Signed-off-by: Rafael Antognolli
> ---
> src/mesa/main/extensions_table.h | 1 +
> src/mesa/main/mtypes.h | 5
https://bugs.freedesktop.org/show_bug.cgi?id=94512
--- Comment #8 from Emil Velikov ---
Double-checking the logs - seems like TLS is built/used throughout the board.
One thing which comes to mind - can you try with --disable-asm. I'm fairly sure
that the code we have in
On Friday, December 9, 2016 9:41:51 AM PST Jason Ekstrand wrote:
> The formula we have used in the past is a trivial reduction from the
> definition by simply multiplying both the numerator and denominator of the
> formula by 2. However, multiplying by e^x, you can further reduce it.
> This
Unsurprisingly, the formula looks great to me :-).
I was actually wondering about accuracy. I believe the biggest issue
(both with the original formula and this one) is probably values around
zero - because that gets calculated as (~1 - 1) / 2 - so the closest
values to zero you can get (other
On Thu, Dec 8, 2016 at 5:50 PM, Kenneth Graunke
wrote:
> On Thursday, December 8, 2016 5:41:02 PM PST Haixia Shi wrote:
> > Clamp input scalar value to range [-10, +10] to avoid precision problems
> > when the absolute value of input is too large.
> >
> > Fixes
https://bugs.freedesktop.org/show_bug.cgi?id=60197
Emil Velikov changed:
What|Removed |Added
Resolution|--- |FIXED
The new implementation is more correct because it clamps the incoming value
to 10 to avoid floating-point overflow. It also uses a much reduced
version of the formula which only requires 1 exp() rather than 2. This
fixes all of the dEQP-VK.glsl.builtin.precision.tanh.* tests.
---
The formula we have used in the past is a trivial reduction from the
definition by simply multiplying both the numerator and denominator of the
formula by 2. However, multiplying by e^x, you can further reduce it.
This allows us to get rid of one side of the clamp and two of exponential
functions
From: Marek Olšák
TGSI compute shaders don't have RW_BUFFERS, so use SGPR[0:1].
Graphics shaders use the first slot of RW_BUFFERS.
TODO: Dave's patch only implements the latter; fix the attribute names.
UNTESTED
---
src/gallium/drivers/radeonsi/si_compute.c | 27
On Fri, Dec 9, 2016 at 8:45 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:
> On 08/12/16 19:19, Jason Ekstrand wrote:
>
> On Dec 8, 2016 8:48 AM, "Lionel Landwerlin" wrote:
>
> v2: add lod level argument (Jason)
> return 0 for any lod level > 0 (Jason)
>
v2: put enum directly in gl_API.xml (Ilia)
Signed-off-by: Lionel Landwerlin
Cc: Ilia Mirkin
---
src/mapi/glapi/gen/gl_API.xml | 4
1 file changed, 4 insertions(+)
diff --git a/src/mapi/glapi/gen/gl_API.xml
On 08/12/16 19:19, Jason Ekstrand wrote:
On Dec 8, 2016 8:48 AM, "Lionel Landwerlin" > wrote:
v2: add lod level argument (Jason)
return 0 for any lod level > 0 (Jason)
return 0 for any surface not 3D (Jason)
I'd rather
Reviewed-by: Jason Ekstrand
On Dec 9, 2016 07:07, "Edward O'Callaghan"
wrote:
Following on from the spirit of commit 011e5570f.
Signed-off-by: Edward O'Callaghan
---
src/intel/vulkan/anv_private.h | 15
Following on from the spirit of commit 011e5570f.
Signed-off-by: Edward O'Callaghan
---
src/intel/vulkan/anv_private.h | 15 ---
1 file changed, 15 deletions(-)
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index
Hi
This no longer applies cleanly since radv/meta: cleanup image info setup.
71a9574ffa1463773ad7587262bacc50ed37c042
Regards
Mike
On Wed, 23 Nov 2016 at 05:29 Dave Airlie wrote:
> From: Dave Airlie
>
> This is kind of a gross hacks, but vulkan
On 8 December 2016 at 22:11, Bas Nieuwenhuizen wrote:
> Leftovers from anv?
>
> Signed-off-by: Bas Nieuwenhuizen
> ---
> src/amd/vulkan/radv_private.h | 16
> 1 file changed, 16 deletions(-)
>
> diff --git
On 9 December 2016 at 13:20, Alexandre Courbot wrote:
> On 12/08/2016 04:16 PM, Alexandre Courbot wrote:
>> On 11/30/2016 10:44 PM, Christian Gmeiner wrote:
>>> This a very lightweight library to add basic support for
>>> renderonly GPUs. It does all the magic regarding
Hi Alexandre,
On 9 December 2016 at 13:20, Alexandre Courbot wrote:
> On 12/08/2016 04:16 PM, Alexandre Courbot wrote:
>> First, setting the tiling works indeed just fine if we are using an
>> ioctl for this. However my impression was that the preferred way of
>> doing it
On 9 December 2016 at 10:54, Chris Wilson wrote:
> Before saving the current position of the pipeline for the render
> stream, we need to flush.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
> Testcase: piglit/arb_transform_feedback2-draw-auto
>
On 9 December 2016 at 10:54, Chris Wilson wrote:
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_pipelined_register.h
> +#ifndef BRW_PIPELINED_REGISTER_H
> +#define BRW_PIPELINED_REGISTER_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +void
On 12/08/2016 04:16 PM, Alexandre Courbot wrote:
> On 11/30/2016 10:44 PM, Christian Gmeiner wrote:
>> This a very lightweight library to add basic support for
>> renderonly GPUs. It does all the magic regarding in/exporting
>> buffers etc. This library will likely break android support and
>>
Reviewed-by: Lionel Landwerlin
On 09/12/16 10:54, Chris Wilson wrote:
Reorder the parameters to brw_store_register_mem32 and
brw_store_register_mem64 so that the offset into the buffer and its
identifier are paired. This brings the interface into line wth
Some I915_GEM_DOMAIN_VERTEX are changed to I915_GEM_DOMAIN_INSTRUCTION,
which are treated the same way in the kernel. So I guess it doesn't matter.
Reviewed-by: Lionel Landwerlin
On 09/12/16 10:54, Chris Wilson wrote:
The domains used are immaterial, and we
Reviewed-by: Lionel Landwerlin
On 09/12/16 10:54, Chris Wilson wrote:
There are a few open coded setting of single registers using
MI_LOAD_REGISTER_IMM, replace those with a call to
brw_load_register_imm32().
Signed-off-by: Chris Wilson
Reviewed-by: Lionel Landwerlin
On 09/12/16 10:54, Chris Wilson wrote:
Rename brw_load_register_reg to include the width (32bits) similar to
all the other register routines.
Signed-off-by: Chris Wilson
---
Reviewed-by: Lionel Landwerlin
On 09/12/16 10:54, Chris Wilson wrote:
My ulterior motive is to kill intel_batchbuffer.[ch] and moving
discrete pieces of functionality into their own files is a small step
towards that goal.
Signed-off-by: Chris Wilson
Reviewed-by: Lionel Landwerlin
On 09/12/16 10:54, Chris Wilson wrote:
Rather than emit the instructions directions, make use of the helpers
brw_store_register_mem32() and brw_load_register_mem()
Signed-off-by: Chris Wilson
---
Some dri drivers will pass multiple bits in buffer_mask parameter
to droid_image_get_buffer(), more than the actual supported buffer
type combination. For such case, will go through all the bits, and
will not return error when unsupported buffer is requested, only
return error when the allocation
We need the enum somewhere in the xml files for patch 5 to support the
glGet*()
Otherwise get_hash_params.py will error out.
On 09/12/16 01:29, Ilia Mirkin wrote:
While I'm not against it, not sure that this has much use... mostly
this would be for _mesa_enum_to_string() to work AFAIK. Also,
Reorder the parameters to brw_store_register_mem32 and
brw_store_register_mem64 so that the offset into the buffer and its
identifier are paired. This brings the interface into line wth
brw_load_register_mem.
Signed-off-by: Chris Wilson
---
The domains used are immaterial, and we should never be marking the read
from the buffer as a write, so stop passing them around from the caller
and choose the appropriate read domain when writing.
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/brw_compute.c
My ulterior motive is to kill intel_batchbuffer.[ch] and moving
discrete pieces of functionality into their own files is a small step
towards that goal.
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/Makefile.sources | 2 +
Rename brw_load_register_reg to include the width (32bits) similar to
all the other register routines.
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/brw_pipelined_register.c | 2 +-
src/mesa/drivers/dri/i965/brw_pipelined_register.h | 6 +++---
There are a few open coded setting of single registers using
MI_LOAD_REGISTER_IMM, replace those with a call to
brw_load_register_imm32().
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/brw_draw.c | 6 +-
Rather than emit the instructions directions, make use of the helpers
brw_store_register_mem32() and brw_load_register_mem()
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/hsw_sol.c | 27 +--
1 file changed, 9 insertions(+), 18
Before saving the current position of the pipeline for the render
stream, we need to flush.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
Testcase: piglit/arb_transform_feedback2-draw-auto
Signed-off-by: Chris Wilson
---
77 matches
Mail list logo