From: Roland Scheidegger srol...@vmware.com
This adds support for this to more drivers, in particular for all the special
ones useful for debugging.
HW drivers are left alone, some should be able to support it if they want but
they may not be interested at this point.
---
From: Roland Scheidegger srol...@vmware.com
Some rounding errors could crop up when calculating a0. Use a more accurate
method (barycentric interpolation essentially) to fix this, though to fix
the REAL problem (which is that our interpolation will give very bad results
with small triangles far
From: Roland Scheidegger srol...@vmware.com
In particular get rid of home-grown vector helpers which didn't add much.
And while here fix formatting a bit. No functional change.
---
src/gallium/drivers/llvmpipe/lp_state_setup.c | 183 +
1 file changed, 66 insertions(+),
From: Roland Scheidegger srol...@vmware.com
d3d10 requires us to convert NaNs to zero for any float-int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float-unorm8 path only really passes because it relies on sse2
pack intrinsics which just
From: Roland Scheidegger srol...@vmware.com
We weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first element.
(Copied straight from the same fix for temps.)
While here fix up a couple of broken
From: Roland Scheidegger srol...@vmware.com
There's only one minor functional change, for immediates the pixel offsets
are no longer added since the values are all the same for all elements in
any case (it might be better if those weren't stored as soa vectors in the
first place maybe).
---
From: Roland Scheidegger srol...@vmware.com
SSE can't handle true vector shifts (with variable shift count),
so llvm is turning them into a mess of extracts, scalar shifts and inserts.
It is however possible to emulate them in lp_build_minify with float muls,
which should be way faster (saves
From: Roland Scheidegger srol...@vmware.com
The layer coming from GS needs to be clamped (not sure if that's actually
the correct error behavior but we need something) as the number can be higher
than the amount of layers in the fb. However, this code was using the layer
calculation from the
From: Roland Scheidegger srol...@vmware.com
This format, while still supported in OpenGL (but optional) and glx, is just
causing major nuisance everywhere and needs special code in some places,
because things like 1 depth_bits don't work.
It is also the reason why we chose (just like in GL)
From: Roland Scheidegger srol...@vmware.com
d3d10 requires that cube corners are filtered with accurate weights (that
is, the weight of the non-existing corner texel should be evenly distributed
to the other 3 texels). OpenGL does not require this (but recommends it).
This requires us to use
From: Roland Scheidegger srol...@vmware.com
For seamless cube filtering it is necessary to determine new faces and new
coords per sample. The logic for this is _seriously_ complex (what needs
to happen is very asymmetric wrt face, x/y under/overflow), further
complicated by the fact that if the 4
From: Roland Scheidegger srol...@vmware.com
---
src/gallium/drivers/llvmpipe/lp_screen.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 723e40e..4c81022 100644
---
From: Roland Scheidegger srol...@vmware.com
The previous limit of of 128*1024 was reported to cause frequent recompiles
in some apps due to shader variant thrashing on IRC in some apps leading
to noticeable lags.
Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible
to
From: Roland Scheidegger srol...@vmware.com
Fix coord wrapping (and face selection too) in case of edges.
Unfortunately, the coord wrapping is way more complicated than what
the code did, as it depends on the face and the direction where the
texel falls off the face (the logic needed to get this
From: Roland Scheidegger srol...@vmware.com
They need some special handling. Quite complicated.
Additionally, use the same code for implicit derivatives too if no_rho_approx
and no_quad_lod is set, because it seems while generally it should be ok
to use per quad lod for implicit derivatives
From: Roland Scheidegger srol...@vmware.com
There's two reasons for this:
1) even when ignoring rho approximation for cube maps, the result is still
not correct, but it's better as the max error at edges is now sqrt(2) instead
of 2 (which was a full mip level), same as it is for ordinary 2d maps
From: Roland Scheidegger srol...@vmware.com
Not used since ages, and it wouldn't work at all with explicit derivatives now
(not that it did before as it ignored them but now the code would just use
the derivs pre-projected which would be quite random numbers).
v2: also get rid of 3 helper
From: Roland Scheidegger srol...@vmware.com
There's two reasons for this:
1) even when ignoring rho approximation for cube maps, the result is still
not correct, but it's better as the max error at edges is now sqrt(2) instead
of 2 (which was a full mip level), same as it is for ordinary 2d maps
From: Roland Scheidegger srol...@vmware.com
They need some special handling. Quite complicated.
Additionally, use the same code for implicit derivatives too if no_rho_approx
and no_quad_lod is set, because it seems while generally it should be ok
to use per quad lod for implicit derivatives
From: Roland Scheidegger srol...@vmware.com
Not used since ages, and it wouldn't work at all with explicit derivatives now
(not that it did before as it ignored them but now the code would just use
the derivs pre-projected which would be quite random numbers).
---
From: Roland Scheidegger srol...@vmware.com
There's two reasons for this:
1) even when ignoring rho approximation for cube maps, the result is still
not correct, but it's better as the max error at edges is now sqrt(2) instead
of 2 (which was a full mip level), same as it is for ordinary 2d maps
From: Roland Scheidegger srol...@vmware.com
Technically without seamless filtering enabled GL allows any wrap mode, which
made sense when supporting true borders (can get seamless effect with border
and CLAMP_TO_BORDER), but gallium doesn't support borders and d3d9 requires
wrap modes to be
From: Roland Scheidegger srol...@vmware.com
Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a
correct implementation for nearest filtering, and it's way better than
using repeat wrap for instance for linear filtering (though obviously this
doesn't actually do seamless
From: Roland Scheidegger srol...@vmware.com
Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a
correct implementation for nearest filtering, and it's way better than
using repeat wrap for instance for linear filtering (though obviously this
doesn't actually do seamless
From: Roland Scheidegger srol...@vmware.com
Instead of crashing just return all zero.
---
src/gallium/auxiliary/tgsi/tgsi_exec.c |1 +
src/gallium/drivers/softpipe/sp_tex_sample.c | 30 +-
2 files changed, 26 insertions(+), 5 deletions(-)
diff --git
From: Roland Scheidegger srol...@vmware.com
Turns out we don't need to do much extra work for detecting this case,
since we are guaranteed to get a empty static texture state in this case,
hence just rely on format being 0 and return all zero then.
Previously needed dummy textures (would just
From: Roland Scheidegger srol...@vmware.com
pstipple/aaline stages used PIPE_MAX_SAMPLER instead of
PIPE_MAX_SHADER_SAMPLER_VIEWS when dealing with sampler views.
Now these stages can't actually handle sampler_unit != texture_unit anyway
(they cannot work with d3d10 shaders at all due to using
From: Roland Scheidegger srol...@vmware.com
No idea if this is working right but copied straight from llvmpipe.
(Not only does this check the so_target but also use buffer-data instead
of buffer for the mapping.)
Just trying to get rid of a segfault testing something else...
---
From: Roland Scheidegger srol...@vmware.com
Instead of enhancing the AoS path so it can deal with it, just use SoA. Fixing
AoS path wouldn't be all that difficult (use all the same logic as SoA) but
considered not worth it for now.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |7
From: Roland Scheidegger srol...@vmware.com
The only reason this was needed was because the fetch texel function had to
get the (dynamic) border color, but this is now done much earlier.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 62 -
1 file changed, 22
From: Roland Scheidegger srol...@vmware.com
This is just preparation for per-pixel (or per-quad in case of multiple quads)
min/mag filter since some assumptions about number of miplevels being equal
to number of lods no longer holds true.
This change does not change behavior yet (though
From: Roland Scheidegger srol...@vmware.com
While a sqrt here and there shouldn't hurt much (depending on the cpu) it is
possible to completely omit it since rho is only used for calculating lod and
there log2(x) == 0.5*log2(x^2). Depending on the exact path taken for
calculating lod this means
From: Roland Scheidegger srol...@vmware.com
Since we can have per-pixel lod we should also honor the filter per-pixel
(in fact we didn't honor it per quad neither in the multiple quad case).
Do this by running the linear path and simply beating the weights into shape
(the sample with the higher
From: Roland Scheidegger srol...@vmware.com
There's just no way resetting the counters is working with nested/overlapping
queries.
---
src/gallium/drivers/softpipe/sp_prim_vbuf.c |2 +-
src/gallium/drivers/softpipe/sp_query.c | 33 +--
2 files changed, 17
From: Roland Scheidegger srol...@vmware.com
There's just no way resetting the counters is working with nested/overlapping
queries.
---
src/gallium/drivers/llvmpipe/lp_query.c | 35 ++
src/gallium/drivers/llvmpipe/lp_query.h |1 -
From: Roland Scheidegger srol...@vmware.com
In particular noone is interested in the vertex count, so drop that,
and also drop the duplicated num_primitives_generated /
so.primitives_storage_needed variables in drivers. I am unable for now to figure
out if primitives_storage_needed in SO stats
From: Roland Scheidegger srol...@vmware.com
Previously, the min/mag switchover point when using nearest/none mip
filter was effectively -0.5 which can't be right. Looks like new OpenGL
thinks it's ok if it's always 0.0 (older versions required 0.5 in some
cases), let's hope everybody else thinks
From: Roland Scheidegger srol...@vmware.com
Detected this hunting some other bug, not sure if it really needs fixing but
it is definitely wrong.
---
src/gallium/auxiliary/gallivm/lp_bld_sample.c |8
src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |2 +-
From: Roland Scheidegger srol...@vmware.com
Was using wrong (undefined) vector element (the elements are at 0/2 position,
not 0/1).
---
src/gallium/auxiliary/gallivm/lp_bld_sample.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
From: Roland Scheidegger srol...@vmware.com
Except for explicit derivs with cube maps which are very bogus anyway.
Just like explicit lod this is only used if no_quad_lod is set in
GALLIVM_DEBUG env var.
Minification is terrible on cpus which don't support true vector shifts
(but should work
From: Roland Scheidegger srol...@vmware.com
block size depth is always 1 even for compressed formats (unless someone
invents true 3d compressed formats at least which we can't represent).
Nearest (and soa) path had it right.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |4 ++--
1
From: Roland Scheidegger srol...@vmware.com
Just a copy paste error.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=68409.
Note that the test passing before probably simply means it doesn't verify
clamping of the border color itself as required by the OpenGL spec.
---
From: Roland Scheidegger srol...@vmware.com
The (complicated!) math is all identical, there's just minimal differences how
sign bit is calculated plus there's an additional subtraction for the argument
going into the polynomial for cos.
The logic stays 100% the same (with a small exception, sign
From: Roland Scheidegger srol...@vmware.com
Going to need this soon (not going to bother with avx2 intrinsics at this time
but don't want to do workarounds for true vector shifts if llvm itself can use
them just fine and won't need the gazillion instruction emulation).
Not really tested other
From: Roland Scheidegger srol...@vmware.com
Need to check the wrap mode of the actually used coords not a fixed 2.
While checking more than necessary would only potentially disable aos and
not cause any harm I'm pretty sure for 3d textures it could have caused
assertion failures (if s,t coords
From: Roland Scheidegger srol...@vmware.com
Turns out it is actually very complicated to figure out what a format really
is wrt range, as using channel information for determining unorm/snorm etc.
doesn't work for a bunch of cases - namely compressed, subsampled, other.
Also while here add
From: Roland Scheidegger srol...@vmware.com
Turns out it is actually very complicated to figure out what a format really
is wrt range, as using channel information for determining unorm/snorm etc.
doesn't work for a bunch of cases - namely compressed, subsampled, other.
Also while here add
From: Roland Scheidegger srol...@vmware.com
This is a very well hidden bug found by accident (only the fixed glean
tstencil2 test so far seems to hit it).
We must use new mask with combined s_pass values and orig_mask values
for zpass/zfail stencil ops, otherwise both the sfail op and one of
From: Roland Scheidegger srol...@vmware.com
There's a new debug value used to disable per-quad lod optimizations
in fragment shader (ignored for vs/gs as the results are just too wrong
typically). Also trying to detect if a supplied lod value is really a
scalar (if it's coming from immediate or
From: Roland Scheidegger srol...@vmware.com
Instead of passing s,t,r coordinates pass a coord array - the reason is that
I need to pass more coords (in particular for shadow coord, future will also
need another one for cube map arrays) so just pass them as an array.
Also, to simplify things, use
From: Roland Scheidegger srol...@vmware.com
This makes things a bit nicer, and more importantly it fixes an issue
where a downgraded array texture (due to view reduced to 1 layer and
addressed with (non-array) samplec instruction) would use the wrong
coord as shadow reference value. (This could
From: Roland Scheidegger srol...@vmware.com
This makes things a bit nicer, and more importantly it fixes an issue
where a downgraded array texture (due to view reduced to 1 layer and
addressed with (non-array) samplec instruction) would use the wrong
coord as shadow reference value. (This could
From: Roland Scheidegger srol...@vmware.com
Doing the comparisons pre-filter is highly recommended by OpenGL (and d3d9)
and definitely required by d3d10.
This actually doesn't do it pre-filter but more in-filter as otherwise
need to push the comparisons even further down into fetch code and this
From: Roland Scheidegger srol...@vmware.com
untested.
---
src/gallium/drivers/ilo/shader/toy_tgsi.c | 20
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c
b/src/gallium/drivers/ilo/shader/toy_tgsi.c
index
From: Roland Scheidegger srol...@vmware.com
Also use ordered comparisons for old cmp instructions. Untested.
---
src/gallium/drivers/r600/r600_shader.c | 18 ---
.../drivers/radeon/radeon_setup_tgsi_llvm.c| 49
2 files changed, 48 insertions(+),
From: Roland Scheidegger srol...@vmware.com
untested.
---
.../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp | 17 +
1 file changed, 17 insertions(+)
diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
From: Roland Scheidegger srol...@vmware.com
Should get rid of some float-to-int conversions (with negation).
No piglit regressions (with llvmpipe).
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++--
1 file changed, 15 insertions(+), 30 deletions(-)
diff --git
From: Roland Scheidegger srol...@vmware.com
We need to put border color into texture format color space which
essentially means clamping for non-float, normalized formats (not entirely
sure if we're also meant to quantize the float but it's probably ok not to
do it thankfully).
For OpenGL we
From: Roland Scheidegger srol...@vmware.com
Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the
From: Roland Scheidegger srol...@vmware.com
Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the
From: Roland Scheidegger srol...@vmware.com
Because we must maintain an exec_mask even if there's currently nothing
on the mask stack, we can still have an exec_mask at the end of the program.
Effectively, this mask should be set back to default when returning from main.
Without relying on
From: Roland Scheidegger srol...@vmware.com
Newer graphic languages don't want messy float mask results but instead true
boolean mask results for float comparisons. Otherwise just need to convert
the floats back to integers. Need to keep the old opcodes however due to both
legacy (gl and d3d9)
From: Roland Scheidegger srol...@vmware.com
Also while here add a bunch of other forgotten (integer) instructions to
tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing
away unused input components), though it may still be incomplete.
---
From: Roland Scheidegger srol...@vmware.com
FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the
select.
And just for consistency use the same appropriate ordered/unordered comparisons
for the old opcodes as well.
---
src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 81
From: Roland Scheidegger srol...@vmware.com
The old float comparison opcodes always return floats 0.0 and 1.0 (clarified
in docs these were really floats, was always the case) for legacy graphics.
But everybody else (opengl,opencl,d3d10) just has to work around their
return results (converting
From: Roland Scheidegger srol...@vmware.com
My previous attempt at doing so double-failed miserably (minification of
zero still gives one, and even if it would not the value was never written
anyway).
While here also rename the confusingly named int_vec bld as we have int vecs
of different sizes,
From: Roland Scheidegger srol...@vmware.com
Clearly the returned values need to be per-element if the lod is per element.
Does not actually change behavior yet.
---
src/gallium/auxiliary/draw/draw_llvm_sample.c |2 ++
src/gallium/auxiliary/gallivm/lp_bld_sample.h |1 +
From: Roland Scheidegger srol...@vmware.com
This opcode is quite problematic in tgsi, while it tries to mirror
d3d10 resinfo it can't really do what's stated there due to missing
the crazy return type modifiers. Hence specify this is ignored along
with the swizzle.
(Other options would be to have
From: Roland Scheidegger srol...@vmware.com
This is wrong both for OpenGL and d3d. (In fact clamping is a side effect
of converting to depth format, so this should really do quantization too
at least in d3d10 for the comparisons to be truly correct.)
---
From: Roland Scheidegger srol...@vmware.com
Clamping is only done for fixed-point formats as part of conversion to
texture format.
---
src/gallium/drivers/softpipe/sp_tex_sample.c | 44 +++---
1 file changed, 32 insertions(+), 12 deletions(-)
diff --git
From: Roland Scheidegger srol...@vmware.com
d3d10 specifies ordered comparisons for everything but not_equal which is
unordered
(http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050.aspx).
OpenGL probably doesn't care.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 20
From: Roland Scheidegger srol...@vmware.com
Specifically, must return 0 for non-existent mip levels (and non-existent
textures which is an unsolved problem) for everything but total mip count.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 35 -
1 file changed, 27
From: Roland Scheidegger srol...@vmware.com
d3d10 has no notion of distinct array resources neither at the resource nor
sampler view level. However, shader dcl of resources certainly has, and
d3d10 expects resinfo to return the values according to that - in particular
a resource might have been a
From: Roland Scheidegger srol...@vmware.com
For d3d10 and ARB_robust_buffer_access_behavior, we are required to return
0 for out-of-bounds coordinates (for which we can just enable the code already
there was just disabled). Additionally, also need to return 0 for
out-of-bounds mip level and
From: Roland Scheidegger srol...@vmware.com
Should be much faster, seems to work in softpipe.
While here (also it's now disabled) fix up the pow factor - the former value
is what is in GL core it is however not actually accurate to fp32 standard
(as it is 1.0/2.4), and if someone would do all the
From: Roland Scheidegger srol...@vmware.com
I think it's actually not good enough now...
---
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c |6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c
From: Roland Scheidegger srol...@vmware.com
Should be much faster, seems to work in softpipe.
While here (also it's now disabled) fix up the pow factor - the former value
is what is in GL core it is however not actually accurate to fp32 standard
(as it is 1.0/2.4), and if someone would do all the
From: Roland Scheidegger srol...@vmware.com
While so far this only causes some harmless test failures, there's lots more
cpus with DAZ. All 64bit capable ones can do it (particularly relevant for
AMD cpus as they supported sse3 very very late) but if really necessary we
can check support for that
From: Roland Scheidegger srol...@vmware.com
Previously we were using truncation, which gives the correct result
only for numbers in [0.5-1.0] range (because there's no mantissa bits
to do any rounding there).
This is frequently hit (and probably only used there) when converting
fragment depth to
From: Roland Scheidegger srol...@vmware.com
Previously, nothing was said what happens with shift counts exceeding
bit width of the values to shift. In theory 3 behaviors are possible:
1) undefined (classic c definition)
2) just shift out all bits (so result is zero, or -1 potentially for ashr)
3)
From: Roland Scheidegger srol...@vmware.com
llvm shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask.
NOTE: there's internal callers using this which guarantee the shift count
is smaller than the type width. However, all of these use constant shift
From: Roland Scheidegger srol...@vmware.com
c shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask (on x86 it actually would usually probably work as
shifts do masking on int domain shifts - unless some auto-vectorizer would
come along at last as simd
From: Roland Scheidegger srol...@vmware.com
llvm shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask for the tgsi shift instructions.
v2: only use mask for the tgsi shift instructions, not for the build shift
helpers. None of the internal callers need
From: Roland Scheidegger srol...@vmware.com
unlike OpenGL, the texel swizzle is embedded in the instruction, so honor
that.
(Technically we now execute both the sampler_view swizzle and the
per-instruction swizzle but this should be quite ok.)
---
src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
From: Roland Scheidegger srol...@vmware.com
Same as for gallivm (though these don't quite work correctly in softpipe,
so untested).
---
src/gallium/auxiliary/tgsi/tgsi_exec.c | 40
1 file changed, 35 insertions(+), 5 deletions(-)
diff --git
From: Roland Scheidegger srol...@vmware.com
I am not able to find _any_ rounding behavior specified for OpenGL for
float to half-float conversions. However, it is specified for fp11/fp10
which suggests round to next finite value but round-to-zero would also
be allowed, but finite values must not
From: Roland Scheidegger srol...@vmware.com
Just like the UNORM case we need to use round to nearest, not trunc.
(There's also another problem, we're using the formula for SNORM-float
which will produce a value below -1.0 for the most negative value which
according to both OpenGL and d3d10 would
From: Roland Scheidegger srol...@vmware.com
CPU detection is not really x86 specific, the ifdef in particular didn't
even catch x86_64.
Also move to draw context creation which seems a lot cleaner, and just
call it always (which seems like a better idea than rely on drivers doing this
especially
From: Roland Scheidegger srol...@vmware.com
Since disabling denorms in draw_vbo() we require the util_cpu_caps to be
initialized there. Hence add another util_cpu_detect() call in
draw_create_context() which should ensure this.
(There is another call in draw_get_option_use_llvm() which only gets
From: Roland Scheidegger srol...@vmware.com
The codeword must be unsigned (otherwise will shift in 1's from above when
merging low/high parts so some texels decode wrong).
This also affects gallium's util/u_format_rgtc.
---
src/mesa/main/texcompress_rgtc_tmp.h |6 +++---
1 file changed, 3
From: Roland Scheidegger srol...@vmware.com
We were fixing up the blend factor to ZERO, however this only works correctly
with fixed point render buffers where the input values are clamped to 0/1
(because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped
inputs). Haven't seen
From: Roland Scheidegger srol...@vmware.com
Usually with fixed point renderbuffers clamping is done as part of conversion.
However, since we blend in float format, we essentially skip all conversion
steps pre-blend but since this is still a fixed point renderbuffer we must
still clamp the inputs
From: Roland Scheidegger srol...@vmware.com
Instead of just ignoring the srgb/linear conversions, simply call the
corresponding conversion functions, for all of pack/unpack/fetch,
both for float and unorm8 versions (though some don't make a whole
lot of sense, i.e. unorm8/unorm8 srgb/linear
From: Roland Scheidegger srol...@vmware.com
Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit
From: Roland Scheidegger srol...@vmware.com
We had to disable fast rsqrt before because it wasn't precise enough etc.
However in situations when we know we're not going to need more precision
we can still use a fast rsqrt (which can be several times faster than
the quite expensive sqrt). Hence
From: Roland Scheidegger srol...@vmware.com
srgb-to-linear is using 3rd degree polynomial for now which should be _just_
good enough. Reverse is using some rational polynomials and is quite accurate,
though not hooked into llvmpipe's blend code yet and hence unused (untested).
Using a table might
From: Roland Scheidegger srol...@vmware.com
d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just
From: Roland Scheidegger srol...@vmware.com
b04a295a4a0cd2defe352b3193b5fa79ca8fc9fc removed seemingly unnecessary
code in get_query. Turns out this code could in fact be reached - while
timestamps are always binned, if there are no bins (which happens if fb
size is 0) then the rasterization
From: Roland Scheidegger srol...@vmware.com
d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just
From: Roland Scheidegger srol...@vmware.com
If there are queries active the opaque optimization reseting the bin needs to
be disabled.
(Not really tested since the bug was discovered by code inspection not
an actual test failure.)
---
src/gallium/drivers/llvmpipe/lp_query.c |2 ++
401 - 500 of 624 matches
Mail list logo