This will remove the need for unnecessary runtime checks for CPU features if
already supported by target CPU, resulting in smaller and less branchy code.
V2:
- Removed the SSSE3 related part for the not yet merged patch.
- Avoiding redefinition of macros.
---
src/mesa/x86/common_x86_features.h |
On Fri, 07 Nov 2014 11:32:04 -0800
Eric Anholt e...@anholt.net wrote:
Pekka Paalanen ppaala...@gmail.com writes:
On Thu, 06 Nov 2014 13:01:03 -0800
Ian Romanick i...@freedesktop.org wrote:
I thought Eric and Chad already NAKed it in bugzilla. The problem is
that applications ask for
https://bugs.freedesktop.org/show_bug.cgi?id=85419
José Fonseca jfons...@vmware.com changed:
What|Removed |Added
CC||jfons...@vmware.com
V3:
- remove flag check from config
V2:
- remove unrequired #ifdef bit_SSSE3
- order flag check in config
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
src/mesa/x86/common_x86.c | 2 ++
src/mesa/x86/common_x86_features.h | 4 +++-
2 files changed, 5 insertions(+), 1
Callgrind cpu usage results from pts benchmarks:
For ytile_copy_faster()
Nexuiz 1.6.1: 2.48% - 0.97%
V3:
- rather than putting the ssse3 code in a different file
in order to compile make use of gcc pragma for per
function optimisations. Results in improved performace and less
impact on those
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
src/mesa/x86/x86_function_opt.h | 42 +
1 file changed, 42 insertions(+)
create mode 100644 src/mesa/x86/x86_function_opt.h
Using a macro like this means we can easily enable runtime support in
I rather to not use compiler specific hacks in mesa. If it was a
personal pet project it would make sense.
Best regards,
Siavash Eliasi.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Mesa 10.3.3 has been released. Mesa 10.3.3 is a bug fix release
fixing bugs since the 10.3.2 release, (see below for a list of
changes).
The tag in the git repository for Mesa 10.3.3 is 'mesa-10.3.3'.
Mesa 10.3.3 is available for download at
ftp://freedesktop.org/pub/mesa/10.3.3/
SHA-256
On Sat, Nov 8, 2014 at 4:59 AM, Siavash Eliasi siavashser...@gmail.com wrote:
I rather to not use compiler specific hacks in mesa. If it was a personal
pet project it would make sense.
We use compiler-specific things all the time. That's not going to change.
On 08/11/14 11:12, Timothy Arceri wrote:
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
As long as it fixes odd combinations such as this the following I'm all
in favour of using such an approach. It will save us quite a few
lovely details - split the file, configure checks etc...
On 08/11/14 08:35, Siavash Eliasi wrote:
This will remove the need for unnecessary runtime checks for CPU features if
already supported by target CPU, resulting in smaller and less branchy code.
A comment I could not withheld based on your earlier post - We require
micro-benchmark for this
On Sat, 2014-11-08 at 18:13 +, Emil Velikov wrote:
On 08/11/14 11:12, Timothy Arceri wrote:
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
As long as it fixes odd combinations such as this the following I'm all
in favour of using such an approach. It will save us quite a few
lovely
Requires evergreen/cayman, and updated radeon kernel module.
Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
---
See also kernel side patch sent to dri-de...@lists.freedesktop.org
docs/GL3.txt | 4 +-
docs/relnotes/10.4.html | 1 +
On Sat, 2014-11-08 at 16:29 +0330, Siavash Eliasi wrote:
I rather to not use compiler specific hacks in mesa. If it was a
personal pet project it would make sense.
Best regards,
Siavash Eliasi.
Having to work around compiler differences is a real world problem. As
has been pointed out
On 06/11/14 21:29, Frank Henigman wrote:
From: Frank Henigman fjhenig...@chromium.org
Dri driver libs are not linked to pull in libglapi so gbm_create_device()
fails when it tries to dlopen them (unless the application is linked
with something that does pull in libglapi, like libGL).
Until
On Sat, 2014-11-08 at 18:25 +, Emil Velikov wrote:
On 08/11/14 08:35, Siavash Eliasi wrote:
This will remove the need for unnecessary runtime checks for CPU features if
already supported by target CPU, resulting in smaller and less branchy code.
A comment I could not withheld based on
I know that's a time saver for developer (gcc function multi
versioning), however I still do prefer the approach (my own ^^ ) which
works on all setups regardless of hardware and compiler (well, any sane
compiler ICC, GCC, Clang,...).
Best regards,
Siavash Eliasi.
On 11/08/2014 09:55 PM, Emil Velikov wrote:
A comment I could not withheld based on your earlier post - We require
micro-benchmark for this code. It will take me hours to find why mesa is
so slow now :P
Which brings the question why didn't you post to that thread/topic in
first place instead
With follow up commit we'll split vl static lib from the auxiliary one,
and choose the appropriate vl (galliumvl or galliumvl_stub) for the
respective targets to link against.
Cc: Christian König christian.koe...@amd.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
Hello all,
Here is a reworked version of a patch I've send a while back - it
creates a stub for the vl functions used directly by the gallium
drivers. At link time we use it for non-vl targets, while for vdpau and
friends we use a galliumvl static lib, which is split out of auxiliary.
Resulting
Will be used by the non-VL targets, to stub out the functions called
by the drivers. The entry point to those are within the VL
state-trackers, yet the compiler cannot determine that at link time.
Thus we'll need to stub them out to prevent unresolved symbols in the
dri, egl, gbm and pipe-loader
Rather than shoving all the VL code for non-VL targets, increasing
their size, just split it out and use it when needed. This gives us
the side effect of building vl_winsys_dri.c once, dropping a few
automake warnings, and reducing the size of the dri modules as below
textdata bss
Or we might end up where automatically enable the build, only to error
out a couple of lines after that.
Cc: Christian König christian.koe...@amd.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
configure.ac | 11 ++-
1 file changed, 6 insertions(+), 5 deletions(-)
diff
Set a single VL_{CFLAG,LIBS} for xcb and friends, and let each target
check for it's relevant library alone. Required as with follow up
commits we'll build aux/vl into a separate module, which needs VL_CFLAGS
Cleanup add a couple of explicit LIBDRM_LIBS linking, as aux/vl itself
requires libdrm,
On Sun, 2014-11-09 at 08:59 +1100, Timothy Arceri wrote:
On Sat, 2014-11-08 at 18:13 +, Emil Velikov wrote:
On 08/11/14 11:12, Timothy Arceri wrote:
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
As long as it fixes odd combinations such as this the following I'm all
in favour
On Wednesday, October 29, 2014 02:10:12 PM Matt Turner wrote:
Most prominently helps Natural Selection 2, which has a surprising
number shaders that do very complicated things before drawing black.
instructions in affected programs: 23824 - 19570 (-17.86%)
---
On Wednesday, October 29, 2014 03:58:13 PM Matt Turner wrote:
On Wed, Oct 29, 2014 at 2:10 PM, Matt Turner matts...@gmail.com wrote:
---
.../drivers/dri/i965/brw_fs_live_variables.cpp | 35
++
src/mesa/drivers/dri/i965/brw_fs_live_variables.h | 5
2 files
On Wednesday, October 29, 2014 02:10:13 PM Matt Turner wrote:
Also while we're touching var_from_reg, just make it an inline function.
---
src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 8
src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp | 14
--
On Monday, November 03, 2014 11:58:06 AM Matt Turner wrote:
Dead code elimination now handles this.
---
Depends on the previously sent 5 patch series.
Nice!
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
signature.asc
Description: This is a digitally signed message part.
On Monday, November 03, 2014 01:34:48 PM Matt Turner wrote:
---
.../drivers/dri/i965/brw_vec4_live_variables.cpp | 28
++
.../drivers/dri/i965/brw_vec4_live_variables.h | 5
2 files changed, 33 insertions(+)
Patch 1 is:
Reviewed-by: Kenneth Graunke
On Monday, November 03, 2014 01:34:49 PM Matt Turner wrote:
Improves 359 shaders by =10%
114 shaders by =20%
91 shaders by =30%
82 shaders by =40%
22 shaders by =50%
4 shaders by =60%
2 shaders by =80%
total instructions in
On Sun, 2014-11-09 at 07:48 +0330, Siavash Eliasi wrote:
I know that's a time saver for developer (gcc function multi
versioning), however I still do prefer the approach (my own ^^ ) which
works on all setups regardless of hardware and compiler (well, any sane
compiler ICC, GCC, Clang,...).
32 matches
Mail list logo