Re: [Mesa-dev] [PATCH 0/8] swr: update rasterizer
Reviewed-by: Bruce Cherniak > On Sep 5, 2017, at 1:57 PM, Tim Rowley wrote: > > Highlight is starting to unify the simd/simd16 code, removing lots of > temporary code duplication. > > No piglit or vtk test regressions. > > Tim Rowley (8): > swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets > swr: set caps for VB 4-byte alignment > swr/rast: Removed some trailing whitespace caught during review > swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types > swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble > swr/rast: SIMD16 FE remove templated immediates workaround > swr/rast: Remove use of C++14 template variable > swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib types > > .../swr/rasterizer/codegen/gen_llvm_ir_macros.py |1 + > .../codegen/templates/gen_ar_eventhandlerfile.hpp |4 +- > src/gallium/drivers/swr/rasterizer/core/binner.cpp | 2312 ++-- > src/gallium/drivers/swr/rasterizer/core/binner.h | 192 +- > src/gallium/drivers/swr/rasterizer/core/clip.cpp | 16 +- > src/gallium/drivers/swr/rasterizer/core/clip.h | 1654 -- > .../drivers/swr/rasterizer/core/conservativeRast.h |1 + > src/gallium/drivers/swr/rasterizer/core/fifo.hpp |4 +- > .../drivers/swr/rasterizer/core/frontend.cpp |6 +- > src/gallium/drivers/swr/rasterizer/core/pa.h | 20 +- > src/gallium/drivers/swr/rasterizer/core/state.h|7 + > src/gallium/drivers/swr/rasterizer/core/utils.h|8 + > .../drivers/swr/rasterizer/jitter/fetch_jit.cpp|7 +- > src/gallium/drivers/swr/swr_screen.cpp |9 +- > 14 files changed, 1193 insertions(+), 3048 deletions(-) > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] swr: update rasterizer
Highlight is starting to unify the simd/simd16 code, removing lots of temporary code duplication. No piglit or vtk test regressions. Tim Rowley (8): swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets swr: set caps for VB 4-byte alignment swr/rast: Removed some trailing whitespace caught during review swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble swr/rast: SIMD16 FE remove templated immediates workaround swr/rast: Remove use of C++14 template variable swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib types .../swr/rasterizer/codegen/gen_llvm_ir_macros.py |1 + .../codegen/templates/gen_ar_eventhandlerfile.hpp |4 +- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 2312 ++-- src/gallium/drivers/swr/rasterizer/core/binner.h | 192 +- src/gallium/drivers/swr/rasterizer/core/clip.cpp | 16 +- src/gallium/drivers/swr/rasterizer/core/clip.h | 1654 -- .../drivers/swr/rasterizer/core/conservativeRast.h |1 + src/gallium/drivers/swr/rasterizer/core/fifo.hpp |4 +- .../drivers/swr/rasterizer/core/frontend.cpp |6 +- src/gallium/drivers/swr/rasterizer/core/pa.h | 20 +- src/gallium/drivers/swr/rasterizer/core/state.h|7 + src/gallium/drivers/swr/rasterizer/core/utils.h|8 + .../drivers/swr/rasterizer/jitter/fetch_jit.cpp|7 +- src/gallium/drivers/swr/swr_screen.cpp |9 +- 14 files changed, 1193 insertions(+), 3048 deletions(-) -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] swr: update rasterizer
> On Jun 26, 2017, at 7:41 AM, Emil Velikov wrote: > On 22 June 2017 at 22:12, Tim Rowley wrote: >> Highlights include splitting the heavily templated files into multiple >> chunks to speed compile (2x for a large machine), and switching the >> simd intrinsic usage from a macro-based header to a more c++ feeling >> library. >> > Yay \o/. Out of curiosity - does the simd library bring much more > apart from a C++ feel? A couple major intentions, mainly to produce better code for avx512: * hide the differences in masking operations - avx/avx2 uses a normal ymm register for masking, while avx512 has separate mask registers * allow reduced vector width operations to be implemented in terms of avx512 code, so that a larger register set and mask registers can be used > Did you notice the errors in the Travis build [1]? For some reason > they don't flag up when building locally, although a few C++17 > warnings did pop-up. Speaking for which since we're back to C++11 for > SWR can we toggle back to GCC 4.8(.1) for Travis? > > Can you guys look at those, please... in case you haven't already. Sorry, had a patch for this ready to go Friday, but we were working through some other issues and I forgot to send it to the list. I’ve done so now. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] swr: update rasterizer
Hi Tim, On 22 June 2017 at 22:12, Tim Rowley wrote: > Highlights include splitting the heavily templated files into multiple > chunks to speed compile (2x for a large machine), and switching the > simd intrinsic usage from a macro-based header to a more c++ feeling > library. > Yay \o/. Out of curiosity - does the simd library bring much more apart from a C++ feel? > No regressions on piglit or vtk tests, passes "make dist"/compile. > Did you notice the errors in the Travis build [1]? For some reason they don't flag up when building locally, although a few C++17 warnings did pop-up. Speaking for which since we're back to C++11 for SWR can we toggle back to GCC 4.8(.1) for Travis? Can you guys look at those, please... in case you haven't already. Thanks Emil https://travis-ci.org/evelikov/Mesa/jobs/244817060 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] swr: update rasterizer
Highlights include splitting the heavily templated files into multiple chunks to speed compile (2x for a large machine), and switching the simd intrinsic usage from a macro-based header to a more c++ feeling library. No regressions on piglit or vtk tests, passes "make dist"/compile. Tim Rowley (8): swr/rast: Split backend.cpp to improve compile time swr/rast: Support dynamically sized vertex layout swr/rast: Split rasterizer.cpp to improve compile times swr/rast: Fix unused variable warnings swr/rast: Switch intrinsic usage to SIMDLib swr/rast: Fix missing setup of psContext.pColorBuffer swr/rast: move default split size from driver to rasterizer swr/rast: increase number of possible draws in flight src/gallium/drivers/swr/Makefile.am| 60 +- src/gallium/drivers/swr/Makefile.sources | 19 +- src/gallium/drivers/swr/SConscript | 41 +- .../drivers/swr/rasterizer/codegen/gen_backends.py | 53 +- .../drivers/swr/rasterizer/codegen/gen_common.py |7 + .../drivers/swr/rasterizer/codegen/knob_defs.py|4 +- .../rasterizer/codegen/templates/gen_backend.cpp |1 + .../codegen/templates/gen_header_init.hpp | 43 + .../codegen/templates/gen_rasterizer.cpp | 42 + src/gallium/drivers/swr/rasterizer/common/intrin.h | 102 +- .../drivers/swr/rasterizer/common/simd16intrin.h | 1223 ++--- .../drivers/swr/rasterizer/common/simdintrin.h | 1257 +++--- .../drivers/swr/rasterizer/common/simdlib.hpp | 550 ++ .../swr/rasterizer/common/simdlib_128_avx.inl | 545 ++ .../swr/rasterizer/common/simdlib_128_avx2.inl | 68 + .../swr/rasterizer/common/simdlib_128_avx512.inl | 408 + .../swr/rasterizer/common/simdlib_256_avx.inl | 757 + .../swr/rasterizer/common/simdlib_256_avx2.inl | 234 +++ .../swr/rasterizer/common/simdlib_256_avx512.inl | 409 + .../swr/rasterizer/common/simdlib_512_avx512.inl | 682 .../rasterizer/common/simdlib_512_avx512_masks.inl | 27 + .../swr/rasterizer/common/simdlib_512_emu.inl | 842 + .../rasterizer/common/simdlib_512_emu_masks.inl| 28 + .../swr/rasterizer/common/simdlib_interface.hpp| 428 + .../swr/rasterizer/common/simdlib_types.hpp| 377 + src/gallium/drivers/swr/rasterizer/core/api.cpp|8 +- .../drivers/swr/rasterizer/core/backend.cpp| 809 + src/gallium/drivers/swr/rasterizer/core/backend.h | 1033 +-- .../drivers/swr/rasterizer/core/backend_clear.cpp | 281 +++ .../drivers/swr/rasterizer/core/backend_impl.h | 1067 .../drivers/swr/rasterizer/core/backend_sample.cpp | 344 .../swr/rasterizer/core/backend_singlesample.cpp | 320 src/gallium/drivers/swr/rasterizer/core/binner.cpp | 293 ++-- src/gallium/drivers/swr/rasterizer/core/clip.cpp |6 +- src/gallium/drivers/swr/rasterizer/core/clip.h | 50 +- src/gallium/drivers/swr/rasterizer/core/context.h |2 +- .../swr/rasterizer/core/format_conversion.h|8 +- .../drivers/swr/rasterizer/core/format_types.h | 30 +- .../drivers/swr/rasterizer/core/format_utils.h | 268 +-- .../drivers/swr/rasterizer/core/frontend.cpp | 16 +- src/gallium/drivers/swr/rasterizer/core/frontend.h |4 +- .../drivers/swr/rasterizer/core/multisample.cpp| 48 - src/gallium/drivers/swr/rasterizer/core/pa.h | 16 +- src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 106 +- .../drivers/swr/rasterizer/core/rasterizer.cpp | 1788 +++- .../drivers/swr/rasterizer/core/rasterizer.h | 31 +- .../drivers/swr/rasterizer/core/rasterizer_impl.h | 1376 +++ src/gallium/drivers/swr/rasterizer/core/state.h| 12 + .../drivers/swr/rasterizer/memory/StoreTile.h | 156 +- src/gallium/drivers/swr/swr_screen.cpp |4 - src/gallium/drivers/swr/swr_shader.cpp |2 + src/gallium/drivers/swr/swr_state.cpp |2 + 52 files changed, 10139 insertions(+), 6148 deletions(-) create mode 100644 src/gallium/drivers/swr/rasterizer/codegen/templates/gen_header_init.hpp create mode 100644 src/gallium/drivers/swr/rasterizer/codegen/templates/gen_rasterizer.cpp create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib.hpp create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx.inl create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx2.inl create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx512.inl create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx.inl create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx2.inl create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx512.inl create mode 100644 src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512.inl create mode 100644