> Hi, > > Patch optimizes non-subword vector compress and expand APIs for x86 AVX2 only > targets. > Upcoming E-core Xeons (Sierra Forest) and Hybrid CPUs only support AVX2 > instruction set. > These are very frequently used APIs in columnar database filter operation. > > Implementation uses a lookup table to record permute indices. Table index is > computed using > mask argument of compress/expand operation. > > Following are the performance number of JMH micro included with the patch. > > > System : Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) > > Baseline: > Benchmark (size) Mode Cnt Score Error > Units > ColumnFilterBenchmark.filterDoubleColumn 1024 thrpt 2 142.767 > ops/ms > ColumnFilterBenchmark.filterDoubleColumn 2047 thrpt 2 71.436 > ops/ms > ColumnFilterBenchmark.filterDoubleColumn 4096 thrpt 2 35.992 > ops/ms > ColumnFilterBenchmark.filterFloatColumn 1024 thrpt 2 182.151 > ops/ms > ColumnFilterBenchmark.filterFloatColumn 2047 thrpt 2 91.096 > ops/ms > ColumnFilterBenchmark.filterFloatColumn 4096 thrpt 2 44.757 > ops/ms > ColumnFilterBenchmark.filterIntColumn 1024 thrpt 2 184.099 > ops/ms > ColumnFilterBenchmark.filterIntColumn 2047 thrpt 2 91.981 > ops/ms > ColumnFilterBenchmark.filterIntColumn 4096 thrpt 2 45.170 > ops/ms > ColumnFilterBenchmark.filterLongColumn 1024 thrpt 2 148.017 > ops/ms > ColumnFilterBenchmark.filterLongColumn 2047 thrpt 2 73.516 > ops/ms > ColumnFilterBenchmark.filterLongColumn 4096 thrpt 2 36.844 > ops/ms > > Withopt: > Benchmark (size) Mode Cnt Score > Error Units > ColumnFilterBenchmark.filterDoubleColumn 1024 thrpt 2 2051.707 > ops/ms > ColumnFilterBenchmark.filterDoubleColumn 2047 thrpt 2 914.072 > ops/ms > ColumnFilterBenchmark.filterDoubleColumn 4096 thrpt 2 489.898 > ops/ms > ColumnFilterBenchmark.filterFloatColumn 1024 thrpt 2 5324.195 > ops/ms > ColumnFilterBenchmark.filterFloatColumn 2047 thrpt 2 2587.229 > ops/ms > ColumnFilterBenchmark.filterFloatColumn 4096 thrpt 2 1278.665 > ops/ms > ColumnFilterBenchmark.filterIntColumn 1024 thrpt 2 4149.384 > ops/ms > ColumnFilterBenchmark.filterIntColumn 2047 thrpt 2 1791.170 > ops/ms > ColumnFilterBenchmark.filterIntColumn 4096...
Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17261/files - new: https://git.openjdk.org/jdk/pull/17261/files/6bd9b0ad..ea0aa0b4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17261&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17261&range=01-02 Stats: 49 lines in 4 files changed: 44 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17261.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17261/head:pull/17261 PR: https://git.openjdk.org/jdk/pull/17261