> "`VectorSupport.indexVector()`" is used to compute a vector that contains the > index values based on a given vector and a scale value (`i.e. index = vec + > iota * scale`). This function is widely used in other APIs like > "`VectorMask.indexInRange`" which is useful to the tail loop vectorization. > And it can be easily implemented with the vector instructions. > > This patch adds the vector intrinsic implementation of it. The steps are: > > 1) Load the const "iota" vector. > > We extend the "`vector_iota_indices`" stubs from byte to other integral > types. For floating point vectors, it needs an additional vector cast to get > the right iota values. > > 2) Compute indexes with "`vec + iota * scale`" > > Here is the performance result to the new added micro benchmark on ARM NEON: > > Benchmark Gain > IndexVectorBenchmark.byteIndexVector 1.477 > IndexVectorBenchmark.doubleIndexVector 5.031 > IndexVectorBenchmark.floatIndexVector 5.342 > IndexVectorBenchmark.intIndexVector 5.529 > IndexVectorBenchmark.longIndexVector 3.177 > IndexVectorBenchmark.shortIndexVector 5.841 > > > Please help to review and share the feedback! Thanks in advance!
Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Add the floating point support for VectorLoadConst and remove the VectorCast - Merge branch 'master' into JDK-8293409 - 8293409: [vectorapi] Intrinsify VectorSupport.indexVector ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10332/files - new: https://git.openjdk.org/jdk/pull/10332/files/2ad157b6..53f042d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10332&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10332&range=00-01 Stats: 50675 lines in 1239 files changed: 30581 ins; 14395 del; 5699 mod Patch: https://git.openjdk.org/jdk/pull/10332.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10332/head:pull/10332 PR: https://git.openjdk.org/jdk/pull/10332