On Thu, 13 Oct 2022 07:18:24 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> "`VectorSupport.indexVector()`" is used to compute a vector that contains >> the index values based on a given vector and a scale value (`i.e. index = >> vec + iota * scale`). This function is widely used in other APIs like >> "`VectorMask.indexInRange`" which is useful to the tail loop vectorization. >> And it can be easily implemented with the vector instructions. >> >> This patch adds the vector intrinsic implementation of it. The steps are: >> >> 1) Load the const "iota" vector. >> >> We extend the "`vector_iota_indices`" stubs from byte to other integral >> types. For floating point vectors, it needs an additional vector cast to get >> the right iota values. >> >> 2) Compute indexes with "`vec + iota * scale`" >> >> Here is the performance result to the new added micro benchmark on ARM NEON: >> >> Benchmark Gain >> IndexVectorBenchmark.byteIndexVector 1.477 >> IndexVectorBenchmark.doubleIndexVector 5.031 >> IndexVectorBenchmark.floatIndexVector 5.342 >> IndexVectorBenchmark.intIndexVector 5.529 >> IndexVectorBenchmark.longIndexVector 3.177 >> IndexVectorBenchmark.shortIndexVector 5.841 >> >> >> Please help to review and share the feedback! Thanks in advance! > > src/hotspot/share/opto/vectorIntrinsics.cpp line 2978: > >> 2976: case T_DOUBLE: { >> 2977: scale = gvn().transform(new ConvI2LNode(scale)); >> 2978: scale = gvn().transform(new ConvL2DNode(scale)); > > Any specific reason for not directly using ConvI2D for double case. Good catch, I think it's ok to use ConvI2D here. I will change this. Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/10332