It certainly can't be called once - it's reading different data each time. There might be a faster way to do it, I don't know. Do you have ideas?
On Sun, Sep 13, 2020 at 9:25 PM Chang Chen <baibaic...@gmail.com> wrote: > > Hi export > > it looks like there is a hot spot in VectorizedRleValuesReader#readNextGroup() > > case PACKED: > int numGroups = header >>> 1; > this.currentCount = numGroups * 8; > > if (this.currentBuffer.length < this.currentCount) { > this.currentBuffer = new int[this.currentCount]; > } > currentBufferIdx = 0; > int valueIndex = 0; > while (valueIndex < this.currentCount) { > // values are bit packed 8 at a time, so reading bitWidth will always work > ByteBuffer buffer = in.slice(bitWidth); > this.packer.unpack8Values(buffer, buffer.position(), this.currentBuffer, > valueIndex); > valueIndex += 8; > } > > > Per my profile, the codes will spend 30% time of readNextGrou() on slice , > why we can't call slice out of the loop? --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org