It certainly can't be called once - it's reading different data each time.
There might be a faster way to do it, I don't know. Do you have ideas?

On Sun, Sep 13, 2020 at 9:25 PM Chang Chen <baibaic...@gmail.com> wrote:
>
> Hi export
>
> it looks like there is a hot spot in VectorizedRleValuesReader#readNextGroup()
>
> case PACKED:
>   int numGroups = header >>> 1;
>   this.currentCount = numGroups * 8;
>
>   if (this.currentBuffer.length < this.currentCount) {
>     this.currentBuffer = new int[this.currentCount];
>   }
>   currentBufferIdx = 0;
>   int valueIndex = 0;
>   while (valueIndex < this.currentCount) {
>     // values are bit packed 8 at a time, so reading bitWidth will always work
>     ByteBuffer buffer = in.slice(bitWidth);
>     this.packer.unpack8Values(buffer, buffer.position(), this.currentBuffer, 
> valueIndex);
>     valueIndex += 8;
>   }
>
>
> Per my profile, the codes will spend 30% time of readNextGrou() on slice , 
> why we can't call slice out of the loop?

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to