andygrove commented on code in PR #2235:
URL: https://github.com/apache/datafusion-comet/pull/2235#discussion_r2304332108
##########
spark/src/main/scala/org/apache/spark/sql/comet/execution/shuffle/NativeBatchDecoderIterator.scala:
##########
@@ -182,14 +182,22 @@ case class NativeBatchDecoderIterator(
currentBatch = null
}
in.close()
+ resetDataBuf()
isClosed = true
}
}
}
}
object NativeBatchDecoderIterator {
+
+ private val INITIAL_BUFFER_SIZE = 128 * 1024
+
private val threadLocalDataBuf: ThreadLocal[ByteBuffer] =
ThreadLocal.withInitial(() => {
- ByteBuffer.allocateDirect(128 * 1024)
+ ByteBuffer.allocateDirect(INITIAL_BUFFER_SIZE)
})
+
+ private def resetDataBuf(): Unit = {
+ threadLocalDataBuf.set(ByteBuffer.allocateDirect(INITIAL_BUFFER_SIZE))
Review Comment:
I tested locally with TPC-H q1 and saw this reallocation happen 353 times,
but the buffer had not grown beyond the initial allocation, so this current
implementation is adding unnecessary overhead
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]