wankunde commented on code in PR #41782: URL: https://github.com/apache/spark/pull/41782#discussion_r1299497876
########## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ########## @@ -487,6 +487,25 @@ object SQLConf { .intConf .createWithDefault(10000) + val VECTORIZED_HUGE_VECTOR_RESERVE_RATIO = + buildConf("spark.sql.inMemoryColumnarStorage.hugeVectorReserveRatio") + .doc("spark will reserve requiredCapacity * this ratio memory next time. This is only " + + "effective when spark.sql.inMemoryColumnarStorage.hugeVectorThreshold > 0 and required " + + "memory larger than that threshold.") + .version("3.5.0") + .doubleConf + .createWithDefault(1.2) + + val VECTORIZED_HUGE_VECTOR_THRESHOLD = + buildConf("spark.sql.inMemoryColumnarStorage.hugeVectorThreshold") + .doc("When the in memory column vector is larger than this, spark will reserve " + + s"requiredCapacity * ${VECTORIZED_HUGE_VECTOR_RESERVE_RATIO.key} memory next time and " + + "free this column vector before reading next batch data. -1 means disabling the " + + "optimization.") + .version("3.5.0") + .bytesConf(ByteUnit.BYTE) + .createWithDefault(-1) Review Comment: ![image](https://github.com/apache/spark/assets/3626747/7da2b853-e585-4244-9ddd-e445733d30e7) When VECTORIZED_HUGE_VECTOR_THRESHOLD = 1, there are two UT failures, as expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org