jiangjiangtian opened a new issue, #10451: URL: https://github.com/apache/incubator-gluten/issues/10451
### Description In https://github.com/apache/incubator-gluten/blob/main/backends-velox/src/main/scala/org/apache/gluten/execution/ColumnarPartialProjectExec.scala, I notice that the entire batch is exported to Arrow format. Now Gluten use Arrow 15.0.0 and it doesn't implement StringView in java. So if there is any string column, it will have high memory pressure to gather the strings into a buffer. I find that arrow 17.0.0 has introduced StringView implementation in java https://arrow.apache.org/release/17.0.0 https://github.com/apache/arrow/issues/40339. html So does gluten have any plan to use Arrow 17.0.0? Or maybe we can implement a class named ArrowConverter and it controls the size of rows exported. But it will have a impact on the performance. ### Gluten version None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
