Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 @henrify I took a look at the string/binary type of ORC batch, the data is stored in a ` byte[][]`, which is not a continuous byte array and we can't do a single copy. For better performance, I think we need to use low-level ORC reader API, we can consider this in the future.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org