Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21070 @mswit-databricks, I wouldn't worry about that. We've limited the length of binary and string fields. In the next version of Parquet, we're planning on releasing page indexes, which are lower and upper bounds instead of min and max values. That gives us more flexibility to shorten values and avoid the case that you're worried about.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org