Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20013 > I wonder which changes double the disk usage? It's the new indices, more explicitly the values, not the keys. I tried changing the disk layout to write all the indices in a new namespace with a very short key length, and that didn't change the resulting store size at all.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org