Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/20013
> I wonder which changes double the disk usage?
It's the new indices, more explicitly the values, not the keys. I tried
changing the disk layout to write all the indices in a new namespace with a
very short key length, and that didn't change the resulting store size at all.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]