On 03/06/2015 12:27 AM, Jianshi Huang wrote:
Hi,
I understand that for columns where value is null, parquet will skip it in
encoding, so it doesn't take storage space.
Does it also the case in a Map column? I think the key will always be
encoded even though values are null.
Is it correct?
That is correct. The keys must be encoded so that the map can be
reconstructed. { 'a': null } isn't the same as { } and we would have to
encode 'a' even if missing entries and null values are handled the same
in your application.
If you don't want to encode them, then you can remove the keys for null
values from your map before you store the map.
rb
--
Ryan Blue
Software Engineer
Cloudera, Inc.