Ian Barfield created PARQUET-71:
-----------------------------------
Summary: column chunk page write store log message displays
incorrect information
Key: PARQUET-71
URL: https://issues.apache.org/jira/browse/PARQUET-71
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Reporter: Ian Barfield
Priority: Minor
It is printing the size of the dictionary (in terms of the number of keys)
twice and calling the second time the 'compressed byte count'. An accurate
account of that number would be very helpful for accounting for disk space
usage. The actual 'compressed byte count' is indeed calculated at a point near
there so I am guessing this is a simple mistake.
see:
https://github.com/apache/incubator-parquet-mr/blob/master/parquet-hadoop/src/main/java/parquet/hadoop/ColumnChunkPageWriteStore.java#L152
--
This message was sent by Atlassian JIRA
(v6.2#6252)