Swapnil created PARQUET-196:
-------------------------------
Summary: parquet-tools command to get rowcount & size
Key: PARQUET-196
URL: https://issues.apache.org/jira/browse/PARQUET-196
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Affects Versions: parquet-mr_1.6.0
Reporter: Swapnil
Priority: Minor
Fix For: parquet-mr_1.6.0
Parquet files contain metadata about rowcount & file size. We should have new
commands to get rows count & size.
These command can be added in parquet-tools:
1. rowcount : This should add number of rows in all footers to give total rows
in data.
2. size : This should give compresses size in bytes and human readable format.
These command helps us to avoid parsing job logs or loading data once again to
find number of rows in data. This comes very handy in complex processes, stats
generation, QA etc..
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)