Ryan Blue created PARQUET-166:
---------------------------------
Summary: Validate parquet row group size and HDFS block size
Key: PARQUET-166
URL: https://issues.apache.org/jira/browse/PARQUET-166
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Reporter: Ryan Blue
The OutputFormat should verify that {{parquet.block.size < dfs.blocksize}} to
avoid bad performance. In addition, we could check that {{(dfs.blocksize %
parquet.block.size) < 1MB}} to ensure that some number of row groups is
approximately the size of an HDFS block.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)