Ryan Blue created PARQUET-166:
---------------------------------

             Summary: Validate parquet row group size and HDFS block size
                 Key: PARQUET-166
                 URL: https://issues.apache.org/jira/browse/PARQUET-166
             Project: Parquet
          Issue Type: Bug
          Components: parquet-mr
            Reporter: Ryan Blue


The OutputFormat should verify that {{parquet.block.size < dfs.blocksize}} to 
avoid bad performance. In addition, we could check that {{(dfs.blocksize % 
parquet.block.size) < 1MB}} to ensure that some number of row groups is 
approximately the size of an HDFS block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to