ability to check partition size for dynamic partiions
-----------------------------------------------------

                 Key: HIVE-1635
                 URL: https://issues.apache.org/jira/browse/HIVE-1635
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Namit Jain
            Assignee: Ning Zhang
             Fix For: 0.7.0


With dynamic partitions, it becomes very easy to create partitions.

We have seen some scenarios, where a lot of partitions/files get created due to 
some corrupt data (1 corrupt row
can end up creating a partition and a lot of files (number of mappers, if merge 
is false)).

This puts a lot of load on the cluster, and is a debugging nightmare.

It would be good to have a configuration parameter, for the minimum number of 
rows for a partition.
If the number of rows is less than the threshold, the partition need not be 
created. The default value
of this parameter can be zero for backward compatibility

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to