ability to check partition size for dynamic partiions
-----------------------------------------------------
Key: HIVE-1635
URL: https://issues.apache.org/jira/browse/HIVE-1635
Project: Hadoop Hive
Issue Type: New Feature
Components: Query Processor
Reporter: Namit Jain
Assignee: Ning Zhang
Fix For: 0.7.0
With dynamic partitions, it becomes very easy to create partitions.
We have seen some scenarios, where a lot of partitions/files get created due to
some corrupt data (1 corrupt row
can end up creating a partition and a lot of files (number of mappers, if merge
is false)).
This puts a lot of load on the cluster, and is a debugging nightmare.
It would be good to have a configuration parameter, for the minimum number of
rows for a partition.
If the number of rows is less than the threshold, the partition need not be
created. The default value
of this parameter can be zero for backward compatibility
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.