Rajesh Balamohan created HIVE-24663: ---------------------------------------
Summary: Batch process in ColStatsProcessor Key: HIVE-24663 URL: https://issues.apache.org/jira/browse/HIVE-24663 Project: Hive Issue Type: Improvement Reporter: Rajesh Balamohan When large number of partitions (>20K) are processed, ColStatsProcessor runs into DB issues. {{ db.setPartitionColumnStatistics(request);}} gets stuck for hours together and in some cases postgres stops processing. It would be good to introduce small batches for stats gathering in ColStatsProcessor instead of bulk update. Ref: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L181 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L199 -- This message was sent by Atlassian Jira (v8.3.4#803005)