[ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906608#comment-13906608 ]
Hive QA commented on HIVE-6455: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629854/HIVE-6455.4.patch {color:red}ERROR:{color} -1 due to 70 failed/errored test(s), 5161 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2 org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats2 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testsequencefile org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_case org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_when org.apache.hadoop.hive.ql.parse.TestParse.testParse_union {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1421/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1421/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 70 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629854 > Scalable dynamic partitioning and bucketing optimization > -------------------------------------------------------- > > Key: HIVE-6455 > URL: https://issues.apache.org/jira/browse/HIVE-6455 > Project: Hive > Issue Type: New Feature > Components: Query Processor > Affects Versions: 0.13.0 > Reporter: Prasanth J > Assignee: Prasanth J > Labels: optimization > Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, > HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch > > > The current implementation of dynamic partition works by keeping at least one > record writer open per dynamic partition directory. In case of bucketing > there can be multispray file writers which further adds up to the number of > open record writers. The record writers of column oriented file format (like > ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or > compression buffers) open all the time to buffer up the rows and compress > them before flushing it to disk. Since these buffers are maintained per > column basis the amount of constant memory that will required at runtime > increases as the number of partitions and number of columns per partition > increases. This often leads to OutOfMemory (OOM) exception in mappers or > reducers depending on the number of open record writers. Users often tune the > JVM heapsize (runtime memory) to get over such OOM issues. > With this optimization, the dynamic partition columns and bucketing columns > (in case of bucketed tables) are sorted before being fed to the reducers. > Since the partitioning and bucketing columns are sorted, each reducers can > keep only one record writer open at any time thereby reducing the memory > pressure on the reducers. This optimization is highly scalable as the number > of partition and number of columns per partition increases at the cost of > sorting the columns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)