[ https://issues.apache.org/jira/browse/HIVE-10036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372158#comment-14372158 ]
Hive QA commented on HIVE-10036: -------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706001/HIVE-10036.1.patch {color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 7819 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_delete org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_delete_own_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update_own_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_all_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_full org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_tmp_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_empty_files org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_vectorization_ppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_acid org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_after_multiple_inserts org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_values_tmp_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge6 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_after_multiple_inserts org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2 org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.io.orc.TestInStream.testCompressed org.apache.hadoop.hive.ql.io.orc.TestInStream.testDisjointBuffers org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testMemoryManagementV12[0] org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testMemoryManagementV12[1] org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testEmpty org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[3] org.apache.hive.hcatalog.pig.TestHCatStorer.testEmptyStore[3] {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3095/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3095/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3095/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 39 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12706001 - PreCommit-HIVE-TRUNK-Build > Writing ORC format big table causes OOM - too many fixed sized stream buffers > ----------------------------------------------------------------------------- > > Key: HIVE-10036 > URL: https://issues.apache.org/jira/browse/HIVE-10036 > Project: Hive > Issue Type: Improvement > Reporter: Selina Zhang > Assignee: Selina Zhang > Attachments: HIVE-10036.1.patch > > > ORC writer keeps multiple out steams for each column. Each output stream is > allocated fixed size ByteBuffer (configurable, default to 256K). For a big > table, the memory cost is unbearable. Specially when HCatalog dynamic > partition involves, several hundreds files may be open and writing at the > same time (same problems for FileSinkOperator). > Global ORC memory manager controls the buffer size, but it only got kicked in > at 5000 rows interval. An enhancement could be done here, but the problem is > reducing the buffer size introduces worse compression and more IOs in read > path. Sacrificing the read performance is always not a good choice. > I changed the fixed size ByteBuffer to a dynamic growth buffer which up bound > to the existing configurable buffer size. Most of the streams does not need > large buffer so the performance got improved significantly. Comparing to > Facebook's hive-dwrf, I monitored 2x performance gain with this fix. > Solving OOM for ORC completely maybe needs lots of effort , but this is > definitely a low hanging fruit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)