[ https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111095#comment-16111095 ]
Hive QA commented on HIVE-16896: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12880033/HIVE-16896.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11041 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin] (batchId=90) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6232/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6232/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6232/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12880033 - PreCommit-HIVE-Build > move replication load related work in semantic analysis phase to execution > phase using a task > --------------------------------------------------------------------------------------------- > > Key: HIVE-16896 > URL: https://issues.apache.org/jira/browse/HIVE-16896 > Project: Hive > Issue Type: Sub-task > Reporter: anishek > Assignee: anishek > Attachments: HIVE-16896.1.patch, HIVE-16896.2.patch > > > we want to not create too many tasks in memory in the analysis phase while > loading data. Currently we load all the files in the bootstrap dump location > as {{FileStatus[]}} and then iterate over it to load objects, we should > rather move to > {code} > org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listFiles(Path > f, boolean recursive) > {code} > which would internally batch and return values. > additionally since we cant hand off partial tasks from analysis pahse => > execution phase, we are going to move the whole repl load functionality to > execution phase so we can better control creation/execution of tasks (not > related to hive {{Task}}, we may get rid of ReplCopyTask) > Additional consideration to take into account at the end of this jira is to > see if we want to specifically do a multi threaded load of bootstrap dump. -- This message was sent by Atlassian JIRA (v6.4.14#64029)