[
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988629#comment-15988629
]
Hive QA commented on HIVE-16143:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865438/HIVE-16143.01.patch
{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10647 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index]
(batchId=225)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[create_like]
(batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_0]
(batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_1]
(batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_2]
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_3]
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_batchsize]
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repair] (batchId=32)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
(batchId=143)
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4917/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4917/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4917/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12865438 - PreCommit-HIVE-Build
> Improve msck repair batching
> ----------------------------
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
> Issue Type: Improvement
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}.
> Following snippet shows the batching logic. There can be couple of
> improvements to this batching logic:
> {noformat}
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
> if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
> counter++;
>
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
> repairOutput.add("Repair: Added partition to metastore " +
> msckDesc.getTableName()
> + ':' + part.getPartitionName());
> if (counter % batch_size == 0 || counter ==
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(),
> table.getTableName(), false);
> }
> }
> } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
> repairOutput.add("Repair: Added partition to metastore " +
> msckDesc.getTableName()
> + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
> }
> } catch (Exception e) {
> LOG.info("Could not bulk-add partitions to metastore; trying one by
> one", e);
> repairOutput.clear();
> msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding
> partitions one by one which is almost always very slow. It is easily possible
> that users increase the batch size to higher value to make the command run
> faster but end up with a worse performance because code falls back to adding
> one by one. Users are then expected to determine the tuned value of batch
> size which works well for their environment. I think the code could handle
> this situation better by exponentially decaying the batch size instead of
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch
> succeeds and the second one fails, the code tries to add all the partitions
> one by one irrespective of whether some of the were successfully added or
> not. If we need to fall back to one by one we should atleast remove the ones
> which we know for sure are already added successfully.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)