[ https://issues.apache.org/jira/browse/HIVE-21071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762091#comment-16762091 ]
Hive QA commented on HIVE-21071: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12957784/HIVE-21071.9.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15772 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.TestTxnCommands.testMergeOnTezEdges (batchId=327) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15969/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15969/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15969/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12957784 - PreCommit-HIVE-Build > Improve getInputSummary > ----------------------- > > Key: HIVE-21071 > URL: https://issues.apache.org/jira/browse/HIVE-21071 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Affects Versions: 4.0.0, 3.2.0 > Reporter: BELUGA BEHR > Assignee: BELUGA BEHR > Priority: Major > Attachments: HIVE-21071.1.patch, HIVE-21071.2.patch, > HIVE-21071.3.patch, HIVE-21071.4.patch, HIVE-21071.5.patch, > HIVE-21071.6.patch, HIVE-21071.7.patch, HIVE-21071.8.patch, HIVE-21071.9.patch > > > There is a global lock in the {{getInptSummary}} code, so it is important > that it be fast. The current implementation has quite a bit of overhead that > can be re-engineered. > For example, the current implementation keeps a map of File Path to > ContentSummary object. This map is populated by several threads > concurrently. The method then loops through the map, in a single thread, at > the end to add up all of the ContentSummary objects and ignores the paths. > The code can be be re-engineered to not use a map, or a collection at all, to > store the results and instead just keep a running tally. By keeping a tally, > there is no {{O\(n)}} operation at the end to perform the addition. > There are other things can be improved. The method returns an object which > is never used anywhere, so change method to void return type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)