[ https://issues.apache.org/jira/browse/HIVE-27056?focusedWorklogId=845207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-845207 ]
ASF GitHub Bot logged work on HIVE-27056: ----------------------------------------- Author: ASF GitHub Bot Created on: 13/Feb/23 18:49 Start Date: 13/Feb/23 18:49 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4037: URL: https://github.com/apache/hive/pull/4037#issuecomment-1428476259 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4037) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4037&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4037&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4037&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=CODE_SMELL) [1 Code Smell](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4037&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4037&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4037&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking ------------------- Worklog Id: (was: 845207) Time Spent: 1.5h (was: 1h 20m) > Ensure that MR distcp goes to the same Yarn queue as other parts of the query > ----------------------------------------------------------------------------- > > Key: HIVE-27056 > URL: https://issues.apache.org/jira/browse/HIVE-27056 > Project: Hive > Issue Type: Improvement > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-2 > > Attachments: image (1).png > > Time Spent: 1.5h > Remaining Estimate: 0h > > Given the following plan for an EXPORT TABLE query: > {code} > +----------------------------------------------------+ > | Explain | > +----------------------------------------------------+ > | STAGE DEPENDENCIES: | > | Stage-2 is a root stage | > | Stage-3 depends on stages: Stage-2 | > | Stage-1 depends on stages: Stage-3 | > | Stage-6 depends on stages: Stage-1 | > | Stage-5 depends on stages: Stage-6 | > | Stage-7 depends on stages: Stage-5 | > | | > | STAGE PLANS: | > | Stage: Stage-2 | > | Tez | > | DagId: hive_20230124103803_b865e63c-4565-4b95-8664-8c983938697a:7 | > | DagName: hive_20230124103803_b865e63c-4565-4b95-8664-8c983938697a:7 | > | Vertices: | > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: distcptest | > | Statistics: Num rows: 2 Data size: 182658 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator | > | expressions: id (type: int), txt (type: string) | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 2 Data size: 182658 Basic stats: > COMPLETE Column stats: COMPLETE | > | File Output Operator | > | compressed: false | > | Statistics: Num rows: 2 Data size: 182658 Basic > stats: COMPLETE Column stats: COMPLETE | > | table: | > | input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | > | serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde | > | name: > bigd35368.distcptest_ced9015c_852b_49a4_8c28_4d284d598523 | > | Execution mode: vectorized | > | | > | Stage: Stage-3 | > | Dependency Collection | > | | > | Stage: Stage-1 | > | Move Operator | > | tables: | > | replace: false | > | table: | > | input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > | > | output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | > | serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde | > | name: > bigd35368.distcptest_ced9015c_852b_49a4_8c28_4d284d598523 | > | | > | Stage: Stage-6 | > | Set Properties | > | table name: bigd35368.distcptest_ced9015c_852b_49a4_8c28_4d284d598523 > | > | properties: | > | transactional true | > | | > | Stage: Stage-5 | > | Export Work | > | | > | Stage: Stage-7 | > | Drop Table | > | table: bigd35368.distcptest_ced9015c_852b_49a4_8c28_4d284d598523 | > | | > +----------------------------------------------------+ > {code} > here, Stage-5 Export Work starts MR distcp jobs > the problem and solution are described in code comment, pasted below: > {code} > /** > * This method ensures if there is an explicit tez.queue.name set, the > hadoop shim will submit jobs > * to the same yarn queue. This solves a security issue where e.g settings > have the following values: > * tez.queue.name=sample > * hive.server2.tez.queue.access.check=true > * In this case, when a query submits Tez DAGs, the tez client layer checks > whether the end user has access to > * the yarn queue 'sample' via YarnQueueHelper, but this is not respected > in case of MR jobs that run > * even if the query execution engine is Tez. E.g. an EXPORT TABLE can > submit DistCp MR jobs at some stages when > * certain criteria are met. We tend to restrict the setting of > mapreduce.job.queuename in order to bypass this > * security flaw, and even the default queue is unexpected if we explicitly > set tez.queue.name. > * Under the hood the desired behavior is to have DistCp jobs in the same > yarn queue as other parts > * of the query. Most of the time, the user isn't aware that a query > involves DistCp jobs, hence isn't aware > * of these details. > */ > {code} > after the fix, the MR job went into the same yarn queue: > https://issues.apache.org/jira/secure/attachment/13055215/image%20%281%29.png -- This message was sent by Atlassian Jira (v8.20.10#820010)