abstractdog commented on code in PR #4037:
URL: https://github.com/apache/hive/pull/4037#discussion_r1105583865


##########
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
##########
@@ -1211,6 +1207,39 @@ public boolean runDistCpWithSnapshots(String 
oldSnapshot, String newSnapshot, Li
     return false;
   }
 
+  protected int runDistCpInternal(DistCp distcp, List<String> params) {
+    ensureMapReduceQueue(distcp.getConf());
+    return distcp.run(params.toArray(new String[0]));
+  }
+
+  /**
+   * This method ensures if there is an explicit tez.queue.name set, the 
hadoop shim will submit jobs
+   * to the same yarn queue. This solves a security issue where e.g settings 
have the following values:
+   * tez.queue.name=sample
+   * hive.server2.tez.queue.access.check=true
+   * In this case, when a query submits Tez DAGs, the tez client layer checks 
whether the end user has access to
+   * the yarn queue 'sample' via YarnQueueHelper, but this is not respected in 
case of MR jobs that run
+   * even if the query execution engine is Tez. E.g. an EXPORT TABLE can 
submit DistCp MR jobs at some stages when
+   * certain criteria are met. We tend to restrict the setting of 
mapreduce.job.queuename in order to bypass this
+   * security flaw, and even the default queue is unexpected if we explicitly 
set tez.queue.name.
+   * Under the hood the desired behavior is to have DistCp jobs in the same 
yarn queue as other parts
+   * of the query. Most of the time, the user isn't aware that a query 
involves DistCp jobs, hence isn't aware
+   * of these details.
+   */
+  protected void ensureMapReduceQueue(Configuration conf) {
+    String queueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
+    boolean isTez = conf.get("hive.execution.engine", 
"tez").equalsIgnoreCase("tez");

Review Comment:
   I don't have a strong opinion about this one, the abovementioned makes 
sense, I'll proceed with:
   ```
    boolean isTez = "tez".equalsIgnoreCase(conf.get("hive.execution.engine"));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to