ayushtkn commented on code in PR #4037:
URL: https://github.com/apache/hive/pull/4037#discussion_r1105571406
##########
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
##########
@@ -1211,6 +1207,39 @@ public boolean runDistCpWithSnapshots(String
oldSnapshot, String newSnapshot, Li
return false;
}
+ protected int runDistCpInternal(DistCp distcp, List<String> params) {
+ ensureMapReduceQueue(distcp.getConf());
+ return distcp.run(params.toArray(new String[0]));
+ }
+
+ /**
+ * This method ensures if there is an explicit tez.queue.name set, the
hadoop shim will submit jobs
+ * to the same yarn queue. This solves a security issue where e.g settings
have the following values:
+ * tez.queue.name=sample
+ * hive.server2.tez.queue.access.check=true
+ * In this case, when a query submits Tez DAGs, the tez client layer checks
whether the end user has access to
+ * the yarn queue 'sample' via YarnQueueHelper, but this is not respected in
case of MR jobs that run
+ * even if the query execution engine is Tez. E.g. an EXPORT TABLE can
submit DistCp MR jobs at some stages when
+ * certain criteria are met. We tend to restrict the setting of
mapreduce.job.queuename in order to bypass this
+ * security flaw, and even the default queue is unexpected if we explicitly
set tez.queue.name.
+ * Under the hood the desired behavior is to have DistCp jobs in the same
yarn queue as other parts
+ * of the query. Most of the time, the user isn't aware that a query
involves DistCp jobs, hence isn't aware
+ * of these details.
+ */
+ protected void ensureMapReduceQueue(Configuration conf) {
+ String queueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
+ boolean isTez = conf.get("hive.execution.engine",
"tez").equalsIgnoreCase("tez");
Review Comment:
Is there any case where execution engine won't be there? In that case this
will default to Tez, I don't think we want that.
Only when we are sure it is Tez we should do that.
Can we change to
```
boolean isTez =
"tez".equalsIgnoreCase(conf.get("hive.execution.engine"));
```
The default by HiveConf is `mr` I feel. So, may be change the default to mr
here as well.
```
HIVE_EXECUTION_ENGINE("hive.execution.engine", "mr"
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]