[ 
https://issues.apache.org/jira/browse/HIVE-20549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617565#comment-16617565
 ] 

mahesh kumar behera commented on HIVE-20549:
--------------------------------------------

I have few observations ..check if these can be accommodated ?

{code}
1. In HiveConf.java, some more description about the usage and limitations can 
be added.

2. setApplicationTag need to consider the case where tag is already set for mr 
or tez jobs. So we just need to append to the existing one. The same jobs can 
be tagged by both query id and query tag.

3."userid=" — can be stored as a final static variable and used everywhere.

4.In case of “numCanceled == 0”, yarn jobs should not be killed.

5.If multiple queries are tagged to same name, then the tag should not be used 
to kill the yarn jobs. The query id can be used to kill the yarn jobs. This 
will avoid killing of yarn jobs by unauthorized user. 

6. Only admin should be allowed to kill the yarn jobs using tag if the 
corresponding operation is missing in the map. 
{code}

> Allow user set query tag, and kill query with tag
> -------------------------------------------------
>
>                 Key: HIVE-20549
>                 URL: https://issues.apache.org/jira/browse/HIVE-20549
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Major
>         Attachments: HIVE-20549.1.patch, HIVE-20549.2.patch
>
>
> HIVE-19924 add capacity for replication job set a query tag and kill the 
> replication distcp job with the tag. Here I make it more general, user can 
> set arbitrary "hive.query.tag" in sql script, and kill query with the tag. 
> Hive will cancel the corresponding operation in hs2, along with Tez/MR 
> application launched for the query. For example:
> {code}
> set hive.query.tag=mytag;
> select ..... -- long running query
> {code}
> In another session:
> {code}
> kill query 'mytag';
> {code}
> There're limitations in the implementation:
> 1. No tag duplication check. There's nothing to prevent conflicting tag for 
> same user, and kill query will kill queries share the same tag. However, kill 
> query will not kill queries from different user unless admin. So different 
> user might share the same tag
> 2. In multiple hs2 environment, kill statement should be issued to all hs2 to 
> make sure the corresponding operation is canceled. When beeline/jdbc connects 
> to hs2 using regular way (zookeeper url), the session will connect to random 
> hs2, which might be different than the hs2 where query run on. User can use 
> HiveConnection.getAllUrls or beeline --getUrlsFromBeelineSite (HIVE-20507) to 
> get a list of all hs2 instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to