[ https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
mahesh kumar behera updated HIVE-19924: --------------------------------------- Status: Patch Available (was: In Progress) > Tag distcp jobs run by Repl Load > -------------------------------- > > Key: HIVE-19924 > URL: https://issues.apache.org/jira/browse/HIVE-19924 > Project: Hive > Issue Type: Task > Components: repl > Affects Versions: 3.1.0, 4.0.0 > Reporter: mahesh kumar behera > Assignee: mahesh kumar behera > Priority: Major > Labels: DR, replication > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, > HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch > > > Add tags in jobconf for distcp related jobs started by replication. This will > allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon > issues a kill command. > * one of the tags should definitely be the query_id that starts the job : > With this flow beacon before retrying the bootstrap load, will issue a kill > command to hs2 with the query id of the previous issued command. hs2 will > then kill an running jobs on yarn tagged with the Query_id. > * To get around the additional failure point as mentioned above. The jobs > can be tagged with an additional unique tag_id provided by Beacon in the WITH > clause in repl load command to be used to tag distcp jobs ). Enhance the kill > api to take the tag as input and kill jobs associated with that tag. Problem > here is how do we validate the association of the tag with a hive query id to > make sure this api is not used to kill jobs run by other components, however > we can provide this capability to only admins and should be ok in that case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)