[ 
https://issues.apache.org/jira/browse/HIVE-16686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-16686:
------------------------------------
    Release Note: 
This introduces parsing of additional parameters that are not directly used by 
hive, but are passed on to distcp when hive invokes it. We now introduce the 
ability to use the hive command to do "set" commands to pass along cli 
arguments to distcp.

Any parameter set as "set distcp.options.blah=''" will result in an extra 
"-blah" argument going into distcp, as well as any parameter set as "set 
distcp.options.foo='bar'" will result in an extra "-foo bar" argument going to 
distcp.

Currently, we always pass along "-update" and "-skipcrccheck" to distcp - that 
is retained as defaults if no distcp.options.* params are found. If they are 
found, then these options are not added by default, letting the user instead 
provide an excplicit list.

Note that all of these properties affect how distcp runs when it is launched by 
hive, but are not directly hive settings. Instead, hive will allow setting them 
through the use of the "set" command.

  was:
This introduces parsing of additional parameters that are not directly used by 
hive, but are passed on to distcp when hive invokes it. We now introduce the 
ability to use the hive command to do "set" commands to pass along cli 
arguments to distcp.

Any parameter set as "set distcp.options.blah=''" will result in an extra 
"-blah" argument going into distcp, as well as any parameter set as "set 
distcp.options.foo='bar'" will result in an extra "-foo bar" argument going to 
distcp.

Currently, we always pass along "-update" and "-skipcrccheck" to distcp - that 
is retained as defaults if no distcp.options.* params are found. If they are 
found, then these options are not added by default, letting the user instead 
provide an excplicit list.

In addition, one new special option parameter, "distcp.option.privilegedUser"  
is being added as a special option that is not passed along to distCp. Instead, 
this option is used to make sure that hive will run distcp inside a 
impersonation context as that specified user, if this parameter is specified, 
and the user being impersonated is different from the current user. This, 
however, will require that the user have impersonation proxy 
privileges(something that a HS2 instance typically will have, but not a regular 
end-user).

Note that all of these properties affect how distcp runs when it is launched by 
hive, but are not directly hive settings. Instead, hive will allow setting them 
through the use of the "set" command.


> repli invocations of distcp needs additional handling
> -----------------------------------------------------
>
>                 Key: HIVE-16686
>                 URL: https://issues.apache.org/jira/browse/HIVE-16686
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>            Reporter: Sushanth Sowmyan
>            Assignee: Sushanth Sowmyan
>              Labels: TODOC3.0
>         Attachments: HIVE-16686.1.patch, HIVE-16686.2.patch
>
>
> When REPL LOAD invokes distcp, there needs to be a way for the user invoking 
> REPL LOAD to pass on arguments to distcp. In addition, there is sometimes a 
> need for distcp to be invoked from within an impersonated context, such as 
> running as user "hdfs", asking distcp to preserve ownerships of individual 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to