[
https://issues.apache.org/jira/browse/OOZIE-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454346#comment-13454346
]
Alejandro Abdelnur commented on OOZIE-983:
------------------------------------------
Mona, I'm a bit confused by your comments.
* Oozie client currently does not include hadoop-common/hadoop-hdfs JARs (and
all its deps), so you cannot do FS calls via Hadoop FS API.
* Oozie client currently does not know about any Hadoop endpoint. Nor Oozie, as
Oozie is 'agnostic' of the cluster (it is up to the job-tracker/name-node URIs
you use in your jobs.
* Oozie client does not use Hadoop UGI for kerberos (no hadoop-common), it just
uses basic JDK GSS stuff (done in hadoop-auth, which is included in the client
lib/).
* I would assume that if you are not in a gateway machine you may not have
kerberos authentication there but other form or authentication.
If your ideaa is to have a tool that does 'oozie deploy <LOCALPATH>
<REMOTEPATH>' where LOCALPATH would be a local filesystem directory and
REMOTEPATH would be any filesystem URI supported by the Hadoop FileSystem API
(hdfs://, webhdfs://, viewfs://, etc) this is just a wrapper on 'hadoop fs -put
<LOCALPATH> <REMOTEPATH>', nothing else.
> [Design] Automatic Oozie application deployment using WebHDFS
> -------------------------------------------------------------
>
> Key: OOZIE-983
> URL: https://issues.apache.org/jira/browse/OOZIE-983
> Project: Oozie
> Issue Type: Bug
> Reporter: Mohammad Kamrul Islam
> Assignee: Mona Chitnis
>
> Problem:
> 1. A user can't upload the oozie application from his dev box. User needs to
> access to a specialized box (such as gateway) to run those hadoop commands.
> It is inconvenient which requires to follow multiple steps and restrictions.
> 2. Automatic Oozie application versioning. If a user wants to deploy a new
> version of Oozie application, he needs to run multiple commands. In addition,
> there is no standard for this.
> Proposal:
> 1. Oozie will provide a tool that will automatically deploy the application
> and maintained a rigid version mechanism.
> 2. It could be a new script (e.g. oozie-deply) or it can extend the existing
> oozie command (e.g. oozie -deply....."). TBD
> 3. The new script will get the necessary information to launch a WebHDFS
> command from the user and upload the necessary files. It includes: WebHDFS
> end point, security token (for secured version), local application directory
> and remote application base path.
> 4. Using the appropriate WebHDFS REST API, the tool will deploy the
> application. User can choose whether to override an existing application
> path.
> 5. User can ask to upload a new version of application. The new version could
> be user provided or auto created by the script. For auto version selection,
> oozie tools will check the existing application path with pattern "v?". Then
> select the new version number.
> 6. For uploading a new application version, the oozie tool will first upload
> the application and then kill the old job (How to get the old job id?). At
> last, submit the new application.
> Open question:
> 1. How to pass the kerberos token? Specially from a dev box.
> 2. Who will determine the new version? user or automatic?
> Other key points:
> 1. Only supported for Hadoop 1.0.2+
> 2. Need to use/develop some wrapper tools which can hide most of the WebHDFS
> details. There are already two such tools: a) for python :
> https://github.com/drelu/webhdfs-py b) for Ruby,
> https://github.com/zenja/webhdfs-ruby. At this point the options are:
> * Write a new Java wrapper class.
> * Write a new wrapper tool using pure shell commands.
> * Reuse python or Ruby libraries.
> Overall, we need to do it correctly from the beginning. The comments from
> others are highly appreciated.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira