[
https://issues.apache.org/jira/browse/OOZIE-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454265#comment-13454265
]
Mona Chitnis commented on OOZIE-983:
------------------------------------
Thanks Alejandro for your comment. So the idea is to achieve the same things
you outlined, but invoking a library via Oozie itself (Point #2 in JIRA
description). Hence the user does not need to issue special 'hadoop fs'
commands himself, and having to know the httpFS hostname etc. since Oozie
client knows about the WebHDFS endpoint from the hadoop config. Since there is
a standard way of obtaining kerberos auth info from user for Oozie job
submission, it will be a more uniform experience to do it the same way for
copying the application too.
This way can be one alternative addressing the common user's need. It does not
necessarily preclude users who wish to manage their app directories
independently.
> [Design] Automatic Oozie application deployment using WebHDFS
> -------------------------------------------------------------
>
> Key: OOZIE-983
> URL: https://issues.apache.org/jira/browse/OOZIE-983
> Project: Oozie
> Issue Type: Bug
> Reporter: Mohammad Kamrul Islam
> Assignee: Mona Chitnis
>
> Problem:
> 1. A user can't upload the oozie application from his dev box. User needs to
> access to a specialized box (such as gateway) to run those hadoop commands.
> It is inconvenient which requires to follow multiple steps and restrictions.
> 2. Automatic Oozie application versioning. If a user wants to deploy a new
> version of Oozie application, he needs to run multiple commands. In addition,
> there is no standard for this.
> Proposal:
> 1. Oozie will provide a tool that will automatically deploy the application
> and maintained a rigid version mechanism.
> 2. It could be a new script (e.g. oozie-deply) or it can extend the existing
> oozie command (e.g. oozie -deply....."). TBD
> 3. The new script will get the necessary information to launch a WebHDFS
> command from the user and upload the necessary files. It includes: WebHDFS
> end point, security token (for secured version), local application directory
> and remote application base path.
> 4. Using the appropriate WebHDFS REST API, the tool will deploy the
> application. User can choose whether to override an existing application
> path.
> 5. User can ask to upload a new version of application. The new version could
> be user provided or auto created by the script. For auto version selection,
> oozie tools will check the existing application path with pattern "v?". Then
> select the new version number.
> 6. For uploading a new application version, the oozie tool will first upload
> the application and then kill the old job (How to get the old job id?). At
> last, submit the new application.
> Open question:
> 1. How to pass the kerberos token? Specially from a dev box.
> 2. Who will determine the new version? user or automatic?
> Other key points:
> 1. Only supported for Hadoop 1.0.2+
> 2. Need to use/develop some wrapper tools which can hide most of the WebHDFS
> details. There are already two such tools: a) for python :
> https://github.com/drelu/webhdfs-py b) for Ruby,
> https://github.com/zenja/webhdfs-ruby. At this point the options are:
> * Write a new Java wrapper class.
> * Write a new wrapper tool using pure shell commands.
> * Reuse python or Ruby libraries.
> Overall, we need to do it correctly from the beginning. The comments from
> others are highly appreciated.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira