[ 
https://issues.apache.org/jira/browse/OOZIE-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476611#comment-13476611
 ] 

Mohammad Kamrul Islam commented on OOZIE-561:
---------------------------------------------

High-level Requirements:
•       Allow users to specify table/partition based data dependencies used in 
HCatalog. 
•       User should be able to specify the existing directory based data 
dependency as well.
•       User can use both types of dataset (such directory and table-partition 
based) for the same coordinator.
•       Single data source should be from the same type (either table-partition 
or directory).
•       For table-partition based dataset, enable user to utilize the existing 
concept defined through EL functions such as current(), latest(), future().
•       Oozie should allow to utilize metadata from multiple HCatalog servers.
•       User can optionally provide the HCatalog server end-points. Oozie 
should have a default HCatalog server.
•       Oozie will allow passing the input data destinations as 
db/table/partition(filter) that could easily be used in MR job and pig script.
•        Oozie will allow passing the output data destination as 
db/table/partition(filter) that could easily be used in MR job and pig script.
•       Include “remove partition “ like statement into <prepare> block.
•       Include HCatalog Action. (Future)
•       In condition expression ((e.g. <case> statement) ), support the same 
functionality provided for directory-based system. Provide equivalent new EL 
function with table-partition support based logic. (Future)
•       Support non-timed based data dependency (Asynchronous data processing) 
(Future)

The future items will not be implemented as part of this JIRA.
However, it should be considered in our design consideration.

                
> Integrate Oozie with HCatalog
> -----------------------------
>
>                 Key: OOZIE-561
>                 URL: https://issues.apache.org/jira/browse/OOZIE-561
>             Project: Oozie
>          Issue Type: New Feature
>            Reporter: Santhosh Srinivasan
>            Assignee: Mona Chitnis
>
> With the incubation of HCatalog, we have a mechanism to abstract data and 
> storage on HDFS. A natural progression for Oozie is to interact with HCatalog 
> to facilitate the interplay between MapReduce, Pig and Hive. In addition, the 
> support for notification in HCatalog will alleviate (and not eliminate) the 
> need to poll HDFS for data sets represented as tables and partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to