[ https://issues.apache.org/jira/browse/OOZIE-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477504#comment-13477504 ]
Alejandro Abdelnur commented on OOZIE-561: ------------------------------------------ The design looks pretty good, some comments/questions: # dataset hcat:// URIs ## if allowing default host:port them make sure 3 /// are required in that case in to be able to differentiate authority from path. ## database & table, why not assuming the database and table names are the first and second elements of the path, then instead *hcat:host:port/db/<DB>/table/<TABLE>/...* it would just be #hcat:host:port/<DB>/<TABLE>/...* ## to support the shortening in the prev bullet I'd say <DB> is always required # dataset URI handler, I'd assume you'll be introducing a scheme handler interface, and we'll have 2 IMPLs to start, one for HDFS:// and one for HCAT://, right? # On prepare handling remove partitions, wouldn't make sense that the scheme handler handles this operations? # EL functions ## All the proposed EL functions should act on an event name, for consistency, from the event we can resolve hcatalog authority, database and table. ## A single *getFilter()* EL function would do both for input and output events. # How input event ranges will be resolved? I assume that by the schema handler, thus allowing, if metadata backend supports, single range queries. # The schema handler should also be the one checking for dataset instances available and it understand the URIs, right? And the one processing incoming notifications. # Where how listeners to the JMS topics are done? # How is authentication planned to be handled? > Integrate Oozie with HCatalog > ----------------------------- > > Key: OOZIE-561 > URL: https://issues.apache.org/jira/browse/OOZIE-561 > Project: Oozie > Issue Type: New Feature > Reporter: Santhosh Srinivasan > Assignee: Mona Chitnis > Attachments: Oozie-HCatHighLevel.pptx > > > With the incubation of HCatalog, we have a mechanism to abstract data and > storage on HDFS. A natural progression for Oozie is to interact with HCatalog > to facilitate the interplay between MapReduce, Pig and Hive. In addition, the > support for notification in HCatalog will alleviate (and not eliminate) the > need to poll HDFS for data sets represented as tables and partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira