[jira] [Commented] (OOZIE-561) Integrate Oozie with HCatalog

Alejandro Abdelnur (JIRA) Tue, 16 Oct 2012 17:50:06 -0700

    [ 
https://issues.apache.org/jira/browse/OOZIE-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477504#comment-13477504
 ]


Alejandro Abdelnur commented on OOZIE-561:
------------------------------------------

The design looks pretty good, some comments/questions:

# dataset hcat:// URIs
## if allowing default host:port them make sure 3 /// are required in that case 
in to be able to differentiate authority from path.
## database & table, why not assuming the database and table names are the 
first and second elements of the path, then instead 
*hcat:host:port/db/<DB>/table/<TABLE>/...* it would just be 
#hcat:host:port/<DB>/<TABLE>/...*
## to support the shortening in the prev bullet I'd say <DB> is always required
# dataset URI handler, I'd assume you'll be introducing a scheme handler 
interface, and we'll have 2 IMPLs to start, one for HDFS:// and one for 
HCAT://, right?
# On prepare handling remove partitions, wouldn't make sense that the scheme 
handler handles this operations?
# EL functions
## All the proposed EL functions should act on an event name, for consistency, 
from the event we can resolve hcatalog authority, database and table.
## A single *getFilter()* EL function would do both for input and output events.
# How input event ranges will be resolved? I assume that by the schema handler, 
thus allowing, if metadata backend supports, single range queries.
# The schema handler should also be the one checking for dataset instances 
available and it understand the URIs, right? And the one processing incoming 
notifications.
# Where how listeners to the JMS topics are done?
# How is authentication planned to be handled?


                
> Integrate Oozie with HCatalog
> -----------------------------
>
>                 Key: OOZIE-561
>                 URL: https://issues.apache.org/jira/browse/OOZIE-561
>             Project: Oozie
>          Issue Type: New Feature
>            Reporter: Santhosh Srinivasan
>            Assignee: Mona Chitnis
>         Attachments: Oozie-HCatHighLevel.pptx
>
>
> With the incubation of HCatalog, we have a mechanism to abstract data and 
> storage on HDFS. A natural progression for Oozie is to interact with HCatalog 
> to facilitate the interplay between MapReduce, Pig and Hive. In addition, the 
> support for notification in HCatalog will alleviate (and not eliminate) the 
> need to poll HDFS for data sets represented as tables and partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OOZIE-561) Integrate Oozie with HCatalog

Reply via email to