[
https://issues.apache.org/jira/browse/FALCON-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093159#comment-14093159
]
Venkatesh Seetharam commented on FALCON-310:
--------------------------------------------
This should be quite straight forward to create external tables in Hive and
point to data on HDFS. It should work OOTB.
> Allow existing processes to work out-of-box when existing HDFS feeds are
> configured in HCatalog
> -----------------------------------------------------------------------------------------------
>
> Key: FALCON-310
> URL: https://issues.apache.org/jira/browse/FALCON-310
> Project: Falcon
> Issue Type: Improvement
> Reporter: Satish Mittal
> Assignee: Shwetha G S
>
> After Hcatalog integration, one can configure new falcon feeds based on
> HCatalog tables and then write processes that read/write HCat based feeds.
> However the expectation is that these processes will be implemented using
> HCatalog interfaces (HCatInputFormat/HCatOutputFormat in case of M/R jobs, or
> HCatLoader/HCatStorer in case of PIG scripts). This is easy for new
> processes.
> However there would be existing processes running in production that are
> based on HDFS based feeds and may not get re-written using HCat interfaces.
> For such processes, one might just want to configure HCatalog tables around
> their HDFS feeds and provide a way to allow existing processes to continue to
> run as if they are still working with HDFS feeds.
> Behind the scenes, falcon should be able to find new partitions to
> read/write, get their corresponding locations, populate the corresponding
> workflow variables, register/drop partitions etc as part of pre/post
> processing step.
--
This message was sent by Atlassian JIRA
(v6.2#6252)