[ 
https://issues.apache.org/jira/browse/FLUME-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502341#comment-13502341
 ] 

Mike Percy commented on FLUME-1734:
-----------------------------------

Hi Roshan,
Cool! A couple aspects to consider as you are mulling this over:
* If a Flume {{Transaction}} is committed by the sink then the data must be 
persisted. We need to avoid getting into states where any committed 
{{Channel.take()}} could be lost somehow. One way to do that today (requires 
some setup though) is to write to an external Hive table and then periodically 
do a LOAD via Oozie or something, which could move the files out of the 
external table and into the desired partitions.
* If the HCat APIs don't work with secure meta stores or secure HDFS yet, it 
might be worth considering other APIs at the moment. However, if it can 
navigate the necessary Hive & Hadoop security features to partition and write 
the data, it sounds great to me! This is just my opinion, of course you are 
welcome to take it or leave it.
                
> Create a HCatalog Sink 
> -----------------------
>
>                 Key: FLUME-1734
>                 URL: https://issues.apache.org/jira/browse/FLUME-1734
>             Project: Flume
>          Issue Type: New Feature
>          Components: Sinks+Sources
>    Affects Versions: v1.2.0
>            Reporter: Roshan Naik
>            Assignee: Roshan Naik
>              Labels: features
>
> Create a sink that would stream data into HCatalog partitions. The primary 
> goal being that once the data is loaded into Hadoop, it should be 
> automatically queryable (using say Hive or Pig) without requiring additional 
> post processing steps on behalf of the users. Sink should manage the creation 
> of new partitions and committing them periodically. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to