Mark Sun created FLUME-2701:
-------------------------------

             Summary: Adding WebHDFS support
                 Key: FLUME-2701
                 URL: https://issues.apache.org/jira/browse/FLUME-2701
             Project: Flume
          Issue Type: New Feature
            Reporter: Mark Sun


I'm using HttpFs as a HDFS Web Gateway to handle data from Flume in other 
datacenter via Internet or WAN, in my case, a gateway is necessary for 
minimizing the footprint required to access HDFS, but WebHDFS API do not 
support hsync(), which is required by Flume.

HDFS will sync all data and metadata to DN disk before file close, and it also 
works in WebHDFS API. It seems to me that we can use this guarantee to make 
data safe without hsync()  when unavailable. Personally, I guess it’s much 
easier than adding hsync() support to WebHDFS/HttpFs.

Basically, the idea is making transaction open until rolling occurs, if we 
found the schema of HDFS URI is “webhdfs”.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to