Mark Sun created FLUME-2701:
-------------------------------
Summary: Adding WebHDFS support
Key: FLUME-2701
URL: https://issues.apache.org/jira/browse/FLUME-2701
Project: Flume
Issue Type: New Feature
Reporter: Mark Sun
I'm using HttpFs as a HDFS Web Gateway to handle data from Flume in other
datacenter via Internet or WAN, in my case, a gateway is necessary for
minimizing the footprint required to access HDFS, but WebHDFS API do not
support hsync(), which is required by Flume.
HDFS will sync all data and metadata to DN disk before file close, and it also
works in WebHDFS API. It seems to me that we can use this guarantee to make
data safe without hsync() when unavailable. Personally, I guess it’s much
easier than adding hsync() support to WebHDFS/HttpFs.
Basically, the idea is making transaction open until rolling occurs, if we
found the schema of HDFS URI is “webhdfs”.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)