[ 
https://issues.apache.org/jira/browse/STORM-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929183#comment-15929183
 ] 

Roshan Naik edited comment on STORM-2355 at 3/17/17 9:23 PM:
-------------------------------------------------------------

Fist of all... thanks for your work on this.

I took a look at the HDFS's inotify side of things and also spoke to a Hdfs 
committer.  These over-arching concerns came up (partly alluded to earlier by 
you):

1. Currently INotify is restricted to HDFS admins because it doesn't scale (wrt 
namenode). So in its current state it seems unsuitable for Hdfs Spout kind of 
use case... even if we (unrealistically) asked users to run Storm worker as a 
HDFS admin user.
2. The proposal for scaling Inotify and opening it up to end users appears to 
have stalled for some time now. 

Although we are seeing some improvements (that you noted) in resource 
utilization from the Storm side, it seems not advisable from the HDFS namenode 
perspective. I think this feature in HDFS Spout would be useful once a 
scaleable inotify solution is made publicly available by HDFS.

The other option is to get this into Storm now and not use it till HDFS 
implements their scaleable inotify. My concern with that is we cant bet with 
certainty that final inotify will still work as we now expect it to (although 
the intent is there) ... it may even change in a incompatible way. 

Either way it appears like a feature that cant be used until the scaleable 
inotify happens (if it happens).


was (Author: roshan_naik):
Fist of all... thanks for your work on this.

I took a look at the HDFS's inotify side of things and also spoke to a Hdfs 
committer.  These over-arching concerns came up (partly alluded to earlier by 
you):

1. Currently INotify is restricted to HDFS admins because it doesn't scale (wrt 
namenode). So in its current state it seems unsuitable for Hdfs Spout kind of 
use case... even if we (unrealistically) asked users to run Storm worker as a 
HDFS admin user.
2. The proposal for scaling Inotify and opening it up to end users appears to 
have stalled for some time now. 

Although we are seeing some improvements (that you noted) in resource 
utilization from the Storm side, it seems not advisable from the HDFS namenode 
perspective. I think this feature in HDFS would be useful once a scaleable 
inotify solution is made publicly available by HDFS.

The other option is to get this into Storm now and not use it till HDFS 
implements their scaleable inotify. My concern with that is we cant bet with 
certainty that final inotify will still work as we now expect it to (although 
the intent is there) ... it may even change in a incompatible way. 

Either way it appears like a feature that cant be used until the scaleable 
inotify happens (if it happens).

> Storm-HDFS: inotify support
> ---------------------------
>
>                 Key: STORM-2355
>                 URL: https://issues.apache.org/jira/browse/STORM-2355
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-hdfs
>            Reporter: Tibor Kiss
>            Assignee: Tibor Kiss
>             Fix For: 2.0.0, 1.1.0
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> This is a proposal to implement inotify based watch dir monitoring in 
> Storm-HDFS Spout.
> *Motivation*
> Storm-HDFS's HdfsSpout currently polls the Spout’s input directory using 
> Hadoop's {{FileSystem.listFiles()}}. This operation is expensive since it 
> returns the block locations and all stat information of the files inside the 
> watch directory. Moreover HdfsSpout currently uses only one element of the 
> returned Path list which is inefficient as the rest of the entries are thrown 
> away without processing.
> The proposed design provides greater efficiency through the inotify interface 
> and also enables to easier extension of the original ({{listFiles()}} based) 
> monitoring with buffering (see Further work section below). 
> *High level design*
> Goal is to leverage [HDFS inotify 
> API|http://hadoop.apache.org/docs/current/api//org/apache/hadoop/hdfs/DFSInotifyEventInputStream.html]
>  to monitor new file arrival to HdfsSpout's input directory.
> The inotify based monitoring is an addition to the original 
> {{FileSystem.listFiles()}} based implementation, the default behavior of the 
> spout will be unchanged by this modification.
> To unify the two monitoring methods and enable buffering an iterator based 
> ({{HdfsDirectoryMonitor}}) class is created.
> To retain backward compatibility the HdfsSpout's default monitoring behavior 
> is unchanged, inotify based monitoring could be enabled through a parameter.
> As inotify requires administrative privileges (see Caveat section below) a 
> fallback mechanism is be implemented in HdfsSpout to use the original 
> {{listFiles()}} based monitoring if initialization fails for inotify based 
> monitoring.
> *Implementation details*
> As inotify provides only a delta of the filesystem events from a given Tx Id 
> (of Hdfs Edit Log) it is required to do a {{FileSystem.listFiles()}} based 
> collection during the Spout's initialization to ensure that any left over 
> files are processed.
> The inotify based implementation uses HdfsAdmin's 
> [{{DFSInotifyEventInputStream.poll()}}|http://hadoop.apache.org/docs/current/api//org/apache/hadoop/hdfs/DFSInotifyEventInputStream.html#poll--]
>  method to fetch and buffer the list of new files created since the provided 
> Tx Id to {{newFileList}} buffer.
> During {{HdfsSpout.nextTuple()}} call one element is taken from the 
> {{newFileList}} buffer and processed by the spout.
> The {{newFileList}} buffer is extended with the result of the 
> {{DFSInotifyEventInputStream.poll(lastTxId)}} call in every nextTuple() call.
> Since HdfsSpout is able to create it's own {{HdfsAdmin()}} instance there 
> will be no need for the user to do additional initialization for the spout 
> even if inotify is enabled.
> *Caveat*
> HDFS inotify is currently available through hdfs administrator user only, but 
> there is ongoing discussion in Hadoop community to extend its support to 
> users. See: HDFS-8940 
> *Further work*
> 1) The number of calls to {{DFSInotifyEventInputStream.poll(lastTxId)}} could 
> be further reduced if the locking directory is moved away from the input 
> directory. With the current design updates on the lock dir are also included 
> in the {{newFileList}} buffer hence the buffer will never get completely 
> empty.
> 2) The original {{listFiles()}} based solution could be improved through 
> {{HdfsDirectoryMonitor}} to buffer and use all the returned items from the 
> work directory, similarly to inotify based monitoring. Such improvement will 
> reduce the number of calls made to namenode. 
> These improvements are currently not part of this ticket.
> *Error scenarios*
>  - Inability of HdfsAdmin instance creation (e.g. lack of privileges):
>    The spout falls back to the original {{listFiles()}} based method.
>  - Namenode's edit log is not yet open for write during {{HdfsSpout.open()}}:
>    The initialization will be postponed to the {{HdfsSpout.nextTuple()}} 
> call(s).
>  - Hdfs gets disconnected while the topology is running:
>    HdfsSpout reports an error and retries in the next call of nextSpout() 
> call. 
>    No data will be skipped as the update will be requested from the last 
> known Tx Id.
>  
> *Testing related changes*
> The {{TestHdfsSpout}} testcase should be parametrized to check for both the 
> poll & inotify based solution.
> Additional testcases are added to ensure that inotify is able to pick up any 
> leftover files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to