[ 
https://issues.apache.org/jira/browse/FLINK-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229754#comment-14229754
 ] 

ASF GitHub Bot commented on FLINK-1081:
---------------------------------------

Github user rmetzger commented on the pull request:

    https://github.com/apache/incubator-flink/pull/226#issuecomment-65061398
  
    I think its okay to do it this way. I forgot that the 
FSDataInputStreamWrapper of Avro is implementing an Avro specific interface.
    
    Sorry that I'm asking so many questions. Your code is very well written, 
I'm asking these questions to get a better understanding of your changes.
    
    Have you tested the code on a cluster with HDFS? I was wondering what 
happens if you are running the code in a distributed setup, with the sources 
running multiple times in the cluster?
    If the sources are running `n` times in the cluster, all `n` instances will 
"see" the new or updated file and then start to process it.


> Add HDFS file-stream source for streaming
> -----------------------------------------
>
>                 Key: FLINK-1081
>                 URL: https://issues.apache.org/jira/browse/FLINK-1081
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 0.7.0-incubating
>            Reporter: Gyula Fora
>            Assignee: Chiwan Park
>              Labels: starter
>
> Add data stream source that will monitor a slected directory on HDFS (or 
> other filesystems as well) and will process all new files created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to