Hi,

On Mon, Aug 25, 2014 at 9:56 AM, Dean Chen <deanch...@gmail.com> wrote:

>  We are using HDFS for log storage where logs are flushed to HDFS every
> minute, with a new file created for each hour. We would like to consume
> these logs using spark streaming.
>
> The docs state that new HDFS will be picked up, but does Spark Streaming
> support HDFS appends?
>

I don't think so. The docs at
http://spark.apache.org/docs/1.0.0/api/scala/index.html#org.apache.spark.streaming.StreamingContext
say that even for new files, "Files must be written to the monitored
directory by 'moving' them from another location within the same file
system." So I don't think you can just append to your files.

Tobias

Reply via email to