Not sure what you are trying to do, but the HDFS sink appends. It's just
that you have to determine what your roll-over strategy will be. Instead of
every few minutes, you can set the hdfs.rollInterval=0 (disables) and set
the hdfs.rollSize to however large you want your files before you roll over
to appending to a new file. You can also use hdfs.rollCount to set your
roll-over for a certain number of records. I use rollSize for my roll-over
strategy.


On Tue, Apr 8, 2014 at 8:35 PM, Pritchard, Charles X. -ND <
[email protected]> wrote:

> Exploring the idea of using "append" instead of creating new files with
> HDFS every few minutes.
> Are there particular design decisions / considerations?
>
> There's certainly a history of append with HDFS, mainly, earlier versions
> of Hadoop warn strongly against using file append semantics.
>
>
> -Charles

Reply via email to