streaming spark is writing results to S3 a good idea?

Andy Davidson Tue, 23 Feb 2016 17:28:34 -0800

Currently our stream apps write results to hdfs. We are running into
problems with HDFS becoming corrupted and running out of space. It seems
like a better solution might be to write directly to S3. Is this a good
idea?


We plan to continue to write our checkpoints to hdfs

Are there any issues to be aware of? Maybe performance or something else to
watch out for?

This is our first S3 project. Does storage just grow on on demand?

Kind regards

Andy


P.s. Turns out we are using an old version of hadoop (v 1.0.4)

streaming spark is writing results to S3 a good idea?

Reply via email to