Currently our stream apps write results to hdfs. We are running into problems with HDFS becoming corrupted and running out of space. It seems like a better solution might be to write directly to S3. Is this a good idea?
We plan to continue to write our checkpoints to hdfs Are there any issues to be aware of? Maybe performance or something else to watch out for? This is our first S3 project. Does storage just grow on on demand? Kind regards Andy P.s. Turns out we are using an old version of hadoop (v 1.0.4)