Data has to live somewhere -- how do you not add storage but store more data? Alluxio is not persistent storage, and S3 isn't on your premises.
On Sun, Feb 12, 2017 at 4:29 AM Benjamin Kim <bbuil...@gmail.com> wrote: > Has anyone got some advice on how to remove the reliance on HDFS for > storing persistent data. We have an on-premise Spark cluster. It seems like > a waste of resources to keep adding nodes because of a lack of storage > space only. I would rather add more powerful nodes due to the lack of > processing power at a less frequent rate, than add less powerful nodes at a > more frequent rate just to handle the ever growing data. Can anyone point > me in the right direction? Is Alluxio a good solution? S3? I would like to > hear your thoughts. > > Cheers, > Ben > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >