Data has to live somewhere -- how do you not add storage but store more
data?  Alluxio is not persistent storage, and S3 isn't on your premises.

On Sun, Feb 12, 2017 at 4:29 AM Benjamin Kim <bbuil...@gmail.com> wrote:

> Has anyone got some advice on how to remove the reliance on HDFS for
> storing persistent data. We have an on-premise Spark cluster. It seems like
> a waste of resources to keep adding nodes because of a lack of storage
> space only. I would rather add more powerful nodes due to the lack of
> processing power at a less frequent rate, than add less powerful nodes at a
> more frequent rate just to handle the ever growing data. Can anyone point
> me in the right direction? Is Alluxio a good solution? S3? I would like to
> hear your thoughts.
>
> Cheers,
> Ben
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to