Re: Remove dependence on HDFS

2017-02-13 Thread Calvin Jia
Hi Ben, You can replace HDFS with a number of storage systems since Spark is compatible with other storage like S3. This would allow you to scale your compute nodes solely for the purpose of adding compute power and not disk space. You can deploy Alluxio on your compute nodes to offset the

Re: Remove dependence on HDFS

2017-02-13 Thread Saisai Shao
IIUC Spark doesn't strongly bind to HDFS, it uses a common FileSystem layer which supports different FS implementations, HDFS is just one option. You could also use S3 as a backend FS, from Spark's point it is transparent to different FS implementations. On Sun, Feb 12, 2017 at 5:32 PM, ayan

Re: Remove dependence on HDFS

2017-02-12 Thread ayan guha
How about adding more NFS storage? On Sun, 12 Feb 2017 at 8:14 pm, Sean Owen wrote: > Data has to live somewhere -- how do you not add storage but store more > data? Alluxio is not persistent storage, and S3 isn't on your premises. > > On Sun, Feb 12, 2017 at 4:29 AM

Re: Remove dependence on HDFS

2017-02-12 Thread Sean Owen
Data has to live somewhere -- how do you not add storage but store more data? Alluxio is not persistent storage, and S3 isn't on your premises. On Sun, Feb 12, 2017 at 4:29 AM Benjamin Kim wrote: > Has anyone got some advice on how to remove the reliance on HDFS for >

Re: Remove dependence on HDFS

2017-02-12 Thread Jörn Franke
You're have to carefully choose if your strategy makes sense given your users workloads. Hence, I am not sure your reasoning makes sense. However, You can , for example, install openstack swift as an object store and use this as storage. HDFS in this case can be used as a temporary store

Remove dependence on HDFS

2017-02-11 Thread Benjamin Kim
Has anyone got some advice on how to remove the reliance on HDFS for storing persistent data. We have an on-premise Spark cluster. It seems like a waste of resources to keep adding nodes because of a lack of storage space only. I would rather add more powerful nodes due to the lack of