subject:"Spark and HDFS \( Worker and Data Nodes Combination \)"

Spark and HDFS ( Worker and Data Nodes Combination )

2015-06-22 Thread Ashish Soni

Hi All , What is the Best Way to install and Spark Cluster along side with Hadoop Cluster , Any recommendation for below deployment topology will be a great help *Also Is it necessary to put the Spark Worker on DataNodes as when it read block from HDFS it will be local to the Server / Worker or

Re: Spark and HDFS ( Worker and Data Nodes Combination )

2015-06-22 Thread Akhil Das

Option 1 should be fine, Option 2 would bound a lot on network as the data increase in time. Thanks Best Regards On Mon, Jun 22, 2015 at 5:59 PM, Ashish Soni asoni.le...@gmail.com wrote: Hi All , What is the Best Way to install and Spark Cluster along side with Hadoop Cluster , Any

Re: Spark and HDFS ( Worker and Data Nodes Combination )

2015-06-22 Thread ayan guha

I have a basic qs: how spark assigns partition to an executor? Does it respect data locality? Does this behaviour depend on cluster manager, ie yarn vs standalone? On 22 Jun 2015 22:45, Akhil Das ak...@sigmoidanalytics.com wrote: Option 1 should be fine, Option 2 would bound a lot on network as