Option 1 should be fine, Option 2 would bound a lot on network as the data increase in time.
Thanks Best Regards On Mon, Jun 22, 2015 at 5:59 PM, Ashish Soni <asoni.le...@gmail.com> wrote: > Hi All , > > What is the Best Way to install and Spark Cluster along side with Hadoop > Cluster , Any recommendation for below deployment topology will be a great > help > > *Also Is it necessary to put the Spark Worker on DataNodes as when it read > block from HDFS it will be local to the Server / Worker or I can put the > Worker on any other nodes and if i do that will it affect the performance > of the Spark Data Processing ..* > > Hadoop Option 1 > > Server 1 - NameNode & Spark Master > Server 2 - DataNode 1 & Spark Worker > Server 3 - DataNode 2 & Spark Worker > Server 4 - DataNode 3 & Spark Worker > > Hadoop Option 2 > > > Server 1 - NameNode > Server 2 - Spark Master > Server 2 - DataNode 1 > Server 3 - DataNode 2 > Server 4 - DataNode 3 > Server 5 - Spark Worker 1 > Server 6 - Spark Worker 2 > Server 7 - Spark Worker 3 > > Thanks. > > > >