subject:"RE\: Spark Master on Hadoop Job Tracker\?"

Re: Spark Master on Hadoop Job Tracker?

2014-01-21 Thread mharwida

Many thanks for the replies. The way I currently have my set up is as follows; 6 nodes running Hadoop with each node having approximately 5GB of data. Launched a Spark Master (and Shark via ./shark) on one of the Hadoop nodes and launched 5 worker Spark nodes on the remaining 5 Hadoop nodes. So

RE: Spark Master on Hadoop Job Tracker?

2014-01-20 Thread Liu, Raymond

Not sure what did you aim to solve. When you mention Spark Master, I guess you probably mean spark standalone mode? In that case spark cluster does not necessary coupled with hadoop cluster. While if you aim to achieve better data locality , then yes, run spark worker on HDFS data node might

Re: Spark Master on Hadoop Job Tracker?

2014-01-20 Thread Nick Pentreath

If you intend to run Hadoop mapReduce and Spark on the same cluster concurrently, and you have enough memory on the jobtracker master, then you can run the Spark master (for standalone as Raymond mentions) on the same node . This is not necessary but more for convenience so you only have so ssh