Running Spark alongside Hadoop

2014-06-20 Thread Sameer Tilak
Dear Spark users, I have a small 4 node Hadoop cluster. Each node is a VM -- 4 virtual cores, 8GB memory and 500GB disk. I am currently running Hadoop on it. I would like to run Spark (in standalone mode) along side Hadoop on the same nodes. Given the configuration of my nodes, will that work?

Re: Running Spark alongside Hadoop

2014-06-20 Thread Mayur Rustagi
The ideal way to do that is to use a cluster manager like Yarn mesos. You can control how much resources to give to which node etc. You should be able to run both together in standalone mode however you may experience varying latency performance in the cluster as both MR spark demand resources

Re: Running Spark alongside Hadoop

2014-06-20 Thread Koert Kuipers
for development/testing i think its fine to run them side by side as you suggested, using spark standalone. just be realistic about what size data you can load with limited RAM. On Fri, Jun 20, 2014 at 3:43 PM, Mayur Rustagi mayur.rust...@gmail.com wrote: The ideal way to do that is to use a

Re: Running Spark alongside Hadoop

2014-06-20 Thread Ognen Duzlevski
I only ran HDFS on the same nodes as Spark and that worked out great performance and robustness wise. However, I did not run Hadoop itself to do any computations/jobs on the same nodes. My expectation is that if you actually ran both at the same time with your configuration, the performance