Thanks Matei.
On Tue, Jul 15, 2014 at 11:47 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
Yup, as mentioned in the FAQ, we are aware of multiple deployments running
jobs on over 1000 nodes. Some of our proof of concepts involved people
running a 2000-node job on EC2.
I wouldn't confuse
Hello Folks:
There is lot of buzz in the hadoop community around Spark's inability to
scale beyond the 1 TB datasets ( or 10-20 nodes). It is being regarded as
great tech for cpu intensive workloads on smaller data( less that TB) but
fails to scale and perform effectively on larger datasets. How
Hi Rohit,
I think the 3rd question on the FAQ may help you.
https://spark.apache.org/faq.html
Some other links that talk about building bigger clusters and processing
more data:
http://spark-summit.org/wp-content/uploads/2014/07/Building-1000-node-Spark-Cluster-on-EMR.pdf