Yup, as mentioned in the FAQ, we are aware of multiple deployments running jobs 
on over 1000 nodes. Some of our proof of concepts involved people running a 
2000-node job on EC2.

I wouldn't confuse buzz with FUD :).

Matei

On Jul 15, 2014, at 9:17 PM, Sonal Goyal <sonalgoy...@gmail.com> wrote:

> Hi Rohit,
> 
> I think the 3rd question on the FAQ may help you.
> 
> https://spark.apache.org/faq.html
> 
> Some other links that talk about building bigger clusters and processing more 
> data: 
> 
> http://spark-summit.org/wp-content/uploads/2014/07/Building-1000-node-Spark-Cluster-on-EMR.pdf
> http://apache-spark-user-list.1001560.n3.nabble.com/Largest-Spark-Cluster-td3782.html
> 
> 
> 
> Best Regards,
> Sonal
> Nube Technologies 
> 
> 
> 
> 
> 
> 
> On Wed, Jul 16, 2014 at 9:17 AM, Rohit Pujari <rpuj...@hortonworks.com> wrote:
> Hello Folks: 
> 
> There is lot of buzz in the hadoop community around Spark's inability to 
> scale beyond the 1 TB datasets ( or 10-20 nodes). It is being regarded as 
> great tech for cpu intensive workloads on smaller data( less that TB) but 
> fails to scale and perform effectively on larger datasets. How true it is?
> 
> Are there any customers in who are running petabyte scale workloads on spark 
> in production? Are there any benchmarks performed by databricks or other 
> companies to clear this perception?
> 
> I'm a big fan of spark. Knowing spark is in its early stages, I'd like to 
> better understand boundaries of the tech and recommend right solution for 
> right problem.
> 
> Thanks,
> Rohit Pujari
> Solutions Engineer, Hortonworks
> rpuj...@hortonworks.com
> 716-430-6899
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader of 
> this message is not the intended recipient, you are hereby notified that any 
> printing, copying, dissemination, distribution, disclosure or forwarding of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please contact the sender immediately and delete it 
> from your system. Thank You.
> 

Reply via email to