TeraSort on Spark

Rivera, Dario Mon, 18 Nov 2013 06:55:41 -0800

Hello spark community.
I wanted to ask if any work has been done on porting TeraSort (Tera 
Gen/Sort/Validate) from Hadoop to Spark on EC2/EMR
I am looking for some guidance on lessons learned from this or similar efforts 
as we are trying to do some benchmarking on some of the newer EC2 instances to 
determine how to optimize in-memory processing of these instances with Spark 
for some of AWS' customers looking to move to Spark for their data processing 
workloads.


Any guidance the community can provide on this effort is greatly appreciated!

Thanks,

Dario Rivera
Solutions Architect
Cell: 571-205-2731
Email: dar...@amazon.com<mailto:dar...@amazon.com>

[AWS Graphic]

<<inline: image003.jpg>>

TeraSort on Spark

Reply via email to