Hi Jonathan, Maybe you can try BigDataBench. http://prof.ict.ac.cn/BigDataBench/ <http://prof.ict.ac.cn/BigDataBench/> . It provides lots of workloads, including both Hadoop and Spark based workloads.
Zhen Jia hodgesz wrote > Hi Spark Experts, > > I am curious what people are using to benchmark their Spark clusters. We > are about to start a build (bare metal) vs buy (AWS/Google Cloud/Qubole) > project to determine our Hadoop and Spark deployment selection. On the > Hadoop side we will test live workloads as well as simulated ones with > frameworks like TestDFSIO, TeraSort, MRBench, GridMix, etc. > > Do any equivalent benchmarking frameworks exist for Spark? A quick Google > search yielded https://github.com/databricks/spark-perf which looks pretty > interesting. It would be great to hear what others are doing here. > > Thanks for the help! > > Jonathan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Cluster-Benchmarking-Frameworks-tp12699p23146.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org