Hi Mike,

This project contains some small synthetic benchmarks: 
https://github.com/amplab/spark-perf. Otherwise, for ML algorithms, look in 
mllib -- it comes with driver programs for K-means, logistic regression, matrix 
factorization, etc, as well as data generators for them.

Matei

On Aug 23, 2013, at 5:12 PM, Mike <[email protected]> wrote:

> I'm looking to put together some representative tests for Spark.  Where 
> can I find such data and code?  There must be some already existing.  
> Some tests (logistic regression, k-means, PageRank) are mentioned in the 
> RDD paper, for example.

Reply via email to