Hi,

I am training a RandomForest Regression Model on Spark-1.6.1 (EMR) and am
interested in how it might be best to scale it - e.g more cpus per
instances, more memory per instance, more instances etc.

I'm currently using 32 m3.xlarge instances for for a training set with 2.5
million rows, 1300 columns and a total size of 31GB (parquet)

thanks

-- 
Franc

Reply via email to