nce tuning:
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Slow-Performance-with-Apache-Spar
mediate input size or avoid shuffling of data between stages. In my
case I'm basically using an "out-of-the-box" algorithm, which is written by
ML experts and *should* already be well tuned in this regard. My own code
that outputs GBT model to S3 should take a trivial amount of time to r