Re: MLLib: LinearRegressionWithSGD performance

2014-11-24 Thread Yanbo
From the metrics page, it reveals that only two executors work parallel for each iteration. You need to improve parallel threads numbers. Some tips maybe helpful: Increase spark.default.parallelism; Use repartition() or coalesce() to increase partition number. 在 2014年11月22日,上午3:18,Sameer

Re: MLLib: LinearRegressionWithSGD performance

2014-11-21 Thread Jayant Shekhar
Hi Sameer, You can try increasing the number of executor-cores. -Jayant On Fri, Nov 21, 2014 at 11:18 AM, Sameer Tilak ssti...@live.com wrote: Hi All, I have been using MLLib's linear regression and I have some question regarding the performance. We have a cluster of 10 nodes -- each

Re: MLLib: LinearRegressionWithSGD performance

2014-11-21 Thread Jayant Shekhar
Hi Sameer, You can also use repartition to create a higher number of tasks. -Jayant On Fri, Nov 21, 2014 at 12:02 PM, Jayant Shekhar jay...@cloudera.com wrote: Hi Sameer, You can try increasing the number of executor-cores. -Jayant On Fri, Nov 21, 2014 at 11:18 AM, Sameer Tilak