Hi Joseph,

Thank you for suggestion!
It seems that instead of sample it is better to shuffle data and then access it 
sequentially by mini-batches. Could you suggest how to implement it?

With regards to aggregate (reduce), I am wondering why it works so slow in 
local mode? Could you elaborate on this? I do understand that in cluster mode 
the network speed will kick in and then one can blame it.

Best regards, Alexander

From: Joseph Bradley [mailto:jos...@databricks.com]
Sent: Thursday, April 02, 2015 10:51 AM
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: Stochastic gradient descent performance

It looks like SPARK-3250 was applied to the sample() which GradientDescent 
uses, and that should kick in for your minibatchFraction <= 0.4.  Based on your 
numbers, aggregation seems like the main issue, though I hesitate to optimize 
aggregation based on local tests for data sizes that small.

The first thing I'd check for is unnecessary object creation, and to profile in 
a cluster or larger data setting.

On Wed, Apr 1, 2015 at 10:09 AM, Ulanov, Alexander 
<alexander.ula...@hp.com<mailto:alexander.ula...@hp.com>> wrote:
Sorry for bothering you again, but I think that it is an important issue for 
applicability of SGD in Spark MLlib. Could Spark developers please comment on 
it.

-----Original Message-----
From: Ulanov, Alexander
Sent: Monday, March 30, 2015 5:00 PM
To: dev@spark.apache.org<mailto:dev@spark.apache.org>
Subject: Stochastic gradient descent performance

Hi,

It seems to me that there is an overhead in "runMiniBatchSGD" function of 
MLlib's "GradientDescent". In particular, "sample" and "treeAggregate" might 
take time that is order of magnitude greater than the actual gradient 
computation. In particular, for mnist dataset of 60K instances, minibatch size 
= 0.001 (i.e. 60 samples) it take 0.15 s to sample and 0.3 to aggregate in 
local mode with 1 data partition on Core i5 processor. The actual gradient 
computation takes 0.002 s. I searched through Spark Jira and found that there 
was recently an update for more efficient sampling (SPARK-3250) that is already 
included in Spark codebase. Is there a way to reduce the sampling time and 
local treeRedeuce by order of magnitude?

Best regards, Alexander
---------------------------------------------------------------------
To unsubscribe, e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>
For additional commands, e-mail: 
dev-h...@spark.apache.org<mailto:dev-h...@spark.apache.org>

Reply via email to