[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2017-05-17 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014394#comment-16014394
 ] 

Joseph K. Bradley commented on SPARK-1503:
--

I'm fine with us closing it since there isn't momentum on it, but I could 
definitely see us reviving it in the future.  It could be worth exploring as a 
lightweight alternative to LBFGS.

> Implement Nesterov's accelerated first-order method
> ---
>
> Key: SPARK-1503
> URL: https://issues.apache.org/jira/browse/SPARK-1503
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: Xiangrui Meng
>Assignee: Aaron Staple
> Attachments: linear_l1.png, linear.png, logistic_l2.png, logistic.png
>
>
> Nesterov's accelerated first-order method is a drop-in replacement for 
> steepest descent but it converges much faster. We should implement this 
> method and compare its performance with existing algorithms, including SGD 
> and L-BFGS.
> TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
> method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2017-05-17 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013999#comment-16013999
 ] 

Nick Pentreath commented on SPARK-1503:
---

I think it's safe to say this won't go into Spark core, especially as there is 
(at least some) support in an extern

> Implement Nesterov's accelerated first-order method
> ---
>
> Key: SPARK-1503
> URL: https://issues.apache.org/jira/browse/SPARK-1503
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: Xiangrui Meng
>Assignee: Aaron Staple
> Attachments: linear_l1.png, linear.png, logistic_l2.png, logistic.png
>
>
> Nesterov's accelerated first-order method is a drop-in replacement for 
> steepest descent but it converges much faster. We should implement this 
> method and compare its performance with existing algorithms, including SGD 
> and L-BFGS.
> TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
> method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-07-28 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644560#comment-14644560
 ] 

Xiangrui Meng commented on SPARK-1503:
--

There are couple stop criteria in TFOCS (implemented in 
https://github.com/cvxr/TFOCS/blob/master/private/tfocs_iterate.m and 
documented in section 2.4.4 of 
https://github.com/cvxr/TFOCS/raw/master/userguide.pdf)

Copied from 
https://github.com/cvxr/TFOCS/blob/e34c0daeb136935d23b8df506de8b7b191f6b0a3/userguide.tex#L536:

{code}
\subsubsection{Stopping criteria}

There are a variety of ways to decide when the algorithm should terminate:
\begin{trivlist}
\item \texttt{tol}: TFOCS terminates when the iterates satisfy
$\|x_{k+1}-x_k\|/\max\{1,\|x_{k+1}\|\}\leq\texttt{tol}$.
The default value is $10^{-8}$; if set to zero or a negative value,
this criterion will never be engaged.
\item \texttt{maxIts}: The maximum number of iterations the algorithm
should take; defaults to \verb@Inf@.
\item \texttt{maxCounts}: This option causes termination after a certain
number of function calls or linear operations are made; see 
\S\ref{sec:opcounts}
for details. It defaults to \verb@Inf@.
\item \texttt{stopCrit}: Choose from one of several stopping criteria.
By default, \texttt{stopCrit} is 1, which is our recommended stopping 
criteria
when not using the SCD model.
Setting this to 3 will use a stopping criteria applied to the dual value
(so this is only available in SCD models, where the dual is really the 
primal),
and setting this to 4 is similar but uses a relative error tolerance.
A value of 4 is recommended when using the SCD model with continuation.
For details, see the code in \verb@private/tfocs_iterate.m@.
\item \texttt{stopFcn}: This option allows you to supply one or more
stopping criteria of your own design. To use it, set \verb@stopFcn@
must be a function handle or a cell array of function handles. For
\verb@tfocs.m@, these function handles will be called as follows:
\begin{code_}
stop = stopFcn( f, x );
\end{code_}
where \verb@f@ is the function value and \verb@x@ is the current point.
\begin{code_}
stop = stopFcn( f, z, x );
\end{code_}
where \verb@f@ is the current \emph{dual} function value, \verb@z@ is
the current dual point, and  \verb@x@ is the current primal point.
The output should either be  \verb@true@ or \verb@false@; if
\verb@true@, the algorithm will stop.

Note that the standard stopping criteria still apply, so the algorithm will halt
when any of the stopping criteria are reached.  To ignore the standard stopping 
criteria,
set \texttt{stopCrit} to $\infty$.
\end{trivlist}
{code}

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-07-01 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610018#comment-14610018
 ] 

Kai Sasaki commented on SPARK-1503:
---

[~staple] [~josephkb] Thank you for pinging and inspiring information! I'll 
rewrite current patch based on your logic and codes. Thanks a lot.


 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-07-01 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610720#comment-14610720
 ] 

Joseph K. Bradley commented on SPARK-1503:
--

I appreciate it!

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-06-24 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600056#comment-14600056
 ] 

Joseph K. Bradley commented on SPARK-1503:
--

Switching to absolute tolerance sounds reasonable.

I agree about old vs. new weight norms not making much difference.  Let's go 
with what TFOCS does to facilitate your comparisons with the existing 
implementation.


 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-06-23 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598709#comment-14598709
 ] 

Aaron Staple commented on SPARK-1503:
-

I believe this stopping criteria was added after the paper was written. It is 
documented on page 8 of the userguide 
(https://github.com/cvxr/TFOCS/raw/master/userguide.pdf) but unfortunately no 
explanation is provided. (The userguide also documents this as a = test, while 
the current code uses .) And unfortunately I couldn’t find an explanation in 
the code or git history.

I think the switch to absolute tolerance may be because a relative difference 
measurement could be less useful when the weights are extremely small, and 1 is 
a convenient cutoff point. (Using 1, the equation is simple and the 
interpretation is clear.) I believe [~mengxr] alluded to switching to an 
absolute tolerance at 1 already 
(https://github.com/apache/spark/pull/3636#discussion_r22078041) so he might be 
able to provide more information.

With regard to using the new weight norms as the basis for measuring relative 
weight difference, I think that if the convergence test passes using either the 
old or new weight norms, then the old and new norms are going to be very 
similar. It may not make a significant difference which test is used. (It may 
also be worth pointing out that in cases where the tolerance tests with respect 
to different old/new weights return different results, if the tolerance wrt new 
weights is met (and wrt old weights is not) then the weight norm increased 
slightly; if the tolerance wrt the old weights is met (and wrt new weights not) 
then we weight norm decreased slightly.)

Finally, TFOCS adopts a policy of skipping the convergence test after the first 
iteration if the weights are unchanged. I believe this condition is based on 
implementation specific behavior and does not need to be adopted generally.


 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-06-22 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596991#comment-14596991
 ] 

Aaron Staple commented on SPARK-1503:
-

[~josephkb] and [~lewuathe] Sure, happy to coordinate. So far I have just been 
duplicating the the convergence tolerance check in TFOCS, the matlab package on 
which the accelerated gradient descent implementation is based. TFOCS also 
tests for convergence by checking if the relative change in the weight vector 
is below a specified threshold. But there are some differences from the 
SPARK-3382 implementation. For example in TFOCS the relative difference between 
new and old weight vectors is measured with respect to the new weight vector 
instead of the old. And if the new weight vector is smaller than the unit 
vector the convergence test is changed to be an absolute rather than relative 
difference between successive weight vectors. I am just describing the 
implementation here, happy to discuss further and potentially look at making 
changes.

Here is the relevant code if you are interested (there is also a separate 
condition when the weight vector does not change between iterations):
https://github.com/cvxr/TFOCS/blob/e34c0daeb136935d23b8df506de8b7b191f6b0a3/private/tfocs_iterate.m#L19-L24

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-06-22 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597195#comment-14597195
 ] 

Joseph K. Bradley commented on SPARK-1503:
--

Thanks!  [~lewuathe] I probably should have looked into this more carefully 
before your PR.  It would be good to understand the best criterion.  
Regardless, I believe most of your PR's changes should carry over.

[~staple]  [~mengxr]  Do you know what the basis of the TFOCS stopping 
criterion is?  I don't see that detail described in the paper.

Barring a better understanding...I guess I'd vote to mimic TFOCS.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-06-21 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595226#comment-14595226
 ] 

Joseph K. Bradley commented on SPARK-1503:
--

[~staple] [~lewuathe] Can you please coordinate about how convergence is 
measured, as in [SPARK-3382]?  The current implementation for [SPARK-3382] uses 
relative convergence w.r.t. the weight vector, where it measures relative to 
the weight vector from the previous iteration.  I figure we should use the same 
convergence criterion for both accelerated and non-accelerated gradient descent.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-03-10 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355883#comment-14355883
 ] 

Sean Owen commented on SPARK-1503:
--

I was just today reminded of this other issue, which takes issue with shrinking 
step size instead of a more principled line search in L-BFGS. Worth linking 
this up.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple
 Attachments: linear.png, linear_l1.png, logistic.png, logistic_l2.png


 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-03-06 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350812#comment-14350812
 ] 

Aaron Staple commented on SPARK-1503:
-

I recently created a PR for an implementation of accelerated gradient descent 
without backtracking, as discussed above.

I also ran some simple, small scale (single server) benchmarks to compare 
different optimization methods. (The benchmark result graphs are provided as 
images attached to this ticket.) The optimization methods tested were:

- gra: existing gradient descent implementation (using full batch)
- acc: accelerated descent (as implemented in the PR), but without automatic 
restart
- acc_r: accelerated descent, with automatic restart
- acc_b: accelerated descent, with backtracking, without automatic restart
- acc_rb: accelerated descent, with backtracking, with automatic restart
- lbfgs: existing lbfgs implementation

(Note that backtracking is not part of the PR, and was just tested for.)

The x axis shows the number of outer loop iterations of the optimization 
algorithm. (Note that for backtracking implementations, the full cost of 
backtracking is not represented in this outer loop count. For non backtracking 
implementations, the number of outer loop iterations is the same as the number 
of spark map reduce jobs). The y axis is the log of the difference from best 
determined optimized value.

The optimization test runs were:

- linear: A scaled up version of the test data from TFOCS’s test_LASSO.m 
example, with 1 observations on 1024 features. 512 of the features are 
actually correlated with result. Unregularized linear regression was used. (The 
scala acceleration implementation was observed to be consistent with the TFOCS 
implementation on this dataset.)
- linear l1: The same as ‘linear’, but with l1 regularization
- logistic: Each feature of each observation is generated by summing a feature 
gaussian specific to the observation’s binary category with a noise gaussian. 
1 observations on 250 features. Unregularized logistic regression was used.
- logistic l2: Same as ‘logistic’, but using l2 regularization

Note that for each test run, all optimization methods were given the same 
initial step size.

Observations:
- Acceleration consistently converges more quickly than standard gradient 
descent, given the same initial step size.
- Automatic restart is helpful for acceleration convergence
- Backtracking can significantly boost convergence rates in some cases 
(measured in terms of outer loop iterations). But the full cost of backtracking 
was not measured in these runs.
- lbfgs generally outperformed accelerated gradient descent in these test runs. 
Accelerated gradient descent was competitive with lbfgs for linear l1 (lasso) 
regression. However as noted in the documentation, the L1Updater “will not 
work” for the lbfgs implementation. It seems l1 regularization is currently a 
weak spot for the lbfgs implementation.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2015-03-06 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350809#comment-14350809
 ] 

Apache Spark commented on SPARK-1503:
-

User 'staple' has created a pull request for this issue:
https://github.com/apache/spark/pull/4934

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-11-26 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226958#comment-14226958
 ] 

Aaron Staple commented on SPARK-1503:
-

[~rezazadeh] Thanks for your feedback.

Your point about the communication cost of backtracking is well taken. Just to 
explain where I was coming from with the design proposal: As I was looking to 
come up to speed on accelerated gradient descent, I came across some scattered 
comments online suggesting that acceleration was difficult to implement well, 
was finicky, etc - especially when compared with standard gradient descent for 
machine learning. So, wary of these sorts of comments, I wrote up the proposal 
with the intention of duplicating TFOCS as closely as possible to start with, 
with the possibility of making changes from there based on performance. In 
addition, I’d interpreted an earlier comment in this ticket as suggesting that 
line search be implemented in the same manner as in TFOCS.

I am happy to implement a constant step size first, though. It may also be 
informative to run some performance tests in spark both with and without 
backtracking. One (basic, not conclusive) data point I have now is that, if I 
run TFOCS’ test_LASSO example it triggers 97 iterations of the outer AT loop 
and 106 iterations of the inner backtracking loop. For this one example, the 
backtracking iteration overhead is only about 10%. Though keep in mind that in 
spark if we removed backtracking entirely it would mean only one distributed 
aggregation per iteration rather than two - so a huge improvement in 
communication cost assuming there is still a good convergence rate.

Incidentally, are there any specific learning benchmarks for spark that you 
would recommend?

I’ll do a bit of research to identify the best ways to manage the lipschitz 
estimate / step size in the absence of backtracking. I’ve also noticed some 
references online to distributed implementations of accelerated methods. It may 
be informative to learn more about them - if you’ve heard of any particularly 
good distributed optimizers using acceleration, please let me know.

Thanks,
Aaron

PS Yes, I’ll make sure to follow the lbfgs example so the accelerated 
implementation can be easily substituted.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-11-26 Thread Reza Zadeh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227023#comment-14227023
 ] 

Reza Zadeh commented on SPARK-1503:
---

Thanks Aaron. From an implementation perspective, it's probably easier to 
implement a constant step size first. From there you can see if there is any 
finicky behavior and compare to the unaccelerated proximal gradient already in 
Spark. If it works well enough, we should commit the first PR without 
backtracking, and then experiment with backtracking, otherwise if you see 
strange behavior then you can decide if backtracking would solve it. 

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-11-24 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224132#comment-14224132
 ] 

Aaron Staple commented on SPARK-1503:
-

[~mengxr] [~rezazadeh] Ok, thanks for the heads up. Let me know if there’s 
anything about the spec that should be handled differently. I covered most of 
the mathematics informally (the details are already covered formally in the 
references). And in addition, the proposal describes a method of implementing 
TFOCS functionality distributively but does not investigate existing 
distributed optimization systems.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-11-21 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221233#comment-14221233
 ] 

Aaron Staple commented on SPARK-1503:
-

[~mengxr] Sorry for the delay. I wrote up a design proposal for the initial 
implementation. Let me know what you think, if you'd like me to clarify 
anything.

https://docs.google.com/document/d/1L50O66LnBfVopFjptbet2ZTQRzriZTjKvlIILZwKsno/edit?usp=sharing

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-10-10 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166465#comment-14166465
 ] 

Xiangrui Meng commented on SPARK-1503:
--

[~staple] Thanks for picking up this JIRA! TFOCS is a good place to start. We 
can support AT (Auslender and Teboulle) update, line search, and restart in the 
first version. It would be nice to take generic composite objective functions.

Please note that this could become a big task. We definitely need to go through 
the design first.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-10-10 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167868#comment-14167868
 ] 

Aaron Staple commented on SPARK-1503:
-

[~mengxr] Thanks for the heads up! I’ll definitely go through TFOCS and am 
happy to work carefully and collaboratively on design.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng
Assignee: Aaron Staple

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-10-09 Thread Aaron Staple (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166168#comment-14166168
 ] 

Aaron Staple commented on SPARK-1503:
-

Hi, I’d like to try working on this ticket. If you’d like to assign it to me, I 
can write a short spec and then work on a PR.

 Implement Nesterov's accelerated first-order method
 ---

 Key: SPARK-1503
 URL: https://issues.apache.org/jira/browse/SPARK-1503
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Xiangrui Meng

 Nesterov's accelerated first-order method is a drop-in replacement for 
 steepest descent but it converges much faster. We should implement this 
 method and compare its performance with existing algorithms, including SGD 
 and L-BFGS.
 TFOCS (http://cvxr.com/tfocs/) is a reference implementation of Nesterov's 
 method and its variants on composite objectives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org