[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4934 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-125656909 We refactored the implementation. You can find the latest version at https://github.com/databricks/spark-tfocs. We will send a new PR when the implementation is ready. @staple Could you close this PR for now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-125644133 Likewise is this one stale? I'm not sure this is going to move forward. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user staple commented on a diff in the pull request: https://github.com/apache/spark/pull/4934#discussion_r26186113 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/AcceleratedGradientDescent.scala --- @@ -0,0 +1,237 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.optimization + +import scala.collection.mutable.ArrayBuffer + +import breeze.linalg.{DenseVector => BDV, norm} + +import org.apache.spark.Logging +import org.apache.spark.annotation.DeveloperApi +import org.apache.spark.mllib.linalg.{Vector, Vectors} +import org.apache.spark.rdd.RDD + +/** + * :: DeveloperApi :: + * This class optimizes a vector of weights via accelerated (proximal) gradient descent. + * The implementation is based on TFOCS [[http://cvxr.com/tfocs]], described in Becker, Candes, and + * Grant 2010. + * @param gradient Delegate that computes the loss function value and gradient for a vector of + * weights. + * @param updater Delegate that updates weights in the direction of a gradient. + */ +@DeveloperApi +class AcceleratedGradientDescent (private var gradient: Gradient, private var updater: Updater) + extends Optimizer { + + private var stepSize: Double = 1.0 + private var convergenceTol: Double = 1e-4 + private var numIterations: Int = 100 + private var regParam: Double = 0.0 + + /** + * Set the initial step size, used for the first step. Default 1.0. + * On subsequent steps, the step size will be adjusted by the acceleration algorithm. + */ + def setStepSize(step: Double): this.type = { +this.stepSize = step +this + } + + /** + * Set the optimization convergence tolerance. Default 1e-4. + * Smaller values will increase accuracy but require additional iterations. + */ + def setConvergenceTol(tol: Double): this.type = { +this.convergenceTol = tol +this + } + + /** + * Set the maximum number of iterations. Default 100. + */ + def setNumIterations(iters: Int): this.type = { +this.numIterations = iters +this + } + + /** + * Set the regularization parameter. Default 0.0. + */ + def setRegParam(regParam: Double): this.type = { +this.regParam = regParam +this + } + + /** + * Set a Gradient delegate for computing the loss function value and gradient. + */ + def setGradient(gradient: Gradient): this.type = { +this.gradient = gradient +this + } + + /** + * Set an Updater delegate for updating weights in the direction of a gradient. + * If regularization is used, the Updater will implement the regularization term's proximity + * operator. Thus the type of regularization penalty is configured by providing a corresponding + * Updater implementation. + */ + def setUpdater(updater: Updater): this.type = { +this.updater = updater +this + } + + /** + * Run accelerated gradient descent on the provided training data. + * @param data training data + * @param initialWeights initial weights + * @return solution vector + */ + def optimize(data: RDD[(Double, Vector)], initialWeights: Vector): Vector = { +val (weights, _) = AcceleratedGradientDescent.run( + data, + gradient, + updater, + stepSize, + convergenceTol, + numIterations, + regParam, + initialWeights) +weights + } +} + +/** + * :: DeveloperApi :: + * Top-level method to run accelerated (proximal) gradient descent. + */ +@DeveloperApi +object AcceleratedGradientDescent extends Logging { + /** + * Run accelerated proximal gradient descent. + * The implementation is based on TFOCS [[http://cvxr.com/tfocs]], described in Becker, Candes, + * and Grant 2010. A li
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-78192837 Hi, replying to some of the statements above: > It seems @staple has already implemented backtracking (because he has results in the JIRA), but kept them out of this PR to keep it simple, so we can tackle that afterwards. I wrote a backtracking implementation (and checked that it performs the same as the tfocs implementation). Currently it is just a port of the tfocs version. Iâd need a little time to make it scala / spark idiomatic, but the turnaround would be pretty fast. > For example, if we add line search option, what is the semantic of agd.setStepSize(1.0).useLineSearch() TFOCS supports a suggested initial lipschitz value (variable named âLâ), which is just a starting point for line search, so a corresponding behavior would be to use the step size as an initial suggestion only when line search is enabled. It may be desirable to use a parameter name like âLâ instead of âstepSizeâ to make the meaning clearer. In TFOCS you can disable backtracking line search by setting several parameters (L, Lexact, alpha, and beta) which individually control different aspects of the backtracking implementation. For spark it may make sense to provide backtracking modes that are configured explicitly, for example fixed lipshitz bound (no backtracking), or backtracking line search based on the TFOCS implementation, or possibly an alternative line search implementation that is more conservative about performing round trip aggregations. Then there could be a setBacktrackingMode() setter to configure which mode is used. Moving forward there may be a need to support additional acceleration algorithms in addition to Auslender and Teboulle. These might be configurable via a setAlgorithm() function. > Btw, I don't think we need to stick to the current GradientDescent API. The accelerated gradient takes a smooth convex function which provides gradient and optionally the Lipschitz constant. The implementation of Nesterov's method doesn't need to know RDDs. This is good to know. I had been assuming we would stick with the existing GradientDescent api including Gradient and Updater delegates. Currently the applySmooth and applyProjector functions (named the same as corresponding TFOCS functions) serve as a bridge between the acceleration implementation (relatively unaware of RDDs) and spark specific RDD aggregations. This seems like a good time to mention that the backtracking implementation in TFOCS uses a system of caching the (expensive to compute) linear operator component of the objective function, which significantly reduces the cost of backtracking. A similar implementation is possible in spark, though the performance benefit may not be as significant because two round trips would still be required per iteration. (See p. 3 of my design doc linked in the jira for some more detail.) One reason I suggested not implementing linear operator caching in the design doc is because itâs incompatible with the existing Gradient interface. If we are considering an alternative interface it may be worth revisiting this issue. The objective function âinterfaceâ used by TFOCS involves the functions applyLinear (linear operator component of objective), applySmooth (smooth portion of objective), applyProjector (nonsmooth portion of objective). In addition there are a number of numeric and categorical parameters. Theoretically we could adopt a similar interface (with or without applyLinear, depending) where RDD specific operations are encapsulated within the various apply* functions. Finally, I wanted to mention that Iâm in the bay area and am happy to meet in person to discuss this project if that would be helpful. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-78165360 Line search helps if you don't know the Lipschitz constant. With accelerated gradient, it is very easy to blow up if the step size is wrong. I'm okay with not having line search in this version. But we need to consider how the APIs are going to change after we add line search. For example, if we add line search option, what is the semantic of `agd.setStepSize(1.0).useLineSearch()`? Btw, I don't think we need to stick to the current `GradientDescent` API. The accelerated gradient takes a smooth convex function which provides gradient and optionally the Lipschitz constant. The implementation of Nesterov's method doesn't need to know RDDs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user rezazadeh commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-77801646 Thank you for this PR @staple ! @mengxr I suggested to @staple to first implement without backtracking to keep the PR as simple as possible. According to his plots (see JIRA), even without backtracking, this PR achieves fewer iterations with the same cost per iteration. Note that backtracking requires several additional map-reduces per iteration. This makes it unclear when backtracking is best used. So I suggested to first merge the case that is a clear win (fewer iterations in the same cost per iteration). I think we should merge this without backtracking, and then have another PR to properly evaluate how backtracking affects total cost with the goal of also merging backtracking. It seems @staple has already implemented backtracking (because he has results in the JIRA), but kept them out of this PR to keep it simple, so we can tackle that afterwards. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user staple commented on a diff in the pull request: https://github.com/apache/spark/pull/4934#discussion_r25986928 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/AcceleratedGradientDescent.scala --- @@ -0,0 +1,237 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.optimization + +import scala.collection.mutable.ArrayBuffer + +import breeze.linalg.{DenseVector => BDV, norm} + +import org.apache.spark.Logging +import org.apache.spark.annotation.DeveloperApi +import org.apache.spark.mllib.linalg.{Vector, Vectors} +import org.apache.spark.rdd.RDD + +/** + * :: DeveloperApi :: + * This class optimizes a vector of weights via accelerated (proximal) gradient descent. + * The implementation is based on TFOCS [[http://cvxr.com/tfocs]], described in Becker, Candes, and + * Grant 2010. + * @param gradient Delegate that computes the loss function value and gradient for a vector of + * weights. + * @param updater Delegate that updates weights in the direction of a gradient. + */ +@DeveloperApi +class AcceleratedGradientDescent (private var gradient: Gradient, private var updater: Updater) + extends Optimizer { + + private var stepSize: Double = 1.0 + private var convergenceTol: Double = 1e-4 + private var numIterations: Int = 100 + private var regParam: Double = 0.0 + + /** + * Set the initial step size, used for the first step. Default 1.0. + * On subsequent steps, the step size will be adjusted by the acceleration algorithm. + */ + def setStepSize(step: Double): this.type = { --- End diff -- @mengxr Thanks for taking a look. I was advised by Reza Zadeh to implement a version without line search, at least for the initial implementation. Please see discussion here: https://issues.apache.org/jira/browse/SPARK-1503?focusedCommentId=14225295, and in the following comments. I also attached some optimization benchmarks to the jira, which include performance of both backtracking line search and non line search implementations. Per your suggestion that it's hard to choose a proper stepSize I can attest that, anecdotally, acceleration seems somewhat more sensitive to diverging with nominal stepSize than the existing gradient descent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4934#discussion_r25985578 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/AcceleratedGradientDescent.scala --- @@ -0,0 +1,237 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.optimization + +import scala.collection.mutable.ArrayBuffer + +import breeze.linalg.{DenseVector => BDV, norm} + +import org.apache.spark.Logging +import org.apache.spark.annotation.DeveloperApi +import org.apache.spark.mllib.linalg.{Vector, Vectors} +import org.apache.spark.rdd.RDD + +/** + * :: DeveloperApi :: + * This class optimizes a vector of weights via accelerated (proximal) gradient descent. + * The implementation is based on TFOCS [[http://cvxr.com/tfocs]], described in Becker, Candes, and + * Grant 2010. + * @param gradient Delegate that computes the loss function value and gradient for a vector of + * weights. + * @param updater Delegate that updates weights in the direction of a gradient. + */ +@DeveloperApi +class AcceleratedGradientDescent (private var gradient: Gradient, private var updater: Updater) + extends Optimizer { + + private var stepSize: Double = 1.0 + private var convergenceTol: Double = 1e-4 + private var numIterations: Int = 100 + private var regParam: Double = 0.0 + + /** + * Set the initial step size, used for the first step. Default 1.0. + * On subsequent steps, the step size will be adjusted by the acceleration algorithm. + */ + def setStepSize(step: Double): this.type = { --- End diff -- It is quite hard to choose a proper `stepSize` in practice, because it depends on the Lipschitz constant, which is usually unknown. It may be better if we can implement a line search method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-77638297 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28355/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-77638281 [Test build #28355 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28355/consoleFull) for PR 4934 at commit [`a121bd0`](https://github.com/apache/spark/commit/a121bd0f6e5f2387bd502976940c08f6a4a2e4b1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4934#issuecomment-77625694 [Test build #28355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28355/consoleFull) for PR 4934 at commit [`a121bd0`](https://github.com/apache/spark/commit/a121bd0f6e5f2387bd502976940c08f6a4a2e4b1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1503][MLLIB] Initial AcceleratedGradien...
GitHub user staple opened a pull request: https://github.com/apache/spark/pull/4934 [SPARK-1503][MLLIB] Initial AcceleratedGradientDescent implementation. An implementation of accelerated gradient descent, a first order optimization algorithm with faster asymptotic convergence than standard gradient descent. Design discussion and benchmark results at https://issues.apache.org/jira/browse/SPARK-1503 You can merge this pull request into a Git repository by running: $ git pull https://github.com/staple/spark SPARK-1503 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4934 commit a121bd0f6e5f2387bd502976940c08f6a4a2e4b1 Author: Aaron Staple Date: 2015-02-13T22:24:07Z [SPARK-1503][MLLIB] Initial AcceleratedGradientDescent implementation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org