[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377357#comment-14377357 ] Debasish Das edited comment on SPARK-2426 at 3/24/15 3:23 PM: -- [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221Is not there is log term that multiplies with R to make it a KL divergence loss ? May be the log term can removed under non-negative and normalization constraints ? @mengxr any ideas here ? If we can do that we can target KL divergence loss from Lee's paper: http://hebb.mit.edu/people/seung/papers/ls-lponm-99.pdf For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much slower we get compared to stochastic matrix decomposition using ALS. MAP loss looks like a strong contender to LDA and can natively handle counts (does not need regression style datasets which is difficult to get in practical setup where people normally don't give any rating and satisfaction should be infered from viewing time etc) was (Author: debasish83): [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much slower we get compared to stochastic matrix decomposition using ALS. MAP loss looks like a strong contender to LDA and can natively handle counts (does not need regression style datasets which is difficult to get in practical setup where people normally don't give any rating and satisfaction should be infered from viewing time etc) Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377357#comment-14377357 ] Debasish Das edited comment on SPARK-2426 at 3/24/15 3:23 PM: -- [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 Is not there is log term that multiplies with R to make it a KL divergence loss ? May be the log term can removed under non-negative and normalization constraints ? @mengxr any ideas here ? If we can do that we can target KL divergence loss from Lee's paper: http://hebb.mit.edu/people/seung/papers/ls-lponm-99.pdf For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much slower we get compared to stochastic matrix decomposition using ALS. MAP loss looks like a strong contender to LDA and can natively handle counts (does not need regression style datasets which is difficult to get in practical setup where people normally don't give any rating and satisfaction should be infered from viewing time etc) was (Author: debasish83): [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221Is not there is log term that multiplies with R to make it a KL divergence loss ? May be the log term can removed under non-negative and normalization constraints ? @mengxr any ideas here ? If we can do that we can target KL divergence loss from Lee's paper: http://hebb.mit.edu/people/seung/papers/ls-lponm-99.pdf For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much slower we get compared to stochastic matrix decomposition using ALS. MAP loss looks like a strong contender to LDA and can natively handle counts (does not need regression style datasets which is difficult to get in practical setup where people normally don't give any rating and satisfaction should be infered from viewing time etc) Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377357#comment-14377357 ] Debasish Das edited comment on SPARK-2426 at 3/24/15 6:11 AM: -- [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. || . || stands for Frobenius norm (or l1)., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323...I am very curious how much slower we get compared to stochastic matrix decomposition using ALS was (Author: debasish83): [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ||.|| stands for Frobenius norm (or l1)., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323...I am very curious how much slower we get compared to stochastic matrix decomposition using ALS Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377357#comment-14377357 ] Debasish Das edited comment on SPARK-2426 at 3/24/15 6:11 AM: -- [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323...I am very curious how much slower we get compared to stochastic matrix decomposition using ALS was (Author: debasish83): [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. || . || stands for Frobenius norm (or l1)., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323...I am very curious how much slower we get compared to stochastic matrix decomposition using ALS Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377357#comment-14377357 ] Debasish Das edited comment on SPARK-2426 at 3/24/15 6:13 AM: -- [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323. I am very curious how much slower we get compared to stochastic matrix decomposition using ALS. MAP loss looks like a strong contender to LDA and can natively handle counts (does not need regression style datasets which is difficult to get in practical setup where people normally don't give any rating and satisfaction should be infered from viewing time etc) was (Author: debasish83): [~acopich] From your comment before Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ., could you please point me to a good reference with application to collaborative filtering/topic modeling ? Stochastic matrix decomposition is what we can do in this PR now https://github.com/apache/spark/pull/3221 For MAP loss, I will open up a PR in a week through JIRA https://issues.apache.org/jira/browse/SPARK-6323...I am very curious how much slower we get compared to stochastic matrix decomposition using ALS Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231375#comment-14231375 ] Valeriy Avanesov edited comment on SPARK-2426 at 12/2/14 11:47 AM: --- I'm not sure if I understand your question... As far as I can see, w_i stands for a row of the matrix w and h_j stands for a column of the matrix h. \sum_i \sum_j ( r_ij - w_i*h_j) -- is not a matrix norm. Probably, you either miss abs or square -- \sum_i \sum_j |r_ij - w_i*h_j| or \sum_i \sum_j ( r_ij - w_i*h_j)^2 It looks like l2 regularized stochastic matrix decomposition with respect to Frobenius (or l1) norm. But I don't understand why do you consider k optimization problems (do you? What does k \in {1 ... 25} stand for?). Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ||.|| stands for Frobenius norm (or l1). By the way: is the matrix of ranks r stochastic? Stochastic matrix decomposition doesn't seem reasonable if it's not. was (Author: acopich): I'm not sure if I understand your question... As far as I can see, w_i stands for a row of the matrix w and h_j stands for a column of the matrix h. \sum_i \sum_j ( r_ij - w_i*h_j) -- is not a matrix norm. Probably, you either miss abs or square -- \sum_i \sum_j |r_ij - w_i*h_j| or \sum_i \sum_j ( r_ij - w_i*h_j)^2 It looks like l2 regularized stochastic matrix decomposition with respect to Frobenius (or l1) norm. But I don't understand why do you consider k optimization problems (do you? What does k \in {1 ... 25} stand for?). Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ||..|| stands for Frobenius norm (or l1). By the way: is the matrix of ranks r stochastic? Stochastic matrix decomposition doesn't seem reasonable if it's not. Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231375#comment-14231375 ] Valeriy Avanesov edited comment on SPARK-2426 at 12/2/14 11:47 AM: --- I'm not sure if I understand your question... As far as I can see, w_i stands for a row of the matrix w and h_j stands for a column of the matrix h. \sum_i \sum_j ( r_ij - w_i*h_j) -- is not a matrix norm. Probably, you either miss abs or square -- \sum_i \sum_j |r_ij - w_i*h_j| or \sum_i \sum_j ( r_ij - w_i*h_j)^2 It looks like l2 regularized stochastic matrix decomposition with respect to Frobenius (or l1) norm. But I don't understand why do you consider k optimization problems (do you? What does k \in {1 ... 25} stand for?). Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. \||.|| stands for Frobenius norm (or l1). By the way: is the matrix of ranks r stochastic? Stochastic matrix decomposition doesn't seem reasonable if it's not. was (Author: acopich): I'm not sure if I understand your question... As far as I can see, w_i stands for a row of the matrix w and h_j stands for a column of the matrix h. \sum_i \sum_j ( r_ij - w_i*h_j) -- is not a matrix norm. Probably, you either miss abs or square -- \sum_i \sum_j |r_ij - w_i*h_j| or \sum_i \sum_j ( r_ij - w_i*h_j)^2 It looks like l2 regularized stochastic matrix decomposition with respect to Frobenius (or l1) norm. But I don't understand why do you consider k optimization problems (do you? What does k \in {1 ... 25} stand for?). Anyway, l2 regularized stochastic matrix decomposition problem is defined as follows Minimize w.r.t. W and H : ||R - W*H|| + \lambda(||W|| + ||H||) under non-negativeness and normalization constraints. ||.|| stands for Frobenius norm (or l1). By the way: is the matrix of ranks r stochastic? Stochastic matrix decomposition doesn't seem reasonable if it's not. Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.3.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit. http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark Based on Xiangrui's feedback I am currently comparing the ADMM based Quadratic Minimization solvers with IPM based QpSolvers and the default ALS/NNLS. I will keep updating the runtime comparison results. For integration the detailed plan is as follows: 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218891#comment-14218891 ] Debasish Das edited comment on SPARK-2426 at 11/20/14 2:13 AM: --- With the MAP measures being added to examples.MovieLensALS through https://issues.apache.org/jira/browse/SPARK-4231 I compared the quality and runtime of the matrix completion formulations on MovieLens 1M dataset: Default: userConstraint L2, productConstraint L2 lambdaUser=lambdaProduct=0.065 rank=100 iterations 10 Test RMSE = 0.8436480113821955. Test users 6038 MAP 0.05860164548002782 Solver: Cholesky decomposition followed by forward-backward solves Per iteration runtime for baseline (solveTime in ms) 14/11/19 17:37:06 INFO ALS: usersOrProducts 924 slowConvergence 0 QuadraticMinimizer solveTime 362.813 Iters 0 14/11/19 17:37:06 INFO ALS: usersOrProducts 910 slowConvergence 0 QuadraticMinimizer solveTime 314.527 Iters 0 14/11/19 17:37:06 INFO ALS: usersOrProducts 927 slowConvergence 0 QuadraticMinimizer solveTime 265.75 Iters 0 14/11/19 17:37:06 INFO ALS: usersOrProducts 918 slowConvergence 0 QuadraticMinimizer solveTime 271.513 Iters 0 14/11/19 17:37:09 INFO ALS: usersOrProducts 1510 slowConvergence 0 QuadraticMinimizer solveTime 370.177 Iters 0 14/11/19 17:37:09 INFO ALS: usersOrProducts 1512 slowConvergence 0 QuadraticMinimizer solveTime 467.994 Iters 0 14/11/19 17:37:09 INFO ALS: usersOrProducts 1507 slowConvergence 0 QuadraticMinimizer solveTime 511.894 Iters 0 14/11/19 17:37:09 INFO ALS: usersOrProducts 1511 slowConvergence 0 QuadraticMinimizer solveTime 481.189 Iters 0 NMF: userConstraint POSITIVE, productConstraint POSITIVE, userLambda=productLambda=0.065 L2 regularization Got 1000209 ratings from 6040 users on 3706 movies. Training: 800670, test: 199539. Quadratic minimization userConstraint POSITIVE productConstraint POSITIVE Test RMSE = 0.8435335132641906. Test users 6038 MAP 0.056361816590625446 ALS iteration1 runtime: QuadraticMinimizer convergence profile: 14/11/19 17:46:46 INFO ALS: usersOrProducts 918 slowConvergence 0 QuadraticMinimizer solveTime 1936.281 Iters 73132 14/11/19 17:46:46 INFO ALS: usersOrProducts 927 slowConvergence 0 QuadraticMinimizer solveTime 1871.364 Iters 75219 14/11/19 17:46:46 INFO ALS: usersOrProducts 910 slowConvergence 0 QuadraticMinimizer solveTime 2067.735 Iters 73180 14/11/19 17:46:46 INFO ALS: usersOrProducts 924 slowConvergence 0 QuadraticMinimizer solveTime 2127.161 Iters 75546 14/11/19 17:46:53 INFO ALS: usersOrProducts 1507 slowConvergence 0 QuadraticMinimizer solveTime 3813.923 Iters 193207 14/11/19 17:46:54 INFO ALS: usersOrProducts 1511 slowConvergence 0 QuadraticMinimizer solveTime 3894.068 Iters 196882 14/11/19 17:46:54 INFO ALS: usersOrProducts 1510 slowConvergence 0 QuadraticMinimizer solveTime 3875.915 Iters 193987 14/11/19 17:46:54 INFO ALS: usersOrProducts 1512 slowConvergence 0 QuadraticMinimizer solveTime 3939.765 Iters 192471 NNLS convergence profile: 14/11/19 17:46:46 INFO ALS: NNLS solveTime 252.909 iters 7381 14/11/19 17:46:46 INFO ALS: NNLS solveTime 256.803 iters 7740 14/11/19 17:46:46 INFO ALS: NNLS solveTime 274.352 iters 7491 14/11/19 17:46:46 INFO ALS: NNLS solveTime 272.971 iters 7664 14/11/19 17:46:53 INFO ALS: NNLS solveTime 1487.262 iters 60338 14/11/19 17:46:54 INFO ALS: NNLS solveTime 1472.742 iters 61321 14/11/19 17:46:54 INFO ALS: NNLS solveTime 1489.863 iters 62228 14/11/19 17:46:54 INFO ALS: NNLS solveTime 1494.192 iters 60489 ALS iteration 10 Quadratic Minimizer convergence profile: 14/11/19 17:48:17 INFO ALS: usersOrProducts 924 slowConvergence 0 QuadraticMinimizer solveTime 1082.056 Iters 53724 14/11/19 17:48:17 INFO ALS: usersOrProducts 910 slowConvergence 0 QuadraticMinimizer solveTime 1180.601 Iters 50593 14/11/19 17:48:17 INFO ALS: usersOrProducts 927 slowConvergence 0 QuadraticMinimizer solveTime 1106.131 Iters 53069 14/11/19 17:48:17 INFO ALS: usersOrProducts 918 slowConvergence 0 QuadraticMinimizer solveTime 1108.478 Iters 50895 14/11/19 17:48:23 INFO ALS: usersOrProducts 1510 slowConvergence 0 QuadraticMinimizer solveTime 2262.193 Iters 116818 14/11/19 17:48:23 INFO ALS: usersOrProducts 1512 slowConvergence 0 QuadraticMinimizer solveTime 2293.64 Iters 116026 14/11/19 17:48:23 INFO ALS: usersOrProducts 1507 slowConvergence 0 QuadraticMinimizer solveTime 2241.491 Iters 116293 14/11/19 17:48:23 INFO ALS: usersOrProducts 1511 slowConvergence 0 QuadraticMinimizer solveTime 2372.957 Iters 118391 NNLS convergence profile: 14/11/19 17:48:17 INFO ALS: NNLS solveTime 623.031 iters 21611 14/11/19 17:48:17 INFO ALS: NNLS solveTime 553.493 iters 21732 14/11/19 17:48:17 INFO ALS: NNLS solveTime 559.9 iters 22511 14/11/19 17:48:17 INFO ALS: NNLS solveTime 556.654 iters 21330 14/11/19 17:48:23 INFO ALS: NNLS solveTime 1672.582 iters 86006 14/11/19 17:48:23 INFO ALS: NNLS solveTime
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191551#comment-14191551 ] Debasish Das edited comment on SPARK-2426 at 10/31/14 8:04 AM: --- [~mengxr] The matlab comparison scripts are open sourced over here: https://github.com/debasish83/ecos/blob/master/matlab/admm/qprandom.m https://github.com/debasish83/ecos/blob/master/matlab/pdco4/code/pdcotestQP.m The detailed comparisons are on the REAME.md. Please look at the section on Matlab comparisons. In a nutshell, for bounds MOSEK and ADMM are similar, for elastic net Proximal is 10X faster compared to MOSEK, for equality MOSEK is 2-3X faster than Proximal but both PDCO and ECOS produces much worse result as compared to ADMM. Accelerated ADMM also did not work as good as default ADMM. Increasing the over-relaxation parameter helps ADMM but I have not explored it yet. ADMM and PDCO are in Matlab but ECOS and MOSEK are both using mex files so they are expected to be more efficient. Next I will add the performance results of running positivity, box, sparse coding / regularized lsi and robust-plsa on MovieLens dataset and validate product recommendation using the MAP measure...In terms of RMSE, default positive sparse coding... What's the largest datasets LDA PRs are running? I would like to try that on sparse coding as well...From these papers sparse coding/RLSI should give results at par with LDA: https://www.cs.cmu.edu/~xichen/images/SLSA-sdm11-final.pdf http://web.stanford.edu/group/mmds/slides2012/s-hli.pdf The same randomized matrices can be generated and run in the PR as follows: ./bin/spark-class org.apache.spark.mllib.optimization.QuadraticMinimizer 1000 1 1.0 0.99 rank=1000, equality=1.0 lambda=1.0 beta=0.99 L1regularization = lambda*beta L2regularization = lambda*(1-beta) Generating randomized QPs with rank 1000 equalities 1 sparseQp 88.423 ms iterations 45 converged true posQp 181.369 ms iterations 121 converged true boundsQp 175.733 ms iterations 121 converged true Qp Equality 2805.564 ms iterations 2230 converged true was (Author: debasish83): [~mengxr] The matlab comparison scripts are open sourced over here: https://github.com/debasish83/ecos/blob/master/matlab/admm/qprandom.m https://github.com/debasish83/ecos/blob/master/matlab/pdco4/code/pdcotestQP.m The detailed comparisons are on the REAME.md. Please look at the section on Matlab comparisons. In a nutshell, for bounds MOSEK and ADMM are similar, for elastic net Proximal is 10X faster compared to MOSEK, for equality MOSEK is 2-3X faster than Proximal but both PDCO and ECOS produces much worse result as compared to ADMM. Accelerated ADMM also did not work as good as default ADMM. Increasing the over-relaxation parameter helped accelerated ADMM but I have not explored it yet. ADMM and PDCO are in Matlab but ECOS and MOSEK are both using mex files so they are expected to be more efficient. Next I will add the performance results of running positivity, box, sparse coding / regularized lsi and robust-plsa on MovieLens dataset and validate product recommendation using the MAP measure...In terms of RMSE, default positive sparse coding... What's the largest datasets LDA PRs are running? I would like to try that on sparse coding as well...From these papers sparse coding/RLSI should give results at par with LDA: https://www.cs.cmu.edu/~xichen/images/SLSA-sdm11-final.pdf http://web.stanford.edu/group/mmds/slides2012/s-hli.pdf The same randomized matrices can be generated and run in the PR as follows: ./bin/spark-class org.apache.spark.mllib.optimization.QuadraticMinimizer 1000 1 1.0 0.99 rank=1000, equality=1.0 lambda=1.0 beta=0.99 L1regularization = lambda*beta L2regularization = lambda*(1-beta) Generating randomized QPs with rank 1000 equalities 1 sparseQp 88.423 ms iterations 45 converged true posQp 181.369 ms iterations 121 converged true boundsQp 175.733 ms iterations 121 converged true Qp Equality 2805.564 ms iterations 2230 converged true Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.0.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares and nonnegative least squares. I presented ADMM and IPM based Quadratic Minimization solvers to be used for the following ALS problems: 1. ALS with bounds 2. ALS with L1 regularization 3. ALS with Equality constraint and bounds Initial runtime comparisons are presented at Spark Summit.
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095232#comment-14095232 ] Debasish Das edited comment on SPARK-2426 at 10/31/14 4:20 PM: --- Hi Xiangrui, The branch is ready for an initial review. I will do lot of clean-up this week. I need some advice on whether we should bring the additional ALS features first or integrate NNLS with QuadraticMinimizer so that we can handle large ranks as well. https://github.com/debasish83/spark/commits/qp-als optimization/QuadraticMinimizer.scala is the placeholder for all QuadraticMinimization. Right now we support 5 features: 1. Least square 2. Quadratic minimization with positivity 3. Quadratic minimization with box : generalization of positivity 4. Quadratic minimization with elastic net :L1 is at 0.99, elastic net control is not given to users 5. Quadratic minimization with affine constraints and bounds There are lot many regularization in Proximal.scala which can be re-used in mllib updater...L1Updater in mllib is an example of Proximal algorithm... QuadraticMinimizer is optimized for direct solve right now (cholesky / lu based on problem we are solving) The CG core from Breeze will be used for iterative solve when ranks are high...I need a different variant of CG for Formulation 5 so Breeze CG is not sufficient for all the formulations this branch supports and needs to be extended.. Right now I am experimenting with ADMM rho and lambda values so that the NNLS iterations are at par with Least square with positivity. The idea for rho and lambda tuning are the following: 1. Derive an optimal value of lambda for quadratic problems, similar to idea of Nesterov's acceleration being used in algorithms like FISTA and accelerated ADMM from UCLA 2. Derive rho from approximate min and max eigenvalues of gram matrix For Matlab based experiments within PDCO, ECOS(IPM), MOSEK and ADMM variants, ADMM is faster with producing result quality within 1e-4 of MOSEK. I will publish the numbers and the matlab script through the ECOS jnilib open source (GPL licensed). I did not add any of ECOS code here so that everything stays Apache. For topic modeling use-case, I expect to produce sparse coding results (L1 on product factors, L2 on user factors) Example runs: NMF: ./bin/spark-submit --total-executor-cores 4 --master spark://localhost:7077 --jars ~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar --class org.apache.spark.examples.mllib.MovieLensALS ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20 --numIterations 10 --userConstraint POSITIVE --lambdaUser 0.065 --productConstraint POSITIVE --lambdaProduct 0.065 --kryo hdfs://localhost:8020/sandbox/movielens/ Sparse coding: ./bin/spark-submit --total-executor-cores 4 --master spark://localhost:7077 --jars ~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar --class org.apache.spark.examples.mllib.MovieLensALS ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --delimiter --rank 20 --numIterations 10 --userConstraint SMOOTH --lambdaUser 0.065 --productConstraint SPARSE --lambdaProduct 0.065 --kryo hdfs://localhost:8020/sandbox/movielens Robust PLSA with least square loss: ./bin/spark-submit --total-executor-cores 4 --master spark://localhost:7077 --jars ~/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar --class org.apache.spark.examples.mllib.MovieLensALS ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --delimiter --rank 20 --numIterations 10 --userConstraint EQUALITY --lambdaUser 0.065 --productConstraint EQUALITY --lambdaProduct 0.065 --kryo hdfs://localhost:8020/sandbox/movielens With this change, users can select to apply user and product specific constraint...basically positive factors for products (interpretability) and smooth for users to get more RMSE improvements. Thanks. Deb was (Author: debasish83): Hi Xiangrui, The branch is ready for an initial review. I will do lot of clean-up this week. I need some advice on whether we should bring the additional ALS features first or integrate NNLS with QuadraticMinimizer so that we can handle large ranks as well. https://github.com/debasish83/spark/commits/qp-als optimization/QuadraticMinimizer.scala is the placeholder for all QuadraticMinimization. Right now we support 5 features: 1. Least square 2. Least square with positivity 3. Least square with bounds : generalization of positivity 4. Least square with equality and positivity/bounds for LDA/PLSA 5. Least square + L1 constraint for sparse NMF There are lot many regularization in Proximal.scala which can be re-used in mllib updater...L1Updater in mllib is an example of Proximal algorithm... QuadraticMinimizer is optimized for direct solve right now (cholesky / lu based on problem we are solving)
[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095232#comment-14095232 ] Debasish Das edited comment on SPARK-2426 at 8/13/14 3:31 PM: -- Hi Xiangrui, The branch is ready for an initial review. I will do lot of clean-up this week. I need some advice on whether we should bring the additional ALS features first or integrate NNLS with QuadraticMinimizer so that we can handle large ranks as well. https://github.com/debasish83/spark/commits/qp-als optimization/QuadraticMinimizer.scala is the placeholder for all QuadraticMinimization. Right now we support 5 features: 1. Least square 2. Least square with positivity 3. Least square with bounds : generalization of positivity 4. Least square with equality and positivity/bounds for LDA/PLSA 5. Least square + L1 constraint for sparse NMF There are lot many regularization in Proximal.scala which can be re-used in mllib updater...L1Updater in mllib is an example of Proximal algorithm... QuadraticMinimizer is optimized for direct solve right now (cholesky / lu based on problem we are solving) The CG core from NNLS should be used for iterative solve when ranks are high...I need a different variant of CG for Formulation 4 so NNLS CG is not sufficient for all the formulations this branch supports... Right now I am experimenting with ADMM rho and lambda values so that the NNLS iterations are at par with Least square with positivity. The idea for rho and lambda tuning are the following: 1. Derive an optimal value of lambda for quadratic problems, similar to idea of Nesterov's acceleration being used in algorithms like FISTA and accelerated ADMM from UCLA 2. Equilibrate/Scale the gram matrix such that rho can always be set at 1.0 For Matlab based experiments within PDCO, ECOS(IPM), MOSEK and ADMM variants, ADMM is faster with producing result quality within 1e-4 of MOSEK. I will publish the numbers and the matlab script through the ECOS jnilib open source (GPL licensed). I did not add any of ECOS code here so that everything stays Apache. For recommendation use-case, I expect to produce Jellylish L1 ball projection results on netflix/movielens dataset using Formulation 5. Example runs: Least square with equality and positivity for topic modeling, all topics sum to 1.0: bin/spark-submit --class org.apache.spark.examples.mllib.MovieLensALS \ | examples/target/scala-*/spark-examples-*.jar \ | --rank 25 --numIterations 10 --lambda 1.0 --kryo --qpProblem 4\ | data/mllib/sample_movielens_data.txt Least square with L1 regularization: bin/spark-submit --class org.apache.spark.examples.mllib.MovieLensALS \ | examples/target/scala-*/spark-examples-*.jar \ | --rank 25 --numIterations 10 --lambda 1.0 --lambdaL1 1e-2 --kryo --qpProblem 5\ | data/mllib/sample_movielens_data.txt Thanks. Deb was (Author: debasish83): Hi Xiangrui, The branch is ready for an initial review. I will do lot of clean-up this week. https://github.com/debasish83/spark/commits/qp-als optimization/QuadraticMinimizer.scala is the placeholder for all QuadraticMinimization. Right now we support 5 features: 1. Least square 2. Least square with positivity 3. Least square with bounds : generalization of positivity 4. Least square with equality and positivity/bounds for LDA/PLSA 5. Least square + L1 constraint for sparse NMF There are lot many regularization in Proximal.scala which can be re-used in mllib updater...L1Updater is an example of Proximal algorithm. I feel we should move NNLS into QuadraticMinimizer as well and clean ALS.scala as you have suggested before... QuadraticMinimizer is optimized for direct solve right now (cholesky / lu based on problem we are solving) The CG core from NNLS should be used for iterative solve when ranks are high...I need a different variant of CG for Formulation 4 so NNLS CG is not sufficient for all the formulations. Right now I am experimenting with ADMM rho and lambda values so that the NNLS iterations are at par with Least square with positivity. I will publish results from the comparisons. I will also publish comparisons with PDCO, ECOS (IPM) and MOSEK with ADMM variants used in this branch... For recommendation use-case, I expect to produce Jellylish L1 ball projection results on netflix/movielens dataset using Formulation 5. Thanks. Deb Quadratic Minimization for MLlib ALS Key: SPARK-2426 URL: https://issues.apache.org/jira/browse/SPARK-2426 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.0.0 Reporter: Debasish Das Assignee: Debasish Das Original Estimate: 504h Remaining Estimate: 504h Current ALS supports least squares