[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74678564 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74673838 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74673711 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -944,8 +942,8 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74673561 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74673500 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74673391 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74673210 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74672937 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74672274 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -944,13 +955,140 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74637489 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -932,8 +945,6 @@ class BinaryLogisticRegressionSummary

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-12 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/13796 @sethah Please merge master since there is a conflict. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74636502 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -944,13 +955,140 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-12 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74636388 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -944,13 +955,140 @@ class BinaryLogisticRegressionSummary

[GitHub] spark issue #14519: [SPARK-16933] [ML] Fix AFTAggregator in AFTSurvivalRegre...

2016-08-08 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14519 LGTM. Will be nice to see the compassion of shuffle write size, and then will be ready to merge. Thanks. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #14519: [SPARK-16933] [ML] Fix AFTAggregator in AFTSurviv...

2016-08-08 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14519#discussion_r73922053 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala --- @@ -478,21 +482,23 @@ object AFTSurvivalRegressionModel

[GitHub] spark pull request #14519: [SPARK-16933] [ML] Fix AFTAggregator in AFTSurviv...

2016-08-08 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14519#discussion_r73833738 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala --- @@ -478,21 +482,23 @@ object AFTSurvivalRegressionModel

[GitHub] spark pull request #14519: [SPARK-16933] [ML] Fix AFTAggregator in AFTSurviv...

2016-08-08 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14519#discussion_r73832261 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala --- @@ -478,21 +482,23 @@ object AFTSurvivalRegressionModel

[GitHub] spark issue #14109: [SPARK-16404][ML] LeastSquaresAggregators serializes unn...

2016-08-08 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14109 Would be great to have LOR sharing similar style and destroy mean and variance after usage. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14109: [SPARK-16404][ML] LeastSquaresAggregators serializes unn...

2016-08-08 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14109 LGTM. Merged into master. Great work! Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #14109: [SPARK-16404][ML] LeastSquaresAggregators seriali...

2016-08-05 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14109#discussion_r73754722 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -996,19 +1013,24 @@ private class LeastSquaresCostFun

[GitHub] spark issue #14326: [SPARK-3181] [ML] Implement RobustRegression with huber ...

2016-08-05 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14326 I'm making through the first pass now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14109: [SPARK-16404][ML] LeastSquaresAggregators seriali...

2016-08-05 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14109#discussion_r73646330 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -862,27 +873,30 @@ class LinearRegressionSummary private

[GitHub] spark issue #14109: [SPARK-16404][ML] LeastSquaresAggregators serializes unn...

2016-08-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14109 @sethah In my opinion, I think using `@transient lazy val` is okay since there are only two places dereferencing the `lazy val`, and we don't use it in the tight loop. LGTM except one small comment

[GitHub] spark pull request #14109: [SPARK-16404][ML] LeastSquaresAggregators seriali...

2016-08-04 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14109#discussion_r73643402 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -862,27 +873,30 @@ class LinearRegressionSummary private

[GitHub] spark issue #13729: [SPARK-16008][ML] Remove unnecessary serialization in lo...

2016-07-05 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/13729 Hi @jodersky @sethah Could you test in Linear Regression, if `@transient` helps the performance for the same serialization issue? https://github.com/apache/spark/blob/master/mllib

[GitHub] spark issue #8013: [SPARK-3181][MLLIB]: Add Robust Regression Algorithm with...

2016-07-05 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/8013 @yanboliang Sounds great! Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #8013: [SPARK-3181][MLLIB]: Add Robust Regression Algorithm with...

2016-07-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/8013 @rxin @mengxr I'm back to US from a leave. Going to revisit PRs under me. I had worked with @MechCoder to implement Huber estimator in python scikit https://github.com/scikit-learn/scikit

[GitHub] spark issue #13729: [SPARK-16008][ML] Remove unnecessary serialization in lo...

2016-07-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/13729 @sethah Late comment. Great improvement for high dimensional problems. I didn't test it out myself, and I wonder whether `@transient` annotation works in the constructor of `LogisticAggregator

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-07-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/13796 @sethah I apologize for the delay. I just came back to US. Gonna make the first pass. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #9457: [SPARK-11496][GRAPHX] Parallel implementation of personal...

2016-07-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/9457 Hello @moustaki , The work of having standalone vectors and matrices had been done in [SPARK-13944](https://issues.apache.org/jira/browse/SPARK-13944), and what you need to do is adding

[GitHub] spark pull request: [SPARK-15413] [ML] [MLLIB] Change `toBreeze` t...

2016-05-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13198#issuecomment-220473337 Since this is a private API, this will not affect anything. Given that we are using 'asML' now, I believe that it worths to make them consistent. --- If your project

[GitHub] spark pull request: [SPARK-15413] [ML] [MLLIB] Change `toBreeze` t...

2016-05-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13198#issuecomment-220438703 +cc @jkbradley @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15411] [ML] Add @since to ml.stat.Multi...

2016-05-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13197#issuecomment-220422594 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15411] [ML] Add @since to ml.stat.Multi...

2016-05-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13197#issuecomment-220420285 Jenkins, test it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15413] [ML] [MLLIB] Change `toBreeze` t...

2016-05-19 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/13198 [SPARK-15413] [ML] [MLLIB] Change `toBreeze` to `asBreeze` in Vector and Matrix ## What changes were proposed in this pull request? We're using `asML` to convert the mllib vector/matrix

[GitHub] spark pull request: [SPARK-15411] [ML] Add @since to ml.stat.Multi...

2016-05-19 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/13197 [SPARK-15411] [ML] Add @since to ml.stat.MultivariateOnlineSummarizer.scala ## What changes were proposed in this pull request? Add @since to ml.stat.MultivariateOnlineSummarizer.scala

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-05-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13191#issuecomment-220404935 Thanks. Merged into master and 2.0 branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-05-19 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13191#issuecomment-22024 Test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14615][ML] Use the new ML Vector and Ma...

2016-05-18 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-220102877 Thank you for everyone who involved in this work. I agree that the amount of work was underestimated, and some of them were actually hard to estimate given the issues

[GitHub] spark pull request: [SPARK-14615][ML] Use the new ML Vector and Ma...

2016-05-17 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12627#discussion_r63569421 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala --- @@ -967,4 +968,122 @@ object LogisticRegressionSuite

[GitHub] spark pull request: [SPARK-14906][ML] Move VectorUDT and MatrixUDT...

2016-05-16 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12870#issuecomment-219583049 Ping @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-14615][ML] Use the new ML Vector and Ma...

2016-05-12 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-218949085 @viirya Can you try to merge my code with yours, and see if the python tests pass? Thanks. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-14615][ML] Use the new ML Vector and Ma...

2016-05-12 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-218901313 @viirya How can we submit the two PRs independently? I though your PR requires me changes as well. Thanks. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14615][ML] Use the new ML Vector and Ma...

2016-05-11 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-218659906 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15268][SQL] Make JavaTypeInference work...

2016-05-11 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/13046#issuecomment-218383424 LGTM. Wait for test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15268][SQL] Make JavaTypeInference work...

2016-05-11 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/13046#discussion_r62802522 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/VectorUDTSuite.scala --- @@ -17,9 +17,19 @@ package org.apache.spark.ml.linalg

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-05-11 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-218382765 Wait for https://github.com/apache/spark/pull/13046/files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-05-06 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12627#discussion_r62402319 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -129,7 +131,9 @@ class KMeansModel private[ml] ( @Since("

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-05-05 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-217287176 @mengxr That can work, but need to import everywhere. I can give it a shot. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-05-02 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-216393823 Currently, we're working on 2.0 release. Please ping us again after the release. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-29 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215829721 @pravingadakh The changes in `modules.py` is just helping Jenkins to understand the dependencies, and I don't get it why this will break the build

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-04-29 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-215808336 @mengxr working on this now. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215587478 Thanks. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12593#issuecomment-215588256 @jkbradley The `@Since` annotation was merged https://github.com/apache/spark/pull/12416 Could you submit a followup PR? Thanks. --- If your project is set up

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573982 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573910 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573882 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573656 Jenkins, please test it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215564655 Seems to be very promising. Since 2.0 window will be closed soon, it's unlikely to get into 2.0. Let's target 2.1 --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215561453 You may use some fake data to demonstrate how this PR improves. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215553815 Can you also post the benchmark result with/without this PR for very sparse features? Thanks. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215553142 LGTM except one minor styling issue. Once that is updated, and tests pass, I'll go ahead and merge it. Thank you very much. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61496950 --- Diff: project/SparkBuild.scala --- @@ -50,10 +50,11 @@ object BuildCommons { ).map(ProjectRef(buildLocation, _)) val allProjects@Seq

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215545617 @pravingadakh Yes. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215539469 Make it build first, and then we can start to review the code. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215539228 You need to manually add it into MiMa exclude. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215532091 Jenkins, ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215531609 @pravingadakh I think you need to update `dev/sparktestsupport/modules.py`. See https://github.com/apache/spark/commit/efaf7d18205f5ae3a1c767942ee7d7320f7410de

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-28 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61473084 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/UDTRegistration.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-26 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12593#issuecomment-214923459 LGTM. Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-26 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61183805 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/MatrixUDTSuite.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-26 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-214870752 Beside the comment from other people, I think you need to update the module dependencies in `sparktestsupport/modules.py` such that when tag is changed, all the modules

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-26 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61139825 --- Diff: dev/run-tests.py --- @@ -128,7 +128,7 @@ def determine_tags_to_exclude(changed_modules): tags = [] for m in modules.all_modules

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61018699 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -274,6 +339,12 @@ class KMeans @Since("

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61018493 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedGeneralTypeParams.scala --- @@ -0,0 +1,34 @@ +/* --- End diff -- I

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61018069 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -258,6 +290,27 @@ class KMeans @Since("1.5.0") ( @Si

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61017723 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -198,6 +231,17 @@ object KMeansModel extends MLReadable[KMeansModel

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61017539 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -171,12 +192,23 @@ object KMeansModel extends MLReadable[KMeansModel

[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r61017075 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -137,6 +138,17 @@ class KMeansModel private[ml] ( @Since("

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-214562802 I think `override def pyUDT: String = "pyspark.mllib.linalg.MatrixUDT"` has to be changed; otherwise, this will cause inconsistent result. Maybe this can

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61006720 --- Diff: project/SparkBuild.scala --- @@ -53,7 +53,7 @@ object BuildCommons { core, graphx, mllib, mllibLocal, repl, networkCommon, networkShuffle

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-214560261 @pravingadakh I think you need to update `dev/sparktestsupport/modules.py` for Jenkins build as well. Thanks. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61004682 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -561,10 +589,11 @@ object DenseVector { * @param indices index

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61004014 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -415,13 +443,14 @@ object DenseMatrix

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61003884 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -154,11 +172,12 @@ sealed trait Matrix extends Serializable

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-214548556 @vanzin I guess using java doc `/* @since /*` style, it's harder to document the public variables in the constructor. Just my 0.02 cents. --- If your project is set

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-25 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12593#issuecomment-214543499 Do you want to add the test code for both `MultivariateGaussian.scala` and `Utils.scala`? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12593#discussion_r60996366 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/stat/distribution/MultivariateGaussian.scala --- @@ -0,0 +1,131 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14732][ML] spark.ml GaussianMixture sho...

2016-04-25 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12593#discussion_r60996174 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala --- @@ -17,17 +17,19 @@ package org.apache.spark.ml.clustering

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-04-22 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12627#issuecomment-213625305 Waiting https://github.com/apache/spark/pull/12259 to be merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14615][ML][WIP] Use the new ML Vector a...

2016-04-22 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/12627 [SPARK-14615][ML][WIP] Use the new ML Vector and Matrix in the ML pipeline based algorithms ## What changes were proposed in this pull request? Once SPARK-14487 and SPARK-14549 are merged

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60811552 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/VectorUDTSuite.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60776248 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/MatrixUDTSuite.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60776001 --- Diff: mllib/src/test/scala/org/apache/spark/ml/linalg/MatrixUDTSuite.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60775630 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/VectorUDT.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60775523 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60775018 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-22 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r60757277 --- Diff: mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-22 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-213476037 Ping @pravingadakh for update. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14734][ML][MLLIB] Added asML, fromML me...

2016-04-21 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12504#issuecomment-213162793 Thanks. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

<    3   4   5   6   7   8   9   10   11   12   >