spark git commit: [MINOR][ML][MLLIB] Remove work around for breeze sparse matrix.

2016-09-04 Thread yliang
Repository: spark Updated Branches: refs/heads/master cdeb97a8c -> 1b001b520 [MINOR][ML][MLLIB] Remove work around for breeze sparse matrix. ## What changes were proposed in this pull request? Since we have updated breeze version to 0.12, we should remove work around for bug of breeze sparse

spark git commit: [SPARK-17197][ML][PYSPARK] PySpark LiR/LoR supports tree aggregation level configurable.

2016-08-25 Thread yliang
Repository: spark Updated Branches: refs/heads/master e0b20f9f2 -> 6b8cb1fe5 [SPARK-17197][ML][PYSPARK] PySpark LiR/LoR supports tree aggregation level configurable. ## What changes were proposed in this pull request? [SPARK-17090](https://issues.apache.org/jira/browse/SPARK-17090) makes

spark git commit: [MINOR][DOC] Fix wrong ml.feature.Normalizer document.

2016-08-24 Thread yliang
Repository: spark Updated Branches: refs/heads/master 92c0eaf34 -> 45b786aca [MINOR][DOC] Fix wrong ml.feature.Normalizer document. ## What changes were proposed in this pull request? The ```ml.feature.Normalizer``` examples illustrate L1 norm rather than L2, we should correct corresponding

spark git commit: [SPARK-17090][FOLLOW-UP][ML] Add expert param support to SharedParamsCodeGen

2016-08-22 Thread yliang
Repository: spark Updated Branches: refs/heads/master 6d93f9e02 -> 37f0ab70d [SPARK-17090][FOLLOW-UP][ML] Add expert param support to SharedParamsCodeGen ## What changes were proposed in this pull request? Add expert param support to SharedParamsCodeGen where aggregationDepth a expert param

spark git commit: [SPARK-16961][FOLLOW-UP][SPARKR] More robust test case for spark.gaussianMixture.

2016-08-21 Thread yliang
Repository: spark Updated Branches: refs/heads/master 61ef74f22 -> 7f08a60b6 [SPARK-16961][FOLLOW-UP][SPARKR] More robust test case for spark.gaussianMixture. ## What changes were proposed in this pull request? #14551 fixed off-by-one bug in ```randomizeInPlace``` and some test failure

spark git commit: [SPARK-15018][PYSPARK][ML] Improve handling of PySpark Pipeline when used without stages

2016-08-20 Thread yliang
Repository: spark Updated Branches: refs/heads/master 45d40d9f6 -> 39f328ba3 [SPARK-15018][PYSPARK][ML] Improve handling of PySpark Pipeline when used without stages ## What changes were proposed in this pull request? When fitting a PySpark Pipeline without the `stages` param set, a

spark git commit: [SPARK-17141][ML] MinMaxScaler should remain NaN value.

2016-08-19 Thread yliang
Repository: spark Updated Branches: refs/heads/master 5377fc623 -> 864be9359 [SPARK-17141][ML] MinMaxScaler should remain NaN value. ## What changes were proposed in this pull request? In the existing code, ```MinMaxScaler``` handle ```NaN``` value indeterminately. * If a column has identity

spark git commit: [SPARK-16934][ML][MLLIB] Update LogisticCostAggregator serialization code to make it consistent with LinearRegression

2016-08-15 Thread yliang
Repository: spark Updated Branches: refs/heads/master ddf0d1e3f -> 3d8bfe7a3 [SPARK-16934][ML][MLLIB] Update LogisticCostAggregator serialization code to make it consistent with LinearRegression ## What changes were proposed in this pull request? Update LogisticCostAggregator serialization

spark git commit: [MINOR][ML] Rename TreeEnsembleModels to TreeEnsembleModel for PySpark

2016-08-12 Thread yliang
Repository: spark Updated Branches: refs/heads/master ac84fb64d -> ccc6dc0f4 [MINOR][ML] Rename TreeEnsembleModels to TreeEnsembleModel for PySpark ## What changes were proposed in this pull request? Fix the typo of ```TreeEnsembleModels``` for PySpark, it should ```TreeEnsembleModel```

spark git commit: [SPARK-16933][ML] Fix AFTAggregator in AFTSurvivalRegression serializes unnecessary data.

2016-08-09 Thread yliang
Repository: spark Updated Branches: refs/heads/master 511f52f84 -> 182e11904 [SPARK-16933][ML] Fix AFTAggregator in AFTSurvivalRegression serializes unnecessary data. ## What changes were proposed in this pull request? Similar to ```LeastSquaresAggregator``` in #14109, ```AFTAggregator```

spark git commit: [SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use MLVector instead of MLlib Vector

2016-08-02 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 9d9956e8f -> c5516ab60 [SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use MLVector instead of MLlib Vector ## What changes were proposed in this pull request? mllib.LDAExample uses ML pipeline and MLlib LDA

spark git commit: [SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use MLVector instead of MLlib Vector

2016-08-02 Thread yliang
Repository: spark Updated Branches: refs/heads/master d9e0919d3 -> dd8514fa2 [SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use MLVector instead of MLlib Vector ## What changes were proposed in this pull request? mllib.LDAExample uses ML pipeline and MLlib LDA algorithm.

spark git commit: [SPARK-16851][ML] Incorrect threshould length in 'setThresholds()' evoke Exception

2016-08-02 Thread yliang
Repository: spark Updated Branches: refs/heads/master a1ff72e1c -> d9e0919d3 [SPARK-16851][ML] Incorrect threshould length in 'setThresholds()' evoke Exception ## What changes were proposed in this pull request? Add a length checking for threshoulds' length in method `setThreshoulds()` of

spark git commit: [PYSPARK] add picklable SparseMatrix in pyspark.ml.common

2016-07-24 Thread yliang
Repository: spark Updated Branches: refs/heads/master cc1d2dcb6 -> 37bed97de [PYSPARK] add picklable SparseMatrix in pyspark.ml.common ## What changes were proposed in this pull request? add `SparseMatrix` class whick support pickler. ## How was this patch tested? Existing test. Author:

spark git commit: [SPARK-16307][ML] Add test to verify the predicted variances of a DT on toy data

2016-07-06 Thread yliang
Repository: spark Updated Branches: refs/heads/master 7e28fabdf -> 909c6d812 [SPARK-16307][ML] Add test to verify the predicted variances of a DT on toy data ## What changes were proposed in this pull request? The current tests assumes that `impurity.calculate()` returns the variance

spark git commit: [SPARK-16249][ML] Change visibility of Object ml.clustering.LDA to public for loading

2016-07-06 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 521fc7186 -> 25006c8bc [SPARK-16249][ML] Change visibility of Object ml.clustering.LDA to public for loading ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-16249 Change visibility

spark git commit: [SPARK-16249][ML] Change visibility of Object ml.clustering.LDA to public for loading

2016-07-06 Thread yliang
Repository: spark Updated Branches: refs/heads/master 5f342049c -> 5497242c7 [SPARK-16249][ML] Change visibility of Object ml.clustering.LDA to public for loading ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-16249 Change visibility of

spark git commit: [SPARK-16260][ML][EXAMPLE] PySpark ML Example Improvements and Cleanup

2016-07-04 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 0c6fd03fa -> 3ecee573c [SPARK-16260][ML][EXAMPLE] PySpark ML Example Improvements and Cleanup ## What changes were proposed in this pull request? 1). Remove unused import in Scala example; 2). Move spark session import outside example

spark git commit: [SPARK-16260][ML][EXAMPLE] PySpark ML Example Improvements and Cleanup

2016-07-04 Thread yliang
Repository: spark Updated Branches: refs/heads/master 262833397 -> a539b724c [SPARK-16260][ML][EXAMPLE] PySpark ML Example Improvements and Cleanup ## What changes were proposed in this pull request? 1). Remove unused import in Scala example; 2). Move spark session import outside example

spark git commit: [SPARK-16241][ML] model loading backward compatibility for ml NaiveBayes

2016-06-30 Thread yliang
Repository: spark Updated Branches: refs/heads/master 2c3d96134 -> b30a2dc7c [SPARK-16241][ML] model loading backward compatibility for ml NaiveBayes ## What changes were proposed in this pull request? model loading backward compatibility for ml NaiveBayes ## How was this patch tested?

spark git commit: [SPARK-16241][ML] model loading backward compatibility for ml NaiveBayes

2016-06-30 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 c8a7c2305 -> 1d274455c [SPARK-16241][ML] model loading backward compatibility for ml NaiveBayes ## What changes were proposed in this pull request? model loading backward compatibility for ml NaiveBayes ## How was this patch tested?

spark git commit: [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python)

2016-06-28 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 af70ad028 -> b349237e4 [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers for #13888 to convert

spark git commit: [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python)

2016-06-28 Thread yliang
Repository: spark Updated Branches: refs/heads/master f6b497fcd -> e158478a9 [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers for #13888 to convert old/new

spark git commit: [SPARK-15946][MLLIB] Conversion between old/new vector columns in a DataFrame (Python)

2016-06-17 Thread yliang
Repository: spark Updated Branches: refs/heads/master af2a4b082 -> edb23f9e4 [SPARK-15946][MLLIB] Conversion between old/new vector columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers for #13662 to convert old/new vector

spark git commit: [SPARK-15738][PYSPARK][ML] Adding Pyspark ml RFormula __str__ method similar to Scala API

2016-06-10 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 8b6742a37 -> 80b8711b3 [SPARK-15738][PYSPARK][ML] Adding Pyspark ml RFormula __str__ method similar to Scala API ## What changes were proposed in this pull request? Adding __str__ to RFormula and model that will show the set formula

spark git commit: [SPARK-13590][ML][DOC] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference

2016-06-07 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 9e16f23e7 -> e21a9ddef [SPARK-13590][ML][DOC] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference ## What changes were proposed in this pull request? When fitting ```LinearRegressionModel```(by "l-bfgs" solver)

spark git commit: [SPARK-13590][ML][DOC] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference

2016-06-07 Thread yliang
Repository: spark Updated Branches: refs/heads/master 890baaca5 -> 6ecedf39b [SPARK-13590][ML][DOC] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference ## What changes were proposed in this pull request? When fitting ```LinearRegressionModel```(by "l-bfgs" solver) and

<    1   2   3