[GitHub] spark pull request #19185: [Spark-21854] Added LogisticRegressionTrainingSum...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19185#discussion_r138045280 --- Diff: python/pyspark/ml/classification.py --- @@ -529,8 +529,11 @@ def summary(self): """ if

[GitHub] spark pull request #19185: [Spark-21854] Added LogisticRegressionTrainingSum...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19185#discussion_r138045342 --- Diff: python/pyspark/ml/classification.py --- @@ -529,8 +529,11 @@ def summary(self): """ if

[GitHub] spark pull request #19185: [Spark-21854] Added LogisticRegressionTrainingSum...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19185#discussion_r138045070 --- Diff: python/pyspark/ml/classification.py --- @@ -529,8 +529,11 @@ def summary(self): """ if

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138025640 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138024385 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138025184 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138027427 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138021102 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138024573 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r138023290 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,437 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #19172: [SPARK-21856] Add probability and rawPrediction to MLPC ...

2017-09-11 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19172 LGTM2, merged into master. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #19172: [SPARK-21856] Add probability and rawPrediction t...

2017-09-10 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19172#discussion_r137975691 --- Diff: python/pyspark/ml/tests.py --- @@ -1655,6 +1655,26 @@ def test_multinomial_logistic_regression_with_bound(self): np.allclose

[GitHub] spark issue #19185: [Spark-21854] Added LogisticRegressionTrainingSummary fo...

2017-09-10 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19185 @gatorsmile Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19176: [SPARK-21965] [SparkR] Add createOrReplaceGlobalTempView...

2017-09-10 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19176 Oh sorry, I didn't find that PR before. Let's discuss this issue in that JIRA, I'll close this PR. Thanks for all you

[GitHub] spark pull request #19176: [SPARK-21965] [SparkR] Add createOrReplaceGlobalT...

2017-09-10 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/19176 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19176: [SPARK-21965] [SparkR] Add createOrReplaceGlobalTempView...

2017-09-09 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19176 cc @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #19172: [SPARK-21856] Add probability and rawPrediction t...

2017-09-09 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19172#discussion_r137929197 --- Diff: python/pyspark/ml/tests.py --- @@ -1655,6 +1655,25 @@ def test_multinomial_logistic_regression_with_bound(self): np.allclose

[GitHub] spark pull request #19172: [SPARK-21856] Add probability and rawPrediction t...

2017-09-09 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19172#discussion_r137929052 --- Diff: python/pyspark/ml/classification.py --- @@ -1425,11 +1425,13 @@ class MultilayerPerceptronClassifier(JavaEstimator, HasFeaturesCol

[GitHub] spark pull request #19172: [SPARK-21856] Add probability and rawPrediction t...

2017-09-09 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19172#discussion_r137929080 --- Diff: python/pyspark/ml/classification.py --- @@ -1442,11 +1444,13 @@ def __init__(self, featuresCol="features", labelCol="label&

[GitHub] spark pull request #19176: [SPARK-21965] [SparkR] Add createOrReplaceGlobalT...

2017-09-09 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/19176 [SPARK-21965] [SparkR] Add createOrReplaceGlobalTempView and dropGlobalTempView for SparkR ## What changes were proposed in this pull request? Add ```createOrReplaceGlobalTempView``` and

[GitHub] spark pull request #19173: [Minor] [SQL] Correct PySpark DataFrame doc.

2017-09-09 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/19173 [Minor] [SQL] Correct PySpark DataFrame doc. ## What changes were proposed in this pull request? Correct PySpark DataFrame doc. ## How was this patch tested? Only doc change, no

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137278367 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/ClusteringEvaluatorSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137253446 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with the im...

2017-09-06 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18538 @mgaido91 I left some minor comments, otherwise, this looks good. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239650 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137240370 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/ClusteringEvaluatorSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137224923 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239744 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137242981 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/ClusteringEvaluatorSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137180832 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239772 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137178736 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137238642 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137226104 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137180329 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137226738 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239566 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239933 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137180194 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137178071 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137226969 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239478 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137226318 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137242127 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/ClusteringEvaluatorSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137239906 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137175816 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-09-06 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r137178833 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #8883: [SPARK-10884] [ML] Support prediction on single in...

2017-09-05 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/8883 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19020: [SPARK-3181] [ML] Implement huber loss for LinearRegress...

2017-09-01 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19020 @MLnick @WeichenXu123 Thanks for your comments, also cc @jkbradley @hhbyyh @sethah, would you mind to have a look? Thanks. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #19020: [SPARK-3181] [ML] Implement huber loss for Linear...

2017-09-01 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19020#discussion_r136515821 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/HuberAggregator.scala --- @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r136333104 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,379 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r136332399 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,379 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r136306135 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,379 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r136304803 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,379 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r136305238 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,379 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r136305819 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,379 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #17862: [SPARK-20602] [ML]Adding LBFGS optimizer and Squared_hin...

2017-08-30 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/17862 +1 @jkbradley for test on large-scale datasets. @hhbyyh Do you have time to test it? If not, I can help. Thanks. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #18610: [SPARK-21386] ML LinearRegression supports warm s...

2017-08-30 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18610#discussion_r136020810 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -72,6 +72,22 @@ private[regression] trait

[GitHub] spark pull request #18610: [SPARK-21386] ML LinearRegression supports warm s...

2017-08-30 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18610#discussion_r136018943 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -226,6 +246,12 @@ class LinearRegression @Since("

[GitHub] spark issue #18610: [SPARK-21386] ML LinearRegression supports warm start fr...

2017-08-29 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18610 @hhbyyh We got agreement that the initialModel should be of type ```[T <: Model[T]]``` at #9. I understand the scenario you mentioned, however, I think they are different scenarios:

[GitHub] spark issue #18998: [SPARK-21748][ML] Migrate the implementation of HashingT...

2017-08-29 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18998 @srowen You are right, mllib won't be removed before 3.0, but we don't expect to migrate them at last minute. Thanks. --- If your project is set up for it, you can reply to this emai

[GitHub] spark issue #18998: [SPARK-21748][ML] Migrate the implementation of HashingT...

2017-08-29 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18998 @srowen Except for @facaiy mentioned above, we will remove spark.mllib package in the future, so all implementations in spark.mllib should be copied to spark.ml. Actually lots of MLlib

[GitHub] spark pull request #17014: [SPARK-18608][ML] Fix double-caching in ML algori...

2017-08-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/17014#discussion_r135740464 --- Diff: mllib/src/main/scala/org/apache/spark/ml/Predictor.scala --- @@ -85,6 +86,10 @@ abstract class Predictor[ M <: PredictionMo

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-25 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18902 @zhengruifeng DataFrame-based operation is 2~3x slower than RDD-based operation is a known issue, because of the deserialization cost. If we switch to RDD-based method, we need to implement our

[GitHub] spark pull request #19029: [SPARK-21818][ML][MLLIB] Fix bug of MultivariateO...

2017-08-25 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19029#discussion_r135216154 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -440,7 +440,7 @@ private[ml] object WeightedLeastSquares

[GitHub] spark issue #18315: [SPARK-21108] [ML] convert LinearSVC to aggregator frame...

2017-08-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18315 Merged into master. Thanks for all. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18315: [SPARK-21108] [ML] convert LinearSVC to aggregato...

2017-08-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r135172322 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/HingeAggregatorSuite.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the

[GitHub] spark pull request #19029: [SPARK-21818][ML][MLLIB] Fix bug of MultivariateO...

2017-08-23 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19029#discussion_r134816423 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -438,6 +438,10 @@ private[ml] object SummaryBuilderImpl extends Logging

[GitHub] spark pull request #18315: [SPARK-21108] [ML] convert LinearSVC to aggregato...

2017-08-23 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r134718512 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/HingeAggregatorSuite.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the

[GitHub] spark pull request #18315: [SPARK-21108] [ML] convert LinearSVC to aggregato...

2017-08-23 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r134718447 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/HingeAggregatorSuite.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the

[GitHub] spark pull request #18315: [SPARK-21108] [ML] convert LinearSVC to aggregato...

2017-08-23 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r134717404 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/HingeAggregator.scala --- @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #19011: [ML][MINOR] Make sharedParams update.

2017-08-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19011 Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #19020: [SPARK-3181] [ML] Implement huber loss for LinearRegress...

2017-08-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19020 @MLnick Yeah, I think we have get an agreement in JIRA discussion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #19020: [SPARK-3181] [ML] Implement huber loss for Linear...

2017-08-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19020#discussion_r134476397 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -344,33 +408,58 @@ class LinearRegression @Since("

[GitHub] spark issue #14326: [SPARK-3181] [ML] Implement RobustRegression with huber ...

2017-08-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/14326 Please go to #19020 for reviewing and comments. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #19020: [SPARK-3181] [ML] Implement huber loss for Linear...

2017-08-22 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/19020 [SPARK-3181] [ML] Implement huber loss for LinearRegression. ## What changes were proposed in this pull request? The current implementation is a straight forward porting for Python scikit

[GitHub] spark issue #14326: [SPARK-3181] [ML] Implement RobustRegression with huber ...

2017-08-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/14326 I'll close this PR and open a new one. Feel free to review and comment. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #14326: [SPARK-3181] [ML] Implement RobustRegression with...

2017-08-22 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/14326 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18992: [SPARK-19762][ML][FOLLOWUP]Add necessary comments to L2R...

2017-08-21 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18992 Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #19011: [ML][MINOR] Make sharedParams update.

2017-08-21 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19011#discussion_r134232769 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala --- @@ -154,7 +154,7 @@ private[ml] trait HasVarianceCol extends

[GitHub] spark pull request #19011: [ML][MINOR] Make sharedParams update.

2017-08-21 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/19011 [ML][MINOR] Make sharedParams update. ## What changes were proposed in this pull request? ```sharedParams.scala``` was generated by ```SharedParamsCodeGen```, but it's not updated curr

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-18 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r133961918 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18992: [SPARK-19762][ML][FOLLOWUP]Add necessary comments to L2R...

2017-08-18 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18992 @sethah @srowen Thanks for your great contributions for #17094. I wish you would not mind, I found these annotation section was missing. I think this is very important to let users/developers to

[GitHub] spark pull request #18992: [SPARK-19762][ML][FOLLOWUP]Add necessary comments...

2017-08-18 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/18992 [SPARK-19762][ML][FOLLOWUP]Add necessary comments to L2Regularization. ## What changes were proposed in this pull request? MLlib LiR/LoR/SR always standardize the data during training to

[GitHub] spark issue #18980: Correct validateAndTransformSchema in GaussianMixture

2017-08-18 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18980 +1 @srowen, this is a bug. @sharp-pixel Would you mind to fix both ```GaussianMixture``` and ```AFTSurvivalRegression```? It's better to file a JIRA firstly and add some unit tests. T

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r133876325 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r133875990 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-17 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18902 @zhengruifeng What _the RDD-based one_ means? It's the code on master or the code in your former commit? Thanks --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #17862: [SPARK-20602] [ML]Adding LBFGS optimizer and Squared_hin...

2017-08-17 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/17862 +1 @WeichenXu123 IIRC softmax regression also include a non-derivable point, we can use LBFGS to solve it as well. We can support _squared hinge loss_ which is smooth function in the future, so

[GitHub] spark issue #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to aggregator...

2017-08-17 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18315 @hhbyyh I think it's ready to move ```WIP``` in the PR title. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to agg...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r133671265 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/HingeAggregatorSuite.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the

[GitHub] spark pull request #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to agg...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r133662950 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -173,7 +174,7 @@ class LinearSVCSuite extends

[GitHub] spark pull request #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to agg...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r133671815 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/HingeAggregatorSuite.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the

[GitHub] spark pull request #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to agg...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r133665834 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -219,8 +219,17 @@ class LinearSVC @Since("

[GitHub] spark pull request #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to agg...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r133671409 --- Diff: mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/HingeAggregatorSuite.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the

[GitHub] spark pull request #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to agg...

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18315#discussion_r133664633 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/HingeAggregator.scala --- @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18902: [SPARK-21690][ML] one-pass imputer

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18902#discussion_r133649640 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -133,23 +133,45 @@ class Imputer @Since("2.2.0") (@Si

[GitHub] spark pull request #18902: [SPARK-21690][ML] one-pass imputer

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18902#discussion_r133649896 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -133,23 +133,45 @@ class Imputer @Since("2.2.0") (@Si

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-17 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18902 @hhbyyh @zhengruifeng I'm ok with the _convert to null_ method, I think there is no extra pass for data if we handle it with this way, and the DataFrame/RDD functions to compute _mean/me

[GitHub] spark pull request #18902: [SPARK-21690][ML] one-pass imputer

2017-08-17 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18902#discussion_r133647271 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -133,23 +133,45 @@ class Imputer @Since("2.2.0") (@Si

<    1   2   3   4   5   6   7   8   9   10   >