[GitHub] spark pull request #21097: [SPARK-14682][ML] Provide evaluateEachIteration m...

2018-04-19 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/21097#discussion_r182829257 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala --- @@ -365,6 +365,20 @@ class GBTClassifierSuite extends

[GitHub] spark pull request #21097: [SPARK-14682][ML] Provide evaluateEachIteration m...

2018-04-18 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/21097#discussion_r182603253 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala --- @@ -365,6 +365,20 @@ class GBTClassifierSuite extends

[GitHub] spark pull request #21090: [SPARK-15784][ML] Add Power Iteration Clustering ...

2018-04-17 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/21090#discussion_r182254888 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,256 @@ +/* + * Licensed

[GitHub] spark issue #21090: [SPARK-15784][ML] Add Power Iteration Clustering to spar...

2018-04-17 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/21090 Take a quick look. Despite of the style failure and a minor format issue, LGTM. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21090: [SPARK-15784][ML] Add Power Iteration Clustering ...

2018-04-17 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/21090#discussion_r182243819 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala --- @@ -0,0 +1,239 @@ +/* + * Licensed

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2018-04-17 Thread wangmiao1981
Github user wangmiao1981 closed the pull request at: https://github.com/apache/spark/pull/15770 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2018-04-17 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @jkbradley I close this one now. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2018-04-17 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @jkbradley Sorry for missing your comments. Anyway, I will close it now. I will choose another one to work on. Thanks

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2018-01-03 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 ping @yanboliang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-11-21 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 ping @yanboliang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-11-09 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @weichenXu123 Any other comments? Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-11-01 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @WeichenXu123 Thanks for your review and reply! I agree with you that the helper can be discussed later for potential enhancement

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-10-31 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @WeichenXu123 , for the graph helper, the Mllib has a version takes `Graph[Double, Double]` as a parameter for training. In ML, do we have to provide `DataSet` of `Graph`? Can you specify

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143078744 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143078479 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-09-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 I will address the review comments soon. Thanks! @WeichenXu123 --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-09-08 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 ping @WeichenXu123 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-09-08 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 ping @WeichenXu123 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-19 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @WeichenXu123 I have made changes based on your comments. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-16 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 info] Main Scala API documentation successful. [error] (spark/javaunidoc:doc) javadoc returned nonzero exit code [error] Total time: 95 s, completed Aug 15, 2017 4:59:59 PM [error

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 retest please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 Jenkins, retest please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 weird. Local style test passed. Anyway, I changed the order as required by Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-08-15 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r133271527 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,213 @@ +/* + * Licensed

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-08-15 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r133267575 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,213 @@ +/* + * Licensed

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-08-10 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @WeichenXu123 Thanks for reviewing! I will address the comments soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-24 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 @felixcheung Can you take a look? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-20 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 @yanboliang I have made changes accordingly. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-18 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 @yanboliang Thanks for your reply! I will change the unit tests now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-17 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 @yanboliang after #18613, unit tests fails if "skip" is used. For example, data <- data.frame(clicked = base::sample(c(0, 1), 10, replace = TRUE), someString = b

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 Sure. I am reading the #18613 comments. Just come back from a business travel. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18613: [SPARK-20307][ML][SPARKR][FOLLOW-UP] RFormula should han...

2017-07-12 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18613 @felixcheung I agree. We should make changes in Scala side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-12 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 Trigger windows check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-12 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 Reopen for windows check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleIn...

2017-07-12 Thread wangmiao1981
GitHub user wangmiao1981 reopened a pull request: https://github.com/apache/spark/pull/18605 [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid for classification algorithms ## What changes were proposed in this pull request? SPARK-20307 Added handleInvalid option

[GitHub] spark pull request #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleIn...

2017-07-12 Thread wangmiao1981
Github user wangmiao1981 closed the pull request at: https://github.com/apache/spark/pull/18605 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid f...

2017-07-11 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18605 @felixcheung This is a follow-up PR of JIRA-20307. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18605: [SparkR][SPARK-21381]:SparkR: pass on setHandleIn...

2017-07-11 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/18605 [SparkR][SPARK-21381]:SparkR: pass on setHandleInvalid for classification algorithms ## What changes were proposed in this pull request? SPARK-20307 Added handleInvalid option

[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-08 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18496 #14850 is the PR printing the full stack. We can improve it by print the cause instead of print stack. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-08 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18496 I will review all classifiers to add the handleInvalid when necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-08 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18496 Actually, the udf in transform() of StringIndexer.scala, will throw an exception in action. But, it doesn't stop the execution of collect(). val indexer = udf { label: String

[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-07 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18496 I did a quick debug: In DataSet.scala def ofRows(sparkSession: SparkSession, logicalPlan: LogicalPlan): DataFrame = { val qe = sparkSession.sessionState.executePlan(logicalPlan

[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-07 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18496 @felixcheung Yes. I think we can improve scala side. It only throws exception when a `NULL` field is given. For unseen labels, as the example above, it always fails at the same place `double

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-07-07 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15770 @yanboliang Can you take a look first? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-07-07 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r126198035 --- Diff: R/pkg/tests/fulltests/test_mllib_tree.R --- @@ -212,6 +212,23 @@ test_that("spark.randomForest", { expect_equal(length

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-07-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r125954907 --- Diff: R/pkg/tests/fulltests/test_mllib_tree.R --- @@ -212,6 +212,23 @@ test_that("spark.randomForest", { expect_equal(length

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-07-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r125802201 --- Diff: R/pkg/tests/fulltests/test_mllib_tree.R --- @@ -212,6 +212,23 @@ test_that("spark.randomForest", { expect_equal(length

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-07-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r125703030 --- Diff: R/pkg/tests/fulltests/test_mllib_tree.R --- @@ -212,6 +212,23 @@ test_that("spark.randomForest", { expect_equal(length

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-07-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r125702340 --- Diff: R/pkg/R/mllib_tree.R --- @@ -374,6 +374,10 @@ setMethod("write.ml", signature(object = "GBTClassificationModel&

[GitHub] spark pull request #18518: [MINOR][SparkR]: ignore Rplots.pdf test output af...

2017-07-03 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/18518 [MINOR][SparkR]: ignore Rplots.pdf test output after running R tests ## What changes were proposed in this pull request? After running R tests in local build, it outputs Rplots.pdf

[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-01 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18496 I will fix it tonight. It is weird. In my local test, it passed. It seems that my new change doesn't apply to the test. Anyway, I will fix the failure first. --- If your project is set up

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-06-30 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r125154756 --- Diff: R/pkg/R/mllib_tree.R --- @@ -409,7 +413,7 @@ setMethod("spark.randomForest", signature(data = "SparkDataFrame&

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-06-30 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18496#discussion_r125154735 --- Diff: R/pkg/R/mllib_tree.R --- @@ -374,6 +374,10 @@ setMethod("write.ml", signature(object = "GBTClassificationModel&

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-06-30 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 @jiangxb1987 The original PR has some issues that are not correctly handled. I will open a new PR when I figure out the right fix. I intended to close this PR. Thanks for closing

[GitHub] spark pull request #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleIn...

2017-06-30 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/18496 [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer ## What changes were proposed in this pull request? For randomForest classifier

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-13 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18128 ping @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18128 @felixcheung if I remove `as.integer`, backend doesn't recognize it as `integer`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18128 Local test passed. Let me check it tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18128 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-02 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18128#discussion_r119978881 --- Diff: R/pkg/R/mllib_classification.R --- @@ -239,21 +253,64 @@ function(object, path, overwrite = FALSE) { setMethod("spark.logit",

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-02 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18128#discussion_r119911447 --- Diff: R/pkg/R/mllib_classification.R --- @@ -239,21 +253,57 @@ function(object, path, overwrite = FALSE) { setMethod("spark.logit",

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-02 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/18128#discussion_r119911006 --- Diff: R/pkg/R/mllib_classification.R --- @@ -239,21 +253,57 @@ function(object, path, overwrite = FALSE) { setMethod("spark.logit",

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-05-31 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/18128 @yanboliang Can you take a look? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-05-27 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/18128 [SPARK-20906][SparkR]:Constrained Logistic Regression for SparkR ## What changes were proposed in this pull request? PR https://github.com/apache/spark/pull/17715 Added Constrained

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345383 --- Diff: R/pkg/DESCRIPTION --- @@ -42,6 +42,7 @@ Collate: 'functions.R' 'install.R' 'jvm.R' +'mllib_wrapper.R

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345166 --- Diff: R/pkg/R/mllib_regression.R --- @@ -360,6 +338,7 @@ setMethod("spark.isoreg", signature(data = "SparkDataFrame"

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345323 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345209 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345283 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116344992 --- Diff: R/pkg/R/mllib_classification.R --- @@ -22,29 +22,36 @@ #' #' @param jobj a Java object reference to the backing Scala

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116344933 --- Diff: R/pkg/R/generics.R --- @@ -1535,9 +1535,7 @@ setGeneric("spark.freqItemsets", function(object) { standardGeneric(&q

[GitHub] spark issue #17808: [SPARK-20533][SparkR]:SparkR Wrappers Model should be pr...

2017-04-29 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17808 I think we don't have to back-port. This is a small improvement/optimization of the original code. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #17808: [SPARK-20533][SparkR]:SparkR Wrappers Model shoul...

2017-04-29 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/17808 [SPARK-20533][SparkR]:SparkR Wrappers Model should be private and value should be lazy ## What changes were proposed in this pull request? MultilayerPerceptronClassifierWrapper model

[GitHub] spark issue #17805: [SPARK-20477][SparkR][DOC]: Document R bisecting k-means...

2017-04-29 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17805 cc @felixcheung This is a similar documentation change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #17805: [SparkR][DOC][SPARK-20477]: Document R bisecting ...

2017-04-28 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/17805 [SparkR][DOC][SPARK-20477]: Document R bisecting k-means in R programming guide ## What changes were proposed in this pull request? Add hyper link in the SparkR programming guide

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-28 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113974703 --- Diff: R/pkg/R/serialize.R --- @@ -83,6 +83,7 @@ writeObject <- function(con, object, writeType = TRUE) { Date = writeDate(con, obj

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-28 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113972686 --- Diff: R/pkg/R/serialize.R --- @@ -83,6 +83,7 @@ writeObject <- function(con, object, writeType = TRUE) { Date = writeDate(con, obj

[GitHub] spark issue #17797: [SparkR][DOC]:Document LinearSVC in R programming guide

2017-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17797 @felixcheung As I checked the SparkR programming guide, it seems that all machine learning parts are links to existing documents. So I just add the link to Linear SVM document and tested

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113853483 --- Diff: R/pkg/R/serialize.R --- @@ -83,6 +83,7 @@ writeObject <- function(con, object, writeType = TRUE) { Date = writeDate(con, obj

[GitHub] spark pull request #17797: [SparkR][DOC]:Document LinearSVC in R programming...

2017-04-27 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/17797 [SparkR][DOC]:Document LinearSVC in R programming guide ## What changes were proposed in this pull request? add link to svmLinear in the SparkR programming document. ## How

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113823246 --- Diff: R/pkg/R/serialize.R --- @@ -83,6 +83,7 @@ writeObject <- function(con, object, writeType = TRUE) { Date = writeDate(con, obj

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-26 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113586516 --- Diff: R/pkg/R/serialize.R --- @@ -83,6 +83,7 @@ writeObject <- function(con, object, writeType = TRUE) { Date = writeDate(con, obj

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-26 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113585851 --- Diff: R/pkg/R/serialize.R --- @@ -83,6 +83,7 @@ writeObject <- function(con, object, writeType = TRUE) { Date = writeDate(con, obj

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-25 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113362108 --- Diff: R/pkg/inst/tests/testthat/test_Serde.R --- @@ -28,6 +28,10 @@ test_that("SerDe of primitive types", { expect_

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-25 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113358460 --- Diff: R/pkg/inst/tests/testthat/test_Serde.R --- @@ -28,6 +28,10 @@ test_that("SerDe of primitive types", { expect_

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-25 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17640#discussion_r113358355 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -3043,6 +3043,23 @@ test_that("catalog APIs, currentDatabase, setCurrentDat

[GitHub] spark pull request #17754: [FollowUp][SPARK-18901][ML]: Require in LR Logist...

2017-04-24 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/17754 [FollowUp][SPARK-18901][ML]: Require in LR LogisticAggregator is redundant ## What changes were proposed in this pull request? This is a follow-up PR of #17478. ## How

[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...

2017-04-24 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17478 @yanboliang I will do it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-23 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 @felixcheung I just came back from vacation. I will make changes now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-17 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 I am adding more tests right now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-16 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 Based on my understanding, it does not directly solvethe 12360. This one just solves the serialization of a specific type `bigint` in struct field. --- If your project is set up for it, you

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-16 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 For `Inf` case, I used a very large number

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-15 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 If I use very big number, then sparkR shell will get the following output: > collect(df1) a b cd 1 Inf 1 1 Inf So the overf

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17640 I will some bound check and error handling. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-14 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/17640 [SPARK-17608][SPARKR]:Long type has incorrect serialization/deserialization ## What changes were proposed in this pull request? `bigint` is not supported in schema and the serialization

[GitHub] spark issue #17611: [SPARK-20298][SparkR][MINOR] fixed spelling mistake "cha...

2017-04-11 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17611 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17611: [SPARK-20298][SparkR][MINOR] fixed spelling mistake "cha...

2017-04-11 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17611 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...

2017-03-30 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/17478 @sethah Thanks for your reply! Your suggestion makes sense to me. My intention was to close the JIRA by simple fix. How about we add a test for these checks and close the original JIRA? or you

  1   2   3   4   5   6   7   >