[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-22 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123426656 --- Diff: R/pkg/R/functions.R --- @@ -2414,20 +2396,23 @@ setMethod("from_json", signature(x = "Column",

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-06-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @HyukjinKwon Great catch. Fixed all issues you pointed out. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #18371: [SPARK-20889][SparkR] Grouped documentation for M...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18371#discussion_r123425200 --- Diff: R/pkg/R/functions.R --- @@ -34,6 +34,30 @@ NULL #' df <- createDataFrame(cbind(model = rownames(mtcars), mtcars))} N

[GitHub] spark issue #18371: [SPARK-20889][SparkR] Grouped documentation for MATH col...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18371 Made another commit that addresses your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18371: [SPARK-20889][SparkR] Grouped documentation for M...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18371#discussion_r123425179 --- Diff: R/pkg/R/functions.R --- @@ -1405,18 +1309,12 @@ setMethod("sha1", column(jc) })

[GitHub] spark issue #18366: [SPARK-20889][SparkR] Grouped documentation for STRING c...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18366 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18366: [SPARK-20889][SparkR] Grouped documentation for STRING c...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18366 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @felixcheung Any idea what this message means? `This patch adds the following public classes (experimental): #' @Param x For class` --- If your project is set up for it, you can reply

[GitHub] spark issue #18371: [SPARK-20889][SparkR] Grouped documentation for MATH col...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18371 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123328728 --- Diff: R/pkg/R/functions.R --- @@ -2774,27 +2724,16 @@ setMethod("format_string", signature(format = "characte

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123328753 --- Diff: R/pkg/R/functions.R --- @@ -2774,27 +2724,16 @@ setMethod("format_string", signature(format = "characte

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123328685 --- Diff: R/pkg/R/functions.R --- @@ -2458,111 +2441,78 @@ setMethod("instr", signature(y = "Column", x = "character"

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123328197 --- Diff: R/pkg/R/functions.R --- @@ -34,6 +34,58 @@ NULL #' df <- createDataFrame(cbind(model = rownames(mtcars), mtcars))} N

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123328228 --- Diff: R/pkg/R/functions.R --- @@ -546,18 +598,20 @@ setMethod("hash", column(jc) }) -#'

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123328162 --- Diff: R/pkg/R/functions.R --- @@ -34,6 +34,58 @@ NULL #' df <- createDataFrame(cbind(model = rownames(mtcars), mtcars))} N

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @felixcheung Thanks so much for the review and comments. Super helpful! I fixed all the issues you have pointed out in the new commit. --- If your project is set up for it, you can reply

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123327948 --- Diff: R/pkg/R/functions.R --- @@ -2348,26 +2336,18 @@ setMethod("n", signature(x = "Column"),

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r123327963 --- Diff: R/pkg/R/functions.R --- @@ -1801,29 +1819,18 @@ setMethod("to_json", signature(x = "Column"),

[GitHub] spark issue #18367: [SQL][Doc] Fix documentation of lpad

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18367 OK. Updated the doc as suggested. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18366: [SPARK-20889][SparkR] Grouped documentation for STRING c...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18366 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18371: [SPARK-20889][SparkR] Grouped documentation for MATH col...

2017-06-21 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18371 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18371: [SPARK-20889][SparkR] Grouped documentation for MATH col...

2017-06-20 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18371 @felixcheung @HyukjinKwon This one is also fairly straightforward. See screenshots below. ![image](https://user-images.githubusercontent.com/11082368/27358300-d4980cbe-55ca-11e7

[GitHub] spark pull request #18371: [SPARK-20889][SparkR] Grouped documentation for M...

2017-06-20 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18371 [SPARK-20889][SparkR] Grouped documentation for MATH column methods ## What changes were proposed in this pull request? Grouped documentation for math column methods. You can merge

[GitHub] spark issue #18367: [SQL][Doc] Fix documentation of lpad

2017-06-20 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18367 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18367: [SQL][Doc] Fix documentation of lpad

2017-06-20 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18367 [SQL][Doc] Fix documentation of lpad ## What changes were proposed in this pull request? Fix incomplete documentation for `lpad`. You can merge this pull request into a Git repository

[GitHub] spark issue #18366: [SPARK-20889][SparkR] Grouped documentation for STRING c...

2017-06-20 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18366 @felixcheung @HyukjinKwon This one is pretty straightforward. See the screenshot below. ![image](https://user-images.githubusercontent.com/11082368/27346356-c80b4f02-55a1-11e7

[GitHub] spark pull request #18366: [SPARK-20889][SparkR] Grouped documentation for S...

2017-06-20 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18366 [SPARK-20889][SparkR] Grouped documentation for STRING column methods ## What changes were proposed in this pull request? Grouped documentation for string column methods. You can

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-06-20 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 Oh, great. Did that and checks passed now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-06-19 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 For the `column_datetime_diff_functions`: ![image](https://user-images.githubusercontent.com/11082368/27315654-9ba01c08-552f-11e7-973e-f8351cb50aae.png) ![image](https://user

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-06-19 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 For the date time functions, I create two groups: one for arithmetic functions that work with two columns `column_datetime_diff_functions`, and the other for functions that work with only one

[GitHub] spark pull request #18140: [SPARK-20917][ML][SparkR] SparkR supports string ...

2017-06-19 Thread actuaryzhang
Github user actuaryzhang closed the pull request at: https://github.com/apache/spark/pull/18140 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18140: [SPARK-20917][ML][SparkR] SparkR supports string ...

2017-06-19 Thread actuaryzhang
GitHub user actuaryzhang reopened a pull request: https://github.com/apache/spark/pull/18140 [SPARK-20917][ML][SparkR] SparkR supports string encoding consistent with R ## What changes were proposed in this pull request? Add `stringIndexerOrderType` to `spark.glm

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-06-19 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 How do I do that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-06-19 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 OK. Updated the doc for the cov method for SparkDataFrame. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-06-19 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 Thanks for the comments. Fixed them all in the new commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 This is how the doc for column_aggregate_functions looks like (only snapshot of the main parts): ![image](https://user-images.githubusercontent.com/11082368/27269174-85df12fa-5469

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122617531 --- Diff: R/pkg/R/stats.R --- @@ -52,22 +52,17 @@ setMethod("crosstab", collect(dat

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122617405 --- Diff: R/pkg/R/stats.R --- @@ -52,22 +52,17 @@ setMethod("crosstab", collect(dat

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122616787 --- Diff: R/pkg/R/stats.R --- @@ -52,22 +52,17 @@ setMethod("crosstab", collect(dat

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122609690 --- Diff: R/pkg/R/functions.R --- @@ -361,10 +361,13 @@ setMethod("column", #' #' @rdname corr #' @name corr -#' @f

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 @felixcheung It's up to date now. Any additional comments on this one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18291: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-06-18 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18291 @felixcheung Anything else needed for this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-06-17 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @HyukjinKwon Thanks for catching this. They were incorrectly labeled as math functions instead of aggregate functions in SparkR. And that's why I did not change them. New commit fixed

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-06-16 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @felixcheung Could you take another look and let me know if there is anything else needed? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @felixcheung Your comments are all addressed now. Please let me know if there is anything else needed. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122358689 --- Diff: R/pkg/R/generics.R --- @@ -919,10 +920,9 @@ setGeneric("array_contains", function(x, value) { standardGeneric("array_contain

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122356625 --- Diff: R/pkg/R/generics.R --- @@ -1403,20 +1416,25 @@ setGeneric("unix_timestamp", function(x, format) { standardGeneric(&qu

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122322352 --- Diff: R/pkg/R/functions.R --- @@ -2254,18 +2198,12 @@ setMethod("approxCountDistinct",

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122318874 --- Diff: R/pkg/R/functions.R --- @@ -85,17 +100,20 @@ setMethod("acos", column(jc) }) -

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r122312290 --- Diff: R/pkg/R/functions.R --- @@ -85,17 +100,20 @@ setMethod("acos", column(jc) }) -

[GitHub] spark pull request #18291: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18291#discussion_r122298526 --- Diff: R/pkg/NAMESPACE --- @@ -357,6 +357,7 @@ exportMethods("%<=>%", "to_utc_timestamp",

[GitHub] spark issue #18291: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18291 Added your suggested change. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18291: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-06-15 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18291#discussion_r122270206 --- Diff: R/pkg/NAMESPACE --- @@ -357,6 +357,7 @@ exportMethods("%<=>%", "to_utc_timestamp",

[GitHub] spark issue #18291: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-06-13 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18291 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18291: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-06-13 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18291 @felixcheung @zero323 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18116: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-06-13 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18116 Sorry, I messed up git. Close and reopen in another PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #18291: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-06-13 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18291 [SPARK-20892][SparkR] Add SQL trunc function to SparkR ## What changes were proposed in this pull request? Add SQL trunc function ## How was this patch tested? standard

[GitHub] spark pull request #18116: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-06-13 Thread actuaryzhang
Github user actuaryzhang closed the pull request at: https://github.com/apache/spark/pull/18116 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-06-01 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @HyukjinKwon Thanks much for the review. New commit now fixes the issues you pointed out. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-01 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r119538499 --- Diff: R/pkg/R/functions.R --- @@ -1630,18 +1609,12 @@ setMethod("sqrt", column(jc) })

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-06-01 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r119538471 --- Diff: R/pkg/R/functions.R --- @@ -1081,19 +1098,12 @@ setMethod("md5", column(jc) })

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-05-31 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 Thanks for the update. Look forward to your feedback. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-05-31 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 @felixcheung Yes, the first one is the default. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18140: [SPARK-20917][ML][SparkR] SparkR supports string ...

2017-05-31 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18140#discussion_r119438348 --- Diff: R/pkg/R/mllib_regression.R --- @@ -110,7 +125,8 @@ setClass("IsotonicRegressionModel", representation(jobj = "jobj"))

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-05-31 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 Simple example to illustrate: ``` > df <- createDataFrame(as.data.frame(Titanic, stringsAsFactors = FALSE)) > rModel <- stats::glm(Freq ~ Sex + Age, family = &quo

[GitHub] spark pull request #18140: [SPARK-20917][ML][SparkR] SparkR supports string ...

2017-05-31 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18140#discussion_r119285081 --- Diff: R/pkg/inst/tests/testthat/test_mllib_regression.R --- @@ -379,6 +379,49 @@ test_that("glm save/load", { unlink

[GitHub] spark issue #18116: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-05-30 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18116 Thanks @zero323. Anything else needed for this one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18116: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-05-30 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18116 @dongjoon-hyun Thanks for pointing this out. Fixed now. I thought the `@export` tag will instruct roxygen to export this method automatically in the namespace. Or was this namespace file

[GitHub] spark pull request #18140: [SPARK-20917][ML][SparkR] SparkR supports string ...

2017-05-30 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18140#discussion_r119029879 --- Diff: R/pkg/R/mllib_regression.R --- @@ -110,7 +125,8 @@ setClass("IsotonicRegressionModel", representation(jobj = "jobj"))

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-05-30 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 Thanks for the comments. Addressed them in the new commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18140: [SPARK-20917][ML][SparkR] SparkR supports string encodin...

2017-05-29 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18140 @felixcheung Please take a look. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

2017-05-29 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18122 @yanboliang I have moved the tests to the test file. Please let me know if there is anything else needed. Thanks. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #18140: Spark r formula

2017-05-29 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18140 Spark r formula ## What changes were proposed in this pull request? Add `stringIndexerOrderType` to `spark.glm` and `spark.survreg` to support string encoding that is consistent

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-05-27 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-05-26 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @felixcheung The new commit addresses your concern by splitting methods with two arguments into a separate doc. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for D...

2017-05-26 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r118803557 --- Diff: R/pkg/R/functions.R --- @@ -2095,26 +2061,28 @@ setMethod("atan2", signature(y = "Column"),

[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

2017-05-26 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18122#discussion_r118796569 --- Diff: python/pyspark/ml/feature.py --- @@ -3043,26 +3055,35 @@ class RFormula(JavaEstimator, HasFeaturesCol, HasLabelCol, JavaMLReadable, JavaM

[GitHub] spark pull request #18122: [SPARK-20899][PySpark] PySpark supports stringInd...

2017-05-26 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18122 [SPARK-20899][PySpark] PySpark supports stringIndexerOrderType in RFormula ## What changes were proposed in this pull request? PySpark supports stringIndexerOrderType in RFormula

[GitHub] spark issue #18122: [SPARK-20899][PySpark] PySpark supports stringIndexerOrd...

2017-05-26 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18122 @felixcheung @yanboliang @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18025: [SPARK-20889][SparkR] Grouped documentation for AGGREGAT...

2017-05-26 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18025: [SPARK-20889][SparkR] Grouped documentation for A...

2017-05-26 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18025#discussion_r118643715 --- Diff: R/pkg/R/generics.R --- @@ -1403,20 +1416,25 @@ setGeneric("unix_timestamp", function(x, format) { standardGeneric(&qu

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for DATETIME...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @felixcheung Thank you. This is great suggestion. I will split it into two help files which should make the doc much cleaner without changing the functions. --- If your project is set up

[GitHub] spark pull request #18116: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18116#discussion_r118634921 --- Diff: R/pkg/R/functions.R --- @@ -4015,3 +4015,29 @@ setMethod("input_file_name", signature("missing"),

[GitHub] spark pull request #18116: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18116#discussion_r118634873 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -1404,6 +1404,8 @@ test_that("column functions", { c20 <- t

[GitHub] spark pull request #18116: [SPARK-20892][SparkR] Add SQL trunc function to S...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18116#discussion_r118612551 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -1404,6 +1404,8 @@ test_that("column functions", { c20 <- t

[GitHub] spark pull request #18116: [SPARK-20892][SparkR] Add SQL trunc function

2017-05-25 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18116 [SPARK-20892][SparkR] Add SQL trunc function ## What changes were proposed in this pull request? Add SQL trunc function ## How was this patch tested? standard test You can

[GitHub] spark issue #18116: [SPARK-20892][SparkR] Add SQL trunc function to SparkR

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18116 @felixcheung @wangmiao1981 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for d...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r118605422 --- Diff: R/pkg/R/functions.R --- @@ -2476,24 +2430,27 @@ setMethod("from_json", signature(x = "Column",

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for d...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r118605244 --- Diff: R/pkg/R/functions.R --- @@ -2095,26 +2061,28 @@ setMethod("atan2", signature(y = "Column"),

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for datetime...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @felixcheung Created this PR to update the doc for the date time methods, similar to #18114. About 27 date time methods are documented into one page. I'm attaching the snapshot

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for d...

2017-05-25 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18114 [SPARK-20889][SparkR] Grouped documentation for datetime column methods ## What changes were proposed in this pull request? Grouped documentation for datetime column methods. You

[GitHub] spark pull request #17864: [SPARK-20604][ML] Allow imputer to handle numeric...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/17864#discussion_r118600408 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -94,12 +94,13 @@ private[feature] trait ImputerParams extends Params

[GitHub] spark issue #17864: [SPARK-20604][ML] Allow imputer to handle numeric types

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17864 @MLnick Thanks much for your comments. Yes, I think always returning Double is consistent with Python and R and also other transformers in ML. Plus, as @hhbyyh mentioned, this makes

[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18051 That makes sense! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18051 @zero323 I really like your thoughts on the docs. As @felixcheung mentioned above, we are doing some cleaning in #18025, which will improve readability and fix the SeeAlso issue

[GitHub] spark issue #18025: [SPARK-20889][SparkR][WIP] Grouped documentation for agg...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 Opened a JIRA. We would need several PRs to fix all doc issues. Also, not sure why Jenkins failed as the error msg is not clear and all tests passed on my computer. --- If your project

[GitHub] spark issue #18025: [WIP][SparkR] Grouped documentation for sql functions

2017-05-24 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @felixcheung All comments are addressed now and I think this is ready for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18025: [WIP][SparkR] Grouped documentation for sql functions

2017-05-24 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 - New commit now resolves the Name issue. `@title` does not work, which is the header in the second line `\title{Aggregate functions for Column operations}`. The solution is to use `@name NULL

[GitHub] spark issue #18025: [WIP][SparkR] Grouped documentation for sql functions

2017-05-24 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @felixcheung - The links to `stddev_samp` etc are already removed in the latest commit. - About collecting all the example into one, I think that'll work for this particular one

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-24 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17967 @felixcheung @yanboliang I'm fine with either the ascii table or the html table. It's your call. Hope to get over this minor doc issue and get this PR in soon. I can update the doc later

<    1   2   3   4   5   6   >