[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-11-11 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13690 @shivaram I will update this today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15365: [SPARK-17157][SPARKR]: Add multiclass logistic re...

2016-10-07 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/15365#discussion_r82497112 --- Diff: R/pkg/R/mllib.R --- @@ -117,7 +124,7 @@ NULL #' @export #' @seealso \link{spark.glm}, \link{glm}, #' @seealso

[GitHub] spark pull request #15365: [SPARK-17157][SPARKR]: Add multiclass logistic re...

2016-10-07 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/15365#discussion_r82497107 --- Diff: R/pkg/R/mllib.R --- @@ -105,7 +112,7 @@ setClass("KSTest", representation(jobj = "jobj")) #' @seealso \l

[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-10-06 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13690 @felixcheung @shivaram @junyangq It's ready for the review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #15365: [SPARK-17157][SPARKR]: Add multiclass logistic regressio...

2016-10-06 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/15365 @wangmiao1981 I saw the similar error on Jekin. Same with question with you. Regarding to `e1071`, I think we only need to install that package locally. --- If your project is set up for it

[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-09-29 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13690 @felixcheung I'll update the changes in this two days. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-08-23 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13690 @junyangq I have started working on random forest wrapper. I will open PR as soon as possible. Also, I'll update this PR very soon. Thanks. --- If your project is set up for it, you can rep

[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-08-13 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13690 Yes, sure. But I'm in a vacation this week. I will keep working on this and update as soon as possible when I get back next week. On Thu, Aug 11, 2016, 19:46 Felix Cheung

[GitHub] spark pull request #13922: [SPARK-11938][PySpark] Expose numFeatures in all ...

2016-07-18 Thread vectorijk
Github user vectorijk closed the pull request at: https://github.com/apache/spark/pull/13922 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #13922: [SPARK-11938][PySpark] Expose numFeatures in all ML Pred...

2016-07-18 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13922 @MLnick Thanks! I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-07-14 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/14136 We also need to remove line here https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala#L240. --- If your project is set up for it

[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...

2016-07-14 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r70834638 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala --- @@ -475,4 +475,20 @@ class DataFrameAggregateSuite extends

[GitHub] spark issue #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression wrapper ...

2016-07-13 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/14182 @wangmiao1981 TODO: `summary()`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression w...

2016-07-13 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/14182#discussion_r70682078 --- Diff: R/pkg/R/mllib.R --- @@ -53,6 +53,13 @@ setClass("AFTSurvivalRegressionModel", representation(jobj = "jobj")) #'

[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...

2016-07-11 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r70294379 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala --- @@ -0,0 +1,148 @@ +/* + * Licensed

[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...

2016-07-11 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r70285450 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala --- @@ -0,0 +1,148 @@ +/* + * Licensed

[GitHub] spark issue #13922: [SPARK-11938][PySpark] Expose numFeatures in all ML Pred...

2016-06-27 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13922 cc @jkbradley @yanboliang @Lewuathe --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #9936: [SPARK-11938][ML] Expose numFeatures in all ML Prediction...

2016-06-27 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/9936 @Lewuathe Thanks! I opened a new PR #13922 here for this issue. Would you mind closing this PR later? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #13922: [SPARK-11938][PySpark] Expose numFeatures in all ...

2016-06-27 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/13922 [SPARK-11938][PySpark] Expose numFeatures in all ML PredictionModel for PySpark ## What changes were proposed in this pull request? JIRA: [https://issues.apache.org/jira/browse/SPARK-11938

[GitHub] spark issue #13248: [SPARK-15194] [ML] Add Python ML API for MultivariateGau...

2016-06-24 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13248 ping @praveendareddy21 Is this still active? If not, I could help with this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #9936: [SPARK-11938][ML] Expose numFeatures in all ML Prediction...

2016-06-24 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/9936 ping @Lewuathe Is this still active? If not, I could help with this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #13820: [SPARK-16107] [R] group glm methods in documentat...

2016-06-21 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13820#discussion_r67968653 --- Diff: R/pkg/R/mllib.R --- @@ -99,10 +114,8 @@ setMethod("spark.glm", signature(data = "SparkDataFrame", formula = "formula&q

[GitHub] spark pull request #13820: [SPARK-16107] [R] group glm methods in documentat...

2016-06-21 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13820#discussion_r67968505 --- Diff: R/pkg/R/mllib.R --- @@ -99,10 +114,8 @@ setMethod("spark.glm", signature(data = "SparkDataFrame", formula = "formula&q

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-21 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67945982 --- Diff: docs/sparkr.md --- @@ -262,6 +262,83 @@ head(df) {% endhighlight %} +### Applying User-defined Function +In SparkR, we

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-20 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67782797 --- Diff: docs/sparkr.md --- @@ -262,6 +262,79 @@ head(df) {% endhighlight %} +### Applying User-defined Function +In SparkR, we

[GitHub] spark issue #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-20 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13660 Jenkins test this again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-19 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK I am not quite sure. Maybe you could create a new JIRA for gapply's programming guide. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-18 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13660 @jkbradley @shivaram @felixcheung addressed comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-17 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Cool~ I think it is better to open a separate PR to track `gapply` programming guide. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-17 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Which way do you want to include programming guide for `gapply`, in separate PR or in #13660? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-15 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r67259794 --- Diff: R/pkg/R/mllib.R --- @@ -402,6 +406,8 @@ setMethod("spark.naiveBayes", signature(data = "SparkDataFrame", formula = "for

[GitHub] spark pull request #13690: [SPARK-15767][R][ML][WIP] Decision Tree Regressio...

2016-06-15 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/13690 [SPARK-15767][R][ML][WIP] Decision Tree Regression wrapper in SparkR ## What changes were proposed in this pull request? Implement a wrapper in SparkR to support decision tree regression. R&#

[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-06-15 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13394 Thanks! @jkbradley @felixcheung @shivaram Sure. How about use title `Predicted values based on model object` instead of using `predict` (like [https://stat.ethz.ch/R-manual/R-devel/library/stats

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67020817 --- Diff: docs/sparkr.md --- @@ -262,6 +262,67 @@ head(df) {% endhighlight %} +### Applying User-defined Function

[GitHub] spark issue #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13660 cc @jkbradley @shivaram --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/13660 [SPARK-15672][R][DOC] R programming guide update ## What changes were proposed in this pull request? Guide for - UDFs with dapply, dapplyCollect - spark.lapply for running parallel R

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-13 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66917105 --- Diff: R/pkg/R/mllib.R --- @@ -197,7 +201,7 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) { invisibl

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-12 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66719273 --- Diff: R/pkg/R/DataFrame.R --- @@ -851,6 +849,8 @@ setMethod("nrow", count(x) }) +#' ncol

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-12 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66719192 --- Diff: R/pkg/R/DataFrame.R --- @@ -2766,18 +2780,21 @@ setMethod("histogram", return(histStats) })

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-12 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66718939 --- Diff: R/pkg/R/DataFrame.R --- @@ -2766,18 +2780,21 @@ setMethod("histogram", return(histStats) })

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-12 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66718189 --- Diff: R/pkg/R/mllib.R --- @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) { invisibl

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-06 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65849144 --- Diff: R/pkg/R/DataFrame.R --- @@ -628,8 +628,6 @@ setMethod("repartition", #' #' @param x A SparkDataFrame #'

[GitHub] spark issue #13488: [MINOR][R][DOC] Fix R documentation generation instructi...

2016-06-05 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13488 Thanks On Sun, Jun 5, 2016, 13:05 asfgit wrote: > Closed #13488 <https://github.com/apache/spark/pull/13488> via 8a91105 > <https://github.com/apac

[GitHub] spark issue #13248: [SPARK-15194] [ML] Add Python ML API for MultivariateGau...

2016-06-04 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13248 @praveendareddy21 For generating documentation for this API correctly, you could include this in `spark/python/docs/pyspark.ml.rst` ``` pyspark.ml.stat module

[GitHub] spark pull request #13248: [SPARK-15194] [ML] Add Python ML API for Multivar...

2016-06-04 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13248#discussion_r65798896 --- Diff: python/pyspark/ml/stat/distribution.py --- @@ -0,0 +1,267 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request #13248: [SPARK-15194] [ML] Add Python ML API for Multivar...

2016-06-04 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13248#discussion_r65798890 --- Diff: python/pyspark/ml/stat/distribution.py --- @@ -0,0 +1,267 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-03 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65691008 --- Diff: R/pkg/R/DataFrame.R --- @@ -1069,6 +1079,8 @@ setMethod("first", #' #' @param x A SparkDataFrame #' +

[GitHub] spark issue #13488: [MINOR][R][DOC] Fix R documentation generation instructi...

2016-06-02 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/13488 cc @jkbradley @shivaram @felixcheung result you could see [here](https://github.com/vectorijk/spark/tree/R-Readme/R) --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #13488: [MINOR][R][DOC] Fix R documentation generation in...

2016-06-02 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/13488 [MINOR][R][DOC] Fix R documentation generation instruction. ## What changes were proposed in this pull request? changes in R/README.md - Make step of generating SparkR document more

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-01 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65418448 --- Diff: R/pkg/R/DataFrame.R --- @@ -2514,7 +2529,9 @@ setMethod("attach", #' environment. Then, the given expression is evalua

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65284357 --- Diff: R/pkg/R/DataFrame.R --- @@ -2514,7 +2529,9 @@ setMethod("attach", #' environment. Then, the given expression is evalua

[GitHub] spark pull request: [SPARK-15177] [SparkR] [ML] SparkR 2.0 QA: New R APIs an...

2016-05-31 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13023 Suggested by this [comment](https://github.com/apache/spark/pull/13394#issuecomment-222560187), I was wondering if we also need to update the docs for k-means and naive bayes in [http

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13394 @shivaram For updating the programming guide, I'd love to do this in a separate PR. --- If your project is set up for it, you can reply to this email and have your reply appear on G

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65163283 --- Diff: R/pkg/R/stats.R --- @@ -19,12 +19,11 @@ setOldClass("jobj") -#' crosstab -#' #' Compu

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65162041 --- Diff: R/pkg/R/DataFrame.R --- @@ -1069,7 +1080,10 @@ setMethod("first", #' #' @param x A SparkDataFrame #'

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

2016-05-29 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13394#issuecomment-222385824 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

2016-05-29 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13394#issuecomment-222378580 Jenkins test this again please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

2016-05-29 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13394#issuecomment-222354451 cc @felixcheung @shivaram @sun-rui --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R API...

2016-05-29 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/13394 [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API docs for non-MLib changes ## What changes were proposed in this pull request? R Docs changes include typos, format, layout

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221486634 > @shivaram I will make the change with R version check. @wangmiao1981 FYI, I have switched to R 3.1.3 before on Mac. It seems fail too. Do you mind

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221485421 We may also investigate why unit test run differently on Jenkins and local. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-10592] [ML] [PySpark] Deprecate weights...

2016-05-24 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/9311#issuecomment-221444959 @bharath-official I don't think it's really necessary because it was changed in Spark 2.0 --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: [SPARK-10592] [ML] [PySpark] Deprecate weights...

2016-05-24 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/9311#issuecomment-221440730 @bharath-official You're right. Aha. Like you said, it should work fine if this warning message just shows during the process of calling model. --- If

[GitHub] spark pull request: [SPARK-10592] [ML] [PySpark] Deprecate weights...

2016-05-24 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/9311#issuecomment-221437455 @bharath-official Try to use coefficients instead. This is warning information. Did you get error message? --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-928][CORE] Add support for Unsafe-based...

2016-05-04 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12913#issuecomment-217031852 @techaddict According to [this](https://github.com/EsotericSoftware/kryo#-disclaimer-about-using-unsafe-based-io-), I am just wondering what will happen when Unsafe

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61508511 --- Diff: python/pyspark/ml/tests.py --- @@ -586,10 +589,13 @@ def test_fit_maximize_metric(self): tvsModel = tvs.fit(dataset

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61508260 --- Diff: python/pyspark/ml/tests.py --- @@ -616,6 +622,7 @@ def test_save_load(self): tvsModel.save(tvsModelPath) loadedModel

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61507716 --- Diff: python/pyspark/ml/tuning.py --- @@ -613,7 +615,9 @@ def copy(self, extra=None): """ i

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12464#discussion_r61491169 --- Diff: python/pyspark/ml/tests.py --- @@ -461,6 +461,31 @@ def _fit(self, dataset): class CrossValidatorTests(PySparkTestCase

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-27 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12464#issuecomment-215067357 I also notice that `validationMetrics` in `TrainValidationSplitModel` should also be supported in Python. Should we support that after this PR? --- If your

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-27 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12464#issuecomment-215065890 @jkbradley 25959e5 this commit is trying to - update `metrics` in `CrossValidator` to float list (like [0.0] * number) . - use

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-27 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12464#discussion_r61247126 --- Diff: python/pyspark/ml/tests.py --- @@ -534,6 +534,8 @@ def test_save_load(self): cvModel.save(cvModelPath) loadedModel

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-19 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12464#discussion_r60221706 --- Diff: python/pyspark/ml/tuning.py --- @@ -367,7 +368,9 @@ def copy(self, extra=None): """ i

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-17 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12464#issuecomment-211210916 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-17 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12464#issuecomment-211205576 cc @feynmanliang @jkbradley @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-17 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/12464 [SPARK-12810][PySpark] PySpark CrossValidatorModel should support avgMetrics ## What changes were proposed in this pull request? support avgMetrics in CrossValidatorModel with Python

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-11 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11468#issuecomment-208559614 @jkbradley I have addressed all the comments. Could you review this again? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-08 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11468#issuecomment-207689752 @yanboliang Thanks! I have addressed your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14373] [PySpark] PySpark RandomForestCl...

2016-04-08 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12238#issuecomment-207639513 Thanks! @holdenk @jkbradley @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-08 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/11468#discussion_r59099605 --- Diff: python/pyspark/ml/regression.py --- @@ -934,6 +935,146 @@ def predict(self, features): return self._call_java("predict"

[GitHub] spark pull request: [SPARK-14373] [PySpark] PySpark RandomForestCl...

2016-04-07 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/12238#issuecomment-206906967 cc @jkbradley @mengxr @yanboliang It is ready for the review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14373] [PySpark] PySpark RandomForestCl...

2016-04-07 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/12238 [SPARK-14373] [PySpark] PySpark RandomForestClassifier, Regressor support export/import ## What changes were proposed in this pull request? supporting `RandomForest{Classifier, Regressor

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-05 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/11468#discussion_r58511272 --- Diff: python/pyspark/ml/regression.py --- @@ -934,6 +935,146 @@ def predict(self, features): return self._call_java("predict"

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-05 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11468#issuecomment-205730049 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-12461] [SQL] Add ExpressionDescription ...

2016-04-01 Thread vectorijk
Github user vectorijk closed the pull request at: https://github.com/apache/spark/pull/10489 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-12461] [SQL] Add ExpressionDescription ...

2016-04-01 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/10489#issuecomment-204528226 I will close this PR at this time. If needed, I might re-open this. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-01 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11468#issuecomment-204452107 Sorry about late response. Yes, i will catch this today. On Fri, Apr 1, 2016, 09:02 Yanbo Liang wrote: > @vectorijk <https://github.com/vec

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-03-02 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/11468#discussion_r54834301 --- Diff: python/pyspark/ml/regression.py --- @@ -857,6 +858,146 @@ def predict(self, features): return self._call_java("predict"

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-03-02 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11468#issuecomment-191228337 cc @mengxr @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-03-02 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/11468 [SPARK-13597][PySpark][ML] Python API for GeneralizedLinearRegression ## What changes were proposed in this pull request? Python API for GeneralizedLinearRegression JIRA: https

[GitHub] spark pull request: [SPARK-7106][MLlib][PySpark] Support model sav...

2016-02-24 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11321#issuecomment-188658913 @mengxr, Thanks for replying! Definitely, I will post a rough draft proposal on JIRA later. On Wed, Feb 24, 2016 at 11:31 PM, Xiangrui Meng

[GitHub] spark pull request: [SPARK-7106][MLlib][PySpark] Support model sav...

2016-02-24 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/11321#discussion_r53924161 --- Diff: python/pyspark/mllib/fpm.py --- @@ -40,6 +41,11 @@ class FPGrowthModel(JavaModelWrapper): >>> model = FPGrowth.train(rd

[GitHub] spark pull request: [SPARK-7106][MLlib][PySpark] Support model sav...

2016-02-23 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11321#issuecomment-188006126 @mengxr Thanks, I didn't notice that. Addressed comment. Also, I take a look at 9ca79c1 and it only moved cleanup temp file code for doctest under ml dire

[GitHub] spark pull request: [SPARK-7106][MLlib][PySpark] Support model sav...

2016-02-23 Thread vectorijk
GitHub user vectorijk opened a pull request: https://github.com/apache/spark/pull/11321 [SPARK-7106][MLlib][PySpark] Support model save/load in Python's FPGrowth ## What changes were proposed in this pull request? Python API supports mode save/load in FPGrowth

[GitHub] spark pull request: [SPARK-7106][MLlib][PySpark] Support model sav...

2016-02-23 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11321#issuecomment-187676196 cc @mengxr @yanboliang Could you take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-13037][ML][PySpark] PySpark ml.recommen...

2016-02-22 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/11044#issuecomment-187588613 _off-topic_ Hi Spark Devs! I was wondering if Spark community would be interested in mentoring students for Google Summer of Code(GSoC) under Apache

[GitHub] spark pull request: [SPARK-12567][SQL] Add aes_{encrypt,decrypt} U...

2016-02-20 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/10527#issuecomment-186738840 Thanks so much for suggestion! I will open a new PR for update. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-12567][SQL] Add aes_{encrypt,decrypt} U...

2016-02-19 Thread vectorijk
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/10527#issuecomment-186523636 Ok, Sure. I will do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-12567][SQL] Add aes_{encrypt,decrypt} U...

2016-02-19 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/10527#discussion_r53545904 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -1931,6 +1931,42 @@ object functions extends LegacyFunctions { new

[GitHub] spark pull request: [SPARK-12567][SQL] Add aes_{encrypt,decrypt} U...

2016-02-19 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/10527#discussion_r53442582 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MiscFunctionsSuite.scala --- @@ -132,4 +132,88 @@ class

  1   2   >