[GitHub] spark issue #21465: [SPARK-24333][ML][PYTHON]Add fit with validation set to ...

2018-12-07 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21465 @BryanCutler Thank you very much for your help! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #23256: [SPARK-24207][R] follow-up PR for SPARK-24207 to ...

2018-12-07 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/23256 [SPARK-24207][R] follow-up PR for SPARK-24207 to fix code style problems ## What changes were proposed in this pull request? follow-up PR for SPARK-24207 to fix code style problems You

[GitHub] spark pull request #21465: [SPARK-24333][ML][PYTHON]Add fit with validation ...

2018-12-07 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21465#discussion_r239904265 --- Diff: python/pyspark/ml/param/shared.py --- @@ -814,3 +814,25 @@ def getDistanceMeasure(self): """ return se

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-06 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r239626824 --- Diff: R/pkg/R/mllib_clustering.R --- @@ -610,3 +616,58 @@ setMethod("write.ml", signature(object = "LDAModel"

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-06 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r239626871 --- Diff: docs/ml-clustering.md --- @@ -265,3 +265,44 @@ Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details

[GitHub] spark issue #21465: [SPARK-24333][ML][PYTHON]Add fit with validation set to ...

2018-12-05 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21465 @BryanCutler Thank you very much for your review! I will submit changes soon. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-05 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r239250376 --- Diff: docs/ml-clustering.md --- @@ -265,3 +265,44 @@ Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details

[GitHub] spark issue #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-05 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/23072 @dongjoon-hyun Thank you very much for your review. I will make the changes soon. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-05 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r239250335 --- Diff: R/pkg/R/mllib_clustering.R --- @@ -610,3 +616,58 @@ setMethod("write.ml", signature(object = "LDAModel"

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-05 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r239238873 --- Diff: docs/ml-clustering.md --- @@ -265,3 +265,44 @@ Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details

[GitHub] spark pull request #21465: [SPARK-24333][ML][PYTHON]Add fit with validation ...

2018-12-05 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21465#discussion_r239173098 --- Diff: python/pyspark/ml/classification.py --- @@ -1174,9 +1165,31 @@ def trees(self): return [DecisionTreeClassificationModel(m) for m

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-11-30 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r237966508 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/FPGrowthExample.scala --- @@ -64,4 +64,3 @@ object FPGrowthExample { spark.stop

[GitHub] spark pull request #23168: [SPARK-26207][doc]add PowerIterationClustering (P...

2018-11-29 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23168#discussion_r237661961 --- Diff: docs/ml-clustering.md --- @@ -265,3 +265,38 @@ Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details

[GitHub] spark pull request #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-29 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23161#discussion_r237628719 --- Diff: R/pkg/R/DataFrame.R --- @@ -2732,13 +2732,24 @@ setMethod("union", dataFrame(unioned) })

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-11-28 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r237332601 --- Diff: docs/ml-clustering.md --- @@ -265,3 +265,44 @@ Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details

[GitHub] spark issue #23168: [SPARK-26207][doc]add PowerIterationClustering (PIC) doc...

2018-11-28 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/23168 @srowen It's not in master yet. The PR is here https://github.com/apache/spark/pull/23072 --- - To unsubscribe, e-mail

[GitHub] spark issue #23168: [SPARK-26207][doc]add PowerIterationClustering (PIC) doc...

2018-11-28 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/23168 @felixcheung Could you please review? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #23168: [SPARK-26207][doc]add PowerIterationClustering (P...

2018-11-28 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/23168 [SPARK-26207][doc]add PowerIterationClustering (PIC) doc in 2.4 branch ## What changes were proposed in this pull request? Add PIC doc in 2.4 ## How was this patch tested

[GitHub] spark pull request #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/23161 [SPARK-26189][R]Fix unionAll doc in SparkR ## What changes were proposed in this pull request? Fix unionAll doc in SparkR ## How was this patch tested? Manually ran

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-11-27 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r236787704 --- Diff: docs/ml-clustering.md --- @@ -265,3 +265,44 @@ Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details

[GitHub] spark pull request #23157: [SPARK-26185][PYTHON]add weightCol in python Mult...

2018-11-27 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/23157 [SPARK-26185][PYTHON]add weightCol in python MulticlassClassificationEvaluator ## What changes were proposed in this pull request? add weightCol for python version

[GitHub] spark pull request #21465: [SPARK-24333][ML][PYTHON]Add fit with validation ...

2018-11-21 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21465#discussion_r235488017 --- Diff: python/pyspark/ml/classification.py --- @@ -1176,8 +1176,8 @@ def trees(self): @inherit_doc class GBTClassifier(JavaEstimator

[GitHub] spark issue #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-11-19 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/23072 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...

2018-11-19 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/20442 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-11-17 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/23072 [SPARK-19827][R]spark.ml R API for PIC ## What changes were proposed in this pull request? Add PowerIterationCluster (PIC) in R ## How was this patch tested? Add test case You

[GitHub] spark issue #22996: [SPARK-25997][ML]add Python example code for Power Itera...

2018-11-09 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22996 @holdenk Yes, it is. I will include the examples in ml-clustering.md. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22996: add Python example code for Power Iteration Clust...

2018-11-09 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22996 add Python example code for Power Iteration Clustering in spark.ml ## What changes were proposed in this pull request? Add python example for Power Iteration Clustering in spark.ml

[GitHub] spark issue #22788: [SPARK-25769][SQL]make UnresolvedAttribute.sql escape ne...

2018-11-06 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22788 @cloud-fan @dongjoon-hyun Because of the above test failures in ```ExpressionTypeCheckingSuite```, shall I revert to the previous change ? ``` override def sql: String

[GitHub] spark issue #22788: [SPARK-25769][SQL]make UnresolvedAttribute.sql escape ne...

2018-11-02 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22788 I have a question regarding the test failure in ```ExpressionTypeCheckingSuite```. Most of the tests in this suite failed after I change ```UnresolvedAttribute.sql = UnresolvedAttribute.name

[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-31 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229583440 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1

[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229479893 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1

[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229354247 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2856,6 +2856,21 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...

2018-10-27 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22863 Thanks @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22863: [SPARK-25859][ML]add scala/java/python example an...

2018-10-27 Thread huaxingao
Github user huaxingao closed the pull request at: https://github.com/apache/spark/pull/22863 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...

2018-10-27 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22863 @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22863: [SPARK-25859][ML]add scala/java/python example an...

2018-10-27 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22863 [SPARK-25859][ML]add scala/java/python example and doc for PrefixSpan ## What changes were proposed in this pull request? add scala/java/python example and doc for PrefixSpan in branch

[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-10-26 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21710 @felixcheung I am terribly sorry that I missed your comment for the ml doc and example for 2.4. Is it still time to merge in 2.4? I saw one of my PR got merged in 2.4 last night. I can submit

[GitHub] spark issue #22295: [SPARK-25255][PYTHON]Add getActiveSession to SparkSessio...

2018-10-26 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22295 Thank you very much for your help! ! @holdenk @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...

2018-10-25 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22790#discussion_r228274470 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala --- @@ -109,7 +109,7 @@ class BisectingKMeansModel private

[GitHub] spark issue #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_...

2018-10-23 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22790 I added a regression test in ```org.apache.spark.mllib.clustering.BisectingKMeansSuite``` I could add the following test in ml package. ``` test("SPARK-25793") {

[GitHub] spark issue #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_...

2018-10-23 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22790 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...

2018-10-22 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22790#discussion_r227229331 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala --- @@ -126,7 +126,7 @@ object BisectingKMeansModel extends

[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-22 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r227152273 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2702,7 +2702,7 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #22793: [SPARK-25793][ML]Call SaveLoadV2_0.load for class...

2018-10-22 Thread huaxingao
Github user huaxingao closed the pull request at: https://github.com/apache/spark/pull/22793 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22793: [SPARK-25793][ML]Call SaveLoadV2_0.load for classNameV2_...

2018-10-22 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22793 @WeichenXu123 I created two PRs for this jira. I had trouble to create the first one so I created another one. I will close this PR. Please use the other one. Thanks

[GitHub] spark pull request #22793: [SPARK-25793][ML]Call SaveLoadV2_0.load for class...

2018-10-21 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22793 [SPARK-25793][ML]Call SaveLoadV2_0.load for classNameV2_0 ## What changes were proposed in this pull request? The wrong version of load is called in BisectingKMeansModel.load

[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...

2018-10-21 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22790 [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_0 ## What changes were proposed in this pull request? The following code in BisectingKMeansModel.load calls the wrong version of load

[GitHub] spark pull request #22788: [SPARK-25769][SQL]change nested columns from `a.b...

2018-10-21 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r226872842 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala --- @@ -98,8 +98,18 @@ case class

[GitHub] spark pull request #22788: [SPARK-25769][SQL]change nested columns from `a.b...

2018-10-21 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22788 [SPARK-25769][SQL]change nested columns from `a.b` to `a`.`b` ## What changes were proposed in this pull request? Currently, ```$"a.b".expr.asInstanceOf[UnresolvedAttr

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-18 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r226178191 --- Diff: python/pyspark/sql/tests.py --- @@ -3863,6 +3863,145 @@ def test_jvm_default_session_already_set(self): spark.stop

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-18 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r226178127 --- Diff: python/pyspark/sql/tests.py --- @@ -3863,6 +3863,145 @@ def test_jvm_default_session_already_set(self): spark.stop

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-18 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r226178054 --- Diff: python/pyspark/sql/functions.py --- @@ -2713,6 +2713,25 @@ def from_csv(col, schema, options={}): return Column(jc

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-16 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r225667299 --- Diff: python/pyspark/sql/tests.py --- @@ -3654,6 +3654,109 @@ def test_jvm_default_session_already_set(self): spark.stop

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-16 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r225667174 --- Diff: python/pyspark/sql/functions.py --- @@ -2633,6 +2633,23 @@ def sequence(start, stop, step=None): _to_java_column(start

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-16 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r225666954 --- Diff: python/pyspark/sql/session.py --- @@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None): or SparkSession

[GitHub] spark pull request #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-10-09 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21710#discussion_r223760003 --- Diff: examples/src/main/python/ml/prefixspan_example.py --- @@ -0,0 +1,48 @@ +# --- End diff -- @felixcheung I don't think the doc

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-05 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r223165392 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +255,20 @@ def newSession(self): """ return self.__class__

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-28 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r221394694 --- Diff: python/pyspark/sql/session.py --- @@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None): or SparkSession

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-27 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r221089916 --- Diff: python/pyspark/sql/session.py --- @@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None): or SparkSession

[GitHub] spark issue #22295: [SPARK-25255][PYTHON]Add getActiveSession to SparkSessio...

2018-09-27 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22295 I just saw this fix [SPARK-25525][SQL][PYSPARK] Do not update conf for existing SparkContext in SparkSession.getOrCreate. #22545 I will remove ```test_create_SparkContext_then_SparkSession

[GitHub] spark issue #22537: [SPARK-21291][R] add R partitionBy API in DataFrame

2018-09-25 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22537 Thanks! @HyukjinKwon @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22537: [SPARK-21291][R] add R partitionBy API in DataFra...

2018-09-24 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22537 [SPARK-21291][R] add R partitionBy API in DataFrame ## What changes were proposed in this pull request? add R partitionBy API in write.df I didn't add bucketBy in write.df. The last

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-17 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r218237306 --- Diff: python/pyspark/sql/session.py --- @@ -231,6 +231,7 @@ def __init__(self, sparkContext, jsparkSession=None): or SparkSession

[GitHub] spark issue #21649: [SPARK-23648][R][SQL]Adds more types for hint in SparkR

2018-09-17 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21649 @felixcheung Thanks for your comments. I changed ```stopifnot```. At L3925 I could add ``` hintList <- list("hint2", "hint3", "hint4") h

[GitHub] spark pull request #21649: [SPARK-23648][R][SQL]Adds more types for hint in ...

2018-09-11 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21649#discussion_r216727458 --- Diff: R/pkg/R/DataFrame.R --- @@ -3939,7 +3929,15 @@ setMethod("hint", signature(x = "SparkDataFrame"

[GitHub] spark pull request #21649: [SPARK-23648][R][SQL]Adds more types for hint in ...

2018-09-10 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21649#discussion_r216413819 --- Diff: R/pkg/R/DataFrame.R --- @@ -3905,6 +3905,16 @@ setMethod("rollup", group

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-07 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r216115581 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +252,16 @@ def newSession(self): """ return self.__class__

[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-09-05 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21710 @felixcheung Are there any other things I need to change? If not, could this PR be merged in 2.4? Thanks

[GitHub] spark issue #21649: [SPARK-23648][R][SQL]Adds more types for hint in SparkR

2018-09-05 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21649 @felixcheung Are there any other things I need to change? If not, could this PR be merged in 2.4? Thanks

[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...

2018-09-04 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/20442 Any more comments? @MLnick @jkbradley --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-04 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r215022091 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +252,16 @@ def newSession(self): """ return self.__class__

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-04 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r215022059 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +252,16 @@ def newSession(self): """ return self.__class__

[GitHub] spark pull request #22228: [SPARK-25124][ML]VectorSizeHint setSize and getSi...

2018-09-04 Thread huaxingao
Github user huaxingao closed the pull request at: https://github.com/apache/spark/pull/8 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22291: [SPARK-25007][R]Add array_intersect/array_except/...

2018-08-31 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22291#discussion_r214472480 --- Diff: R/pkg/R/generics.R --- @@ -799,10 +807,18 @@ setGeneric("array_sort", function(x) { standardGeneric("array_sort") }

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-08-30 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22295 [SPARK-25255][PYTHON]Add getActiveSession to SparkSession in PySpark ## What changes were proposed in this pull request? add getActiveSession in session.py ## How

[GitHub] spark issue #22291: [SPARK-25007][R]Add array_intersect/array_except/array_u...

2018-08-30 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22291 @felixcheung @HyukjinKwon Sorry I couldn't figure out how to make the ```sequence``` work in the other PR. I will work on this one first

[GitHub] spark pull request #22291: [SPARK-25007][R]Add array_intersect/array_except/...

2018-08-30 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22291 [SPARK-25007][R]Add array_intersect/array_except/array_union/shuffle to SparkR ## What changes were proposed in this pull request? Add the R version of array_intersect/array_except

[GitHub] spark issue #22228: [SPARK-25124][ML]VectorSizeHint setSize and getSize don'...

2018-08-24 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/8 @jkbradley backport to 2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22228: [SPARK-25124][ML]VectorSizeHint setSize and getSi...

2018-08-24 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/8 [SPARK-25124][ML]VectorSizeHint setSize and getSize don't return values backport to 2.3 ## What changes were proposed in this pull request? In feature.py, VectorSizeHint setSize and getSize

[GitHub] spark pull request #22136: [SPARK-25124][ML]VectorSizeHint setSize and getSi...

2018-08-22 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22136#discussion_r212088986 --- Diff: python/pyspark/ml/tests.py --- @@ -844,6 +844,28 @@ def test_string_indexer_from_labels(self): .select

[GitHub] spark pull request #22136: [SPARK-25124][ML]VectorSizeHint setSize and getSi...

2018-08-17 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22136 [SPARK-25124][ML]VectorSizeHint setSize and getSize don't return values ## What changes were proposed in this pull request? In feature.py, VectorSizeHint setSize and getSize don't return

[GitHub] spark pull request #21835: [SPARK-24779]Add sequence / map_concat / map_from...

2018-08-15 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21835#discussion_r210326849 --- Diff: R/pkg/R/functions.R --- @@ -3320,7 +3321,7 @@ setMethod("explode", #' @aliases sequence sequence,Column-method #' @note sequ

[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

2018-08-13 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21439 Sure. I will work on it. Thanks for letting me know. @viirya --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21925: [SPARK-24973][PYTHON]Add numIter to Python Cluste...

2018-07-30 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/21925 [SPARK-24973][PYTHON]Add numIter to Python ClusteringSummary ## What changes were proposed in this pull request? Add numIter to Python version of ClusteringSummary ## How

[GitHub] spark pull request #21835: [SPARK-24779]Add sequence / map_concat / map_from...

2018-07-26 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21835#discussion_r205609369 --- Diff: R/pkg/R/functions.R --- @@ -3320,7 +3321,7 @@ setMethod("explode", #' @aliases sequence sequence,Column-method #' @note sequ

[GitHub] spark pull request #21835: [SPARK-24779]Add sequence / map_concat / map_from...

2018-07-26 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21835#discussion_r205538890 --- Diff: R/pkg/R/functions.R --- @@ -3320,7 +3321,7 @@ setMethod("explode", #' @aliases sequence sequence,Column-method #' @note sequ

[GitHub] spark pull request #21835: [SPARK-24779]Add sequence / map_concat / map_from...

2018-07-24 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21835#discussion_r204861059 --- Diff: R/pkg/tests/fulltests/test_context.R --- @@ -21,10 +21,11 @@ test_that("Check masked functions", { # Check that we are not m

[GitHub] spark issue #21835: [SPARK-24779]Add sequence / map_concat / map_from_entrie...

2018-07-23 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21835 @HyukjinKwon @felixcheung Could you please review? Thank you very much in advance! --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21835: [SPARK-24779]Add sequence / map_concat / map_from...

2018-07-21 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/21835 [SPARK-24779]Add sequence / map_concat / map_from_entries / an option in months_between UDF to disable rounding-off ## What changes were proposed in this pull request? Add

[GitHub] spark issue #21820: [SPARK-24868][PYTHON]add sequence function in Python

2018-07-20 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21820 @HyukjinKwon Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21820: [SPARK-24868][PYTHON]add sequence function in Pyt...

2018-07-19 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21820#discussion_r203934505 --- Diff: python/pyspark/sql/functions.py --- @@ -2551,6 +2551,27 @@ def map_concat(*cols): return Column(jc) +@since(2.4

[GitHub] spark pull request #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-07-18 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21710#discussion_r203526021 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/PrefixSpanWrapper.scala --- @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-07-18 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21710#discussion_r203481597 --- Diff: R/pkg/R/generics.R --- @@ -1415,6 +1415,13 @@ setGeneric("spark.freqItemsets", function(object) { standardGeneric("spark.freqI

[GitHub] spark pull request #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-07-17 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21710#discussion_r203229835 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/PrefixSpanWrapper.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-07-17 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21710#discussion_r203229794 --- Diff: R/pkg/tests/fulltests/test_mllib_fpm.R --- @@ -82,4 +82,26 @@ test_that("spark.fpGrowth", { }) +test_that("s

[GitHub] spark pull request #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-07-17 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21710#discussion_r203229733 --- Diff: R/pkg/R/generics.R --- @@ -1415,6 +1415,13 @@ setGeneric("spark.freqItemsets", function(object) { standardGeneric("spark.freqI

[GitHub] spark issue #21645: [SPARK-24537][R]Add array_remove / array_zip / map_from_...

2018-07-12 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21645 Thanks! @HyukjinKwon @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-07-11 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21710 @felixcheung Can I open a new jira for code example and documentation? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21645: [SPARK-24537][R]Add array_remove / array_zip / ma...

2018-07-11 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21645#discussion_r201827579 --- Diff: R/pkg/R/functions.R --- @@ -3071,6 +3085,19 @@ setMethod("array_position", column(jc) }) +#

[GitHub] spark issue #21678: [SPARK-23461][R]vignettes should include model predictio...

2018-07-11 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/21678 @felixcheung Thanks a lot for your help! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

  1   2   3   4   >