[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18164 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep U...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18014#discussion_r119808220 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java --- @@ -386,6 +425,35 @@ public void putArray(int rowId,

[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18183#discussion_r119807397 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ParquetDictionary.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the

[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18183#discussion_r119807273 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ParquetDictionary.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the

[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18183#discussion_r119807101 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/Dictionary.java --- @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18164 @HyukjinKwon Yes, I think it's okay to add this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119805698 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119805293 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119804978 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18164 @ueshin, do you think it is okay to add this? I want to help review here if so. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119800727 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119800139 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119798778 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18181 **[Test build #77673 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77673/testReport)** for PR 18181 at commit

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/18181 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/18181 Unfortunately, rolling back parquet-mr to 1.8.1 brings back [PARQUET-389][1], which breaks multiple test cases involving schema evolution (add a new column to a Parquet table and filter on that

[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18183#discussion_r119796465 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ParquetDictionary.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18164#discussion_r119791032 --- Diff: python/pyspark/sql/tests.py --- @@ -1697,40 +1697,56 @@ def test_fillna(self): schema = StructType([

[GitHub] spark issue #18130: [Web UI] Remove no need loop in JobProgressListener

2017-06-02 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18130 There's a [JIRA](https://issues.apache.org/jira/browse/SPARK-20650) planning to remove this `JobProgressListener`, so I'd suggest to not change this deprecated code unnecessarily. --- If your

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18185 **[Test build #77672 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77672/testReport)** for PR 18185 at commit

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18185 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18184 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77670/ Test FAILed. ---

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18184 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77668/ Test FAILed. ---

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18181 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77669/ Test FAILed. ---

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18184 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18185 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77671/ Test FAILed. ---

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18181 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18184 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18185 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18183: [SPARK-20961][SQL] generalize the dictionary in ColumnVe...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18183 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18183: [SPARK-20961][SQL] generalize the dictionary in ColumnVe...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18183 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77667/ Test PASSed. ---

[GitHub] spark issue #18183: [SPARK-20961][SQL] generalize the dictionary in ColumnVe...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18183 **[Test build #77667 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77667/testReport)** for PR 18183 at commit

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18185 **[Test build #77671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77671/testReport)** for PR 18185 at commit

[GitHub] spark pull request #18185: [SPARK-20962][SQL] Support subquery column aliase...

2017-06-02 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/18185 [SPARK-20962][SQL] Support subquery column aliases in FROM clause ## What changes were proposed in this pull request? This pr added parsing rules to support subquery column aliases in FROM

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18184 **[Test build #77670 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77670/testReport)** for PR 18184 at commit

[GitHub] spark issue #18174: [SPARK-20950][CORE]Improve Serializerbuffersize configur...

2017-06-02 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18174 I don't think that addresses my question? when would you set this separately? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18181 **[Test build #77669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77669/testReport)** for PR 18181 at commit

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18184 **[Test build #77668 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77668/testReport)** for PR 18184 at commit

[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18184 cc @cloud-fan @sameeragarwal @ueshin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18070: [SPARK-20713][Spark Core] Convert CommitDenied to TaskKi...

2017-06-02 Thread liyichao
Github user liyichao commented on the issue: https://github.com/apache/spark/pull/18070 How about Letting TaskCommitDenied and TaskKilled extend a same trait (for example, TaskKilledReason)? This way when accounting metrics, TaskCommitDenied and TaskKilled are all contributing to

[GitHub] spark pull request #18184: [MINOR] [SQL] Update the description of spark.sql...

2017-06-02 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/18184 [MINOR] [SQL] Update the description of spark.sql.files.ignoreCorruptFiles ### What changes were proposed in this pull request? When the file does not exist, we will issue the error

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/18181 @viirya Thanks for reminding! I'm reverting that one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/18181 @dongjoon-hyun I already reverted PR #16751 manually but forgot to mention it in the PR description. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-02 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18128#discussion_r119788377 --- Diff: R/pkg/inst/tests/testthat/test_mllib_classification.R --- @@ -225,6 +225,32 @@ test_that("spark.logit", { model2 <- spark.logit(df2,

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-02 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18128#discussion_r119787497 --- Diff: R/pkg/R/mllib_classification.R --- @@ -239,21 +253,57 @@ function(object, path, overwrite = FALSE) { setMethod("spark.logit",

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-02 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18128#discussion_r119788169 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/LogisticRegressionWrapper.scala --- @@ -97,7 +97,15 @@ private[r] object LogisticRegressionWrapper

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18148 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18148 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77666/ Test PASSed. ---

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18148 **[Test build #77666 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77666/testReport)** for PR 18148 at commit

[GitHub] spark issue #18118: SPARK-20199 : Provided featureSubsetStrategy to GBTClass...

2017-06-02 Thread pralabhkumar
Github user pralabhkumar commented on the issue: https://github.com/apache/spark/pull/18118 12d83aa is successful . Please review the pull request . @MLnick @sethah @mpjlu @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns

2017-06-02 Thread catlain
Github user catlain commented on the issue: https://github.com/apache/spark/pull/14783 still have this issue when input data is a array column with different length each vector, like: ``` test1 key value 1 4dda7d68a202e9e3

<    1   2   3   4