[GitHub] spark issue #17009: [SPARK-19674][SQL]Ignore non-existing driver accumulator...

2017-02-22 Thread carsonwang
Github user carsonwang commented on the issue: https://github.com/apache/spark/pull/17009 Thanks @cloud-fan . `driver accumulators don't belong to this execution` is more appropriate. I'll update the words. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #17009: [SPARK-19674][SQL]Ignore non-existing driver accu...

2017-02-22 Thread carsonwang
Github user carsonwang commented on a diff in the pull request: https://github.com/apache/spark/pull/17009#discussion_r102656311 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLListenerSuite.scala --- @@ -147,6 +147,10 @@ class SQLListenerSuite extends

[GitHub] spark issue #17036: [SPARK-19706][pyspark] add Column.contains in pyspark

2017-02-22 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/17036 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102656121 --- Diff: python/pyspark/sql/readwriter.py --- @@ -193,8 +193,9 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,

[GitHub] spark issue #17036: [SPARK-19706][pyspark] add Column.contains in pyspark

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17036 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73332/ Test FAILed. ---

[GitHub] spark issue #17036: [SPARK-19706][pyspark] add Column.contains in pyspark

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17036 **[Test build #73332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73332/testReport)** for PR 17036 at commit

[GitHub] spark issue #17036: [SPARK-19706][pyspark] add Column.contains in pyspark

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17036 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17036: [SPARK-19706][pyspark] add Column.contains in pyspark

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17036 **[Test build #73332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73332/testReport)** for PR 17036 at commit

[GitHub] spark issue #17034: [SPARK-19704][ML] AFTSurvivalRegression should support n...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17034 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73330/ Test PASSed. ---

[GitHub] spark issue #17034: [SPARK-19704][ML] AFTSurvivalRegression should support n...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17034 **[Test build #73330 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73330/testReport)** for PR 17034 at commit

[GitHub] spark issue #17034: [SPARK-19704][ML] AFTSurvivalRegression should support n...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17034 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17036: [SPARK-19706][pyspark] add Column.contains in pys...

2017-02-22 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/17036 [SPARK-19706][pyspark] add Column.contains in pyspark ## What changes were proposed in this pull request? to be consistent with the scala API, we should also add `contains` to `Column`

[GitHub] spark issue #17036: [SPARK-19706][pyspark] add Column.contains in pyspark

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17036 cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #17035: [SPARK-19705][SQL] Preferred location supporting HDFS ca...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17035 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17001 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73325/ Test PASSed. ---

[GitHub] spark issue #15821: [SPARK-13534][WIP][PySpark] Using Apache Arrow to increa...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15821 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73321/ Test FAILed. ---

[GitHub] spark issue #15821: [SPARK-13534][WIP][PySpark] Using Apache Arrow to increa...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15821 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17001 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102654066 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -202,21 +221,25 @@ private[csv] class

[GitHub] spark pull request #17035: [SPARK-19705][SQL] Preferred location supporting ...

2017-02-22 Thread tanejagagan
GitHub user tanejagagan opened a pull request: https://github.com/apache/spark/pull/17035 [SPARK-19705][SQL] Preferred location supporting HDFS cache for FileS… …canRDD Added support of HDFS cache using TaskLocation.inMemoryLocationTag NewHadoopRDD and HadoopRDD

[GitHub] spark issue #15821: [SPARK-13534][WIP][PySpark] Using Apache Arrow to increa...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15821 **[Test build #73321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73321/testReport)** for PR 15821 at commit

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17001 **[Test build #73325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73325/testReport)** for PR 17001 at commit

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102653357 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -202,21 +212,41 @@ private[csv] class

[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16928 **[Test build #73331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73331/testReport)** for PR 16928 at commit

[GitHub] spark issue #17009: [SPARK-19674][SQL]Ignore non-existing driver accumulator...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17009 The change looks reasonable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

2017-02-22 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/17028 @HyukjinKwon @hvanhovell How about the latest fix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17009: [SPARK-19674][SQL]Ignore non-existing driver accu...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17009#discussion_r102652094 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLListenerSuite.scala --- @@ -147,6 +147,10 @@ class SQLListenerSuite extends

[GitHub] spark issue #17017: [SPARK-19682][SparkR] Issue warning (or error) when subs...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17017 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73329/ Test PASSed. ---

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r102650051 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -0,0 +1,256 @@ +/* + * Licensed to

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r102649816 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -0,0 +1,256 @@ +/* + * Licensed to

[GitHub] spark issue #17017: [SPARK-19682][SparkR] Issue warning (or error) when subs...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17017 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17033: [DOCS] application environment rest api

2017-02-22 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17033 cc @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #17017: [SPARK-19682][SparkR] Issue warning (or error) when subs...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17017 **[Test build #73329 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73329/testReport)** for PR 17017 at commit

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r102649654 --- Diff: python/test_support/sql/ages_newlines.csv --- @@ -0,0 +1,6 @@ +Joe,20,"Hi, +I am Jeo" +Tom,30,"My name is Tom" +Hyukjin,25,"I am

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r102651748 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -233,3 +236,28 @@ private[csv] class

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread windpiger
Github user windpiger commented on a diff in the pull request: https://github.com/apache/spark/pull/17001#discussion_r102650536 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -339,10 +340,17 @@ private[hive] class HiveClientImpl(

[GitHub] spark issue #16923: [SPARK-19038][Hive][YARN] Correctly figure out keytab fi...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73326/ Test PASSed. ---

[GitHub] spark issue #16923: [SPARK-19038][Hive][YARN] Correctly figure out keytab fi...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16923 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16923: [SPARK-19038][Hive][YARN] Correctly figure out keytab fi...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16923 **[Test build #73326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73326/testReport)** for PR 16923 at commit

[GitHub] spark issue #17034: [SPARK-19704][ML] AFTSurvivalRegression should support n...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17034 **[Test build #73330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73330/testReport)** for PR 17034 at commit

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17001#discussion_r102648988 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -339,10 +340,17 @@ private[hive] class HiveClientImpl(

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-22 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 To a very limited extent. It can bring some useful information in IPython / Jupyter (maybe some other tools as well) but won't work with built-in `help` / `pydoc.help`. You can compare:

[GitHub] spark pull request #17034: [SPARK-19704][ML] AFTSurvivalRegression should su...

2017-02-22 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/17034 [SPARK-19704][ML] AFTSurvivalRegression should support numeric censorCol ## What changes were proposed in this pull request? make `AFTSurvivalRegression` support numeric censorCol ##

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16594 LGTM except one comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17017: [SPARK-19682][SparkR] Issue warning (or error) when subs...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17017 **[Test build #73329 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73329/testReport)** for PR 17017 at commit

[GitHub] spark pull request #17017: [SPARK-19682][SparkR] Issue warning (or error) wh...

2017-02-22 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/17017#discussion_r102648326 --- Diff: R/pkg/R/DataFrame.R --- @@ -1776,6 +1780,10 @@ setMethod("[[", signature(x = "SparkDataFrame", i = "numericOrcharacter"), #' @note [[<-

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102648290 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -202,21 +221,25 @@ private[csv] class

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102648114 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -45,24 +45,41 @@ private[csv] class

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102648089 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -202,21 +221,25 @@ private[csv] class

[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16928 **[Test build #73328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73328/testReport)** for PR 16928 at commit

[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17028 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73320/ Test PASSed. ---

[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16928 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16928 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73322/ Test PASSed. ---

[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17028 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16928 **[Test build #73322 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73322/testReport)** for PR 16928 at commit

[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17028 **[Test build #73320 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73320/testReport)** for PR 17028 at commit

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102647844 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #16594: [SPARK-17078] [SQL] Show stats when explain

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16594#discussion_r102647596 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -794,6 +795,7 @@ EXPLAIN: 'EXPLAIN'; FORMAT: 'FORMAT';

[GitHub] spark issue #16938: [SPARK-19583][SQL]CTAS for data source table with a crea...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16938 @tejasapatil Spark doesn't need to be exactly same with Hive, we follow hive behavior if it's reasonable, or use our own logic if hive's behavior doesn't make sense. --- If your project is set

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102646572 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -202,21 +221,25 @@ private[csv] class

[GitHub] spark issue #11211: [SPARK-13330][PYSPARK] PYTHONHASHSEED is not propgated t...

2017-02-22 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/11211 @holdenk description is updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102646336 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -202,21 +221,25 @@ private[csv] class

[GitHub] spark issue #16938: [SPARK-19583][SQL]CTAS for data source table with a crea...

2017-02-22 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/16938 I looked into the code. Looks like that version is merely for picking the hive shim and metastore interactions and got nothing to do with semantics of SQL operations. So you are most likely

[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-02-22 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/16971 Yes my point was returning null is not very idiomatic in Scala. Better to return Option or empty collection. Option doesn't work for Java compat, so empty Array is best in this case I believe.

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102646058 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -190,8 +208,9 @@ private[csv] class

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102646065 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102645418 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -45,24 +45,41 @@ private[csv] class

[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-02-22 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16971 @thunterdb Good point. I will check the `sampled` in `def query`. @MLnick @gatorsmile I perfer empty array as the result for empty dataset or columns that only contains na. And,

[GitHub] spark issue #16938: [SPARK-19583][SQL]CTAS for data source table with a crea...

2017-02-22 Thread windpiger
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16938 @tejasapatil In my opinion, test in Hive 2.0.0 just make a compare with Spark, the target is to determine these actions in Spark, not to make consist with Hive 2.0.0 or Hive 1.2.1, isn't it?

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102645047 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -45,24 +45,41 @@ private[csv] class

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102645010 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -45,24 +45,41 @@ private[csv] class

[GitHub] spark issue #16938: [SPARK-19583][SQL]CTAS for data source table with a crea...

2017-02-22 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/16938 @windpiger : I realised that you are checking the hive behavior against Hive 2.0.0. Spark is expected to support semantics for Hive 1.2.1 :

[GitHub] spark issue #16841: [SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN s...

2017-02-22 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/16841 @gatorsmile sure, I will do that. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102644680 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -45,24 +45,41 @@ private[csv] class

[GitHub] spark issue #17033: [DOCS] application environment rest api

2017-02-22 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17033 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102644140 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -45,24 +45,41 @@ private[csv] class

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102643961 --- Diff: python/pyspark/sql/readwriter.py --- @@ -367,10 +368,18 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark issue #16841: [SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN s...

2017-02-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16841 To make the results consistent between big endian and small endian, we can improve the queries with the extra order by clauses. @robbinspg Which queries failed? @kevinyu98 Can you

[GitHub] spark pull request #16928: [SPARK-18699][SQL] Put malformed tokens into a ne...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16928#discussion_r102643746 --- Diff: python/pyspark/sql/readwriter.py --- @@ -367,10 +368,18 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark issue #16841: [SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN s...

2017-02-22 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/16841 Hello Pete: Thanks for running the test case. Can you send the failing test case file to me? Also I can provide new test files with the output files, can you help test on your platforms? thanks.

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-22 Thread lw-lin
Github user lw-lin closed the pull request at: https://github.com/apache/spark/pull/16987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-22 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/16987 Using deterministic file names sounds great. Thanks! I'm closing this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #17033: [DOCS] application environment rest api

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17033 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17033: [DOCS] application environment rest api

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17033 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73327/ Test PASSed. ---

[GitHub] spark issue #17033: [DOCS] application environment rest api

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17033 **[Test build #73327 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73327/testReport)** for PR 17033 at commit

[GitHub] spark pull request #17023: [SPARK-19695][SQL] Throw an exception if a `colum...

2017-02-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17023 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73316/ Test PASSed. ---

[GitHub] spark issue #17023: [SPARK-19695][SQL] Throw an exception if a `columnNameOf...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17023 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16998 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16998 **[Test build #73316 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73316/testReport)** for PR 16998 at commit

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17001#discussion_r102642601 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -339,10 +340,17 @@ private[hive] class HiveClientImpl(

[GitHub] spark issue #17033: [DOCS] application environment rest api

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17033 **[Test build #73327 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73327/testReport)** for PR 17033 at commit

[GitHub] spark pull request #17033: [DOCS] application environment rest api

2017-02-22 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17033 [DOCS] application environment rest api ## What changes were proposed in this pull request? application environment rest api ## How was this patch tested? jenkins You

[GitHub] spark issue #16938: [SPARK-19583][SQL]CTAS for data source table with a crea...

2017-02-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16938 Thank you for your work! Maybe the last question. ``` **2. CREATE TABLE ...PARTITIONED BY ... LOCATION path AS SELECT ...** a) path exists hive(external) ->

[GitHub] spark issue #16923: [SPARK-19038][Hive][YARN] Correctly figure out keytab fi...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16923 **[Test build #73326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73326/testReport)** for PR 16923 at commit

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17001 **[Test build #73325 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73325/testReport)** for PR 17001 at commit

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16826 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16826 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73319/ Test FAILed. ---

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16826 **[Test build #73319 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73319/testReport)** for PR 16826 at commit

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13599 **[Test build #73324 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73324/testReport)** for PR 13599 at commit

  1   2   3   4   5   6   >