[GitHub] spark issue #14256: [SPARK-16620][CORE] Add back the tokenization process in...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14256 Hi, @lw-lin . This seems to resolve SPARK-16613 , too. Could you check that? If possible, please add SPARK-16613 into the title, too. --- If your project is set up for it, you can reply to t

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support numeric...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14251 **[Test build #62511 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62511/consoleFull)** for PR 14251 at commit [`90d6851`](https://github.com/apache/spark/commit/9

[GitHub] spark pull request #14253: [Doc] improve python doc for rdd.histogram and da...

2016-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14253 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support numeric...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14251 Now, `findTightestCommonTypeToString` becomes public and the testcase is moved and reduced. --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark issue #14255: [MINOR] Fix Java Linter `LineLength` errors

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14255 For easy comparison, `lint-java` results are here. - https://travis-ci.org/dongjoon-hyun/spark/jobs/145738728 (Current master: [SPARK-16303][DOCS][EXAMPLES] ...) - https://travis-c

[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram and dataframe...

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14253 Merging in master/2.0. THanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14222 Ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram and dataframe...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14253 **[Test build #3188 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3188/consoleFull)** for PR 14253 at commit [`6d8c9aa`](https://github.com/apache/spark/commit

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...

2016-07-18 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r71278812 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCastsSuite.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the A

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13382 **[Test build #62510 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62510/consoleFull)** for PR 13382 at commit [`e19ec3d`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14014 Let's also update the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled a

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71277693 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2965,4 +2965,32 @@ class SQLQuerySuite extends QueryTest with Share

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71277607 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -146,6 +151,15 @@ case class CatalogTable( requi

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread dafrista
Github user dafrista commented on the issue: https://github.com/apache/spark/pull/13382 Thanks @ericl I've added that information to the class comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14255: [MINOR] Fix Java Linter `LineLength` errors

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14255 **[Test build #62509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62509/consoleFull)** for PR 14255 at commit [`c44a8a0`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #14256: [SPARK-16620][CORE] Add back the tokenization process in...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14256 **[Test build #62508 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62508/consoleFull)** for PR 14256 at commit [`8517465`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #14255: [MINOR] Fix Java Linter `LineLength` errors

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14255 Rebased to resolve conflicts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #14200: [SPARK-16528][SQL] Fix NPE problem in HiveClientI...

2016-07-18 Thread jacek-lewandowski
Github user jacek-lewandowski commented on a diff in the pull request: https://github.com/apache/spark/pull/14200#discussion_r71277387 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -320,7 +320,7 @@ private[hive] class HiveClientImpl

[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14222 @viirya I'm going to take over the PR and play with the API a little bit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request #14256: [SPARK-16620][CORE] Add back tokenization process...

2016-07-18 Thread lw-lin
GitHub user lw-lin opened a pull request: https://github.com/apache/spark/pull/14256 [SPARK-16620][CORE] Add back tokenization process in RDD.pipe(command: String) ## What changes were proposed in this pull request? Currently `RDD.pipe(command: String)`: - works only wi

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71277158 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2965,4 +2965,32 @@ class SQLQuerySuite extends QueryTest with SharedSQLConte

[GitHub] spark pull request #14014: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14014#discussion_r71277147 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -442,13 +445,23 @@ private[parquet] clas

[GitHub] spark pull request #14200: [SPARK-16528][SQL] Fix NPE problem in HiveClientI...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14200#discussion_r71277056 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -320,7 +320,7 @@ private[hive] class HiveClientImpl(

[GitHub] spark pull request #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java ex...

2016-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14245 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71277066 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2965,4 +2965,32 @@ class SQLQuerySuite extends QueryTest with Share

[GitHub] spark pull request #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceG...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14222#discussion_r71277037 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/expressions/ReduceAggregatorSuite.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Soft

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71276984 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2965,4 +2965,32 @@ class SQLQuerySuite extends QueryTest with Share

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71276925 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -100,7 +100,8 @@ object TypeCoercion { }

[GitHub] spark issue #14255: [MINOR] Fix Java Linter `LineLength` errors

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14255 **[Test build #62507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62507/consoleFull)** for PR 14255 at commit [`8cf8c78`](https://github.com/apache/spark/commit/8

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71276731 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2965,4 +2965,32 @@ class SQLQuerySuite extends QueryTest with SharedSQLConte

[GitHub] spark pull request #14251: [SPARK-16602][SQL] `Nvl` function should support ...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14251#discussion_r71276614 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -100,7 +100,8 @@ object TypeCoercion { }

[GitHub] spark pull request #14255: [MINOR] Fix Java Linter `LineLength` errors

2016-07-18 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/14255 [MINOR] Fix Java Linter `LineLength` errors ## What changes were proposed in this pull request? This PR fixes four java linter `LineLength` errors. Those are all `LineLength` errors,

[GitHub] spark pull request #14014: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14014#discussion_r71276489 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRecordMaterializer.scala --- @@ -30,10 +30,11 @@ import org.apache

[GitHub] spark pull request #14227: [SPARK-16582][SQL] Explicitly define isNull = fal...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14227#discussion_r71276500 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -377,6 +377,7 @@ abstract class UnaryExpression ex

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71276368 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram and dataframe...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14253 **[Test build #3188 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3188/consoleFull)** for PR 14253 at commit [`6d8c9aa`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram and dataframe...

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14253 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14247: [MINOR] Remove unused arg in als.py

2016-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14247 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...

2016-07-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14245 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark issue #14254: [SPARK-16619] Add shuffle service metrics entry in monit...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14254 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14247 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71275973 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71275955 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #14254: Add shuffle service metrics entry in monitoring d...

2016-07-18 Thread lovexi
GitHub user lovexi opened a pull request: https://github.com/apache/spark/pull/14254 Add shuffle service metrics entry in monitoring docs ## What changes were proposed in this pull request? Add shuffle service metrics entry in currently supporting metrics list in monitoring

[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13778 ping @cloud-fan Can you check if this is good for you now? It is for a while. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r71275696 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCastsSuite.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to t

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r71275743 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1441,6 +1441,12 @@ object PushPredicateThroughJoin e

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14207 > when the data/files are changed by external system (e.g., appended by a streaming system), the stored schema can be inconsistent with the actual schema of the data. I think this problem

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14207 @gatorsmile Yea. I meant that as you use the stored schema without inferred schema for table, when the data/files are changed by external system (e.g., appended by a streaming system), the stored sch

[GitHub] spark issue #14065: [SPARK-14743][YARN] Add a configurable token manager for...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62506/consoleFull)** for PR 14065 at commit [`b8eeb28`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14207 @viirya Schema inference is time-consuming, especially when the number of files is huge. Thus, we should avoid refreshing it every time. That is one of the major reasons why we have a metadata ca

[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14253 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request #14253: [Doc] improve python doc for rdd.histogram

2016-07-18 Thread mortada
GitHub user mortada opened a pull request: https://github.com/apache/spark/pull/14253 [Doc] improve python doc for rdd.histogram ## What changes were proposed in this pull request? doc change only ## How was this patch tested? doc change only

[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14222 ping @rxin The change is ok for you? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14207 @gatorsmile When the data/files are input by an external system, and Spark is just used to process them in batch. Does it mean that schema can be inconsistent? Or it should call refresh every time it

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13704 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62504/ Test PASSed. ---

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62504/consoleFull)** for PR 13704 at commit [`cbcfd56`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support numeric...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14251 Hi, @rxin . Could you review this `Nvl` PR again? I can solve that by only replacing `findTightestCommonTypeOfTwo` into `findTightestCommonTypeToString`. --- If your project is set u

[GitHub] spark pull request #13990: [SPARK-16287][SQL] Implement str_to_map SQL funct...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13990#discussion_r71273378 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -393,3 +394,56 @@ case class CreateNamedSt

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support numeric...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14251 **[Test build #62505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62505/consoleFull)** for PR 14251 at commit [`53ae02f`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71273081 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -146,6 +151,15 @@ case class CatalogTable( requireSu

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71272934 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -303,6 +303,7 @@ object CreateDataSourceTableUtil

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13382 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13382 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62503/ Test PASSed. ---

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13382 **[Test build #62503 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62503/consoleFull)** for PR 13382 at commit [`0fe4bc8`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71272434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -313,18 +313,48 @@ class SparkSqlAstBuilder(conf: SQLConf) extends

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62502/ Test PASSed. ---

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62502 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62502/consoleFull)** for PR 14132 at commit [`404a322`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71272290 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -146,6 +151,15 @@ case class CatalogTable( requireSu

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62501/ Test PASSed. ---

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #62501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62501/consoleFull)** for PR 14102 at commit [`cfe6bed`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62504/consoleFull)** for PR 13704 at commit [`cbcfd56`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62500/ Test PASSed. ---

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62500/consoleFull)** for PR 14132 at commit [`5ba2ad7`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/13382 Cool, @JoshRosen I'll leave this for you to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71266677 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -27,8 +27,8 @@ import org.apache.spark.util.Utils /**

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Sche...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14207#discussion_r71266605 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala --- @@ -351,6 +353,44 @@ class CatalogImpl(sparkSession: SparkSession) e

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Sche...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14207#discussion_r71266596 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -487,6 +487,10 @@ object DDLUtils { isDatasourceTable(t

[GitHub] spark issue #10881: [SPARK-12967][Netty] Avoid NettyRpc error message during...

2016-07-18 Thread JerryLead
Github user JerryLead commented on the issue: https://github.com/apache/spark/pull/10881 This bug still exists in latest Spark 1.6.2. How about merging it to branch-1.6? @nishkamravi2 @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13382 **[Test build #62503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62503/consoleFull)** for PR 13382 at commit [`0fe4bc8`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread dafrista
Github user dafrista commented on the issue: https://github.com/apache/spark/pull/13382 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread dafrista
Github user dafrista commented on the issue: https://github.com/apache/spark/pull/13382 Thanks @ericl. I pushed a commit addressing your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62502/consoleFull)** for PR 14132 at commit [`404a322`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 Right, there is `table` API, too. Thank you, I'll add that, too. By the way, I still in the downtown. I need to go home for dinner. I'll take care that tonight. Thank you again. --- I

[GitHub] spark pull request #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level t...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14054#discussion_r71265164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -158,25 +159,41 @@ object JdbcUtils extends Logg

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71265020 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1774,6 +1775,51 @@ class Analyzer( }

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14132 For your reference, below is a simple case if users want to do it using dataframe ```Scala sql("CREATE TABLE tab1(c1 int)") val df = spark.read.table("tab1") df.join(broadcast(df))

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 Yep. I made that case. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enable

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 What I mean is currently how to broadcast the Hive table `tab1`? I'm making the testcase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHu

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14132 Is it related? This is the most basic test case, right? ```SQL CREATE TABLE tab1(c1 int) select * from tab1, tab1 ``` --- If your project is set up for it, you can rep

[GitHub] spark issue #13990: [SPARK-16287][SQL] Implement str_to_map SQL function

2016-07-18 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/13990 @cloud-fan Comment addressed, test passed 👍 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 Does this work in `DataFrame` API, too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14132 Not all the joins have the operators `SubqueryAlias`. For example, below is a self join against Hive tables: ``` == Analyzed Logical Plan == c1: int, c1: int Project [c1#7, c1#8]

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/13382 This LGTM with some minor comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled a

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71262947 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -46,102 +46,145 @@ private[spark] class DiskBlockObjectWriter(

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71262912 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -46,102 +46,145 @@ private[spark] class DiskBlockObjectWriter(

  1   2   3   4   5   6   >