[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram and dataframe...

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14253 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #14247: [MINOR] Remove unused arg in als.py

2016-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14247 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...

2016-07-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14245 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14254: [SPARK-16619] Add shuffle service metrics entry in monit...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14254 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14247 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71275955 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71275973 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non

[GitHub] spark pull request #14254: Add shuffle service metrics entry in monitoring d...

2016-07-18 Thread lovexi
GitHub user lovexi opened a pull request: https://github.com/apache/spark/pull/14254 Add shuffle service metrics entry in monitoring docs ## What changes were proposed in this pull request? Add shuffle service metrics entry in currently supporting metrics list in

[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13778 ping @cloud-fan Can you check if this is good for you now? It is for a while. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r71275696 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCastsSuite.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r71275743 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1441,6 +1441,12 @@ object PushPredicateThroughJoin

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14207 > when the data/files are changed by external system (e.g., appended by a streaming system), the stored schema can be inconsistent with the actual schema of the data. I think this

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14207 @gatorsmile Yea. I meant that as you use the stored schema without inferred schema for table, when the data/files are changed by external system (e.g., appended by a streaming system), the stored

[GitHub] spark issue #14065: [SPARK-14743][YARN] Add a configurable token manager for...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62506/consoleFull)** for PR 14065 at commit

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14207 @viirya Schema inference is time-consuming, especially when the number of files is huge. Thus, we should avoid refreshing it every time. That is one of the major reasons why we have a metadata

[GitHub] spark issue #14253: [Doc] improve python doc for rdd.histogram

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14253 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #14253: [Doc] improve python doc for rdd.histogram

2016-07-18 Thread mortada
GitHub user mortada opened a pull request: https://github.com/apache/spark/pull/14253 [Doc] improve python doc for rdd.histogram ## What changes were proposed in this pull request? doc change only ## How was this patch tested? doc change only

[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14222 ping @rxin The change is ok for you? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Schemas int...

2016-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14207 @gatorsmile When the data/files are input by an external system, and Spark is just used to process them in batch. Does it mean that schema can be inconsistent? Or it should call refresh every time

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13704 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62504/ Test PASSed. ---

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62504/consoleFull)** for PR 13704 at commit

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support numeric...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14251 Hi, @rxin . Could you review this `Nvl` PR again? I can solve that by only replacing `findTightestCommonTypeOfTwo` into `findTightestCommonTypeToString`. --- If your project is set

[GitHub] spark pull request #13990: [SPARK-16287][SQL] Implement str_to_map SQL funct...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13990#discussion_r71273378 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -393,3 +394,56 @@ case class

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support numeric...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14251 **[Test build #62505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62505/consoleFull)** for PR 14251 at commit

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71273081 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -146,6 +151,15 @@ case class CatalogTable(

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71272934 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -303,6 +303,7 @@ object

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13382 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13382 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62503/ Test PASSed. ---

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13382 **[Test build #62503 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62503/consoleFull)** for PR 13382 at commit

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71272434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -313,18 +313,48 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62502/ Test PASSed. ---

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62502 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62502/consoleFull)** for PR 14132 at commit

[GitHub] spark pull request #14155: [SPARK-16498][SQL][WIP] move hive hack for data s...

2016-07-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r71272290 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -146,6 +151,15 @@ case class CatalogTable(

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62501/ Test PASSed. ---

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #62501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62501/consoleFull)** for PR 14102 at commit

[GitHub] spark issue #13704: [SPARK-15985][SQL] Eliminate redundant cast from an arra...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62504/consoleFull)** for PR 13704 at commit

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62500/ Test PASSed. ---

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62500/consoleFull)** for PR 14132 at commit

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/13382 Cool, @JoshRosen I'll leave this for you to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71266677 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -27,8 +27,8 @@ import org.apache.spark.util.Utils /**

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Sche...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14207#discussion_r71266605 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala --- @@ -351,6 +353,44 @@ class CatalogImpl(sparkSession: SparkSession)

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Sche...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14207#discussion_r71266596 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -487,6 +487,10 @@ object DDLUtils {

[GitHub] spark issue #10881: [SPARK-12967][Netty] Avoid NettyRpc error message during...

2016-07-18 Thread JerryLead
Github user JerryLead commented on the issue: https://github.com/apache/spark/pull/10881 This bug still exists in latest Spark 1.6.2. How about merging it to branch-1.6? @nishkamravi2 @zsxwing --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13382 **[Test build #62503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62503/consoleFull)** for PR 13382 at commit

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread dafrista
Github user dafrista commented on the issue: https://github.com/apache/spark/pull/13382 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread dafrista
Github user dafrista commented on the issue: https://github.com/apache/spark/pull/13382 Thanks @ericl. I pushed a commit addressing your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62502/consoleFull)** for PR 14132 at commit

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 Right, there is `table` API, too. Thank you, I'll add that, too. By the way, I still in the downtown. I need to go home for dinner. I'll take care that tonight. Thank you again. ---

[GitHub] spark pull request #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level t...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14054#discussion_r71265164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -158,25 +159,41 @@ object JdbcUtils extends

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71265020 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1774,6 +1775,51 @@ class Analyzer( }

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14132 For your reference, below is a simple case if users want to do it using dataframe ```Scala sql("CREATE TABLE tab1(c1 int)") val df = spark.read.table("tab1")

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 Yep. I made that case. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 What I mean is currently how to broadcast the Hive table `tab1`? I'm making the testcase. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14132 Is it related? This is the most basic test case, right? ```SQL CREATE TABLE tab1(c1 int) select * from tab1, tab1 ``` --- If your project is set up for it, you can

[GitHub] spark issue #13990: [SPARK-16287][SQL] Implement str_to_map SQL function

2016-07-18 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/13990 @cloud-fan Comment addressed, test passed 👍 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 Does this work in `DataFrame` API, too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14132 Not all the joins have the operators `SubqueryAlias`. For example, below is a self join against Hive tables: ``` == Analyzed Logical Plan == c1: int, c1: int Project [c1#7, c1#8]

[GitHub] spark issue #13382: [SPARK-5581][Core] When writing sorted map output file, ...

2016-07-18 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/13382 This LGTM with some minor comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71262947 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -46,102 +46,145 @@ private[spark] class DiskBlockObjectWriter(

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71262912 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -46,102 +46,145 @@ private[spark] class DiskBlockObjectWriter(

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71262847 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -46,102 +46,145 @@ private[spark] class DiskBlockObjectWriter(

[GitHub] spark pull request #13382: [SPARK-5581][Core] When writing sorted map output...

2016-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/13382#discussion_r71262784 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -46,102 +46,145 @@ private[spark] class DiskBlockObjectWriter(

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-07-18 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 no, go ahead to submit one:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Sche...

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14207#discussion_r71262073 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -270,6 +291,11 @@ case class

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71261575 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/InferSchema.scala --- @@ -60,13 +60,13 @@ private[sql] object

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71261455 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1774,6 +1775,51 @@ class Analyzer( }

[GitHub] spark issue #14065: [SPARK-14743][YARN] Add a configurable token manager for...

2016-07-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/14065 Thanks @tgravescs for your comments, I will add the docs about it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #62501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62501/consoleFull)** for PR 14102 at commit

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14102 @yhuai the commits I pushed include the changes below: - Reverts the changes in `JSONOptions` about `columnNameOfCorruptRecord`

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] [WIP] Store the Inferred Sche...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14207#discussion_r71261112 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -270,6 +291,11 @@ case class

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71261042 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala --- @@ -425,6 +452,49 @@ class SQLBuilder(logicalPlan: LogicalPlan)

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71260947 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1774,6 +1775,51 @@ class Analyzer( }

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...

2016-07-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r71260242 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCastsSuite.scala --- @@ -0,0 +1,119 @@ +/* + * Licensed to

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71260158 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala --- @@ -425,6 +452,49 @@ class SQLBuilder(logicalPlan: LogicalPlan)

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71259834 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -87,6 +87,7 @@ class Analyzer(

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14132 I see. For `NormalizeBroadcastHint`, I will try to minimize the cases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71259759 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1774,6 +1775,51 @@ class Analyzer( }

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14132 **[Test build #62500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62500/consoleFull)** for PR 14132 at commit

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71259501 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala --- @@ -425,6 +452,49 @@ class SQLBuilder(logicalPlan: LogicalPlan)

[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14241 @ericl I was talking with @marmbrus -- it'd be better to create an API in the physical scan operator that accepts a list of filters, and then do pruning there. That is to say, we also want to move all

[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14132 I was referring to NormalizeBroadcastHint -- there are many cases in there and it seems error prone against future changes. Do we need all those rules? --- If your project is set up for it, you

[GitHub] spark pull request #14252: [SPARK-16615][SQL] Expose sqlContext in SparkSess...

2016-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14252 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14252: [SPARK-16615][SQL] Expose sqlContext in SparkSession

2016-07-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14252 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support various...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62499/ Test PASSed. ---

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support various...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14251 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14251: [SPARK-16602][SQL] `Nvl` function should support various...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14251 **[Test build #62499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62499/consoleFull)** for PR 14251 at commit

[GitHub] spark pull request #14235: [SPARK-16590][SQL] Improve LogicalPlanToSQLSuite ...

2016-07-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71257848 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -17,15 +17,33 @@ package

[GitHub] spark pull request #14235: [SPARK-16590][SQL] Improve LogicalPlanToSQLSuite ...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71257655 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -17,15 +17,33 @@ package

[GitHub] spark pull request #14235: [SPARK-16590][SQL] Improve LogicalPlanToSQLSuite ...

2016-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71257369 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -17,15 +17,33 @@ package

[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62498/ Test PASSed. ---

[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14174 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14174 **[Test build #62498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62498/consoleFull)** for PR 14174 at commit

[GitHub] spark issue #14252: [SPARK-16615][SQL] Expose sqlContext in SparkSession

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14252 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62496/ Test PASSed. ---

[GitHub] spark issue #14252: [SPARK-16615][SQL] Expose sqlContext in SparkSession

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14252 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14252: [SPARK-16615][SQL] Expose sqlContext in SparkSession

2016-07-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14252 **[Test build #62496 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62496/consoleFull)** for PR 14252 at commit

[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...

2016-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14174 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

  1   2   3   4   5   6   >