[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16958 cc @hvanhovell @bogdanrdc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved...

2017-02-16 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/16958 [SPARK-13721][SQL] Make GeneratorOuter unresolved. ## What changes were proposed in this pull request? This is a small change to make GeneratorOuter always unresolved. It is mostly no-op change

[GitHub] spark pull request #16956: [SPARK-19598][SQL]Remove the alias parameter in U...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16956#discussion_r101530187 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala --- @@ -54,10 +54,6 @@ object ResolveHints

spark git commit: [SPARK-19607][HOTFIX] Finding QueryExecution that matches provided executionId

2017-02-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3755da76c -> 59dc26e37 [SPARK-19607][HOTFIX] Finding QueryExecution that matches provided executionId ## What changes were proposed in this pull request? #16940 adds a test case which does not stop the spark job. It causes many failures o

[GitHub] spark issue #16943: [SPARK-19607][HOTFIX] Finding QueryExecution that matche...

2017-02-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16943 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

spark git commit: [SPARK-16475][SQL] broadcast hint for SQL queries - disallow space as the delimiter

2017-02-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master a8a139820 -> acf71c63c [SPARK-16475][SQL] broadcast hint for SQL queries - disallow space as the delimiter ## What changes were proposed in this pull request? A follow-up to disallow space as the delimiter in broadcast hint. ## How was t

[GitHub] spark issue #16941: [SPARK-16475][SQL] broadcast hint for SQL queries - disa...

2017-02-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16941 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16941: [SPARK-16475][SQL] broadcast hint for SQL queries...

2017-02-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16941#discussion_r101329235 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala --- @@ -524,7 +530,7 @@ class PlanParserSuite extends

spark git commit: [SPARK-19607] Finding QueryExecution that matches provided executionId

2017-02-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3973403d5 -> b55563c17 [SPARK-19607] Finding QueryExecution that matches provided executionId ## What changes were proposed in this pull request? Implementing a mapping between executionId and corresponding QueryExecution in SQLExecution.

[GitHub] spark issue #16940: [SPARK-19607] Finding QueryExecution that matches provid...

2017-02-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16940 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101289645 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -374,6 +374,16 @@ querySpecification windows

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101289574 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -374,6 +374,16 @@ querySpecification windows

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101288304 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -374,6 +374,16 @@ querySpecification windows

[GitHub] spark issue #16920: [MINOR][DOCS] Add jira url in pull request description

2017-02-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16920 Yea the only issue is that it requires another manual update. Why not use the chrome plugin I sent? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16940: [SPARK-19607] Finding QueryExecution that matches provid...

2017-02-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16940 LGTM (pending Jenkins). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16920: [MINOR][DOCS] Add jira url in pull request description

2017-02-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16920 Why is this necessary? It seems like an extra step needed and doesn't provide any real information. I suggest you use this: https://chrome.google.com/webstore/detail/j

[GitHub] spark pull request #16939: [SPARK-16475][SQL] broadcast hint for SQL queries...

2017-02-15 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/16939 [SPARK-16475][SQL] broadcast hint for SQL queries - follow up ## What changes were proposed in this pull request? A small update to https://github.com/apache/spark/pull/16925 1. Rename

[GitHub] spark issue #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16925 the latest commit hasn't finished running tests yet ... but probably fine given the small change. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101137229 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteHintsSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101129634 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteHints.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to the

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101129594 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteHints.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to the

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101129453 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteHintsSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark issue #16925: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16925 cc @dongjoon-hyun, @cloud-fan, @gatorsmile and @hvanhovell This should be ready for review. Note that the semantics is different from the earlier versions. --- If your project is set up for it, you

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16925#discussion_r101088496 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteHints.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed to the

[GitHub] spark issue #16925: [SPARK-16475][SQL] Broadcast Hint for SQL Queries - WIP

2017-02-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16925 Actually I'm going to completely rewrite this. I don't think the current implementation makes sense. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #16925: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2017-02-14 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/16925 [SPARK-16475][SQL] Broadcast Hint for SQL Queries ## What changes were proposed in this pull request? This PR aims to achieve the following two goals in Spark SQL. 1. Generic Hint

[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14426 Actually I have some time. I will submit a pr based on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2017-02-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14426 @dongjoon-hyun do you have time to update the pull request now the view canonicalization work is done? Basically we can remove all the SQL generation stuff. --- If your project is set up for it

spark git commit: [SPARK-19514] Enhancing the test for Range interruption.

2017-02-13 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1c4d10b10 -> 0417ce878 [SPARK-19514] Enhancing the test for Range interruption. Improve the test for SPARK-19514, so that it's clear which stage is being cancelled. Author: Ala Luszczak Closes #16914 from ala/fix-range-test. Project:

[GitHub] spark issue #16914: [SPARK-19514] Enhancing the test for Range interruption.

2017-02-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16914 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16914: [SPARK-19514] Enhancing the test for Range interruption.

2017-02-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16914 LGTM pending jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16872: [SPARK-19514] Making range interruptible.

2017-02-13 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16872#discussion_r100789955 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala --- @@ -127,4 +133,28 @@ class DataFrameRangeSuite extends QueryTest with

[GitHub] spark issue #16888: [WIP] [SPARK-19552] [BUILD] Upgrade Netty version to 4.1...

2017-02-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16888 Are there specific benefits brought by updating to 4.1 of Netty? Netty is so core to Spark that any bug in it would be extremely difficult to debug (yes we have founds bugs in Netty and helped fix

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2017-02-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r100687458 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala --- @@ -48,69 +47,110 @@ class JacksonParser( // A

spark git commit: [SPARK-19549] Allow providing reason for stage/job cancelling

2017-02-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3a43ae7c0 -> d785217b7 [SPARK-19549] Allow providing reason for stage/job cancelling ## What changes were proposed in this pull request? This change add an optional argument to `SparkContext.cancelStage()` and `SparkContext.cancelJob()` f

[GitHub] spark issue #16887: [SPARK-19549] Allow providing reason for stage/job cance...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16887 Merging in master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16888: [SPARK-19552] [BUILD] Upgrade Netty version to 4.1.8 fin...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16888 BTW for Netty we shouldn't just bump to the highest version. We should use the maintenance branches. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #16888: [SPARK-19552] [BUILD] Upgrade Netty version to 4.1.8 fin...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16888 Shouldn't we use netty-4.0.44.Final rather than 4.1.x? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 Yea we should fix that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 Actually @cloud-fan are you sure it is a problem right now? DataSOurce.write itself creates the commands, and if the information are propagated correctly, the QueryExecution object should have a

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 Basically I see no reason to add some specific parameter to a listener API that is meant to be generic which already contains reference to QueryExecution. What are you going to do if next time you

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 I think that's a separate "bug" we should fix, i.e. DataFrameWriter should use InsertIntoDataSourceCommand so we can consolidate the two paths. --- If your project is set up for it, y

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 Well it does. It contains the entire plan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 That's probably because you are not familiar with the SQL component. The existing API already has references to the QueryExecution object, which actually includes all of the information

spark git commit: Encryption of shuffle files

2017-02-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8640dc082 -> c5a66356d Encryption of shuffle files Hello According to my understanding of commits 4b4e329e49f8af28fa6301bd06c48d7097eaf9e6 & 8b325b17ecdf013b7a6edcb7ee3773546bd914df, one may now encrypt shuffle files regardless of the c

[GitHub] spark issue #16885: Encryption of shuffle files

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16885 Thanks - merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16887: [SPARK-19549] Allow providing reason for stage/job cance...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16887 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener c...

2017-02-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16664#discussion_r100565585 --- Diff: docs/sql-programming-guide.md --- @@ -1300,10 +1300,28 @@ Configuration of in-memory caching can be done using the `setConf` method on `Sp

[GitHub] spark pull request #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener c...

2017-02-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16664#discussion_r100565522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala --- @@ -44,27 +44,50 @@ trait QueryExecutionListener

[GitHub] spark pull request #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener c...

2017-02-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16664#discussion_r100564925 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -218,7 +247,14 @@ final class DataFrameWriter[T] private[sql](ds: Dataset

[GitHub] spark issue #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener callback...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16664 Sorry I'm really confused, probably because I haven't kept track with this pr. But the diff doesn't match the pr description. Are we fixing a bug here or introducing a bunch of new

[GitHub] spark pull request #16887: [SPARK-19549] Allow providing reason for stage/jo...

2017-02-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16887#discussion_r100552660 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -696,9 +696,9 @@ class DAGScheduler( /** * Cancel a job that

[GitHub] spark pull request #16887: [SPARK-19549] Allow providing reason for stage/jo...

2017-02-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16887#discussion_r100552370 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2207,20 +2207,22 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100503141 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java --- @@ -81,6 +81,11 @@ int getVersionNumber() { public abstract

spark git commit: [SPARK-19512][BACKPORT-2.1][SQL] codegen for compare structs fails #16852

2017-02-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 a3d5300a0 -> ff5818b8c [SPARK-19512][BACKPORT-2.1][SQL] codegen for compare structs fails #16852 ## What changes were proposed in this pull request? Set currentVars to null in GenerateOrdering.genComparisons before genCode is called.

[GitHub] spark issue #16875: [BACKPORT-2.1][SPARK-19512][SQL] codegen for compare str...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16875 Merging in branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16875: [BACKPORT-2.1][SPARK-19512][SQL] codegen for compare str...

2017-02-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16875 @bogdanrdc can you close this? It won't auto close because it is not merged in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #16872: [SPARK-19514] Making range interruptible.

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16872#discussion_r100396033 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala --- @@ -127,4 +133,28 @@ class DataFrameRangeSuite extends QueryTest with

[GitHub] spark issue #16864: [SPARK-19527][Core] Approximate Size of Intersection of ...

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16864 I meant just union, but createUnion ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [SPARK-19514] Making range interruptible.

2017-02-09 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3fc8e8caf -> 4064574d0 [SPARK-19514] Making range interruptible. ## What changes were proposed in this pull request? Previously range operator could not be interrupted. For example, using DAGScheduler.cancelStage(...) on a query with rang

[GitHub] spark issue #16872: [SPARK-19514] Making range interruptible.

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16872 I'm going to merge this in master. If we find a way to optimize the test we can do a follow-up pr. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #16872: [SPARK-19514] Making range interruptible.

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16872 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16871: [SPARK-19493][BUILD][CORE][WIP] Remove Java 7 sup...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16871#discussion_r100287048 --- Diff: build/mvn --- @@ -22,7 +22,7 @@ _DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" # Preserve t

[GitHub] spark pull request #16871: [SPARK-19493][BUILD][CORE][WIP] Remove Java 7 sup...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16871#discussion_r100287082 --- Diff: core/src/test/java/org/apache/spark/Java8RDDAPISuite.java --- @@ -15,7 +15,7 @@ * limitations under the License. */ -package

[GitHub] spark pull request #16871: [SPARK-19493][BUILD][CORE][WIP] Remove Java 7 sup...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16871#discussion_r100284451 --- Diff: core/src/test/java/org/apache/spark/Java8RDDAPISuite.java --- @@ -15,7 +15,7 @@ * limitations under the License. */ -package

[GitHub] spark pull request #16871: [SPARK-19493][BUILD][CORE][WIP] Remove Java 7 sup...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16871#discussion_r100284373 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1910,31 +1908,7 @@ private[spark] object Utils extends Logging { * @return

[GitHub] spark issue #16871: [SPARK-19493][BUILD][CORE][WIP] Remove Java 7 support

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16871 With this, what's the behavior if users use a Java 7 runtime to run Spark? What kind of errors do we generate? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #16871: [SPARK-19493][BUILD][CORE][WIP] Remove Java 7 sup...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16871#discussion_r100284098 --- Diff: build/mvn --- @@ -22,7 +22,7 @@ _DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" # Preserve t

[GitHub] spark issue #16864: [SPARK-19527][Core] Approximate Size of Intersection of ...

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16864 cc @mengxr / @tjhunter / @jkbradley is this good to have? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100261227 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java --- @@ -81,6 +81,11 @@ int getVersionNumber() { public abstract

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100261151 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java --- @@ -148,6 +153,20 @@ int getVersionNumber() { public abstract

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100261088 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/IncompatibleUnionException.java --- @@ -0,0 +1,24 @@ +/* + * Licensed to the

[GitHub] spark issue #16826: Fork SparkSession with option to inherit a copy of the S...

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16826 @kunalkhamar you should create a JIRA ticket for this. In addition, I'm not a big fan of the design to pass a base session in. It'd be simpler if there is just a clone method on se

[GitHub] spark pull request #16826: Fork SparkSession with option to inherit a copy o...

2017-02-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r100255729 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala --- @@ -213,6 +218,24 @@ class SparkSession private( new SparkSession

[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16856 I think the issue is that the programming guide should probably switch over to the DataFrame one as the primary one, and then the RDD one as a RDD programming guide. cc @matei for his input

[GitHub] spark issue #16810: [SPARK-19464][CORE][YARN][test-hadoop2.6] Remove support...

2017-02-08 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16810 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-maven-hadoop-2.6/3810/ --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16810: [SPARK-19464][CORE][YARN][test-hadoop2.6] Remove support...

2017-02-08 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16810 Did we break the build? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16837: [SPARK-19359][SQL] renaming partition should not leave u...

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16837 Does this change not require changing the other external catalog? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

spark git commit: [SPARK-19495][SQL] Make SQLConf slightly more extensible

2017-02-07 Thread rxin
the build* functions. ## How was this patch tested? N/A - there are no logic changes and everything should be covered by existing unit tests. Author: Reynold Xin Closes #16835 from rxin/SPARK-19495. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/re

[GitHub] spark issue #16835: [SPARK-19495][SQL] Make SQLConf slightly more extensible

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16835 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16835: [SPARK-19495][SQL] Make SQLConf slightly more ext...

2017-02-07 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/16835 [SPARK-19495][SQL] Make SQLConf slightly more extensible ## What changes were proposed in this pull request? This pull request makes SQLConf slightly more extensible by removing the visibility

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16594 ok here is an idea how about ``` explain stats xxx ``` as the way to add stats? --- If your project is set up for it, you can reply to this email and have your

spark git commit: [SPARK-19447] Fixing input metrics for range operator.

2017-02-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master e99e34d0f -> 6ed285c68 [SPARK-19447] Fixing input metrics for range operator. ## What changes were proposed in this pull request? This change introduces a new metric "number of generated rows". It is used exclusively for Range, which is a

[GitHub] spark issue #16829: [SPARK-19447] Fixing input metrics for range operator.

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16829 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16829: [SPARK-19447] Fixing input metrics for range operator.

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16829 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16832: [SPARK-19490][SQL] change hive column names to lower cas...

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16832 hm is it safe to just do this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16826: Fork SparkSession with option to inherit a copy of the S...

2017-02-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16826 What is the semantics? Do functions/settings on the base SparkSession affect the new forked? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

spark git commit: [SPARK-19409][SPARK-17213] Cleanup Parquet workarounds/hacks due to bugs of old Parquet versions

2017-02-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master 65b10ffb3 -> 7730426cb [SPARK-19409][SPARK-17213] Cleanup Parquet workarounds/hacks due to bugs of old Parquet versions ## What changes were proposed in this pull request? We've already upgraded parquet-mr to 1.8.2. This PR does some furt

[GitHub] spark issue #16791: [SPARK-19409][SPARK-17213] Cleanup Parquet workarounds/h...

2017-02-06 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16791 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16796: [SPARK-10063] Follow-up: remove dead code related...

2017-02-03 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/16796 [SPARK-10063] Follow-up: remove dead code related to an old output committer. ## What changes were proposed in this pull request? DirectParquetOutputCommitter was removed from Spark as it was

[GitHub] spark issue #16792: [SPARK-19453][PYTHON][SQL][DOC] Correct and extend DataF...

2017-02-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16792 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

spark git commit: [SPARK-19411][SQL] Remove the metadata used to mark optional columns in merged Parquet schema for filter predicate pushdown

2017-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master c86a57f4d -> bf493686e [SPARK-19411][SQL] Remove the metadata used to mark optional columns in merged Parquet schema for filter predicate pushdown ## What changes were proposed in this pull request? There is a metadata introduced before t

[GitHub] spark issue #16756: [SPARK-19411][SQL] Remove the metadata used to mark opti...

2017-02-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16756 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2

2017-01-31 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16751 can you put rest of the cleanups in one place? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

spark git commit: [SPARK-19409][BUILD] Bump parquet version to 1.8.2

2017-01-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master be7425e26 -> 26a4cba3f [SPARK-19409][BUILD] Bump parquet version to 1.8.2 ## What changes were proposed in this pull request? According to the discussion on #16281 which tried to upgrade toward Apache Parquet 1.9.0, Apache Spark community

[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2

2017-01-31 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16751 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

spark git commit: [SPARK-19403][PYTHON][SQL] Correct pyspark.sql.column.__all__ list.

2017-01-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master ade075aed -> 06fbc3554 [SPARK-19403][PYTHON][SQL] Correct pyspark.sql.column.__all__ list. ## What changes were proposed in this pull request? This removes from the `__all__` list class names that are not defined (visible) in the `pyspark

[GitHub] spark issue #16742: [SPARK-19403][PYTHON][SQL] Correct pyspark.sql.column.__...

2017-01-30 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16742 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16742: [SPARK-19403][PYTHON][SQL] Correct pyspark.sql.column.__...

2017-01-30 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16742 LGTM, but can you update your description: ``` This removes from the __all__ list class names that are not defined (visible) in the pyspark.sql.column. ``` Your current

[GitHub] spark issue #16731: [SPARK-19393][SQL] Add `approx_percentile` Dataset/DataF...

2017-01-30 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16731 to be honest I really hate it with Scala/Java when we need to add so many functions just for a single function. Can we just tell users to use `expr("approx_percentile(...)")`? --- If your

<    6   7   8   9   10   11   12   13   14   15   >