[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112373858 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112373805 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112370906 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112370321 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112368956 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112368367 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112365872 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1635,21 +1636,49 @@ def toDF(self, *cols): return DataFrame(jdf, self.sql_ctx

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112365773 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1635,21 +1636,49 @@ def toDF(self, *cols): return DataFrame(jdf, self.sql_ctx

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112365501 --- Diff: python/pyspark/serializers.py --- @@ -182,6 +182,23 @@ def loads(self, obj): raise NotImplementedError +class

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-04-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15821 @BryanCutler Are you going to update this for arrow 0.3? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-04-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15821 Please move ArrowConverters.scala somewhere else that's not top level, e.g. execution.arrow --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17678: [SPARK-20381][SQL] Add SQL metrics of numOutputRows for ...

2017-04-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17678 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17678: [SPARK-20381][SQL] Add SQL metrics of numOutputRows for ...

2017-04-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17678 Is there a codegen version we need to worry about? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

spark git commit: Fixed typos in docs

2017-04-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master dd6d55d5d -> bdc605691 Fixed typos in docs ## What changes were proposed in this pull request? Typos at a couple of place in the docs. ## How was this patch tested? build including docs Please review

spark git commit: Fixed typos in docs

2017-04-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 e6bbdb0c5 -> 8d658b90b Fixed typos in docs ## What changes were proposed in this pull request? Typos at a couple of place in the docs. ## How was this patch tested? build including docs Please review

[GitHub] spark issue #17690: Fixed typos in docs

2017-04-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17690 Thanks - merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [SPARK-20398][SQL] range() operator should include cancellation reason when killed

2017-04-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 af9f18c31 -> e6bbdb0c5 [SPARK-20398][SQL] range() operator should include cancellation reason when killed ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-19820 adds a reason field for

spark git commit: [SPARK-20398][SQL] range() operator should include cancellation reason when killed

2017-04-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master 39e303a8b -> dd6d55d5d [SPARK-20398][SQL] range() operator should include cancellation reason when killed ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-19820 adds a reason field for why

[GitHub] spark issue #17692: [SPARK-20398] [SQL] range() operator should include canc...

2017-04-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17692 Merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17648: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...

2017-04-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17648 Can we just do a logical rewrite to turn them into "condA + condB + condC > 0" (for Some/Any) ? --- If your project is set up for it, you can reply to this email and have your reply app

spark git commit: [TEST][MINOR] Replace repartitionBy with distribute in CollapseRepartitionSuite

2017-04-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0075562dd -> 33ea908af [TEST][MINOR] Replace repartitionBy with distribute in CollapseRepartitionSuite ## What changes were proposed in this pull request? Replace non-existent `repartitionBy` with `distribute` in

[GitHub] spark issue #17657: [TEST][MINOR] Replace repartitionBy with distribute in C...

2017-04-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17657 Merging in master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-20349][SQL][REVERT-BRANCH2.1] ListFunctions returns duplicate functions after using persistent functions

2017-04-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 622d7a8bf -> 3808b4728 [SPARK-20349][SQL][REVERT-BRANCH2.1] ListFunctions returns duplicate functions after using persistent functions Revert the changes of https://github.com/apache/spark/pull/17646 made in Branch 2.1, because it

[GitHub] spark issue #17661: [SPARK-20349] [SQL] [REVERT-Branch2.1] ListFunctions ret...

2017-04-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17661 Merging in branch-2.1. Can you close your PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

spark git commit: Typo fix: distitrbuted -> distributed

2017-04-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master e5fee3e4f -> 0075562dd Typo fix: distitrbuted -> distributed ## What changes were proposed in this pull request? Typo fix: distitrbuted -> distributed ## How was this patch tested? Existing tests Author: Andrew Ash

[GitHub] spark issue #17664: Typo fix: distitrbuted -> distributed

2017-04-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17664 Thanks - merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...

2017-04-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15398 I pushed a commit. Hopefully that fixes it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [HOTFIX] Fix compilation.

2017-04-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 db9517c16 -> 622d7a8bf [HOTFIX] Fix compilation. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/622d7a8b Tree:

spark git commit: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns.

2017-04-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 7aad057b0 -> db9517c16 [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns. This patch fixes a bug in the way LIKE patterns are translated to Java regexes. The bug causes any character following an escaped backslash to be

[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...

2017-04-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15398 I've resolved the conflict and merged this in master/branch-2.1. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

spark git commit: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns.

2017-04-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 01ff0350a -> e5fee3e4f [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns. ## What changes were proposed in this pull request? This patch fixes a bug in the way LIKE patterns are translated to Java regexes. The bug causes any

[GitHub] spark issue #17630: [SPARK-20318][SQL] Use Catalyst type for min/max in Colu...

2017-04-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17630 Thanks for the explanation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17630: [SPARK-20318][SQL] Use Catalyst type for min/max in Colu...

2017-04-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17630 Wait - are we storing UTF8Strings directly in the catalog for statistics? That doesn't make sense ... if we are not, then we are not using internal types. In that case we should document clearly

[GitHub] spark issue #17633: [SPARK-20331][SQL] Enhanced Hive partition pruning predi...

2017-04-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17633 Then it should work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17633: [SPARK-20331][SQL] Enhanced Hive partition pruning predi...

2017-04-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17633 Does this work for non-Hive tables? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17623: [SPARK-20292][SQL][WIP] Clean up string represent...

2017-04-13 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17623#discussion_r111505420 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -149,7 +149,7 @@ case class Cast(child: Expression

[GitHub] spark pull request #17196: [SPARK-19855][SQL] Create an internal FilePartiti...

2017-04-13 Thread rxin
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/17196 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17630: [SPARK-20318][SQL] Use Catalyst type for min/max in Colu...

2017-04-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17630 When we update Spark and change the internal format, we'd still need to keep the current implementation. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #17630: [SPARK-20318][SQL] Use Catalyst type for min/max in Colu...

2017-04-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17630 hm this means we will forever need to be able to read the internal format, doesn't it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

spark git commit: [SPARK-20302][SQL] Short circuit cast when from and to types are structurally the same

2017-04-12 Thread rxin
lly the same (having the same structure but different field names), we should be able to skip the actual cast. ## How was this patch tested? Added unit tests for the newly introduced functions. Author: Reynold Xin <r...@databricks.com> Closes #17614 from rxin/SPARK-20302. Project: http:

[GitHub] spark issue #17614: [SPARK-20302][SQL] Short circuit cast when from and to t...

2017-04-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17614 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17616: [SPARK-20304][SQL] AssertNotNull should not inclu...

2017-04-12 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17616 [SPARK-20304][SQL] AssertNotNull should not include path in string representation ## What changes were proposed in this pull request? AssertNotNull's toString/simpleString dumps the entire

[GitHub] spark issue #17616: [SPARK-20304][SQL] AssertNotNull should not include path...

2017-04-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17616 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17614: [SPARK-20302][SQL] Short circuit cast when from a...

2017-04-11 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17614#discussion_r111064001 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala --- @@ -288,4 +288,30 @@ object DataType { case (fromDataType

[GitHub] spark issue #17604: [SPARK-20289][SQL] Use StaticInvoke to box primitive typ...

2017-04-11 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17604 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-20289][SQL] Use StaticInvoke to box primitive types

2017-04-11 Thread rxin
tor). Instead, it'd be slightly more idiomatic in Java to use PrimitiveType.valueOf, which can be invoked using StaticInvoke expression. ## How was this patch tested? The change should be covered by existing tests for Dataset encoders. Author: Reynold Xin <r...@databricks.com> Closes #17604

[GitHub] spark pull request #17604: [SPARK-20289][SQL] Use StaticInvoke to box primit...

2017-04-11 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17604 [SPARK-20289][SQL] Use StaticInvoke to box primitive types ## What changes were proposed in this pull request? Dataset typed API currently uses NewInstance to box primitive types (i.e. calling

spark git commit: [SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.furtherRequestsDelay

2017-04-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 f40e44de8 -> 8eb71b81f [SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.furtherRequestsDelay ## What changes were proposed in this pull request? This PR fixs the following failure: ``` sbt.ForkMain$ForkError:

spark git commit: [SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.furtherRequestsDelay

2017-04-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 379b0b0bb -> 734dfbfcf [SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.furtherRequestsDelay ## What changes were proposed in this pull request? This PR fixs the following failure: ``` sbt.ForkMain$ForkError:

[GitHub] spark issue #17599: [SPARK-17564][Tests]Fix flaky RequestTimeoutIntegrationS...

2017-04-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17599 Merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17599: [SPARK-17564][Tests]Fix flaky RequestTimeoutIntegrationS...

2017-04-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17599 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17596: [SPARK-12837][SQL] reduce the serialized size of accumul...

2017-04-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17596 BTW a potential, better way to solve this is to combine all the metrics into a single accumulator. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #17596: [SPARK-12837][SQL] reduce the serialized size of ...

2017-04-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17596#discussion_r110765367 --- Diff: core/src/main/scala/org/apache/spark/util/InternalLongAccumulator.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software

spark git commit: [SPARK-20283][SQL] Add preOptimizationBatches

2017-04-10 Thread rxin
hes so the optimizer debugging extensions are symmetric. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #17595 from rxin/SPARK-20283. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit

[GitHub] spark issue #17595: [SPARK-20283][SQL] Add preOptimizationBatches

2017-04-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17595 Merging this since as long as it compiles the change should be fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17595: [SPARK-20283][SQL] Add preOptimizationBatches

2017-04-10 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17595 [SPARK-20283][SQL] Add preOptimizationBatches ## What changes were proposed in this pull request? We currently have postHocOptimizationBatches, but not preOptimizationBatches. This patch adds

[GitHub] spark issue #17592: [SPARK-20243][TESTS] DebugFilesystem.assertNoOpenStreams...

2017-04-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17592 Should this go into branch-2.1 as well? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17574 Meh let's not bother. There isn't any harm in the current setup since it's already a transitive dependency. Why waste time on those? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #17574: [SPARK-20264][SQL] asm should be non-test depende...

2017-04-07 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17574 [SPARK-20264][SQL] asm should be non-test dependency in sql/core ## What changes were proposed in this pull request? sq/core module currently declares asm as a test scope dependency. Transitively

[GitHub] spark pull request #17573: [SPARK-20262][SQL] AssertNotNull should throw Nul...

2017-04-07 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17573 [SPARK-20262][SQL] AssertNotNull should throw NullPointerException ## What changes were proposed in this pull request? AssertNotNull currently throws RuntimeException. It should throw

spark git commit: [SPARK-20255] Move listLeafFiles() to InMemoryFileIndex

2017-04-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1ad73f0a2 -> 589f3edb8 [SPARK-20255] Move listLeafFiles() to InMemoryFileIndex ## What changes were proposed in this pull request Trying to get a grip on the `FileIndex` hierarchy, I was confused by the following inconsistency: On the

[GitHub] spark issue #17570: [SPARK-20255] Move listLeafFiles() to InMemoryFileIndex

2017-04-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17570 Merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17570: [SPARK-20255] Move listLeafFiles() to InMemoryFileIndex

2017-04-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17570 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17570: [SPARK-20255] Move listLeafFiles() to InMemoryFileIndex

2017-04-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17570 Jenkins, add to whitelist. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17555: [SPARK-19495][SQL] Make SQLConf slightly more ext...

2017-04-06 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17555 [SPARK-19495][SQL] Make SQLConf slightly more extensible - addendum ## What changes were proposed in this pull request? This is a tiny addendum to SPARK-19495 to remove the private visibility

spark git commit: [MINOR][DOCS] Fix typo in Hive Examples

2017-04-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master a4491626e -> 8129d59d0 [MINOR][DOCS] Fix typo in Hive Examples ## What changes were proposed in this pull request? Fix typo in hive examples from "DaraFrames" to "DataFrames" ## How was this patch tested? N/A Please review

[GitHub] spark issue #17554: [MINOR][DOCS] Fix typo in Hive Examples

2017-04-06 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17554 Thanks - merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-05 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17541#discussion_r110013198 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/broadcastMode.scala --- @@ -26,10 +26,7 @@ import

spark git commit: Small doc fix for ReuseSubquery.

2017-04-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master c1b8b6675 -> b6e71032d Small doc fix for ReuseSubquery. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b6e71032 Tree:

[GitHub] spark issue #17471: [SPARK-3577] Report Spill size on disk for UnsafeExterna...

2017-04-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17471 cc @cloud-fan / @ueshin / @sameeragarwal can you review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17521 @nsyca can you look into it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[2/2] spark git commit: [SPARK-20204][SQL] remove SimpleCatalystConf and CatalystConf type alias

2017-04-04 Thread rxin
[SPARK-20204][SQL] remove SimpleCatalystConf and CatalystConf type alias ## What changes were proposed in this pull request? This is a follow-up of https://github.com/apache/spark/pull/17285 . ## How was this patch tested? existing tests Author: Wenchen Fan Closes

[1/2] spark git commit: [SPARK-20204][SQL] remove SimpleCatalystConf and CatalystConf type alias

2017-04-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0e2ee8204 -> 402bf2a50 http://git-wip-us.apache.org/repos/asf/spark/blob/402bf2a5/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/PlanTest.scala -- diff

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17521 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-18278][SCHEDULER] Documentation to point to Kubernetes cluster scheduler

2017-04-04 Thread rxin
out-of-repo in https://github.com/apache-spark-on-k8s/spark cc rxin srowen tnachen ash211 mccheah erikerlandson ## How was this patch tested? Docs only change Author: Anirudh Ramanathan <fox...@users.noreply.github.com> Author: foxish <ramanath...@google.com> Closes #17522 from foxi

[GitHub] spark issue #17522: [SPARK-18278] [Scheduler] Documentation to point to Kube...

2017-04-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17522 Thanks - merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17521 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17499: [SPARK-20161][CORE] Default log4j properties file should...

2017-04-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17499 Great - please close this. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-20145] Fix range case insensitive bug in SQL

2017-04-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 703c42c39 -> 58c9e6e77 [SPARK-20145] Fix range case insensitive bug in SQL ## What changes were proposed in this pull request? Range in SQL should be case insensitive ## How was this patch tested? unit test Author: samelamin

[GitHub] spark issue #17487: [Spark-20145] Fix range case insensitive bug in SQL

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17487 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17505: [SPARK-20187][SQL] Replace loadTable with moveFil...

2017-04-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17505#discussion_r109553390 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -242,6 +251,16 @@ private[client] class Shim_v0_12 extends Shim

[GitHub] spark issue #17499: [SPARK-20161][CORE] Default log4j properties file should...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17499 Maybe Hive can do it in Hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17521: [SPARK-20204][SQL] separate SQLConf into catalyst confs ...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17521 To be clear, I don't think we should have two separate places to define config entries. If this is what the pr is doing, I strongly veto. --- If your project is set up for it, you can reply

[GitHub] spark issue #17522: [SPARK-18278] [Scheduler] Documentation to point to Kube...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17522 Seems fine to me, since the number of external resource managers are small. We should definitely make it clear there is no firm commitment currently to merge this into Spark though. --- If your

[GitHub] spark issue #17518: [SPARK-20198] [SQL] Remove the inconsistency in table/fu...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17518 Is this an API change or just a documentation change? The title suggests you are changing public facing APIs? --- If your project is set up for it, you can reply to this email and have your reply

spark git commit: [SPARK-20151][SQL] Account for partition pruning in scan metadataTime metrics

2017-03-31 Thread rxin
I'm not sure if this is worth it. Author: Reynold Xin <r...@databricks.com> Closes #17476 from rxin/SPARK-20151. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a8a765b3 Tree: http://git-wip-us.apache.org/repos/asf/s

[GitHub] spark issue #17476: [SPARK-20151][SQL] Account for partition pruning in scan...

2017-03-31 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17476 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17490: [SPARK-20167]In SqlBase.g4,some of the comments is not c...

2017-03-31 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17490 I don't think the change makes sense ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17476: [SPARK-20151][SQL] Account for partition pruning ...

2017-03-30 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17476#discussion_r109092194 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileIndex.scala --- @@ -72,4 +72,14 @@ trait FileIndex { /** Schema

[GitHub] spark pull request #17476: [SPARK-20151][SQL] Account for partition pruning ...

2017-03-30 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17476#discussion_r109092246 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CatalogFileIndex.scala --- @@ -111,7 +113,8 @@ private class

[GitHub] spark issue #17476: [SPARK-20151][SQL] Account for partition pruning in scan...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17476 cc @ericl, @bogdanrdc, @adrian-ionescu, @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17476: [SPARK-20151][SQL] Account for partition pruning ...

2017-03-29 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17476 [SPARK-20151][SQL] Account for partition pruning in scan metadataTime metrics ## What changes were proposed in this pull request? After SPARK-20136, we report metadata timing metrics in scan

spark git commit: [SPARK-20148][SQL] Extend the file commit API to allow subscribing to task commit messages

2017-03-29 Thread rxin
tch tested? Unit tests. cc rxin Author: Eric Liang <e...@databricks.com> Closes #17475 from ericl/file-commit-api-ext. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/79636054 Tree: http://git-wip-us.apache.org/repos/

[GitHub] spark issue #17475: [SPARK-20148] [SQL] Extend the file commit API to allow ...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17475 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17465: [SPARK-20136][SQL] Add num files and metadata operation ...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17465 Let me merge this now. I will send a follow-up PR to take the logical planning time into account (otherwise in the vast majority of cases, i.e. pruned partitions, the metadata operation time

[GitHub] spark issue #17465: [SPARK-20136][SQL] Add num files and metadata operation ...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17465 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-20136][SQL] Add num files and metadata operation timing to scan operator metrics

2017-03-29 Thread rxin
tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #17465 from rxin/SPARK-20136. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/60977889 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/60977889 Diff: ht

[GitHub] spark issue #17470: [SPARK-20146][SQL] fix comment missing issue for thrift ...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17470 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-20146][SQL] fix comment missing issue for thrift server

2017-03-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master dd2e7d528 -> 22f07fefe [SPARK-20146][SQL] fix comment missing issue for thrift server ## What changes were proposed in this pull request? The column comment was missing while constructing the Hive TableSchema. This fix will preserve the

[GitHub] spark issue #17475: [SPARK-20148] [SQL] Extend the file commit API to allow ...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17475 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

<    4   5   6   7   8   9   10   11   12   13   >