[GitHub] spark issue #13483: [SPARK-15688][SQL] RelationalGroupedDataset.toDF should ...

2016-06-08 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13483 I haven't looked at this PR in detail, but I'm the originator of this suggestion. It seemed odd to me that we were auto adding columns when I was already specifying them. I agree if the user

[GitHub] spark issue #13147: [SPARK-6320][SQL] Move planLater method into GenericStra...

2016-06-08 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13147 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13147: [SPARK-6320][SQL] Move planLater method into GenericStra...

2016-06-08 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13147 LGTM, I'm going to merge this into master after it passes tests once more (just want to check to make sure its not stale) --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-08 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r66347756 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/PartitioningUtilsSuite.scala --- @@ -0,0 +1,36 @@ +/* + * Licensed

[GitHub] spark pull request #13486: [SPARK-15743][SQL] Prevent saving with all-column...

2016-06-08 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13486#discussion_r66347539 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -339,7 +339,7 @@ private[sql] object

[GitHub] spark issue #13484: [SPARK-15742][SQL] Reduce temp collections allocations i...

2016-06-02 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13484 LGTM How big of a difference in the benchmarks? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13342: [SPARK-15593][SQL]Add DataFrameWriter.foreach to allow t...

2016-06-01 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13342 I'm okay with this interface, though I think we will probably need to add a less batch oriented version in the future. Can you update the description? --- If your project is set up for it, you

[GitHub] spark pull request #13342: [SPARK-15593][SQL]Add DataFrameWriter.foreach to ...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13342#discussion_r65412780 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -401,6 +381,52 @@ final class DataFrameWriter private[sql](df

[GitHub] spark pull request #13342: [SPARK-15593][SQL]Add DataFrameWriter.foreach to ...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13342#discussion_r65412579 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ForeachWriter.scala --- @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #13335: [SPARK-15580][SQL]Add ContinuousQueryInfo to make...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r65411829 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Offset.scala --- @@ -17,11 +17,14 @@ package

[GitHub] spark pull request #13335: [SPARK-15580][SQL]Add ContinuousQueryInfo to make...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r65410185 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Offset.scala --- @@ -17,11 +17,14 @@ package

[GitHub] spark pull request #13335: [SPARK-15580][SQL]Add ContinuousQueryInfo to make...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r65409940 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ContinuousQueryListenerBus.scala --- @@ -71,15 +70,15 @@ class

[GitHub] spark pull request #13335: [SPARK-15580][SQL]Add ContinuousQueryInfo to make...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r65409815 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ContinuousQueryListenerBus.scala --- @@ -69,14 +69,15 @@ class

[GitHub] spark pull request #13335: [SPARK-15580][SQL]Add ContinuousQueryInfo to make...

2016-06-01 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r65409474 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/ContinuousQueryListener.scala --- @@ -69,27 +72,27 @@ abstract class

spark git commit: [SPARK-6320][SQL] Move planLater method into GenericStrategy.

2016-06-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 a780848af -> 71e8aaeaa [SPARK-6320][SQL] Move planLater method into GenericStrategy. ## What changes were proposed in this pull request? This PR is the minimal version of #13147 for `branch-2.0`. ## How was this patch tested? Picked

[GitHub] spark issue #13426: [SPARK-6320][SQL] Move planLater method into GenericStra...

2016-06-01 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13426 @ueshin can you close this (PRs not against master don't auto close) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13426: [SPARK-6320][SQL] Move planLater method into GenericStra...

2016-06-01 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13426 I'm going to go ahead and merge this so we get the API changes in before we cut any RCs. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[1/2] spark git commit: [SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming

2016-06-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 9406a3c9a -> a780848af http://git-wip-us.apache.org/repos/asf/spark/blob/a780848a/sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --

[2/2] spark git commit: [SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming

2016-06-01 Thread marmbrus
[SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming ## What changes were proposed in this pull request? This patch moves all user-facing structured streaming classes into sql.streaming. As part of this, I also added some since version annotation to methods and classes that

[1/2] spark git commit: [SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming

2016-06-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d5012c274 -> a71d1364a http://git-wip-us.apache.org/repos/asf/spark/blob/a71d1364/sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala -- diff

[GitHub] spark issue #13429: [SPARK-15686][SQL] Move user-facing streaming classes in...

2016-06-01 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13429 LGTM, merging to master and 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6320][SQL] Move planLater method into GenericStra...

2016-05-31 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13426 @rxin thoughts on including this in 2.0? Seems safe to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

spark git commit: [SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming

2016-05-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 8657942ce -> df4f87106 [SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming ## What changes were proposed in this pull request? Currently structured streaming only supports append output mode.

spark git commit: [SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming

2016-05-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master dfe2cbeb4 -> 90b11439b [SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming ## What changes were proposed in this pull request? Currently structured streaming only supports append output mode. This PR

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for complete o...

2016-05-31 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13286 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-15443][SQL][Streaming] Properly explain continuou...

2016-05-31 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13221 > but still it cannot reflect the real plan in the run-time I think you could get really close to the actual plan in most cases by just substituting dummy nodes. We can indic

[GitHub] spark pull request: [SPARK-6320][SQL] Move planLater method into GenericStra...

2016-05-31 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13147 Quick question: Since we are really close to Spark 2.0 this is probably too big of a change to merge into `branch-2.0` (though @rxin should probably make that call). However, even though

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64976281 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-15443][SQL][Streaming] Properly explain...

2016-05-27 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13221#issuecomment-69157 > Also a better solution is to find out a good solution to get the plan without really executing the query. Yeah this seems like the best solution to

[GitHub] spark pull request: [SPARK-15550][SQL] Dataset.show() should show ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13331#discussion_r64973277 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -436,20 +435,6 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request: [WIP][SPARK-6320][SQL] Move planLater method i...

2016-05-27 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13147#issuecomment-63996 This is great! In order to commit this it would be good to explain what is happening in detail (I think I follow, but its very dense). I'd also like to see some

[GitHub] spark pull request: [WIP][SPARK-6320][SQL] Move planLater method i...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13147#discussion_r64972461 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/QueryPlanner.scala --- @@ -47,17 +55,27 @@ abstract class QueryPlanner

[GitHub] spark pull request: [WIP][SPARK-6320][SQL] Move planLater method i...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13147#discussion_r64972400 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/QueryPlanner.scala --- @@ -47,17 +55,27 @@ abstract class QueryPlanner

[GitHub] spark pull request: [WIP][SPARK-6320][SQL] Move planLater method i...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13147#discussion_r64972060 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/QueryPlanner.scala --- @@ -47,17 +55,27 @@ abstract class QueryPlanner

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64967749 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64967228 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,33 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13286#issuecomment-54170 LGTM with a few comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64966959 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala --- @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64966038 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala --- @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64962858 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala --- @@ -237,15 +237,15 @@ class

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64962236 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -77,7 +77,47 @@ final class DataFrameWriter private[sql](df: DataFrame

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64962057 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64961894 --- Diff: python/pyspark/sql/readwriter.py --- @@ -500,6 +500,26 @@ def mode(self, saveMode): self._jwrite = self._jwrite.mode(saveMode

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13335#issuecomment-42009 Can you include the json produced as a sanity check? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64960270 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,33 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64960190 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,33 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64960049 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,33 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64959957 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,33 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15593][SQL]Add DataFrameWriter.foreach ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13342#discussion_r64957648 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ForeachWriter.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-15593][SQL]Add DataFrameWriter.foreach ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13342#discussion_r64955801 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ForeachWriter.scala --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-15593][SQL]Add DataFrameWriter.foreach ...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13342#discussion_r64954717 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ForeachWriter.scala --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64954076 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,23 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64953837 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,23 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64953741 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,23 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64953510 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,23 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15580][SQL]Add ContinuousQueryInfo to m...

2016-05-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13335#discussion_r64953212 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --- @@ -65,11 +65,23 @@ object ContinuousQueryListener

[GitHub] spark pull request: [SPARK-15515] [SQL] Error Handling in Running ...

2016-05-26 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13283#discussion_r64814173 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -448,6 +448,14 @@ class Analyzer

[GitHub] spark pull request: [SPARK-6320][SQL] Move planLater method into G...

2016-05-26 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13147#issuecomment-221968384 Unfortunately, I don't have the code anymore, but I can try to sketch out what I think the right solution looks like. Basically, the problem is that `planLater

[GitHub] spark pull request: [SPARK-15515] [SQL] Error Handling in Running ...

2016-05-26 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13283#discussion_r64795297 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -448,6 +448,14 @@ class Analyzer

[GitHub] spark pull request: [SPARK-15443][SQL][Streaming] Properly explain...

2016-05-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13221#issuecomment-221745872 I think its unfortunate that you have to actually start the query before you can see what the physical plan looks like, that seems counter to the goal of explain

[GitHub] spark pull request: [SPARK-15543][SQL] Rename DefaultSources to ma...

2016-05-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13311#issuecomment-221745234 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-15483][SQL] IncrementalExecution should...

2016-05-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13261#issuecomment-221673769 Thanks, merging to master and 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

spark git commit: [SPARK-15483][SQL] IncrementalExecution should use extra strategies.

2016-05-25 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 409eb28f7 -> 20cc2eb1b [SPARK-15483][SQL] IncrementalExecution should use extra strategies. ## What changes were proposed in this pull request? Extra strategies does not work for streams because `IncrementalExecution` uses modified

spark git commit: [SPARK-15483][SQL] IncrementalExecution should use extra strategies.

2016-05-25 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1cb347fbc -> 4b8806741 [SPARK-15483][SQL] IncrementalExecution should use extra strategies. ## What changes were proposed in this pull request? Extra strategies does not work for streams because `IncrementalExecution` uses modified

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13286#discussion_r64498668 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/OutputMode.java --- @@ -15,9 +15,10 @@ * limitations under the License

[GitHub] spark pull request: [SPARK-15458][SQL][STREAMING] Disable schema i...

2016-05-24 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13238#issuecomment-221402223 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-15458][SQL][STREAMING] Disable schema i...

2016-05-23 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13238#discussion_r64291854 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -140,6 +140,18 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request: [SPARK-15458][SQL][STREAMING] Disable schema i...

2016-05-23 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13238#discussion_r64291689 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -165,19 +177,32 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request: [MINOR][SQL][DOCS] Add notes of the determinis...

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-221099791 Good thing I got distracted :) I would have just changed the title while merging though. --- If your project is set up for it, you can reply to this email and have

spark git commit: [MINOR][SQL][DOCS] Add notes of the deterministic assumption on UDF functions

2016-05-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 c55a39c97 -> 80bf4ce30 [MINOR][SQL][DOCS] Add notes of the deterministic assumption on UDF functions ## What changes were proposed in this pull request? Spark assumes that UDF functions are deterministic. This PR adds explicit notes

spark git commit: [MINOR][SQL][DOCS] Add notes of the deterministic assumption on UDF functions

2016-05-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2585d2b32 -> 37c617e4f [MINOR][SQL][DOCS] Add notes of the deterministic assumption on UDF functions ## What changes were proposed in this pull request? Spark assumes that UDF functions are deterministic. This PR adds explicit notes

[GitHub] spark pull request: [SPARK-15458][SQL][STREAMING] Disable schema i...

2016-05-23 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13238#discussion_r64290699 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -186,6 +187,15 @@ case class DataSource

[GitHub] spark pull request: [SPARK-15458][SQL][STREAMING] Disable schema i...

2016-05-23 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13238#discussion_r64290791 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -186,6 +187,15 @@ case class DataSource

[GitHub] spark pull request: [SPARK-15282][SQL][DOCS] Add notes of the dete...

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-221095790 merging to master and 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15282][SQL][DOCS] Add notes of the dete...

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-221095595 sure thats fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15282][SQL][DOCS] Add notes of the dete...

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-221064956 LGTM pending tests. @linbojin, we should also handle your use case though maybe that should be its own JIRA. Perhaps you could open one with the information

spark git commit: [SPARK-15471][SQL] ScalaReflection cleanup

2016-05-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 80091b8a6 -> 07c36a2f0 [SPARK-15471][SQL] ScalaReflection cleanup ## What changes were proposed in this pull request? 1. simplify the logic of deserializing option type. 2. simplify the logic of serializing array type, and remove

[GitHub] spark pull request: [SPARK-15471][SQL] ScalaReflection cleanup

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13250#issuecomment-221050583 Thanks, merging to master and 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14557][SQL] Reading textfile (created t...

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/12356#issuecomment-221046569 Is it possible to write unit tests for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14557][SQL] Reading textfile (created t...

2016-05-23 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/12356#issuecomment-221046599 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-15282][SQL][DOCS] Add notes of the dete...

2016-05-23 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13087#discussion_r64256143 --- Diff: python/pyspark/sql/functions.py --- @@ -1756,6 +1756,7 @@ def __call__(self, *cols): @since(1.3) def udf(f, returnType=StringType

[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...

2016-05-21 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13226#issuecomment-220796372 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...

2016-05-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13210#issuecomment-220732096 LGTM pending grammar fixes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

spark git commit: [SPARK-10216][SQL] Revert "[] Avoid creating empty files during overwrit…

2016-05-20 Thread marmbrus
Michael Armbrust <mich...@databricks.com> Closes #13181 from marmbrus/revert12855. (cherry picked from commit 2ba3ff044900d16d5f6331523526f785864c1e62) Signed-off-by: Michael Armbrust <mich...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://g

spark git commit: [SPARK-10216][SQL] Revert "[] Avoid creating empty files during overwrit…

2016-05-20 Thread marmbrus
Michael Armbrust <mich...@databricks.com> Closes #13181 from marmbrus/revert12855. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2ba3ff04 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2ba3ff04 Diff: http://git-wip

[GitHub] spark pull request: Revert "[SPARK-10216][SQL] Avoid creating empt...

2016-05-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13181#issuecomment-220704424 I'm going to go ahead and merge this, but please to ping me on follow up issues that try to add this back. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15454][SQL] Filter out files starting w...

2016-05-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13227#issuecomment-220701184 LGTM, pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15282][SQL] PushDownPredicate should no...

2016-05-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-220700839 My main questions remains: does this actually make a difference in runtime? or is execution smart enough already to do this optimization (even if to the user

[GitHub] spark pull request: [SPARK-15190][SQL]Support using SQLUserDefined...

2016-05-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/12965#issuecomment-220699835 Thanks, merging to master and 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

spark git commit: [SPARK-15190][SQL] Support using SQLUserDefinedType for case classes

2016-05-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 e99b22080 -> 42e63c35a [SPARK-15190][SQL] Support using SQLUserDefinedType for case classes ## What changes were proposed in this pull request? Right now inferring the schema for case classes happens before searching the

spark git commit: [SPARK-15190][SQL] Support using SQLUserDefinedType for case classes

2016-05-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 22947cd02 -> dfa61f7b1 [SPARK-15190][SQL] Support using SQLUserDefinedType for case classes ## What changes were proposed in this pull request? Right now inferring the schema for case classes happens before searching the

[GitHub] spark pull request: [SPARK-15282][SQL] Make ScalaUDF nondeterminis...

2016-05-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13087#issuecomment-220651359 I think we can have a way to mark a UDF as non-deterministic, but that is too large of a change to make it the default. Also, is this an actual performance

[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

2016-05-19 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13200#issuecomment-220516804 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-15425][SQL] Disallow cartesian joins by...

2016-05-19 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13209#issuecomment-220504465 Yeah, thats a good idea. They might not even know which join is getting planned that way otherwise. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15425][SQL] Disallow cartesian joins by...

2016-05-19 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13209#discussion_r63984891 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -178,7 +178,12 @@ private[sql] abstract class SparkStrategies

spark git commit: [SPARK-15416][SQL] Display a better message for not finding classes removed in Spark 2.0

2016-05-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 664367781 -> 16ba71aba [SPARK-15416][SQL] Display a better message for not finding classes removed in Spark 2.0 ## What changes were proposed in this pull request? If finding `NoClassDefFoundError` or `ClassNotFoundException`, check if

spark git commit: [SPARK-15416][SQL] Display a better message for not finding classes removed in Spark 2.0

2016-05-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 e53a8f218 -> 7e25131a9 [SPARK-15416][SQL] Display a better message for not finding classes removed in Spark 2.0 ## What changes were proposed in this pull request? If finding `NoClassDefFoundError` or `ClassNotFoundException`, check

[GitHub] spark pull request: [SPARK-15416][SQL]Display a better message for...

2016-05-19 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/13201#issuecomment-220495636 Thanks! Merging to master and 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-15416][SQL]Display a better message for...

2016-05-19 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13201#discussion_r63973282 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -93,26 +101,45 @@ case class DataSource

[GitHub] spark pull request: [SPARK-15416][SQL]Display a better message for...

2016-05-19 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/13201#discussion_r63973212 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -93,26 +101,45 @@ case class DataSource

<    3   4   5   6   7   8   9   10   11   12   >