spark git commit: [SPARK-11453][SQL][FOLLOW-UP] remove DecimalLit

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master bc5d6c038 -> 253e87e8a [SPARK-11453][SQL][FOLLOW-UP] remove DecimalLit A cleanup for https://github.com/apache/spark/pull/9085. The `DecimalLit` is very similar to `FloatLit`, we can just keep one of them. Also added low level unit test at

spark git commit: [SPARK-11511][STREAMING] Fix NPE when an InputDStream is not used

2015-11-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master 253e87e8a -> cf69ce136 [SPARK-11511][STREAMING] Fix NPE when an InputDStream is not used Just ignored `InputDStream`s that have null `rememberDuration` in `DStreamGraph.getMaxInputStreamRememberDuration`. Author: Shixiong Zhu Closes #94

spark git commit: [SPARK-11511][STREAMING] Fix NPE when an InputDStream is not used

2015-11-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 1cfad7d55 -> 0a430f04e [SPARK-11511][STREAMING] Fix NPE when an InputDStream is not used Just ignored `InputDStream`s that have null `rememberDuration` in `DStreamGraph.getMaxInputStreamRememberDuration`. Author: Shixiong Zhu Closes

spark git commit: [SPARK-11511][STREAMING] Fix NPE when an InputDStream is not used

2015-11-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.5 b8b1fbfc8 -> dc058f2ff [SPARK-11511][STREAMING] Fix NPE when an InputDStream is not used Just ignored `InputDStream`s that have null `rememberDuration` in `DStreamGraph.getMaxInputStreamRememberDuration`. Author: Shixiong Zhu Closes

spark git commit: [SPARK-9162] [SQL] Implement code generation for ScalaUDF

2015-11-06 Thread davies
Repository: spark Updated Branches: refs/heads/master cf69ce136 -> 574141a29 [SPARK-9162] [SQL] Implement code generation for ScalaUDF JIRA: https://issues.apache.org/jira/browse/SPARK-9162 Currently ScalaUDF extends CodegenFallback and doesn't provide code generation implementation. This pa

spark git commit: [SPARK-9162] [SQL] Implement code generation for ScalaUDF

2015-11-06 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 0a430f04e -> d69bc9e47 [SPARK-9162] [SQL] Implement code generation for ScalaUDF JIRA: https://issues.apache.org/jira/browse/SPARK-9162 Currently ScalaUDF extends CodegenFallback and doesn't provide code generation implementation. Thi

spark git commit: [SPARK-10978][SQL][FOLLOW-UP] More comprehensive tests for PR #9399

2015-11-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 d69bc9e47 -> fba48e733 [SPARK-10978][SQL][FOLLOW-UP] More comprehensive tests for PR #9399 This PR adds test cases that test various column pruning and filter push-down cases. Author: Cheng Lian Closes #9468 from liancheng/spark-109

spark git commit: [SPARK-10978][SQL][FOLLOW-UP] More comprehensive tests for PR #9399

2015-11-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 574141a29 -> c048929c6 [SPARK-10978][SQL][FOLLOW-UP] More comprehensive tests for PR #9399 This PR adds test cases that test various column pruning and filter push-down cases. Author: Cheng Lian Closes #9468 from liancheng/spark-10978.f

spark git commit: [SPARK-9858][SQL] Add an ExchangeCoordinator to estimate the number of post-shuffle partitions for aggregates and joins (follow-up)

2015-11-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c048929c6 -> 8211aab07 [SPARK-9858][SQL] Add an ExchangeCoordinator to estimate the number of post-shuffle partitions for aggregates and joins (follow-up) https://issues.apache.org/jira/browse/SPARK-9858 This PR is the follow-up work of h

spark git commit: [SPARK-9858][SQL] Add an ExchangeCoordinator to estimate the number of post-shuffle partitions for aggregates and joins (follow-up)

2015-11-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 fba48e733 -> 089ba81d1 [SPARK-9858][SQL] Add an ExchangeCoordinator to estimate the number of post-shuffle partitions for aggregates and joins (follow-up) https://issues.apache.org/jira/browse/SPARK-9858 This PR is the follow-up work

[4/4] spark git commit: [SPARK-11453][SQL][FOLLOW-UP] remove DecimalLit

2015-11-06 Thread rxin
[SPARK-11453][SQL][FOLLOW-UP] remove DecimalLit A cleanup for https://github.com/apache/spark/pull/9085. The `DecimalLit` is very similar to `FloatLit`, we can just keep one of them. Also added low level unit test at `SqlParserSuite` Author: Wenchen Fan Closes #9482 from cloud-fan/parser. (ch

spark git commit: [SPARK-11457][STREAMING][YARN] Fix incorrect AM proxy filter conf recovery from checkpoint

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 089ba81d1 -> 9d6238859 [SPARK-11457][STREAMING][YARN] Fix incorrect AM proxy filter conf recovery from checkpoint Currently Yarn AM proxy filter configuration is recovered from checkpoint file when Spark Streaming application is resta

[2/4] spark git commit: [SPARK-11540][SQL] API audit for QueryExecutionListener.

2015-11-06 Thread rxin
[SPARK-11540][SQL] API audit for QueryExecutionListener. Author: Reynold Xin Closes #9509 from rxin/SPARK-11540. (cherry picked from commit 3cc2c053b5d68c747a30bd58cf388b87b1922f13) Signed-off-by: Reynold Xin Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-u

[1/4] spark git commit: [SPARK-11538][BUILD] Force guava 14 in sbt build.

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 9d6238859 -> e86499954 [SPARK-11538][BUILD] Force guava 14 in sbt build. sbt's version resolution code always picks the most recent version, and we don't want that for guava. Author: Marcelo Vanzin Closes #9508 from vanzin/SPARK-1153

[3/4] spark git commit: [SPARK-11541][SQL] Break JdbcDialects.scala into multiple files and mark various dialects as private.

2015-11-06 Thread rxin
[SPARK-11541][SQL] Break JdbcDialects.scala into multiple files and mark various dialects as private. Author: Reynold Xin Closes #9511 from rxin/SPARK-11541. (cherry picked from commit bc5d6c03893a9bd340d6b94d3550e25648412241) Signed-off-by: Reynold Xin Project: http://git-wip-us.apache.org

spark git commit: Typo fixes + code readability improvements

2015-11-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master 8211aab07 -> 62bb29077 Typo fixes + code readability improvements Author: Jacek Laskowski Closes #9501 from jaceklaskowski/typos-with-style. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org

spark git commit: [SPARK-10116][CORE] XORShiftRandom.hashSeed is random in high bits

2015-11-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 e86499954 -> 7755b50b4 [SPARK-10116][CORE] XORShiftRandom.hashSeed is random in high bits https://issues.apache.org/jira/browse/SPARK-10116 This is really trivial, just happened to notice it -- if `XORShiftRandom.hashSeed` is really s

spark git commit: [SPARK-10116][CORE] XORShiftRandom.hashSeed is random in high bits

2015-11-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master 62bb29077 -> 49f1a8203 [SPARK-10116][CORE] XORShiftRandom.hashSeed is random in high bits https://issues.apache.org/jira/browse/SPARK-10116 This is really trivial, just happened to notice it -- if `XORShiftRandom.hashSeed` is really suppo

spark git commit: [SPARK-11450] [SQL] Add Unsafe Row processing to Expand

2015-11-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 49f1a8203 -> f328fedaf [SPARK-11450] [SQL] Add Unsafe Row processing to Expand This PR enables the Expand operator to process and produce Unsafe Rows. Author: Herman van Hovell Closes #9414 from hvanhovell/SPARK-11450. Project: http://

spark git commit: [SPARK-11450] [SQL] Add Unsafe Row processing to Expand

2015-11-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 7755b50b4 -> efa1e4a25 [SPARK-11450] [SQL] Add Unsafe Row processing to Expand This PR enables the Expand operator to process and produce Unsafe Rows. Author: Herman van Hovell Closes #9414 from hvanhovell/SPARK-11450. Project: htt

spark git commit: [SPARK-11561][SQL] Rename text data source's column name to value.

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 efa1e4a25 -> 52e921c7c [SPARK-11561][SQL] Rename text data source's column name to value. Author: Reynold Xin Closes #9527 from rxin/SPARK-11561. (cherry picked from commit 3a652f691b220fada0286f8d0a562c5657973d4d) Signed-off-by: Rey

spark git commit: [SPARK-11561][SQL] Rename text data source's column name to value.

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master f328fedaf -> 3a652f691 [SPARK-11561][SQL] Rename text data source's column name to value. Author: Reynold Xin Closes #9527 from rxin/SPARK-11561. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apach

spark git commit: [SPARK-11217][ML] save/load for non-meta estimators and transformers

2015-11-06 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 3a652f691 -> c447c9d54 [SPARK-11217][ML] save/load for non-meta estimators and transformers This PR implements the default save/load for non-meta estimators and transformers using the JSON serialization of param values. The saved metadata

spark git commit: [SPARK-11217][ML] save/load for non-meta estimators and transformers

2015-11-06 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.6 52e921c7c -> e7e3bfda3 [SPARK-11217][ML] save/load for non-meta estimators and transformers This PR implements the default save/load for non-meta estimators and transformers using the JSON serialization of param values. The saved metad

spark git commit: [SPARK-11555] spark on yarn spark-class --num-workers doesn't work

2015-11-06 Thread vanzin
Repository: spark Updated Branches: refs/heads/master c447c9d54 -> f6680cdc5 [SPARK-11555] spark on yarn spark-class --num-workers doesn't work I tested the various options with both spark-submit and spark-class of specifying number of executors in both client and cluster mode where it applie

spark git commit: [SPARK-11555] spark on yarn spark-class --num-workers doesn't work

2015-11-06 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-1.6 e7e3bfda3 -> b58f1ce5b [SPARK-11555] spark on yarn spark-class --num-workers doesn't work I tested the various options with both spark-submit and spark-class of specifying number of executors in both client and cluster mode where it ap

spark git commit: [SPARK-11555] spark on yarn spark-class --num-workers doesn't work

2015-11-06 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-1.5 dc058f2ff -> 8fb6696cd [SPARK-11555] spark on yarn spark-class --num-workers doesn't work I tested the various options with both spark-submit and spark-class of specifying number of executors in both client and cluster mode where it ap

spark git commit: [SPARK-11269][SQL] Java API support & test cases for Dataset

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master f6680cdc5 -> 7e9a9e603 [SPARK-11269][SQL] Java API support & test cases for Dataset This simply brings https://github.com/apache/spark/pull/9358 up-to-date. Author: Wenchen Fan Author: Reynold Xin Closes #9528 from rxin/dataset-java.

spark git commit: [SPARK-11269][SQL] Java API support & test cases for Dataset

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 b58f1ce5b -> 02748c953 [SPARK-11269][SQL] Java API support & test cases for Dataset This simply brings https://github.com/apache/spark/pull/9358 up-to-date. Author: Wenchen Fan Author: Reynold Xin Closes #9528 from rxin/dataset-java

spark git commit: [SPARK-11410] [PYSPARK] Add python bindings for repartition and sortW…

2015-11-06 Thread davies
Repository: spark Updated Branches: refs/heads/master 7e9a9e603 -> 1ab72b086 [SPARK-11410] [PYSPARK] Add python bindings for repartition and sortW… …ithinPartitions. Author: Nong Li Closes #9504 from nongli/spark-11410. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit

spark git commit: [SPARK-11410] [PYSPARK] Add python bindings for repartition and sortW…

2015-11-06 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 02748c953 -> 40a5db561 [SPARK-11410] [PYSPARK] Add python bindings for repartition and sortW… …ithinPartitions. Author: Nong Li Closes #9504 from nongli/spark-11410. (cherry picked from commit 1ab72b08601a1c8a674bdd3fab84d980489

spark git commit: [SPARK-9241][SQL] Supporting multiple DISTINCT columns (2) - Rewriting Rule

2015-11-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1ab72b086 -> 6d0ead322 [SPARK-9241][SQL] Supporting multiple DISTINCT columns (2) - Rewriting Rule The second PR for SPARK-9241, this adds support for multiple distinct columns to the new aggregation code path. This PR solves the multiple

spark git commit: [SPARK-9241][SQL] Supporting multiple DISTINCT columns (2) - Rewriting Rule

2015-11-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 40a5db561 -> 162a7704c [SPARK-9241][SQL] Supporting multiple DISTINCT columns (2) - Rewriting Rule The second PR for SPARK-9241, this adds support for multiple distinct columns to the new aggregation code path. This PR solves the mult

spark git commit: [SPARK-11546] Thrift server makes too many logs about result schema

2015-11-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 6d0ead322 -> 1c80d66e5 [SPARK-11546] Thrift server makes too many logs about result schema SparkExecuteStatementOperation logs result schema for each getNextRowSet() calls which is by default every 1000 rows, overwhelming whole log file.

spark git commit: [SPARK-11546] Thrift server makes too many logs about result schema

2015-11-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 162a7704c -> 9bf77d5c3 [SPARK-11546] Thrift server makes too many logs about result schema SparkExecuteStatementOperation logs result schema for each getNextRowSet() calls which is by default every 1000 rows, overwhelming whole log fil

spark git commit: [HOTFIX] Fix python tests after #9527

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1c80d66e5 -> 105732dcc [HOTFIX] Fix python tests after #9527 #9527 missed updating the python tests. Author: Michael Armbrust Closes #9533 from marmbrus/hotfixTextValue. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit

spark git commit: [HOTFIX] Fix python tests after #9527

2015-11-06 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 9bf77d5c3 -> 68c1d9fa6 [HOTFIX] Fix python tests after #9527 #9527 missed updating the python tests. Author: Michael Armbrust Closes #9533 from marmbrus/hotfixTextValue. (cherry picked from commit 105732dcc6b651b9779f4a5773a759c5b4f

[1/2] spark git commit: [SPARK-11389][CORE] Add support for off-heap memory to MemoryManager

2015-11-06 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 105732dcc -> 30b706b7b http://git-wip-us.apache.org/repos/asf/spark/blob/30b706b7/core/src/test/scala/org/apache/spark/memory/StaticMemoryManagerSuite.scala -- diff --git a

[2/2] spark git commit: [SPARK-11389][CORE] Add support for off-heap memory to MemoryManager

2015-11-06 Thread joshrosen
[SPARK-11389][CORE] Add support for off-heap memory to MemoryManager In order to lay the groundwork for proper off-heap memory support in SQL / Tungsten, we need to extend our MemoryManager to perform bookkeeping for off-heap memory. ## User-facing changes This PR introduces a new configuratio

[2/2] spark git commit: [SPARK-11389][CORE] Add support for off-heap memory to MemoryManager

2015-11-06 Thread joshrosen
[SPARK-11389][CORE] Add support for off-heap memory to MemoryManager In order to lay the groundwork for proper off-heap memory support in SQL / Tungsten, we need to extend our MemoryManager to perform bookkeeping for off-heap memory. ## User-facing changes This PR introduces a new configuratio

[1/2] spark git commit: [SPARK-11389][CORE] Add support for off-heap memory to MemoryManager

2015-11-06 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 68c1d9fa6 -> e2546c227 http://git-wip-us.apache.org/repos/asf/spark/blob/e2546c22/core/src/test/scala/org/apache/spark/memory/StaticMemoryManagerSuite.scala -- diff --gi

spark git commit: [SPARK-11112] DAG visualization: display RDD callsite

2015-11-06 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 30b706b7b -> 7f741905b [SPARK-2] DAG visualization: display RDD callsite https://cloud.githubusercontent.com/assets/2133137/10870343/2a8cd070-807d-11e5-857a-4ebcace77b5b.png";> mateiz sarutak Author: Andrew Or Closes #9398 from andre

spark git commit: [SPARK-8467] [MLLIB] [PYSPARK] Add LDAModel.describeTopics() in Python

2015-11-06 Thread davies
Repository: spark Updated Branches: refs/heads/master 7f741905b -> 2ff0e79a8 [SPARK-8467] [MLLIB] [PYSPARK] Add LDAModel.describeTopics() in Python Could jkbradley and davies review it? - Create a wrapper class: `LDAModelWrapper` for `LDAModel`. Because we can't deal with the return value of

spark git commit: [SPARK-8467] [MLLIB] [PYSPARK] Add LDAModel.describeTopics() in Python

2015-11-06 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 e2546c227 -> aede729a9 [SPARK-8467] [MLLIB] [PYSPARK] Add LDAModel.describeTopics() in Python Could jkbradley and davies review it? - Create a wrapper class: `LDAModelWrapper` for `LDAModel`. Because we can't deal with the return valu