spark git commit: [TESTS][SQL] Setup testdata at the beginning for tests to run independently

2017-01-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 256a3a801 -> 9effc2cdc [TESTS][SQL] Setup testdata at the beginning for tests to run independently ## What changes were proposed in this pull request? In CachedTableSuite, we are not setting up the test data at the beginning. Some tests

spark git commit: [SPARK-18020][STREAMING][KINESIS] Checkpoint SHARD_END to finish reading closed shards

2017-01-25 Thread brkyvz
Repository: spark Updated Branches: refs/heads/master 233845126 -> 256a3a801 [SPARK-18020][STREAMING][KINESIS] Checkpoint SHARD_END to finish reading closed shards ## What changes were proposed in this pull request? This pr is to fix an issue occurred when resharding Kinesis streams; the

spark git commit: [SPARK-18495][UI] Document meaning of green dot in DAG visualization

2017-01-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 47d5d0ddb -> 233845126 [SPARK-18495][UI] Document meaning of green dot in DAG visualization ## What changes were proposed in this pull request? A green dot in the DAG visualization apparently means that the referenced RDD is cached. This

spark git commit: [SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD

2017-01-25 Thread tdas
Repository: spark Updated Branches: refs/heads/master 965c82d8c -> 47d5d0ddb [SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD ## What changes were proposed in this pull request? EdgeRDD/VertexRDD overrides checkpoint() and isCheckpointed() to forward these to the internal

spark git commit: [SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD

2017-01-25 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 a5c10ff23 -> 0d7e38524 [SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD ## What changes were proposed in this pull request? EdgeRDD/VertexRDD overrides checkpoint() and isCheckpointed() to forward these to the

spark git commit: [SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD

2017-01-25 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.0 00a48075a -> 48a8dc8b8 [SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD ## What changes were proposed in this pull request? EdgeRDD/VertexRDD overrides checkpoint() and isCheckpointed() to forward these to the

spark git commit: [SPARK-19064][PYSPARK] Fix pip installing of sub components

2017-01-25 Thread holden
Repository: spark Updated Branches: refs/heads/branch-2.1 97d3353ef -> a5c10ff23 [SPARK-19064][PYSPARK] Fix pip installing of sub components ## What changes were proposed in this pull request? Fix instalation of mllib and ml sub components, and more eagerly cleanup cache files during test

spark git commit: [SPARK-19064][PYSPARK] Fix pip installing of sub components

2017-01-25 Thread holden
Repository: spark Updated Branches: refs/heads/master 92afaa93a -> 965c82d8c [SPARK-19064][PYSPARK] Fix pip installing of sub components ## What changes were proposed in this pull request? Fix instalation of mllib and ml sub components, and more eagerly cleanup cache files during test

[1/2] spark git commit: [SPARK-18750][YARN] Avoid using "mapValues" when allocating containers.

2017-01-25 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.0 886f73737 -> 00a48075a [SPARK-18750][YARN] Avoid using "mapValues" when allocating containers. That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was

[2/2] spark git commit: [SPARK-18750][YARN] Follow up: move test to correct directory in 2.1 branch.

2017-01-25 Thread vanzin
[SPARK-18750][YARN] Follow up: move test to correct directory in 2.1 branch. Author: Marcelo Vanzin Closes #16704 from vanzin/SPARK-18750_2.1. (cherry picked from commit 97d3353ef16a6e6edc93d8177b08442a03e19eee) Signed-off-by: Marcelo Vanzin

spark git commit: [SPARK-18750][YARN] Follow up: move test to correct directory in 2.1 branch.

2017-01-25 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.1 c9f075abb -> 97d3353ef [SPARK-18750][YARN] Follow up: move test to correct directory in 2.1 branch. Author: Marcelo Vanzin Closes #16704 from vanzin/SPARK-18750_2.1. Project:

spark git commit: [SPARK-19307][PYSPARK] Make sure user conf is propagated to SparkContext.

2017-01-25 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.1 af9545538 -> c9f075abb [SPARK-19307][PYSPARK] Make sure user conf is propagated to SparkContext. The code was failing to propagate the user conf in the case where the JVM was already initialized, which happens when a user submits a

spark git commit: [SPARK-19307][PYSPARK] Make sure user conf is propagated to SparkContext.

2017-01-25 Thread vanzin
Repository: spark Updated Branches: refs/heads/master f6480b146 -> 92afaa93a [SPARK-19307][PYSPARK] Make sure user conf is propagated to SparkContext. The code was failing to propagate the user conf in the case where the JVM was already initialized, which happens when a user submits a python

spark git commit: [SPARK-19311][SQL] fix UDT hierarchy issue

2017-01-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f1ddca5fc -> f6480b146 [SPARK-19311][SQL] fix UDT hierarchy issue ## What changes were proposed in this pull request? acceptType() in UDT will no only accept the same type but also all base types ## How was this patch tested? Manual test

spark git commit: [SPARK-18863][SQL] Output non-aggregate expressions without GROUP BY in a subquery does not yield an error

2017-01-25 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 f391ad2c8 -> af9545538 [SPARK-18863][SQL] Output non-aggregate expressions without GROUP BY in a subquery does not yield an error ## What changes were proposed in this pull request? This PR will report proper error messages when a

spark git commit: [SPARK-18863][SQL] Output non-aggregate expressions without GROUP BY in a subquery does not yield an error

2017-01-25 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 0e821ec6f -> f1ddca5fc [SPARK-18863][SQL] Output non-aggregate expressions without GROUP BY in a subquery does not yield an error ## What changes were proposed in this pull request? This PR will report proper error messages when a

spark git commit: [SPARK-19313][ML][MLLIB] GaussianMixture should limit the number of features

2017-01-25 Thread yliang
Repository: spark Updated Branches: refs/heads/master 76db394f2 -> 0e821ec6f [SPARK-19313][ML][MLLIB] GaussianMixture should limit the number of features ## What changes were proposed in this pull request? The following test will fail on current master scala test("gmm fails on high

spark git commit: [SPARK-18750][YARN] Avoid using "mapValues" when allocating containers.

2017-01-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 3fdce8143 -> 76db394f2 [SPARK-18750][YARN] Avoid using "mapValues" when allocating containers. That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was tested