spark git commit: Tiny style improvement.

2016-12-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master f923c849e -> 150d26cad Tiny style improvement. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/150d26ca Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/150d26ca

spark git commit: [SPARK-18899][SPARK-18912][SPARK-18913][SQL] refactor the error checking when append data to an existing table

2016-12-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master fa829ce21 -> f923c849e [SPARK-18899][SPARK-18912][SPARK-18913][SQL] refactor the error checking when append data to an existing table ## What changes were proposed in this pull request? When we append data to an existing table with `DataF

spark git commit: [SPARK-18761][CORE] Introduce "task reaper" to oversee task killing in executors

2016-12-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5857b9ac2 -> fa829ce21 [SPARK-18761][CORE] Introduce "task reaper" to oversee task killing in executors ## What changes were proposed in this pull request? Spark's current task cancellation / task killing mechanism is "best effort" becaus

spark git commit: [SPARK-18928] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & UnsafeSorter

2016-12-19 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 4cb49412d -> 5857b9ac2 [SPARK-18928] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & UnsafeSorter ## What changes were proposed in this pull request? In order to respond to task cancellation, Spark tasks must periodically chec

spark git commit: [SPARK-18928] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & UnsafeSorter

2016-12-19 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 c1a26b458 -> f07e989c0 [SPARK-18928] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & UnsafeSorter ## What changes were proposed in this pull request? In order to respond to task cancellation, Spark tasks must periodically

spark git commit: [SPARK-18836][CORE] Serialize one copy of task metrics in DAGScheduler

2016-12-19 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/master 70d495dce -> 4cb49412d [SPARK-18836][CORE] Serialize one copy of task metrics in DAGScheduler ## What changes were proposed in this pull request? Right now we serialize the empty task metrics once per task – Since this is shared across

spark git commit: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-19 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 7a75ee1c9 -> 70d495dce [SPARK-18624][SQL] Implicit cast ArrayType(InternalType) ## What changes were proposed in this pull request? Currently `ImplicitTypeCasts` doesn't handle casts between `ArrayType`s, this is not convenient, we should

spark git commit: [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase

2016-12-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 fc1b25660 -> c1a26b458 [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase ## What changes were proposed in this pull request? It's weird that we use `Hive.getDatabase` to check the existence of

spark git commit: [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase

2016-12-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 24482858e -> 7a75ee1c9 [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase ## What changes were proposed in this pull request? It's weird that we use `Hive.getDatabase` to check the existence of a d

spark git commit: Fix test case for SubquerySuite.

2016-12-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 b41668349 -> 2a5ab1490 Fix test case for SubquerySuite. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2a5ab149 Tree: http://git-wip-us.apache.org/repos/asf/spark/

spark git commit: [SPARK-18700][SQL] Add StripedLock for each table's relation in cache

2016-12-19 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 3080f995c -> fc1b25660 [SPARK-18700][SQL] Add StripedLock for each table's relation in cache ## What changes were proposed in this pull request? As the scenario describe in [SPARK-18700](https://issues.apache.org/jira/browse/SPARK-187

spark git commit: [SPARK-18700][SQL] Add StripedLock for each table's relation in cache

2016-12-19 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 7db09abb0 -> 24482858e [SPARK-18700][SQL] Add StripedLock for each table's relation in cache ## What changes were proposed in this pull request? As the scenario describe in [SPARK-18700](https://issues.apache.org/jira/browse/SPARK-18700),

spark git commit: [SPARK-18356][ML] KMeans should cache RDD before training

2016-12-19 Thread srowen
Repository: spark Updated Branches: refs/heads/master 1e5c51f33 -> 7db09abb0 [SPARK-18356][ML] KMeans should cache RDD before training ## What changes were proposed in this pull request? According to request of Mr. Joseph Bradley , I did this update of my PR https://github.com/apache/spark/p