spark git commit: [SPARK-11216][SQL][FOLLOW-UP] add encoder/decoder for external row

2015-10-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f6d06adf0 -> 42d225f44 [SPARK-11216][SQL][FOLLOW-UP] add encoder/decoder for external row address comments in https://github.com/apache/spark/pull/9184 Author: Wenchen Fan Closes #9212 from cloud-fan/encoder.

spark git commit: [SPARK-11232][CORE] Use 'offer' instead of 'put' to make sure calling send won't be interrupted

2015-10-22 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 42d225f44 -> 7bb6d31cf [SPARK-11232][CORE] Use 'offer' instead of 'put' to make sure calling send won't be interrupted The current `NettyRpcEndpointRef.send` can be interrupted because it uses `LinkedBlockingQueue.put`, which may hang

spark git commit: [SPARK-11121][CORE] Correct the TaskLocation type

2015-10-22 Thread srowen
Repository: spark Updated Branches: refs/heads/master 1d9733271 -> c03b6d115 [SPARK-11121][CORE] Correct the TaskLocation type Correct the logic to return `HDFSCacheTaskLocation` instance when the input `str` is a in memory location. Author: zhichao.li Closes #9096

spark git commit: [SPARK-9735][SQL] Respect the user specified schema than the infer partition schema for HadoopFsRelation

2015-10-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 3535b91dd -> d4950e6be [SPARK-9735][SQL] Respect the user specified schema than the infer partition schema for HadoopFsRelation To enable the unit test of `hadoopFsRelationSuite.Partition column type casting`. It previously threw

spark git commit: [SPARK-11116][SQL] First Draft of Dataset API

2015-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 188ea348f -> 53e83a3a7 [SPARK-6][SQL] First Draft of Dataset API *This PR adds a new experimental API to Spark, tentitively named Datasets.* A `Dataset` is a strongly-typed collection of objects that can be transformed in parallel

[2/2] spark git commit: [SPARK-10812] [YARN] Fix shutdown of token renewer.

2015-10-22 Thread vanzin
[SPARK-10812] [YARN] Fix shutdown of token renewer. A recent change to fix the referenced bug caused this exception in the `SparkContext.stop()` path: org.apache.spark.SparkException: YarnSparkHadoopUtil is not available in non-YARN mode! at

[1/2] spark git commit: [SPARK-10812] [YARN] Spark hadoop util support switching to yarn

2015-10-22 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-1.5 f9ad0e543 -> e405c2a1f [SPARK-10812] [YARN] Spark hadoop util support switching to yarn While this is likely not a huge issue for real production systems, for test systems which may setup a Spark Context and tear it down and stand up

spark git commit: [SPARK-11242][SQL] In conf/spark-env.sh.template SPARK_DRIVER_MEMORY is documented incorrectly

2015-10-22 Thread srowen
Repository: spark Updated Branches: refs/heads/master d4950e6be -> 188ea348f [SPARK-11242][SQL] In conf/spark-env.sh.template SPARK_DRIVER_MEMORY is documented incorrectly Minor fix on the comment Author: guoxi Closes #9201 from xguo27/SPARK-11242. Project:

spark git commit: [SPARK-7021] Add JUnit output for Python unit tests

2015-10-22 Thread davies
Repository: spark Updated Branches: refs/heads/master 53e83a3a7 -> 163d53e82 [SPARK-7021] Add JUnit output for Python unit tests WIP Author: Gábor Lipták Closes #8323 from gliptak/SPARK-7021. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11251] Fix page size calculation in local mode

2015-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 e405c2a1f -> a76cf51ed [SPARK-11251] Fix page size calculation in local mode ``` // My machine only has 8 cores $ bin/spark-shell --master local[32] scala> val df = sc.parallelize(Seq((1, 1), (2, 2))).toDF("a", "b") scala>

spark git commit: [SPARK-11251] Fix page size calculation in local mode

2015-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 163d53e82 -> 34e71c6d8 [SPARK-11251] Fix page size calculation in local mode ``` // My machine only has 8 cores $ bin/spark-shell --master local[32] scala> val df = sc.parallelize(Seq((1, 1), (2, 2))).toDF("a", "b") scala>

[2/2] spark git commit: Preparing development version 1.5.3-SNAPSHOT

2015-10-22 Thread pwendell
Preparing development version 1.5.3-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/be3e3434 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/be3e3434 Diff:

[1/2] spark git commit: Preparing Spark release v1.5.2-rc1

2015-10-22 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.5 a76cf51ed -> be3e34345 Preparing Spark release v1.5.2-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ad6ade12 Tree:

Git Push Summary

2015-10-22 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.5.2-rc1 [created] ad6ade124 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-11098][CORE] Add Outbox to cache the sending messages to resolve the message disorder issue

2015-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 34e71c6d8 -> a88c66ca8 [SPARK-11098][CORE] Add Outbox to cache the sending messages to resolve the message disorder issue The current NettyRpc has a message order issue because it uses a thread pool to send messages. E.g., running the

spark git commit: [SPARK-11134][CORE] Increase LauncherBackendSuite timeout.

2015-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master a88c66ca8 -> fa6a4fbf0 [SPARK-11134][CORE] Increase LauncherBackendSuite timeout. This test can take a little while to finish on slow / loaded machines. Author: Marcelo Vanzin Closes #9235 from vanzin/SPARK-11134.

spark git commit: Fix a (very tiny) typo

2015-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master fa6a4fbf0 -> b1c1597e3 Fix a (very tiny) typo Author: Jacek Laskowski Closes #9230 from jaceklaskowski/utils-seconds-typo. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

[4/4] spark git commit: [SPARK-10708] Consolidate sort shuffle implementations

2015-10-22 Thread joshrosen
[SPARK-10708] Consolidate sort shuffle implementations There's a lot of duplication between SortShuffleManager and UnsafeShuffleManager. Given that these now provide the same set of functionality, now that UnsafeShuffleManager supports large records, I think that we should replace

[3/4] spark git commit: [SPARK-10708] Consolidate sort shuffle implementations

2015-10-22 Thread joshrosen
http://git-wip-us.apache.org/repos/asf/spark/blob/f6d06adf/core/src/main/java/org/apache/spark/shuffle/unsafe/UnsafeShuffleExternalSorter.java -- diff --git

[2/4] spark git commit: [SPARK-10708] Consolidate sort shuffle implementations

2015-10-22 Thread joshrosen
http://git-wip-us.apache.org/repos/asf/spark/blob/f6d06adf/core/src/main/scala/org/apache/spark/util/collection/PartitionedSerializedPairBuffer.scala -- diff --git

[1/4] spark git commit: [SPARK-10708] Consolidate sort shuffle implementations

2015-10-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 94e2064fa -> f6d06adf0 http://git-wip-us.apache.org/repos/asf/spark/blob/f6d06adf/core/src/test/scala/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriterSuite.scala