git commit: [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"

2014-09-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 1bed0a386 -> 00362dac9 [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions" 9b225ac3072de522b40b46aba6df1f1c231f13ef has been causing GraphX tests to fail nondeterministically, which is blocking development for others. A

git commit: [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"

2014-09-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.0 8dd7690e2 -> 4d3ab2925 [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions" 9b225ac3072de522b40b46aba6df1f1c231f13ef has been causing GraphX tests to fail nondeterministically, which is blocking development for others

git commit: [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"

2014-09-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 f41c45a75 -> 8c40ab5c0 [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions" 9b225ac3072de522b40b46aba6df1f1c231f13ef has been causing GraphX tests to fail nondeterministically, which is blocking development for others

git commit: [SPARK-3372] [MLlib] MLlib doesn't pass maven build / checkstyle due to multi-byte character contained in Gradient.scala

2014-09-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.1 3111501ea -> f41c45a75 [SPARK-3372] [MLlib] MLlib doesn't pass maven build / checkstyle due to multi-byte character contained in Gradient.scala Author: Kousuke Saruta Closes #2248 from sarutak/SPARK-3372 and squashes the following co

git commit: [SPARK-3372] [MLlib] MLlib doesn't pass maven build / checkstyle due to multi-byte character contained in Gradient.scala

2014-09-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 7c6e71f05 -> 1bed0a386 [SPARK-3372] [MLlib] MLlib doesn't pass maven build / checkstyle due to multi-byte character contained in Gradient.scala Author: Kousuke Saruta Closes #2248 from sarutak/SPARK-3372 and squashes the following commit

git commit: [SPARK-2435] Add shutdown hook to pyspark

2014-09-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master c5cbc4923 -> 7c6e71f05 [SPARK-2435] Add shutdown hook to pyspark Author: Matthew Farrellee Closes #2183 from mattf/SPARK-2435 and squashes the following commits: ee0ee99 [Matthew Farrellee] [SPARK-2435] Add shutdown hook to pyspark Pro

git commit: [SPARK-3335] [SQL] [PySpark] support broadcast in Python UDF

2014-09-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 248067adb -> c5cbc4923 [SPARK-3335] [SQL] [PySpark] support broadcast in Python UDF After this patch, broadcast can be used in Python UDF. Author: Davies Liu Closes #2243 from davies/udf_broadcast and squashes the following commits: 7b8

git commit: [SPARK-2961][SQL] Use statistics to prune batches within cached partitions

2014-09-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f48420fde -> 248067adb [SPARK-2961][SQL] Use statistics to prune batches within cached partitions This PR is based on #1883 authored by marmbrus. Key differences: 1. Batch pruning instead of partition pruning When #1883 was authored, b

git commit: [SPARK-2973][SQL] Lightweight SQL commands without distributed jobs when calling .collect()

2014-09-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4bba10c41 -> f48420fde [SPARK-2973][SQL] Lightweight SQL commands without distributed jobs when calling .collect() By overriding `executeCollect()` in physical plan classes of all commands, we can avoid to kick off a distributed job when

git commit: [SPARK-3233] Executor never stop its SparnEnv, BlockManager, ConnectionManager etc.

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master e08ea7393 -> 4bba10c41 [SPARK-3233] Executor never stop its SparnEnv, BlockManager, ConnectionManager etc. Author: Kousuke Saruta Closes #2138 from sarutak/SPARK-3233 and squashes the following commits: c0205b7 [Kousuke Saruta] Merge br

git commit: [SPARK-3303][core] fix SparkContextSchedulerCreationSuite test error

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master a52240792 -> e08ea7393 [SPARK-3303][core] fix SparkContextSchedulerCreationSuite test error run test with the master branch with this command when mesos native lib is set sbt/sbt -Phive "test-only org.apache.spark.SparkContextSchedulerCreat

git commit: [SPARK-2419][Streaming][Docs] Updates to the streaming programming guide

2014-09-03 Thread tdas
Repository: spark Updated Branches: refs/heads/master 996b7434e -> a52240792 [SPARK-2419][Streaming][Docs] Updates to the streaming programming guide Updated the main streaming programming guide, and also added source-specific guides for Kafka, Flume, Kinesis. Author: Tathagata Das Author:

git commit: [SPARK-2419][Streaming][Docs] Updates to the streaming programming guide

2014-09-03 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.1 37b10086b -> 3111501ea [SPARK-2419][Streaming][Docs] Updates to the streaming programming guide Updated the main streaming programming guide, and also added source-specific guides for Kafka, Flume, Kinesis. Author: Tathagata Das Auth

git commit: [SPARK-3345] Do correct parameters for ShuffleFileGroup

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 2784822e4 -> 996b7434e [SPARK-3345] Do correct parameters for ShuffleFileGroup In the method `newFileGroup` of class `FileShuffleBlockManager`, the parameters for creating new `ShuffleFileGroup` object is in wrong order. Because in curren

git commit: [Minor] Fix outdated Spark version

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master f2b5b619a -> 2784822e4 [Minor] Fix outdated Spark version This is causing the event logs to include a file called SPARK_VERSION_1.0.0, which is not accurate. Author: Andrew Or Author: andrewor14 Closes #2255 from andrewor14/spark-versi

git commit: [SPARK-3388] Expose aplication ID in ApplicationStart event, use it in history server.

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master ccc69e26e -> f2b5b619a [SPARK-3388] Expose aplication ID in ApplicationStart event, use it in history server. This change exposes the application ID generated by the Spark Master, Mesos or Yarn via the SparkListenerApplicationStart event.

git commit: [SPARK-2845] Add timestamps to block manager events.

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master e5d376801 -> ccc69e26e [SPARK-2845] Add timestamps to block manager events. These are not used by the UI but are useful when analysing the logs from a spark job. Author: Marcelo Vanzin Closes #654 from vanzin/bm-event-tstamp and squashes

git commit: [SPARK-3263][GraphX] Fix changes made to GraphGenerator.logNormalGraph in PR #720

2014-09-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 6481d2742 -> e5d376801 [SPARK-3263][GraphX] Fix changes made to GraphGenerator.logNormalGraph in PR #720 PR #720 made multiple changes to GraphGenerator.logNormalGraph including: * Replacing the call to functions for generating random ver

git commit: [SPARK-3216] [SPARK-3232] Spark-shell is broken in branch-1.0 / Backport SPARK-3006 into branch-1.0

2014-09-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.0 d47581638 -> 8dd7690e2 [SPARK-3216] [SPARK-3232] Spark-shell is broken in branch-1.0 / Backport SPARK-3006 into branch-1.0 Author: Kousuke Saruta Author: Andrew Or Closes #2136 from sarutak/SPARK-3216 and squashes the following comm

git commit: [SPARK-3309] [PySpark] Put all public API in __all__

2014-09-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 6a72a3694 -> 6481d2742 [SPARK-3309] [PySpark] Put all public API in __all__ Put all public API in __all__, also put them all in pyspark.__init__.py, then we can got all the documents for public API by `pydoc pyspark`. It also can be used

git commit: [SPARK-3187] [yarn] Cleanup allocator code.

2014-09-03 Thread tgraves
Repository: spark Updated Branches: refs/heads/master c64cc435e -> 6a72a3694 [SPARK-3187] [yarn] Cleanup allocator code. Move all shared logic to the base YarnAllocator class, and leave the version-specific logic in the version-specific module. Author: Marcelo Vanzin Closes #2169 from vanzi