spark git commit: [SPARK-10576] [BUILD] Move .java files out of src/main/scala

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 16b6d1861 -> 4e2242bb4 [SPARK-10576] [BUILD] Move .java files out of src/main/scala Move .java files in `src/main/scala` to `src/main/java` root, except for `package-info.java` (to stay next to package.scala) Author: Sean Owen

spark git commit: [SPARK-10564] ThreadingSuite: assertion failures in threads don't fail the test (round 2)

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 eb0cb25bb -> 5db51f911 [SPARK-10564] ThreadingSuite: assertion failures in threads don't fail the test (round 2) This is a follow-up patch to #8723. I missed one case there. Author: Andrew Or Closes #8727

spark git commit: [SPARK-10549] scala 2.11 spark on yarn with security - Repl doesn't work

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 4e2242bb4 -> ffbbc2c58 [SPARK-10549] scala 2.11 spark on yarn with security - Repl doesn't work Make this lazy so that it can set the yarn mode before creating the securityManager. Author: Tom Graves Author:

spark git commit: [SPARK-10543] [CORE] Peak Execution Memory Quantile should be Per-task Basis

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master ffbbc2c58 -> fd1e8cddf [SPARK-10543] [CORE] Peak Execution Memory Quantile should be Per-task Basis Read `PEAK_EXECUTION_MEMORY` using `update` to get per task partial value instead of cumulative value. I tested with this workload:

spark git commit: [SPARK-10543] [CORE] Peak Execution Memory Quantile should be Per-task Basis

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 0e1c9d9ff -> eb0cb25bb [SPARK-10543] [CORE] Peak Execution Memory Quantile should be Per-task Basis Read `PEAK_EXECUTION_MEMORY` using `update` to get per task partial value instead of cumulative value. I tested with this workload:

spark git commit: [SPARK-6981] [SQL] Factor out SparkPlanner and QueryExecution from SQLContext

2015-09-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7e32387ae -> 64f04154e [SPARK-6981] [SQL] Factor out SparkPlanner and QueryExecution from SQLContext Alternative to PR #6122; in this case the refactored out classes are replaced by inner classes with the same name for backwards binary

spark git commit: [SPARK-9996] [SPARK-9997] [SQL] Add local expand and NestedLoopJoin operators

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 64f04154e -> 217e49644 [SPARK-9996] [SPARK-9997] [SQL] Add local expand and NestedLoopJoin operators This PR is in conflict with #8535 and #8573. Will update this one when they are merged. Author: zsxwing Closes

spark git commit: [SPARK-10549] scala 2.11 spark on yarn with security - Repl doesn't work

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 a0d564a10 -> 0e1c9d9ff [SPARK-10549] scala 2.11 spark on yarn with security - Repl doesn't work Make this lazy so that it can set the yarn mode before creating the securityManager. Author: Tom Graves Author:

spark git commit: [SPARK-10594] [YARN] Remove reference to --num-executors, add --properties-file

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 217e49644 -> 16b6d1861 [SPARK-10594] [YARN] Remove reference to --num-executors, add --properties-file `ApplicationMaster` no longer has the `--num-executors` flag, and had an undocumented `--properties-file` configuration option. cc

spark git commit: [SPARK-10564] ThreadingSuite: assertion failures in threads don't fail the test (round 2)

2015-09-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master fd1e8cddf -> 7b6c85636 [SPARK-10564] ThreadingSuite: assertion failures in threads don't fail the test (round 2) This is a follow-up patch to #8723. I missed one case there. Author: Andrew Or Closes #8727 from

spark git commit: [SPARK-10542] [PYSPARK] fix serialize namedtuple

2015-09-14 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.5 5db51f911 -> d5c0361e7 [SPARK-10542] [PYSPARK] fix serialize namedtuple Author: Davies Liu Closes #8707 from davies/fix_namedtuple. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-10273] Add @since annotation to pyspark.mllib.feature

2015-09-14 Thread meng
Repository: spark Updated Branches: refs/heads/master 4ae4d5479 -> 610971ecf [SPARK-10273] Add @since annotation to pyspark.mllib.feature Duplicated the since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added since to methods +

spark git commit: [SPARK-10275] [MLLIB] Add @since annotation to pyspark.mllib.random

2015-09-14 Thread meng
Repository: spark Updated Branches: refs/heads/master 610971ecf -> a2249359d [SPARK-10275] [MLLIB] Add @since annotation to pyspark.mllib.random Author: Yu ISHIKAWA Closes #8666 from yu-iskw/SPARK-10275. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-9851] Support submitting map stages individually in DAGScheduler

2015-09-14 Thread matei
Repository: spark Updated Branches: refs/heads/master 7b6c85636 -> 1a0955250 [SPARK-9851] Support submitting map stages individually in DAGScheduler This patch adds support for submitting map stages in a DAG individually so that we can make downstream decisions after seeing statistics about

spark git commit: [SPARK-10542] [PYSPARK] fix serialize namedtuple

2015-09-14 Thread davies
Repository: spark Updated Branches: refs/heads/master 1a0955250 -> 552041810 [SPARK-10542] [PYSPARK] fix serialize namedtuple Author: Davies Liu Closes #8707 from davies/fix_namedtuple. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-9793] [MLLIB] [PYSPARK] PySpark DenseVector, SparseVector implement __eq__ and __hash__ correctly

2015-09-14 Thread meng
Repository: spark Updated Branches: refs/heads/master 552041810 -> 4ae4d5479 [SPARK-9793] [MLLIB] [PYSPARK] PySpark DenseVector, SparseVector implement __eq__ and __hash__ correctly PySpark DenseVector, SparseVector ```__eq__``` method should use semantics equality, and DenseVector can

spark git commit: [SPARK-10522] [SQL] Nanoseconds of Timestamp in Parquet should be positive

2015-09-14 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.5 5b7067c91 -> a0d564a10 [SPARK-10522] [SQL] Nanoseconds of Timestamp in Parquet should be positive Or Hive can't read it back correctly. Thanks vanzin for report this. Author: Davies Liu Closes #8674 from