spark git commit: [SPARK-10497] [BUILD] [TRIVIAL] Handle both locations for JIRAError with python-jira

2015-09-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 1dc7548c5 -> 48817cc11 [SPARK-10497] [BUILD] [TRIVIAL] Handle both locations for JIRAError with python-jira Location of JIRAError has moved between old and new versions of python-jira package. Longer term it probably makes sense to pin

spark git commit: [SPARK-10065] [SQL] avoid the extra copy when generate unsafe array

2015-09-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 48817cc11 -> 4f1daa1ef [SPARK-10065] [SQL] avoid the extra copy when generate unsafe array The reason for this extra copy is that we iterate the array twice: calculate elements data size and copy elements to array buffer. A simple

spark git commit: [SPARK-7142] [SQL] Minor enhancement to BooleanSimplification Optimizer rule

2015-09-10 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4f1daa1ef -> f892d927d [SPARK-7142] [SQL] Minor enhancement to BooleanSimplification Optimizer rule Use these in the optimizer as well: A and (not(A) or B) => A and B not(A and B) => not(A) or not(B)

spark git commit: [SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master af3bc59d1 -> f0562e8cd [SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores This is a regression introduced in #4960, this commit fixes it and adds a test. tnachen andrewor14 please review, this should be an easy

spark git commit: [SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 bff05aaa0 -> 8cf16191f [SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores This is a regression introduced in #4960, this commit fixes it and adds a test. tnachen andrewor14 please review, this should be an

spark git commit: [SPARK-10514] [MESOS] waiting for min no of total cores acquired by Spark by implementing the sufficientResourcesRegistered method

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master f0562e8cd -> a5ef2d060 [SPARK-10514] [MESOS] waiting for min no of total cores acquired by Spark by implementing the sufficientResourcesRegistered method spark.scheduler.minRegisteredResourcesRatio configuration parameter works for YARN

spark git commit: [SPARK-6931] [PYSPARK] Cast Python time float values to int before serialization

2015-09-10 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.2 7029cd12b -> 4862a80d2 [SPARK-6931] [PYSPARK] Cast Python time float values to int before serialization Python time values return a floating point value, need to cast to integer before serialize with struct.pack('!q', value)

spark git commit: [SPARK-6931] [PYSPARK] Cast Python time float values to int before serialization

2015-09-10 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.3 9fcd8310e -> d0d7ada9f [SPARK-6931] [PYSPARK] Cast Python time float values to int before serialization Python time values return a floating point value, need to cast to integer before serialize with struct.pack('!q', value)

spark git commit: [SPARK-10466] [SQL] UnsafeRow SerDe exception with data spill

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 49da38e5f -> e04811137 [SPARK-10466] [SQL] UnsafeRow SerDe exception with data spill Data Spill with UnsafeRow causes assert failure. ``` java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165)

spark git commit: [SPARK-10466] [SQL] UnsafeRow SerDe exception with data spill

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 5e06d41a4 -> bc70043c8 [SPARK-10466] [SQL] UnsafeRow SerDe exception with data spill Data Spill with UnsafeRow causes assert failure. ``` java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165)

spark git commit: [SPARK-10469] [DOC] Try and document the three options

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master e04811137 -> a76bde9da [SPARK-10469] [DOC] Try and document the three options >From JIRA: Add documentation for tungsten-sort. >From the mailing list "I saw a new "spark.shuffle.manager=tungsten-sort" >implemented in

spark git commit: [SPARK-8167] Make tasks that fail from YARN preemption not fail job

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master a76bde9da -> af3bc59d1 [SPARK-8167] Make tasks that fail from YARN preemption not fail job The architecture is that, in YARN mode, if the driver detects that an executor has disconnected, it asks the ApplicationMaster why the executor

spark git commit: [SPARK-10469] [DOC] Try and document the three options

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 bc70043c8 -> bff05aaa0 [SPARK-10469] [DOC] Try and document the three options >From JIRA: Add documentation for tungsten-sort. >From the mailing list "I saw a new "spark.shuffle.manager=tungsten-sort" >implemented in

spark git commit: [SPARK-9990] [SQL] Create local hash join operator

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master a5ef2d060 -> d88abb7e2 [SPARK-9990] [SQL] Create local hash join operator This PR includes the following changes: - Add SQLConf to LocalNode - Add HashJoinNode - Add ConvertToUnsafeNode and ConvertToSafeNode.scala to test unsafe hash join.

spark git commit: Revert "[SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores"

2015-09-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 8cf16191f -> 89d351b5a Revert "[SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores" This reverts commit 8cf16191f3e3b0562f22d44b0381bea35ba511d7. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-9043] Serialize key, value and combiner classes in ShuffleDependency

2015-09-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 89562a172 -> 0eabea8a0 [SPARK-9043] Serialize key, value and combiner classes in ShuffleDependency ShuffleManager implementations are currently not given type information for the key, value and combiner classes. Serialization of shuffle

spark git commit: [SPARK-10023] [ML] [PySpark] Unified DecisionTreeParams checkpointInterval between Scala and Python API.

2015-09-10 Thread meng
Repository: spark Updated Branches: refs/heads/master 0eabea8a0 -> 339a52714 [SPARK-10023] [ML] [PySpark] Unified DecisionTreeParams checkpointInterval between Scala and Python API. "checkpointInterval" is member of DecisionTreeParams in Scala API which is inconsistency with Python API, we

spark git commit: [SPARK-10027] [ML] [PySpark] Add Python API missing methods for ml.feature

2015-09-10 Thread meng
Repository: spark Updated Branches: refs/heads/master 339a52714 -> a140dd77c [SPARK-10027] [ML] [PySpark] Add Python API missing methods for ml.feature Missing method of ml.feature are listed here: ```StringIndexer``` lacks of parameter ```handleInvalid```. ```StringIndexerModel``` lacks of

spark git commit: [SPARK-10049] [SPARKR] Support collecting data of ArraryType in DataFrame.

2015-09-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master d88abb7e2 -> 45e3be5c1 [SPARK-10049] [SPARKR] Support collecting data of ArraryType in DataFrame. this PR : 1. Enhance reflection in RBackend. Automatically matching a Java array to Scala Seq when finding methods. Util functions like

spark git commit: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types.Row implements __getitem__

2015-09-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 420475771 -> 89562a172 [SPARK-7544] [SQL] [PySpark] pyspark.sql.types.Row implements __getitem__ pyspark.sql.types.Row implements ```__getitem__``` Author: Yanbo Liang Closes #8333 from yanboliang/spark-7544.

spark git commit: [SPARK-10443] [SQL] Refactor SortMergeOuterJoin to reduce duplication

2015-09-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 45e3be5c1 -> 3db72554b [SPARK-10443] [SQL] Refactor SortMergeOuterJoin to reduce duplication `LeftOutputIterator` and `RightOutputIterator` are symmetrically identical and can share a lot of code. If someone makes a change in one but

spark git commit: Add 1.5 to master branch EC2 scripts

2015-09-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 3db72554b -> 420475771 Add 1.5 to master branch EC2 scripts This change brings it to par with `branch-1.5` (and 1.5.0 release) Author: Shivaram Venkataraman Closes #8704 from shivaram/ec2-1.5-update. Project:

spark git commit: [SPARK-10301] [SPARK-10428] [SQL] Addresses comments of PR #8583 and #8509 for master

2015-09-10 Thread davies
Repository: spark Updated Branches: refs/heads/master f892d927d -> 49da38e5f [SPARK-10301] [SPARK-10428] [SQL] Addresses comments of PR #8583 and #8509 for master Author: Cheng Lian Closes #8670 from liancheng/spark-10301/address-pr-comments. Project: