Git Push Summary

2015-08-03 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150803 [deleted] 35264204b - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-9263] Added flags to exclude dependencies when using --packages

2015-08-03 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-1.5 73c863ac8 - 34335719a [SPARK-9263] Added flags to exclude dependencies when using --packages While the functionality is there to exclude packages, there are no flags that allow users to exclude dependencies, in case of dependency

spark git commit: [SPARK-9263] Added flags to exclude dependencies when using --packages

2015-08-03 Thread vanzin
Repository: spark Updated Branches: refs/heads/master b79b4f5f2 - 1633d0a26 [SPARK-9263] Added flags to exclude dependencies when using --packages While the functionality is there to exclude packages, there are no flags that allow users to exclude dependencies, in case of dependency

[2/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/6bd12e81/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala -- diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala

[2/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/a2409d1c/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala -- diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala

Git Push Summary

2015-08-03 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150803 [deleted] 4c4f638c7 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-08-03 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150803 [created] 7e7147f3b - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-08-03 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150803 [created] 35264204b - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing development version 1.5.0-SNAPSHOT

2015-08-03 Thread pwendell
Preparing development version 1.5.0-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/73fab884 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/73fab884 Diff:

[1/2] spark git commit: Preparing Spark release v1.5.0-snapshot-20150803

2015-08-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.5 6bd12e819 - 73fab8849 Preparing Spark release v1.5.0-snapshot-20150803 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/35264204 Tree: http://git-wip-us.apache.org

spark git commit: [SPARK-8416] highlight and topping the executor threads in thread dumping page

2015-08-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.5 34335719a - 93076ae39 [SPARK-8416] highlight and topping the executor threads in thread dumping page https://issues.apache.org/jira/browse/SPARK-8416 To facilitate debugging, I made this patch with three changes: * render the

spark git commit: [SPARK-8416] highlight and topping the executor threads in thread dumping page

2015-08-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 1633d0a26 - 3b0e44490 [SPARK-8416] highlight and topping the executor threads in thread dumping page https://issues.apache.org/jira/browse/SPARK-8416 To facilitate debugging, I made this patch with three changes: * render the

[1/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b2e4b85d2 - a2409d1c8 http://git-wip-us.apache.org/repos/asf/spark/blob/a2409d1c/sql/hive/src/test/resources/golden/parenthesis_star_by-5-6888c7f7894910538d82eefa23443189

[3/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
[SPARK-8064] [SQL] Build against Hive 1.2.1 Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork. Tests not run yet: that's what the machines are for Author: Steve

[1/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 db5832708 - 6bd12e819 http://git-wip-us.apache.org/repos/asf/spark/blob/6bd12e81/sql/hive/src/test/resources/golden/parenthesis_star_by-5-6888c7f7894910538d82eefa23443189

[3/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
[SPARK-8064] [SQL] Build against Hive 1.2.1 Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork. Tests not run yet: that's what the machines are for Author: Steve

Git Push Summary

2015-08-03 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150803 [created] 4c4f638c7 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing development version 1.5.0-SNAPSHOT

2015-08-03 Thread pwendell
Preparing development version 1.5.0-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bc49ca46 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bc49ca46 Diff:

[1/2] spark git commit: Preparing Spark release v1.5.0-snapshot-20150803

2015-08-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.5 acda9d954 - bc49ca468 Preparing Spark release v1.5.0-snapshot-20150803 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4c4f638c Tree: http://git-wip-us.apache.org

spark git commit: [SPARK-9577][SQL] Surface concrete iterator types in various sort classes.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3b0e44490 - 5eb89f67e [SPARK-9577][SQL] Surface concrete iterator types in various sort classes. We often return abstract iterator types in various sort-related classes (e.g. UnsafeKVExternalSorter). It is actually better to return a more

spark git commit: [SPARK-9577][SQL] Surface concrete iterator types in various sort classes.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 93076ae39 - ebe42b98c [SPARK-9577][SQL] Surface concrete iterator types in various sort classes. We often return abstract iterator types in various sort-related classes (e.g. UnsafeKVExternalSorter). It is actually better to return a

spark git commit: [SPARK-8874] [ML] Add missing methods in Word2Vec

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master a2409d1c8 - 13675c742 [SPARK-8874] [ML] Add missing methods in Word2Vec Add missing methods 1. getVectors 2. findSynonyms to W2Vec scala and python API mengxr Author: MechCoder manojkumarsivaraj...@gmail.com Closes #7263 from

[2/2] spark git commit: Preparing development version 1.5.0-SNAPSHOT

2015-08-03 Thread pwendell
Preparing development version 1.5.0-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/74792e71 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/74792e71 Diff:

[1/2] spark git commit: Preparing Spark release v1.5.0-snapshot-20150803

2015-08-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.5 bc49ca468 - 74792e71c Preparing Spark release v1.5.0-snapshot-20150803 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7e7147f3 Tree: http://git-wip-us.apache.org

spark git commit: Add a prerequisites section for building docs

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 13675c742 - 7abaaad5b Add a prerequisites section for building docs This puts all the install commands that need to be run in one section instead of being spread over many paragraphs cc rxin Author: Shivaram Venkataraman

spark git commit: [SPARK-9483] Fix UTF8String.getPrefix for big-endian.

2015-08-03 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.5 74792e71c - 73c863ac8 [SPARK-9483] Fix UTF8String.getPrefix for big-endian. Previous code assumed little-endian. Author: Matthew Brandyberry mbra...@us.ibm.com Closes #7902 from mtbrandy/SPARK-9483 and squashes the following commits:

spark git commit: [SPARK-9483] Fix UTF8String.getPrefix for big-endian.

2015-08-03 Thread davies
Repository: spark Updated Branches: refs/heads/master 7abaaad5b - b79b4f5f2 [SPARK-9483] Fix UTF8String.getPrefix for big-endian. Previous code assumed little-endian. Author: Matthew Brandyberry mbra...@us.ibm.com Closes #7902 from mtbrandy/SPARK-9483 and squashes the following commits:

spark git commit: Revert [SPARK-9372] [SQL] Filter nulls in join keys

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 29756ff11 - db5832708 Revert [SPARK-9372] [SQL] Filter nulls in join keys This reverts commit 687c8c37150f4c93f8e57d86bb56321a4891286b. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert [SPARK-9372] [SQL] Filter nulls in join keys

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 702aa9d7f - b2e4b85d2 Revert [SPARK-9372] [SQL] Filter nulls in join keys This reverts commit 687c8c37150f4c93f8e57d86bb56321a4891286b. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-8874] [ML] Add missing methods in Word2Vec

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 73fab8849 - acda9d954 [SPARK-8874] [ML] Add missing methods in Word2Vec Add missing methods 1. getVectors 2. findSynonyms to W2Vec scala and python API mengxr Author: MechCoder manojkumarsivaraj...@gmail.com Closes #7263 from

spark git commit: [SPARK-9521] [DOCS] Addendum. Require Maven 3.3.3+ in the build

2015-08-03 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.5 ebe42b98c - 1f7dbcd6f [SPARK-9521] [DOCS] Addendum. Require Maven 3.3.3+ in the build Follow on for #7852: Building Spark doc needs to refer to new Maven requirement too Author: Sean Owen so...@cloudera.com Closes #7905 from

spark git commit: [SPARK-9521] [DOCS] Addendum. Require Maven 3.3.3+ in the build

2015-08-03 Thread sarutak
Repository: spark Updated Branches: refs/heads/master 5eb89f67e - 0afa6fbf5 [SPARK-9521] [DOCS] Addendum. Require Maven 3.3.3+ in the build Follow on for #7852: Building Spark doc needs to refer to new Maven requirement too Author: Sean Owen so...@cloudera.com Closes #7905 from

[1/2] spark git commit: [SPARK-8735] [SQL] Expose memory usage for shuffles, joins and aggregations

2015-08-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.5 e7329ab31 - 29756ff11 http://git-wip-us.apache.org/repos/asf/spark/blob/29756ff1/core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala -- diff --git

[1/2] spark git commit: [SPARK-8735] [SQL] Expose memory usage for shuffles, joins and aggregations

2015-08-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master e4765a468 - 702aa9d7f http://git-wip-us.apache.org/repos/asf/spark/blob/702aa9d7/core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala -- diff --git

[2/2] spark git commit: [SPARK-8735] [SQL] Expose memory usage for shuffles, joins and aggregations

2015-08-03 Thread joshrosen
[SPARK-8735] [SQL] Expose memory usage for shuffles, joins and aggregations This patch exposes the memory used by internal data structures on the SparkUI. This tracks memory used by all spilling operations and SQL operators backed by Tungsten, e.g. `BroadcastHashJoin`, `ExternalSort`,

spark git commit: [SPARK-8873] [MESOS] Clean up shuffle files if external shuffle service is used

2015-08-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 1ebd41b14 - 95dccc633 [SPARK-8873] [MESOS] Clean up shuffle files if external shuffle service is used This patch builds directly on #7820, which is largely written by tnachen. The only addition is one commit for cleaning up the code.

spark git commit: [SPARK-9404][SPARK-9542][SQL] unsafe array data and map data

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 687c8c371 - 608353c8e [SPARK-9404][SPARK-9542][SQL] unsafe array data and map data This PR adds a UnsafeArrayData, current we encode it in this way: first 4 bytes is the # elements then each 4 byte is the start offset of the element,

spark git commit: [SPARK-9549][SQL] fix bugs in expressions

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 608353c8e - 98d6d9c7a [SPARK-9549][SQL] fix bugs in expressions JIRA: https://issues.apache.org/jira/browse/SPARK-9549 This PR fix the following bugs: 1. `UnaryMinus`'s codegen version would fail to compile when the input is

spark git commit: [SPARK-9372] [SQL] Filter nulls in join keys

2015-08-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 4cdd8ecd6 - 687c8c371 [SPARK-9372] [SQL] Filter nulls in join keys This PR adds an optimization rule, `FilterNullsInJoinKey`, to add `Filter` before join operators to filter out rows having null values for join keys. This optimization is

[2/2] spark git commit: [SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row

2015-08-03 Thread rxin
[SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row This PR adds a base aggregation iterator `AggregationIterator`, which is used to create `SortBasedAggregationIterator` (for sort-based aggregation) and `UnsafeHybridAggregationIterator` (first it tries hash-based aggregation and

[1/2] spark git commit: [SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 98d6d9c7a - 1ebd41b14 http://git-wip-us.apache.org/repos/asf/spark/blob/1ebd41b1/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/sortBasedIterators.scala

spark git commit: [SPARK-5133] [ML] Added featureImportance to RandomForestClassifier and Regressor

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 703e44bff - ff9169a00 [SPARK-5133] [ML] Added featureImportance to RandomForestClassifier and Regressor Added featureImportance to RandomForestClassifier and Regressor. This follows the scikit-learn implementation here:

spark git commit: [SQL][minor] Simplify UnsafeRow.calculateBitSetWidthInBytes.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 4de833e9e - 5452e93f0 [SQL][minor] Simplify UnsafeRow.calculateBitSetWidthInBytes. Author: Reynold Xin r...@databricks.com Closes #7897 from rxin/calculateBitSetWidthInBytes and squashes the following commits: 2e73b3a [Reynold Xin]

spark git commit: [SPARK-9554] [SQL] Enables in-memory partition pruning by default

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 5452e93f0 - 6d46e9b7c [SPARK-9554] [SQL] Enables in-memory partition pruning by default Author: Cheng Lian l...@databricks.com Closes #7895 from liancheng/spark-9554/enable-in-memory-partition-pruning and squashes the following

spark git commit: [SPARK-9554] [SQL] Enables in-memory partition pruning by default

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7a9d09f0b - 703e44bff [SPARK-9554] [SQL] Enables in-memory partition pruning by default Author: Cheng Lian l...@databricks.com Closes #7895 from liancheng/spark-9554/enable-in-memory-partition-pruning and squashes the following commits:

spark git commit: [SPARK-5133] [ML] Added featureImportance to RandomForestClassifier and Regressor

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 6d46e9b7c - b3117d312 [SPARK-5133] [ML] Added featureImportance to RandomForestClassifier and Regressor Added featureImportance to RandomForestClassifier and Regressor. This follows the scikit-learn implementation here:

Git Push Summary

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 [created] b41a32718 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-9511] [SQL] Fixed Table Name Parsing

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b41a32718 - dfe7bd168 [SPARK-9511] [SQL] Fixed Table Name Parsing The issue was that the tokenizer was parsing 1one into the numeric 1 using the code on line 110. I added another case to accept strings that start with a number and then

spark git commit: [SPARK-9528] [ML] Changed RandomForestClassifier to extend ProbabilisticClassifier

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 8be198c86 - 69f5a7c93 [SPARK-9528] [ML] Changed RandomForestClassifier to extend ProbabilisticClassifier RandomForestClassifier now outputs rawPrediction based on tree probabilities, plus probability column computed from normalized

spark git commit: [SPARK-1855] Local checkpointing

2015-08-03 Thread tdas
Repository: spark Updated Branches: refs/heads/master 69f5a7c93 - b41a32718 [SPARK-1855] Local checkpointing Certain use cases of Spark involve RDDs with long lineages that must be truncated periodically (e.g. GraphX). The existing way of doing it is through `rdd.checkpoint()`, which is

spark git commit: [SPARK-9551][SQL] add a cheap version of copy for UnsafeRow to reuse a copy buffer

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 95dccc633 - 137f47865 [SPARK-9551][SQL] add a cheap version of copy for UnsafeRow to reuse a copy buffer Author: Wenchen Fan cloud0...@outlook.com Closes #7885 from cloud-fan/cheap-copy and squashes the following commits: 0900ca1

spark git commit: Two minor comments from code review on 191bf2689.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 191bf2689 - 8be198c86 Two minor comments from code review on 191bf2689. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8be198c8 Tree:

spark git commit: [SPARK-7563] (backport for 1.3) OutputCommitCoordinator.stop() should only run on the driver

2015-08-03 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.3 cc5f711c0 - 265ec35bc [SPARK-7563] (backport for 1.3) OutputCommitCoordinator.stop() should only run on the driver Backport of [SPARK-7563] OutputCommitCoordinator.stop() should only run on the driver for 1.3 Author: Sean Owen

spark git commit: [SPARK-9518] [SQL] cleanup generated UnsafeRowJoiner and fix bug

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 137f47865 - 191bf2689 [SPARK-9518] [SQL] cleanup generated UnsafeRowJoiner and fix bug Currently, when copy the bitsets, we didn't consider that the row1 may not sit in the beginning of byte array. cc rxin Author: Davies Liu

spark git commit: [SPARK-9191] [ML] [Doc] Add ml.PCA user guide and code examples

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master ba1c4e138 - 8ca287ebb [SPARK-9191] [ML] [Doc] Add ml.PCA user guide and code examples Add ml.PCA user guide document and code examples for Scala/Java/Python. Author: Yanbo Liang yblia...@gmail.com Closes #7522 from yanboliang/ml-pca-md

spark git commit: [SPARK-9544] [MLLIB] add Python API for RFormula

2015-08-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.5 444058d91 - dc0c8c982 [SPARK-9544] [MLLIB] add Python API for RFormula Add Python API for RFormula. Similar to other feature transformers in Python. This is just a thin wrapper over the Scala implementation. ericl MechCoder Author:

spark git commit: [SPARK-9544] [MLLIB] add Python API for RFormula

2015-08-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 8ca287ebb - e4765a468 [SPARK-9544] [MLLIB] add Python API for RFormula Add Python API for RFormula. Similar to other feature transformers in Python. This is just a thin wrapper over the Scala implementation. ericl MechCoder Author:

spark git commit: [SPARK-9191] [ML] [Doc] Add ml.PCA user guide and code examples

2015-08-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 dc0c8c982 - e7329ab31 [SPARK-9191] [ML] [Doc] Add ml.PCA user guide and code examples Add ml.PCA user guide document and code examples for Scala/Java/Python. Author: Yanbo Liang yblia...@gmail.com Closes #7522 from

spark git commit: [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master ff9169a00 - ba1c4e138 [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults. Now the memory defaults of master and slave in Standalone mode and History Server is 1g, not 512m. So let's update docs. Author: Kousuke

spark git commit: [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 b3117d312 - 444058d91 [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults. Now the memory defaults of master and slave in Standalone mode and History Server is 1g, not 512m. So let's update docs. Author: Kousuke