spark git commit: [SPARK-14722][SQL] Rename upstreams() -> inputRDDs() in WholeStageCodegen

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4eae1dbd7 -> 6f8800689 [SPARK-14722][SQL] Rename upstreams() -> inputRDDs() in WholeStageCodegen ## What changes were proposed in this pull request? Per rxin's suggestions, this patch renames `upstreams()` to `inputRDDs()` in

[1/2] spark git commit: [SPARK-14718][SQL] Avoid mutating ExprCode in doGenCode

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master ed2de0299 -> 4eae1dbd7 http://git-wip-us.apache.org/repos/asf/spark/blob/4eae1dbd/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

[2/2] spark git commit: [SPARK-14718][SQL] Avoid mutating ExprCode in doGenCode

2016-04-18 Thread rxin
[SPARK-14718][SQL] Avoid mutating ExprCode in doGenCode ## What changes were proposed in this pull request? The `doGenCode` method currently takes in an `ExprCode`, mutates it and returns the java code to evaluate the given expression. It should instead just return a new `ExprCode` to avoid

spark git commit: [SPARK-14719] WriteAheadLogBasedBlockHandler should ignore BlockManager put errors

2016-04-18 Thread tdas
Repository: spark Updated Branches: refs/heads/master 5e92583d3 -> ed2de0299 [SPARK-14719] WriteAheadLogBasedBlockHandler should ignore BlockManager put errors WriteAheadLogBasedBlockHandler will currently throw exceptions if its BlockManager `put()` calls fail, even though those calls are

spark git commit: [SPARK-14667] Remove HashShuffleManager

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4b3d1294a -> 5e92583d3 [SPARK-14667] Remove HashShuffleManager ## What changes were proposed in this pull request? The sort shuffle manager has been the default since Spark 1.2. It is time to remove the old hash shuffle manager. ## How

spark git commit: [SPARK-13227] Risky apply() in OpenHashMap

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2b151b6b9 -> 4b3d1294a [SPARK-13227] Risky apply() in OpenHashMap https://issues.apache.org/jira/browse/SPARK-13227 It might confuse the future developers when they use OpenHashMap.apply() with a numeric value type.

spark git commit: [SPARK-13227] Risky apply() in OpenHashMap

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 58dfba66e -> cd812143d [SPARK-13227] Risky apply() in OpenHashMap https://issues.apache.org/jira/browse/SPARK-13227 It might confuse the future developers when they use OpenHashMap.apply() with a numeric value type.

spark git commit: [SPARK-14711][BUILD] Examples jar not a part of distribution.

2016-04-18 Thread vanzin
Repository: spark Updated Branches: refs/heads/master d29e429ee -> 2b151b6b9 [SPARK-14711][BUILD] Examples jar not a part of distribution. ## What changes were proposed in this pull request? Move the spark-examples.jar from being in examples/target to examples/target/scala-2.11/jars ## How

spark git commit: [SPARK-14714][ML][PYTHON] Fixed issues with non-kwarg typeConverter arg for Param constructor

2016-04-18 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 9bfb35da1 -> d29e429ee [SPARK-14714][ML][PYTHON] Fixed issues with non-kwarg typeConverter arg for Param constructor ## What changes were proposed in this pull request? PySpark Param constructors need to pass the TypeConverter argument

spark git commit: [SPARK-14515][DOC] Add python example for ChiSqSelector

2016-04-18 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 602734084 -> 9bfb35da1 [SPARK-14515][DOC] Add python example for ChiSqSelector ## What changes were proposed in this pull request? Add the missing python example for ChiSqSelector ## How was this patch tested? manual tests Author: Zheng

[2/2] spark git commit: [SPARK-14628][CORE][FOLLLOW-UP] Always tracking read/write metrics

2016-04-18 Thread rxin
[SPARK-14628][CORE][FOLLLOW-UP] Always tracking read/write metrics ## What changes were proposed in this pull request? This PR is a follow up for https://github.com/apache/spark/pull/12417, now we always track input/output/shuffle metrics in spark JSON protocol and status API. Most of the line

[1/2] spark git commit: [SPARK-14628][CORE][FOLLLOW-UP] Always tracking read/write metrics

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6ff043585 -> 602734084 http://git-wip-us.apache.org/repos/asf/spark/blob/60273408/core/src/test/resources/HistoryServerExpectations/stage_task_list_w__offset___length_expectation.json

spark git commit: [SPARK-14713][TESTS] Fix the flaky test NettyBlockTransferServiceSuite

2016-04-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 68450c8c6 -> 6ff043585 [SPARK-14713][TESTS] Fix the flaky test NettyBlockTransferServiceSuite ## What changes were proposed in this pull request? When there are multiple tests running, "NettyBlockTransferServiceSuite.can bind to a

spark git commit: [SPARK-14504][SQL] Enable Oracle docker tests

2016-04-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master f1a11976d -> 68450c8c6 [SPARK-14504][SQL] Enable Oracle docker tests ## What changes were proposed in this pull request? Enable Oracle docker tests ## How was this patch tested? Existing tests Author: Luciano Resende

spark git commit: [SPARK-14674][SQL] Move HiveContext.hiveconf to HiveSessionState

2016-04-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8bd812132 -> f1a11976d [SPARK-14674][SQL] Move HiveContext.hiveconf to HiveSessionState ## What changes were proposed in this pull request? This is just cleanup. This allows us to remove HiveContext later without inflating the diff too

[2/2] spark git commit: [SPARK-14710][SQL] Rename gen/genCode to genCode/doGenCode to better reflect the semantics

2016-04-18 Thread rxin
[SPARK-14710][SQL] Rename gen/genCode to genCode/doGenCode to better reflect the semantics ## What changes were proposed in this pull request? Per rxin's suggestions, this patch renames `s/gen/genCode` and `s/genCode/doGenCode` to better reflect the semantics of these 2 function calls. ## How

[1/2] spark git commit: [SPARK-14710][SQL] Rename gen/genCode to genCode/doGenCode to better reflect the semantics

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6fc1e72d9 -> 8bd812132 http://git-wip-us.apache.org/repos/asf/spark/blob/8bd81213/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --

spark git commit: [MINOR] Revert removing explicit typing (changed in some examples and StatFunctions)

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8c62edb70 -> 6fc1e72d9 [MINOR] Revert removing explicit typing (changed in some examples and StatFunctions) ## What changes were proposed in this pull request? This PR reverts some changes in https://github.com/apache/spark/pull/12413.

spark git commit: [SPARK-14299][EXAMPLES] Remove duplications for scala.examples.ml

2016-04-18 Thread meng
Repository: spark Updated Branches: refs/heads/master f31a62d1b -> 8c62edb70 [SPARK-14299][EXAMPLES] Remove duplications for scala.examples.ml ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14299 Delete duplications in scala/examples/ml.

spark git commit: [SPARK-14440][PYSPARK] Remove pipeline specific reader and writer

2016-04-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 28ee15702 -> f31a62d1b [SPARK-14440][PYSPARK] Remove pipeline specific reader and writer ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14440 Remove * PipelineMLWriter * PipelineMLReader

spark git commit: [SPARK-14647][SQL] Group SQLContext/HiveContext state into SharedState

2016-04-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e4ae97429 -> 28ee15702 [SPARK-14647][SQL] Group SQLContext/HiveContext state into SharedState ## What changes were proposed in this pull request? This patch adds a SharedState that groups state shared across multiple SQLContexts. This is

spark git commit: [HOTFIX] Fix Scala 2.10 compilation break.

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3d66a2ce9 -> e4ae97429 [HOTFIX] Fix Scala 2.10 compilation break. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e4ae9742 Tree:

spark git commit: [SPARK-14564][ML][MLLIB][PYSPARK] Python Word2Vec missing setWindowSize method

2016-04-18 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master d280d1da1 -> 3d66a2ce9 [SPARK-14564][ML][MLLIB][PYSPARK] Python Word2Vec missing setWindowSize method ## What changes were proposed in this pull request? Added windowSize getter/setter to ML/MLlib ## How was this patch tested? Added test

spark git commit: [SPARK-14580][SPARK-14655][SQL] Hive IfCoercion should preserve predicate.

2016-04-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master b64482f49 -> d280d1da1 [SPARK-14580][SPARK-14655][SQL] Hive IfCoercion should preserve predicate. ## What changes were proposed in this pull request? Currently, `HiveTypeCoercion.IfCoercion` removes all predicates whose return-type are

spark git commit: [SPARK-14306][ML][PYSPARK] PySpark ml.classification OneVsRest support export/import

2016-04-18 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 775cf17ea -> b64482f49 [SPARK-14306][ML][PYSPARK] PySpark ml.classification OneVsRest support export/import ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14306 Add PySpark OneVsRest

svn commit: r1739802 - /spark/site/news/spark-summit-june-2016-agenda-posted.html

2016-04-18 Thread yhuai
Author: yhuai Date: Mon Apr 18 18:05:47 2016 New Revision: 1739802 URL: http://svn.apache.org/viewvc?rev=1739802=rev Log: Fix the link for previous commit (Add news for Spark Summit (June 6, 2016) agenda) again Added: spark/site/news/spark-summit-june-2016-agenda-posted.html Added:

svn commit: r1739801 [1/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Author: yhuai Date: Mon Apr 18 18:04:55 2016 New Revision: 1739801 URL: http://svn.apache.org/viewvc?rev=1739801=rev Log: Fix the link for previous commit (Add news for Spark Summit (June 6, 2016) agenda) Added: spark/news/_posts/2016-04-17-spark-summit-june-2016-agenda-posted.md Modified:

svn commit: r1739801 [2/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Modified: spark/site/releases/spark-release-1-2-2.html URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-2-2.html?rev=1739801=1739800=1739801=diff == --- spark/site/releases/spark-release-1-2-2.html

spark git commit: [SPARK-14473][SQL] Define analysis rules to catch operations not supported in streaming

2016-04-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 432d1399c -> 775cf17ea [SPARK-14473][SQL] Define analysis rules to catch operations not supported in streaming ## What changes were proposed in this pull request? There are many operations that are currently not supported in the

svn commit: r1739799 [2/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Modified: spark/site/news/spark-version-0-6-0-released.html URL: http://svn.apache.org/viewvc/spark/site/news/spark-version-0-6-0-released.html?rev=1739799=1739798=1739799=diff == ---

svn commit: r1739799 [1/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Author: yhuai Date: Mon Apr 18 17:56:44 2016 New Revision: 1739799 URL: http://svn.apache.org/viewvc?rev=1739799=rev Log: Add news for Spark Summit (June 6, 2016) agenda Added: spark/news/_posts/2016-04-17-submit-summit-agenda-posted.md Modified: spark/site/community.html

spark git commit: [SPARK-14614] [SQL] Add `bround` function

2016-04-18 Thread davies
Repository: spark Updated Branches: refs/heads/master d6fb485de -> 432d1399c [SPARK-14614] [SQL] Add `bround` function ## What changes were proposed in this pull request? This PR aims to add `bound` function (aka Banker's round) by extending current `round` implementation. [Hive supports

spark git commit: [SPARK-14423][YARN] Avoid same name files added to distributed cache again

2016-04-18 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 1a3966472 -> d6fb485de [SPARK-14423][YARN] Avoid same name files added to distributed cache again ## What changes were proposed in this pull request? In the current implementation of assembly-free spark deployment, jars under

spark git commit: [SPARK-14696][SQL] Add implicit encoders for boxed primitive types

2016-04-18 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 2f1d0320c -> 1a3966472 [SPARK-14696][SQL] Add implicit encoders for boxed primitive types ## What changes were proposed in this pull request? We currently only have implicit encoders for scala primitive types. We should also add implicit

spark git commit: [SPARK-13363][SQL] support Aggregator in RelationalGroupedDataset

2016-04-18 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 7de06a646 -> 2f1d0320c [SPARK-13363][SQL] support Aggregator in RelationalGroupedDataset ## What changes were proposed in this pull request? set the input encoder for `TypedColumn` in `RelationalGroupedDataset.agg`. ## How was this patch