[1/2] spark git commit: [SPARK-14508][BUILD] Add a new ScalaStyle Rule `OmitBracesInCase`

2016-04-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 678b96e77 -> b0f5497e9 http://git-wip-us.apache.org/repos/asf/spark/blob/b0f5497e/streaming/src/main/scala/org/apache/spark/streaming/dstream/StateDStream.scala -- diff --gi

[2/2] spark git commit: [SPARK-14508][BUILD] Add a new ScalaStyle Rule `OmitBracesInCase`

2016-04-12 Thread rxin
[SPARK-14508][BUILD] Add a new ScalaStyle Rule `OmitBracesInCase` ## What changes were proposed in this pull request? According to the [Spark Code Style Guide](https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide) and [Scala Style Guide](http://docs.scala-lang.org/style/con

spark git commit: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPORARY TABLE ... USING ... AS SELECT" shouldn't create persisted table

2016-04-12 Thread lian
Repository: spark Updated Branches: refs/heads/master b0f5497e9 -> 124cbfb68 [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPORARY TABLE ... USING ... AS SELECT" shouldn't create persisted table ## What changes were proposed in this pull request? When planning logical plan node `CreateTableUsing

spark git commit: [SPARK-3724][ML] RandomForest: More options for feature subset size.

2016-04-12 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 124cbfb68 -> da60b34d2 [SPARK-3724][ML] RandomForest: More options for feature subset size. ## What changes were proposed in this pull request? This PR tries to support more options for feature subset size in RandomForest implementation.

spark git commit: [SPARK-12566][SPARK-14324][ML] GLM model family, link function support in SparkR:::glm

2016-04-12 Thread meng
Repository: spark Updated Branches: refs/heads/master 6bf692147 -> 75e05a5a9 [SPARK-12566][SPARK-14324][ML] GLM model family, link function support in SparkR:::glm * SparkR glm supports families and link functions which match R's signature for family. * SparkR glm API refactor. The comparati

spark git commit: [SPARK-14474][SQL] Move FileSource offset log into checkpointLocation

2016-04-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master da60b34d2 -> 6bf692147 [SPARK-14474][SQL] Move FileSource offset log into checkpointLocation ## What changes were proposed in this pull request? Now that we have a single location for storing checkpointed state. This PR just propagates th

spark git commit: [SPARK-13322][ML] AFTSurvivalRegression supports feature standardization

2016-04-12 Thread meng
Repository: spark Updated Branches: refs/heads/master 75e05a5a9 -> 101663f1a [SPARK-13322][ML] AFTSurvivalRegression supports feature standardization ## What changes were proposed in this pull request? AFTSurvivalRegression should support feature standardization, it will improve the convergen

spark git commit: [SPARK-13597][PYSPARK][ML] Python API for GeneralizedLinearRegression

2016-04-12 Thread meng
Repository: spark Updated Branches: refs/heads/master 101663f1a -> 7f024c474 [SPARK-13597][PYSPARK][ML] Python API for GeneralizedLinearRegression ## What changes were proposed in this pull request? Python API for GeneralizedLinearRegression JIRA: https://issues.apache.org/jira/browse/SPARK-1

spark git commit: [SPARK-14563][ML] use a random table name instead of __THIS__ in SQLTransformer

2016-04-12 Thread meng
Repository: spark Updated Branches: refs/heads/master 7f024c474 -> 1995c2e64 [SPARK-14563][ML] use a random table name instead of __THIS__ in SQLTransformer ## What changes were proposed in this pull request? Use a random table name instead of `__THIS__` in SQLTransformer, and add a test for

spark git commit: [SPARK-14563][ML] use a random table name instead of __THIS__ in SQLTransformer

2016-04-12 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 663a492f0 -> 2554c35e7 [SPARK-14563][ML] use a random table name instead of __THIS__ in SQLTransformer ## What changes were proposed in this pull request? Use a random table name instead of `__THIS__` in SQLTransformer, and add a test

spark git commit: [SPARK-14147][ML][SPARKR] SparkR predict should not output feature column

2016-04-12 Thread meng
Repository: spark Updated Branches: refs/heads/master 1995c2e64 -> 111a62474 [SPARK-14147][ML][SPARKR] SparkR predict should not output feature column ## What changes were proposed in this pull request? SparkR does not support type of vector which is the default type of feature column in ML.

spark git commit: [SPARK-14556][SQL] Code clean-ups for package o.a.s.sql.execution.streaming.state

2016-04-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 111a62474 -> 852bbc6c0 [SPARK-14556][SQL] Code clean-ups for package o.a.s.sql.execution.streaming.state ## What changes were proposed in this pull request? - `StateStoreConf.**max**DeltasForSnapshot` was renamed to `StateStoreConf.**min

spark git commit: [SPARK-14562] [SQL] improve constraints propagation in Union

2016-04-12 Thread davies
Repository: spark Updated Branches: refs/heads/master 852bbc6c0 -> 85e68b4be [SPARK-14562] [SQL] improve constraints propagation in Union ## What changes were proposed in this pull request? Currently, Union only takes intersect of the constraints from it's children, all others are dropped, w

spark git commit: [SPARK-14414][SQL] improve the error message class hierarchy

2016-04-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 85e68b4be -> bcd207627 [SPARK-14414][SQL] improve the error message class hierarchy ## What changes were proposed in this pull request? Before we are using `AnalysisException`, `ParseException`, `NoSuchFunctionException` etc when a parsin

spark git commit: [SPARK-14513][CORE] Fix threads left behind after stopping SparkContext

2016-04-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master bcd207627 -> 3e53de4bd [SPARK-14513][CORE] Fix threads left behind after stopping SparkContext ## What changes were proposed in this pull request? Shutting down `QueuedThreadPool` used by Jetty `Server` to avoid threads leakage after Spar

spark git commit: [SPARK-14544] [SQL] improve performance of SQL UI tab

2016-04-12 Thread davies
Repository: spark Updated Branches: refs/heads/master 3e53de4bd -> 1ef5f8cfa [SPARK-14544] [SQL] improve performance of SQL UI tab ## What changes were proposed in this pull request? This PR improve the performance of SQL UI by: 1) remove the details column in all executions page (the first

spark git commit: [SPARK-14544] [SQL] improve performance of SQL UI tab

2016-04-12 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 2554c35e7 -> 582ed8a6e [SPARK-14544] [SQL] improve performance of SQL UI tab ## What changes were proposed in this pull request? This PR improve the performance of SQL UI by: 1) remove the details column in all executions page (the fi

spark git commit: [SPARK-14547] Avoid DNS resolution for reusing connections

2016-04-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1ef5f8cfa -> c439d88e9 [SPARK-14547] Avoid DNS resolution for reusing connections ## What changes were proposed in this pull request? This patch changes the connection creation logic in the network client module to avoid DNS resolution whe

spark git commit: [SPARK-14363] Fix executor OOM due to memory leak in the Sorter

2016-04-12 Thread davies
Repository: spark Updated Branches: refs/heads/master c439d88e9 -> d187e7dea [SPARK-14363] Fix executor OOM due to memory leak in the Sorter ## What changes were proposed in this pull request? Fix memory leak in the Sorter. When the UnsafeExternalSorter spills the data to disk, it does not f

spark git commit: [SPARK-14363] Fix executor OOM due to memory leak in the Sorter

2016-04-12 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 582ed8a6e -> 413d0600e [SPARK-14363] Fix executor OOM due to memory leak in the Sorter Fix memory leak in the Sorter. When the UnsafeExternalSorter spills the data to disk, it does not free up the underlying pointer array. As a result,

spark git commit: [SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema

2016-04-12 Thread davies
Repository: spark Updated Branches: refs/heads/master d187e7dea -> 372baf047 [SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema ## What changes were proposed in this pull request? The wide schema, the expression of fields will be splitted into multiple functions, b

spark git commit: [SPARK-14579][SQL] Fix a race condition in StreamExecution.processAllAvailable

2016-04-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 372baf047 -> 768b3d623 [SPARK-14579][SQL] Fix a race condition in StreamExecution.processAllAvailable ## What changes were proposed in this pull request? There is a race condition in `StreamExecution.processAllAvailable`. Here is an execu

spark git commit: [MINOR][SQL] Remove some unused imports in datasources.

2016-04-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 768b3d623 -> 587cd554a [MINOR][SQL] Remove some unused imports in datasources. ## What changes were proposed in this pull request? It looks several recent commits for datasources (maybe while removing old `HadoopFsRelation` interface) mis

spark git commit: [SPARK-14554][SQL][FOLLOW-UP] use checkDataset to check the result

2016-04-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 587cd554a -> a5f8c9b15 [SPARK-14554][SQL][FOLLOW-UP] use checkDataset to check the result ## What changes were proposed in this pull request? address this comment: https://github.com/apache/spark/pull/12322#discussion_r59417359 ## How wa

spark git commit: [SPARK-13992][CORE][PYSPARK][FOLLOWUP] Update OFF_HEAP semantics for Java api and Python api

2016-04-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master a5f8c9b15 -> 23f93f559 [SPARK-13992][CORE][PYSPARK][FOLLOWUP] Update OFF_HEAP semantics for Java api and Python api ## What changes were proposed in this pull request? - updated `OFF_HEAP` semantics for `StorageLevels.java` - updated `OFF