spark git commit: [SPARK-15803] [PYSPARK] Support with statement syntax for SparkSession

2016-06-17 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 2066258ef -> feeef497d [SPARK-15803] [PYSPARK] Support with statement syntax for SparkSession ## What changes were proposed in this pull request? Support with statement syntax for SparkSession in pyspark ## How was this patch tested?

spark git commit: [SPARK-15803] [PYSPARK] Support with statement syntax for SparkSession

2016-06-17 Thread davies
Repository: spark Updated Branches: refs/heads/master 4c64e88d5 -> 898cb6525 [SPARK-15803] [PYSPARK] Support with statement syntax for SparkSession ## What changes were proposed in this pull request? Support with statement syntax for SparkSession in pyspark ## How was this patch tested?

spark git commit: [SPARK-16035][PYSPARK] Fix SparseVector parser assertion for end parenthesis

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/master d0ac0e6f4 -> 4c64e88d5 [SPARK-16035][PYSPARK] Fix SparseVector parser assertion for end parenthesis ## What changes were proposed in this pull request? The check on the end parenthesis of the expression to parse was using the wrong

spark git commit: [SPARK-16035][PYSPARK] Fix SparseVector parser assertion for end parenthesis

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 fd053892c -> 3f1d730e9 [SPARK-16035][PYSPARK] Fix SparseVector parser assertion for end parenthesis ## What changes were proposed in this pull request? The check on the end parenthesis of the expression to parse was using the wrong

spark git commit: [SPARK-16035][PYSPARK] Fix SparseVector parser assertion for end parenthesis

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/branch-2.0 2859ea3ec -> 2066258ef [SPARK-16035][PYSPARK] Fix SparseVector parser assertion for end parenthesis ## What changes were proposed in this pull request? The check on the end parenthesis of the expression to parse was using the wrong

spark git commit: [SPARK-16020][SQL] Fix complete mode aggregation with console sink

2016-06-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 8c198e246 -> d0ac0e6f4 [SPARK-16020][SQL] Fix complete mode aggregation with console sink ## What changes were proposed in this pull request? We cannot use `limit` on DataFrame in ConsoleSink because it will use a wrong planner. This PR

spark git commit: [SPARK-16020][SQL] Fix complete mode aggregation with console sink

2016-06-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8b7e56121 -> 2859ea3ec [SPARK-16020][SQL] Fix complete mode aggregation with console sink ## What changes were proposed in this pull request? We cannot use `limit` on DataFrame in ConsoleSink because it will use a wrong planner. This

spark git commit: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 0a8fd2eb8 -> 8b7e56121 [SPARK-15159][SPARKR] SparkR SparkSession API ## What changes were proposed in this pull request? This PR introduces the new SparkSession API for SparkR. `sparkR.session.getOrCreate()` and

spark git commit: [SPARK-15946][MLLIB] Conversion between old/new vector columns in a DataFrame (Python)

2016-06-17 Thread yliang
Repository: spark Updated Branches: refs/heads/master af2a4b082 -> edb23f9e4 [SPARK-15946][MLLIB] Conversion between old/new vector columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers for #13662 to convert old/new vector

spark git commit: [SPARK-15129][R][DOC] R API changes in ML

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/branch-2.0 57feaa572 -> f0de45cb1 [SPARK-15129][R][DOC] R API changes in ML ## What changes were proposed in this pull request? Make user guide changes to SparkR documentation for all changes that happened in 2.0 to Machine Learning APIs

spark git commit: [SPARK-15129][R][DOC] R API changes in ML

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/master 10b671447 -> af2a4b082 [SPARK-15129][R][DOC] R API changes in ML ## What changes were proposed in this pull request? Make user guide changes to SparkR documentation for all changes that happened in 2.0 to Machine Learning APIs Author:

spark git commit: [SPARK-15892][ML] Backport correctly merging AFTAggregators to branch 1.6

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 e530823dd -> fd053892c [SPARK-15892][ML] Backport correctly merging AFTAggregators to branch 1.6 ## What changes were proposed in this pull request? This PR backports https://github.com/apache/spark/pull/13619. The original test

spark git commit: [SPARK-16033][SQL] insertInto() can't be used together with partitionBy()

2016-06-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ebb9a3b6f -> 10b671447 [SPARK-16033][SQL] insertInto() can't be used together with partitionBy() ## What changes were proposed in this pull request? When inserting into an existing partitioned table, partitioning columns should always be

spark git commit: [SPARK-15916][SQL] JDBC filter push down should respect operator precedence

2016-06-17 Thread lian
Repository: spark Updated Branches: refs/heads/master 7d65a0db4 -> ebb9a3b6f [SPARK-15916][SQL] JDBC filter push down should respect operator precedence ## What changes were proposed in this pull request? This PR fixes the problem that the precedence order is messed when pushing

spark git commit: [SPARK-15916][SQL] JDBC filter push down should respect operator precedence

2016-06-17 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 ca0802fd5 -> b22b20db6 [SPARK-15916][SQL] JDBC filter push down should respect operator precedence ## What changes were proposed in this pull request? This PR fixes the problem that the precedence order is messed when pushing

spark git commit: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread shivaram
Repository: spark Updated Branches: refs/heads/master ef3cc4fc0 -> 7d65a0db4 [SPARK-16005][R] Add `randomSplit` to SparkR ## What changes were proposed in this pull request? This PR adds `randomSplit` to SparkR for API parity. ## How was this patch tested? Pass the Jenkins tests (with new

spark git commit: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 d4bb9a3ff -> ca0802fd5 [SPARK-16005][R] Add `randomSplit` to SparkR ## What changes were proposed in this pull request? This PR adds `randomSplit` to SparkR for API parity. ## How was this patch tested? Pass the Jenkins tests (with

spark git commit: [SPARK-15925][SPARKR] R DataFrame add back registerTempTable, add tests

2016-06-17 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 2e5211ff9 -> d4bb9a3ff [SPARK-15925][SPARKR] R DataFrame add back registerTempTable, add tests ## What changes were proposed in this pull request? Add registerTempTable to DataFrame with Deprecate ## How was this patch tested? unit

spark git commit: [SPARK-15925][SPARKR] R DataFrame add back registerTempTable, add tests

2016-06-17 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 1a65e62a7 -> ef3cc4fc0 [SPARK-15925][SPARKR] R DataFrame add back registerTempTable, add tests ## What changes were proposed in this pull request? Add registerTempTable to DataFrame with Deprecate ## How was this patch tested? unit

spark git commit: [SPARK-16014][SQL] Rename optimizer rules to be more consistent

2016-06-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 62d8fe208 -> 1a65e62a7 [SPARK-16014][SQL] Rename optimizer rules to be more consistent ## What changes were proposed in this pull request? This small patch renames a few optimizer rules to make the naming more consistent, e.g. class name

spark git commit: [SPARK-16014][SQL] Rename optimizer rules to be more consistent

2016-06-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 0701b8d95 -> 2e5211ff9 [SPARK-16014][SQL] Rename optimizer rules to be more consistent ## What changes were proposed in this pull request? This small patch renames a few optimizer rules to make the naming more consistent, e.g. class

spark git commit: [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver

2016-06-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 3457497e0 -> 0701b8d95 [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver ## What changes were proposed in this pull request? [SPARK-15395](https://issues.apache.org/jira/browse/SPARK-15395) changes the

spark git commit: [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver

2016-06-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 298c4ae81 -> 62d8fe208 [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver ## What changes were proposed in this pull request? [SPARK-15395](https://issues.apache.org/jira/browse/SPARK-15395) changes the

spark git commit: [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger

2016-06-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 269b715e4 -> 3457497e0 [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger ## What changes were proposed in this pull request? Shade the netty.io namespace so that we can use it in shuffle independent of the

spark git commit: [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger

2016-06-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/master c8809db5a -> 298c4ae81 [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger ## What changes were proposed in this pull request? Shade the netty.io namespace so that we can use it in shuffle independent of the dependencies

spark git commit: [SPARK-16008][ML] Remove unnecessary serialization in logistic regression

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/branch-2.0 de964e419 -> 269b715e4 [SPARK-16008][ML] Remove unnecessary serialization in logistic regression JIRA: [SPARK-16008](https://issues.apache.org/jira/browse/SPARK-16008) ## What changes were proposed in this pull request?

spark git commit: [SPARK-16008][ML] Remove unnecessary serialization in logistic regression

2016-06-17 Thread meng
Repository: spark Updated Branches: refs/heads/master 34d6c4cd1 -> 1f0a46958 [SPARK-16008][ML] Remove unnecessary serialization in logistic regression JIRA: [SPARK-16008](https://issues.apache.org/jira/browse/SPARK-16008) ## What changes were proposed in this pull request?

spark git commit: Remove non-obvious conf settings from TPCDS benchmark

2016-06-17 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master ef43b4ed8 -> 34d6c4cd1 Remove non-obvious conf settings from TPCDS benchmark ## What changes were proposed in this pull request? My fault -- these 2 conf entries are mysteriously hidden inside the benchmark code and makes it non-obvious

spark git commit: [SPARK-15811][SQL] fix the Python UDF in Scala 2.10

2016-06-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 b82abde06 -> 3c3865d0b [SPARK-15811][SQL] fix the Python UDF in Scala 2.10 ## What changes were proposed in this pull request? Iterator can't be serialized in Scala 2.10, we should force it into a array to make sure that . ## How

spark git commit: [SPARK-15811][SQL] fix the Python UDF in Scala 2.10

2016-06-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master e5d703bca -> ef43b4ed8 [SPARK-15811][SQL] fix the Python UDF in Scala 2.10 ## What changes were proposed in this pull request? Iterator can't be serialized in Scala 2.10, we should force it into a array to make sure that . ## How was