spark git commit: [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R

2016-06-20 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 4fc4eb943 -> dbf7f48b6 [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R ## What changes were proposed in this pull request? Update doc as per discussion in PR #13592 ## How was this patch

spark git commit: [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R

2016-06-20 Thread lian
Repository: spark Updated Branches: refs/heads/master 07367533d -> 58f6e27dd [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R ## What changes were proposed in this pull request? Update doc as per discussion in PR #13592 ## How was this patch

spark git commit: [SPARK-16025][CORE] Document OFF_HEAP storage level in 2.0

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 12f00b6ed -> 4fc4eb943 [SPARK-16025][CORE] Document OFF_HEAP storage level in 2.0 This has changed from 1.6, and now stores memory off-heap using spark's off-heap support instead of in tachyon. Author: Eric Liang

spark git commit: [SPARK-16025][CORE] Document OFF_HEAP storage level in 2.0

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4f7f1c436 -> 07367533d [SPARK-16025][CORE] Document OFF_HEAP storage level in 2.0 This has changed from 1.6, and now stores memory off-heap using spark's off-heap support instead of in tachyon. Author: Eric Liang

spark git commit: [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 9d513b8d2 -> 12f00b6ed [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD ## What changes were proposed in this pull request? This PR makes `input_file_name()` function return the file

spark git commit: [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 18a8a9b1f -> 4f7f1c436 [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD ## What changes were proposed in this pull request? This PR makes `input_file_name()` function return the file paths

spark git commit: [SPARK-16074][MLLIB] expose VectorUDT/MatrixUDT in a public API

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master d9a3a2a0b -> 18a8a9b1f [SPARK-16074][MLLIB] expose VectorUDT/MatrixUDT in a public API ## What changes were proposed in this pull request? Both VectorUDT and MatrixUDT are private APIs, because UserDefinedType itself is private in Spark.

spark git commit: [SPARK-16074][MLLIB] expose VectorUDT/MatrixUDT in a public API

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 b998c33c0 -> 9d513b8d2 [SPARK-16074][MLLIB] expose VectorUDT/MatrixUDT in a public API ## What changes were proposed in this pull request? Both VectorUDT and MatrixUDT are private APIs, because UserDefinedType itself is private in

spark git commit: [SPARK-16056][SPARK-16057][SPARK-16058][SQL] Fix Multiple Bugs in Column Partitioning in JDBC Source

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 603424c16 -> b998c33c0 [SPARK-16056][SPARK-16057][SPARK-16058][SQL] Fix Multiple Bugs in Column Partitioning in JDBC Source What changes were proposed in this pull request? This PR is to fix the following bugs: **Issue 1: Wrong

spark git commit: [SPARK-16056][SPARK-16057][SPARK-16058][SQL] Fix Multiple Bugs in Column Partitioning in JDBC Source

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master c775bf09e -> d9a3a2a0b [SPARK-16056][SPARK-16057][SPARK-16058][SQL] Fix Multiple Bugs in Column Partitioning in JDBC Source What changes were proposed in this pull request? This PR is to fix the following bugs: **Issue 1: Wrong

spark git commit: [SPARK-13792][SQL] Limit logging of bad records in CSV data source

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 10c476fc8 -> 603424c16 [SPARK-13792][SQL] Limit logging of bad records in CSV data source ## What changes were proposed in this pull request? This pull request adds a new option (maxMalformedLogPerPartition) in CSV reader to limit the

spark git commit: [SPARK-13792][SQL] Limit logging of bad records in CSV data source

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 217db56ba -> c775bf09e [SPARK-13792][SQL] Limit logging of bad records in CSV data source ## What changes were proposed in this pull request? This pull request adds a new option (maxMalformedLogPerPartition) in CSV reader to limit the

spark git commit: [SPARK-15294][R] Add `pivot` to SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 087bd2799 -> 10c476fc8 [SPARK-15294][R] Add `pivot` to SparkR ## What changes were proposed in this pull request? This PR adds `pivot` function to SparkR for API parity. Since this PR is based on

spark git commit: [SPARK-15294][R] Add `pivot` to SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master a46553cba -> 217db56ba [SPARK-15294][R] Add `pivot` to SparkR ## What changes were proposed in this pull request? This PR adds `pivot` function to SparkR for API parity. Since this PR is based on

spark git commit: [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6)

2016-06-20 Thread davies
Repository: spark Updated Branches: refs/heads/master e2b7eba87 -> a46553cba [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6) Fix the bug for Python UDF that does not have any arguments. Added regression tests. Author: Davies Liu Closes #13793 from

spark git commit: [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6)

2016-06-20 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 f57317690 -> 087bd2799 [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6) Fix the bug for Python UDF that does not have any arguments. Added regression tests. Author: Davies Liu Closes #13793 from

spark git commit: [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6)

2016-06-20 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.5 1891e04a6 -> 6001138fd [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6) ## What changes were proposed in this pull request? Fix the bug for Python UDF that does not have any arguments. ## How was this patch tested?

spark git commit: [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6)

2016-06-20 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 db86e7fd2 -> abe36c53d [SPARK-16086] [SQL] fix Python UDF without arguments (for 1.6) ## What changes were proposed in this pull request? Fix the bug for Python UDF that does not have any arguments. ## How was this patch tested?

spark git commit: remove duplicated docs in dapply

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 c7006538a -> f57317690 remove duplicated docs in dapply ## What changes were proposed in this pull request? Removed unnecessary duplicated documentation in dapply and dapplyCollect. In this pull request I created separate R docs for

spark git commit: remove duplicated docs in dapply

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master a42bf5553 -> e2b7eba87 remove duplicated docs in dapply ## What changes were proposed in this pull request? Removed unnecessary duplicated documentation in dapply and dapplyCollect. In this pull request I created separate R docs for

spark git commit: [SPARK-16079][PYSPARK][ML] Added missing import for DecisionTreeRegressionModel used in GBTClassificationModel

2016-06-20 Thread meng
Repository: spark Updated Branches: refs/heads/branch-2.0 b40663541 -> c7006538a [SPARK-16079][PYSPARK][ML] Added missing import for DecisionTreeRegressionModel used in GBTClassificationModel ## What changes were proposed in this pull request? Fixed missing import for

spark git commit: [SPARK-16079][PYSPARK][ML] Added missing import for DecisionTreeRegressionModel used in GBTClassificationModel

2016-06-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 6daa8cf1a -> a42bf5553 [SPARK-16079][PYSPARK][ML] Added missing import for DecisionTreeRegressionModel used in GBTClassificationModel ## What changes were proposed in this pull request? Fixed missing import for

spark git commit: [SPARK-16061][SQL][MINOR] The property "spark.streaming.stateStore.maintenanceInterval" should be renamed to "spark.sql.streaming.stateStore.maintenanceInterval"

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master b99129cc4 -> 6daa8cf1a [SPARK-16061][SQL][MINOR] The property "spark.streaming.stateStore.maintenanceInterval" should be renamed to "spark.sql.streaming.stateStore.maintenanceInterval" ## What changes were proposed in this pull request?

spark git commit: [SPARK-16061][SQL][MINOR] The property "spark.streaming.stateStore.maintenanceInterval" should be renamed to "spark.sql.streaming.stateStore.maintenanceInterval"

2016-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 54001cb12 -> b40663541 [SPARK-16061][SQL][MINOR] The property "spark.streaming.stateStore.maintenanceInterval" should be renamed to "spark.sql.streaming.stateStore.maintenanceInterval" ## What changes were proposed in this pull

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8159da20e -> 54001cb12 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. -

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6df8e3886 -> b99129cc4 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. - `text()`

spark git commit: [SPARK-15863][SQL][DOC] Initial SQL programming guide update for Spark 2.0

2016-06-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 54aef1c14 -> 8159da20e [SPARK-15863][SQL][DOC] Initial SQL programming guide update for Spark 2.0 ## What changes were proposed in this pull request? Initial SQL programming guide update for Spark 2.0. Contents like 1.6 to 2.0

spark git commit: [SPARK-15863][SQL][DOC] Initial SQL programming guide update for Spark 2.0

2016-06-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d0eddb80e -> 6df8e3886 [SPARK-15863][SQL][DOC] Initial SQL programming guide update for Spark 2.0 ## What changes were proposed in this pull request? Initial SQL programming guide update for Spark 2.0. Contents like 1.6 to 2.0 migration

[1/2] spark git commit: [SPARK-14995][R] Add `since` tag in Roxygen documentation for SparkR API methods

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 f90b2ea1d -> 54aef1c14 http://git-wip-us.apache.org/repos/asf/spark/blob/54aef1c1/R/pkg/R/mllib.R -- diff --git a/R/pkg/R/mllib.R b/R/pkg/R/mllib.R index

[1/2] spark git commit: [SPARK-14995][R] Add `since` tag in Roxygen documentation for SparkR API methods

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 92514232e -> d0eddb80e http://git-wip-us.apache.org/repos/asf/spark/blob/d0eddb80/R/pkg/R/mllib.R -- diff --git a/R/pkg/R/mllib.R b/R/pkg/R/mllib.R index 2127dae..d6ff2aa

[2/2] spark git commit: [SPARK-14995][R] Add `since` tag in Roxygen documentation for SparkR API methods

2016-06-20 Thread shivaram
[SPARK-14995][R] Add `since` tag in Roxygen documentation for SparkR API methods ## What changes were proposed in this pull request? This PR adds `since` tags to Roxygen documentation according to the previous documentation archive. https://home.apache.org/~dongjoon/spark-2.0.0-docs/api/R/ ##

[2/2] spark git commit: [SPARK-14995][R] Add `since` tag in Roxygen documentation for SparkR API methods

2016-06-20 Thread shivaram
[SPARK-14995][R] Add `since` tag in Roxygen documentation for SparkR API methods ## What changes were proposed in this pull request? This PR adds `since` tags to Roxygen documentation according to the previous documentation archive. https://home.apache.org/~dongjoon/spark-2.0.0-docs/api/R/ ##

spark git commit: [MINOR] Closing stale pull requests.

2016-06-20 Thread srowen
Repository: spark Updated Branches: refs/heads/master 359c2e827 -> 92514232e [MINOR] Closing stale pull requests. Closes #13114 Closes #10187 Closes #13432 Closes #13550 Author: Sean Owen Closes #13781 from srowen/CloseStalePR. Project:

spark git commit: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, programming guide, example updates

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 45c41aa33 -> f90b2ea1d [SPARK-15159][SPARKR] SparkSession roxygen2 doc, programming guide, example updates ## What changes were proposed in this pull request? roxygen2 doc, programming guide, example updates ## How was this patch

spark git commit: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, programming guide, example updates

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master b0f2fb5b9 -> 359c2e827 [SPARK-15159][SPARKR] SparkSession roxygen2 doc, programming guide, example updates ## What changes were proposed in this pull request? roxygen2 doc, programming guide, example updates ## How was this patch

spark git commit: [SPARK-16053][R] Add `spark_partition_id` in SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 dfa920204 -> 45c41aa33 [SPARK-16053][R] Add `spark_partition_id` in SparkR ## What changes were proposed in this pull request? This PR adds `spark_partition_id` virtual column function in SparkR for API parity. The following is just

spark git commit: [SPARK-16053][R] Add `spark_partition_id` in SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master aee1420ec -> b0f2fb5b9 [SPARK-16053][R] Add `spark_partition_id` in SparkR ## What changes were proposed in this pull request? This PR adds `spark_partition_id` virtual column function in SparkR for API parity. The following is just an

spark git commit: [SPARK-15613] [SQL] Fix incorrect days to millis conversion due to Daylight Saving Time

2016-06-20 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 16b7f1dfc -> db86e7fd2 [SPARK-15613] [SQL] Fix incorrect days to millis conversion due to Daylight Saving Time Internally, we use Int to represent a date (the days since 1970-01-01), when we convert that into unix timestamp

spark git commit: [SPARKR] fix R roxygen2 doc for count on GroupedData

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 d2c94e6a4 -> dfa920204 [SPARKR] fix R roxygen2 doc for count on GroupedData ## What changes were proposed in this pull request? fix code doc ## How was this patch tested? manual shivaram Author: Felix Cheung

spark git commit: [SPARKR] fix R roxygen2 doc for count on GroupedData

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 46d98e0a1 -> aee1420ec [SPARKR] fix R roxygen2 doc for count on GroupedData ## What changes were proposed in this pull request? fix code doc ## How was this patch tested? manual shivaram Author: Felix Cheung

spark git commit: [SPARK-16028][SPARKR] spark.lapply can work with active context

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 ead872e49 -> d2c94e6a4 [SPARK-16028][SPARKR] spark.lapply can work with active context ## What changes were proposed in this pull request? spark.lapply and setLogLevel ## How was this patch tested? unit test shivaram thunterdb

spark git commit: [SPARK-16028][SPARKR] spark.lapply can work with active context

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master c44bf137c -> 46d98e0a1 [SPARK-16028][SPARKR] spark.lapply can work with active context ## What changes were proposed in this pull request? spark.lapply and setLogLevel ## How was this patch tested? unit test shivaram thunterdb Author:

spark git commit: [SPARK-16051][R] Add `read.orc/write.orc` to SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 36e812d4b -> c44bf137c [SPARK-16051][R] Add `read.orc/write.orc` to SparkR ## What changes were proposed in this pull request? This issue adds `read.orc/write.orc` to SparkR for API parity. ## How was this patch tested? Pass the Jenkins

spark git commit: [SPARK-16051][R] Add `read.orc/write.orc` to SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 5b22e34e9 -> ead872e49 [SPARK-16051][R] Add `read.orc/write.orc` to SparkR ## What changes were proposed in this pull request? This issue adds `read.orc/write.orc` to SparkR for API parity. ## How was this patch tested? Pass the

spark git commit: [SPARK-16029][SPARKR] SparkR add dropTempView and deprecate dropTempTable

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 bb80d1c24 -> 5b22e34e9 [SPARK-16029][SPARKR] SparkR add dropTempView and deprecate dropTempTable ## What changes were proposed in this pull request? Add dropTempView and deprecate dropTempTable ## How was this patch tested? unit

spark git commit: [SPARK-16029][SPARKR] SparkR add dropTempView and deprecate dropTempTable

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 961342489 -> 36e812d4b [SPARK-16029][SPARKR] SparkR add dropTempView and deprecate dropTempTable ## What changes were proposed in this pull request? Add dropTempView and deprecate dropTempTable ## How was this patch tested? unit tests

spark git commit: [SPARK-16059][R] Add `monotonically_increasing_id` function in SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 363db9f8b -> bb80d1c24 [SPARK-16059][R] Add `monotonically_increasing_id` function in SparkR ## What changes were proposed in this pull request? This PR adds `monotonically_increasing_id` column function in SparkR for API parity.

spark git commit: [SPARK-16059][R] Add `monotonically_increasing_id` function in SparkR

2016-06-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 5cfabec87 -> 961342489 [SPARK-16059][R] Add `monotonically_increasing_id` function in SparkR ## What changes were proposed in this pull request? This PR adds `monotonically_increasing_id` column function in SparkR for API parity. After

spark git commit: [SPARK-16050][TESTS] Remove the flaky test: ConsoleSinkSuite

2016-06-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 905f774b7 -> 5cfabec87 [SPARK-16050][TESTS] Remove the flaky test: ConsoleSinkSuite ## What changes were proposed in this pull request? ConsoleSinkSuite just collects content from stdout and compare them with the expected string.

spark git commit: [SPARK-16050][TESTS] Remove the flaky test: ConsoleSinkSuite

2016-06-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 0b0b5fe54 -> 363db9f8b [SPARK-16050][TESTS] Remove the flaky test: ConsoleSinkSuite ## What changes were proposed in this pull request? ConsoleSinkSuite just collects content from stdout and compare them with the expected string.

spark git commit: [SPARK-14391][LAUNCHER] Fix launcher communication test, take 2.

2016-06-20 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-1.6 208348595 -> 16b7f1dfc [SPARK-14391][LAUNCHER] Fix launcher communication test, take 2. There's actually a race here: the state of the handler was changed before the connection was set, so the test code could be notified of the state

spark git commit: [SPARK-16030][SQL] Allow specifying static partitions when inserting to data source tables

2016-06-20 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 19397caab -> 0b0b5fe54 [SPARK-16030][SQL] Allow specifying static partitions when inserting to data source tables ## What changes were proposed in this pull request? This PR adds the static partition support to INSERT statement when

spark git commit: [SPARK-16030][SQL] Allow specifying static partitions when inserting to data source tables

2016-06-20 Thread lian
Repository: spark Updated Branches: refs/heads/master 6d0f921ae -> 905f774b7 [SPARK-16030][SQL] Allow specifying static partitions when inserting to data source tables ## What changes were proposed in this pull request? This PR adds the static partition support to INSERT statement when the