svn commit: r30445 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_26_18_03-313a1f0-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Sat Oct 27 01:19:12 2018 New Revision: 30445 Log: Apache Spark 2.4.1-SNAPSHOT-2018_10_26_18_03-313a1f0 docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30444 - in /dev/spark/2.3.3-SNAPSHOT-2018_10_26_18_02-3afb3a2-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Sat Oct 27 01:17:32 2018 New Revision: 30444 Log: Apache Spark 2.3.3-SNAPSHOT-2018_10_26_18_02-3afb3a2 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30434 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_26_16_02-e9b71c8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 23:17:18 2018 New Revision: 30434 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_26_16_02-e9b71c8 docs [This commit notification would consist of 1473 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25828][K8S] Bumping Kubernetes-Client version to 4.1.0
Repository: spark Updated Branches: refs/heads/master ca545f794 -> e9b71c8f0 [SPARK-25828][K8S] Bumping Kubernetes-Client version to 4.1.0 ## What changes were proposed in this pull request? Changed the `kubernetes-client` version and refactored code that broke as a result ## How was this patch tested? Unit and Integration tests Closes #22820 from ifilonenko/SPARK-25828. Authored-by: Ilan Filonenko Signed-off-by: Erik Erlandson Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e9b71c8f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e9b71c8f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e9b71c8f Branch: refs/heads/master Commit: e9b71c8f017d2da3b9ae586017b2e5a040f023d2 Parents: ca545f7 Author: Ilan Filonenko Authored: Fri Oct 26 15:59:12 2018 -0700 Committer: Erik Erlandson Committed: Fri Oct 26 15:59:12 2018 -0700 -- dev/deps/spark-deps-hadoop-2.7 | 6 +++--- dev/deps/spark-deps-hadoop-3.1 | 6 +++--- docs/running-on-kubernetes.md | 3 ++- resource-managers/kubernetes/core/pom.xml | 2 +- .../scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala| 6 +++--- .../spark/deploy/k8s/features/MountVolumesFeatureStep.scala| 3 ++- .../spark/deploy/k8s/submit/LoggingPodStatusWatcher.scala | 2 +- .../scheduler/cluster/k8s/ExecutorLifecycleTestUtils.scala | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- 9 files changed, 17 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e9b71c8f/dev/deps/spark-deps-hadoop-2.7 -- diff --git a/dev/deps/spark-deps-hadoop-2.7 b/dev/deps/spark-deps-hadoop-2.7 index 537831e..0703b5b 100644 --- a/dev/deps/spark-deps-hadoop-2.7 +++ b/dev/deps/spark-deps-hadoop-2.7 @@ -132,13 +132,13 @@ jta-1.1.jar jtransforms-2.4.0.jar jul-to-slf4j-1.7.16.jar kryo-shaded-4.0.2.jar -kubernetes-client-3.0.0.jar -kubernetes-model-2.0.0.jar +kubernetes-client-4.1.0.jar +kubernetes-model-4.1.0.jar leveldbjni-all-1.8.jar libfb303-0.9.3.jar libthrift-0.9.3.jar log4j-1.2.17.jar -logging-interceptor-3.8.1.jar +logging-interceptor-3.9.1.jar lz4-java-1.5.0.jar machinist_2.11-0.6.1.jar macro-compat_2.11-1.1.1.jar http://git-wip-us.apache.org/repos/asf/spark/blob/e9b71c8f/dev/deps/spark-deps-hadoop-3.1 -- diff --git a/dev/deps/spark-deps-hadoop-3.1 b/dev/deps/spark-deps-hadoop-3.1 index bc4ef31..5139868 100644 --- a/dev/deps/spark-deps-hadoop-3.1 +++ b/dev/deps/spark-deps-hadoop-3.1 @@ -147,13 +147,13 @@ kerby-pkix-1.0.1.jar kerby-util-1.0.1.jar kerby-xdr-1.0.1.jar kryo-shaded-4.0.2.jar -kubernetes-client-3.0.0.jar -kubernetes-model-2.0.0.jar +kubernetes-client-4.1.0.jar +kubernetes-model-4.1.0.jar leveldbjni-all-1.8.jar libfb303-0.9.3.jar libthrift-0.9.3.jar log4j-1.2.17.jar -logging-interceptor-3.8.1.jar +logging-interceptor-3.9.1.jar lz4-java-1.5.0.jar machinist_2.11-0.6.1.jar macro-compat_2.11-1.1.1.jar http://git-wip-us.apache.org/repos/asf/spark/blob/e9b71c8f/docs/running-on-kubernetes.md -- diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index 60c9279..7093ee5 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -45,7 +45,8 @@ logs and remains in "completed" state in the Kubernetes API until it's eventuall Note that in the completed state, the driver pod does *not* use any computational or memory resources. -The driver and executor pod scheduling is handled by Kubernetes. It is possible to schedule the +The driver and executor pod scheduling is handled by Kubernetes. Communication to the Kubernetes API is done via fabric8, and we are +currently running kubernetes-client version 4.1.0. Make sure that when you are making infrastructure additions that you are aware of said version. It is possible to schedule the driver and executor pods on a subset of available nodes through a [node selector](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector) using the configuration property for it. It will be possible to use more advanced scheduling hints like [node/pod affinities](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity) in a future release. http://git-wip-us.apache.org/repos/asf/spark/blob/e9b71c8f/resource-managers/kubernetes/core/pom.xml -- diff --git a/resource-managers/kubernetes/core/pom.xml b/resource-managers/kubernetes/core/pom.xml
spark git commit: [SPARK-25821][SQL] Remove SQLContext methods deprecated in 1.4
Repository: spark Updated Branches: refs/heads/master d325ffbf3 -> ca545f794 [SPARK-25821][SQL] Remove SQLContext methods deprecated in 1.4 ## What changes were proposed in this pull request? Remove SQLContext methods deprecated in 1.4 ## How was this patch tested? Existing tests. Closes #22815 from srowen/SPARK-25821. Authored-by: Sean Owen Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ca545f79 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ca545f79 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ca545f79 Branch: refs/heads/master Commit: ca545f79410a464ef24e3986fac225f53bb2ef02 Parents: d325ffb Author: Sean Owen Authored: Fri Oct 26 16:49:48 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 16:49:48 2018 -0500 -- R/pkg/NAMESPACE | 2 - R/pkg/R/SQLContext.R| 61 +--- R/pkg/tests/fulltests/test_sparkSQL.R | 25 +- docs/sparkr.md | 6 +- .../scala/org/apache/spark/sql/SQLContext.scala | 283 --- 5 files changed, 8 insertions(+), 369 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/ca545f79/R/pkg/NAMESPACE -- diff --git a/R/pkg/NAMESPACE b/R/pkg/NAMESPACE index 36d7a9b..5a5dc20 100644 --- a/R/pkg/NAMESPACE +++ b/R/pkg/NAMESPACE @@ -420,13 +420,11 @@ export("as.DataFrame", "currentDatabase", "dropTempTable", "dropTempView", - "jsonFile", "listColumns", "listDatabases", "listFunctions", "listTables", "loadDF", - "parquetFile", "read.df", "read.jdbc", "read.json", http://git-wip-us.apache.org/repos/asf/spark/blob/ca545f79/R/pkg/R/SQLContext.R -- diff --git a/R/pkg/R/SQLContext.R b/R/pkg/R/SQLContext.R index c819a7d..3f89ee9 100644 --- a/R/pkg/R/SQLContext.R +++ b/R/pkg/R/SQLContext.R @@ -343,7 +343,6 @@ setMethod("toDF", signature(x = "RDD"), #' path <- "path/to/file.json" #' df <- read.json(path) #' df <- read.json(path, multiLine = TRUE) -#' df <- jsonFile(path) #' } #' @name read.json #' @method read.json default @@ -363,51 +362,6 @@ read.json <- function(x, ...) { dispatchFunc("read.json(path)", x, ...) } -#' @rdname read.json -#' @name jsonFile -#' @method jsonFile default -#' @note jsonFile since 1.4.0 -jsonFile.default <- function(path) { - .Deprecated("read.json") - read.json(path) -} - -jsonFile <- function(x, ...) { - dispatchFunc("jsonFile(path)", x, ...) -} - -#' JSON RDD -#' -#' Loads an RDD storing one JSON object per string as a SparkDataFrame. -#' -#' @param sqlContext SQLContext to use -#' @param rdd An RDD of JSON string -#' @param schema A StructType object to use as schema -#' @param samplingRatio The ratio of simpling used to infer the schema -#' @return A SparkDataFrame -#' @noRd -#' @examples -#'\dontrun{ -#' sparkR.session() -#' rdd <- texFile(sc, "path/to/json") -#' df <- jsonRDD(sqlContext, rdd) -#'} - -# TODO: remove - this method is no longer exported -# TODO: support schema -jsonRDD <- function(sqlContext, rdd, schema = NULL, samplingRatio = 1.0) { - .Deprecated("read.json") - rdd <- serializeToString(rdd) - if (is.null(schema)) { -read <- callJMethod(sqlContext, "read") -# samplingRatio is deprecated -sdf <- callJMethod(read, "json", callJMethod(getJRDD(rdd), "rdd")) -dataFrame(sdf) - } else { -stop("not implemented") - } -} - #' Create a SparkDataFrame from an ORC file. #' #' Loads an ORC file, returning the result as a SparkDataFrame. @@ -434,6 +388,7 @@ read.orc <- function(path, ...) { #' Loads a Parquet file, returning the result as a SparkDataFrame. #' #' @param path path of file to read. A vector of multiple paths is allowed. +#' @param ... additional external data source specific named properties. #' @return SparkDataFrame #' @rdname read.parquet #' @name read.parquet @@ -454,20 +409,6 @@ read.parquet <- function(x, ...) { dispatchFunc("read.parquet(...)", x, ...) } -#' @param ... argument(s) passed to the method. -#' @rdname read.parquet -#' @name parquetFile -#' @method parquetFile default -#' @note parquetFile since 1.4.0 -parquetFile.default <- function(...) { - .Deprecated("read.parquet") - read.parquet(unlist(list(...))) -} - -parquetFile <- function(x, ...) { - dispatchFunc("parquetFile(...)", x, ...) -} - #' Create a SparkDataFrame from a text file. #' #' Loads text files and returns a SparkDataFrame whose schema starts with http://git-wip-us.apache.org/repos/asf/spark/blob/ca545f79/R/pkg/tests/fulltests/test_sparkSQL.R
spark git commit: [SPARK-25851][SQL][MINOR] Fix deprecated API warning in SQLListener
Repository: spark Updated Branches: refs/heads/master 6aa506394 -> d325ffbf3 [SPARK-25851][SQL][MINOR] Fix deprecated API warning in SQLListener ## What changes were proposed in this pull request? In https://github.com/apache/spark/pull/21596, Jackson is upgraded to 2.9.6. There are some deprecated API warnings in SQLListener. Create a trivial PR to fix them. ``` [warn] SQLListener.scala:92: method uncheckedSimpleType in class TypeFactory is deprecated: see corresponding Javadoc for more information. [warn] val objectType = typeFactory.uncheckedSimpleType(classOf[Object]) [warn] [warn] SQLListener.scala:93: method constructSimpleType in class TypeFactory is deprecated: see corresponding Javadoc for more information. [warn] typeFactory.constructSimpleType(classOf[(_, _)], classOf[(_, _)], Array(objectType, objectType)) [warn] [warn] SQLListener.scala:97: method uncheckedSimpleType in class TypeFactory is deprecated: see corresponding Javadoc for more information. [warn] val longType = typeFactory.uncheckedSimpleType(classOf[Long]) [warn] [warn] SQLListener.scala:98: method constructSimpleType in class TypeFactory is deprecated: see corresponding Javadoc for more information. [warn] typeFactory.constructSimpleType(classOf[(_, _)], classOf[(_, _)], Array(longType, longType)) ``` ## How was this patch tested? Existing unit tests. Closes #22848 from gengliangwang/fixSQLListenerWarning. Authored-by: Gengliang Wang Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d325ffbf Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d325ffbf Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d325ffbf Branch: refs/heads/master Commit: d325ffbf3a6b3555cbe5a3004ffb4dde41bff363 Parents: 6aa5063 Author: Gengliang Wang Authored: Fri Oct 26 16:45:56 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 16:45:56 2018 -0500 -- .../org/apache/spark/sql/execution/ui/SQLListener.scala | 8 1 file changed, 4 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d325ffbf/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala index c04a31c..03d75c4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala @@ -89,12 +89,12 @@ private class LongLongTupleConverter extends Converter[(Object, Object), (Long, } override def getInputType(typeFactory: TypeFactory): JavaType = { -val objectType = typeFactory.uncheckedSimpleType(classOf[Object]) -typeFactory.constructSimpleType(classOf[(_, _)], classOf[(_, _)], Array(objectType, objectType)) +val objectType = typeFactory.constructType(classOf[Object]) +typeFactory.constructSimpleType(classOf[(_, _)], Array(objectType, objectType)) } override def getOutputType(typeFactory: TypeFactory): JavaType = { -val longType = typeFactory.uncheckedSimpleType(classOf[Long]) -typeFactory.constructSimpleType(classOf[(_, _)], classOf[(_, _)], Array(longType, longType)) +val longType = typeFactory.constructType(classOf[Long]) +typeFactory.constructSimpleType(classOf[(_, _)], Array(longType, longType)) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown
Repository: spark Updated Branches: refs/heads/branch-2.2 8906696ac -> 5b1396596 [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown the final line in the mvn helper script in build/ attempts to shut down the zinc server. due to the zinc server being set up w/a 30min timeout, by the time the mvn test instantiation finishes, the server times out. this means that when the mvn script tries to shut down zinc, it returns w/an exit code of 1. this will then automatically fail the entire build (even if the build passes). i set up a test build: https://amplab.cs.berkeley.edu/jenkins/job/sknapp-testing-spark-branch-2.4-test-maven-hadoop-2.7/ Closes #22854 from shaneknapp/fix-mvn-helper-script. Authored-by: shane knapp Signed-off-by: Sean Owen (cherry picked from commit 6aa506394958bfb30cd2a9085a5e8e8be927de51) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5b139659 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5b139659 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5b139659 Branch: refs/heads/branch-2.2 Commit: 5b13965967c603c1bd2491c7c123e40439bf3071 Parents: 8906696 Author: shane knapp Authored: Fri Oct 26 16:37:36 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 16:42:31 2018 -0500 -- build/mvn | 12 1 file changed, 8 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/5b139659/build/mvn -- diff --git a/build/mvn b/build/mvn index 6a7d4db..941209d 100755 --- a/build/mvn +++ b/build/mvn @@ -144,7 +144,7 @@ if [ -n "${ZINC_INSTALL_FLAG}" -o -z "`"${ZINC_BIN}" -status -port ${ZINC_PORT}` export ZINC_OPTS=${ZINC_OPTS:-"$_COMPILE_JVM_OPTS"} "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} "${ZINC_BIN}" -start -port ${ZINC_PORT} \ --server 127.0.0.1 -idle-timeout 30m \ +-server 127.0.0.1 -idle-timeout 3h \ -scala-compiler "${SCALA_COMPILER}" \ -scala-library "${SCALA_LIBRARY}" &>/dev/null fi @@ -154,8 +154,12 @@ export MAVEN_OPTS=${MAVEN_OPTS:-"$_COMPILE_JVM_OPTS"} echo "Using \`mvn\` from path: $MVN_BIN" 1>&2 -# Last, call the `mvn` command as usual -${MVN_BIN} -DzincPort=${ZINC_PORT} "$@" +# call the `mvn` command as usual +# SPARK-25854 +"${MVN_BIN}" -DzincPort=${ZINC_PORT} "$@" +MVN_RETCODE=$? -# Try to shut down zinc explicitly +# Try to shut down zinc explicitly if the server is still running. "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} + +exit $MVN_RETCODE - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown
Repository: spark Updated Branches: refs/heads/branch-2.3 0a05cf917 -> 3afb3a20e [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown the final line in the mvn helper script in build/ attempts to shut down the zinc server. due to the zinc server being set up w/a 30min timeout, by the time the mvn test instantiation finishes, the server times out. this means that when the mvn script tries to shut down zinc, it returns w/an exit code of 1. this will then automatically fail the entire build (even if the build passes). i set up a test build: https://amplab.cs.berkeley.edu/jenkins/job/sknapp-testing-spark-branch-2.4-test-maven-hadoop-2.7/ Closes #22854 from shaneknapp/fix-mvn-helper-script. Authored-by: shane knapp Signed-off-by: Sean Owen (cherry picked from commit 6aa506394958bfb30cd2a9085a5e8e8be927de51) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3afb3a20 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3afb3a20 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3afb3a20 Branch: refs/heads/branch-2.3 Commit: 3afb3a20e670c73677ab96d6fe5fcb3380800f33 Parents: 0a05cf9 Author: shane knapp Authored: Fri Oct 26 16:37:36 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 16:40:56 2018 -0500 -- build/mvn | 12 1 file changed, 8 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/3afb3a20/build/mvn -- diff --git a/build/mvn b/build/mvn index 7951e10..5c7f2ca 100755 --- a/build/mvn +++ b/build/mvn @@ -144,7 +144,7 @@ if [ -n "${ZINC_INSTALL_FLAG}" -o -z "`"${ZINC_BIN}" -status -port ${ZINC_PORT}` export ZINC_OPTS=${ZINC_OPTS:-"$_COMPILE_JVM_OPTS"} "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} "${ZINC_BIN}" -start -port ${ZINC_PORT} \ --server 127.0.0.1 -idle-timeout 30m \ +-server 127.0.0.1 -idle-timeout 3h \ -scala-compiler "${SCALA_COMPILER}" \ -scala-library "${SCALA_LIBRARY}" &>/dev/null fi @@ -154,8 +154,12 @@ export MAVEN_OPTS=${MAVEN_OPTS:-"$_COMPILE_JVM_OPTS"} echo "Using \`mvn\` from path: $MVN_BIN" 1>&2 -# Last, call the `mvn` command as usual -${MVN_BIN} -DzincPort=${ZINC_PORT} "$@" +# call the `mvn` command as usual +# SPARK-25854 +"${MVN_BIN}" -DzincPort=${ZINC_PORT} "$@" +MVN_RETCODE=$? -# Try to shut down zinc explicitly +# Try to shut down zinc explicitly if the server is still running. "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} + +exit $MVN_RETCODE - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown
Repository: spark Updated Branches: refs/heads/branch-2.4 cb2827d28 -> 313a1f0a7 [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown ## What changes were proposed in this pull request? the final line in the mvn helper script in build/ attempts to shut down the zinc server. due to the zinc server being set up w/a 30min timeout, by the time the mvn test instantiation finishes, the server times out. this means that when the mvn script tries to shut down zinc, it returns w/an exit code of 1. this will then automatically fail the entire build (even if the build passes). ## How was this patch tested? i set up a test build: https://amplab.cs.berkeley.edu/jenkins/job/sknapp-testing-spark-branch-2.4-test-maven-hadoop-2.7/ Closes #22854 from shaneknapp/fix-mvn-helper-script. Authored-by: shane knapp Signed-off-by: Sean Owen (cherry picked from commit 6aa506394958bfb30cd2a9085a5e8e8be927de51) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/313a1f0a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/313a1f0a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/313a1f0a Branch: refs/heads/branch-2.4 Commit: 313a1f0a7aa325ea4038530fc12fad695c7d9809 Parents: cb2827d Author: shane knapp Authored: Fri Oct 26 16:37:36 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 16:37:50 2018 -0500 -- build/mvn | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/313a1f0a/build/mvn -- diff --git a/build/mvn b/build/mvn index b60ea64..3816993 100755 --- a/build/mvn +++ b/build/mvn @@ -153,7 +153,7 @@ if [ -n "${ZINC_INSTALL_FLAG}" -o -z "`"${ZINC_BIN}" -status -port ${ZINC_PORT}` export ZINC_OPTS=${ZINC_OPTS:-"$_COMPILE_JVM_OPTS"} "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} "${ZINC_BIN}" -start -port ${ZINC_PORT} \ --server 127.0.0.1 -idle-timeout 30m \ +-server 127.0.0.1 -idle-timeout 3h \ -scala-compiler "${SCALA_COMPILER}" \ -scala-library "${SCALA_LIBRARY}" &>/dev/null fi @@ -163,8 +163,12 @@ export MAVEN_OPTS=${MAVEN_OPTS:-"$_COMPILE_JVM_OPTS"} echo "Using \`mvn\` from path: $MVN_BIN" 1>&2 -# Last, call the `mvn` command as usual +# call the `mvn` command as usual +# SPARK-25854 "${MVN_BIN}" -DzincPort=${ZINC_PORT} "$@" +MVN_RETCODE=$? -# Try to shut down zinc explicitly +# Try to shut down zinc explicitly if the server is still running. "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} + +exit $MVN_RETCODE - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown
Repository: spark Updated Branches: refs/heads/master d367bdcf5 -> 6aa506394 [SPARK-25854][BUILD] fix `build/mvn` not to fail during Zinc server shutdown ## What changes were proposed in this pull request? the final line in the mvn helper script in build/ attempts to shut down the zinc server. due to the zinc server being set up w/a 30min timeout, by the time the mvn test instantiation finishes, the server times out. this means that when the mvn script tries to shut down zinc, it returns w/an exit code of 1. this will then automatically fail the entire build (even if the build passes). ## How was this patch tested? i set up a test build: https://amplab.cs.berkeley.edu/jenkins/job/sknapp-testing-spark-branch-2.4-test-maven-hadoop-2.7/ Closes #22854 from shaneknapp/fix-mvn-helper-script. Authored-by: shane knapp Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6aa50639 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6aa50639 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6aa50639 Branch: refs/heads/master Commit: 6aa506394958bfb30cd2a9085a5e8e8be927de51 Parents: d367bdc Author: shane knapp Authored: Fri Oct 26 16:37:36 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 16:37:36 2018 -0500 -- build/mvn | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6aa50639/build/mvn -- diff --git a/build/mvn b/build/mvn index b60ea64..3816993 100755 --- a/build/mvn +++ b/build/mvn @@ -153,7 +153,7 @@ if [ -n "${ZINC_INSTALL_FLAG}" -o -z "`"${ZINC_BIN}" -status -port ${ZINC_PORT}` export ZINC_OPTS=${ZINC_OPTS:-"$_COMPILE_JVM_OPTS"} "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} "${ZINC_BIN}" -start -port ${ZINC_PORT} \ --server 127.0.0.1 -idle-timeout 30m \ +-server 127.0.0.1 -idle-timeout 3h \ -scala-compiler "${SCALA_COMPILER}" \ -scala-library "${SCALA_LIBRARY}" &>/dev/null fi @@ -163,8 +163,12 @@ export MAVEN_OPTS=${MAVEN_OPTS:-"$_COMPILE_JVM_OPTS"} echo "Using \`mvn\` from path: $MVN_BIN" 1>&2 -# Last, call the `mvn` command as usual +# call the `mvn` command as usual +# SPARK-25854 "${MVN_BIN}" -DzincPort=${ZINC_PORT} "$@" +MVN_RETCODE=$? -# Try to shut down zinc explicitly +# Try to shut down zinc explicitly if the server is still running. "${ZINC_BIN}" -shutdown -port ${ZINC_PORT} + +exit $MVN_RETCODE - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30431 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_26_12_02-d367bdc-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 19:16:52 2018 New Revision: 30431 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_26_12_02-d367bdc docs [This commit notification would consist of 1473 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30425 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_26_10_03-cb2827d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 17:17:38 2018 New Revision: 30425 Log: Apache Spark 2.4.1-SNAPSHOT-2018_10_26_10_03-cb2827d docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 2.4.1-SNAPSHOT
Preparing development version 2.4.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cb2827d2 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cb2827d2 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cb2827d2 Branch: refs/heads/branch-2.4 Commit: cb2827d286fe2cd8026d95ae1ae55c16c6331699 Parents: 4a7ead4 Author: Wenchen Fan Authored: Fri Oct 26 16:47:05 2018 + Committer: Wenchen Fan Committed: Fri Oct 26 16:47:05 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/cb2827d2/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index f52d785..714b6f1 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.0 +Version: 2.4.1 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/cb2827d2/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 63ab510..ee0de73 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.0 +2.4.1-SNAPSHOT ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/cb2827d2/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b10e118..b89e0fe 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.0 +2.4.1-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/cb2827d2/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml
[1/2] spark git commit: Preparing Spark release v2.4.0-rc5
Repository: spark Updated Branches: refs/heads/branch-2.4 1757a603f -> cb2827d28 Preparing Spark release v2.4.0-rc5 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4a7ead48 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4a7ead48 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4a7ead48 Branch: refs/heads/branch-2.4 Commit: 4a7ead480ac8ddb07e34e9ff5360b0c07973c95e Parents: 1757a60 Author: Wenchen Fan Authored: Fri Oct 26 16:47:00 2018 + Committer: Wenchen Fan Committed: Fri Oct 26 16:47:00 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/4a7ead48/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 714b6f1..f52d785 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.1 +Version: 2.4.0 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/4a7ead48/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index ee0de73..63ab510 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.1-SNAPSHOT +2.4.0 ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/4a7ead48/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b89e0fe..b10e118 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.1-SNAPSHOT +2.4.0 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/4a7ead48/common/network-common/pom.xml
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.4.0-rc5 [created] 4a7ead480 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: HOT-FIX pyspark import
Repository: spark Updated Branches: refs/heads/branch-2.4 d868dc2b8 -> 1757a603f HOT-FIX pyspark import Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1757a603 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1757a603 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1757a603 Branch: refs/heads/branch-2.4 Commit: 1757a603fa123e3a81a7bfc06f9b58ee328f11b0 Parents: d868dc2 Author: Wenchen Fan Authored: Sat Oct 27 00:43:16 2018 +0800 Committer: Wenchen Fan Committed: Sat Oct 27 00:43:16 2018 +0800 -- python/pyspark/sql/functions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/1757a603/python/pyspark/sql/functions.py -- diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py index a59d5c9..9583a98 100644 --- a/python/pyspark/sql/functions.py +++ b/python/pyspark/sql/functions.py @@ -27,7 +27,7 @@ if sys.version < "3": from pyspark import since, SparkContext from pyspark.rdd import ignore_unicode_prefix, PythonEvalType -from pyspark.sql.column import Column, _to_java_column, _to_seq +from pyspark.sql.column import Column, _to_java_column, _to_seq, _create_column_from_literal from pyspark.sql.dataframe import DataFrame from pyspark.sql.types import StringType, DataType # Keep UserDefinedFunction import for backwards compatible import; moved in SPARK-22409 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25255][PYTHON] Add getActiveSession to SparkSession in PySpark
Repository: spark Updated Branches: refs/heads/master f1891ff1e -> d367bdcf5 [SPARK-25255][PYTHON] Add getActiveSession to SparkSession in PySpark ## What changes were proposed in this pull request? add getActiveSession in session.py ## How was this patch tested? add doctest Closes #22295 from huaxingao/spark25255. Authored-by: Huaxin Gao Signed-off-by: Holden Karau Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d367bdcf Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d367bdcf Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d367bdcf Branch: refs/heads/master Commit: d367bdcf521f564d2d7066257200be26b27ea926 Parents: f1891ff Author: Huaxin Gao Authored: Fri Oct 26 09:40:13 2018 -0700 Committer: Holden Karau Committed: Fri Oct 26 09:40:13 2018 -0700 -- python/pyspark/sql/session.py | 30 python/pyspark/sql/tests.py | 151 + 2 files changed, 181 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d367bdcf/python/pyspark/sql/session.py -- diff --git a/python/pyspark/sql/session.py b/python/pyspark/sql/session.py index 079af8c..6f4b327 100644 --- a/python/pyspark/sql/session.py +++ b/python/pyspark/sql/session.py @@ -192,6 +192,7 @@ class SparkSession(object): """A class attribute having a :class:`Builder` to construct :class:`SparkSession` instances""" _instantiatedSession = None +_activeSession = None @ignore_unicode_prefix def __init__(self, sparkContext, jsparkSession=None): @@ -233,7 +234,9 @@ class SparkSession(object): if SparkSession._instantiatedSession is None \ or SparkSession._instantiatedSession._sc._jsc is None: SparkSession._instantiatedSession = self +SparkSession._activeSession = self self._jvm.SparkSession.setDefaultSession(self._jsparkSession) +self._jvm.SparkSession.setActiveSession(self._jsparkSession) def _repr_html_(self): return """ @@ -255,6 +258,29 @@ class SparkSession(object): """ return self.__class__(self._sc, self._jsparkSession.newSession()) +@classmethod +@since(3.0) +def getActiveSession(cls): +""" +Returns the active SparkSession for the current thread, returned by the builder. +>>> s = SparkSession.getActiveSession() +>>> l = [('Alice', 1)] +>>> rdd = s.sparkContext.parallelize(l) +>>> df = s.createDataFrame(rdd, ['name', 'age']) +>>> df.select("age").collect() +[Row(age=1)] +""" +from pyspark import SparkContext +sc = SparkContext._active_spark_context +if sc is None: +return None +else: +if sc._jvm.SparkSession.getActiveSession().isDefined(): +SparkSession(sc, sc._jvm.SparkSession.getActiveSession().get()) +return SparkSession._activeSession +else: +return None + @property @since(2.0) def sparkContext(self): @@ -671,6 +697,8 @@ class SparkSession(object): ... Py4JJavaError: ... """ +SparkSession._activeSession = self +self._jvm.SparkSession.setActiveSession(self._jsparkSession) if isinstance(data, DataFrame): raise TypeError("data is already a DataFrame") @@ -826,7 +854,9 @@ class SparkSession(object): self._sc.stop() # We should clean the default session up. See SPARK-23228. self._jvm.SparkSession.clearDefaultSession() +self._jvm.SparkSession.clearActiveSession() SparkSession._instantiatedSession = None +SparkSession._activeSession = None @since(2.0) def __enter__(self): http://git-wip-us.apache.org/repos/asf/spark/blob/d367bdcf/python/pyspark/sql/tests.py -- diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py index 82dc5a6..ad04270 100644 --- a/python/pyspark/sql/tests.py +++ b/python/pyspark/sql/tests.py @@ -3985,6 +3985,157 @@ class SparkSessionTests(PySparkTestCase): spark.stop() +class SparkSessionTests2(unittest.TestCase): + +def test_active_session(self): +spark = SparkSession.builder \ +.master("local") \ +.getOrCreate() +try: +activeSession = SparkSession.getActiveSession() +df = activeSession.createDataFrame([(1, 'Alice')], ['age', 'name']) +self.assertEqual(df.collect(), [Row(age=1, name=u'Alice')]) +finally: +spark.stop() + +def test_get_active_session_when_no_active_session(self): +active =
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.4.0-rc5 [deleted] 075447b39 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 2.4.1-SNAPSHOT
Preparing development version 2.4.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d868dc2b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d868dc2b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d868dc2b Branch: refs/heads/branch-2.4 Commit: d868dc2b819da75cdb16fee6d5779f9d1e575f87 Parents: 075447b Author: Wenchen Fan Authored: Fri Oct 26 16:26:36 2018 + Committer: Wenchen Fan Committed: Fri Oct 26 16:26:36 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d868dc2b/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index f52d785..714b6f1 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.0 +Version: 2.4.1 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/d868dc2b/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 63ab510..ee0de73 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.0 +2.4.1-SNAPSHOT ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/d868dc2b/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b10e118..b89e0fe 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.0 +2.4.1-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/d868dc2b/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.4.0-rc5 [created] 075447b39 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[1/2] spark git commit: Preparing Spark release v2.4.0-rc5
Repository: spark Updated Branches: refs/heads/branch-2.4 40ed093b7 -> d868dc2b8 Preparing Spark release v2.4.0-rc5 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/075447b3 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/075447b3 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/075447b3 Branch: refs/heads/branch-2.4 Commit: 075447b3965489ffba4e6afb2b120880bc307505 Parents: 40ed093 Author: Wenchen Fan Authored: Fri Oct 26 16:26:31 2018 + Committer: Wenchen Fan Committed: Fri Oct 26 16:26:31 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/075447b3/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 714b6f1..f52d785 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.1 +Version: 2.4.0 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/075447b3/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index ee0de73..63ab510 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.1-SNAPSHOT +2.4.0 ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/075447b3/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b89e0fe..b10e118 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.1-SNAPSHOT +2.4.0 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/075447b3/common/network-common/pom.xml
svn commit: r30424 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_26_08_02-f1891ff-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 15:17:08 2018 New Revision: 30424 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_26_08_02-f1891ff docs [This commit notification would consist of 1473 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25760][DOCS][FOLLOWUP] Add note about AddJar return value change in migration guide
Repository: spark Updated Branches: refs/heads/master 33e337c11 -> f1891ff1e [SPARK-25760][DOCS][FOLLOWUP] Add note about AddJar return value change in migration guide ## What changes were proposed in this pull request? Add note about AddJar return value change in migration guide ## How was this patch tested? n/a Closes #22826 from srowen/SPARK-25760.2. Authored-by: Sean Owen Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f1891ff1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f1891ff1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f1891ff1 Branch: refs/heads/master Commit: f1891ff1e3f03668ac21b352b009bfea5e3c2b7f Parents: 33e337c Author: Sean Owen Authored: Fri Oct 26 09:48:17 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 09:48:17 2018 -0500 -- docs/sql-migration-guide-upgrade.md | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/f1891ff1/docs/sql-migration-guide-upgrade.md -- diff --git a/docs/sql-migration-guide-upgrade.md b/docs/sql-migration-guide-upgrade.md index 38c03d3..c9685b8 100644 --- a/docs/sql-migration-guide-upgrade.md +++ b/docs/sql-migration-guide-upgrade.md @@ -15,6 +15,8 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 3.0, the `from_json` functions supports two modes - `PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The default mode became `PERMISSIVE`. In previous versions, behavior of `from_json` did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing of malformed JSON records. For example, the JSON string `{"a" 1}` with the schema `a INT` is converted to `null` by previous versions but Spark 3.0 converts it to `Row(null)`. + - The `ADD JAR` command previously returned a result set with the single value 0. It now returns an empty result set. + ## Upgrading From Spark SQL 2.3 to 2.4 - In Spark version 2.3 and earlier, the second parameter to array_contains function is implicitly promoted to the element type of first array type parameter. This type promotion can be lossy and may cause `array_contains` function to return wrong result. This problem has been addressed in 2.4 by employing a safer type promotion mechanism. This can cause some change in behavior and are illustrated in the table below. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's input json as literal only
Repository: spark Updated Branches: refs/heads/master 7d44bc264 -> 33e337c11 [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's input json as literal only ## What changes were proposed in this pull request? The main purpose of `schema_of_json` is the usage of combination with `from_json` (to make up the leak of schema inference) which takes its schema only as literal; however, currently `schema_of_json` allows JSON input as non-literal expressions (e.g, column). This was mistakenly allowed - we don't have to take other usages rather then the main purpose into account for now. This PR makes a followup to only allow literals for `schema_of_json`'s JSON input. We can allow non literal expressions later when it's needed or there are some usecase for it. ## How was this patch tested? Unit tests were added. Closes #22775 from HyukjinKwon/SPARK-25447-followup. Lead-authored-by: hyukjinkwon Co-authored-by: Hyukjin Kwon Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/33e337c1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/33e337c1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/33e337c1 Branch: refs/heads/master Commit: 33e337c1180a12edf1ae97f0221e389f23192461 Parents: 7d44bc2 Author: hyukjinkwon Authored: Fri Oct 26 22:14:43 2018 +0800 Committer: Wenchen Fan Committed: Fri Oct 26 22:14:43 2018 +0800 -- python/pyspark/sql/functions.py | 22 ++-- .../catalyst/expressions/jsonExpressions.scala | 21 +--- .../scala/org/apache/spark/sql/functions.scala | 24 + .../sql-tests/inputs/json-functions.sql | 6 +++- .../sql-tests/results/json-functions.sql.out| 36 +++- .../apache/spark/sql/JsonFunctionsSuite.scala | 2 +- 6 files changed, 87 insertions(+), 24 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/33e337c1/python/pyspark/sql/functions.py -- diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py index 739496b..ca2a256 100644 --- a/python/pyspark/sql/functions.py +++ b/python/pyspark/sql/functions.py @@ -2335,30 +2335,32 @@ def to_json(col, options={}): @ignore_unicode_prefix @since(2.4) -def schema_of_json(col, options={}): +def schema_of_json(json, options={}): """ -Parses a column containing a JSON string and infers its schema in DDL format. +Parses a JSON string and infers its schema in DDL format. -:param col: string column in json format +:param json: a JSON string or a string literal containing a JSON string. :param options: options to control parsing. accepts the same options as the JSON datasource .. versionchanged:: 3.0 It accepts `options` parameter to control schema inferring. ->>> from pyspark.sql.types import * ->>> data = [(1, '{"a": 1}')] ->>> df = spark.createDataFrame(data, ("key", "value")) ->>> df.select(schema_of_json(df.value).alias("json")).collect() -[Row(json=u'struct')] +>>> df = spark.range(1) >>> df.select(schema_of_json(lit('{"a": 0}')).alias("json")).collect() [Row(json=u'struct')] ->>> schema = schema_of_json(lit('{a: 1}'), {'allowUnquotedFieldNames':'true'}) +>>> schema = schema_of_json('{a: 1}', {'allowUnquotedFieldNames':'true'}) >>> df.select(schema.alias("json")).collect() [Row(json=u'struct')] """ +if isinstance(json, basestring): +col = _create_column_from_literal(json) +elif isinstance(json, Column): +col = _to_java_column(json) +else: +raise TypeError("schema argument should be a column or string") sc = SparkContext._active_spark_context -jc = sc._jvm.functions.schema_of_json(_to_java_column(col), options) +jc = sc._jvm.functions.schema_of_json(col, options) return Column(jc) http://git-wip-us.apache.org/repos/asf/spark/blob/33e337c1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala index e966924..77af590 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala @@ -742,7 +742,7 @@ case class StructsToJson( case class SchemaOfJson( child: Expression, options: Map[String, String]) - extends UnaryExpression with String2StringExpression with CodegenFallback { + extends UnaryExpression
spark git commit: [SPARK-25835][K8S] Create kubernetes-tests profile and use the detected SCALA_VERSION
Repository: spark Updated Branches: refs/heads/branch-2.4 b47b8271d -> 26e1d3ef8 [SPARK-25835][K8S] Create kubernetes-tests profile and use the detected SCALA_VERSION - Fixes the scala version propagation issue. - Disables the tests under the k8s profile, now we will run them manually. Adds a test specific profile otherwise tests will not run if we just remove the module from the kubernetes profile (quickest solution I can think of). Manually by running the tests with different versions of scala. Closes #22838 from skonto/propagate-scala2.12. Authored-by: Stavros Kontopoulos Signed-off-by: Sean Owen (cherry picked from commit 7d44bc26408b2189804fd305797afcefb7b2b0e0) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/26e1d3ef Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/26e1d3ef Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/26e1d3ef Branch: refs/heads/branch-2.4 Commit: 26e1d3ef8223e8caca32d42060212dd12dad6d64 Parents: b47b827 Author: Stavros Kontopoulos Authored: Fri Oct 26 08:49:27 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 08:54:04 2018 -0500 -- pom.xml | 7 +++ .../integration-tests/dev/dev-run-integration-tests.sh| 3 ++- 2 files changed, 9 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/26e1d3ef/pom.xml -- diff --git a/pom.xml b/pom.xml index 28804be..349de83 100644 --- a/pom.xml +++ b/pom.xml @@ -2716,6 +2716,13 @@ kubernetes resource-managers/kubernetes/core + + + + + + kubernetes-integration-tests + resource-managers/kubernetes/integration-tests http://git-wip-us.apache.org/repos/asf/spark/blob/26e1d3ef/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh -- diff --git a/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh b/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh index b28b8b8..cb5cf69 100755 --- a/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh +++ b/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh @@ -28,6 +28,7 @@ NAMESPACE= SERVICE_ACCOUNT= INCLUDE_TAGS="k8s" EXCLUDE_TAGS= +SCALA_VERSION="$($TEST_ROOT_DIR/build/mvn org.apache.maven.plugins:maven-help-plugin:2.1.1:evaluate -Dexpression=scala.binary.version | grep -v '\[' )" # Parse arguments while (( "$#" )); do @@ -103,4 +104,4 @@ then properties=( ${properties[@]} -Dtest.exclude.tags=$EXCLUDE_TAGS ) fi -$TEST_ROOT_DIR/build/mvn integration-test -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests -am -Pkubernetes -Phadoop-2.7 ${properties[@]} +$TEST_ROOT_DIR/build/mvn integration-test -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests -am -Pscala-$SCALA_VERSION -Pkubernetes -Pkubernetes-integration-tests -Phadoop-2.7 ${properties[@]} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25835][K8S] Create kubernetes-tests profile and use the detected SCALA_VERSION
Repository: spark Updated Branches: refs/heads/master 6fd5ff395 -> 7d44bc264 [SPARK-25835][K8S] Create kubernetes-tests profile and use the detected SCALA_VERSION ## What changes were proposed in this pull request? - Fixes the scala version propagation issue. - Disables the tests under the k8s profile, now we will run them manually. Adds a test specific profile otherwise tests will not run if we just remove the module from the kubernetes profile (quickest solution I can think of). ## How was this patch tested? Manually by running the tests with different versions of scala. Closes #22838 from skonto/propagate-scala2.12. Authored-by: Stavros Kontopoulos Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7d44bc26 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7d44bc26 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7d44bc26 Branch: refs/heads/master Commit: 7d44bc26408b2189804fd305797afcefb7b2b0e0 Parents: 6fd5ff3 Author: Stavros Kontopoulos Authored: Fri Oct 26 08:49:27 2018 -0500 Committer: Sean Owen Committed: Fri Oct 26 08:49:27 2018 -0500 -- pom.xml | 7 +++ .../integration-tests/dev/dev-run-integration-tests.sh| 3 ++- 2 files changed, 9 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7d44bc26/pom.xml -- diff --git a/pom.xml b/pom.xml index 92934c1..597fb2f 100644 --- a/pom.xml +++ b/pom.xml @@ -2656,6 +2656,13 @@ kubernetes resource-managers/kubernetes/core + + + + + + kubernetes-integration-tests + resource-managers/kubernetes/integration-tests http://git-wip-us.apache.org/repos/asf/spark/blob/7d44bc26/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh -- diff --git a/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh b/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh index e26c0b3..c3c843e 100755 --- a/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh +++ b/resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh @@ -28,6 +28,7 @@ NAMESPACE= SERVICE_ACCOUNT= INCLUDE_TAGS="k8s" EXCLUDE_TAGS= +SCALA_VERSION="$($TEST_ROOT_DIR/build/mvn org.apache.maven.plugins:maven-help-plugin:2.1.1:evaluate -Dexpression=scala.binary.version | grep -v '\[' )" # Parse arguments while (( "$#" )); do @@ -103,4 +104,4 @@ then properties=( ${properties[@]} -Dtest.exclude.tags=$EXCLUDE_TAGS ) fi -$TEST_ROOT_DIR/build/mvn integration-test -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests -am -Pkubernetes ${properties[@]} +$TEST_ROOT_DIR/build/mvn integration-test -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests -am -Pscala-$SCALA_VERSION -Pkubernetes -Pkubernetes-integration-tests ${properties[@]} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30422 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_26_04_02-6fd5ff3-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 11:16:53 2018 New Revision: 30422 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_26_04_02-6fd5ff3 docs [This commit notification would consist of 1473 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25797][SQL][DOCS] Add migration doc for solving issues caused by view canonicalization approach change
Repository: spark Updated Branches: refs/heads/master 89d748b33 -> 6fd5ff395 [SPARK-25797][SQL][DOCS] Add migration doc for solving issues caused by view canonicalization approach change ## What changes were proposed in this pull request? Since Spark 2.2, view definitions are stored in a different way from prior versions. This may cause Spark unable to read views created by prior versions. See [SPARK-25797](https://issues.apache.org/jira/browse/SPARK-25797) for more details. Basically, we have 2 options. 1) Make Spark 2.2+ able to get older view definitions back. Since the expanded text is buggy and unusable, we have to use original text (this is possible with [SPARK-25459](https://issues.apache.org/jira/browse/SPARK-25459)). However, because older Spark versions don't save the context for the database, we cannot always get correct view definitions without view default database. 2) Recreate the views by `ALTER VIEW AS` or `CREATE OR REPLACE VIEW AS`. This PR aims to add migration doc to help users troubleshoot this issue by above option 2. ## How was this patch tested? N/A. Docs are generated and checked locally ``` cd docs SKIP_API=1 jekyll serve --watch ``` Closes #22846 from seancxmao/SPARK-25797. Authored-by: seancxmao Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6fd5ff39 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6fd5ff39 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6fd5ff39 Branch: refs/heads/master Commit: 6fd5ff3951ed9ac7c0b20f2666d8bc39929bfb5c Parents: 89d748b Author: seancxmao Authored: Fri Oct 26 18:53:55 2018 +0800 Committer: Wenchen Fan Committed: Fri Oct 26 18:53:55 2018 +0800 -- docs/sql-migration-guide-upgrade.md | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6fd5ff39/docs/sql-migration-guide-upgrade.md -- diff --git a/docs/sql-migration-guide-upgrade.md b/docs/sql-migration-guide-upgrade.md index dfa35b8..38c03d3 100644 --- a/docs/sql-migration-guide-upgrade.md +++ b/docs/sql-migration-guide-upgrade.md @@ -304,6 +304,8 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 2.2.1 and 2.3.0, the schema is always inferred at runtime when the data source tables have the columns that exist in both partition schema and data schema. The inferred schema does not have the partitioned columns. When reading the table, Spark respects the partition values of these overlapping columns instead of the values stored in the data source files. In 2.2.0 and 2.1.x release, the inferred schema is partitioned but the data of the table is invisible to users (i.e., the result set is empty). + - Since Spark 2.2, view definitions are stored in a different way from prior versions. This may cause Spark unable to read views created by prior versions. In such cases, you need to recreate the views using `ALTER VIEW AS` or `CREATE OR REPLACE VIEW AS` with newer Spark versions. + ## Upgrading From Spark SQL 2.0 to 2.1 - Datasource tables now store partition metadata in the Hive metastore. This means that Hive DDLs such as `ALTER TABLE PARTITION ... SET LOCATION` are now available for tables created with the Datasource API. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30419 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_26_02_02-f37bcea-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 09:17:08 2018 New Revision: 30419 Log: Apache Spark 2.4.1-SNAPSHOT-2018_10_26_02_02-f37bcea docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30418 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_26_00_02-89d748b-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Oct 26 07:17:49 2018 New Revision: 30418 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_26_00_02-89d748b docs [This commit notification would consist of 1473 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org