date:20181010

svn commit: r30000 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_10_16_02-80813e1-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-10-10 Thread pwendell

Author: pwendell
Date: Wed Oct 10 23:16:48 2018
New Revision: 3

Log:
Apache Spark 3.0.0-SNAPSHOT-2018_10_10_16_02-80813e1 docs


[This commit notification would consist of 1482 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r29997 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_10_12_02-6df2345-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-10-10 Thread pwendell

Author: pwendell
Date: Wed Oct 10 19:16:46 2018
New Revision: 29997

Log:
Apache Spark 3.0.0-SNAPSHOT-2018_10_10_12_02-6df2345 docs


[This commit notification would consist of 1482 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

2018-10-10 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 6df234579 -> 80813e198


[SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

## What changes were proposed in this pull request?

Remove Hadoop 2.6 references and make 2.7 the default.
Obviously, this is for master/3.0.0 only.
After this we can also get rid of the separate test jobs for Hadoop 2.6.

## How was this patch tested?

Existing tests

Closes #22615 from srowen/SPARK-25016.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80813e19
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/80813e19
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/80813e19

Branch: refs/heads/master
Commit: 80813e198033cd63cc6100ee6ffe7d1eb1dff27b
Parents: 6df2345
Author: Sean Owen 
Authored: Wed Oct 10 12:07:53 2018 -0700
Committer: Sean Owen 
Committed: Wed Oct 10 12:07:53 2018 -0700

--
 dev/appveyor-install-dependencies.ps1   |   3 +-
 dev/create-release/release-build.sh |  43 ++--
 dev/deps/spark-deps-hadoop-2.6  | 198 ---
 dev/run-tests.py|  15 +-
 dev/test-dependencies.sh|   1 -
 docs/building-spark.md  |  11 +-
 docs/index.md   |   3 -
 docs/running-on-yarn.md |   3 +-
 hadoop-cloud/pom.xml|  59 +++---
 pom.xml |  14 +-
 .../dev/dev-run-integration-tests.sh|   2 +-
 .../org/apache/spark/deploy/yarn/Client.scala   |  13 +-
 .../org/apache/spark/sql/hive/TableReader.scala |   2 +-
 .../sql/hive/client/IsolatedClientLoader.scala  |  11 +-
 14 files changed, 68 insertions(+), 310 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/80813e19/dev/appveyor-install-dependencies.ps1
--
diff --git a/dev/appveyor-install-dependencies.ps1 
b/dev/appveyor-install-dependencies.ps1
index 8a04b62..c918828 100644
--- a/dev/appveyor-install-dependencies.ps1
+++ b/dev/appveyor-install-dependencies.ps1
@@ -95,7 +95,8 @@ $env:MAVEN_OPTS = "-Xmx2g -XX:ReservedCodeCacheSize=512m"
 Pop-Location
 
 # == Hadoop bin package
-$hadoopVer = "2.6.4"
+# This must match the version at 
https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1
+$hadoopVer = "2.7.1"
 $hadoopPath = "$tools\hadoop"
 if (!(Test-Path $hadoopPath)) {
 New-Item -ItemType Directory -Force -Path $hadoopPath | Out-Null

http://git-wip-us.apache.org/repos/asf/spark/blob/80813e19/dev/create-release/release-build.sh
--
diff --git a/dev/create-release/release-build.sh 
b/dev/create-release/release-build.sh
index cce5f8b..89593cf 100755
--- a/dev/create-release/release-build.sh
+++ b/dev/create-release/release-build.sh
@@ -191,9 +191,19 @@ if [[ "$1" == "package" ]]; then
   make_binary_release() {
 NAME=$1
 FLAGS="$MVN_EXTRA_OPTS -B $BASE_RELEASE_PROFILES $2"
+# BUILD_PACKAGE can be "withpip", "withr", or both as "withpip,withr"
 BUILD_PACKAGE=$3
 SCALA_VERSION=$4
 
+PIP_FLAG=""
+if [[ $BUILD_PACKAGE == *"withpip"* ]]; then
+  PIP_FLAG="--pip"
+fi
+R_FLAG=""
+if [[ $BUILD_PACKAGE == *"withr"* ]]; then
+  R_FLAG="--r"
+fi
+
 # We increment the Zinc port each time to avoid OOM's and other craziness 
if multiple builds
 # share the same Zinc server.
 ZINC_PORT=$((ZINC_PORT + 1))
@@ -217,18 +227,13 @@ if [[ "$1" == "package" ]]; then
 # Get maven home set by MVN
 MVN_HOME=`$MVN -version 2>&1 | grep 'Maven home' | awk '{print $NF}'`
 
+echo "Creating distribution"
+./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn --tgz \
+  $PIP_FLAG $R_FLAG $FLAGS \
+  -DzincPort=$ZINC_PORT 2>&1 >  ../binary-release-$NAME.log
+cd ..
 
-if [ -z "$BUILD_PACKAGE" ]; then
-  echo "Creating distribution without PIP/R package"
-  ./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn --tgz 
$FLAGS \
--DzincPort=$ZINC_PORT 2>&1 >  ../binary-release-$NAME.log
-  cd ..
-elif [[ "$BUILD_PACKAGE" == "withr" ]]; then
-  echo "Creating distribution with R package"
-  ./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn --tgz 
--r $FLAGS \
--DzincPort=$ZINC_PORT 2>&1 >  ../binary-release-$NAME.log
-  cd ..
-
+if [[ -n $R_FLAG ]]; then
   echo "Copying and signing R source package"
   R_DIST_NAME=SparkR_$SPARK_VERSION.tar.gz
   cp spark-$SPARK_VERSION-bin-$NAME/R/$R_DIST_NAME .
@@ -239,12 +244,9 @@ if [[ "$1" == "package" ]]; then
   echo

spark git commit: [SPARK-25699][SQL] Partially push down conjunctive predicated in ORC

2018-10-10 Thread dbtsai

Repository: spark
Updated Branches:
  refs/heads/master 8a7872dc2 -> 6df234579


[SPARK-25699][SQL] Partially push down conjunctive predicated in ORC

## What changes were proposed in this pull request?

Inspired by https://github.com/apache/spark/pull/22574 .
We can partially push down top level conjunctive predicates to Orc.
This PR improves Orc predicate push down in both SQL and Hive module.

## How was this patch tested?

New unit test.

Closes #22684 from gengliangwang/pushOrcFilters.

Authored-by: Gengliang Wang 
Signed-off-by: DB Tsai 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6df23457
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6df23457
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6df23457

Branch: refs/heads/master
Commit: 6df2345794614c33c95fa453cabac755cf94d131
Parents: 8a7872d
Author: Gengliang Wang 
Authored: Wed Oct 10 18:18:56 2018 +
Committer: DB Tsai 
Committed: Wed Oct 10 18:18:56 2018 +

--
 .../execution/datasources/orc/OrcFilters.scala  | 69 +++-
 .../datasources/orc/OrcFilterSuite.scala| 37 ++-
 .../apache/spark/sql/hive/orc/OrcFilters.scala  | 69 +++-
 .../spark/sql/hive/orc/HiveOrcFilterSuite.scala | 45 -
 4 files changed, 186 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6df23457/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
index dbafc46..2b17b47 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
@@ -138,6 +138,23 @@ private[sql] object OrcFilters {
   dataTypeMap: Map[String, DataType],
   expression: Filter,
   builder: Builder): Option[Builder] = {
+createBuilder(dataTypeMap, expression, builder, 
canPartialPushDownConjuncts = true)
+  }
+
+  /**
+   * @param dataTypeMap a map from the attribute name to its data type.
+   * @param expression the input filter predicates.
+   * @param builder the input SearchArgument.Builder.
+   * @param canPartialPushDownConjuncts whether a subset of conjuncts of 
predicates can be pushed
+   *down safely. Pushing ONLY one side of 
AND down is safe to
+   *do at the top level or none of its 
ancestors is NOT and OR.
+   * @return the builder so far.
+   */
+  private def createBuilder(
+  dataTypeMap: Map[String, DataType],
+  expression: Filter,
+  builder: Builder,
+  canPartialPushDownConjuncts: Boolean): Option[Builder] = {
 def getType(attribute: String): PredicateLeaf.Type =
   getPredicateLeafType(dataTypeMap(attribute))
 
@@ -145,32 +162,52 @@ private[sql] object OrcFilters {
 
 expression match {
   case And(left, right) =>
-// At here, it is not safe to just convert one side if we do not 
understand the
-// other side. Here is an example used to explain the reason.
+// At here, it is not safe to just convert one side and remove the 
other side
+// if we do not understand what the parent filters are.
+//
+// Here is an example used to explain the reason.
 // Let's say we have NOT(a = 2 AND b in ('1')) and we do not 
understand how to
 // convert b in ('1'). If we only convert a = 2, we will end up with a 
filter
 // NOT(a = 2), which will generate wrong results.
-// Pushing one side of AND down is only safe to do at the top level.
-// You can see ParquetRelation's initializeLocalJobFunc method as an 
example.
-for {
-  _ <- buildSearchArgument(dataTypeMap, left, newBuilder)
-  _ <- buildSearchArgument(dataTypeMap, right, newBuilder)
-  lhs <- buildSearchArgument(dataTypeMap, left, builder.startAnd())
-  rhs <- buildSearchArgument(dataTypeMap, right, lhs)
-} yield rhs.end()
+//
+// Pushing one side of AND down is only safe to do at the top level or 
in the child
+// AND before hitting NOT or OR conditions, and in this case, the 
unsupported predicate
+// can be safely removed.
+val leftBuilderOption =
+  createBuilder(dataTypeMap, left, newBuilder, 
canPartialPushDownConjuncts)
+val rightBuilderOption =
+  createBuilder(dataTypeMap, right, newBuilder, 
canPartialPushDownConjuncts)
+(leftBuilderOption, rightBuilderOption) match {

svn commit: r29994 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_10_10_02-cd40655-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-10-10 Thread pwendell

Author: pwendell
Date: Wed Oct 10 17:17:08 2018
New Revision: 29994

Log:
Apache Spark 2.4.1-SNAPSHOT-2018_10_10_10_02-cd40655 docs


[This commit notification would consist of 1472 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25636][CORE] spark-submit cuts off the failure reason when there is an error connecting to master

2018-10-10 Thread vanzin

Repository: spark
Updated Branches:
  refs/heads/branch-2.4 71b8739fe -> cd4065596


[SPARK-25636][CORE] spark-submit cuts off the failure reason when there is an 
error connecting to master

## What changes were proposed in this pull request?
Cause of the error is wrapped with SparkException, now finding the cause from 
the wrapped exception and throwing the cause instead of the wrapped exception.

## How was this patch tested?
Verified it manually by checking the cause of the error, it gives the error as 
shown below.

### Without the PR change

```
[apache-spark]$ ./bin/spark-submit --verbose --master spark://**

Error: Exception thrown in awaitResult:
Run with --help for usage help or --verbose for debug output

```

### With the PR change

```
[apache-spark]$ ./bin/spark-submit --verbose --master spark://**

Exception in thread "main" org.apache.spark.SparkException: Exception thrown in 
awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Failed to connect to devaraj-pc1/10.3.66.65:7077
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
   
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: 
Connection refused: devaraj-pc1/10.3.66.65:7077
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
... 1 more
Caused by: java.net.ConnectException: Connection refused
... 11 more

```

Closes #22623 from devaraj-kavali/SPARK-25636.

Authored-by: Devaraj K 
Signed-off-by: Marcelo Vanzin 
(cherry picked from commit 8a7872dc254710f9b29fdfdb2915a949ef606871)
Signed-off-by: Marcelo Vanzin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cd406559
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cd406559
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cd406559

Branch: refs/heads/branch-2.4
Commit: cd40655965072051dfae65eabd979edff0e4d398
Parents: 71b8739
Author: Devaraj K 
Authored: Wed Oct 10 09:24:36 2018 -0700
Committer: Marcelo Vanzin 
Committed: Wed Oct 10 09:24:50 2018 -0700

--
 .../org/apache/spark/deploy/SparkSubmit.scala  |  2 --
 .../org/apache/spark/deploy/SparkSubmitSuite.scala | 17 +++--
 2 files changed, 11 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/cd406559/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index cf902db..1d32d96 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -925,8 +925,6 @@ object SparkSubmit extends CommandLineUtils with Logging {
 } catch {
   case e: SparkUserAppException =>
 exitFn(e.exitCode)
-  case e: SparkException =>
-printErrorAndExit(e.getMessage())
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cd406559/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
index 9eae360..652c36f 100644
--- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
@@ -74,20 +74,25 @@ trait TestPrematureExit {
 @volatile var exitedCleanly = false
 mainObject.exitFn = (_) => exitedCleanly = true
 
+@volatile var exception: Exception = null
 val thread = new Thread {
   override def run() = try {
 mainObject.main(input)
   } catch {
-// If exceptions occur after the "exit" has happened, fine to ignore 
them.
-// These represent code paths not reachable during normal execution.
-case e: Exception => if (!exitedCleanly) throw e
+// Capture the exception to check whether the exception contains 
searchString or not
+case e: Exception => exception = e
   }
 }
 thread.start()
 thread.join()
-val

spark git commit: [SPARK-25636][CORE] spark-submit cuts off the failure reason when there is an error connecting to master

2018-10-10 Thread vanzin

Repository: spark
Updated Branches:
  refs/heads/master 3528c08be -> 8a7872dc2


[SPARK-25636][CORE] spark-submit cuts off the failure reason when there is an 
error connecting to master

## What changes were proposed in this pull request?
Cause of the error is wrapped with SparkException, now finding the cause from 
the wrapped exception and throwing the cause instead of the wrapped exception.

## How was this patch tested?
Verified it manually by checking the cause of the error, it gives the error as 
shown below.

### Without the PR change

```
[apache-spark]$ ./bin/spark-submit --verbose --master spark://**

Error: Exception thrown in awaitResult:
Run with --help for usage help or --verbose for debug output

```

### With the PR change

```
[apache-spark]$ ./bin/spark-submit --verbose --master spark://**

Exception in thread "main" org.apache.spark.SparkException: Exception thrown in 
awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Failed to connect to devaraj-pc1/10.3.66.65:7077
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
   
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: 
Connection refused: devaraj-pc1/10.3.66.65:7077
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
... 1 more
Caused by: java.net.ConnectException: Connection refused
... 11 more

```

Closes #22623 from devaraj-kavali/SPARK-25636.

Authored-by: Devaraj K 
Signed-off-by: Marcelo Vanzin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8a7872dc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8a7872dc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8a7872dc

Branch: refs/heads/master
Commit: 8a7872dc254710f9b29fdfdb2915a949ef606871
Parents: 3528c08
Author: Devaraj K 
Authored: Wed Oct 10 09:24:36 2018 -0700
Committer: Marcelo Vanzin 
Committed: Wed Oct 10 09:24:36 2018 -0700

--
 .../org/apache/spark/deploy/SparkSubmit.scala  |  2 --
 .../org/apache/spark/deploy/SparkSubmitSuite.scala | 17 +++--
 2 files changed, 11 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8a7872dc/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index d5f2865..61b379f 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -927,8 +927,6 @@ object SparkSubmit extends CommandLineUtils with Logging {
 } catch {
   case e: SparkUserAppException =>
 exitFn(e.exitCode)
-  case e: SparkException =>
-printErrorAndExit(e.getMessage())
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/8a7872dc/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
index 9eae360..652c36f 100644
--- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
@@ -74,20 +74,25 @@ trait TestPrematureExit {
 @volatile var exitedCleanly = false
 mainObject.exitFn = (_) => exitedCleanly = true
 
+@volatile var exception: Exception = null
 val thread = new Thread {
   override def run() = try {
 mainObject.main(input)
   } catch {
-// If exceptions occur after the "exit" has happened, fine to ignore 
them.
-// These represent code paths not reachable during normal execution.
-case e: Exception => if (!exitedCleanly) throw e
+// Capture the exception to check whether the exception contains 
searchString or not
+case e: Exception => exception = e
   }
 }
 thread.start()
 thread.join()
-val joined = printStream.lineBuffer.mkString("\n")
-if (!joined.contains(searchString)) {
-  fail(s"Search

spark git commit: [SPARK-25611][SPARK-25612][SQL][TESTS] Improve test run time of CompressionCodecSuite

2018-10-10 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master eaafcd8a2 -> 3528c08be


[SPARK-25611][SPARK-25612][SQL][TESTS] Improve test run time of 
CompressionCodecSuite

## What changes were proposed in this pull request?
Reduced the combination of codecs from 9 to 3 to improve the test runtime.

## How was this patch tested?
This is a test fix.

Closes #22641 from dilipbiswal/SPARK-25611.

Authored-by: Dilip Biswal 
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3528c08b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3528c08b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3528c08b

Branch: refs/heads/master
Commit: 3528c08bebbcad3dee7557945ddcd31c99deb50e
Parents: eaafcd8
Author: Dilip Biswal 
Authored: Wed Oct 10 08:51:16 2018 -0700
Committer: Sean Owen 
Committed: Wed Oct 10 08:51:16 2018 -0700

--
 .../spark/sql/hive/CompressionCodecSuite.scala  | 54 
 1 file changed, 21 insertions(+), 33 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/3528c08b/sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala
--
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala
index 1bd7e52..398f4d2 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala
@@ -229,8 +229,8 @@ class CompressionCodecSuite extends TestHiveSingleton with 
ParquetTest with Befo
   tableCompressionCodecs: List[String])
   (assertionCompressionCodec: (Option[String], String, String, Long) => 
Unit): Unit = {
 withSQLConf(getConvertMetastoreConfName(format) -> 
convertMetastore.toString) {
-  tableCompressionCodecs.foreach { tableCompression =>
-compressionCodecs.foreach { sessionCompressionCodec =>
+  tableCompressionCodecs.zipAll(compressionCodecs, null, "SNAPPY").foreach 
{
+case (tableCompression, sessionCompressionCodec) =>
   withSQLConf(getSparkCompressionConfName(format) -> 
sessionCompressionCodec) {
 // 'tableCompression = null' means no table-level compression
 val compression = Option(tableCompression)
@@ -240,7 +240,6 @@ class CompressionCodecSuite extends TestHiveSingleton with 
ParquetTest with Befo
   compression, sessionCompressionCodec, realCompressionCodec, 
tableSize)
 }
   }
-}
   }
 }
   }
@@ -262,7 +261,10 @@ class CompressionCodecSuite extends TestHiveSingleton with 
ParquetTest with Befo
 }
   }
 
-  def checkForTableWithCompressProp(format: String, compressCodecs: 
List[String]): Unit = {
+  def checkForTableWithCompressProp(
+  format: String,
+  tableCompressCodecs: List[String],
+  sessionCompressCodecs: List[String]): Unit = {
 Seq(true, false).foreach { isPartitioned =>
   Seq(true, false).foreach { convertMetastore =>
 Seq(true, false).foreach { usingCTAS =>
@@ -271,10 +273,10 @@ class CompressionCodecSuite extends TestHiveSingleton 
with ParquetTest with Befo
 isPartitioned,
 convertMetastore,
 usingCTAS,
-compressionCodecs = compressCodecs,
-tableCompressionCodecs = compressCodecs) {
+compressionCodecs = sessionCompressCodecs,
+tableCompressionCodecs = tableCompressCodecs) {
 case (tableCodec, sessionCodec, realCodec, tableSize) =>
-  val expectCodec = tableCodec.get
+  val expectCodec = tableCodec.getOrElse(sessionCodec)
   assert(expectCodec == realCodec)
   assert(checkTableSize(
 format, expectCodec, isPartitioned, convertMetastore, 
usingCTAS, tableSize))
@@ -284,36 +286,22 @@ class CompressionCodecSuite extends TestHiveSingleton 
with ParquetTest with Befo
 }
   }
 
-  def checkForTableWithoutCompressProp(format: String, compressCodecs: 
List[String]): Unit = {
-Seq(true, false).foreach { isPartitioned =>
-  Seq(true, false).foreach { convertMetastore =>
-Seq(true, false).foreach { usingCTAS =>
-  checkTableCompressionCodecForCodecs(
-format,
-isPartitioned,
-convertMetastore,
-usingCTAS,
-compressionCodecs = compressCodecs,
-tableCompressionCodecs = List(null)) {
-case (tableCodec, sessionCodec, realCodec, tableSize) =>
-  // Always expect session-level take effect
-  assert(sessionCodec == realCodec)
-  assert(checkTableSize(
-

spark git commit: [SPARK-25605][TESTS] Alternate take. Run cast string to timestamp tests for a subset of timezones

2018-10-10 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 3caab872d -> eaafcd8a2


[SPARK-25605][TESTS] Alternate take. Run cast string to timestamp tests for a 
subset of timezones

## What changes were proposed in this pull request?

Try testing timezones in parallel instead in CastSuite, instead of random 
sampling.
See also #22631

## How was this patch tested?

Existing test.

Closes #22672 from srowen/SPARK-25605.2.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eaafcd8a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eaafcd8a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eaafcd8a

Branch: refs/heads/master
Commit: eaafcd8a22db187e87f09966826dcf677c4c38ea
Parents: 3caab87
Author: Sean Owen 
Authored: Wed Oct 10 08:25:12 2018 -0700
Committer: Sean Owen 
Committed: Wed Oct 10 08:25:12 2018 -0700

--
 .../org/apache/spark/sql/catalyst/expressions/CastSuite.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/eaafcd8a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
index 90c0bf7..94dee7e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
@@ -112,7 +112,7 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   }
 
   test("cast string to timestamp") {
-for (tz <- Random.shuffle(ALL_TIMEZONES).take(50)) {
+ALL_TIMEZONES.par.foreach { tz =>
   def checkCastStringToTimestamp(str: String, expected: Timestamp): Unit = 
{
 checkEvaluation(cast(Literal(str), TimestampType, Option(tz.getID)), 
expected)
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r29990 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_10_08_02-3caab87-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-10-10 Thread pwendell

Author: pwendell
Date: Wed Oct 10 15:17:21 2018
New Revision: 29990

Log:
Apache Spark 3.0.0-SNAPSHOT-2018_10_10_08_02-3caab87 docs


[This commit notification would consist of 1482 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r29989 - in /dev/spark/v2.4.0-rc3-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark

2018-10-10 Thread wenchen

Author: wenchen
Date: Wed Oct 10 14:49:52 2018
New Revision: 29989

Log:
Apache Spark v2.4.0-rc3 docs


[This commit notification would consist of 1474 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r29988 - /dev/spark/v2.4.0-rc3-bin/

2018-10-10 Thread wenchen

Author: wenchen
Date: Wed Oct 10 14:30:18 2018
New Revision: 29988

Log:
Apache Spark v2.4.0-rc3

Added:
dev/spark/v2.4.0-rc3-bin/
dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz   (with props)
dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.asc
dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.sha512
dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz   (with props)
dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz.asc
dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz.sha512
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-hadoop2.6.tgz   (with props)
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-hadoop2.6.tgz.asc
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-hadoop2.6.tgz.sha512
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-hadoop2.7.tgz   (with props)
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-hadoop2.7.tgz.asc
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-hadoop2.7.tgz.sha512
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-without-hadoop-scala-2.12.tgz   
(with props)
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-without-hadoop-scala-2.12.tgz.asc

dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-without-hadoop-scala-2.12.tgz.sha512
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-without-hadoop.tgz   (with props)
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-without-hadoop.tgz.asc
dev/spark/v2.4.0-rc3-bin/spark-2.4.0-bin-without-hadoop.tgz.sha512
dev/spark/v2.4.0-rc3-bin/spark-2.4.0.tgz   (with props)
dev/spark/v2.4.0-rc3-bin/spark-2.4.0.tgz.asc
dev/spark/v2.4.0-rc3-bin/spark-2.4.0.tgz.sha512

Added: dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.asc
==
--- dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.asc (added)
+++ dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.asc Wed Oct 10 14:30:18 2018
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+Version: GnuPG v1
+
+iQIcBAABAgAGBQJbvglgAAoJEGuscolPT9yKGUAP/jin9W23RsTESt5iJ1UmtKyF
+iEsSLvfjhnkA2hpbyYmWLbn7/NxW8xXpSkypOfOBht16DBTOdYF02hl4nk1Ydrsm
+pRlPaiV8IgzDptT4HKIRF3QG6m+sTntoEBwiFGjsFSjdM585YZDiIv/H5T+Y8pKH
+jzBE69MI1HcMOZlgIMpsR6H3ZxAqpZncYh2SY9nFvvlhjKrcG9fQTPfuoG+0Q62F
+FSCMW36Rzt7DusN6dtlhbCTGW66I0oXbKddT4aoK/lqRXgc3esFcIe8UyGFELRQw
+5tPdyWPy5YpgKu9fHZZjhZmh1AJQzB+/i3Szh1yAXlSkqgLdvA7wGjIIKO3cyspf
+l4FTAl5LMQKF6fhnplon3vdC1x8UX89Ip1pwhYFwHex8fOGFREyp5w/B7A2IflhR
+id/U71w1vdi9xWANoyKVhAYDTZpE9AMGEvh5ACY+jpnw14b6omlqI+zhv+/Gmibi
+dJE6FlpmrI25xxN7t48+Qj59YlXx06C+2JIUvs0LrJUT7M/yFuosJLNPHn3gTamE
+28ZjhiJ0co5JLXcCkuVUfIlnej5B5rjjqanQAN/mibil8invXSVn7Kddn2CVveyt
+vIeD2h/W7WwruzANwAsoTKlYn76S+chDD0biPtK60BfWddcOTRTTZT+PvtfD2fjp
+hrp0mF1QcvGE+c77bmnn
+=hCFv
+-END PGP SIGNATURE-

Added: dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.sha512
==
--- dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.sha512 (added)
+++ dev/spark/v2.4.0-rc3-bin/SparkR_2.4.0.tar.gz.sha512 Wed Oct 10 14:30:18 2018
@@ -0,0 +1,3 @@
+SparkR_2.4.0.tar.gz: 1530EB56 B6FC9627 1CBEFA2B 918A6D1B A901299E FCC6B396
+ 74319B4D C5063ABA A91DB157 5DBD6299 E28E2D02 126EE70B
+ A1166CA6 09C903C8 F9A14DF5 C657346E

Added: dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz.asc
==
--- dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz.asc (added)
+++ dev/spark/v2.4.0-rc3-bin/pyspark-2.4.0.tar.gz.asc Wed Oct 10 14:30:18 2018
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+Version: GnuPG v1
+
+iQIcBAABAgAGBQJbvgahAAoJEGuscolPT9yKU04P/2N8ZrNc9OhmhqUfTrgcoP7w
+xBby+wWsr1LgT4onxToZnRCMGsMVUFsUibFYCvGj+GJknuHLFPn2C6mceXetZpim
+jYdbIZSWOFOBHfoVPwWqjZiRWhN11wMJnf5O2ZDm+LBKVd8uG1h+bzBkIJz9nlT3
+f7y5JvHf9g8F3imSMhdE1MNJttQMMhKR+4mMWbIlWnvGcMU7+R8Qf7I4ycq0Oam4
+IUdJfxFtpg0YquC12WZ1i5zbq/B/4mCa/LMb6pjYpxH3ifVgFgejIbMKMZbZ4ngQ
+3GcxZHunxD/2EYZJeDoY72m4c9xAHx2aXtgmadBq75hRrdGO2U/QDklyju5VxCnt
+O1F6jlLNGmvsJSJ7+G8IFlzYH87KcdGJMSAIuxEska5B4dPH4dlh+r8w+I4X/37k
+q6Z/sT55eDXA5URhWBe6PmZT7GYHJmkaZQtt72Pvem40btYt1Q9I9xr/elbBzt0P
+KEEzxg4UQZRge3m9s4uzPwNcstPenoELpK7lPmNFlix3cAECJqGDU4ct7bL1qVnk
+tOCLJrLfudAd86enr/Urxi04tL1eHJ1VdRHOgdolNKuw0LavN5PVFcZR2X1jJPDH
+3JPx4mM8qt9BGtkwI5HZzXp6LeLrk6/zOw68f2QBHVZeuf+EE0YzqOP83czNc/rA
+EcduOQbg7TAuPOmdaGcM
+=PSYJ
+-END PGP SIGNATURE-

Added:

[spark] Git Push Summary

2018-10-10 Thread wenchen

Repository: spark
Updated Tags:  refs/tags/v2.4.0-rc3 [created] 8e4a99bd2

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing development version 2.4.1-SNAPSHOT

2018-10-10 Thread wenchen

Preparing development version 2.4.1-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/71b8739f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/71b8739f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/71b8739f

Branch: refs/heads/branch-2.4
Commit: 71b8739fe0f6d63775ee799e5867295ff6637c8c
Parents: 8e4a99b
Author: Wenchen Fan 
Authored: Wed Oct 10 13:26:16 2018 +
Committer: Wenchen Fan 
Committed: Wed Oct 10 13:26:16 2018 +

--
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 4 ++--
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/flume-assembly/pom.xml| 2 +-
 external/flume-sink/pom.xml| 2 +-
 external/flume/pom.xml | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kafka-0-8-assembly/pom.xml| 2 +-
 external/kafka-0-8/pom.xml | 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 43 files changed, 44 insertions(+), 44 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/71b8739f/R/pkg/DESCRIPTION
--
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index f52d785..714b6f1 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 2.4.0
+Version: 2.4.1
 Title: R Frontend for Apache Spark
 Description: Provides an R Frontend for Apache Spark.
 Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),

http://git-wip-us.apache.org/repos/asf/spark/blob/71b8739f/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 63ab510..ee0de73 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.0
+2.4.1-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/71b8739f/common/kvstore/pom.xml
--
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index b10e118..b89e0fe 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.0
+2.4.1-SNAPSHOT
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/71b8739f/common/network-common/pom.xml
--
diff --git a/common/network-common/pom.xml

[1/2] spark git commit: Preparing Spark release v2.4.0-rc3

2018-10-10 Thread wenchen

Repository: spark
Updated Branches:
  refs/heads/branch-2.4 404c84039 -> 71b8739fe


Preparing Spark release v2.4.0-rc3


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8e4a99bd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8e4a99bd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8e4a99bd

Branch: refs/heads/branch-2.4
Commit: 8e4a99bd201b9204fec52580f19ae70a229ed94e
Parents: 404c840
Author: Wenchen Fan 
Authored: Wed Oct 10 13:26:12 2018 +
Committer: Wenchen Fan 
Committed: Wed Oct 10 13:26:12 2018 +

--
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 4 ++--
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/flume-assembly/pom.xml| 2 +-
 external/flume-sink/pom.xml| 2 +-
 external/flume/pom.xml | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kafka-0-8-assembly/pom.xml| 2 +-
 external/kafka-0-8/pom.xml | 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 43 files changed, 44 insertions(+), 44 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8e4a99bd/R/pkg/DESCRIPTION
--
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 714b6f1..f52d785 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 2.4.1
+Version: 2.4.0
 Title: R Frontend for Apache Spark
 Description: Provides an R Frontend for Apache Spark.
 Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),

http://git-wip-us.apache.org/repos/asf/spark/blob/8e4a99bd/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index ee0de73..63ab510 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.1-SNAPSHOT
+2.4.0
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/8e4a99bd/common/kvstore/pom.xml
--
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index b89e0fe..b10e118 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.1-SNAPSHOT
+2.4.0
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/8e4a99bd/common/network-common/pom.xml

spark git commit: [SPARK-20946][SPARK-25525][SQL][FOLLOW-UP] Update the migration guide.

2018-10-10 Thread wenchen

Repository: spark
Updated Branches:
  refs/heads/master faf73dcd3 -> 3caab872d


[SPARK-20946][SPARK-25525][SQL][FOLLOW-UP] Update the migration guide.

## What changes were proposed in this pull request?

This is a follow-up pr of #18536 and #22545 to update the migration guide.

## How was this patch tested?

Build and check the doc locally.

Closes #22682 from ueshin/issues/SPARK-20946_25525/migration_guide.

Authored-by: Takuya UESHIN 
Signed-off-by: Wenchen Fan 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3caab872
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3caab872
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3caab872

Branch: refs/heads/master
Commit: 3caab872db22246c9ab5f3395498f05cb097c142
Parents: faf73dc
Author: Takuya UESHIN 
Authored: Wed Oct 10 21:07:59 2018 +0800
Committer: Wenchen Fan 
Committed: Wed Oct 10 21:07:59 2018 +0800

--
 docs/sql-programming-guide.md | 6 ++
 1 file changed, 6 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/3caab872/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index a1d7b11..0d29357 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1890,6 +1890,10 @@ working with timestamps in `pandas_udf`s to get the best 
performance, see
 
 # Migration Guide
 
+## Upgrading From Spark SQL 2.4 to 3.0
+
+  - In PySpark, when creating a `SparkSession` with 
`SparkSession.builder.getOrCreate()`, if there is an existing `SparkContext`, 
the builder was trying to update the `SparkConf` of the existing `SparkContext` 
with configurations specified to the builder, but the `SparkContext` is shared 
by all `SparkSession`s, so we should not update them. Since 3.0, the builder 
come to not update the configurations. This is the same behavior as Java/Scala 
API in 2.3 and above. If you want to update them, you need to update them prior 
to creating a `SparkSession`.
+
 ## Upgrading From Spark SQL 2.3 to 2.4
 
   - In Spark version 2.3 and earlier, the second parameter to array_contains 
function is implicitly promoted to the element type of first array type 
parameter. This type promotion can be lossy and may cause `array_contains` 
function to return wrong result. This problem has been addressed in 2.4 by 
employing a safer type promotion mechanism. This can cause some change in 
behavior and are illustrated in the table below.
@@ -2135,6 +2139,8 @@ working with timestamps in `pandas_udf`s to get the best 
performance, see
   - In PySpark, `df.replace` does not allow to omit `value` when `to_replace` 
is not a dictionary. Previously, `value` could be omitted in the other cases 
and had `None` by default, which is counterintuitive and error-prone.
   - Un-aliased subquery's semantic has not been well defined with confusing 
behaviors. Since Spark 2.3, we invalidate such confusing cases, for example: 
`SELECT v.i from (SELECT i FROM v)`, Spark will throw an analysis exception in 
this case because users should not be able to use the qualifier inside a 
subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) 
and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for more 
details.
 
+  - When creating a `SparkSession` with `SparkSession.builder.getOrCreate()`, 
if there is an existing `SparkContext`, the builder was trying to update the 
`SparkConf` of the existing `SparkContext` with configurations specified to the 
builder, but the `SparkContext` is shared by all `SparkSession`s, so we should 
not update them. Since 2.3, the builder come to not update the configurations. 
If you want to update them, you need to update them prior to creating a 
`SparkSession`.
+
 ## Upgrading From Spark SQL 2.1 to 2.2
 
   - Spark 2.1.1 introduced a new configuration key: 
`spark.sql.hive.caseSensitiveInferenceMode`. It had a default setting of 
`NEVER_INFER`, which kept behavior identical to 2.1.0. However, Spark 2.2.0 
changes this setting's default value to `INFER_AND_SAVE` to restore 
compatibility with reading Hive metastore tables whose underlying file schema 
have mixed-case column names. With the `INFER_AND_SAVE` configuration value, on 
first access Spark will perform schema inference on any Hive metastore table 
for which it has not already saved an inferred schema. Note that schema 
inference can be a very time-consuming operation for tables with thousands of 
partitions. If compatibility with mixed-case column names is not a concern, you 
can safely set `spark.sql.hive.caseSensitiveInferenceMode` to `NEVER_INFER` to 
avoid the initial overhead of schema inference. Note that with the new default 
`INFER_AND_SAVE` setting, the results of the

svn commit: r30000 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_10_16_02-80813e1-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

svn commit: r29997 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_10_12_02-6df2345-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

spark git commit: [SPARK-25699][SQL] Partially push down conjunctive predicated in ORC

svn commit: r29994 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_10_10_02-cd40655-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-25636][CORE] spark-submit cuts off the failure reason when there is an error connecting to master

spark git commit: [SPARK-25636][CORE] spark-submit cuts off the failure reason when there is an error connecting to master

spark git commit: [SPARK-25611][SPARK-25612][SQL][TESTS] Improve test run time of CompressionCodecSuite

spark git commit: [SPARK-25605][TESTS] Alternate take. Run cast string to timestamp tests for a subset of timezones

svn commit: r29990 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_10_08_02-3caab87-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

svn commit: r29989 - in /dev/spark/v2.4.0-rc3-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark

svn commit: r29988 - /dev/spark/v2.4.0-rc3-bin/

[spark] Git Push Summary

[2/2] spark git commit: Preparing development version 2.4.1-SNAPSHOT

[1/2] spark git commit: Preparing Spark release v2.4.0-rc3

spark git commit: [SPARK-20946][SPARK-25525][SQL][FOLLOW-UP] Update the migration guide.

16 matches

Site Navigation

Mail list logo

Footer information