date:20180809

svn commit: r28649 - in /dev/spark/v2.3.2-rc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark

2018-08-09 Thread jshao

Author: jshao
Date: Fri Aug 10 05:50:52 2018
New Revision: 28649

Log:
Apache Spark v2.3.2-rc4 docs


[This commit notification would consist of 1446 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r28648 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_22_01-e66f3f9-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Fri Aug 10 05:15:36 2018
New Revision: 28648

Log:
Apache Spark 2.3.3-SNAPSHOT-2018_08_09_22_01-e66f3f9 docs


[This commit notification would consist of 1443 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r28647 - /dev/spark/v2.3.2-rc4-bin/

2018-08-09 Thread jshao

Author: jshao
Date: Fri Aug 10 04:58:55 2018
New Revision: 28647

Log:
Apache Spark v2.3.2-rc4

Added:
dev/spark/v2.3.2-rc4-bin/
dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz   (with props)
dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc
dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512
dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz   (with props)
dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc
dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.6.tgz   (with props)
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.6.tgz.asc
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.6.tgz.sha512
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.7.tgz   (with props)
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.7.tgz.asc
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.7.tgz.sha512
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-without-hadoop.tgz   (with props)
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-without-hadoop.tgz.asc
dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-without-hadoop.tgz.sha512
dev/spark/v2.3.2-rc4-bin/spark-2.3.2.tgz   (with props)
dev/spark/v2.3.2-rc4-bin/spark-2.3.2.tgz.asc
dev/spark/v2.3.2-rc4-bin/spark-2.3.2.tgz.sha512

Added: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc
==
--- dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc (added)
+++ dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc Fri Aug 10 04:58:55 2018
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIcBAABCgAGBQJbbRLtAAoJENsLIaASlz/QWPAP/RcLNtpDzKSx4/Egl7h+VCNp
+u1j1pKBIZF/I2lNNWPj87JJCoV9JDUcCU8ktzFTVM7sl5EQ+YzgmvnhkVu3QmZPH
+r+kI5wSQIb5OEUytqLo+aEImaW1T3rvQA3SGXFaVXhAOJlCO71HbBJyrGRdjuJ18
++PNi6/riIuCLX2Sd2UHMF+MLpGZGoRbKemg8+/3+CYw7aq+1WNaZJDY2x5yED0Ey
+kFzSc/eV9TlkJSRKX9r/zTrEIbZ4/QLZbplf4lZt+XvAA+0O49VkRKND0IOYUNPQ
+ZIeHOrrqbDefH0Kzx/eQJLtrLBDoKI+olZNVNIL0zNcj47QZNUYUFvXRiFElaDCk
+ks/WXsV4etQbhtxBbFzLRXky3OrSjZKD+X1jSO5ADpch8ePFoemCjiftWqF+D5oy
+h3Ex9O+DNCTGwsVj7DmIaqsDGC6PRRps8zyx5WPVJ+vUY5m9osgsMC/QxbRN9MI5
+rSzo4YqU5FYoAmpdYD1vPX/y7k4oARNi4tcw57ZQi7awJsi/jxFMSinwrAN81WpC
+8mosXpYtRli1uDob1TjY3D0D/gFRdYy8lduUm7tD7IGiYnpT9tmK/md77W6VM4L3
+6cfIqoEBuAfpi/xSdc6arDllro2VFG3mY6j/G5qta5bVfzyc9xqLOZns+mDlN88h
+mYRBkZzOHBoN2eCHPIWE
+=PaR1
+-END PGP SIGNATURE-

Added: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512
==
--- dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512 (added)
+++ dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512 Fri Aug 10 04:58:55 2018
@@ -0,0 +1,3 @@
+SparkR_2.3.2.tar.gz: 7EB37D66 8E5826F2 CC1B25AC E51E7C71 D477A379 63676728
+ 2777AA32 E6DAF5C6 690BFF9E CC770A22 0B4DF04E D5D87832
+ FDCE7EBF 76561358 6962F46C 83084A5E

Added: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc
==
--- dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc (added)
+++ dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc Fri Aug 10 04:58:55 2018
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIcBAABCgAGBQJbbRcNAAoJENsLIaASlz/QMfgQAL5TawmaMQnBJN4NBEoj5DTB
+Tk/iGHPcr6nvzxFVffoSappW4Lfw6kcvXimU7CaYk3qAG+ssdOXP6RtcPz02aybM
+hjCUzbQIJfZUlmeuMAs/Eh6m40bUHMQMTRmY4Bq96MPUEv053Og2c/W08VBbnZjL
+D5fK6MT0xVKzq9aQ5vA1TrR+nDqR+bPkabWiWUCGKCjhKil2ltkKWdw4gflvFzaR
+Un7ItbwlxKb7pQSiLdBkO/aj4XhKVEwJVl2K929OS066fwoPSEslSjqo/K7TtapQ
+uL2i1Sb9P312HcMhDc8ja0y2YlYgIMCxjc5ZyMczHzUaIFbMlwitrfUlDitywhBL
+PIPQpWzvsHkbHLsLjGeV8e10RRgh2PjaDPFFKrJsRSlpEy9pVyuRcGEzIrV/ZAfv
+t6nBCKp96SZwqpCl6cfjUNDgDgVLO9J8My48I45Vhutp69XZvJDDV3OsAPmNERqA
+AuNOWVf1wJEUNPejeMK+HiPbITNSey7DS1fMN77kz8dapZbL0p9NYNhBur0zlXip
+tChlQKuM7TxdtoL1OCCrdNnzqABz6Z1ccR5vOlgj7cIPCA9z1KCyUuOUyIwPtEc4
+FGiTwoEC6rz8BQrk0gezPf8EI/kBgBy+mdGRlNuZvWaTJio6Jj8puMe+E/KNEjcR
+HenOiJR4yCzvr1AcdAzY
+=6eHj
+-END PGP SIGNATURE-

Added: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512
==
--- dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512 (added)
+++ dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512 Fri Aug 10 04:58:55 
2018
@@ -0,0 +1,3 @@

spark git commit: [SPARK-24855][SQL][EXTERNAL] Built-in AVRO support should support specified schema on write

2018-08-09 Thread dbtsai

Repository: spark
Updated Branches:
  refs/heads/master bdd27961c -> 0cea9e3cd


[SPARK-24855][SQL][EXTERNAL] Built-in AVRO support should support specified 
schema on write

## What changes were proposed in this pull request?

Allows `avroSchema` option to be specified on write, allowing a user to specify 
a schema in cases where this is required.  A trivial use case is reading in an 
avro dataset, making some small adjustment to a column or columns and writing 
out using the same schema.  Implicit schema creation from SQL Struct results in 
a schema that while for the most part, is functionally similar, is not 
necessarily compatible.

Allows `fixed` Field type to be utilized for records of specified `avroSchema`

## How was this patch tested?

Unit tests in AvroSuite are extended to test this with enum and fixed types.

Please review http://spark.apache.org/contributing.html before opening a pull 
request.

Closes #21847 from lindblombr/specify_schema_on_write.

Lead-authored-by: Brian Lindblom 
Co-authored-by: DB Tsai 
Signed-off-by: DB Tsai 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0cea9e3c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0cea9e3c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0cea9e3c

Branch: refs/heads/master
Commit: 0cea9e3cd0a92799bdcc0f9bc2cf96259c343a30
Parents: bdd2796
Author: Brian Lindblom 
Authored: Fri Aug 10 03:35:29 2018 +
Committer: DB Tsai 
Committed: Fri Aug 10 03:35:29 2018 +

--
 .../apache/spark/sql/avro/AvroFileFormat.scala  |   6 +-
 .../apache/spark/sql/avro/AvroSerializer.scala  |  40 +++-
 .../org/apache/spark/sql/avro/AvroSuite.scala   | 228 ++-
 3 files changed, 257 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/0cea9e3c/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala
--
diff --git 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala
index 6ffcf37..6df23c9 100755
--- 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala
+++ 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala
@@ -113,8 +113,10 @@ private[avro] class AvroFileFormat extends FileFormat
   options: Map[String, String],
   dataSchema: StructType): OutputWriterFactory = {
 val parsedOptions = new AvroOptions(options, 
spark.sessionState.newHadoopConf())
-val outputAvroSchema = SchemaConverters.toAvroType(dataSchema, nullable = 
false,
-  parsedOptions.recordName, parsedOptions.recordNamespace, 
parsedOptions.outputTimestampType)
+val outputAvroSchema: Schema = parsedOptions.schema
+  .map(new Schema.Parser().parse)
+  .getOrElse(SchemaConverters.toAvroType(dataSchema, nullable = false,
+parsedOptions.recordName, parsedOptions.recordNamespace))
 
 AvroJob.setOutputKeySchema(job, outputAvroSchema)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/0cea9e3c/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala
--
diff --git 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala
index 9885826..216c52a 100644
--- 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala
+++ 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala
@@ -23,8 +23,8 @@ import scala.collection.JavaConverters._
 
 import org.apache.avro.LogicalTypes.{TimestampMicros, TimestampMillis}
 import org.apache.avro.Schema
-import org.apache.avro.Schema.Type.NULL
-import org.apache.avro.generic.GenericData.Record
+import org.apache.avro.Schema.Type
+import org.apache.avro.generic.GenericData.{EnumSymbol, Fixed, Record}
 import org.apache.avro.util.Utf8
 
 import org.apache.spark.sql.catalyst.InternalRow
@@ -87,10 +87,36 @@ class AvroSerializer(rootCatalystType: DataType, 
rootAvroType: Schema, nullable:
 (getter, ordinal) => getter.getDouble(ordinal)
   case d: DecimalType =>
 (getter, ordinal) => getter.getDecimal(ordinal, d.precision, 
d.scale).toString
-  case StringType =>
-(getter, ordinal) => new Utf8(getter.getUTF8String(ordinal).getBytes)
-  case BinaryType =>
-(getter, ordinal) => ByteBuffer.wrap(getter.getBinary(ordinal))
+  case StringType => avroType.getType match {
+case Type.ENUM =>
+  import scala.collection.JavaConverters._
+  val enumSymbols: Set[String] = avroType.getEnumSymbols.asScala.toSet
+  (getter, ordinal) =>
+

svn commit: r28646 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_20_02-6c7bb57-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Fri Aug 10 03:15:55 2018
New Revision: 28646

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_09_20_02-6c7bb57 docs


[This commit notification would consist of 1476 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-24251][SQL] Add analysis tests for AppendData.

2018-08-09 Thread wenchen

Repository: spark
Updated Branches:
  refs/heads/master 6c7bb575b -> bdd27961c


[SPARK-24251][SQL] Add analysis tests for AppendData.

## What changes were proposed in this pull request?

This is a follow-up to #21305 that adds a test suite for AppendData analysis.

This also fixes the following problems uncovered by these tests:
* Incorrect order of data types passed to `canWrite` is fixed
* The field check calls `canWrite` first to ensure all errors are found
* `AppendData#resolved` must check resolution of the query's attributes
* Column names are quoted to show empty names

## How was this patch tested?

This PR adds a test suite for AppendData analysis.

Closes #22043 from rdblue/SPARK-24251-add-append-data-analysis-tests.

Authored-by: Ryan Blue 
Signed-off-by: Wenchen Fan 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bdd27961
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bdd27961
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bdd27961

Branch: refs/heads/master
Commit: bdd27961c870a3c443686cdbb6dd0eee3ad32012
Parents: 6c7bb57
Author: Ryan Blue 
Authored: Fri Aug 10 11:10:23 2018 +0800
Committer: Wenchen Fan 
Committed: Fri Aug 10 11:10:23 2018 +0800

--
 .../spark/sql/catalyst/analysis/Analyzer.scala  |  16 +-
 .../plans/logical/basicLogicalOperators.scala   |  15 +-
 .../analysis/DataSourceV2AnalysisSuite.scala| 379 +++
 3 files changed, 397 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bdd27961/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index a7cd96e..d00b82d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -2258,8 +2258,8 @@ class Analyzer(
   if (expected.size < query.output.size) {
 throw new AnalysisException(
   s"""Cannot write to '$tableName', too many data columns:
- |Table columns: ${expected.map(_.name).mkString(", ")}
- |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+ |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", 
")}
+ |Data columns: ${query.output.map(c => 
s"'${c.name}'").mkString(", ")}""".stripMargin)
   }
 
   val errors = new mutable.ArrayBuffer[String]()
@@ -2278,8 +2278,9 @@ class Analyzer(
 if (expected.size > query.output.size) {
   throw new AnalysisException(
 s"""Cannot write to '$tableName', not enough data columns:
-   |Table columns: ${expected.map(_.name).mkString(", ")}
-   |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+   |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", 
")}
+   |Data columns: ${query.output.map(c => 
s"'${c.name}'").mkString(", ")}"""
+.stripMargin)
 }
 
 query.output.zip(expected).flatMap {
@@ -2301,12 +2302,15 @@ class Analyzer(
 queryExpr: NamedExpression,
 addError: String => Unit): Option[NamedExpression] = {
 
+  // run the type check first to ensure type errors are present
+  val canWrite = DataType.canWrite(
+queryExpr.dataType, tableAttr.dataType, resolver, tableAttr.name, 
addError)
+
   if (queryExpr.nullable && !tableAttr.nullable) {
 addError(s"Cannot write nullable values to non-null column 
'${tableAttr.name}'")
 None
 
-  } else if (!DataType.canWrite(
-  tableAttr.dataType, queryExpr.dataType, resolver, tableAttr.name, 
addError)) {
+  } else if (!canWrite) {
 None
 
   } else {

http://git-wip-us.apache.org/repos/asf/spark/blob/bdd27961/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 0d31c6f..a6631a8 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -363,13 +363,14 @@ case class AppendData(
   override def output: Seq[Attribute] = Seq.empty
 
   override lazy val

[1/2] spark git commit: Preparing Spark release v2.3.2-rc4

2018-08-09 Thread jshao

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 b426ec583 -> e66f3f9b1


Preparing Spark release v2.3.2-rc4


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6930f488
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6930f488
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6930f488

Branch: refs/heads/branch-2.3
Commit: 6930f4885356eaec2c1e85896be3c93a80ce779c
Parents: b426ec5
Author: Saisai Shao 
Authored: Fri Aug 10 02:06:28 2018 +
Committer: Saisai Shao 
Committed: Fri Aug 10 02:06:28 2018 +

--
 R/pkg/DESCRIPTION | 2 +-
 assembly/pom.xml  | 2 +-
 common/kvstore/pom.xml| 2 +-
 common/network-common/pom.xml | 2 +-
 common/network-shuffle/pom.xml| 2 +-
 common/network-yarn/pom.xml   | 2 +-
 common/sketch/pom.xml | 2 +-
 common/tags/pom.xml   | 2 +-
 common/unsafe/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 docs/_config.yml  | 4 ++--
 examples/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml | 2 +-
 external/flume-assembly/pom.xml   | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-0-10-assembly/pom.xml  | 2 +-
 external/kafka-0-10-sql/pom.xml   | 2 +-
 external/kafka-0-10/pom.xml   | 2 +-
 external/kafka-0-8-assembly/pom.xml   | 2 +-
 external/kafka-0-8/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml | 2 +-
 external/kinesis-asl/pom.xml  | 2 +-
 external/spark-ganglia-lgpl/pom.xml   | 2 +-
 graphx/pom.xml| 2 +-
 hadoop-cloud/pom.xml  | 2 +-
 launcher/pom.xml  | 2 +-
 mllib-local/pom.xml   | 2 +-
 mllib/pom.xml | 2 +-
 pom.xml   | 2 +-
 python/pyspark/version.py | 2 +-
 repl/pom.xml  | 2 +-
 resource-managers/kubernetes/core/pom.xml | 2 +-
 resource-managers/mesos/pom.xml   | 2 +-
 resource-managers/yarn/pom.xml| 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 41 files changed, 42 insertions(+), 42 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/R/pkg/DESCRIPTION
--
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 6ec4966..8df2635 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 2.3.3
+Version: 2.3.2
 Title: R Frontend for Apache Spark
 Description: Provides an R Frontend for Apache Spark.
 Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),

http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index f8b15cc..57485fc 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.2
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/common/kvstore/pom.xml
--
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index e412a47..53e58c2 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.2
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/common/network-common/pom.xml
--
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index d8f9a3d..d05647c 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.2
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/common/network-shuffle/pom.xml
--
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index a1a4f87..8d46761 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml

[spark] Git Push Summary

2018-08-09 Thread jshao

Repository: spark
Updated Tags:  refs/tags/v2.3.2-rc4 [created] 6930f4885

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing development version 2.3.3-SNAPSHOT

2018-08-09 Thread jshao

Preparing development version 2.3.3-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e66f3f9b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e66f3f9b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e66f3f9b

Branch: refs/heads/branch-2.3
Commit: e66f3f9b11f37261ec6cbcb6bb2ebeb34e56a968
Parents: 6930f48
Author: Saisai Shao 
Authored: Fri Aug 10 02:06:37 2018 +
Committer: Saisai Shao 
Committed: Fri Aug 10 02:06:37 2018 +

--
 R/pkg/DESCRIPTION | 2 +-
 assembly/pom.xml  | 2 +-
 common/kvstore/pom.xml| 2 +-
 common/network-common/pom.xml | 2 +-
 common/network-shuffle/pom.xml| 2 +-
 common/network-yarn/pom.xml   | 2 +-
 common/sketch/pom.xml | 2 +-
 common/tags/pom.xml   | 2 +-
 common/unsafe/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 docs/_config.yml  | 4 ++--
 examples/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml | 2 +-
 external/flume-assembly/pom.xml   | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-0-10-assembly/pom.xml  | 2 +-
 external/kafka-0-10-sql/pom.xml   | 2 +-
 external/kafka-0-10/pom.xml   | 2 +-
 external/kafka-0-8-assembly/pom.xml   | 2 +-
 external/kafka-0-8/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml | 2 +-
 external/kinesis-asl/pom.xml  | 2 +-
 external/spark-ganglia-lgpl/pom.xml   | 2 +-
 graphx/pom.xml| 2 +-
 hadoop-cloud/pom.xml  | 2 +-
 launcher/pom.xml  | 2 +-
 mllib-local/pom.xml   | 2 +-
 mllib/pom.xml | 2 +-
 pom.xml   | 2 +-
 python/pyspark/version.py | 2 +-
 repl/pom.xml  | 2 +-
 resource-managers/kubernetes/core/pom.xml | 2 +-
 resource-managers/mesos/pom.xml   | 2 +-
 resource-managers/yarn/pom.xml| 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 41 files changed, 42 insertions(+), 42 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/R/pkg/DESCRIPTION
--
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 8df2635..6ec4966 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 2.3.2
+Version: 2.3.3
 Title: R Frontend for Apache Spark
 Description: Provides an R Frontend for Apache Spark.
 Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),

http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 57485fc..f8b15cc 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.2
+2.3.3-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/common/kvstore/pom.xml
--
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 53e58c2..e412a47 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.2
+2.3.3-SNAPSHOT
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/common/network-common/pom.xml
--
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index d05647c..d8f9a3d 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.2
+2.3.3-SNAPSHOT
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/common/network-shuffle/pom.xml
--
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index 8d46761..a1a4f87 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-

svn commit: r28644 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_18_02-b426ec5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Fri Aug 10 01:15:33 2018
New Revision: 28644

Log:
Apache Spark 2.3.3-SNAPSHOT-2018_08_09_18_02-b426ec5 docs


[This commit notification would consist of 1443 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins build (from 300m to 340m)

2018-08-09 Thread gurwls223

Repository: spark
Updated Branches:
  refs/heads/master 9b8521e53 -> 6c7bb575b


[SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins 
build (from 300m to 340m)

## What changes were proposed in this pull request?

Currently, looks we hit the time limit time to time. Looks better increasing 
the time a bit.

For instance, please see https://github.com/apache/spark/pull/21822

For clarification, current Jenkins timeout is 400m. This PR just proposes to 
fix the test script to increase it correspondingly.

*This PR does not target to change the build configuration*

## How was this patch tested?

Jenkins tests.

Closes #21845 from HyukjinKwon/SPARK-24886.

Authored-by: hyukjinkwon 
Signed-off-by: hyukjinkwon 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6c7bb575
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6c7bb575
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6c7bb575

Branch: refs/heads/master
Commit: 6c7bb575bf8b0bfc26f23e0ef449aaded77d3789
Parents: 9b8521e
Author: hyukjinkwon 
Authored: Fri Aug 10 09:12:17 2018 +0800
Committer: hyukjinkwon 
Committed: Fri Aug 10 09:12:17 2018 +0800

--
 dev/run-tests-jenkins.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6c7bb575/dev/run-tests-jenkins.py
--
diff --git a/dev/run-tests-jenkins.py b/dev/run-tests-jenkins.py
index 3960a0d..16af97c 100755
--- a/dev/run-tests-jenkins.py
+++ b/dev/run-tests-jenkins.py
@@ -181,8 +181,8 @@ def main():
 short_commit_hash = ghprb_actual_commit[0:7]
 
 # format: http://linux.die.net/man/1/timeout
-# must be less than the timeout configured on Jenkins (currently 350m)
-tests_timeout = "300m"
+# must be less than the timeout configured on Jenkins (currently 400m)
+tests_timeout = "340m"
 
 # Array to capture all test names to run on the pull request. These tests 
are represented
 # by their file equivalents in the dev/tests/ directory.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r28641 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_16_02-9b8521e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Thu Aug  9 23:16:16 2018
New Revision: 28641

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_09_16_02-9b8521e docs


[This commit notification would consist of 1476 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13

2018-08-09 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 b2e0f68f6 -> 42229430f


[SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails 
w/java 8 181-b13

- Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis 
and millisToDays multiple skipdates can be specified.
- Updated test so that both new years eve 2014 and new years day 2015 are 
skipped for kiribati time zones.  This is necessary as java versions pre 
181-b13 considered new years day 2015 to be skipped while susequent versions 
corrected this to new years eve.

Unit tests

Author: Chris Martin 

Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures.

(cherry picked from commit c5b8d54c61780af6e9e157e6c855718df972efad)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/42229430
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/42229430
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/42229430

Branch: refs/heads/branch-2.1
Commit: 42229430f94ba33fb614628b9438e699b4922099
Parents: b2e0f68
Author: Chris Martin 
Authored: Sat Jul 28 10:40:10 2018 -0500
Committer: Sean Owen 
Committed: Thu Aug 9 17:31:10 2018 -0500

--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala  | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/42229430/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
index e0a9a0c..a62a3d0 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
@@ -538,19 +538,19 @@ class DateTimeUtilsSuite extends SparkFunSuite {
 
   test("daysToMillis and millisToDays") {
 // There are some days are skipped entirely in some timezone, skip them 
here.
-val skipped_days = Map[String, Int](
-  "Kwajalein" -> 8632,
-  "Pacific/Apia" -> 15338,
-  "Pacific/Enderbury" -> 9131,
-  "Pacific/Fakaofo" -> 15338,
-  "Pacific/Kiritimati" -> 9131,
-  "Pacific/Kwajalein" -> 8632,
-  "MIT" -> 15338)
+val skipped_days = Map[String, Set[Int]](
+  "Kwajalein" -> Set(8632),
+  "Pacific/Apia" -> Set(15338),
+  "Pacific/Enderbury" -> Set(9130, 9131),
+  "Pacific/Fakaofo" -> Set(15338),
+  "Pacific/Kiritimati" -> Set(9130, 9131),
+  "Pacific/Kwajalein" -> Set(8632),
+  "MIT" -> Set(15338))
 for (tz <- DateTimeTestUtils.ALL_TIMEZONES) {
   DateTimeTestUtils.withDefaultTimeZone(tz) {
-val skipped = skipped_days.getOrElse(tz.getID, Int.MinValue)
+val skipped = skipped_days.getOrElse(tz.getID, Set.empty)
 (-2 to 2).foreach { d =>
-  if (d != skipped) {
+  if (!skipped.contains(d)) {
 assert(millisToDays(daysToMillis(d)) === d,
   s"Round trip of ${d} did not work in tz ${tz}")
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13

2018-08-09 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 9bfc55b1b -> b426ec583


[SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails 
w/java 8 181-b13

## What changes were proposed in this pull request?

- Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis 
and millisToDays multiple skipdates can be specified.
- Updated test so that both new years eve 2014 and new years day 2015 are 
skipped for kiribati time zones.  This is necessary as java versions pre 
181-b13 considered new years day 2015 to be skipped while susequent versions 
corrected this to new years eve.

## How was this patch tested?
Unit tests

Author: Chris Martin 

Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures.

(cherry picked from commit c5b8d54c61780af6e9e157e6c855718df972efad)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b426ec58
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b426ec58
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b426ec58

Branch: refs/heads/branch-2.3
Commit: b426ec583fb5176461c5b0c7112d2194af66d93d
Parents: 9bfc55b
Author: Chris Martin 
Authored: Sat Jul 28 10:40:10 2018 -0500
Committer: Sean Owen 
Committed: Thu Aug 9 17:24:24 2018 -0500

--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala  | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b426ec58/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
index 625ff38..b025b85 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
@@ -650,18 +650,18 @@ class DateTimeUtilsSuite extends SparkFunSuite {
 assert(daysToMillis(16800, TimeZoneGMT) === c.getTimeInMillis)
 
 // There are some days are skipped entirely in some timezone, skip them 
here.
-val skipped_days = Map[String, Int](
-  "Kwajalein" -> 8632,
-  "Pacific/Apia" -> 15338,
-  "Pacific/Enderbury" -> 9131,
-  "Pacific/Fakaofo" -> 15338,
-  "Pacific/Kiritimati" -> 9131,
-  "Pacific/Kwajalein" -> 8632,
-  "MIT" -> 15338)
+val skipped_days = Map[String, Set[Int]](
+  "Kwajalein" -> Set(8632),
+  "Pacific/Apia" -> Set(15338),
+  "Pacific/Enderbury" -> Set(9130, 9131),
+  "Pacific/Fakaofo" -> Set(15338),
+  "Pacific/Kiritimati" -> Set(9130, 9131),
+  "Pacific/Kwajalein" -> Set(8632),
+  "MIT" -> Set(15338))
 for (tz <- DateTimeTestUtils.ALL_TIMEZONES) {
-  val skipped = skipped_days.getOrElse(tz.getID, Int.MinValue)
+  val skipped = skipped_days.getOrElse(tz.getID, Set.empty)
   (-2 to 2).foreach { d =>
-if (d != skipped) {
+if (!skipped.contains(d)) {
   assert(millisToDays(daysToMillis(d, tz), tz) === d,
 s"Round trip of ${d} did not work in tz ${tz}")
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13

2018-08-09 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.2 53ac8504b -> b283c1f05


[SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails 
w/java 8 181-b13

## What changes were proposed in this pull request?

- Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis 
and millisToDays multiple skipdates can be specified.
- Updated test so that both new years eve 2014 and new years day 2015 are 
skipped for kiribati time zones.  This is necessary as java versions pre 
181-b13 considered new years day 2015 to be skipped while susequent versions 
corrected this to new years eve.

## How was this patch tested?
Unit tests

Author: Chris Martin 

Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures.

(cherry picked from commit c5b8d54c61780af6e9e157e6c855718df972efad)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b283c1f0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b283c1f0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b283c1f0

Branch: refs/heads/branch-2.2
Commit: b283c1f055521e4090a9829924e5c63810bb0c89
Parents: 53ac850
Author: Chris Martin 
Authored: Sat Jul 28 10:40:10 2018 -0500
Committer: Sean Owen 
Committed: Thu Aug 9 17:24:43 2018 -0500

--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala  | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b283c1f0/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
index c8cf16d..deaf2f9 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
@@ -580,18 +580,18 @@ class DateTimeUtilsSuite extends SparkFunSuite {
 assert(daysToMillis(16800, TimeZoneGMT) === c.getTimeInMillis)
 
 // There are some days are skipped entirely in some timezone, skip them 
here.
-val skipped_days = Map[String, Int](
-  "Kwajalein" -> 8632,
-  "Pacific/Apia" -> 15338,
-  "Pacific/Enderbury" -> 9131,
-  "Pacific/Fakaofo" -> 15338,
-  "Pacific/Kiritimati" -> 9131,
-  "Pacific/Kwajalein" -> 8632,
-  "MIT" -> 15338)
+val skipped_days = Map[String, Set[Int]](
+  "Kwajalein" -> Set(8632),
+  "Pacific/Apia" -> Set(15338),
+  "Pacific/Enderbury" -> Set(9130, 9131),
+  "Pacific/Fakaofo" -> Set(15338),
+  "Pacific/Kiritimati" -> Set(9130, 9131),
+  "Pacific/Kwajalein" -> Set(8632),
+  "MIT" -> Set(15338))
 for (tz <- DateTimeTestUtils.ALL_TIMEZONES) {
-  val skipped = skipped_days.getOrElse(tz.getID, Int.MinValue)
+  val skipped = skipped_days.getOrElse(tz.getID, Set.empty)
   (-2 to 2).foreach { d =>
-if (d != skipped) {
+if (!skipped.contains(d)) {
   assert(millisToDays(daysToMillis(d, tz), tz) === d,
 s"Round trip of ${d} did not work in tz ${tz}")
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25068][SQL] Add exists function.

2018-08-09 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/master fec67ed7e -> 9b8521e53


[SPARK-25068][SQL] Add exists function.

## What changes were proposed in this pull request?

This pr adds `exists` function which tests whether a predicate holds for one or 
more elements in the array.

```sql
> SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
 true
```

## How was this patch tested?

Added tests.

Closes #22052 from ueshin/issues/SPARK-25068/exists.

Authored-by: Takuya UESHIN 
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9b8521e5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9b8521e5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9b8521e5

Branch: refs/heads/master
Commit: 9b8521e53e56a53b44c02366a99f8a8ee1307bbf
Parents: fec67ed
Author: Takuya UESHIN 
Authored: Thu Aug 9 14:41:59 2018 -0700
Committer: Xiao Li 
Committed: Thu Aug 9 14:41:59 2018 -0700

--
 .../catalyst/analysis/FunctionRegistry.scala|  1 +
 .../expressions/higherOrderFunctions.scala  | 47 ++
 .../expressions/HigherOrderFunctionsSuite.scala | 37 
 .../sql-tests/inputs/higher-order-functions.sql |  6 ++
 .../results/higher-order-functions.sql.out  | 18 
 .../spark/sql/DataFrameFunctionsSuite.scala | 96 
 6 files changed, 205 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9b8521e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index 390debd..15543c9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -444,6 +444,7 @@ object FunctionRegistry {
 expression[ArrayTransform]("transform"),
 expression[MapFilter]("map_filter"),
 expression[ArrayFilter]("filter"),
+expression[ArrayExists]("exists"),
 expression[ArrayAggregate]("aggregate"),
 CreateStruct.registryEntry,
 

http://git-wip-us.apache.org/repos/asf/spark/blob/9b8521e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
index d206733..7f8203a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
@@ -357,6 +357,53 @@ case class ArrayFilter(
 }
 
 /**
+ * Tests whether a predicate holds for one or more elements in the array.
+ */
+@ExpressionDescription(usage =
+  "_FUNC_(expr, pred) - Tests whether a predicate holds for one or more 
elements in the array.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3), x -> x % 2 == 0);
+   true
+  """,
+  since = "2.4.0")
+case class ArrayExists(
+input: Expression,
+function: Expression)
+  extends ArrayBasedSimpleHigherOrderFunction with CodegenFallback {
+
+  override def nullable: Boolean = input.nullable
+
+  override def dataType: DataType = BooleanType
+
+  override def expectingFunctionType: AbstractDataType = BooleanType
+
+  override def bind(f: (Expression, Seq[(DataType, Boolean)]) => 
LambdaFunction): ArrayExists = {
+val elem = HigherOrderFunction.arrayArgumentType(input.dataType)
+copy(function = f(function, elem :: Nil))
+  }
+
+  @transient lazy val LambdaFunction(_, Seq(elementVar: NamedLambdaVariable), 
_) = function
+
+  override def nullSafeEval(inputRow: InternalRow, value: Any): Any = {
+val arr = value.asInstanceOf[ArrayData]
+val f = functionForEval
+var exists = false
+var i = 0
+while (i < arr.numElements && !exists) {
+  elementVar.value.set(arr.get(i, elementVar.dataType))
+  if (f.eval(inputRow).asInstanceOf[Boolean]) {
+exists = true
+  }
+  i += 1
+}
+exists
+  }
+
+  override def prettyName: String = "exists"
+}
+
+/**
  * Applies a binary operator to a start value and all elements in the array.
  */
 @ExpressionDescription(

http://git-wip-us.apache.org/repos/asf/spark/blob/9b8521e5/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala

spark git commit: [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession

2018-08-09 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 7d465d8f4 -> 9bfc55b1b


[SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession

## What changes were proposed in this pull request?

When a `SparkSession` is stopped, `SQLConf.get` should use the fallback conf to 
avoid weird issues like
```
sbt.ForkMain$ForkError: java.lang.IllegalStateException: LiveListenerBus is 
stopped.
at 
org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
at 
org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
at 
org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120)
at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120)
at scala.Option.getOrElse(Option.scala:121)
...
```

## How was this patch tested?

a new test suite

Closes #22056 from cloud-fan/session.

Authored-by: Wenchen Fan 
Signed-off-by: Xiao Li 
(cherry picked from commit fec67ed7e95483c5ea97a7b263ad4bea7d3d42b5)
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9bfc55b1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9bfc55b1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9bfc55b1

Branch: refs/heads/branch-2.3
Commit: 9bfc55b1b0aae269320bb978027a800fd1878149
Parents: 7d465d8
Author: Wenchen Fan 
Authored: Thu Aug 9 14:38:58 2018 -0700
Committer: Xiao Li 
Committed: Thu Aug 9 14:40:09 2018 -0700

--
 .../org/apache/spark/sql/SparkSession.scala |  3 +-
 .../apache/spark/sql/LocalSparkSession.scala|  9 ++
 .../spark/sql/internal/SQLConfGetterSuite.scala | 33 
 3 files changed, 37 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9bfc55b1/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index b699ccd..adc7143 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -92,7 +92,8 @@ class SparkSession private(
 
   // If there is no active SparkSession, uses the default SQL conf. Otherwise, 
use the session's.
   SQLConf.setSQLConfGetter(() => {
-
SparkSession.getActiveSession.map(_.sessionState.conf).getOrElse(SQLConf.getFallbackConf)
+
SparkSession.getActiveSession.filterNot(_.sparkContext.isStopped).map(_.sessionState.conf)
+  .getOrElse(SQLConf.getFallbackConf)
   })
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/9bfc55b1/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
index cbef1c7..6b90f20 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
@@ -36,19 +36,14 @@ trait LocalSparkSession extends BeforeAndAfterEach with 
BeforeAndAfterAll { self
 
   override def afterEach() {
 try {
-  resetSparkContext()
+  LocalSparkSession.stop(spark)
   SparkSession.clearActiveSession()
   SparkSession.clearDefaultSession()
+  spark = null
 } finally {
   super.afterEach()
 }
   }
-
-  def resetSparkContext(): Unit = {
-LocalSparkSession.stop(spark)
-spark = null
-  }
-
 }
 
 object LocalSparkSession {

http://git-wip-us.apache.org/repos/asf/spark/blob/9bfc55b1/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
new file mode 100644
index 000..bb79d3a
--- /dev/null
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *

spark git commit: [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession

2018-08-09 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/master bd6db1505 -> fec67ed7e


[SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession

## What changes were proposed in this pull request?

When a `SparkSession` is stopped, `SQLConf.get` should use the fallback conf to 
avoid weird issues like
```
sbt.ForkMain$ForkError: java.lang.IllegalStateException: LiveListenerBus is 
stopped.
at 
org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
at 
org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
at 
org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120)
at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120)
at scala.Option.getOrElse(Option.scala:121)
...
```

## How was this patch tested?

a new test suite

Closes #22056 from cloud-fan/session.

Authored-by: Wenchen Fan 
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fec67ed7
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fec67ed7
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fec67ed7

Branch: refs/heads/master
Commit: fec67ed7e95483c5ea97a7b263ad4bea7d3d42b5
Parents: bd6db15
Author: Wenchen Fan 
Authored: Thu Aug 9 14:38:58 2018 -0700
Committer: Xiao Li 
Committed: Thu Aug 9 14:38:58 2018 -0700

--
 .../org/apache/spark/sql/SparkSession.scala |  3 +-
 .../apache/spark/sql/LocalSparkSession.scala|  9 ++
 .../spark/sql/internal/SQLConfGetterSuite.scala | 33 
 3 files changed, 37 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/fec67ed7/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index 565042f..d9278d8 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -92,7 +92,8 @@ class SparkSession private(
 
   // If there is no active SparkSession, uses the default SQL conf. Otherwise, 
use the session's.
   SQLConf.setSQLConfGetter(() => {
-
SparkSession.getActiveSession.map(_.sessionState.conf).getOrElse(SQLConf.getFallbackConf)
+
SparkSession.getActiveSession.filterNot(_.sparkContext.isStopped).map(_.sessionState.conf)
+  .getOrElse(SQLConf.getFallbackConf)
   })
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/fec67ed7/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
index cbef1c7..6b90f20 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala
@@ -36,19 +36,14 @@ trait LocalSparkSession extends BeforeAndAfterEach with 
BeforeAndAfterAll { self
 
   override def afterEach() {
 try {
-  resetSparkContext()
+  LocalSparkSession.stop(spark)
   SparkSession.clearActiveSession()
   SparkSession.clearDefaultSession()
+  spark = null
 } finally {
   super.afterEach()
 }
   }
-
-  def resetSparkContext(): Unit = {
-LocalSparkSession.stop(spark)
-spark = null
-  }
-
 }
 
 object LocalSparkSession {

http://git-wip-us.apache.org/repos/asf/spark/blob/fec67ed7/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
new file mode 100644
index 000..bb79d3a
--- /dev/null
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to

svn commit: r28640 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_14_02-7d465d8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Thu Aug  9 21:15:35 2018
New Revision: 28640

Log:
Apache Spark 2.3.3-SNAPSHOT-2018_08_09_14_02-7d465d8 docs


[This commit notification would consist of 1443 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25077][SQL] Delete unused variable in WindowExec

2018-08-09 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/master eb9a696dd -> bd6db1505


[SPARK-25077][SQL] Delete unused variable in WindowExec

## What changes were proposed in this pull request?

Just delete the unused variable `inputFields` in WindowExec, avoid making 
others confused while reading the code.

## How was this patch tested?

Existing UT.

Closes #22057 from xuanyuanking/SPARK-25077.

Authored-by: liyuanjian 
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bd6db150
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bd6db150
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bd6db150

Branch: refs/heads/master
Commit: bd6db1505fb68737fa1782bd457ddc52eae6652d
Parents: eb9a696
Author: liyuanjian 
Authored: Thu Aug 9 13:43:07 2018 -0700
Committer: Xiao Li 
Committed: Thu Aug 9 13:43:07 2018 -0700

--
 .../scala/org/apache/spark/sql/execution/window/WindowExec.scala   | 2 --
 1 file changed, 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bd6db150/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
index 626f39d..fede0f3 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
@@ -323,8 +323,6 @@ case class WindowExec(
 fetchNextRow()
 
 // Manage the current partition.
-val inputFields = child.output.length
-
 val buffer: ExternalAppendOnlyUnsafeRowArray =
   new ExternalAppendOnlyUnsafeRowArray(inMemoryThreshold, 
spillThreshold)
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r28638 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_12_02-eb9a696-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Thu Aug  9 19:16:15 2018
New Revision: 28638

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_09_12_02-eb9a696 docs


[This commit notification would consist of 1476 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605

2018-08-09 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 9fb70f458 -> 7d465d8f4


[MINOR][BUILD] Update Jetty to 9.3.24.v20180605

Update Jetty to 9.3.24.v20180605 to pick up security fix

Existing tests.

Closes #22055 from srowen/Jetty9324.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit eb9a696dd6f138225708d15bb2383854ed8a6dab)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7d465d8f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7d465d8f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7d465d8f

Branch: refs/heads/branch-2.3
Commit: 7d465d8f4ad982fbdcfc0129ff9a4952a384bb17
Parents: 9fb70f4
Author: Sean Owen 
Authored: Thu Aug 9 13:04:03 2018 -0500
Committer: Sean Owen 
Committed: Thu Aug 9 13:05:26 2018 -0500

--
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7d465d8f/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 76e8363..3ff0408 100644
--- a/pom.xml
+++ b/pom.xml
@@ -133,7 +133,7 @@
 1.4.4
 nohive
 1.6.0
-9.3.20.v20170531
+9.3.24.v20180605
 3.1.0
 0.8.4
 2.4.0


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605

2018-08-09 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master d36539741 -> eb9a696dd


[MINOR][BUILD] Update Jetty to 9.3.24.v20180605

## What changes were proposed in this pull request?

Update Jetty to 9.3.24.v20180605 to pick up security fix

## How was this patch tested?

Existing tests.

Closes #22055 from srowen/Jetty9324.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eb9a696d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eb9a696d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eb9a696d

Branch: refs/heads/master
Commit: eb9a696dd6f138225708d15bb2383854ed8a6dab
Parents: d365397
Author: Sean Owen 
Authored: Thu Aug 9 13:04:03 2018 -0500
Committer: Sean Owen 
Committed: Thu Aug 9 13:04:03 2018 -0500

--
 dev/deps/spark-deps-hadoop-3.1 | 4 ++--
 pom.xml| 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/eb9a696d/dev/deps/spark-deps-hadoop-3.1
--
diff --git a/dev/deps/spark-deps-hadoop-3.1 b/dev/deps/spark-deps-hadoop-3.1
index 90602fc..fb42adf 100644
--- a/dev/deps/spark-deps-hadoop-3.1
+++ b/dev/deps/spark-deps-hadoop-3.1
@@ -120,8 +120,8 @@ jersey-guava-2.22.2.jar
 jersey-media-jaxb-2.22.2.jar
 jersey-server-2.22.2.jar
 jets3t-0.9.4.jar
-jetty-webapp-9.3.20.v20170531.jar
-jetty-xml-9.3.20.v20170531.jar
+jetty-webapp-9.3.24.v20180605.jar
+jetty-xml-9.3.24.v20180605.jar
 jline-2.14.3.jar
 joda-time-2.9.3.jar
 jodd-core-3.5.2.jar

http://git-wip-us.apache.org/repos/asf/spark/blob/eb9a696d/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 8abdb70..b89713f 100644
--- a/pom.xml
+++ b/pom.xml
@@ -134,7 +134,7 @@
 1.5.2
 nohive
 1.6.0
-9.3.20.v20170531
+9.3.24.v20180605
 3.1.0
 0.8.4
 2.4.0


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-24626][SQL] Improve location size calculation in Analyze Table command

2018-08-09 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/master 2949a835f -> d36539741


[SPARK-24626][SQL] Improve location size calculation in Analyze Table command

## What changes were proposed in this pull request?

Currently, Analyze table calculates table size sequentially for each partition. 
We can parallelize size calculations over partitions.

Results : Tested on a table with 100 partitions and data stored in S3.
With changes :
- 10.429s
- 10.557s
- 10.439s
- 9.893sâ¨

Without changes :
- 110.034s
- 99.510s
- 100.743s
- 99.106s

## How was this patch tested?

Simple unit test.

Closes #21608 from Achuth17/improveAnalyze.

Lead-authored-by: Achuth17 
Co-authored-by: arajagopal17 
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3653974
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3653974
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d3653974

Branch: refs/heads/master
Commit: d36539741ff6a12a6acde9274e9992a66cdd36e7
Parents: 2949a83
Author: Achuth17 
Authored: Thu Aug 9 08:29:24 2018 -0700
Committer: Xiao Li 
Committed: Thu Aug 9 08:29:24 2018 -0700

--
 docs/sql-programming-guide.md   |  2 ++
 .../org/apache/spark/sql/internal/SQLConf.scala | 12 
 .../command/AnalyzeColumnCommand.scala  |  2 +-
 .../execution/command/AnalyzeTableCommand.scala |  2 +-
 .../sql/execution/command/CommandUtils.scala| 30 +++-
 .../execution/datasources/DataSourceUtils.scala | 10 +++
 .../datasources/InMemoryFileIndex.scala |  2 +-
 .../apache/spark/sql/hive/StatisticsSuite.scala | 23 ++-
 8 files changed, 72 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d3653974/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index a1e019c..9adb86a 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1892,6 +1892,8 @@ working with timestamps in `pandas_udf`s to get the best 
performance, see
   - In version 2.3 and earlier, Spark converts Parquet Hive tables by default 
but ignores table properties like `TBLPROPERTIES (parquet.compression 'NONE')`. 
This happens for ORC Hive table properties like `TBLPROPERTIES (orc.compress 
'NONE')` in case of `spark.sql.hive.convertMetastoreOrc=true`, too. Since Spark 
2.4, Spark respects Parquet/ORC specific table properties while converting 
Parquet/ORC Hive tables. As an example, `CREATE TABLE t(id int) STORED AS 
PARQUET TBLPROPERTIES (parquet.compression 'NONE')` would generate Snappy 
parquet files during insertion in Spark 2.3, and in Spark 2.4, the result would 
be uncompressed parquet files.
   - Since Spark 2.0, Spark converts Parquet Hive tables by default for better 
performance. Since Spark 2.4, Spark converts ORC Hive tables by default, too. 
It means Spark uses its own ORC support by default instead of Hive SerDe. As an 
example, `CREATE TABLE t(id int) STORED AS ORC` would be handled with Hive 
SerDe in Spark 2.3, and in Spark 2.4, it would be converted into Spark's ORC 
data source table and ORC vectorization would be applied. To set `false` to 
`spark.sql.hive.convertMetastoreOrc` restores the previous behavior.
   - In version 2.3 and earlier, CSV rows are considered as malformed if at 
least one column value in the row is malformed. CSV parser dropped such rows in 
the DROPMALFORMED mode or outputs an error in the FAILFAST mode. Since Spark 
2.4, CSV row is considered as malformed only when it contains malformed column 
values requested from CSV datasource, other values can be ignored. As an 
example, CSV file contains the "id,name" header and one row "1234". In Spark 
2.4, selection of the id column consists of a row with one column value 1234 
but in Spark 2.3 and earlier it is empty in the DROPMALFORMED mode. To restore 
the previous behavior, set `spark.sql.csv.parser.columnPruning.enabled` to 
`false`.
+  - Since Spark 2.4, File listing for compute statistics is done in parallel 
by default. This can be disabled by setting 
`spark.sql.parallelFileListingInStatsComputation.enabled` to `False`.
+  - Since Spark 2.4, Metadata files (e.g. Parquet summary files) and temporary 
files are not counted as data files when calculating table size during 
Statistics computation.
 
 ## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above
 

http://git-wip-us.apache.org/repos/asf/spark/blob/d3653974/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

svn commit: r28636 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_08_02-1a7e747-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Thu Aug  9 15:16:22 2018
New Revision: 28636

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_09_08_02-1a7e747 docs


[This commit notification would consist of 1476 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotNull

2018-08-09 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/master 1a7e747ce -> 2949a835f


[SPARK-25063][SQL] Rename class KnowNotNull to KnownNotNull

## What changes were proposed in this pull request?

Correct the class name typo checked in through SPARK-24891

## How was this patch tested?

Passed all existing tests.

Closes #22049 from maryannxue/known-not-null.

Authored-by: maryannxue 
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2949a835
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2949a835
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2949a835

Branch: refs/heads/master
Commit: 2949a835fae3f4ac6e3dae6f18cd8b6543b74601
Parents: 1a7e747
Author: maryannxue 
Authored: Thu Aug 9 08:11:30 2018 -0700
Committer: Xiao Li 
Committed: Thu Aug 9 08:11:30 2018 -0700

--
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala  | 4 ++--
 .../spark/sql/catalyst/expressions/constraintExpressions.scala | 2 +-
 .../org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2949a835/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index d23d43b..a7cd96e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -2157,7 +2157,7 @@ class Analyzer(
   // trust the `nullable` information.
   // (cls, expr) => cls.isPrimitive && expr.nullable
   val needsNullCheck = (cls: Class[_], expr: Expression) =>
-cls.isPrimitive && !expr.isInstanceOf[KnowNotNull]
+cls.isPrimitive && !expr.isInstanceOf[KnownNotNull]
   val inputsNullCheck = parameterTypes.zip(inputs)
 .filter { case (cls, expr) => needsNullCheck(cls, expr) }
 .map { case (_, expr) => IsNull(expr) }
@@ -2167,7 +2167,7 @@ class Analyzer(
   // branch of `If` will be called if any of these checked inputs is 
null. Thus we can
   // prevent this rule from being applied repeatedly.
   val newInputs = parameterTypes.zip(inputs).map{ case (cls, expr) =>
-if (needsNullCheck(cls, expr)) KnowNotNull(expr) else expr }
+if (needsNullCheck(cls, expr)) KnownNotNull(expr) else expr }
   inputsNullCheck
 .map(If(_, Literal.create(null, udf.dataType), udf.copy(children = 
newInputs)))
 .getOrElse(udf)

http://git-wip-us.apache.org/repos/asf/spark/blob/2949a835/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala
index 53936aa..2917b0b 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala
@@ -21,7 +21,7 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode, FalseLiteral}
 import org.apache.spark.sql.types.DataType
 
-case class KnowNotNull(child: Expression) extends UnaryExpression {
+case class KnownNotNull(child: Expression) extends UnaryExpression {
   override def nullable: Boolean = false
   override def dataType: DataType = child.dataType
 

http://git-wip-us.apache.org/repos/asf/spark/blob/2949a835/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
index ba44484..a1c976d 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
@@ -319,7 +319,7 @@ class AnalysisSuite extends AnalysisTest with Matchers {
 // only primitive parameter needs special null handling
 val udf2 = ScalaUDF((s: String, d: Double) => "x", StringType, string :: 
double :: Nil)

spark git commit: [SPARK-25047][ML] Can't assign SerializedLambda to scala.Function1 in deserialization of BucketedRandomProjectionLSHModel

2018-08-09 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master b2950cef3 -> 1a7e747ce


[SPARK-25047][ML] Can't assign SerializedLambda to scala.Function1 in 
deserialization of BucketedRandomProjectionLSHModel

## What changes were proposed in this pull request?

Convert two function fields in ML classes to simple functions to avoiâ¦d odd 
SerializedLambda deserialization problem

## How was this patch tested?

Existing tests.

Closes #22032 from srowen/SPARK-25047.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1a7e747c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1a7e747c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1a7e747c

Branch: refs/heads/master
Commit: 1a7e747ce4f8c5253c5923045d23c62e43a6566b
Parents: b2950ce
Author: Sean Owen 
Authored: Thu Aug 9 08:07:46 2018 -0500
Committer: Sean Owen 
Committed: Thu Aug 9 08:07:46 2018 -0500

--
 .../feature/BucketedRandomProjectionLSH.scala   | 14 ++
 .../scala/org/apache/spark/ml/feature/LSH.scala |  4 ++--
 .../apache/spark/ml/feature/MinHashLSH.scala| 20 +---
 .../GeneralizedLinearRegression.scala   | 15 +++
 4 files changed, 24 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1a7e747c/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala
index a906e95..0554455 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala
@@ -82,14 +82,12 @@ class BucketedRandomProjectionLSHModel private[ml](
   override def setOutputCol(value: String): this.type = super.set(outputCol, 
value)
 
   @Since("2.1.0")
-  override protected[ml] val hashFunction: Vector => Array[Vector] = {
-key: Vector => {
-  val hashValues: Array[Double] = randUnitVectors.map({
-randUnitVector => Math.floor(BLAS.dot(key, randUnitVector) / 
$(bucketLength))
-  })
-  // TODO: Output vectors of dimension numHashFunctions in SPARK-18450
-  hashValues.map(Vectors.dense(_))
-}
+  override protected[ml] def hashFunction(elems: Vector): Array[Vector] = {
+val hashValues = randUnitVectors.map(
+  randUnitVector => Math.floor(BLAS.dot(elems, randUnitVector) / 
$(bucketLength))
+)
+// TODO: Output vectors of dimension numHashFunctions in SPARK-18450
+hashValues.map(Vectors.dense(_))
   }
 
   @Since("2.1.0")

http://git-wip-us.apache.org/repos/asf/spark/blob/1a7e747c/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala
index a70931f..b208523 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala
@@ -75,7 +75,7 @@ private[ml] abstract class LSHModel[T <: LSHModel[T]]
* The hash function of LSH, mapping an input feature vector to multiple 
hash vectors.
* @return The mapping of LSH function.
*/
-  protected[ml] val hashFunction: Vector => Array[Vector]
+  protected[ml] def hashFunction(elems: Vector): Array[Vector]
 
   /**
* Calculate the distance between two different keys using the distance 
metric corresponding
@@ -97,7 +97,7 @@ private[ml] abstract class LSHModel[T <: LSHModel[T]]
 
   override def transform(dataset: Dataset[_]): DataFrame = {
 transformSchema(dataset.schema, logging = true)
-val transformUDF = udf(hashFunction, DataTypes.createArrayType(new 
VectorUDT))
+val transformUDF = udf(hashFunction(_: Vector), 
DataTypes.createArrayType(new VectorUDT))
 dataset.withColumn($(outputCol), transformUDF(dataset($(inputCol
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/1a7e747c/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala
index a043033..21cde66 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala
@@ -60,18 +60,16 @@ class MinHashLSHModel private[ml](
   override def setOutputCol(value: String): this.type =

spark git commit: Revert "[SPARK-24648][SQL] SqlMetrics should be threadsafe"

2018-08-09 Thread wenchen

Repository: spark
Updated Branches:
  refs/heads/master 386fbd3af -> b2950cef3


Revert "[SPARK-24648][SQL] SqlMetrics should be threadsafe"

This reverts commit 5264164a67df498b73facae207eda12ee133be7d.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b2950cef
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b2950cef
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b2950cef

Branch: refs/heads/master
Commit: b2950cef3c898f59a2c92e8800ff134c44263b9a
Parents: 386fbd3
Author: Wenchen Fan 
Authored: Thu Aug 9 20:33:59 2018 +0800
Committer: Wenchen Fan 
Committed: Thu Aug 9 20:33:59 2018 +0800

--
 .../spark/sql/execution/metric/SQLMetrics.scala | 33 +++---
 .../sql/execution/metric/SQLMetricsSuite.scala  | 36 +---
 2 files changed, 14 insertions(+), 55 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b2950cef/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
index 98f58a3..cbf707f 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
@@ -19,7 +19,6 @@ package org.apache.spark.sql.execution.metric
 
 import java.text.NumberFormat
 import java.util.Locale
-import java.util.concurrent.atomic.LongAdder
 
 import org.apache.spark.SparkContext
 import org.apache.spark.scheduler.AccumulableInfo
@@ -33,45 +32,40 @@ import org.apache.spark.util.{AccumulatorContext, 
AccumulatorV2, Utils}
  * on the driver side must be explicitly posted using 
[[SQLMetrics.postDriverMetricUpdates()]].
  */
 class SQLMetric(val metricType: String, initValue: Long = 0L) extends 
AccumulatorV2[Long, Long] {
-
   // This is a workaround for SPARK-11013.
   // We may use -1 as initial value of the accumulator, if the accumulator is 
valid, we will
   // update it at the end of task and the value will be at least 0. Then we 
can filter out the -1
   // values before calculate max, min, etc.
-  private[this] val _value = new LongAdder
-  private val _zeroValue = initValue
-  _value.add(initValue)
+  private[this] var _value = initValue
+  private var _zeroValue = initValue
 
   override def copy(): SQLMetric = {
-val newAcc = new SQLMetric(metricType, initValue)
-newAcc.add(_value.sum())
+val newAcc = new SQLMetric(metricType, _value)
+newAcc._zeroValue = initValue
 newAcc
   }
 
-  override def reset(): Unit = this.set(_zeroValue)
+  override def reset(): Unit = _value = _zeroValue
 
   override def merge(other: AccumulatorV2[Long, Long]): Unit = other match {
-case o: SQLMetric => _value.add(o.value)
+case o: SQLMetric => _value += o.value
 case _ => throw new UnsupportedOperationException(
   s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}")
   }
 
-  override def isZero(): Boolean = _value.sum() == _zeroValue
+  override def isZero(): Boolean = _value == _zeroValue
 
-  override def add(v: Long): Unit = _value.add(v)
+  override def add(v: Long): Unit = _value += v
 
   // We can set a double value to `SQLMetric` which stores only long value, if 
it is
   // average metrics.
   def set(v: Double): Unit = SQLMetrics.setDoubleForAverageMetrics(this, v)
 
-  def set(v: Long): Unit = {
-_value.reset()
-_value.add(v)
-  }
+  def set(v: Long): Unit = _value = v
 
-  def +=(v: Long): Unit = _value.add(v)
+  def +=(v: Long): Unit = _value += v
 
-  override def value: Long = _value.sum()
+  override def value: Long = _value
 
   // Provide special identifier as metadata so we can tell that this is a 
`SQLMetric` later
   override def toInfo(update: Option[Any], value: Option[Any]): 
AccumulableInfo = {
@@ -159,7 +153,7 @@ object SQLMetrics {
   Seq.fill(3)(0L)
 } else {
   val sorted = validValues.sorted
-  Seq(sorted.head, sorted(validValues.length / 2), 
sorted(validValues.length - 1))
+  Seq(sorted(0), sorted(validValues.length / 2), 
sorted(validValues.length - 1))
 }
 metric.map(v => numberFormat.format(v.toDouble / baseForAvgMetric))
   }
@@ -179,8 +173,7 @@ object SQLMetrics {
   Seq.fill(4)(0L)
 } else {
   val sorted = validValues.sorted
-  Seq(sorted.sum, sorted.head, sorted(validValues.length / 2),
-sorted(validValues.length - 1))
+  Seq(sorted.sum, sorted(0), sorted(validValues.length / 2), 
sorted(validValues.length - 1))
 }
 metric.map(strFormat)
   }

spark git commit: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSparkSubmitSuite correct and stable

2018-08-09 Thread wenchen

Repository: spark
Updated Branches:
  refs/heads/master 56e9e9707 -> 386fbd3af


[SPARK-23415][SQL][TEST] Make behavior of BufferHolderSparkSubmitSuite correct 
and stable

## What changes were proposed in this pull request?

This PR addresses two issues in `BufferHolderSparkSubmitSuite`.

1. While `BufferHolderSparkSubmitSuite` tried to allocate a large object 
several times, it actually allocated an object once and reused the object.
2. `BufferHolderSparkSubmitSuite` may fail due to timeout

To assign a small object before allocating a large object each time solved 
issue 1 by avoiding reuse.
To increasing heap size from 4g to 7g solved issue 2. It can also avoid OOM 
after fixing issue 1.

## How was this patch tested?

Updated existing `BufferHolderSparkSubmitSuite`

Closes #20636 from kiszk/SPARK-23415.

Authored-by: Kazuaki Ishizaki 
Signed-off-by: Wenchen Fan 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/386fbd3a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/386fbd3a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/386fbd3a

Branch: refs/heads/master
Commit: 386fbd3aff95ce919567b1b94d5b19c5bcef266a
Parents: 56e9e97
Author: Kazuaki Ishizaki 
Authored: Thu Aug 9 20:28:14 2018 +0800
Committer: Wenchen Fan 
Committed: Thu Aug 9 20:28:14 2018 +0800

--
 .../expressions/codegen/BufferHolder.java   | 13 +--
 .../codegen/BufferHolderSparkSubmitSuite.scala  | 36 
 .../expressions/codegen/BufferHolderSuite.scala | 10 +++---
 3 files changed, 36 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/386fbd3a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
--
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
index 537ef24..6a52a5b 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
@@ -35,6 +35,7 @@ final class BufferHolder {
 
   private static final int ARRAY_MAX = 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH;
 
+  // buffer is guarantee to be word-aligned since UnsafeRow assumes each field 
is word-aligned.
   private byte[] buffer;
   private int cursor = Platform.BYTE_ARRAY_OFFSET;
   private final UnsafeRow row;
@@ -52,7 +53,8 @@ final class BufferHolder {
   "too many fields (number of fields: " + row.numFields() + ")");
 }
 this.fixedSize = bitsetWidthInBytes + 8 * row.numFields();
-this.buffer = new byte[fixedSize + initialSize];
+int roundedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(fixedSize + initialSize);
+this.buffer = new byte[roundedSize];
 this.row = row;
 this.row.pointTo(buffer, buffer.length);
   }
@@ -61,8 +63,12 @@ final class BufferHolder {
* Grows the buffer by at least neededSize and points the row to the buffer.
*/
   void grow(int neededSize) {
+if (neededSize < 0) {
+  throw new IllegalArgumentException(
+"Cannot grow BufferHolder by size " + neededSize + " because the size 
is negative");
+}
 if (neededSize > ARRAY_MAX - totalSize()) {
-  throw new UnsupportedOperationException(
+  throw new IllegalArgumentException(
 "Cannot grow BufferHolder by size " + neededSize + " because the size 
after growing " +
   "exceeds size limitation " + ARRAY_MAX);
 }
@@ -70,7 +76,8 @@ final class BufferHolder {
 if (buffer.length < length) {
   // This will not happen frequently, because the buffer is re-used.
   int newLength = length < ARRAY_MAX / 2 ? length * 2 : ARRAY_MAX;
-  final byte[] tmp = new byte[newLength];
+  int roundedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(newLength);
+  final byte[] tmp = new byte[roundedSize];
   Platform.copyMemory(
 buffer,
 Platform.BYTE_ARRAY_OFFSET,

http://git-wip-us.apache.org/repos/asf/spark/blob/386fbd3a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala
index 85682cf..d2862c8 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala
+++

spark git commit: [MINOR][DOC] Fix typo

2018-08-09 Thread gurwls223

Repository: spark
Updated Branches:
  refs/heads/master 519e03d82 -> 56e9e9707


[MINOR][DOC] Fix typo

## What changes were proposed in this pull request?

This PR fixes typo regarding `auxiliary verb + verb[s]`. This is a follow-on of 
#21956.

## How was this patch tested?

N/A

Closes #22040 from kiszk/spellcheck1.

Authored-by: Kazuaki Ishizaki 
Signed-off-by: hyukjinkwon 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/56e9e970
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/56e9e970
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/56e9e970

Branch: refs/heads/master
Commit: 56e9e97073cf1896e301371b3941c9307e42ff77
Parents: 519e03d
Author: Kazuaki Ishizaki 
Authored: Thu Aug 9 20:10:17 2018 +0800
Committer: hyukjinkwon 
Committed: Thu Aug 9 20:10:17 2018 +0800

--
 .../main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java   | 2 +-
 .../util/collection/unsafe/sort/UnsafeSorterSpillMerger.java | 2 +-
 .../src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala | 2 +-
 .../test/java/test/org/apache/spark/JavaSparkContextSuite.java   | 2 +-
 .../scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala  | 2 +-
 .../org/apache/spark/ml/classification/LogisticRegression.scala  | 2 +-
 python/pyspark/sql/types.py  | 2 +-
 .../apache/spark/sql/catalyst/analysis/DecimalPrecision.scala| 2 +-
 .../expressions/CodeGeneratorWithInterpretedFallback.scala   | 2 +-
 .../spark/sql/catalyst/expressions/ExpectsInputTypes.scala   | 2 +-
 .../spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala | 4 ++--
 .../spark/sql/catalyst/encoders/EncoderResolutionSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala | 2 +-
 .../apache/spark/sql/execution/streaming/FileStreamSource.scala  | 2 +-
 .../apache/spark/sql/execution/streaming/ProgressReporter.scala  | 2 +-
 .../src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala  | 2 +-
 .../sql/hive/execution/CreateHiveTableAsSelectCommand.scala  | 2 +-
 .../org/apache/spark/sql/hive/execution/HiveQuerySuite.scala | 2 +-
 18 files changed, 19 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/56e9e970/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
--
diff --git 
a/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java 
b/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
index 9a767dd..9b6cbab 100644
--- a/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
+++ b/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
@@ -662,7 +662,7 @@ public final class BytesToBytesMap extends MemoryConsumer {
  * It is only valid to call this method immediately after calling 
`lookup()` using the same key.
  * 
  * 
- * The key and value must be word-aligned (that is, their sizes must 
multiples of 8).
+ * The key and value must be word-aligned (that is, their sizes must be a 
multiple of 8).
  * 
  * 
  * After calling this method, calls to `get[Key|Value]Address()` and 
`get[Key|Value]Length`

http://git-wip-us.apache.org/repos/asf/spark/blob/56e9e970/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java
--
diff --git 
a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java
 
b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java
index ff0dcc2..ab80028 100644
--- 
a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java
+++ 
b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java
@@ -51,7 +51,7 @@ final class UnsafeSorterSpillMerger {
 if (spillReader.hasNext()) {
   // We only add the spillReader to the priorityQueue if it is not empty. 
We do this to
   // make sure the hasNext method of UnsafeSorterIterator returned by 
getSortedIterator
-  // does not return wrong result because hasNext will returns true
+  // does not return wrong result because hasNext will return true
   // at least priorityQueue.size() times. If we allow n spillReaders in the
   // priorityQueue, we will have n extra empty records in the result of 
UnsafeSorterIterator.
   spillReader.loadNext();

http://git-wip-us.apache.org/repos/asf/spark/blob/56e9e970/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala

svn commit: r28629 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_00_02-519e03d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-09 Thread pwendell

Author: pwendell
Date: Thu Aug  9 07:16:59 2018
New Revision: 28629

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_09_00_02-519e03d docs


[This commit notification would consist of 1476 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r28649 - in /dev/spark/v2.3.2-rc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark

svn commit: r28648 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_22_01-e66f3f9-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

svn commit: r28647 - /dev/spark/v2.3.2-rc4-bin/

spark git commit: [SPARK-24855][SQL][EXTERNAL] Built-in AVRO support should support specified schema on write

svn commit: r28646 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_20_02-6c7bb57-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-24251][SQL] Add analysis tests for AppendData.

[1/2] spark git commit: Preparing Spark release v2.3.2-rc4

[spark] Git Push Summary

[2/2] spark git commit: Preparing development version 2.3.3-SNAPSHOT

svn commit: r28644 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_18_02-b426ec5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins build (from 300m to 340m)

svn commit: r28641 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_16_02-9b8521e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13

spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13

spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13

spark git commit: [SPARK-25068][SQL] Add exists function.

spark git commit: [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession

spark git commit: [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession

svn commit: r28640 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_14_02-7d465d8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-25077][SQL] Delete unused variable in WindowExec

svn commit: r28638 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_12_02-eb9a696-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605

spark git commit: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605

spark git commit: [SPARK-24626][SQL] Improve location size calculation in Analyze Table command

svn commit: r28636 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_08_02-1a7e747-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotNull

spark git commit: [SPARK-25047][ML] Can't assign SerializedLambda to scala.Function1 in deserialization of BucketedRandomProjectionLSHModel

spark git commit: Revert "[SPARK-24648][SQL] SqlMetrics should be threadsafe"

spark git commit: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSparkSubmitSuite correct and stable

spark git commit: [MINOR][DOC] Fix typo

svn commit: r28629 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_00_02-519e03d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

31 matches

Site Navigation

Mail list logo

Footer information