svn commit: r28649 - in /dev/spark/v2.3.2-rc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark
Author: jshao Date: Fri Aug 10 05:50:52 2018 New Revision: 28649 Log: Apache Spark v2.3.2-rc4 docs [This commit notification would consist of 1446 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28648 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_22_01-e66f3f9-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Aug 10 05:15:36 2018 New Revision: 28648 Log: Apache Spark 2.3.3-SNAPSHOT-2018_08_09_22_01-e66f3f9 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28647 - /dev/spark/v2.3.2-rc4-bin/
Author: jshao Date: Fri Aug 10 04:58:55 2018 New Revision: 28647 Log: Apache Spark v2.3.2-rc4 Added: dev/spark/v2.3.2-rc4-bin/ dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz (with props) dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512 dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz (with props) dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512 dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.6.tgz (with props) dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.6.tgz.asc dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.6.tgz.sha512 dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.7.tgz (with props) dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.7.tgz.asc dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-hadoop2.7.tgz.sha512 dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-without-hadoop.tgz (with props) dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-without-hadoop.tgz.asc dev/spark/v2.3.2-rc4-bin/spark-2.3.2-bin-without-hadoop.tgz.sha512 dev/spark/v2.3.2-rc4-bin/spark-2.3.2.tgz (with props) dev/spark/v2.3.2-rc4-bin/spark-2.3.2.tgz.asc dev/spark/v2.3.2-rc4-bin/spark-2.3.2.tgz.sha512 Added: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc == --- dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc (added) +++ dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.asc Fri Aug 10 04:58:55 2018 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIcBAABCgAGBQJbbRLtAAoJENsLIaASlz/QWPAP/RcLNtpDzKSx4/Egl7h+VCNp +u1j1pKBIZF/I2lNNWPj87JJCoV9JDUcCU8ktzFTVM7sl5EQ+YzgmvnhkVu3QmZPH +r+kI5wSQIb5OEUytqLo+aEImaW1T3rvQA3SGXFaVXhAOJlCO71HbBJyrGRdjuJ18 ++PNi6/riIuCLX2Sd2UHMF+MLpGZGoRbKemg8+/3+CYw7aq+1WNaZJDY2x5yED0Ey +kFzSc/eV9TlkJSRKX9r/zTrEIbZ4/QLZbplf4lZt+XvAA+0O49VkRKND0IOYUNPQ +ZIeHOrrqbDefH0Kzx/eQJLtrLBDoKI+olZNVNIL0zNcj47QZNUYUFvXRiFElaDCk +ks/WXsV4etQbhtxBbFzLRXky3OrSjZKD+X1jSO5ADpch8ePFoemCjiftWqF+D5oy +h3Ex9O+DNCTGwsVj7DmIaqsDGC6PRRps8zyx5WPVJ+vUY5m9osgsMC/QxbRN9MI5 +rSzo4YqU5FYoAmpdYD1vPX/y7k4oARNi4tcw57ZQi7awJsi/jxFMSinwrAN81WpC +8mosXpYtRli1uDob1TjY3D0D/gFRdYy8lduUm7tD7IGiYnpT9tmK/md77W6VM4L3 +6cfIqoEBuAfpi/xSdc6arDllro2VFG3mY6j/G5qta5bVfzyc9xqLOZns+mDlN88h +mYRBkZzOHBoN2eCHPIWE +=PaR1 +-END PGP SIGNATURE- Added: dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512 == --- dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512 (added) +++ dev/spark/v2.3.2-rc4-bin/SparkR_2.3.2.tar.gz.sha512 Fri Aug 10 04:58:55 2018 @@ -0,0 +1,3 @@ +SparkR_2.3.2.tar.gz: 7EB37D66 8E5826F2 CC1B25AC E51E7C71 D477A379 63676728 + 2777AA32 E6DAF5C6 690BFF9E CC770A22 0B4DF04E D5D87832 + FDCE7EBF 76561358 6962F46C 83084A5E Added: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc == --- dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc (added) +++ dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.asc Fri Aug 10 04:58:55 2018 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIcBAABCgAGBQJbbRcNAAoJENsLIaASlz/QMfgQAL5TawmaMQnBJN4NBEoj5DTB +Tk/iGHPcr6nvzxFVffoSappW4Lfw6kcvXimU7CaYk3qAG+ssdOXP6RtcPz02aybM +hjCUzbQIJfZUlmeuMAs/Eh6m40bUHMQMTRmY4Bq96MPUEv053Og2c/W08VBbnZjL +D5fK6MT0xVKzq9aQ5vA1TrR+nDqR+bPkabWiWUCGKCjhKil2ltkKWdw4gflvFzaR +Un7ItbwlxKb7pQSiLdBkO/aj4XhKVEwJVl2K929OS066fwoPSEslSjqo/K7TtapQ +uL2i1Sb9P312HcMhDc8ja0y2YlYgIMCxjc5ZyMczHzUaIFbMlwitrfUlDitywhBL +PIPQpWzvsHkbHLsLjGeV8e10RRgh2PjaDPFFKrJsRSlpEy9pVyuRcGEzIrV/ZAfv +t6nBCKp96SZwqpCl6cfjUNDgDgVLO9J8My48I45Vhutp69XZvJDDV3OsAPmNERqA +AuNOWVf1wJEUNPejeMK+HiPbITNSey7DS1fMN77kz8dapZbL0p9NYNhBur0zlXip +tChlQKuM7TxdtoL1OCCrdNnzqABz6Z1ccR5vOlgj7cIPCA9z1KCyUuOUyIwPtEc4 +FGiTwoEC6rz8BQrk0gezPf8EI/kBgBy+mdGRlNuZvWaTJio6Jj8puMe+E/KNEjcR +HenOiJR4yCzvr1AcdAzY +=6eHj +-END PGP SIGNATURE- Added: dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512 == --- dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512 (added) +++ dev/spark/v2.3.2-rc4-bin/pyspark-2.3.2.tar.gz.sha512 Fri Aug 10 04:58:55 2018 @@ -0,0 +1,3 @@
spark git commit: [SPARK-24855][SQL][EXTERNAL] Built-in AVRO support should support specified schema on write
Repository: spark Updated Branches: refs/heads/master bdd27961c -> 0cea9e3cd [SPARK-24855][SQL][EXTERNAL] Built-in AVRO support should support specified schema on write ## What changes were proposed in this pull request? Allows `avroSchema` option to be specified on write, allowing a user to specify a schema in cases where this is required. A trivial use case is reading in an avro dataset, making some small adjustment to a column or columns and writing out using the same schema. Implicit schema creation from SQL Struct results in a schema that while for the most part, is functionally similar, is not necessarily compatible. Allows `fixed` Field type to be utilized for records of specified `avroSchema` ## How was this patch tested? Unit tests in AvroSuite are extended to test this with enum and fixed types. Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #21847 from lindblombr/specify_schema_on_write. Lead-authored-by: Brian Lindblom Co-authored-by: DB Tsai Signed-off-by: DB Tsai Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0cea9e3c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0cea9e3c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0cea9e3c Branch: refs/heads/master Commit: 0cea9e3cd0a92799bdcc0f9bc2cf96259c343a30 Parents: bdd2796 Author: Brian Lindblom Authored: Fri Aug 10 03:35:29 2018 + Committer: DB Tsai Committed: Fri Aug 10 03:35:29 2018 + -- .../apache/spark/sql/avro/AvroFileFormat.scala | 6 +- .../apache/spark/sql/avro/AvroSerializer.scala | 40 +++- .../org/apache/spark/sql/avro/AvroSuite.scala | 228 ++- 3 files changed, 257 insertions(+), 17 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/0cea9e3c/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala -- diff --git a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala index 6ffcf37..6df23c9 100755 --- a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala +++ b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala @@ -113,8 +113,10 @@ private[avro] class AvroFileFormat extends FileFormat options: Map[String, String], dataSchema: StructType): OutputWriterFactory = { val parsedOptions = new AvroOptions(options, spark.sessionState.newHadoopConf()) -val outputAvroSchema = SchemaConverters.toAvroType(dataSchema, nullable = false, - parsedOptions.recordName, parsedOptions.recordNamespace, parsedOptions.outputTimestampType) +val outputAvroSchema: Schema = parsedOptions.schema + .map(new Schema.Parser().parse) + .getOrElse(SchemaConverters.toAvroType(dataSchema, nullable = false, +parsedOptions.recordName, parsedOptions.recordNamespace)) AvroJob.setOutputKeySchema(job, outputAvroSchema) http://git-wip-us.apache.org/repos/asf/spark/blob/0cea9e3c/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala -- diff --git a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala index 9885826..216c52a 100644 --- a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala +++ b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala @@ -23,8 +23,8 @@ import scala.collection.JavaConverters._ import org.apache.avro.LogicalTypes.{TimestampMicros, TimestampMillis} import org.apache.avro.Schema -import org.apache.avro.Schema.Type.NULL -import org.apache.avro.generic.GenericData.Record +import org.apache.avro.Schema.Type +import org.apache.avro.generic.GenericData.{EnumSymbol, Fixed, Record} import org.apache.avro.util.Utf8 import org.apache.spark.sql.catalyst.InternalRow @@ -87,10 +87,36 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: (getter, ordinal) => getter.getDouble(ordinal) case d: DecimalType => (getter, ordinal) => getter.getDecimal(ordinal, d.precision, d.scale).toString - case StringType => -(getter, ordinal) => new Utf8(getter.getUTF8String(ordinal).getBytes) - case BinaryType => -(getter, ordinal) => ByteBuffer.wrap(getter.getBinary(ordinal)) + case StringType => avroType.getType match { +case Type.ENUM => + import scala.collection.JavaConverters._ + val enumSymbols: Set[String] = avroType.getEnumSymbols.asScala.toSet + (getter, ordinal) => +
svn commit: r28646 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_20_02-6c7bb57-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Aug 10 03:15:55 2018 New Revision: 28646 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_09_20_02-6c7bb57 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24251][SQL] Add analysis tests for AppendData.
Repository: spark Updated Branches: refs/heads/master 6c7bb575b -> bdd27961c [SPARK-24251][SQL] Add analysis tests for AppendData. ## What changes were proposed in this pull request? This is a follow-up to #21305 that adds a test suite for AppendData analysis. This also fixes the following problems uncovered by these tests: * Incorrect order of data types passed to `canWrite` is fixed * The field check calls `canWrite` first to ensure all errors are found * `AppendData#resolved` must check resolution of the query's attributes * Column names are quoted to show empty names ## How was this patch tested? This PR adds a test suite for AppendData analysis. Closes #22043 from rdblue/SPARK-24251-add-append-data-analysis-tests. Authored-by: Ryan Blue Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bdd27961 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bdd27961 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bdd27961 Branch: refs/heads/master Commit: bdd27961c870a3c443686cdbb6dd0eee3ad32012 Parents: 6c7bb57 Author: Ryan Blue Authored: Fri Aug 10 11:10:23 2018 +0800 Committer: Wenchen Fan Committed: Fri Aug 10 11:10:23 2018 +0800 -- .../spark/sql/catalyst/analysis/Analyzer.scala | 16 +- .../plans/logical/basicLogicalOperators.scala | 15 +- .../analysis/DataSourceV2AnalysisSuite.scala| 379 +++ 3 files changed, 397 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/bdd27961/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index a7cd96e..d00b82d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -2258,8 +2258,8 @@ class Analyzer( if (expected.size < query.output.size) { throw new AnalysisException( s"""Cannot write to '$tableName', too many data columns: - |Table columns: ${expected.map(_.name).mkString(", ")} - |Data columns: ${query.output.map(_.name).mkString(", ")}""".stripMargin) + |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", ")} + |Data columns: ${query.output.map(c => s"'${c.name}'").mkString(", ")}""".stripMargin) } val errors = new mutable.ArrayBuffer[String]() @@ -2278,8 +2278,9 @@ class Analyzer( if (expected.size > query.output.size) { throw new AnalysisException( s"""Cannot write to '$tableName', not enough data columns: - |Table columns: ${expected.map(_.name).mkString(", ")} - |Data columns: ${query.output.map(_.name).mkString(", ")}""".stripMargin) + |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", ")} + |Data columns: ${query.output.map(c => s"'${c.name}'").mkString(", ")}""" +.stripMargin) } query.output.zip(expected).flatMap { @@ -2301,12 +2302,15 @@ class Analyzer( queryExpr: NamedExpression, addError: String => Unit): Option[NamedExpression] = { + // run the type check first to ensure type errors are present + val canWrite = DataType.canWrite( +queryExpr.dataType, tableAttr.dataType, resolver, tableAttr.name, addError) + if (queryExpr.nullable && !tableAttr.nullable) { addError(s"Cannot write nullable values to non-null column '${tableAttr.name}'") None - } else if (!DataType.canWrite( - tableAttr.dataType, queryExpr.dataType, resolver, tableAttr.name, addError)) { + } else if (!canWrite) { None } else { http://git-wip-us.apache.org/repos/asf/spark/blob/bdd27961/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala index 0d31c6f..a6631a8 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala @@ -363,13 +363,14 @@ case class AppendData( override def output: Seq[Attribute] = Seq.empty override lazy val
[1/2] spark git commit: Preparing Spark release v2.3.2-rc4
Repository: spark Updated Branches: refs/heads/branch-2.3 b426ec583 -> e66f3f9b1 Preparing Spark release v2.3.2-rc4 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6930f488 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6930f488 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6930f488 Branch: refs/heads/branch-2.3 Commit: 6930f4885356eaec2c1e85896be3c93a80ce779c Parents: b426ec5 Author: Saisai Shao Authored: Fri Aug 10 02:06:28 2018 + Committer: Saisai Shao Committed: Fri Aug 10 02:06:28 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml| 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml| 2 +- common/network-yarn/pom.xml | 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml | 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml | 2 +- external/kafka-0-10/pom.xml | 2 +- external/kafka-0-8-assembly/pom.xml | 2 +- external/kafka-0-8/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml | 2 +- mllib/pom.xml | 2 +- pom.xml | 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/mesos/pom.xml | 2 +- resource-managers/yarn/pom.xml| 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 41 files changed, 42 insertions(+), 42 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 6ec4966..8df2635 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.3.3 +Version: 2.3.2 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index f8b15cc..57485fc 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.3.3-SNAPSHOT +2.3.2 ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index e412a47..53e58c2 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.3-SNAPSHOT +2.3.2 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index d8f9a3d..d05647c 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.3-SNAPSHOT +2.3.2 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/6930f488/common/network-shuffle/pom.xml -- diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index a1a4f87..8d46761 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.3.2-rc4 [created] 6930f4885 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 2.3.3-SNAPSHOT
Preparing development version 2.3.3-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e66f3f9b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e66f3f9b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e66f3f9b Branch: refs/heads/branch-2.3 Commit: e66f3f9b11f37261ec6cbcb6bb2ebeb34e56a968 Parents: 6930f48 Author: Saisai Shao Authored: Fri Aug 10 02:06:37 2018 + Committer: Saisai Shao Committed: Fri Aug 10 02:06:37 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml| 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml| 2 +- common/network-yarn/pom.xml | 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml | 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml | 2 +- external/kafka-0-10/pom.xml | 2 +- external/kafka-0-8-assembly/pom.xml | 2 +- external/kafka-0-8/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml | 2 +- mllib/pom.xml | 2 +- pom.xml | 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/mesos/pom.xml | 2 +- resource-managers/yarn/pom.xml| 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 41 files changed, 42 insertions(+), 42 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 8df2635..6ec4966 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.3.2 +Version: 2.3.3 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 57485fc..f8b15cc 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.3.2 +2.3.3-SNAPSHOT ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 53e58c2..e412a47 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.2 +2.3.3-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index d05647c..d8f9a3d 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.2 +2.3.3-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/e66f3f9b/common/network-shuffle/pom.xml -- diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 8d46761..a1a4f87 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -
svn commit: r28644 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_18_02-b426ec5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Aug 10 01:15:33 2018 New Revision: 28644 Log: Apache Spark 2.3.3-SNAPSHOT-2018_08_09_18_02-b426ec5 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins build (from 300m to 340m)
Repository: spark Updated Branches: refs/heads/master 9b8521e53 -> 6c7bb575b [SPARK-24886][INFRA] Fix the testing script to increase timeout for Jenkins build (from 300m to 340m) ## What changes were proposed in this pull request? Currently, looks we hit the time limit time to time. Looks better increasing the time a bit. For instance, please see https://github.com/apache/spark/pull/21822 For clarification, current Jenkins timeout is 400m. This PR just proposes to fix the test script to increase it correspondingly. *This PR does not target to change the build configuration* ## How was this patch tested? Jenkins tests. Closes #21845 from HyukjinKwon/SPARK-24886. Authored-by: hyukjinkwon Signed-off-by: hyukjinkwon Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6c7bb575 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6c7bb575 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6c7bb575 Branch: refs/heads/master Commit: 6c7bb575bf8b0bfc26f23e0ef449aaded77d3789 Parents: 9b8521e Author: hyukjinkwon Authored: Fri Aug 10 09:12:17 2018 +0800 Committer: hyukjinkwon Committed: Fri Aug 10 09:12:17 2018 +0800 -- dev/run-tests-jenkins.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6c7bb575/dev/run-tests-jenkins.py -- diff --git a/dev/run-tests-jenkins.py b/dev/run-tests-jenkins.py index 3960a0d..16af97c 100755 --- a/dev/run-tests-jenkins.py +++ b/dev/run-tests-jenkins.py @@ -181,8 +181,8 @@ def main(): short_commit_hash = ghprb_actual_commit[0:7] # format: http://linux.die.net/man/1/timeout -# must be less than the timeout configured on Jenkins (currently 350m) -tests_timeout = "300m" +# must be less than the timeout configured on Jenkins (currently 400m) +tests_timeout = "340m" # Array to capture all test names to run on the pull request. These tests are represented # by their file equivalents in the dev/tests/ directory. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28641 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_16_02-9b8521e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Aug 9 23:16:16 2018 New Revision: 28641 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_09_16_02-9b8521e docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13
Repository: spark Updated Branches: refs/heads/branch-2.1 b2e0f68f6 -> 42229430f [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13 - Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis and millisToDays multiple skipdates can be specified. - Updated test so that both new years eve 2014 and new years day 2015 are skipped for kiribati time zones. This is necessary as java versions pre 181-b13 considered new years day 2015 to be skipped while susequent versions corrected this to new years eve. Unit tests Author: Chris Martin Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures. (cherry picked from commit c5b8d54c61780af6e9e157e6c855718df972efad) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/42229430 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/42229430 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/42229430 Branch: refs/heads/branch-2.1 Commit: 42229430f94ba33fb614628b9438e699b4922099 Parents: b2e0f68 Author: Chris Martin Authored: Sat Jul 28 10:40:10 2018 -0500 Committer: Sean Owen Committed: Thu Aug 9 17:31:10 2018 -0500 -- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/42229430/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala index e0a9a0c..a62a3d0 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala @@ -538,19 +538,19 @@ class DateTimeUtilsSuite extends SparkFunSuite { test("daysToMillis and millisToDays") { // There are some days are skipped entirely in some timezone, skip them here. -val skipped_days = Map[String, Int]( - "Kwajalein" -> 8632, - "Pacific/Apia" -> 15338, - "Pacific/Enderbury" -> 9131, - "Pacific/Fakaofo" -> 15338, - "Pacific/Kiritimati" -> 9131, - "Pacific/Kwajalein" -> 8632, - "MIT" -> 15338) +val skipped_days = Map[String, Set[Int]]( + "Kwajalein" -> Set(8632), + "Pacific/Apia" -> Set(15338), + "Pacific/Enderbury" -> Set(9130, 9131), + "Pacific/Fakaofo" -> Set(15338), + "Pacific/Kiritimati" -> Set(9130, 9131), + "Pacific/Kwajalein" -> Set(8632), + "MIT" -> Set(15338)) for (tz <- DateTimeTestUtils.ALL_TIMEZONES) { DateTimeTestUtils.withDefaultTimeZone(tz) { -val skipped = skipped_days.getOrElse(tz.getID, Int.MinValue) +val skipped = skipped_days.getOrElse(tz.getID, Set.empty) (-2 to 2).foreach { d => - if (d != skipped) { + if (!skipped.contains(d)) { assert(millisToDays(daysToMillis(d)) === d, s"Round trip of ${d} did not work in tz ${tz}") } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13
Repository: spark Updated Branches: refs/heads/branch-2.3 9bfc55b1b -> b426ec583 [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13 ## What changes were proposed in this pull request? - Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis and millisToDays multiple skipdates can be specified. - Updated test so that both new years eve 2014 and new years day 2015 are skipped for kiribati time zones. This is necessary as java versions pre 181-b13 considered new years day 2015 to be skipped while susequent versions corrected this to new years eve. ## How was this patch tested? Unit tests Author: Chris Martin Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures. (cherry picked from commit c5b8d54c61780af6e9e157e6c855718df972efad) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b426ec58 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b426ec58 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b426ec58 Branch: refs/heads/branch-2.3 Commit: b426ec583fb5176461c5b0c7112d2194af66d93d Parents: 9bfc55b Author: Chris Martin Authored: Sat Jul 28 10:40:10 2018 -0500 Committer: Sean Owen Committed: Thu Aug 9 17:24:24 2018 -0500 -- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b426ec58/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala index 625ff38..b025b85 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala @@ -650,18 +650,18 @@ class DateTimeUtilsSuite extends SparkFunSuite { assert(daysToMillis(16800, TimeZoneGMT) === c.getTimeInMillis) // There are some days are skipped entirely in some timezone, skip them here. -val skipped_days = Map[String, Int]( - "Kwajalein" -> 8632, - "Pacific/Apia" -> 15338, - "Pacific/Enderbury" -> 9131, - "Pacific/Fakaofo" -> 15338, - "Pacific/Kiritimati" -> 9131, - "Pacific/Kwajalein" -> 8632, - "MIT" -> 15338) +val skipped_days = Map[String, Set[Int]]( + "Kwajalein" -> Set(8632), + "Pacific/Apia" -> Set(15338), + "Pacific/Enderbury" -> Set(9130, 9131), + "Pacific/Fakaofo" -> Set(15338), + "Pacific/Kiritimati" -> Set(9130, 9131), + "Pacific/Kwajalein" -> Set(8632), + "MIT" -> Set(15338)) for (tz <- DateTimeTestUtils.ALL_TIMEZONES) { - val skipped = skipped_days.getOrElse(tz.getID, Int.MinValue) + val skipped = skipped_days.getOrElse(tz.getID, Set.empty) (-2 to 2).foreach { d => -if (d != skipped) { +if (!skipped.contains(d)) { assert(millisToDays(daysToMillis(d, tz), tz) === d, s"Round trip of ${d} did not work in tz ${tz}") } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13
Repository: spark Updated Branches: refs/heads/branch-2.2 53ac8504b -> b283c1f05 [SPARK-24950][SQL] DateTimeUtilsSuite daysToMillis and millisToDays fails w/java 8 181-b13 ## What changes were proposed in this pull request? - Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis and millisToDays multiple skipdates can be specified. - Updated test so that both new years eve 2014 and new years day 2015 are skipped for kiribati time zones. This is necessary as java versions pre 181-b13 considered new years day 2015 to be skipped while susequent versions corrected this to new years eve. ## How was this patch tested? Unit tests Author: Chris Martin Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures. (cherry picked from commit c5b8d54c61780af6e9e157e6c855718df972efad) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b283c1f0 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b283c1f0 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b283c1f0 Branch: refs/heads/branch-2.2 Commit: b283c1f055521e4090a9829924e5c63810bb0c89 Parents: 53ac850 Author: Chris Martin Authored: Sat Jul 28 10:40:10 2018 -0500 Committer: Sean Owen Committed: Thu Aug 9 17:24:43 2018 -0500 -- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b283c1f0/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala index c8cf16d..deaf2f9 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala @@ -580,18 +580,18 @@ class DateTimeUtilsSuite extends SparkFunSuite { assert(daysToMillis(16800, TimeZoneGMT) === c.getTimeInMillis) // There are some days are skipped entirely in some timezone, skip them here. -val skipped_days = Map[String, Int]( - "Kwajalein" -> 8632, - "Pacific/Apia" -> 15338, - "Pacific/Enderbury" -> 9131, - "Pacific/Fakaofo" -> 15338, - "Pacific/Kiritimati" -> 9131, - "Pacific/Kwajalein" -> 8632, - "MIT" -> 15338) +val skipped_days = Map[String, Set[Int]]( + "Kwajalein" -> Set(8632), + "Pacific/Apia" -> Set(15338), + "Pacific/Enderbury" -> Set(9130, 9131), + "Pacific/Fakaofo" -> Set(15338), + "Pacific/Kiritimati" -> Set(9130, 9131), + "Pacific/Kwajalein" -> Set(8632), + "MIT" -> Set(15338)) for (tz <- DateTimeTestUtils.ALL_TIMEZONES) { - val skipped = skipped_days.getOrElse(tz.getID, Int.MinValue) + val skipped = skipped_days.getOrElse(tz.getID, Set.empty) (-2 to 2).foreach { d => -if (d != skipped) { +if (!skipped.contains(d)) { assert(millisToDays(daysToMillis(d, tz), tz) === d, s"Round trip of ${d} did not work in tz ${tz}") } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25068][SQL] Add exists function.
Repository: spark Updated Branches: refs/heads/master fec67ed7e -> 9b8521e53 [SPARK-25068][SQL] Add exists function. ## What changes were proposed in this pull request? This pr adds `exists` function which tests whether a predicate holds for one or more elements in the array. ```sql > SELECT exists(array(1, 2, 3), x -> x % 2 == 0); true ``` ## How was this patch tested? Added tests. Closes #22052 from ueshin/issues/SPARK-25068/exists. Authored-by: Takuya UESHIN Signed-off-by: Xiao Li Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9b8521e5 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9b8521e5 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9b8521e5 Branch: refs/heads/master Commit: 9b8521e53e56a53b44c02366a99f8a8ee1307bbf Parents: fec67ed Author: Takuya UESHIN Authored: Thu Aug 9 14:41:59 2018 -0700 Committer: Xiao Li Committed: Thu Aug 9 14:41:59 2018 -0700 -- .../catalyst/analysis/FunctionRegistry.scala| 1 + .../expressions/higherOrderFunctions.scala | 47 ++ .../expressions/HigherOrderFunctionsSuite.scala | 37 .../sql-tests/inputs/higher-order-functions.sql | 6 ++ .../results/higher-order-functions.sql.out | 18 .../spark/sql/DataFrameFunctionsSuite.scala | 96 6 files changed, 205 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/9b8521e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala index 390debd..15543c9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala @@ -444,6 +444,7 @@ object FunctionRegistry { expression[ArrayTransform]("transform"), expression[MapFilter]("map_filter"), expression[ArrayFilter]("filter"), +expression[ArrayExists]("exists"), expression[ArrayAggregate]("aggregate"), CreateStruct.registryEntry, http://git-wip-us.apache.org/repos/asf/spark/blob/9b8521e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala index d206733..7f8203a 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala @@ -357,6 +357,53 @@ case class ArrayFilter( } /** + * Tests whether a predicate holds for one or more elements in the array. + */ +@ExpressionDescription(usage = + "_FUNC_(expr, pred) - Tests whether a predicate holds for one or more elements in the array.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), x -> x % 2 == 0); + true + """, + since = "2.4.0") +case class ArrayExists( +input: Expression, +function: Expression) + extends ArrayBasedSimpleHigherOrderFunction with CodegenFallback { + + override def nullable: Boolean = input.nullable + + override def dataType: DataType = BooleanType + + override def expectingFunctionType: AbstractDataType = BooleanType + + override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): ArrayExists = { +val elem = HigherOrderFunction.arrayArgumentType(input.dataType) +copy(function = f(function, elem :: Nil)) + } + + @transient lazy val LambdaFunction(_, Seq(elementVar: NamedLambdaVariable), _) = function + + override def nullSafeEval(inputRow: InternalRow, value: Any): Any = { +val arr = value.asInstanceOf[ArrayData] +val f = functionForEval +var exists = false +var i = 0 +while (i < arr.numElements && !exists) { + elementVar.value.set(arr.get(i, elementVar.dataType)) + if (f.eval(inputRow).asInstanceOf[Boolean]) { +exists = true + } + i += 1 +} +exists + } + + override def prettyName: String = "exists" +} + +/** * Applies a binary operator to a start value and all elements in the array. */ @ExpressionDescription( http://git-wip-us.apache.org/repos/asf/spark/blob/9b8521e5/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala
spark git commit: [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession
Repository: spark Updated Branches: refs/heads/branch-2.3 7d465d8f4 -> 9bfc55b1b [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession ## What changes were proposed in this pull request? When a `SparkSession` is stopped, `SQLConf.get` should use the fallback conf to avoid weird issues like ``` sbt.ForkMain$ForkError: java.lang.IllegalStateException: LiveListenerBus is stopped. at org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97) at org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:93) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120) at scala.Option.getOrElse(Option.scala:121) ... ``` ## How was this patch tested? a new test suite Closes #22056 from cloud-fan/session. Authored-by: Wenchen Fan Signed-off-by: Xiao Li (cherry picked from commit fec67ed7e95483c5ea97a7b263ad4bea7d3d42b5) Signed-off-by: Xiao Li Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9bfc55b1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9bfc55b1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9bfc55b1 Branch: refs/heads/branch-2.3 Commit: 9bfc55b1b0aae269320bb978027a800fd1878149 Parents: 7d465d8 Author: Wenchen Fan Authored: Thu Aug 9 14:38:58 2018 -0700 Committer: Xiao Li Committed: Thu Aug 9 14:40:09 2018 -0700 -- .../org/apache/spark/sql/SparkSession.scala | 3 +- .../apache/spark/sql/LocalSparkSession.scala| 9 ++ .../spark/sql/internal/SQLConfGetterSuite.scala | 33 3 files changed, 37 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/9bfc55b1/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala index b699ccd..adc7143 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala @@ -92,7 +92,8 @@ class SparkSession private( // If there is no active SparkSession, uses the default SQL conf. Otherwise, use the session's. SQLConf.setSQLConfGetter(() => { - SparkSession.getActiveSession.map(_.sessionState.conf).getOrElse(SQLConf.getFallbackConf) + SparkSession.getActiveSession.filterNot(_.sparkContext.isStopped).map(_.sessionState.conf) + .getOrElse(SQLConf.getFallbackConf) }) /** http://git-wip-us.apache.org/repos/asf/spark/blob/9bfc55b1/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala index cbef1c7..6b90f20 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala @@ -36,19 +36,14 @@ trait LocalSparkSession extends BeforeAndAfterEach with BeforeAndAfterAll { self override def afterEach() { try { - resetSparkContext() + LocalSparkSession.stop(spark) SparkSession.clearActiveSession() SparkSession.clearDefaultSession() + spark = null } finally { super.afterEach() } } - - def resetSparkContext(): Unit = { -LocalSparkSession.stop(spark) -spark = null - } - } object LocalSparkSession { http://git-wip-us.apache.org/repos/asf/spark/blob/9bfc55b1/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala new file mode 100644 index 000..bb79d3a --- /dev/null +++ b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *
spark git commit: [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession
Repository: spark Updated Branches: refs/heads/master bd6db1505 -> fec67ed7e [SPARK-25076][SQL] SQLConf should not be retrieved from a stopped SparkSession ## What changes were proposed in this pull request? When a `SparkSession` is stopped, `SQLConf.get` should use the fallback conf to avoid weird issues like ``` sbt.ForkMain$ForkError: java.lang.IllegalStateException: LiveListenerBus is stopped. at org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97) at org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:93) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:120) at scala.Option.getOrElse(Option.scala:121) ... ``` ## How was this patch tested? a new test suite Closes #22056 from cloud-fan/session. Authored-by: Wenchen Fan Signed-off-by: Xiao Li Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fec67ed7 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fec67ed7 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fec67ed7 Branch: refs/heads/master Commit: fec67ed7e95483c5ea97a7b263ad4bea7d3d42b5 Parents: bd6db15 Author: Wenchen Fan Authored: Thu Aug 9 14:38:58 2018 -0700 Committer: Xiao Li Committed: Thu Aug 9 14:38:58 2018 -0700 -- .../org/apache/spark/sql/SparkSession.scala | 3 +- .../apache/spark/sql/LocalSparkSession.scala| 9 ++ .../spark/sql/internal/SQLConfGetterSuite.scala | 33 3 files changed, 37 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/fec67ed7/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala index 565042f..d9278d8 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala @@ -92,7 +92,8 @@ class SparkSession private( // If there is no active SparkSession, uses the default SQL conf. Otherwise, use the session's. SQLConf.setSQLConfGetter(() => { - SparkSession.getActiveSession.map(_.sessionState.conf).getOrElse(SQLConf.getFallbackConf) + SparkSession.getActiveSession.filterNot(_.sparkContext.isStopped).map(_.sessionState.conf) + .getOrElse(SQLConf.getFallbackConf) }) /** http://git-wip-us.apache.org/repos/asf/spark/blob/fec67ed7/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala index cbef1c7..6b90f20 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/LocalSparkSession.scala @@ -36,19 +36,14 @@ trait LocalSparkSession extends BeforeAndAfterEach with BeforeAndAfterAll { self override def afterEach() { try { - resetSparkContext() + LocalSparkSession.stop(spark) SparkSession.clearActiveSession() SparkSession.clearDefaultSession() + spark = null } finally { super.afterEach() } } - - def resetSparkContext(): Unit = { -LocalSparkSession.stop(spark) -spark = null - } - } object LocalSparkSession { http://git-wip-us.apache.org/repos/asf/spark/blob/fec67ed7/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala new file mode 100644 index 000..bb79d3a --- /dev/null +++ b/sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfGetterSuite.scala @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to
svn commit: r28640 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_09_14_02-7d465d8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Aug 9 21:15:35 2018 New Revision: 28640 Log: Apache Spark 2.3.3-SNAPSHOT-2018_08_09_14_02-7d465d8 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25077][SQL] Delete unused variable in WindowExec
Repository: spark Updated Branches: refs/heads/master eb9a696dd -> bd6db1505 [SPARK-25077][SQL] Delete unused variable in WindowExec ## What changes were proposed in this pull request? Just delete the unused variable `inputFields` in WindowExec, avoid making others confused while reading the code. ## How was this patch tested? Existing UT. Closes #22057 from xuanyuanking/SPARK-25077. Authored-by: liyuanjian Signed-off-by: Xiao Li Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bd6db150 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bd6db150 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bd6db150 Branch: refs/heads/master Commit: bd6db1505fb68737fa1782bd457ddc52eae6652d Parents: eb9a696 Author: liyuanjian Authored: Thu Aug 9 13:43:07 2018 -0700 Committer: Xiao Li Committed: Thu Aug 9 13:43:07 2018 -0700 -- .../scala/org/apache/spark/sql/execution/window/WindowExec.scala | 2 -- 1 file changed, 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/bd6db150/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala index 626f39d..fede0f3 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala @@ -323,8 +323,6 @@ case class WindowExec( fetchNextRow() // Manage the current partition. -val inputFields = child.output.length - val buffer: ExternalAppendOnlyUnsafeRowArray = new ExternalAppendOnlyUnsafeRowArray(inMemoryThreshold, spillThreshold) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28638 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_12_02-eb9a696-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Aug 9 19:16:15 2018 New Revision: 28638 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_09_12_02-eb9a696 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605
Repository: spark Updated Branches: refs/heads/branch-2.3 9fb70f458 -> 7d465d8f4 [MINOR][BUILD] Update Jetty to 9.3.24.v20180605 Update Jetty to 9.3.24.v20180605 to pick up security fix Existing tests. Closes #22055 from srowen/Jetty9324. Authored-by: Sean Owen Signed-off-by: Sean Owen (cherry picked from commit eb9a696dd6f138225708d15bb2383854ed8a6dab) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7d465d8f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7d465d8f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7d465d8f Branch: refs/heads/branch-2.3 Commit: 7d465d8f4ad982fbdcfc0129ff9a4952a384bb17 Parents: 9fb70f4 Author: Sean Owen Authored: Thu Aug 9 13:04:03 2018 -0500 Committer: Sean Owen Committed: Thu Aug 9 13:05:26 2018 -0500 -- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7d465d8f/pom.xml -- diff --git a/pom.xml b/pom.xml index 76e8363..3ff0408 100644 --- a/pom.xml +++ b/pom.xml @@ -133,7 +133,7 @@ 1.4.4 nohive 1.6.0 -9.3.20.v20170531 +9.3.24.v20180605 3.1.0 0.8.4 2.4.0 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605
Repository: spark Updated Branches: refs/heads/master d36539741 -> eb9a696dd [MINOR][BUILD] Update Jetty to 9.3.24.v20180605 ## What changes were proposed in this pull request? Update Jetty to 9.3.24.v20180605 to pick up security fix ## How was this patch tested? Existing tests. Closes #22055 from srowen/Jetty9324. Authored-by: Sean Owen Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eb9a696d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eb9a696d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eb9a696d Branch: refs/heads/master Commit: eb9a696dd6f138225708d15bb2383854ed8a6dab Parents: d365397 Author: Sean Owen Authored: Thu Aug 9 13:04:03 2018 -0500 Committer: Sean Owen Committed: Thu Aug 9 13:04:03 2018 -0500 -- dev/deps/spark-deps-hadoop-3.1 | 4 ++-- pom.xml| 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/eb9a696d/dev/deps/spark-deps-hadoop-3.1 -- diff --git a/dev/deps/spark-deps-hadoop-3.1 b/dev/deps/spark-deps-hadoop-3.1 index 90602fc..fb42adf 100644 --- a/dev/deps/spark-deps-hadoop-3.1 +++ b/dev/deps/spark-deps-hadoop-3.1 @@ -120,8 +120,8 @@ jersey-guava-2.22.2.jar jersey-media-jaxb-2.22.2.jar jersey-server-2.22.2.jar jets3t-0.9.4.jar -jetty-webapp-9.3.20.v20170531.jar -jetty-xml-9.3.20.v20170531.jar +jetty-webapp-9.3.24.v20180605.jar +jetty-xml-9.3.24.v20180605.jar jline-2.14.3.jar joda-time-2.9.3.jar jodd-core-3.5.2.jar http://git-wip-us.apache.org/repos/asf/spark/blob/eb9a696d/pom.xml -- diff --git a/pom.xml b/pom.xml index 8abdb70..b89713f 100644 --- a/pom.xml +++ b/pom.xml @@ -134,7 +134,7 @@ 1.5.2 nohive 1.6.0 -9.3.20.v20170531 +9.3.24.v20180605 3.1.0 0.8.4 2.4.0 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-24626][SQL] Improve location size calculation in Analyze Table command
Repository: spark Updated Branches: refs/heads/master 2949a835f -> d36539741 [SPARK-24626][SQL] Improve location size calculation in Analyze Table command ## What changes were proposed in this pull request? Currently, Analyze table calculates table size sequentially for each partition. We can parallelize size calculations over partitions. Results : Tested on a table with 100 partitions and data stored in S3. With changes : - 10.429s - 10.557s - 10.439s - 9.893s⨠Without changes : - 110.034s - 99.510s - 100.743s - 99.106s ## How was this patch tested? Simple unit test. Closes #21608 from Achuth17/improveAnalyze. Lead-authored-by: Achuth17 Co-authored-by: arajagopal17 Signed-off-by: Xiao Li Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3653974 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3653974 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d3653974 Branch: refs/heads/master Commit: d36539741ff6a12a6acde9274e9992a66cdd36e7 Parents: 2949a83 Author: Achuth17 Authored: Thu Aug 9 08:29:24 2018 -0700 Committer: Xiao Li Committed: Thu Aug 9 08:29:24 2018 -0700 -- docs/sql-programming-guide.md | 2 ++ .../org/apache/spark/sql/internal/SQLConf.scala | 12 .../command/AnalyzeColumnCommand.scala | 2 +- .../execution/command/AnalyzeTableCommand.scala | 2 +- .../sql/execution/command/CommandUtils.scala| 30 +++- .../execution/datasources/DataSourceUtils.scala | 10 +++ .../datasources/InMemoryFileIndex.scala | 2 +- .../apache/spark/sql/hive/StatisticsSuite.scala | 23 ++- 8 files changed, 72 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d3653974/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index a1e019c..9adb86a 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1892,6 +1892,8 @@ working with timestamps in `pandas_udf`s to get the best performance, see - In version 2.3 and earlier, Spark converts Parquet Hive tables by default but ignores table properties like `TBLPROPERTIES (parquet.compression 'NONE')`. This happens for ORC Hive table properties like `TBLPROPERTIES (orc.compress 'NONE')` in case of `spark.sql.hive.convertMetastoreOrc=true`, too. Since Spark 2.4, Spark respects Parquet/ORC specific table properties while converting Parquet/ORC Hive tables. As an example, `CREATE TABLE t(id int) STORED AS PARQUET TBLPROPERTIES (parquet.compression 'NONE')` would generate Snappy parquet files during insertion in Spark 2.3, and in Spark 2.4, the result would be uncompressed parquet files. - Since Spark 2.0, Spark converts Parquet Hive tables by default for better performance. Since Spark 2.4, Spark converts ORC Hive tables by default, too. It means Spark uses its own ORC support by default instead of Hive SerDe. As an example, `CREATE TABLE t(id int) STORED AS ORC` would be handled with Hive SerDe in Spark 2.3, and in Spark 2.4, it would be converted into Spark's ORC data source table and ORC vectorization would be applied. To set `false` to `spark.sql.hive.convertMetastoreOrc` restores the previous behavior. - In version 2.3 and earlier, CSV rows are considered as malformed if at least one column value in the row is malformed. CSV parser dropped such rows in the DROPMALFORMED mode or outputs an error in the FAILFAST mode. Since Spark 2.4, CSV row is considered as malformed only when it contains malformed column values requested from CSV datasource, other values can be ignored. As an example, CSV file contains the "id,name" header and one row "1234". In Spark 2.4, selection of the id column consists of a row with one column value 1234 but in Spark 2.3 and earlier it is empty in the DROPMALFORMED mode. To restore the previous behavior, set `spark.sql.csv.parser.columnPruning.enabled` to `false`. + - Since Spark 2.4, File listing for compute statistics is done in parallel by default. This can be disabled by setting `spark.sql.parallelFileListingInStatsComputation.enabled` to `False`. + - Since Spark 2.4, Metadata files (e.g. Parquet summary files) and temporary files are not counted as data files when calculating table size during Statistics computation. ## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above http://git-wip-us.apache.org/repos/asf/spark/blob/d3653974/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
svn commit: r28636 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_08_02-1a7e747-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Aug 9 15:16:22 2018 New Revision: 28636 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_09_08_02-1a7e747 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotNull
Repository: spark Updated Branches: refs/heads/master 1a7e747ce -> 2949a835f [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotNull ## What changes were proposed in this pull request? Correct the class name typo checked in through SPARK-24891 ## How was this patch tested? Passed all existing tests. Closes #22049 from maryannxue/known-not-null. Authored-by: maryannxue Signed-off-by: Xiao Li Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2949a835 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2949a835 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2949a835 Branch: refs/heads/master Commit: 2949a835fae3f4ac6e3dae6f18cd8b6543b74601 Parents: 1a7e747 Author: maryannxue Authored: Thu Aug 9 08:11:30 2018 -0700 Committer: Xiao Li Committed: Thu Aug 9 08:11:30 2018 -0700 -- .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 4 ++-- .../spark/sql/catalyst/expressions/constraintExpressions.scala | 2 +- .../org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2949a835/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index d23d43b..a7cd96e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -2157,7 +2157,7 @@ class Analyzer( // trust the `nullable` information. // (cls, expr) => cls.isPrimitive && expr.nullable val needsNullCheck = (cls: Class[_], expr: Expression) => -cls.isPrimitive && !expr.isInstanceOf[KnowNotNull] +cls.isPrimitive && !expr.isInstanceOf[KnownNotNull] val inputsNullCheck = parameterTypes.zip(inputs) .filter { case (cls, expr) => needsNullCheck(cls, expr) } .map { case (_, expr) => IsNull(expr) } @@ -2167,7 +2167,7 @@ class Analyzer( // branch of `If` will be called if any of these checked inputs is null. Thus we can // prevent this rule from being applied repeatedly. val newInputs = parameterTypes.zip(inputs).map{ case (cls, expr) => -if (needsNullCheck(cls, expr)) KnowNotNull(expr) else expr } +if (needsNullCheck(cls, expr)) KnownNotNull(expr) else expr } inputsNullCheck .map(If(_, Literal.create(null, udf.dataType), udf.copy(children = newInputs))) .getOrElse(udf) http://git-wip-us.apache.org/repos/asf/spark/blob/2949a835/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala index 53936aa..2917b0b 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala @@ -21,7 +21,7 @@ import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCode, FalseLiteral} import org.apache.spark.sql.types.DataType -case class KnowNotNull(child: Expression) extends UnaryExpression { +case class KnownNotNull(child: Expression) extends UnaryExpression { override def nullable: Boolean = false override def dataType: DataType = child.dataType http://git-wip-us.apache.org/repos/asf/spark/blob/2949a835/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala index ba44484..a1c976d 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala @@ -319,7 +319,7 @@ class AnalysisSuite extends AnalysisTest with Matchers { // only primitive parameter needs special null handling val udf2 = ScalaUDF((s: String, d: Double) => "x", StringType, string :: double :: Nil)
spark git commit: [SPARK-25047][ML] Can't assign SerializedLambda to scala.Function1 in deserialization of BucketedRandomProjectionLSHModel
Repository: spark Updated Branches: refs/heads/master b2950cef3 -> 1a7e747ce [SPARK-25047][ML] Can't assign SerializedLambda to scala.Function1 in deserialization of BucketedRandomProjectionLSHModel ## What changes were proposed in this pull request? Convert two function fields in ML classes to simple functions to avoiâ¦d odd SerializedLambda deserialization problem ## How was this patch tested? Existing tests. Closes #22032 from srowen/SPARK-25047. Authored-by: Sean Owen Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1a7e747c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1a7e747c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1a7e747c Branch: refs/heads/master Commit: 1a7e747ce4f8c5253c5923045d23c62e43a6566b Parents: b2950ce Author: Sean Owen Authored: Thu Aug 9 08:07:46 2018 -0500 Committer: Sean Owen Committed: Thu Aug 9 08:07:46 2018 -0500 -- .../feature/BucketedRandomProjectionLSH.scala | 14 ++ .../scala/org/apache/spark/ml/feature/LSH.scala | 4 ++-- .../apache/spark/ml/feature/MinHashLSH.scala| 20 +--- .../GeneralizedLinearRegression.scala | 15 +++ 4 files changed, 24 insertions(+), 29 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/1a7e747c/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala b/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala index a906e95..0554455 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala @@ -82,14 +82,12 @@ class BucketedRandomProjectionLSHModel private[ml]( override def setOutputCol(value: String): this.type = super.set(outputCol, value) @Since("2.1.0") - override protected[ml] val hashFunction: Vector => Array[Vector] = { -key: Vector => { - val hashValues: Array[Double] = randUnitVectors.map({ -randUnitVector => Math.floor(BLAS.dot(key, randUnitVector) / $(bucketLength)) - }) - // TODO: Output vectors of dimension numHashFunctions in SPARK-18450 - hashValues.map(Vectors.dense(_)) -} + override protected[ml] def hashFunction(elems: Vector): Array[Vector] = { +val hashValues = randUnitVectors.map( + randUnitVector => Math.floor(BLAS.dot(elems, randUnitVector) / $(bucketLength)) +) +// TODO: Output vectors of dimension numHashFunctions in SPARK-18450 +hashValues.map(Vectors.dense(_)) } @Since("2.1.0") http://git-wip-us.apache.org/repos/asf/spark/blob/1a7e747c/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala b/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala index a70931f..b208523 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala @@ -75,7 +75,7 @@ private[ml] abstract class LSHModel[T <: LSHModel[T]] * The hash function of LSH, mapping an input feature vector to multiple hash vectors. * @return The mapping of LSH function. */ - protected[ml] val hashFunction: Vector => Array[Vector] + protected[ml] def hashFunction(elems: Vector): Array[Vector] /** * Calculate the distance between two different keys using the distance metric corresponding @@ -97,7 +97,7 @@ private[ml] abstract class LSHModel[T <: LSHModel[T]] override def transform(dataset: Dataset[_]): DataFrame = { transformSchema(dataset.schema, logging = true) -val transformUDF = udf(hashFunction, DataTypes.createArrayType(new VectorUDT)) +val transformUDF = udf(hashFunction(_: Vector), DataTypes.createArrayType(new VectorUDT)) dataset.withColumn($(outputCol), transformUDF(dataset($(inputCol } http://git-wip-us.apache.org/repos/asf/spark/blob/1a7e747c/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala b/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala index a043033..21cde66 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala @@ -60,18 +60,16 @@ class MinHashLSHModel private[ml]( override def setOutputCol(value: String): this.type =
spark git commit: Revert "[SPARK-24648][SQL] SqlMetrics should be threadsafe"
Repository: spark Updated Branches: refs/heads/master 386fbd3af -> b2950cef3 Revert "[SPARK-24648][SQL] SqlMetrics should be threadsafe" This reverts commit 5264164a67df498b73facae207eda12ee133be7d. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b2950cef Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b2950cef Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b2950cef Branch: refs/heads/master Commit: b2950cef3c898f59a2c92e8800ff134c44263b9a Parents: 386fbd3 Author: Wenchen Fan Authored: Thu Aug 9 20:33:59 2018 +0800 Committer: Wenchen Fan Committed: Thu Aug 9 20:33:59 2018 +0800 -- .../spark/sql/execution/metric/SQLMetrics.scala | 33 +++--- .../sql/execution/metric/SQLMetricsSuite.scala | 36 +--- 2 files changed, 14 insertions(+), 55 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b2950cef/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala index 98f58a3..cbf707f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala @@ -19,7 +19,6 @@ package org.apache.spark.sql.execution.metric import java.text.NumberFormat import java.util.Locale -import java.util.concurrent.atomic.LongAdder import org.apache.spark.SparkContext import org.apache.spark.scheduler.AccumulableInfo @@ -33,45 +32,40 @@ import org.apache.spark.util.{AccumulatorContext, AccumulatorV2, Utils} * on the driver side must be explicitly posted using [[SQLMetrics.postDriverMetricUpdates()]]. */ class SQLMetric(val metricType: String, initValue: Long = 0L) extends AccumulatorV2[Long, Long] { - // This is a workaround for SPARK-11013. // We may use -1 as initial value of the accumulator, if the accumulator is valid, we will // update it at the end of task and the value will be at least 0. Then we can filter out the -1 // values before calculate max, min, etc. - private[this] val _value = new LongAdder - private val _zeroValue = initValue - _value.add(initValue) + private[this] var _value = initValue + private var _zeroValue = initValue override def copy(): SQLMetric = { -val newAcc = new SQLMetric(metricType, initValue) -newAcc.add(_value.sum()) +val newAcc = new SQLMetric(metricType, _value) +newAcc._zeroValue = initValue newAcc } - override def reset(): Unit = this.set(_zeroValue) + override def reset(): Unit = _value = _zeroValue override def merge(other: AccumulatorV2[Long, Long]): Unit = other match { -case o: SQLMetric => _value.add(o.value) +case o: SQLMetric => _value += o.value case _ => throw new UnsupportedOperationException( s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}") } - override def isZero(): Boolean = _value.sum() == _zeroValue + override def isZero(): Boolean = _value == _zeroValue - override def add(v: Long): Unit = _value.add(v) + override def add(v: Long): Unit = _value += v // We can set a double value to `SQLMetric` which stores only long value, if it is // average metrics. def set(v: Double): Unit = SQLMetrics.setDoubleForAverageMetrics(this, v) - def set(v: Long): Unit = { -_value.reset() -_value.add(v) - } + def set(v: Long): Unit = _value = v - def +=(v: Long): Unit = _value.add(v) + def +=(v: Long): Unit = _value += v - override def value: Long = _value.sum() + override def value: Long = _value // Provide special identifier as metadata so we can tell that this is a `SQLMetric` later override def toInfo(update: Option[Any], value: Option[Any]): AccumulableInfo = { @@ -159,7 +153,7 @@ object SQLMetrics { Seq.fill(3)(0L) } else { val sorted = validValues.sorted - Seq(sorted.head, sorted(validValues.length / 2), sorted(validValues.length - 1)) + Seq(sorted(0), sorted(validValues.length / 2), sorted(validValues.length - 1)) } metric.map(v => numberFormat.format(v.toDouble / baseForAvgMetric)) } @@ -179,8 +173,7 @@ object SQLMetrics { Seq.fill(4)(0L) } else { val sorted = validValues.sorted - Seq(sorted.sum, sorted.head, sorted(validValues.length / 2), -sorted(validValues.length - 1)) + Seq(sorted.sum, sorted(0), sorted(validValues.length / 2), sorted(validValues.length - 1)) } metric.map(strFormat) }
spark git commit: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSparkSubmitSuite correct and stable
Repository: spark Updated Branches: refs/heads/master 56e9e9707 -> 386fbd3af [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSparkSubmitSuite correct and stable ## What changes were proposed in this pull request? This PR addresses two issues in `BufferHolderSparkSubmitSuite`. 1. While `BufferHolderSparkSubmitSuite` tried to allocate a large object several times, it actually allocated an object once and reused the object. 2. `BufferHolderSparkSubmitSuite` may fail due to timeout To assign a small object before allocating a large object each time solved issue 1 by avoiding reuse. To increasing heap size from 4g to 7g solved issue 2. It can also avoid OOM after fixing issue 1. ## How was this patch tested? Updated existing `BufferHolderSparkSubmitSuite` Closes #20636 from kiszk/SPARK-23415. Authored-by: Kazuaki Ishizaki Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/386fbd3a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/386fbd3a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/386fbd3a Branch: refs/heads/master Commit: 386fbd3aff95ce919567b1b94d5b19c5bcef266a Parents: 56e9e97 Author: Kazuaki Ishizaki Authored: Thu Aug 9 20:28:14 2018 +0800 Committer: Wenchen Fan Committed: Thu Aug 9 20:28:14 2018 +0800 -- .../expressions/codegen/BufferHolder.java | 13 +-- .../codegen/BufferHolderSparkSubmitSuite.scala | 36 .../expressions/codegen/BufferHolderSuite.scala | 10 +++--- 3 files changed, 36 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/386fbd3a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java -- diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java index 537ef24..6a52a5b 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java @@ -35,6 +35,7 @@ final class BufferHolder { private static final int ARRAY_MAX = ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH; + // buffer is guarantee to be word-aligned since UnsafeRow assumes each field is word-aligned. private byte[] buffer; private int cursor = Platform.BYTE_ARRAY_OFFSET; private final UnsafeRow row; @@ -52,7 +53,8 @@ final class BufferHolder { "too many fields (number of fields: " + row.numFields() + ")"); } this.fixedSize = bitsetWidthInBytes + 8 * row.numFields(); -this.buffer = new byte[fixedSize + initialSize]; +int roundedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(fixedSize + initialSize); +this.buffer = new byte[roundedSize]; this.row = row; this.row.pointTo(buffer, buffer.length); } @@ -61,8 +63,12 @@ final class BufferHolder { * Grows the buffer by at least neededSize and points the row to the buffer. */ void grow(int neededSize) { +if (neededSize < 0) { + throw new IllegalArgumentException( +"Cannot grow BufferHolder by size " + neededSize + " because the size is negative"); +} if (neededSize > ARRAY_MAX - totalSize()) { - throw new UnsupportedOperationException( + throw new IllegalArgumentException( "Cannot grow BufferHolder by size " + neededSize + " because the size after growing " + "exceeds size limitation " + ARRAY_MAX); } @@ -70,7 +76,8 @@ final class BufferHolder { if (buffer.length < length) { // This will not happen frequently, because the buffer is re-used. int newLength = length < ARRAY_MAX / 2 ? length * 2 : ARRAY_MAX; - final byte[] tmp = new byte[newLength]; + int roundedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(newLength); + final byte[] tmp = new byte[roundedSize]; Platform.copyMemory( buffer, Platform.BYTE_ARRAY_OFFSET, http://git-wip-us.apache.org/repos/asf/spark/blob/386fbd3a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala index 85682cf..d2862c8 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala +++
spark git commit: [MINOR][DOC] Fix typo
Repository: spark Updated Branches: refs/heads/master 519e03d82 -> 56e9e9707 [MINOR][DOC] Fix typo ## What changes were proposed in this pull request? This PR fixes typo regarding `auxiliary verb + verb[s]`. This is a follow-on of #21956. ## How was this patch tested? N/A Closes #22040 from kiszk/spellcheck1. Authored-by: Kazuaki Ishizaki Signed-off-by: hyukjinkwon Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/56e9e970 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/56e9e970 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/56e9e970 Branch: refs/heads/master Commit: 56e9e97073cf1896e301371b3941c9307e42ff77 Parents: 519e03d Author: Kazuaki Ishizaki Authored: Thu Aug 9 20:10:17 2018 +0800 Committer: hyukjinkwon Committed: Thu Aug 9 20:10:17 2018 +0800 -- .../main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java | 2 +- .../util/collection/unsafe/sort/UnsafeSorterSpillMerger.java | 2 +- .../src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala | 2 +- .../test/java/test/org/apache/spark/JavaSparkContextSuite.java | 2 +- .../scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala | 2 +- .../org/apache/spark/ml/classification/LogisticRegression.scala | 2 +- python/pyspark/sql/types.py | 2 +- .../apache/spark/sql/catalyst/analysis/DecimalPrecision.scala| 2 +- .../expressions/CodeGeneratorWithInterpretedFallback.scala | 2 +- .../spark/sql/catalyst/expressions/ExpectsInputTypes.scala | 2 +- .../spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala | 4 ++-- .../spark/sql/catalyst/encoders/EncoderResolutionSuite.scala | 2 +- .../scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala | 2 +- .../apache/spark/sql/execution/streaming/FileStreamSource.scala | 2 +- .../apache/spark/sql/execution/streaming/ProgressReporter.scala | 2 +- .../src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala | 2 +- .../sql/hive/execution/CreateHiveTableAsSelectCommand.scala | 2 +- .../org/apache/spark/sql/hive/execution/HiveQuerySuite.scala | 2 +- 18 files changed, 19 insertions(+), 19 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/56e9e970/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java -- diff --git a/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java b/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java index 9a767dd..9b6cbab 100644 --- a/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java +++ b/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java @@ -662,7 +662,7 @@ public final class BytesToBytesMap extends MemoryConsumer { * It is only valid to call this method immediately after calling `lookup()` using the same key. * * - * The key and value must be word-aligned (that is, their sizes must multiples of 8). + * The key and value must be word-aligned (that is, their sizes must be a multiple of 8). * * * After calling this method, calls to `get[Key|Value]Address()` and `get[Key|Value]Length` http://git-wip-us.apache.org/repos/asf/spark/blob/56e9e970/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java -- diff --git a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java index ff0dcc2..ab80028 100644 --- a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java +++ b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java @@ -51,7 +51,7 @@ final class UnsafeSorterSpillMerger { if (spillReader.hasNext()) { // We only add the spillReader to the priorityQueue if it is not empty. We do this to // make sure the hasNext method of UnsafeSorterIterator returned by getSortedIterator - // does not return wrong result because hasNext will returns true + // does not return wrong result because hasNext will return true // at least priorityQueue.size() times. If we allow n spillReaders in the // priorityQueue, we will have n extra empty records in the result of UnsafeSorterIterator. spillReader.loadNext(); http://git-wip-us.apache.org/repos/asf/spark/blob/56e9e970/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala -- diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
svn commit: r28629 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_09_00_02-519e03d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Aug 9 07:16:59 2018 New Revision: 28629 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_09_00_02-519e03d docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org