svn commit: r30509 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_29_20_02-5bd5e1b-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Tue Oct 30 03:17:06 2018 New Revision: 30509 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_29_20_02-5bd5e1b docs [This commit notification would consist of 1472 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [MINOR][SQL] Avoid hardcoded configuration keys in SQLConf's `doc`
Repository: spark Updated Branches: refs/heads/master 5e5d886a2 -> 5bd5e1b9c [MINOR][SQL] Avoid hardcoded configuration keys in SQLConf's `doc` ## What changes were proposed in this pull request? This PR proposes to avoid hardcorded configuration keys in SQLConf's `doc. ## How was this patch tested? Manually verified. Closes #22877 from HyukjinKwon/minor-conf-name. Authored-by: hyukjinkwon Signed-off-by: hyukjinkwon Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5bd5e1b9 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5bd5e1b9 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5bd5e1b9 Branch: refs/heads/master Commit: 5bd5e1b9c84b5f7d4d67ab94e02d49ebdd02f177 Parents: 5e5d886 Author: hyukjinkwon Authored: Tue Oct 30 07:38:26 2018 +0800 Committer: hyukjinkwon Committed: Tue Oct 30 07:38:26 2018 +0800 -- .../org/apache/spark/sql/internal/SQLConf.scala | 41 +++- 1 file changed, 23 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/5bd5e1b9/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index 4edffce..535ec51 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -408,7 +408,8 @@ object SQLConf { val PARQUET_FILTER_PUSHDOWN_DATE_ENABLED = buildConf("spark.sql.parquet.filterPushdown.date") .doc("If true, enables Parquet filter push-down optimization for Date. " + - "This configuration only has an effect when 'spark.sql.parquet.filterPushdown' is enabled.") + s"This configuration only has an effect when '${PARQUET_FILTER_PUSHDOWN_ENABLED.key}' is " + + "enabled.") .internal() .booleanConf .createWithDefault(true) @@ -416,7 +417,7 @@ object SQLConf { val PARQUET_FILTER_PUSHDOWN_TIMESTAMP_ENABLED = buildConf("spark.sql.parquet.filterPushdown.timestamp") .doc("If true, enables Parquet filter push-down optimization for Timestamp. " + -"This configuration only has an effect when 'spark.sql.parquet.filterPushdown' is " + +s"This configuration only has an effect when '${PARQUET_FILTER_PUSHDOWN_ENABLED.key}' is " + "enabled and Timestamp stored as TIMESTAMP_MICROS or TIMESTAMP_MILLIS type.") .internal() .booleanConf @@ -425,7 +426,8 @@ object SQLConf { val PARQUET_FILTER_PUSHDOWN_DECIMAL_ENABLED = buildConf("spark.sql.parquet.filterPushdown.decimal") .doc("If true, enables Parquet filter push-down optimization for Decimal. " + -"This configuration only has an effect when 'spark.sql.parquet.filterPushdown' is enabled.") +s"This configuration only has an effect when '${PARQUET_FILTER_PUSHDOWN_ENABLED.key}' is " + +"enabled.") .internal() .booleanConf .createWithDefault(true) @@ -433,7 +435,8 @@ object SQLConf { val PARQUET_FILTER_PUSHDOWN_STRING_STARTSWITH_ENABLED = buildConf("spark.sql.parquet.filterPushdown.string.startsWith") .doc("If true, enables Parquet filter push-down optimization for string startsWith function. " + - "This configuration only has an effect when 'spark.sql.parquet.filterPushdown' is enabled.") + s"This configuration only has an effect when '${PARQUET_FILTER_PUSHDOWN_ENABLED.key}' is " + + "enabled.") .internal() .booleanConf .createWithDefault(true) @@ -444,7 +447,8 @@ object SQLConf { "Large threshold won't necessarily provide much better performance. " + "The experiment argued that 300 is the limit threshold. " + "By setting this value to 0 this feature can be disabled. " + -"This configuration only has an effect when 'spark.sql.parquet.filterPushdown' is enabled.") +s"This configuration only has an effect when '${PARQUET_FILTER_PUSHDOWN_ENABLED.key}' is " + +"enabled.") .internal() .intConf .checkValue(threshold => threshold >= 0, "The threshold must not be negative.") @@ -459,14 +463,6 @@ object SQLConf { .booleanConf .createWithDefault(false) - val PARQUET_RECORD_FILTER_ENABLED = buildConf("spark.sql.parquet.recordLevelFilter.enabled") -.doc("If true, enables Parquet's native record-level filtering using the pushed down " + - "filters. This configuration only has an effect when 'spark.sql.parquet.filterPushdown' " + - "is enabled and the vectorized reader is not used. You can ensure the vectorized reader " + - "is not used by setting
svn commit: r30501 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_29_12_09-5e5d886-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Oct 29 19:24:15 2018 New Revision: 30501 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_29_12_09-5e5d886 docs [This commit notification would consist of 1472 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25856][SQL][MINOR] Remove AverageLike and CountLike classes
Repository: spark Updated Branches: refs/heads/master 7fe5cff05 -> 5e5d886a2 [SPARK-25856][SQL][MINOR] Remove AverageLike and CountLike classes ## What changes were proposed in this pull request? These two classes were added for regr_ expression support (SPARK-23907). These have been removed and hence we can remove these base classes and inline the logic in the concrete classes. ## How was this patch tested? Existing tests. Closes #22856 from dilipbiswal/average_cleanup. Authored-by: Dilip Biswal Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5e5d886a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5e5d886a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5e5d886a Branch: refs/heads/master Commit: 5e5d886a2bc291a707cf4a6c70ecc6de6f8e990d Parents: 7fe5cff Author: Dilip Biswal Authored: Mon Oct 29 12:56:06 2018 -0500 Committer: Sean Owen Committed: Mon Oct 29 12:56:06 2018 -0500 -- .../expressions/aggregate/Average.scala | 33 +--- .../catalyst/expressions/aggregate/Count.scala | 28 +++-- 2 files changed, 25 insertions(+), 36 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/5e5d886a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala index 5ecb77b..8dd80dc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala @@ -23,9 +23,21 @@ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.util.TypeUtils import org.apache.spark.sql.types._ -abstract class AverageLike(child: Expression) extends DeclarativeAggregate { +@ExpressionDescription( + usage = "_FUNC_(expr) - Returns the mean calculated from values of a group.") +case class Average(child: Expression) extends DeclarativeAggregate with ImplicitCastInputTypes { + + override def prettyName: String = "avg" + + override def children: Seq[Expression] = child :: Nil + + override def inputTypes: Seq[AbstractDataType] = Seq(NumericType) + + override def checkInputDataTypes(): TypeCheckResult = +TypeUtils.checkForNumericExpr(child.dataType, "function average") override def nullable: Boolean = true + // Return data type. override def dataType: DataType = resultType @@ -63,28 +75,11 @@ abstract class AverageLike(child: Expression) extends DeclarativeAggregate { sum.cast(resultType) / count.cast(resultType) } - protected def updateExpressionsDef: Seq[Expression] = Seq( + override lazy val updateExpressions: Seq[Expression] = Seq( /* sum = */ Add( sum, coalesce(child.cast(sumDataType), Literal(0).cast(sumDataType))), /* count = */ If(child.isNull, count, count + 1L) ) - - override lazy val updateExpressions = updateExpressionsDef -} - -@ExpressionDescription( - usage = "_FUNC_(expr) - Returns the mean calculated from values of a group.") -case class Average(child: Expression) - extends AverageLike(child) with ImplicitCastInputTypes { - - override def prettyName: String = "avg" - - override def children: Seq[Expression] = child :: Nil - - override def inputTypes: Seq[AbstractDataType] = Seq(NumericType) - - override def checkInputDataTypes(): TypeCheckResult = -TypeUtils.checkForNumericExpr(child.dataType, "function average") } http://git-wip-us.apache.org/repos/asf/spark/blob/5e5d886a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala index 8cab8e4..d402f2d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala @@ -21,10 +21,17 @@ import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.types._ -/** - * Base class for all counting aggregators. - */ -abstract class CountLike extends DeclarativeAggregate { +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = """ +_FUNC_(*) - Returns the total number of retrieved rows,
svn commit: r30500 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_29_10_09-5cc2987-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Oct 29 17:23:31 2018 New Revision: 30500 Log: Apache Spark 2.4.1-SNAPSHOT-2018_10_29_10_09-5cc2987 docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25767][SQL] Fix lazily evaluated stream of expressions in code generation
Repository: spark Updated Branches: refs/heads/branch-2.4 22bec3c6d -> 5cc2987db [SPARK-25767][SQL] Fix lazily evaluated stream of expressions in code generation ## What changes were proposed in this pull request? Code generation is incorrect if `outputVars` parameter of `consume` method in `CodegenSupport` contains a lazily evaluated stream of expressions. This PR fixes the issue by forcing the evaluation of `inputVars` before generating the code for UnsafeRow. ## How was this patch tested? Tested with the sample program provided in https://issues.apache.org/jira/browse/SPARK-25767 Closes #22789 from peter-toth/SPARK-25767. Authored-by: Peter Toth Signed-off-by: Herman van Hovell (cherry picked from commit 7fe5cff0581ca9d8221533215098f40f69362018) Signed-off-by: Herman van Hovell Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5cc2987d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5cc2987d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5cc2987d Branch: refs/heads/branch-2.4 Commit: 5cc2987dbba609d99df0b367abe25238c9498cba Parents: 22bec3c Author: Peter Toth Authored: Mon Oct 29 16:47:50 2018 +0100 Committer: Herman van Hovell Committed: Mon Oct 29 16:48:06 2018 +0100 -- .../spark/sql/execution/WholeStageCodegenExec.scala | 5 - .../spark/sql/execution/WholeStageCodegenSuite.scala | 11 +++ 2 files changed, 15 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/5cc2987d/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala index 1fc4de9..ded8dd3 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala @@ -146,7 +146,10 @@ trait CodegenSupport extends SparkPlan { if (outputVars != null) { assert(outputVars.length == output.length) // outputVars will be used to generate the code for UnsafeRow, so we should copy them -outputVars.map(_.copy()) +outputVars.map(_.copy()) match { + case stream: Stream[ExprCode] => stream.force + case other => other +} } else { assert(row != null, "outputVars and row cannot both be null.") ctx.currentVars = null http://git-wip-us.apache.org/repos/asf/spark/blob/5cc2987d/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala index b714dcd..09ad0fd 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala @@ -319,4 +319,15 @@ class WholeStageCodegenSuite extends QueryTest with SharedSQLContext { assert(df.limit(1).collect() === Array(Row("bat", 8.0))) } } + + test("SPARK-25767: Lazy evaluated stream of expressions handled correctly") { +val a = Seq(1).toDF("key") +val b = Seq((1, "a")).toDF("key", "value") +val c = Seq(1).toDF("key") + +val ab = a.join(b, Stream("key"), "left") +val abc = ab.join(c, Seq("key"), "left") + +checkAnswer(abc, Row(1, "a")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25767][SQL] Fix lazily evaluated stream of expressions in code generation
Repository: spark Updated Branches: refs/heads/master 409d688fb -> 7fe5cff05 [SPARK-25767][SQL] Fix lazily evaluated stream of expressions in code generation ## What changes were proposed in this pull request? Code generation is incorrect if `outputVars` parameter of `consume` method in `CodegenSupport` contains a lazily evaluated stream of expressions. This PR fixes the issue by forcing the evaluation of `inputVars` before generating the code for UnsafeRow. ## How was this patch tested? Tested with the sample program provided in https://issues.apache.org/jira/browse/SPARK-25767 Closes #22789 from peter-toth/SPARK-25767. Authored-by: Peter Toth Signed-off-by: Herman van Hovell Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7fe5cff0 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7fe5cff0 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7fe5cff0 Branch: refs/heads/master Commit: 7fe5cff0581ca9d8221533215098f40f69362018 Parents: 409d688 Author: Peter Toth Authored: Mon Oct 29 16:47:50 2018 +0100 Committer: Herman van Hovell Committed: Mon Oct 29 16:47:50 2018 +0100 -- .../spark/sql/execution/WholeStageCodegenExec.scala | 5 - .../spark/sql/execution/WholeStageCodegenSuite.scala | 11 +++ 2 files changed, 15 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7fe5cff0/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala index f5aee62..5f81b6f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala @@ -146,7 +146,10 @@ trait CodegenSupport extends SparkPlan { if (outputVars != null) { assert(outputVars.length == output.length) // outputVars will be used to generate the code for UnsafeRow, so we should copy them -outputVars.map(_.copy()) +outputVars.map(_.copy()) match { + case stream: Stream[ExprCode] => stream.force + case other => other +} } else { assert(row != null, "outputVars and row cannot both be null.") ctx.currentVars = null http://git-wip-us.apache.org/repos/asf/spark/blob/7fe5cff0/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala index b714dcd..09ad0fd 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala @@ -319,4 +319,15 @@ class WholeStageCodegenSuite extends QueryTest with SharedSQLContext { assert(df.limit(1).collect() === Array(Row("bat", 8.0))) } } + + test("SPARK-25767: Lazy evaluated stream of expressions handled correctly") { +val a = Seq(1).toDF("key") +val b = Seq((1, "a")).toDF("key", "value") +val c = Seq(1).toDF("key") + +val ab = a.join(b, Stream("key"), "left") +val abc = ab.join(c, Seq("key"), "left") + +checkAnswer(abc, Row(1, "a")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30498 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_29_08_05-409d688-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Oct 29 15:19:30 2018 New Revision: 30498 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_29_08_05-409d688 docs [This commit notification would consist of 1472 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25864][SQL][TEST] Make main args accessible for BenchmarkBase's subclass
Repository: spark Updated Branches: refs/heads/master fbaf15050 -> 409d688fb [SPARK-25864][SQL][TEST] Make main args accessible for BenchmarkBase's subclass ## What changes were proposed in this pull request? Set main args correctly in BenchmarkBase, to make it accessible for its subclass. It will benefit: - BuiltInDataSourceWriteBenchmark - AvroWriteBenchmark ## How was this patch tested? manual tests Closes #22872 from yucai/main_args. Authored-by: yucai Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/409d688f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/409d688f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/409d688f Branch: refs/heads/master Commit: 409d688fb6169ecbc41f6296c6341ae3ed7d1ec8 Parents: fbaf150 Author: yucai Authored: Mon Oct 29 20:00:31 2018 +0800 Committer: Wenchen Fan Committed: Mon Oct 29 20:00:31 2018 +0800 -- .../test/scala/org/apache/spark/benchmark/BenchmarkBase.scala| 4 ++-- .../test/scala/org/apache/spark/serializer/KryoBenchmark.scala | 2 +- .../apache/spark/mllib/linalg/UDTSerializationBenchmark.scala| 2 +- .../src/test/scala/org/apache/spark/sql/HashBenchmark.scala | 2 +- .../test/scala/org/apache/spark/sql/HashByteArrayBenchmark.scala | 2 +- .../scala/org/apache/spark/sql/UnsafeProjectionBenchmark.scala | 2 +- .../src/test/scala/org/apache/spark/sql/DatasetBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/AggregateBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/BloomFilterBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/DataSourceReadBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/FilterPushdownBenchmark.scala | 2 +- .../org/apache/spark/sql/execution/benchmark/JoinBenchmark.scala | 2 +- .../org/apache/spark/sql/execution/benchmark/MiscBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/PrimitiveArrayBenchmark.scala | 2 +- .../apache/spark/sql/execution/benchmark/RangeBenchmark.scala| 2 +- .../org/apache/spark/sql/execution/benchmark/SortBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/UnsafeArrayDataBenchmark.scala | 2 +- .../spark/sql/execution/benchmark/WideSchemaBenchmark.scala | 2 +- .../columnar/compression/CompressionSchemeBenchmark.scala| 2 +- .../spark/sql/execution/vectorized/ColumnarBatchBenchmark.scala | 2 +- .../execution/benchmark/ObjectHashAggregateExecBenchmark.scala | 2 +- .../scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala | 2 +- 22 files changed, 23 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/409d688f/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala -- diff --git a/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala b/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala index 89e927e..24e596e 100644 --- a/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala +++ b/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala @@ -30,7 +30,7 @@ abstract class BenchmarkBase { * Implementations of this method are supposed to use the wrapper method `runBenchmark` * for each benchmark scenario. */ - def runBenchmarkSuite(): Unit + def runBenchmarkSuite(mainArgs: Array[String]): Unit final def runBenchmark(benchmarkName: String)(func: => Any): Unit = { val separator = "=" * 96 @@ -51,7 +51,7 @@ abstract class BenchmarkBase { output = Some(new FileOutputStream(file)) } -runBenchmarkSuite() +runBenchmarkSuite(args) output.foreach { o => if (o != null) { http://git-wip-us.apache.org/repos/asf/spark/blob/409d688f/core/src/test/scala/org/apache/spark/serializer/KryoBenchmark.scala -- diff --git a/core/src/test/scala/org/apache/spark/serializer/KryoBenchmark.scala b/core/src/test/scala/org/apache/spark/serializer/KryoBenchmark.scala index 8a52c13..d7730f2 100644 --- a/core/src/test/scala/org/apache/spark/serializer/KryoBenchmark.scala +++ b/core/src/test/scala/org/apache/spark/serializer/KryoBenchmark.scala @@ -39,7 +39,7 @@ import org.apache.spark.serializer.KryoTest._ object KryoBenchmark extends BenchmarkBase { val N = 100 - override def runBenchmarkSuite(): Unit = { + override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { val name = "Benchmark Kryo Unsafe vs safe Serialization" runBenchmark(name) { val benchmark = new Benchmark(name, N, 10, output = output) http://git-wip-us.apache.org/repos/asf/spark/blob/409d688f/mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala
svn commit: r30489 - in /dev/spark/2.4.1-SNAPSHOT-2018_10_29_02_03-22bec3c-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Oct 29 09:19:09 2018 New Revision: 30489 Log: Apache Spark 2.4.1-SNAPSHOT-2018_10_29_02_03-22bec3c docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30488 - in /dev/spark/2.3.3-SNAPSHOT-2018_10_29_02_02-632c0d9-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Oct 29 09:17:42 2018 New Revision: 30488 Log: Apache Spark 2.3.3-SNAPSHOT-2018_10_29_02_02-632c0d9 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30485 - in /dev/spark/v2.4.0-rc5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark
Author: wenchen Date: Mon Oct 29 07:29:17 2018 New Revision: 30485 Log: Apache Spark v2.4.0-rc5 docs [This commit notification would consist of 1479 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30484 - in /dev/spark/3.0.0-SNAPSHOT-2018_10_29_00_03-fbaf150-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Oct 29 07:18:06 2018 New Revision: 30484 Log: Apache Spark 3.0.0-SNAPSHOT-2018_10_29_00_03-fbaf150 docs [This commit notification would consist of 1472 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30479 - /dev/spark/v2.4.0-rc5-bin/
Author: wenchen Date: Mon Oct 29 07:10:51 2018 New Revision: 30479 Log: Apache Spark v2.4.0-rc5 Added: dev/spark/v2.4.0-rc5-bin/ dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz (with props) dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.asc dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.sha512 dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz (with props) dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz.asc dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz.sha512 dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-hadoop2.6.tgz (with props) dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-hadoop2.6.tgz.asc dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-hadoop2.6.tgz.sha512 dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-hadoop2.7.tgz (with props) dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-hadoop2.7.tgz.asc dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-hadoop2.7.tgz.sha512 dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-without-hadoop-scala-2.12.tgz (with props) dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-without-hadoop-scala-2.12.tgz.asc dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-without-hadoop-scala-2.12.tgz.sha512 dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-without-hadoop.tgz (with props) dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-without-hadoop.tgz.asc dev/spark/v2.4.0-rc5-bin/spark-2.4.0-bin-without-hadoop.tgz.sha512 dev/spark/v2.4.0-rc5-bin/spark-2.4.0.tgz (with props) dev/spark/v2.4.0-rc5-bin/spark-2.4.0.tgz.asc dev/spark/v2.4.0-rc5-bin/spark-2.4.0.tgz.sha512 Added: dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.asc == --- dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.asc (added) +++ dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.asc Mon Oct 29 07:10:51 2018 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJb1qptAAoJEGuscolPT9yKjC8P/jLUuT2yPoRev6BYHyFy6Wx2 +TunYnpp3tYck4tUWvh90Jq3TzMjebo96qFI3D1etOeYRIFnbTYlkDbqn85A58nR/ +bG15JIcd4DkU5JZIImaGxwnAKqEURpzi4WKkgYXU1ZWOJYzlmgt6MWVdTGoD+bEB +OoAG8A6qrSVaxSDSMeQOneCfWC7p19IP3/K02zmtrV/eoIjx2B68Hm+zc6S09qeT +VHaBthX1sKU+stPDRSaH+ppo8N1UI6H6dkTJ3/04a3a1ExQo3m56CE1A08lSAasW +mW0zKcyzaZkOzWO+A4d9F1R5zRbgk3jSXL83Nnp3Eoc8SPc2qpy4x4fx0rzjp+L3 +Ar+GrKwtoQe+SI7JyOZdtCdrRGnIPOFlGBKhC5l7VnYPI1hBtaj3daManj6bZpVj +BUiUEDSQeoIpSHC1b5oFXwpqyrJFxrVXvzXbENZHAxLZ48/wtiTM0SW+IRo+Pkd6 +AV9Sbg1MbuqPhSpcit4irBO5pYjPDr7JUcoItd/LDFIv4YWmroMr9LdHZMfc8mN3 +U2440EM++b4K70P3/MBS9lOkR5coARr/A7oF5nC0ztz0/ItwwPWVsYja6AxDUrMz +5fV+8Gf7jYvEK+/OQ7xQ97QNJJNCZAEGODc8qXuhol1jVkg4+4SGa4JeZnRoLUpZ +jyr0uumTzK9eUW/mUPt3 +=GMSq +-END PGP SIGNATURE- Added: dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.sha512 == --- dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.sha512 (added) +++ dev/spark/v2.4.0-rc5-bin/SparkR_2.4.0.tar.gz.sha512 Mon Oct 29 07:10:51 2018 @@ -0,0 +1,3 @@ +SparkR_2.4.0.tar.gz: 2D8D5DED E396D8CD C958BE33 C108C663 5CA15B4B 185C3B2C + 954A87A7 2A98DA76 150C2BE9 3F871DC7 992F5B53 4A72B7F0 + DECB4CAF 45A52C38 7F269361 22B51630 Added: dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz.asc == --- dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz.asc (added) +++ dev/spark/v2.4.0-rc5-bin/pyspark-2.4.0.tar.gz.asc Mon Oct 29 07:10:51 2018 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJb1qptAAoJEGuscolPT9yKHncP/3/sjzafE/pNVca58RZZXSGv +n6bawYnXa7K+lJIiLFKO6IBVlghiaSBpnRmyp6+8y7Ys3YBv0efys6H5use/RnXy +pcZLw9MFjqf1D67EF9a2IouuKPoMw20xuw7gz8s1WNmS3fbeym1+FktnwhyPLdXT +S6m7a3QHd8SR62WnL7MobcWulp+yvvBviFBoA46di9AvczVT3SHT5hi1X8MW1WDs +qb9jCJJYuHVr0omcML/szz3lPTxUfC8phiepWix7HFf0Pr9FjVzR58R6up0W3uxa +pkhuPIQ8LN4YsBag7vE8hITSienNVVMrAA8wHUR1DqcaKesyZFtU7JUxP/PoNGK3 +u5oK3xrAacUiYQMznbLakjav1A7WXjlKSb0lPiQQP6pK/WOmxohvpSoptbm4mJvo +AeEkRvbRHVoDK948SKmnNCT1qvyknpiGWeOenDvH897hIqUvkU2YDcRqJtjh8bFY +Lw8JZyTF9Uzgomp/40l9b/qTBveXu2Rd2FsIzVSVcHgq/uAYpeJEp5IIa3Ab2V6z +i30CX2WQB3+MjuqBvt6KjSBCAUu9fhYBaWLSnsrxmabjr3FRCXLjYV9qZsX2RWvQ +vc7ixkzQRnK+ZcgRx8UYrbt6jSurylt1W/+vc0y2p2NcKgILyj75YqnJh9DKI3j9 +08FCbwpz+2WxEOp3ug6a +=2edh +-END PGP SIGNATURE- Added:
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.4.0-rc5 [created] 0a4c03f7d - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 2.4.1-SNAPSHOT
Preparing development version 2.4.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/22bec3c6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/22bec3c6 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/22bec3c6 Branch: refs/heads/branch-2.4 Commit: 22bec3c6dab1147eee0342993aa8f64202603a8d Parents: 0a4c03f Author: Wenchen Fan Authored: Mon Oct 29 06:15:33 2018 + Committer: Wenchen Fan Committed: Mon Oct 29 06:15:33 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/22bec3c6/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index f52d785..714b6f1 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.0 +Version: 2.4.1 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/22bec3c6/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 63ab510..ee0de73 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.0 +2.4.1-SNAPSHOT ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/22bec3c6/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b10e118..b89e0fe 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.0 +2.4.1-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/22bec3c6/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml
[1/2] spark git commit: Preparing Spark release v2.4.0-rc5
Repository: spark Updated Branches: refs/heads/branch-2.4 7f4fce426 -> 22bec3c6d Preparing Spark release v2.4.0-rc5 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0a4c03f7 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0a4c03f7 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0a4c03f7 Branch: refs/heads/branch-2.4 Commit: 0a4c03f7d084f1d2aa48673b99f3b9496893ce8d Parents: 7f4fce4 Author: Wenchen Fan Authored: Mon Oct 29 06:15:29 2018 + Committer: Wenchen Fan Committed: Mon Oct 29 06:15:29 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/0a4c03f7/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 714b6f1..f52d785 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.1 +Version: 2.4.0 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/0a4c03f7/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index ee0de73..63ab510 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.1-SNAPSHOT +2.4.0 ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/0a4c03f7/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b89e0fe..b10e118 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.1-SNAPSHOT +2.4.0 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/0a4c03f7/common/network-common/pom.xml
spark git commit: [SPARK-25179][PYTHON][DOCS] Document BinaryType support in Arrow conversion
Repository: spark Updated Branches: refs/heads/branch-2.4 b6ba0dd47 -> 7f4fce426 [SPARK-25179][PYTHON][DOCS] Document BinaryType support in Arrow conversion ## What changes were proposed in this pull request? This PR targets to document binary type in "Apache Arrow in Spark". ## How was this patch tested? Manually built the documentation and checked. Closes #22871 from HyukjinKwon/SPARK-25179. Authored-by: hyukjinkwon Signed-off-by: gatorsmile (cherry picked from commit fbaf150507a289ec0ac02fdbf4009c42cd9bc164) Signed-off-by: gatorsmile Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7f4fce42 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7f4fce42 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7f4fce42 Branch: refs/heads/branch-2.4 Commit: 7f4fce426025d54f41d8e87928582563a8ad689e Parents: b6ba0dd Author: hyukjinkwon Authored: Sun Oct 28 23:01:35 2018 -0700 Committer: gatorsmile Committed: Sun Oct 28 23:02:09 2018 -0700 -- docs/sql-pyspark-pandas-with-arrow.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7f4fce42/docs/sql-pyspark-pandas-with-arrow.md -- diff --git a/docs/sql-pyspark-pandas-with-arrow.md b/docs/sql-pyspark-pandas-with-arrow.md index e8e9f55..d04b955 100644 --- a/docs/sql-pyspark-pandas-with-arrow.md +++ b/docs/sql-pyspark-pandas-with-arrow.md @@ -127,8 +127,9 @@ For detailed usage, please see [`pyspark.sql.functions.pandas_udf`](api/python/p ### Supported SQL Types -Currently, all Spark SQL data types are supported by Arrow-based conversion except `BinaryType`, `MapType`, -`ArrayType` of `TimestampType`, and nested `StructType`. +Currently, all Spark SQL data types are supported by Arrow-based conversion except `MapType`, +`ArrayType` of `TimestampType`, and nested `StructType`. `BinaryType` is supported only when +installed PyArrow is equal to or higher then 0.10.0. ### Setting Arrow Batch Size - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25179][PYTHON][DOCS] Document BinaryType support in Arrow conversion
Repository: spark Updated Branches: refs/heads/master 4e990d9dd -> fbaf15050 [SPARK-25179][PYTHON][DOCS] Document BinaryType support in Arrow conversion ## What changes were proposed in this pull request? This PR targets to document binary type in "Apache Arrow in Spark". ## How was this patch tested? Manually built the documentation and checked. Closes #22871 from HyukjinKwon/SPARK-25179. Authored-by: hyukjinkwon Signed-off-by: gatorsmile Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fbaf1505 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fbaf1505 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fbaf1505 Branch: refs/heads/master Commit: fbaf150507a289ec0ac02fdbf4009c42cd9bc164 Parents: 4e990d9 Author: hyukjinkwon Authored: Sun Oct 28 23:01:35 2018 -0700 Committer: gatorsmile Committed: Sun Oct 28 23:01:35 2018 -0700 -- docs/sql-pyspark-pandas-with-arrow.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/fbaf1505/docs/sql-pyspark-pandas-with-arrow.md -- diff --git a/docs/sql-pyspark-pandas-with-arrow.md b/docs/sql-pyspark-pandas-with-arrow.md index e8e9f55..d04b955 100644 --- a/docs/sql-pyspark-pandas-with-arrow.md +++ b/docs/sql-pyspark-pandas-with-arrow.md @@ -127,8 +127,9 @@ For detailed usage, please see [`pyspark.sql.functions.pandas_udf`](api/python/p ### Supported SQL Types -Currently, all Spark SQL data types are supported by Arrow-based conversion except `BinaryType`, `MapType`, -`ArrayType` of `TimestampType`, and nested `StructType`. +Currently, all Spark SQL data types are supported by Arrow-based conversion except `MapType`, +`ArrayType` of `TimestampType`, and nested `StructType`. `BinaryType` is supported only when +installed PyArrow is equal to or higher then 0.10.0. ### Setting Arrow Batch Size - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org