[GitHub] spark pull request #21236: [SPARK-23935][SQL] Adding map_entries function
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21236#discussion_r186371562 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -118,6 +118,162 @@ case class MapValues(child: Expression) override def prettyName: String = "map_values" } +/** + * Returns an unordered array of all entries in the given map. + */ +@ExpressionDescription( + usage = "_FUNC_(map) - Returns an unordered array of all entries in the given map.", + examples = """ +Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b')); + [(1,"a"),(2,"b")] + """, + since = "2.4.0") +case class MapEntries(child: Expression) extends UnaryExpression with ExpectsInputTypes { + + override def inputTypes: Seq[AbstractDataType] = Seq(MapType) + + lazy val childDataType: MapType = child.dataType.asInstanceOf[MapType] + + override def dataType: DataType = { +ArrayType( + StructType( +StructField("key", childDataType.keyType, false) :: +StructField("value", childDataType.valueType, childDataType.valueContainsNull) :: +Nil), + false) + } + + override protected def nullSafeEval(input: Any): Any = { +val childMap = input.asInstanceOf[MapData] +val keys = childMap.keyArray() +val values = childMap.valueArray() +val length = childMap.numElements() +val resultData = new Array[AnyRef](length) +var i = 0; +while (i < length) { + val key = keys.get(i, childDataType.keyType) + val value = values.get(i, childDataType.valueType) + val row = new GenericInternalRow(Array[Any](key, value)) + resultData.update(i, row) + i += 1 +} +new GenericArrayData(resultData) + } + + override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { +nullSafeCodeGen(ctx, ev, c => { + val numElements = ctx.freshName("numElements") + val keys = ctx.freshName("keys") + val values = ctx.freshName("values") + val isKeyPrimitive = CodeGenerator.isPrimitiveType(childDataType.keyType) + val isValuePrimitive = CodeGenerator.isPrimitiveType(childDataType.valueType) + val code = if (isKeyPrimitive && isValuePrimitive) { +genCodeForPrimitiveElements(ctx, keys, values, ev.value, numElements) + } else { +genCodeForAnyElements(ctx, keys, values, ev.value, numElements) + } + s""" + |final int $numElements = $c.numElements(); + |final ArrayData $keys = $c.keyArray(); + |final ArrayData $values = $c.valueArray(); + |$code + """.stripMargin +}) + } + + private def getKey(varName: String) = CodeGenerator.getValue(varName, childDataType.keyType, "z") + + private def getValue(varName: String) = { +CodeGenerator.getValue(varName, childDataType.valueType, "z") + } + + private def genCodeForPrimitiveElements( + ctx: CodegenContext, + keys: String, + values: String, + arrayData: String, + numElements: String): String = { +val byteArraySize = ctx.freshName("byteArraySize") +val data = ctx.freshName("byteArray") +val unsafeRow = ctx.freshName("unsafeRow") +val structSize = ctx.freshName("structSize") +val unsafeArrayData = ctx.freshName("unsafeArrayData") +val structsOffset = ctx.freshName("structsOffset") +val calculateArraySize = "UnsafeArrayData.calculateSizeOfUnderlyingByteArray" +val calculateHeader = "UnsafeArrayData.calculateHeaderPortionInBytes" + +val baseOffset = Platform.BYTE_ARRAY_OFFSET +val longSize = LongType.defaultSize +val keyTypeName = CodeGenerator.primitiveTypeName(childDataType.keyType) +val valueTypeName = CodeGenerator.primitiveTypeName(childDataType.keyType) + +val valueAssignment = s"$unsafeRow.set$valueTypeName(1, ${getValue(values)});" +val valueAssignmentChecked = if (childDataType.valueContainsNull) { + s""" + |if ($values.isNullAt(z)) { + | $unsafeRow.setNullAt(1); + |} else { + | $valueAssignment + |} + """.stripMargin +} else { + valueAssignment +} + +s""" + |final int $structSize = ${UnsafeRow.calculateBitSetWidthInBytes(2) + longSize * 2}; --- End diff -- We can calculate `structSize` beforehand and inline it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90308/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21028 **[Test build #90308 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90308/testReport)** for PR 21028 at commit [`4e37975`](https://github.com/apache/spark/commit/4e37975ba3ce361009a83d248ad7d0b758f86f4c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21254: [SPARK-23094][SPARK-23723][SPARK-23724][SQL][FOLLOW-UP] ...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21254 > Do we have any behavior change after the previous PR: #20937? The PR brought the `encoding` (and `charset`) option but we didn't change behavior when `encoding` is not specified. As @HyukjinKwon wrote above the PR #21247 eliminates restrictions in write but the restrictions don't break previous behavior (before #20937) in any case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14083: [SPARK-16406][SQL] Improve performance of Logical...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14083 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14083 Merging to master. Thanks for all the reviews! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21250 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90310/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21250 **[Test build #90310 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90310/testReport)** for PR 21250 at commit [`dd6c329`](https://github.com/apache/spark/commit/dd6c329733924a4fe625473593c7a87b90f2280e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14083 Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186368498 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1502,12 +1502,21 @@ test_that("column functions", { result <- collect(select(df, sort_array(df[[1]])))[[1]] expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L))) - # Test flattern + result <- collect(select(df, reverse(df[[1]])))[[1]] --- End diff -- Seems we don't have test for `reverse` for string. Can you add one for it too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186369103 --- Diff: R/pkg/R/functions.R --- @@ -209,6 +209,7 @@ NULL #' head(select(tmp, array_max(tmp$v1), array_min(tmp$v1))) #' head(select(tmp, array_position(tmp$v1, 21))) #' head(select(tmp, flatten(tmp$v1))) +#' head(select(tmp, reverse(tmp$v1))) --- End diff -- Also add `concat` here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migratio...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21249#discussion_r186365513 --- Diff: docs/sparkr.md --- @@ -664,6 +664,6 @@ You can inspect the search path in R with [`search()`](https://stat.ethz.ch/R-ma - For `summary`, option for statistics to compute has been added. Its output is changed from that from `describe`. - A warning can be raised if versions of SparkR package and the Spark JVM do not match. -## Upgrading to Spark 2.4.0 +## Upgrading to SparkR 2.3.1 and above - - The `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. It has been fixed so the `start` parameter of `substr` method is now 1-base, e.g., therefore to get the same result as `substr(df$a, 2, 5)`, it should be changed to `substr(df$a, 1, 4)`. + - In SparkR 2.3.0 and earlier, the `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. In version 2.3.1 and later, it has been fixed so the `start` parameter of `substr` method is now 1-base. As an example, `substr(lit('abcdef'), 2, 4))` would result to `abc` in SparkR 2.3.0, and the result would be `bcd` in SparkR 2.3.1. --- End diff -- ok. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migratio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21249#discussion_r186364537 --- Diff: docs/sparkr.md --- @@ -664,6 +664,6 @@ You can inspect the search path in R with [`search()`](https://stat.ethz.ch/R-ma - For `summary`, option for statistics to compute has been added. Its output is changed from that from `describe`. - A warning can be raised if versions of SparkR package and the Spark JVM do not match. -## Upgrading to Spark 2.4.0 +## Upgrading to SparkR 2.3.1 and above - - The `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. It has been fixed so the `start` parameter of `substr` method is now 1-base, e.g., therefore to get the same result as `substr(df$a, 2, 5)`, it should be changed to `substr(df$a, 1, 4)`. + - In SparkR 2.3.0 and earlier, the `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. In version 2.3.1 and later, it has been fixed so the `start` parameter of `substr` method is now 1-base. As an example, `substr(lit('abcdef'), 2, 4))` would result to `abc` in SparkR 2.3.0, and the result would be `bcd` in SparkR 2.3.1. --- End diff -- I think it's fine since it's an example ... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migration note ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21249 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migration note ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90309/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migration note ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21249 **[Test build #90309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90309/testReport)** for PR 21249 at commit [`6c4743a`](https://github.com/apache/spark/commit/6c4743a8f33138431c2f3ce3ddd9f2512d72bc66). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21250 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2993/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migratio...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21249#discussion_r186361818 --- Diff: docs/sparkr.md --- @@ -664,6 +664,6 @@ You can inspect the search path in R with [`search()`](https://stat.ethz.ch/R-ma - For `summary`, option for statistics to compute has been added. Its output is changed from that from `describe`. - A warning can be raised if versions of SparkR package and the Spark JVM do not match. -## Upgrading to Spark 2.4.0 +## Upgrading to SparkR 2.3.1 and above - - The `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. It has been fixed so the `start` parameter of `substr` method is now 1-base, e.g., therefore to get the same result as `substr(df$a, 2, 5)`, it should be changed to `substr(df$a, 1, 4)`. + - In SparkR 2.3.0 and earlier, the `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. In version 2.3.1 and later, it has been fixed so the `start` parameter of `substr` method is now 1-base. As an example, `substr(lit('abcdef'), 2, 4))` would result to `abc` in SparkR 2.3.0, and the result would be `bcd` in SparkR 2.3.1. --- End diff -- nit: ```the result would be `bcd` in SparkR 2.3.1 and above.``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21193: [SPARK-24121][SQL] Add API for handling expressio...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/21193#discussion_r186361480 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/javaCode.scala --- @@ -112,6 +112,112 @@ object JavaCode { def isNullExpression(code: String): SimpleExprValue = { expression(code, BooleanType) } + + def block(code: String): Block = { +CodeBlock(codeParts = Seq(code), blockInputs = Seq.empty) + } +} + +/** + * A trait representing a block of java code. + */ +trait Block extends JavaCode { + + // The expressions to be evaluated inside this block. + def exprValues: Seq[ExprValue] + + // This will be called during string interpolation. + override def toString: String = _marginChar match { +case Some(c) => code.stripMargin(c) +case _ => code + } + + var _marginChar: Option[Char] = None + + def stripMargin(c: Char): this.type = { +_marginChar = Some(c) +this + } + + def stripMargin: this.type = { +_marginChar = Some('|') +this + } + + def + (other: Block): Block +} + +object Block { + implicit def blockToString(block: Block): String = block.toString + + implicit def blocksToBlock(blocks: Seq[Block]): Block = Blocks(blocks) + + implicit class BlockHelper(val sc: StringContext) extends AnyVal { +def code(args: Any*): Block = { + sc.checkLengths(args) + if (sc.parts.length == 0) { +EmptyBlock + } else { +args.foreach { + case _: ExprValue => + case _: Int | _: Long | _: Float | _: Double | _: String => + case _: Block => + case other => throw new IllegalArgumentException( +s"Can not interpolate ${other.getClass.getName} into code block.") --- End diff -- +10 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21250 **[Test build #90310 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90310/testReport)** for PR 21250 at commit [`dd6c329`](https://github.com/apache/spark/commit/dd6c329733924a4fe625473593c7a87b90f2280e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migration note ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21249 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migration note ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2992/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21249: [SPARK-23291][R][FOLLOWUP] Update SparkR migration note ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21249 **[Test build #90309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90309/testReport)** for PR 21249 at commit [`6c4743a`](https://github.com/apache/spark/commit/6c4743a8f33138431c2f3ce3ddd9f2512d72bc66). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21193: [SPARK-24121][SQL] Add API for handling expressio...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21193#discussion_r186359947 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/ExprValueSuite.scala --- @@ -17,6 +17,8 @@ package org.apache.spark.sql.catalyst.expressions.codegen +import scala.collection.mutable --- End diff -- my bad. forgot to remove. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21255 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21255 **[Test build #90304 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90304/testReport)** for PR 21255 at commit [`3985285`](https://github.com/apache/spark/commit/3985285089673e42a85a5d1ba3cd7419a6948909). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21255 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90304/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21185: [SPARK-23894][CORE][SQL] Defensively clear Active...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21185#discussion_r186358733 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -229,6 +229,23 @@ private[spark] class Executor( ManagementFactory.getGarbageCollectorMXBeans.asScala.map(_.getCollectionTime).sum } + /** + * Only in local mode, we have to prevent the driver from setting the active SparkSession + * in the executor threads. See SPARK-23894. + */ + private lazy val clearActiveSparkSessionMethod = if (Utils.isLocalMaster(conf)) { --- End diff -- I've added this check in https://github.com/apache/spark/pull/21190 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21070: [SPARK-23972][BUILD][SQL] Update Parquet to 1.10....
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21070#discussion_r186358096 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java --- @@ -63,115 +59,157 @@ public final void readBooleans(int total, WritableColumnVector c, int rowId) { } } + private ByteBuffer getBuffer(int length) { +try { + return in.slice(length).order(ByteOrder.LITTLE_ENDIAN); --- End diff -- previously we only call `.order(ByteOrder.LITTLE_ENDIAN)` if it's a big-endian platform. Is it OK to alway call it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21208: [SPARK-23925][SQL] Add array_repeat collection fu...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21208#discussion_r186357798 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1229,3 +1229,140 @@ case class Flatten(child: Expression) extends UnaryExpression { override def prettyName: String = "flatten" } + +/** + * Returns the array containing the given input value (left) count (right) times. + */ +@ExpressionDescription( + usage = "_FUNC_(element, count) - Returns the array containing element count times.", + examples = """ +Examples: + > SELECT _FUNC_('123', 2); + ['123', '123'] + """) +case class ArrayRepeat(left: Expression, right: Expression) + extends BinaryExpression with ExpectsInputTypes { + + override def dataType: ArrayType = ArrayType(left.dataType, left.nullable) + + override def inputTypes: Seq[AbstractDataType] = Seq(AnyDataType, IntegerType) + + override def nullable: Boolean = right.nullable + + override def eval(input: InternalRow): Any = { +val count = right.eval(input) +if (count == null) { + null +} else { + new GenericArrayData(List.fill(count.asInstanceOf[Int])(left.eval(input))) +} + } + + override def prettyName: String = "array_repeat" + + override def nullSafeCodeGen(ctx: CodegenContext, + ev: ExprCode, + f: (String, String) => String): ExprCode = { +val leftGen = left.genCode(ctx) +val rightGen = right.genCode(ctx) +val resultCode = f(leftGen.value, rightGen.value) + +if (nullable) { + val nullSafeEval = +leftGen.code + + rightGen.code + ctx.nullSafeExec(right.nullable, rightGen.isNull) { +s""" + ${ev.isNull} = false; + $resultCode +""" + } + + ev.copy(code = +s""" + | boolean ${ev.isNull} = true; + | ${CodeGenerator.javaType(dataType)} ${ev.value} = + | ${CodeGenerator.defaultValue(dataType)}; + | $nullSafeEval + """.stripMargin + ) +} else { + ev.copy(code = +s""" + | boolean ${ev.isNull} = false; + | ${leftGen.code} + | ${rightGen.code} + | ${CodeGenerator.javaType(dataType)} ${ev.value} = + | ${CodeGenerator.defaultValue(dataType)}; + | $resultCode + """.stripMargin +, isNull = FalseLiteral) +} + + } + + override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { + +nullSafeCodeGen(ctx, ev, (l, r) => { + val et = dataType.elementType + val isPrimitive = CodeGenerator.isPrimitiveType(et) + + val arrayDataName = ctx.freshName("arrayData") + val arrayName = ctx.freshName("arrayObject") + val numElements = ctx.freshName("numElements") + + val genNumElements = +s""" + | int $numElements = 0; + | if ($r > 0) { + | $numElements = $r; + | } + """.stripMargin + + val initialization = if (isPrimitive) { +val arrayName = ctx.freshName("array") +val baseOffset = Platform.BYTE_ARRAY_OFFSET +s""" + | int numBytes = ${et.defaultSize} * $numElements; + | int unsafeArraySizeInBytes = + | UnsafeArrayData.calculateHeaderPortionInBytes($numElements) + | + org.apache.spark.unsafe.array.ByteArrayMethods + | .roundNumberOfBytesToNearestWord(numBytes); + | byte[] $arrayName = new byte[unsafeArraySizeInBytes]; --- End diff -- Maybe we can use `ctx.createUnsafeArray()` now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21070: [SPARK-23972][BUILD][SQL] Update Parquet to 1.10....
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21070#discussion_r186357714 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java --- @@ -63,115 +59,157 @@ public final void readBooleans(int total, WritableColumnVector c, int rowId) { } } + private ByteBuffer getBuffer(int length) { +try { + return in.slice(length).order(ByteOrder.LITTLE_ENDIAN); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read " + length + " bytes", e); +} + } + @Override public final void readIntegers(int total, WritableColumnVector c, int rowId) { -c.putIntsLittleEndian(rowId, total, buffer, offset - Platform.BYTE_ARRAY_OFFSET); -offset += 4 * total; +int requiredBytes = total * 4; +ByteBuffer buffer = getBuffer(requiredBytes); + +if (buffer.hasArray()) { --- End diff -- shall we assert `buffer.hasArray()` is always true? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21070: [SPARK-23972][BUILD][SQL] Update Parquet to 1.10....
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21070#discussion_r186357371 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java --- @@ -63,115 +59,157 @@ public final void readBooleans(int total, WritableColumnVector c, int rowId) { } } + private ByteBuffer getBuffer(int length) { +try { + return in.slice(length).order(ByteOrder.LITTLE_ENDIAN); --- End diff -- does `in.slice(length)` do copy? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21208: [SPARK-23925][SQL] Add array_repeat collection fu...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21208#discussion_r186356981 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1229,3 +1229,140 @@ case class Flatten(child: Expression) extends UnaryExpression { override def prettyName: String = "flatten" } + +/** + * Returns the array containing the given input value (left) count (right) times. + */ +@ExpressionDescription( + usage = "_FUNC_(element, count) - Returns the array containing element count times.", + examples = """ +Examples: + > SELECT _FUNC_('123', 2); + ['123', '123'] + """) +case class ArrayRepeat(left: Expression, right: Expression) + extends BinaryExpression with ExpectsInputTypes { + + override def dataType: ArrayType = ArrayType(left.dataType, left.nullable) + + override def inputTypes: Seq[AbstractDataType] = Seq(AnyDataType, IntegerType) + + override def nullable: Boolean = right.nullable + + override def eval(input: InternalRow): Any = { +val count = right.eval(input) +if (count == null) { + null +} else { + new GenericArrayData(List.fill(count.asInstanceOf[Int])(left.eval(input))) +} + } + + override def prettyName: String = "array_repeat" + + override def nullSafeCodeGen(ctx: CodegenContext, --- End diff -- Yes, overriding `nullSafeCodeGen` is not suitable for this usage. So I think it would be good to put all code in `doGenCode`, or to create another method instead of overriding `nullSafeCodeGen`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21193: [SPARK-24121][SQL] Add API for handling expressio...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/21193#discussion_r186356233 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/ExprValueSuite.scala --- @@ -17,6 +17,8 @@ package org.apache.spark.sql.catalyst.expressions.codegen +import scala.collection.mutable --- End diff -- ??? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait ...
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21164 @gatorsmile Could you please give some comments when you have time? Thanks so much. In addition, I think this is a critical bug!!! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r186355622 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3039,6 +3039,16 @@ object functions { ArrayContains(column.expr, Literal(value)) } + /** + * Returns `true` if `a1` and `a2` have at least one non-null element in common. If not and + * any of the arrays contains a `null`, it returns `null`. It returns `false` otherwise. + * @group collection_funcs + * @since 2.4.0 + */ + def arrays_overlap(a1: Column, a2: Column): Column = withExpr { +ArraysOverlap(a1.expr, a2.expr) + } --- End diff -- nit: indent --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r186355288 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -530,6 +560,155 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } +/** + * Checks if the two arrays contain at least one common element. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Returns true if a1 contains at least an element present also in a2. If the arrays have no common element and either of them contains a null element null is returned, false otherwise.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(3, 4, 5)); + true + """, since = "2.4.0") +// scalastyle:off line.size.limit +case class ArraysOverlap(left: Expression, right: Expression) + extends BinaryArrayExpressionWithImplicitCast { + + override def dataType: DataType = BooleanType + + override def nullable: Boolean = { +left.nullable || right.nullable || left.dataType.asInstanceOf[ArrayType].containsNull || + right.dataType.asInstanceOf[ArrayType].containsNull + } + + override def nullSafeEval(a1: Any, a2: Any): Any = { +var hasNull = false +val arr1 = a1.asInstanceOf[ArrayData] +val arr2 = a2.asInstanceOf[ArrayData] +val (biggestArr, smallestArr) = if (arr1.numElements() > arr2.numElements()) { --- End diff -- it's just 2 arrays, `smaller` and `bigger` should be better --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r186355096 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -530,6 +560,155 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } +/** + * Checks if the two arrays contain at least one common element. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Returns true if a1 contains at least an element present also in a2. If the arrays have no common element and either of them contains a null element null is returned, false otherwise.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(3, 4, 5)); + true + """, since = "2.4.0") +// scalastyle:off line.size.limit +case class ArraysOverlap(left: Expression, right: Expression) + extends BinaryArrayExpressionWithImplicitCast { + + override def dataType: DataType = BooleanType + + override def nullable: Boolean = { +left.nullable || right.nullable || left.dataType.asInstanceOf[ArrayType].containsNull || + right.dataType.asInstanceOf[ArrayType].containsNull + } + + override def nullSafeEval(a1: Any, a2: Any): Any = { +var hasNull = false +val arr1 = a1.asInstanceOf[ArrayData] +val arr2 = a2.asInstanceOf[ArrayData] +val (biggestArr, smallestArr) = if (arr1.numElements() > arr2.numElements()) { + (arr1, arr2) +} else { + (arr2, arr1) +} +if (smallestArr.numElements() > 0) { + val smallestSet = new mutable.HashSet[Any] + smallestArr.foreach(elementType, (_, v) => +if (v == null) { + hasNull = true +} else { + smallestSet += v +}) + biggestArr.foreach(elementType, (_, v1) => +if (v1 == null) { + hasNull = true +} else if (smallestSet.contains(v1)) { + return true +} + ) +} else if (containsNull(biggestArr, right.dataType.asInstanceOf[ArrayType])) { --- End diff -- `right.dataType.asInstanceOf[ArrayType]` may not match the `biggerArr` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2991/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2990/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r186354007 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -530,6 +560,155 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } +/** + * Checks if the two arrays contain at least one common element. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Returns true if a1 contains at least an element present also in a2. If the arrays have no common element and either of them contains a null element null is returned, false otherwise.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(3, 4, 5)); + true + """, since = "2.4.0") +// scalastyle:off line.size.limit +case class ArraysOverlap(left: Expression, right: Expression) + extends BinaryArrayExpressionWithImplicitCast { + + override def dataType: DataType = BooleanType + + override def nullable: Boolean = { +left.nullable || right.nullable || left.dataType.asInstanceOf[ArrayType].containsNull || + right.dataType.asInstanceOf[ArrayType].containsNull + } + + override def nullSafeEval(a1: Any, a2: Any): Any = { +var hasNull = false +val arr1 = a1.asInstanceOf[ArrayData] +val arr2 = a2.asInstanceOf[ArrayData] +val (biggestArr, smallestArr) = if (arr1.numElements() > arr2.numElements()) { + (arr1, arr2) +} else { + (arr2, arr1) +} +if (smallestArr.numElements() > 0) { + val smallestSet = new mutable.HashSet[Any] + smallestArr.foreach(elementType, (_, v) => +if (v == null) { + hasNull = true +} else { + smallestSet += v +}) + biggestArr.foreach(elementType, (_, v1) => +if (v1 == null) { + hasNull = true +} else if (smallestSet.contains(v1)) { + return true +} + ) +} else if (containsNull(biggestArr, right.dataType.asInstanceOf[ArrayType])) { + hasNull = true +} +if (hasNull) { + null +} else { + false +} + } + + def containsNull(arr: ArrayData, dt: ArrayType): Boolean = { +if (dt.containsNull) { + arr.foreach(elementType, (_, v) => --- End diff -- ``` var i = 0 var hasNull = false while (i < arr.numElements && !hasNull) { hasNull = arr.isNullAt(i) i += 1 } hasNull ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r186353058 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -28,6 +30,34 @@ import org.apache.spark.unsafe.Platform import org.apache.spark.unsafe.array.ByteArrayMethods import org.apache.spark.unsafe.types.{ByteArray, UTF8String} +/** + * Base trait for [[BinaryExpression]]s with two arrays of the same element type and implicit + * casting. + */ +trait BinaryArrayExpressionWithImplicitCast extends BinaryExpression + with ImplicitCastInputTypes { + + protected lazy val elementType: DataType = inputTypes.head.asInstanceOf[ArrayType].elementType --- End diff -- this can be a `def` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r186353005 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -28,6 +30,34 @@ import org.apache.spark.unsafe.Platform import org.apache.spark.unsafe.array.ByteArrayMethods import org.apache.spark.unsafe.types.{ByteArray, UTF8String} +/** + * Base trait for [[BinaryExpression]]s with two arrays of the same element type and implicit + * casting. + */ +trait BinaryArrayExpressionWithImplicitCast extends BinaryExpression + with ImplicitCastInputTypes { + + protected lazy val elementType: DataType = inputTypes.head.asInstanceOf[ArrayType].elementType + + override def inputTypes: Seq[AbstractDataType] = { +TypeCoercion.findWiderTypeForTwo(left.dataType, right.dataType) match { --- End diff -- does presto allow implicitly casting to string for these collection functions? e.g. can `ArraysOverlap` work for array of int and array of string? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21040 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186352935 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") + .internal() + .doc("In sql like 'select x from t order by y limit m', if m is under this threshold, " + + "sort in memory, otherwise do a global sort with disk.") + .intConf + .createWithDefault(2000) --- End diff -- Yeah, I agree. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2989/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16677 **[Test build #90306 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90306/testReport)** for PR 16677 at commit [`062b8fd`](https://github.com/apache/spark/commit/062b8fd58ae13f252b1e6f61c70b69ed05521715). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90306/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21028 **[Test build #90308 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90308/testReport)** for PR 21028 at commit [`4e37975`](https://github.com/apache/spark/commit/4e37975ba3ce361009a83d248ad7d0b758f86f4c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21040: [SPARK-23930][SQL] Add slice function
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/21040 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21240 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2988/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21240 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/21028 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16478 **[Test build #90307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90307/testReport)** for PR 16478 at commit [`ae00de1`](https://github.com/apache/spark/commit/ae00de13dd779a2a09b142c54a2fcc144d7f8c23). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16677 **[Test build #90306 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90306/testReport)** for PR 16677 at commit [`062b8fd`](https://github.com/apache/spark/commit/062b8fd58ae13f252b1e6f61c70b69ed05521715). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21255 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21255 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2987/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16677 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21255 **[Test build #90304 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90304/testReport)** for PR 21255 at commit [`3985285`](https://github.com/apache/spark/commit/3985285089673e42a85a5d1ba3cd7419a6948909). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16478 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21240 **[Test build #90305 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90305/testReport)** for PR 21240 at commit [`4ab3af0`](https://github.com/apache/spark/commit/4ab3af0c1abfd0ac078c968dbe589bf96091). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21256: [SPARK-24160][FOLLOWUP] Fix compilation failure
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21256 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21240 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186349524 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") + .internal() + .doc("In sql like 'select x from t order by y limit m', if m is under this threshold, " + + "sort in memory, otherwise do a global sort with disk.") + .intConf + .createWithDefault(2000) --- End diff -- I would suggest `Int.Max` as the default value, which preserves the previous behavior. Users can tune it w.r.t. their workload. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21255: [SPARK-24186][SparR][SQL]change reverse and concat to co...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21255 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186348657 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") + .internal() + .doc("In sql like 'select x from t order by y limit m', if m is under this threshold, " + + "sort in memory, otherwise do a global sort with disk.") + .intConf + .createWithDefault(2000) --- End diff -- Isn't 2000 too small for this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21256: [SPARK-24160][FOLLOWUP] Fix compilation failure
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21256 thanks! I'm merging it to unblock the build, since it already passes the compilation. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186348278 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") + .internal() + .doc("In sql like 'select x from t order by y limit m', if m is under this threshold, " + + "sort in memory, otherwise do a global sort with disk.") + .intConf + .createWithDefault(2000) --- End diff -- Oh, yeah, reasonable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21256: [SPARK-24160][FOLLOWUP] Fix compilation failure
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21256 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr shou...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21250#discussion_r186348079 --- Diff: docs/sparkr.md --- @@ -663,3 +663,7 @@ You can inspect the search path in R with [`search()`](https://stat.ethz.ch/R-ma - The `stringsAsFactors` parameter was previously ignored with `collect`, for example, in `collect(createDataFrame(iris), stringsAsFactors = TRUE))`. It has been corrected. - For `summary`, option for statistics to compute has been added. Its output is changed from that from `describe`. - A warning can be raised if versions of SparkR package and the Spark JVM do not match. + +## Upgrading to Spark 2.3.1 and above + + - The `start` parameter of `substr` method was wrongly subtracted by one, previously. In other words, the index specified by `start` parameter was considered as 0-base. This can lead to inconsistent substring results and also does not match with the behaviour with `substr` in R. It has been fixed so the `start` parameter of `substr` method is now 1-base, e.g., therefore to get the same result as `substr(df$a, 2, 5)`, it should be changed to `substr(df$a, 1, 4)`. --- End diff -- we should mention the version more explicitly, e.g. ``` In SparkR 2.3.0 and earlier, the `start` parameter ... In version 2.3.1 and later, ... As an example, `substr(lit('abcdef'), 2, 5)` would result to `abc` in SparkR 2.3.0, and in SparkR 2.3.1, the result would be ... ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2986/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21250: [SPARK-23291][SQL][R][BRANCH-2.3] R's substr shou...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21250#discussion_r186347093 --- Diff: docs/sparkr.md --- @@ -663,3 +663,7 @@ You can inspect the search path in R with [`search()`](https://stat.ethz.ch/R-ma - The `stringsAsFactors` parameter was previously ignored with `collect`, for example, in `collect(createDataFrame(iris), stringsAsFactors = TRUE))`. It has been corrected. - For `summary`, option for statistics to compute has been added. Its output is changed from that from `describe`. - A warning can be raised if versions of SparkR package and the Spark JVM do not match. + +## Upgrading to Spark 2.3.1 and above --- End diff -- `Spark` -> `SparkR` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186346827 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") + .internal() + .doc("In sql like 'select x from t order by y limit m', if m is under this threshold, " + + "sort in memory, otherwise do a global sort with disk.") --- End diff -- `with disk` -> `which spills to disk if necessary` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186346747 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") --- End diff -- `spark.sql.execution.combineLimitAfterSortTreshold`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2985/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21240 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90303/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21240 **[Test build #90303 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90303/testReport)** for PR 21240 at commit [`4ab3af0`](https://github.com/apache/spark/commit/4ab3af0c1abfd0ac078c968dbe589bf96091). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21240 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21252#discussion_r186345913 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1238,6 +1238,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val SORT_IN_MEM_FOR_LIMIT_THRESHOLD = +buildConf("spark.sql.limit.sortInMemThreshold") + .internal() + .doc("In sql like 'select x from t order by y limit m', if m is under this threshold, " + + "sort in memory, otherwise do a global sort with disk.") + .intConf + .createWithDefault(2000) --- End diff -- what if users only have a few queries which have large limit and they want to disable the top n sort? I feel this config is more flexible than a boolean flag. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21256: [SPARK-24160][FOLLOWUP] Fix compilation failure
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/21256 cc @cloud-fan @JoshRosen @jinxing64 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21240 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2984/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21240 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16478 **[Test build #90302 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90302/testReport)** for PR 16478 at commit [`ae00de1`](https://github.com/apache/spark/commit/ae00de13dd779a2a09b142c54a2fcc144d7f8c23). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90302/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16677 **[Test build #90301 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90301/testReport)** for PR 16677 at commit [`062b8fd`](https://github.com/apache/spark/commit/062b8fd58ae13f252b1e6f61c70b69ed05521715). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90301/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16677 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21240 **[Test build #90303 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90303/testReport)** for PR 21240 at commit [`4ab3af0`](https://github.com/apache/spark/commit/4ab3af0c1abfd0ac078c968dbe589bf96091). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21256: [SPARK-24160][FOLLOWUP] Fix compilation failure
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21256 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21256: [SPARK-24160][FOLLOWUP] Fix compilation failure
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21256 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2983/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org