[GitHub] spark pull request #21169: [SPARK-23715][SQL] the input of to/from_utc_times...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21169#discussion_r184731765 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1208,6 +1208,13 @@ object SQLConf { .stringConf .createWithDefault("") + val REJECT_TIMEZONE_IN_STRING = buildConf("spark.sql.function.rejectTimezoneInString") +.internal() +.doc("If true, `to_utc_timestamp` and `from_utc_timestamp` return null if the input string " + + "contains a timezone part, e.g. `2000-10-10 00:00:00+00:00`.") +.booleanConf +.createWithDefault(true) + --- End diff -- existing workloads may depend on the previous behavior and think that's corrected. It's safer to provide an internal conf and tell users about it when they complain about behavior change. It's an internal conf and is invisible to end users. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21028 **[Test build #89930 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89930/testReport)** for PR 21028 at commit [`076fc69`](https://github.com/apache/spark/commit/076fc698d4054b757e5afb14d1d6bfc190c2c6f7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21182: [SPARK-24068] Propagating DataFrameReader's options to T...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21182 **[Test build #89929 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89929/testReport)** for PR 21182 at commit [`8a8ff3f`](https://github.com/apache/spark/commit/8a8ff3f5bfdfaee7ec73e362cfa34261d199f407). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21182: [SPARK-24068] Propagating DataFrameReader's options to T...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21182 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21181: [SPARK-23736][SQL][FOLLOWUP] Error message should contai...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21181 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2721/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21181: [SPARK-23736][SQL][FOLLOWUP] Error message should contai...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21181 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully method ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21175 **[Test build #89928 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89928/testReport)** for PR 21175 at commit [`fb527c8`](https://github.com/apache/spark/commit/fb527c87a1f4ddb05eb601038736aeb4ec3f7223). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully method ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21175 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89928/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r184730604 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -19,14 +19,41 @@ package org.apache.spark.sql.catalyst.expressions import java.util.Comparator import org.apache.spark.sql.catalyst.InternalRow -import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.{TypeCheckResult, TypeCoercion} import org.apache.spark.sql.catalyst.expressions.codegen._ import org.apache.spark.sql.catalyst.util.{ArrayData, GenericArrayData, MapData, TypeUtils} import org.apache.spark.sql.types._ import org.apache.spark.unsafe.Platform import org.apache.spark.unsafe.array.ByteArrayMethods import org.apache.spark.unsafe.types.{ByteArray, UTF8String} +/** + * Base trait for [[BinaryExpression]]s with two arrays of the same element type and implicit + * casting. + */ +trait BinaryArrayExpressionWithImplicitCast extends BinaryExpression --- End diff -- The `ImplicitCastInputTypes` trait is able to work with any number of children. Would it be possible to implement this trait to behave in the same way? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21182: [SPARK-24068] Propagating DataFrameReader's optio...
GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/21182 [SPARK-24068] Propagating DataFrameReader's options to Text datasource on schema inferring ## What changes were proposed in this pull request? While reading CSV or JSON files, DataFrameReader's options are converted to Hadoop's parameters, for example there: https://github.com/apache/spark/blob/branch-2.3/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L302 but the options are not propagated to Text datasource on schema inferring, for instance: https://github.com/apache/spark/blob/branch-2.3/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala#L184-L188 The PR proposes propagation of user's options to Text datasource on scheme inferring in similar way as user's options are converted to Hadoop parameters if schema is specified. ## How was this patch tested? The changes were tested manually by using https://github.com/twitter/hadoop-lzo: ``` hadoop-lzo> mvn clean package hadoop-lzo> ln -s ./target/hadoop-lzo-0.4.21-SNAPSHOT.jar ./hadoop-lzo.jar ``` Create 2 test files in JSON and CSV format and compress them: ```shell $ cat test.csv col1|col2 a|1 $ lzop test.csv $ cat test.json {"col1":"a","col2":1} $ lzop test.json ``` Run `spark-shell` with hadoop-lzo: ``` bin/spark-shell --jars ~/hadoop-lzo/hadoop-lzo.jar ``` reading compressing CSV and JSON without schema: ```scala spark.read.option("io.compression.codecs", "com.hadoop.compression.lzo.LzopCodec").option("inferSchema",true).option("header",true).option("sep","|").csv("test.csv.lzo").show() +++ |col1|col2| +++ | a| 1| +++ ``` ```scala spark.read.option("io.compression.codecs", "com.hadoop.compression.lzo.LzopCodec").option("multiLine", true).json("test.json.lzo").printSchema root |-- col1: string (nullable = true) |-- col2: long (nullable = true) ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 text-options Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21182.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21182 commit 8a8ff3f5bfdfaee7ec73e362cfa34261d199f407 Author: Maxim Gekk Date: 2018-04-27T13:23:40Z Propagating DataFrameReader's options to the text datasource on schema inferring --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21181: [SPARK-23736][SQL][FOLLOWUP] Error message should contai...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/21181 cc @mn-mikke @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21181: [SPARK-23736][SQL][FOLLOWUP] Error message should contai...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21181 **[Test build #89927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89927/testReport)** for PR 21181 at commit [`5660de8`](https://github.com/apache/spark/commit/5660de8d93ff96aa4dd840e74a1b87294e543838). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully method ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21175 **[Test build #89928 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89928/testReport)** for PR 21175 at commit [`fb527c8`](https://github.com/apache/spark/commit/fb527c87a1f4ddb05eb601038736aeb4ec3f7223). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully method ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21175 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21181: [SPARK-23736][SQL][FOLLOWUP] Error message should...
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/21181 [SPARK-23736][SQL][FOLLOWUP] Error message should contains SQL types ## What changes were proposed in this pull request? In the error messages we should return the SQL types (like `string` rather than the internal types like `StringType`). ## How was this patch tested? added UT You can merge this pull request into a Git repository by running: $ git pull https://github.com/mgaido91/spark SPARK-23736_followup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21181.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21181 commit 5660de8d93ff96aa4dd840e74a1b87294e543838 Author: Marco Gaido Date: 2018-04-27T15:45:21Z [SPARK-23736][SQL][FOLLOWUP] Error message should contains SQL types --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...
Github user manbuyun commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184728954 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,10 +63,12 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - while (bytes.remaining() > 0) { + val curChunkLimit = bytes.limit() + while (bytes.hasRemaining) { val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) bytes.limit(bytes.position() + ioSize) channel.write(bytes) +bytes.limit(curChunkLimit) --- End diff -- I have commit this modified --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...
Github user manbuyun commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184727560 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,10 +63,12 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - while (bytes.remaining() > 0) { + val curChunkLimit = bytes.limit() + while (bytes.hasRemaining) { val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) bytes.limit(bytes.position() + ioSize) channel.write(bytes) +bytes.limit(curChunkLimit) --- End diff -- Right. When channel write throw IOException --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21141: [SPARK-23853][PYSPARK][TEST] Run Hive-related PySpark te...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21141 @dongjoon-hyun @HyukjinKwon My PR is no longer addressing the issue described its associated Jira [(SPARK-23776),](https://issues.apache.org/jira/browse/SPARK-23776) which is that developers don't know what to do when they run the pyspark tests and get a failure with a UDF registration error (or Hive assembly missing error), as Holden experienced earlier. My PR morphed into a "skip the tests for missing components" change. After these "skip tests" PRs go through, I will revisit this. In the meantime, feel free to use/ignore whatever is in #20909. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2720/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) qu...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21177#discussion_r184725980 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala --- @@ -78,7 +81,7 @@ object TPCDSQueryBenchmark extends Logging { } val numRows = queryRelations.map(tableSizes.getOrElse(_, 0L)).sum val benchmark = new Benchmark(s"TPCDS Snappy", numRows, 5) - benchmark.addCase(name) { i => + benchmark.addCase(s"$name$nameSuffix") { _ => --- End diff -- how about ``` benchmark.addCase(s"$name$nameSuffix")(_ => spark.sql(queryString).collect()) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) qu...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21177#discussion_r184724132 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala --- @@ -87,10 +90,20 @@ object TPCDSQueryBenchmark extends Logging { } } + def filterQueries( + origQueries: Seq[String], + args: TPCDSQueryBenchmarkArguments): Seq[String] = { +if (args.queryFilter.nonEmpty) { + origQueries.filter { case queryName => args.queryFilter.contains(queryName) } --- End diff -- ``` origQueries.filter(args.queryFilter.contains) ``` maybe better? :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89926/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21028 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21028 **[Test build #89926 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89926/testReport)** for PR 21028 at commit [`1dbcd0c`](https://github.com/apache/spark/commit/1dbcd0c68171ee5375e54a320b4314741a135fbd). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21028 **[Test build #89926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89926/testReport)** for PR 21028 at commit [`1dbcd0c`](https://github.com/apache/spark/commit/1dbcd0c68171ee5375e54a320b4314741a135fbd). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21136: [SPARK-24061][SS]Add TypedFilter support for cont...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21136#discussion_r184719162 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala --- @@ -771,6 +778,16 @@ class UnsupportedOperationsSuite extends SparkFunSuite { } } + /** Assert that the logical plan is supportd for continuous procsssing mode */ --- End diff -- nit:`supportd` -> `supported` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21136: [SPARK-24061][SS]Add TypedFilter support for cont...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21136#discussion_r184718790 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala --- @@ -840,4 +857,9 @@ class UnsupportedOperationsSuite extends SparkFunSuite { def this(attribute: Attribute) = this(Seq(attribute)) override def isStreaming: Boolean = true } + + case class StreamingRelationV2(output: Seq[Attribute]) extends LeafNode { --- End diff -- nit: rename this to `TestStreamingRelationV2` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20946: [SPARK-23565] [SS] New error message for structur...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20946 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20946: [SPARK-23565] [SS] New error message for structured stre...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/20946 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184713695 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,10 +63,12 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - while (bytes.remaining() > 0) { + val curChunkLimit = bytes.limit() + while (bytes.hasRemaining) { val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) bytes.limit(bytes.position() + ioSize) channel.write(bytes) +bytes.limit(curChunkLimit) --- End diff -- I would rewrite this using: ``` try { val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) bytes.limit(bytes.position() + ioSize) channel.write(bytes) } finally { bytes.limit(curChunkLimit) } ``` to be safe. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21180: [SPARK-22674][PYTHON] Disabled _hack_namedtuple for pick...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21180 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21180: [SPARK-22674][PYTHON] Disabled _hack_namedtuple for pick...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21180 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21180: [SPARK-22674][PYTHON] Disabled _hack_namedtuple f...
GitHub user superbobry opened a pull request: https://github.com/apache/spark/pull/21180 [SPARK-22674][PYTHON] Disabled _hack_namedtuple for picklable namedtuples Prior to this PR ``_hack_namedtuple`` was applied to all namedtuples, regardless if they were defined on the top level of some module, and are therefore picklable using the default ``__reduce__`` implementation, or not. The PR ensures that only the non-picklable namedtuples are hacked, i.e. the ones defined in the REPL or locally in a function or method. Note that the namedtuple might be defined locally but still be picklable without the hack applied. def define(): global Foo Foo = namedtuple("Foo", []) The current implementation does not cover such cases and will apply the hack anyway. Sidenote: the PR also fixes the module name of the hacked namedtuples. Due to an extra layer of indirection added by ``_hijack_namedtuple``, all hacked namedtuples had "collections" as ``__module__``. This behaviour is no longer the case. SerializationTestCase and RDDTests. ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/criteo-forks/spark hijack-non-importable-namedtuple Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21180.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21180 commit 37b0f6d14fcd48b9bd05b6f43b5cfb6284200367 Author: Sergei Lebedev Date: 2018-04-27T14:34:12Z [SPARK-22674][PYTHON] Disabled _hack_namedtuple for picklable namedtuples Prior to this PR ``_hack_namedtuple`` was applied to all namedtuples, regardless if they were defined on the top level of some module, and are therefore picklable using the default ``__reduce__`` implementation, or not. The PR ensures that only the non-picklable namedtuples are hacked, i.e. the ones defined in the REPL or locally in a function or method. Note that the namedtuple might be defined locally but still be picklable without the hack applied. def define(): global Foo Foo = namedtuple("Foo", []) The current implementation do not cover such cases, and will apply the hack anyway. Sidenote: the PR also fixes the module name of the hacked namedtuples. Due to an extra layer of indirection added by ``_hijack_namedtuple``, all hacked namedtuples had "collections" as ``__module__``. This behaviour is no longer the case. SerializationTestCase and RDDTests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...
Github user superbobry commented on the issue: https://github.com/apache/spark/pull/21157 Closing in favour of #21180. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21157: [SPARK-22674][PYTHON] Removed the namedtuple pick...
Github user superbobry closed the pull request at: https://github.com/apache/spark/pull/21157 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/21133 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r184706765 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } +/** + * Checks if the two arrays contain at least one common element. + */ +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Returns true if a1 contains at least an element present also in a2.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(3, 4, 5)); + true + """, since = "2.4.0") +case class ArraysOverlap(left: Expression, right: Expression) + extends BinaryExpression with ImplicitCastInputTypes { + + private lazy val elementType = inputTypes.head.asInstanceOf[ArrayType].elementType + + override def dataType: DataType = BooleanType + + override def inputTypes: Seq[AbstractDataType] = left.dataType match { --- End diff -- implicit type cast is allowed in Presto. I am pushing here a proposal of trait, let me know what you think about it. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...
Github user manbuyun commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184706100 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala --- @@ -56,6 +56,15 @@ class ChunkedByteBufferSuite extends SparkFunSuite { assert(chunkedByteBuffer.getChunks().head.position() === 0) } + test("SPARK-24107: writeFully() write buffer which is larger than bufferWriteChunkSize") { +val bufferWriteChunkSize = Option(SparkEnv.get).map(_.conf.get(config.BUFFER_WRITE_CHUNK_SIZE)) + .getOrElse(config.BUFFER_WRITE_CHUNK_SIZE.defaultValue.get).toInt --- End diff -- Ok. I have added. Please check --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21153 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89924/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21153 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21133 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21153 **[Test build #89924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89924/testReport)** for PR 21153 at commit [`526fa4a`](https://github.com/apache/spark/commit/526fa4a96b61f4b5adb6606a92ed440879747a28). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89921/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21133 **[Test build #89921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89921/testReport)** for PR 21133 at commit [`2fa8da7`](https://github.com/apache/spark/commit/2fa8da744b1726284577deca6c70d184cdae3579). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21021 **[Test build #89925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89925/testReport)** for PR 21021 at commit [`175d981`](https://github.com/apache/spark/commit/175d98195fc172655584b0dcf4087014e1377d12). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2719/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21021 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r184700686 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } +/** + * Checks if the two arrays contain at least one common element. + */ +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Returns true if a1 contains at least an element present also in a2.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(3, 4, 5)); + true + """, since = "2.4.0") +case class ArraysOverlap(left: Expression, right: Expression) + extends BinaryExpression with ImplicitCastInputTypes { + + private lazy val elementType = inputTypes.head.asInstanceOf[ArrayType].elementType + + override def dataType: DataType = BooleanType + + override def inputTypes: Seq[AbstractDataType] = left.dataType match { --- End diff -- @mgaido91 Sorry, I should have been more explicit. I've been referring to the below case that I added into `FunctionArgumentConversion` due to enabling type coercion of array types. ``` case c @ Concat(children) if children.forall(c => ArrayType.acceptsType(c.dataType)) && !haveSameType(children) => val types = children.map(_.dataType) findWiderCommonType(types) match { case Some(finalDataType) => Concat(children.map(Cast(_, finalDataType))) case None => c } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21153 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2718/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21153 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21153 **[Test build #89924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89924/testReport)** for PR 21153 at commit [`526fa4a`](https://github.com/apache/spark/commit/526fa4a96b61f4b5adb6606a92ed440879747a28). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21153#discussion_r184693332 --- Diff: python/pyspark/ml/util.py --- @@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None): """ uid = instance.uid cls = instance.__module__ + '.' + instance.__class__.__name__ -params = instance.extractParamMap() + +# User-supplied param values +params = instance._paramMap jsonParams = {} if paramMap is not None: jsonParams = paramMap else: for p in params: jsonParams[p.name] = params[p] + +# Default param values +jsonDefaultParams = {} +for p in instance._defaultParamMap: +jsonDefaultParams[p.name] = instance._defaultParamMap[p] --- End diff -- ditto. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21153#discussion_r184693307 --- Diff: python/pyspark/ml/util.py --- @@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None): """ uid = instance.uid cls = instance.__module__ + '.' + instance.__class__.__name__ -params = instance.extractParamMap() + +# User-supplied param values +params = instance._paramMap jsonParams = {} if paramMap is not None: jsonParams = paramMap else: for p in params: jsonParams[p.name] = params[p] --- End diff -- `_paramMap`'s keys are `Param` not string. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21169: [SPARK-23715][SQL] the input of to/from_utc_times...
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21169#discussion_r184687634 --- Diff: sql/core/src/test/resources/sql-tests/results/datetime.sql.out --- @@ -82,9 +82,138 @@ struct 1 2 2 3 + -- !query 9 select weekday('2007-02-03'), weekday('2009-07-30'), weekday('2017-05-27'), weekday(null), weekday('1582-10-15 13:10:15') --- !query 3 schema +-- !query 9 schema struct --- !query 3 output +-- !query 9 output 5 3 5 NULL4 + + +-- !query 10 +select from_utc_timestamp('2015-07-24 00:00:00', 'PST') +-- !query 10 schema +struct +-- !query 10 output +2015-07-23 17:00:00 + + +-- !query 11 +select from_utc_timestamp('2015-01-24 00:00:00', 'PST') +-- !query 11 schema +struct +-- !query 11 output +2015-01-23 16:00:00 + + +-- !query 12 +select from_utc_timestamp(null, 'PST') +-- !query 12 schema +struct +-- !query 12 output +NULL + + +-- !query 13 +select from_utc_timestamp('2015-07-24 00:00:00', null) +-- !query 13 schema +struct +-- !query 13 output +NULL + + +-- !query 14 +select from_utc_timestamp(null, null) +-- !query 14 schema +struct +-- !query 14 output +NULL + + +-- !query 15 +select from_utc_timestamp(cast(0 as timestamp), 'PST') +-- !query 15 schema +struct +-- !query 15 output +1969-12-31 08:00:00 --- End diff -- Since we're adding new SQLConf settings anyway, we could have a "timestamp.hive.compatibility" (or something like that), that is true by default and allows select from_utc_timestamp(cast(0 as timestamp), 'PST') to continue to produce the above answer. However, when false, it would treat 0 as 1970-01-01T00:00:00 UTC, so the above would instead produce the answer '1969-12-31 16:00:00' (which we both agree in the Jira is probably the more correct answer). Just a thought. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21040: [SPARK-23930][SQL] Add slice function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21040 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2717/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21040: [SPARK-23930][SQL] Add slice function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21040 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20604: [SPARK-23365][CORE] Do not adjust num executors when kil...
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20604 ping @squito --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21040: [SPARK-23930][SQL] Add slice function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21040 **[Test build #89923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89923/testReport)** for PR 21040 at commit [`72ed607`](https://github.com/apache/spark/commit/72ed607c7d59b1f7f821f48f74e19d06b73758a7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21169: [SPARK-23715][SQL] the input of to/from_utc_times...
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21169#discussion_r184685378 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1208,6 +1208,13 @@ object SQLConf { .stringConf .createWithDefault("") + val REJECT_TIMEZONE_IN_STRING = buildConf("spark.sql.function.rejectTimezoneInString") +.internal() +.doc("If true, `to_utc_timestamp` and `from_utc_timestamp` return null if the input string " + + "contains a timezone part, e.g. `2000-10-10 00:00:00+00:00`.") +.booleanConf +.createWithDefault(true) + --- End diff -- Why would we need this? Currently, if a user passes '2000-10-10 00:00:00+00:00' to _utc_timestamp, they get the wrong answer. If they switch off this setting, they will continue to get the wrong answer rather than null. Are we accommodating the users who experienced this bug and are manually shifting the result? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r184683126 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } +/** + * Checks if the two arrays contain at least one common element. + */ +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Returns true if a1 contains at least an element present also in a2.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(3, 4, 5)); + true + """, since = "2.4.0") +case class ArraysOverlap(left: Expression, right: Expression) + extends BinaryExpression with ImplicitCastInputTypes { + + private lazy val elementType = inputTypes.head.asInstanceOf[ArrayType].elementType + + override def dataType: DataType = BooleanType + + override def inputTypes: Seq[AbstractDataType] = left.dataType match { --- End diff -- @mn-mikke I am not sure, since it is quite a strange case, since it allows also string and byte. I am not sure we can do this with implicit type cast. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21178 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21178 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89922/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeyt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21178 **[Test build #89922 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89922/testReport)** for PR 21178 at commit [`684fb26`](https://github.com/apache/spark/commit/684fb26b2de91b0c0e33874e06e9b93fa338b2cf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21178 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2716/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21178 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeyt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21178 **[Test build #89922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89922/testReport)** for PR 21178 at commit [`684fb26`](https://github.com/apache/spark/commit/684fb26b2de91b0c0e33874e06e9b93fa338b2cf). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184675304 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala --- @@ -56,6 +56,15 @@ class ChunkedByteBufferSuite extends SparkFunSuite { assert(chunkedByteBuffer.getChunks().head.position() === 0) } + test("SPARK-24107: writeFully() write buffer which is larger than bufferWriteChunkSize") { +val bufferWriteChunkSize = Option(SparkEnv.get).map(_.conf.get(config.BUFFER_WRITE_CHUNK_SIZE)) + .getOrElse(config.BUFFER_WRITE_CHUNK_SIZE.defaultValue.get).toInt --- End diff -- How about setting this value via `spark.buffer.write.chunkSize`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21021 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89920/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21021 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21021 **[Test build #89920 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89920/testReport)** for PR 21021 at commit [`175d981`](https://github.com/apache/spark/commit/175d98195fc172655584b0dcf4087014e1377d12). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait ArraySortUtil extends ExpectsInputTypes ` * `case class ArraySort(child: Expression) extends UnaryExpression with ArraySortUtil ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21106: [SPARK-23711][SQL][WIP] Add fallback logic for UnsafePro...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21106 A sql conf sounds good to me. @hvanhovell What do you think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184672313 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- What is the purpose of moving both field and method out of this method? I'm not sure is there any difference. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21121: [SPARK-24042][SQL] Collection function: zip_with_index
Github user mn-mikke commented on the issue: https://github.com/apache/spark/pull/21121 @ueshin What about combining `zip_with_index` with [`map_from_entries`](https://issues.apache.org/jira/browse/SPARK-23934)? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...
Github user manbuyun commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184668664 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala --- @@ -56,6 +56,13 @@ class ChunkedByteBufferSuite extends SparkFunSuite { assert(chunkedByteBuffer.getChunks().head.position() === 0) } + test("SPARK-24107: writeFully() write buffer which is larger than bufferWriteChunkSize") { +val chunkedByteBuffer = new ChunkedByteBuffer(Array(ByteBuffer.allocate(80 * 1024 * 1024))) --- End diff -- I have modified.Please check --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21121: [SPARK-24042][SQL] Collection function: zip_with_index
Github user lokm01 commented on the issue: https://github.com/apache/spark/pull/21121 @ueshin Currently we use our own implementation of zipWithIndex when we do explode and need to preserve the ordering of the array elements (especially if there is a shuffle involved in the subsequent transformation). Sure, once transform becomes available, it will be much better and more performant to use that, but since we're dealing with production applications, we would like to start rewriting these jobs with those small "drop-in" replacements for functions such as zipWithIndex before going for a major rewrite with HOFs in spark SQL. I've seen many threads in the community, which recommend the same approach when dealing with these difficult array cases - I'm pretty sure it will benefit other users. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r184666357 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,101 @@ case class ArrayContains(left: Expression, right: Expression) override def prettyName: String = "array_contains" } + + +/** + * Slices an array according to the requested start index and length + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(a1, a2) - Subsets array x starting from index start (or starting from the end if start is negative) with the specified length.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3, 4), 2, 2); + [2,3] + > SELECT _FUNC_(array(1, 2, 3, 4), -2, 2); + [3,4] + """, since = "2.4.0") +// scalastyle:on line.size.limit +case class Slice(x: Expression, start: Expression, length: Expression) + extends TernaryExpression with ImplicitCastInputTypes { + + override def dataType: DataType = x.dataType + + override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType, IntegerType, IntegerType) + + override def nullable: Boolean = children.exists(_.nullable) + + override def foldable: Boolean = children.forall(_.foldable) + + override def children: Seq[Expression] = Seq(x, start, length) + + override def nullSafeEval(xVal: Any, startVal: Any, lengthVal: Any): Any = { +val startInt = startVal.asInstanceOf[Int] +val lengthInt = lengthVal.asInstanceOf[Int] +val arr = xVal.asInstanceOf[ArrayData] +val startIndex = if (startInt == 0) { + throw new RuntimeException( +s"Unexpected value for start in function $prettyName: SQL array indices start at 1.") +} else if (startInt < 0) { + startInt + arr.numElements() +} else { + startInt - 1 +} +if (lengthInt < 0) { + throw new RuntimeException(s"Unexpected value for length in function $prettyName: " + +s"length must be greater than or equal to 0.") +} +// this can happen if start is negative and its absolute value is greater than the +// number of elements in the array +if (startIndex < 0) { + return new GenericArrayData(Array.empty[AnyRef]) +} +val elementType = x.dataType.asInstanceOf[ArrayType].elementType +val data = arr.toArray[AnyRef](elementType) +new GenericArrayData(data.slice(startIndex, startIndex + lengthInt)) + } + + override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { +val elementType = x.dataType.asInstanceOf[ArrayType].elementType +nullSafeCodeGen(ctx, ev, (x, start, length) => { + val arrayClass = classOf[GenericArrayData].getName + val values = ctx.freshName("values") + val i = ctx.freshName("i") + val startIdx = ctx.freshName("startIdx") + val resLength = ctx.freshName("resLength") + val defaultIntValue = CodeGenerator.defaultValue(CodeGenerator.JAVA_INT, false) + s""" + |${CodeGenerator.JAVA_INT} $startIdx = $defaultIntValue; + |${CodeGenerator.JAVA_INT} $resLength = $defaultIntValue; + |if ($start == 0) { + | throw new RuntimeException("Unexpected value for start in function $prettyName: " + |+ "SQL array indices start at 1."); + |} else if ($start < 0) { + | $startIdx = $start + $x.numElements(); + |} else { + | // arrays in SQL are 1-based instead of 0-based + | $startIdx = $start - 1; + |} + |if ($length < 0) { + | throw new RuntimeException("Unexpected value for length in function $prettyName: " + |+ "length must be greater than or equal to 0."); + |} else if ($length > $x.numElements() - $startIdx) { + | $resLength = $x.numElements() - $startIdx; + |} else { + | $resLength = $length; + |} + |Object[] $values; + |if ($startIdx < 0) { + | $values = new Object[0]; + |} else { + | $values = new Object[$resLength]; + | for (int $i = 0; $i < $resLength; $i ++) { + |$values[$i] = ${CodeGenerator.getValue(x, elementType, s"$i + $startIdx")}; --- End diff -- You are right, I am not sure why I missed it...maybe I checked outdated code. Sorry, I am fixing it, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21174: [SPARK-24085][SQL] Query returns UnsupportedOperationExc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89918/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21174: [SPARK-24085][SQL] Query returns UnsupportedOperationExc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21174 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21174: [SPARK-24085][SQL] Query returns UnsupportedOperationExc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21174 **[Test build #89918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89918/testReport)** for PR 21174 at commit [`e6e9397`](https://github.com/apache/spark/commit/e6e9397b42c1ad39d58d7b1c11f7cb152f019c82). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184662359 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,11 @@ class ApproximatePercentileQuerySuite extends QueryTest with SharedSQLContext { checkAnswer(query, expected) } } + + test("SPARK-24013: unneeded compress can cause performance issues with sorted input") { +failAfter(30 seconds) { + checkAnswer(sql("select approx_percentile(id, array(0.1)) from range(1000)"), +Row(Array(999160))) --- End diff -- Ok. Yeah, looking at the other tests in this suite it's definitely fine :-). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21143: [SPARK-24072][SQL] clearly define pushed filters
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21143 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21143: [SPARK-24072][SQL] clearly define pushed filters
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21143 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89919/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21143: [SPARK-24072][SQL] clearly define pushed filters
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21143 **[Test build #89919 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89919/testReport)** for PR 21143 at commit [`172dca0`](https://github.com/apache/spark/commit/172dca0f86f041ab9c53041f0715b7e5a682a89a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184658151 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,11 @@ class ApproximatePercentileQuerySuite extends QueryTest with SharedSQLContext { checkAnswer(query, expected) } } + + test("SPARK-24013: unneeded compress can cause performance issues with sorted input") { +failAfter(30 seconds) { + checkAnswer(sql("select approx_percentile(id, array(0.1)) from range(1000)"), +Row(Array(999160))) --- End diff -- it is not the only place where it is checked with an exact answer, so I don't think it is an issue, a small change would anyway require to change many test cases answers. What do you think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184656896 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,11 @@ class ApproximatePercentileQuerySuite extends QueryTest with SharedSQLContext { checkAnswer(query, expected) } } + + test("SPARK-24013: unneeded compress can cause performance issues with sorted input") { +failAfter(30 seconds) { + checkAnswer(sql("select approx_percentile(id, array(0.1)) from range(1000)"), +Row(Array(999160))) --- End diff -- nit: With the approx nature of the algorithm, could the exact answer not get flakty through some small changes in code or config? (like e.g. the split of range into tasks, and then different merging of partial aggrs producing slightly different results) maybe just asserting on collect().length == 1 would do? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2715/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21133 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21133 **[Test build #89921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89921/testReport)** for PR 21133 at commit [`2fa8da7`](https://github.com/apache/spark/commit/2fa8da744b1726284577deca6c70d184cdae3579). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184654618 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -238,12 +238,6 @@ object ApproximatePercentile { summaries = summaries.insert(value) // The result of QuantileSummaries.insert is un-compressed isCompressed = false - - // Currently, QuantileSummaries ignores the construction parameter compressThresHold, - // which may cause QuantileSummaries to occupy unbounded memory. We have to hack around here - // to make sure QuantileSummaries doesn't occupy infinite memory. - // TODO: Figure out why QuantileSummaries ignores construction parameter compressThresHold - if (summaries.sampled.length >= compressThresHoldBufferLength) compress() --- End diff -- no problem at all, thanks for checking this :) I addressed you comment on the test. Any more comments? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184654132 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -238,12 +238,6 @@ object ApproximatePercentile { summaries = summaries.insert(value) // The result of QuantileSummaries.insert is un-compressed isCompressed = false - - // Currently, QuantileSummaries ignores the construction parameter compressThresHold, - // which may cause QuantileSummaries to occupy unbounded memory. We have to hack around here - // to make sure QuantileSummaries doesn't occupy infinite memory. - // TODO: Figure out why QuantileSummaries ignores construction parameter compressThresHold - if (summaries.sampled.length >= compressThresHoldBufferLength) compress() --- End diff -- Sorry, it's my fault of not reading the description attentively :-). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2714/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184653021 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,10 @@ class ApproximatePercentileQuerySuite extends QueryTest with SharedSQLContext { checkAnswer(query, expected) } } + + test("SPARK-24013: unneeded compress can cause performance issues with sorted input") { +failAfter(20 seconds) { + assert(sql("select approx_percentile(id, array(0.1)) from range(1000)").count() == 1) --- End diff -- nice catch, thanks, I started using collect during my tests than I moved to count but it was a mistake, I am fixing it, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184652876 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -238,12 +238,6 @@ object ApproximatePercentile { summaries = summaries.insert(value) // The result of QuantileSummaries.insert is un-compressed isCompressed = false - - // Currently, QuantileSummaries ignores the construction parameter compressThresHold, - // which may cause QuantileSummaries to occupy unbounded memory. We have to hack around here - // to make sure QuantileSummaries doesn't occupy infinite memory. - // TODO: Figure out why QuantileSummaries ignores construction parameter compressThresHold - if (summaries.sampled.length >= compressThresHoldBufferLength) compress() --- End diff -- Yes, the TODO was resolved in SPARK-17439. I thought I clearly stated it in the description, but if this is not the case or you have any suggestion about how to improve the description, I am happy to improve it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) queries i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21177 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89916/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) queries i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21177 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) queries i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21177 **[Test build #89916 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89916/testReport)** for PR 21177 at commit [`0ef1b28`](https://github.com/apache/spark/commit/0ef1b28b72f072ff22f750c6263c7fdc46eec831). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21021: [SPARK-23921][SQL] Add array_sort function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21021 **[Test build #89920 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89920/testReport)** for PR 21021 at commit [`175d981`](https://github.com/apache/spark/commit/175d98195fc172655584b0dcf4087014e1377d12). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org