[GitHub] spark pull request #22141: [SPARK-25154][SQL] Support NOT IN sub-queries ins...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22141#discussion_r211835129 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala --- @@ -137,13 +137,21 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper { plan: LogicalPlan): (Option[Expression], LogicalPlan) = { var newPlan = plan val newExprs = exprs.map { e => - e transformUp { + e transformDown { case Exists(sub, conditions, _) => val exists = AttributeReference("exists", BooleanType, nullable = false)() // Deduplicate conflicting attributes if any. newPlan = dedupJoin( Join(newPlan, sub, ExistenceJoin(exists), conditions.reduceLeftOption(And))) exists +case (Not(InSubquery(values, ListQuery(sub, conditions, _, _ => + val exists = AttributeReference("exists", BooleanType, nullable = false)() + val inConditions = values.zip(sub.output).map(EqualTo.tupled) + val nullAwareJoinConds = inConditions.map(c => Or(c, IsNull(c))) --- End diff -- @liwensun I tried all the five queries and they work fine. I verified the results with another database just to make sure. I briefly looked at the plan and they look ok to me. Also i have added all the five tests in my last commit. Please take a look and let me know if anything amiss. Thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22141 **[Test build #95086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95086/testReport)** for PR 22141 at commit [`844a3ff`](https://github.com/apache/spark/commit/844a3ff82a688e7398bb130a44750aec78420698). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22141 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2426/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22141 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95078/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #95078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95078/testReport)** for PR 17400 at commit [`e288288`](https://github.com/apache/spark/commit/e288288081db14d218277ebacf4094f55ca11d1d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21899#discussion_r211833522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -118,12 +119,20 @@ case class BroadcastExchangeExec( // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => -throw new SparkFatalException( +val sizeMessage = if (dataSize != -1) { + s"${SparkLauncher.DRIVER_MEMORY} by at least the estimated size of the " + +s"relation ($dataSize bytes)" --- End diff -- Hmmm.. good question. I will check. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22183 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22183 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21859 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95070/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21859 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22165 I'll make one pass of this later today :) Thanks for taking this task! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22079 LGTM, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21859 **[Test build #95070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95070/testReport)** for PR 21859 at commit [`6f52f1f`](https://github.com/apache/spark/commit/6f52f1fde3d4df9384e1c99d08b930953843bcde). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22183 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive...
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22183 [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when reading from Parquet ## What changes were proposed in this pull request? This is a backport of https://github.com/apache/spark/pull/22148 Spark SQL returns NULL for a column whose Hive metastore schema and Parquet schema are in different letter cases, regardless of spark.sql.caseSensitive set to true or false. This PR aims to add case-insensitive field resolution for ParquetFileFormat. * Do case-insensitive resolution only if Spark is in case-insensitive mode. * Field resolution should fail if there is ambiguity, i.e. more than one field is matched. ## How was this patch tested? Unit tests added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/seancxmao/spark SPARK-25132-2.3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22183.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22183 commit 28315888eaae5a9c9160ea53eb6eb9a9af712958 Author: seancxmao Date: 2018-08-21T02:34:23Z [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when reading from Parquet --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22165 @xuanyuanking thanks for helping the test coverage! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22165 cc @jiangxb1987 @mengxr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22152: [SPARK-25159][SQL] json schema inference should o...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22152 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 cc @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22154 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22152 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22154 LGTM Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16478 **[Test build #95085 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95085/testReport)** for PR 16478 at commit [`8b83ec7`](https://github.com/apache/spark/commit/8b83ec7242fe44847485c0591c90bc41dbdfea4a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2425/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22154 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95069/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22154 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16478 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22154 **[Test build #95069 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95069/testReport)** for PR 22154 at commit [`129c25d`](https://github.com/apache/spark/commit/129c25d689bf4fad8d018b7391ce73937d765a12). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22182 **[Test build #95084 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95084/testReport)** for PR 22182 at commit [`319990f`](https://github.com/apache/spark/commit/319990ff60ad7b6fad6fd0cea5cada0b22e3f3c9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22176 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95066/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22176 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22182 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2424/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22182 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22176 **[Test build #95066 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95066/testReport)** for PR 22176 at commit [`c2223c3`](https://github.com/apache/spark/commit/c2223c3862619d2191ea787f3a2ee3c0d8d67ff2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22182: [SPARK-25184][SS] Fixed race condition in StreamE...
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/22182 [SPARK-25184][SS] Fixed race condition in StreamExecution that caused flaky test in FlatMapGroupsWithState ## What changes were proposed in this pull request? The race condition that caused test failure is between 2 threads. - The MicrobatchExecution thread that processes inputs to produce answers and then generates progress events. - The test thread that generates some input data, checked the answer and then verified the query generated progress event. The synchronization structure between these threads is as follows 1. MicrobatchExecution thread, in every batch, does the following in order. a. Processes batch input to generate answer. b. Signals `awaitProgressLockCondition` to wake up threads waiting for progress using `awaitOffset` c. Generates progress event 2. Test execution thread a. Calls `awaitOffset` to wait for progress, which waits on `awaitProgressLockCondition`. b. As soon as `awaitProgressLockCondition` is signaled, it would move on the in the test to check answer. c. Finally, it would verify the last generated progress event. What can happen is the following sequence of events: 2a -> 1a -> 1b -> 2b -> 2c -> 1c. In other words, the progress event may be generated after the test tries to verify it. The solution has two steps. 1. Signal the waiting thread after the progress event has been generated, that is, after `finishTrigger()`. 2. Increase the timeout of `awaitProgressLockCondition.await(100 ms)` to a large value. This latter is to ensure that test thread for keeps waiting on `awaitProgressLockCondition`until the MicroBatchExecution thread explicitly signals it. With the existing small timeout of 100ms the following sequence can occur. - MicroBatchExecution thread updates committed offsets - Test thread waiting on `awaitProgressLockCondition` accidentally times out after 100 ms, finds that the committed offsets have been updated, therefore returns from `awaitOffset` and moves on to the progress event tests. - MicroBatchExecution thread then generates progress event and signals. But the test thread has already attempted to verify the event and failed. By increasing the timeout to large (e.g., `streamingTimeoutMs = 60 seconds`, similar to `awaitInitialization`), this above type of race condition is also avoided. ## How was this patch tested? Ran locally many times. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark SPARK-25184 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22182.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22182 commit 319990ff60ad7b6fad6fd0cea5cada0b22e3f3c9 Author: Tathagata Das Date: 2018-08-22T04:44:59Z [SC-12136][SS][HOTFIX] Fixed race condition in StreamExecution that caused flaky test in FlatMapGroupsWithState The race condition that caused test failure is between 2 threads. - The MicrobatchExecution thread that processes inputs to produce answers and then generates progress events. - The test thread that generates some input data, checked the answer and then verified the query generated progress event. The synchronization structure between these threads is as follows 1. MicrobatchExecution thread, in every batch, does the following in order. a. Processes batch input to generate answer. b. Signals `awaitProgressLockCondition` to wake up threads waiting for progress using `awaitOffset` c. Generates progress event 2. Test execution thread a. Calls `awaitOffset` to wait for progress, which waits on `awaitProgressLockCondition`. b. As soon as `awaitProgressLockCondition` is signaled, it would move on the in the test to check answer. c. Finally, it would verify the last generated progress event. What can happen is the following sequence of events: 2a -> 1a -> 1b -> 2b -> 2c -> 1c. In other words, the progress event may be generated after the test tries to verify it. The solution has two steps. 1. Signal the waiting thread after the progress event has been generated, that is, after `finishTrigger()`. 2. Increase the timeout of `awaitProgressLockCondition.await(100 ms)` to a large value. This latter is to ensure that test thread for keeps waiting on `awaitProgressLockCondition`until the MicroBatchExecution thread explicitly signals it. With the existing small timeout of 100ms the following sequence can occur. - MicroBatchExecution thread updates committed offsets - Test thread waiting on `awaitProgressLockCondition`
[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21977 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21977 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95064/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21977 **[Test build #95064 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95064/testReport)** for PR 21977 at commit [`505f2eb`](https://github.com/apache/spark/commit/505f2eb09d60c695a80c7f62bde9a19a0e677357). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22152 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95068/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22152 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22152 **[Test build #95068 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95068/testReport)** for PR 22152 at commit [`95ec4d7`](https://github.com/apache/spark/commit/95ec4d7f196a20a0b6461244523a9418021677f6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user sujith71955 commented on the issue: https://github.com/apache/spark/pull/20611 @srowen Fixed the pending comments. Kindly recheck. Thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...
Github user rezasafi commented on a diff in the pull request: https://github.com/apache/spark/pull/21899#discussion_r211825377 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -118,12 +119,20 @@ case class BroadcastExchangeExec( // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => -throw new SparkFatalException( +val sizeMessage = if (dataSize != -1) { + s"${SparkLauncher.DRIVER_MEMORY} by at least the estimated size of the " + +s"relation ($dataSize bytes)" --- End diff -- How accurate is the datasize? Just worried that it becomes misleading --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95067/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16478 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16478 **[Test build #95067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95067/testReport)** for PR 16478 at commit [`8b83ec7`](https://github.com/apache/spark/commit/8b83ec7242fe44847485c0591c90bc41dbdfea4a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21546 **[Test build #95083 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95083/testReport)** for PR 21546 at commit [`89d7836`](https://github.com/apache/spark/commit/89d78364d93490b1b301c5ec766e4390bdc0b8a7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21770 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2422/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21546 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2423/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21770 **[Test build #95082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95082/testReport)** for PR 21770 at commit [`5a70a7c`](https://github.com/apache/spark/commit/5a70a7cb33c6fbdf114b39fc8f0196b8d01f8582). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21770 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21546 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22171 **[Test build #95081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95081/testReport)** for PR 22171 at commit [`5e2fb96`](https://github.com/apache/spark/commit/5e2fb96b6f28f59fb265dbd909d55ee15778bc71). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22171 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22171 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2421/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #95080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95080/testReport)** for PR 17400 at commit [`c67d11a`](https://github.com/apache/spark/commit/c67d11ab8671e0d07ac1dbcc6308f0866cc403ef). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2420/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #95079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95079/testReport)** for PR 17400 at commit [`ec3e6d9`](https://github.com/apache/spark/commit/ec3e6d9ad2b3b07a261e5ad6b308fd619f054236). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95079/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #95079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95079/testReport)** for PR 17400 at commit [`ec3e6d9`](https://github.com/apache/spark/commit/ec3e6d9ad2b3b07a261e5ad6b308fd619f054236). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2419/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22175: [MINOR] Added import to fix compilation
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22175 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22175: [MINOR] Added import to fix compilation
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22175 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95059/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22175: [MINOR] Added import to fix compilation
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22175 **[Test build #95059 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95059/testReport)** for PR 22175 at commit [`f40c600`](https://github.com/apache/spark/commit/f40c600bb9630cccbfc8b6e62530c8ee3e4ee6a7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/22180 cc @gatorsmile @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #95078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95078/testReport)** for PR 17400 at commit [`e288288`](https://github.com/apache/spark/commit/e288288081db14d218277ebacf4094f55ca11d1d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2418/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22178 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95062/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22152 **[Test build #95077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95077/testReport)** for PR 22152 at commit [`23dfcda`](https://github.com/apache/spark/commit/23dfcda279d0a854b0e64263a109dfd8d0b98b93). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22178 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22152 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2417/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22180 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95073/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22152 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22178 **[Test build #95062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95062/testReport)** for PR 22178 at commit [`533a536`](https://github.com/apache/spark/commit/533a53637ead10a8b8432cfd960947b218088ced). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22180 **[Test build #95073 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95073/testReport)** for PR 22180 at commit [`8f5b67a`](https://github.com/apache/spark/commit/8f5b67a57f6f8e9237fbfcfd9f80a02ee73cfe5d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22180 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22152: [SPARK-25159][SQL] json schema inference should o...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22152#discussion_r211815985 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala --- @@ -69,10 +70,17 @@ private[sql] object JsonInferSchema { }.reduceOption(typeMerger).toIterator } -// Here we get RDD local iterator then fold, instead of calling `RDD.fold` directly, because -// `RDD.fold` will run the fold function in DAGScheduler event loop thread, which may not have -// active SparkSession and `SQLConf.get` may point to the wrong configs. -val rootType = mergedTypesFromPartitions.toLocalIterator.fold(StructType(Nil))(typeMerger) +// Here we manually submit a fold-like Spark job, so that we can set the SQLConf when running +// the fold functions in the scheduler event loop thread. +val existingConf = SQLConf.get +var rootType: DataType = StructType(Nil) +val foldPartition = (iter: Iterator[DataType]) => iter.fold(StructType(Nil))(typeMerger) +val mergeResult = (index: Int, taskResult: DataType) => { + rootType = SQLConf.withExistingConf(existingConf) { --- End diff -- the schema can be very complex (e.g. very wide and deep schema). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22009 **[Test build #95076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95076/testReport)** for PR 22009 at commit [`51cda76`](https://github.com/apache/spark/commit/51cda76897353344427aaa666e29be408263eeb1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22181 **[Test build #95075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95075/testReport)** for PR 22181 at commit [`77e108a`](https://github.com/apache/spark/commit/77e108a18788502d05b1b3dacc21c3e72eac4264). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22181 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2415/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2416/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22181 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.col...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22181 [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection.ExternalAppendOnlyMapSuiteCheck ## What changes were proposed in this pull request? `ExternalAppendOnlyMapSuiteCheck` test is flaky. The reason is that spill status was possibly checked before all events posted to the listener bus are processed. We should check spill status after all events are processed. ## How was this patch tested? Unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-25163 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22181.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22181 commit 77e108a18788502d05b1b3dacc21c3e72eac4264 Author: Liang-Chi Hsieh Date: 2018-08-22T02:41:49Z Check spill status after processing all events. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22163 **[Test build #95074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95074/testReport)** for PR 22163 at commit [`f91e18c`](https://github.com/apache/spark/commit/f91e18c7d4b8eab53c4983320a0eab0403c37a48). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22163 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22180 **[Test build #95073 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95073/testReport)** for PR 22180 at commit [`8f5b67a`](https://github.com/apache/spark/commit/8f5b67a57f6f8e9237fbfcfd9f80a02ee73cfe5d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22180 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2413/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22163 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2414/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22180 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22180: [SPARK-25174][YARN]Limit the size of diagnostic m...
GitHub user yaooqinn opened a pull request: https://github.com/apache/spark/pull/22180 [SPARK-25174][YARN]Limit the size of diagnostic message for am to unregister itself from rm ## What changes were proposed in this pull request? When using older versions of spark releases, a use case generated a huge code-gen file which hit the limitation `Constant pool has grown past JVM limit of 0x`. In this situation, it should fail immediately. But the diagnosis message sent to RM is too large, the ApplicationMaster suspended and RM's ZKStateStore was crashed. For 2.3 or later spark releases the limitation of code-gen has been removed, but maybe there are still some uncaught exceptions that contain oversized error message will cause such a problem. This PR is aim to cut down the diagnosis message size. ## How was this patch tested? Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yaooqinn/spark SPARK-25174 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22180.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22180 commit 8f5b67a57f6f8e9237fbfcfd9f80a02ee73cfe5d Author: Kent Yao Date: 2018-08-22T02:01:28Z limit the size for am to unregister itself from rm --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/22165 cc @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20637#discussion_r211812298 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala --- @@ -110,7 +116,7 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro } val writeField = writeElement(ctx, input.value, index.toString, dt, rowWriter) -if (input.isNull == FalseLiteral) { +if (input.isNull == FalseLiteral || !nullable) { --- End diff -- `input.isNull == FalseLiteral || ` is not needed? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org