[jira] [Created] (SPARK-38681) Support nested generic case classes
Emil Ejbyfeldt created SPARK-38681: -- Summary: Support nested generic case classes Key: SPARK-38681 URL: https://issues.apache.org/jira/browse/SPARK-38681 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0, 3.4.0 Reporter: Emil Ejbyfeldt Spark fail to derive schemas when using nested case class with generic parameters. Example {code:java} case class GenericData[A]( genericField: A) {code} Will derive a correct schema for `GenericData[Int]` but if the classes are nested e.g. {code:java} case class NestedGeneric[T]( generic: GenericData[T]) {code} it will fail to derive a schema for `NestedGeneric[Int]`. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38682) Complex calculations with lead to driver oom
JacobZheng created SPARK-38682: -- Summary: Complex calculations with lead to driver oom Key: SPARK-38682 URL: https://issues.apache.org/jira/browse/SPARK-38682 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: JacobZheng My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the driver would hang during runtime due to oom. The dump file shows that the stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm wondering if it's related to the SPARK-33016 change, or if the execution plan change has created more stages, causing the driver to run out of memory, or some other reason. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38682) Complex calculations with lead to driver oom
[ https://issues.apache.org/jira/browse/SPARK-38682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JacobZheng updated SPARK-38682: --- Attachment: screenshot-1.png > Complex calculations with lead to driver oom > - > > Key: SPARK-38682 > URL: https://issues.apache.org/jira/browse/SPARK-38682 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: JacobZheng >Priority: Major > Attachments: 20220329164645.jpg, screenshot-1.png > > > My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the > driver would hang during runtime due to oom. The dump file shows that the > stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm > wondering if it's related to the SPARK-33016 change, or if the execution plan > change has created more stages, causing the driver to run out of memory, or > some other reason. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38682) Complex calculations with lead to driver oom
[ https://issues.apache.org/jira/browse/SPARK-38682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JacobZheng updated SPARK-38682: --- Attachment: 20220329164645.jpg > Complex calculations with lead to driver oom > - > > Key: SPARK-38682 > URL: https://issues.apache.org/jira/browse/SPARK-38682 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: JacobZheng >Priority: Major > Attachments: 20220329164645.jpg, screenshot-1.png > > > My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the > driver would hang during runtime due to oom. The dump file shows that the > stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm > wondering if it's related to the SPARK-33016 change, or if the execution plan > change has created more stages, causing the driver to run out of memory, or > some other reason. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38682) Complex calculations with lead to driver oom
[ https://issues.apache.org/jira/browse/SPARK-38682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JacobZheng updated SPARK-38682: --- Description: My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the driver would hang during runtime due to oom. The dump file shows that the stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm wondering if it's related to the SPARK-33016 change, or if the execution plan change has created more stages, causing the driver to run out of memory, or some other reason. !screenshot-1.png! was: My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the driver would hang during runtime due to oom. The dump file shows that the stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm wondering if it's related to the SPARK-33016 change, or if the execution plan change has created more stages, causing the driver to run out of memory, or some other reason. > Complex calculations with lead to driver oom > - > > Key: SPARK-38682 > URL: https://issues.apache.org/jira/browse/SPARK-38682 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: JacobZheng >Priority: Major > Attachments: 20220329164645.jpg, screenshot-1.png > > > My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the > driver would hang during runtime due to oom. The dump file shows that the > stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm > wondering if it's related to the SPARK-33016 change, or if the execution plan > change has created more stages, causing the driver to run out of memory, or > some other reason. > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38682) Complex calculations with lead to driver oom
[ https://issues.apache.org/jira/browse/SPARK-38682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JacobZheng updated SPARK-38682: --- Attachment: (was: 20220329164645.jpg) > Complex calculations with lead to driver oom > - > > Key: SPARK-38682 > URL: https://issues.apache.org/jira/browse/SPARK-38682 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: JacobZheng >Priority: Major > Attachments: screenshot-1.png > > > My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the > driver would hang during runtime due to oom. The dump file shows that the > stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm > wondering if it's related to the SPARK-33016 change, or if the execution plan > change has created more stages, causing the driver to run out of memory, or > some other reason. > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38683) It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection
weixiuli created SPARK-38683: Summary: It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection is terminated Key: SPARK-38683 URL: https://issues.apache.org/jira/browse/SPARK-38683 Project: Spark Issue Type: Bug Components: Shuffle Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0 Reporter: weixiuli It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection is terminated, to reduce I/O operations and improve performance for the External Shuffle Service. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38684) Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
Jungtaek Lim created SPARK-38684: Summary: Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators Key: SPARK-38684 URL: https://issues.apache.org/jira/browse/SPARK-38684 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.2.1, 3.3.0 Reporter: Jungtaek Lim We figured out stream-stream join has the same issue with SPARK-38320 on the appended iterators. Since the root cause is same as SPARK-38320, this is only reproducible with RocksDB state store provider, but even with HDFS backed state store provider, it is not guaranteed by interface contract hence may depend on the JVM vendor, version, etc. I can easily construct the scenario of “data loss” in state store. Condition follows: * Use stream-stream time interval outer join ** left outer join has an issue on left side, right outer join has an issue on right side, full outer join has an issue on both sides * At batch N, produce row(s) on the problematic side which are non-late * At the same batch (batch N), some row(s) on the problematic side should be evicted by watermark condition When the condition is fulfilled, out of sync happens with keyToNumValues between state and the iterator in evict phase. If eviction of the row happens for the grouping key (updating keyToNumValues), the eviction phase “overwrites” keyToNumValues in the state as the value it calculates. Given that the eviction phase “do not know” about the new rows (keyToNumValues is out of sync), effectively discarding all rows from the state being added in the batch N. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38684) Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
[ https://issues.apache.org/jira/browse/SPARK-38684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513949#comment-17513949 ] Jungtaek Lim commented on SPARK-38684: -- Will submit a PR sooner. > Stream-stream outer join has a possible correctness issue due to weakly read > consistent on outer iterators > -- > > Key: SPARK-38684 > URL: https://issues.apache.org/jira/browse/SPARK-38684 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.2.1, 3.3.0 >Reporter: Jungtaek Lim >Priority: Blocker > Labels: correctness > > We figured out stream-stream join has the same issue with SPARK-38320 on the > appended iterators. Since the root cause is same as SPARK-38320, this is only > reproducible with RocksDB state store provider, but even with HDFS backed > state store provider, it is not guaranteed by interface contract hence may > depend on the JVM vendor, version, etc. > I can easily construct the scenario of “data loss” in state store. > Condition follows: > * Use stream-stream time interval outer join > ** left outer join has an issue on left side, right outer join has an issue > on right side, full outer join has an issue on both sides > * At batch N, produce row(s) on the problematic side which are non-late > * At the same batch (batch N), some row(s) on the problematic side should be > evicted by watermark condition > When the condition is fulfilled, out of sync happens with keyToNumValues > between state and the iterator in evict phase. If eviction of the row happens > for the grouping key (updating keyToNumValues), the eviction phase > “overwrites” keyToNumValues in the state as the value it calculates. > Given that the eviction phase “do not know” about the new rows > (keyToNumValues is out of sync), effectively discarding all rows from the > state being added in the batch N. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38561) Add doc for "Customized Kubernetes Schedulers"
[ https://issues.apache.org/jira/browse/SPARK-38561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513962#comment-17513962 ] Apache Spark commented on SPARK-38561: -- User 'Yikun' has created a pull request for this issue: https://github.com/apache/spark/pull/36001 > Add doc for "Customized Kubernetes Schedulers" > -- > > Key: SPARK-38561 > URL: https://issues.apache.org/jira/browse/SPARK-38561 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Kubernetes >Affects Versions: 3.3.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38683) It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection
[ https://issues.apache.org/jira/browse/SPARK-38683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38683: Assignee: Apache Spark > It is unnecessary to release the ShuffleManagedBufferIterator or > ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the > client channel's connection is terminated > -- > > Key: SPARK-38683 > URL: https://issues.apache.org/jira/browse/SPARK-38683 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1 >Reporter: weixiuli >Assignee: Apache Spark >Priority: Major > > It is unnecessary to release the ShuffleManagedBufferIterator or > ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the > client channel's connection is terminated, to reduce I/O operations and > improve performance for the External Shuffle Service. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38683) It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection
[ https://issues.apache.org/jira/browse/SPARK-38683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38683: Assignee: (was: Apache Spark) > It is unnecessary to release the ShuffleManagedBufferIterator or > ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the > client channel's connection is terminated > -- > > Key: SPARK-38683 > URL: https://issues.apache.org/jira/browse/SPARK-38683 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1 >Reporter: weixiuli >Priority: Major > > It is unnecessary to release the ShuffleManagedBufferIterator or > ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the > client channel's connection is terminated, to reduce I/O operations and > improve performance for the External Shuffle Service. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38683) It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connectio
[ https://issues.apache.org/jira/browse/SPARK-38683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513963#comment-17513963 ] Apache Spark commented on SPARK-38683: -- User 'weixiuli' has created a pull request for this issue: https://github.com/apache/spark/pull/36000 > It is unnecessary to release the ShuffleManagedBufferIterator or > ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the > client channel's connection is terminated > -- > > Key: SPARK-38683 > URL: https://issues.apache.org/jira/browse/SPARK-38683 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1 >Reporter: weixiuli >Priority: Major > > It is unnecessary to release the ShuffleManagedBufferIterator or > ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the > client channel's connection is terminated, to reduce I/O operations and > improve performance for the External Shuffle Service. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38684) Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
[ https://issues.apache.org/jira/browse/SPARK-38684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38684: Assignee: (was: Apache Spark) > Stream-stream outer join has a possible correctness issue due to weakly read > consistent on outer iterators > -- > > Key: SPARK-38684 > URL: https://issues.apache.org/jira/browse/SPARK-38684 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.2.1, 3.3.0 >Reporter: Jungtaek Lim >Priority: Blocker > Labels: correctness > > We figured out stream-stream join has the same issue with SPARK-38320 on the > appended iterators. Since the root cause is same as SPARK-38320, this is only > reproducible with RocksDB state store provider, but even with HDFS backed > state store provider, it is not guaranteed by interface contract hence may > depend on the JVM vendor, version, etc. > I can easily construct the scenario of “data loss” in state store. > Condition follows: > * Use stream-stream time interval outer join > ** left outer join has an issue on left side, right outer join has an issue > on right side, full outer join has an issue on both sides > * At batch N, produce row(s) on the problematic side which are non-late > * At the same batch (batch N), some row(s) on the problematic side should be > evicted by watermark condition > When the condition is fulfilled, out of sync happens with keyToNumValues > between state and the iterator in evict phase. If eviction of the row happens > for the grouping key (updating keyToNumValues), the eviction phase > “overwrites” keyToNumValues in the state as the value it calculates. > Given that the eviction phase “do not know” about the new rows > (keyToNumValues is out of sync), effectively discarding all rows from the > state being added in the batch N. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38684) Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
[ https://issues.apache.org/jira/browse/SPARK-38684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513994#comment-17513994 ] Apache Spark commented on SPARK-38684: -- User 'HeartSaVioR' has created a pull request for this issue: https://github.com/apache/spark/pull/36002 > Stream-stream outer join has a possible correctness issue due to weakly read > consistent on outer iterators > -- > > Key: SPARK-38684 > URL: https://issues.apache.org/jira/browse/SPARK-38684 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.2.1, 3.3.0 >Reporter: Jungtaek Lim >Priority: Blocker > Labels: correctness > > We figured out stream-stream join has the same issue with SPARK-38320 on the > appended iterators. Since the root cause is same as SPARK-38320, this is only > reproducible with RocksDB state store provider, but even with HDFS backed > state store provider, it is not guaranteed by interface contract hence may > depend on the JVM vendor, version, etc. > I can easily construct the scenario of “data loss” in state store. > Condition follows: > * Use stream-stream time interval outer join > ** left outer join has an issue on left side, right outer join has an issue > on right side, full outer join has an issue on both sides > * At batch N, produce row(s) on the problematic side which are non-late > * At the same batch (batch N), some row(s) on the problematic side should be > evicted by watermark condition > When the condition is fulfilled, out of sync happens with keyToNumValues > between state and the iterator in evict phase. If eviction of the row happens > for the grouping key (updating keyToNumValues), the eviction phase > “overwrites” keyToNumValues in the state as the value it calculates. > Given that the eviction phase “do not know” about the new rows > (keyToNumValues is out of sync), effectively discarding all rows from the > state being added in the batch N. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38684) Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
[ https://issues.apache.org/jira/browse/SPARK-38684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38684: Assignee: Apache Spark > Stream-stream outer join has a possible correctness issue due to weakly read > consistent on outer iterators > -- > > Key: SPARK-38684 > URL: https://issues.apache.org/jira/browse/SPARK-38684 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.2.1, 3.3.0 >Reporter: Jungtaek Lim >Assignee: Apache Spark >Priority: Blocker > Labels: correctness > > We figured out stream-stream join has the same issue with SPARK-38320 on the > appended iterators. Since the root cause is same as SPARK-38320, this is only > reproducible with RocksDB state store provider, but even with HDFS backed > state store provider, it is not guaranteed by interface contract hence may > depend on the JVM vendor, version, etc. > I can easily construct the scenario of “data loss” in state store. > Condition follows: > * Use stream-stream time interval outer join > ** left outer join has an issue on left side, right outer join has an issue > on right side, full outer join has an issue on both sides > * At batch N, produce row(s) on the problematic side which are non-late > * At the same batch (batch N), some row(s) on the problematic side should be > evicted by watermark condition > When the condition is fulfilled, out of sync happens with keyToNumValues > between state and the iterator in evict phase. If eviction of the row happens > for the grouping key (updating keyToNumValues), the eviction phase > “overwrites” keyToNumValues in the state as the value it calculates. > Given that the eviction phase “do not know” about the new rows > (keyToNumValues is out of sync), effectively discarding all rows from the > state being added in the batch N. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38682) Complex calculations with lead to driver oom
[ https://issues.apache.org/jira/browse/SPARK-38682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JacobZheng updated SPARK-38682: --- Description: My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the driver would hang during runtime due to oom. The dump file shows that the stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm wondering if it's related to the SPARK-33016 change, or if the execution plan change has created more tasks, causing the driver to run out of memory, or some other reason. !screenshot-1.png! was: My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the driver would hang during runtime due to oom. The dump file shows that the stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm wondering if it's related to the SPARK-33016 change, or if the execution plan change has created more stages, causing the driver to run out of memory, or some other reason. !screenshot-1.png! > Complex calculations with lead to driver oom > - > > Key: SPARK-38682 > URL: https://issues.apache.org/jira/browse/SPARK-38682 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: JacobZheng >Priority: Major > Attachments: screenshot-1.png > > > My spark job is working fine in version 3.0.1. After I upgraded to 3.2, the > driver would hang during runtime due to oom. The dump file shows that the > stageMetrics in SQLAppStatusListener are taking up a lot of memory.I'm > wondering if it's related to the SPARK-33016 change, or if the execution plan > change has created more tasks, causing the driver to run out of memory, or > some other reason. > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38670) Add offset commit time to streaming query listener
[ https://issues.apache.org/jira/browse/SPARK-38670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-38670. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35985 [https://github.com/apache/spark/pull/35985] > Add offset commit time to streaming query listener > -- > > Key: SPARK-38670 > URL: https://issues.apache.org/jira/browse/SPARK-38670 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.1 >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng >Priority: Major > Fix For: 3.4.0 > > > A major portion of the batch duration is committing offsets at the end of the > micro-batch. The timing for this operation is missing from the durationMs > metrics. Lets add this metric to have a more complete picture of where the > time is going during the processing of a micro-batch -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38670) Add offset commit time to streaming query listener
[ https://issues.apache.org/jira/browse/SPARK-38670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-38670: Assignee: Boyang Jerry Peng > Add offset commit time to streaming query listener > -- > > Key: SPARK-38670 > URL: https://issues.apache.org/jira/browse/SPARK-38670 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.1 >Reporter: Boyang Jerry Peng >Assignee: Boyang Jerry Peng >Priority: Major > > A major portion of the batch duration is committing offsets at the end of the > micro-batch. The timing for this operation is missing from the durationMs > metrics. Lets add this metric to have a more complete picture of where the > time is going during the processing of a micro-batch -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38685) Improve the implement of percentile_cont
jiaan.geng created SPARK-38685: -- Summary: Improve the implement of percentile_cont Key: SPARK-38685 URL: https://issues.apache.org/jira/browse/SPARK-38685 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: jiaan.geng -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38685) Improve the implement of percentile_cont
[ https://issues.apache.org/jira/browse/SPARK-38685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38685: Assignee: Apache Spark > Improve the implement of percentile_cont > > > Key: SPARK-38685 > URL: https://issues.apache.org/jira/browse/SPARK-38685 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: jiaan.geng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38685) Improve the implement of percentile_cont
[ https://issues.apache.org/jira/browse/SPARK-38685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38685: Assignee: (was: Apache Spark) > Improve the implement of percentile_cont > > > Key: SPARK-38685 > URL: https://issues.apache.org/jira/browse/SPARK-38685 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: jiaan.geng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38685) Improve the implement of percentile_cont
[ https://issues.apache.org/jira/browse/SPARK-38685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514062#comment-17514062 ] Apache Spark commented on SPARK-38685: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/36003 > Improve the implement of percentile_cont > > > Key: SPARK-38685 > URL: https://issues.apache.org/jira/browse/SPARK-38685 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: jiaan.geng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38685) Improve the implement of percentile_cont
[ https://issues.apache.org/jira/browse/SPARK-38685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514060#comment-17514060 ] Apache Spark commented on SPARK-38685: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/36003 > Improve the implement of percentile_cont > > > Key: SPARK-38685 > URL: https://issues.apache.org/jira/browse/SPARK-38685 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: jiaan.geng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38681) Support nested generic case classes
[ https://issues.apache.org/jira/browse/SPARK-38681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514064#comment-17514064 ] Apache Spark commented on SPARK-38681: -- User 'eejbyfeldt' has created a pull request for this issue: https://github.com/apache/spark/pull/36004 > Support nested generic case classes > --- > > Key: SPARK-38681 > URL: https://issues.apache.org/jira/browse/SPARK-38681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.4.0 >Reporter: Emil Ejbyfeldt >Priority: Major > > Spark fail to derive schemas when using nested case class with generic > parameters. > Example > {code:java} > case class GenericData[A]( > genericField: A) > {code} > Will derive a correct schema for `GenericData[Int]` but if the classes are > nested e.g. > {code:java} > case class NestedGeneric[T]( > generic: GenericData[T]) > {code} > it will fail to derive a schema for `NestedGeneric[Int]`. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38681) Support nested generic case classes
[ https://issues.apache.org/jira/browse/SPARK-38681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38681: Assignee: (was: Apache Spark) > Support nested generic case classes > --- > > Key: SPARK-38681 > URL: https://issues.apache.org/jira/browse/SPARK-38681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.4.0 >Reporter: Emil Ejbyfeldt >Priority: Major > > Spark fail to derive schemas when using nested case class with generic > parameters. > Example > {code:java} > case class GenericData[A]( > genericField: A) > {code} > Will derive a correct schema for `GenericData[Int]` but if the classes are > nested e.g. > {code:java} > case class NestedGeneric[T]( > generic: GenericData[T]) > {code} > it will fail to derive a schema for `NestedGeneric[Int]`. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38681) Support nested generic case classes
[ https://issues.apache.org/jira/browse/SPARK-38681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38681: Assignee: Apache Spark > Support nested generic case classes > --- > > Key: SPARK-38681 > URL: https://issues.apache.org/jira/browse/SPARK-38681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.4.0 >Reporter: Emil Ejbyfeldt >Assignee: Apache Spark >Priority: Major > > Spark fail to derive schemas when using nested case class with generic > parameters. > Example > {code:java} > case class GenericData[A]( > genericField: A) > {code} > Will derive a correct schema for `GenericData[Int]` but if the classes are > nested e.g. > {code:java} > case class NestedGeneric[T]( > generic: GenericData[T]) > {code} > it will fail to derive a schema for `NestedGeneric[Int]`. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38674) Remove useless deduplicate in SubqueryBroadcastExec
[ https://issues.apache.org/jira/browse/SPARK-38674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-38674. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35989 [https://github.com/apache/spark/pull/35989] > Remove useless deduplicate in SubqueryBroadcastExec > --- > > Key: SPARK-38674 > URL: https://issues.apache.org/jira/browse/SPARK-38674 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Fix For: 3.4.0 > > > Distinct performance: > https://github.com/apache/spark/pull/29642#discussion_r511606498 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38674) Remove useless deduplicate in SubqueryBroadcastExec
[ https://issues.apache.org/jira/browse/SPARK-38674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-38674: --- Assignee: Yuming Wang > Remove useless deduplicate in SubqueryBroadcastExec > --- > > Key: SPARK-38674 > URL: https://issues.apache.org/jira/browse/SPARK-38674 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > > Distinct performance: > https://github.com/apache/spark/pull/29642#discussion_r511606498 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38506) Push partial aggregation through join
[ https://issues.apache.org/jira/browse/SPARK-38506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38506: Assignee: Apache Spark > Push partial aggregation through join > - > > Key: SPARK-38506 > URL: https://issues.apache.org/jira/browse/SPARK-38506 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Assignee: Apache Spark >Priority: Major > > Please see > https://docs.teradata.com/r/Teradata-VantageTM-SQL-Request-and-Transaction-Processing/March-2019/Join-Planning-and-Optimization/Partial-GROUP-BY-Block-Optimization > for more details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38506) Push partial aggregation through join
[ https://issues.apache.org/jira/browse/SPARK-38506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38506: Assignee: (was: Apache Spark) > Push partial aggregation through join > - > > Key: SPARK-38506 > URL: https://issues.apache.org/jira/browse/SPARK-38506 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Priority: Major > > Please see > https://docs.teradata.com/r/Teradata-VantageTM-SQL-Request-and-Transaction-Processing/March-2019/Join-Planning-and-Optimization/Partial-GROUP-BY-Block-Optimization > for more details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38506) Push partial aggregation through join
[ https://issues.apache.org/jira/browse/SPARK-38506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514133#comment-17514133 ] Apache Spark commented on SPARK-38506: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/36005 > Push partial aggregation through join > - > > Key: SPARK-38506 > URL: https://issues.apache.org/jira/browse/SPARK-38506 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Priority: Major > > Please see > https://docs.teradata.com/r/Teradata-VantageTM-SQL-Request-and-Transaction-Processing/March-2019/Join-Planning-and-Optimization/Partial-GROUP-BY-Block-Optimization > for more details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38562) Add doc for Volcano scheduler
[ https://issues.apache.org/jira/browse/SPARK-38562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-38562. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35870 [https://github.com/apache/spark/pull/35870] > Add doc for Volcano scheduler > - > > Key: SPARK-38562 > URL: https://issues.apache.org/jira/browse/SPARK-38562 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Kubernetes >Affects Versions: 3.3.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38562) Add doc for Volcano scheduler
[ https://issues.apache.org/jira/browse/SPARK-38562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-38562: - Assignee: Yikun Jiang > Add doc for Volcano scheduler > - > > Key: SPARK-38562 > URL: https://issues.apache.org/jira/browse/SPARK-38562 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Kubernetes >Affects Versions: 3.3.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38562) Add doc for Volcano scheduler
[ https://issues.apache.org/jira/browse/SPARK-38562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-38562: -- Fix Version/s: 3.3.0 (was: 3.4.0) > Add doc for Volcano scheduler > - > > Key: SPARK-38562 > URL: https://issues.apache.org/jira/browse/SPARK-38562 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Kubernetes >Affects Versions: 3.3.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38686) Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`
Xinrong Meng created SPARK-38686: Summary: Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` Key: SPARK-38686 URL: https://issues.apache.org/jira/browse/SPARK-38686 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.4.0 Reporter: Xinrong Meng Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37982) Use error classes in the execution errors related to unsupported input type
[ https://issues.apache.org/jira/browse/SPARK-37982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-37982. -- Fix Version/s: 3.4.0 (was: 3.3.0) Resolution: Fixed Issue resolved by pull request 35274 [https://github.com/apache/spark/pull/35274] > Use error classes in the execution errors related to unsupported input type > --- > > Key: SPARK-37982 > URL: https://issues.apache.org/jira/browse/SPARK-37982 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: leesf >Assignee: leesf >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37982) Use error classes in the execution errors related to unsupported input type
[ https://issues.apache.org/jira/browse/SPARK-37982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-37982: Assignee: leesf > Use error classes in the execution errors related to unsupported input type > --- > > Key: SPARK-37982 > URL: https://issues.apache.org/jira/browse/SPARK-37982 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: leesf >Assignee: leesf >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38686) Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`
[ https://issues.apache.org/jira/browse/SPARK-38686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514275#comment-17514275 ] Apache Spark commented on SPARK-38686: -- User 'xinrong-databricks' has created a pull request for this issue: https://github.com/apache/spark/pull/36006 > Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` > -- > > Key: SPARK-38686 > URL: https://issues.apache.org/jira/browse/SPARK-38686 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38686) Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`
[ https://issues.apache.org/jira/browse/SPARK-38686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38686: Assignee: (was: Apache Spark) > Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` > -- > > Key: SPARK-38686 > URL: https://issues.apache.org/jira/browse/SPARK-38686 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38686) Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`
[ https://issues.apache.org/jira/browse/SPARK-38686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38686: Assignee: Apache Spark > Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` > -- > > Key: SPARK-38686 > URL: https://issues.apache.org/jira/browse/SPARK-38686 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Assignee: Apache Spark >Priority: Major > > Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38687) Use error classes in the compilation errors of generators
[ https://issues.apache.org/jira/browse/SPARK-38687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38687: - Affects Version/s: 3.4.0 (was: 3.3.0) > Use error classes in the compilation errors of generators > - > > Key: SPARK-38687 > URL: https://issues.apache.org/jira/browse/SPARK-38687 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * windowSpecificationNotDefinedError > * windowAggregateFunctionWithFilterNotSupportedError > * windowFunctionInsideAggregateFunctionNotAllowedError > * expressionWithoutWindowExpressionError > * expressionWithMultiWindowExpressionsError > * windowFunctionNotAllowedError > * cannotSpecifyWindowFrameError > * windowFrameNotMatchRequiredFrameError > * windowFunctionWithWindowFrameNotOrderedError > * multiTimeWindowExpressionsNotSupportedError > * sessionWindowGapDurationDataTypeError > * invalidLiteralForWindowDurationError > * emptyWindowExpressionError > * foundDifferentWindowFunctionTypeError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. > *Feel free to split this to sub-tasks.* -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38687) Use error classes in the compilation errors of generators
Max Gekk created SPARK-38687: Summary: Use error classes in the compilation errors of generators Key: SPARK-38687 URL: https://issues.apache.org/jira/browse/SPARK-38687 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Max Gekk Migrate the following errors in QueryCompilationErrors: * windowSpecificationNotDefinedError * windowAggregateFunctionWithFilterNotSupportedError * windowFunctionInsideAggregateFunctionNotAllowedError * expressionWithoutWindowExpressionError * expressionWithMultiWindowExpressionsError * windowFunctionNotAllowedError * cannotSpecifyWindowFrameError * windowFrameNotMatchRequiredFrameError * windowFunctionWithWindowFrameNotOrderedError * multiTimeWindowExpressionsNotSupportedError * sessionWindowGapDurationDataTypeError * invalidLiteralForWindowDurationError * emptyWindowExpressionError * foundDifferentWindowFunctionTypeError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. *Feel free to split this to sub-tasks.* -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38687) Use error classes in the compilation errors of generators
[ https://issues.apache.org/jira/browse/SPARK-38687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38687: - Description: Migrate the following errors in QueryCompilationErrors: * nestedGeneratorError * moreThanOneGeneratorError * generatorOutsideSelectError * generatorNotExpectedError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * windowSpecificationNotDefinedError * windowAggregateFunctionWithFilterNotSupportedError * windowFunctionInsideAggregateFunctionNotAllowedError * expressionWithoutWindowExpressionError * expressionWithMultiWindowExpressionsError * windowFunctionNotAllowedError * cannotSpecifyWindowFrameError * windowFrameNotMatchRequiredFrameError * windowFunctionWithWindowFrameNotOrderedError * multiTimeWindowExpressionsNotSupportedError * sessionWindowGapDurationDataTypeError * invalidLiteralForWindowDurationError * emptyWindowExpressionError * foundDifferentWindowFunctionTypeError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. *Feel free to split this to sub-tasks.* > Use error classes in the compilation errors of generators > - > > Key: SPARK-38687 > URL: https://issues.apache.org/jira/browse/SPARK-38687 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * nestedGeneratorError > * moreThanOneGeneratorError > * generatorOutsideSelectError > * generatorNotExpectedError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38688) Use error classes in the compilation errors of deserializer
[ https://issues.apache.org/jira/browse/SPARK-38688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38688: - Description: Migrate the following errors in QueryCompilationErrors: * dataTypeMismatchForDeserializerError * fieldNumberMismatchForDeserializerError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * nestedGeneratorError * moreThanOneGeneratorError * generatorOutsideSelectError * generatorNotExpectedError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. > Use error classes in the compilation errors of deserializer > --- > > Key: SPARK-38688 > URL: https://issues.apache.org/jira/browse/SPARK-38688 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * dataTypeMismatchForDeserializerError > * fieldNumberMismatchForDeserializerError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38688) Use error classes in the compilation errors of deserializer
Max Gekk created SPARK-38688: Summary: Use error classes in the compilation errors of deserializer Key: SPARK-38688 URL: https://issues.apache.org/jira/browse/SPARK-38688 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Migrate the following errors in QueryCompilationErrors: * nestedGeneratorError * moreThanOneGeneratorError * generatorOutsideSelectError * generatorNotExpectedError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38689) Use error classes in the compilation errors of not allowed DESC PARTITION
[ https://issues.apache.org/jira/browse/SPARK-38689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38689: - Affects Version/s: 3.4.0 (was: 3.3.0) > Use error classes in the compilation errors of not allowed DESC PARTITION > - > > Key: SPARK-38689 > URL: https://issues.apache.org/jira/browse/SPARK-38689 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * unsupportedIfNotExistsError > * nonPartitionColError > * missingStaticPartitionColumn > * alterV2TableSetLocationWithPartitionNotSupportedError > * invalidPartitionSpecError > * partitionNotSpecifyLocationUriError > * describeDoesNotSupportPartitionForV2TablesError > * tableDoesNotSupportPartitionManagementError > * tableDoesNotSupportAtomicPartitionManagementError > * alterTableRecoverPartitionsNotSupportedForV2TablesError > * partitionColumnNotSpecifiedError > * invalidPartitionColumnError > * multiplePartitionColumnValuesSpecifiedError > * cannotUseDataTypeForPartitionColumnError > * cannotUseAllColumnsForPartitionColumnsError > * partitionColumnNotFoundInSchemaError > * mismatchedTablePartitionColumnError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38689) Use error classes in the compilation errors of not allowed DESC PARTITION
Max Gekk created SPARK-38689: Summary: Use error classes in the compilation errors of not allowed DESC PARTITION Key: SPARK-38689 URL: https://issues.apache.org/jira/browse/SPARK-38689 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Max Gekk Migrate the following errors in QueryCompilationErrors: * unsupportedIfNotExistsError * nonPartitionColError * missingStaticPartitionColumn * alterV2TableSetLocationWithPartitionNotSupportedError * invalidPartitionSpecError * partitionNotSpecifyLocationUriError * describeDoesNotSupportPartitionForV2TablesError * tableDoesNotSupportPartitionManagementError * tableDoesNotSupportAtomicPartitionManagementError * alterTableRecoverPartitionsNotSupportedForV2TablesError * partitionColumnNotSpecifiedError * invalidPartitionColumnError * multiplePartitionColumnValuesSpecifiedError * cannotUseDataTypeForPartitionColumnError * cannotUseAllColumnsForPartitionColumnsError * partitionColumnNotFoundInSchemaError * mismatchedTablePartitionColumnError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38689) Use error classes in the compilation errors of not allowed DESC PARTITION
[ https://issues.apache.org/jira/browse/SPARK-38689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38689: - Description: Migrate the following errors in QueryCompilationErrors: * descPartitionNotAllowedOnTempView * descPartitionNotAllowedOnView onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * unsupportedIfNotExistsError * nonPartitionColError * missingStaticPartitionColumn * alterV2TableSetLocationWithPartitionNotSupportedError * invalidPartitionSpecError * partitionNotSpecifyLocationUriError * describeDoesNotSupportPartitionForV2TablesError * tableDoesNotSupportPartitionManagementError * tableDoesNotSupportAtomicPartitionManagementError * alterTableRecoverPartitionsNotSupportedForV2TablesError * partitionColumnNotSpecifiedError * invalidPartitionColumnError * multiplePartitionColumnValuesSpecifiedError * cannotUseDataTypeForPartitionColumnError * cannotUseAllColumnsForPartitionColumnsError * partitionColumnNotFoundInSchemaError * mismatchedTablePartitionColumnError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. > Use error classes in the compilation errors of not allowed DESC PARTITION > - > > Key: SPARK-38689 > URL: https://issues.apache.org/jira/browse/SPARK-38689 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * descPartitionNotAllowedOnTempView > * descPartitionNotAllowedOnView > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38689) Use error classes in the compilation errors of not allowed DESC PARTITION
[ https://issues.apache.org/jira/browse/SPARK-38689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38689: - Description: Migrate the following errors in QueryCompilationErrors: * descPartitionNotAllowedOnTempView * descPartitionNotAllowedOnView * descPartitionNotAllowedOnViewError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * descPartitionNotAllowedOnTempView * descPartitionNotAllowedOnView onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. > Use error classes in the compilation errors of not allowed DESC PARTITION > - > > Key: SPARK-38689 > URL: https://issues.apache.org/jira/browse/SPARK-38689 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * descPartitionNotAllowedOnTempView > * descPartitionNotAllowedOnView > * descPartitionNotAllowedOnViewError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38690) Use error classes in the compilation errors of SHOW CREATE TABLE
Max Gekk created SPARK-38690: Summary: Use error classes in the compilation errors of SHOW CREATE TABLE Key: SPARK-38690 URL: https://issues.apache.org/jira/browse/SPARK-38690 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Migrate the following errors in QueryCompilationErrors: * descPartitionNotAllowedOnTempView * descPartitionNotAllowedOnView * descPartitionNotAllowedOnViewError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38690) Use error classes in the compilation errors of SHOW CREATE TABLE
[ https://issues.apache.org/jira/browse/SPARK-38690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38690: - Description: Migrate the following errors in QueryCompilationErrors: * showCreateTableAsSerdeNotSupportedForV2TablesError * showCreateTableNotSupportedOnTempView * showCreateTableFailToExecuteUnsupportedFeatureError * showCreateTableNotSupportTransactionalHiveTableError * showCreateTableFailToExecuteUnsupportedConfError * showCreateTableAsSerdeNotAllowedOnSparkDataSourceTableError * showCreateTableOrViewFailToExecuteUnsupportedFeatureError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * descPartitionNotAllowedOnTempView * descPartitionNotAllowedOnView * descPartitionNotAllowedOnViewError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. > Use error classes in the compilation errors of SHOW CREATE TABLE > > > Key: SPARK-38690 > URL: https://issues.apache.org/jira/browse/SPARK-38690 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * showCreateTableAsSerdeNotSupportedForV2TablesError > * showCreateTableNotSupportedOnTempView > * showCreateTableFailToExecuteUnsupportedFeatureError > * showCreateTableNotSupportTransactionalHiveTableError > * showCreateTableFailToExecuteUnsupportedConfError > * showCreateTableAsSerdeNotAllowedOnSparkDataSourceTableError > * showCreateTableOrViewFailToExecuteUnsupportedFeatureError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38691) Use error classes in the compilation errors of column/attr resolving
Max Gekk created SPARK-38691: Summary: Use error classes in the compilation errors of column/attr resolving Key: SPARK-38691 URL: https://issues.apache.org/jira/browse/SPARK-38691 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Migrate the following errors in QueryCompilationErrors: * showCreateTableAsSerdeNotSupportedForV2TablesError * showCreateTableNotSupportedOnTempView * showCreateTableFailToExecuteUnsupportedFeatureError * showCreateTableNotSupportTransactionalHiveTableError * showCreateTableFailToExecuteUnsupportedConfError * showCreateTableAsSerdeNotAllowedOnSparkDataSourceTableError * showCreateTableOrViewFailToExecuteUnsupportedFeatureError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38691) Use error classes in the compilation errors of column/attr resolving
[ https://issues.apache.org/jira/browse/SPARK-38691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38691: - Description: Migrate the following errors in QueryCompilationErrors: * cannotResolveUserSpecifiedColumnsError * cannotResolveStarExpandGivenInputColumnsError * cannotResolveAttributeError * cannotResolveColumnGivenInputColumnsError * cannotResolveColumnNameAmongAttributesError * cannotResolveColumnNameAmongFieldsError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * showCreateTableAsSerdeNotSupportedForV2TablesError * showCreateTableNotSupportedOnTempView * showCreateTableFailToExecuteUnsupportedFeatureError * showCreateTableNotSupportTransactionalHiveTableError * showCreateTableFailToExecuteUnsupportedConfError * showCreateTableAsSerdeNotAllowedOnSparkDataSourceTableError * showCreateTableOrViewFailToExecuteUnsupportedFeatureError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. > Use error classes in the compilation errors of column/attr resolving > > > Key: SPARK-38691 > URL: https://issues.apache.org/jira/browse/SPARK-38691 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * cannotResolveUserSpecifiedColumnsError > * cannotResolveStarExpandGivenInputColumnsError > * cannotResolveAttributeError > * cannotResolveColumnGivenInputColumnsError > * cannotResolveColumnNameAmongAttributesError > * cannotResolveColumnNameAmongFieldsError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38692) Use error classes in the compilation errors of function args
Max Gekk created SPARK-38692: Summary: Use error classes in the compilation errors of function args Key: SPARK-38692 URL: https://issues.apache.org/jira/browse/SPARK-38692 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Migrate the following errors in QueryCompilationErrors: * cannotResolveUserSpecifiedColumnsError * cannotResolveStarExpandGivenInputColumnsError * cannotResolveAttributeError * cannotResolveColumnGivenInputColumnsError * cannotResolveColumnNameAmongAttributesError * cannotResolveColumnNameAmongFieldsError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38692) Use error classes in the compilation errors of function args
[ https://issues.apache.org/jira/browse/SPARK-38692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38692: - Description: Migrate the following errors in QueryCompilationErrors: * invalidFunctionArgumentsError * invalidFunctionArgumentNumberError * functionAcceptsOnlyOneArgumentError * secondArgumentNotDoubleLiteralError * functionCannotProcessInputError * v2FunctionInvalidInputTypeLengthError * secondArgumentInFunctionIsNotBooleanLiteralError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. was: Migrate the following errors in QueryCompilationErrors: * cannotResolveUserSpecifiedColumnsError * cannotResolveStarExpandGivenInputColumnsError * cannotResolveAttributeError * cannotResolveColumnGivenInputColumnsError * cannotResolveColumnNameAmongAttributesError * cannotResolveColumnNameAmongFieldsError onto use error classes. Throw an implementation of SparkThrowable. Also write a test per every error in QueryCompilationErrorsSuite. > Use error classes in the compilation errors of function args > > > Key: SPARK-38692 > URL: https://issues.apache.org/jira/browse/SPARK-38692 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * invalidFunctionArgumentsError > * invalidFunctionArgumentNumberError > * functionAcceptsOnlyOneArgumentError > * secondArgumentNotDoubleLiteralError > * functionCannotProcessInputError > * v2FunctionInvalidInputTypeLengthError > * secondArgumentInFunctionIsNotBooleanLiteralError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38693) Spark does not use SessionManager
Brad Solomon created SPARK-38693: Summary: Spark does not use SessionManager Key: SPARK-38693 URL: https://issues.apache.org/jira/browse/SPARK-38693 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 3.2.1 Reporter: Brad Solomon Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [`org.keycloak.adapters.servlet.KeycloakOIDCFilter`](https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter) as the `spark.ui.filters` class. Sample logs: ``` spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager ``` Configuration: ``` spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json ``` This exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38693) Spark does not use SessionManager
[ https://issues.apache.org/jira/browse/SPARK-38693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Solomon updated SPARK-38693: - Description: Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [`org.keycloak.adapters.servlet.KeycloakOIDCFilter`]([https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter]) as the `spark.ui.filters` class. Sample logs: {code:java} spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager{code} Configuration: {code:java} spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json {code} This exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. was: Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [`org.keycloak.adapters.servlet.KeycloakOIDCFilter`](https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter) as the `spark.ui.filters` class. Sample logs: ``` spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager ``` Configuration: ``` spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json ``` This exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. > Spark does not use SessionManager > - > > Key: SPARK-38693 > URL: https://issues.apache.org/jira/browse/SPARK-38693 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 3.2.1 >Reporter: Brad Solomon >Priority: Blocker > > Spark's failure to use a `SessionManager` causes > `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI > from being used with > [`org.keycloak.adapters.servlet.KeycloakOIDCFilter`]([https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter]) > as the `spark.ui.filters` class. > > Sample logs: > > {code:java} > spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from > http://REDACTED/auth/realms/master/.well-known/openid-configuration > spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / > spark_1 | java.lang.IllegalStateException: No SessionManager{code} > > Configuration: > > > {code:java} > spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter > spark.acls.enable=true > spark.admin.acls=* > spark.ui.view.acls=* > spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json > > {code} > > This exception emanates from Jetty: > > [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] > > It appears that Spark's `ServletContextHandler` has the ability to use a > `SessionManager` but doesn't. This seems to be a blocker that prevents > integration with Keycloak entirely. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38693) Spark does not use SessionManager
[ https://issues.apache.org/jira/browse/SPARK-38693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Solomon updated SPARK-38693: - Description: Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [org.keycloak.adapters.servlet.KeycloakOIDCFilter|[https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter]] as the `spark.ui.filters` class. Sample logs: {code:java} spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager{code} Configuration: {code:java} spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json {code} This exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. was: Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [`org.keycloak.adapters.servlet.KeycloakOIDCFilter`]([https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter]) as the `spark.ui.filters` class. Sample logs: {code:java} spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager{code} Configuration: {code:java} spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json {code} This exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. > Spark does not use SessionManager > - > > Key: SPARK-38693 > URL: https://issues.apache.org/jira/browse/SPARK-38693 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 3.2.1 >Reporter: Brad Solomon >Priority: Blocker > > Spark's failure to use a `SessionManager` causes > `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI > from being used with > [org.keycloak.adapters.servlet.KeycloakOIDCFilter|[https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter]] > as the `spark.ui.filters` class. > > Sample logs: > > {code:java} > spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from > http://REDACTED/auth/realms/master/.well-known/openid-configuration > spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / > spark_1 | java.lang.IllegalStateException: No SessionManager{code} > > Configuration: > > > {code:java} > spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter > spark.acls.enable=true > spark.admin.acls=* > spark.ui.view.acls=* > spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json > > {code} > > This exception emanates from Jetty: > > [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] > > It appears that Spark's `ServletContextHandler` has the ability to use a > `SessionManager` but doesn't. This seems to be a blocker that prevents > integration with Keycloak entirely. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38693) Spark does not use SessionManager
[ https://issues.apache.org/jira/browse/SPARK-38693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Solomon updated SPARK-38693: - Description: Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [org.keycloak.adapters.servlet.KeycloakOIDCFilter|#_servlet_filter_adapter]] as the `spark.ui.filters` class. Sample logs: {code:java} spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager{code} Configuration: {code:java} spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json {code} Above `spark-keycloak.json` contains configuration generated in the Keycloak admin console. We can see that Spark gets as far as allowing the KeycloakOIDCFilter class to read this file and initiate communication with keycloak. This IllegalStateException exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. was: Spark's failure to use a `SessionManager` causes `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI from being used with [org.keycloak.adapters.servlet.KeycloakOIDCFilter|[https://www.keycloak.org/docs/latest/securing_apps/#_servlet_filter_adapter]] as the `spark.ui.filters` class. Sample logs: {code:java} spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from http://REDACTED/auth/realms/master/.well-known/openid-configuration spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / spark_1 | java.lang.IllegalStateException: No SessionManager{code} Configuration: {code:java} spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter spark.acls.enable=true spark.admin.acls=* spark.ui.view.acls=* spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json {code} This exception emanates from Jetty: [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] It appears that Spark's `ServletContextHandler` has the ability to use a `SessionManager` but doesn't. This seems to be a blocker that prevents integration with Keycloak entirely. > Spark does not use SessionManager > - > > Key: SPARK-38693 > URL: https://issues.apache.org/jira/browse/SPARK-38693 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 3.2.1 >Reporter: Brad Solomon >Priority: Blocker > > Spark's failure to use a `SessionManager` causes > `java.lang.IllegalStateException: No SessionManager` that prevents Spark UI > from being used with > [org.keycloak.adapters.servlet.KeycloakOIDCFilter|#_servlet_filter_adapter]] > as the `spark.ui.filters` class. > > Sample logs: > > {code:java} > spark_1 | 22/03/29 18:43:24 INFO KeycloakDeployment: Loaded URLs from > http://REDACTED/auth/realms/master/.well-known/openid-configuration > spark_1 | 22/03/29 18:43:24 WARN HttpChannel: / > spark_1 | java.lang.IllegalStateException: No SessionManager{code} > > Configuration: > > > {code:java} > spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter > spark.acls.enable=true > spark.admin.acls=* > spark.ui.view.acls=* > spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/opt/bitnami/spark/conf/spark-keycloak.json > > {code} > > Above `spark-keycloak.json` contains configuration generated in the Keycloak > admin console. We can see that Spark gets as far as allowing the > KeycloakOIDCFilter class to read this file and initiate communication with > keycloak. > > This IllegalStateException exception emanates from Jetty: > > [https://github.com/eclipse/jetty.project/blob/ae5c8e34e7dd4f5cce5f649e48469ba3bbc51d91/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L1524] > > It appears that Spark's `ServletContextHandler` has the ability to use a > `SessionManager` but doesn't. This seems to be a blocker that prevents > integration with Keycloak entirely. -- This message was sent by Atlassian Jira (v8.20.1#820001) ---
[jira] [Updated] (SPARK-35803) Spark SQL does not support creating views using DataSource v2 based data sources
[ https://issues.apache.org/jira/browse/SPARK-35803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Rabinowitz updated SPARK-35803: - Issue Type: Bug (was: New Feature) > Spark SQL does not support creating views using DataSource v2 based data > sources > > > Key: SPARK-35803 > URL: https://issues.apache.org/jira/browse/SPARK-35803 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.8, 3.1.2 >Reporter: David Rabinowitz >Assignee: Pablo Langa Blanco >Priority: Major > Fix For: 3.3.0 > > > When a temporary view is created in Spark SQL using an external data source, > Spark then tries to create the relevant relation using > DataSource.resolveRelation() method. Unlike DataFrameReader.load(), > resolveRelation() does not check if the provided DataSource implements the > DataSourceV2 interface and instead tries to use the RelationProvider trait in > order to generate the Relation. > Furthermore, DataSourceV2Relation is not a subclass of BaseRelation, so it > cannot be used in resolveRelation(). > Last, I tried to implement the RelationProvider trait in my Java > implementation of DataSourceV2, but the match inside resolveRelation() did > not detect it as RelationProvider. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514343#comment-17514343 ] Dongjoon Hyun commented on SPARK-38652: --- Any update, [~dcoliversun]? > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output error > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) > > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOpera
[jira] [Commented] (SPARK-33349) ExecutorPodsWatchSnapshotSource: Kubernetes client has been closed
[ https://issues.apache.org/jira/browse/SPARK-33349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514364#comment-17514364 ] Dongjoon Hyun commented on SPARK-33349: --- Hi, All. Could you try to use the latest one because this area is moving rapidly. - The latest Apache Spark is 3.2.1 with `kubernetes-client 5.4.1`. - In addition, Apache Spark 3.3.0 is currently under testing to build release candidate with `Kubernetes-client 5.12.1`. > ExecutorPodsWatchSnapshotSource: Kubernetes client has been closed > -- > > Key: SPARK-33349 > URL: https://issues.apache.org/jira/browse/SPARK-33349 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.1, 3.0.2, 3.1.0 >Reporter: Nicola Bova >Priority: Critical > > I launch my spark application with the > [spark-on-kubernetes-operator|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator] > with the following yaml file: > {code:yaml} > apiVersion: sparkoperator.k8s.io/v1beta2 > kind: SparkApplication > metadata: > name: spark-kafka-streamer-test > namespace: kafka2hdfs > spec: > type: Scala > mode: cluster > image: /spark:3.0.2-SNAPSHOT-2.12-0.1.0 > imagePullPolicy: Always > timeToLiveSeconds: 259200 > mainClass: path.to.my.class.KafkaStreamer > mainApplicationFile: spark-kafka-streamer_2.12-spark300-assembly.jar > sparkVersion: 3.0.1 > restartPolicy: > type: Always > sparkConf: > "spark.kafka.consumer.cache.capacity": "8192" > "spark.kubernetes.memoryOverheadFactor": "0.3" > deps: > jars: > - my > - jar > - list > hadoopConfigMap: hdfs-config > driver: > cores: 4 > memory: 12g > labels: > version: 3.0.1 > serviceAccount: default > javaOptions: > "-Dlog4j.configuration=file:///opt/spark/log4j/log4j.properties" > executor: > instances: 4 > cores: 4 > memory: 16g > labels: > version: 3.0.1 > javaOptions: > "-Dlog4j.configuration=file:///opt/spark/log4j/log4j.properties" > {code} > I have tried with both Spark `3.0.1` and `3.0.2-SNAPSHOT` with the ["Restart > the watcher when we receive a version changed from > k8s"|https://github.com/apache/spark/pull/29533] patch. > This is the driver log: > {code} > 20/11/04 12:16:02 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > ... // my app log, it's a structured streaming app reading from kafka and > writing to hdfs > 20/11/04 13:12:12 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client has > been closed (this is expected if the application is shutting down.) > io.fabric8.kubernetes.client.KubernetesClientException: too old resource > version: 1574101276 (1574213896) > at > io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:259) > at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323) > at > okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219) > at > okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105) > at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source) > at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > at java.base/java.lang.Thread.run(Unknown Source) > {code} > The error above appears after roughly 50 minutes. > After the exception above, no more logs are produced and the app hangs. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514379#comment-17514379 ] Dongjoon Hyun commented on SPARK-38652: --- BTW, [~dcoliversun]. K8s IT itself doesn't fail in both Apache Spark `master` branch and `branch-3.3` in my environment. Do you mean the test case fails when you do `spark-submit`? {code} $ build/sbt -Psparkr -Pkubernetes -Pvolcano -Pkubernetes-integration-tests -Dtest.exclude.tags=minikube -Dspark.kubernetes.test.deployMode=docker-for-desktop "kubernetes-integration-tests/test" ... [info] KubernetesSuite: [info] - Run SparkPi with no resources (8 seconds, 527 milliseconds) [info] - Run SparkPi with no resources & statefulset allocation (8 seconds, 323 milliseconds) [info] - Run SparkPi with a very long application name. (8 seconds, 386 milliseconds) [info] - Use SparkLauncher.NO_RESOURCE (8 seconds, 425 milliseconds) [info] - Run SparkPi with a master URL without a scheme. (8 seconds, 385 milliseconds) [info] - Run SparkPi with an argument. (8 seconds, 328 milliseconds) [info] - Run SparkPi with custom labels, annotations, and environment variables. (8 seconds, 384 milliseconds) [info] - All pods have the same service account by default (8 seconds, 342 milliseconds) [info] - Run extraJVMOptions check on driver (4 seconds, 327 milliseconds) [info] - Run SparkRemoteFileTest using a remote data file (8 seconds, 429 milliseconds) ... {code} > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSub
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514390#comment-17514390 ] qian commented on SPARK-38652: -- [~dongjoon] Hi. DepsTestsSuite has tests as follow * Launcher client dependencies * SPARK-33615: Launcher client archives * SPARK-33748: Launcher python client respecting PYSPARK_PYTHON * ... spark-submit command is used by these tests. So, I think DepsTestsSuite blocks. Could you please check these tests run? Maybe `-Dtest.exclude.tags` option doesn't need `minikube` value. > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514398#comment-17514398 ] Dongjoon Hyun commented on SPARK-38652: --- Got it, [~dcoliversun]. > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output error > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) > > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation
[jira] [Updated] (SPARK-38320) (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch
[ https://issues.apache.org/jira/browse/SPARK-38320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-38320: -- Labels: correctness (was: ) > (flat)MapGroupsWithState can timeout groups which just received inputs in the > same microbatch > - > > Key: SPARK-38320 > URL: https://issues.apache.org/jira/browse/SPARK-38320 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.2.1 >Reporter: Alex Balikov >Assignee: Alex Balikov >Priority: Major > Labels: correctness > Fix For: 3.3.0, 3.2.2 > > > We have identified an issue where the RocksDB state store iterator will not > pick up store updates made after its creation. As a result of this, the > _timeoutProcessorIter_ in > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala] > will not pick up state changes made during _newDataProcessorIter_ input > processing. The user observed behavior is that a group state may receive > input records and also be called with timeout in the same micro batch. This > contradics the public documentation for GroupState - > [https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/streaming/GroupState.html] > * The timeout is reset every time the function is called on a group, that > is, when the group has new data, or the group has timed out. So the user has > to set the timeout duration every time the function is called, otherwise, > there will not be any timeout set. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38694) Simplify Java UT code with Junit `assertThrows`
Yang Jie created SPARK-38694: Summary: Simplify Java UT code with Junit `assertThrows` Key: SPARK-38694 URL: https://issues.apache.org/jira/browse/SPARK-38694 Project: Spark Issue Type: Improvement Components: Tests Affects Versions: 3.4.0 Reporter: Yang Jie There are some code patterns in Java UTs: {code:java} @Test public void testAuthReplay() throws Exception { try { doSomeOperation(); fail("Should have failed"); } catch (Exception e) { assertTrue(checkException(e)); } } {code} or {code:java} @Test(expected = SomeException.class) public void testAuthReplay() throws Exception { try { doSomeOperation(); fail("Should have failed"); } catch (Exception e) { assertTrue(checkException(e)); throw e; } } {code} we can use Junit assertThrows to simplify the similar patterns -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38694) Simplify Java UT code with Junit `assertThrows`
[ https://issues.apache.org/jira/browse/SPARK-38694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514412#comment-17514412 ] Apache Spark commented on SPARK-38694: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/36008 > Simplify Java UT code with Junit `assertThrows` > --- > > Key: SPARK-38694 > URL: https://issues.apache.org/jira/browse/SPARK-38694 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > There are some code patterns in Java UTs: > {code:java} > @Test > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > } > } > {code} > or > > {code:java} > @Test(expected = SomeException.class) > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > throw e; > } > } {code} > we can use Junit assertThrows to simplify the similar patterns > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38694) Simplify Java UT code with Junit `assertThrows`
[ https://issues.apache.org/jira/browse/SPARK-38694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38694: Assignee: Apache Spark > Simplify Java UT code with Junit `assertThrows` > --- > > Key: SPARK-38694 > URL: https://issues.apache.org/jira/browse/SPARK-38694 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > There are some code patterns in Java UTs: > {code:java} > @Test > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > } > } > {code} > or > > {code:java} > @Test(expected = SomeException.class) > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > throw e; > } > } {code} > we can use Junit assertThrows to simplify the similar patterns > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38694) Simplify Java UT code with Junit `assertThrows`
[ https://issues.apache.org/jira/browse/SPARK-38694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514413#comment-17514413 ] Apache Spark commented on SPARK-38694: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/36008 > Simplify Java UT code with Junit `assertThrows` > --- > > Key: SPARK-38694 > URL: https://issues.apache.org/jira/browse/SPARK-38694 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > There are some code patterns in Java UTs: > {code:java} > @Test > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > } > } > {code} > or > > {code:java} > @Test(expected = SomeException.class) > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > throw e; > } > } {code} > we can use Junit assertThrows to simplify the similar patterns > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38694) Simplify Java UT code with Junit `assertThrows`
[ https://issues.apache.org/jira/browse/SPARK-38694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38694: Assignee: (was: Apache Spark) > Simplify Java UT code with Junit `assertThrows` > --- > > Key: SPARK-38694 > URL: https://issues.apache.org/jira/browse/SPARK-38694 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > There are some code patterns in Java UTs: > {code:java} > @Test > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > } > } > {code} > or > > {code:java} > @Test(expected = SomeException.class) > public void testAuthReplay() throws Exception { > try { > doSomeOperation(); > fail("Should have failed"); > } catch (Exception e) { > assertTrue(checkException(e)); > throw e; > } > } {code} > we can use Junit assertThrows to simplify the similar patterns > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38349) No need to filter events when sessionwindow gapDuration greater than 0
[ https://issues.apache.org/jira/browse/SPARK-38349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-38349: Assignee: nyingping > No need to filter events when sessionwindow gapDuration greater than 0 > -- > > Key: SPARK-38349 > URL: https://issues.apache.org/jira/browse/SPARK-38349 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.1 >Reporter: nyingping >Assignee: nyingping >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38349) No need to filter events when sessionwindow gapDuration greater than 0
[ https://issues.apache.org/jira/browse/SPARK-38349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-38349. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35680 [https://github.com/apache/spark/pull/35680] > No need to filter events when sessionwindow gapDuration greater than 0 > -- > > Key: SPARK-38349 > URL: https://issues.apache.org/jira/browse/SPARK-38349 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.1 >Reporter: nyingping >Assignee: nyingping >Priority: Trivial > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514431#comment-17514431 ] Dongjoon Hyun commented on SPARK-38652: --- I also confirmed this regression and raise this issue as a blocker. Thank you, [~dcoliversun]. > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output error > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) > > at > org.apache.hadoop.fs.s
[jira] [Updated] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-38652: -- Priority: Blocker (was: Major) > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Blocker > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output error > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) > > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226) > > at > org.apache.had
[jira] [Updated] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-38652: -- Component/s: (was: Tests) > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.0 >Reporter: qian >Priority: Blocker > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output error > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) > > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226) > > at > org.apache.hadoop.fs.s3
[jira] [Resolved] (SPARK-38605) Retrying on file manager operation in HDFSMetadataLog
[ https://issues.apache.org/jira/browse/SPARK-38605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-38605. - Resolution: Won't Fix > Retrying on file manager operation in HDFSMetadataLog > - > > Key: SPARK-38605 > URL: https://issues.apache.org/jira/browse/SPARK-38605 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.4.0 >Reporter: L. C. Hsieh >Priority: Major > > Currently HDFSMetadataLog uses CheckpointFileManager to do some file > operation like opening metadata file. It is very easy to be affected by > network blips and causes the streaming query failed. Although we can restart > the streaming query, but it takes more time to recover. > Such file operations should be resilient with such situation by retrying. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org