[jira] [Commented] (SPARK-44670) Fix the `test_to_excel` tests for python3.7

2023-08-03 Thread Madhukar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750959#comment-17750959 ] Madhukar commented on SPARK-44670: -- Raised a PR for using openpyxl instead of xlrd -

[jira] [Updated] (SPARK-44670) Fix the `test_to_excel` tests for python3.7

2023-08-03 Thread Madhukar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhukar updated SPARK-44670: - Description: With python3.7 and openpyxl installed got error:

[jira] [Resolved] (SPARK-44582) JVM crash caused by SMJ and WindowExec

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-44582. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42206

[jira] [Assigned] (SPARK-44582) JVM crash caused by SMJ and WindowExec

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-44582: Assignee: Wan Kun > JVM crash caused by SMJ and WindowExec >

[jira] [Created] (SPARK-44671) Retry ExecutePlan in case initial request didn't reach server in Python client

2023-08-03 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-44671: Summary: Retry ExecutePlan in case initial request didn't reach server in Python client Key: SPARK-44671 URL: https://issues.apache.org/jira/browse/SPARK-44671

[jira] [Updated] (SPARK-44009) Support profiler for Python UDTFs

2023-08-03 Thread Allison Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Wang updated SPARK-44009: - Summary: Support profiler for Python UDTFs (was: Support memory_profiler for UDTFs ) >

[jira] [Updated] (SPARK-44663) Disable arrow optimization by default for Python UDTFs

2023-08-03 Thread Allison Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Wang updated SPARK-44663: - Summary: Disable arrow optimization by default for Python UDTFs (was: Disable arrow

[jira] [Updated] (SPARK-44670) Fix the `test_to_excel` tests for python3.7

2023-08-03 Thread Madhukar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhukar updated SPARK-44670: - Description: With python3.7 and openpyxl installed got error:

[jira] [Updated] (SPARK-44670) Fix the `test_to_excel` tests for python3.7

2023-08-03 Thread Madhukar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhukar updated SPARK-44670: - Affects Version/s: 3.4.1 (was: 3.4.0) > Fix the `test_to_excel` tests for

[jira] [Created] (SPARK-44670) Fix the `test_to_excel` tests for python3.7

2023-08-03 Thread Madhukar (Jira)
Madhukar created SPARK-44670: Summary: Fix the `test_to_excel` tests for python3.7 Key: SPARK-44670 URL: https://issues.apache.org/jira/browse/SPARK-44670 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-44670) Fix the `test_to_excel` tests for python3.7

2023-08-03 Thread Madhukar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhukar updated SPARK-44670: - Description: (was: So far, we've been skipping the `read_excel` test in pandas API on Spark:

[jira] [Closed] (SPARK-44668) ObjectMapper are threadsafe, we can reuse it in Object

2023-08-03 Thread Jia Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jia Fan closed SPARK-44668. --- > ObjectMapper are threadsafe, we can reuse it in Object >

[jira] [Resolved] (SPARK-44668) ObjectMapper are threadsafe, we can reuse it in Object

2023-08-03 Thread Jia Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jia Fan resolved SPARK-44668. - Resolution: Invalid > ObjectMapper are threadsafe, we can reuse it in Object >

[jira] [Created] (SPARK-44669) Parquet/ORC files written using Hive Serde should has file extension

2023-08-03 Thread Cheng Pan (Jira)
Cheng Pan created SPARK-44669: - Summary: Parquet/ORC files written using Hive Serde should has file extension Key: SPARK-44669 URL: https://issues.apache.org/jira/browse/SPARK-44669 Project: Spark

[jira] [Created] (SPARK-44668) ObjectMapper are threadsafe, we can reuse it in Object

2023-08-03 Thread Jia Fan (Jira)
Jia Fan created SPARK-44668: --- Summary: ObjectMapper are threadsafe, we can reuse it in Object Key: SPARK-44668 URL: https://issues.apache.org/jira/browse/SPARK-44668 Project: Spark Issue Type:

[jira] [Created] (SPARK-44667) Uninstall large ML libraries for non-ML jobs

2023-08-03 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-44667: - Summary: Uninstall large ML libraries for non-ML jobs Key: SPARK-44667 URL: https://issues.apache.org/jira/browse/SPARK-44667 Project: Spark Issue Type:

[jira] [Created] (SPARK-44666) Uninstall CodeQL/Go/Node in non-container jobs

2023-08-03 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-44666: - Summary: Uninstall CodeQL/Go/Node in non-container jobs Key: SPARK-44666 URL: https://issues.apache.org/jira/browse/SPARK-44666 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-44653) non-trivial DataFrame unions should not break caching

2023-08-03 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-44653. - Fix Version/s: 3.3.3 3.5.0 3.4.2 Resolution: Fixed

[jira] [Assigned] (SPARK-44653) non-trivial DataFrame unions should not break caching

2023-08-03 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-44653: --- Assignee: Wenchen Fan > non-trivial DataFrame unions should not break caching >

[jira] [Assigned] (SPARK-44624) Spark Connect reattachable Execute when initial ExecutePlan didn't reach server

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-44624: Assignee: Juliusz Sompolski > Spark Connect reattachable Execute when initial

[jira] [Resolved] (SPARK-44624) Spark Connect reattachable Execute when initial ExecutePlan didn't reach server

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-44624. -- Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull

[jira] [Resolved] (SPARK-44664) Release the execute when closing the iterator in Python client

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-44664. -- Fix Version/s: 3.5.0 4.0.0 Assignee: Hyukjin Kwon

[jira] [Resolved] (SPARK-44619) Free up disk space for container jobs

2023-08-03 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-44619. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42253

[jira] [Assigned] (SPARK-43562) Enable DataFrameTests.test_append for pandas 2.0.0.

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-43562: Assignee: Haejoon Lee > Enable DataFrameTests.test_append for pandas 2.0.0. >

[jira] [Resolved] (SPARK-43870) Enable SeriesTests for pandas 2.0.0.

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-43870. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42268

[jira] [Assigned] (SPARK-43870) Enable SeriesTests for pandas 2.0.0.

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-43870: Assignee: Haejoon Lee > Enable SeriesTests for pandas 2.0.0. >

[jira] [Resolved] (SPARK-43562) Enable DataFrameTests.test_append for pandas 2.0.0.

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-43562. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42268

[jira] [Resolved] (SPARK-43873) Enable DataFrameSlowTests for pandas 2.0.0.

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-43873. -- Fix Version/s: 4.0.0 Assignee: Haejoon Lee Resolution: Fixed Fixed in

[jira] [Resolved] (SPARK-44640) Improve error messages for Python UDTF returning non iterable

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-44640. -- Fix Version/s: 4.0.0 Assignee: Allison Wang Resolution: Fixed Fixed in

[jira] [Updated] (SPARK-44548) Add support for pandas-on-Spark DataFrame assertDataFrameEqual

2023-08-03 Thread Amanda Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amanda Liu updated SPARK-44548: --- Summary: Add support for pandas-on-Spark DataFrame assertDataFrameEqual (was: Add support for

[jira] [Created] (SPARK-44665) Add support for pandas DataFrame assertDataFrameEqual

2023-08-03 Thread Amanda Liu (Jira)
Amanda Liu created SPARK-44665: -- Summary: Add support for pandas DataFrame assertDataFrameEqual Key: SPARK-44665 URL: https://issues.apache.org/jira/browse/SPARK-44665 Project: Spark Issue

[jira] [Created] (SPARK-44664) Release the execute when closing the iterator in Python client

2023-08-03 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-44664: Summary: Release the execute when closing the iterator in Python client Key: SPARK-44664 URL: https://issues.apache.org/jira/browse/SPARK-44664 Project: Spark

[jira] [Assigned] (SPARK-44642) ExecutePlanResponseReattachableIterator should release all after error

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-44642: Assignee: Juliusz Sompolski > ExecutePlanResponseReattachableIterator should release all

[jira] [Resolved] (SPARK-44642) ExecutePlanResponseReattachableIterator should release all after error

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-44642. -- Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull

[jira] [Resolved] (SPARK-44652) Raise error when only one df is None

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-44652. -- Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull

[jira] [Assigned] (SPARK-44652) Raise error when only one df is None

2023-08-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-44652: Assignee: Amanda Liu > Raise error when only one df is None >

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Created] (SPARK-44663) Disable arrow optimization by default

2023-08-03 Thread Allison Wang (Jira)
Allison Wang created SPARK-44663: Summary: Disable arrow optimization by default Key: SPARK-44663 URL: https://issues.apache.org/jira/browse/SPARK-44663 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-44661) getMapOutputLocation should not throw NPE

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-44661: -- Priority: Minor (was: Major) > getMapOutputLocation should not throw NPE >

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Resolved] (SPARK-44661) getMapOutputLocation should not throw NPE

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-44661. --- Fix Version/s: 3.5.0 4.0.0 3.4.2 Resolution:

[jira] [Updated] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-44662: - Description: h2. *Q1. What are you trying to do? Articulate your objectives using absolutely no jargon.* On

[jira] [Assigned] (SPARK-44661) getMapOutputLocation should not throw NPE

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-44661: - Assignee: Dongjoon Hyun > getMapOutputLocation should not throw NPE >

[jira] [Commented] (SPARK-44646) Migrate Log4j 2.x in Spark 3.4.1 to Logback

2023-08-03 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750895#comment-17750895 ] L. C. Hsieh commented on SPARK-44646: - I have not used it but maybe you can try

[jira] [Created] (SPARK-44662) SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-03 Thread Asif (Jira)
Asif created SPARK-44662: Summary: SPIP: Improving performance of BroadcastHashJoin queries with stream side join key on non partition columns Key: SPARK-44662 URL: https://issues.apache.org/jira/browse/SPARK-44662

[jira] [Commented] (SPARK-44646) Migrate Log4j 2.x in Spark 3.4.1 to Logback

2023-08-03 Thread Yu Tian (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750891#comment-17750891 ] Yu Tian commented on SPARK-44646: - Hi [~viirya]  Could you please check this thread? It is a question

[jira] [Assigned] (SPARK-44658) ShuffleStatus.getMapStatus should return None instead of Some(null)

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-44658: - Assignee: Dongjoon Hyun > ShuffleStatus.getMapStatus should return None instead of

[jira] [Resolved] (SPARK-44658) ShuffleStatus.getMapStatus should return None instead of Some(null)

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-44658. --- Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by

[jira] [Commented] (SPARK-44660) Relax constraint for columnar shuffle check in AQE

2023-08-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750881#comment-17750881 ] Chao Sun commented on SPARK-44660: -- In fact the check is necessary, but it seems {code}

[jira] [Updated] (SPARK-44641) SPJ: Results duplicated when SPJ partial-cluster and pushdown enabled but conditions unmet

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-44641: -- Affects Version/s: 3.4.0 > SPJ: Results duplicated when SPJ partial-cluster and pushdown

[jira] [Updated] (SPARK-44641) SPJ: Results duplicated when SPJ partial-cluster and pushdown enabled but conditions unmet

2023-08-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-44641: - Priority: Blocker (was: Major) > SPJ: Results duplicated when SPJ partial-cluster and pushdown enabled

[jira] [Created] (SPARK-44661) getMapOutputLocation should not throw NPE

2023-08-03 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-44661: - Summary: getMapOutputLocation should not throw NPE Key: SPARK-44661 URL: https://issues.apache.org/jira/browse/SPARK-44661 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-44660) Relax constraint for columnar shuffle check in AQE

2023-08-03 Thread Chao Sun (Jira)
Chao Sun created SPARK-44660: Summary: Relax constraint for columnar shuffle check in AQE Key: SPARK-44660 URL: https://issues.apache.org/jira/browse/SPARK-44660 Project: Spark Issue Type:

[jira] [Updated] (SPARK-44658) ShuffleStatus.getMapStatus should return None instead of Some(null)

2023-08-03 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-44658: -- Summary: ShuffleStatus.getMapStatus should return None instead of Some(null) (was:

[jira] [Updated] (SPARK-44659) SPJ: Include keyGroupedPartitioning in StoragePartitionJoinParams equality check

2023-08-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-44659: - Summary: SPJ: Include keyGroupedPartitioning in StoragePartitionJoinParams equality check (was:

[jira] [Updated] (SPARK-44641) SPJ: Results duplicated when SPJ partial-cluster and pushdown enabled but conditions unmet

2023-08-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-44641: - Summary: SPJ: Results duplicated when SPJ partial-cluster and pushdown enabled but conditions unmet

[jira] [Created] (SPARK-44659) Include keyGroupedPartitioning in StoragePartitionJoinParams equality check

2023-08-03 Thread Chao Sun (Jira)
Chao Sun created SPARK-44659: Summary: Include keyGroupedPartitioning in StoragePartitionJoinParams equality check Key: SPARK-44659 URL: https://issues.apache.org/jira/browse/SPARK-44659 Project: Spark

[jira] [Updated] (SPARK-44641) Results duplicated when SPJ partial-cluster and pushdown enabled but conditions unmet

2023-08-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-44641: - Parent: SPARK-37375 Issue Type: Sub-task (was: Bug) > Results duplicated when SPJ

[jira] [Created] (SPARK-44658) ShuffleStatus.getMapStatus should return None

2023-08-03 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-44658: - Summary: ShuffleStatus.getMapStatus should return None Key: SPARK-44658 URL: https://issues.apache.org/jira/browse/SPARK-44658 Project: Spark Issue Type:

[jira] [Commented] (SPARK-43496) Have a separate config for Memory limits for kubernetes pods

2023-08-03 Thread Laurenceau Julien (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750791#comment-17750791 ] Laurenceau Julien commented on SPARK-43496: --- Hi, I would like to suggest to go beyond that :

[jira] [Commented] (SPARK-44654) In subquery cannot perform partition pruning

2023-08-03 Thread 7mming7 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750743#comment-17750743 ] 7mming7 commented on SPARK-44654: - [~yumwang] This is also possible, but if it is the case of multiple

[jira] [Commented] (SPARK-44654) In subquery cannot perform partition pruning

2023-08-03 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750739#comment-17750739 ] Yuming Wang commented on SPARK-44654: - Another way is convert join to filter if maximum number of

[jira] [Created] (SPARK-44657) Incorrect limit handling and config parsing in Arrow collect

2023-08-03 Thread Venkata Sai Akhil Gudesa (Jira)
Venkata Sai Akhil Gudesa created SPARK-44657: Summary: Incorrect limit handling and config parsing in Arrow collect Key: SPARK-44657 URL: https://issues.apache.org/jira/browse/SPARK-44657

[jira] [Updated] (SPARK-44656) Close dangling iterators in SparkResult too (Spark Connect Scala)

2023-08-03 Thread Juliusz Sompolski (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juliusz Sompolski updated SPARK-44656: -- Epic Link: SPARK-43754 > Close dangling iterators in SparkResult too (Spark Connect

[jira] [Created] (SPARK-44656) Close dangling iterators in SparkResult too (Spark Connect Scala)

2023-08-03 Thread Alice Sayutina (Jira)
Alice Sayutina created SPARK-44656: -- Summary: Close dangling iterators in SparkResult too (Spark Connect Scala) Key: SPARK-44656 URL: https://issues.apache.org/jira/browse/SPARK-44656 Project: Spark

[jira] [Updated] (SPARK-44619) Free up disk space for container jobs

2023-08-03 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng updated SPARK-44619: -- Summary: Free up disk space for container jobs (was: Free up disk space for pyspark

[jira] [Created] (SPARK-44655) make the code cleaner about static and dynamc data/partition filters

2023-08-03 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-44655: --- Summary: make the code cleaner about static and dynamc data/partition filters Key: SPARK-44655 URL: https://issues.apache.org/jira/browse/SPARK-44655 Project: Spark

[jira] [Updated] (SPARK-44654) In subquery cannot perform partition pruning

2023-08-03 Thread 7mming7 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 7mming7 updated SPARK-44654: Description: The following SQL cannot perform partition pruning {code:java} SELECT * FROM parquet_part

[jira] [Commented] (SPARK-40927) Memory issue with Structured streaming

2023-08-03 Thread Iain Morrison (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750660#comment-17750660 ] Iain Morrison commented on SPARK-40927: --- In our case I found the following settings greatly

[jira] [Updated] (SPARK-44654) In subquery cannot perform partition pruning

2023-08-03 Thread 7mming7 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 7mming7 updated SPARK-44654: Attachment: image-2023-08-03-17-22-53-981.png > In subquery cannot perform partition pruning >

[jira] [Created] (SPARK-44654) In subquery cannot perform partition pruning

2023-08-03 Thread 7mming7 (Jira)
7mming7 created SPARK-44654: --- Summary: In subquery cannot perform partition pruning Key: SPARK-44654 URL: https://issues.apache.org/jira/browse/SPARK-44654 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-44575) Implement Error Translation

2023-08-03 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750659#comment-17750659 ] ASF GitHub Bot commented on SPARK-44575: User 'heyihong' has created a pull request for this

[jira] [Commented] (SPARK-44619) Free up disk space for pyspark container jobs

2023-08-03 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750657#comment-17750657 ] ASF GitHub Bot commented on SPARK-44619: User 'zhengruifeng' has created a pull request for this

[jira] [Commented] (SPARK-44581) ShutdownHookManager get wrong hadoop user group information

2023-08-03 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750656#comment-17750656 ] ASF GitHub Bot commented on SPARK-44581: User 'liangyu-1' has created a pull request for this

[jira] [Commented] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions

2023-08-03 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750652#comment-17750652 ] ASF GitHub Bot commented on SPARK-44649: User 'beliefer' has created a pull request for this

[jira] [Commented] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions

2023-08-03 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750653#comment-17750653 ] ASF GitHub Bot commented on SPARK-44649: User 'beliefer' has created a pull request for this

[jira] [Commented] (SPARK-42375) Point out the user-facing documentation in Spark Connect server startup

2023-08-03 Thread Junyao Huang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750586#comment-17750586 ] Junyao Huang commented on SPARK-42375: -- Hi, [~gurwls223] , Do you mean we directly add the

[jira] [Commented] (SPARK-42729) Update Submitting Applications page for Spark Connect

2023-08-03 Thread Junyao Huang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750585#comment-17750585 ] Junyao Huang commented on SPARK-42729: -- Hi, [~gurwls223] , I think these pages are different

[jira] [Created] (SPARK-44653) non-trivial DataFrame unions should not break caching

2023-08-03 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-44653: --- Summary: non-trivial DataFrame unions should not break caching Key: SPARK-44653 URL: https://issues.apache.org/jira/browse/SPARK-44653 Project: Spark Issue