[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40292.  I

[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40292.  I

[jira] [Resolved] (SPARK-40469) Upgrade Scala to 2.12.17

2022-09-15 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-40469. - Resolution: Duplicate > Upgrade Scala to 2.12.17 > > >

[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40292.  I

[jira] [Created] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40470: Summary: arrays_zip output unexpected alias column names when using Map Key: SPARK-40470 URL: https://issues.apache.org/jira/browse/SPARK-40470 Project: Spark

[jira] [Assigned] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40468: Assignee: (was: Apache Spark) > Column pruning is not handled correctly in CSV when

[jira] [Assigned] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40468: Assignee: Apache Spark > Column pruning is not handled correctly in CSV when

[jira] [Commented] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605616#comment-17605616 ] Apache Spark commented on SPARK-40468: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Commented] (SPARK-39696) Uncaught exception in thread executor-heartbeater java.util.ConcurrentModificationException: mutation occurred during iteration

2022-09-15 Thread Jiri Humpolicek (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605615#comment-17605615 ] Jiri Humpolicek commented on SPARK-39696: - same for me. Is it possible to extend any timeout? >

[jira] [Created] (SPARK-40469) Upgrade Scala to 2.12.17

2022-09-15 Thread Yuming Wang (Jira)
Yuming Wang created SPARK-40469: --- Summary: Upgrade Scala to 2.12.17 Key: SPARK-40469 URL: https://issues.apache.org/jira/browse/SPARK-40469 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-40175) Converting Tuple2 to Scala Map via `.toMap` is slow

2022-09-15 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40175: --- Assignee: Yang Jie > Converting Tuple2 to Scala Map via `.toMap` is slow >

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Resolved] (SPARK-40175) Converting Tuple2 to Scala Map via `.toMap` is slow

2022-09-15 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40175. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37876

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Created] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40468: Summary: Column pruning is not handled correctly in CSV when _corrupt_record is used Key: SPARK-40468 URL: https://issues.apache.org/jira/browse/SPARK-40468 Project:

[jira] [Commented] (SPARK-40196) Consolidate `lit` function with NumPy scalar in sql and pandas module

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605605#comment-17605605 ] Apache Spark commented on SPARK-40196: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40196) Consolidate `lit` function with NumPy scalar in sql and pandas module

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-40196: Assignee: Xinrong Meng > Consolidate `lit` function with NumPy scalar in sql and pandas

[jira] [Resolved] (SPARK-40196) Consolidate `lit` function with NumPy scalar in sql and pandas module

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40196. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37888

[jira] [Commented] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605595#comment-17605595 ] Apache Spark commented on SPARK-40467: -- User 'HeartSaVioR' has created a pull request for this

[jira] [Commented] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605593#comment-17605593 ] Apache Spark commented on SPARK-40467: -- User 'HeartSaVioR' has created a pull request for this

[jira] [Commented] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605594#comment-17605594 ] Hyukjin Kwon commented on SPARK-40439: -- Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}}

[jira] [Assigned] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40467: Assignee: Apache Spark > Split FlatMapGroupsWithState down to multiple test suites >

[jira] [Assigned] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40467: Assignee: (was: Apache Spark) > Split FlatMapGroupsWithState down to multiple test

[jira] [Resolved] (SPARK-40446) Rename `_MissingPandasXXX` as `MissingPandasXXX`

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40446. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37898

[jira] [Updated] (SPARK-40441) With PANDAS_UDF, data from tasks on the same physical node is aggregated into one task execution, resulting in concurrency not being fully utilized

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40441: - Component/s: SQL (was: Scheduler) > With PANDAS_UDF, data from tasks on

[jira] [Commented] (SPARK-40441) With PANDAS_UDF, data from tasks on the same physical node is aggregated into one task execution, resulting in concurrency not being fully utilized

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605591#comment-17605591 ] Hyukjin Kwon commented on SPARK-40441: -- [~SimonAries] Spark 2.4 is EOL. Mind trying 3.1+? > With

[jira] [Commented] (SPARK-40457) upgrade jackson data mapper to latest

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605590#comment-17605590 ] Hyukjin Kwon commented on SPARK-40457: -- [~bilna123] which Jackson version do you mean? > upgrade

[jira] [Updated] (SPARK-40441) With PANDAS_UDF, data from tasks on the same physical node is aggregated into one task execution, resulting in concurrency not being fully utilized

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40441: - Component/s: (was: Pandas API on Spark) > With PANDAS_UDF, data from tasks on the same

[jira] [Updated] (SPARK-40466) Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40466: - Fix Version/s: (was: 3.4.0) > Improve the error message if the DSv2 source is disabled but

[jira] [Updated] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40467: - Priority: Minor (was: Major) > Split FlatMapGroupsWithState down to multiple test suites >

[jira] [Created] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40467: Summary: Split FlatMapGroupsWithState down to multiple test suites Key: SPARK-40467 URL: https://issues.apache.org/jira/browse/SPARK-40467 Project: Spark

[jira] [Assigned] (SPARK-40446) Rename `_MissingPandasXXX` as `MissingPandasXXX`

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-40446: Assignee: Ruifeng Zheng > Rename `_MissingPandasXXX` as `MissingPandasXXX` >

[jira] [Updated] (SPARK-40466) Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available

2022-09-15 Thread Huanli Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huanli Wang updated SPARK-40466: Shepherd: (was: Huanli Wang) > Improve the error message if the DSv2 source is disabled but

[jira] [Updated] (SPARK-40466) Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available

2022-09-15 Thread Huanli Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huanli Wang updated SPARK-40466: Shepherd: Huanli Wang > Improve the error message if the DSv2 source is disabled but DSv1

[jira] [Created] (SPARK-40466) Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available

2022-09-15 Thread Huanli Wang (Jira)
Huanli Wang created SPARK-40466: --- Summary: Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available Key: SPARK-40466 URL: https://issues.apache.org/jira/browse/SPARK-40466

[jira] [Created] (SPARK-40465) Refactor Decimal so as we can use Int128 as underlying implementation

2022-09-15 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40465: -- Summary: Refactor Decimal so as we can use Int128 as underlying implementation Key: SPARK-40465 URL: https://issues.apache.org/jira/browse/SPARK-40465 Project: Spark

[jira] [Assigned] (SPARK-40463) Update gpg's keyserver

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40463: Assignee: (was: Apache Spark) > Update gpg's keyserver > -- > >

[jira] [Assigned] (SPARK-40463) Update gpg's keyserver

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40463: Assignee: Apache Spark > Update gpg's keyserver > -- > >

[jira] [Commented] (SPARK-40463) Update gpg's keyserver

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605587#comment-17605587 ] Apache Spark commented on SPARK-40463: -- User 'wangyum' has created a pull request for this issue:

[jira] [Created] (SPARK-40464) Support automatic data format conversion for shuffle state db

2022-09-15 Thread Yang Jie (Jira)
Yang Jie created SPARK-40464: Summary: Support automatic data format conversion for shuffle state db Key: SPARK-40464 URL: https://issues.apache.org/jira/browse/SPARK-40464 Project: Spark Issue

[jira] [Commented] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605578#comment-17605578 ] Apache Spark commented on SPARK-40460: -- User 'Yaohua628' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40460: Assignee: Apache Spark > Streaming metrics is zero when select _metadata >

[jira] [Assigned] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40460: Assignee: (was: Apache Spark) > Streaming metrics is zero when select _metadata >

[jira] [Updated] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-15 Thread Yaohua Zhao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yaohua Zhao updated SPARK-40460: Description: Streaming metrics report all 0 (`processedRowsPerSecond`, etc) when selecting

[jira] [Created] (SPARK-40463) Update gpg's keyserver

2022-09-15 Thread Yuming Wang (Jira)
Yuming Wang created SPARK-40463: --- Summary: Update gpg's keyserver Key: SPARK-40463 URL: https://issues.apache.org/jira/browse/SPARK-40463 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for Python linter

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40461: - Fix Version/s: 3.1.4 3.4.0 3.3.1 3.2.3

[jira] [Resolved] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for Python linter

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40461. -- Resolution: Fixed Fixed in https://github.com/apache/spark/pull/37904 > Set upperbound for

[jira] [Assigned] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40459: - Assignee: Dongjoon Hyun > recoverDiskStore should not stop by existing recomputed

[jira] [Resolved] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40459. --- Fix Version/s: 3.3.2 3.2.3 3.4.0 Resolution:

[jira] [Commented] (SPARK-40462) Support np.ndarray for functions.lit

2022-09-15 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605563#comment-17605563 ] Haejoon Lee commented on SPARK-40462: - Let me take a look > Support np.ndarray for functions.lit >

[jira] [Created] (SPARK-40462) Support np.ndarray for functions.lit

2022-09-15 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-40462: --- Summary: Support np.ndarray for functions.lit Key: SPARK-40462 URL: https://issues.apache.org/jira/browse/SPARK-40462 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for Python linter

2022-09-15 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40461: - Summary: Set upperbound for pyzmq 24.0.0 for Python linter (was: Set upperbound for pyzmq

[jira] [Assigned] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for linters

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40461: Assignee: Apache Spark > Set upperbound for pyzmq 24.0.0 for linters >

[jira] [Commented] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for linters

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605562#comment-17605562 ] Apache Spark commented on SPARK-40461: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for linters

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40461: Assignee: (was: Apache Spark) > Set upperbound for pyzmq 24.0.0 for linters >

[jira] [Created] (SPARK-40461) Set upperbound for pyzmq 24.0.0 for linters

2022-09-15 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-40461: Summary: Set upperbound for pyzmq 24.0.0 for linters Key: SPARK-40461 URL: https://issues.apache.org/jira/browse/SPARK-40461 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-15 Thread Yaohua Zhao (Jira)
Yaohua Zhao created SPARK-40460: --- Summary: Streaming metrics is zero when select _metadata Key: SPARK-40460 URL: https://issues.apache.org/jira/browse/SPARK-40460 Project: Spark Issue Type:

[jira] [Commented] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605559#comment-17605559 ] Apache Spark commented on SPARK-40459: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40459: Assignee: (was: Apache Spark) > recoverDiskStore should not stop by existing

[jira] [Updated] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40459: -- Description: {code:java} org.apache.commons.io.FileExistsException: File element in parameter

[jira] [Assigned] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40459: Assignee: Apache Spark > recoverDiskStore should not stop by existing recomputed files >

[jira] [Commented] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605558#comment-17605558 ] Apache Spark commented on SPARK-40459: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Created] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-15 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-40459: - Summary: recoverDiskStore should not stop by existing recomputed files Key: SPARK-40459 URL: https://issues.apache.org/jira/browse/SPARK-40459 Project: Spark

[jira] [Assigned] (SPARK-40432) Introduce GroupStateImpl and GroupStateTimeout in PySpark

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40432: Assignee: Jungtaek Lim > Introduce GroupStateImpl and GroupStateTimeout in PySpark >

[jira] [Resolved] (SPARK-40432) Introduce GroupStateImpl and GroupStateTimeout in PySpark

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40432. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37889

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-09-15 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605509#comment-17605509 ] Drew commented on SPARK-40286: --

[jira] [Comment Edited] (SPARK-40286) Load Data from S3 deletes data source file

2022-09-15 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605509#comment-17605509 ] Drew edited comment on SPARK-40286 at 9/15/22 8:04 PM: ---

[jira] [Commented] (SPARK-40359) Migrate JSON type check failures onto error classes

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605496#comment-17605496 ] Apache Spark commented on SPARK-40359: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40359) Migrate JSON type check failures onto error classes

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40359: Assignee: (was: Apache Spark) > Migrate JSON type check failures onto error classes

[jira] [Commented] (SPARK-40359) Migrate JSON type check failures onto error classes

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605495#comment-17605495 ] Apache Spark commented on SPARK-40359: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40359) Migrate JSON type check failures onto error classes

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40359: Assignee: Apache Spark > Migrate JSON type check failures onto error classes >

[jira] [Created] (SPARK-40458) Bump Kubernetes Client Version to 6.1.1

2022-09-15 Thread Attila Zsolt Piros (Jira)
Attila Zsolt Piros created SPARK-40458: -- Summary: Bump Kubernetes Client Version to 6.1.1 Key: SPARK-40458 URL: https://issues.apache.org/jira/browse/SPARK-40458 Project: Spark Issue

[jira] [Updated] (SPARK-40429) Only set KeyGroupedPartitioning when the referenced column is in the output

2022-09-15 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40429: -- Fix Version/s: 3.3.2 > Only set KeyGroupedPartitioning when the referenced column is in the

[jira] [Commented] (SPARK-19335) Spark should support doing an efficient DataFrame Upsert via JDBC

2022-09-15 Thread Kboh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605440#comment-17605440 ] Kboh commented on SPARK-19335: -- Also interested in this. ty > Spark should support doing an efficient

[jira] [Commented] (SPARK-40429) Only set KeyGroupedPartitioning when the referenced column is in the output

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605430#comment-17605430 ] Apache Spark commented on SPARK-40429: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Commented] (SPARK-40429) Only set KeyGroupedPartitioning when the referenced column is in the output

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605432#comment-17605432 ] Apache Spark commented on SPARK-40429: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Created] (SPARK-40457) upgrade jackson data mapper to latest

2022-09-15 Thread Bilna (Jira)
Bilna created SPARK-40457: - Summary: upgrade jackson data mapper to latest Key: SPARK-40457 URL: https://issues.apache.org/jira/browse/SPARK-40457 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-39059) When using multiple SparkSessions, DataFrame.resolve uses configuration from the wrong session

2022-09-15 Thread Furcy Pin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Furcy Pin updated SPARK-39059: -- Description: We encountered unexpected error when using SparkSession.newSession and the

[jira] [Updated] (SPARK-39059) When using multiple SparkSessions, DataFrame.resolve uses configuration from the wrong session

2022-09-15 Thread Furcy Pin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Furcy Pin updated SPARK-39059: -- Description: We encountered unexpected error when using SparkSession.newSession and the

[jira] [Commented] (SPARK-40456) PartitionIterator.hasNext should be cheap to call repeatedly

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605358#comment-17605358 ] Apache Spark commented on SPARK-40456: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40456) PartitionIterator.hasNext should be cheap to call repeatedly

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40456: Assignee: Wenchen Fan (was: Apache Spark) > PartitionIterator.hasNext should be cheap

[jira] [Commented] (SPARK-40456) PartitionIterator.hasNext should be cheap to call repeatedly

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605357#comment-17605357 ] Apache Spark commented on SPARK-40456: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40456) PartitionIterator.hasNext should be cheap to call repeatedly

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40456: Assignee: Apache Spark (was: Wenchen Fan) > PartitionIterator.hasNext should be cheap

[jira] [Created] (SPARK-40456) PartitionIterator.hasNext should be cheap to call repeatedly

2022-09-15 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-40456: --- Summary: PartitionIterator.hasNext should be cheap to call repeatedly Key: SPARK-40456 URL: https://issues.apache.org/jira/browse/SPARK-40456 Project: Spark

[jira] [Assigned] (SPARK-40387) Improve the implementation of Spark Decimal

2022-09-15 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40387: --- Assignee: jiaan.geng > Improve the implementation of Spark Decimal >

[jira] [Resolved] (SPARK-40387) Improve the implementation of Spark Decimal

2022-09-15 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40387. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37830

[jira] [Assigned] (SPARK-40433) Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40433: Assignee: Jungtaek Lim > Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row

[jira] [Resolved] (SPARK-40433) Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40433. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37891

[jira] [Commented] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605294#comment-17605294 ] Apache Spark commented on SPARK-40455: -- User 'caican00' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40455: Assignee: Apache Spark > Abort result stage directly when it failed caused by

[jira] [Commented] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605295#comment-17605295 ] Apache Spark commented on SPARK-40455: -- User 'caican00' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40455: Assignee: (was: Apache Spark) > Abort result stage directly when it failed caused by

[jira] [Updated] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread caican (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caican updated SPARK-40455: --- Description: Here's a very serious bug: When result stage failed caused by FetchFailedException,  the

[jira] [Updated] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread caican (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caican updated SPARK-40455: --- Description: Here's a very serious bug: When result stage failed caused by FetchFailedException,  the

[jira] [Updated] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread caican (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caican updated SPARK-40455: --- Description: Here's a very serious bug: When result stage failed caused by `FetchFailedException`,  the

[jira] [Updated] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread caican (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caican updated SPARK-40455: --- Description: Here's a very serious bug: When result stage failed caused by `FetchFailedException`,  the

[jira] [Updated] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread caican (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caican updated SPARK-40455: --- Description: Here's a very serious bug:   > Abort result stage directly when it failed caused by

[jira] [Created] (SPARK-40455) Abort result stage directly when it failed caused by FetchFailed

2022-09-15 Thread caican (Jira)
caican created SPARK-40455: -- Summary: Abort result stage directly when it failed caused by FetchFailed Key: SPARK-40455 URL: https://issues.apache.org/jira/browse/SPARK-40455 Project: Spark Issue

  1   2   >