[jira] [Assigned] (SPARK-40592) Implement `min_count` in `GroupBy.max`

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40592: Assignee: (was: Apache Spark) > Implement `min_count` in `GroupBy.max` >

[jira] [Assigned] (SPARK-40592) Implement `min_count` in `GroupBy.max`

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40592: Assignee: Apache Spark > Implement `min_count` in `GroupBy.max` >

[jira] [Commented] (SPARK-40592) Implement `min_count` in `GroupBy.max`

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610348#comment-17610348 ] Apache Spark commented on SPARK-40592: -- User 'zhengruifeng' has created a pull request for this

[jira] [Created] (SPARK-40592) Implement `min_count` in `GroupBy.max`

2022-09-27 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40592: - Summary: Implement `min_count` in `GroupBy.max` Key: SPARK-40592 URL: https://issues.apache.org/jira/browse/SPARK-40592 Project: Spark Issue Type:

[jira] [Updated] (SPARK-40584) Incorrect Count when reading CSV file

2022-09-27 Thread Tarique Anwer (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarique Anwer updated SPARK-40584: -- Priority: Major (was: Minor) > Incorrect Count when reading CSV file >

[jira] [Commented] (SPARK-40584) Incorrect Count when reading CSV file

2022-09-27 Thread Tarique Anwer (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610343#comment-17610343 ] Tarique Anwer commented on SPARK-40584: --- Thank You for the workaround. It worked. Would you mind

[jira] [Updated] (SPARK-40490) `YarnShuffleIntegrationSuite` no longer verifies `registeredExecFile` reload after SPARK-17321

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40490: Fix Version/s: 3.3.1 (was: 3.3.2) > `YarnShuffleIntegrationSuite` no

[jira] [Updated] (SPARK-40385) Classes with companion object constructor fails interpreted path

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40385: Fix Version/s: 3.3.1 (was: 3.3.2) > Classes with companion object

[jira] [Updated] (SPARK-40508) Treat unknown partitioning as UnknownPartitioning

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40508: Fix Version/s: 3.3.1 > Treat unknown partitioning as UnknownPartitioning >

[jira] [Updated] (SPARK-38803) Set minio cpu to 250m (0.25) in K8s IT

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-38803: Fix Version/s: 3.3.1 (was: 3.3.2) > Set minio cpu to 250m (0.25) in K8s IT

[jira] [Updated] (SPARK-38802) Support spark.kubernetes.test.(driver|executor)RequestCores

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-38802: Fix Version/s: 3.3.1 (was: 3.3.2) > Support

[jira] [Updated] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40460: Fix Version/s: 3.3.1 (was: 3.3.2) > Streaming metrics is zero when select

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40468: Fix Version/s: 3.3.1 (was: 3.3.2) > Column pruning is not handled

[jira] [Updated] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40459: Fix Version/s: 3.3.1 (was: 3.3.2) > recoverDiskStore should not stop by

[jira] [Updated] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-38017: Fix Version/s: 3.3.1 (was: 3.3.0) > Fix the API doc for window to say it

[jira] [Updated] (SPARK-40429) Only set KeyGroupedPartitioning when the referenced column is in the output

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40429: Fix Version/s: 3.3.1 (was: 3.3.2) > Only set KeyGroupedPartitioning when

[jira] [Updated] (SPARK-40423) Add explicit YuniKorn queue submission test coverage

2022-09-27 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-40423: Fix Version/s: 3.3.1 (was: 3.3.2) > Add explicit YuniKorn queue submission

[jira] [Updated] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-40591: - Labels: correctness (was: ) > ignoreCorruptFiles results data loss >

[jira] [Commented] (SPARK-40589) Fix `DataFrame.corr_with`

2022-09-27 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610321#comment-17610321 ] Haejoon Lee commented on SPARK-40589: - Reported issue for pandas community:

[jira] [Commented] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610320#comment-17610320 ] Apache Spark commented on SPARK-40591: -- User 'yaooqinn' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40591: Assignee: (was: Apache Spark) > ignoreCorruptFiles results data loss >

[jira] [Assigned] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40591: Assignee: Apache Spark > ignoreCorruptFiles results data loss >

[jira] [Updated] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40582: - Priority: Major (was: Critical) > NullPointerException: Cannot invoke >

[jira] [Commented] (SPARK-40584) Incorrect Count when reading CSV file

2022-09-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610319#comment-17610319 ] Hyukjin Kwon commented on SPARK-40584: -- As a workaround, you can do: {code} df_inputfile = ...

[jira] [Resolved] (SPARK-40579) `GroupBy.first` should skip nulls

2022-09-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40579. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38017

[jira] [Updated] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-40591: - Description: Let's take a look at the case below, the left and the right are visiting the same table

[jira] [Assigned] (SPARK-40579) `GroupBy.first` should skip nulls

2022-09-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-40579: Assignee: Ruifeng Zheng > `GroupBy.first` should skip nulls >

[jira] [Resolved] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40578. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38016

[jira] [Assigned] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-40578: Assignee: Haejoon Lee > Fix `IndexesTest.test_to_frame` when pandas 1.5.0 >

[jira] [Updated] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-40591: - Description:   !image-2022-09-28-09-20-21-693.png!   was:   !image-2022-09-28-09-19-45-522.png!  

[jira] [Updated] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-40591: - Attachment: image-2022-09-28-09-20-21-693.png > ignoreCorruptFiles results data loss >

[jira] [Created] (SPARK-40591) ignoreCorruptFiles results data loss

2022-09-27 Thread Kent Yao (Jira)
Kent Yao created SPARK-40591: Summary: ignoreCorruptFiles results data loss Key: SPARK-40591 URL: https://issues.apache.org/jira/browse/SPARK-40591 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-40590) Fix `ps.read_parquet` when pandas_metadata is True

2022-09-27 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610313#comment-17610313 ] Haejoon Lee commented on SPARK-40590: - Let me take a look > Fix `ps.read_parquet` when

[jira] [Created] (SPARK-40590) Fix `ps.read_parquet` when pandas_metadata is True

2022-09-27 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-40590: --- Summary: Fix `ps.read_parquet` when pandas_metadata is True Key: SPARK-40590 URL: https://issues.apache.org/jira/browse/SPARK-40590 Project: Spark Issue Type:

[jira] [Commented] (SPARK-40589) Fix `DataFrame.corr_with`

2022-09-27 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610312#comment-17610312 ] Haejoon Lee commented on SPARK-40589: - I'm fixing > Fix `DataFrame.corr_with` >

[jira] [Created] (SPARK-40589) Fix `DataFrame.corr_with`

2022-09-27 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-40589: --- Summary: Fix `DataFrame.corr_with` Key: SPARK-40589 URL: https://issues.apache.org/jira/browse/SPARK-40589 Project: Spark Issue Type: Sub-task

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610301#comment-17610301 ] Yang Jie edited comment on SPARK-40582 at 9/28/22 12:10 AM: Yes, Scala

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610301#comment-17610301 ] Yang Jie commented on SPARK-40582: -- Yes, Scala 2.13.9 fixes this issue, but 2.13.9 has an [incompatible

[jira] [Assigned] (SPARK-40574) Add PURGE to DROP TABLE doc

2022-09-27 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40574: - Assignee: Yuming Wang > Add PURGE to DROP TABLE doc > --- > >

[jira] [Resolved] (SPARK-40574) Add PURGE to DROP TABLE doc

2022-09-27 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40574. --- Fix Version/s: 3.3.1 3.2.3 3.4.0 Resolution:

[jira] [Assigned] (SPARK-40583) Documentation error in "Integration with Cloud Infrastructures"

2022-09-27 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40583: - Assignee: Daniel Ranchal > Documentation error in "Integration with Cloud

[jira] [Resolved] (SPARK-40583) Documentation error in "Integration with Cloud Infrastructures"

2022-09-27 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40583. --- Fix Version/s: 3.2.3 3.4.0 3.3.1 Resolution:

[jira] [Created] (SPARK-40588) Sorting issue with AQE turned on

2022-09-27 Thread Swetha Baskaran (Jira)
Swetha Baskaran created SPARK-40588: --- Summary: Sorting issue with AQE turned on Key: SPARK-40588 URL: https://issues.apache.org/jira/browse/SPARK-40588 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-40587) SELECT * shouldn't be empty project list in proto.

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40587: Assignee: Apache Spark > SELECT * shouldn't be empty project list in proto. >

[jira] [Commented] (SPARK-40587) SELECT * shouldn't be empty project list in proto.

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610259#comment-17610259 ] Apache Spark commented on SPARK-40587: -- User 'amaliujia' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40587) SELECT * shouldn't be empty project list in proto.

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40587: Assignee: (was: Apache Spark) > SELECT * shouldn't be empty project list in proto. >

[jira] [Created] (SPARK-40587) SELECT * shouldn't be empty project list in proto.

2022-09-27 Thread Rui Wang (Jira)
Rui Wang created SPARK-40587: Summary: SELECT * shouldn't be empty project list in proto. Key: SPARK-40587 URL: https://issues.apache.org/jira/browse/SPARK-40587 Project: Spark Issue Type:

[jira] [Created] (SPARK-40586) Decouple plan transformation and validation on server side

2022-09-27 Thread Rui Wang (Jira)
Rui Wang created SPARK-40586: Summary: Decouple plan transformation and validation on server side Key: SPARK-40586 URL: https://issues.apache.org/jira/browse/SPARK-40586 Project: Spark Issue

[jira] [Assigned] (SPARK-40585) Support double-quoted identifiers

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40585: Assignee: Apache Spark > Support double-quoted identifiers >

[jira] [Assigned] (SPARK-40585) Support double-quoted identifiers

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40585: Assignee: (was: Apache Spark) > Support double-quoted identifiers >

[jira] [Commented] (SPARK-40585) Support double-quoted identifiers

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610191#comment-17610191 ] Apache Spark commented on SPARK-40585: -- User 'srielau' has created a pull request for this issue:

[jira] [Commented] (SPARK-40585) Support double-quoted identifiers

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610192#comment-17610192 ] Apache Spark commented on SPARK-40585: -- User 'srielau' has created a pull request for this issue:

[jira] [Created] (SPARK-40585) Support double-quoted identifiers

2022-09-27 Thread Serge Rielau (Jira)
Serge Rielau created SPARK-40585: Summary: Support double-quoted identifiers Key: SPARK-40585 URL: https://issues.apache.org/jira/browse/SPARK-40585 Project: Spark Issue Type: New Feature

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610178#comment-17610178 ] Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:50 PM: Will

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610178#comment-17610178 ] Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:49 PM: Will

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610178#comment-17610178 ] Garret Wilson commented on SPARK-40582: --- Will Spark be updated to use the newer version of Scala

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610176#comment-17610176 ] Garret Wilson commented on SPARK-40582: --- That's awesome to hear, Yang! Thanks for the good news.

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610170#comment-17610170 ] Yang Jie edited comment on SPARK-40582 at 9/27/22 5:39 PM: --- It looks like a

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610170#comment-17610170 ] Yang Jie commented on SPARK-40582: -- It looks like a known bug of Scala 2.13.8 and fixed by SPARK-39553, 

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610163#comment-17610163 ] Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:16 PM:

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610163#comment-17610163 ] Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:16 PM:

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610163#comment-17610163 ] Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:15 PM:

[jira] [Comment Edited] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610163#comment-17610163 ] Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:14 PM:

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610163#comment-17610163 ] Garret Wilson commented on SPARK-40582: --- {quote}Do you use Scala 2.13?{quote} Yes. (Sorry; I

[jira] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582 ] Garret Wilson deleted comment on SPARK-40582: --- was (Author: garretwilson): {quote}Do you use Scala 2.13?{quote} No. As per the description, I am using Spark 3.3.0 with Java 17 on Windows

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610162#comment-17610162 ] Garret Wilson commented on SPARK-40582: --- {quote}Do you use Scala 2.13?{quote} No. As per the

[jira] [Commented] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610161#comment-17610161 ] Yang Jie commented on SPARK-40582: -- Do you use Scala 2.13? [~garretwilson]    > NullPointerException:

[jira] [Created] (SPARK-40584) Incorrect Count when reading CSV file

2022-09-27 Thread Tarique Anwer (Jira)
Tarique Anwer created SPARK-40584: - Summary: Incorrect Count when reading CSV file Key: SPARK-40584 URL: https://issues.apache.org/jira/browse/SPARK-40584 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-40583) Documentation error in "Integration with Cloud Infrastructures"

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40583: Assignee: (was: Apache Spark) > Documentation error in "Integration with Cloud

[jira] [Commented] (SPARK-40583) Documentation error in "Integration with Cloud Infrastructures"

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610128#comment-17610128 ] Apache Spark commented on SPARK-40583: -- User 'danitico' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40583) Documentation error in "Integration with Cloud Infrastructures"

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40583: Assignee: Apache Spark > Documentation error in "Integration with Cloud Infrastructures"

[jira] [Created] (SPARK-40583) Documentation error in "Integration with Cloud Infrastructures"

2022-09-27 Thread Daniel Ranchal (Jira)
Daniel Ranchal created SPARK-40583: -- Summary: Documentation error in "Integration with Cloud Infrastructures" Key: SPARK-40583 URL: https://issues.apache.org/jira/browse/SPARK-40583 Project: Spark

[jira] [Updated] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Garret Wilson updated SPARK-40582: -- Description: I'm running a simple little Spark 3.3.0 pipeline on Windows 10 using Java 17

[jira] [Created] (SPARK-40582) NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null

2022-09-27 Thread Garret Wilson (Jira)
Garret Wilson created SPARK-40582: - Summary: NullPointerException: Cannot invoke invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null Key: SPARK-40582 URL:

[jira] [Resolved] (SPARK-38433) Add Shell Code Style Check Action

2022-09-27 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-38433. -- Resolution: Won't Fix > Add Shell Code Style Check Action > -

[jira] [Commented] (SPARK-39877) Unpivot / melt function for PySpark

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610045#comment-17610045 ] Apache Spark commented on SPARK-39877: -- User 'EnricoMi' has created a pull request for this issue:

[jira] [Commented] (SPARK-39877) Unpivot / melt function for PySpark

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610043#comment-17610043 ] Apache Spark commented on SPARK-39877: -- User 'EnricoMi' has created a pull request for this issue:

[jira] [Commented] (SPARK-38864) Unpivot / melt function for Dataset API

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610029#comment-17610029 ] Apache Spark commented on SPARK-38864: -- User 'EnricoMi' has created a pull request for this issue:

[jira] [Commented] (SPARK-27339) Decimal up cast to higher scale fails while reading parquet to Dataset

2022-09-27 Thread sam (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609958#comment-17609958 ] sam commented on SPARK-27339: - [~hyukjin.kwon][~wrschneider99] [~ksbalas]. We are working on a work around

[jira] [Assigned] (SPARK-40562) Add spark.sql.legacy.groupingIdWithAppendedUserGroupBy

2022-09-27 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40562: - Assignee: Dongjoon Hyun > Add spark.sql.legacy.groupingIdWithAppendedUserGroupBy >

[jira] [Resolved] (SPARK-40562) Add spark.sql.legacy.groupingIdWithAppendedUserGroupBy

2022-09-27 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40562. --- Fix Version/s: 3.3.1 3.2.3 3.4.0 Resolution:

[jira] [Resolved] (SPARK-40571) Construct a test case to verify fault-tolerance semantic with random python worker failures

2022-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40571. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38008

[jira] [Assigned] (SPARK-40571) Construct a test case to verify fault-tolerance semantic with random python worker failures

2022-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40571: Assignee: Jungtaek Lim > Construct a test case to verify fault-tolerance semantic with

[jira] [Created] (SPARK-40581) Improving testability of GroupState in applyInPandasWithState

2022-09-27 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40581: Summary: Improving testability of GroupState in applyInPandasWithState Key: SPARK-40581 URL: https://issues.apache.org/jira/browse/SPARK-40581 Project: Spark

[jira] [Commented] (SPARK-40580) Update the document for DataFrame.to_orc

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609898#comment-17609898 ] Apache Spark commented on SPARK-40580: -- User 'itholic' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40580) Update the document for DataFrame.to_orc

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40580: Assignee: Apache Spark > Update the document for DataFrame.to_orc >

[jira] [Assigned] (SPARK-40580) Update the document for DataFrame.to_orc

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40580: Assignee: (was: Apache Spark) > Update the document for DataFrame.to_orc >

[jira] [Resolved] (SPARK-35800) Improving testability of GroupState in streaming flatMapGroupsWithState

2022-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-35800. -- Fix Version/s: 3.2.0 Resolution: Fixed Looks like Li Zhang didn't create a ASF Jira

[jira] [Assigned] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40578: Assignee: (was: Apache Spark) > Fix `IndexesTest.test_to_frame` when pandas 1.5.0 >

[jira] [Commented] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609886#comment-17609886 ] Apache Spark commented on SPARK-40578: -- User 'itholic' has created a pull request for this issue:

[jira] [Commented] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609887#comment-17609887 ] Apache Spark commented on SPARK-40578: -- User 'itholic' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40578: Assignee: Apache Spark > Fix `IndexesTest.test_to_frame` when pandas 1.5.0 >

[jira] [Commented] (SPARK-40580) Update the document for DataFrame.to_orc

2022-09-27 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609883#comment-17609883 ] Haejoon Lee commented on SPARK-40580: - Let me take a look > Update the document for

[jira] [Created] (SPARK-40580) Update the document for DataFrame.to_orc

2022-09-27 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-40580: --- Summary: Update the document for DataFrame.to_orc Key: SPARK-40580 URL: https://issues.apache.org/jira/browse/SPARK-40580 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-40579) `GroupBy.first` should skip nulls

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40579: Assignee: (was: Apache Spark) > `GroupBy.first` should skip nulls >

[jira] [Commented] (SPARK-40579) `GroupBy.first` should skip nulls

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609877#comment-17609877 ] Apache Spark commented on SPARK-40579: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40579) `GroupBy.first` should skip nulls

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40579: Assignee: Apache Spark > `GroupBy.first` should skip nulls >

[jira] [Commented] (SPARK-40577) Fix CategoricalIndex.append

2022-09-27 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609875#comment-17609875 ] Apache Spark commented on SPARK-40577: -- User 'itholic' has created a pull request for this issue:

[jira] [Created] (SPARK-40579) `GroupBy.first` should skip nulls

2022-09-27 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40579: - Summary: `GroupBy.first` should skip nulls Key: SPARK-40579 URL: https://issues.apache.org/jira/browse/SPARK-40579 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-40578) Fix `IndexesTest.test_to_frame` when pandas 1.5.0

2022-09-27 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-40578: Summary: Fix `IndexesTest.test_to_frame` when pandas 1.5.0 (was: Fix Index.to_frame) > Fix

  1   2   >