[jira] [Created] (SPARK-48690) SPJ: Support auto-shuffle one side + less join keys than partition keys

2024-06-21 Thread Szehon Ho (Jira)
Szehon Ho created SPARK-48690: - Summary: SPJ: Support auto-shuffle one side + less join keys than partition keys Key: SPARK-48690 URL: https://issues.apache.org/jira/browse/SPARK-48690 Project: Spark

[jira] [Updated] (SPARK-48689) Reading lengthy JSON results in a corrupted record.

2024-06-21 Thread Yuxiang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuxiang Wei updated SPARK-48689: Description: When reading a data frame from a JSON file including a very long string, spark will

[jira] [Updated] (SPARK-48689) Reading lengthy JSON results in a corrupted record.

2024-06-21 Thread Yuxiang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuxiang Wei updated SPARK-48689: Description: When reading a data frame from a JSON file including a very long string, spark will

[jira] [Created] (SPARK-48689) Reading lengthy JSON results in a corrupted record.

2024-06-21 Thread Yuxiang Wei (Jira)
Yuxiang Wei created SPARK-48689: --- Summary: Reading lengthy JSON results in a corrupted record. Key: SPARK-48689 URL: https://issues.apache.org/jira/browse/SPARK-48689 Project: Spark Issue

[jira] [Created] (SPARK-48688) Return reasonable error when calling SQL to_avro and from_avro functions but Avro is not loaded by default

2024-06-21 Thread Daniel (Jira)
Daniel created SPARK-48688: -- Summary: Return reasonable error when calling SQL to_avro and from_avro functions but Avro is not loaded by default Key: SPARK-48688 URL: https://issues.apache.org/jira/browse/SPARK-48688

[jira] [Commented] (SPARK-48687) Add changes to implement state schema validation in planning phase on driver for stateful streaming queries

2024-06-21 Thread Anish Shrigondekar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856883#comment-17856883 ] Anish Shrigondekar commented on SPARK-48687: PR here -

[jira] [Created] (SPARK-48687) Add changes to implement state schema validation in planning phase on driver for stateful streaming queries

2024-06-21 Thread Anish Shrigondekar (Jira)
Anish Shrigondekar created SPARK-48687: -- Summary: Add changes to implement state schema validation in planning phase on driver for stateful streaming queries Key: SPARK-48687 URL:

[jira] [Created] (SPARK-48686) Improve performance of ParserUtils.unescapeSQLString

2024-06-21 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-48686: -- Summary: Improve performance of ParserUtils.unescapeSQLString Key: SPARK-48686 URL: https://issues.apache.org/jira/browse/SPARK-48686 Project: Spark Issue Type:

[jira] [Created] (SPARK-48685) PySpark MinHashLSH when used with CountVectorizer doesn't meet requirements

2024-06-21 Thread Etienne Soulard-Geoffrion (Jira)
Etienne Soulard-Geoffrion created SPARK-48685: - Summary: PySpark MinHashLSH when used with CountVectorizer doesn't meet requirements Key: SPARK-48685 URL:

[jira] [Resolved] (SPARK-48545) Create to_avro and from_avro SQL functions to match PySpark equivalent

2024-06-21 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-48545. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46977

[jira] [Assigned] (SPARK-48545) Create to_avro and from_avro SQL functions to match PySpark equivalent

2024-06-21 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-48545: -- Assignee: Daniel > Create to_avro and from_avro SQL functions to match PySpark

[jira] [Assigned] (SPARK-48655) SPJ: Add tests for shuffle skipping for aggregate queries

2024-06-21 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-48655: Assignee: Szehon Ho > SPJ: Add tests for shuffle skipping for aggregate queries >

[jira] [Commented] (SPARK-48463) MLLib function unable to handle nested data

2024-06-21 Thread Chhavi Bansal (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856832#comment-17856832 ] Chhavi Bansal commented on SPARK-48463: --- Thank you for the update. > MLLib function unable to

[jira] [Assigned] (SPARK-48675) Cache table doesn't work with collated column

2024-06-21 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-48675: --- Assignee: Nikola Mandic > Cache table doesn't work with collated column >

[jira] [Resolved] (SPARK-48675) Cache table doesn't work with collated column

2024-06-21 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-48675. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47045

[jira] [Updated] (SPARK-48680) Add char/varchar doc to language specific tables

2024-06-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-48680: --- Labels: pull-request-available (was: ) > Add char/varchar doc to language specific tables

[jira] [Updated] (SPARK-48664) Add official image Dockerfile for Apache Spark 4.0.0-preview1

2024-06-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-48664: --- Labels: pull-request-available (was: ) > Add official image Dockerfile for Apache Spark

[jira] [Updated] (SPARK-48656) ArrayIndexOutOfBoundsException in CartesianRDD getPartitions

2024-06-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-48656: --- Labels: pull-request-available (was: ) > ArrayIndexOutOfBoundsException in CartesianRDD

[jira] [Updated] (SPARK-48655) SPJ: Add tests for shuffle skipping for aggregate queries

2024-06-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-48655: --- Labels: pull-request-available (was: ) > SPJ: Add tests for shuffle skipping for aggregate

[jira] [Resolved] (SPARK-48684) Print related JIRA summary before proceeding merge

2024-06-21 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-48684. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47057

[jira] [Updated] (SPARK-48684) Print related JIRA summary before proceeding merge

2024-06-21 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-48684: - Component/s: Project Infra (was: SQL) > Print related JIRA summary before

[jira] [Updated] (SPARK-48684) Print related JIRA summary before proceeding merge

2024-06-21 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-48684: - Priority: Minor (was: Major) > Print related JIRA summary before proceeding merge >

[jira] [Created] (SPARK-48684) Print related JIRA summary before proceeding merge

2024-06-21 Thread Kent Yao (Jira)
Kent Yao created SPARK-48684: Summary: Print related JIRA summary before proceeding merge Key: SPARK-48684 URL: https://issues.apache.org/jira/browse/SPARK-48684 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-48662) Fix StructsToXml expression with collations

2024-06-21 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-48662: Assignee: Mihailo Milosevic > Fix StructsToXml expression with collations >

[jira] [Resolved] (SPARK-48662) Fix StructsToXml expression with collations

2024-06-21 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-48662. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47053

[jira] [Commented] (SPARK-48463) MLLib function unable to handle nested data

2024-06-21 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856713#comment-17856713 ] Weichen Xu commented on SPARK-48463: I will try to do it this sprint. (and then cherrypick it to

[jira] [Assigned] (SPARK-47258) Assign error classes to SHOW CREATE TABLE errors

2024-06-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47258: -- Assignee: Apache Spark > Assign error classes to SHOW CREATE TABLE errors >

[jira] [Assigned] (SPARK-47258) Assign error classes to SHOW CREATE TABLE errors

2024-06-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47258: -- Assignee: (was: Apache Spark) > Assign error classes to SHOW CREATE TABLE errors

[jira] [Comment Edited] (SPARK-48666) A filter should not be pushed down if it contains Unevaluable expression

2024-06-21 Thread Yokesh NK (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856433#comment-17856433 ] Yokesh NK edited comment on SPARK-48666 at 6/21/24 9:11 AM: During

[jira] [Updated] (SPARK-48683) Schema evolution with `df.mergeInto` losing `when` clauses

2024-06-21 Thread Pengfei Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengfei Xu updated SPARK-48683: --- Component/s: SQL (was: Spark Core) > Schema evolution with `df.mergeInto`

[jira] [Created] (SPARK-48683) Schema evolution with `df.mergeInto` losing `when` clauses

2024-06-21 Thread Pengfei Xu (Jira)
Pengfei Xu created SPARK-48683: -- Summary: Schema evolution with `df.mergeInto` losing `when` clauses Key: SPARK-48683 URL: https://issues.apache.org/jira/browse/SPARK-48683 Project: Spark Issue

[jira] [Resolved] (SPARK-48659) Unify v1 and v2 ALTER TABLE .. SET TBLPROPERTIES tests

2024-06-21 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-48659. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47018

[jira] [Assigned] (SPARK-48659) Unify v1 and v2 ALTER TABLE .. SET TBLPROPERTIES tests

2024-06-21 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-48659: --- Assignee: BingKun Pan > Unify v1 and v2 ALTER TABLE .. SET TBLPROPERTIES tests >

[jira] [Created] (SPARK-48682) Use ICU in InitCap expression (UTF8_BINARY collation)

2024-06-21 Thread Jira
Uroš Bojanić created SPARK-48682: Summary: Use ICU in InitCap expression (UTF8_BINARY collation) Key: SPARK-48682 URL: https://issues.apache.org/jira/browse/SPARK-48682 Project: Spark Issue

[jira] [Created] (SPARK-48681) Use ICU in Lower/Upper expressions (UTF8_BINARY collation)

2024-06-21 Thread Jira
Uroš Bojanić created SPARK-48681: Summary: Use ICU in Lower/Upper expressions (UTF8_BINARY collation) Key: SPARK-48681 URL: https://issues.apache.org/jira/browse/SPARK-48681 Project: Spark

[jira] [Created] (SPARK-48680) Add char/varchar doc to language specific tables

2024-06-21 Thread Kent Yao (Jira)
Kent Yao created SPARK-48680: Summary: Add char/varchar doc to language specific tables Key: SPARK-48680 URL: https://issues.apache.org/jira/browse/SPARK-48680 Project: Spark Issue Type: