[jira] [Assigned] (SPARK-40946) Introduce a new DataSource V2 interface SupportsPushDownClusterKeys

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40946: Assignee: Apache Spark > Introduce a new DataSource V2 interface SupportsPushDownClusterK

[jira] [Commented] (SPARK-40946) Introduce a new DataSource V2 interface SupportsPushDownClusterKeys

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17626007#comment-17626007 ] Apache Spark commented on SPARK-40946: -- User 'huaxingao' has created a pull request

[jira] [Assigned] (SPARK-40946) Introduce a new DataSource V2 interface SupportsPushDownClusterKeys

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40946: Assignee: (was: Apache Spark) > Introduce a new DataSource V2 interface SupportsPushD

[jira] [Updated] (SPARK-40964) Cannot run spark history server with shaded hadoop jar

2022-10-28 Thread YUBI LEE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YUBI LEE updated SPARK-40964: - Description: Since SPARK-33212, Spark uses shaded client jars from Hadoop 3.x+. If you try to start Spar

[jira] [Updated] (SPARK-40964) Cannot run spark history server with shaded hadoop jar

2022-10-28 Thread YUBI LEE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YUBI LEE updated SPARK-40964: - Description: Since SPARK-33212, Spark uses shaded client jars from Hadoop 3.x+. If you try to start Spar

[jira] [Updated] (SPARK-40964) Cannot run spark history server with shaded hadoop jar

2022-10-28 Thread YUBI LEE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YUBI LEE updated SPARK-40964: - Description: Since SPARK-33212, Spark uses shaded client jars from Hadoop 3.x+. If you try to start Spar

[jira] [Updated] (SPARK-40964) Cannot run spark history server with shaded hadoop jar

2022-10-28 Thread YUBI LEE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YUBI LEE updated SPARK-40964: - Description: Since SPARK-33212, Spark uses shaded client jars from Hadoop 3.x+. In this situation, if yo

[jira] [Updated] (SPARK-40964) Cannot run spark history server with shaded hadoop jar

2022-10-28 Thread YUBI LEE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YUBI LEE updated SPARK-40964: - Description: Since SPARK-33212, Spark uses shaded client jars from Hadoop 3.x+. In this situation, if yo

[jira] [Created] (SPARK-40964) Cannot run spark history server with shaded hadoop jar

2022-10-28 Thread YUBI LEE (Jira)
YUBI LEE created SPARK-40964: Summary: Cannot run spark history server with shaded hadoop jar Key: SPARK-40964 URL: https://issues.apache.org/jira/browse/SPARK-40964 Project: Spark Issue Type: Bu

[jira] [Updated] (SPARK-40963) containsNull in array type attributes is not updated from child output

2022-10-28 Thread Bruce Robbins (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Robbins updated SPARK-40963: -- Affects Version/s: 3.1.3 > containsNull in array type attributes is not updated from child out

[jira] [Updated] (SPARK-40963) containsNull in array type attributes is not updated from child output

2022-10-28 Thread Bruce Robbins (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Robbins updated SPARK-40963: -- Affects Version/s: 3.2.2 > containsNull in array type attributes is not updated from child out

[jira] [Updated] (SPARK-40963) containsNull in array type attributes is not updated from child output

2022-10-28 Thread Bruce Robbins (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Robbins updated SPARK-40963: -- Affects Version/s: 3.3.1 > containsNull in array type attributes is not updated from child out

[jira] [Commented] (SPARK-40963) containsNull in array type attributes is not updated from child output

2022-10-28 Thread Bruce Robbins (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625955#comment-17625955 ] Bruce Robbins commented on SPARK-40963: --- I'll take a stab at fixing this in the ne

[jira] [Created] (SPARK-40963) containsNull in array type attributes is not updated from child output

2022-10-28 Thread Bruce Robbins (Jira)
Bruce Robbins created SPARK-40963: - Summary: containsNull in array type attributes is not updated from child output Key: SPARK-40963 URL: https://issues.apache.org/jira/browse/SPARK-40963 Project: Spa

[jira] [Commented] (SPARK-40939) Release a shaded version of Apache Spark / shade jars on main jar

2022-10-28 Thread Erik Krogen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625953#comment-17625953 ] Erik Krogen commented on SPARK-40939: - As a reference for prior work, there is also

[jira] [Assigned] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40943: Assignee: Apache Spark > Make MSCK optional in MSCK REPAIR TABLE commands > -

[jira] [Assigned] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40943: Assignee: (was: Apache Spark) > Make MSCK optional in MSCK REPAIR TABLE commands > --

[jira] [Commented] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625952#comment-17625952 ] Apache Spark commented on SPARK-40943: -- User 'ben-zhang' has created a pull request

[jira] [Commented] (SPARK-36124) Support set operators to be on correlation paths

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625881#comment-17625881 ] Apache Spark commented on SPARK-36124: -- User 'jchen5' has created a pull request fo

[jira] [Commented] (SPARK-36124) Support set operators to be on correlation paths

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625879#comment-17625879 ] Apache Spark commented on SPARK-36124: -- User 'jchen5' has created a pull request fo

[jira] [Assigned] (SPARK-36124) Support set operators to be on correlation paths

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36124: Assignee: Apache Spark > Support set operators to be on correlation paths > -

[jira] [Assigned] (SPARK-36124) Support set operators to be on correlation paths

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36124: Assignee: (was: Apache Spark) > Support set operators to be on correlation paths > --

[jira] [Updated] (SPARK-39319) Make query context as part of SparkThrowable

2022-10-28 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-39319: --- Parent: SPARK-38615 Issue Type: Sub-task (was: Improvement) > Make query context as

[jira] [Updated] (SPARK-39365) Truncate fragment of query context if it is too long

2022-10-28 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-39365: --- Parent: (was: SPARK-39319) Issue Type: Task (was: Sub-task) > Truncate fragment

[jira] [Commented] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625853#comment-17625853 ] Apache Spark commented on SPARK-40956: -- User 'carlfu-db' has created a pull request

[jira] [Assigned] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40956: Assignee: (was: Apache Spark) > SQL Equivalent for Dataframe overwrite command >

[jira] [Assigned] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40956: Assignee: Apache Spark > SQL Equivalent for Dataframe overwrite command > ---

[jira] [Commented] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625851#comment-17625851 ] Apache Spark commented on SPARK-40956: -- User 'carlfu-db' has created a pull request

[jira] [Updated] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread chengyan fu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chengyan fu updated SPARK-40956: Description: Proposing syntax {code:java} INSERT INTO tbl REPLACE whereClause identifierList{code

[jira] [Updated] (SPARK-38615) SQL Error Attribution Framework

2022-10-28 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38615: --- Summary: SQL Error Attribution Framework (was: Provide error context for runtime ANSI failu

[jira] [Updated] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread chengyan fu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chengyan fu updated SPARK-40956: Description: Proposing syntax {code:java} INSERT INTO tbl REPLACE whereClause identifierList{code

[jira] (SPARK-40686) Support data masking and redacting built-in functions

2022-10-28 Thread Vinod KC (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686 ] Vinod KC deleted comment on SPARK-40686: -- was (Author: vinodkc): I'm working on these 6 sub-tasks > Support data masking and redacting built-in functions > -

[jira] [Commented] (SPARK-40686) Support data masking and redacting built-in functions

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625827#comment-17625827 ] Daniel commented on SPARK-40686: > Daniel , Yes, please add them here. Let us change the

[jira] [Comment Edited] (SPARK-40686) Support data masking and redacting built-in functions

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625827#comment-17625827 ] Daniel edited comment on SPARK-40686 at 10/28/22 5:40 PM: -- bq.

[jira] [Commented] (SPARK-40918) Mismatch between ParquetFileFormat and FileSourceScanExec in # columns for WSCG.isTooManyFields when using _metadata

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625826#comment-17625826 ] Apache Spark commented on SPARK-40918: -- User 'juliuszsompolski' has created a pull

[jira] [Created] (SPARK-40962) Support data masking built-in function 'diff_privacy'

2022-10-28 Thread Daniel (Jira)
Daniel created SPARK-40962: -- Summary: Support data masking built-in function 'diff_privacy' Key: SPARK-40962 URL: https://issues.apache.org/jira/browse/SPARK-40962 Project: Spark Issue Type: Sub-tas

[jira] [Created] (SPARK-40961) Support data masking built-in functions for HIPAA compliant age/date/birthday/zipcode display

2022-10-28 Thread Daniel (Jira)
Daniel created SPARK-40961: -- Summary: Support data masking built-in functions for HIPAA compliant age/date/birthday/zipcode display Key: SPARK-40961 URL: https://issues.apache.org/jira/browse/SPARK-40961 Pro

[jira] [Updated] (SPARK-40686) Support data masking and redacting built-in functions

2022-10-28 Thread Vinod KC (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod KC updated SPARK-40686: - Description: Support built-in data masking and redacting functions  (was: Support built-in data masking

[jira] [Created] (SPARK-40960) Support data masking built-in functions 'bcrypt'

2022-10-28 Thread Daniel (Jira)
Daniel created SPARK-40960: -- Summary: Support data masking built-in functions 'bcrypt' Key: SPARK-40960 URL: https://issues.apache.org/jira/browse/SPARK-40960 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-40959) Support data masking built-in function 'mask_default'

2022-10-28 Thread Daniel (Jira)
Daniel created SPARK-40959: -- Summary: Support data masking built-in function 'mask_default' Key: SPARK-40959 URL: https://issues.apache.org/jira/browse/SPARK-40959 Project: Spark Issue Type: Sub-tas

[jira] [Created] (SPARK-40958) Support data masking built-in function 'null'

2022-10-28 Thread Daniel (Jira)
Daniel created SPARK-40958: -- Summary: Support data masking built-in function 'null' Key: SPARK-40958 URL: https://issues.apache.org/jira/browse/SPARK-40958 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-40686) Support data masking and redacting built-in functions

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel updated SPARK-40686: --- Summary: Support data masking and redacting built-in functions (was: Support data masking built-in function

[jira] [Commented] (SPARK-40957) Add in memory cache in HDFSMetadataLog

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625821#comment-17625821 ] Apache Spark commented on SPARK-40957: -- User 'jerrypeng' has created a pull request

[jira] [Assigned] (SPARK-40957) Add in memory cache in HDFSMetadataLog

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40957: Assignee: Apache Spark > Add in memory cache in HDFSMetadataLog > ---

[jira] [Commented] (SPARK-40957) Add in memory cache in HDFSMetadataLog

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625820#comment-17625820 ] Apache Spark commented on SPARK-40957: -- User 'jerrypeng' has created a pull request

[jira] [Assigned] (SPARK-40957) Add in memory cache in HDFSMetadataLog

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40957: Assignee: (was: Apache Spark) > Add in memory cache in HDFSMetadataLog >

[jira] [Commented] (SPARK-40686) Support data masking built-in functions

2022-10-28 Thread Vinod KC (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625819#comment-17625819 ] Vinod KC commented on SPARK-40686: -- [~dtenedor] , Yes, please add them here. Let us cha

[jira] [Created] (SPARK-40957) Add in memory cache in HDFSMetadataLog

2022-10-28 Thread Boyang Jerry Peng (Jira)
Boyang Jerry Peng created SPARK-40957: - Summary: Add in memory cache in HDFSMetadataLog Key: SPARK-40957 URL: https://issues.apache.org/jira/browse/SPARK-40957 Project: Spark Issue Type:

[jira] [Updated] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread chengyan fu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chengyan fu updated SPARK-40956: Description: Proposing syntax ```INSERT INTO tbl REPLACE whereClause identifierList``` to the spa

[jira] [Updated] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread chengyan fu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chengyan fu updated SPARK-40956: Description:   {code:java}  {code} Proposing syntax ```INSERT INTO tbl REPLACE whereClause identif

[jira] [Updated] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread chengyan fu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chengyan fu updated SPARK-40956: Description: Proposing syntax ```INSERT INTO tbl REPLACE whereClause identifierList``` to the spa

[jira] [Created] (SPARK-40956) SQL Equivalent for Dataframe overwrite command

2022-10-28 Thread chengyan fu (Jira)
chengyan fu created SPARK-40956: --- Summary: SQL Equivalent for Dataframe overwrite command Key: SPARK-40956 URL: https://issues.apache.org/jira/browse/SPARK-40956 Project: Spark Issue Type: New

[jira] [Comment Edited] (SPARK-34827) Support fetching shuffle blocks in batch with i/o encryption

2022-10-28 Thread Nikhil Sharma (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625809#comment-17625809 ] Nikhil Sharma edited comment on SPARK-34827 at 10/28/22 4:54 PM: -

[jira] [Comment Edited] (SPARK-34827) Support fetching shuffle blocks in batch with i/o encryption

2022-10-28 Thread Nikhil Sharma (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625809#comment-17625809 ] Nikhil Sharma edited comment on SPARK-34827 at 10/28/22 4:53 PM: -

[jira] [Commented] (SPARK-34827) Support fetching shuffle blocks in batch with i/o encryption

2022-10-28 Thread Nikhil Sharma (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625809#comment-17625809 ] Nikhil Sharma commented on SPARK-34827: --- Thank you for sharing such good informati

[jira] [Commented] (SPARK-40686) Support data masking built-in functions

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625805#comment-17625805 ] Daniel commented on SPARK-40686: In addition to the functions listed in subtasks here so

[jira] [Commented] (SPARK-40686) Support data masking built-in functions

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625803#comment-17625803 ] Daniel commented on SPARK-40686: I closed https://issues.apache.org/jira/browse/SPARK-40

[jira] [Resolved] (SPARK-40623) Add new SQL built-in functions to help with redacting data

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel resolved SPARK-40623. Resolution: Duplicate closing as a dup of https://issues.apache.org/jira/browse/SPARK-40686 > Add new SQL

[jira] [Commented] (SPARK-40652) Add MASK_PHONE and TRY_MASK_PHONE functions

2022-10-28 Thread Daniel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625800#comment-17625800 ] Daniel commented on SPARK-40652: I am going to close this as there is a duplicate effort

[jira] [Updated] (SPARK-40955) allow DSV2 Predicate pushdown in FileScanBuilder.pushedDataFilter

2022-10-28 Thread RJ Marcus (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] RJ Marcus updated SPARK-40955: -- Description: {+}overall{+}:  Allow FileScanBuilder to push `Predicate` instead of `Filter` for data

[jira] [Created] (SPARK-40955) allow DSV2 Predicate pushdown in FileScanBuilder.pushedDataFilter

2022-10-28 Thread RJ Marcus (Jira)
RJ Marcus created SPARK-40955: - Summary: allow DSV2 Predicate pushdown in FileScanBuilder.pushedDataFilter Key: SPARK-40955 URL: https://issues.apache.org/jira/browse/SPARK-40955 Project: Spark

[jira] [Updated] (SPARK-40954) Kubernetes integration tests stuck forever on Mac M1 with Minikube + Docker

2022-10-28 Thread Anton Ippolitov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Ippolitov updated SPARK-40954: Description: h2. Description I tried running Kubernetes integration tests with the Miniku

[jira] [Updated] (SPARK-40954) Kubernetes integration tests stuck forever on Mac M1 with Minikube + Docker

2022-10-28 Thread Anton Ippolitov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Ippolitov updated SPARK-40954: Attachment: TestProcess.scala > Kubernetes integration tests stuck forever on Mac M1 with

[jira] [Created] (SPARK-40954) Kubernetes integration tests stuck forever on Mac M1 with Minikube + Docker

2022-10-28 Thread Anton Ippolitov (Jira)
Anton Ippolitov created SPARK-40954: --- Summary: Kubernetes integration tests stuck forever on Mac M1 with Minikube + Docker Key: SPARK-40954 URL: https://issues.apache.org/jira/browse/SPARK-40954 Pro

[jira] [Commented] (SPARK-40800) Always inline expressions in OptimizeOneRowRelationSubquery

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625783#comment-17625783 ] Apache Spark commented on SPARK-40800: -- User 'allisonwang-db' has created a pull re

[jira] [Commented] (SPARK-40800) Always inline expressions in OptimizeOneRowRelationSubquery

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625782#comment-17625782 ] Apache Spark commented on SPARK-40800: -- User 'allisonwang-db' has created a pull re

[jira] [Commented] (SPARK-40950) isRemoteAddressMaxedOut performance overhead on scala 2.13

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625712#comment-17625712 ] Apache Spark commented on SPARK-40950: -- User 'eejbyfeldt' has created a pull reques

[jira] [Assigned] (SPARK-40950) isRemoteAddressMaxedOut performance overhead on scala 2.13

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40950: Assignee: (was: Apache Spark) > isRemoteAddressMaxedOut performance overhead on scala

[jira] [Commented] (SPARK-40950) isRemoteAddressMaxedOut performance overhead on scala 2.13

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625710#comment-17625710 ] Apache Spark commented on SPARK-40950: -- User 'eejbyfeldt' has created a pull reques

[jira] [Assigned] (SPARK-40950) isRemoteAddressMaxedOut performance overhead on scala 2.13

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40950: Assignee: Apache Spark > isRemoteAddressMaxedOut performance overhead on scala 2.13 > ---

[jira] [Resolved] (SPARK-40932) Barrier: messages for allGather will be overridden by the following barrier APIs

2022-10-28 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40932. - Fix Version/s: 3.3.2 3.4.0 Resolution: Fixed Issue resolved by pull re

[jira] [Assigned] (SPARK-40932) Barrier: messages for allGather will be overridden by the following barrier APIs

2022-10-28 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40932: --- Assignee: Bobby Wang > Barrier: messages for allGather will be overridden by the following

[jira] [Assigned] (SPARK-40912) Overhead of Exceptions in DeserializationStream

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40912: Assignee: Apache Spark > Overhead of Exceptions in DeserializationStream > -

[jira] [Assigned] (SPARK-40912) Overhead of Exceptions in DeserializationStream

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40912: Assignee: (was: Apache Spark) > Overhead of Exceptions in DeserializationStream > --

[jira] [Commented] (SPARK-40912) Overhead of Exceptions in DeserializationStream

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625695#comment-17625695 ] Apache Spark commented on SPARK-40912: -- User 'eejbyfeldt' has created a pull reques

[jira] [Updated] (SPARK-40952) Exception when handling timestamp data in PySpark Structured Streaming

2022-10-28 Thread Kai-Michael Roesner (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai-Michael Roesner updated SPARK-40952: Description: I'm trying to process data that contains timestamps in PySpark "Struc

[jira] [Commented] (SPARK-40749) Migrate type check failures of generators onto error classes

2022-10-28 Thread BingKun Pan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625674#comment-17625674 ] BingKun Pan commented on SPARK-40749: - I work on it. > Migrate type check failures

[jira] [Assigned] (SPARK-40953) Add missing `limit(n)` in DataFrame.head

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40953: Assignee: (was: Apache Spark) > Add missing `limit(n)` in DataFrame.head > --

[jira] [Commented] (SPARK-40953) Add missing `limit(n)` in DataFrame.head

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625662#comment-17625662 ] Apache Spark commented on SPARK-40953: -- User 'zhengruifeng' has created a pull requ

[jira] [Commented] (SPARK-40953) Add missing `limit(n)` in DataFrame.head

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625664#comment-17625664 ] Apache Spark commented on SPARK-40953: -- User 'zhengruifeng' has created a pull requ

[jira] [Assigned] (SPARK-40953) Add missing `limit(n)` in DataFrame.head

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40953: Assignee: Apache Spark > Add missing `limit(n)` in DataFrame.head > -

[jira] [Created] (SPARK-40953) Add missing `limit(n)` in DataFrame.head

2022-10-28 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40953: - Summary: Add missing `limit(n)` in DataFrame.head Key: SPARK-40953 URL: https://issues.apache.org/jira/browse/SPARK-40953 Project: Spark Issue Type: Sub-ta

[jira] [Resolved] (SPARK-40889) Check error classes in PlanResolutionSuite

2022-10-28 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-40889. -- Resolution: Fixed Issue resolved by pull request 38421 [https://github.com/apache/spark/pull/38421] >

[jira] [Assigned] (SPARK-40889) Check error classes in PlanResolutionSuite

2022-10-28 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-40889: Assignee: BingKun Pan > Check error classes in PlanResolutionSuite >

[jira] [Resolved] (SPARK-40936) Refactor `AnalysisTest#assertAnalysisErrorClass` by reusing the `SparkFunSuite#checkError`

2022-10-28 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-40936. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38413 [https://github.com

[jira] [Assigned] (SPARK-40936) Refactor `AnalysisTest#assertAnalysisErrorClass` by reusing the `SparkFunSuite#checkError`

2022-10-28 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-40936: Assignee: Yang Jie > Refactor `AnalysisTest#assertAnalysisErrorClass` by reusing the > `SparkFun

[jira] [Updated] (SPARK-40952) Exception when handling timestamp data in PySpark Structured Streaming

2022-10-28 Thread Kai-Michael Roesner (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai-Michael Roesner updated SPARK-40952: Description: I'm trying to process data that contains timestamps in PySpark "Struc

[jira] [Updated] (SPARK-40952) Exception when handling timestamp data in PySpark Structured Streaming

2022-10-28 Thread Kai-Michael Roesner (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai-Michael Roesner updated SPARK-40952: Description: I'm trying to process data that contains timestamps in PySpark "Struc

[jira] [Created] (SPARK-40952) Exception when handling timestamp data in PySpark Structured Streaming

2022-10-28 Thread Kai-Michael Roesner (Jira)
Kai-Michael Roesner created SPARK-40952: --- Summary: Exception when handling timestamp data in PySpark Structured Streaming Key: SPARK-40952 URL: https://issues.apache.org/jira/browse/SPARK-40952

[jira] [Assigned] (SPARK-40951) pyspark-connect tests should be skipped if pandas doesn't exist

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40951: Assignee: Apache Spark > pyspark-connect tests should be skipped if pandas doesn't exist

[jira] [Commented] (SPARK-40951) pyspark-connect tests should be skipped if pandas doesn't exist

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625566#comment-17625566 ] Apache Spark commented on SPARK-40951: -- User 'dongjoon-hyun' has created a pull req

[jira] [Assigned] (SPARK-40951) pyspark-connect tests should be skipped if pandas doesn't exist

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40951: Assignee: (was: Apache Spark) > pyspark-connect tests should be skipped if pandas doe

[jira] [Created] (SPARK-40951) pyspark-connect tests should be skipped if pandas doesn't exist

2022-10-28 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-40951: - Summary: pyspark-connect tests should be skipped if pandas doesn't exist Key: SPARK-40951 URL: https://issues.apache.org/jira/browse/SPARK-40951 Project: Spark

[jira] [Commented] (SPARK-40229) Re-enable excel I/O test for pandas API on Spark.

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625550#comment-17625550 ] Apache Spark commented on SPARK-40229: -- User 'dongjoon-hyun' has created a pull req

[jira] [Updated] (SPARK-40950) isRemoteAddressMaxedOut performance overhead on scala 2.13

2022-10-28 Thread Emil Ejbyfeldt (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emil Ejbyfeldt updated SPARK-40950: --- Summary: isRemoteAddressMaxedOut performance overhead on scala 2.13 (was: On scala 2.13 isR

[jira] [Created] (SPARK-40950) On scala 2.13 isRemoteAddressMaxedOut performance overhead

2022-10-28 Thread Emil Ejbyfeldt (Jira)
Emil Ejbyfeldt created SPARK-40950: -- Summary: On scala 2.13 isRemoteAddressMaxedOut performance overhead Key: SPARK-40950 URL: https://issues.apache.org/jira/browse/SPARK-40950 Project: Spark

[jira] [Assigned] (SPARK-40949) Implement `DataFrame.sortWithinPartitions`

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40949: Assignee: (was: Apache Spark) > Implement `DataFrame.sortWithinPartitions` >

[jira] [Commented] (SPARK-40949) Implement `DataFrame.sortWithinPartitions`

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625526#comment-17625526 ] Apache Spark commented on SPARK-40949: -- User 'zhengruifeng' has created a pull requ

[jira] [Assigned] (SPARK-40949) Implement `DataFrame.sortWithinPartitions`

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40949: Assignee: Apache Spark > Implement `DataFrame.sortWithinPartitions` > ---

[jira] [Commented] (SPARK-40949) Implement `DataFrame.sortWithinPartitions`

2022-10-28 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625524#comment-17625524 ] Apache Spark commented on SPARK-40949: -- User 'zhengruifeng' has created a pull requ

  1   2   >