[jira] [Updated] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37570: -- Description: Mypy breaks on a project with pyspark 3.2.0 dependency (worked fine for 3.1.2),

[jira] [Updated] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-37575: - Affects Version/s: 3.2.0 > Empty strings and null values are both saved as quoted empty Strings

[jira] [Commented] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Guo Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454989#comment-17454989 ] Guo Wei commented on SPARK-37575: - Spark 3.2.0 has the same behavior. > Empty strings and null values

[jira] [Commented] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454976#comment-17454976 ] Hyukjin Kwon commented on SPARK-37575: -- Spark 2.4.X is EOL so it won't likely be fixed. Does it

[jira] [Comment Edited] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Guo Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454974#comment-17454974 ] Guo Wei edited comment on SPARK-37575 at 12/8/21, 6:02 AM: --- As default

[jira] [Comment Edited] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Guo Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454974#comment-17454974 ] Guo Wei edited comment on SPARK-37575 at 12/8/21, 6:02 AM: --- As default

[jira] [Commented] (SPARK-37551) Argument 1 to "rename" of "DataFrame" has incompatible type in pandas.LocIndexerLike.__getitem__

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454975#comment-17454975 ] Hyukjin Kwon commented on SPARK-37551: -- cc [~XinrongM] too FYI > Argument 1 to "rename" of

[jira] [Commented] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Guo Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454974#comment-17454974 ] Guo Wei commented on SPARK-37575: - As default writerSettings in CSVOptions,  nullValue is "",   

[jira] [Commented] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454970#comment-17454970 ] Hyukjin Kwon commented on SPARK-37570: -- cc [~itholic] [~XinrongM] [~zero323] FYI > mypy breaks on

[jira] [Assigned] (SPARK-37576) Support built-in K8s executor roll plugin

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37576: Assignee: (was: Apache Spark) > Support built-in K8s executor roll plugin >

[jira] [Commented] (SPARK-37576) Support built-in K8s executor roll plugin

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454969#comment-17454969 ] Apache Spark commented on SPARK-37576: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-37576) Support built-in K8s executor roll plugin

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37576: Assignee: Apache Spark > Support built-in K8s executor roll plugin >

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-37572: - Priority: Major (was: Critical) > Flexible ways of launching executors >

[jira] [Commented] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454968#comment-17454968 ] Hyukjin Kwon commented on SPARK-37575: -- can you set nullValue and emptyValue options? > Empty

[jira] [Created] (SPARK-37576) Support built-in K8s executor roll plugin

2021-12-07 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-37576: - Summary: Support built-in K8s executor roll plugin Key: SPARK-37576 URL: https://issues.apache.org/jira/browse/SPARK-37576 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Guo Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454966#comment-17454966 ] Guo Wei commented on SPARK-37575: - related issues: https://issues.apache.org/jira/browse/SPARK-17916

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-37571: --- Affects Version/s: 3.3.0 (was: 3.2.0) > decouple amplab jenkins

[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454960#comment-17454960 ] Kousuke Saruta commented on SPARK-37568: [~yoda-mon] OK, please go ahead. > Support 2-arguments

[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Leona Yoda (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454958#comment-17454958 ] Leona Yoda commented on SPARK-37568:  I would like to work on this. > Support 2-arguments by the

[jira] [Created] (SPARK-37575) Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values)

2021-12-07 Thread Guo Wei (Jira)
Guo Wei created SPARK-37575: --- Summary: Empty strings and null values are both saved as quoted empty Strings "" rather than "" (for empty strings) and nothing(for null values) Key: SPARK-37575 URL:

[jira] [Assigned] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

2021-12-07 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-37392: --- Assignee: Wenchen Fan > Catalyst optimizer very time-consuming and memory-intensive with

[jira] [Resolved] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

2021-12-07 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-37392. - Fix Version/s: 3.3.0 3.2.1 3.1.3 Resolution: Fixed

[jira] [Assigned] (SPARK-37516) Uses Python's standard string formatter for SQL API in PySpark

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-37516: Assignee: Hyukjin Kwon > Uses Python's standard string formatter for SQL API in PySpark

[jira] [Resolved] (SPARK-37516) Uses Python's standard string formatter for SQL API in PySpark

2021-12-07 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-37516. -- Fix Version/s: 3.3.0 Resolution: Fixed Fixed in

[jira] [Commented] (SPARK-37574) Simplify fetchBlocks w/o retry

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454950#comment-17454950 ] Apache Spark commented on SPARK-37574: -- User 'pan3793' has created a pull request for this issue:

[jira] [Commented] (SPARK-37574) Simplify fetchBlocks w/o retry

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454949#comment-17454949 ] Apache Spark commented on SPARK-37574: -- User 'pan3793' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37574) Simplify fetchBlocks w/o retry

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37574: Assignee: Apache Spark > Simplify fetchBlocks w/o retry > --

[jira] [Assigned] (SPARK-37574) Simplify fetchBlocks w/o retry

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37574: Assignee: (was: Apache Spark) > Simplify fetchBlocks w/o retry >

[jira] [Created] (SPARK-37574) Simplify fetchBlocks w/o retry

2021-12-07 Thread Cheng Pan (Jira)
Cheng Pan created SPARK-37574: - Summary: Simplify fetchBlocks w/o retry Key: SPARK-37574 URL: https://issues.apache.org/jira/browse/SPARK-37574 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454947#comment-17454947 ] Kousuke Saruta commented on SPARK-37568: cc: [~yoda-mon] [~YActs] Do you want to work on this?

[jira] [Commented] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454946#comment-17454946 ] Apache Spark commented on SPARK-37573: -- User 'AngersZh' has created a pull request for this

[jira] [Assigned] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37573: Assignee: Apache Spark > IsolatedClient fallbackVersion should be build in version, not

[jira] [Commented] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454945#comment-17454945 ] Apache Spark commented on SPARK-37573: -- User 'AngersZh' has created a pull request for this

[jira] [Assigned] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37573: Assignee: (was: Apache Spark) > IsolatedClient fallbackVersion should be build in

[jira] [Created] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-07 Thread angerszhu (Jira)
angerszhu created SPARK-37573: - Summary: IsolatedClient fallbackVersion should be build in version, not always 2.7.4 Key: SPARK-37573 URL: https://issues.apache.org/jira/browse/SPARK-37573 Project:

[jira] [Assigned] (SPARK-37445) Update hadoop-profile

2021-12-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37445: Assignee: angerszhu > Update hadoop-profile > - > > Key:

[jira] [Resolved] (SPARK-37445) Update hadoop-profile

2021-12-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37445. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34715

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running commands [1],

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running commands [1],

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running commands [1],

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running commands [1],

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running commands [1],

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running a command

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running a command

[jira] [Updated] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dagang Wei updated SPARK-37572: --- Description: Currently Spark launches executor processes by constructing and running a command

[jira] [Created] (SPARK-37572) Flexible ways of launching executors

2021-12-07 Thread Dagang Wei (Jira)
Dagang Wei created SPARK-37572: -- Summary: Flexible ways of launching executors Key: SPARK-37572 URL: https://issues.apache.org/jira/browse/SPARK-37572 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454925#comment-17454925 ] Shane Knapp commented on SPARK-37571: - this is gonna take a while...  nearly a decade later,

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Knapp updated SPARK-37571: Attachment: audit.txt > decouple amplab jenkins from spark website, builds and tests >

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Knapp updated SPARK-37571: Attachment: spark-repo-to-be-audited.txt > decouple amplab jenkins from spark website, builds and

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Knapp updated SPARK-37571: Attachment: spark-repo-to-be-audited.txt > decouple amplab jenkins from spark website, builds and

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Knapp updated SPARK-37571: Attachment: (was: spark-repo-to-be-audited.txt) > decouple amplab jenkins from spark website,

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Knapp updated SPARK-37571: Summary: decouple amplab jenkins from spark website, builds and tests (was: decouple jenkins

[jira] [Updated] (SPARK-37571) decouple amplab jenkins from spark website, builds and tests

2021-12-07 Thread Shane Knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Knapp updated SPARK-37571: Description: we will be turning off jenkins on dec 23rd, and we need to decouple the build infra

[jira] [Created] (SPARK-37571) decouple jenkins from spark builds and tests

2021-12-07 Thread Shane Knapp (Jira)
Shane Knapp created SPARK-37571: --- Summary: decouple jenkins from spark builds and tests Key: SPARK-37571 URL: https://issues.apache.org/jira/browse/SPARK-37571 Project: Spark Issue Type:

[jira] [Created] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-07 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-37570: - Summary: mypy breaks on pyspark.pandas.plot.core.Bucketizer Key: SPARK-37570 URL: https://issues.apache.org/jira/browse/SPARK-37570 Project: Spark Issue

[jira] [Created] (SPARK-37569) View Analysis incorrectly marks nested fields as nullable

2021-12-07 Thread Shardul Mahadik (Jira)
Shardul Mahadik created SPARK-37569: --- Summary: View Analysis incorrectly marks nested fields as nullable Key: SPARK-37569 URL: https://issues.apache.org/jira/browse/SPARK-37569 Project: Spark

[jira] [Commented] (SPARK-23607) Use HDFS extended attributes to store application summary to improve the Spark History Server performance

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454800#comment-17454800 ] Apache Spark commented on SPARK-23607: -- User 'thejdeep' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23607) Use HDFS extended attributes to store application summary to improve the Spark History Server performance

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23607: Assignee: Apache Spark > Use HDFS extended attributes to store application summary to

[jira] [Assigned] (SPARK-23607) Use HDFS extended attributes to store application summary to improve the Spark History Server performance

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23607: Assignee: (was: Apache Spark) > Use HDFS extended attributes to store application

[jira] [Commented] (SPARK-23607) Use HDFS extended attributes to store application summary to improve the Spark History Server performance

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454799#comment-17454799 ] Apache Spark commented on SPARK-23607: -- User 'thejdeep' has created a pull request for this issue:

[jira] [Commented] (SPARK-23607) Use HDFS extended attributes to store application summary to improve the Spark History Server performance

2021-12-07 Thread Thejdeep Gudivada (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454797#comment-17454797 ] Thejdeep Gudivada commented on SPARK-23607: --- Posted a preview PR for this, will be adding

[jira] [Reopened] (SPARK-23607) Use HDFS extended attributes to store application summary to improve the Spark History Server performance

2021-12-07 Thread Thejdeep Gudivada (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejdeep Gudivada reopened SPARK-23607: --- > Use HDFS extended attributes to store application summary to improve the > Spark

[jira] [Assigned] (SPARK-37556) Deser void class fail with Java serialization

2021-12-07 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-37556: Assignee: Daniel Dai > Deser void class fail with Java serialization >

[jira] [Resolved] (SPARK-37556) Deser void class fail with Java serialization

2021-12-07 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-37556. -- Fix Version/s: 3.3.0 3.0.4 3.2.1

[jira] [Commented] (SPARK-37515) minRatePerPartition works as "max messages per partition per a batch" (it should be per seconds)

2021-12-07 Thread Sungpeo Kook (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454691#comment-17454691 ] Sungpeo Kook commented on SPARK-37515: -- [~apachespark] Nobody check this issue? >

[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454634#comment-17454634 ] Max Gekk commented on SPARK-37568: -- [~beliefer] [~sarutak] [~angerszhuuu] [~xiaopenglei] Would you like

[jira] [Created] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Max Gekk (Jira)
Max Gekk created SPARK-37568: Summary: Support 2-arguments by the convert_timezone() function Key: SPARK-37568 URL: https://issues.apache.org/jira/browse/SPARK-37568 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

2021-12-07 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-37478. - Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34819

[jira] [Assigned] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

2021-12-07 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-37478: --- Assignee: dch nguyen > Unify v1 and v2 DROP NAMESPACE tests >

[jira] [Commented] (SPARK-32225) Parquet footer information is read twice

2021-12-07 Thread Stijn De Haes (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454622#comment-17454622 ] Stijn De Haes commented on SPARK-32225: --- Could this be the reason that when you read a Parquet

[jira] [Updated] (SPARK-32225) Parquet footer information is read twice

2021-12-07 Thread Stijn De Haes (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stijn De Haes updated SPARK-32225: -- Attachment: image-2021-12-07-13-37-12-197.png > Parquet footer information is read twice >

[jira] [Assigned] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37566: Assignee: (was: Apache Spark) > The sampling job will lead to the wrong statistics >

[jira] [Assigned] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37566: Assignee: Apache Spark > The sampling job will lead to the wrong statistics >

[jira] [Commented] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454534#comment-17454534 ] Apache Spark commented on SPARK-37566: -- User 'cfmcgrady' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37566: Assignee: Apache Spark > The sampling job will lead to the wrong statistics >

[jira] [Commented] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454524#comment-17454524 ] junbiao chen commented on SPARK-37567: -- Hi,[~davies], Is this a reuse bug? > reuse Exchange failed

[jira] [Updated] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen updated SPARK-37567: - Description: use case:query2 in TPC-DS.There are three exchange subquery will scan the same

[jira] [Updated] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen updated SPARK-37567: - Attachment: execution stage(1)-query2.png > reuse Exchange failed > -- > >

[jira] [Updated] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen updated SPARK-37567: - Attachment: physical plan-query2.png > reuse Exchange failed > -- > >

[jira] [Updated] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen updated SPARK-37567: - Description: use case:query2 in TPC-DS.There are three exchange subquery will scan the same

[jira] [Updated] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen updated SPARK-37567: - Attachment: execution stage-query2.png > reuse Exchange failed > -- > >

[jira] [Created] (SPARK-37567) reuse Exchange failed

2021-12-07 Thread junbiao chen (Jira)
junbiao chen created SPARK-37567: Summary: reuse Exchange failed Key: SPARK-37567 URL: https://issues.apache.org/jira/browse/SPARK-37567 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Fu Chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454497#comment-17454497 ] Fu Chen commented on SPARK-37566: - The expected value of `number of output rows` is 10 > The sampling

[jira] [Updated] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Fu Chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fu Chen updated SPARK-37566: Description: code for reproduce {code:java}   spark.range(0, 10)       .repartitionByRange(10,

[jira] [Updated] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Fu Chen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fu Chen updated SPARK-37566: Attachment: 截屏2021-12-07 下午5.17.12.png > The sampling job will lead to the wrong statistics >

[jira] [Created] (SPARK-37566) The sampling job will lead to the wrong statistics

2021-12-07 Thread Fu Chen (Jira)
Fu Chen created SPARK-37566: --- Summary: The sampling job will lead to the wrong statistics Key: SPARK-37566 URL: https://issues.apache.org/jira/browse/SPARK-37566 Project: Spark Issue Type: Bug