[jira] [Commented] (SPARK-40502) Support dataframe API use jdbc data source in PySpark

2022-09-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607447#comment-17607447 ] Hyukjin Kwon commented on SPARK-40502: -- {quote} For some reasons, i can't using DataFrame API, only

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40499: - Priority: Major (was: Blocker) > Spark 3.2.1 percentlie_approx query much slower than Spark

[jira] [Created] (SPARK-40509) Construct an example of applyInPandasWithState in examples directory

2022-09-20 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40509: Summary: Construct an example of applyInPandasWithState in examples directory Key: SPARK-40509 URL: https://issues.apache.org/jira/browse/SPARK-40509 Project: Spark

[jira] [Assigned] (SPARK-40491) Remove too old TODO for JdbcRDD

2022-09-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-40491: Assignee: jiaan.geng > Remove too old TODO for JdbcRDD > ---

[jira] [Commented] (SPARK-40332) Implement `GroupBy.quantile`.

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607489#comment-17607489 ] Apache Spark commented on SPARK-40332: -- User 'zhengruifeng' has created a pull request for this

[jira] [Updated] (SPARK-40501) Enhance 'SpecialLimits' to support project(..., limit(...))

2022-09-20 Thread BingKun Pan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BingKun Pan updated SPARK-40501: Summary: Enhance 'SpecialLimits' to support project(..., limit(...)) (was: Add

[jira] [Commented] (SPARK-40510) Implement `ddof` in `Series.cov`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607483#comment-17607483 ] Apache Spark commented on SPARK-40510: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40510) Implement `ddof` in `Series.cov`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40510: Assignee: (was: Apache Spark) > Implement `ddof` in `Series.cov` >

[jira] [Assigned] (SPARK-40510) Implement `ddof` in `Series.cov`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40510: Assignee: Apache Spark > Implement `ddof` in `Series.cov` >

[jira] [Commented] (SPARK-40510) Implement `ddof` in `Series.cov`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607481#comment-17607481 ] Apache Spark commented on SPARK-40510: -- User 'zhengruifeng' has created a pull request for this

[jira] [Created] (SPARK-40510) Implement `ddof` in `Series.cov`

2022-09-20 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40510: - Summary: Implement `ddof` in `Series.cov` Key: SPARK-40510 URL: https://issues.apache.org/jira/browse/SPARK-40510 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-40500) Use `pd.items` instead of `pd.iteritems`

2022-09-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-40500: Assignee: Ruifeng Zheng > Use `pd.items` instead of `pd.iteritems` >

[jira] [Resolved] (SPARK-40500) Use `pd.items` instead of `pd.iteritems`

2022-09-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40500. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37947

[jira] [Resolved] (SPARK-40491) Remove too old TODO for JdbcRDD

2022-09-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40491. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37937

[jira] [Commented] (SPARK-40332) Implement `GroupBy.quantile`.

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607488#comment-17607488 ] Apache Spark commented on SPARK-40332: -- User 'zhengruifeng' has created a pull request for this

[jira] [Updated] (SPARK-40506) Spark Streaming metrics name don't need application name

2022-09-20 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-40506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 王俊博 updated SPARK-40506: Description: Spark  StreamingSource  Metrics sourceName is inappropriate.The label now looks like

[jira] [Created] (SPARK-40511) Upgrade slf4j to 2.x

2022-09-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-40511: Summary: Upgrade slf4j to 2.x Key: SPARK-40511 URL: https://issues.apache.org/jira/browse/SPARK-40511 Project: Spark Issue Type: Improvement Components:

[jira] [Resolved] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40496. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37942

[jira] [Assigned] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40496: --- Assignee: Ivan Sadikov > Configs to control "enableDateTimeParsingFallback" are

[jira] [Assigned] (SPARK-40511) Upgrade slf4j to 2.x

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40511: Assignee: Apache Spark > Upgrade slf4j to 2.x > > >

[jira] [Commented] (SPARK-40511) Upgrade slf4j to 2.x

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607522#comment-17607522 ] Apache Spark commented on SPARK-40511: -- User 'LuciferYang' has created a pull request for this

[jira] [Assigned] (SPARK-40511) Upgrade slf4j to 2.x

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40511: Assignee: (was: Apache Spark) > Upgrade slf4j to 2.x > > >

[jira] [Commented] (SPARK-40502) Support dataframe API use jdbc data source in PySpark

2022-09-20 Thread CaoYu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607523#comment-17607523 ] CaoYu commented on SPARK-40502: --- I am a teacher Recently designed Python language basic course, big data

[jira] [Created] (SPARK-40497) Upgrade Scala to 2.13.9

2022-09-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-40497: Summary: Upgrade Scala to 2.13.9 Key: SPARK-40497 URL: https://issues.apache.org/jira/browse/SPARK-40497 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Attachment: spark2.4-shuffle-data.png > Spark 3.2.1 percentlie_approx query much slower than Spark

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Description: spark.sql(       s"""          |SELECT          | Info ,          |

[jira] [Commented] (SPARK-40419) Integrate Grouped Aggregate Pandas UDFs into *.sql test cases

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607022#comment-17607022 ] Apache Spark commented on SPARK-40419: -- User 'itholic' has created a pull request for this issue:

[jira] [Commented] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606963#comment-17606963 ] Apache Spark commented on SPARK-40496: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40497) Upgrade Scala to 2.13.9

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40497: Assignee: (was: Apache Spark) > Upgrade Scala to 2.13.9 > --- >

[jira] [Assigned] (SPARK-40497) Upgrade Scala to 2.13.9

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40497: Assignee: Apache Spark > Upgrade Scala to 2.13.9 > --- > >

[jira] [Commented] (SPARK-40497) Upgrade Scala to 2.13.9

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606966#comment-17606966 ] Apache Spark commented on SPARK-40497: -- User 'LuciferYang' has created a pull request for this

[jira] [Commented] (SPARK-40497) Upgrade Scala to 2.13.9

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606968#comment-17606968 ] Apache Spark commented on SPARK-40497: -- User 'LuciferYang' has created a pull request for this

[jira] [Created] (SPARK-40498) Implement `kendall` and `min_periods` in `Series.corr`

2022-09-20 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40498: - Summary: Implement `kendall` and `min_periods` in `Series.corr` Key: SPARK-40498 URL: https://issues.apache.org/jira/browse/SPARK-40498 Project: Spark

[jira] [Commented] (SPARK-40419) Integrate Grouped Aggregate Pandas UDFs into *.sql test cases

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607021#comment-17607021 ] Apache Spark commented on SPARK-40419: -- User 'itholic' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40495: Assignee: Apache Spark > Add additional tests to StreamingSessionWindowSuite >

[jira] [Assigned] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40495: Assignee: (was: Apache Spark) > Add additional tests to StreamingSessionWindowSuite

[jira] [Commented] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606940#comment-17606940 ] Apache Spark commented on SPARK-40495: -- User 'WweiL' has created a pull request for this issue:

[jira] [Resolved] (SPARK-40486) Implement `spearman` and `kendall` in `DataFrame.corrwith`

2022-09-20 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-40486. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37929

[jira] [Assigned] (SPARK-40486) Implement `spearman` and `kendall` in `DataFrame.corrwith`

2022-09-20 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-40486: - Assignee: Ruifeng Zheng > Implement `spearman` and `kendall` in `DataFrame.corrwith` >

[jira] [Created] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40496: Summary: Configs to control "enableDateTimeParsingFallback" are incorrectly swapped Key: SPARK-40496 URL: https://issues.apache.org/jira/browse/SPARK-40496 Project:

[jira] [Assigned] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40496: Assignee: (was: Apache Spark) > Configs to control "enableDateTimeParsingFallback"

[jira] [Assigned] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40496: Assignee: Apache Spark > Configs to control "enableDateTimeParsingFallback" are

[jira] [Commented] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606962#comment-17606962 ] Apache Spark commented on SPARK-40496: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Commented] (SPARK-40489) Spark 3.3.0 breaks with SFL4J 2.

2022-09-20 Thread Piotr Karwasz (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606973#comment-17606973 ] Piotr Karwasz commented on SPARK-40489: --- [~kabhwan], The code snippet I posted above (using

[jira] [Updated] (SPARK-40491) Remove too old TODO for JdbcRDD

2022-09-20 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Summary: Remove too old TODO for JdbcRDD (was: Remove too old todo comments for JdbcRDD) > Remove

[jira] [Commented] (SPARK-40498) Implement `kendall` and `min_periods` in `Series.corr`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607006#comment-17607006 ] Apache Spark commented on SPARK-40498: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40498) Implement `kendall` and `min_periods` in `Series.corr`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40498: Assignee: Apache Spark > Implement `kendall` and `min_periods` in `Series.corr` >

[jira] [Commented] (SPARK-40498) Implement `kendall` and `min_periods` in `Series.corr`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607004#comment-17607004 ] Apache Spark commented on SPARK-40498: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40498) Implement `kendall` and `min_periods` in `Series.corr`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40498: Assignee: (was: Apache Spark) > Implement `kendall` and `min_periods` in

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Attachment: spark3.2-shuffle-data.png > Spark 3.2.1 percentlie_approx query much slower than Spark

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Environment: hadoop: 3.0.0  spark:  2.4.0 / 3.2.1 shuffle:spark 2.4.0 was: hadoop 3.0.0 

[jira] [Commented] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-20 Thread Wei Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606937#comment-17606937 ] Wei Liu commented on SPARK-40495: - Hi there, this is my first time to contribute to OSS spark. I'm not

[jira] [Created] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-20 Thread Wei Liu (Jira)
Wei Liu created SPARK-40495: --- Summary: Add additional tests to StreamingSessionWindowSuite Key: SPARK-40495 URL: https://issues.apache.org/jira/browse/SPARK-40495 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-40491) Remove too old todo comments for JdbcRDD

2022-09-20 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Summary: Remove too old todo comments for JdbcRDD (was: Expose a jdbcRDD function in SparkContext)

[jira] [Updated] (SPARK-40491) Remove too old todo comments for JdbcRDD

2022-09-20 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Description: According to the legacy document of JdbcRDD, we need to expose a jdbcRDD function in

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Environment: hadoop 3.0.0  spark2.4.0 / spark3.2.1 shuffle: spark2.4.0

[jira] [Created] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
xuanzhiang created SPARK-40499: -- Summary: Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0 Key: SPARK-40499 URL: https://issues.apache.org/jira/browse/SPARK-40499 Project: Spark

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Priority: Blocker (was: Minor) > Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Priority: Major (was: Blocker) > Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

[jira] [Updated] (SPARK-40501) Add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread BingKun Pan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BingKun Pan updated SPARK-40501: Summary: Add PushProjectionThroughLimit for Optimizer (was: add PushProjectionThroughLimit for

[jira] [Commented] (SPARK-40505) Remove min heap setting in Kubernetes Dockerfile entrypoint

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607149#comment-17607149 ] Apache Spark commented on SPARK-40505: -- User 'bryanck' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40505) Remove min heap setting in Kubernetes Dockerfile entrypoint

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40505: Assignee: Apache Spark > Remove min heap setting in Kubernetes Dockerfile entrypoint >

[jira] [Assigned] (SPARK-40505) Remove min heap setting in Kubernetes Dockerfile entrypoint

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40505: Assignee: (was: Apache Spark) > Remove min heap setting in Kubernetes Dockerfile

[jira] [Commented] (SPARK-40505) Remove min heap setting in Kubernetes Dockerfile entrypoint

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607151#comment-17607151 ] Apache Spark commented on SPARK-40505: -- User 'bryanck' has created a pull request for this issue:

[jira] [Created] (SPARK-40506) Spark Streaming Metrics SourceName is unsuitable

2022-09-20 Thread Jira
王俊博 created SPARK-40506: --- Summary: Spark Streaming Metrics SourceName is unsuitable Key: SPARK-40506 URL: https://issues.apache.org/jira/browse/SPARK-40506 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-40506) Spark Streaming metrics name don't need application name

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40506: Assignee: (was: Apache Spark) > Spark Streaming metrics name don't need application

[jira] [Commented] (SPARK-40506) Spark Streaming metrics name don't need application name

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607167#comment-17607167 ] Apache Spark commented on SPARK-40506: -- User 'Kwafoor' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40506) Spark Streaming metrics name don't need application name

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40506: Assignee: Apache Spark > Spark Streaming metrics name don't need application name >

[jira] [Updated] (SPARK-40501) Add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread BingKun Pan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BingKun Pan updated SPARK-40501: Description: h4. It took a long time to fetch out, still running after 20 minutes... when run as

[jira] [Created] (SPARK-40505) Remove min heap setting in Kubernetes Dockerfile entrypoint

2022-09-20 Thread Bryan Keller (Jira)
Bryan Keller created SPARK-40505: Summary: Remove min heap setting in Kubernetes Dockerfile entrypoint Key: SPARK-40505 URL: https://issues.apache.org/jira/browse/SPARK-40505 Project: Spark

[jira] [Updated] (SPARK-40506) Spark Streaming Metrics SourceName is unsuitable

2022-09-20 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-40506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 王俊博 updated SPARK-40506: Description: Spark  StreamingSource  Metrics sourceName is inappropriate.The label now looks like

[jira] [Updated] (SPARK-40506) Spark Streaming metrics name don't need application name

2022-09-20 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-40506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 王俊博 updated SPARK-40506: Summary: Spark Streaming metrics name don't need application name (was: Spark Streaming Metrics SourceName is

[jira] [Commented] (SPARK-40327) Increase pandas API coverage for pandas API on Spark

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607079#comment-17607079 ] Apache Spark commented on SPARK-40327: -- User 'zhengruifeng' has created a pull request for this

[jira] [Created] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread zhengchenyu (Jira)
zhengchenyu created SPARK-40504: --- Summary: Make yarn appmaster load config from client Key: SPARK-40504 URL: https://issues.apache.org/jira/browse/SPARK-40504 Project: Spark Issue Type:

[jira] [Created] (SPARK-40500) Use `pd.items` instead of `pd.iteritems`

2022-09-20 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40500: - Summary: Use `pd.items` instead of `pd.iteritems` Key: SPARK-40500 URL: https://issues.apache.org/jira/browse/SPARK-40500 Project: Spark Issue Type:

[jira] [Created] (SPARK-40501) add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread BingKun Pan (Jira)
BingKun Pan created SPARK-40501: --- Summary: add PushProjectionThroughLimit for Optimizer Key: SPARK-40501 URL: https://issues.apache.org/jira/browse/SPARK-40501 Project: Spark Issue Type:

[jira] [Updated] (SPARK-40499) Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

2022-09-20 Thread xuanzhiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhiang updated SPARK-40499: --- Priority: Blocker (was: Major) > Spark 3.2.1 percentlie_approx query much slower than Spark 2.4.0

[jira] [Commented] (SPARK-40457) upgrade jackson data mapper to latest

2022-09-20 Thread Bilna (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607082#comment-17607082 ] Bilna commented on SPARK-40457: --- [~hyukjin.kwon] it is org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13

[jira] [Updated] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated SPARK-40504: Description: In yarn federation mode, config in client side and nm side may be different.

[jira] [Updated] (SPARK-40501) add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread BingKun Pan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BingKun Pan updated SPARK-40501: Description: h4. It took a long time to fetch out when run as follow code in spark-shell:

[jira] [Assigned] (SPARK-40500) Use `pd.items` instead of `pd.iteritems`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40500: Assignee: (was: Apache Spark) > Use `pd.items` instead of `pd.iteritems` >

[jira] [Assigned] (SPARK-40500) Use `pd.items` instead of `pd.iteritems`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40500: Assignee: Apache Spark > Use `pd.items` instead of `pd.iteritems` >

[jira] [Assigned] (SPARK-40501) add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40501: Assignee: (was: Apache Spark) > add PushProjectionThroughLimit for Optimizer >

[jira] [Commented] (SPARK-40501) add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607033#comment-17607033 ] Apache Spark commented on SPARK-40501: -- User 'panbingkun' has created a pull request for this

[jira] [Commented] (SPARK-40500) Use `pd.items` instead of `pd.iteritems`

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607034#comment-17607034 ] Apache Spark commented on SPARK-40500: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40501) add PushProjectionThroughLimit for Optimizer

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40501: Assignee: Apache Spark > add PushProjectionThroughLimit for Optimizer >

[jira] [Created] (SPARK-40502) Support dataframe API use jdbc data source in PySpark

2022-09-20 Thread CaoYu (Jira)
CaoYu created SPARK-40502: - Summary: Support dataframe API use jdbc data source in PySpark Key: SPARK-40502 URL: https://issues.apache.org/jira/browse/SPARK-40502 Project: Spark Issue Type: New

[jira] [Created] (SPARK-40503) Add resampling to API references

2022-09-20 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-40503: - Summary: Add resampling to API references Key: SPARK-40503 URL: https://issues.apache.org/jira/browse/SPARK-40503 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40504: Assignee: Apache Spark > Make yarn appmaster load config from client >

[jira] [Commented] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607092#comment-17607092 ] Apache Spark commented on SPARK-40504: -- User 'zhengchenyu' has created a pull request for this

[jira] [Commented] (SPARK-40491) Remove too old TODO for JdbcRDD

2022-09-20 Thread CaoYu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607045#comment-17607045 ] CaoYu commented on SPARK-40491: --- Maybe we can just not remove these.    I have already created

[jira] [Commented] (SPARK-40327) Increase pandas API coverage for pandas API on Spark

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607077#comment-17607077 ] Apache Spark commented on SPARK-40327: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-40327) Increase pandas API coverage for pandas API on Spark

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40327: Assignee: (was: Apache Spark) > Increase pandas API coverage for pandas API on Spark

[jira] [Assigned] (SPARK-40327) Increase pandas API coverage for pandas API on Spark

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40327: Assignee: Apache Spark > Increase pandas API coverage for pandas API on Spark >

[jira] [Assigned] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40504: Assignee: (was: Apache Spark) > Make yarn appmaster load config from client >

[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame

2022-09-20 Thread xsys (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:17 PM: --- [~hyukjin.kwon]: Thank you

[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame

2022-09-20 Thread xsys (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:18 PM: --- [~hyukjin.kwon]: Thank you

[jira] [Commented] (SPARK-31404) file source backward compatibility after calendar switch

2022-09-20 Thread Sachit (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607326#comment-17607326 ] Sachit commented on SPARK-31404: Hi [~cloud_fan] , Could you please confirm if we need to use below

[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame

2022-09-20 Thread xsys (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:20 PM: --- [~hyukjin.kwon]: Thank you

[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame

2022-09-20 Thread xsys (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:20 PM: --- [~hyukjin.kwon]: Thank you

  1   2   >