[jira] [Commented] (SPARK-36905) Reading Hive view without explicit column names fails in Spark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427451#comment-17427451 ] Apache Spark commented on SPARK-36905: -- User 'linhongliu-db' has created a pull request for this

[jira] [Assigned] (SPARK-36905) Reading Hive view without explicit column names fails in Spark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36905: Assignee: (was: Apache Spark) > Reading Hive view without explicit column names

[jira] [Assigned] (SPARK-36905) Reading Hive view without explicit column names fails in Spark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36905: Assignee: Apache Spark > Reading Hive view without explicit column names fails in Spark

[jira] [Commented] (SPARK-36905) Reading Hive view without explicit column names fails in Spark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427450#comment-17427450 ] Apache Spark commented on SPARK-36905: -- User 'linhongliu-db' has created a pull request for this

[jira] [Commented] (SPARK-36981) Upgrade joda-time to 2.10.12

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427449#comment-17427449 ] Apache Spark commented on SPARK-36981: -- User 'sarutak' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36981) Upgrade joda-time to 2.10.12

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36981: Assignee: Kousuke Saruta (was: Apache Spark) > Upgrade joda-time to 2.10.12 >

[jira] [Commented] (SPARK-36981) Upgrade joda-time to 2.10.12

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427448#comment-17427448 ] Apache Spark commented on SPARK-36981: -- User 'sarutak' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36981) Upgrade joda-time to 2.10.12

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36981: Assignee: Apache Spark (was: Kousuke Saruta) > Upgrade joda-time to 2.10.12 >

[jira] [Updated] (SPARK-36981) Upgrade joda-time to 2.10.12

2021-10-11 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36981: --- Description: joda-time 2.10.12 seems to support an updated TZDB.

[jira] [Created] (SPARK-36981) Upgrade joda-time to 2.10.12

2021-10-11 Thread Kousuke Saruta (Jira)
Kousuke Saruta created SPARK-36981: -- Summary: Upgrade joda-time to 2.10.12 Key: SPARK-36981 URL: https://issues.apache.org/jira/browse/SPARK-36981 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-36980) Insert support query with CTE

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427440#comment-17427440 ] Apache Spark commented on SPARK-36980: -- User 'AngersZh' has created a pull request for this

[jira] [Assigned] (SPARK-36980) Insert support query with CTE

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36980: Assignee: Apache Spark > Insert support query with CTE > - >

[jira] [Assigned] (SPARK-36980) Insert support query with CTE

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36980: Assignee: (was: Apache Spark) > Insert support query with CTE >

[jira] [Commented] (SPARK-36980) Insert support query with CTE

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427439#comment-17427439 ] Apache Spark commented on SPARK-36980: -- User 'AngersZh' has created a pull request for this

[jira] [Created] (SPARK-36980) Insert support query with CTE

2021-10-11 Thread angerszhu (Jira)
angerszhu created SPARK-36980: - Summary: Insert support query with CTE Key: SPARK-36980 URL: https://issues.apache.org/jira/browse/SPARK-36980 Project: Spark Issue Type: Task

[jira] [Assigned] (SPARK-36973) Deduplicate prepare data method for HistogramPlotBase and KdePlotBase

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36973: Assignee: Apache Spark > Deduplicate prepare data method for HistogramPlotBase and

[jira] [Assigned] (SPARK-36973) Deduplicate prepare data method for HistogramPlotBase and KdePlotBase

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36973: Assignee: (was: Apache Spark) > Deduplicate prepare data method for

[jira] [Commented] (SPARK-36973) Deduplicate prepare data method for HistogramPlotBase and KdePlotBase

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427425#comment-17427425 ] Apache Spark commented on SPARK-36973: -- User 'dchvn' has created a pull request for this issue:

[jira] [Commented] (SPARK-36971) Query files directly with SQL is broken (with Glue)

2021-10-11 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427419#comment-17427419 ] Hyukjin Kwon commented on SPARK-36971: -- is this an issue in Apache Spark? Doesn't look specific to

[jira] [Updated] (SPARK-36971) Query files directly with SQL is broken (with Glue)

2021-10-11 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36971: - Priority: Major (was: Critical) > Query files directly with SQL is broken (with Glue) >

[jira] [Commented] (SPARK-36900) "SPARK-36464: size returns correct positive number even with over 2GB data" will oom with JDK17

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427416#comment-17427416 ] Apache Spark commented on SPARK-36900: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-3563) Shuffle data not always be cleaned

2021-10-11 Thread wangkang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427417#comment-17427417 ] wangkang commented on SPARK-3563: - I meet the same question. A user runs a spark streaming job, the

[jira] [Commented] (SPARK-36900) "SPARK-36464: size returns correct positive number even with over 2GB data" will oom with JDK17

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427415#comment-17427415 ] Apache Spark commented on SPARK-36900: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-36979) Add RewriteLateralSubquery rule into nonExcludableRules

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427412#comment-17427412 ] Apache Spark commented on SPARK-36979: -- User 'ulysses-you' has created a pull request for this

[jira] [Assigned] (SPARK-36979) Add RewriteLateralSubquery rule into nonExcludableRules

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36979: Assignee: (was: Apache Spark) > Add RewriteLateralSubquery rule into

[jira] [Assigned] (SPARK-36979) Add RewriteLateralSubquery rule into nonExcludableRules

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36979: Assignee: Apache Spark > Add RewriteLateralSubquery rule into nonExcludableRules >

[jira] [Created] (SPARK-36979) Add RewriteLateralSubquery rule into nonExcludableRules

2021-10-11 Thread XiDuo You (Jira)
XiDuo You created SPARK-36979: - Summary: Add RewriteLateralSubquery rule into nonExcludableRules Key: SPARK-36979 URL: https://issues.apache.org/jira/browse/SPARK-36979 Project: Spark Issue

[jira] [Resolved] (SPARK-36958) Reading of legacy timestamps from Parquet confusing in Spark 3, related config values don't seem working

2021-10-11 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36958. -- Resolution: Not A Problem > Reading of legacy timestamps from Parquet confusing in Spark 3,

[jira] [Assigned] (SPARK-36647) Push down filter by partition column for Aggregate (Min/Max/Count) for Parquet

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36647: Assignee: (was: Apache Spark) > Push down filter by partition column for Aggregate

[jira] [Assigned] (SPARK-36647) Push down filter by partition column for Aggregate (Min/Max/Count) for Parquet

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36647: Assignee: Apache Spark > Push down filter by partition column for Aggregate

[jira] [Commented] (SPARK-36647) Push down filter by partition column for Aggregate (Min/Max/Count) for Parquet

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427400#comment-17427400 ] Apache Spark commented on SPARK-36647: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36977) Update docs to reflect that Python 3.6 is no longer supported

2021-10-11 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36977: Assignee: Maciej Szymkiewicz > Update docs to reflect that Python 3.6 is no longer

[jira] [Resolved] (SPARK-36977) Update docs to reflect that Python 3.6 is no longer supported

2021-10-11 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36977. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34242

[jira] [Resolved] (SPARK-36885) Inline type hints for python/pyspark/sql/dataframe.py

2021-10-11 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36885. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34225

[jira] [Created] (SPARK-36978) InferConstraints rule should create IsNotNull constraints on the nested field instead of the root nested type

2021-10-11 Thread Utkarsh Agarwal (Jira)
Utkarsh Agarwal created SPARK-36978: --- Summary: InferConstraints rule should create IsNotNull constraints on the nested field instead of the root nested type Key: SPARK-36978 URL:

[jira] [Commented] (SPARK-36794) Ignore duplicated join keys when building relation for SEMI/ANTI hash join

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427356#comment-17427356 ] Apache Spark commented on SPARK-36794: -- User 'c21' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36546) Make unionByName null-filling behavior work with array of struct columns

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36546: Assignee: Apache Spark > Make unionByName null-filling behavior work with array of

[jira] [Assigned] (SPARK-36546) Make unionByName null-filling behavior work with array of struct columns

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36546: Assignee: (was: Apache Spark) > Make unionByName null-filling behavior work with

[jira] [Commented] (SPARK-36546) Make unionByName null-filling behavior work with array of struct columns

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427353#comment-17427353 ] Apache Spark commented on SPARK-36546: -- User 'Kimahriman' has created a pull request for this

[jira] [Commented] (SPARK-33277) Python/Pandas UDF right after off-heap vectorized reader could cause executor crash.

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427295#comment-17427295 ] Apache Spark commented on SPARK-33277: -- User 'ankurdave' has created a pull request for this issue:

[jira] [Commented] (SPARK-33277) Python/Pandas UDF right after off-heap vectorized reader could cause executor crash.

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427293#comment-17427293 ] Apache Spark commented on SPARK-33277: -- User 'ankurdave' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36794) Ignore duplicated join keys when building relation for SEMI/ANTI hash join

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36794: Assignee: Apache Spark (was: Cheng Su) > Ignore duplicated join keys when building

[jira] [Assigned] (SPARK-36794) Ignore duplicated join keys when building relation for SEMI/ANTI hash join

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36794: Assignee: Cheng Su (was: Apache Spark) > Ignore duplicated join keys when building

[jira] [Assigned] (SPARK-36867) Misleading Error Message with Invalid Column and Group By

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36867: Assignee: (was: Apache Spark) > Misleading Error Message with Invalid Column and

[jira] [Commented] (SPARK-36867) Misleading Error Message with Invalid Column and Group By

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427212#comment-17427212 ] Apache Spark commented on SPARK-36867: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36867) Misleading Error Message with Invalid Column and Group By

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36867: Assignee: Apache Spark > Misleading Error Message with Invalid Column and Group By >

[jira] [Commented] (SPARK-36853) Code failing on checkstyle

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427208#comment-17427208 ] Apache Spark commented on SPARK-36853: -- User 'Shockang' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36853) Code failing on checkstyle

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36853: Assignee: (was: Apache Spark) > Code failing on checkstyle >

[jira] [Commented] (SPARK-36853) Code failing on checkstyle

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427207#comment-17427207 ] Apache Spark commented on SPARK-36853: -- User 'Shockang' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36853) Code failing on checkstyle

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36853: Assignee: Apache Spark > Code failing on checkstyle > -- > >

[jira] [Commented] (SPARK-36949) Fix CREATE TABLE AS SELECT of ANSI intervals

2021-10-11 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427195#comment-17427195 ] Max Gekk commented on SPARK-36949: -- I wasn't able to create a table with intervals: {code:sql} 0:

[jira] [Commented] (SPARK-36962) Make HiveSerDe.serdeMap extensible

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427172#comment-17427172 ] Wenchen Fan commented on SPARK-36962: - Seems fine to make it extensible, but we need to carefully

[jira] [Commented] (SPARK-36949) Fix CREATE TABLE AS SELECT of ANSI intervals

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427171#comment-17427171 ] Wenchen Fan commented on SPARK-36949: - Interesting. Does this work in Hive natively? CTAS with

[jira] [Assigned] (SPARK-36977) Update docs to reflect that Python 3.6 is no longer supported

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36977: Assignee: Apache Spark > Update docs to reflect that Python 3.6 is no longer supported >

[jira] [Assigned] (SPARK-36977) Update docs to reflect that Python 3.6 is no longer supported

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36977: Assignee: (was: Apache Spark) > Update docs to reflect that Python 3.6 is no longer

[jira] [Commented] (SPARK-36977) Update docs to reflect that Python 3.6 is no longer supported

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427150#comment-17427150 ] Apache Spark commented on SPARK-36977: -- User 'zero323' has created a pull request for this issue:

[jira] [Created] (SPARK-36977) Update docs to reflect that Python 3.6 is no longer supported

2021-10-11 Thread Maciej Szymkiewicz (Jira)
Maciej Szymkiewicz created SPARK-36977: -- Summary: Update docs to reflect that Python 3.6 is no longer supported Key: SPARK-36977 URL: https://issues.apache.org/jira/browse/SPARK-36977 Project:

[jira] [Resolved] (SPARK-36540) AM should not just finish with Success when dissconnected

2021-10-11 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-36540. --- Fix Version/s: 3.3.0 Assignee: angerszhu Resolution: Fixed > AM should not

[jira] [Created] (SPARK-36976) Add max_by/min_by API to SparkR

2021-10-11 Thread Leona Yoda (Jira)
Leona Yoda created SPARK-36976: -- Summary: Add max_by/min_by API to SparkR Key: SPARK-36976 URL: https://issues.apache.org/jira/browse/SPARK-36976 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Leona Yoda (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leona Yoda updated SPARK-36972: --- Priority: Minor (was: Major) > Add max_by/min_by API to PySpark >

[jira] [Commented] (SPARK-36877) Calling ds.rdd with AQE enabled leads to jobs being run, eventually causing reruns

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427089#comment-17427089 ] Wenchen Fan commented on SPARK-36877: - > shouldn't it reuse the result from previous stages? One

[jira] [Commented] (SPARK-36877) Calling ds.rdd with AQE enabled leads to jobs being run, eventually causing reruns

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427087#comment-17427087 ] Wenchen Fan commented on SPARK-36877: - > Should calling df.rdd trigger actual job execution when AQE

[jira] [Commented] (SPARK-36861) Partition columns are overly eagerly parsed as dates

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427085#comment-17427085 ] Wenchen Fan commented on SPARK-36861: - I think partition value parsing needs to be stricter. cc

[jira] [Commented] (SPARK-36678) Migrate SHOW TABLES to use V2 command by default

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427079#comment-17427079 ] Apache Spark commented on SPARK-36678: -- User 'imback82' has created a pull request for this issue:

[jira] [Resolved] (SPARK-36678) Migrate SHOW TABLES to use V2 command by default

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-36678. - Fix Version/s: 3.3.0 Assignee: Terry Kim Resolution: Fixed > Migrate SHOW

[jira] [Commented] (SPARK-36678) Migrate SHOW TABLES to use V2 command by default

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427077#comment-17427077 ] Wenchen Fan commented on SPARK-36678: - resolved by https://github.com/apache/spark/pull/34137 >

[jira] [Reopened] (SPARK-36588) Use v2 commands by default

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reopened SPARK-36588: - > Use v2 commands by default > -- > > Key: SPARK-36588 >

[jira] [Issue Comment Deleted] (SPARK-36588) Use v2 commands by default

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-36588: Comment: was deleted (was: Issue resolved by pull request 34137

[jira] [Resolved] (SPARK-36588) Use v2 commands by default

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-36588. - Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34137

[jira] [Assigned] (SPARK-36588) Use v2 commands by default

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-36588: --- Assignee: Terry Kim > Use v2 commands by default > -- > >

[jira] [Commented] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427071#comment-17427071 ] Apache Spark commented on SPARK-36975: -- User 'AngersZh' has created a pull request for this

[jira] [Assigned] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36975: Assignee: (was: Apache Spark) > Refactor HiveClientImpl collect hive client call

[jira] [Commented] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427070#comment-17427070 ] Apache Spark commented on SPARK-36975: -- User 'AngersZh' has created a pull request for this

[jira] [Assigned] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36975: Assignee: Apache Spark > Refactor HiveClientImpl collect hive client call logic >

[jira] [Created] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-11 Thread angerszhu (Jira)
angerszhu created SPARK-36975: - Summary: Refactor HiveClientImpl collect hive client call logic Key: SPARK-36975 URL: https://issues.apache.org/jira/browse/SPARK-36975 Project: Spark Issue Type:

[jira] [Commented] (SPARK-36975) Refactor HiveClientImpl collect hive client call logic

2021-10-11 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427053#comment-17427053 ] angerszhu commented on SPARK-36975: --- Raise a pr soon > Refactor HiveClientImpl collect hive client

[jira] [Resolved] (SPARK-36974) Try to raise memory and parallelism again for GA

2021-10-11 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu resolved SPARK-36974. --- Resolution: Duplicate > Try to raise memory and parallelism again for GA >

[jira] [Resolved] (SPARK-35531) Can not insert into hive bucket table if create table with upper case schema

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-35531. - Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34218

[jira] [Created] (SPARK-36974) Try to raise memory and parallelism again for GA

2021-10-11 Thread angerszhu (Jira)
angerszhu created SPARK-36974: - Summary: Try to raise memory and parallelism again for GA Key: SPARK-36974 URL: https://issues.apache.org/jira/browse/SPARK-36974 Project: Spark Issue Type: Task

[jira] [Created] (SPARK-36973) Deduplicate prepare data method for HistogramPlotBase and KdePlotBase

2021-10-11 Thread dch nguyen (Jira)
dch nguyen created SPARK-36973: -- Summary: Deduplicate prepare data method for HistogramPlotBase and KdePlotBase Key: SPARK-36973 URL: https://issues.apache.org/jira/browse/SPARK-36973 Project: Spark

[jira] [Assigned] (SPARK-35531) Can not insert into hive bucket table if create table with upper case schema

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-35531: --- Assignee: angerszhu > Can not insert into hive bucket table if create table with upper

[jira] [Reopened] (SPARK-36794) Ignore duplicated join keys when building relation for SEMI/ANTI hash join

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reopened SPARK-36794: - > Ignore duplicated join keys when building relation for SEMI/ANTI hash join >

[jira] [Assigned] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36972: Assignee: (was: Apache Spark) > Add max_by/min_by API to PySpark >

[jira] [Assigned] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36972: Assignee: Apache Spark > Add max_by/min_by API to PySpark >

[jira] [Commented] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427029#comment-17427029 ] Apache Spark commented on SPARK-36972: -- User 'yoda-mon' has created a pull request for this issue:

[jira] [Commented] (SPARK-36932) Misuse "merge schema" when mapGroups

2021-10-11 Thread chendihao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427005#comment-17427005 ] chendihao commented on SPARK-36932: --- It is not related to join operation. I have a simpler case to

[jira] [Updated] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Leona Yoda (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leona Yoda updated SPARK-36972: --- Description: Related issues - https://issues.apache.org/jira/browse/SPARK-27653 *

[jira] [Updated] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Leona Yoda (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leona Yoda updated SPARK-36972: --- Description: Related issues - https://issues.apache.org/jira/browse/SPARK-27653 - 

[jira] [Created] (SPARK-36972) Add max_by/min_by API to PySpark

2021-10-11 Thread Leona Yoda (Jira)
Leona Yoda created SPARK-36972: -- Summary: Add max_by/min_by API to PySpark Key: SPARK-36972 URL: https://issues.apache.org/jira/browse/SPARK-36972 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-36059) Add the ability to specify a scheduler & queue

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36059: Assignee: (was: Apache Spark) > Add the ability to specify a scheduler & queue >

[jira] [Assigned] (SPARK-36059) Add the ability to specify a scheduler & queue

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36059: Assignee: Apache Spark > Add the ability to specify a scheduler & queue >

[jira] [Assigned] (SPARK-36059) Add the ability to specify a scheduler & queue

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36059: Assignee: Apache Spark > Add the ability to specify a scheduler & queue >

[jira] [Commented] (SPARK-36059) Add the ability to specify a scheduler & queue

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426985#comment-17426985 ] Apache Spark commented on SPARK-36059: -- User 'Yikun' has created a pull request for this issue:

[jira] [Commented] (SPARK-36059) Add the ability to specify a scheduler & queue

2021-10-11 Thread Yikun Jiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426984#comment-17426984 ] Yikun Jiang commented on SPARK-36059: - We have the abilities of specify a scheduler on executor

[jira] [Updated] (SPARK-36969) Inline type hints for SparkContext

2021-10-11 Thread dch nguyen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36969: --- Summary: Inline type hints for SparkContext (was: Inline type hints for python/pyspark/context.py)

[jira] [Assigned] (SPARK-36969) Inline type hints for python/pyspark/context.py

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36969: Assignee: (was: Apache Spark) > Inline type hints for python/pyspark/context.py >

[jira] [Commented] (SPARK-36969) Inline type hints for python/pyspark/context.py

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426980#comment-17426980 ] Apache Spark commented on SPARK-36969: -- User 'dchvn' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36969) Inline type hints for python/pyspark/context.py

2021-10-11 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36969: Assignee: Apache Spark > Inline type hints for python/pyspark/context.py >

[jira] [Resolved] (SPARK-36876) Support Dynamic Partition pruning for HiveTableScanExec

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-36876. - Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34139

[jira] [Assigned] (SPARK-36876) Support Dynamic Partition pruning for HiveTableScanExec

2021-10-11 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-36876: --- Assignee: angerszhu > Support Dynamic Partition pruning for HiveTableScanExec >

  1   2   >