[jira] [Created] (SPARK-47518) Skip transfer the last spilled shuffle data

2024-03-22 Thread Wan Kun (Jira)
Wan Kun created SPARK-47518: --- Summary: Skip transfer the last spilled shuffle data Key: SPARK-47518 URL: https://issues.apache.org/jira/browse/SPARK-47518 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-46942) Ignore --num-executors config if DYN_ALLOCATION_ENABLED is true and allow remove idle executors

2024-02-01 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-46942: Component/s: Spark Core (was: YARN) > Ignore --num-executors config if DYN_ALLOCATION

[jira] [Created] (SPARK-46942) Ignore --num-executors config if DYN_ALLOCATION_ENABLED is true and allow remove idle executors

2024-02-01 Thread Wan Kun (Jira)
Wan Kun created SPARK-46942: --- Summary: Ignore --num-executors config if DYN_ALLOCATION_ENABLED is true and allow remove idle executors Key: SPARK-46942 URL: https://issues.apache.org/jira/browse/SPARK-46942

[jira] [Commented] (SPARK-33144) Connot insert overwite multiple partition, get exception "get partition: Value for key name is null or empty"

2023-12-18 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798232#comment-17798232 ] Wan Kun commented on SPARK-33144: - Speculative task also can cause this issue:   Spark

[jira] [Updated] (SPARK-46403) Decode parquet binary with getBytesUnsafe method

2023-12-14 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-46403: Description: Now spark will get a parquet binary object with getBytes() method. The *Binary.getBytes()* m

[jira] [Updated] (SPARK-46403) Decode parquet binary with getBytesUnsafe method

2023-12-14 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-46403: Description: Now spark will get a parquet binary object with getBytes() method. The *Binary.getBytes()* m

[jira] [Updated] (SPARK-46403) Decode parquet binary with getBytesUnsafe method

2023-12-14 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-46403: Attachment: image-2023-12-14-16-30-39-104.png > Decode parquet binary with getBytesUnsafe method > ---

[jira] [Updated] (SPARK-46403) Decode parquet binary with getBytesUnsafe method

2023-12-14 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-46403: Summary: Decode parquet binary with getBytesUnsafe method (was: Decode parquet binary dictionary with get

[jira] [Created] (SPARK-46403) Decode parquet binary dictionary with getBytesUnsafe method

2023-12-14 Thread Wan Kun (Jira)
Wan Kun created SPARK-46403: --- Summary: Decode parquet binary dictionary with getBytesUnsafe method Key: SPARK-46403 URL: https://issues.apache.org/jira/browse/SPARK-46403 Project: Spark Issue Type

[jira] [Created] (SPARK-46364) Push calculation from Aggregate through Expand

2023-12-11 Thread Wan Kun (Jira)
Wan Kun created SPARK-46364: --- Summary: Push calculation from Aggregate through Expand Key: SPARK-46364 URL: https://issues.apache.org/jira/browse/SPARK-46364 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-46069) Support unwrap timestamp type to date type

2023-11-23 Thread Wan Kun (Jira)
Wan Kun created SPARK-46069: --- Summary: Support unwrap timestamp type to date type Key: SPARK-46069 URL: https://issues.apache.org/jira/browse/SPARK-46069 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-45607) Collapse repartition operators with a project

2023-10-19 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45607: Summary: Collapse repartition operators with a project (was: Collapse repartition operators with project)

[jira] [Updated] (SPARK-45607) Collapse repartition operators with project

2023-10-19 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45607: Summary: Collapse repartition operators with project (was: Collapse repartitions with project) > Collaps

[jira] [Updated] (SPARK-45607) Collapse repartitions with project

2023-10-19 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45607: Summary: Collapse repartitions with project (was: Collapse repartition with project) > Collapse repartit

[jira] [Created] (SPARK-45607) Collapse repartition with project

2023-10-19 Thread Wan Kun (Jira)
Wan Kun created SPARK-45607: --- Summary: Collapse repartition with project Key: SPARK-45607 URL: https://issues.apache.org/jira/browse/SPARK-45607 Project: Spark Issue Type: Improvement Com

[jira] [Updated] (SPARK-45594) Auto repartition before writing data into partitioned or bucket table

2023-10-18 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45594: Description: Now, when writing data into partitioned table, there will be at least *dynamicPartitions * S

[jira] [Created] (SPARK-45594) Auto repartition before writing data into partitioned or bucket table

2023-10-18 Thread Wan Kun (Jira)
Wan Kun created SPARK-45594: --- Summary: Auto repartition before writing data into partitioned or bucket table Key: SPARK-45594 URL: https://issues.apache.org/jira/browse/SPARK-45594 Project: Spark

[jira] [Updated] (SPARK-45230) Adjust sorter for Aggregate after SMJ

2023-09-20 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45230: Summary: Adjust sorter for Aggregate after SMJ (was: Plan sorter for Aggregate after SMJ) > Adjust sorte

[jira] [Updated] (SPARK-45230) Plan sorter for Aggregate after SMJ

2023-09-20 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45230: Description: If there is an aggregate operator after the SMJ and the grouping expressions of aggregate op

[jira] [Created] (SPARK-45230) Plan sorter for Aggregate after SMJ

2023-09-20 Thread Wan Kun (Jira)
Wan Kun created SPARK-45230: --- Summary: Plan sorter for Aggregate after SMJ Key: SPARK-45230 URL: https://issues.apache.org/jira/browse/SPARK-45230 Project: Spark Issue Type: Improvement C

[jira] [Updated] (SPARK-45230) Plan sorter for Aggregate after SMJ

2023-09-20 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45230: Description: If the aggregate operator comes after SMJ and the grouping expressions of aggregate operator

[jira] [Updated] (SPARK-45230) Plan sorter for Aggregate after SMJ

2023-09-20 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45230: Description: If there is an aggregate operator after the SMJ and the grouping expressions of aggregate op

[jira] [Updated] (SPARK-45230) Plan sorter for Aggregate after SMJ

2023-09-20 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-45230: Description: If the aggregate operator comes after SMJ and the grouping expressions of aggregate operator

[jira] [Created] (SPARK-44870) Convert HashAggregate to SortAggregate if all grouping expressions are in child output orderings

2023-08-18 Thread Wan Kun (Jira)
Wan Kun created SPARK-44870: --- Summary: Convert HashAggregate to SortAggregate if all grouping expressions are in child output orderings Key: SPARK-44870 URL: https://issues.apache.org/jira/browse/SPARK-44870

[jira] [Updated] (SPARK-44804) SortMergeJoin should respect the streamed side ordering

2023-08-14 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44804: Description: In each partition, SortMergeJoin will compute one by one from the streamed side, so we could

[jira] [Created] (SPARK-44804) SortMergeJoin should respect the streamed side ordering

2023-08-14 Thread Wan Kun (Jira)
Wan Kun created SPARK-44804: --- Summary: SortMergeJoin should respect the streamed side ordering Key: SPARK-44804 URL: https://issues.apache.org/jira/browse/SPARK-44804 Project: Spark Issue Type: Imp

[jira] [Created] (SPARK-44773) Code-gen CodegenFallback expression in WholeStageCodegen if possible

2023-08-11 Thread Wan Kun (Jira)
Wan Kun created SPARK-44773: --- Summary: Code-gen CodegenFallback expression in WholeStageCodegen if possible Key: SPARK-44773 URL: https://issues.apache.org/jira/browse/SPARK-44773 Project: Spark I

[jira] [Updated] (SPARK-44582) JVM crash caused by SMJ and WindowExec

2023-07-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44582: Description: After inner SMJ early cleanup offheap memory, when the SMJ call the left Window next metho

[jira] [Updated] (SPARK-44582) JVM crash caused by SMJ and WindowExec

2023-07-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44582: Attachment: screenshot-2.png > JVM crash caused by SMJ and WindowExec > --

[jira] [Updated] (SPARK-44582) JVM crash caused by SMJ and WindowExec

2023-07-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44582: Attachment: screenshot-1.png > JVM crash caused by SMJ and WindowExec > --

[jira] [Created] (SPARK-44582) JVM crash caused by SMJ and WindowExec

2023-07-28 Thread Wan Kun (Jira)
Wan Kun created SPARK-44582: --- Summary: JVM crash caused by SMJ and WindowExec Key: SPARK-44582 URL: https://issues.apache.org/jira/browse/SPARK-44582 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-44243) Add a parameter to determine the locality of local shuffle reader

2023-06-29 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44243: Summary: Add a parameter to determine the locality of local shuffle reader (was: Local shuffle reader sho

[jira] [Created] (SPARK-44243) Local shuffle reader should not respect SHUFFLE_REDUCE_LOCALITY_ENABLE

2023-06-29 Thread Wan Kun (Jira)
Wan Kun created SPARK-44243: --- Summary: Local shuffle reader should not respect SHUFFLE_REDUCE_LOCALITY_ENABLE Key: SPARK-44243 URL: https://issues.apache.org/jira/browse/SPARK-44243 Project: Spark

[jira] [Updated] (SPARK-44239) Free memory allocated by large vectors when vectors are reset

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Summary: Free memory allocated by large vectors when vectors are reset (was: Free memory allocated by hug

[jira] [Updated] (SPARK-44239) Free memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Description: When spark reads a data file into a WritableColumnVector, the memory allocated by the Writab

[jira] [Updated] (SPARK-44239) Free memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Description: When spark reads a data file into a WritableColumnVector, the memory allocated by the Writab

[jira] [Updated] (SPARK-44239) Free memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Summary: Free memory allocated by huge column vector (was: Reclaim memory allocated by huge column vector

[jira] [Updated] (SPARK-44239) Reclaim memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Attachment: image-2023-06-29-13-03-15-470.png > Reclaim memory allocated by huge column vector > -

[jira] [Updated] (SPARK-44239) Reclaim memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Description: When spark reads a data file into a WritableColumnVector, the memory allocated by the Writab

[jira] [Updated] (SPARK-44239) Reclaim memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Description: When spark reads a data file into a WritableColumnVector, the memory allocated by the Writab

[jira] [Updated] (SPARK-44239) Reclaim memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-44239: Attachment: image-2023-06-29-12-58-12-256.png > Reclaim memory allocated by huge column vector > -

[jira] [Created] (SPARK-44239) Reclaim memory allocated by huge column vector

2023-06-28 Thread Wan Kun (Jira)
Wan Kun created SPARK-44239: --- Summary: Reclaim memory allocated by huge column vector Key: SPARK-44239 URL: https://issues.apache.org/jira/browse/SPARK-44239 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-44109) Remove duplicate preferred locations of each RDD partition

2023-06-20 Thread Wan Kun (Jira)
Wan Kun created SPARK-44109: --- Summary: Remove duplicate preferred locations of each RDD partition Key: SPARK-44109 URL: https://issues.apache.org/jira/browse/SPARK-44109 Project: Spark Issue Type:

[jira] [Created] (SPARK-43876) Enable fast hashmap for distinct queries

2023-05-29 Thread Wan Kun (Jira)
Wan Kun created SPARK-43876: --- Summary: Enable fast hashmap for distinct queries Key: SPARK-43876 URL: https://issues.apache.org/jira/browse/SPARK-43876 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-43593) Support the minimum number of range shuffle partitions

2023-05-19 Thread Wan Kun (Jira)
Wan Kun created SPARK-43593: --- Summary: Support the minimum number of range shuffle partitions Key: SPARK-43593 URL: https://issues.apache.org/jira/browse/SPARK-43593 Project: Spark Issue Type: Impr

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-05-08 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* h2. How to support more subexpressions elimination cases * Get all commo

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-05-08 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* h2. How to support more subexpressions elimination cases * Get all commo

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-05-08 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* h2. How to support more subexpressions elimination cases * Get all commo

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-03-23 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* * Get all common expressions from input expressions. Recursively visits

[jira] [Created] (SPARK-42897) Avoid evaluate more than once for the variables from the left side in the FullOuter SMJ condition

2023-03-22 Thread Wan Kun (Jira)
Wan Kun created SPARK-42897: --- Summary: Avoid evaluate more than once for the variables from the left side in the FullOuter SMJ condition Key: SPARK-42897 URL: https://issues.apache.org/jira/browse/SPARK-42897

[jira] [Updated] (SPARK-42831) Show result expressions in AggregateExec

2023-03-16 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42831: Description: If the result expressions in AggregateExec are not empty, we should display them. Or we will

[jira] [Updated] (SPARK-42831) Show result expressions in AggregateExec

2023-03-16 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42831: Description: If the result expressions in AggregateExec is non-empty, we should show them. Or we will be

[jira] [Created] (SPARK-42831) Show result expressions in AggregateExec

2023-03-16 Thread Wan Kun (Jira)
Wan Kun created SPARK-42831: --- Summary: Show result expressions in AggregateExec Key: SPARK-42831 URL: https://issues.apache.org/jira/browse/SPARK-42831 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-03-16 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* * Get all common expressions from input expressions. Recursively visits

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-03-16 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* * Get all common expressions from input expressions. Recursively visits

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-03-16 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: h1. *Design Sketch* * Get all common expressions from input expressions. Recursively visits

[jira] [Updated] (SPARK-42551) Support more subexpression elimination cases

2023-03-16 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Summary: Support more subexpression elimination cases (was: Support subexpression elimination in FilterEx

[jira] [Updated] (SPARK-42551) Support subexpression elimination in FilterExec and JoinExec

2023-03-07 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Summary: Support subexpression elimination in FilterExec and JoinExec (was: Support subexpression elimina

[jira] [Updated] (SPARK-42551) Support subexpression elimination in FilterExec

2023-02-23 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: Just like SPARK-33092, We can support subexpression elimination in FilterExec in Whole-stage

[jira] [Updated] (SPARK-42551) Support subexpression elimination in FilterExec

2023-02-23 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42551: Description: Just like SPARK-33092, We can support subexpression elimination in FilterExec in Whole-stage

[jira] [Created] (SPARK-42551) Support subexpression elimination in FilterExec

2023-02-23 Thread Wan Kun (Jira)
Wan Kun created SPARK-42551: --- Summary: Support subexpression elimination in FilterExec Key: SPARK-42551 URL: https://issues.apache.org/jira/browse/SPARK-42551 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-42360) Transform LeftOuter join with IsNull filter on right side to Anti join

2023-02-06 Thread Wan Kun (Jira)
Wan Kun created SPARK-42360: --- Summary: Transform LeftOuter join with IsNull filter on right side to Anti join Key: SPARK-42360 URL: https://issues.apache.org/jira/browse/SPARK-42360 Project: Spark

[jira] [Created] (SPARK-42270) Sort merge join may oom when right match rows are very large

2023-01-31 Thread Wan Kun (Jira)
Wan Kun created SPARK-42270: --- Summary: Sort merge join may oom when right match rows are very large Key: SPARK-42270 URL: https://issues.apache.org/jira/browse/SPARK-42270 Project: Spark Issue Typ

[jira] [Updated] (SPARK-42223) Remove duplicate branches in CASE_WHEN and COALESCE function

2023-01-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42223: Summary: Remove duplicate branches in CASE_WHEN and COALESCE function (was: Remove duplicate branch in CA

[jira] [Updated] (SPARK-42223) Remove duplicate branch in CASE_WHEN and COALESCE function

2023-01-28 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42223: Summary: Remove duplicate branch in CASE_WHEN and COALESCE function (was: Remove duplicat branch in CASE_

[jira] [Created] (SPARK-42223) Remove duplicat branch in CASE_WHEN and COALESCE function

2023-01-28 Thread Wan Kun (Jira)
Wan Kun created SPARK-42223: --- Summary: Remove duplicat branch in CASE_WHEN and COALESCE function Key: SPARK-42223 URL: https://issues.apache.org/jira/browse/SPARK-42223 Project: Spark Issue Type: I

[jira] [Created] (SPARK-42025) Optimize logs for removeBlocks and removeShuffleMerge PRC

2023-01-12 Thread Wan Kun (Jira)
Wan Kun created SPARK-42025: --- Summary: Optimize logs for removeBlocks and removeShuffleMerge PRC Key: SPARK-42025 URL: https://issues.apache.org/jira/browse/SPARK-42025 Project: Spark Issue Type: I

[jira] [Created] (SPARK-41981) Collapse percentile functions if possible

2023-01-11 Thread Wan Kun (Jira)
Wan Kun created SPARK-41981: --- Summary: Collapse percentile functions if possible Key: SPARK-41981 URL: https://issues.apache.org/jira/browse/SPARK-41981 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-41940) Infer IsNotNull constraints for complex join expressions

2023-01-08 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-41940: Description: Infer IsNotNull constraints for complex join expressions could help filter a lot of rows bef

[jira] [Updated] (SPARK-41940) Infer IsNotNull constraints for complex join expressions

2023-01-08 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-41940: Summary: Infer IsNotNull constraints for complex join expressions (was: Infer IsNotNull constraints for c

[jira] [Created] (SPARK-41940) Infer IsNotNull constraints for complex expression join keys

2023-01-08 Thread Wan Kun (Jira)
Wan Kun created SPARK-41940: --- Summary: Infer IsNotNull constraints for complex expression join keys Key: SPARK-41940 URL: https://issues.apache.org/jira/browse/SPARK-41940 Project: Spark Issue Typ

[jira] [Updated] (SPARK-41805) Reuse expressions in WindowSpecDefinition

2023-01-01 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-41805: Description: For complex expressions in window spec definition, we extract it and replace it with an Attr

[jira] [Created] (SPARK-41805) Reuse expressions in WindowSpecDefinition

2023-01-01 Thread Wan Kun (Jira)
Wan Kun created SPARK-41805: --- Summary: Reuse expressions in WindowSpecDefinition Key: SPARK-41805 URL: https://issues.apache.org/jira/browse/SPARK-41805 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-41416) Rewrite self join in in predicate to aggregate

2022-12-06 Thread Wan Kun (Jira)
Wan Kun created SPARK-41416: --- Summary: Rewrite self join in in predicate to aggregate Key: SPARK-41416 URL: https://issues.apache.org/jira/browse/SPARK-41416 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-41368) Reorder the window partition expressions by expression stats

2022-12-02 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-41368: Description: We can reorder window partition expressions by the distinct values stats. Sorting with high

[jira] [Created] (SPARK-41368) Reorder the window partition expressions by expression stats

2022-12-02 Thread Wan Kun (Jira)
Wan Kun created SPARK-41368: --- Summary: Reorder the window partition expressions by expression stats Key: SPARK-41368 URL: https://issues.apache.org/jira/browse/SPARK-41368 Project: Spark Issue Typ

[jira] [Created] (SPARK-41355) Workaround hive table name validation issue

2022-12-01 Thread Wan Kun (Jira)
Wan Kun created SPARK-41355: --- Summary: Workaround hive table name validation issue Key: SPARK-41355 URL: https://issues.apache.org/jira/browse/SPARK-41355 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-41167) Optimize LikeSimplification rule to improve multi like performance

2022-11-16 Thread Wan Kun (Jira)
Wan Kun created SPARK-41167: --- Summary: Optimize LikeSimplification rule to improve multi like performance Key: SPARK-41167 URL: https://issues.apache.org/jira/browse/SPARK-41167 Project: Spark Iss

[jira] [Created] (SPARK-41159) Optimize like any and like all expressions

2022-11-16 Thread Wan Kun (Jira)
Wan Kun created SPARK-41159: --- Summary: Optimize like any and like all expressions Key: SPARK-41159 URL: https://issues.apache.org/jira/browse/SPARK-41159 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-41132) Convert LikeAny and NotLikeAny to InSet if no pattern contains wildcards

2022-11-13 Thread Wan Kun (Jira)
Wan Kun created SPARK-41132: --- Summary: Convert LikeAny and NotLikeAny to InSet if no pattern contains wildcards Key: SPARK-41132 URL: https://issues.apache.org/jira/browse/SPARK-41132 Project: Spark

[jira] [Created] (SPARK-40715) Support preferring shuffled hash join thought LocalMapThreshold is less than advisory partition size

2022-10-09 Thread Wan Kun (Jira)
Wan Kun created SPARK-40715: --- Summary: Support preferring shuffled hash join thought LocalMapThreshold is less than advisory partition size Key: SPARK-40715 URL: https://issues.apache.org/jira/browse/SPARK-40715

[jira] [Created] (SPARK-40480) Remove push-based shuffle data after query finished

2022-09-18 Thread Wan Kun (Jira)
Wan Kun created SPARK-40480: --- Summary: Remove push-based shuffle data after query finished Key: SPARK-40480 URL: https://issues.apache.org/jira/browse/SPARK-40480 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-40306) Support more than Integer.MAX_VALUE of the same join key

2022-09-01 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-40306: Description: For SMJ, the number of the same join key records of the right table is greater than Integer.

[jira] [Updated] (SPARK-40306) Support more than Integer.MAX_VALUE of the same join key

2022-09-01 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-40306: Attachment: image-2022-09-01-23-02-15-955.png > Support more than Integer.MAX_VALUE of the same join key >

[jira] [Created] (SPARK-40306) Support more than Integer.MAX_VALUE of the same join key

2022-09-01 Thread Wan Kun (Jira)
Wan Kun created SPARK-40306: --- Summary: Support more than Integer.MAX_VALUE of the same join key Key: SPARK-40306 URL: https://issues.apache.org/jira/browse/SPARK-40306 Project: Spark Issue Type: Bu

[jira] [Created] (SPARK-40164) The partitionSpec should be distinct keys after filter one row of row_number

2022-08-21 Thread Wan Kun (Jira)
Wan Kun created SPARK-40164: --- Summary: The partitionSpec should be distinct keys after filter one row of row_number Key: SPARK-40164 URL: https://issues.apache.org/jira/browse/SPARK-40164 Project: Spark

[jira] [Created] (SPARK-40159) Aggregate should be group only after collapse project to aggregate

2022-08-20 Thread Wan Kun (Jira)
Wan Kun created SPARK-40159: --- Summary: Aggregate should be group only after collapse project to aggregate Key: SPARK-40159 URL: https://issues.apache.org/jira/browse/SPARK-40159 Project: Spark Iss

[jira] [Created] (SPARK-40096) Finalize shuffle merge slow due to connection creation fails

2022-08-15 Thread Wan Kun (Jira)
Wan Kun created SPARK-40096: --- Summary: Finalize shuffle merge slow due to connection creation fails Key: SPARK-40096 URL: https://issues.apache.org/jira/browse/SPARK-40096 Project: Spark Issue Typ

[jira] [Updated] (SPARK-39893) Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are foldable

2022-07-29 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Description: If all group expressions are foldable, the result of this aggregate will always be OneRowRel

[jira] [Updated] (SPARK-39893) Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are foldable

2022-07-29 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Description: If all group expressions are foldable, the result of this aggregate will always be OneRowRel

[jira] [Updated] (SPARK-39893) Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are foldable

2022-07-29 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Description: If all group expressions are foldable, the result of this aggregate will always be OneRowRel

[jira] [Updated] (SPARK-39893) Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are foldable

2022-07-29 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Description: If all group expressions are foldable, the result of this aggregate will always be OneRowRel

[jira] [Updated] (SPARK-39893) Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are foldable

2022-07-29 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Summary: Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are

[jira] [Updated] (SPARK-39893) Remove redundant aggregate if it is group only and all grouping and aggregate expressions are foldable

2022-07-27 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Summary: Remove redundant aggregate if it is group only and all grouping and aggregate expressions are fol

[jira] [Updated] (SPARK-39893) Remove Aggregate if it is group only and all grouping and aggregate expressions are foldable

2022-07-27 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Summary: Remove Aggregate if it is group only and all grouping and aggregate expressions are foldable (wa

[jira] [Updated] (SPARK-39893) Remote Aggregate if it is group only and all grouping and aggregate expressions are foldable

2022-07-27 Thread Wan Kun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-39893: Description: If all groupingExpressions and aggregateExpressions in a aggregate are foldable, we can remo

[jira] [Created] (SPARK-39893) Remote Aggregate if it is group only and all grouping and aggregate expressions are foldable

2022-07-27 Thread Wan Kun (Jira)
Wan Kun created SPARK-39893: --- Summary: Remote Aggregate if it is group only and all grouping and aggregate expressions are foldable Key: SPARK-39893 URL: https://issues.apache.org/jira/browse/SPARK-39893 Pr

[jira] [Created] (SPARK-39325) Improve MapOutputTracker convertMapStatuses performance

2022-05-27 Thread Wan Kun (Jira)
Wan Kun created SPARK-39325: --- Summary: Improve MapOutputTracker convertMapStatuses performance Key: SPARK-39325 URL: https://issues.apache.org/jira/browse/SPARK-39325 Project: Spark Issue Type: Imp

[jira] [Created] (SPARK-39080) Optimize shuffle error handler

2022-04-30 Thread Wan Kun (Jira)
Wan Kun created SPARK-39080: --- Summary: Optimize shuffle error handler Key: SPARK-39080 URL: https://issues.apache.org/jira/browse/SPARK-39080 Project: Spark Issue Type: Improvement Compon

[jira] [Created] (SPARK-39072) Fast Fail the remaining push blocks if shuffle stage finalized

2022-04-29 Thread Wan Kun (Jira)
Wan Kun created SPARK-39072: --- Summary: Fast Fail the remaining push blocks if shuffle stage finalized Key: SPARK-39072 URL: https://issues.apache.org/jira/browse/SPARK-39072 Project: Spark Issue T

  1   2   >