[jira] [Updated] (SPARK-48580) Add consistency check and fallback for mapIds in push-merged block meta

2024-06-11 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Parent: SPARK-33235 Issue Type: Sub-task (was: Bug) > Add consistency check and fallback fo

[jira] [Updated] (SPARK-48580) Add consistency check and fallback for mapIds in push-merged block meta

2024-06-11 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Summary: Add consistency check and fallback for mapIds in push-merged block meta (was: MergedBlock

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Description: When push-based shuffle enabled, 0.03% of the spark application in our cluster experie

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Description: When push-based shuffle enabled, 0.03% of the spark application in our cluster experie

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Attachment: (was: image-2024-06-11-10-19-22-284.png) > MergedBlock read by reduce have missing c

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Attachment: image-2024-06-11-10-19-57-227.png > MergedBlock read by reduce have missing chunks, lead

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Attachment: image-2024-06-11-10-19-22-284.png > MergedBlock read by reduce have missing chunks, lead

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-48580: --- Summary: MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data (was:

[jira] [Created] (SPARK-48580) The merge blocks read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-48580: -- Summary: The merge blocks read by reduce have missing chunks, leading to inconsistent shuffle data Key: SPARK-48580 URL: https://issues.apache.org/jira/browse/SPARK-48580

[jira] [Commented] (SPARK-42694) Data duplication and loss occur after executing 'insert overwrite...' in Spark 3.1.1

2024-05-14 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846293#comment-17846293 ] gaoyajun02 commented on SPARK-42694: Have you enabled push-based shuffle? > Data du

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-17 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Priority: Critical (was: Blocker) > Data duplication may occur when fallback to origin shuffle bloc

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-17 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Priority: Blocker (was: Critical) > Data duplication may occur when fallback to origin shuffle bloc

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-17 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Priority: Critical (was: Major) > Data duplication may occur when fallback to origin shuffle block

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-17 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Affects Version/s: 3.5.0 3.4.0 3.3.0 > Data duplicatio

[jira] [Commented] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-12 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764130#comment-17764130 ] gaoyajun02 commented on SPARK-45134: Hi, [~csingh] [~vsowrirajan] [~mshen] , Can you

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-12 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Description: One possible situation that has been found is that, during the process of requesting m

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-12 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Description: One possible situation that has been found is that, during the process of requesting m

[jira] [Updated] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-12 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-45134: --- Description: One possible situation that has been found is that, during the process of requesting m

[jira] [Created] (SPARK-45134) Data duplication may occur when fallback to origin shuffle block

2023-09-12 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-45134: -- Summary: Data duplication may occur when fallback to origin shuffle block Key: SPARK-45134 URL: https://issues.apache.org/jira/browse/SPARK-45134 Project: Spark

[jira] [Commented] (SPARK-42203) JsonProtocol should skip logging of push-based shuffle read metrics when push-based shuffle is disabled

2023-07-23 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746225#comment-17746225 ] gaoyajun02 commented on SPARK-42203: ping [~thejdeep]  when can you resolve it? > J

[jira] [Commented] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-06-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728724#comment-17728724 ] gaoyajun02 commented on SPARK-43864: It looks like a series of test package dependen

[jira] [Commented] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-30 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727827#comment-17727827 ] gaoyajun02 commented on SPARK-43864: @[~srowen] thank you for your reply,It doesn't

[jira] [Commented] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727064#comment-17727064 ] gaoyajun02 commented on SPARK-43864: Can you take a look  @[~yangjie01]  > Version

[jira] [Updated] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-43864: --- Attachment: (was: image-2023-05-29-18-18-44-132.png) > Versions of the package net.sourceforge.h

[jira] [Updated] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-43864: --- Attachment: image-2023-05-29-18-18-44-132.png > Versions of the package net.sourceforge.htmlunit:htm

[jira] [Updated] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-43864: --- Description: CVE-2023-26119 Detail: [https://nvd.nist.gov/vuln/detail/CVE-2023-26119] It is recomme

[jira] [Updated] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-43864: --- Description: CVE-2023-26119 Detail: [https://nvd.nist.gov/vuln/detail/CVE-2023-26119] It is recomme

[jira] [Created] (SPARK-43864) Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL

2023-05-29 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-43864: -- Summary: Versions of the package net.sourceforge.htmlunit:htmlunit from 0 and before 3.0.0 are vulnerable to Remote Code Execution (RCE) via XSTL Key: SPARK-43864 URL: https://issues.

[jira] [Updated] (SPARK-40872) Fallback to original shuffle block when a push-merged shuffle chunk is zero-size

2022-10-21 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-40872: --- Parent: SPARK-33235 Issue Type: Sub-task (was: Improvement) > Fallback to original shuffle

[jira] [Created] (SPARK-40872) Fallback to original shuffle block when a push-merged shuffle chunk is zero-size

2022-10-21 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-40872: -- Summary: Fallback to original shuffle block when a push-merged shuffle chunk is zero-size Key: SPARK-40872 URL: https://issues.apache.org/jira/browse/SPARK-40872 Project:

[jira] [Resolved] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-25 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 resolved SPARK-38010. Resolution: Fixed > Push-based shuffle disabled due to insufficient mergeLocations > -

[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-25 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-38010: --- Issue Type: Question (was: Brainstorming) > Push-based shuffle disabled due to insufficient mergeLo

[jira] [Commented] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-24 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481614#comment-17481614 ] gaoyajun02 commented on SPARK-38010: https://issues.apache.org/jira/browse/SPARK-348

[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-24 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-38010: --- Parent: (was: SPARK-33235) Issue Type: Brainstorming (was: Technical task) > Push-based

[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-24 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-38010: --- Issue Type: Technical task (was: Sub-task) > Push-based shuffle disabled due to insufficient mergeL

[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-24 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-38010: --- Parent: SPARK-33235 Issue Type: Sub-task (was: Improvement) > Push-based shuffle disabled d

[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-24 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-38010: --- Description: The current shuffle merger locations is obtained based on the host of the active or de

[jira] [Created] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

2022-01-24 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-38010: -- Summary: Push-based shuffle disabled due to insufficient mergeLocations Key: SPARK-38010 URL: https://issues.apache.org/jira/browse/SPARK-38010 Project: Spark I

[jira] [Updated] (SPARK-36964) Reuse CachedDNSToSwitchMapping for yarn container requests

2021-10-17 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36964: --- Affects Version/s: 3.3.0 3.2.0 > Reuse CachedDNSToSwitchMapping for yarn con

[jira] [Updated] (SPARK-36964) Reuse CachedDNSToSwitchMapping for yarn container requests

2021-10-14 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36964: --- Description: Similar to SPARK-13704​, In some cases, YarnAllocator add container requests with loca

[jira] [Updated] (SPARK-36964) Reuse CachedDNSToSwitchMapping for yarn container requests

2021-10-14 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36964: --- Description: Similar to SPARK-13704​, In some cases, YarnAllocator add or remove container requests

[jira] [Updated] (SPARK-36964) Reuse CachedDNSToSwitchMapping for yarn container requests

2021-10-14 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36964: --- Description: Similar to SPARK-13704​, In some cases, YarnAllocator add or remove container requests

[jira] [Created] (SPARK-36964) Reuse CachedDNSToSwitchMapping for yarn container requests

2021-10-09 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-36964: -- Summary: Reuse CachedDNSToSwitchMapping for yarn container requests Key: SPARK-36964 URL: https://issues.apache.org/jira/browse/SPARK-36964 Project: Spark Issu

[jira] [Updated] (SPARK-36815) Found duplicate rewrite attributes

2021-09-22 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36815: --- Description: We are using Spark version 3.0.2 in production and some ETLs contain multi-level CTEs

[jira] [Comment Edited] (SPARK-36815) Found duplicate rewrite attributes

2021-09-21 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418064#comment-17418064 ] gaoyajun02 edited comment on SPARK-36815 at 9/21/21, 2:19 PM:

[jira] [Comment Edited] (SPARK-36815) Found duplicate rewrite attributes

2021-09-21 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418064#comment-17418064 ] gaoyajun02 edited comment on SPARK-36815 at 9/21/21, 12:17 PM: ---

[jira] [Commented] (SPARK-36815) Found duplicate rewrite attributes

2021-09-21 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418064#comment-17418064 ] gaoyajun02 commented on SPARK-36815: https://issues.apache.org/jira/browse/SPARK-332

[jira] [Updated] (SPARK-36815) Found duplicate rewrite attributes

2021-09-21 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36815: --- Description: We are using Spark version 3.0.2 in production and some ETLs contain multi-level CETs

[jira] [Created] (SPARK-36815) Found duplicate rewrite attributes

2021-09-21 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-36815: -- Summary: Found duplicate rewrite attributes Key: SPARK-36815 URL: https://issues.apache.org/jira/browse/SPARK-36815 Project: Spark Issue Type: Bug Comp

[jira] [Updated] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36630: --- Parent: (was: SPARK-33828) Issue Type: Question (was: Sub-task) > Add the option to use

[jira] [Issue Comment Deleted] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36630: --- Comment: was deleted (was: close it) > Add the option to use physical statistics to avoid large tab

[jira] [Closed] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 closed SPARK-36630. -- close it > Add the option to use physical statistics to avoid large tables being > broadcast > -

[jira] [Resolved] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 resolved SPARK-36630. Resolution: Fixed > Add the option to use physical statistics to avoid large tables being > broad

[jira] [Commented] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408901#comment-17408901 ] gaoyajun02 commented on SPARK-36630: Found my issue is similar to https://issues.apa

[jira] [Updated] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-02 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36630: --- Parent: SPARK-33828 Issue Type: Sub-task (was: Improvement) > Add the option to use physica

[jira] [Updated] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-09-01 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36630: --- Description: Currently when AQE's queryStage is not materialized, it uses the stats of the logical p

[jira] [Updated] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-08-31 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36630: --- Description: Currently when AQE's queryStage is not materialized, it uses the stats of the logical p

[jira] [Created] (SPARK-36630) Add the option to use physical statistics to avoid large tables being broadcast

2021-08-31 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-36630: -- Summary: Add the option to use physical statistics to avoid large tables being broadcast Key: SPARK-36630 URL: https://issues.apache.org/jira/browse/SPARK-36630 Project:

[jira] [Updated] (SPARK-36339) aggsBuffer should collect AggregateExpression in the map range

2021-07-28 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36339: --- Description: show demo for this ISSUE: {code:java} // SQL without error SELECT name, count(name) c

[jira] [Created] (SPARK-36339) aggsBuffer should collect AggregateExpression in the map range

2021-07-28 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-36339: -- Summary: aggsBuffer should collect AggregateExpression in the map range Key: SPARK-36339 URL: https://issues.apache.org/jira/browse/SPARK-36339 Project: Spark I

[jira] [Commented] (SPARK-36121) Write data loss caused by stage retry when enable v2 FileOutputCommitter

2021-07-14 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380646#comment-17380646 ] gaoyajun02 commented on SPARK-36121: Hi [~hyukjin.kwon], Thank you very much. It se

[jira] [Updated] (SPARK-36121) Write data loss caused by stage retry when enable v2 FileOutputCommitter

2021-07-13 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36121: --- Description: All our ETL scenarios are configured: mapreduce.fileoutputcommitter.algorithm.version=

[jira] [Updated] (SPARK-36121) Write data loss caused by stage retry when enable v2 FileOutputCommitter

2021-07-13 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaoyajun02 updated SPARK-36121: --- Description: All our ETL scenarios are configured: mapreduce.fileoutputcommitter.algorithm.version=

[jira] [Created] (SPARK-36121) Write data loss caused by stage retry when enable v2 FileOutputCommitter

2021-07-13 Thread gaoyajun02 (Jira)
gaoyajun02 created SPARK-36121: -- Summary: Write data loss caused by stage retry when enable v2 FileOutputCommitter Key: SPARK-36121 URL: https://issues.apache.org/jira/browse/SPARK-36121 Project: Spark

[jira] [Comment Edited] (SPARK-35920) Upgrade to Chill 0.10.0

2021-06-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371213#comment-17371213 ] gaoyajun02 edited comment on SPARK-35920 at 6/29/21, 9:31 AM:

[jira] [Commented] (SPARK-35920) Upgrade to Chill 0.10.0

2021-06-29 Thread gaoyajun02 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371213#comment-17371213 ] gaoyajun02 commented on SPARK-35920: The spark core module loses this dependency in