[jira] [Updated] (SPARK-37473) BypassMergeSortShuffleWriter may loss data when disk is missing however catagory is present

2021-11-26 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-37473: -- Description: !image-2021-11-27-15-07-41-516.png|width=715,height=484! We think it has no data when

[jira] [Updated] (SPARK-37473) BypassMergeSortShuffleWriter may loss data when disk is missing however catagory is present

2021-11-26 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-37473: -- Attachment: image-2021-11-27-15-07-41-516.png > BypassMergeSortShuffleWriter may loss data when disk

[jira] [Updated] (SPARK-37473) BypassMergeSortShuffleWriter may loss data when disk is missing however catagory is present

2021-11-26 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-37473: -- Description: !image-2021-11-27-15-07-41-516.png|width=715,height=484! We think it has no data when

[jira] [Created] (SPARK-37473) BypassMergeSortShuffleWriter may loss data when disk is missing however catagory is present

2021-11-26 Thread haiyangyu (Jira)
haiyangyu created SPARK-37473: - Summary: BypassMergeSortShuffleWriter may loss data when disk is missing however catagory is present Key: SPARK-37473 URL: https://issues.apache.org/jira/browse/SPARK-37473

[jira] [Updated] (SPARK-34536) zstd-jni lead to read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Summary: zstd-jni lead to read less shuffle data (was: zstd-jni lead read less shuffle data) >

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Description: h2. BackGround I find a rare case which lead some partitions read less data when use

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Description: h2. BackGround I find a rare case which lead some partitions read less data when use

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Description: h2. BackGround I find a rare case which lead some partitions read less data when use

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Description: h2. BackGround I find a rare case which lead some partitions read less data when use

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Description: h2. BackGround I find a rare case which lead some partitions read less data when use

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Attachment: image-2021-02-25-17-51-49-998.png > zstd-jni lead read less shuffle data >

[jira] [Updated] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34536: -- Attachment: image-2021-02-25-17-50-49-427.png > zstd-jni lead read less shuffle data >

[jira] [Created] (SPARK-34536) zstd-jni lead read less shuffle data

2021-02-25 Thread haiyangyu (Jira)
haiyangyu created SPARK-34536: - Summary: zstd-jni lead read less shuffle data Key: SPARK-34536 URL: https://issues.apache.org/jira/browse/SPARK-34536 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Description: We will build a new rpc message `FetchShuffleBlocks` when `OneForOneBlockFetcher` init

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Description: We will build a new rpc message `FetchShuffleBlocks` when `OneForOneBlockFetcher` init

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Attachment: image-2021-02-25-11-31-59-110.png > New protocol FetchShuffleBlocks in

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Attachment: image-2021-02-25-11-30-03-834.png > New protocol FetchShuffleBlocks in

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Attachment: image-2021-02-25-11-28-31-255.png > New protocol FetchShuffleBlocks in

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Attachment: image-2021-02-25-11-27-34-429.png > New protocol FetchShuffleBlocks in

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Description: We will build a new rpc message `FetchShuffleBlocks` when `OneForOneBlockFetcher` init

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Attachment: image-2021-02-25-11-17-12-714.png > New protocol FetchShuffleBlocks in

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Description: We will build a new rpc message  {code:java} FetchShuffleBlocks{code} when {code:java}

[jira] [Updated] (SPARK-34534) New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34534: -- Summary: New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

[jira] [Created] (SPARK-34534) FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness

2021-02-24 Thread haiyangyu (Jira)
haiyangyu created SPARK-34534: - Summary: FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness Key: SPARK-34534 URL: https://issues.apache.org/jira/browse/SPARK-34534 Project:

[jira] [Updated] (SPARK-34242) Use getPartitionByNames to filter partition to avoid partition scan

2021-01-26 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34242: -- Description: In HiveShim, `getPartitionByFilters` will lead parttion scan in special , ec: 

[jira] [Updated] (SPARK-34242) Use getPartitionByNames to filter partition to avoid partition scan

2021-01-26 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-34242: -- Description: In HiveShim, {code:java} getPartitionByFilters{code} will lead parttion scan in special

[jira] [Created] (SPARK-34242) Use getPartitionByNames to filter partition to avoid partition scan

2021-01-26 Thread haiyangyu (Jira)
haiyangyu created SPARK-34242: - Summary: Use getPartitionByNames to filter partition to avoid partition scan Key: SPARK-34242 URL: https://issues.apache.org/jira/browse/SPARK-34242 Project: Spark

[jira] [Updated] (SPARK-30325) markPartitionCompleted cause task status inconsistent

2019-12-25 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Summary: markPartitionCompleted cause task status inconsistent (was: Stage retry and executor

[jira] [Commented] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001661#comment-17001661 ] haiyangyu commented on SPARK-30325: --- [https://github.com/apache/spark/pull/26975] > Stage retry and

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Description: h3. Corner case The bugs occurs in the coren case as follows: # The stage occurs for

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Attachment: image-2019-12-21-17-17-42-244.png > Stage retry and executor crashed cause app hung up

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Attachment: image-2019-12-21-17-16-40-998.png > Stage retry and executor crashed cause app hung up

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Attachment: image-2019-12-21-17-15-51-512.png > Stage retry and executor crashed cause app hung up

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Attachment: image-2019-12-21-17-11-38-565.png > Stage retry and executor crashed cause app hung up

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Description: h3. Corner case The bugs occurs in the coren case as follows: # The stage occurs for

[jira] [Created] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
haiyangyu created SPARK-30325: - Summary: Stage retry and executor crashed cause app hung up forever Key: SPARK-30325 URL: https://issues.apache.org/jira/browse/SPARK-30325 Project: Spark Issue

[jira] [Updated] (SPARK-30325) Stage retry and executor crashed cause app hung up forever

2019-12-21 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30325: -- Description: Kill tasks which succeeded in origin stage when new retry stage has started the same

[jira] [Comment Edited] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999237#comment-16999237 ] haiyangyu edited comment on SPARK-30297 at 12/18/19 2:53 PM: -

[jira] [Commented] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999237#comment-16999237 ] haiyangyu commented on SPARK-30297: --- [~r...@databricks.com] [~AMateenM] [~dongjoon] please look this

[jira] [Commented] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999235#comment-16999235 ] haiyangyu commented on SPARK-30297: --- pr here [https://github.com/apache/spark/pull/26938] > Executor

[jira] [Updated] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30297: -- Description: h3. *Backgroud* The driver can't sense this executor was lost through the network

[jira] [Updated] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30297: -- Description: h3. *Backgroud* The driver can't sense this executor was lost through the network

[jira] [Updated] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haiyangyu updated SPARK-30297: -- Description: h3. *Backgroud* The driver can't sense this executor was lost through the network

[jira] [Created] (SPARK-30297) Executor heartbeat expired cause app hung up forever

2019-12-18 Thread haiyangyu (Jira)
haiyangyu created SPARK-30297: - Summary: Executor heartbeat expired cause app hung up forever Key: SPARK-30297 URL: https://issues.apache.org/jira/browse/SPARK-30297 Project: Spark Issue Type: