[jira] [Updated] (SPARK-46052) Remove unnecessary TaskScheduler.killAllTaskAttempts

2023-11-22 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-46052: - Description: Spark has two functions to kill all tasks in a Stage: * `cancelTasks`: Not only kill all the

[jira] [Created] (SPARK-46052) Remove unnecessary TaskScheduler.killAllTaskAttempts

2023-11-22 Thread wuyi (Jira)
wuyi created SPARK-46052: Summary: Remove unnecessary TaskScheduler.killAllTaskAttempts Key: SPARK-46052 URL: https://issues.apache.org/jira/browse/SPARK-46052 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-45527) Task fraction resource request is not expected

2023-10-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774756#comment-17774756 ] wuyi commented on SPARK-45527: -- cc [~wbo4958]   [~tgraves]  > Task fraction resource request is not

[jira] [Created] (SPARK-45527) Task fraction resource request is not expected

2023-10-12 Thread wuyi (Jira)
wuyi created SPARK-45527: Summary: Task fraction resource request is not expected Key: SPARK-45527 URL: https://issues.apache.org/jira/browse/SPARK-45527 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-45527) Task fraction resource request is not expected

2023-10-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-45527: - Description:   {code:java} test("SPARK-XXX") { import org.apache.spark.resource.{ResourceProfileBuilder,

[jira] [Commented] (SPARK-45057) Deadlock caused by rdd replication level of 2

2023-09-27 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769443#comment-17769443 ] wuyi commented on SPARK-45057: -- In the case of "Received UploadBlock request from T1 (blocked by T4)",

[jira] [Created] (SPARK-45310) Mapstatus location type changed from external shuffle service to executor after decommission migration

2023-09-25 Thread wuyi (Jira)
wuyi created SPARK-45310: Summary: Mapstatus location type changed from external shuffle service to executor after decommission migration Key: SPARK-45310 URL: https://issues.apache.org/jira/browse/SPARK-45310

[jira] [Created] (SPARK-42577) A large stage could run indefinitely due to executor lost

2023-02-25 Thread wuyi (Jira)
wuyi created SPARK-42577: Summary: A large stage could run indefinitely due to executor lost Key: SPARK-42577 URL: https://issues.apache.org/jira/browse/SPARK-42577 Project: Spark Issue Type:

[jira] [Created] (SPARK-41958) Disallow arbitrary custom classpath with proxy user in cluster mode

2023-01-09 Thread wuyi (Jira)
wuyi created SPARK-41958: Summary: Disallow arbitrary custom classpath with proxy user in cluster mode Key: SPARK-41958 URL: https://issues.apache.org/jira/browse/SPARK-41958 Project: Spark Issue

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2023-01-02 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17653781#comment-17653781 ] wuyi commented on SPARK-41497: -- > If I am not wrong, SQL makes very heavy use of accumulators, and so most

[jira] [Updated] (SPARK-41848) Tasks are over-scheduled with TaskResourceProfile

2023-01-02 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-41848: - Priority: Blocker (was: Major) > Tasks are over-scheduled with TaskResourceProfile >

[jira] [Commented] (SPARK-41848) Tasks are over-scheduled with TaskResourceProfile

2023-01-02 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17653739#comment-17653739 ] wuyi commented on SPARK-41848: -- cc [~ivoson]  > Tasks are over-scheduled with TaskResourceProfile >

[jira] [Created] (SPARK-41848) Tasks are over-scheduled with TaskResourceProfile

2023-01-02 Thread wuyi (Jira)
wuyi created SPARK-41848: Summary: Tasks are over-scheduled with TaskResourceProfile Key: SPARK-41848 URL: https://issues.apache.org/jira/browse/SPARK-41848 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-39853) Support stage level schedule for standalone cluster when dynamic allocation is disabled

2023-01-02 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-39853: - Fix Version/s: 3.4.0 > Support stage level schedule for standalone cluster when dynamic allocation > is

[jira] [Comment Edited] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646971#comment-17646971 ] wuyi edited comment on SPARK-41497 at 12/14/22 7:31 AM: I'm thinking if we could

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646971#comment-17646971 ] wuyi commented on SPARK-41497: -- I'm thinking if we could improve the improved Option 4 by changing the rdd

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646969#comment-17646969 ] wuyi commented on SPARK-41497: -- > do we have a way to do that ?   [~mridulm80]  Currently, we only have

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646919#comment-17646919 ] wuyi commented on SPARK-41497: -- [~mridulm80]  For b) and c), shouldn't we allow T2 to use the result of

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646639#comment-17646639 ] wuyi commented on SPARK-41497: -- [~mridulm80] Sounds like a better idea than option 4. But I think it still

[jira] [Updated] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-41497: - Description: Accumulator could be undercounted when the retried task has rdd cache.  See the example below and

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646097#comment-17646097 ] wuyi commented on SPARK-41497: -- [~mridulm80] [~tgraves] [~attilapiros] [~ivoson] any good ideas? >

[jira] [Updated] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-41497: - Description: Accumulator could be undercounted when the retried task has rdd cache.  See the example below and

[jira] [Created] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-12 Thread wuyi (Jira)
wuyi created SPARK-41497: Summary: Accumulator undercounting in the case of retry task with rdd cache Key: SPARK-41497 URL: https://issues.apache.org/jira/browse/SPARK-41497 Project: Spark Issue

[jira] [Created] (SPARK-41469) Task rerun on decommissioned executor can be avoided if shuffle data has migrated

2022-12-09 Thread wuyi (Jira)
wuyi created SPARK-41469: Summary: Task rerun on decommissioned executor can be avoided if shuffle data has migrated Key: SPARK-41469 URL: https://issues.apache.org/jira/browse/SPARK-41469 Project: Spark

[jira] [Created] (SPARK-41460) Introduce IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint

2022-12-08 Thread wuyi (Jira)
wuyi created SPARK-41460: Summary: Introduce IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint Key: SPARK-41460 URL: https://issues.apache.org/jira/browse/SPARK-41460 Project: Spark

[jira] [Updated] (SPARK-41360) Avoid BlockManager re-registration if the executor has been lost

2022-12-01 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-41360: - Summary: Avoid BlockManager re-registration if the executor has been lost (was: Avoid BlockMananger

[jira] [Created] (SPARK-41360) Avoid BlockMananger re-registration if the executor has been lost

2022-12-01 Thread wuyi (Jira)
wuyi created SPARK-41360: Summary: Avoid BlockMananger re-registration if the executor has been lost Key: SPARK-41360 URL: https://issues.apache.org/jira/browse/SPARK-41360 Project: Spark Issue

[jira] [Updated] (SPARK-35011) False active executor in UI that caused by BlockManager reregistration

2022-12-01 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-35011: - Summary: False active executor in UI that caused by BlockManager reregistration (was: Avoid Block Manager

[jira] [Resolved] (SPARK-40596) Populate ExecutorDecommission with more informative messages

2022-10-10 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-40596. -- Assignee: Bo Zhang Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/38030 >

[jira] [Resolved] (SPARK-39853) Support stage level schedule for standalone cluster when dynamic allocation is disabled

2022-09-29 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-39853. -- Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/37268 > Support stage level

[jira] [Assigned] (SPARK-39853) Support stage level schedule for standalone cluster when dynamic allocation is disabled

2022-09-29 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-39853: Assignee: huangtengfei > Support stage level schedule for standalone cluster when dynamic allocation >

[jira] [Commented] (SPARK-40320) When the Executor plugin fails to initialize, the Executor shows active but does not accept tasks forever, just like being hung

2022-09-22 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17608268#comment-17608268 ] wuyi commented on SPARK-40320: -- I see. Thanks for the explaination.  > When the Executor plugin fails to

[jira] [Commented] (SPARK-40320) When the Executor plugin fails to initialize, the Executor shows active but does not accept tasks forever, just like being hung

2022-09-06 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600777#comment-17600777 ] wuyi commented on SPARK-40320: -- > Actually the  `CoarseGrainedExecutorBackend` JVM process  is active but

[jira] [Assigned] (SPARK-39957) Delay onDisconnected to enable Driver receives ExecutorExitCode

2022-08-24 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-39957: Assignee: Kai-Hsun Chen > Delay onDisconnected to enable Driver receives ExecutorExitCode >

[jira] [Resolved] (SPARK-39957) Delay onDisconnected to enable Driver receives ExecutorExitCode

2022-08-24 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-39957. -- Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/37400 > Delay onDisconnected to

[jira] [Updated] (SPARK-39957) Delay onDisconnected to enable Driver receives ExecutorExitCode

2022-08-24 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-39957: - Fix Version/s: 3.4.0 > Delay onDisconnected to enable Driver receives ExecutorExitCode >

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572506#comment-17572506 ] wuyi commented on SPARK-34788: -- > Why don't you ensure enough disks from the beginning? On the write side,

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572478#comment-17572478 ] wuyi commented on SPARK-34788: -- [~leo wen] are you able to reproduce in your environment? If yes, we could

[jira] [Resolved] (SPARK-39062) Add Standalone backend support for Stage Level Scheduling

2022-07-04 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-39062. -- Fix Version/s: 3.4.0 Assignee: huangtengfei Resolution: Fixed Issue resolved by

[jira] [Updated] (SPARK-32170) Improve the speculation for the inefficient tasks by the task metrics.

2022-06-28 Thread wuyi (Jira)
Title: Message Title wuyi updated an

[jira] [Resolved] (SPARK-32170) Improve the speculation for the inefficient tasks by the task metrics.

2022-06-28 Thread wuyi (Jira)
Title: Message Title wuyi resolved as

[jira] [Resolved] (SPARK-39152) StreamCorruptedException cause job failure for disk persisted RDD

2022-06-20 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-39152. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/36512 >

[jira] [Resolved] (SPARK-38683) It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection

2022-03-31 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-38683. -- Fix Version/s: 3.4.0 Assignee: weixiuli Resolution: Fixed Issue resolved by

[jira] [Updated] (SPARK-38683) It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection

2022-03-31 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-38683: - Issue Type: Improvement (was: Bug) > It is unnecessary to release the ShuffleManagedBufferIterator or >

[jira] [Commented] (SPARK-38468) Use error classes in org.apache.spark.metrics

2022-03-15 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506754#comment-17506754 ] wuyi commented on SPARK-38468: -- Shall we close this one? Seem it's duplicate with

[jira] [Commented] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2022-03-14 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506673#comment-17506673 ] wuyi commented on SPARK-37481: -- Backport fix to 3.1/3.0 also done.   > Disappearance of skipped stages

[jira] [Resolved] (SPARK-38266) UnresolvedException: Invalid call to dataType on unresolved object caused by GetDateFieldOperations

2022-02-20 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-38266. -- Resolution: Fixed > UnresolvedException: Invalid call to dataType on unresolved object caused by >

[jira] [Assigned] (SPARK-38266) UnresolvedException: Invalid call to dataType on unresolved object caused by GetDateFieldOperations

2022-02-20 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-38266: Assignee: wuyi > UnresolvedException: Invalid call to dataType on unresolved object caused by >

[jira] [Commented] (SPARK-38266) UnresolvedException: Invalid call to dataType on unresolved object caused by GetDateFieldOperations

2022-02-20 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17495285#comment-17495285 ] wuyi commented on SPARK-38266: -- Issue resolved by https://github.com/apache/spark/pull/35568 >

[jira] [Created] (SPARK-38266) UnresolvedException: Invalid call to dataType on unresolved object caused by GetDateFieldOperations

2022-02-20 Thread wuyi (Jira)
wuyi created SPARK-38266: Summary: UnresolvedException: Invalid call to dataType on unresolved object caused by GetDateFieldOperations Key: SPARK-38266 URL: https://issues.apache.org/jira/browse/SPARK-38266

[jira] [Resolved] (SPARK-37580) Optimize current TaskSetManager abort logic when task failed count reach the threshold

2022-01-18 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-37580. -- Assignee: wangshengjie Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/34834

[jira] [Resolved] (SPARK-37695) Skip diagnosis ob merged blocks from push-based shuffle

2021-12-22 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-37695. -- Fix Version/s: 3.2.1 3.3.0 Assignee: Cheng Pan Resolution: Fixed Issue

[jira] [Updated] (SPARK-37695) Skip diagnosis ob merged blocks from push-based shuffle

2021-12-20 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-37695: - Description: Shuffle  {code:java} 21/12/19 18:46:37 WARN TaskSetManager: Lost task 166.0 in stage 1921.0 (TID

[jira] [Updated] (SPARK-37695) Skip diagnosis ob merged blocks from push-based shuffle

2021-12-20 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-37695: - Description: Shuffle corruption diagnosis for push-based shuffle hasn't been supported yet. So we should skip

[jira] [Created] (SPARK-37695) Skip diagnosis ob merged blocks from push-based shuffle

2021-12-20 Thread wuyi (Jira)
wuyi created SPARK-37695: Summary: Skip diagnosis ob merged blocks from push-based shuffle Key: SPARK-37695 URL: https://issues.apache.org/jira/browse/SPARK-37695 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-37060) Report driver status does not handle response from backup masters

2021-12-15 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-37060: - Fix Version/s: 3.1.3 3.2.1 > Report driver status does not handle response from backup

[jira] [Assigned] (SPARK-37060) Report driver status does not handle response from backup masters

2021-12-15 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-37060: Assignee: Mohamadreza Rostami > Report driver status does not handle response from backup masters >

[jira] [Resolved] (SPARK-37060) Report driver status does not handle response from backup masters

2021-12-15 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-37060. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/34331 >

[jira] [Resolved] (SPARK-37300) TaskSchedulerImpl should ignore task finished event if its task was already finished state

2021-12-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-37300. -- Fix Version/s: 3.3.0 Assignee: hujiahua Resolution: Fixed Issue resolved by

[jira] [Commented] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-12-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17458131#comment-17458131 ] wuyi commented on SPARK-37481: -- Backport fix to 3.1/3.0 is still in progress. > Disappearance of skipped

[jira] [Resolved] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-12-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-37481. -- Fix Version/s: 3.3.0 3.2.0 Assignee: Kent Yao Resolution: Fixed Issue

[jira] [Commented] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-10 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441796#comment-17441796 ] wuyi commented on SPARK-36575: -- FYI: the fix is reverted due to test issues. > Executor lost may cause

[jira] [Assigned] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-36575: Assignee: hujiahua > Executor lost may cause spark stage to hang >

[jira] [Assigned] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-36575: Assignee: (was: wuyi) > Executor lost may cause spark stage to hang >

[jira] [Commented] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441486#comment-17441486 ] wuyi commented on SPARK-36575: -- Issue resolved by [https://github.com/apache/spark/pull/33872.]   To

[jira] [Updated] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36575: - Issue Type: Improvement (was: Bug) > Executor lost may cause spark stage to hang >

[jira] [Updated] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36575: - Fix Version/s: 3.3.0 > Executor lost may cause spark stage to hang >

[jira] [Assigned] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi reassigned SPARK-36575: Assignee: wuyi > Executor lost may cause spark stage to hang >

[jira] [Resolved] (SPARK-36575) Executor lost may cause spark stage to hang

2021-11-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-36575. -- Resolution: Fixed > Executor lost may cause spark stage to hang > ---

[jira] [Commented] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2021-09-29 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421971#comment-17421971 ] wuyi commented on SPARK-18105: -- [~vladimir.prus] Hi, could you also file a sub-task under

[jira] [Resolved] (SPARK-36700) BlockManager re-registration is broken due to deferred removal of BlockManager

2021-09-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-36700. -- Fix Version/s: 3.3.0 3.0.4 3.1.3 3.2.0

[jira] [Commented] (SPARK-36700) BlockManager re-registration is broken due to deferred removal of BlockManager

2021-09-12 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413865#comment-17413865 ] wuyi commented on SPARK-36700: -- Reverted by [https://github.com/apache/spark/pull/33942] and backported to

[jira] [Commented] (SPARK-36700) BlockManager re-registration is broken due to deferred removal of BlockManager

2021-09-08 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412290#comment-17412290 ] wuyi commented on SPARK-36700: -- I'm working on the fix. > BlockManager re-registration is broken due to

[jira] [Commented] (SPARK-36700) BlockManager re-registration is broken due to deferred removal of BlockManager

2021-09-08 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412285#comment-17412285 ] wuyi commented on SPARK-36700: -- cc [~mridulm80] [~sumeet.gajjar] [~attilapiros] cc [~gengliang] for the

[jira] [Created] (SPARK-36700) BlockManager re-registration is broken due to deferred removal of BlockManager

2021-09-08 Thread wuyi (Jira)
wuyi created SPARK-36700: Summary: BlockManager re-registration is broken due to deferred removal of BlockManager Key: SPARK-36700 URL: https://issues.apache.org/jira/browse/SPARK-36700 Project: Spark

[jira] [Created] (SPARK-36614) Executor loss reason shows "worker lost" rather "Executor decommission"

2021-08-30 Thread wuyi (Jira)
wuyi created SPARK-36614: Summary: Executor loss reason shows "worker lost" rather "Executor decommission" Key: SPARK-36614 URL: https://issues.apache.org/jira/browse/SPARK-36614 Project: Spark

[jira] [Updated] (SPARK-36614) Executor loss reason shows "worker lost" rather "Executor decommission"

2021-08-30 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36614: - Attachment: WeChat13c9f1345a096ff83d193e4e9853b165.png > Executor loss reason shows "worker lost" rather

[jira] [Comment Edited] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2021-08-30 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406622#comment-17406622 ] wuyi edited comment on SPARK-18105 at 8/30/21, 8:31 AM: FYI, for users who hit

[jira] [Commented] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2021-08-30 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406622#comment-17406622 ] wuyi commented on SPARK-18105: -- FYI, for users who hit the "Stream is corrupted" error, please try to apply

[jira] [Commented] (SPARK-36196) Spark FetchFailedException Stream is corrupted Error

2021-08-30 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406620#comment-17406620 ] wuyi commented on SPARK-36196: -- Hi [~arghya18] Did you try to apply the fix of 

[jira] [Commented] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-24 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403556#comment-17403556 ] wuyi commented on SPARK-36558: -- Discussed with [~vsowrirajan] offline. The issue won't happen in the

[jira] [Resolved] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-24 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-36558. -- Resolution: Won't Fix > Stage has all tasks finished but with ongoing finalization can cause job hang >

[jira] [Updated] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-23 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36558: - Description:   For a stage that all tasks are finished but with ongoing finalization can lead to job hang.

[jira] [Updated] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-23 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36558: - Description:   For a stage that all tasks are finished but with ongoing finalization can lead to job hang.

[jira] [Updated] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-23 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36558: - Description:   For a stage that all tasks are finished but with ongoing finalization can lead to job hang.

[jira] [Commented] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-23 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403456#comment-17403456 ] wuyi commented on SPARK-36558: -- [~vsowrirajan] Sorry, I missed one tweaked change in `MyRDD`. We should

[jira] [Updated] (SPARK-36564) LiveRDDDistribution.toApi throws NullPointerException

2021-08-23 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36564: - Summary: LiveRDDDistribution.toApi throws NullPointerException (was: LiveRDD.doUpdate throws

[jira] [Created] (SPARK-36564) LiveRDD.doUpdate throws NullPointerException

2021-08-23 Thread wuyi (Jira)
wuyi created SPARK-36564: Summary: LiveRDD.doUpdate throws NullPointerException Key: SPARK-36564 URL: https://issues.apache.org/jira/browse/SPARK-36564 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-22 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36558: - Description:   For a stage that all tasks are finished but with ongoing finalization can lead to job hang.

[jira] [Updated] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-22 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-36558: - Description:   For a stage that all tasks are finished but with ongoing finalization can lead to job hang.

[jira] [Commented] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-22 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402942#comment-17402942 ] wuyi commented on SPARK-36558: -- cc [~mridul] [~mshen] [~csingh] > Stage has all tasks finished but with

[jira] [Created] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-22 Thread wuyi (Jira)
wuyi created SPARK-36558: Summary: Stage has all tasks finished but with ongoing finalization can cause job hang Key: SPARK-36558 URL: https://issues.apache.org/jira/browse/SPARK-36558 Project: Spark

[jira] [Created] (SPARK-36543) Decommission logs too frequent when waiting migration to finish

2021-08-18 Thread wuyi (Jira)
wuyi created SPARK-36543: Summary: Decommission logs too frequent when waiting migration to finish Key: SPARK-36543 URL: https://issues.apache.org/jira/browse/SPARK-36543 Project: Spark Issue Type:

[jira] [Created] (SPARK-36532) Deadlock in CoarseGrainedExecutorBackend.onDisconnected

2021-08-17 Thread wuyi (Jira)
wuyi created SPARK-36532: Summary: Deadlock in CoarseGrainedExecutorBackend.onDisconnected Key: SPARK-36532 URL: https://issues.apache.org/jira/browse/SPARK-36532 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-36530) Avoid finalizing when there's no push at all in a shuffle

2021-08-16 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-36530. -- Resolution: Won't Fix > Avoid finalizing when there's no push at all in a shuffle >

[jira] [Commented] (SPARK-36530) Avoid finalizing when there's no push at all in a shuffle

2021-08-16 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400118#comment-17400118 ] wuyi commented on SPARK-36530: -- SGTM. Could you update SPARK-33701 to include this part? Then, I'll close

[jira] [Commented] (SPARK-36530) Avoid finalizing when there's no push at all in a shuffle

2021-08-16 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400089#comment-17400089 ] wuyi commented on SPARK-36530: -- cc [~mridulm80] [~mshen] any thoughts? > Avoid finalizing when there's no

[jira] [Created] (SPARK-36530) Avoid finalizing when there's no push at all in a shuffle

2021-08-16 Thread wuyi (Jira)
wuyi created SPARK-36530: Summary: Avoid finalizing when there's no push at all in a shuffle Key: SPARK-36530 URL: https://issues.apache.org/jira/browse/SPARK-36530 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-11 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-36378. -- Fix Version/s: 3.3.0 3.2.0 Assignee: Min Shen Resolution: Fixed Issue

[jira] [Resolved] (SPARK-36332) Cleanup RemoteBlockPushResolver log messages

2021-08-09 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-36332. -- Fix Version/s: 3.3.0 3.2.0 Assignee: Venkata krishnan Sowrirajan

  1   2   3   4   5   >