[ https://issues.apache.org/jira/browse/SPARK-28183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lantao Jin updated SPARK-28183: ------------------------------- Description: We have a scenario that our application needs to query failed tasks by REST API {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/taskList}} when Spark job is running. In a large Stage, it may filter out dozens of failed tasks from hundred thousands total tasks. It consumes much unnecessary memory and time both in Spark and App side. was: We have a scenario that our application needs to query failed tasks by REST API {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/taskList}} when Spark job is running. In a large Stage, it may contain hundred thousands tasks totally. Although it offers a pagination query via {{?offset=[offset]&length=[len]}}, it still faces two disadvantages: 1. App still needs to query out all tasks. It consumes much unnecessary memory and time both in Spark and App side. 2. Pagination query via {{?offset=[offset]&length=[len]}} makes the logic much complex and it still needs to handle all tasks. > Add a task status filter for taskList in REST API > ------------------------------------------------- > > Key: SPARK-28183 > URL: https://issues.apache.org/jira/browse/SPARK-28183 > Project: Spark > Issue Type: Improvement > Components: Spark Core, Web UI > Affects Versions: 3.0.0 > Reporter: Lantao Jin > Priority: Major > > We have a scenario that our application needs to query failed tasks by REST > API {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/taskList}} > when Spark job is running. In a large Stage, it may filter out dozens of > failed tasks from hundred thousands total tasks. It consumes much unnecessary > memory and time both in Spark and App side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org