[jira] [Assigned] (SPARK-32921) Extend MapOutputTracker to support tracking and serving the metadata about each merged shuffle partitions for a given shuffle in push-based shuffle scenario
[ https://issues.apache.org/jira/browse/SPARK-32921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan reassigned SPARK-32921: --- Assignee: Venkata krishnan Sowrirajan > Extend MapOutputTracker to support tracking and serving the metadata about > each merged shuffle partitions for a given shuffle in push-based shuffle > scenario > > > Key: SPARK-32921 > URL: https://issues.apache.org/jira/browse/SPARK-32921 > Project: Spark > Issue Type: Sub-task > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Min Shen >Assignee: Venkata krishnan Sowrirajan >Priority: Major > > Similar to MapStatus, which tracks the metadata about each map task's shuffle > output, we also need to track the metadata about each merged shuffle > partition with push-based shuffle. We currently term this as MergeStatus. > Since MergeStatus tracks metadata from the perspective of reducer tasks, it's > not efficient to break up the metadata tracked in a MergeStatus and spread it > across multiple MapStatus. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-32921) Extend MapOutputTracker to support tracking and serving the metadata about each merged shuffle partitions for a given shuffle in push-based shuffle scenario
[ https://issues.apache.org/jira/browse/SPARK-32921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32921: Assignee: Apache Spark > Extend MapOutputTracker to support tracking and serving the metadata about > each merged shuffle partitions for a given shuffle in push-based shuffle > scenario > > > Key: SPARK-32921 > URL: https://issues.apache.org/jira/browse/SPARK-32921 > Project: Spark > Issue Type: Sub-task > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Min Shen >Assignee: Apache Spark >Priority: Major > > Similar to MapStatus, which tracks the metadata about each map task's shuffle > output, we also need to track the metadata about each merged shuffle > partition with push-based shuffle. We currently term this as MergeStatus. > Since MergeStatus tracks metadata from the perspective of reducer tasks, it's > not efficient to break up the metadata tracked in a MergeStatus and spread it > across multiple MapStatus. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-32921) Extend MapOutputTracker to support tracking and serving the metadata about each merged shuffle partitions for a given shuffle in push-based shuffle scenario
[ https://issues.apache.org/jira/browse/SPARK-32921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32921: Assignee: (was: Apache Spark) > Extend MapOutputTracker to support tracking and serving the metadata about > each merged shuffle partitions for a given shuffle in push-based shuffle > scenario > > > Key: SPARK-32921 > URL: https://issues.apache.org/jira/browse/SPARK-32921 > Project: Spark > Issue Type: Sub-task > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Min Shen >Priority: Major > > Similar to MapStatus, which tracks the metadata about each map task's shuffle > output, we also need to track the metadata about each merged shuffle > partition with push-based shuffle. We currently term this as MergeStatus. > Since MergeStatus tracks metadata from the perspective of reducer tasks, it's > not efficient to break up the metadata tracked in a MergeStatus and spread it > across multiple MapStatus. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org