[ https://issues.apache.org/jira/browse/SPARK-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kostas Sakellis updated SPARK-5225: ----------------------------------- Description: Currently, If task reads data from more than one block and it is from different read methods we ignore the second read method bytes. For example: {noformat} CoalescedRDD | Task1 / | \ hadoop hadoop cached {noformat} if Task1 starts reading from the hadoop blocks first, then the input metrics for Task1 will only contain input metrics from the hadoop blocks and ignre the input metrics from cached blocks. We need to change the way we collect input metrics so that it is not a single value but rather a collection of input metrics for a task. was: Currently, If task reads data from more than one block and it is from different read methods we ignore the second read method bytes. For example: CoalescedRDD | Task1 / | \ / | \ hadoop hadoop cached if Task1 starts reading from the hadoop blocks first, then the input metrics for Task1 will only contain input metrics from the hadoop blocks and ignre the input metrics from cached blocks. We need to change the way we collect input metrics so that it is not a single value but rather a collection of input metrics for a task. > Support coalesed Input Metrics from different sources > ----------------------------------------------------- > > Key: SPARK-5225 > URL: https://issues.apache.org/jira/browse/SPARK-5225 > Project: Spark > Issue Type: Bug > Reporter: Kostas Sakellis > > Currently, If task reads data from more than one block and it is from > different read methods we ignore the second read method bytes. For example: > {noformat} > CoalescedRDD > | > Task1 > / | \ > hadoop hadoop cached > {noformat} > if Task1 starts reading from the hadoop blocks first, then the input metrics > for Task1 will only contain input metrics from the hadoop blocks and ignre > the input metrics from cached blocks. We need to change the way we collect > input metrics so that it is not a single value but rather a collection of > input metrics for a task. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org