Github user squito commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-74293360 Hi @CodingCat , thanks for working on this, sorry the review has been dragging out. This will be a nice addition, but I have some concerns similar to the other reviewers -- but lemme explain in a little more detail. I don't necessarily see anything wrong with your approach for getting the Broadcast block info by itself, but it seems like it ought to share a similar code path to tracking the RDDBlocks. RDDBlocks get updated in the `SparkListenerTaskEnd` event, through `event.taskMetrics.updatedBlocks`. this makes its way to the current `StorageTab` in [`StorageListener`](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala#L59), which then [filters down to the RDDBlocks](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala#L50) via `BlockId.asRDDId`. So, this makes me wonder about two possible alternatives for unifying the code paths: 1) Can you get the Broadcast block ids into the same `SparkListenerTaskEnd` event, which would eliminate the need to create a new event? (In fact, maybe those block ids are already there? I am still looking into the code for how those events get created ...) 2) If we can't get Broadcast block ids into the same `SparkListenerTaskEnd`, does it make sense to change the RDD block tracking logic to do the same thing? Yes this will be a bigger change, but it seems strange to have these two blocks tracked by such different mechanisms so it will lead to simpler code overall. (But really that pushes me to option #1)
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org