Ryan Williams created SPARK-11156: ------------------------------------- Summary: Web UI doesn't count or show info about replicated blocks Key: SPARK-11156 URL: https://issues.apache.org/jira/browse/SPARK-11156 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.5.1 Reporter: Ryan Williams
When executors receive a replica of a block, they [notify the driver with a {{UpdateBlockInfo}} message|https://github.com/apache/spark/blob/4ee2cea2a43f7d04ab8511d9c029f80c5dadd48e/core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala#L59-L61] which [sends a {{SparkListenerBlockUpdated}} event to SparkListeners|https://github.com/apache/spark/blob/4ee2cea2a43f7d04ab8511d9c029f80c5dadd48e/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala#L67]. However, the web UI (via its BlockStatusListener) [ignores {{SparkListenerBlockUpdated}} events for non-streaming blocks|https://github.com/apache/spark/blob/4ee2cea2a43f7d04ab8511d9c029f80c5dadd48e/core/src/main/scala/org/apache/spark/storage/BlockStatusListener.scala#L57-L60]. As a result, in non-streaming apps: * The "Executors" column on RDD Page doesn't show executors housing replicas; it can only show the executor that initially computed (and initiated replication of) the block. * The executor-memory-usage and related stats displayed throughout the web interface are undercounting due to ignorance of the existence of block replicas. For example, here is the Storage tab for a simple app with 3 identical RDDs cached with replication equal to 1, 2, and 3: !http://f.cl.ly/items/3m3B2v2k2J23350I3t1c/Screen%20Shot%202015-10-16%20at%2012.30.54%20AM.png! These were generated with: {code} val bar1 = sc.parallelize(1 to 100000000, 100).map(_ % 100 -> 1).reduceByKey(_+_, 100).setName("bar1").persist(StorageLevel(false, true, false, true, 1)) bar1.count val bar2 = sc.parallelize(1 to 100000000, 100).map(_ % 100 -> 1).reduceByKey(_+_, 100).setName("bar2").persist(StorageLevel(false, true, false, true, 2)) bar2.count val bar3 = sc.parallelize(1 to 100000000, 100).map(_ % 100 -> 1).reduceByKey(_+_, 100).setName("bar3").persist(StorageLevel(false, true, false, true, 3)) bar3.count {code} Note the identically-reported memory usage across the three. Here is the RDD page for the 3x-replicated RDD above: !http://f.cl.ly/items/0t0H1o2S2g140s1A0X0k/Screen%20Shot%202015-10-16%20at%2012.31.24%20AM.png! Note that only one executor is listed for each partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org