[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087235#comment-14087235 ]
Carlos Fuertes commented on SPARK-2016: --------------------------------------- I have done some very simple benchmarks comparing the current master and the UI is still unresponsive with big tables (high number of blocks) even after the change in SPARK-2316. However if you switch to a solution where you serve the data for the tables through JSON and build the html table with Javascript, the UI remains responsive. Here it is a rough benchmark running on an old MacBook laptop in local mode and using Chrome to render the UI — gathered the stats using the dev tools included in Chrome: > sc.parallelize(1 to 1000000, 50000).count() The time to load ‘/storage/rdd/?id=0’ is : - Current master release takes between ~11 secs but then when the page finishes loading is completely unusable since it takes forever to scroll up or down. Size of the page is 14.4MB. - If I run the page with the modified css style, it loads couples sec faster but it remains unresponsive after it loads. That corresponds to running my pull request with “spark.ui.jsRenderingEnabled false” - With the JSON solution, you have the page without the blocks table instantly while it takes ~15 secs to load the blocks table. After that however the page is totally responsive. >From my limited tests I would say that it is a win using Javascript with JSON >to render the page since the page remains responsive and usable after loading >big tables. > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > ---------------------------------------------------------------------------------------- > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task > Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 1000000).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org