[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642441#comment-14642441 ] Apache Spark commented on SPARK-2016: - User 'carsonwang' has created a pull request for this issue: https://github.com/apache/spark/pull/7692 > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task > Components: Web UI >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243153#comment-14243153 ] Andrew Or commented on SPARK-2016: -- This was filed before SPARK-2316 (https://github.com/apache/spark/pull/1679) was fixed. At least on the backend side, this should be much quicker than before. I don't know if we need to do some CSS magic to make the frontend side blazing fast too. Is this still reproducible? > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243147#comment-14243147 ] Reynold Xin commented on SPARK-2016: cc [~andrewor14] can you comment on this? > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243109#comment-14243109 ] Sean Owen commented on SPARK-2016: -- Is this and SPARK-2017 now subsumed by SPARK-3644? the PR for this and SPARK-2017 are closed and discussion suggested it was to be continued in SPARK-3644. > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162405#comment-14162405 ] Apache Spark commented on SPARK-2016: - User 'carlosfuertes' has created a pull request for this issue: https://github.com/apache/spark/pull/1682 > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087235#comment-14087235 ] Carlos Fuertes commented on SPARK-2016: --- I have done some very simple benchmarks comparing the current master and the UI is still unresponsive with big tables (high number of blocks) even after the change in SPARK-2316. However if you switch to a solution where you serve the data for the tables through JSON and build the html table with Javascript, the UI remains responsive. Here it is a rough benchmark running on an old MacBook laptop in local mode and using Chrome to render the UI — gathered the stats using the dev tools included in Chrome: > sc.parallelize(1 to 100, 5).count() The time to load ‘/storage/rdd/?id=0’ is : - Current master release takes between ~11 secs but then when the page finishes loading is completely unusable since it takes forever to scroll up or down. Size of the page is 14.4MB. - If I run the page with the modified css style, it loads couples sec faster but it remains unresponsive after it loads. That corresponds to running my pull request with “spark.ui.jsRenderingEnabled false” - With the JSON solution, you have the page without the blocks table instantly while it takes ~15 secs to load the blocks table. After that however the page is totally responsive. >From my limited tests I would say that it is a win using Javascript with JSON >to render the page since the page remains responsive and usable after loading >big tables. > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084399#comment-14084399 ] Patrick Wendell commented on SPARK-2016: I think a major part of this was on the computation side as well - there were huge inefficiencies on this page. They were fixed in SPARK-2316. It would be interesting to see whether the test you give here is still unresponsive in that case (to dilineate whether this was a UI issue or not). > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083263#comment-14083263 ] Carlos Fuertes commented on SPARK-2016: --- The real problem with the unresponsiveness of the browser is with the css style used for the tables. I have updated in that same pull request the style of the table and that solves the issue without having to modify how the data is currently presented (although we could still add any other suggested options). See SPARK-2017 > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080542#comment-14080542 ] Carlos Fuertes commented on SPARK-2016: --- I have created a pull request https://github.com/apache/spark/pull/1682 that deals with this issue. The idea follow the discussion of issue SPARK-2017 where the data for the tables is served as JSON and later rendered javascript. See https://issues.apache.org/jira/browse/SPARK-2017 for all the discussion. > rdd in-memory storage UI becomes unresponsive when the number of RDD > partitions is large > > > Key: SPARK-2016 > URL: https://issues.apache.org/jira/browse/SPARK-2016 > Project: Spark > Issue Type: Sub-task >Reporter: Reynold Xin > Labels: starter > > Try run > {code} > sc.parallelize(1 to 100, 100).cache().count() > {code} > And open the storage UI for this RDD. It takes forever to load the page. > When the number of partitions is very large, I think there are a few > alternatives: > 0. Only show the top 1000. > 1. Pagination > 2. Instead of grouping by RDD blocks, group by executors -- This message was sent by Atlassian JIRA (v6.2#6252)