Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers
The reason WebWorkers don't have access to the DOM is concurrency. For example, to loop through a list of children I need to first read the number of childrens, then have a for loop which starts at 0 and ends at length-1. If you have two threads that can access the DOM concurrently, then one could change the number of children while the other was looping through the list, which would cause bugs in the program. The only way to fix this is to make the DOM a monitor or introduce semaphores, but then you would have to change the way the DOM is accessed in HTML5, breaking backwards compatibility, which is not a good idea. A better solution to your problem is to load fragments of the entire document using AJAX and then insert those fragments into the main document, when they are needed. You rarely need to see the entire document at once anyways. Marius Gundersen One good way I have found would be to cut the whole page into several parts (one the server side, what is already done in the multi-page version) and to launch several workers. Each worker gets one part of the whole page in the background and could give it to the browsing context which will append the right part at the right place. As others have noted, the slowness turns out to not be parsing, but to be a bunch of scripts that are doing various things such as adding the sidebar annotations, setting up the dfn cross-references, and generating the short table of contents. Plus, since browsers don't have thread-safe DOM implementations, we actually can't expose the DOM in workers. Maybe one day. :-) -- Ian Hickson = I'm sorry for the misunderstanding. I shouldn't have said the DOM API. To be as accurate as I can be I want to provide the DOMImplementation interface (http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-102161490) to the workers. As I'm going to explain, the point is to be able to create a document and then a documentFragment. I will explain my point through another use case. (Sorry for the confusion with the HTML5 one-page version.) Let imagine that I want to build a single page with several non-HTML sources of information. They can be in different formats (RSS, datas got from XML-RPC requests, any other kind of XML file, JSON...). I suppose that each source is a different JSON file with different structures (different properties, different nestings). Each source needs a particular treatment. As I said in my first e-mail, there are 3 mains steps before visualizing my page fully loaded. For each source of content, we have to : (1) get the content (2) transform it into a DOM tree (as a documentFragment or a string that is the representation of a HTML fragment, for example) (3) append this to the main document at the right place. (which triggers graphical rendering) This last step is either an appendChild or a .innerHTML= and must be done in the main browsing context, there is no choice. Let imagine that I want that one workers per source. For the moment, WebWorkers can do the step (1) independently (thanks to XMLHttpRequest). When each workers receives its JSON string, this string must be transformed into an HTML DOM tree (2) (let say a table for example). Because none of the DOM core API is currently available to the WebWorkers, we have two solutions to turn the JSON string received in (1) into an HTML DOM tree : (2.1) Send the JSON string (or the resulting object, whatever) to the main document which will create a documentFragment, run through the JSON object and append the table,tbody, trs and tds and contents to this fragment for all the sources. (2.2) Each worker create a string which looks like table id=blablatbodytr class=bliblitd1/tdtd2/td/trtr class=bliblitd3/tdtd37/td/tr/tbody/table with += while running through the JSON object. Then send the string through postMessage() and the main browsing context can do a rightPlace.innerHTML = e.data (where e.data is the string). (2.1) We have the document/documentFragment/Element/Node abstraction, but we loose all the parallelism, because the browsing context is handling all the sources of information (and creating a documentFragment and all the appendings for each source) (2.2) We have the parallelism, because each Worker handles a source. However, we loose the DOM abstraction. I hope that I have made the string ridiculously long enough to convince you that it is not a good solution. For complicated examples, by experience, using += and .innerHTML is always a source of error especially because of closing tags. These problems don't occur when developing with the DOM abstraction. My proposition is : (2.3) Assuming that we have access to the DOMImplementation interface, we can create an object implementing the document interface which is DIFFERENT from the main document object and I insist on this point. I am NOT proposing to provide an access to the main document (the one which created the workers). Thanks to this document, we can create a
Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers
On Thu, 12 Nov 2009, David Bruant wrote: I was waiting for Firefox to stop freezing on the HTML5 spec page (it freezes about one minute each time I visit the one-page version) and I tried to think of a way to design this page in a way that wouldn't freeze my browser. The easiest way is to disable the scripts, which you can do by appending ?slow-browser to the page's URL, as in: http://www.whatwg.org/specs/web-apps/current-work/?slow-browser One good way I have found would be to cut the whole page into several parts (one the server side, what is already done in the multi-page version) and to launch several workers. Each worker gets one part of the whole page in the background and could give it to the browsing context which will append the right part at the right place. As others have noted, the slowness turns out to not be parsing, but to be a bunch of scripts that are doing various things such as adding the sidebar annotations, setting up the dfn cross-references, and generating the short table of contents. Plus, since browsers don't have thread-safe DOM implementations, we actually can't expose the DOM in workers. Maybe one day. :-) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] [WebWorkers] Advocation to provide the DOM API to the workers
Hi, I was waiting for Firefox to stop freezing on the HTML5 spec page (it freezes about one minute each time I visit the one-page version) and I tried to think of a way to design this page in a way that wouldn't freeze my browser. One good way I have found would be to cut the whole page into several parts (one the server side, what is already done in the multi-page version) and to launch several workers. Each worker gets one part of the whole page in the background and could give it to the browsing context which will append the right part at the right place. But what is this give ? Without the DOM API, this give means sending a string through the postMessage() method and the append means rightPlace.innerHTML = stringContainingAPartOfThePage. However, with the DOM API, each worker could build independantly its documentFragment, send it to the browsing context which will append (appendChild) it in the right place. Building the page requires 3 main operations : - getting the content (can be parallelized with the workers which can make XMLHttpRequests) - building a DOM tree from the content - rendering (cannot be parallelized because must occur in the browsing context) Without the DOM API, the second step cannot be parallelized in the WebWorkers. I understand that the whole DOM API is not useful for the WebWorkers, but, could a reduced API (sufficient to describe the tree structure of a document) be available to the web workers to have a chance to parallelize the tree structuration of the content (and then reduce its cost) ? Thanks, David
Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers
On 11/12/09 9:21 PM, David Bruant wrote: I was waiting for Firefox to stop freezing on the HTML5 spec page (it freezes about one minute each time I visit the one-page version) and I tried to think of a way to design this page in a way that wouldn't freeze my browser. Two easy ways to do this: 1) Take out the script at the end of the page that goes and messes with the DOM. 2) Fix the O(N^2) algorithm in the web browser that this script happens to trigger (https://bugzilla.mozilla.org/show_bug.cgi?id=526394; should be checked in pretty soon unless something goes drastically wrong). One good way I have found would be to cut the whole page into several parts (one the server side, what is already done in the multi-page version) and to launch several workers. Each worker gets one part of the whole page in the background and could give it to the browsing context which will append the right part at the right place. I'm not sure what you mean, exactly... what would the worker give, exactly? But what is this give ? Without the DOM API, this give means sending a string through the postMessage() method and the append means rightPlace.innerHTML = stringContainingAPartOfThePage. However, with the DOM API, each worker could build independantly its documentFragment, send it to the browsing context which will append (appendChild) it in the right place. The problem here is that of a script making certain DOM mutations after the DOM is completely built and reflecting those mutations into the rendering tree, not of initial DOM construction. That is, even if this proposal were implemented it would not eliminate the hang you're seeing without item 2 above being addressed. Building the page requires 3 main operations : - getting the content (can be parallelized with the workers which can make XMLHttpRequests) - building a DOM tree from the content - rendering (cannot be parallelized because must occur in the browsing context) And in this case the slowness you seem to be trying to address is in the rendering part. -Boris
Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers
On Fri, Nov 13, 2009 at 1:46 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 11/12/09 9:21 PM, David Bruant wrote: I was waiting for Firefox to stop freezing on the HTML5 spec page (it freezes about one minute each time I visit the one-page version) and I tried to think of a way to design this page in a way that wouldn't freeze my browser. Two easy ways to do this: 1) Take out the script at the end of the page that goes and messes with the DOM. 2) Fix the O(N^2) algorithm in the web browser that this script happens to trigger (https://bugzilla.mozilla.org/show_bug.cgi?id=526394; should be checked in pretty soon unless something goes drastically wrong). One good way I have found would be to cut the whole page into several parts (one the server side, what is already done in the multi-page version) and to launch several workers. Each worker gets one part of the whole page in the background and could give it to the browsing context which will append the right part at the right place. I'm not sure what you mean, exactly... what would the worker give, exactly? But what is this give ? Without the DOM API, this give means sending a string through the postMessage() method and the append means rightPlace.innerHTML = stringContainingAPartOfThePage. However, with the DOM API, each worker could build independantly its documentFragment, send it to the browsing context which will append (appendChild) it in the right place. The problem here is that of a script making certain DOM mutations after the DOM is completely built and reflecting those mutations into the rendering tree, not of initial DOM construction. That is, even if this proposal were implemented it would not eliminate the hang you're seeing without item 2 above being addressed. Building the page requires 3 main operations : - getting the content (can be parallelized with the workers which can make XMLHttpRequests) - building a DOM tree from the content - rendering (cannot be parallelized because must occur in the browsing context) And in this case the slowness you seem to be trying to address is in the rendering part. -Boris The reason WebWorkers don't have access to the DOM is concurrency. For example, to loop through a list of children I need to first read the number of childrens, then have a for loop which starts at 0 and ends at length-1. If you have two threads that can access the DOM concurrently, then one could change the number of children while the other was looping through the list, which would cause bugs in the program. The only way to fix this is to make the DOM a monitor or introduce semaphores, but then you would have to change the way the DOM is accessed in HTML5, breaking backwards compatibility, which is not a good idea. A better solution to your problem is to load fragments of the entire document using AJAX and then insert those fragments into the main document, when they are needed. You rarely need to see the entire document at once anyways. Marius Gundersen