Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers

2009-12-07 Thread David Bruant
 The reason WebWorkers don't have access to the DOM is concurrency. For
 example, to loop through a list of children I need to first read the
 number of childrens, then have a for loop which starts at 0 and ends
 at length-1. If you have two threads that can access the DOM
 concurrently, then one could change the number of children while the
 other was looping through the list, which would cause bugs in the
 program. The only way to fix this is to make the DOM a monitor or
 introduce semaphores, but then you would have to change the way the
 DOM is accessed in HTML5, breaking backwards compatibility, which is
 not a good idea.

 A better solution to your problem is to load fragments of the entire
 document using AJAX and then insert those fragments into the main
 document, when they are needed. You rarely need to see the entire
 document at once anyways.

 Marius Gundersen
 One good way I have found would be to cut the whole page into several 
 parts (one the server side, what is already done in the multi-page 
 version) and to launch several workers. Each worker gets one part of the 
 whole page in the background and could give it to the browsing context 
 which will append the right part at the right place.
 

 As others have noted, the slowness turns out to not be parsing, but to be 
 a bunch of scripts that are doing various things such as adding the 
 sidebar annotations, setting up the dfn cross-references, and generating 
 the short table of contents.

 Plus, since browsers don't have thread-safe DOM implementations, we 
 actually can't expose the DOM in workers. Maybe one day. :-)
   
 -- Ian Hickson
= I'm sorry for the misunderstanding. I shouldn't have said the DOM
API. To be as accurate as I can be I want to provide the
DOMImplementation interface
(http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-102161490) to the
workers. As I'm going to explain, the point is to be able to create a
document and then a documentFragment.

I will explain my point through another use case. (Sorry for the
confusion with the HTML5 one-page version.)

Let imagine that I want to build a single page with several non-HTML
sources of information. They can be in different formats (RSS, datas got
from XML-RPC requests, any other kind of XML file, JSON...). I suppose
that each source is a different JSON file with different structures
(different properties, different nestings). Each source needs a
particular treatment. As I said in my first e-mail, there are 3 mains
steps before visualizing my page fully loaded. For each source of
content, we have to :
(1) get the content
(2) transform it into a DOM tree (as a documentFragment or a string that
is the representation of a HTML fragment, for example)
(3) append this to the main document at the right place. (which triggers
graphical rendering)
This last step is either an appendChild or a .innerHTML= and must be
done in the main browsing context, there is no choice.


Let imagine that I want that one workers per source.
For the moment, WebWorkers can do the step (1) independently (thanks to
XMLHttpRequest).

When each workers receives its JSON string, this string must be
transformed into an HTML DOM tree (2) (let say a table for example).
Because none of the DOM core API is currently available to the
WebWorkers, we have two solutions to turn the JSON string received in
(1) into an HTML DOM tree :
(2.1) Send the JSON string (or the resulting object, whatever) to the
main document which will create a documentFragment, run through the JSON
object and append the table,tbody, trs and tds and contents to
this fragment for all the sources.
(2.2) Each worker create a string which looks like table
id=blablatbodytr class=bliblitd1/tdtd2/td/trtr
class=bliblitd3/tdtd37/td/tr/tbody/table with +=
while running through the JSON object. Then send the string through
postMessage() and the main browsing context can do a
rightPlace.innerHTML = e.data (where e.data is the string).

(2.1) We have the document/documentFragment/Element/Node abstraction,
but we loose all the parallelism, because the browsing context is
handling all the sources of information (and creating a documentFragment
and all the appendings for each source)

(2.2) We have the parallelism, because each Worker handles a source.
However, we loose the DOM abstraction. I hope that I have made the
string ridiculously long enough to convince you that it is not a good
solution. For complicated examples, by experience, using += and
.innerHTML is always a source of error especially because of closing
tags. These problems don't occur when developing with the DOM abstraction.

My proposition is :
(2.3) Assuming that we have access to the DOMImplementation interface,
we can create an object implementing the document interface which is
DIFFERENT from the main document object and I insist on this point. I am
NOT proposing to provide an access to the main document (the one which
created the workers).
Thanks to this document, we can create a 

Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers

2009-12-01 Thread Ian Hickson
On Thu, 12 Nov 2009, David Bruant wrote:

 I was waiting for Firefox to stop freezing on the HTML5 spec page (it 
 freezes about one minute each time I visit the one-page version) and I 
 tried to think of a way to design this page in a way that wouldn't 
 freeze my browser.

The easiest way is to disable the scripts, which you can do by appending 
?slow-browser to the page's URL, as in:

   http://www.whatwg.org/specs/web-apps/current-work/?slow-browser


 One good way I have found would be to cut the whole page into several 
 parts (one the server side, what is already done in the multi-page 
 version) and to launch several workers. Each worker gets one part of the 
 whole page in the background and could give it to the browsing context 
 which will append the right part at the right place.

As others have noted, the slowness turns out to not be parsing, but to be 
a bunch of scripts that are doing various things such as adding the 
sidebar annotations, setting up the dfn cross-references, and generating 
the short table of contents.

Plus, since browsers don't have thread-safe DOM implementations, we 
actually can't expose the DOM in workers. Maybe one day. :-)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] [WebWorkers] Advocation to provide the DOM API to the workers

2009-11-12 Thread David Bruant
Hi,

I was waiting for Firefox to stop freezing on the HTML5 spec page (it
freezes about one minute each time I visit the one-page version) and I
tried to think of a way to design this page in a way that wouldn't
freeze my browser.
One good way I have found would be to cut the whole page into several
parts (one the server side, what is already done in the multi-page
version) and to launch several workers. Each worker gets one part of the
whole page in the background and could give it to the browsing context
which will append the right part at the right place.

But what is this give ? Without the DOM API, this give means
sending a string through the postMessage() method and the append
means rightPlace.innerHTML = stringContainingAPartOfThePage.
However, with the DOM API, each worker could build independantly its
documentFragment, send it to the browsing context which will append
(appendChild) it in the right place.

Building the page requires 3 main operations :
- getting the content (can be parallelized with the workers which can
make XMLHttpRequests)
- building a DOM tree from the content
- rendering (cannot be parallelized because must occur in the browsing
context)

Without the DOM API, the second step cannot be parallelized in the
WebWorkers.

I understand that the whole DOM API is not useful for the WebWorkers,
but, could a reduced API (sufficient to describe the tree structure of a
document) be available to the web workers to have a chance to
parallelize the tree structuration of the content (and then reduce its
cost) ?

Thanks,

David


Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers

2009-11-12 Thread Boris Zbarsky

On 11/12/09 9:21 PM, David Bruant wrote:

I was waiting for Firefox to stop freezing on the HTML5 spec page (it
freezes about one minute each time I visit the one-page version) and I
tried to think of a way to design this page in a way that wouldn't
freeze my browser.


Two easy ways to do this:

1)  Take out the script at the end of the page that goes and messes
with the DOM.
2)  Fix the O(N^2) algorithm in the web browser that this script
happens to trigger
(https://bugzilla.mozilla.org/show_bug.cgi?id=526394; should be
checked in pretty soon unless something goes drastically wrong).


One good way I have found would be to cut the whole page into several
parts (one the server side, what is already done in the multi-page
version) and to launch several workers. Each worker gets one part of the
whole page in the background and could give it to the browsing context
which will append the right part at the right place.


I'm not sure what you mean, exactly... what would the worker give, 
exactly?



But what is this give ? Without the DOM API, this give means
sending a string through the postMessage() method and the append
means rightPlace.innerHTML = stringContainingAPartOfThePage.
However, with the DOM API, each worker could build independantly its
documentFragment, send it to the browsing context which will append
(appendChild) it in the right place.


The problem here is that of a script making certain DOM mutations after 
the DOM is completely built and reflecting those mutations into the 
rendering tree, not of initial DOM construction.


That is, even if this proposal were implemented it would not eliminate 
the hang you're seeing without item 2 above being addressed.



Building the page requires 3 main operations :
- getting the content (can be parallelized with the workers which can
make XMLHttpRequests)
- building a DOM tree from the content
- rendering (cannot be parallelized because must occur in the browsing
context)


And in this case the slowness you seem to be trying to address is in the 
rendering part.


-Boris


Re: [whatwg] [WebWorkers] Advocation to provide the DOM API to the workers

2009-11-12 Thread Marius Gundersen
On Fri, Nov 13, 2009 at 1:46 PM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 11/12/09 9:21 PM, David Bruant wrote:

 I was waiting for Firefox to stop freezing on the HTML5 spec page (it
 freezes about one minute each time I visit the one-page version) and I
 tried to think of a way to design this page in a way that wouldn't
 freeze my browser.


 Two easy ways to do this:

 1)  Take out the script at the end of the page that goes and messes
with the DOM.
 2)  Fix the O(N^2) algorithm in the web browser that this script
happens to trigger
(https://bugzilla.mozilla.org/show_bug.cgi?id=526394; should be
checked in pretty soon unless something goes drastically wrong).


  One good way I have found would be to cut the whole page into several
 parts (one the server side, what is already done in the multi-page
 version) and to launch several workers. Each worker gets one part of the
 whole page in the background and could give it to the browsing context
 which will append the right part at the right place.


 I'm not sure what you mean, exactly... what would the worker give,
 exactly?


  But what is this give ? Without the DOM API, this give means
 sending a string through the postMessage() method and the append
 means rightPlace.innerHTML = stringContainingAPartOfThePage.
 However, with the DOM API, each worker could build independantly its
 documentFragment, send it to the browsing context which will append
 (appendChild) it in the right place.


 The problem here is that of a script making certain DOM mutations after the
 DOM is completely built and reflecting those mutations into the rendering
 tree, not of initial DOM construction.

 That is, even if this proposal were implemented it would not eliminate the
 hang you're seeing without item 2 above being addressed.


  Building the page requires 3 main operations :
 - getting the content (can be parallelized with the workers which can
 make XMLHttpRequests)
 - building a DOM tree from the content
 - rendering (cannot be parallelized because must occur in the browsing
 context)


 And in this case the slowness you seem to be trying to address is in the
 rendering part.

 -Boris



The reason WebWorkers don't have access to the DOM is concurrency. For
example, to loop through a list of children I need to first read the number
of childrens, then have a for loop which starts at 0 and ends at length-1.
If you have two threads that can access the DOM concurrently, then one could
change the number of children while the other was looping through the list,
which would cause bugs in the program. The only way to fix this is to make
the DOM a monitor or introduce semaphores, but then you would have to change
the way the DOM is accessed in HTML5, breaking backwards compatibility,
which is not a good idea.

A better solution to your problem is to load fragments of the entire
document using AJAX and then insert those fragments into the main document,
when they are needed. You rarely need to see the entire document at once
anyways.

Marius Gundersen