https://bugzilla.wikimedia.org/show_bug.cgi?id=53410

       Web browser: ---
            Bug ID: 53410
           Summary: Don't modify the document after passing it to
                    callbacks in mediawiki.HTML5TreeBuilder.node.js
           Product: Parsoid
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: DOM
          Assignee: gwi...@wikimedia.org
          Reporter: gwi...@wikimedia.org
                CC: ssas...@wikimedia.org
    Classification: Unclassified
   Mobile Platform: ---

When we process a document to HTML DOM, the document that is passed to
callbacks is eventually modified in mediawiki.HTML5TreeBuilder.node.js when the
pipeline is reused for another parse. This is unexpected behavior, so should
really be fixed.

The main reason for this is that there seems to be no efficient way to
re-initialize an existing HTML5 parser with a new document. Creating a new
HTML5 parser would work, but is (or at least was) quite expensive with a lot of
work done in the constructor.

So we should probably:

1) measure how expensive it currently is to create a new HTML5 parser from
scratch, and if that is too slow,

2) create a way to efficiently reset the HTML5 parser with a new document
without clobbering the old one.

See also https://gerrit.wikimedia.org/r/#/c/81251/ for an earlier partial fix
and bug 53407 for an example illustrating why modifying the document is a
problem.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to