Subramanya Sastry has submitted this change and it was merged.

Change subject: Refactor for DOM-based serializer support in WikitextSerializer
......................................................................


Refactor for DOM-based serializer support in WikitextSerializer

This patch adds rough support for DOM-based handlers to the
WikitextSerializer. This is intended to simplify the SelectiveSerializer and
to fix several long-standing issues around separators (mainly newlines)
between modified and unmodified content. DOM-based handlers have full sibling
information available, which simplifies decisions on which parts of the
document to serialize by reusing the original source, and which separators and
modified content to represent with WikitextSerializer output.

* Replaced token collector uses and token-based handlers for figures and links
  with DOM-based serializer handlers.

* Added loadDataAttrib, loadDataParsoid, saveDataAttribs methods to DOMUtils,
  which decode data-<name> JSON attributes into node.data.<name> for efficient
  access when operating on the DOM. saveDataAttribs serializes the loaded data
  attribs back into their data-<name> variants.

  Using this more widely in the DOMPostProcessor should improve performance a
  bit by avoiding repeated loading / saving of JSON information.

* Added DOM-based getAttributeShadowInfo to DOMUtils, which works just like
  the token equivalent.

No changes in parser test results, but some changes in failing test output:

* Some link content seemed to emit an array of tokens previously, which left
  commas in the resulting wikitext. This seems to be fixed now.

* Some output previously contained data-parsoid and span data. It is now
  serialized, so this data is stripped out.

TODO:

* Pass callbacks down the call chain for cleaner internal interfaces, and
  switch callbacks when selser is enabled. Switching will avoid some of the
  indirection we currently have with serializeInfo and a selser handler state
  machine. The generic chunk callback can still be stored in state.chunkCB,
  but should no longer be directly called.

* Convert more handlers to be DOM-based. The separator handler in particular
  would be a good target.

Change-Id: I889e153cf001df14266724f651f25b5a2e8d65c9
---
M js/lib/mediawiki.DOMUtils.js
M js/lib/mediawiki.SelectiveSerializer.js
M js/lib/mediawiki.WikitextSerializer.js
3 files changed, 504 insertions(+), 221 deletions(-)

Approvals:
  Subramanya Sastry: Verified; Looks good to me, approved
  jenkins-bot: Checked


--
To view, visit https://gerrit.wikimedia.org/r/47039
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I889e153cf001df14266724f651f25b5a2e8d65c9
Gerrit-PatchSet: 7
Gerrit-Project: mediawiki/extensions/Parsoid
Gerrit-Branch: master
Gerrit-Owner: GWicke <[email protected]>
Gerrit-Reviewer: GWicke <[email protected]>
Gerrit-Reviewer: Subramanya Sastry <[email protected]>
Gerrit-Reviewer: jenkins-bot

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to