Subramanya Sastry has submitted this change and it was merged. Change subject: Refactor for DOM-based serializer support in WikitextSerializer ......................................................................
Refactor for DOM-based serializer support in WikitextSerializer This patch adds rough support for DOM-based handlers to the WikitextSerializer. This is intended to simplify the SelectiveSerializer and to fix several long-standing issues around separators (mainly newlines) between modified and unmodified content. DOM-based handlers have full sibling information available, which simplifies decisions on which parts of the document to serialize by reusing the original source, and which separators and modified content to represent with WikitextSerializer output. * Replaced token collector uses and token-based handlers for figures and links with DOM-based serializer handlers. * Added loadDataAttrib, loadDataParsoid, saveDataAttribs methods to DOMUtils, which decode data-<name> JSON attributes into node.data.<name> for efficient access when operating on the DOM. saveDataAttribs serializes the loaded data attribs back into their data-<name> variants. Using this more widely in the DOMPostProcessor should improve performance a bit by avoiding repeated loading / saving of JSON information. * Added DOM-based getAttributeShadowInfo to DOMUtils, which works just like the token equivalent. No changes in parser test results, but some changes in failing test output: * Some link content seemed to emit an array of tokens previously, which left commas in the resulting wikitext. This seems to be fixed now. * Some output previously contained data-parsoid and span data. It is now serialized, so this data is stripped out. TODO: * Pass callbacks down the call chain for cleaner internal interfaces, and switch callbacks when selser is enabled. Switching will avoid some of the indirection we currently have with serializeInfo and a selser handler state machine. The generic chunk callback can still be stored in state.chunkCB, but should no longer be directly called. * Convert more handlers to be DOM-based. The separator handler in particular would be a good target. Change-Id: I889e153cf001df14266724f651f25b5a2e8d65c9 --- M js/lib/mediawiki.DOMUtils.js M js/lib/mediawiki.SelectiveSerializer.js M js/lib/mediawiki.WikitextSerializer.js 3 files changed, 504 insertions(+), 221 deletions(-) Approvals: Subramanya Sastry: Verified; Looks good to me, approved jenkins-bot: Checked -- To view, visit https://gerrit.wikimedia.org/r/47039 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I889e153cf001df14266724f651f25b5a2e8d65c9 Gerrit-PatchSet: 7 Gerrit-Project: mediawiki/extensions/Parsoid Gerrit-Branch: master Gerrit-Owner: GWicke <[email protected]> Gerrit-Reviewer: GWicke <[email protected]> Gerrit-Reviewer: Subramanya Sastry <[email protected]> Gerrit-Reviewer: jenkins-bot _______________________________________________ MediaWiki-commits mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
