Hi Luca,

we are working on somewhat related issues in Parsoid [1][2]. The
modified HTML DOM is diffed vs. the original DOM on the way in. Each
modified node is annotated with the base revision. We don't store this
information yet- right now we use it to selectively serialize modified
parts of the page back to wikitext. We will however soon store the HTML
along with wikitext for each revision, which should make it possible to
display a coarse blame map.

There are several limitations:

* We don't preserve blame information on wikitext edits yet. This should
become possible with the incremental re-parsing optimization which is on
our roadmap for this summer.

* Our DOM diff algorithm is extremely simplistic. We are considering to
port XyDiff for better move detection.

* The information is pretty coarse at a node level. Refining this to a
word level would require an efficient encoding for that information,
possibly as length/revision pairs associated with the wrapping element.

* We have not moved metadata from attributes to a metadata section with
efficient encoding yet.

We don't currently plan to work on blame maps ourselves. Maybe there are
opportunities for collaboration?

Gabriel

[1]: http://www.mediawiki.org/wiki/Parsoid
[2]: http://www.mediawiki.org/wiki/Parsoid/Roadmap

-- 
Gabriel Wicke
Senior Software Engineer
Wikimedia Foundation

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to