On 20 November 2014 09:30, Gergo Tisza <gti...@wikimedia.org> wrote:

> On Mon, Nov 17, 2014 at 11:03 AM, James Forrester <
> jforres...@wikimedia.org>
> wrote:
>
> > ​​Moving to character-level rather than paragraph-level diffing might
> help
>
> here, potentially. I vaguely​ remember that we attempted that and abandoned
> > it because it caused more issues than it solved back in ?2004, though.
> >
>
> A paragraph-level diff means that you only get an edit conflict if two
> people change the same paragraph. A character-level diff would mean, then,
> that you only get a conflict if they change the same character? That sounds
> a bit excessive. (Stupid example: if I change "sixty-three" to "sixty-five"
> and someone else changes it to "seventy-three", that should probably be a
> conflict, but a character-level diff would happily merge them into
> "seventy-five".)


​Sure, but wikitext "paragraphs" are significantly more extensive and
diverse than the NLP concept; to give an example:

Original wikitext:

There are six [[alpaca]] shearers​ on [[Sunningdale Acers|the farm]].


​My changes:​

There are six [[*Alpaca fiber|*alpaca]]​ shearers on [[Sunningdale Acr*e*s|the
farm]].


​Their changes:​

There are six [[alpaca]]​ shearers on [[Sunningdale Acers|the farm*stead*
]].



​Merg​ing these two changes requires character-level merging (or something
that natively understand wikitext at a subtle level. The first change would
go through as a word-level diff (but not at sentence-level); the second
wouldn't go through even then. Of course, we could prompt people to review
the diff after saving if we're auto-merging, but that might be something we
should be doing even now?




> Another low-hanging fruit would be to special-case the situation when
> editor A adds text to the end of a section but does not start a new
> section, while editor B adds a new section to the same place. This is
> currently a conflict as they both try to insert to the same "slot" between
> paragraphs, so a generic merge tool cannot figure out whether those
> additions conflict and what would be the right order if they don't;
> however, knowing the semantics of wikitext, inserting the text from A first
> and the one from B after that seems a pretty safe bet. This kind of
> conflict is very typical on talk pages where people almost always edit the
> end of a section, and the few "hot topic" sections get the majority of the
> edits.


​That seems
​ like a sensible idea.​ Filed:
https://bugzilla.wikimedia.org/show_bug.cgi?id=73667



> (Of course, using unstructured wikitext for talk pages is a bad
> thing in general, but that's a long-term problem, and this kind of edit
> conflict could be prevented quickly.)​


​Indeed!​


​J.
-- 
James D. Forrester
Product Manager, Editing
Wikimedia Foundation, Inc.

jforres...@wikimedia.org | @jdforrester
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to