On 20 November 2014 09:30, Gergo Tisza <gti...@wikimedia.org> wrote: > On Mon, Nov 17, 2014 at 11:03 AM, James Forrester < > jforres...@wikimedia.org> > wrote: > > > Moving to character-level rather than paragraph-level diffing might > help > > here, potentially. I vaguely remember that we attempted that and abandoned > > it because it caused more issues than it solved back in ?2004, though. > > > > A paragraph-level diff means that you only get an edit conflict if two > people change the same paragraph. A character-level diff would mean, then, > that you only get a conflict if they change the same character? That sounds > a bit excessive. (Stupid example: if I change "sixty-three" to "sixty-five" > and someone else changes it to "seventy-three", that should probably be a > conflict, but a character-level diff would happily merge them into > "seventy-five".)
Sure, but wikitext "paragraphs" are significantly more extensive and diverse than the NLP concept; to give an example: Original wikitext: There are six [[alpaca]] shearers on [[Sunningdale Acers|the farm]]. My changes: There are six [[*Alpaca fiber|*alpaca]] shearers on [[Sunningdale Acr*e*s|the farm]]. Their changes: There are six [[alpaca]] shearers on [[Sunningdale Acers|the farm*stead* ]]. Merging these two changes requires character-level merging (or something that natively understand wikitext at a subtle level. The first change would go through as a word-level diff (but not at sentence-level); the second wouldn't go through even then. Of course, we could prompt people to review the diff after saving if we're auto-merging, but that might be something we should be doing even now? > Another low-hanging fruit would be to special-case the situation when > editor A adds text to the end of a section but does not start a new > section, while editor B adds a new section to the same place. This is > currently a conflict as they both try to insert to the same "slot" between > paragraphs, so a generic merge tool cannot figure out whether those > additions conflict and what would be the right order if they don't; > however, knowing the semantics of wikitext, inserting the text from A first > and the one from B after that seems a pretty safe bet. This kind of > conflict is very typical on talk pages where people almost always edit the > end of a section, and the few "hot topic" sections get the majority of the > edits. That seems like a sensible idea. Filed: https://bugzilla.wikimedia.org/show_bug.cgi?id=73667 > (Of course, using unstructured wikitext for talk pages is a bad > thing in general, but that's a long-term problem, and this kind of edit > conflict could be prevented quickly.) Indeed! J. -- James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc. jforres...@wikimedia.org | @jdforrester _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l