Not only is the current markup a barrier to participation, it's a barrier to development. As I argued on Wikien-l, starting over with a markup that can be syntacticly validated, preferably one that is XML based would reap huge rewards in the safety and effectiveness of automated tools - authors of tools like AWB have just as much trouble making software handle the corner cases in wikitext markup as new editors have understanding it.
It's no wonder really, as the current markup is the result of bolting on feature after feature, often to handle various corner cases, basically a really undisciplined approach to development. I honestly wouldn't be surprised if there were security issues waiting to be found in the parser, because of the way it's been thrown together. The switching cost to a new markup and the development cost for an editor for it would probably be comparable to that of writing a WYSIWYG editor that understands the current one. A new markup gives us some huge advantages. A new parser could potentially handle different output formats (print, web, mobile, pdf, etc) seamlessly, could largely eliminate accessibility issues that currently are created from ad-hoc formatting capabilities, and would enable tools that actually understand the markup to start being employed. Making that parser XML based would mean we can reuse the existing ecosystem of libraries and tools for manipulating and parsing XML text, and would have a truly seamless transition into any future markup changes, thanks to XSLT. XML also gives us the benefit of being able to do validation, possibly at the time an edit is saved, allowing us to stop broken markup, rather than having to fix it later, and it allows us to completely remove presentation from the hands of casual editors - the presentation of articles could be controlled at site level, with existing consensus processes similar to that used to change major templates being required to get changes to the site templates (the article template basically becomes part of the interface at this stage.) Transition will be ugly regardless of whether we keep the current parser or replace it. There are a lot of complex features in the current markup that need to go because a WYSIWYG editor wouldn't understand them, and it would also be ideal going forward to "flatten" the formatting capabilities into a subset that assures consistency - ad-hoc HTML and CSS formatting would likely have to go. I honestly think WYSIWYM is a more realistic target once the more problematic features in the current markup are gone. The editing experience is largely the same, with the key difference being that what you see in the editor doesn't have to look exactly like what you see in the rendered article. By aiming for WYSIWYM, some things would render in the editor in a way that makes them easier to understand and edit. For example, templates could render in the editor as tables or as a block that loads the template parameters into a sidebar when clicked. The same could be done for references. This has a very shallow learning curve, and a tremendous advantage over WYSIWYG in that elements that don't lend well to editing in a WYSIWYG environment are presented in the manner easiest to edit, rather than in the manner in which they appear. It may take one or two "second looks" to figure out what's happening, but after that, it's smooth sailing, and by doing it this way, we avoid the downsides of editing complicated parts of a page with a minimum cost of initial confusion. WYSIWYM editors can be friendly to both experienced and new users alike - take LyX as a good example of such an editor - being WYSIWYM, things that are naturally complex and unwieldy in WYSIWYG mode become easy because the interface is built to provide a visual understanding of what is going on rather than 1:1 fidelity with the final document - as a result, you spend more time editing and less time worrying about pixel perfect formatting, because you can trust that the underlying formatting engine will handle things right when you do go to render the finished document. As far as current markup goes, a creative solution would be a fork of the parser into three parts, with a corresponding fork in namespaces as well. The resulting parts would be an article parser, a template parser, and a parser for a new layout namespace. Initially, the existing parser is lumped into the article parser, and class inheritance is used so that the template parser inherits all markup from the article parser, and in turn, the layout parser inherits all markup from the template parser. (I'll get more into layouts below.) This gives us a framework to do several things that improve usability and consistency. Once the initial "split" is set up, we begin to move markup features from the article parser into the template and layout parsers to remove ugly markup from general use, and to restrict formatting capabilities to a subset that both allows a WYSIWYG or WYSIWYM editor to work correctly, and allows some level of consistency to be enforced at a site level. Layouts would be a new form of template, designed to apply as a block-level outline to an article, providing both a framework to build a particular type of article, and defining the formatting for that article in a manner that templates and article markup would no longer be permitted to do. It's likely that layouts would be treated like highly used templates and the interface itself, with the ability to create and change a layout restricted by a permission bit. Layouts would be one to an article, so the interface to select one would probably be just selecting it from a dropdown or typing it's name. Every layout would have at least an article body, and would have one or more additional "blocks" defined - so for basics you could have the default layout (a flat article), an article with infobox layout, and a list layout. The end result of this solution is that ad-hoc formatting using HTML and CSS is gone as far as most editors are concerned, complicated or easily misused markup features that might cause problems are removed from the parser that most users will interact with, and the most problematic of markup is effectively reserved for experienced editors. A new interface can then be built at a fraction of the development cost, because the "hard stuff" is out of article space. This both makes sense for usability now, as well as makes sense as a possible first step if we are ever going to change parsers, because the things that we probably couldn't convert would be "contained" rather than spread across articles. -Steph _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l