[Wikitech-l] Big problem to solve: good WYSIWYG on WMF wikis
[crossposted to foundation-l and wikitech-l] There has to be a vision though, of something better. Maybe something that is an actual wiki, quick and easy, rather than the template coding hell Wikipedia's turned into. - something Fred Bauder just said on wikien-l. Our current markup is one of our biggest barriers to participation. AIUI, edit rates are about half what they were in 2005, even as our fame has gone from popular through famous to part of the structure of the world. I submit that this is not a good or healthy thing in any way and needs fixing. People who can handle wikitext really just do not understand how offputting the computer guacamole is to people who can cope with text they can see. We know this is a problem; WYSIWYG that works is something that's been wanted here forever. There are various hideous technical nightmares in its way, that make this a big and hairy problem, of the sort where the hair has hair. However, I submit that it's important enough we need to attack it with actual resources anyway. This is just one data point, where a Canadian government office got *EIGHT TIMES* the participation in their intranet wiki by putting in a (heavily locally patched) copy of FCKeditor: http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034062.html I have to disagree with you given my experience. In one government department where MediaWiki was installed we saw the active user base spike from about 1000 users to about 8000 users within a month of having enabled FCKeditor. FCKeditor definitely has it's warts, but it very closely matches the experience non-technical people have gotten used to while using Word or WordPerfect. Leveraging skills people already have cuts down on training costs and allows them to be productive almost immediately. http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034071.html Since a plethora of intelligent people with no desire to learn WikiCode can now add content, the quality of posts has been in line with the adoption of wiki use by these people. Thus one would say it has gone up. In the beginning there were some hard core users that learned WikiCode, for the most part they have indicated that when the WYSIWYG fails, they are able to switch to WikiCode mode to address the problem. This usually occurs with complex table nesting which is something that few of the users do anyways. Most document layouts are kept simple. Additionally, we have a multilingual english/french wiki. As a result the browser spell-check is insufficient for the most part (not to mention it has issues with WikiCode). To address this a second spellcheck button was added to the interface so that both english and french spellcheck could be available within the same interface (via aspell backend). So, the payoffs could be ridiculously huge: eight times the number of smart and knowledgeable people even being able to *fix typos* on material they care about. Here are some problems. (Off the top of my head; please do add more, all you can think of.) - The problem: * Fidelity with the existing body of wikitext. No conversion flag day. The current body exploits every possible edge case in the regular expression guacamole we call a parser. Tim said a few years ago that any solution has to account for the existing body of text. * Two-way fidelity. Those who know wikitext will demand to keep it and will bitterly resist any attempt to take it away from them. * FCKeditor (now CKeditor) in MediaWiki is all but unmaintained. * There is no specification for wikitext. Well, there almost is - compiled as C, it runs a bit slower than the existing PHP compiler. But it's a start! http://lists.wikimedia.org/pipermail/wikitext-l/2010-August/000318.html - Attempting to solve it: * The best brains around Wikipedia, MediaWiki and WMF have dashed their foreheads against this problem for at least the past five years and have got *nowhere*. Tim has a whole section in the SVN repository for new parser attempts. Sheer brilliance isn't going to solve this one. * Tim doesn't scale. Most of our other technical people don't scale. *We have no resources and still run on almost nothing*. ($14m might sound like enough money to run a popular website, but for comparison, I work as a sysadmin at a tiny, tiny publishing company with more money and staff just in our department than that to do *almost nothing* compared to what WMF achieves. WMF is an INCREDIBLY efficient organisation.) - Other attempts: * Starting from a clear field makes it ridiculously easy. The government example quoted above is one. Wikia wrote a good WYSIWYG that works really nicely on new wikis (I'm speaking here as an experienced wikitext user who happily fixes random typos on Wikia). Of course, I noted that we can't start from a clear field - we have an existing body of wikitext. So, specification of the problem: * We need good WYSIWYG. The government example suggests that a simple word-processor-like
Re: [Wikitech-l] dataset1, xml dumps
On Wed, Dec 15, 2010 at 4:56 PM, Ariel T. Glenn ar...@wikimedia.org wrote: We want people besides us to host it. We expect to put a copy at the new data center (at least), as well. Does anyone know if the Wikipedia XML Data AWS Public Dataset [1] is being routinely updated? It's showing a last update of September 29, 2009 1:09 AM GMT, but perhaps that's just the last update to the dataset metadata? I guess I could mount the EBS volume to check myself... It might be nice if the database dumps were included as well I guess. //Ed [1] http://aws.amazon.com/datasets/2506 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Sentence-level editing usability videos
Hi all, A short note for everyone interested in usability testing: the videos of the usability testing of sentence-level editing are available on Wikimedia Commons. Videos are in Dutch, transcripts/notes are in English. http://commons.wikimedia.org/wiki/Category:Sentence-level_editing Best wishes, Jan Paul ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Foundation-l] Big problem to solve: good WYSIWYG on WMF wikis
On 28 December 2010 16:06, Victor Vasiliev vasi...@gmail.com wrote: I have thought about WYSIWYG editor for Wikipedia and found it technically impossible. The main and key problem of WYSIWIG are templates. You have to understand that templates are not single element of Wikipedia syntax, they are integral part of page markup. You do not insert infobox template, you insert infobox *itself*, and from what I heard the templates were the main concern of many editors who were scared of wikitext. Now think of how many templates are there in Wikipedia, how frequently they are changed and how much time it would take to implement their editing. Yes. So how do we sensibly - usably - deal with templates in a word-processor-like layout? Is there a way that passes usability muster for non-geeks? How do others do it? Do their methods actually work? e.g. Wikia has WYSIWYG editing and templates. They have a sort of solution to template editing in WYSIWYG. It's not great, but people sort of cope. How did they get there? What can be done to make it better, *conceptually*? What I'm saying there is that we don't start from the assumption that we know nothing and have to start from scratch, forming our answers only from pure application of personal brilliance; we should start from the assumption that we know actually quite a bit, if we only know who to ask and where. Does it require throwing out all previous work? etc., etc. And this is the sort of question that requires actual expense on resources to answer. Given that considerable work has gone on already, what would we do with resources to apply to the problem? - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Foundation-l] Big problem to solve: good WYSIWYG on WMF wikis
On Tue, Dec 28, 2010 at 8:43 AM, David Gerard dger...@gmail.com wrote: e.g. Wikia has WYSIWYG editing and templates. They have a sort of solution to template editing in WYSIWYG. It's not great, but people sort of cope. How did they get there? What can be done to make it better, *conceptually*? What I'm saying there is that we don't start from the assumption that we know nothing and have to start from scratch, forming our answers only from pure application of personal brilliance; we should start from the assumption that we know actually quite a bit, if we only know who to ask and where. Does it require throwing out all previous work? etc., etc. And this is the sort of question that requires actual expense on resources to answer. Given that considerable work has gone on already, what would we do with resources to apply to the problem? My primary interest at the moment in this area is to reframe the question a bit; rather than how do we make good WYSIWYG that works on the way Wikipedia pages' markup and templates are structured now -- which we know has been extremely hard to get going -- to instead consider how do we make good WYSIWYG that does the sorts of things we currently use markup and templates for, plus the things we wish we could do that we can't? We have indeed learned a *huge* amount from the last decade of Wikipedia and friends, among them: * authors and readers crave advanced systems for data format-sharing (eg putting structured info into infoboxes) and interactive features (even just sticking a marker on a map!) * most authors prefer simplicity of editing (keep the complicated stuff out of the way until you need it) * some authors will happily dive into hardcore coding to create the tools they need (templates, user/site JS, gadgets) * many other authors will very happily use those tools once they're created * the less the guts of those tools are exposed, the easier it is for other people to reuse them The incredible creativity of Wikimedians in extending the frontend capabilities of MediaWiki through custom JavaScript, and the markup system through templates, has been blowing my mind for years. I want to find a way to point that creativity straight forward, as it were, and use it to kick some ass. :) Within the Wikimedia ecosystem, we can roughly divide the world into Wikipedia and all the other projects. MediaWiki was created for Wikipedia, based on previous software that had been adapted to the needs of Wikipedia; and while the editing and template systems are sometimes awkward, they work. Our other projects like Commons, Wiktionary, Wikibooks, Wikiversity, and Wikinews have *never* been as well served. The freeform markup model -- which works very well for body text on Wikipedia even if it's icky for creating tables, diagrams and information sets -- has been a poorer fit, and little effort has been spent on actually creating ways to support them well. Commons needs better tools for annotating and grouping media resources. Wiktionary needs structured data with editing and search tools geared towards it. Wikibooks needs a structure model that's based on groups of pages and media resources, instead of just standalone freetext articles which may happen to link to each other. Wikiversity needs all those, and more interactive features and the ability for users to group themselves socially and work together. Getting anything done that would work on the huge, well-developed, wildly-popular Wikipedia has always been a non-starter because it has to deal with 10 years of backwards-compatibility from the get-go. I think it's going to be a *lot* easier to get things going on those smaller projects which are now so poorly served that most people don't even know they exist. :) This isn't a problem specific to Wikimedia; established organizations of all sorts have a very difficult time getting new ideas over that hump from not good enough for our core needs to *bam* slap it everywhere. By concentrating on the areas that aren't served at all well by the current system, we can make much greater headway in the early stages of development; Clayton Christensen's The Innovator's Dilemma calls this competing against non-consumption. For the Wikipedia case, we need to incubate the next generation of templating up to the point that they can actually undercut and replace today's wikitext templates, or I worry we're just going to be sitting around going gosh I wish we could replace these templates and have markup that works cleanly in wysiwyg forever. My current thoughts are to concentrate on a few areas: 1) create a widget/gadget/template/extension/plugin model built around embedding blocks of information within a larger context... 2) ...where the data and rendering can be reasonably separate... (eg, not having to pull tricks where you manually mix different levels of table templates to make the infobox work right) 3) ...and the rendering can be as simple, or as fancy as complex,
[Wikitech-l] StringFunctions on enwiki?
At some point in the past, it was determined that the StringFunctions extension (now part of the ParserFunctions extension) would be disabled on enwiki. I know saw a comment to the effect of: if StringFunctions was turned on, it would only encourage people to start writing parsers in wikicode. Maybe other people were already aware, but not me, that we have a set of hacked-up string functions on enwiki, for example [[Template:Str len]]. There's a whole category of them at [[Category:String manipulation templates]]. I'd like to know the current opinion of the server ops about these things. Is there any chance of StringFunctions being enabled? If not, should we feel free to work around it as these templates do? I'm writing to this list so it will be possible to link from enwiki to the mailing list archives, so responses on-list would be best. - Carl ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions on enwiki?
I too don't understand precisely why string functions are so discouraged. I saw extremely complex templates built just to do (with a high server load I suppose in my ignorance...) what could be obtained with an extremely simple string function. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions on enwiki?
On Tue, Dec 28, 2010 at 3:14 PM, Alex Brollo alex.bro...@gmail.com wrote: I too don't understand precisely why string functions are so discouraged. I saw extremely complex templates built just to do (with a high server load I suppose in my ignorance...) what could be obtained with an extremely simple string function. This seems like it comes up every few months. I think the prevailing opinion on why StringFuncs wasn't ever going to be enabled was think wikimarkup has been bastardized enough as is and StringFuncs would send the wiki into the next circle of markup syntax hell as it would be giving editors more rope to hang themselves with. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions on enwiki?
Alex Brollo wrote: I too don't understand precisely why string functions are so discouraged. I saw extremely complex templates built just to do (with a high server load I suppose in my ignorance...) what could be obtained with an extremely simple string function. https://bugzilla.wikimedia.org/show_bug.cgi?id=6455#c92 (and subsequent comments) MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Foundation-l] Big problem to solve: good WYSIWYG on WMF wikis
On 28 December 2010 16:54, Stephanie Daugherty sdaughe...@gmail.com wrote: Not only is the current markup a barrier to participation, it's a barrier to development. As I argued on Wikien-l, starting over with a markup that can be syntacticly validated, preferably one that is XML based would reap huge rewards in the safety and effectiveness of automated tools - authors of tools like AWB have just as much trouble making software handle the corner cases in wikitext markup as new editors have understanding it. In every discussion so far, throwing out wikitext and replacing it with something that isn't a crawling horror has been considered a non-starter, given ten years and terabytes of legacy wikitext. If you think you can swing throwing out wikitext and barring the actual code from human editing - XML is not safely human editable in any circumstances - then good luck to you, but I don't like your chances. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Foundation-l] Big problem to solve: good WYSIWYG on WMF wikis
On Tue, Dec 28, 2010 at 3:43 PM, David Gerard dger...@gmail.com wrote: On 28 December 2010 16:54, Stephanie Daugherty sdaughe...@gmail.com wrote: Not only is the current markup a barrier to participation, it's a barrier to development. As I argued on Wikien-l, starting over with a markup that can be syntacticly validated, preferably one that is XML based would reap huge rewards in the safety and effectiveness of automated tools - authors of tools like AWB have just as much trouble making software handle the corner cases in wikitext markup as new editors have understanding it. In every discussion so far, throwing out wikitext and replacing it with something that isn't a crawling horror has been considered a non-starter, given ten years and terabytes of legacy wikitext. If you think you can swing throwing out wikitext and barring the actual code from human editing - XML is not safely human editable in any circumstances - then good luck to you, but I don't like your chances. That is true - We can't do away with Wikitext always been the intermediate conclusion (in between My god, we need to do something about this problem and This is hopeless, we give up again). Perhaps it's time to start some exercises in noneuclidian Wiki development, and just assume the opposite and see what happens. -- -george william herbert george.herb...@gmail.com ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Big problem to solve: good WYSIWYG on WMF wikis
There are some things that we know: 1) as Brion says, MediaWiki currently only presents content in one way: as wikitext run through the parser. He may well be right that there is a bigger fish which could be caught than WYSIWYG editing by saying that MW should present data in other new and exciting ways, but that's actually a separate question. *If* you wish to solve WYSIWYG editing, your baseline is wikitext and the parser. 2) guacamole is one of the more unusual descriptors I've heard for the parser, but it's far from the worst. We all agree that it's horribly messy and most developers treat it like either a sleeping dragon or a *very* grumpy neighbour. I'd say that the two biggest problems with it are that a) it's buried so deep in the codebase that literally the only way to get your wikitext parsed is to fire up the whole of the rest of MediaWiki around it to give it somewhere comfy to live in, and b) there is as David says no way of explaining what it's supposed to be doing except saying follow the code; whatever it does is what it's supposed to do. It seems to be generally accepted that it is *impossible* to represent everything the parser does in any standard grammar. Those are all standard gripes, and nothing new or exciting. There are also, to quote a much-abused former world leader, some known unknowns: 1) we don't know how to explain What You See when you parse wikitext except by prodding an exceedingly grumpy hundred thousand lines of PHP and *asking What it thinks* You Get. 2) We don't know how to create a WYSIWYG editor for wikitext. Now, I'd say we have some unknown unknowns. 1) *is* it because of wikitext's idiosyncracies that WYSIWYG is so difficult? Is wikitext *by its nature* not amenable to WYSIWYG editing? 2) would a wikitext which *was* representable in a standard grammar be amenable to WYSIWYG editing? 3) would a wikitext which had an alternative parser, one that was not buried in the depths of MW (perhaps a full JS library that could be called in real-time on the client), be amenable to WYSIWYG editing? 4) are questions 2 and 3 synonymous? --HM David Gerard dger...@gmail.com wrote in message news:aanlktimthux-undo1ctnexcrqbpp89t2m-pvha6fk...@mail.gmail.com... [crossposted to foundation-l and wikitech-l] There has to be a vision though, of something better. Maybe something that is an actual wiki, quick and easy, rather than the template coding hell Wikipedia's turned into. - something Fred Bauder just said on wikien-l. Our current markup is one of our biggest barriers to participation. AIUI, edit rates are about half what they were in 2005, even as our fame has gone from popular through famous to part of the structure of the world. I submit that this is not a good or healthy thing in any way and needs fixing. People who can handle wikitext really just do not understand how offputting the computer guacamole is to people who can cope with text they can see. We know this is a problem; WYSIWYG that works is something that's been wanted here forever. There are various hideous technical nightmares in its way, that make this a big and hairy problem, of the sort where the hair has hair. However, I submit that it's important enough we need to attack it with actual resources anyway. This is just one data point, where a Canadian government office got *EIGHT TIMES* the participation in their intranet wiki by putting in a (heavily locally patched) copy of FCKeditor: http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034062.html I have to disagree with you given my experience. In one government department where MediaWiki was installed we saw the active user base spike from about 1000 users to about 8000 users within a month of having enabled FCKeditor. FCKeditor definitely has it's warts, but it very closely matches the experience non-technical people have gotten used to while using Word or WordPerfect. Leveraging skills people already have cuts down on training costs and allows them to be productive almost immediately. http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034071.html Since a plethora of intelligent people with no desire to learn WikiCode can now add content, the quality of posts has been in line with the adoption of wiki use by these people. Thus one would say it has gone up. In the beginning there were some hard core users that learned WikiCode, for the most part they have indicated that when the WYSIWYG fails, they are able to switch to WikiCode mode to address the problem. This usually occurs with complex table nesting which is something that few of the users do anyways. Most document layouts are kept simple. Additionally, we have a multilingual english/french wiki. As a result the browser spell-check is insufficient for the most part (not to mention it has issues with WikiCode). To address this a second spellcheck button was added to the interface so that
Re: [Wikitech-l] Big problem to solve: good WYSIWYG on WMF wikis
Hi, When this topic was raised before a few years ago (I dont remember which time, it's been continuingly discussed throughout the years) I found an idea especially interesting but it got buried in the mass. From memory and imagination: The idea is to write a new parser that is not deep in MediaWiki and can therefor be used apart from MediaWiki and is fairly easy to be translated to, for example, javascript. This parser accepts similar input as we do now (ie. '''bold''', {{template}}, [[link|text]] etc.) however totally rewritten and with more logical behavour. Call it a 2.0 parser without any worries about compatibilty or old wikitext edge cases which (ab)use the edge cases of the current parser. This would become the default in MediaWiki for new pages created, and indicated by an int in the revision table (ie. rev_pv (parserversion) ). A WYSIWYG editor can be written for this in javascript and it's great. So what about articles with the old paser (ie. rev_pv=NULL / rev_pv=1) ? No problem, the old parser stick around for a while and such articles simply dont have a WYSIWYG editor. Editing articles with the old parser will show a small notice on top (like the one for pages larger than x bytes due to old browser limits) showing an option 'switch' it. That would result in previewing the page's wikitext with the new parser. The user can then make adjuistment as needed to make it look good again (if neccecary at all) and save page (which saves the new revision with rev_pv=2, like it would do for new articles). Since there are lots of articles which likely will have the same output in HTML and require no modification whatshowever there could be a script written (either as a userbot for the end user or as a maintenance script) that would automatically check all pages that have the old rev_pv and compare them to the output of the new parser and automatically update the rev_rv field if it matches. All others would be visible on a SpecialPage for pages of which the last revision has an older version of the parser, with a link to an MW.org page with an overview of a few things that regulars may want to know (ie. the most common differences). Just an idea :) -- Krinkle ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Foundation-l] Big problem to solve: good WYSIWYG on WMF wikis
Hi Brion, Thanks for laying out the problem so clearly! I agree wholeheartedly that we need to avoid thinking about this problem too narrowly as a user interface issue on top of existing markup+templates. More inline: On Tue, Dec 28, 2010 at 9:27 AM, Brion Vibber br...@pobox.com wrote: This isn't a problem specific to Wikimedia; established organizations of all sorts have a very difficult time getting new ideas over that hump from not good enough for our core needs to *bam* slap it everywhere. By concentrating on the areas that aren't served at all well by the current system, we can make much greater headway in the early stages of development; Clayton Christensen's The Innovator's Dilemma calls this competing against non-consumption. Thankfully, we at least we're not trying to defend a business model and cost structure that's fundamentally incompatible with making a change here. However, I know that's not the part that you're highlighting, and I agree that Christensen's competing against non-consumption concept is well worth learning about in this context[1], as well as the concepts of disruptive innovation vs continuous innovation[2]. As you've said, we've learned a lot in the past decade of Wikipedia about how people use our the technology. A new editing model that incorporates that learning will almost certainly take a while to reach full parity in flexibility, power, and performance. The current editor base of English Wikipedia probably won't be patient with any changes that result in a loss of flexibility, power and performance. Furthermore, many (perhaps even most) things we'd be inclined to try would *not* have a measurable and traceable impact on new editor acquisition and retention, which will further diminish patience. A mature project like Wikipedia is a hard place to hunt for willing guinea pigs. For the Wikipedia case, we need to incubate the next generation of templating up to the point that they can actually undercut and replace today's wikitext templates, or I worry we're just going to be sitting around going gosh I wish we could replace these templates and have markup that works cleanly in wysiwyg forever. My current thoughts are to concentrate on a few areas: 1) create a widget/gadget/template/extension/plugin model built around embedding blocks of information within a larger context... 2) ...where the data and rendering can be reasonably separate... (eg, not having to pull tricks where you manually mix different levels of table templates to make the infobox work right) 3) ...and the rendering can be as simple, or as fancy as complex, as your imagination and HTML5 allow. Let me riff on what you're saying here (partly just to confirm that I understand fully what you're saying). It'd be very cool to have the ability to declare a single article, or probably more helpfully, a single revision of an article to use a completely different syntax. There's already technically a kludgy model for that now: wrap the whole thing in a tag, and put the parser for the new syntax in a tag extension. That said, it would probably exacerbate our problems if we allowed intermixing of old syntax and new syntax in a single revision. The goal should be to move articles irreversibly toward a new model, and I don't think it'd be possible to do this without the tools to prevent us from backsliding (for example, tools that allow editors to convert an article from old syntax to new syntax, and also tools that allow administrators to lock down the syntax choice for an article without locking down the article). Still, it's pretty alluring to think about the upgrade of syntax as an incremental problem within an article. We could figure out how to solve one little corner of the data/rendering separation problem and then move on to the next. For example, we could start with citations and make sure it's possible to insert citations easily and cleanly, and to extract citations from an article without relying on scraping the HTML to get them. Or maybe we do that certain types of infoboxes instead, and then gradually get more general. We can take advantage of the fact that we've got millions of articles to help us choose which particular types of data will benefit from a targeted approach, and tailor extensions to very specific data problems, and then generalize after we sort out what works/doesn't work with a few specific cases. So, which problem first? Rob [1] Those with an aversion to business-speak will require steely fortitude to even click on the url, let alone actually read the article, but it's still worth extracting the non-business points from this article: http://businessinnovationfactory.com/weblog/christensen_worldinnovationforum [2] While there is a Wikipedia article describing this[3], a better description of the important bits is here: http://www.mail-archive.com/haskell@haskell.org/msg18498.html [3] Whee, footnote to a footnote!
Re: [Wikitech-l] [Foundation-l] Big problem to solve: good WYSIWYG on WMF wikis
(Lying on the ground in the foetal position sobbing gently ... poor poor Wiksource, forgotten again.) Wikisource - we have tried to get the source and structure by regulating the spaces that we can, however, formalising template fields to forms would be great ... * extension for DynamicPageList (previously rejected) * search engines that work with transcluded text * extension for music notation (Lilypond?) * pdf text extraction tool to be implemented * good metadata tools * bibliographic tools, especially tools that allow sister cross-references * book-making tools that work with transcluded text * tools that allow What links here across all of WMF ... Hell, I could even see that text from WS references could be framed and transcluded to WP, and provide a ready link back to the works at the sites. Same for WQ to transclude quotes from a WS reference text, ready links from Wiktionary to usage in WS books. That should be the value of a wiki and sister sites. Regards, Andrew On 28 Dec 2010 at 9:27, Brion Vibber wrote: On Tue, Dec 28, 2010 at 8:43 AM, David Gerard dger...@gmail.com wrote: e.g. Wikia has WYSIWYG editing and templates. They have a sort of solution to template editing in WYSIWYG. It's not great, but people sort of cope. How did they get there? What can be done to make it better, *conceptually*? What I'm saying there is that we don't start from the assumption that we know nothing and have to start from scratch, forming our answers only from pure application of personal brilliance; we should start from the assumption that we know actually quite a bit, if we only know who to ask and where. Does it require throwing out all previous work? etc., etc. And this is the sort of question that requires actual expense on resources to answer. Given that considerable work has gone on already, what would we do with resources to apply to the problem? My primary interest at the moment in this area is to reframe the question a bit; rather than how do we make good WYSIWYG that works on the way Wikipedia pages' markup and templates are structured now -- which we know has been extremely hard to get going -- to instead consider how do we make good WYSIWYG that does the sorts of things we currently use markup and templates for, plus the things we wish we could do that we can't? We have indeed learned a *huge* amount from the last decade of Wikipedia and friends, among them: * authors and readers crave advanced systems for data format-sharing (eg putting structured info into infoboxes) and interactive features (even just sticking a marker on a map!) * most authors prefer simplicity of editing (keep the complicated stuff out of the way until you need it) * some authors will happily dive into hardcore coding to create the tools they need (templates, user/site JS, gadgets) * many other authors will very happily use those tools once they're created * the less the guts of those tools are exposed, the easier it is for other people to reuse them The incredible creativity of Wikimedians in extending the frontend capabilities of MediaWiki through custom JavaScript, and the markup system through templates, has been blowing my mind for years. I want to find a way to point that creativity straight forward, as it were, and use it to kick some ass. :) Within the Wikimedia ecosystem, we can roughly divide the world into Wikipedia and all the other projects. MediaWiki was created for Wikipedia, based on previous software that had been adapted to the needs of Wikipedia; and while the editing and template systems are sometimes awkward, they work. Our other projects like Commons, Wiktionary, Wikibooks, Wikiversity, and Wikinews have *never* been as well served. The freeform markup model -- which works very well for body text on Wikipedia even if it's icky for creating tables, diagrams and information sets -- has been a poorer fit, and little effort has been spent on actually creating ways to support them well. Commons needs better tools for annotating and grouping media resources. Wiktionary needs structured data with editing and search tools geared towards it. Wikibooks needs a structure model that's based on groups of pages and media resources, instead of just standalone freetext articles which may happen to link to each other. Wikiversity needs all those, and more interactive features and the ability for users to group themselves socially and work together. Getting anything done that would work on the huge, well-developed, wildly-popular Wikipedia has always been a non-starter because it has to deal with 10 years of backwards-compatibility from the get-go. I think it's going to be a *lot* easier to get things going on those smaller projects which are now so poorly served that most people don't even know they exist. :) This isn't a problem specific to Wikimedia; established organizations of all sorts
[Wikitech-l] Does anybody have the 20080726 dump version?
Hi all, I have looked through the web for the 20080726 version of the dump file pages-articles.xml.bz2. But I can't find any result. Can anybody provide me a download link? Thank a lot! Following are the summarization of other versions of this file I have yet found. Wish they are useful for you. 2010-10-11 http://download.wikimedia.org/enwiki/20101011/ 2010-09-16 http://download.wikimedia.org/enwiki/20100916/ 2010-09-04 http://download.wikimedia.org/enwiki/20100904/ 2010-08-17 http://download.wikimedia.org/enwiki/20100904/ enwiki-20100817-pages-articles.xml.bz2http://www.monova.org/details/3873361/enwiki-20100817-pages-articles.xml.bz2.html (6.06 GiB) on monova.org 2010-07-30 enwiki-20100730-pages-articles.xml.bz2http://www.monova.org/details/3869561/enwiki-2010730-pages-articles.xml.bz2.html (6.07 GiB) on monova.org 2010-05-14 enwiki-20100514-pages-articles.xml.bz2http://www.monova.org/details/3780808/enwiki-20100514-pages-articles.xml.bz2.html (5.87 GiB) on monova.org http://dumps.wikimedia.org/archive/enwiki/20100514/ 2010-03-12 http://dumps.wikimedia.org/archive/enwiki/20100312/ 2010-01-30 http://download.wikimedia.org/enwiki/20100130/ 2009-10-09 http://jeffkubina.org/data/download.wikimedia.org/enwiki/20091009/ 2009-06-18 http://download.wikimedia.org/enwiki/20100130/Pirate Bayhttp://thepiratebay.org/torrent/4978482 has enwiki-20090618-pages-articles.xmlhttp://torrents.thepiratebay.org/4978482/enwiki-20090618-pages-articles.xml.bz2.4978482.TPB.torrent, 4.9 GiB (5258589574 Bytes) 2008-10-08 http://jeffkubina.org/data/download.wikimedia.org/enwiki/20081008/ 2008-06-21 http://www.torrentportal.com/details/4621368/Wikipedia+Wiki+Static+HTML+Dump+-+English+-+2008-06-21+-+wikipedia-en-static-html.tar.7z.html 2008-01-03 http://jeffkubina.org/data/download.wikimedia.org/enwiki/20080103/ English Wikipedia dump from 2008-01-03http://www.archive.org/details/enwiki-20080103 Best, Monica ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Does anybody have the 20080726 dump version?
On Wed, Dec 29, 2010 at 12:16 AM, Monica shu monicashu...@gmail.com wrote: Hi all, I have looked through the web for the 20080726 version of the dump file pages-articles.xml.bz2. But I can't find any result. Can anybody provide me a download link? Thank a lot! True story: I used to have a copy of the 20080726 dump. I deleted it like a year ago because I didn't need it anymore and I didn't know it had gone missing at the time. I should ask next time :( -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] How would you disrupt Wikipedia?
I've been inspired by the discussion David Gerard and Brion Vibber kicked off, and I think they are headed in the right direction. But I just want to ask a separate, but related question. Let's imagine you wanted to start a rival to Wikipedia. Assume that you are motivated by money, and that venture capitalists promise you can be paid gazillions of dollars if you can do one, or many, of the following: 1 - Become a more attractive home to the WP editors. Get them to work on your content. 2 - Take the free content from WP, and use it in this new system. But make it much better, in a way Wikipedia can't match. 3 - Attract even more readers, or perhaps a niche group of super-passionate readers that you can use to build a new community. In other words, if you had no legacy, and just wanted to build something from zero, how would you go about creating an innovation that was disruptive to Wikipedia, in fact something that made Wikipedia look like Friendster or Myspace compared to Facebook? And there's a followup question to this -- but you're all smart people and can guess what it is. -- Neil Kandalgaonkar ( ne...@wikimedia.org ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Big problem to solve: good WYSIWYG on WMF wikis
On 29 December 2010 02:07, Happy-melon happy-me...@live.com wrote: There are some things that we know: 1) as Brion says, MediaWiki currently only presents content in one way: as wikitext run through the parser. He may well be right that there is a bigger fish which could be caught than WYSIWYG editing by saying that MW should present data in other new and exciting ways, but that's actually a separate question. *If* you wish to solve WYSIWYG editing, your baseline is wikitext and the parser. Specifically, it only presents content as HTML. It's not really a parser because it doesn't create an AST (Abstract Syntax Tree). It's a wikitext to HTML converter. The flavour of the HTML can be somewhat modulated by the skin but it could never output directly to something totally different like RTF or PDF. 2) guacamole is one of the more unusual descriptors I've heard for the parser, but it's far from the worst. We all agree that it's horribly messy and most developers treat it like either a sleeping dragon or a *very* grumpy neighbour. I'd say that the two biggest problems with it are that a) it's buried so deep in the codebase that literally the only way to get your wikitext parsed is to fire up the whole of the rest of MediaWiki around it to give it somewhere comfy to live in, I have started to advocate the isolation of the parser from the rest of the innards or MediaWiki for just this reason: https://bugzilla.wikimedia.org/show_bug.cgi?id=25984 Free it up so that anybody can embed it in their code and get exactly the same rendering that Wikipedia et al get, guaranteed. We have to find all the edges where the parser calls other parts of MediaWiki and all the edges where other parts of MediaWiki call the parser. We then define these edges as interfaces so that we can drop an alternative parser into MediaWiki and drop the current parser into say an offline viewer or whatever. With a freed up parser more people will hack on it, more people will come to grok it and come up with strategies to address some of its problems. It should also be a boon for unit testing. (I have a very rough prototype working by the way with lots of stub classes) and b) there is as David says no way of explaining what it's supposed to be doing except saying follow the code; whatever it does is what it's supposed to do. It seems to be generally accepted that it is *impossible* to represent everything the parser does in any standard grammar. I've thought a lot about this too. It certainly is not any type of standard grammar. But on the other hand it is a pretty common kind of nonstandard grammar. I call it a recursive text replacement grammar. Perhaps this type of grammar has some useful characteristics we can discover and document. It may be possible to follow the code flow and document each text replacement in sequence as a kind of parser spec rather than trying and failing again to shoehorn it into a standard LALR grammar. If it is possible to extract such a spec it would then be possible to implement it in other languages. Some research may even find that is possible to transform such a grammar deterministically into an LALR grammar... But even if not I'm certain it would demysitfy what happens in the parser so that problems and edge cases would be easier to locate. Andrew Dunbar (hippietrail) Those are all standard gripes, and nothing new or exciting. There are also, to quote a much-abused former world leader, some known unknowns: 1) we don't know how to explain What You See when you parse wikitext except by prodding an exceedingly grumpy hundred thousand lines of PHP and *asking What it thinks* You Get. 2) We don't know how to create a WYSIWYG editor for wikitext. Now, I'd say we have some unknown unknowns. 1) *is* it because of wikitext's idiosyncracies that WYSIWYG is so difficult? Is wikitext *by its nature* not amenable to WYSIWYG editing? 2) would a wikitext which *was* representable in a standard grammar be amenable to WYSIWYG editing? 3) would a wikitext which had an alternative parser, one that was not buried in the depths of MW (perhaps a full JS library that could be called in real-time on the client), be amenable to WYSIWYG editing? 4) are questions 2 and 3 synonymous? --HM David Gerard dger...@gmail.com wrote in message news:aanlktimthux-undo1ctnexcrqbpp89t2m-pvha6fk...@mail.gmail.com... [crossposted to foundation-l and wikitech-l] There has to be a vision though, of something better. Maybe something that is an actual wiki, quick and easy, rather than the template coding hell Wikipedia's turned into. - something Fred Bauder just said on wikien-l. Our current markup is one of our biggest barriers to participation. AIUI, edit rates are about half what they were in 2005, even as our fame has gone from popular through famous to part of the structure of the world. I submit that this is not a good or healthy thing in any way and needs fixing.
Re: [Wikitech-l] Does anybody have the 20080726 dump version?
@_...@... Thanks any way:) Anyone else hands up? On Wed, Dec 29, 2010 at 3:18 PM, Chad innocentkil...@gmail.com wrote: On Wed, Dec 29, 2010 at 12:16 AM, Monica shu monicashu...@gmail.com wrote: Hi all, I have looked through the web for the 20080726 version of the dump file pages-articles.xml.bz2. But I can't find any result. Can anybody provide me a download link? Thank a lot! True story: I used to have a copy of the 20080726 dump. I deleted it like a year ago because I didn't need it anymore and I didn't know it had gone missing at the time. I should ask next time :( -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l