Ah, and I see that people did receive the original: it's just the archive that is broken. Thanks for that.
Cheers, Karl On Tue, Feb 1, 2011 at 11:19 PM, Andreas Jonsson <[email protected]> wrote: > > 2011-02-02 01:48, Karl Matthias skrev: >> Apologies... even the second attempt was truncated it seems. Here's >> one final try > > You are hit by the same problem I was a few days ago on this list. You > have a line that starts with "From your" in the text. > > /Andreas > > >> Karl >> ----------- >> Alan Post wrote: >> > Interesting. Is the PEG grammar available for this parser? >> >> > >> > -Alan >> >> It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg >> >> Get peg/leg from http://piumarta.com/software/peg/ >> >> >> I just tried it and already found a bug on the first Hello World (it >> surrounds headers inside paragraphs). >> It strangely converts templates into underscored words. They may be >> expecting some other parser piece to restore it. I'm pretty sure there >> >> are corner cases in the preprocessor (eg. just looking at the peg file >> they don't handle mixed case noincludes), but I don't think that should >> need to be handled by the parser itself. >> >> The grammar looks elegant. I doubt it can really handle full wikitext. >> >> But it would be so nice if it did... >> >> >> I'm one of the authors of the Kiwi parser and will be presenting it at >> the Data Summit on Friday. The parser is pretty complete but >> certainly we could use some community support and we encourage >> feedback and participation! It is a highly functional tool already >> but it can use some polish. It does actually handle most wikitext, >> though not absolutely everything. >> >>>From your post I can see that you are experiencing a couple of design >> decisions we made in writing this parser. We did not set out to match >> the exact HTML output of MediaWiki, only to output something that will >> look the same in the browser. This might not be the best approach, >> but right now this is the case. Our site doesn't have the same needs >> as Wikipedia so when in doubt we leaned toward what suited our needs >> and not necessarily ultimate tolerance of poor syntax (though it is >> somewhat flexible). Another design decision is that everything that >> you put in comes out wrapped in paragraph tags. Usually this wraps >> the whole document, so if your whole document was just a heading, then >> yes it is wrapped in paragraph tags. This is probably not the best >> way to handle this but it's what it currently does. Feel free to >> contribute a different solution. >> >> Templates, as you probably know, require full integration with an >> application to work in the way that MediaWiki handles them, because >> they require access to the data store, and possibly other >> configuration information. We built a parser that works independently >> of the data store (indeed, even on the command line in a somewhat >> degenerate form). In order to do that, we had to decouple template >> retrieval from the parse. If you take a look in the Ruby FFI >> examples, you will see a more elegant handling of templates(though it >> needs work). When a document is parsed, the parser library makes >> available a list of templates that were found, the arguments passed to >> the template, and the unique replacement tag in the document for >> inserting the template once rendered. Those underscored tags that come >> out are not a bug, they are those unique tags. There is a switch to >> disable templates and in that case it just swallows them instead. So >> the template handling work flow (simplistically) is: >> >> 1. Parse original document and generate list of templates, >> arguments, replacement tags >> 2. Fetch first template, if there is no recursion needed, insert >> into original document >> 3. Fetch next template, etc >> >> We currently recurse 6 templates deep in the bindings we built for >> AboutUs.org (sysop-only at the moment). Template arguments don't work >> right now, but it's fairly trivial to do it. We just haven't done it >> yet. >> >> Like templates, images require some different solutions if the parser >> is to be decoupled. Our parser does not re-size images, store them, >> etc. It just works with image URLs. If your application requires >> images to be regularized, you would need to implement resizing them at >> upload, or lazily at load time, or whatever works in your scenario. >> More work is needed in this area, though if you check out >> http://kiwi.drasticcode.com you can see that most image support is >> working (no resizing). You can also experiment with the parser there >> as needed. >> >> Hope that at least helps explain what we've done. Again, feedback and >> particularly code contributions are appreciated! >> >> Cheers, >> Karl >> >> _______________________________________________ >> Wikitext-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikitext-l >> > > > _______________________________________________ > Wikitext-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitext-l > _______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
