> Your not going to get 100% compatibility moving from the multiple
> search/replace method into a single parse. 
> 
> Hooks embedded within the parser,  like InternalParseBeforeLinks,
> ParserBeforeTidy become impossible to do. 

True. I was thinking of "clean" tag hooks and parser functions. These should
continue to work wit ha minimum of modification. I don't mind the black magic
braking.

>> That is, the grammar should NOT know about <ref>, not what it 
>> does, not even that it exists. It should simply have a 
>> facility that allows externam (php) code to handle the 
>> characters (unchanged!) between (some specific) tags.
> 
> Agreed, the grammar should know how to pass and correct tag soup style
> HTML/XML that gets handed off to deal with.

Yes, though for the parser, there are three cases to consider for HTML/XML style
tags:

1) (whitelisted) HTML tags, which can occur "soupy", and are more or less passed
through (or "tidied" into valid xhtml).
2) Other tags (potentially handled by an extension) which must match in pairs
exactly and cause the parser to take anything *inbetween* LITERALLY, and pass it
to the extension for processing.
3) In case there is no such extension, it needs to go back, read the *tags*
literally, and then parse the text between the tags.

There's even a fourth case, namely magic tags like <nowiki> that have to be
known to the parser for special handling - these may also include <includeonly>,
<onlyinclude> and <noinclude>, though those might be handled by the
preprocessor, i'm not sure about that.

In the case of (some!) parser functions, it has to be considered that the
*output* of the extension would have to be parsed to, inlined. But that stuff is
probably handled by the preprocessor - if that is indeed the case, there's
nothing to worry about.

-- Daniel

_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Reply via email to