Re: [Wikitech-l] Enabling some string functions
Brian wrote: You seem confused. You seem to think that I care about the proper way to program using templates and parser functions. That's not true, they are an ugly hack and I recognize that. If have absolutely no desire to learn how to use something so hideously inefficient in an efficient manner. Then you shouldn't be presenting examples of how it can't be implemented reasonably in template programming. Almost any _reasonable programming language_ allows you to write ugly code if so you want. That doesn't prove the language is ugly. Nonetheless... it's ugly :) Ugly or not, but having a kind of scripting inside the pages can be very much useful. It exteneds the flexibility of sites built on top of the MediaWiki. Probably one of reasons MediaWiki becomes more popular as website engine around the world. Maybe a better syntax and restriction to template namespace would be a good thing, though. I personally liked the idea of pre-parsed and checked limited subset of PHP operators for performance, though the security may be an issue. Dmitriy ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
Brian wrote: You seem confused. You seem to think that I care about the proper way to program using templates and parser functions. That's not true, they are an ugly hack and I recognize that. If have absolutely no desire to learn how to use something so hideously inefficient in an efficient manner. Then you shouldn't be presenting examples of how it can't be implemented reasonably in template programming. Almost any _reasonable programming language_ allows you to write ugly code if so you want. That doesn't prove the language is ugly. Nonetheless... it's ugly :) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
Aryeh Gregor wrote: On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.katt...@gmail.com wrote: The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook. I thought this was fixed ages ago with the new preprocessor. Yes it was fixed in 1.12 (late 2007), as I have repeatedly told this list. The new if parser function is passed a placeholder object which can be expanded on demand. -- Tim Starling ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
Brian wrote: They want the functionality and they are willing to satisfy usability and quality of implementation in order to get it, plain and simple. ParserFunctions combined with StringFunctions is flat out unreadable. We should not facilitate the writing of unreadable code. As an example, yesterday I wrote some code that basically says, check the doi and http template parameters and check to make sure they begin with http, and if not add it. In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions. #[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}|{{#if:{{{4}}}| http://dx.doi.org/{{{4}}}|{{#if:{{{5}}}|http://dx.doi.org/{{{5} {{#if:{{{2}}}| {{{2}{{#if:{{{4}}}|]|{{#if:{{{5}}}|] {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}} There is some extra stuff in there, but you get my point. Just because a few people really, really want extra functionality at any cost doesn't mean much. I have seen this before. People use #if for everything even when there is a better way. Look at what you're doing: {{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}} {{#if:{{{5}}}|{{{5}}} mean show parameter 5 if it is set, or show parameter 5 if it is not blank. In either case, {{{5|}}} would do the job. The parent #if is simlar parameter 4 if set, else parameter 5. {{{4| {{{5|}}} }}} would do the job. Template default parameters were here much before ParserFunctions. But people prefer using ugly #ifs, making syntax more unreadable (and increasing preprocessor limits). Another common abuse is to do: {{#if: {{{Foo}}}| trtdFoo: /tdtd{{{Foo}}} /td/tr }} I'd like to have a feature in the parser to mark a section to be skipped if the inner parameter is not set, without having to use #ifs everywhere. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
You seem confused. You seem to think that I care about the proper way to program using templates and parser functions. That's not true, they are an ugly hack and I recognize that. If have absolutely no desire to learn how to use something so hideously inefficient in an efficient manner. On Sat, Jun 27, 2009 at 4:43 PM, Platonides platoni...@gmail.com wrote: Brian wrote: They want the functionality and they are willing to satisfy usability and quality of implementation in order to get it, plain and simple. ParserFunctions combined with StringFunctions is flat out unreadable. We should not facilitate the writing of unreadable code. As an example, yesterday I wrote some code that basically says, check the doi and http template parameters and check to make sure they begin with http, and if not add it. In any reasonable sort of language that lends itself to a reasonable sort of implementation. But not with Parser and String Functions. #[[{{{1}}}]]. {{#if:{{{4}}}|[|{{#if:{{{5}}}|[{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}|{{#if:{{{4}}}| http://dx.doi.org/{{{4}}}|{{#if:{{{5}}}|http://dx.doi.org/{{{5}}}http://dx.doi.org/%7B%7B%7B4%7D%7D%7D%7C%7B%7B#if:%7B%7B%7B5%7D%7D%7D%7Chttp://dx.doi.org/%7B%7B%7B5%7D%7D%7D%7D%7D%7D%7D }} {{#if:{{{2}}}| {{{2}{{#if:{{{4}}}|]|{{#if:{{{5}}}|] {{#ifexist: File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}| ''{{{3}}}.''}} There is some extra stuff in there, but you get my point. Just because a few people really, really want extra functionality at any cost doesn't mean much. I have seen this before. People use #if for everything even when there is a better way. Look at what you're doing: {{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}} {{#if:{{{5}}}|{{{5}}} mean show parameter 5 if it is set, or show parameter 5 if it is not blank. In either case, {{{5|}}} would do the job. The parent #if is simlar parameter 4 if set, else parameter 5. {{{4| {{{5|}}} }}} would do the job. Template default parameters were here much before ParserFunctions. But people prefer using ugly #ifs, making syntax more unreadable (and increasing preprocessor limits). Another common abuse is to do: {{#if: {{{Foo}}}| trtdFoo: /tdtd{{{Foo}}} /td/tr }} I'd like to have a feature in the parser to mark a section to be skipped if the inner parameter is not set, without having to use #ifs everywhere. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On Thu, Jun 25, 2009 at 10:35 PM, Tim Starlingtstarl...@wikimedia.org wrote: snip The community of people who work on such templates is an extremely small, self-selected subset of the community of editors. It is that tiny segment of the community that can code in this accidental programming language, who are not deterred by its density, inconsistency or performance limitations. There is some truth to this. However, I believe the community of people who would like to see string functions is much, much larger, than just the community of template coders. Most Wikipedians can use templates even if they don't feel comfortable creating them, and many of them have at one time or another encountered practical problems that could be solved with basic string functionality. snip Introducing a scripting language will not make those accumulated contributions disappear. The task of deciphering them, and converting them to a more accessible form, will remain. Do you actually have a plan for introducing a scripting language? Lua, which seems to your favored strategy, was recently LATER-ed on bugzilla by Brion, and suffers from several serious problems. For example the dependency on compiled binaries is highly undesirable. The relative power of a full programming language would require limiting its resources to avoid bad code consuming all memory or flooding Mediawiki with output, and that is only the starting point for considering the risks of malicious or overtaxing code. Not to mention that the comments at Extension talk:Lua suggest several people have failed in attempts to get the Extension working at all. Even if one gets past that, Lua brings its own grammar, set of function keywords, and methodologies, which will again create a high barrier to participation for people wanting to work with it. Frankly Lua feels like it creates at least as many usability and portability problems as it solves, and is still a long ways off. Werdna's suggestion to adapt the AbuseFilter parser into a home-grown Mediawiki scripting language feels lot more natural in terms of control and ability to affect an integrated presentation, but that would also seem quite distant. If one is going to say no string functions until the template coding problem is solved, then I'd liked to know if there is really a serious strategy for doing that. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
2009/6/26 Stephen Bain stephen.b...@gmail.com: In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable. Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above. Maybe it's the mindset that should be changed then? For one thing, {{link}} used to use {{substr}} to check if the first argument started with http:// , https:// or ftp:// and produced an internal link if not, despite the fact that the documentation for {{link}} clearly states that it creates an *external* link, which means people shouldn't be using it to create internal links. If people try to use a template for something it's not intended for, they should be told to use a different template; currently, it seems like the template is just extended with new functionality, leading unnecessary {{#if: , {{#switch: and {{substr}} uses that serve only the users' laziness. To get back to {{cite}}: the template itself contains no more than some logic to choose between {{Citation/core}} and {{Citation/patent}} based on the presence/absence of certain parameters, and {{Citation/core}} does the same thing to choose between books and periodicals. What's wrong with breaking up this template in, say, {{cite patent}}, {{cite book}} and {{cite periodical}}? Similarly, other multifunctional templates could be broken up as well. The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook. This means both {{foo}} and {{bar}} get expanded, one of which in vain. Of course this is even worse for complex systems of nested #if/#ifeq statements and/or #switch statements, in which every possible 'code' path is evaluated before a decision is made. In practice, this means that for every call to {{cite}}, which seems to have three major modes, the preprocessor will spend about 2/3 of its time expanding stuff it's gonna throw away anyway. To fix this, control flow parser functions such as #if could be put in a special class of parser functions that take their arguments unexpanded. They could then call the parser to expand their first argument and return a value based on that. Whether these functions are expected to return expanded or unexpanded wikitext doesn't really matter from a performance standpoint. (Disclaimer: I'm hardly a parser expert, Tim is; he should of course be the judge of the feasibility of this proposal.) As an aside, lazy evaluation of #if statements would also improve performance for stuff like: {{#if:{{{param1|}}}|Do something with param1 {{#if:{{{param2|}}}|Do something with param2 ... {{#if:{{{param9|}}}|Do something with param9}} Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
Roan Kattouw wrote: To get back to {{cite}}: the template itself contains no more than some logic to choose between {{Citation/core}} and {{Citation/patent}} based on the presence/absence of certain parameters, and {{Citation/core}} does the same thing to choose between books and periodicals. What's wrong with breaking up this template in, say, {{cite patent}}, {{cite book}} and {{cite periodical}}? Similarly, other multifunctional templates could be broken up as well. While this is not a comment on merits of string functions in general, there are following wrong things with that approach: - It is easier for users to remember the name of just a single template. - Multiple templates that are separately maintained will diverge over time, for example same parameters might end being named differently. - A new feature in one template can't be easily applied to another template. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On Thu, Jun 25, 2009 at 11:33 PM, Tim Starlingtstarl...@wikimedia.org wrote: Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions. The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel. Well, then at least let's be consistent and cripple padleft/padright. Also, while I disagree with Robert's skepticism about the comparative usability of a real scripting language, I'd be interested to hear what your ideas are for actually implementing that. Come to think of it, the easiest scripting language to implement would be . . . PHP! Just run it through the built-in PHP parser, carefully sanitize the tokens so that it's safe (possibly banning things like function definitions), and eval()! We could even dump the scripts into lots of little files and use includes, so APC can cache them. That would probably be the easiest thing to do, if we need to keep pure PHP support for the sake of third parties. It's kind of horrible, of course . . . How much of Wikipedia is your random shared-hosted site going to be able to mirror anyway, though? Couldn't we at least require working exec() to get infoboxes to work? People on shared hosting could use Special:ExpandTemplates to get a copy of the article with no dependencies, too (albeit with rather messy source code). On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.katt...@gmail.com wrote: The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook. I thought this was fixed ages ago with the new preprocessor. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.katt...@gmail.com wrote: The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook. I thought this was fixed ages ago with the new preprocessor. I asked Domas whether it was and he said no; Tim, can you chip in on this? Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On Fri, Jun 26, 2009 at 2:44 AM, Stephen Bain stephen.b...@gmail.comwrote: In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable. Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above. The good old days are long gone. If you believe there is never a valid case for basic programming constructs such as conditionals you should have objected when ParserFunctions were first implemented. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On 26/06/2009, at 3:32 PM, Brian wrote: On Fri, Jun 26, 2009 at 2:44 AM, Stephen Bain stephen.b...@gmail.comwrote: In the good old days someone would have solved the same problem by mentioning in the template's documentation that the parameter should use full URLs. Both the template and instances of it would be readable. Template programmers are not going to create accessible templates because they have a programming mindset, and set out to solve problems in ways like Brian's code above. The good old days are long gone. If you believe there is never a valid case for basic programming constructs such as conditionals you should have objected when ParserFunctions were first implemented. The fact that we, at some stage, made the mistake of adding programming-like functions does not oblige us to complete the job. If we could make ParserFunctions go away, we would. ParserFunctions is there now, and there's too much code dependent on it to remove it right now. That analysis does not apply to StringFunctions. -- Andrew Garrett Contract Developer, Wikimedia Foundation agarr...@wikimedia.org http://werdn.us ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On Fri, Jun 26, 2009 at 7:16 AM, Aryeh Gregorsimetrical+wikil...@gmail.com wrote: On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouwroan.katt...@gmail.com wrote: The reason I believe breaking up templates improves performance is this: they're typically of the form {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see that this is a parser function call with three arguments, and expand all three of them before it runs the #if hook. I thought this was fixed ages ago with the new preprocessor. My understanding has been that the PREprocessor expands all branches, by looking up and substituting transcluded templates and similar things, but that the actual processor only evaluates the branches that it needs. That's a lot faster than actually evaluating all branches (which is how things originally worked), but not quite as effective as if the dead branches were ignored entirely. (I could be totally wrong however.) -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
2009/6/26 Robert Rohde raro...@gmail.com: My understanding has been that the PREprocessor expands all branches, by looking up and substituting transcluded templates and similar things, but that the actual processor only evaluates the branches that it needs. That's a lot faster than actually evaluating all branches (which is how things originally worked), but not quite as effective as if the dead branches were ignored entirely. (I could be totally wrong however.) You're right that dead code never reaches the parser (your processor), but ideally the preprocessor wouldn't bother expanding it either. I have vague recollection that it was fixed with the new preprocessor, as Simetrical said, but I have no idea how much truth there is in that. Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Enabling some string functions
A while ago, StringFunctions got merged in with ParserFunctions. Tim disabled them by default before scapping, with the following comment: /** * Enable string functions. * * Set this to true if you want your users to be able to implement their own * parsers in the ugliest, most inefficient programming language known to man: * MediaWiki wikitext with ParserFunctions. * * WARNING: enabling this may have an adverse impact on the sanity of your users. * An alternative, saner solution for embedding complex text processing in * MediaWiki templates can be found at: http://www.mediawiki.org/wiki/Extension:Lua */ I'm sure we all agree that wikitext is terrible syntax. But some of the string functions already are at least partially replicated (with horrifying inefficiency, and significant limitations in some cases) on enwiki anyway. Specifically: * #len is implemented by [[Template:Str len]]. Running {{str len}} it on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes. * #pos is implemented by [[Template:Str find]]. Trying to find b in a string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes. * #substr is implemented by [[Template:Str sub]]. Using the same string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes. Is there any good reason not to enable these three string functions, at least? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
Aryeh Gregor wrote: * #len is implemented by [[Template:Str len]]. Running {{str len}} it on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes. * #pos is implemented by [[Template:Str find]]. Trying to find b in a string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes. * #substr is implemented by [[Template:Str sub]]. Using the same string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes. Is there any good reason not to enable these three string functions, at least? Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions. The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel. -- Tim Starling ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enabling some string functions
On Thu, Jun 25, 2009 at 8:33 PM, Tim Starlingtstarl...@wikimedia.org wrote: Aryeh Gregor wrote: * #len is implemented by [[Template:Str len]]. Running {{str len}} it on a string of 250 a's gives preprocessor node count 152, post-expand include size 4597 bytes, template argument size 7430 bytes. * #pos is implemented by [[Template:Str find]]. Trying to find b in a string of 250 a's gives preprocessor node count 1354, post-expand include size 5740 bytes, template argument size 50320 bytes. * #substr is implemented by [[Template:Str sub]]. Using the same string of a's, with start 30 and length 20, gives preprocessor node count 1534, post-expand include size 13400 bytes, template argument size 44578 bytes. Is there any good reason not to enable these three string functions, at least? Those templates can be defeated by reducing the functionality of padleft/padright, and I think that would be a better course of action than enabling the string functions. The set of string functions you describe are not the most innocuous ones, they're the ones I most want to keep out of Wikipedia, at least until we have a decent server-side scripting language in parallel. Could you offer a bit more beyond I don't like it? A few devs, and you in particular, have expressed dismay over what string functions would do to wiki template code. However, most devs are rarely if ever involved with writing wiki templates. By contrast, the community of people who do work on such templates have been asking for these functions for literally years and don't seem the least bit afraid that the marginal impact of adding a few more parser functions will bring the house down. It is hard for me to figure why this case is so peculiar that the devs should block the wishes of the community. Nor do I see why the existence of basic string functionality should be dependent on someone overhauling or replacing the template coding scheme. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l