Re: [Wikitech-l] On templates and programming languages

2009-07-08 Thread Tim Starling
Aryeh Gregor wrote:
 On Fri, Jul 3, 2009 at 3:13 AM, Tim Starlingtstarl...@wikimedia.org wrote:
 Loops are essential for readable code. There is no problem with
 allowing loops in conjunction with time limits, that we don't have
 already with complex templates. In fact, time limits for complex
 templates would be an improvement over the system of expansion limits
 we have at the moment.
 
 But time limits are inconsistent.  Whether a template hits the limit
 might depend on whether it happens to be running on an Apache with a
 Pentium IV, an Opteron, a Xeon, . . .

That's the reason I went with expansion limits when I wrote the code.
But I think it was the wrong choice, because the code is complex and
there are lots of ways to run over the 30s time limit set in php.ini,
or to exceed the memory limit, even with the expansion limits in
place. It's hard to find all the potential performance problems during
code review, especially when new parser functions are constantly added.

I didn't say either method was perfect, just that time limits are better.

 Recursion can give a long running time even if the depth is limited.
 By calling the function multiple times from its own body, you can have
 exponential time order in the recursion depth.
 
 You can also have exponential time with loops.

Without the time limit, the worst case running time for a JavaScript
script is infinity with finite input, so the time order is O(∞). With
the time limit, it's O(1). That's the whole point, a time limit lets
you ignore algorithmic complexities.

If you measure script execution times, instead of trying to guess them
in advance, then you can concentrate developer effort on quotas,
access control, profiling tools, etc., which I think are more
tractable problems than analysing the performance every possible thing
the parser can do and limiting it in advance.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-08 Thread Tei
Another idea thrown against a wall:

can template scripts be pre-calculated ?
I think most people talk about script interpreted on-demand. But what If
scripts are updated every 100 uses, and in a async way. That way a script
that take 10 min to finish is not a problem.  It will be updated every 10
min (or every 1 hour, if the servers want that).

I mean, have outdated templates, that only updated wen the server can, and
not everytime the server use that template.


note: I have no idea if this message is more noise to make the raise/signal
worst. Sorry if is that.



-- 
--
ℱin del ℳensaje.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-08 Thread Tim Starling
Steve Sanbeg wrote:
 On Fri, 03 Jul 2009 17:13:45 +1000, Tim Starling wrote:
 Recursion can give a long running time even if the depth is limited.
 By calling the function multiple times from its own body, you can have
 exponential time order in the recursion depth.

 
 All those calls still end up on the same stack; even if it could be a tree
 in theory, the stack only grows one way, and execution time would only be
 linear.

That's an interesting theory.

 I found some documentation on the example I'd thought of emulating, which
 may clarify a little:
 
 http://www.delorie.com/gnu/docs/elisp-manual-21/elisp_123.html

I thought I would try it.

  (defun pow5 (n)
(if (= n 0)
  1
  (+
(pow5 (1- n))
(pow5 (1- n))
(pow5 (1- n))
(pow5 (1- n))
(pow5 (1- n))
)
  )
)

It calculates 5 to the power of n by adding 1+1+1+1+1... I found that
with a stack depth limit of 25, I was able to calculate 5^6 = 15625.
That's plainly not an O(N) execution time in stack depth.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Tei
On Fri, Jul 3, 2009 at 4:18 AM, Steve Bennett stevag...@gmail.com wrote:

 On Fri, Jul 3, 2009 at 7:38 AM, Brion Vibberbr...@wikimedia.org wrote:
  Since iteration over a set is frequently desired/needed, assume it will
  exist in a sensible programming language.
 
  As already noted in this thread, horrible hacks for limited-depth
  looping are already in use.

 So:
 1) The chosen language will support iteration over finite sets
 2) Could it support general iteration, recursion etc?
 3) If so, are there any good mechanisms for limiting the
 destrutiveness of an infinite loop?

 That is, is it practical to say you can iterate all you like, but
 you're only getting 10ms to do it?  Sounds like it could be an
 interesting property of a template, where a suitably authorised person
 could allow certain templates longer execution times.



another option, is to use a compiled language to a intermediate languaje
that is interpreted.  make so the interpreted for a program has a number of
instruction limit. Say.. .a budget of 90.000 opcodes. If a script break that
barrier, is stoped (the interpreter return;) and the script is marked as
dirty.

bad example follows:
QuakeC is compiled to QC (a fake aseembler lang)  this is interpreted by
QCVM (the quake virtual machine). The interpreter include some limitations
(on stock QCVM, the deep of recursion ).

A good side effect of this, is that a Quake mod work on any OS.

Trivia:  Quake3 and others have a setup like this one, but using C. It
probably is not useable for Wikipedia, since C is bad lang to work with
strings. Too bad, because is fast, crossplatform, there are lots of tools to
work with it, and existing programmers.





-- 
--
ℱin del ℳensaje.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Bryan Tong Minh
On Thu, Jul 2, 2009 at 7:32 PM, Brion Vibberbr...@wikimedia.org wrote:

 I'd really _like_ to be able to avoid having to require external
 executables either, if it can be managed, but that's harder since it
 means having a pure PHP implementation of the scripting language. (ouch!)

 -- brion

We could always have a default implementation in PHP, and optionally
provide the same functionality but faster in a C module.


Bryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Dmitriy Sintsov
* Andrew Garrett agarr...@wikimedia.org [Fri, 3 Jul 2009 13:03:03 
+0100]:

 Writing an interpreter for a language is not trivial. Writing it in C,
 and then porting to PHP is even worse.

Many of languages resemble syntax of C language (curly braces, 
plusplus/minusminus and so on), but with lousy typing (numerical 
strings transparently mixed with numbers, floats mixed with integers). 
Add a dollar sign prefix to variable names to JS code and lots of 
_simple_ JS code would be really similar to PHP. I wonder whether that 
helps to translate?
BTW, one of LOGO interpreter distributions (an educational functional 
language) has a partial (incomplete but working) interpreter of Pascal 
language just in about 20-30KB! It's amazing! it seems that writing 
interpreter with another interpreter is much easier than by using 
low-level language like C. I am not sure whether Logo is suitable, 
though - I am not expert in translation in any way.
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Platonides
Tim Starling wrote:
 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can 
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter 
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.
 
 PHP can be secured against arbitrary execution using token_get_all(),
 there's a proof-of-principle validator of this kind in the master
 switch script project. But there are problems with attempting a
 single-process PHP-in-PHP sandbox:
 
 * The poor support for signals in PHP makes it difficult to limit the
 execution time of a script snippet. Ticks only occur at the end of
 each statement, so you can defeat them by making a single statement
 that runs forever.

Inject a check_limits() call into each looping structure. If it detects
the script has been running for more than $maxTime, timeout it.
Can you defeat that?


 * Apart from blacklisting function definition, there is no way to
 protect against infinite recursion, which exhausts the process stack
 and causes a segfault.

Also inject the same call into functions.


 * Memory limits are implemented on a per-request basis, and there's no
 way to recover from exceeding the memory limit, the request is just
 killed.

Call memory_get_usage() before and also inside check_limits() to check
script abides inside memory limits. Abort if it gets near php memory
limit (I'd expect the script's memory to be much lower than php's).
However, that check is much easier to bypass.



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Sergey Chernyshev
I think you're confusing simple logic of ParserFunctions in the template
with a full scripting language like PHP.

That's why I proposed to look at something simplified like Smarty or alike.

Thank you,

Sergey


--
Sergey Chernyshev
http://www.sergeychernyshev.com/


On Tue, Jun 30, 2009 at 11:46 PM, Aryeh Gregor 
simetrical+wikil...@gmail.com simetrical%2bwikil...@gmail.com wrote:

 On Tue, Jun 30, 2009 at 10:45 PM, Sergey
 Chernyshevsergey.chernys...@gmail.com wrote:
  I don't know about scripting languages for the templating, it might be an
  overkill.

 People are using ParserFunctions as a scripting language already.
 That's not feasibly going to be removed at this point.  So the only
 way to go is to replace it with a better scripting language, which is
 what we're talking about.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Aryeh Gregor
On Fri, Jul 3, 2009 at 1:27 PM, Sergey
Chernyshevsergey.chernys...@gmail.com wrote:
 I think you're confusing simple logic of ParserFunctions in the template
 with a full scripting language like PHP.

In what way is the logic of ParserFunctions simple?  If you ignore
the limitations on parse length, it's Turing-complete.

 That's why I proposed to look at something simplified like Smarty or alike.

Hmm. Smarty looks interesting, at a quick glance.  I suspect it's not
designed to be secure against DoS, so it would need some kind of
sandboxing.  Hopefully less than some of the other solutions we're
contemplating, though!  I'd think it might serve okay, if we wrote
enough custom functions to replace the existing ParserFunctions.  I'm
not sure.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-03 Thread Aryeh Gregor
On Fri, Jul 3, 2009 at 1:37 PM, Petr Kadlecpetr.kad...@gmail.com wrote:
 And they might depend on the server load at the time

Probably not too much, if you count user+system instead of real time.
But yes, that could be an issue too (more context switches, etc.).

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Tim Landscheidt
Michael Daly michael.d...@kayakwiki.org wrote:

 [...]
 Since $ doesn't have a close, that makes things like {{{xxx|default
 value}}} slightly problematic, since $xxx|$default_value is slightly
 more awkward to parse.  But that only shows how templates are also
 overly reliant on the pipe (|) symbol - as anyone who has tried to use
 tables in templates has discovered.
 [...]

bash (don't know if standard POSIX) has:

- ${parameter}
- ${parameter:-default}
- ${parameter:?error}

and even string functions:

- $#{parameter}
- ${parameter:offset:length}
- etc.

  Personally, whatever programming language would be chosen,
I really like Aryeh's approach to sanitize and compile the
template to PHP. It could be used everywhere MediaWiki runs,
it is no hassle to set up compared to installing other in-
terpreters and the performance should be the top of what PHP
has to offer (and *all* templates could be compiled to that
code). From a distance, I think it would even be easier to
have the file/memory/CPU restrictions hacked into the main
PHP interpreter rather than to cook our own soup.

Tim


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Brion Vibber
Brian wrote:
 There are lots of usability improvements that can be made to the
 templating system. First and foremost the new system should allow
 advanced wiki users to perform programmatic operations on article data
 without the requirement that the data in the article be made
 unreadable.
 
 If we only focus our efforts on making the template namespace more
 complicated by giving it a more advanced programming language and we
 leave the article namespace as it is then we have not even touched the
 usability issue. We have just made it worse.

These are totally orthogonal issues, and paying attention to one doesn't 
mean ignoring the other.


The ideal markup situation for the article namespace is that markup 
shouldn't even *be* exposed to most users. A long-term goal is migration 
to a more WYSIWIG-like editing experience -- to which one of the 
potential stumbling blocks has been but how will we do templates, which 
currently are built with our horrifying wiki markup?

Most editors will never know or care about the internal implementation 
of templates, just as they don't know or care about it today. Cleaning 
them up to allow the power-users who *write* templates to make them 
functional and useful *and* maintainable is a win for template writers, 
while having no direct impact on general editors.

(Indirectly, it will mean they're provided with better tools to use in 
their articles.)


not the subject of this thread
For the general article editing experience, the issues are very 
different, and that's the area the Wikipedia Usability Initiative is 
concentrating on.

In the very short term, we're working on general look  feel, workflow, 
and making it easier to figure out what you're supposed to do (such as 
making the markup cheat-sheet available without leaving the editing window).

In the medium term, we hope to be able to fold up things that are 
particularly ugly in markup such as images/media, template invocations 
and tables, and provide friendlier widgets for adding and editing them.

In the long term, we might hope to be able to drop the front-end markup 
entirely... but that's still a harder problem with several possible 
trade-offs.
/not the subject of this thread

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Brion Vibber
Michael Daly wrote:
 Chad wrote:
 
 Unless we plan on trying to mass-convert not only years of old revisions
 but change years-old behavior that millions of users have come to expect?
 I would expect _any_ change to keep {{sometemplate}} always working,
 even if the mechanics behind it change.
 
 Why not switch the template syntax for articles to match the syntax for 
 tags (which in turn is based on XML or whatever syntax that comes from 
 ultimately)?

For the meantime, assume there will be no changes whatsoever in how 
markup in article space is written. A hypothetical change to template 
invocation syntax is unrelated to how templates are implemented, and 
clouds the current discussion.

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Brion Vibber
Aryeh Gregor wrote:
 I was assuming it would just return wikitext, and that would be
 integrated into the page and parsed, following all limits on wikitext
 (including size) -- just as with current parser functions.

That's one simple way to implement, but we may wish to consider working 
with a document tree structure instead to help future-proof it against 
future syntax changes (or dropping out the wiki syntax entirely). Things 
to consider... :)

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Tim Landscheidt
I wrote:

 [...]
   Personally, whatever programming language would be chosen,
 I really like Aryeh's approach to sanitize and compile the
 template to PHP. It could be used everywhere MediaWiki runs,
 it is no hassle to set up compared to installing other in-
 terpreters and the performance should be the top of what PHP
 has to offer (and *all* templates could be compiled to that
 code). From a distance, I think it would even be easier to
 have the file/memory/CPU restrictions hacked into the main
 PHP interpreter rather than to cook our own soup.

Come to think of it, it would also fit very well with pro-
filing individual templates.

Tim


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Brion Vibber
Gregory Maxwell wrote:
 So—  Any thoughts on how you address the universal problem of the DOS
 attack script?
[snip]
 I'm of the impression that simply setting a limits on CPU and memory
 isn't sufficient to address this, because the reasonable limit will be
 high enough to be dangerous when the object is added to 100k pages,
 while a limit low enough to be safe everywhere will be far too
 constraining and likely to fail at random depending on overall system
 load.

It's never an easy problem. :)

But there are some interesting potential things to poke at, such as 
having per-template limits, per-cluster limits, etc -- we could in 
theory shut down some template rendering while still spitting out the 
rest of a page on a timely basis.

 Disadvantage: Like PHP, Python is difficult to lock down securely.
 
 I don't know that difficult is really the right description here.
 People willing to spend far more effort on this than you probably are
 have tried to sandbox python and failed.  I don't believe there is any
 real production grade support for the level of lockdown required for
 either PHP or Python. And I'd worry that any PHP implementations of
 the sandboxed languages might lose the battle tested sandboxing.
 
 It's acceptable for mediawiki to fall back to lower performing
 alternatives when c modules can't be used, but I doubt its acceptable
 to fall back to less secure ones!

Indeed. :)

 Is execution in enviroments where c modules are not possible actually
 a hard requirement? If it is I think this is a non-starter.

Since requiring custom PHP modules would pretty much rule out all casual 
third-party use of MediaWiki, that would definitely be a hard 
requirement to not require it. :)

I'd really _like_ to be able to avoid having to require external 
executables either, if it can be managed, but that's harder since it 
means having a pure PHP implementation of the scripting language. (ouch!)

-- brion


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Dmitriy Sintsov
 I'd really _like_ to be able to avoid having to require external
 executables either, if it can be managed, but that's harder since it
 means having a pure PHP implementation of the scripting language.
 (ouch!)

Maybe translating only a subset of JS or Lua to PHP. The engine itself 
is written in PHP, anyway. Moving to C/Java modules would dramatically 
reduce the popularity of engine. For example, right now I am having 
difficulties compiling ffmpeg at old FreeBSD host. I imagine custom php 
module can have similar difficulties.
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Michael Daly
Brion Vibber wrote:

 The ideal markup situation for the article namespace is that markup 
 shouldn't even *be* exposed to most users. A long-term goal is migration 
 to a more WYSIWIG-like editing experience -- to which one of the 
 potential stumbling blocks has been but how will we do templates, which 
 currently are built with our horrifying wiki markup?

Since templates forbid looping, whatever manages it can't be considered 
a programming language (missing iteration in {sequence, selection, 
iteration}).

Perhaps we should consider this a markup problem and not a programming 
problem.  If templates have styles (not to be confused with CSS concepts 
necessarily) then we just describe the template instead of program the 
template.  {{{var}}} can become something like content: in CSS. 
Conditionals are... interesting.  Apply style x if the condition 
satisfied else apply style y (.e.g display: nicely; vs display: none;)

Just an idea...

Mike


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Marco Schuster
On Fri, Jul 3, 2009 at 4:22 AM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Thu, Jul 2, 2009 at 10:18 PM, Steve Bennettstevag...@gmail.com wrote:
  So:
  1) The chosen language will support iteration over finite sets
  2) Could it support general iteration, recursion etc?
  3) If so, are there any good mechanisms for limiting the
  destrutiveness of an infinite loop?

 You don't really need an infinite loop.  DoS would work fine if you
 can have any loop.  Even with just foreach:

 foreach(array(1,2)as $x1)foreach(array(1,2)as $x2)

 A few dozen of those in a row will give you a nice short bit of code
 that may as well run forever.

You can make some kind of counter, which gets incremented each
foreach/while/for loop. If it reaches 200 (or whatever), execution is
stopped.

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Robert Rohde
On Thu, Jul 2, 2009 at 7:25 PM, Marco
Schusterma...@harddisk.is-a-geek.org wrote:
 On Fri, Jul 3, 2009 at 4:22 AM, Aryeh Gregor
 simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Thu, Jul 2, 2009 at 10:18 PM, Steve Bennettstevag...@gmail.com wrote:
  So:
  1) The chosen language will support iteration over finite sets
  2) Could it support general iteration, recursion etc?
  3) If so, are there any good mechanisms for limiting the
  destrutiveness of an infinite loop?

 You don't really need an infinite loop.  DoS would work fine if you
 can have any loop.  Even with just foreach:

 foreach(array(1,2)as $x1)foreach(array(1,2)as $x2)

 A few dozen of those in a row will give you a nice short bit of code
 that may as well run forever.

 You can make some kind of counter, which gets incremented each
 foreach/while/for loop. If it reaches 200 (or whatever), execution is
 stopped.

Really, the ideal solution is to say the user is allowed X number of
basic operations, Y amount of memory, and Z amount of execution time,
and write an interpreter that is agnostic about how those resources
are used.  If all you do is add limits to loops, then someone will add
loops of loops and or even flat stacks to get around it.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Steve Bennett
On Fri, Jul 3, 2009 at 12:25 PM, Marco
Schusterma...@harddisk.is-a-geek.org wrote:
 You can make some kind of counter, which gets incremented each
 foreach/while/for loop. If it reaches 200 (or whatever), execution is
 stopped.

Yes, but that implies:
1) We're writing an interpreter, or getting heavily involved in the
codebase of an existing one
2) Thinking ahead of every possible DoS and thwarting it.

I was wondering if there was a more general solution using a black box
interpreter. But without knowing the language or interpreter, that may
not be a very meaningful question.

Steve

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Gregory Maxwell
On Wed, Jul 1, 2009 at 1:42 AM, Dmitriy Sintsovques...@rambler.ru wrote:
 XSLT itself is a way too much locked down - even simple things like
 substrings manipulation and loops aren't so easy to perform. Well, maybe
 I am too stupid for XSLT but from my experience bringing tag syntax in
 programming language make the code poorly readable and bloated. I've
 used XSLT for just one of my projects.

Juniper Networks (my day job) uses XSLT as the primary scripting
language on their routing devices, and chose to do so primarily
because of sandboxing and the ease of XML tree manipulation with xpath
(JunOS configuration has a complete and comprehensive XML
representation).  To facilitate that usage we defined an alternative
syntax for XSLT called SLAX (http://code.google.com/p/libslax/),
though it hasn't seen widespread adoption outside of Juniper yet.
(Slax can be mechanically converted to XSLT and vice versa)

SLAX pretty much resolves your readability concern. Although there are
the conceptual barriers for people coming from procedural languages to
any strongly functional programming language still remain.

You don't loop in XSLT, you recurse or iterate over a structure (i.e.
map/reduce).

I've grown rather fond of XSLT but wouldn't personally recommend it
for this application. It lacks the high speed bytecoded execution
environments available for other languages, snf I don't see many
scripts on the site doing extensive document tree manipulation (it's
hard for me to express how awesome xpath is at that)... and I would
also guess that there are probably more adept mediawiki template
language coders today than there are people who are really fluent in
XSLT.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread William Allen Simpson
Haven't read the entire thread yet, so hopefully nobody has said this:

Perl, write-once, poor choice for uncontrolled environment.

Lisp, at least the computer science type will know. Haven't used it
myself since early '80s.

Lua, don't know whether it's improved in the past few years, but freeciv
had serious problems with migrating to 5.1. Personally, I've given up on
it, but my 14 y-o nephew seems to like it for various game modification.

Javascript, OMG don't go there.

Everybody seems to be going the python direction lately, but I've only
minimal experience with it, so cannot make a recommendation.

I'd worry less about providing extensive functionality (we certainly
don't have much now, so anything more would be gravy), but rather
ease of integration, scalability, and security.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Gregory Maxwell
On Wed, Jul 1, 2009 at 3:50 AM, William Allen
Simpsonwilliam.allen.simp...@gmail.com wrote:
 Javascript, OMG don't go there.

Don't be so quick to dismiss Javscript.  If we were making a scorecard
it would likely meet most of the checkboxes:

* Available of reliable battle tested sandboxes (and probably the only
option discussed other than x-in-JVM meeting this criteria)
* Availability of fast execution engines
* Widely known by the existing technical userbase   (JS beats the
other options hands down here)
* Already used by many Mediawiki developers
* Doesn't inflate the number of languages used in the operation of the site
* Possibility of reuse between server-executed and client-executed
(Only JS of the named options meets this criteria)
* Can easily write clear and readable code
* Modern high level language features (dynamic arrays, hash tables, etc)

There may exist great reasons why another language is a better choice,
but JS is far from the first thing that should be eliminated.

Python is a fine language but it fails all the criteria I listed above
except the last two.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Hay (Husky)
Javascript might have gotten a bad name in the past because of 14-year
olds who used it to display 'Welcome to my website!' alerts on their
Geocities homepage, but it's really unfair. Javascript is a very
flexible and dynamic language that can be written very elegantly.

I urge everyone who still think Javascript is a toy language to read
Douglas Crockford's excellent article:

http://javascript.crockford.com/javascript.html

-- Hay

On Wed, Jul 1, 2009 at 10:35 AM, Gregory Maxwellgmaxw...@gmail.com wrote:
 On Wed, Jul 1, 2009 at 3:50 AM, William Allen
 Simpsonwilliam.allen.simp...@gmail.com wrote:
 Javascript, OMG don't go there.

 Don't be so quick to dismiss Javscript.  If we were making a scorecard
 it would likely meet most of the checkboxes:

 * Available of reliable battle tested sandboxes (and probably the only
 option discussed other than x-in-JVM meeting this criteria)
 * Availability of fast execution engines
 * Widely known by the existing technical userbase   (JS beats the
 other options hands down here)
 * Already used by many Mediawiki developers
 * Doesn't inflate the number of languages used in the operation of the site
 * Possibility of reuse between server-executed and client-executed
 (Only JS of the named options meets this criteria)
 * Can easily write clear and readable code
 * Modern high level language features (dynamic arrays, hash tables, etc)

 There may exist great reasons why another language is a better choice,
 but JS is far from the first thing that should be eliminated.

 Python is a fine language but it fails all the criteria I listed above
 except the last two.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Jared Williams
 

 -Original Message-
 From: wikitech-l-boun...@lists.wikimedia.org 
 [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of 
 Brion Vibber
 Sent: 30 June 2009 17:17
 To: Wikimedia developers
 Subject: [Wikitech-l] On templates and programming languages
 
 As many folks have noted, our current templating system works 
 ok for simple things, but doesn't scale well -- even 
 moderately complex conditionals or text-munging will quickly 
 turn your template source into what appears to be line noise.
 
 And we all thought Perl was bad! ;)
 
 There's been talk of Lua as an embedded templating language 
 for a while, and there's even an extension implementation.
 
 One advantage of Lua over other languages is that its 
 implementation is optimized for use as an embedded language, 
 and it looks kind of pretty.
 
 An _inherent_ disadvantage is that it's a fairly rarely-used 
 language, so still requires special learning on potential 
 template programmers' part.
 
 An _implementation_ disadvantage is that it currently is 
 dependent on an external Lua binary installation -- something 
 that probably won't be present on third-party installs, 
 meaning Lua templates couldn't be easily copied to 
 non-Wikimedia wikis.
 
 
 There are perhaps three primary alternative contenders that 
 don't involve making up our own scripting language (something 
 I'd dearly like to avoid):
 
 * PHP
 
 Advantage: Lots of webbish people have some experience with 
 PHP or can easily find references.
 
 Advantage: we're pretty much guaranteed to have a PHP 
 interpreter available. :)
 
 Disadvantage: PHP is difficult to lock down for secure execution.
 
 
 * JavaScript
 
 Advantage: Even more folks have been exposed to JavaScript 
 programming, including Wikipedia power-users.
 
 Disadvantage: Server-side interpreter not guaranteed to be 
 present. Like Lua, would either restrict our portability or 
 would require an interpreter reimplementation. :P
 
 
 * Python
 
 Advantage: A Python interpreter will be present on most web 
 servers, though not necessarily all. (Windows-based servers 
 especially.)
 
 Wash: Python is probably better known than Lua, but not as 
 well as PHP or JS.
 
 Disadvantage: Like PHP, Python is difficult to lock down securely.
 
 
 Any thoughts? Does anybody happen to have a PHP 
 implementation of a Lua or JavaScript interpreter? ;)
 

Would you want the interpreter to translate the template into PHP
array of opcodes first, so could dump that into APC/MemCache?

Jared


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Trevor Parscal
I'm glad to see I'm not alone. JavaScript can indeed invoke bad  
memories of fragile scripts running in IE5 which are long and awkward  
due to limitations in browser technology at the time. However, anyone  
who has used a modern library like jQuery on a support browser will  
tell you it's very powerful and intuitive while being simple,  
straightforward and actually fun. Any language capable of supporting  
this experience is worth seriously considering as an option for us.

- Trevor

Sent from my iPod

On Jul 1, 2009, at 1:44 AM, Hay (Husky) hus...@gmail.com wrote:

 Javascript might have gotten a bad name in the past because of 14-year
 olds who used it to display 'Welcome to my website!' alerts on their
 Geocities homepage, but it's really unfair. Javascript is a very
 flexible and dynamic language that can be written very elegantly.

 I urge everyone who still think Javascript is a toy language to read
 Douglas Crockford's excellent article:

 http://javascript.crockford.com/javascript.html

 -- Hay

 On Wed, Jul 1, 2009 at 10:35 AM, Gregory Maxwellgmaxw...@gmail.com  
 wrote:
 On Wed, Jul 1, 2009 at 3:50 AM, William Allen
 Simpsonwilliam.allen.simp...@gmail.com wrote:
 Javascript, OMG don't go there.

 Don't be so quick to dismiss Javscript.  If we were making a  
 scorecard
 it would likely meet most of the checkboxes:

 * Available of reliable battle tested sandboxes (and probably the  
 only
 option discussed other than x-in-JVM meeting this criteria)
 * Availability of fast execution engines
 * Widely known by the existing technical userbase   (JS beats the
 other options hands down here)
 * Already used by many Mediawiki developers
 * Doesn't inflate the number of languages used in the operation of  
 the site
 * Possibility of reuse between server-executed and client-executed
 (Only JS of the named options meets this criteria)
 * Can easily write clear and readable code
 * Modern high level language features (dynamic arrays, hash tables,  
 etc)

 There may exist great reasons why another language is a better  
 choice,
 but JS is far from the first thing that should be eliminated.

 Python is a fine language but it fails all the criteria I listed  
 above
 except the last two.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread William Allen Simpson
William Allen Simpson wrote:
 I run Firefox with JS off by default for all wikimedia sites, because of
 serious problems in the not so recent past!
 
s/recent/distant/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Daniel Schwen
 I run Firefox with JS off by default for all wikimedia sites, because of
 serious problems in the not so recent past!
 s/recent/distant/

Hooray JavaScript FUD!

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Gregory Maxwell
On Wed, Jul 1, 2009 at 11:21 AM, William Allen
Simpsonwilliam.allen.simp...@gmail.com wrote:
 * Doesn't inflate the number of languages used in the operation of the site

 This is the important checkbox, as far as integration with the project (my
 first criterion), but is the server side code already running JavaScript?
 For serving pages?

No but mediawiki and the sites are already chock-full of client side code in JS.

You basically can't do advanced development for MediaWiki or the
wikimedia sites without a degree of familiarity with Javascript due to
client compatibility considerations.

 My general rule: coming over the network, presume it's bad data.

In this case were not talking about the language mediawiki is written
in, we're talking about a language used for server-side content
automation (templates).  In that case we'd be assuming the inputs are
toxic just like in the client side case, since everything, including
the code itself came in over the network.

I'll concede that there likely wouldn't be much code reuse, but I'd
attribute that more to the starkly different purpose and the fact that
the server version would have a different API (no DOM, but instead
functions for pulling data out of mediawiki).


 And we have far too many examples of existing JS
 already being used in horrid templates, being promulgated in important
 areas such as large categories, that don't seem to work consistently, and
 don't work at all with JavaScript turned off.
 I run Firefox with JS off by default for all wikimedia sites, because of
 serious problems in the not so recent past!

Fortunately this is a non-issue here: Better server side scripting
enhances the sites ability to operate without requiring scripting on
the client.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread William Allen Simpson
Hay (Husky) wrote:
 I'm sorry that you seem to have such bad experiences with JavaScript.
 Still, i don't think your comments are really valid in today's world.

You mean like the {{hidden}} template series?  How long that took to
finally work?

Worse, folks trying to use the classes directly, resulting in the contents
being centered, with the bullets and numbering removed:
div class=NavFrame collapsed
div class=NavHeadCategories/div
div class=NavContent

https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Wikipedia:Categories_for_discussion/Log/2009_June_6diff=294783332oldid=294782497
or
http://en.wikipedia.org/w/index.php?title=Wikipedia:Categories_for_discussion/Log/2009_June_6diff=294783332oldid=294782497

Believe me, user edits relying on JS, even where the JS isn't directly
accessible, are not really ready for prime time today.


 Take a look at  'web 2.0-style' applications, such as Gmail or Google
 Maps. Stuff like that would simply be impossible in a web browser
 without depending on proprietary technology such as Flash. 

Sure, and do you know how many months it took to get that to work, or
how many folks from the application security group to review?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Steve Sanbeg
On Wed, 01 Jul 2009 09:42:31 +0400, Dmitriy Sintsov wrote:


 XSLT itself is a way too much locked down - even simple things like 
 substrings manipulation and loops aren't so easy to perform. Well, maybe 
 I am too stupid for XSLT but from my experience bringing tag syntax in 
 programming language make the code poorly readable and bloated. I've 
 used XSLT for just one of my projects.
 

I'd assume we want locked down.  Loops would be hard in any locked-down
environment; I don't recall seeing any recommendation in this thread on
how that wold be done.  Recursion is much simpler, just track the depth,
and throw an exception if it goes to deep; emacs lisp already uses this
mechanism.

Some of those things may not be as easy as other lanugages, but the string
functions that this thread was started over are built into XPath 2.0, so
it would solve the problem at hand.

 Deeply nested braces of lisp remind me of current MediaWiki parser.
 

Superficially, sure; but IMHO the real problem with the current parser is
the ambiguity, that when you see a construct begin like {something...
you need to keep reading before you can parse it.  With lisp, it's trivial
to parse, so we could do our own parsing if needed.


 Lua was highly valued here at computer lab, also Ocaml (not sure of 
 proper spelling).
 Dmitriy

It seems like there are benefits there, but it's less clear how to
implement that sufficiently locked down, and how that would interface with
the rest of the parser, for callbacks, magic words, etc.



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-07-01 Thread Steve Sanbeg
On Tue, 30 Jun 2009 22:53:36 +0100, Thomas Dalton wrote:

 2009/6/30 Steve Sanbeg ssan...@ask.com:
 On Tue, 30 Jun 2009 21:38:07 +0100, Thomas Dalton wrote:

 2009/6/30 Michael Daly michael.d...@kayakwiki.org:

 How does that work with anonymous variables? Are all $[NUMBER] style
 names count as auto-declared?


 They're not anonymous, they're just named sequentially.
 
 They are anonymous when you call the template, though. The names are
 determined by the order in the call rather than written explicitly.
 They do need to be considered separately.

Anonymous would mean they don't have names, which isn't the case.  They
are named, but those names may, or may not, be implicit.  Currently, they
aren't handled separately; the parser names any unnamed arguments prior to
calling the template, which has no way of knowing how they were named; to
the template, they're all just named arguments  {{t|a|b}} is the same as
{{t|2=b|1=a}} or even {{t|2=a|b}}.



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] On templates and programming languages

2009-06-30 Thread Brion Vibber
As many folks have noted, our current templating system works ok for 
simple things, but doesn't scale well -- even moderately complex 
conditionals or text-munging will quickly turn your template source into 
what appears to be line noise.

And we all thought Perl was bad! ;)

There's been talk of Lua as an embedded templating language for a while, 
and there's even an extension implementation.

One advantage of Lua over other languages is that its implementation is 
optimized for use as an embedded language, and it looks kind of pretty.

An _inherent_ disadvantage is that it's a fairly rarely-used language, 
so still requires special learning on potential template programmers' part.

An _implementation_ disadvantage is that it currently is dependent on an 
external Lua binary installation -- something that probably won't be 
present on third-party installs, meaning Lua templates couldn't be 
easily copied to non-Wikimedia wikis.


There are perhaps three primary alternative contenders that don't 
involve making up our own scripting language (something I'd dearly like 
to avoid):

* PHP

Advantage: Lots of webbish people have some experience with PHP or can 
easily find references.

Advantage: we're pretty much guaranteed to have a PHP interpreter 
available. :)

Disadvantage: PHP is difficult to lock down for secure execution.


* JavaScript

Advantage: Even more folks have been exposed to JavaScript programming, 
including Wikipedia power-users.

Disadvantage: Server-side interpreter not guaranteed to be present. Like 
Lua, would either restrict our portability or would require an 
interpreter reimplementation. :P


* Python

Advantage: A Python interpreter will be present on most web servers, 
though not necessarily all. (Windows-based servers especially.)

Wash: Python is probably better known than Lua, but not as well as PHP 
or JS.

Disadvantage: Like PHP, Python is difficult to lock down securely.


Any thoughts? Does anybody happen to have a PHP implementation of a Lua 
or JavaScript interpreter? ;)

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Chad
On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibberbr...@wikimedia.org wrote:
 As many folks have noted, our current templating system works ok for
 simple things, but doesn't scale well -- even moderately complex
 conditionals or text-munging will quickly turn your template source into
 what appears to be line noise.

 And we all thought Perl was bad! ;)

 There's been talk of Lua as an embedded templating language for a while,
 and there's even an extension implementation.

 One advantage of Lua over other languages is that its implementation is
 optimized for use as an embedded language, and it looks kind of pretty.

 An _inherent_ disadvantage is that it's a fairly rarely-used language,
 so still requires special learning on potential template programmers' part.

 An _implementation_ disadvantage is that it currently is dependent on an
 external Lua binary installation -- something that probably won't be
 present on third-party installs, meaning Lua templates couldn't be
 easily copied to non-Wikimedia wikis.


 There are perhaps three primary alternative contenders that don't
 involve making up our own scripting language (something I'd dearly like
 to avoid):

 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.


 * JavaScript

 Advantage: Even more folks have been exposed to JavaScript programming,
 including Wikipedia power-users.

 Disadvantage: Server-side interpreter not guaranteed to be present. Like
 Lua, would either restrict our portability or would require an
 interpreter reimplementation. :P


 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.


 Any thoughts? Does anybody happen to have a PHP implementation of a Lua
 or JavaScript interpreter? ;)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


I haven't tried it, but there seems to be a Lua Pecl extension.

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 9:27 AM, Chadinnocentkil...@gmail.com wrote:
 I haven't tried it, but there seems to be a Lua Pecl extension.

The Lua Pecl says:

We should emphasize that is still under development and is completely
experimental.

That was nearly two years ago and there doesn't appear to have been
any real work on it since.  Someone would probably need to look at it
carefully to make sure it is adequately functional before considering
that path.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Victor Vasiliev
Brion Vibber wrote:
 As many folks have noted, our current templating system works ok for 
 simple things, but doesn't scale well -- even moderately complex 
 conditionals or text-munging will quickly turn your template source into 
 what appears to be line noise.

 And we all thought Perl was bad! ;)

 There's been talk of Lua as an embedded templating language for a while, 
 and there's even an extension implementation.

 One advantage of Lua over other languages is that its implementation is 
 optimized for use as an embedded language, and it looks kind of pretty.

 An _inherent_ disadvantage is that it's a fairly rarely-used language, 
 so still requires special learning on potential template programmers' part.

 An _implementation_ disadvantage is that it currently is dependent on an 
 external Lua binary installation -- something that probably won't be 
 present on third-party installs, meaning Lua templates couldn't be 
 easily copied to non-Wikimedia wikis.


 There are perhaps three primary alternative contenders that don't 
 involve making up our own scripting language (something I'd dearly like 
 to avoid):

 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can 
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter 
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.


 * JavaScript

 Advantage: Even more folks have been exposed to JavaScript programming, 
 including Wikipedia power-users.

 Disadvantage: Server-side interpreter not guaranteed to be present. Like 
 Lua, would either restrict our portability or would require an 
 interpreter reimplementation. :P


 * Python

 Advantage: A Python interpreter will be present on most web servers, 
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP 
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.


 Any thoughts? Does anybody happen to have a PHP implementation of a Lua 
 or JavaScript interpreter? ;)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

   
I'm working on rewriting abuse filter parser so it's suitable for 
embedding in wikitext. It's half-done and will be ready soon.
--

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 9:16 AM, Brion Vibberbr...@wikimedia.org wrote:
snip

 There are perhaps three primary alternative contenders that don't
 involve making up our own scripting language (something I'd dearly like
 to avoid):

snip

In the Lua Bugzilla thread (#19298), there was some extended
discussion about using the AbuseFilter parser as the basis for a
Mediawiki scripting language.  From your comment should I assume we
are taking that option off the table?

There are advantages to that approach in terms of integration and
flexibility, though rolling our own scripting language would obviously
be a quite complex (and probably long-term) undertaking.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brion Vibber
Robert Rohde wrote:
 On Tue, Jun 30, 2009 at 9:27 AM, Chadinnocentkil...@gmail.com wrote:
 I haven't tried it, but there seems to be a Lua Pecl extension.
 
 The Lua Pecl says:
 
 We should emphasize that is still under development and is completely
 experimental.
 
 That was nearly two years ago and there doesn't appear to have been
 any real work on it since.  Someone would probably need to look at it
 carefully to make sure it is adequately functional before considering
 that path.

A PECL extension wouldn't be a compatibility improvement over shelling 
out to a Lua binary; it still requires compilation and installation on 
the server. (Though it could be a performance win by having the Lua 
interpreter available in-process.)

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brion Vibber
Robert Rohde wrote:
 In the Lua Bugzilla thread (#19298), there was some extended
 discussion about using the AbuseFilter parser as the basis for a
 Mediawiki scripting language.  From your comment should I assume we
 are taking that option off the table?
 
 There are advantages to that approach in terms of integration and
 flexibility, though rolling our own scripting language would obviously
 be a quite complex (and probably long-term) undertaking.

Right, that's exactly what I don't want to have to do.

I'd honestly rather implement a JS interpreter in PHP than create and 
maintain our own programming language, if it came to that. :)

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 10:03 AM, Trevor Parscaltpars...@wikimedia.org wrote:
 On 6/30/09 9:16 AM, Brion Vibber wrote:
 Any thoughts? Does anybody happen to have a PHP implementation of a Lua
 or JavaScript interpreter? ;)

 -- brion


 GPL, Alpha software, seems to be abandoned in 2005
 http://j4p5.sourceforge.net/

 Perhaps this could be tested, considered, brought back to life, etc?

Google also turns up
http://phpjs.berlios.de/

Like J4P5 it also seems to be an abandoned alpha.

For a stand-alone JavaScript interpreters there are some
well-supported projects, like Jaxer:
http://www.aptana.com/jaxer

However, this again gets back to separately compiled code, and would
not easily be able to interact with PHP.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Victor Vasiliev
Brion Vibber wrote:
 Victor Vasiliev wrote:
   
 I'm working on rewriting abuse filter parser so it's suitable for 
 embedding in wikitext. It's half-done and will be ready soon.
 

 Eh, I'd rather replace the AbuseFilter scripting with 
 JS/Lua/Python/whatever too. :)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   
We'll anyway have to remove for() and while() from it and restrict it in 
other ways.
--vvv

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brian
So far my favorite idea is to use a restricted subset of PHP.

I would like to broach an important topic however: How can we convert all of
the existing ParserFunctions and difficult-to-read template code to this new
language automatically? Are we really talking about the dream of getting rid
of templates entirely? The end of {{||}} ?

How difficult would it be to modify the parser to spit out some of its data
structures in the new language as opposed to HTML etc.. ?

This seems to be the more difficult part of the project.

There is a more practical/pragmatic approach which is to deprecate the
current syntax similar to the way languages sometimes deprecate language
features. I fear that the conversion is a superhuman task, however.

On Tue, Jun 30, 2009 at 10:16 AM, Brion Vibber br...@wikimedia.org wrote:

 As many folks have noted, our current templating system works ok for
 simple things, but doesn't scale well -- even moderately complex
 conditionals or text-munging will quickly turn your template source into
 what appears to be line noise.

 And we all thought Perl was bad! ;)

 There's been talk of Lua as an embedded templating language for a while,
 and there's even an extension implementation.

 One advantage of Lua over other languages is that its implementation is
 optimized for use as an embedded language, and it looks kind of pretty.

 An _inherent_ disadvantage is that it's a fairly rarely-used language,
 so still requires special learning on potential template programmers' part.

 An _implementation_ disadvantage is that it currently is dependent on an
 external Lua binary installation -- something that probably won't be
 present on third-party installs, meaning Lua templates couldn't be
 easily copied to non-Wikimedia wikis.


 There are perhaps three primary alternative contenders that don't
 involve making up our own scripting language (something I'd dearly like
 to avoid):

 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.


 * JavaScript

 Advantage: Even more folks have been exposed to JavaScript programming,
 including Wikipedia power-users.

 Disadvantage: Server-side interpreter not guaranteed to be present. Like
 Lua, would either restrict our portability or would require an
 interpreter reimplementation. :P


 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.


 Any thoughts? Does anybody happen to have a PHP implementation of a Lua
 or JavaScript interpreter? ;)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 10:15 AM, Brianbrian.min...@colorado.edu wrote:
 So far my favorite idea is to use a restricted subset of PHP.

 I would like to broach an important topic however: How can we convert all of
 the existing ParserFunctions and difficult-to-read template code to this new
 language automatically? Are we really talking about the dream of getting rid
 of templates entirely? The end of {{||}} ?

 How difficult would it be to modify the parser to spit out some of its data
 structures in the new language as opposed to HTML etc.. ?

 This seems to be the more difficult part of the project.

 There is a more practical/pragmatic approach which is to deprecate the
 current syntax similar to the way languages sometimes deprecate language
 features. I fear that the conversion is a superhuman task, however.

You couldn't ever turn template syntax off without making old
revisions unrenderable.  The best one could likely do is encourage
people to upgrade and provide tools to make that easier.  However,
given the nastiness of template syntax, I would expect no end of wiki
authors willing to help convert the commonly used stuff.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brian
On Tue, Jun 30, 2009 at 11:20 AM, Robert Rohde raro...@gmail.com wrote:

 You couldn't ever turn template syntax off without making old
 revisions unrenderable.  The best one could likely do is encourage
 people to upgrade and provide tools to make that easier.  However,
 given the nastiness of template syntax, I would expect no end of wiki
 authors willing to help convert the commonly used stuff.

 -Robert Rohde


 The solution (no doubt first developed on this list many years ago) is to
have mark revisions that still trigger the template/parser functions parser
as using that syntax, and to mark revisions that have moved on as using the
new language.

Moreover, old revisions already are unrenderable. They may look like they
render correctly but in fact they don't. This is because mediawiki has no
notion of the fact that a particular revision of an article also uses
particular revisions of templates etc...

At any rate, I don't see how this nitpick is a difficult problem
technically.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 10:45 AM, Amir E. Aharoniamir.ahar...@gmail.com wrote:
 On Tue, Jun 30, 2009 at 20:42, Robert Rohderaro...@gmail.com wrote:
 On Tue, Jun 30, 2009 at 10:22 AM, Bryan Tong
 Also for Python you really will want an editor that supports
 indenting. Web browsers are not suitable for programming Python.

 Though indenting is mandatory for Python, the use or reasonable
 indenting is pretty much necessary to produce readable code in any
 language.

 That said, I don't see any reason we couldn't use two or three
 consecutive spaces to indicate indentations.

 Four!

Four is the default size of a tab in Python, but indents aren't
actually required to be that size.  If we are required to type
consecutive spaces to format code, I'd actually prefer a smaller
default size.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Steve Sanbeg
On Tue, 30 Jun 2009 09:16:41 -0700, Brion Vibber wrote:

 As many folks have noted, our current templating system works ok for 
 simple things, but doesn't scale well -- even moderately complex 
 conditionals or text-munging will quickly turn your template source into 
 what appears to be line noise.
 
 And we all thought Perl was bad! ;)
 
 There's been talk of Lua as an embedded templating language for a while, 
 and there's even an extension implementation.
 
 One advantage of Lua over other languages is that its implementation is 
 optimized for use as an embedded language, and it looks kind of pretty.
 
 An _inherent_ disadvantage is that it's a fairly rarely-used language, 
 so still requires special learning on potential template programmers' part.
 
 An _implementation_ disadvantage is that it currently is dependent on an 
 external Lua binary installation -- something that probably won't be 
 present on third-party installs, meaning Lua templates couldn't be 
 easily copied to non-Wikimedia wikis.
 
 
 There are perhaps three primary alternative contenders that don't 
 involve making up our own scripting language (something I'd dearly like 
 to avoid):
 

I was thinking about something similar this weekend, although I'd thought
about different languages:

1 - XSLT

  Since the syntax is XML (like the extensions tags) and XPath (vaguely
  similar to template syntax, although it's XML that calls XPath, the
  opposite of what we have) It would be reasonably consistent with current
  syntax.  It also should also already be fairly well locked down, and the
 interface seems fairly clear - present template parameters as stylesheet 
 parameters, and other magic words as an input document.  We may just need 
 a few simplifications to make it easier to use.

2- lisp/scheme

 Should be easy to write a parser for if needed, since the grammer is so 
simple, 
 and it should be relatively simple to lock down or extend as needed.

Of course, those are both a bit more esoteric than your recommendations.  
Perl is nice for getting useful results from short code, if we're not 
bothered by one parser with no grammer specification calling another one. Tcl 
may
be a reasonable compromise; a less esoteric, imperative language which is often 
used as an extension language.
 

  




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 10:22 AM, Bryan Tong
Minhbryan.tongm...@gmail.com wrote:
 On Tue, Jun 30, 2009 at 6:16 PM, Brion Vibberbr...@wikimedia.org wrote:

 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.

 Also for Python you really will want an editor that supports
 indenting. Web browsers are not suitable for programming Python.

Though indenting is mandatory for Python, the use or reasonable
indenting is pretty much necessary to produce readable code in any
language.

That said, I don't see any reason we couldn't use two or three
consecutive spaces to indicate indentations.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Amir E. Aharoni
On Tue, Jun 30, 2009 at 20:42, Robert Rohderaro...@gmail.com wrote:
 On Tue, Jun 30, 2009 at 10:22 AM, Bryan Tong
 Also for Python you really will want an editor that supports
 indenting. Web browsers are not suitable for programming Python.

 Though indenting is mandatory for Python, the use or reasonable
 indenting is pretty much necessary to produce readable code in any
 language.

 That said, I don't see any reason we couldn't use two or three
 consecutive spaces to indicate indentations.

Four!

-- 
Amir Elisha Aharoni

http://aharoni.wordpress.com

We're living in pieces,
 I want to live in peace. - T. Moore

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Aryeh Gregor
On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibberbr...@wikimedia.org wrote:
 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.

I think it would be easy to provide a very simple locked-down version,
with most of the features gone.  You could, for instance, only permit
variable assignment, use of built-in operators, a small whitelist of
functions, and conditionals.  You could omit loops, function
definitions, and abusable functions like str_repeat() (let alone
exec(), eval(), etc.) from a first pass.  This would still be vastly
more powerful, more readable, and faster than ParserFunctions.

Hopefully, we could make this secure enough for your average
shared-host website to run it by default with no special measures
taken and without much risk.  Installations with more access and
higher security requirements, like Wikimedia, could shell out to a
process that's sandboxed on the OS level to be on the safe side.  I'd
like to hear what Tim thinks about the possibility of securing PHP
like this.

Of course, PHP is evil, and supporting it sucks.  :(  But if we
*really* *really* need to support users who can't shell out to other
programs, I think it's the only real language that's a feasible
solution.


I'd encourage you to consider requiring exec() support for full use of
Wikipedia templates, though.  Many really big shared hosts allow it,
like 1and1.com.  Anyone big enough to include much Wikipedia content
will likely be on at least a VPS anyway.  And if your host doesn't
support exec(), then at *worst* you can still get the articles in a
totally usable form -- just run Special:ExpandTemplates on all the
article's templates.  You can then transclude those on a per-article
basis; we could update Special:Export to make this easier.  The only
problem in this case would be that you can't easily change the
formatting of all the templates at once -- but such a small site would
likely have few enough articles to do it by hand, if they even want
to.

I think saying that users without exec() support get to use Wikipedia
content in a somewhat less usable form would be just fine, and it
would *really* open up our options.  We could support basically any
programming language in that case.

 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.

It doesn't matter whether it's present, does it?  If the user has
exec() support, they could download a binary interpreter for *any*
language to their webspace and run it from there regardless of whether
the language is supported on the host.  So Python is on exactly the
same level as Lua here.

Much though I love Python, Lua looks like the better option.  First of
all, it's *very* small.  sudo apt-get install lua50 on my machine uses
up only 180 KB of disk space, and the package is 30 KB gzipped.  Our
current tarballs are 10 MB; we could easily just chuck in Lua binaries
for Linux x86-32 and Windows without even noticing the size increase,
and allow users to enable it with one line in LocalSettings.php.  By
contrast, python2.6 is around 10 MB uncompressed, 2.5 MB compressed.
Perl is twice that size.  Windows users, or users with exec() allowed
but open_basedir preventing access to /usr/bin, would have to obtain
Python/Perl/etc. themselves.

It looks to me like Lua would be a lot easier to sandbox.  It seems
pretty simple to deny all I/O within the language itself, so you'd
(hopefully) just need memory and CPU limits.  Both of those could be
implemented on Linux with hard setrlimit() values plus nice.  Similar
things exist on Windows, hopefully accessible by command line somehow.
 If we're shipping binaries with MediaWiki, we could even hack the
code if necessary, to use whatever sandboxing mechanisms the OS makes
available, although hopefully that would be unneeded.

I don't think we should fixate too much on how many people know the
language.  It's not hard to pick up a new language if you already know
one, and Lua has the reputation of being simple (although I haven't
tried to learn it).  I think Lua is the best option here.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Michael Daly
Brion Vibber wrote:
  Any thoughts? Does anybody happen to have a PHP implementation of a
  Lua or JavaScript interpreter?

Rather than reinventing the wheel, why not look at fixing the existing 
template syntax?

The biggest problem that I see is the excessive dependence on the 
parentheses { and }.  In a moderately complex template, you've got a mix 
of double {{...}} and triple {{{...}}} brackets, occasionally nested, 
that result in an unreadable mess.

If {{{xxx}}} was replaced with a local-variable-like syntax, say $xxx 
(where xxx is whatever name you wish, $1, $2... for unnamed), then the 
mess is reduced from something like:

{{blah|{{{xxx}}}|{{{yyy}{{#if: {{{ggg}}}|{{{h}}}|{{{4}{{{5}}}

becomes:

{{blah|$xxx|$yyy}}{{#if: $ggg|$h|$4}}$5

which is somewhat more tolerable.  (whether or not the above makes real 
sense is not my objective - I'm just trying to show how removing the 
blizzard of {{{}}} reduces visual clutter).

Since $ doesn't have a close, that makes things like {{{xxx|default 
value}}} slightly problematic, since $xxx|$default_value is slightly 
more awkward to parse.  But that only shows how templates are also 
overly reliant on the pipe (|) symbol - as anyone who has tried to use 
tables in templates has discovered.

If parsing templates allows the semi-restricted use of a couple of 
symbols (unlike parsing other pages - I know... don't go there), then 
both {{{}}} and | could be replaced with $ and 
I-don't-care-what-make-a-choice.  Then templates become a tad more 
readable and we get rid of kludges like {{!}} and other clutter or 
confusion in tables, parser functions, etc.

As an aside - obliging template writers to declare variables used in the 
template, say, as a definition of the input format at the top of the 
template definition, would make parsing the variables out later a tad 
easier.  If it's declared, it's a variable; if not, it's not a variable 
and is treated as plain text.  Thus the first line of a template would 
be the example of its use:

Template:foobar
--
{{Foobar|$var1|$var2|$andAnotherVar}}
...(implementation)...
--

But what do I know, I've only implemented one OO language compiler in my 
life and that was 20 years ago.

Mike


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brion Vibber
Aryeh Gregor wrote:
 On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibberbr...@wikimedia.org wrote:
 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.
 
 I think it would be easy to provide a very simple locked-down version,
 with most of the features gone.  You could, for instance, only permit
 variable assignment, use of built-in operators, a small whitelist of
 functions, and conditionals.  You could omit loops, function
 definitions, and abusable functions like str_repeat() (let alone
 exec(), eval(), etc.) from a first pass.  This would still be vastly
 more powerful, more readable, and faster than ParserFunctions.

IMO by the time you've implemented your whitelisting parser you might as 
well just interpret it rather than eval()ing. (And of course, eval() 
might be disabled on the server. :)

Looping constructs are also extremely valuable -- at a minimum in a 
foreach() kind of way.

 I'd encourage you to consider requiring exec() support for full use of
 Wikipedia templates, though.  Many really big shared hosts allow it,
 like 1and1.com.  Anyone big enough to include much Wikipedia content
 will likely be on at least a VPS anyway.

It's not about Wikipedia content, but about being able to grab things 
you see on another wiki and use or adapt them to your own needs. We get 
lots of questions from people trying to grab some particular template 
off Wikipedia to use on their own site for their own needs.

 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.
 
 It doesn't matter whether it's present, does it?  If the user has
 exec() support, they could download a binary interpreter for *any*
 language to their webspace and run it from there regardless of whether
 the language is supported on the host.

Considering the amount of trouble people have getting texvc working, I 
wouldn't want to force that on people just to use templates.

 Much though I love Python, Lua looks like the better option.  First of
 all, it's *very* small.  sudo apt-get install lua50 on my machine uses
 up only 180 KB of disk space, and the package is 30 KB gzipped.

Python comes with batteries included, which is to say it's got a huge 
standard library (most of which of course wouldn't be available in a 
restricted environment). Lua's bare interpreter of course wins in an 
embedded-shipping contest. :D

 Our
 current tarballs are 10 MB; we could easily just chuck in Lua binaries
 for Linux x86-32 and Windows without even noticing the size increase,
 and allow users to enable it with one line in LocalSettings.php.

Hmm... it might be interesting to experiment with something like this, 
if it can _really_ be compiled standalone. (Linux binary distribution is 
a hellhole of incompatible linked library versions!)

 It looks to me like Lua would be a lot easier to sandbox.  It seems
 pretty simple to deny all I/O within the language itself, so you'd
 (hopefully) just need memory and CPU limits.

*nod* being designed as an embedded language is a win. :D

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Marco Schuster
On Tue, Jun 30, 2009 at 10:25 PM, Brion Vibber br...@wikimedia.org wrote:

 Aryeh Gregor wrote:
  Our
  current tarballs are 10 MB; we could easily just chuck in Lua binaries
  for Linux x86-32 and Windows without even noticing the size increase,
  and allow users to enable it with one line in LocalSettings.php.

 Hmm... it might be interesting to experiment with something like this,
 if it can _really_ be compiled standalone. (Linux binary distribution is
 a hellhole of incompatible linked library versions!)

Static compiling the stuff? How would this affect the binary size? (And: is
static linking working across different libc versions?)

BTW, what about Mac OS / FreeBSD hosts?

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Thomas Dalton
2009/6/30 Michael Daly michael.d...@kayakwiki.org:
 Brion Vibber wrote:
   Any thoughts? Does anybody happen to have a PHP implementation of a
   Lua or JavaScript interpreter?

 Rather than reinventing the wheel, why not look at fixing the existing
 template syntax?

I would support that. We really don't need a Turing-complete template system.

 As an aside - obliging template writers to declare variables used in the
 template, say, as a definition of the input format at the top of the
 template definition, would make parsing the variables out later a tad
 easier.  If it's declared, it's a variable; if not, it's not a variable
 and is treated as plain text.  Thus the first line of a template would
 be the example of its use:

 Template:foobar
 --
 {{Foobar|$var1|$var2|$andAnotherVar}}
 ...(implementation)...
 --

How does that work with anonymous variables? Are all $[NUMBER] style
names count as auto-declared?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Aryeh Gregor
On Tue, Jun 30, 2009 at 4:25 PM, Brion Vibberbr...@wikimedia.org wrote:
 IMO by the time you've implemented your whitelisting parser you might as
 well just interpret it rather than eval()ing.

I don't think so.  You'd only have to do the whitelisting once, on
page save.  After that you could just execute with no extra overhead.
Even better, you could write it to a file and include() the file; this
would be a huge win if you have an opcode cache.  Of course, parsing
PHP within PHP should be much easier than parsing another language
within PHP: just use token_get_all() to do most of the work.

 (And of course, eval()
 might be disabled on the server. :)

Does anyone actually do this?  It would break a lot of major web apps,
surely.  If anyone does do this, it would still work if you could
write to a file and then include it.

 Looping constructs are also extremely valuable -- at a minimum in a
 foreach() kind of way.

Right, but we could live without them in an initial version.  They
could be added later.

 It's not about Wikipedia content, but about being able to grab things
 you see on another wiki and use or adapt them to your own needs. We get
 lots of questions from people trying to grab some particular template
 off Wikipedia to use on their own site for their own needs.

Sure.  The point still holds.  Some third parties would be unable to
use Wikipedia templates, yes.  But given the tangle of dependencies
the major ones have, and how complicated they are, I'm guessing most
small third-party wikis don't bother in the end anyway.  Requiring
exec() for full use of content is viable IMO.

 Considering the amount of trouble people have getting texvc working, I
 wouldn't want to force that on people just to use templates.

The problem with texvc is installing dependencies and compiling it.  A
much better analogy is things like diff3 -- which we shell out to out
of the box, with zero configuration, if they exist and shelling out
works.  (We'd probably want scripting off by default, of course, but
we could require just a single config line.)

 Python comes with batteries included, which is to say it's got a huge
 standard library (most of which of course wouldn't be available in a
 restricted environment). Lua's bare interpreter of course wins in an
 embedded-shipping contest. :D

Yep, but that's a big advantage.  It means Windows users don't have to
do any extra work.  It also lets us ensure a specific version is
reliably available.  Imagine Wikimedia using Python 2.6, and someone
trying to run that on some shared host running Fedora 8 or God knows
what, with Python 2.2 or something.  (Someone actually came into
#mediawiki a few months ago for help and it turned out their VPS was
something like Fedora 7 or 8.  And horribly overpriced at that!)

 Hmm... it might be interesting to experiment with something like this,
 if it can _really_ be compiled standalone. (Linux binary distribution is
 a hellhole of incompatible linked library versions!)

I hadn't thought of libraries, you're right.  It should work pretty
reliably on Linux (and hopefully not be too much bigger) if it's
statically linked, though, right?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Aryeh Gregor
On Tue, Jun 30, 2009 at 4:33 PM, Marco
Schusterma...@harddisk.is-a-geek.org wrote:
 Static compiling the stuff? How would this affect the binary size?

Hopefully not too badly if you use the right options.  libc is huge,
but the linker should be able to throw out most of it if statically
linking, since Lua likely doesn't use most libc functions.

Alternatively, is the libc ABI stable enough that we could dynamically
link libc, and statically link everything else?  The other libraries
required are very small.

 (And: is
 static linking working across different libc versions?)

Yes, it should work fine, AFAIK.  If you statically link everything
you're just using the kernel ABIs, which are supposed to be very
stable (especially for reasonably common stuff).

 BTW, what about Mac OS / FreeBSD hosts?

Are there any shared webhosts you know of that run Mac or BSD?  At
worst, they can fall into the same group as the no-exec() camp, able
to use Wikipedia content but not 100%.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Jared Williams
 

 -Original Message-
 From: wikitech-l-boun...@lists.wikimedia.org 
 [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of 
 Aryeh Gregor
 Sent: 30 June 2009 20:56
 To: Wikimedia developers
 Subject: Re: [Wikitech-l] On templates and programming languages
 
 On Tue, Jun 30, 2009 at 12:16 PM, Brion 
 Vibberbr...@wikimedia.org wrote:
  * PHP
 
  Advantage: Lots of webbish people have some experience with 
 PHP or can 
  easily find references.
 
  Advantage: we're pretty much guaranteed to have a PHP interpreter 
  available. :)
 
  Disadvantage: PHP is difficult to lock down for secure execution.
 
 I think it would be easy to provide a very simple locked-down 
 version, with most of the features gone.  You could, for 
 instance, only permit variable assignment, use of built-in 
 operators, a small whitelist of functions, and conditionals.  
 You could omit loops, function definitions, and abusable 
 functions like str_repeat() (let alone exec(), eval(), etc.) 
 from a first pass.  This would still be vastly more powerful, 
 more readable, and faster than ParserFunctions.

Pity there is not a method of locking down code execution to a single
namespace, (think ahead with php5.3)

namespace Template
{
  function strlen($string) { return \strlen($string) * 2; }
  function exec() { throw new \Exception(); }

  class Template
 {
  function paint()
 {
  // Redirect \ namespace to Template, so \exec() is also
caught.
  
echo strlen('data');
 }
}
}

Jared


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Marco Schuster
On Tue, Jun 30, 2009 at 10:45 PM, Aryeh Gregor 
simetrical+wikil...@gmail.com simetrical%2bwikil...@gmail.com wrote:

 Alternatively, is the libc ABI stable enough that we could dynamically
 link libc, and statically link everything else?  The other libraries
 required are very small.

I wouldn't count on this... at least we should provide a dyn-linked version
for those wanting less storage/memory/whatever consumption.

How do statically compiled programs for x86 platforms behave on x64, btw?
And what about more exotic platforms like ARM (which can also be
multi-endian, IXP4xx is an example) / SPARC (Toolserver!!!) or PowerPC? Are
they actually supported by Lua?



  BTW, what about Mac OS / FreeBSD hosts?

 Are there any shared webhosts you know of that run Mac or BSD?  At
 worst, they can fall into the same group as the no-exec() camp, able
 to use Wikipedia content but not 100%.


The webhoster hosting our school's homepage does, for example... They host
all schools in Munich, and I think they're a bit security-paranoid. We don't
have any issues hosting a MediaWiki there, actually. (OK, we never imported
WP content.)


Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Aryeh Gregor
On Tue, Jun 30, 2009 at 5:11 PM, Marco
Schusterma...@harddisk.is-a-geek.org wrote:
 How do statically compiled programs for x86 platforms behave on x64, btw?

I'm pretty sure they work fine.  Someone with more knowledge of Linux
binaries needs to comment on how we could best do this, though.

 And what about more exotic platforms like ARM (which can also be
 multi-endian, IXP4xx is an example) / SPARC (Toolserver!!!) or PowerPC? Are
 they actually supported by Lua?

Lua is designed to be extremely portable IIRC, across both
architectures and compilers.

 The webhoster hosting our school's homepage does, for example... They host
 all schools in Munich, and I think they're a bit security-paranoid.

That's not a shared host.  They can easily install Lua themselves.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Steve Sanbeg
On Tue, 30 Jun 2009 21:38:07 +0100, Thomas Dalton wrote:

 2009/6/30 Michael Daly michael.d...@kayakwiki.org:

 How does that work with anonymous variables? Are all $[NUMBER] style
 names count as auto-declared?
 

They're not anonymous, they're just named sequentially.  Most languages
should have some method of accessing/declaring  those, i.e.

XSL: 
  xsl:param name=1default/xsl:param
  xsl:value-of select=$1/

perl:
  my $p=$ARG{1};
  print $p;

etc...

If we do roll our own, it should have similar fucntionality.





___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Thomas Dalton
2009/6/30 Steve Sanbeg ssan...@ask.com:
 On Tue, 30 Jun 2009 21:38:07 +0100, Thomas Dalton wrote:

 2009/6/30 Michael Daly michael.d...@kayakwiki.org:

 How does that work with anonymous variables? Are all $[NUMBER] style
 names count as auto-declared?


 They're not anonymous, they're just named sequentially.

They are anonymous when you call the template, though. The names are
determined by the order in the call rather than written explicitly.
They do need to be considered separately.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 12:56 PM, Aryeh
Gregorsimetrical+wikil...@gmail.com wrote:
 On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibberbr...@wikimedia.org wrote:
 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.

 I think it would be easy to provide a very simple locked-down version,
 with most of the features gone.  You could, for instance, only permit
 variable assignment, use of built-in operators, a small whitelist of
 functions, and conditionals.  You could omit loops, function
 definitions, and abusable functions like str_repeat() (let alone
 exec(), eval(), etc.) from a first pass.  This would still be vastly
 more powerful, more readable, and faster than ParserFunctions.

 Hopefully, we could make this secure enough for your average
 shared-host website to run it by default with no special measures
 taken and without much risk.  Installations with more access and
 higher security requirements, like Wikimedia, could shell out to a
 process that's sandboxed on the OS level to be on the safe side.  I'd
 like to hear what Tim thinks about the possibility of securing PHP
 like this.

 Of course, PHP is evil, and supporting it sucks.  :(  But if we
 *really* *really* need to support users who can't shell out to other
 programs, I think it's the only real language that's a feasible
 solution.


 I'd encourage you to consider requiring exec() support for full use of
 Wikipedia templates, though.  Many really big shared hosts allow it,
 like 1and1.com.  Anyone big enough to include much Wikipedia content
 will likely be on at least a VPS anyway.  And if your host doesn't
 support exec(), then at *worst* you can still get the articles in a
 totally usable form -- just run Special:ExpandTemplates on all the
 article's templates.  You can then transclude those on a per-article
 basis; we could update Special:Export to make this easier.  The only
 problem in this case would be that you can't easily change the
 formatting of all the templates at once -- but such a small site would
 likely have few enough articles to do it by hand, if they even want
 to.

 I think saying that users without exec() support get to use Wikipedia
 content in a somewhat less usable form would be just fine, and it
 would *really* open up our options.  We could support basically any
 programming language in that case.

 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.

 It doesn't matter whether it's present, does it?  If the user has
 exec() support, they could download a binary interpreter for *any*
 language to their webspace and run it from there regardless of whether
 the language is supported on the host.  So Python is on exactly the
 same level as Lua here.

 Much though I love Python, Lua looks like the better option.  First of
 all, it's *very* small.  sudo apt-get install lua50 on my machine uses
 up only 180 KB of disk space, and the package is 30 KB gzipped.  Our
 current tarballs are 10 MB; we could easily just chuck in Lua binaries
 for Linux x86-32 and Windows without even noticing the size increase,
 and allow users to enable it with one line in LocalSettings.php.  By
 contrast, python2.6 is around 10 MB uncompressed, 2.5 MB compressed.
 Perl is twice that size.  Windows users, or users with exec() allowed
 but open_basedir preventing access to /usr/bin, would have to obtain
 Python/Perl/etc. themselves.

 It looks to me like Lua would be a lot easier to sandbox.  It seems
 pretty simple to deny all I/O within the language itself, so you'd
 (hopefully) just need memory and CPU limits.  Both of those could be
 implemented on Linux with hard setrlimit() values plus nice.  Similar
 things exist on Windows, hopefully accessible by command line somehow.
  If we're shipping binaries with MediaWiki, we could even hack the
 code if necessary, to use whatever sandboxing mechanisms the OS makes
 available, although hopefully that would be unneeded.

 I don't think we should fixate too much on how many people know the
 language.  It's not hard to pick up a new language if you already know
 one, and Lua has the reputation of being simple (although I haven't
 tried to learn it).  I think Lua is the best option here.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


In addition to resource limits, any scheme better make sure what's
passed into the programming language and what's passed out makes
sense.  For example, you shouldn't have it generating raw HTML and
probably shouldn't 

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brian
On Tue, Jun 30, 2009 at 11:20 AM, Robert Rohde raro...@gmail.com wrote:

 However,
 given the nastiness of template syntax, I would expect no end of wiki
 authors willing to help convert the commonly used stuff.

 -Robert Rohde

I was curious just how terrible of a task conversion can be expected
to be. This is just a heuristic I came up with..

# Simple English parser functions
$ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{#' | wc -l
22,211

# Simple English templates
$ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{' | wc -l
416,126 - 22,211 = 393,915

# English parser functions
$  bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{#' | wc -l
430,980

# English templates
$  bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{' | wc -l
44,928,358 - 430,980 = 44,497,378

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Andrew Garrett

On 30/06/2009, at 9:42 PM, Aryeh Gregor wrote:

 On Tue, Jun 30, 2009 at 4:25 PM, Brion Vibberbr...@wikimedia.org  
 wrote:
 IMO by the time you've implemented your whitelisting parser you  
 might as
 well just interpret it rather than eval()ing.

 I don't think so.  You'd only have to do the whitelisting once, on
 page save.  After that you could just execute with no extra overhead.

That's just scary. We'd definitely want to do the validation as close  
as possible to the actual eval()ing, to minimise backdoors like  
Special:Import et al.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Hay (Husky)
I would opt for Javascript.

PHP and Python are intended for large and complex applications and
come with a huge standard library people probably expect to be
available. Security concerns are a problem too, so a subset would
probably be necessary So, in essence you get a crippled-down language
that isn't really useful for templates.

Making our own language, either by 'fixing' the template language or
by inventing something new would only mean we introduce a new language
that'll be specific to our own platform and nobody knows outside of
Mediawiki developers.

XSLT is not meant to be written or read by humans. It's a
Turing-complete language stuffed into horrendous XML statements. Let's
not go down that road.

That leaves us to Lua and Javascript, which are both small and
efficient languages meant to solve tasks like this. Remember, i'm
talking about 'core' Javascript here, not with all DOM methods and
stuff. If you strip that all out (take a look at the 1.5. core
reference at Mozilla.com:
https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference) you
get a pretty nice and simple language that isn't very large. Both
would require a new parser and/or installed compilers on the
server-side. Compared to the disadvantages of other options, that
seems like a pretty small loss for a great win.

Javascript is a widely understood and implemented language, with lots
of efforts to get it even faster in modern browsers. Every Wikipedia
user has a copy of it implemented in their browser and can start
experimenting without the need for installing a compiler or a web
server. Many people program in Javascript, so you have a huge
potential number of people who could start programming Mediawiki
templates. And it's already closely tied to the web, so you don't have
to invent new ways of dealing with web-specific stuff.

So, let's choose Javascript as our new template programming language.

Regards,
-- Hay

On Tue, Jun 30, 2009 at 6:16 PM, Brion Vibberbr...@wikimedia.org wrote:
 As many folks have noted, our current templating system works ok for
 simple things, but doesn't scale well -- even moderately complex
 conditionals or text-munging will quickly turn your template source into
 what appears to be line noise.

 And we all thought Perl was bad! ;)

 There's been talk of Lua as an embedded templating language for a while,
 and there's even an extension implementation.

 One advantage of Lua over other languages is that its implementation is
 optimized for use as an embedded language, and it looks kind of pretty.

 An _inherent_ disadvantage is that it's a fairly rarely-used language,
 so still requires special learning on potential template programmers' part.

 An _implementation_ disadvantage is that it currently is dependent on an
 external Lua binary installation -- something that probably won't be
 present on third-party installs, meaning Lua templates couldn't be
 easily copied to non-Wikimedia wikis.


 There are perhaps three primary alternative contenders that don't
 involve making up our own scripting language (something I'd dearly like
 to avoid):

 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.


 * JavaScript

 Advantage: Even more folks have been exposed to JavaScript programming,
 including Wikipedia power-users.

 Disadvantage: Server-side interpreter not guaranteed to be present. Like
 Lua, would either restrict our portability or would require an
 interpreter reimplementation. :P


 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.


 Any thoughts? Does anybody happen to have a PHP implementation of a Lua
 or JavaScript interpreter? ;)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Robert Rohde
On Tue, Jun 30, 2009 at 4:01 PM, Brianbrian.min...@colorado.edu wrote:
 On Tue, Jun 30, 2009 at 11:20 AM, Robert Rohde raro...@gmail.com wrote:

 However,
 given the nastiness of template syntax, I would expect no end of wiki
 authors willing to help convert the commonly used stuff.

 -Robert Rohde

 I was curious just how terrible of a task conversion can be expected
 to be. This is just a heuristic I came up with..

 # Simple English parser functions
 $ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{#' | wc 
 -l
 22,211

 # Simple English templates
 $ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{' | wc -l
 416,126 - 22,211 = 393,915

 # English parser functions
 $  bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{#' | wc -l
 430,980

 # English templates
 $  bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{' | wc -l
 44,928,358 - 430,980 = 44,497,378

I assume we are primarily talking about replacing template code and
not template calls, per se.

In other words, I assume things like {{fact}} and {{msg | foo is
bar }} will be be basically unchanged on the article side but
rewritten on the implementation side in Template: space.  If that is
correct, it would be more useful to simply ask how large Template:
space is rather than counting all the template calls.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brian
On Tue, Jun 30, 2009 at 6:09 PM, Robert Rohderaro...@gmail.com wrote:
 In other words, I assume things like {{fact}} and {{msg | foo is
 bar }} will be be basically unchanged on the article side but
 rewritten on the implementation side in Template: space.  If that is
 correct, it would be more useful to simply ask how large Template:
 space is rather than counting all the template calls.

 -Robert Rohde

Mixing the new language with existing wikicode? With a new language I
would like to see the old language go out the door. The end of double
braces.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brion Vibber
Aryeh Gregor wrote:
 On Tue, Jun 30, 2009 at 4:25 PM, Brion Vibberbr...@wikimedia.org wrote:
 It's not about Wikipedia content, but about being able to grab things
 you see on another wiki and use or adapt them to your own needs. We get
 lots of questions from people trying to grab some particular template
 off Wikipedia to use on their own site for their own needs.
 
 Sure.  The point still holds.  Some third parties would be unable to
 use Wikipedia templates, yes.  But given the tangle of dependencies
 the major ones have, and how complicated they are, I'm guessing most
 small third-party wikis don't bother in the end anyway.

That's why we want to fix it! :)

It *should* be fairly trivial to fetch a template/plugin sort of thing 
off of one wiki and put it on another. Consider this as one of our goals 
for next-gen templating.

-- brion

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Thomas Dalton
2009/7/1 Brian brian.min...@colorado.edu:
 On Tue, Jun 30, 2009 at 6:09 PM, Robert Rohderaro...@gmail.com wrote:
 In other words, I assume things like {{fact}} and {{msg | foo is
 bar }} will be be basically unchanged on the article side but
 rewritten on the implementation side in Template: space.  If that is
 correct, it would be more useful to simply ask how large Template:
 space is rather than counting all the template calls.

 -Robert Rohde

 Mixing the new language with existing wikicode? With a new language I
 would like to see the old language go out the door. The end of double
 braces.

What would you replace them with? The wikitext used by regular editors
should be as simple as possible, we don't want to require PHP or
Javascript to be used by anyone wanting to add an infobox to an
article.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Chad
On Tue, Jun 30, 2009 at 8:28 PM, Brianbrian.min...@colorado.edu wrote:
 On Tue, Jun 30, 2009 at 6:09 PM, Robert Rohderaro...@gmail.com wrote:
 In other words, I assume things like {{fact}} and {{msg | foo is
 bar }} will be be basically unchanged on the article side but
 rewritten on the implementation side in Template: space.  If that is
 correct, it would be more useful to simply ask how large Template:
 space is rather than counting all the template calls.

 -Robert Rohde

 Mixing the new language with existing wikicode? With a new language I
 would like to see the old language go out the door. The end of double
 braces.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Unless we plan on trying to mass-convert not only years of old revisions
but change years-old behavior that millions of users have come to expect?
I would expect _any_ change to keep {{sometemplate}} always working,
even if the mechanics behind it change.

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brian
On Tue, Jun 30, 2009 at 6:34 PM, Thomas Daltonthomas.dal...@gmail.com wrote:
 What would you replace them with? The wikitext used by regular editors
 should be as simple as possible, we don't want to require PHP or
 Javascript to be used by anyone wanting to add an infobox to an
 article.

There is nothing in the OP that indicates that we are keeping the
current template code or even that it would be desirable. Whatever
facilities the language we choose has for including other files and
passing arguments to functions is 100% sufficient.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Brian
On Tue, Jun 30, 2009 at 6:43 PM, Thomas Daltonthomas.dal...@gmail.com wrote:
 There is no proposal to replace wikitext with PHP (it wouldn't even
 work, PHP isn't a markup language, ditto Javascript, Python, etc.),
 the proposal is to replace the template code, ie. the code on the
 template pages.

The OP does not say it is a recommendation to replace ParserFunctions,
it says, our current templating system. In my mind that absolutely
includes the use of templates in the article namespace.

There are lots of usability improvements that can be made to the
templating system. First and foremost the new system should allow
advanced wiki users to perform programmatic operations on article data
without the requirement that the data in the article be made
unreadable.

If we only focus our efforts on making the template namespace more
complicated by giving it a more advanced programming language and we
leave the article namespace as it is then we have not even touched the
usability issue. We have just made it worse.

I do of course have some specific ideas about how to achieve this
goal, but I'm kind of in shock and awe that it's not seen as the
main reason for improving the template system!

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread randomcoder1
http://jtemplates.tpython.com/ ? :)

Trevor Parscal wrote:
 Seems like JSON syntax is pretty simple and could be a big improvement  
 to how templates are currently invoked.

 Bottom line, a well defined syntax like JavaScript is going to be more  
 user friendly than a syntax which is only defined by the behavior of a  
 parser with standardization at all.

 - Trevor

 Sent from my iPod

 On Jun 30, 2009, at 5:34 PM, Thomas Dalton thomas.dal...@gmail.com  
 wrote:

   
 2009/7/1 Brian brian.min...@colorado.edu:
 
 On Tue, Jun 30, 2009 at 6:09 PM, Robert Rohderaro...@gmail.com  
 wrote:
   
 In other words, I assume things like {{fact}} and {{msg | foo is
 bar }} will be be basically unchanged on the article side but
 rewritten on the implementation side in Template: space.  If that is
 correct, it would be more useful to simply ask how large Template:
 space is rather than counting all the template calls.

 -Robert Rohde
 
 Mixing the new language with existing wikicode? With a new language I
 would like to see the old language go out the door. The end of double
 braces.
   
 What would you replace them with? The wikitext used by regular editors
 should be as simple as possible, we don't want to require PHP or
 Javascript to be used by anyone wanting to add an infobox to an
 article.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Aryeh Gregor
On Tue, Jun 30, 2009 at 6:08 PM, Robert Rohderaro...@gmail.com wrote:
 In addition to resource limits, any scheme better make sure what's
 passed into the programming language and what's passed out makes
 sense.  For example, you shouldn't have it generating raw HTML and
 probably shouldn't let it mess with strip markers.  Some of this may
 be automatic depending how it's integrated into the parser.  One would
 probably also want to limit the size of an allowed output (e.g. don't
 let it send 5 MB to the user).  Depending on the integration there may
 be other control sequences that one needs to catch when it returns as
 well.

I was assuming it would just return wikitext, and that would be
integrated into the page and parsed, following all limits on wikitext
(including size) -- just as with current parser functions.

 On a separate point, one of the limitations of stand-alone type
 sandboxes is that it would make it harder for the code to call other
 template pages.  One of the few virtues of the current template code
 is that it is relatively modular, with more complex templates being
 built out of less complex ones.  If this programming language is meant
 to replace that then it would also need to be able to reference the
 results of other template pages.  One solution is to pre-expand those
 sections (similar to what is done now, I believe), but that can get
 rather delicate once one has programming constructs like variable
 assignments, looping, and recursion since the template parameters
 won't necessarily be fixed at the Preprocessor stage.

I'd assume we'd support some kind of includes.  One rudimentary way to
do it would be to run Lua stuff after or during preprocessing, so you
could just include Lua code macro-style using templates.  A better way
would probably be to support the include features of the language
itself (I don't know how they work offhand, for Lua).

On Tue, Jun 30, 2009 at 6:12 PM, Jared
Williamsjared.willia...@ntlworld.com wrote:
 Yeah, would also need time  mem use restrictions.

Which is impossible for in-process use.  You'd have to shell out if
you do that, which defeats the entire point of using PHP instead of
something else to begin with.

On Tue, Jun 30, 2009 at 7:16 PM, Andrew Garrettagarr...@wikimedia.org wrote:
 That's just scary. We'd definitely want to do the validation as close
 as possible to the actual eval()ing, to minimise backdoors like
 Special:Import et al.

You'd be saving the code to a file on disk somewhere, probably named
using a hash of the input.  The only thing saving the code would be
the code that sanitizes it.  There's no way anything could go wrong
unless an attacker gains filesystem write access, in which case you're
hosed anyway.  Parsing PHP on every page view when you could cache it
in APC is crazy.

On Tue, Jun 30, 2009 at 7:24 PM, Hay (Husky)hus...@gmail.com wrote:
 That leaves us to Lua and Javascript, which are both small and
 efficient languages meant to solve tasks like this. Remember, i'm
 talking about 'core' Javascript here, not with all DOM methods and
 stuff. If you strip that all out (take a look at the 1.5. core
 reference at Mozilla.com:
 https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference) you
 get a pretty nice and simple language that isn't very large. Both
 would require a new parser and/or installed compilers on the
 server-side. Compared to the disadvantages of other options, that
 seems like a pretty small loss for a great win.

Reasonable enough, yeah.  Sandboxing might easier too.  What are some
standalone JavaScript interpreters we could use?  Ideally we'd use a
heavily-optimized JIT compiler, like V8 or TraceMonkey, but I don't
know if those work standalone.

On Tue, Jun 30, 2009 at 8:33 PM, Brion Vibberbr...@wikimedia.org wrote:
 That's why we want to fix it! :)

 It *should* be fairly trivial to fetch a template/plugin sort of thing
 off of one wiki and put it on another. Consider this as one of our goals
 for next-gen templating.

Eh.  Then that really ties our hands.  If we have to have support for
shared hosts without exec() support, then I don't see any viable
option except sanitized PHP.

On Tue, Jun 30, 2009 at 8:37 PM, Brion Vibberbr...@wikimedia.org wrote:
 Executing PHP from apache-writable files saved on disk is also a
 security danger.

 The original implementation of the MonoBook skin used the TAL templating
 language, which was compiled into executable PHP at runtime and stored
 in /tmp so it could be cached for the next view.

 In addition to difficulties with hosts which had misconfigured /tmp
 directories, we found that people sharing their hosts with
 poorly-secured WordPress installations would end up finding their wikis
 hacked -- worms exploiting vulnerabilities in other PHP apps would hop
 around the system modifying any .php files they could write to...
 including the cached PHPTAL templates.

It could be eval()ed by default, but the performance wins from using
APC would surely be huge. 

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Sergey Chernyshev
I don't know about scripting languages for the templating, it might be an
overkill.

When I was picking lower language for MediaWiki Widgets extension, I looked
at popular PHP templating systems and ended up picking Smarty (
http://smarty.net/) - it can be security locked, it has a few useful
features.

You can see Widget code here:
http://www.mediawikiwidgets.org/w/index.php?title=Widget:Google_Calendaraction=editand
widget is called using a parser function like this: {{widget:
Name|param=val|param2=val2}}.

Double curlys are far from perfect, but there are not that many good
alternatives - XML is probably the only good alternative because it's
universal and use by many-many tools out there. Can't say that I'm an expert
in templating languages though, especially when we're talking about
power-users and not developers.

Thank you,

Sergey


--
Sergey Chernyshev
http://www.sergeychernyshev.com/


On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibber br...@wikimedia.org wrote:

 As many folks have noted, our current templating system works ok for
 simple things, but doesn't scale well -- even moderately complex
 conditionals or text-munging will quickly turn your template source into
 what appears to be line noise.

 And we all thought Perl was bad! ;)

 There's been talk of Lua as an embedded templating language for a while,
 and there's even an extension implementation.

 One advantage of Lua over other languages is that its implementation is
 optimized for use as an embedded language, and it looks kind of pretty.

 An _inherent_ disadvantage is that it's a fairly rarely-used language,
 so still requires special learning on potential template programmers' part.

 An _implementation_ disadvantage is that it currently is dependent on an
 external Lua binary installation -- something that probably won't be
 present on third-party installs, meaning Lua templates couldn't be
 easily copied to non-Wikimedia wikis.


 There are perhaps three primary alternative contenders that don't
 involve making up our own scripting language (something I'd dearly like
 to avoid):

 * PHP

 Advantage: Lots of webbish people have some experience with PHP or can
 easily find references.

 Advantage: we're pretty much guaranteed to have a PHP interpreter
 available. :)

 Disadvantage: PHP is difficult to lock down for secure execution.


 * JavaScript

 Advantage: Even more folks have been exposed to JavaScript programming,
 including Wikipedia power-users.

 Disadvantage: Server-side interpreter not guaranteed to be present. Like
 Lua, would either restrict our portability or would require an
 interpreter reimplementation. :P


 * Python

 Advantage: A Python interpreter will be present on most web servers,
 though not necessarily all. (Windows-based servers especially.)

 Wash: Python is probably better known than Lua, but not as well as PHP
 or JS.

 Disadvantage: Like PHP, Python is difficult to lock down securely.


 Any thoughts? Does anybody happen to have a PHP implementation of a Lua
 or JavaScript interpreter? ;)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Aryeh Gregor
On Tue, Jun 30, 2009 at 10:45 PM, Sergey
Chernyshevsergey.chernys...@gmail.com wrote:
 I don't know about scripting languages for the templating, it might be an
 overkill.

People are using ParserFunctions as a scripting language already.
That's not feasibly going to be removed at this point.  So the only
way to go is to replace it with a better scripting language, which is
what we're talking about.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Tim Starling
Brion Vibber wrote:
 There's been talk of Lua as an embedded templating language for a while, 
 and there's even an extension implementation.
 
 One advantage of Lua over other languages is that its implementation is 
 optimized for use as an embedded language, and it looks kind of pretty.
 
 An _inherent_ disadvantage is that it's a fairly rarely-used language, 
 so still requires special learning on potential template programmers' part.
 
 An _implementation_ disadvantage is that it currently is dependent on an 
 external Lua binary installation -- something that probably won't be 
 present on third-party installs, meaning Lua templates couldn't be 
 easily copied to non-Wikimedia wikis.

There are problems with all the shell-based solutions. MediaWiki
callbacks, like template expansion, {{VARIABLES}} and ifexist, are
commonly used in templates on Wikipedia, and a scripting language
without these would suffer from poor community buy-in. You could
implement them from the shell using IPC, but IPC in PHP is rather
cumbersome. The interface between the parser and the scripting engine
would be performance-sensitive, because users would write templates
that invoked the scripting engine hundreds of times in the course of
rendering an article. So there's a case there for a persistent
scripting engine with a command-based interface over a pipe.

The reason I like Lua is because of the potential to embed it in PHP
as an extension, with fast setup and fast callbacks to MediaWiki. It
does all its memory allocation via a callback to the application,
including VM stack space, which means that it's possible to control
the memory usage without killing the process when the limit is
exceeded. But its standard library is unsuitable for running untrusted
scripts, since it contains all the usual process control and file
read/write functions.

The current PECL extension doesn't have any of the features that make
Lua attractive: it does not have support for callbacks to PHP, or for
replacing the standard library with something more sensible, or for
limiting memory without killing the request when the limit is
exceeded. Obviously the distributed standalone does not have these
features either.

I had imagined the task of embedding Lua in MediaWiki as being
primarily a C project, writing the necessary glue code between the
embedded interpreter and PHP. I had hoped that banging the drum for
Lua might encourage someone to look at these issues and start work on
that project.


 * PHP
 
 Advantage: Lots of webbish people have some experience with PHP or can 
 easily find references.
 
 Advantage: we're pretty much guaranteed to have a PHP interpreter 
 available. :)
 
 Disadvantage: PHP is difficult to lock down for secure execution.

PHP can be secured against arbitrary execution using token_get_all(),
there's a proof-of-principle validator of this kind in the master
switch script project. But there are problems with attempting a
single-process PHP-in-PHP sandbox:

* The poor support for signals in PHP makes it difficult to limit the
execution time of a script snippet. Ticks only occur at the end of
each statement, so you can defeat them by making a single statement
that runs forever.

* Apart from blacklisting function definition, there is no way to
protect against infinite recursion, which exhausts the process stack
and causes a segfault.

* Memory limits are implemented on a per-request basis, and there's no
way to recover from exceeding the memory limit, the request is just
killed.

 * JavaScript
 
 Advantage: Even more folks have been exposed to JavaScript programming, 
 including Wikipedia power-users.
 
 Disadvantage: Server-side interpreter not guaranteed to be present. Like 
 Lua, would either restrict our portability or would require an 
 interpreter reimplementation. :P
 
 
 * Python
 
 Advantage: A Python interpreter will be present on most web servers, 
 though not necessarily all. (Windows-based servers especially.)
 
 Wash: Python is probably better known than Lua, but not as well as PHP 
 or JS.
 
 Disadvantage: Like PHP, Python is difficult to lock down securely.
 
 
 Any thoughts? Does anybody happen to have a PHP implementation of a Lua 
 or JavaScript interpreter? ;)

SpiderMonkey and Python both lack control over memory usage. Python
lacks a sandbox mode, the rexec module has been removed. SpiderMonkey
isn't embedded in any useful kind of standalone, so you'd have to
start with a C development project, like you would for Lua.

I think Rhino would be an easier path to JavaScript execution than
SpiderMonkey. You can pass an -Xmx option to the java VM, and it'll
throw an OutOfMemory exception when it hits that limit, allowing you
to implement per-snippet memory limits without killing the
interpreter. You could do wall-clock time limits using
java.util.Timer, or CPU time limits using a JNI hack to poll clock().
You could turn off LiveConnect by making your own ClassShutter,
leaving what (on initial impressions) is 

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Thomas Dalton
2009/7/1 Michael Daly michael.d...@kayakwiki.org:
 Why not switch the template syntax for articles to match the syntax for
 tags (which in turn is based on XML or whatever syntax that comes from
 ultimately)?

What is wrong with the current syntax for calling templates? At least,
what is wrong with it that would be improved by that change?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Dmitriy Sintsov
 1 - XSLT

   Since the syntax is XML (like the extensions tags) and XPath 
(vaguely
   similar to template syntax, although it's XML that calls XPath, the
   opposite of what we have) It would be reasonably consistent with
 current
   syntax.  It also should also already be fairly well locked down, and
 the
  interface seems fairly clear - present template parameters as
 stylesheet
  parameters, and other magic words as an input document.  We may just
 need
  a few simplifications to make it easier to use.

XSLT itself is a way too much locked down - even simple things like 
substrings manipulation and loops aren't so easy to perform. Well, maybe 
I am too stupid for XSLT but from my experience bringing tag syntax in 
programming language make the code poorly readable and bloated. I've 
used XSLT for just one of my projects.

 2- lisp/scheme

  Should be easy to write a parser for if needed, since the grammer is 
so
 simple,
  and it should be relatively simple to lock down or extend as needed.

Deeply nested braces of lisp remind me of current MediaWiki parser.

 Of course, those are both a bit more esoteric than your 
recommendations.
 Perl is nice for getting useful results from short code, if we're not
 bothered by one parser with no grammer specification calling another
 one. Tcl may
 be a reasonable compromise; a less esoteric, imperative language which
 is often
 used as an extension language.

Lua was highly valued here at computer lab, also Ocaml (not sure of 
proper spelling).
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l