Re: [Wikitech-l] zh.wikipedia including a JavaScript from Google

2009-06-26 Thread Tim Starling
Petr Kadlec wrote:
> Hi, folks.
> 
> Recently, the problem of user tracking via third party companies has
> been debated on mailing lists. I wonder, if an inclusion of the jQuery
> library linked directly from Google servers (!) does not qualify as a
> bad idea, too… (Even though no user tracking has obviously been
> intended, and due to caching, privacy violation is extremely limited.)
> See http://zh.wikipedia.org/wiki/MediaWiki:Common.js (added on
> 2009-05-22 http://zh.wikipedia.org/w/index.php?diff=10119416&diffonly=1).
> 

Yes it is a bad idea. A number of extensions hosted on Wikimedia have
a copy of jQuery, you can easily find a copy, e.g.

http://zh.wikipedia.org/w/extensions/Collection/collection/jquery.js
http://zh.wikipedia.org/w/extensions/UsabilityInitiative/Resources/jquery.js
http://zh.wikipedia.org/w/extensions/OpenID/skin/jquery-1.3.2.min.js

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Tim Starling
Aryeh Gregor wrote:
> On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouw wrote:
>> The reason I believe breaking up templates improves performance is
>> this: they're typically of the form
>> {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see
>> that this is a parser function call with three arguments, and expand
>> all three of them before it runs the #if hook.
> 
> I thought this was fixed ages ago with the new preprocessor.

Yes it was fixed in 1.12 (late 2007), as I have repeatedly told this
list. The new "if" parser function is passed a placeholder object
which can be expanded on demand.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Current events-related overloads

2009-06-26 Thread Domas Mituzas

> This is a very good idea, and sounds much better than having those

the major problem with all dirty caching is that we have more than one  
caching layer, and of course, things abort.

the fact, that people should be shown dirty versions instead of proper  
article leads to situation where in case of vandal fighting, etc,  
people will see stale versions, instead of waiting few seconds and  
getting real one.

In theory, update flow could look like this:

1. Set "I'm working on this" in a parallelism coordinator or lock  
manager
2. Do all database transactions & commit
3. Parse
4. Set memcached object
5. Invalidate squid objects

Now, should we parse, block or serve stale, could be dynamic, e.g. if  
we detect more than x parallel parses we fall back to blocking for few  
seconds, once we detect more than y of blocked threads on the task, or  
block expires and there's no fresh content yet (or there's new  
copy.. ) - then stale stuff can be served.
In perfect world that asks for specialized software :)

Do note, for past quite a few years we did lots and lots of work to  
avoid stale content being served. I would not see dirty serving as  
something we should be proud of ;-)

Domas

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Sergey Chernyshev
It probably depend on how getTimestamp() is implemented for non-local repos.
Important thing is not to have it return new values too often and return
real "version" of the image.

If this is already the case, can someone apply this patch then - don't want
to be responsible for such an important change ;)

Sergey


On Fri, Jun 26, 2009 at 3:52 PM, Chad  wrote:

> You're patching already-existing functionality at the File level, so it
> should be ok to just plop it in there. I'm not sure how this will affect
> the ForeignApi interface, so it'd be worth testing there too.
>
> From what I can tell at a (very) quick glance, it shouldn't adversely
> affect anything from a client perspective on the API, as we just
> rely on whatever URL was provided to us to begin with.
>
> -Chad
>
> On Fri, Jun 26, 2009 at 3:31 PM, Sergey
> Chernyshev wrote:
> > Which of all those file to change to apply my patch only to files in
> default
> > repository? Currently my patch is applied to File.php
> >
> > http://bug-attachment.wikimedia.org/attachment.cgi?id=5833
> >
> > If you just point me into right direction, I'll update the patch and
> upload
> > it myself.
> >
> > Thank you,
> >
> >Sergey
> >
> >
> > --
> > Sergey Chernyshev
> > http://www.sergeychernyshev.com/
> >
> >
> > On Fri, Jun 26, 2009 at 3:17 PM, Chad  wrote:
> >
> >> The structure is LocalRepo extends FSRepo extends
> >> FileRepo. ForeignApiRepo extends FileRepo directly, and
> >> ForeignDbRepo extends LocalRepo.
> >>
> >> -Chad
> >>
> >> On Jun 26, 2009 3:15 PM, "Sergey Chernyshev" <
> sergey.chernys...@gmail.com>
> >> wrote:
> >>
> >> It's probably worth mentioning that this bug is still open:
> >> https://bugzilla.wikimedia.org/show_bug.cgi?id=17577
> >>
> >> This will save not only traffic on subsequent page views (in this case:
> >>
> >>
> http://www.webpagetest.org/result/090218_132826127ab7f254499631e3e688b24b/1/details/cached/it's
> <
> http://www.webpagetest.org/result/090218_132826127ab7f254499631e3e688b24b/1/details/cached/it%27s
> >
> >> about 50K), but also improve performance dramatically.
> >>
> >> I wonder if anything can be done to at least make it work for local
> files -
> >> I have hard time understanding File vs. LocalFile vs. FSRepo
> relationships
> >> to enable this just for local file system.
> >>
> >> It's probably also wise to figure out a way for it to be implemented on
> >> non-local repositories too so Wikimedia projects can use it, but I'm
> >> completely out of the league here ;)
> >>
> >> Thank you,
> >>
> >>   Sergey
> >>
> >>
> >> --
> >> Sergey Chernyshev
> >> http://www.sergeychernyshev.com/
> >>
> >> On Fri, Jun 26, 2009 at 11:42 AM, Robert Rohde 
> wrote:
> >> >
> >> I'm going to mention ...
> >> ___
> >> Wikitech-l mailing list
> >> Wikitech-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Andrew Dunbar
2009/6/26 Robert Rohde :
> I'm going to mention this here, because it might be of interest on the
> Wikimedia cluster (or it might not).
>
> Last night I deposited Extension:Minify which is essentially a
> lightweight wrapper for the YUI CSS compressor and JSMin JavaScript
> compressor.  If installed it automatically captures all content
> exported through action=raw and precompresses it by removing comments,
> formatting, and other human readable elements.  All of the helpful
> elements still remain on the Mediawiki: pages, but they just don't get
> sent to users.
>
> Currently each page served to anons references 6 CSS/JS pages
> dynamically prepared by Mediawiki, of which 4 would be needed in the
> most common situation of viewing content online (i.e. assuming
> media="print" and media="handheld" are not downloaded in the typical
> case).
>
> These 4 pages, Mediawiki:Common.css, Mediawiki:Monobook.css, gen=css,
> and gen=js comprise about 60 kB on the English Wikipedia.  (I'm using
> enwiki as a benchmark, but Commons and dewiki also have similar
> numbers to those discussed below.)
>
> After gzip compression, which I assume is available on most HTTP
> transactions these days, they total 17039 bytes.  The comparable
> numbers if Minify is applied are 35 kB raw and 9980 after gzip, for a
> savings of 7 kB or about 40% of the total file size.
>
> Now in practical terms 7 kB could shave ~1.5s off a 36 kbps dialup
> connection.  Or given Erik Zachte's observation that action=raw is
> called 500 million times per day, and assuming up to 7 kB / 4 savings
> per call, could shave up to 900 GB off of Wikimedia's daily traffic.
> (In practice, it would probably be somewhat less.  900 GB seems to be
> slightly under 2% of Wikimedia's total daily traffic if I am reading
> the charts correctly.)
>
>
> Anyway, that's the use case (such as it is): slightly faster initial
> downloads and a small but probably measurable impact on total
> bandwidth.  The trade-off of course being that users receive CSS and
> JS pages from action=raw that are largely unreadable.  The extension
> exists if Wikimedia is interested, though to be honest I primarily
> created it for use with my own more tightly bandwidth constrained
> sites.

This sounds great but I have a problem with making action=raw return
something that is not raw. For MediaWiki I think it would be better to
add a new action=minify

What would the pluses and minuses of that be?

Andrew Dunbar (hippietrail)


> -Robert Rohde
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
http://wiktionarydev.leuksman.com http://linguaphile.sf.net

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Michael Dale
Aryeh Gregor wrote:
> Any given image is not included on every single page on the wiki.
> Purging a few thousand pages from Squid on an image reupload (should
> be rare for such a heavily-used image) is okay.  Purging every single
> page on the wiki is not.
>
>   
yea .. we are just talking about adding image.jpg?image_revision_id  to 
all the image src at page render time should never purge everything on 
the wiki ;)
> No.  We don't purge Squid on these events, we just let people see old
> copies.  Of course, this doesn't normally apply to registered users
> (who usually [always?] get Squid misses), or to pages that aren't
> cached (edit, history, . . .).
>   
oky thats basically what I understood. That makes sense.. although it 
would be nice to think about a job or process that purges pages with 
outdated language msg, or pages that are referencing outdated scripts, 
style-sheet, or image urls.

We ~do~ add jobs to purge for template updates. Are other things like 
language msg & code updates candidates for job purge tasks? ... I guess 
its not too big a deal to get an old page until someone updates it.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Aryeh Gregor
On Fri, Jun 26, 2009 at 5:24 PM, Michael Dale wrote:
> The difference in the context of the script-loader is we would read the
> version from the mediaWiki js pages that are being included and the
> $wgStyleVersion var. (avoiding the need to shift reload) ... in the
> context of rendering a normal page with dozens of template lookups I
> don't see this a particularly costly. Its a few extra getLatestRevID
> title calls.

It's not costly unless we have to purge Squid for everything, which
probably we don't.  People could just use old versions, it's not
*that* dangerous.

> Likewise we should do this for images so we can send the
> cache forever header (bug 17577) avoiding a bunch of 304 requests.

Any given image is not included on every single page on the wiki.
Purging a few thousand pages from Squid on an image reupload (should
be rare for such a heavily-used image) is okay.  Purging every single
page on the wiki is not.

> One part I am not completely clear on is how we avoid lots of
> simultaneous requests to the scriptLoader when it first generates the
> JavaScript to be cached on the squids, but other stuff must be throttled
> too no? Like when we update any code, language msgs, or local-settings
> does that does not result in the immediate purging all of wikipedia.

No.  We don't purge Squid on these events, we just let people see old
copies.  Of course, this doesn't normally apply to registered users
(who usually [always?] get Squid misses), or to pages that aren't
cached (edit, history, . . .).

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] subst'ing #if parser functions loses line breaks, and other oddities

2009-06-26 Thread Gerard Meijssen
Hoi,
In the past the existence of templates in one wiki has been used as an
argument to not accept an extension. With extensions you have functionality
that is indeed intended to be external to ordinary users but you are talking
about functionality that can be tested. With templates you have stuff that
can adn does severely impact performance and is at the same time not usable
on other systems.

While it may be so that you can not effectively contribute to an article on
something esoteric as the "ten commondmanets in Roman Catholocism", it might
be possible for you to translate it in another language if you have the
language skills. With the way templates are I would not touch them with a
barge pole if I can help it. Templates are however the only tool we consider
for things like info boxes and stuff. They are as a result quite important
from a functional point of view. From a usability point of view they are
horrible.

In conclusion, templates are used and they prove to be problematic. The best
proof of this is the recent performance issues we had.
Thanks,
   GerardM

2009/6/26 Gregory Maxwell 

> On Fri, Jun 26, 2009 at 12:01 PM, Gerard
> Meijssen wrote:
> > Hoi,
> > At some stage Wikipedia was this thing that everybody can edit... I can
> not
> > and will not edit this shit so what do you expect from the average Joe ??
>
> I can not (effectively) contribute to
> http://en.wikipedia.org/wiki/Ten_Commandments_in_Roman_Catholicism
>
> Does this mean Wikipedia is a failure?
>
> I don't think so.  Not everyone needs to be able to do everything.
> Thats one reasons projects have communities: Other people can do the
> work which I'm not interested in or not qualified for.  Not everyone
> needs to make templates— and there are some people who'd have nothing
> else to do but add fart jokes to science articles if the site didn't
> have plenty of template mongering that needed doing.
>
> Unfortunately the existing system is needlessly exclusive. The
> existing parser function uses solution are so byzantine that even many
> people with the right interest and knowledge are significantly put off
> from it.
>
> The distinction between this and a general "easy to use" is a very
> critical one.
>
> It's also the case that the existing system's problems spills past its
> borders due to its own limitations: Regular users need to deal with
> things like weird whitespace handling and templates which MUST be
> substed (or can't be substed; at random from the user's perspective).
> This makes the system harder even for the vast majority of people who
> should never need to worry about the internals of the templates.
>
> I think this is the most important issue, and its one with real
> usability impacts,  but it's not due to the poor syntax. On this
> point, the template language could be intercal but still leave most
> users completely free to ignore the messy insides. The existing system
> doesn't because there is no clear boundary between the page and the
> templates (among other reasons, like the limitations of the existing
> 'string' manipulation functions).
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minify

2009-06-26 Thread Michael Dale
correct me if I am wrong but thats how we presently update js and css.. 
we have $wgStyleVersion and when that gets updated we send out fresh 
pages with html pointing to js with $wgStyleVersion append.

The difference in the context of the script-loader is we would read the 
version from the mediaWiki js pages that are being included and the 
$wgStyleVersion var. (avoiding the need to shift reload) ... in the 
context of rendering a normal page with dozens of template lookups I 
don't see this a particularly costly. Its a few extra getLatestRevID 
title calls. Likewise we should do this for images so we can send the 
cache forever header (bug 17577) avoiding a bunch of 304 requests.

One part I am not completely clear on is how we avoid lots of 
simultaneous requests to the scriptLoader when it first generates the 
JavaScript to be cached on the squids, but other stuff must be throttled 
too no? Like when we update any code, language msgs, or local-settings 
does that does not result in the immediate purging all of wikipedia.

--michael

Gregory Maxwell wrote:
> On Fri, Jun 26, 2009 at 4:33 PM, Michael Dale wrote:
>   
>> I would quickly add that the script-loader / new-upload branch also
>> supports minify along with associating unique id's grouping & gziping.
>>
>> So all your mediaWiki page includes are tied to their version numbers
>> and can be cached forever without 304 requests by the client or _shift_
>> reload to get new js.
>> 
>
> Hm. Unique ids?
>
> Does this mean the every page on the site must be purged from the
> caches to cause all requests to see a new version number?
>
> Is there also some pending squid patch to let it jam in a new ID
> number on the fly for every request? Or have I misunderstood what this
> does?
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Aryeh Gregor
On Fri, Jun 26, 2009 at 4:49 PM, Gregory Maxwell wrote:
> Hm. Unique ids?
>
> Does this mean the every page on the site must be purged from the
> caches to cause all requests to see a new version number?
>
> Is there also some pending squid patch to let it jam in a new ID
> number on the fly for every request? Or have I misunderstood what this
> does?

We already have version numbers on static CSS/JS, and we just don't
bother purging the HTML.  So any old Squid hits might see the old
include, or the new one.  It's not often noticeable in practice, even
if you get the old HTML with the new scripts/styles.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Gregory Maxwell
On Fri, Jun 26, 2009 at 4:33 PM, Michael Dale wrote:
> I would quickly add that the script-loader / new-upload branch also
> supports minify along with associating unique id's grouping & gziping.
>
> So all your mediaWiki page includes are tied to their version numbers
> and can be cached forever without 304 requests by the client or _shift_
> reload to get new js.

Hm. Unique ids?

Does this mean the every page on the site must be purged from the
caches to cause all requests to see a new version number?

Is there also some pending squid patch to let it jam in a new ID
number on the fly for every request? Or have I misunderstood what this
does?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] zh.wikipedia including a JavaScript from Google

2009-06-26 Thread Michael Dale
We should have a copy of jquery in mediaWiki ~soon~ ... although its a 
good point that it would be nice to centrally locate all our static 
mediWiki files for improved cache-ability across sites. If we could tie 
the version number to the request then we can just set it to never expire.

--michael

Petr Kadlec wrote:
> Hi, folks.
>
> Recently, the problem of user tracking via third party companies has
> been debated on mailing lists. I wonder, if an inclusion of the jQuery
> library linked directly from Google servers (!) does not qualify as a
> bad idea, too… (Even though no user tracking has obviously been
> intended, and due to caching, privacy violation is extremely limited.)
> See http://zh.wikipedia.org/wiki/MediaWiki:Common.js (added on
> 2009-05-22 http://zh.wikipedia.org/w/index.php?diff=10119416&diffonly=1).
>
> -- [[cs:User:Mormegil | Petr Kadlec]]
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] subst'ing #if parser functions loses line breaks, and other oddities

2009-06-26 Thread Gregory Maxwell
On Fri, Jun 26, 2009 at 12:01 PM, Gerard
Meijssen wrote:
> Hoi,
> At some stage Wikipedia was this thing that everybody can edit... I can not
> and will not edit this shit so what do you expect from the average Joe ??

I can not (effectively) contribute to
http://en.wikipedia.org/wiki/Ten_Commandments_in_Roman_Catholicism

Does this mean Wikipedia is a failure?

I don't think so.  Not everyone needs to be able to do everything.
Thats one reasons projects have communities: Other people can do the
work which I'm not interested in or not qualified for.  Not everyone
needs to make templates— and there are some people who'd have nothing
else to do but add fart jokes to science articles if the site didn't
have plenty of template mongering that needed doing.

Unfortunately the existing system is needlessly exclusive. The
existing parser function uses solution are so byzantine that even many
people with the right interest and knowledge are significantly put off
from it.

The distinction between this and a general "easy to use" is a very
critical one.

It's also the case that the existing system's problems spills past its
borders due to its own limitations: Regular users need to deal with
things like weird whitespace handling and templates which MUST be
substed (or can't be substed; at random from the user's perspective).
This makes the system harder even for the vast majority of people who
should never need to worry about the internals of the templates.

I think this is the most important issue, and its one with real
usability impacts,  but it's not due to the poor syntax. On this
point, the template language could be intercal but still leave most
users completely free to ignore the messy insides. The existing system
doesn't because there is no clear boundary between the page and the
templates (among other reasons, like the limitations of the existing
'string' manipulation functions).

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minify

2009-06-26 Thread Michael Dale
I would quickly add that the script-loader / new-upload branch also 
supports minify along with associating unique id's grouping & gziping.

So all your mediaWiki page includes are tied to their version numbers 
and can be cached forever without 304 requests by the client or _shift_ 
reload to get new js.

Plus it works with all the static file based js includes as well. If a 
given set of files is constantly requested we can group them to avoid 
server round trips. And finally it lets us localize msg and package that 
in the JS (again avoiding separate trips for javascript interface msgs)

for more info see the ~slightly outdated~ document:  
http://www.mediawiki.org/wiki/Extension:ScriptLoader

peace,
michael
 
Robert Rohde wrote:
> I'm going to mention this here, because it might be of interest on the
> Wikimedia cluster (or it might not).
>
> Last night I deposited Extension:Minify which is essentially a
> lightweight wrapper for the YUI CSS compressor and JSMin JavaScript
> compressor.  If installed it automatically captures all content
> exported through action=raw and precompresses it by removing comments,
> formatting, and other human readable elements.  All of the helpful
> elements still remain on the Mediawiki: pages, but they just don't get
> sent to users.
>
> Currently each page served to anons references 6 CSS/JS pages
> dynamically prepared by Mediawiki, of which 4 would be needed in the
> most common situation of viewing content online (i.e. assuming
> media="print" and media="handheld" are not downloaded in the typical
> case).
>
> These 4 pages, Mediawiki:Common.css, Mediawiki:Monobook.css, gen=css,
> and gen=js comprise about 60 kB on the English Wikipedia.  (I'm using
> enwiki as a benchmark, but Commons and dewiki also have similar
> numbers to those discussed below.)
>
> After gzip compression, which I assume is available on most HTTP
> transactions these days, they total 17039 bytes.  The comparable
> numbers if Minify is applied are 35 kB raw and 9980 after gzip, for a
> savings of 7 kB or about 40% of the total file size.
>
> Now in practical terms 7 kB could shave ~1.5s off a 36 kbps dialup
> connection.  Or given Erik Zachte's observation that action=raw is
> called 500 million times per day, and assuming up to 7 kB / 4 savings
> per call, could shave up to 900 GB off of Wikimedia's daily traffic.
> (In practice, it would probably be somewhat less.  900 GB seems to be
> slightly under 2% of Wikimedia's total daily traffic if I am reading
> the charts correctly.)
>
>
> Anyway, that's the use case (such as it is): slightly faster initial
> downloads and a small but probably measurable impact on total
> bandwidth.  The trade-off of course being that users receive CSS and
> JS pages from action=raw that are largely unreadable.  The extension
> exists if Wikimedia is interested, though to be honest I primarily
> created it for use with my own more tightly bandwidth constrained
> sites.
>
> -Robert Rohde
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Chad
You're patching already-existing functionality at the File level, so it
should be ok to just plop it in there. I'm not sure how this will affect
the ForeignApi interface, so it'd be worth testing there too.

From what I can tell at a (very) quick glance, it shouldn't adversely
affect anything from a client perspective on the API, as we just
rely on whatever URL was provided to us to begin with.

-Chad

On Fri, Jun 26, 2009 at 3:31 PM, Sergey
Chernyshev wrote:
> Which of all those file to change to apply my patch only to files in default
> repository? Currently my patch is applied to File.php
>
> http://bug-attachment.wikimedia.org/attachment.cgi?id=5833
>
> If you just point me into right direction, I'll update the patch and upload
> it myself.
>
> Thank you,
>
>        Sergey
>
>
> --
> Sergey Chernyshev
> http://www.sergeychernyshev.com/
>
>
> On Fri, Jun 26, 2009 at 3:17 PM, Chad  wrote:
>
>> The structure is LocalRepo extends FSRepo extends
>> FileRepo. ForeignApiRepo extends FileRepo directly, and
>> ForeignDbRepo extends LocalRepo.
>>
>> -Chad
>>
>> On Jun 26, 2009 3:15 PM, "Sergey Chernyshev" 
>> wrote:
>>
>> It's probably worth mentioning that this bug is still open:
>> https://bugzilla.wikimedia.org/show_bug.cgi?id=17577
>>
>> This will save not only traffic on subsequent page views (in this case:
>>
>> http://www.webpagetest.org/result/090218_132826127ab7f254499631e3e688b24b/1/details/cached/it's
>> about 50K), but also improve performance dramatically.
>>
>> I wonder if anything can be done to at least make it work for local files -
>> I have hard time understanding File vs. LocalFile vs. FSRepo relationships
>> to enable this just for local file system.
>>
>> It's probably also wise to figure out a way for it to be implemented on
>> non-local repositories too so Wikimedia projects can use it, but I'm
>> completely out of the league here ;)
>>
>> Thank you,
>>
>>       Sergey
>>
>>
>> --
>> Sergey Chernyshev
>> http://www.sergeychernyshev.com/
>>
>> On Fri, Jun 26, 2009 at 11:42 AM, Robert Rohde  wrote:
>> >
>> I'm going to mention ...
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minify

2009-06-26 Thread Sergey Chernyshev
Which of all those file to change to apply my patch only to files in default
repository? Currently my patch is applied to File.php

http://bug-attachment.wikimedia.org/attachment.cgi?id=5833

If you just point me into right direction, I'll update the patch and upload
it myself.

Thank you,

Sergey


--
Sergey Chernyshev
http://www.sergeychernyshev.com/


On Fri, Jun 26, 2009 at 3:17 PM, Chad  wrote:

> The structure is LocalRepo extends FSRepo extends
> FileRepo. ForeignApiRepo extends FileRepo directly, and
> ForeignDbRepo extends LocalRepo.
>
> -Chad
>
> On Jun 26, 2009 3:15 PM, "Sergey Chernyshev" 
> wrote:
>
> It's probably worth mentioning that this bug is still open:
> https://bugzilla.wikimedia.org/show_bug.cgi?id=17577
>
> This will save not only traffic on subsequent page views (in this case:
>
> http://www.webpagetest.org/result/090218_132826127ab7f254499631e3e688b24b/1/details/cached/it's
> about 50K), but also improve performance dramatically.
>
> I wonder if anything can be done to at least make it work for local files -
> I have hard time understanding File vs. LocalFile vs. FSRepo relationships
> to enable this just for local file system.
>
> It's probably also wise to figure out a way for it to be implemented on
> non-local repositories too so Wikimedia projects can use it, but I'm
> completely out of the league here ;)
>
> Thank you,
>
>   Sergey
>
>
> --
> Sergey Chernyshev
> http://www.sergeychernyshev.com/
>
> On Fri, Jun 26, 2009 at 11:42 AM, Robert Rohde  wrote:
> >
> I'm going to mention ...
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Chad
The structure is LocalRepo extends FSRepo extends
FileRepo. ForeignApiRepo extends FileRepo directly, and
ForeignDbRepo extends LocalRepo.

-Chad

On Jun 26, 2009 3:15 PM, "Sergey Chernyshev" 
wrote:

It's probably worth mentioning that this bug is still open:
https://bugzilla.wikimedia.org/show_bug.cgi?id=17577

This will save not only traffic on subsequent page views (in this case:
http://www.webpagetest.org/result/090218_132826127ab7f254499631e3e688b24b/1/details/cached/it's
about 50K), but also improve performance dramatically.

I wonder if anything can be done to at least make it work for local files -
I have hard time understanding File vs. LocalFile vs. FSRepo relationships
to enable this just for local file system.

It's probably also wise to figure out a way for it to be implemented on
non-local repositories too so Wikimedia projects can use it, but I'm
completely out of the league here ;)

Thank you,

   Sergey


--
Sergey Chernyshev
http://www.sergeychernyshev.com/

On Fri, Jun 26, 2009 at 11:42 AM, Robert Rohde  wrote: >
I'm going to mention ...
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Minify

2009-06-26 Thread Sergey Chernyshev
It's probably worth mentioning that this bug is still open:
https://bugzilla.wikimedia.org/show_bug.cgi?id=17577

This will save not only traffic on subsequent page views (in this case:
http://www.webpagetest.org/result/090218_132826127ab7f254499631e3e688b24b/1/details/cached/it's
about 50K), but also improve performance dramatically.

I wonder if anything can be done to at least make it work for local files -
I have hard time understanding File vs. LocalFile vs. FSRepo relationships
to enable this just for local file system.

It's probably also wise to figure out a way for it to be implemented on
non-local repositories too so Wikimedia projects can use it, but I'm
completely out of the league here ;)

Thank you,

Sergey


--
Sergey Chernyshev
http://www.sergeychernyshev.com/


On Fri, Jun 26, 2009 at 11:42 AM, Robert Rohde  wrote:

> I'm going to mention this here, because it might be of interest on the
> Wikimedia cluster (or it might not).
>
> Last night I deposited Extension:Minify which is essentially a
> lightweight wrapper for the YUI CSS compressor and JSMin JavaScript
> compressor.  If installed it automatically captures all content
> exported through action=raw and precompresses it by removing comments,
> formatting, and other human readable elements.  All of the helpful
> elements still remain on the Mediawiki: pages, but they just don't get
> sent to users.
>
> Currently each page served to anons references 6 CSS/JS pages
> dynamically prepared by Mediawiki, of which 4 would be needed in the
> most common situation of viewing content online (i.e. assuming
> media="print" and media="handheld" are not downloaded in the typical
> case).
>
> These 4 pages, Mediawiki:Common.css, Mediawiki:Monobook.css, gen=css,
> and gen=js comprise about 60 kB on the English Wikipedia.  (I'm using
> enwiki as a benchmark, but Commons and dewiki also have similar
> numbers to those discussed below.)
>
> After gzip compression, which I assume is available on most HTTP
> transactions these days, they total 17039 bytes.  The comparable
> numbers if Minify is applied are 35 kB raw and 9980 after gzip, for a
> savings of 7 kB or about 40% of the total file size.
>
> Now in practical terms 7 kB could shave ~1.5s off a 36 kbps dialup
> connection.  Or given Erik Zachte's observation that action=raw is
> called 500 million times per day, and assuming up to 7 kB / 4 savings
> per call, could shave up to 900 GB off of Wikimedia's daily traffic.
> (In practice, it would probably be somewhat less.  900 GB seems to be
> slightly under 2% of Wikimedia's total daily traffic if I am reading
> the charts correctly.)
>
>
> Anyway, that's the use case (such as it is): slightly faster initial
> downloads and a small but probably measurable impact on total
> bandwidth.  The trade-off of course being that users receive CSS and
> JS pages from action=raw that are largely unreadable.  The extension
> exists if Wikimedia is interested, though to be honest I primarily
> created it for use with my own more tightly bandwidth constrained
> sites.
>
> -Robert Rohde
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Roan Kattouw
2009/6/26 Robert Rohde :
> My understanding has been that the PREprocessor expands all branches,
> by looking up and substituting transcluded templates and similar
> things, but that the actual processor only evaluates the branches that
> it needs.  That's a lot faster than actually evaluating all branches
> (which is how things originally worked), but not quite as effective as
> if the dead branches were ignored entirely.
>
> (I could be totally wrong however.)
>
You're right that dead code never reaches the parser (your
"processor"), but ideally the preprocessor wouldn't bother expanding
it either. I have vague recollection that it was fixed with the new
preprocessor, as Simetrical said, but I have no idea how much truth
there is in that.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] subst'ing #if parser functions loses line breaks, and other oddities

2009-06-26 Thread Gerard Meijssen
Hoi,
At some stage Wikipedia was this thing that everybody can edit... I can not
and will not edit this shit so what do you expect from the average Joe ??
Thanks,
  Gerard

2009/6/25 Tisza Gergő 

> Tim Starling  wikimedia.org> writes:
>
> > {{subst:!}} no longer works as a separator between parser function
> > parameters, it just works as a literal character. Welcome to MediaWiki
> > 1.12.
>
> Seems like it was intended to be the | in [[category:foo|bar]], except that
> someone forgot a | from the code. Correctly it would be:
>
> {subst|}}}#if:{{{par1|}}}|[[Category:{{{par1}}}{subst|}}}#if:
> {{{key1|}}}|{subst|}}}!}}{{{key1}]]
> 
> }}{subst|}}}#if:{{{par2|}}}|[[Category:{{{par2}}}{subst|}}}#if:
> {{{key2|}}}|{subst|}}}!}}{{{key2}]]
> 
> }}{subst|}}}#if:{{{par3|}}}|[[Category:{{{par3}}}{subst|}}}#if:
> {{{key3|}}}|{subst|}}}!}}{{{key3}]]
> 
> }}
>
> (Note that I added extra linebreaks after #if: so that gmane doesn't
> complain
> for lines being too long.)
>
> > The workarounds that come to mind for the line break issue are fairly
> > obscure and complex. If I were you I'd just put the categories on the
> > same line and be done with it.
>
> Just put the templates on separate lines and wrap the whole thing in
> another #if
> to discard additional newlines at the end:
>
> {{#if:1|
> {subst|}}}#if:{{{par1|1}}}|[[Category:{{{par1}}}{subst|}}}#if:
> {{{key1|1}}}|{subst|}}}!}}{{{key1}]]}}
> {subst|}}}#if:{{{par2|}}}|[[Category:{{{par2}}}{subst|}}}#if:
> {{{key2|}}}|{subst|}}}!}}{{{key2}]]}}
> {subst|}}}#if:{{{par3|}}}|[[Category:{{{par3}}}{subst|}}}#if:
> {{{key3|}}}|{subst|}}}!}}{{{key3}]]}}
> }}
>
> (This assumes that whenever par2 is missing, par3 is missing too.)
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Robert Rohde
On Fri, Jun 26, 2009 at 7:16 AM, Aryeh
Gregor wrote:
> On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouw wrote:
>> The reason I believe breaking up templates improves performance is
>> this: they're typically of the form
>> {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see
>> that this is a parser function call with three arguments, and expand
>> all three of them before it runs the #if hook.
>
> I thought this was fixed ages ago with the new preprocessor.

My understanding has been that the PREprocessor expands all branches,
by looking up and substituting transcluded templates and similar
things, but that the actual processor only evaluates the branches that
it needs.  That's a lot faster than actually evaluating all branches
(which is how things originally worked), but not quite as effective as
if the dead branches were ignored entirely.

(I could be totally wrong however.)

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Extending wikilinks syntax

2009-06-26 Thread Aryeh Gregor
On Fri, Jun 26, 2009 at 11:46 AM, Andrew Garrett wrote:
> They already can, with Javascript, so there's no XSS issue.

That ability may be removed in the future, and restricted to a smaller
and more select group.  Witness the problems we've been having with
admins including tracking software.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Andrew Garrett

On 26/06/2009, at 3:32 PM, Brian wrote:

> On Fri, Jun 26, 2009 at 2:44 AM, Stephen Bain  
> wrote:
>
>> In the good old days someone would have solved the same problem by
>> mentioning in the template's documentation that the parameter should
>> use full URLs. Both the template and instances of it would be
>> readable.
>>
>> Template programmers are not going to create accessible templates
>> because they have a  programming mindset, and set out to solve
>> problems in ways like Brian's code above.
>
> The good old days are long gone. If you believe there is never a  
> valid case
> for basic programming constructs such as conditionals you should have
> objected  when ParserFunctions were first implemented.


The fact that we, at some stage, made the mistake of adding  
programming-like functions does not oblige us to complete the job.

If we could make ParserFunctions go away, we would. ParserFunctions is  
there now, and there's too much code dependent on it to remove it  
right now. That analysis does not apply to StringFunctions.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Extending wikilinks syntax

2009-06-26 Thread Andrew Garrett

On 26/06/2009, at 3:21 PM, Aryeh Gregor wrote:

> On Fri, Jun 26, 2009 at 8:22 AM, Steve Bennett  
> wrote:
>> 3) A limited number of admin-controlled special templates can use an
>> even wider range of features, including raw HTML.
>
> Admins are not going to be allowed to insert raw HTML.  At least, not
> ordinary admins.


They already can, with Javascript, so there's no XSS issue.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Minify

2009-06-26 Thread Robert Rohde
I'm going to mention this here, because it might be of interest on the
Wikimedia cluster (or it might not).

Last night I deposited Extension:Minify which is essentially a
lightweight wrapper for the YUI CSS compressor and JSMin JavaScript
compressor.  If installed it automatically captures all content
exported through action=raw and precompresses it by removing comments,
formatting, and other human readable elements.  All of the helpful
elements still remain on the Mediawiki: pages, but they just don't get
sent to users.

Currently each page served to anons references 6 CSS/JS pages
dynamically prepared by Mediawiki, of which 4 would be needed in the
most common situation of viewing content online (i.e. assuming
media="print" and media="handheld" are not downloaded in the typical
case).

These 4 pages, Mediawiki:Common.css, Mediawiki:Monobook.css, gen=css,
and gen=js comprise about 60 kB on the English Wikipedia.  (I'm using
enwiki as a benchmark, but Commons and dewiki also have similar
numbers to those discussed below.)

After gzip compression, which I assume is available on most HTTP
transactions these days, they total 17039 bytes.  The comparable
numbers if Minify is applied are 35 kB raw and 9980 after gzip, for a
savings of 7 kB or about 40% of the total file size.

Now in practical terms 7 kB could shave ~1.5s off a 36 kbps dialup
connection.  Or given Erik Zachte's observation that action=raw is
called 500 million times per day, and assuming up to 7 kB / 4 savings
per call, could shave up to 900 GB off of Wikimedia's daily traffic.
(In practice, it would probably be somewhat less.  900 GB seems to be
slightly under 2% of Wikimedia's total daily traffic if I am reading
the charts correctly.)


Anyway, that's the use case (such as it is): slightly faster initial
downloads and a small but probably measurable impact on total
bandwidth.  The trade-off of course being that users receive CSS and
JS pages from action=raw that are largely unreadable.  The extension
exists if Wikimedia is interested, though to be honest I primarily
created it for use with my own more tightly bandwidth constrained
sites.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Domas Mituzas
>>
> I asked Domas whether it was and he said no; Tim, can you chip in on  
> this?

where did I say no, and what was my 'no' about?

-- domas

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Brian
On Fri, Jun 26, 2009 at 2:44 AM, Stephen Bain wrote:

> In the good old days someone would have solved the same problem by
> mentioning in the template's documentation that the parameter should
> use full URLs. Both the template and instances of it would be
> readable.
>
> Template programmers are not going to create accessible templates
> because they have a  programming mindset, and set out to solve
> problems in ways like Brian's code above.
>

The good old days are long gone. If you believe there is never a valid case
for basic programming constructs such as conditionals you should have
objected  when ParserFunctions were first implemented.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Roan Kattouw
> On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouw wrote:
>> The reason I believe breaking up templates improves performance is
>> this: they're typically of the form
>> {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see
>> that this is a parser function call with three arguments, and expand
>> all three of them before it runs the #if hook.
>
> I thought this was fixed ages ago with the new preprocessor.
>
I asked Domas whether it was and he said no; Tim, can you chip in on this?

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] PHP 5.3.0 coming soon!

2009-06-26 Thread Roan Kattouw
2009/6/26 Chad :
> I could be completely off here, but I thought the lowest supported
> release was 5.1.x. Or that there was talk (somewhere?) of making
> that the case.
>
Officially, MediaWiki supports PHP 5.0.x, but using it is recommended
against because it has some buggy array handling functions (I think
those bugs only existed on 64-bit platforms, not sure though).

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Current events-related overloads

2009-06-26 Thread Roan Kattouw
2009/6/26 Aryeh Gregor :
> But this sounds like a good idea.  If a process is already parsing the
> page, why don't we just have other processes display an old cached
> version of the page instead of waiting or trying to reparse
> themselves?  The worst that would happen is some users would get old
> views for a couple of minutes.
>
This is a very good idea, and sounds much better than having those
other processes wait for the first process to finish parsing. It would
also reduce the severity of the deadlocks occurring when a process
gets stuck on a parse or dies in the middle of it: instead of
deadlocking, the other processes would just display stale versions
instead of wasting time. If we design these parser cache locks to
expire after a few minutes or so, it should work just fine.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Extending wikilinks syntax

2009-06-26 Thread Aryeh Gregor
On Fri, Jun 26, 2009 at 8:22 AM, Steve Bennett wrote:
> 3) A limited number of admin-controlled special templates can use an
> even wider range of features, including raw HTML.

Admins are not going to be allowed to insert raw HTML.  At least, not
ordinary admins.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Aryeh Gregor
On Thu, Jun 25, 2009 at 11:33 PM, Tim Starling wrote:
> Those templates can be defeated by reducing the functionality of
> padleft/padright, and I think that would be a better course of action
> than enabling the string functions.
>
> The set of string functions you describe are not the most innocuous
> ones, they're the ones I most want to keep out of Wikipedia, at least
> until we have a decent server-side scripting language in parallel.

Well, then at least let's be consistent and cripple padleft/padright.

Also, while I disagree with Robert's skepticism about the comparative
usability of a real scripting language, I'd be interested to hear what
your ideas are for actually implementing that.

Come to think of it, the easiest scripting language to implement would
be . . . PHP!  Just run it through the built-in PHP parser, carefully
sanitize the tokens so that it's safe (possibly banning things like
function definitions), and eval()!  We could even dump the scripts
into lots of little files and use includes, so APC can cache them.
That would probably be the easiest thing to do, if we need to keep
pure PHP support for the sake of third parties.  It's kind of
horrible, of course . . .

How much of Wikipedia is your random shared-hosted site going to be
able to mirror anyway, though?  Couldn't we at least require working
exec() to get infoboxes to work?  People on shared hosting could use
Special:ExpandTemplates to get a copy of the article with no
dependencies, too (albeit with rather messy source code).

On Fri, Jun 26, 2009 at 6:33 AM, Roan Kattouw wrote:
> The reason I believe breaking up templates improves performance is
> this: they're typically of the form
> {{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see
> that this is a parser function call with three arguments, and expand
> all three of them before it runs the #if hook.

I thought this was fixed ages ago with the new preprocessor.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] PHP 5.3.0 coming soon!

2009-06-26 Thread Chad
On Fri, Jun 26, 2009 at 9:48 AM, Aryeh
Gregor wrote:
> On Fri, Jun 26, 2009 at 6:24 AM, Andrew Garrett wrote:
>> Hooray for closures!
>>
>> Do we have plans to update the cluster?
>
> Does it matter if MediaWiki still has to work on PHP 5.0?
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

I could be completely off here, but I thought the lowest supported
release was 5.1.x. Or that there was talk (somewhere?) of making
that the case.

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Current events-related overloads

2009-06-26 Thread Aryeh Gregor
On Fri, Jun 26, 2009 at 6:33 AM, Thomas Dalton wrote:
> Of course, the fact that everyone's first port of call after hearing
> such news is to check the Wikipedia page is a fantastic thing, so it
> would be really unfortunate if we have to stop people doing that.

He didn't say we'd shut down views for the article, just we'd shut
down reparsing or cache invalidation or something.  This is the live
hack that was applied yesterday:

Index: includes/parser/ParserCache.php
===
--- includes/parser/ParserCache.php (revision 52359)
+++ includes/parser/ParserCache.php (working copy)
 -63,6 +63,7 @@
if ( is_object( $value ) ) {
wfDebug( "Found.\n" );
# Delete if article has changed since the cache was made
+   if( $article->mTitle->getPrefixedText() != 'Michael 
Jackson' ) {
// temp hack!
$canCache = $article->checkTouched();
$cacheTime = $value->getCacheTime();
$touched = $article->mTouched;
 -82,6 +83,7 @@
}
wfIncrStats( "pcache_hit" );
}
+   }// temp hack!
} else {
wfDebug( "Parser cache miss.\n" );
wfIncrStats( "pcache_miss_absent" );

It just meant that people were seeing outdated versions of the article.

> Would it be possible, perhaps, to direct all requests for a certain
> page through one server so the rest can continue to serve the rest of
> the site unaffected?

Every page view involves a number of servers, and they're not all
interchangeable, so this doesn't make a lot of sense.

> Or perhaps excessively popular pages could be
> rendered (for anons) as part of the editing process, rather than the
> viewing process, since that would mean each version of the article is
> rendered only once (for anons) and would just slow down editing
> slightly (presumably by a fraction of a second), which we can live
> with.

You think that parsing a large page takes a fraction of a second?  Try
twenty or thirty seconds.

But this sounds like a good idea.  If a process is already parsing the
page, why don't we just have other processes display an old cached
version of the page instead of waiting or trying to reparse
themselves?  The worst that would happen is some users would get old
views for a couple of minutes.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] PHP 5.3.0 coming soon!

2009-06-26 Thread Aryeh Gregor
On Fri, Jun 26, 2009 at 6:24 AM, Andrew Garrett wrote:
> Hooray for closures!
>
> Do we have plans to update the cluster?

Does it matter if MediaWiki still has to work on PHP 5.0?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Extending wikilinks syntax

2009-06-26 Thread Steve Bennett
On Fri, Jun 26, 2009 at 12:07 PM, Aryeh
Gregor wrote:
>
> From the editor's point of view.  Not from the view of the HTML
> source, which is what the original proposal was looking at.

I guess.

I'm starting to get the initial pangs of an idea that we should have
different kinds of syntax:

1) Article pages should only be allowed simplified syntax: no parser
functions, nothing funky at all. You want to use weird features, you
must wrap it in a template
2) Normal templates can use the full range of existing syntax
3) A limited number of admin-controlled special templates can use an
even wider range of features, including raw HTML.

Then, if you really specific HTML for a very specific, widely used
template, you could, without opening up any cans of worms.

[The benefit from 1) above is less unreadable wikitext in article
space, though I suspect that's fairly limited already, and unreadable
wikitext is mostly from  and massive templates like {{cite}} ]

Steve

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] zh.wikipedia including a JavaScript from Google

2009-06-26 Thread Petr Kadlec
Hi, folks.

Recently, the problem of user tracking via third party companies has
been debated on mailing lists. I wonder, if an inclusion of the jQuery
library linked directly from Google servers (!) does not qualify as a
bad idea, too… (Even though no user tracking has obviously been
intended, and due to caching, privacy violation is extremely limited.)
See http://zh.wikipedia.org/wiki/MediaWiki:Common.js (added on
2009-05-22 http://zh.wikipedia.org/w/index.php?diff=10119416&diffonly=1).

-- [[cs:User:Mormegil | Petr Kadlec]]

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Nikola Smolenski
Roan Kattouw wrote:
> To get back to {{cite}}: the template itself contains no more than
> some logic to choose between {{Citation/core}} and {{Citation/patent}}
> based on the presence/absence of certain parameters, and
> {{Citation/core}} does the same thing to choose between books and
> periodicals. What's wrong with breaking up this template in, say,
> {{cite patent}}, {{cite book}} and {{cite periodical}}? Similarly,
> other multifunctional templates could be broken up as well.

While this is not a comment on merits of string functions in general, 
there are following wrong things with that approach:

- It is easier for users to remember the name of just a single template.

- Multiple templates that are separately maintained will diverge over 
time, for example same parameters might end being named differently.

- A new feature in one template can't be easily applied to another template.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Roan Kattouw
2009/6/26 Stephen Bain :
> In the good old days someone would have solved the same problem by
> mentioning in the template's documentation that the parameter should
> use full URLs. Both the template and instances of it would be
> readable.
>
> Template programmers are not going to create accessible templates
> because they have a  programming mindset, and set out to solve
> problems in ways like Brian's code above.
>
Maybe it's the mindset that should be changed then? For one thing,
{{link}} used to use {{substr}} to check if the first argument started
with http:// , https:// or ftp:// and produced an internal link if
not, despite the fact that the documentation for {{link}} clearly
states that it creates an *external* link, which means people
shouldn't be using it to create internal links. If people try to use a
template for something it's not intended for, they should be told to
use a different template; currently, it seems like the template is
just extended with new functionality, leading unnecessary {{#if: ,
{{#switch: and {{substr}} uses that serve only the users' laziness.

To get back to {{cite}}: the template itself contains no more than
some logic to choose between {{Citation/core}} and {{Citation/patent}}
based on the presence/absence of certain parameters, and
{{Citation/core}} does the same thing to choose between books and
periodicals. What's wrong with breaking up this template in, say,
{{cite patent}}, {{cite book}} and {{cite periodical}}? Similarly,
other multifunctional templates could be broken up as well.

The reason I believe breaking up templates improves performance is
this: they're typically of the form
{{#if:{{{someparam|}}}|{{foo}}|{{bar . The preprocessor will see
that this is a parser function call with three arguments, and expand
all three of them before it runs the #if hook. This means both {{foo}}
and {{bar}} get expanded, one of which in vain. Of course this is even
worse for complex systems of nested #if/#ifeq statements and/or
#switch statements, in which every possible 'code' path is evaluated
before a decision is made. In practice, this means that for every call
to {{cite}}, which seems to have three major modes, the preprocessor
will spend about 2/3 of its time expanding stuff it's gonna throw away
anyway.

To fix this, control flow parser functions such as #if could be put in
a special class of parser functions that take their arguments
unexpanded. They could then call the parser to expand their first
argument and return a value based on that. Whether these functions are
expected to return expanded or unexpanded wikitext doesn't really
matter from a performance standpoint. (Disclaimer: I'm hardly a parser
expert, Tim is; he should of course be the judge of the feasibility of
this proposal.)

As an aside, lazy evaluation of #if statements would also improve
performance for stuff like:

{{#if:{{{param1|}}}|Do something with param1
{{#if:{{{param2|}}}|Do something with param2
...
{{#if:{{{param9|}}}|Do something with param9}}

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Current events-related overloads

2009-06-26 Thread Thomas Dalton
2009/6/26 Brion Vibber :
> Tim Starling wrote:
>> It's quite a complex feature. If you have a server that deadlocks or
>> is otherwise extremely slow, then it will block rendering for all
>> other attempts, meaning that the article can not be viewed at all.
>> That scenario could even lead to site-wide downtime, since threads
>> waiting for the locks could consume all available apache threads, or
>> all available DB connections.
>>
>> It's a reasonable idea, but implementing it would require a careful
>> design, and possibly some other concepts like per-article thread count
>> limits.
>
> *nod* We should definitely ponder the issue since it comes up
> intermittently but regularly with big news events like this. At the
> least if we can have some automatic threshold that temporarily disables
> or reduces hits on stampeded pages that'd be spiffy...

Of course, the fact that everyone's first port of call after hearing
such news is to check the Wikipedia page is a fantastic thing, so it
would be really unfortunate if we have to stop people doing that.
Would it be possible, perhaps, to direct all requests for a certain
page through one server so the rest can continue to serve the rest of
the site unaffected? Or perhaps excessively popular pages could be
rendered (for anons) as part of the editing process, rather than the
viewing process, since that would mean each version of the article is
rendered only once (for anons) and would just slow down editing
slightly (presumably by a fraction of a second), which we can live
with. There must be something we can do that allows people to continue
viewing the page wherever possible.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] PHP 5.3.0 coming soon!

2009-06-26 Thread Andrew Garrett

On 25/06/2009, at 4:07 PM, Brion Vibber wrote:

> Quick note to all -- PHP 5.3.0 final release is scheduled for June 30.
> Everybody don't be shy about testing out your code with the release
> candidates! :)
>
> -- brion vibber (brion @ wikimedia.org)


Hooray for closures!

Do we have plans to update the cluster?

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Enabling some string functions

2009-06-26 Thread Stephen Bain
On Fri, Jun 26, 2009 at 2:07 PM, Brian wrote:
>
> As an example, yesterday I wrote some code that basically says, "check the
> doi and http template parameters and check to make sure they begin with
> http, and if not add it." In any reasonable sort of language that lends
> itself to a reasonable sort of implementation. But not with Parser and
> String Functions.
>
> #[[{{{1}}}]].
> {{#if:{{{4}}}|[|{{#if:{{{5}}}|[{{#if:{{#pos:{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}|http|}}|{{#if:{{{4}}}|{{{4}}}|{{#if:{{{5}}}|{{{5}}}|{{#if:{{{4}}}|
> http://dx.doi.org/{{{4}}}|{{#if:{{{5}}}|http://dx.doi.org/{{{5}
> {{#if:{{{2}}}| {{{2}{{#if:{{{4}}}|]|{{#if:{{{5}}}|] {{#ifexist:
> File:{{{1}}}.pdf |[{{filepath:{{{1}}}.pdf}} (PDF)]|}} {{#if:{{{3}}}|
> ''{{{3}}}.''}}

On Fri, Jun 26, 2009 at 3:35 PM, Tim Starling wrote:
>
> While some template authors might attempt to make their templates
> accessible, the nature of Wikipedia is such that less-accessible
> contributions tend to accumulate.

In the good old days someone would have solved the same problem by
mentioning in the template's documentation that the parameter should
use full URLs. Both the template and instances of it would be
readable.

Template programmers are not going to create accessible templates
because they have a  programming mindset, and set out to solve
problems in ways like Brian's code above.

-- 
Stephen Bain
stephen.b...@gmail.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l