Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-09-27 Thread jidanni
 AG == Aryeh Gregor simetrical+wikil...@gmail.com writes:

AG Facebook...

Speaking of which, I hear they compile their PHP for extra speed. Anyway,
http://www.useit.com/alertbox/response-times.html mentions the pain of
reading slow sites.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-13 Thread Tei
On 12 August 2010 00:01, Domas Mituzas midom.li...@gmail.com wrote:
...

 I'm sorry to disappoint you but none of the issues you wrote down here are 
 any new.
 If after reading any books or posts you think we have deficiencies, mostly it 
 is because of one of two reasons, either because we're lazy and didn't 
 implement, or because it is something we need to maintain wiki model.


I am not dissapointed.  The wiki model make it hard, because
everything can be modified, because the whole thing is giganteous and
have a innertia, and the need to support a giganteous list of
languages that will make the United Nations looks like timid.  And I
know you guys are a awesome bunch. And lots of eyes has ben put on the
problems.

This make mediawiki a ideal scenario to think about tecniques to make
the web faster.

Heres a cookie, a really nice plugin for firebug to check speed.
http://code.google.com/p/page-speed/


-- 
--
ℱin del ℳensaje.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-13 Thread Aryeh Gregor
On Fri, Aug 13, 2010 at 6:55 AM, Tei oscar.vi...@gmail.com wrote:
 I am not dissapointed.  The wiki model make it hard, because
 everything can be modified, because the whole thing is giganteous and
 have a innertia, and the need to support a giganteous list of
 languages that will make the United Nations looks like timid.

Actually, wikis are much easier to optimize than most other classes of
apps.  The pages only change rarely compared to something like
Facebook or Google, which really has to regenerate every single page
customized to the view.  That's why we get by with so little money
compared to real organizations.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Tei
On 2 August 2010 15:24, Roan Kattouw roan.katt...@gmail.com wrote:
 2010/8/2 Tei oscar.vi...@gmail.com:
 Maybe a theme can get the individual icons that the theme use, and
 combine it all in a single png file.

 This technique is called spriting, and the single combined image file
 is called a sprite. We've done this with e.g. the enhanced toolbar
 buttons, but it doesn't work in all cases.

 Maybe the idea than resource=file must die in 2011 internet :-/

 The resourceloader branch contains work in progress on aggressively
 combining and minifying JavaScript and CSS. The mapping of one
 resource = one file will be preserved, but the mapping of one resource
 = one REQUEST will die: it'll be possible, and encouraged, to obtain
 multiple resources in one request.




A friend a recomended to me a excellent book (yes books are still
usefull on this digital age).  Is called Even Faster Websites.
Everyone sould make his company buy this book. Is excellent.

Reading this book has scared me for life.  There are things that are
worst than I trough.  JS forcing everything monothread (even stoping
the download of new resources!)... while it download ..and while it
executes.   How about a 90% of the code is not needed in onload, but
is loaded before onload anyway. Probably is a much better idea to read
that book that my post (thats a good line, I will end my email with
it).

Some comments on Wikipedia speed:


1)
This is not a website http://en.wikipedia.org;, is a redirection to this:
http://en.wikipedia.org/wiki/Main_Page
Can't http://en.wikipedia.org/wiki/Main_Page; be served from
http://en.wikipedia.org;?

Wait.. this will break relative links on the frontpage, but.. these
are absolute!  a href=/wiki/Wikipedia
title=WikipediaWikipedia/a

2)
The CSS load fine.  \o/
Probabbly the combining effort will save speed anyway.

3)
Probably the CSS rules can be optimized for speed )-:
Probably not.

4)
A bunch of js files!, and load one after another, secuential. This is
worse than a C program written to a file from disk reading byte by
byte. !!
Combining will probably save a lot. Or using a strategy to force the
browser to concurrent download + lineal execute, these files.

5)
There are a lot of img files. Do the page really need than much? sprinting?.

Total: 13.63 seconds.


You guys want to make this faster with cache optimization. But maybe
is not bandwith the problem, but latency. Latency accumulate even with
HEAD request that result in 302.   All the 302 in the world will not
make the page feel smooth, if already acummulate into 3+ seconds
territory.   ...Or I am wrong?

Probably is a much better idea to read that book that my post

-- 
--
ℱin del ℳensaje.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Roan Kattouw
2010/8/11 Tei oscar.vi...@gmail.com:
 1)
 This is not a website http://en.wikipedia.org;, is a redirection to this:
 http://en.wikipedia.org/wiki/Main_Page
 Can't http://en.wikipedia.org/wiki/Main_Page; be served from
 http://en.wikipedia.org;?

 Wait.. this will break relative links on the frontpage, but.. these
 are absolute!  a href=/wiki/Wikipedia
 title=WikipediaWikipedia/a

That would get complicated with Squid cache AFAIK. One redirect (which
is also a 301 Moved Permanently, which means clients may cache the
redirect destination) isn't that bad, right?

 4)
 A bunch of js files!, and load one after another, secuential. This is
 worse than a C program written to a file from disk reading byte by
 byte. !!
 Combining will probably save a lot. Or using a strategy to force the
 browser to concurrent download + lineal execute, these files.

I'll quote my own post from this thread:
 The resourceloader branch contains work in progress on aggressively
 combining and minifying JavaScript and CSS. The mapping of one
 resource = one file will be preserved, but the mapping of one resource
 = one REQUEST will die: it'll be possible, and encouraged, to obtain
 multiple resources in one request.
We're aware of this problem, or we wouldn't be spending paid
developers' time on this resource loader project.

 You guys want to make this faster with cache optimization. But maybe
 is not bandwith the problem, but latency. Latency accumulate even with
 HEAD request that result in 302.   All the 302 in the world will not
 make the page feel smooth, if already acummulate into 3+ seconds
 territory.   ...Or I am wrong?

I'm assuming you mean 304 (Not Modified)? 302 (Found) means the same
as 301 except it's not cacheable.

We're not intending to do many requests resulting in 304s, we're
intending on reducing the number of requests made and on keeping the
long client-side cache expiry times (Cache-Control:
maxage=largenumber) that we already use.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Aryeh Gregor
On Wed, Aug 11, 2010 at 12:51 PM, Tei oscar.vi...@gmail.com wrote:
 Reading this book has scared me for life.  There are things that are
 worst than I trough.  JS forcing everything monothread (even stoping
 the download of new resources!)... while it download ..and while it
 executes.

In newer browsers this is no longer the case.  They can fetch other
resources while script is loading.  They can't begin rendering further
until the script finishes executing, but this isn't such a big issue,
since scripts usually don't do much work at the point of inclusion.
(As Roan says, work is undergoing to improve this, but I thought I'd
point out that it's not quite as bad as you say.)

 There are a lot of img files. Do the page really need than much? sprinting?.

 Total: 13.63 seconds.

Some usability stuff is sprited, I think.  Overall, though, spriting
is a pain in the neck, and we don't load enough images that it's
necessarily worth it to sprite too aggressively.  Image loads don't
block page layout, so it's not a huge deal.  I think script
optimization is much more important right now.

 You guys want to make this faster with cache optimization. But maybe
 is not bandwith the problem, but latency. Latency accumulate even with
 HEAD request that result in 302.   All the 302 in the world will not
 make the page feel smooth, if already acummulate into 3+ seconds
 territory.   ...Or I am wrong?

I've noticed that when browsing from my phone, the redirect to m. is a
noticeable delay, sometimes a second or more.  We don't serve many
redirects other than that, though, AFAIK.

 Probably is a much better idea to read that book that my post

I've read Steve Souders' High-Performance Websites, which is probably
pretty similar in content.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Roan Kattouw
2010/8/11 Aryeh Gregor simetrical+wikil...@gmail.com:
 I've noticed that when browsing from my phone, the redirect to m. is a
 noticeable delay, sometimes a second or more.  We don't serve many
 redirects other than that, though, AFAIK.

Is that a server-side redirect, or is it done in JS? In the latter
case, it taking long would make sense, and would actually be slowed
down by moving all script tags to the bottom (incidentally, that's
what I've done today in the resourceloader branch).

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Domas Mituzas
Hi!

3 enthusiasm :)

 1)
 This is not a website http://en.wikipedia.org;, is a redirection to this:
 http://en.wikipedia.org/wiki/Main_Page
 Can't http://en.wikipedia.org/wiki/Main_Page; be served from
 http://en.wikipedia.org;?

Our major entrance is not via main page usually, so this would be a niche 
optimization that does not really matter that much (well, ~2% of article views 
go to main page, and only 15% of that are loading http://en.wikipedia.org/, 
and... :)

 2)
 The CSS load fine.  \o/

No, they don't, at least not on first pageview. 

 Probabbly the combining effort will save speed anyway.

Yes. We have way too many separate css assets. 

 A bunch of js files!, and load one after another, secuential. This is
 worse than a C program written to a file from disk reading byte by
 byte. !!

Actually, if a program reads byte by byte, whole page is already cached by OS, 
so it is not that expensive ;-) 
And yes, we know that we have a bit too many JS files loaded, and there's work 
to fix that (Roan wrote about that). 

 Combining will probably save a lot. Or using a strategy to force the
 browser to concurrent download + lineal execute, these files.

:-) Thanks for stating obvious. 

 
 5)
 There are a lot of img files. Do the page really need than much? sprinting?.

It is PITA to sprite (not sprint) community uploaded images, and again, that 
would work only for front page, which is not our main target. Skin should of 
course be sprited. 

 Total: 13.63 seconds.

Quite slow connection you've got there. I get 1s rendering times with 
cross-atlantic trips (and much better times if I get served by European caches 
:)

 You guys want to make this faster with cache optimization. But maybe
 is not bandwith the problem, but latency. Latency accumulate even with
 HEAD request that result in 302.   All the 302 in the world will not
 make the page feel smooth, if already acummulate into 3+ seconds
 territory.   ...Or I am wrong?

You are. First of all, skin assets are not doing IMS requests, they are all 
cached. 
We force browsers to do IMS on page views so that browsers would pick up edits 
(it is a wiki). 

 Probably is a much better idea to read that book that my post

I'm sorry to disappoint you but none of the issues you wrote down here are any 
new. 
If after reading any books or posts you think we have deficiencies, mostly it 
is because of one of two reasons, either because we're lazy and didn't 
implement, or because it is something we need to maintain wiki model. 

Though of course, it is all fresh and scared you for life, we've been doing 
this for life. ;-) 

Domas
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Aryeh Gregor
On Wed, Aug 11, 2010 at 5:55 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 Is that a server-side redirect, or is it done in JS? In the latter
 case, it taking long would make sense, and would actually be slowed
 down by moving all script tags to the bottom (incidentally, that's
 what I've done today in the resourceloader branch).

I have no idea.  If it were server-side, it would have to be done in
Squid, presumably.  A JS redirect would explain a lot of the slowness
-- an HTTP redirect shouldn't be that slow.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-11 Thread Platonides
Aryeh Gregor wrote:
 On Wed, Aug 11, 2010 at 5:55 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 Is that a server-side redirect, or is it done in JS? In the latter
 case, it taking long would make sense, and would actually be slowed
 down by moving all script tags to the bottom (incidentally, that's
 what I've done today in the resourceloader branch).
 
 I have no idea.  If it were server-side, it would have to be done in
 Squid, presumably.  A JS redirect would explain a lot of the slowness
 -- an HTTP redirect shouldn't be that slow.

It is a javascript (see extensions/WikimediaMobile)
Plus, those browsers won't be too optimized.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-03 Thread Domas Mituzas
Hi!

 Couldn't you just tag every internal link with
 a separate class for the length of the target article,

Great idea, how come noone ever came up with this, I even have a stylesheet 
ready, here it is (do note, even it looks big in text, gzip gets it down to 10% 
so we can support this kind of granularity even up to a megabyte :)

Domas

a { color: blue }
a.1_byte_article { color: red; }
a.2_byte_article { color: red; }
a.3_byte_article { color: red; }
a.4_byte_article { color: red; }
a.5_byte_article { color: red; }
a.6_byte_article { color: red; }
a.7_byte_article { color: red; }
a.8_byte_article { color: red; }
a.9_byte_article { color: red; }
a.10_byte_article { color: red; }
a.11_byte_article { color: red; }
a.12_byte_article { color: red; }
a.13_byte_article { color: red; }
a.14_byte_article { color: red; }
a.15_byte_article { color: red; }
a.16_byte_article { color: red; }
a.17_byte_article { color: red; }
a.18_byte_article { color: red; }
a.19_byte_article { color: red; }
a.20_byte_article { color: red; }
a.21_byte_article { color: red; }
a.22_byte_article { color: red; }
a.23_byte_article { color: red; }
a.24_byte_article { color: red; }
a.25_byte_article { color: red; }
a.26_byte_article { color: red; }
a.27_byte_article { color: red; }
a.28_byte_article { color: red; }
a.29_byte_article { color: red; }
a.30_byte_article { color: red; }
a.31_byte_article { color: red; }
a.32_byte_article { color: red; }
a.33_byte_article { color: red; }
a.34_byte_article { color: red; }
a.35_byte_article { color: red; }
a.36_byte_article { color: red; }
a.37_byte_article { color: red; }
a.38_byte_article { color: red; }
a.39_byte_article { color: red; }
a.40_byte_article { color: red; }
a.41_byte_article { color: red; }
a.42_byte_article { color: red; }
a.43_byte_article { color: red; }
a.44_byte_article { color: red; }
a.45_byte_article { color: red; }
a.46_byte_article { color: red; }
a.47_byte_article { color: red; }
a.48_byte_article { color: red; }
a.49_byte_article { color: red; }
a.50_byte_article { color: red; }
a.51_byte_article { color: red; }
a.52_byte_article { color: red; }
a.53_byte_article { color: red; }
a.54_byte_article { color: red; }
a.55_byte_article { color: red; }
a.56_byte_article { color: red; }
a.57_byte_article { color: red; }
a.58_byte_article { color: red; }
a.59_byte_article { color: red; }
a.60_byte_article { color: red; }
a.61_byte_article { color: red; }
a.62_byte_article { color: red; }
a.63_byte_article { color: red; }
a.64_byte_article { color: red; }
a.65_byte_article { color: red; }
a.66_byte_article { color: red; }
a.67_byte_article { color: red; }
a.68_byte_article { color: red; }
a.69_byte_article { color: red; }
a.70_byte_article { color: red; }
a.71_byte_article { color: red; }
a.72_byte_article { color: red; }
a.73_byte_article { color: red; }
a.74_byte_article { color: red; }
a.75_byte_article { color: red; }
a.76_byte_article { color: red; }
a.77_byte_article { color: red; }
a.78_byte_article { color: red; }
a.79_byte_article { color: red; }
a.80_byte_article { color: red; }
a.81_byte_article { color: red; }
a.82_byte_article { color: red; }
a.83_byte_article { color: red; }
a.84_byte_article { color: red; }
a.85_byte_article { color: red; }
a.86_byte_article { color: red; }
a.87_byte_article { color: red; }
a.88_byte_article { color: red; }
a.89_byte_article { color: red; }
a.90_byte_article { color: red; }
a.91_byte_article { color: red; }
a.92_byte_article { color: red; }
a.93_byte_article { color: red; }
a.94_byte_article { color: red; }
a.95_byte_article { color: red; }
a.96_byte_article { color: red; }
a.97_byte_article { color: red; }
a.98_byte_article { color: red; }
a.99_byte_article { color: red; }
a.100_byte_article { color: red; }
a.101_byte_article { color: red; }
a.102_byte_article { color: red; }
a.103_byte_article { color: red; }
a.104_byte_article { color: red; }
a.105_byte_article { color: red; }
a.106_byte_article { color: red; }
a.107_byte_article { color: red; }
a.108_byte_article { color: red; }
a.109_byte_article { color: red; }
a.110_byte_article { color: red; }
a.111_byte_article { color: red; }
a.112_byte_article { color: red; }
a.113_byte_article { color: red; }
a.114_byte_article { color: red; }
a.115_byte_article { color: red; }
a.116_byte_article { color: red; }
a.117_byte_article { color: red; }
a.118_byte_article { color: red; }
a.119_byte_article { color: red; }
a.120_byte_article { color: red; }
a.121_byte_article { color: red; }
a.122_byte_article { color: red; }
a.123_byte_article { color: red; }
a.124_byte_article { color: red; }
a.125_byte_article { color: red; }
a.126_byte_article { color: red; }
a.127_byte_article { color: red; }
a.128_byte_article { color: red; }
a.129_byte_article { color: red; }
a.130_byte_article { color: red; }
a.131_byte_article { color: red; }
a.132_byte_article { color: red; }
a.133_byte_article { color: red; }
a.134_byte_article { color: red; }
a.135_byte_article { color: red; }
a.136_byte_article 

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-03 Thread Liangent
On 8/3/10, Lars Aronsson l...@aronsson.se wrote:
 Couldn't you just tag every internal link with
 a separate class for the length of the target article,
 and then use different personal CSS to set the
 threshold? The generated page would be the same
 for all users:

So if a page is changed, all pages linking to it need to be parsed
again. Will this cost even more?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-03 Thread K. Peachey
Would something like what is shown below get it even further down?

a { color: blue }
a.1_byte_article, a.2_byte_article, a.3_byte_article,
a.4_byte_article, a.5_byte_article, a.6_byte_article,
a.7_byte_article, a.8_byte_article, a.9_byte_article,
a.10_byte_article,a.11_byte_article, a.12_byte_article,
a.13_byte_article, a.14_byte_article, a.15_byte_article,
a.16_byte_article, a.17_byte_article, a.18_byte_article,
a.19_byte_article, a.20_byte_article, a.21_byte_article,
a.22_byte_article, a.23_byte_article, a.24_byte_article,
a.25_byte_article, a.26_byte_article, a.27_byte_article,
a.28_byte_article, a.29_byte_article, a.30_byte_article,
a.31_byte_article, a.32_byte_article, a.33_byte_article,
a.34_byte_article, a.35_byte_article, a.36_byte_article,
a.37_byte_article, a.38_byte_article, a.39_byte_article,
a.40_byte_article, a.41_byte_article, a.42_byte_article,
a.43_byte_article, a.44_byte_article, a.45_byte_article,
a.46_byte_article, a.47_byte_article, a.48_byte_article,
a.49_byte_article, a.50_byte_article, a.51_byte_article,
a.52_byte_article, a.53_byte_article, a.54_byte_article,
a.55_byte_article, a.56_byte_article, a.57_byte_article,
a.58_byte_article, a.59_byte_article, a.60_byte_article,
a.61_byte_article, a.62_byte_article, a.63_byte_article,
a.64_byte_article, a.65_byte_article, a.66_byte_article,
a.67_byte_article, a.68_byte_article, a.69_byte_article,
a.70_byte_article, a.71_byte_article, a.72_byte_article,
a.73_byte_article, a.74_byte_article, a.75_byte_article,
a.76_byte_article, a.77_byte_article, a.78_byte_article,
a.79_byte_article, a.80_byte_article, a.81_byte_article,
a.82_byte_article, a.83_byte_article, a.84_byte_article,
a.85_byte_article, a.86_byte_article, a.87_byte_article,
a.88_byte_article, a.89_byte_article, a.90_byte_article,
a.91_byte_article, a.92_byte_article, a.93_byte_article,
a.94_byte_article, a.95_byte_article { color: red }

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-03 Thread Platonides
Lars Aronsson wrote:
 On 08/01/2010 10:55 PM, Aryeh Gregor wrote:
 One easy hack to reduce this problem is just to only provide a few
 options for stub threshold, as we do with thumbnail size.  Although
 this is only useful if we cache pages with nonzero stub threshold . .
 . why don't we do that?  Too much fragmentation due to the excessive
 range of options?
 
 Couldn't you just tag every internal link with
 a separate class for the length of the target article,
 and then use different personal CSS to set the
 threshold? The generated page would be the same
 for all users:
 
 a href=My_Article class=134_byte_articleMy Article/a

That would be workable, eg. one class for articles smaller than 50
bytes, other for 100, 200, 250, 300, 400, 500, 600, 700, 800, 1000,
2000, 2500, 5000, 1 if it weren't for having to update all those
classes whenever the page changes.

It would work to add it as a separate stylesheet for stubs, though.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-03 Thread Happy-melon

Oldak Quill oldakqu...@gmail.com wrote in message 
news:aanlktik8sqmaetwvg8eta+ca49i08rfbrmvicsms+...@mail.gmail.com...
 On 2 August 2010 12:13, Oldak Quill oldakqu...@gmail.com wrote:
 On 28 July 2010 20:13,  jida...@jidanni.org wrote:
 Seems to me playing the role of the average dumb user, that
 en.wikipedia.org is one of the rather slow websites of the many websites
 I browse.

 No matter what browser, it takes more seconds from the time I click on a
 link to the time when the first bytes of the HTTP response start flowing
 back to me.

 Seems facebook is more zippy.

 Maybe Mediawiki is not optimized.


 For what it's worth, Alexa.com lists the average load time of the
 websites they catalogue. I'm not sure what the metrics they use are,
 and I would guess they hit the squid cache and are in the United
 States.

 Alexa.com list the following average load times as of now:

 wikipedia.org: Fast (1.016 Seconds), 74% of sites are slower.
 facebook.com: Average (1.663 Seconds), 50% of sites are slower.


 An addendum to the above message:

 According to the Alexa.com help page Average Load Times: Speed
 Statistics (http://www.alexa.com/help/viewtopic.php?f=6t=1042):
 The Average Load Time ... [is] based on load times experienced by
 Alexa users, and measured by the Alexa Toolbar, during their regular
 web browsing.

 So although US browsers might be overrepresented in this sample (I'm
 just guessing, I have no figures to support this statement), the Alexa
 sample should include many non-US browsers, assuming that the figure
 reported by Alexa.com is reflective of its userbase.

And the average Alexa toolbar user is logged in to facebook and using it to 
see what their friends were up to last night, with masses of personalised 
content; while the average Alexa toolbar user is a reader seeing the same 
page as everyone else.  We definitely have the theoretical advantage.

--HM
 



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Domas Mituzas
Hi!

 I.e., only about a quarter of users have been ported to
 user_properties.  Why wasn't a conversion script run here?

In theory if all properties are at defaults, user shouldn't be there. The 
actual check should be against the blob field.

Domas
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Andrew Garrett
On Mon, Aug 2, 2010 at 5:35 PM, Domas Mituzas midom.li...@gmail.com wrote:
 Hi!

 I.e., only about a quarter of users have been ported to
 user_properties.  Why wasn't a conversion script run here?

 In theory if all properties are at defaults, user shouldn't be there. The 
 actual check should be against the blob field.

That's what he did. Read the query.

-- 
Andrew Garrett
http://werdn.us/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Domas Mituzas
 That's what he did. Read the query.

;-) thats what happens when email gets ahead of coffee.

Domas

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Tei
On 28 July 2010 21:13,  jida...@jidanni.org wrote:
 Seems to me playing the role of the average dumb user, that
 en.wikipedia.org is one of the rather slow websites of the many websites
 I browse.

 No matter what browser, it takes more seconds from the time I click on a
 link to the time when the first bytes of the HTTP response start flowing
 back to me.

 Seems facebook is more zippy.

It seems fast here: 130ms.

The first load of the homepage can be slow:
http://zerror.com/unorganized/wika/lader1.png
http://en.wikipedia.org/wiki/Main_Page
(I need a bigger monitor, the escalator don't fit on my screen)



-- 
--
ℱin del ℳensaje.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Oldak Quill
On 2 August 2010 12:13, Oldak Quill oldakqu...@gmail.com wrote:
 On 28 July 2010 20:13,  jida...@jidanni.org wrote:
 Seems to me playing the role of the average dumb user, that
 en.wikipedia.org is one of the rather slow websites of the many websites
 I browse.

 No matter what browser, it takes more seconds from the time I click on a
 link to the time when the first bytes of the HTTP response start flowing
 back to me.

 Seems facebook is more zippy.

 Maybe Mediawiki is not optimized.


 For what it's worth, Alexa.com lists the average load time of the
 websites they catalogue. I'm not sure what the metrics they use are,
 and I would guess they hit the squid cache and are in the United
 States.

 Alexa.com list the following average load times as of now:

 wikipedia.org: Fast (1.016 Seconds), 74% of sites are slower.
 facebook.com: Average (1.663 Seconds), 50% of sites are slower.


An addendum to the above message:

According to the Alexa.com help page Average Load Times: Speed
Statistics (http://www.alexa.com/help/viewtopic.php?f=6t=1042):
The Average Load Time ... [is] based on load times experienced by
Alexa users, and measured by the Alexa Toolbar, during their regular
web browsing.

So although US browsers might be overrepresented in this sample (I'm
just guessing, I have no figures to support this statement), the Alexa
sample should include many non-US browsers, assuming that the figure
reported by Alexa.com is reflective of its userbase.

-- 
Oldak Quill (oldakqu...@gmail.com)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Roan Kattouw
2010/8/2 Tei oscar.vi...@gmail.com:
 Maybe a theme can get the individual icons that the theme use, and
 combine it all in a single png file.

This technique is called spriting, and the single combined image file
is called a sprite. We've done this with e.g. the enhanced toolbar
buttons, but it doesn't work in all cases.

 Maybe the idea than resource=file must die in 2011 internet :-/

The resourceloader branch contains work in progress on aggressively
combining and minifying JavaScript and CSS. The mapping of one
resource = one file will be preserved, but the mapping of one resource
= one REQUEST will die: it'll be possible, and encouraged, to obtain
multiple resources in one request.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread John Vandenberg
On Mon, Aug 2, 2010 at 11:24 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 The resourceloader branch contains work in progress on aggressively
 combining and minifying JavaScript and CSS. The mapping of one
 resource = one file will be preserved, but the mapping of one resource
 = one REQUEST will die: it'll be possible, and encouraged, to obtain
 multiple resources in one request.

Does that approach gain much over HTTP pipelining?

--
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Tei
On 2 August 2010 15:24, Roan Kattouw roan.katt...@gmail.com wrote:
...
 Maybe the idea than resource=file must die in 2011 internet :-/

 The resourceloader branch contains work in progress on aggressively
 combining and minifying JavaScript and CSS. The mapping of one
 resource = one file will be preserved, but the mapping of one resource
 = one REQUEST will die: it'll be possible, and encouraged, to obtain
 multiple resources in one request.


:-O

That is awesome solution, considering the complex of the real world
problems. Elegant, and probably as side effect may remove some bloat.

-- 
--
ℱin del ℳensaje.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Aryeh Gregor
On Mon, Aug 2, 2010 at 10:50 AM, John Vandenberg jay...@gmail.com wrote:
 Does that approach gain much over HTTP pipelining?

Yes, because browsers don't HTTP pipeline in practice, because
transparent proxies at ISPs cause sites to break if they do that, and
there's no reliable way to detect them.  Opera does pipelining and
blacklists bad ISPs or something, I think.  See:

https://bugzilla.mozilla.org/show_bug.cgi?id=264354

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-02 Thread Lars Aronsson
On 08/01/2010 10:55 PM, Aryeh Gregor wrote:
 One easy hack to reduce this problem is just to only provide a few
 options for stub threshold, as we do with thumbnail size.  Although
 this is only useful if we cache pages with nonzero stub threshold . .
 . why don't we do that?  Too much fragmentation due to the excessive
 range of options?

Couldn't you just tag every internal link with
a separate class for the length of the target article,
and then use different personal CSS to set the
threshold? The generated page would be the same
for all users:

a href=My_Article class=134_byte_articleMy Article/a



-- 
   Lars Aronsson (l...@aronsson.se)
   Aronsson Datateknik - http://aronsson.se



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Platonides
Aryeh Gregor wrote:
 Look, this is just not a useful solution, period.  It would be
 extremely ineffective.  If you extended the permitted staleness level
 so much that it would be moderately effective, it would be useless,
 because you'd be seeing hours- or days-old articles.  On the other
 hand, for a comparable amount of effort you could implement a solution
 that actually is effective, like adding an extra postprocessing stage.

Yes, I have some ideas on how to improve it.


 On Fri, Jul 30, 2010 at 1:32 PM, John Vandenberg jay...@gmail.com wrote:
 Someone who sets their stub threshold to 357 is their own performance enemy.

In fact, setting the stub threshold to anything disables the parser
cache. You can only hit it when it is set to 0.

Aryeh, can you do some statistics about the frequency of the different
stub thresholds? Perhaps restricted to people which edited this year, to
discard unused accounts.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Roan Kattouw
2010/8/1 Platonides platoni...@gmail.com:
 Aryeh, can you do some statistics about the frequency of the different
 stub thresholds? Perhaps restricted to people which edited this year, to
 discard unused accounts.

He can't, but I can.  I ran a couple of queries and put the result at
http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Aryeh Gregor
On Sun, Aug 1, 2010 at 4:43 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 He can't, but I can.  I ran a couple of queries and put the result at
 http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

I can too -- I'm a toolserver root, so I have read-only access to
pretty much the whole database (minus some omitted
databases/tables/columns, mainly IP addresses and maybe private
wikis).  But no need, since you already did it.  :)  The data isn't
complete because not all users have been ported to user_properties,
right?

One easy hack to reduce this problem is just to only provide a few
options for stub threshold, as we do with thumbnail size.  Although
this is only useful if we cache pages with nonzero stub threshold . .
. why don't we do that?  Too much fragmentation due to the excessive
range of options?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Roan Kattouw
2010/8/1 Aryeh Gregor simetrical+wikil...@gmail.com:
 On Sun, Aug 1, 2010 at 4:43 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 He can't, but I can.  I ran a couple of queries and put the result at
 http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold

 I can too -- I'm a toolserver root, so I have read-only access to
 pretty much the whole database (minus some omitted
 databases/tables/columns, mainly IP addresses and maybe private
 wikis).
Ah yes, I forgot about that. I was assuming you'd need access to the
live DB for this.

 But no need, since you already did it.  :)  The data isn't
 complete because not all users have been ported to user_properties,
 right?

I don't know. Cursory inspection seems to indicate user_properties is
relatively complete, but comprehensive count queries are too slow for
me to dare run them on the cluster. Maybe you could run something
along the lines of SELECT COUNT(DISTINCT up_user) FROM
user_properties; on the toolserver and compare it with SELECT COUNT(*)
FROM user;

 One easy hack to reduce this problem is just to only provide a few
 options for stub threshold, as we do with thumbnail size.  Although
 this is only useful if we cache pages with nonzero stub threshold . .
 . why don't we do that?  Too much fragmentation due to the excessive
 range of options?
Maybe; but the fact that the field is present but set to 0 in the
parser cache key is very weird. SVN blame should probably be able to
tell who did this and hopefully why.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Platonides
Roan Kattouw wrote:
 One easy hack to reduce this problem is just to only provide a few
 options for stub threshold, as we do with thumbnail size.  Although
 this is only useful if we cache pages with nonzero stub threshold . .
 . why don't we do that?  Too much fragmentation due to the excessive
 range of options?
 Maybe; but the fact that the field is present but set to 0 in the
 parser cache key is very weird. SVN blame should probably be able to
 tell who did this and hopefully why.
 
 Roan Kattouw (Catrope)

Look at Article::getParserOutput() on how $wgUser-getOption(
'stubthreshold' ) is explicitely check that it is 0 before enabling the
parser cache.
*There are several other entry points to the ParserCache in Article,
it's a bit mixed.


Note that we do offer several options, not only the free-text field. I
think that the underlying problem is that when changing an article from
98 bytes to 102, we would need to invalidate all pages linking to it for
stubthresholds of 100 bytes.

Since the pages are reparsed, custom values are not a problem now.
I think that to cache for the stubthresholds, we would need to cache
just before the replaceLinkHolders() and perform the replacement at the
user request.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Platonides
Roan Kattouw wrote:
 2010/8/1 Platonides:
 Aryeh, can you do some statistics about the frequency of the different
 stub thresholds? Perhaps restricted to people which edited this year, to
 discard unused accounts.

 He can't, but I can.  I ran a couple of queries and put the result at
 http://www.mediawiki.org/wiki/User:Catrope/Stub_threshold
 
 Roan Kattouw (Catrope)

Thanks, Roan.
I think that the condition should have been the inverse (users with
recent edits, not users which don't have old edits) but anyway it shows
that with a few (8-10) values we could please almost everyone.

Also, it shows that people don't understand how to disable it. The tail
has many extremely large values which can only mean don't treat stubs
different.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Aryeh Gregor
On Sun, Aug 1, 2010 at 5:03 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 I don't know. Cursory inspection seems to indicate user_properties is
 relatively complete, but comprehensive count queries are too slow for
 me to dare run them on the cluster. Maybe you could run something
 along the lines of SELECT COUNT(DISTINCT up_user) FROM
 user_properties; on the toolserver and compare it with SELECT COUNT(*)
 FROM user;

That won't work, because it won't count users whose settings are all
default.  However, we can tell who's switched because user_options
will be empty.

On Sun, Aug 1, 2010 at 5:48 PM, Platonides platoni...@gmail.com wrote:
 Note that we do offer several options, not only the free-text field. I
 think that the underlying problem is that when changing an article from
 98 bytes to 102, we would need to invalidate all pages linking to it for
 stubthresholds of 100 bytes.

Aha, that must be it.  Any stub threshold would require extra page
invalidation, which we don't do because it would be pointlessly
expensive.  Postprocessing would fix the problem.

 Since the pages are reparsed, custom values are not a problem now.
 I think that to cache for the stubthresholds, we would need to cache
 just before the replaceLinkHolders() and perform the replacement at the
 user request.

Yep.  Or parse further, but leave markers lingering in the output
somehow.  We don't need to cache the actual wikitext, either way.  We
just need to cache at some point after all the heavy lifting has been
done, and everything that's left can be done in a couple of
milliseconds.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Roan Kattouw
2010/8/1 Platonides platoni...@gmail.com:
 I think that the condition should have been the inverse (users with
 recent edits, not users which don't have old edits)
Oops. I thought I had reversed the condition correctly, but as you
point out I hadn't. I'll run the corrected queries tomorrow.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-08-01 Thread Aryeh Gregor
On Sun, Aug 1, 2010 at 6:24 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 That won't work, because it won't count users whose settings are all
 default.  However, we can tell who's switched because user_options
 will be empty.

SELECT COUNT(*) FROM user WHERE user_options = ''; SELECT COUNT(*) FROM user;
+--+
| COUNT(*) |
+--+
|  3491404 |
+--+
1 row in set (10 min 20.11 sec)

+--+
| COUNT(*) |
+--+
| 12822573 |
+--+
1 row in set (7 min 47.87 sec)

I.e., only about a quarter of users have been ported to
user_properties.  Why wasn't a conversion script run here?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-07-31 Thread Daniel Kinzler
Aryeh Gregor schrieb:
 As soon as you're logged in, you're missing Squid cache, because we
 have to add your name to the top, attach your user CSS/JS, etc.  You
 can't be served the same HTML as an anonymous user.  If you want to be
 served the same HTML as an anonymous user, log out.

This is a few years old, but I guess it's still relevant:
http://brightbyte.de/page/Client-side_skins_with_XSLT I experimented a bit
with ways to do all the per-user preference stuff on the client side, with XSLT.

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-07-30 Thread Alex Brollo
2010/7/30 Daniel Friesen li...@nadir-seen-fire.com


 That's pretty much the purpose of the caching servers.


Yes, but I  presume that a big advantage could come  from having a
simplified, unique, js-free version of the pages online, completely devoid
of user preferences to avoid any need to parse it again when uploaded by
different users with different preferences profile. Nevertheless I say
again: it's only a completely layman idea.

-- 
Alex
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-07-30 Thread John Vandenberg
On Fri, Jul 30, 2010 at 6:23 AM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 On Thu, Jul 29, 2010 at 4:07 PM, Strainu strain...@gmail.com wrote:
 Could you please elaborate on that? Thanks.

 When pages are parsed, the parsed version is cached, since parsing can
 take a long time (sometimes  10 s).  Some preferences change how
 pages are parsed, so different copies need to be stored based on those
 preferences.  If these settings are all default for you, you'll be
 using the same parser cache copies as anonymous users, so you're
 extremely likely to get a parser cache hit.  If any of them is
 non-default, you'll only get a parser cache hit if someone with your
 exact parser-related preferences viewed the page since it was last
 changed; otherwise it will have to reparse the page just for you,
 which will take a long time.

 This is probably a bad thing.

Could we add a logged-in-reader mode, for people who are infrequent
contributors but wish to be logged in for the prefs.

They could be served a slightly old cached version of the page when
one is available for their prefs.  e.g. if the cached version is less
than a minute old.
The down side is that if they see an error, it may already be fixed.
OTOH, if the page is being revised frequently, the same is likely to
happen anyway.  The text could be stale before it hits the wire due to
parsing delay.

For pending changes, the pref 'Always show the latest accepted
revision (if there is one) of a page by default' could be enabled by
default.  Was there any discussion about the default setting for this
pref?

--
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-07-29 Thread Aryeh Gregor
On Thu, Jul 29, 2010 at 4:07 PM, Strainu strain...@gmail.com wrote:
 Could you please elaborate on that? Thanks.

When pages are parsed, the parsed version is cached, since parsing can
take a long time (sometimes  10 s).  Some preferences change how
pages are parsed, so different copies need to be stored based on those
preferences.  If these settings are all default for you, you'll be
using the same parser cache copies as anonymous users, so you're
extremely likely to get a parser cache hit.  If any of them is
non-default, you'll only get a parser cache hit if someone with your
exact parser-related preferences viewed the page since it was last
changed; otherwise it will have to reparse the page just for you,
which will take a long time.

This is probably a bad thing.  I'd think that most of the settings
that fragment the parser cache should be implementable in a
post-processing stage, which should be more than fast enough to run on
parser cache hits as well as misses.  But we don't have such a thing.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia is one of the slower sites on the web

2010-07-29 Thread Domas Mituzas
 This is probably a bad thing.  I'd think that most of the settings
 that fragment the parser cache should be implementable in a
 post-processing stage, which should be more than fast enough to run on
 parser cache hits as well as misses.  But we don't have such a thing.

some of which can be even done with css/js, I guess. 
I'm all for simplifying whatever processing backend has to do :-) 

Domas

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l