Re: [Wikitech-l] [Toolserver-l] Crawling deWP
Marco Schuster schrieb: ... >> But by then, i do hope we have revision flags in the dumps. because that >> would >> be The Right Thing to use. > Still, using the dumps would require me to get the full history dump > because I only want flagged revisions and not current revisions > without the flag. Including the latest revision which is flagged "good" would be an obvious feature that should be implemented along with including the revision flags. So the "current" dump would have 1-3 revisions per page. -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Crawling deWP
Marco Schuster skrev: > Rolf Lampa wrote: >> >> Doesn't the xml dumps contain the flag for flagged revs? > > The xml dumps are nothing for me, way too much overhead (especially, > they are old, and I want to use single files, it's easier to process > these than one hge xml file). And they don't contain flagged > revisions flags :( I traverse the last enwiki dump (last revision only) in 15 minutes (or the Swedish svwiki in < 3 min) with my stream tool (written in Delphi Pascal). On the go I can copy the whole thing, (takes no longer) and while at it I can create the "big three" sql-tables (page, revision & text) out of the xml dump as well, in less than 20 minutes. I like Xml dumps. :) I'd love, however, to see the flagged rev status as an attribute in one of the tags, for example Regards, // Rolf Lampa ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Crawling deWP
Rolf Lampa schrieb: > I'd love, however, to see the flagged rev status as an attribute in one > of the tags, for example > > Regards, Naw, it's more complex than that. You can have any number of different flags. It would probably have to be foobar -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] MediaWiki Slow, what to look for?
Thank you Platonides, Seems now I get the error: "xcache.var_size is either 0 or too small to enable var data caching in */var/www/includes/BagOStuff.php* on line *643" *Googling hasn't provided much info on how to fix this, anyone know?* * 2009/1/28 Platonides > Dawson wrote: > > Modified config file as follows: > > > > $wgUseDatabaseMessage = false; > > $wgUseFileCache = true; > > $wgMainCacheType = "CACHE_ACCEL"; > > This should be $wgMainCacheType = CACHE_ACCEL; (constant) not > $wgMainCacheType = "CACHE_ACCEL"; (string) > > > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] MediaWiki Slow, what to look for?
On Wed, Jan 28, 2009 at 5:33 AM, Dawson wrote: > Seems now I get the error: "xcache.var_size is either 0 or too small to > enable var data caching in */var/www/includes/BagOStuff.php* on line *643" > > *Googling hasn't provided much info on how to fix this, anyone know?* Add this to php.ini: xcache.var_size = 32M Or pick whatever size you like, depending on how much RAM you have available. You can check the amount of RAM used (and other things) using the xcache-admin stuff that should have been provided when you installed XCache. You might want to tweak the other options too: http://xcache.lighttpd.net/wiki/PhpIni ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] hosting wikipedia
I want to offer something like reference. com where the results are formatted in a manner consistent with the look and feel of my website. try a search on this site and you will see what I mean. So is this a live mirror? If ads are on the site, is the revenue shared? - Original Message From: Aryeh Gregor To: Wikimedia developers Sent: Tuesday, January 27, 2009 7:41:24 PM Subject: Re: [Wikitech-l] hosting wikipedia On Tue, Jan 27, 2009 at 7:37 PM, George Herbert wrote: > Right, but a live mirror is a very different thing than a search box link. Well, as far as I can tell, we have no idea whether the original poster meant either of those, or perhaps something else altogether. Obviously nobody minds a search box link, that's just a *link*. You can't stop people from linking to you. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enwiki dump crawling since 10/15/2008
"Brion Vibber" wrote in message news:497f9c35.9050...@wikimedia.org... > On 1/27/09 2:55 PM, Robert Rohde wrote: >> On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber >> wrote: >>> On 1/27/09 2:35 PM, Thomas Dalton wrote: The way I see it, what we need is to get a really powerful server >>> Nope, it's a software architecture issue. We'll restart it with the new >>> arch when it's ready to go. >> The simplest solution is just to kill the current dump job if you have >> faith that a new architecture can be put in place in less than a year. > > We'll probably do that. > > -- brion FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages. Russ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enwiki dump crawling since 10/15/2008
Probably wise to poke in a hack to skip the history first. :) -- brion vibber (brion @ wikimedia.org) On Jan 28, 2009, at 7:34, "Russell Blau" wrote: > "Brion Vibber" wrote in message > news:497f9c35.9050...@wikimedia.org... >> On 1/27/09 2:55 PM, Robert Rohde wrote: >>> On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber >>> wrote: On 1/27/09 2:35 PM, Thomas Dalton wrote: > The way I see it, what we need is to get a really powerful server Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go. >>> The simplest solution is just to kill the current dump job if you >>> have >>> faith that a new architecture can be put in place in less than a >>> year. >> >> We'll probably do that. >> >> -- brion > > FWIW, I'll add my vote for aborting the current dump *now* if we don't > expect it ever to actually be finished, so we can at least get a > fresh dump > of the current pages. > > Russ > > > > > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] enable $wgAllowCopyUploads follow-up
Revising the $wgAllowCopyUploads request ... The thread ended here: http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040942.html Any updates on this; or ideas on how we could support client initiated importing of media assets over http? --michael ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Secure Server IPs?
On enwiki, the secure server (i.e. secure.wikimedia.org) is currently written down as using: 66.230.192.0–66.230.239.255 It seems unlikely that the server really uses or needs such a large range. In addition, we received a report that 66.230.230.230 is operating as a TOR exit node. Since Wikipedia policy is to prohibit anon editing and account creation from TOR nodes, it would be nice to clarify this. Thanks. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Make upload headings changeable
You're hitting on a core issue here, which is the lack of support for multilingual projects. Mediawiki does not currently support this. Using hacks such as uselang has helped hide the issue, but its far from ideal. I would venture that multilingual content could be handled with the user's language setting/headers/uselang param being helpful to show the appropriate content. Until that happens, each project only has one content language. In cases like the ones you mentioned, this happens to be English. Let's suppose I use the French Wikipedia with Arabic interface. I would find it very odd that the content is not in French, even though I use Arabic as my interface language. On multilingual projects, its ok to present in your user language. On single-language projects it is not. Using uselang for content is an icky hack anyway. Multilingual projects need to be supported in core, or we're just going to perpetuate these hacks. Basically, I figured support the majority of cases (single language projects) rather than the minority (multi- language projects). The former get the benefit of the hack, the latter see no change. -Chad On Jan 27, 2009 4:08 PM, "Marcus Buck" wrote: Chad hett schreven: > Should be done with a wiki's content language as of r46372. > > -Chad Thanks! That's already a big improvement, but why content language? As I pointed out in response to your question, it need's to be user language on Meta, Incubator, Wikispecies, Beta Wikiversity, old Wikisource, and all the multilingual wikis of third party users. It's not actually necessary on non-multilingual wikis, but it does no harm either. So why content language? This could be solved with a setting in LocalSettings.php "isMultilingual", but that's another affair and as long as that does not exist, we should use user language. Marcus Buck ___ Wikitech-l mailing list wikitec...@lists.wikimedia ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Enwiki dump crawling since 10/15/2008
That would be great. I second this notion whole heartedly. On Jan 28, 2009, at 7:34 AM, Russell Blau wrote: > "Brion Vibber" wrote in message > news:497f9c35.9050...@wikimedia.org... >> On 1/27/09 2:55 PM, Robert Rohde wrote: >>> On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber >>> wrote: On 1/27/09 2:35 PM, Thomas Dalton wrote: > The way I see it, what we need is to get a really powerful server Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go. >>> The simplest solution is just to kill the current dump job if you >>> have >>> faith that a new architecture can be put in place in less than a >>> year. >> >> We'll probably do that. >> >> -- brion > > FWIW, I'll add my vote for aborting the current dump *now* if we don't > expect it ever to actually be finished, so we can at least get a > fresh dump > of the current pages. > > Russ > > > > > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Make upload headings changeable
Chad hett schreven: > You're hitting on a core issue here, which is the lack of > support for multilingual projects. Mediawiki does not > currently support this. Using hacks such as uselang has > helped hide the issue, but its far from ideal. I would > venture that multilingual content could be handled > with the user's language setting/headers/uselang > param being helpful to show the appropriate content. > Until that happens, each project only has one content > language. In cases like the ones you mentioned, this > happens to be English. The facts are correct, but if you thereby implicate that English thus should be regarded as a valid output for non-English users of those projects, I don't agree. This implication is wrong. > Let's suppose I use the French > Wikipedia with Arabic interface. I would find it very > odd that the content is not in French, even though I > use Arabic as my interface language. > The average user with a non-technical approach does not feel a strict distinction between "interface" (served by the php scripts) and "content" (rendered from database content). Especially on file description pages (file history and file links for example appear as headings just in the same way as the content headings). It won't seem odd to me. > On multilingual projects, its ok to present in your user > language. On single-language projects it is not. Using > uselang for content is an icky hack anyway. Multilingual > projects need to be supported in core, or we're just > going to perpetuate these hacks. > The ways of achieving and accessing may change in the future, but you will never have a clear separation of "content" and localizable elements. Multilang support can be as core as imaginable, but still you will have localizable elements stored in "content" areas. > Basically, I figured support the majority of cases (single > language projects) rather than the minority (multi- > language projects). The former get the benefit of the > hack, the latter see no change. > > -Chad Well, you could put it in other terms and the majority/minority thing switches: content lang allows localization for monolang projects only, when user lang allows it for _all_ projects. So content lang is the minority. Whether Arabic file description pages for users of the French Wikipedia preferring Arabic is a good or a bad thing is not decided and not even decidable. There are some points for content lang, but no strong points. There are some points for user lang, but no strong points either. If there are equally good points for both solutions this supports my interpretation of the majority/minority relation. Your interpretation is based on the assumption that content lang on monolang projects is _obviously_ a good thing. Marcus Buck ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Secure Server IPs?
On 1/28/09 10:18 AM, Robert Rohde wrote: > On enwiki, the secure server (i.e. secure.wikimedia.org) is currently > written down as using: 66.230.192.0–66.230.239.255 > > It seems unlikely that the server really uses or needs such a large range. Indeed, this is wildly incorrect. :) Our *old* public IP address space in Tampa was 66.230.200.0/24 -- this covers the range from 66.230.200.1 to 66.230.200.255, much smaller than the range listed by Pilotguy in 2007: http://en.wikipedia.org/w/index.php?title=MediaWiki:Blockiptext&diff=131647237&oldid=126717109 Most likely, Pilotguy did a lookup and picked out the result for the parent IP space, which would cover many other customers of our provider, rather than the specific result for our network. Further, this range is no longer being actively routed on the internet. Our current public IP address space in Tampa is 208.80.152.0/22, which covers 208.80.152.0 to 208.80.155.255. Note that while edits made through the secure server would list that server in their proxy forwarding headers, which would be visible in CheckUser results, it would not be visible as the final public IP unless there were a misconfiguration of our proxy whitelist. > In addition, we received a report that 66.230.230.230 is operating as > a TOR exit node. Since Wikipedia policy is to prohibit anon editing > and account creation from TOR nodes, it would be nice to clarify this. This IP is not and never has been in our address range. It's probably in the same building, though. :) -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] enable $wgAllowCopyUploads follow-up
On 1/28/09 8:57 AM, Michael Dale wrote: > Revising the $wgAllowCopyUploads request ... The thread ended here: > http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040942.html > > Any updates on this; I'm poking poor Mark about it. :) (And dude, I'm sitting right next to you. ;) > or ideas on how we could support client initiated > importing of media assets over http? I don't think that can be done without a browser plugin, and it would be kinda crappy anyway -- the poor client would be using up both their upstream and downstream bandwidth for the whole file. -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Wikimedia IdeaTorrent?
If you haven't seen it yet, Ubuntu is running an interesting brainstorming software called IdeaTorrent to think collectively about common problems and solutions: http://brainstorm.ubuntu.com/ The software: http://www.ideatorrent.org/ I wonder - would people consider it useful to set up something like brainstorm.wikimedia.org using this software, or would it be too duplicative of BugZilla and listservs? The benefit of IdeaTorrent is that it's very straightforward for non-technical users to contribute ideas and solutions. And, of course, it could be used for non-technical problems as well. -- Erik Möller Deputy Director, Wikimedia Foundation Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Crawling deWP
Daniel Kinzler wrote: > Rolf Lampa schrieb: >> I'd love, however, to see the flagged rev status as an attribute in one >> of the tags, for example >> >> Regards, > > Naw, it's more complex than that. You can have any number of different flags. > It > would probably have to be > foobar > > -- daniel It would be "", child of , just as ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Crawling deWP
2009/1/28 Platonides : > Daniel Kinzler wrote: >> Rolf Lampa schrieb: >>> I'd love, however, to see the flagged rev status as an attribute in one >>> of the tags, for example >>> >>> Regards, >> >> Naw, it's more complex than that. You can have any number of different >> flags. It >> would probably have to be >> foobar >> >> -- daniel > > It would be "", child of , just as But, as daniel said, "flagged" isn't enough, you need to know what flag. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Toolserver-l] Crawling deWP
2009/1/28 Daniel Kinzler : > Marco Schuster schrieb: > ... >>> But by then, i do hope we have revision flags in the dumps. because that >>> would >>> be The Right Thing to use. >> Still, using the dumps would require me to get the full history dump >> because I only want flagged revisions and not current revisions >> without the flag. > > Including the latest revision which is flagged "good" would be an obvious > feature that should be implemented along with including the revision flags. So > the "current" dump would have 1-3 revisions per page. The extension is highly customisable, so different projects will have different flags available. Would you include the latest revision with each flag? The latest revision with any flag? The latest revision with a particular flag chosen for each project? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikimedia IdeaTorrent?
On 1/28/09 12:22 PM, Erik Moeller wrote: > If you haven't seen it yet, Ubuntu is running an interesting > brainstorming software called IdeaTorrent to think collectively about > common problems and solutions: > > http://brainstorm.ubuntu.com/ > > The software: > > http://www.ideatorrent.org/ > > I wonder - would people consider it useful to set up something like > brainstorm.wikimedia.org using this software, or would it be too > duplicative of BugZilla and listservs? The benefit of IdeaTorrent is > that it's very straightforward for non-technical users to contribute > ideas and solutions. And, of course, it could be used for > non-technical problems as well. Taking a quick peek it's giving me a warm fuzzy feeling. :) Not sure how best to integrate things, but it's definitely worth investigating. -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l