Re: [Foundation-l] dumps

2009-02-25 Thread Samuel Klein
Compression allowing random access is definitely the way to go for large selections. Ángel, that's an interesting reader you wrote. I cc: a list for offline wikireaders (most designed around mediawiki). A similar idea is in use by schools across Peru[1] to provide offline access to the Spanish W

Re: [Foundation-l] [Commons-l] dumps

2009-02-25 Thread Samuel Klein
Excellent -- following up offlist. SJ On Tue, Feb 24, 2009 at 12:45 PM, Brion Vibber wrote: > On 2/23/09 5:31 PM, Samuel Klein wrote: >> Copying the Commons list. >> >> I am interested in hosting (and running some scripts on) copies of the >> commons media dump on offline regional servers for o

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Andrew Gray
2009/2/25 Gerard Meijssen : > Hoi, > When the use case of the Simple Wikipedia is better understood, it may even > make room for more simple projects as in simple projects in the biggest > languages. This is quite an interesting thought. The language used by Simple English is (apparently) derived

Re: [Foundation-l] dumps

2009-02-25 Thread Ángel
Samuel Klein wrote: > Compression allowing random access is definitely the way to go for > large selections. > > Ángel, that's an interesting reader you wrote. I cc: a list for > offline wikireaders (most designed around mediawiki). Subscribed. > A similar idea > is in use by schools across Per

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Lars Aronsson
Andrew Gray wrote: > This is quite an interesting thought. The language used by > Simple English is (apparently) derived from two defined > "simplified versions" of English which were deliberately > designed - have there been projects to do the same for, say, > French or Spanish, or would we h

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Samuel Klein
i agree that there are many problems with a discussion or vote on one project impacting another. community participation and language / context barriers are one. having people discussing who themselves aren't editors or readers is another. privileging "having edited 100 articles in any one wikipe

[Foundation-l] Amazon Public Data includes Wikipedia

2009-02-25 Thread Nathan
http://www.nytimes.com/external/readwriteweb/2009/02/25/25readwriteweb-amazon_exposes_1_terrabyte_of.html According to this, a new project by Amazon that makes a terabyte of public data available includes a full dump of Wikipedia. It also includes the complete dbpedia - so it seems like there are

Re: [Foundation-l] Amazon Public Data includes Wikipedia

2009-02-25 Thread Thomas Dalton
2009/2/25 Nathan : > http://www.nytimes.com/external/readwriteweb/2009/02/25/25readwriteweb-amazon_exposes_1_terrabyte_of.html > > According to this, a new project by Amazon that makes a terabyte of public > data available includes a full dump of Wikipedia. It also includes the > complete dbpedia -

Re: [Foundation-l] Amazon Public Data includes Wikipedia

2009-02-25 Thread Samuel Klein
thread convergence! It didn't include wikipedia-proper when I looked yesterday, but this was suggested... On Tue, Feb 24, 2009 at 11:26 PM, Brian wrote: > Why not make the uncompressed dump available as an Amazon Public > Dataset? http://aws.amazon.com/publicdatasets/ On Wed, Feb 25, 2009 at 1

Re: [Foundation-l] dumps

2009-02-25 Thread Anthony
On Tue, Feb 24, 2009 at 11:26 PM, Brian wrote: > Why not make the uncompressed dump available as an Amazon Public > Dataset? http://aws.amazon.com/publicdatasets/ > Which uncompressed dump? The full history English Wikipedia dump doesn't exist, and there doesn't seem to be any demand for this a

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Jussi-Ville Heiskanen
Andrew Gray wrote: > 2009/2/25 Gerard Meijssen : > >> Hoi, >> When the use case of the Simple Wikipedia is better understood, it may even >> make room for more simple projects as in simple projects in the biggest >> languages. >> > > This is quite an interesting thought. The language used b

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Gerard Meijssen
No, Absolutely not. The Incubator is a vital resource that can easily accomadote any language any project. If anything I would make the Incubator compulsory for ANY project. The reason for this is obvious; the Incubator works. Thanks. GerardM 2009/2/25 Jussi-Ville Heiskanen > Andrew Gray w

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Jussi-Ville Heiskanen
"No, Absolutely not." Eh? "No, Absolutely not." to what precisely? You say incubator should be a phase for projects? I said simple should incubate for even larger languages. Where is the "No, Absolutely not." directed at? Gerard Meijssen wrote: > No, > Absolutely not. The Incubator is a vital re

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Gerard Meijssen
Hoi, There is no room nor need for a simple Incubator. One suffices. Thanks. GerardM 2009/2/25 Jussi-Ville Heiskanen > "No, > Absolutely not." > > Eh? "No, Absolutely not." to what precisely? You say incubator should be a > phase for projects? I said simple should incubate for even larger >

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Chad
I could be wrong, but I think you're misreading the point here. It's that we should Incubate more Simples in more languages, not that we need a Simple Incubator... -Chad On Wed, Feb 25, 2009 at 12:59 PM, Gerard Meijssen wrote: > Hoi, > There is no room nor need for a simple Incubator. One suffi

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Jussi-Ville Heiskanen
Do you have a substantive opinion on the essence of my suggestion, that is that even large language simple projects should pass through the incubator? Yours, Jussi-Ville Heiskanen Gerard Meijssen wrote: > Hoi, > There is no room nor need for a simple Incubator. One suffices. > Thanks. > Ge

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Gerard Meijssen
Hoi, Possibly. Now for some cold water. At this moment the policy is explicit. We do not accept any new Simple projects in any language. What I said was that it would be good when there are some good numbers that prove the value of a simple project. Once this is more clear, we may reconsider. Thank

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Andrew Gray
2009/2/25 Gerard Meijssen : > No, > Absolutely not. The Incubator is a vital resource that can easily accomadote > any language any project. If anything I would make the Incubator compulsory > for ANY project. The reason for this is obvious; the Incubator works. I think this is exactly what he was

Re: [Foundation-l] dumps

2009-02-25 Thread Brian
What has led you to believe there is no demand for a full dump of the english wikipedia? On Wed, Feb 25, 2009 at 9:26 AM, Anthony wrote: > On Tue, Feb 24, 2009 at 11:26 PM, Brian wrote: > >> Why not make the uncompressed dump available as an Amazon Public >> Dataset? http://aws.amazon.com/public

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Chad
Not cold water for me. I must confess I've never been a fan of Simple. -Chad On Wed, Feb 25, 2009 at 1:19 PM, Gerard Meijssen wrote: > Hoi, > Possibly. > Now for some cold water. At this moment the policy is explicit. We do not > accept any new Simple projects in any language. What I said was th

Re: [Foundation-l] dumps

2009-02-25 Thread Thomas Dalton
2009/2/25 Brian : > What has led you to believe there is no demand for a full dump of the > english wikipedia? He didn't say there was no demand, he said there was no demand for having it on Amazon. ___ foundation-l mailing list foundation-l@lists.wikim

[Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread John at Darkstar
Our "national lexicon" here in Norway, Store Norske Leksikon, went online with its new free edition today. The new edition has user contributed articles. The chief editor says some of the reason for the new edition is the harsh competition from Wikipedia, especially no.wikipedia.org which outnumber

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread Finn Rindahl
Since our (WMF) aim is to provide free knowledge, I would say that SNL making a free online edition is a proof of our success more than a new "competitor". They have a lot to learn from us, we have a lot to learn from them. And whoever is seeking free knowledge in Norwegian on the web will have mor

Re: [Foundation-l] dumps

2009-02-25 Thread Felipe Ortega
--- El mié, 25/2/09, Anthony escribió: > De: Anthony > Asunto: Re: [Foundation-l] dumps > Para: "Wikimedia Foundation Mailing List" > Fecha: miércoles, 25 febrero, 2009 5:26 > On Tue, Feb 24, 2009 at 11:26 PM, Brian > wrote: > > Which uncompressed dump? The full history English > Wikipedi

Re: [Foundation-l] dumps

2009-02-25 Thread Brian
Ahh ok. Anyone who wants to do processing on the full history (and there are a lot of these people who exist!) by definition *has* to be willing to throw some money at it. It simply doesn't fit on commercial drives. In fact, it would hardly fit on either of the two raid clusters I have access to. M

Re: [Foundation-l] dumps

2009-02-25 Thread Thomas Dalton
2009/2/25 Brian : > Ahh ok. Anyone who wants to do processing on the full history (and there are > a lot of these people who exist!) by definition *has* to be willing to throw > some money at it. It simply doesn't fit on commercial drives. In fact, it > would hardly fit on either of the two raid cl

Re: [Foundation-l] dumps

2009-02-25 Thread Felipe Ortega
--- El jue, 26/2/09, Brian escribió: > De: Brian > Asunto: Re: [Foundation-l] dumps > Para: "Wikimedia Foundation Mailing List" > Fecha: jueves, 26 febrero, 2009 12:33 > Ahh ok. Anyone who wants to do processing on the full > history (and there are > a lot of these people who exist!) by defi

Re: [Foundation-l] dumps

2009-02-25 Thread Brian
One of the academics I am speaking of wrote the textbook on natural language processing. He has a 3TB raid cluster. Of course, for about a thousand dollars you can create a bigger raid cluster than that using the new 2TB drives, but funding comes and goes. Our 26 node cluster has a 26 20GB drives i

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread Ian A. Holton
But is it free as in free beer or freedom? --Ian [[User:Poeloq]] On Thu, Feb 26, 2009 at 12:04 AM, Finn Rindahl wrote: > Since our (WMF) aim is to provide free knowledge, I would say that SNL > making a free online edition is a proof of our success more than a new > "competitor". They have a lot

Re: [Foundation-l] dumps

2009-02-25 Thread Brian
I went ahead and submitted it to Amazon. I'll leave the file up for a week or so if anyone else wants it (18GB): http://mist.colorado.edu/enwiki-20080103-pages-meta-history.xml.7z Just to emphasize my point - I have never been able to unpack this file. I've got no place to put it!! On Wed, Feb 2

Re: [Foundation-l] dumps

2009-02-25 Thread Chad
I can't even ping the host. Typo? -Chad On Wed, Feb 25, 2009 at 7:24 PM, Brian wrote: > I went ahead and submitted it to Amazon. I'll leave the file up for a week > or so if anyone else wants it (18GB): > > http://mist.colorado.edu/enwiki-20080103-pages-meta-history.xml.7z > > Just to emphasize

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread John at Darkstar
There are no formal license so I would say "free beer" as for now. John Ian A. Holton skrev: > But is it free as in free beer or freedom? > > --Ian > [[User:Poeloq]] > > On Thu, Feb 26, 2009 at 12:04 AM, Finn Rindahl wrote: > >> Since our (WMF) aim is to provide free knowledge, I would say that

Re: [Foundation-l] dumps

2009-02-25 Thread Delirium
Brian wrote: > Ahh ok. Anyone who wants to do processing on the full history (and there are > a lot of these people who exist!) by definition *has* to be willing to throw > some money at it. It simply doesn't fit on commercial drives. I've personally never found much of a compelling reason to actua

Re: [Foundation-l] dumps

2009-02-25 Thread Brian
Yep a typo, here is the right link: http://grey.colorado.edu/enwiki-20080103-pages-meta-history.xml.7z On Wed, Feb 25, 2009 at 5:35 PM, Chad wrote: > I can't even ping the host. Typo? > -Chad > > On Wed, Feb 25, 2009 at 7:24 PM, Brian wrote: > > > I went ahead and submitted it to Amazon. I'll

Re: [Foundation-l] dumps

2009-02-25 Thread Mark Wagner
On Wed, Feb 25, 2009 at 15:48, Brian wrote: > One of the academics I am speaking of wrote the textbook on natural language > processing. He has a 3TB raid cluster. Of course, for about a thousand > dollars you can create a bigger raid cluster than that using the new 2TB > drives, but funding comes

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Austin Hair
On Wed, Feb 25, 2009 at 5:34 AM, Samuel Klein wrote: > For the record, lots of people who use simple: are devs or researchers > who need a good small simple testbed, or people who only intend to > read and use in contexts away from the original editable wiki. I > would bet, though with lower odds

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread Jon Harald Søby
Their license text indicates that they are aiming o be free as in freedom, but they do not have a proper license as of yet (all it says is that "you can use our stuff in the same way as with Wikipedia's stuff", and a bunch of articles are marked with "free license", but without specifying that any

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Geoffrey Plourde
I didn't know the language committee was empowered to decide on whether or not Simples were made. I thought your job was to determine valid languages. I absolutely cannot support the continued existence of this body due to these unknown powers and will make my voice known the next time someone o

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Jon Harald Søby
We are not, and you're misinterpreting Gerard's post; what he says is that we do not allow any more simple projects; deciding over existing projects is not something we do, and not something we even *want* to do. 2009/2/26 Geoffrey Plourde > I didn't know the language committee was empowered to

Re: [Foundation-l] Simple English Encyclopedia

2009-02-25 Thread Gerard Meijssen
Hoi, The language committee is empowered to decide on all new projects. It has been this way since its start. Nothing new here. Thanks, GerardM 2009/2/26 Geoffrey Plourde > I didn't know the language committee was empowered to decide on whether or > not Simples were made. I thought your jo

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread John at Darkstar
The release has been given a lot of press coverage, and some comparisons between the encyclopedias has been done. Two of them, in Dagbladet[1] and Dagsavisen[2], has concluded that Wikipedia is best. According to Aftenposten the new edition will cost Kunskapsforlaget and their owners Aschehoug og G

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread Andre Engels
On Thu, Feb 26, 2009 at 8:17 AM, John at Darkstar wrote: > The release has been given a lot of press coverage, and some comparisons > between the encyclopedias has been done. Two of them, in Dagbladet[1] > and Dagsavisen[2], has concluded that Wikipedia is best. According to > Aftenposten the new

Re: [Foundation-l] Free edition of Norways national encyklopedia Store Norske Leksikon

2009-02-25 Thread Andre Engels
Free as in beer, of course, but still, that's the main part of what's our mission, our at least what I see as our mission: As I see it, our mission is to ensure that the _knowledge and information_ are _available_ to everyone. For that, free as in beer is the important step. On Thu, Feb 26, 2009 a