Re: [Wikitech-l] How to find the version of a dump

2010-12-17 Thread Monica shu
Finally Thank you all a lot Monica On Thu, Dec 16, 2010 at 11:50 PM, emijrp wrote: > Hi Monica; > > You dump is this one, with date 2010-03-12:[1][2] > > a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2 > > There are some old English Wikipedia dumps and

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread Ariel T. Glenn
I have no idea about the 2006 one; the other ones I know to be incomplete one way or another. Working with the Jan and March 2010 run, in conjunction with the earlier dumps, you can get complete info, see http://techblog.wikimedia.org/2010/05/ In addition the September 2010 run http://dumps.wikim

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread emijrp
All? The 2006 one too? 2010/12/16 Ariel T. Glenn > The dumps in the archive are there because they are incomplete, by the > way. > > Ariel > > Στις 16-12-2010, ημέρα Πεμ, και ώρα 16:50 +0100, ο/η emijrp έγραψε: > > Hi Monica; > > > > You dump is this one, with date 2010-03-12:[1][2] > > > > a3a5

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread Ariel T. Glenn
The dumps in the archive are there because they are incomplete, by the way. Ariel Στις 16-12-2010, ημέρα Πεμ, και ώρα 16:50 +0100, ο/η emijrp έγραψε: > Hi Monica; > > You dump is this one, with date 2010-03-12:[1][2] > > a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2 >

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread emijrp
Hi Monica; You dump is this one, with date 2010-03-12:[1][2] a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2 There are some old English Wikipedia dumps and md5sum files in a directory called "archive"[3]. Regards, emijrp [1] http://download.wikimedia.org/archive/enwiki

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread emijrp
Hi James; download.wikimedia.org is available again, so, you can download that file from http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-pages-articles.xml.bz26.2 GB. Regards, emijrp 2010/12/14 James Linden > On Mon, Dec 13, 2010 at 7:09 PM, Michael Gurlitz > wrote: > > I grabbe

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread Monica shu
Totally agree! And also I think an info page listing all past versions will also be helpful:) Monica On Tue, Dec 14, 2010 at 5:11 PM, Andrew Dunbar wrote: > On 14 December 2010 20:04, Andrew Dunbar wrote: > > On 14 December 2010 01:57, Monica shu wrote: > >> Thanks Diederik and Waksman, > >>

Re: [Wikitech-l] How to find the version of a dump

2010-12-16 Thread Monica shu
Sorry Andrew, I just notice this reply Can you give me the url of this search page? Thanks! Shu On Tue, Dec 14, 2010 at 5:04 PM, Andrew Dunbar wrote: > On 14 December 2010 01:57, Monica shu wrote: > > Thanks Diederik and Waksman, > > > > It seems that I need to do parse the dump for articl

Re: [Wikitech-l] How to find the version of a dump

2010-12-14 Thread Platonides
Monica shu wrote: > Hi emijrp, > > Here is my dump's info: > > *enwiki-latest-pages-articles.xml.bz2 * > *a3a5ee062abc16a79d111273d4a1a99a* > > Thanks~ I can't find such md5 on any dump. Here are the md5s of the latest enwiki pages-articles: a9506e8aedd3b830e059b7c8a3c0dbcd enwiki-2010090

Re: [Wikitech-l] How to find the version of a dump

2010-12-14 Thread James Linden
On Mon, Dec 13, 2010 at 7:09 PM, Michael Gurlitz wrote: > I grabbed the following files in the days before the server broke, and > I can set up a torrent file if anyone's interested, or I could FTP > them to a server. 2010-10-11 was the last full Wikipedia dump that was > completed. > 6652983189 (

Re: [Wikitech-l] How to find the version of a dump

2010-12-14 Thread Andrew Dunbar
On 14 December 2010 20:04, Andrew Dunbar wrote: > On 14 December 2010 01:57, Monica shu wrote: >> Thanks Diederik and Waksman, >> >> It seems that I need to do parse the dump for article data to get this piece >> of information... >> Yes, this will be the last choice, but I think there maybe some

Re: [Wikitech-l] How to find the version of a dump

2010-12-14 Thread Andrew Dunbar
On 14 December 2010 01:57, Monica shu wrote: > Thanks Diederik and Waksman, > > It seems that I need to do parse the dump for article data to get this piece > of information... > Yes, this will be the last choice, but I think there maybe some easier > way... > > I just got home and checked the dum

Re: [Wikitech-l] How to find the version of a dump

2010-12-13 Thread Monica shu
Hi emijrp, Here is my dump's info: *enwiki-latest-pages-articles.xml.bz2 * *a3a5ee062abc16a79d111273d4a1a99a* Thanks~ On Mon, Dec 13, 2010 at 10:00 PM, emijrp wrote: > Hi; > > It would be better if you can give us the md5sum of the file. If you are on > Linux, use the command "md5sum file

Re: [Wikitech-l] How to find the version of a dump

2010-12-13 Thread Michael Gurlitz
I grabbed the following files in the days before the server broke, and I can set up a torrent file if anyone's interested, or I could FTP them to a server. 2010-10-11 was the last full Wikipedia dump that was completed. 6652983189 (6.2GB) enwiki-20101011-pages-articles.xml.bz2 12823734687 (12GB) en

Re: [Wikitech-l] How to find the version of a dump

2010-12-13 Thread Monica shu
Thanks Diederik and Waksman, It seems that I need to do parse the dump for article data to get this piece of information... Yes, this will be the last choice, but I think there maybe some easier way... I just got home and checked the dump I've downloaded. It's downloaded on June, 10, 2010, the si

Re: [Wikitech-l] How to find the version of a dump

2010-12-13 Thread emijrp
Hi; It would be better if you can give us the md5sum of the file. If you are on Linux, use the command "md5sum filename" (you have to install it with apt-get). If you are on Windows search for a tutorial. Also, the file size and the project language and family (wikipedia, wiktionary...) would be

Re: [Wikitech-l] How to find the version of a dump

2010-12-12 Thread Shaun Waksman
Hi Monica, The file sizes of the EN pages dumps that are available today are: 5204823166 enwiki-20100312-pages-articles.xml.7z 5983814213 enwiki-20100130-pages-articles.xml.bz2 Note that the former is in 7z and the later is in bz2 Does this help? Shaun On Mon, Dec 13, 2010 at 8:45 AM, Moni

Re: [Wikitech-l] How to find the version of a dump

2010-12-12 Thread Diederik van Liere
Hi Monica, I don't think there is such a place, what you could do is parse the file and look for the date of the most recent edit. That will give you a fairly accurate estimate of the date that the dump was generated. Best, Diederik On 2010-12-12, at 10:45 PM, Monica shu wrote: > Hi all, >

[Wikitech-l] How to find the version of a dump

2010-12-12 Thread Monica shu
Hi all, I have downloaded a dump several month ago. By accidentally, I lost the version info of this dump, so I don't know when this dump was generated. Is there any place that list out info about the past dumps(such as size...)? Thanks! Monica ___ Wik