Finally
Thank you all a lot
Monica
On Thu, Dec 16, 2010 at 11:50 PM, emijrp wrote:
> Hi Monica;
>
> You dump is this one, with date 2010-03-12:[1][2]
>
> a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2
>
> There are some old English Wikipedia dumps and
I have no idea about the 2006 one; the other ones I know to be
incomplete one way or another. Working with the Jan and March 2010 run,
in conjunction with the earlier dumps, you can get complete info, see
http://techblog.wikimedia.org/2010/05/
In addition the September 2010 run
http://dumps.wikim
All? The 2006 one too?
2010/12/16 Ariel T. Glenn
> The dumps in the archive are there because they are incomplete, by the
> way.
>
> Ariel
>
> Στις 16-12-2010, ημέρα Πεμ, και ώρα 16:50 +0100, ο/η emijrp έγραψε:
> > Hi Monica;
> >
> > You dump is this one, with date 2010-03-12:[1][2]
> >
> > a3a5
The dumps in the archive are there because they are incomplete, by the
way.
Ariel
Στις 16-12-2010, ημέρα Πεμ, και ώρα 16:50 +0100, ο/η emijrp έγραψε:
> Hi Monica;
>
> You dump is this one, with date 2010-03-12:[1][2]
>
> a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2
>
Hi Monica;
You dump is this one, with date 2010-03-12:[1][2]
a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2
There are some old English Wikipedia dumps and md5sum files in a directory
called "archive"[3].
Regards,
emijrp
[1]
http://download.wikimedia.org/archive/enwiki
Hi James;
download.wikimedia.org is available again, so, you can download that file
from
http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-pages-articles.xml.bz26.2
GB.
Regards,
emijrp
2010/12/14 James Linden
> On Mon, Dec 13, 2010 at 7:09 PM, Michael Gurlitz
> wrote:
> > I grabbe
Totally agree!
And also I think an info page listing all past versions will also be
helpful:)
Monica
On Tue, Dec 14, 2010 at 5:11 PM, Andrew Dunbar wrote:
> On 14 December 2010 20:04, Andrew Dunbar wrote:
> > On 14 December 2010 01:57, Monica shu wrote:
> >> Thanks Diederik and Waksman,
> >>
Sorry Andrew, I just notice this reply
Can you give me the url of this search page?
Thanks!
Shu
On Tue, Dec 14, 2010 at 5:04 PM, Andrew Dunbar wrote:
> On 14 December 2010 01:57, Monica shu wrote:
> > Thanks Diederik and Waksman,
> >
> > It seems that I need to do parse the dump for articl
Monica shu wrote:
> Hi emijrp,
>
> Here is my dump's info:
>
> *enwiki-latest-pages-articles.xml.bz2 *
> *a3a5ee062abc16a79d111273d4a1a99a*
>
> Thanks~
I can't find such md5 on any dump.
Here are the md5s of the latest enwiki pages-articles:
a9506e8aedd3b830e059b7c8a3c0dbcd enwiki-2010090
On Mon, Dec 13, 2010 at 7:09 PM, Michael Gurlitz
wrote:
> I grabbed the following files in the days before the server broke, and
> I can set up a torrent file if anyone's interested, or I could FTP
> them to a server. 2010-10-11 was the last full Wikipedia dump that was
> completed.
> 6652983189 (
On 14 December 2010 20:04, Andrew Dunbar wrote:
> On 14 December 2010 01:57, Monica shu wrote:
>> Thanks Diederik and Waksman,
>>
>> It seems that I need to do parse the dump for article data to get this piece
>> of information...
>> Yes, this will be the last choice, but I think there maybe some
On 14 December 2010 01:57, Monica shu wrote:
> Thanks Diederik and Waksman,
>
> It seems that I need to do parse the dump for article data to get this piece
> of information...
> Yes, this will be the last choice, but I think there maybe some easier
> way...
>
> I just got home and checked the dum
Hi emijrp,
Here is my dump's info:
*enwiki-latest-pages-articles.xml.bz2 *
*a3a5ee062abc16a79d111273d4a1a99a*
Thanks~
On Mon, Dec 13, 2010 at 10:00 PM, emijrp wrote:
> Hi;
>
> It would be better if you can give us the md5sum of the file. If you are on
> Linux, use the command "md5sum file
I grabbed the following files in the days before the server broke, and
I can set up a torrent file if anyone's interested, or I could FTP
them to a server. 2010-10-11 was the last full Wikipedia dump that was
completed.
6652983189 (6.2GB) enwiki-20101011-pages-articles.xml.bz2
12823734687 (12GB) en
Thanks Diederik and Waksman,
It seems that I need to do parse the dump for article data to get this piece
of information...
Yes, this will be the last choice, but I think there maybe some easier
way...
I just got home and checked the dump I've downloaded.
It's downloaded on June, 10, 2010, the si
Hi;
It would be better if you can give us the md5sum of the file. If you are on
Linux, use the command "md5sum filename" (you have to install it with
apt-get). If you are on Windows search for a tutorial.
Also, the file size and the project language and family (wikipedia,
wiktionary...) would be
Hi Monica,
The file sizes of the EN pages dumps that are available today are:
5204823166 enwiki-20100312-pages-articles.xml.7z
5983814213 enwiki-20100130-pages-articles.xml.bz2
Note that the former is in 7z and the later is in bz2
Does this help?
Shaun
On Mon, Dec 13, 2010 at 8:45 AM, Moni
Hi Monica,
I don't think there is such a place, what you could do is parse the file and
look for the date of the most recent edit. That will give you a fairly accurate
estimate of the date that the dump was generated.
Best,
Diederik
On 2010-12-12, at 10:45 PM, Monica shu wrote:
> Hi all,
>
Hi all,
I have downloaded a dump several month ago.
By accidentally, I lost the version info of this dump, so I don't know when
this dump was generated.
Is there any place that list out info about the past dumps(such as size...)?
Thanks!
Monica
___
Wik
19 matches
Mail list logo