Date: Tue, 16 Feb 2010 09:34:41 -0800
From: Brion Vibber <br...@pobox.com>
Subject: Re: [Wikitech-l] [mwdumper] new maintainer?
To: wikitech-l@lists.wikimedia.org
Message-ID: <hlekvf$nl...@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 2/16/10 7:03 AM, Jamie Morken wrote:
> Ok, the simple question: how many people prefer XML or sql dumps?

I think we have a FAQ on this...

http://meta.wikimedia.org/wiki/Download#What_happened_to_the_SQL_dumps.3F


You *do* realize that such "SQL dumps" would have to be invented from 
whole cloth and couldn't just be dumped from the actual databases, right?

The raw databases include dozens of alternate clusters and have data 
from different revisions compressed together, including deleted items 
and private data, and can't simply be released by WMF even if someone 
actually wanted to figure out how to replicate Wikimedia's exact storage 
cluster layout to do a data import.

Most likely if they were created they'd simply be created by running the 
xml through a tool like mwdumper...

-- brion



Hi Brion,

I have not tried mwdumper yet, I have been looking at the various xml to sql 
conversion tools, and reading about people's use of them, but I will have to 
give it a try to see for myself, but it seems like an overly complex task to 
recreate an sql database in my opinion.  Also when wikimedia dumps used to be 
in sql format I think there were less dump problems than there are now, 
although maybe the main issue is the growth of the file sizes.  It is probably 
simpler to make an sql dump than an XML dump I bet, also the older mediawiki 
dumps were in sql format.  For making the wikimedia dumps into sql directly I 
think the process would be to do sql database merge's and then make sure the 
private data is erased?  This might be simpler than creating to XML and then 
using mwdumper to get back to sql.  Also there is a bottleneck somewhere in the 
dump system (dump fails etc) maybe it is the XML part?  I will get back to you 
after I try mwdumper and/or:

php importDump.php <17gigabytefail> :)

cheers,
Jamie


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to