Hi, I'm a student planning on doing GSoC this year on mediawiki.
Specifically, I'd like to work on data dumps.

I'm writing this to gauge what would be useful to the research
community. Several ideas thrown about include:
1. JSON Dumps
2. Sqlite Dumps
3. Daily dumps of revisions in last 24 hours
4. Dumps optimized for very fast import into various external storage
and smaller size (diffs)
5. JSON/CSV for Special:Import and Special:Export

Would any of these be useful? Or is there anything else that I'm
missing, that you would consider much more useful?

Feedback would be invaluable :)

Thanks :)
-- 
Yuvi Panda T
http://yuvi.in/blog

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to