I'd like to mirror just the category structure of the English Wikipedia, and
I'm wondering which of the dump files I need to start with.

 

I don't need the page content, just the page names, and only for the most
current revision.  I need the categories and category members, and I'd like
to exclude hidden categories.  I also need to distinguish redirects, because
I don't want to treat them as separate pages.  As much as possible I'd like
to work with SQL files, but I can crunch through XML if necessary.

 

So which files do I need to download?  I may also need some help in
understanding the schemas.

 

Thanks,

 

Robert

 

_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

Reply via email to