Mohamed Magdy wrote: > I don't remember if I already mentioned this: you can split the xml > file * into smaller pieces then import it using importDump.php. > > Use a loop to make a file like this and then run it: > #!/bin/bash > php maintenance/importDump.php < /path/pagexml.1 > wait > php maintenance/importDump.php < /path/pagexml.2 > ... > > I haven't tried to start many php importDump.php processes working on > different xml files simultaneously, will it work? > > * = > http://blog.prashanthellina.com/2007/10/17/ways-to-process-and-use-wikipedia-dumps/
Thanks Mohamed – This is a good suggestion, but I am a bit vary to try it, because if I later have problems, I would not be sure if it is because I used this script to split the XML files. I understand that the script looks OK, in that it simply splits the XML files at the “</page>” boundaries – but I don’t know a lot on how this would effect the final result. Thanks again, O. O. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l