Andreas Rindler schreef: > Hi, > we are trying to extract all URLs in wiki articles from our Mediawiki > installation. We have tried Grep, Perl and Sed on mysql dumps, but it > is very difficult to get the URLs only, without some > garbage/text/comments before or after them. > > Does anyone know of a better way to achieve this? > Use the externallinks table, it has all this data. If the externallinks table is empty or incomplete, you can rebuild it with a maintenance script (don't remember the name offhand).
Roan Kattouw (Catrope) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l