Hi Paul,

you can use

 $NUTCH_HOME/bin/nutch readdb my_crawl/crawldb/ -dump dump_crawldb/ -format csv

then in dump_crawldb you'll find a CSV file with all URLs in your crawlDb.
One column indicates the status. Select only those records with "db_fetched"
and you'll have your list.

Sebastian

Reply via email to