You can dump segment info to a directory, let's say "tmps", $NUTCH_HOME/bin/nutch readseg -dump $segment tmps -nocontent
Then, go to the directory, you should see a file "dump" grep outlink: dump | cut -f5 -d" " > outlinks On Fri, 2009-07-17 at 18:43 +0200, reinhard schwab wrote: > is any tool available to dump all outlinks (filtered outlinks included)? > (i know the tools to dump crawldb, linkdb and segments) > or do i have to implement such a tool and if, how? > i want to know them to adapt/manage the url filters. > parse the contents with urlfilters disabled? > > reinhard
