Hi,

I am trying to configure my nutch crawler with the runbot script from the wiki. http://wiki.apache.org/nutch/Crawl

I tried to insert regular expressions into regex-urlfilter.txt and into crawl-urlfilter.txt but it seems they are not working. Now I do not know whether my Regex is wrong or Nutch does not use the urlfilter file I believe it does. Is there a way to find out which one Nutch is currently using?

Regards
    Klemens

Reply via email to