Hi

Write an regex URL filter and use it the next time you update the db; it will 
disappear. Be sure to backup the db first in case your regex catches valid 
URL's. Nutch 1.5 will have an option to keep the previous version of the DB 
after update.

cheers

> We accidentally injected some urls into the crawl database and I need to go
> remove them.  From what I understand, in 1.4 I can view and modify the urls
> and indexes.  But I can't seem to find any information on how to do this.
> 
> Is there anything regarding this available?

Reply via email to