The webdb and the segments are two separate things. The webdb is basically used by fetcher to keep track of the status of the URL (like last fetch time, was there an error). The segments contain the data from the fetches themselves, and also the data's index, which is used during searches.
So you've deleted the page from the webdb, and now you need to remove it from the index. You can use the PruneIndexTool to do this. Here's a link for more info: http://lucene.apache.org/nutch/apidocs/org/apache/nutch/tools/PruneIndexTool.html Howie >Does anyone know how to force a page to be deleted. I have run the >WebDBWriter class and removed the page from the database but it still >shows on the search? Further checks using WebDBReader give a 'null' >response when looking for the page. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
