I'd like to crawl the whole-web Applying my own IndexingFilter plugin I can select the URLs that are of interest to me
Than using the 'Prune' method I have the searcher focused on those pages which are of interest So far so good it works, But I would like to recrawl with different strategis, say different frequencies: 1. weekly, the sites which contain interesting data 2. monthly, the sites which do not contain interesting data I tried to look into the scoring plug-in but I didn't find the way to use it for that purpose. Any suggestions ? Basically there seems to be missing a mechanism to feedback from the indexing phase into the crawling phase. In this way we could focus the crawling based on content and not on URL filtering which is a poor reflection of the content. -Raymond-
