Hi Yossi, What you say makes sense if you run Nutch in the "whole Internet crawling" mode. In other words, you don't specify the set of hosts you want to crawl, but crawl up to infinity.
Our case is different. We crawl the specific hosts per each country(around 200000). For each host we set up a stop condition in generate, with the expression based on fetched number per host, lets say db_fetched < 100(see https://issues.apache.org/jira/browse/NUTCH-2368). The problem is for really deep websites this condition can be hard(never in practice) to satisfy. As an illustration, imagine a website with the following structure 1-10-15-5-1-1-1 - ... Therefore I want to have a mechanism to stop at specific point with this host even though the db_fetched condition is not satisfied yet. Semyon.