I agree, may we can get this work by using groups. We can have some workers in a fetch group and let them do the fetching.What about running one fetcher on each node 24/7? Each fetcher would take segments from a global queue. Other parts of the system do not have to wait untill the to-fetch queue is depleted before doing the DB update and new segment generation. So basically adding a queue will allow pipelining of the time consuming work, namely fetching, db update and segment generation. And we will not end up waiting for one or two fetchers to finish their job.
Beside the fetch group we have the preprocessing group that does the rest.
Make that sense?
Stefan
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
