What about running one fetcher on each node 24/7? Each fetcher would
take segments from a global queue. Other parts of the system do not
have to wait untill the to-fetch queue is depleted before doing the DB
update and new segment generation. So basically adding a queue will
allow pipelining of the time consuming work, namely fetching, db
update and segment generation. And we will not end up waiting for one
or two fetchers to finish their job.

I agree, may we can get this work by using groups. We can have some workers in a fetch group and let them do the fetching.
Beside the fetch group we have the preprocessing group that does the rest.


Make that sense?

Stefan



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to