I am running Nutch on a powerful server with 1 GB RAM and 3 GHz Intel
processor. I want to know what the optimum number of threads would be
to crawl an intranet with around 100 sites.

If I use too many threads (say -threads 100) while crawling, won't the
context switching overhead hamper the performance.

Please share your experiences like what number of threads have worked
well for you.

You may also share the other metrics like "-depth" values and "-topN" values.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to