I am running Nutch on a powerful server with 1 GB RAM and 3 GHz Intel processor. I want to know what the optimum number of threads would be to crawl an intranet with around 100 sites.
If I use too many threads (say -threads 100) while crawling, won't the context switching overhead hamper the performance. Please share your experiences like what number of threads have worked well for you. You may also share the other metrics like "-depth" values and "-topN" values. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general