With map reduce there will be only hardware limits.
To crawl ~ 500 Mio with nutch .7 is a pain since db update mai takes more than one week.

Stefan

Am 25.10.2005 um 02:29 schrieb <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>:

Hi,

Does anybody know what the maximum number of pages that have ever been
fetched and indexed with nutch is? I know Yahoo Research did fetch 100M pages about 3 years ago, but they stopped after that. Is there any real large scale (like, google and yahoo) Webdb out there that has been fetched
by nutch?

Thanks, Nima




-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to