With map reduce there will be only hardware limits.
To crawl ~ 500 Mio with nutch .7 is a pain since db update mai takes
more than one week.
Stefan
Am 25.10.2005 um 02:29 schrieb <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>:
Hi,
Does anybody know what the maximum number of pages that have ever been
fetched and indexed with nutch is? I know Yahoo Research did fetch
100M
pages about 3 years ago, but they stopped after that. Is there any
real
large scale (like, google and yahoo) Webdb out there that has been
fetched
by nutch?
Thanks, Nima
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general