Dennis, that's awesomely interesting. Thank you, Mark
On Tue, Nov 24, 2009 at 10:01 AM, Dennis Kubes <[email protected]> wrote: > Hi Mark, > > I just put this up on the wiki. Hope it helps: > > http://wiki.apache.org/nutch/OptimizingCrawls > > Dennis > > > > Mark Kerzner wrote: > >> Hi, guys, >> >> my goal is to do by crawls at 100 fetches per second, observing, of >> course, >> polite crawling. But, when URLs are all different domains, what >> theoretically would stop some software from downloading from 100 domains >> at >> once, achieving the desired speed? >> >> But, whatever I do, I can't make Nutch crawl at that speed. Even if it >> starts at a few dozen URLs/second, it slows down at the end (as discussed >> by >> many and by Krugler). >> >> Should I write something of my own, or are their fast crawlers? >> >> Thanks! >> >> Mark >> >>
