Hi, Somebody (Paul?) mentioned using Droids for doing a 50M page crawl. Anyone else using Droids for crawls of that size?
I'm asking because I have a need to do a "semi-vertical" crawl on up to 10K domains and I'm considering Droids vs. Nutch. This may translate to several times that many different servers - say 100K. And that may translate to a few 100M web pages. Too big for Droids without having a persistent link queue, right? Thanks, Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/
