On Wed, 2008-08-27 at 17:26 -0400, Grant Ingersoll wrote: > Is there a feature list for Droids anywhere? >
I need to wrap some up. Droids is completely controlled by its build.properties https://svn.apache.org/repos/asf/labs/droids/trunk/default.properties Which then get picked up by the spring configuration. https://svn.apache.org/repos/asf/labs/droids/trunk/src/core/java/org/apache/droids/droids-core-context.xml One highlight in the spring configuration is the usage of the cocoon-configurator and its dynamic registry support (making extending droids a pleasure). http://cocoon.apache.org/subprojects/configuration/1.0/spring-configurator/2.0/1400_1_1.html In general the architecture is that a robot (e.g. DefaultDroid) controls various worker (threads) that are doing the actual work. > Or, can it do: > > 1. Honor robots.txt Yes, by default it honors the robot.txt. However you can turn on the hostile mode of a droids (droids.protocol.http.force=true) > 2. Crawl throttling You can configure the amount of concurrent threads that a droids can distribute to their workers (droids.maxThreads=5) and the delay time between the requests (droids.delay.request=500). Or you can use one of the new Delay components (see LABS-139): * SimpleDelayTimer * RandomDelayTimer * GaussianRandomDelayTime > 3. Distributed crawling (i.e. give a bunch of links to it and some > distributed compute resources and have it go to town) If you mean hadoop style, no. However you can start various droids on different systems. salu2 > > Thanks, > Grant > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
