On Wed, 2008-08-27 at 17:26 -0400, Grant Ingersoll wrote:
> Is there a feature list for Droids anywhere?
> 

I need to wrap some up.

Droids is completely controlled by its build.properties
https://svn.apache.org/repos/asf/labs/droids/trunk/default.properties

Which then get picked up by the spring configuration.
https://svn.apache.org/repos/asf/labs/droids/trunk/src/core/java/org/apache/droids/droids-core-context.xml

One highlight in the spring configuration is the usage of the
cocoon-configurator and its dynamic registry support (making extending
droids a pleasure).
http://cocoon.apache.org/subprojects/configuration/1.0/spring-configurator/2.0/1400_1_1.html

In general the architecture is that a robot (e.g. DefaultDroid) controls
various worker (threads) that are doing the actual work. 

> Or, can it do:
> 
> 1. Honor robots.txt

Yes, by default it honors the robot.txt. However you can turn on the
hostile mode of a droids (droids.protocol.http.force=true)

> 2. Crawl throttling

You can configure the amount of concurrent threads that a droids can
distribute to their workers (droids.maxThreads=5) and the delay time
between the requests (droids.delay.request=500). 

Or you can use one of the new Delay components (see LABS-139):
* SimpleDelayTimer 
* RandomDelayTimer 
* GaussianRandomDelayTime

> 3. Distributed crawling (i.e. give a bunch of links to it and some  
> distributed compute resources and have it go to town)

If you mean hadoop style, no. However you can start various droids on
different systems.

salu2

> 
> Thanks,
> Grant
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to