This is a proposal to enter the incubator.
See http://wiki.apache.org/incubator/DroidsProposal for the most
up-to-date version.
As Champion we have Grant Ingersoll <gsingers at apache dot org> from
the ASF.
Droids is an Apache Labs project and we are still looking for some
mentors for this proposal.
We look forward to comments and discussion.
= Droids, an intelligent standalone robot framework =
=== Abstract ===
Droids aims to be an intelligent standalone robot framework that
allows
to create and extend existing droids (robots).
=== Proposal ===
As a standalone robot framework Droids will offer infrastructure
code to
create and extend existing robots. In the future it will offer as
well a
web based administration application to manage and controll the
different droids which will communicate with this app.
Droids makes it very easy to extend existing robots or write a new one
from scratch, which can automatically seek out relevant online
information based on the user's specifications. Since the flexible
design it can reuse directly all custom business logic that are
written
in java.
In the long run it should become umbrella for specialized droids that
are hosted as sub-projects. Where an ultimate goal is to integrate an
artificial intelligence that can control a swarm of droids and
actively
plan/react on different tasks.
=== Background ===
The initial idea for the Droids project was voiced in February 2007
from
Thorsten Scherler mainly because of personal curiosity and developed
as
a labs project. The background of his work was that Cocoon trunk (2.2)
did not provide a crawler anymore and Forrest was based on it, meaning
we could not update anymore till we found a crawler replacement.
Getting
more involved in Solr and Nutch he saw the request for a generic
standalone crawler.
For the first version he took nutch, ripped out and modified the
plugin/extension framework. However the second version were not
based on
it anymore but was using Spring instead. The main reason was that
Spring
has become a standard and helped to make Droids as extensible as
possible.
Soon the first plugins and sample droids had been added to the code
based.
=== Rationale ===
There is ever more demand for tools that automatically do determinate
tasks. Search engines such as Nuts are normally very focused on a
specific functionality and are not focused on extensibility.
Furthermore
there are manly focused on crawling, requesting certain pages and
extract links to other pages, which in our opinion is only one small
area for automated robots. While there are a number of existing
crawler
libraries for various task, each of them comes with a custom API and
there are no generic interface for automatically determining which
crawler (droids) to use for a specific task.
The Droids project attempts to remove this duplication of efforts. We
believe that by pooling the efforts of multiple projects we will be
able
to create a generic robot framework that exceeds the capabilities and
quality of the custom solutions of any single project. The focus of
Droids is not a single crawler but more to offer different reusable
components that custom droids (robots) can use to automate certain
tasks. An intelligent standalone robot framework project will not only
provide common ground for the developers of crawler but as well for
any
other automated application (robots) libraries.
=== Initial Goals ===
The initial goals of the proposed project are:
* Viable community around the Droids codebase
* Active relationships and possible cooperation with related projects
and communities (e.g. reusing Tika for text extraction)
* Generic robot API for crawling, extracting structured text content
and/or new task, filtering task and handle the content
* Flexible extension and plugin development to create a wide range of
functionality
* Fuel develop of various droids and bring the current wget style
crawler to state-of-the-art level
== Current Status ==
=== Meritocracy ===
All the initial committers are familiar with the meritocracy
principles
of Apache, and have already worked on the various source codebases. We
will follow the normal meritocracy rules also with other potential
contributors.
=== Community ===
There is not yet a clear Droids community. Instead we have a number of
people and related projects with an understanding that an intelligent
standalone robot framework project would best serve everyone's
interests. The primary goal of the incubating project is to build a
self-sustaining community around this shared vision.
=== Core Developers ===
The initial set of developers comes from various backgrounds, with
different but compatible needs for the proposed project.
=== Alignment ===
As a generic robot framework Droids will likely be widely used by
various open source and commercial projects both together with and
independent of other Apache tools. Apache projects like Cocoon, Lenya
and Forrest are potential candidates for using different droids as an
embedded component.
== Known Risks ==
=== Orphaned products ===
Till now only one company is known to use Droids in a productive
environment however there is a constant interest in a generic robot
framework expressed by various Apache committers. For many potential
users the existing tools are to complicated or too much focused on a
specific usecase which will help to gain a bigger user base.
Once the project gets started we can quickly build the wget style
droids
to a feature level of existing tools based on plugin development that
reuses code from sources mentioned below. After that we believe to be
able to quickly grow the developer and user communities based on the
benefits of a generic framework offering reusable plugins and
different
droids over custom alternatives.
=== Inexperience with Open Source ===
All the initial developers have worked on open source before and many
are committers and PMC members within other Apache projects.
=== Homogenous Developers ===
The initial developers come from a variety of backgrounds and with a
variety of needs for the proposed toolkit.
=== Reliance on Salaried Developers ===
Some of the developers are paid to work develop certain
functionality on
this, but the proposed project is not the primary task for anyone.
=== Relationships with Other Apache Products ===
TBN
=== A Excessive Fascination with the Apache Brand ===
All of us are familiar with Apache and we have participated in Apache
projects as contributors, committers, and PMC members. We feel that
the
Apache Software Foundation is a natural home for a project like this.
== Documentation ==
The main documentation is distributed with the code
* [http://svn.apache.org/viewvc/labs/droids/trunk/docs/ Docu]
* [http://people.apache.org/~thorsten/droids/ DocuDeployed]
== Initial Source ==
Droids will start with the code base that have been developed in the
Apache Labs project:
* [http://svn.apache.org/viewvc/labs/droids/trunk/ code base]
== Source and Intellectual Property Submission Plan ==
All seed code and other contributions will be handled through the
normal
Apache contribution process.
We will also contact other related efforts for possible cooperation
and
contributions.
== External Dependencies ==
Droids will mainly depend on the Spring core distribution.
== Cryptography ==
Droids itself will not use cryptography, but it is possible that
some of
the external libraries will include cryptographic code to handle
different features.
== Required Resources ==
Mailing lists
* [EMAIL PROTECTED]
* [EMAIL PROTECTED]
* [EMAIL PROTECTED]
Subversion Directory
* https://svn.apache.org/repos/asf/incubator/droids
Issue Tracking
* JIRA Droids (DROIDS)
Other Resources
* none
== Initial Committers ==
|| '''Name''' || '''Email''' ||
'''CLA''' ||
|| Thorsten Scherler || thorsten at apache dot org || yes
||
|| Ryan !McKinley || ryan at apache dot org || yes
||
|| Grant Ingersoll || gsingers at apache dot org
||
yes ||
== Affiliations ==
|| '''Name''' || '''Affiliation'''
||
|| Thorsten Scherler || Freelancer ||
== Sponsors ==
Champion
Grant Ingersoll
Nominated Mentors
TBN
Sponsoring Entity
* [http://hc.apache.org/ Apache HttpComponents]
* [http://lucene.apache.org/ Apache Lucene]
--
Thorsten Scherler
thorsten.at.apache.org
Open Source Java consulting, training and
solutions
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]