On Aug 27, 2008, at 1:46 PM, Thorsten Scherler wrote:
On Wed, 2008-08-27 at 18:52 +0200, Bernd Fondermann wrote:
On Wed, Aug 27, 2008 at 14:06, Thorsten Scherler
<[EMAIL PROTECTED]> wrote:
On Wed, 2008-03-12 at 16:21 +0100, Thorsten Scherler wrote:
...
The Apache HttpComponents project are willing to sponsor the
project.
Why HC and not, say, Lucene/Nutch?
The hc project expressed their interest in droids from the beginning
[3]. They planed to provide a http-spider as you can see from [4] but
nobody found the time to implement it. This are the reasons why HC.
Nutch/lucene did not express any interest, furthermore droids is not
about search engine at all. It is a robot framework so lucene/nutch
does
not fit, I do not want to limit the focus of droids to search engines.
FWIW, Lucene goes well beyond search... Mahout is machine learning,
Tika is content extraction. Nutch is a crawler and search. I've used
just the Nutch crawler before, as have others.
I don't recall Lucene being asked, but I would think it is possible
there is interest, I for one am interested in a good, scalable, crawler.
Not trying to convince you either way, just saying there may be options.
-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]