Hello!
I think that really depends on what you want to achieve and what parts
of your current system you would like to reuse. If it is only HTML
processing I would let Nutch and Solr do that. Of course you can
extend Nutch (it has a plugin API) and implement the custom logic you
need as a Nutch pl
Thanks Rafał and Markus for your comments.
I think Droids it has serious problem with URL parameters in current version
(0.2.0) from Maven central:
https://issues.apache.org/jira/browse/DROIDS-144
I knew about Nutch, but I haven't been able to implement a crawler with it.
Have you done that or