: Im curious, is there a spot / patch for the latest on Nutch / Solr : integration, Ive found a few pages (a few outdated it seems), it would be nice : (?) if it worked as a DataSource type to DataImportHandler, but not sure if : that fits w/ how it works. Either way a nice contrib patch the way the DIH is : already setup would be nice to have. ... : Is there currently work ongoing on this? Seems like it belongs in either / or : project and not both.
My understanding is that previous wok on bridging Nutch crawling with Solr indexing involved patching Nutch and using a Nutch specific schema.xml and the client code which has since been committed as "SolrJ". Most of the discussion seemed to take place on the Nutch list (which makes sense since Nutch required the patching) so you may wnt to start there). I'm not sure if Nutch itegration would make sense as a DIH plugin (it seems like the Nutch crawler could "push" the data much more easily then DIH could pull it from the crawler) but if there is any advantage to having plugin code running in Solr to support this then that would absolutely make sense in the new /contrib area of solr (that i believe Otis already created/commited) but any nutch "plugins" or modifications would obviously need to be made in Nutch. -Hoss