[jira] Updated: (NUTCH-442) Integrate Solr/Nutch

Guillaume Smet (JIRA) Tue, 05 Aug 2008 00:59:16 -0700

     [ 
https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Guillaume Smet updated NUTCH-442:
---------------------------------

    Attachment: NUTCH-442_v7.patch.txt

Here is an updated patch synced with SVN trunk.

A few thoughts and comments:
- in Indexer.java, I added a try{} catch {} finally {} around 
JobClient.runJob(job); so that we're sure to cleanup the temp directory
- why did you remove the try {} catch {} in the *IndexingFilter classes?
- otherwise, it seems to work as expected and post the documents to Solr (be 
sure to add a string field called boost in your Solr schema).
- from what I've seen, it doesn't deal with removing the documents from the 
index when gone
- perhaps using Solrj to communicate with Solr instead of a custom class would 
be better

Thanks for your work.

> Integrate Solr/Nutch
> --------------------
>
>                 Key: NUTCH-442
>                 URL: https://issues.apache.org/jira/browse/NUTCH-442
>             Project: Nutch
>          Issue Type: New Feature
>         Environment: Ubuntu linux
>            Reporter: rubdabadub
>         Attachments: Crawl.patch, Indexer.patch, NUTCH-442_v4.patch, 
> NUTCH-442_v5.patch, NUTCH-442_v6.patch.txt, NUTCH-442_v7.patch.txt, 
> NUTCH_442_v3.patch, RFC_multiple_search_backends.patch, schema.xml
>
>
> Hi:
> After trying out Sami's patch regarding Solr/Nutch. Can be found here 
> (http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html)
>  and I can confirm it worked :-) And that lead me to request the following :
> I would be very very great full if this could be included in nutch 0.9 as I 
> am trying to eliminate my python based crawler which post documents to solr. 
> As I am in the corporate enviornment I can't install trunk version in the 
> production enviornment thus I am asking this to be included in 0.9 release. I 
> hope my wish would be granted.
> I look forward to get some feedback.
> Thank you.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (NUTCH-442) Integrate Solr/Nutch

Reply via email to