Possibility to update already stored documents.
-----------------------------------------------

                 Key: NUTCH-664
                 URL: https://issues.apache.org/jira/browse/NUTCH-664
             Project: Nutch
          Issue Type: New Feature
            Reporter: Sergey Khilkov


We have huge index of stored documents. It is high cost procedure to fetch 
page, merge indexes any time we update some information about page. The 
information can be changed 1-3 times per day. At this moment we have to store 
changed info in database, but in this case we have lots of problems with 
sorting, search restricions and so on. Lucene itself allows delete single 
document and add new one into existing index. But there is a problem with 
hadoop... As I understand hadoop filesystem has no possibility to write in 
random positions. But it will be great feature if nutch will be able to update 
created index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to