[jira] Issue Comment Edited: (NUTCH-664) Possibility to update already stored documents.

2008-12-02 Thread Sergey Khilkov (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12651458#action_12651458
 ] 

skhil edited comment on NUTCH-664 at 12/2/08 1:29 AM:
---

Good news! So, I'll wait until 1.0 and prepare project for hbase-solr!

  was (Author: skhil):
Good news! So, I'll wait until 1.0 and prepare project for 
hbase-solr/katta/etc!
  
 Possibility to update already stored documents.
 ---

 Key: NUTCH-664
 URL: https://issues.apache.org/jira/browse/NUTCH-664
 Project: Nutch
  Issue Type: Wish
Reporter: Sergey Khilkov
Priority: Minor

 We have huge index of stored documents. It is high cost procedure to fetch 
 page, merge indexes any time we update some information about page. The 
 information can be changed 1-3 times per day. At this moment we have to store 
 changed info in database, but in this case we have lots of problems with 
 sorting, search restricions and so on. Lucene itself allows delete single 
 document and add new one into existing index. But there is a problem with 
 hadoop... As I understand hadoop filesystem has no possibility to write in 
 random positions. But it will be great feature if nutch will be able to 
 update created index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-664) Possibility to update already stored documents.

2008-11-27 Thread Sergey Khilkov (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12651458#action_12651458
 ] 

Sergey Khilkov commented on NUTCH-664:
--

Good news! So, I'll wait until 1.0 and prepare project for hbase-solr/katta/etc!

 Possibility to update already stored documents.
 ---

 Key: NUTCH-664
 URL: https://issues.apache.org/jira/browse/NUTCH-664
 Project: Nutch
  Issue Type: Wish
Reporter: Sergey Khilkov
Priority: Minor

 We have huge index of stored documents. It is high cost procedure to fetch 
 page, merge indexes any time we update some information about page. The 
 information can be changed 1-3 times per day. At this moment we have to store 
 changed info in database, but in this case we have lots of problems with 
 sorting, search restricions and so on. Lucene itself allows delete single 
 document and add new one into existing index. But there is a problem with 
 hadoop... As I understand hadoop filesystem has no possibility to write in 
 random positions. But it will be great feature if nutch will be able to 
 update created index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-664) Possibility to update already stored documents.

2008-11-26 Thread Sergey Khilkov (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12650912#action_12650912
 ] 

Sergey Khilkov commented on NUTCH-664:
--

Yes, It will be great to have changeDocument() method of IndexWriter class. 
Hope it's possible )

 Possibility to update already stored documents.
 ---

 Key: NUTCH-664
 URL: https://issues.apache.org/jira/browse/NUTCH-664
 Project: Nutch
  Issue Type: Wish
Reporter: Sergey Khilkov
Priority: Minor

 We have huge index of stored documents. It is high cost procedure to fetch 
 page, merge indexes any time we update some information about page. The 
 information can be changed 1-3 times per day. At this moment we have to store 
 changed info in database, but in this case we have lots of problems with 
 sorting, search restricions and so on. Lucene itself allows delete single 
 document and add new one into existing index. But there is a problem with 
 hadoop... As I understand hadoop filesystem has no possibility to write in 
 random positions. But it will be great feature if nutch will be able to 
 update created index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (NUTCH-664) Possibility to update already stored documents.

2008-11-25 Thread Sergey Khilkov (JIRA)
Possibility to update already stored documents.
---

 Key: NUTCH-664
 URL: https://issues.apache.org/jira/browse/NUTCH-664
 Project: Nutch
  Issue Type: New Feature
Reporter: Sergey Khilkov


We have huge index of stored documents. It is high cost procedure to fetch 
page, merge indexes any time we update some information about page. The 
information can be changed 1-3 times per day. At this moment we have to store 
changed info in database, but in this case we have lots of problems with 
sorting, search restricions and so on. Lucene itself allows delete single 
document and add new one into existing index. But there is a problem with 
hadoop... As I understand hadoop filesystem has no possibility to write in 
random positions. But it will be great feature if nutch will be able to update 
created index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.