Updating Lucene Index with Unstored fields

2008-01-30 Thread philipc

hi,

I'm trying to add a new field to all the documents in a lucene index.
After searching around, I found the only way to do an update 
is by retrieve the old documents, update it, delete it, then re-add
to index.

However, this worked for only preserving the stored fields.
i've lost all the unstored fields from the documents.
is there anyway to keep the unstored fields as well?

Or any way to go around the problem, 
ie, anyway to export the entire index to a csv file
and then update the cvs, and then import it back?

 - Philip
-- 
View this message in context: 
http://www.nabble.com/Updating-Lucene-Index-with-Unstored-fields-tp15188818p15188818.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Updating Lucene Index with Unstored fields

2008-02-01 Thread philipc

thanks for your quick reply.
I'm trying to use your method, but running into a NullPointerException on
the IndexWriteer.addIndexes().

code sample

isearcher is IndexSearcher,
newValues is IndexWriter with RAM Directory
   
ParallelReader preader = new ParallelReader();
preader.add(isearcher.getIndexReader());
preader.add(new 
IndexSearcher(newValues.getDirectory()).getIndexReader());

int numdoc = preader.numDocs();

for (int i = 0; i< numdoc; i++){
Document d= preader.document(i);
System.out.println( d.toString());
}
writer.addIndexes(new IndexReader[]{preader});
   
this code works fine before the addIndexes line.
it printed the merged index properly.
but addIndexes throws NullPointerException.



java.lang.NullPointerException
at
org.apache.lucene.index.ParallelReader$ParallelTermPositions.seek(ParallelReader.java:358)
at
org.apache.lucene.index.ParallelReader$ParallelTermDocs.seek(ParallelReader.java:320)
at
org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:327)
at
org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:298)
at
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:272)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:236)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:89)
at
org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:605)

I was using lucene 1.9.1, but there's a bug for this ,and i've updated to
lucene 2.0.0, 
but still the same.

thanks in advance,
  Philip




Andrzej Bialecki wrote:
> 
> philipc wrote:
>> hi,
>> 
>> I'm trying to add a new field to all the documents in a lucene index.
>> After searching around, I found the only way to do an update 
>> is by retrieve the old documents, update it, delete it, then re-add
>> to index.
>> 
>> However, this worked for only preserving the stored fields.
>> i've lost all the unstored fields from the documents.
>> is there anyway to keep the unstored fields as well?
>> 
>> Or any way to go around the problem, 
>> ie, anyway to export the entire index to a csv file
>> and then update the cvs, and then import it back?
> 
> Here's an idea: create an index consisting of documents with just this 
> field, adding documents in exactly the same order as they are in the 
> other index. Then use ParallelReader to access both indexes at the same 
> time - ParallelReader will present a merged view of both indexes. You 
> can also use IndexWriter.addIndexes() to create a merged index.
> 
> 
> -- 
> Best regards,
> Andrzej Bialecki <><
>   ___. ___ ___ ___ _ _   __
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Updating-Lucene-Index-with-Unstored-fields-tp15188818p15236124.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]