Unfortunately, updateDocument replaces the *entire* previous document with the new one.
The ability to update a single indexed field (either replace that field entirely, or, change only certain token occurrences within it), while leaving all other indexed fields in the document unaffected, has been a long requested big missing feature in Lucene. We call it "incremental field updates". There have been some healthy discussions on the dev list, that have worked out a good rough design (eg see http://markmail.org/thread/lsfjhpiblzymkfcn). Also, recent improvements in how buffered deletes are handled should make it alot easier for updates to "piggyback" using that same packet stream approach. So... I think there is hope some day that we'll get this into Lucene. Mike http://blog.mikemccandless.com On Fri, Apr 8, 2011 at 11:00 AM, Ian Lea <ian....@gmail.com> wrote: > Unfortunately you just can't do this. Might be possible if all fields > were stored but evidently they are not in your index. For unstored > fields, the Document object will not contain the data that was passed > in when the doc was originally added. > > I believe there might be a way of recreating some of the missing data > via TermFreqVector but that has always sounded dodgy and lossy to me. > > The safest way is to reindex, however painful it might be. Maybe you > could take the opportunity to upgrade lucene at the same time! > > > -- > Ian. > > > On Fri, Apr 8, 2011 at 3:44 PM, Chris Bamford > <chris.bamf...@talktalk.net> wrote: >> Hi, >> >> I recently discovered that I need to add a single field to every document in >> an existing (very large) index. Reindexing from scratch is not an option I >> want to consider right now, so I wrote a utility to add the field by >> rewriting the index - but this seemed to lose some of the fields (indexed, >> but not stored?). In fact, it shrunk a 12Gb index down to 4.2Gb - clearly >> not what I wanted. :-) >> What am I doing wrong? >> >> My technique was: >> >> Analyzer analyser = new StandardAnalyzer(); >> IndexSearcher searcher = new IndexSearcher(indexPath); >> IndexWriter indexWriter = new IndexWriter(indexPath, analyser); >> Hits hits = matchAllDocumentsFromIndex(searcher); >> >> for (int i=0; i < hits.length(); i++) { >> Document doc = hits.doc(i); >> String id = doc.get("unique-id"); >> doc.add(new Field("newField", newValue, Field.Store.YES, >> Field.Index.UN_TOKENIZED)); >> indexWriter.updateDocument(new Term("unique-id", id), doc); >> } >> >> searcher.close(); >> indexWriter.optimize(); >> indexWriter.close(); >> >> Note that my matchAllDocumentsFromIndex() does get the right number of hits >> from the index - i.e. the same number as held in the index. >> >> >> Thanks for any ideas! >> BTW I am using Lucene 2.3.2. >> >> - Chris >> >> >> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org